Testing Workflow

This document describes the testing workflow in the Engineering AI Agent system, focusing on how tests are designed, implemented, and executed to ensure code quality and reliability.

Overview

The testing workflow encompasses several stages:

Test planning and strategy
Test implementation
Test execution and automation
Bug identification and reporting
Verification and regression testing

Workflow Stages

1. Test Planning and Strategy

Primary Roles: RD (Research & Development), QA (Quality Assurance)

Before implementing tests, the agents:

Review requirements to identify test scenarios
Determine appropriate test types (unit, integration, etc.)
Define test scope and coverage goals
Establish pass/fail criteria
Identify edge cases and error conditions

Example test plan for authentication system:

Test Plan: User Authentication System

Test Types:
- Unit tests for individual components
- Integration tests for API endpoints
- Security tests for authentication flow

Test Scenarios:
1. Successful login with valid credentials
2. Failed login with invalid credentials
3. Password reset flow
4. Account lockout after multiple failed attempts
5. Token expiration handling
6. JWT token validation
7. Password strength validation

Coverage Goals:
- 90% line coverage for authentication code
- 100% coverage for security-critical functions

2. Test Implementation

Primary Role: RD (Research & Development)

The RD agent implements tests following best practices:

Writes clear, deterministic tests
Uses descriptive test names that explain behavior
Follows the Arrange-Act-Assert pattern
Implements appropriate mocks and test doubles
Organizes tests in a logical structure

Example test implementation:

import pytest
from unittest.mock import patch
from app.auth import AuthService
from app.exceptions import InvalidCredentialsError

class TestAuthService:
    @pytest.fixture
    def auth_service(self):
        return AuthService()
    
    def test_login_success(self, auth_service):
        # Arrange
        username = "valid_user"
        password = "correct_password"
        with patch('app.auth.user_repository.get_user') as mock_get_user:
            mock_get_user.return_value = {
                "id": 123,
                "username": username,
                "password_hash": "$2b$12$...", # Valid hash for correct_password
                "role": "user"
            }
            
            # Act
            result = auth_service.authenticate(username, password)
            
            # Assert
            assert result["authenticated"] is True
            assert "token" in result
            assert result["user_id"] == 123
    
    def test_login_invalid_credentials(self, auth_service):
        # Arrange
        username = "valid_user"
        password = "wrong_password"
        with patch('app.auth.user_repository.get_user') as mock_get_user:
            mock_get_user.return_value = {
                "id": 123,
                "username": username,
                "password_hash": "$2b$12$...", # Hash doesn't match wrong_password
                "role": "user"
            }
            
            # Act & Assert
            with pytest.raises(InvalidCredentialsError):
                auth_service.authenticate(username, password)

3. Test Execution and Automation

Primary Roles: RD (Research & Development), CI/CD Pipeline

Tests are executed:

Locally during development
Automatically when code is pushed to repositories
In CI/CD pipelines before deployment
As scheduled regression tests

The automation process:

Tests run when code is pushed to a branch
Test results determine PR qualification
Coverage reports are generated
Test metrics are tracked over time

Example CI configuration in GitHub Actions:

name: Test Suite

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main, develop ]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
        pip install pytest pytest-cov
    - name: Run tests with coverage
      run: |
        pytest --cov=app --cov-report=xml
    - name: Upload coverage to Codecov
      uses: codecov/codecov-action@v3

4. Bug Identification and Reporting

Primary Role: QA (Quality Assurance)

When tests reveal bugs, the QA agent:

Documents the issue with steps to reproduce
Classifies the bug by severity and priority
Creates a ticket in the project management system
Assigns the ticket to the appropriate developer
Tracks bug resolution status

Example bug report:

Bug Report: User Authentication Failure

Severity: High
Priority: High

Description:
Users are unable to log in when the username contains special characters.

Steps to Reproduce:
1. Create a user with username "test.user@example.com"
2. Attempt to log in with this username and correct password
3. System returns "Invalid credentials" error

Expected Behavior:
Login should succeed with correct credentials regardless of special characters in username

Actual Behavior:
Login fails with "Invalid credentials" even when password is correct

Logs:
[Attached relevant logs showing error]

Affected Components:
- AuthenticationController
- UserRepository.findByUsername method

5. Verification and Regression Testing

Primary Roles: RD (Research & Development), QA (Quality Assurance)

After bugs are fixed:

Tests are written to verify the fix
Regression tests ensure other features aren't broken
Fixed bugs are tested in multiple environments
Automated tests are updated to include the test case

Example verification workflow:

Developer fixes the bug
Developer adds a test case demonstrating the fix
CI pipeline verifies the fix
QA agent validates in test environment
The fix is approved for production

Test Types and Coverage

Unit Testing

Tests individual components in isolation
Mocks external dependencies
Focuses on function/method-level behavior
Aims for high coverage (90%+)
Fast execution (milliseconds)

Integration Testing

Tests interactions between components
Focuses on API contracts and data flow
May use test databases or mock services
Tests entire features or workflows
Medium execution speed (seconds)

End-to-End Testing

Tests complete user workflows
Simulates real user interactions
Tests across multiple systems
Verifies business requirements
Slower execution (minutes)

Performance Testing

Tests system under load
Measures response times and throughput
Identifies bottlenecks
Establishes performance baselines
Typically run on schedule or before releases

Best Practices

Test Structure

Follow the Arrange-Act-Assert pattern
One assertion or concept per test
Descriptive test names (test_should_deny_access_when_token_expired)
Shared setup in fixtures or setup methods
Clean isolation between tests

Test Reliability

Avoid flaky tests with intermittent failures
Don't rely on execution order
Clean up test data after execution
Use stable test doubles instead of actual services
Set explicit timeouts for async operations

Test Coverage

Aim for high code coverage (lines, branches)
Focus on critical path and edge cases
Don't just test the happy path
Include error handling and edge conditions
Balance coverage goals with maintenance costs

Example Workflow

Feature Development:
- RD agent creates tests for new feature
- Implements feature until tests pass
- Ensures adequate test coverage
Continuous Integration:
- Tests run automatically on push
- Coverage reports generated
- PR blocks merging if tests fail
Bug Detection:
- Tests identify a regression
- QA agent creates a bug report
- RD agent assigned to fix the issue
Bug Resolution:
- RD agent adds a test case reproducing the bug
- Fixes the code until the test passes
- PR created with the fix and test case
Verification:
- Automated tests verify the fix
- QA agent confirms resolution
- Bug marked as fixed

Overview​

Workflow Stages​

1. Test Planning and Strategy​

2. Test Implementation​

3. Test Execution and Automation​

4. Bug Identification and Reporting​

5. Verification and Regression Testing​

Test Types and Coverage​

Unit Testing​

Integration Testing​

End-to-End Testing​

Performance Testing​

Best Practices​

Test Structure​

Test Reliability​

Test Coverage​

Example Workflow​