Skip to main content

Testing Workflow

This document describes the testing workflow in the Engineering AI Agent system, focusing on how tests are designed, implemented, and executed to ensure code quality and reliability.

Overview

The testing workflow encompasses several stages:

  1. Test planning and strategy
  2. Test implementation
  3. Test execution and automation
  4. Bug identification and reporting
  5. Verification and regression testing

Workflow Stages

1. Test Planning and Strategy

Primary Roles: RD (Research & Development), QA (Quality Assurance)

Before implementing tests, the agents:

  • Review requirements to identify test scenarios
  • Determine appropriate test types (unit, integration, etc.)
  • Define test scope and coverage goals
  • Establish pass/fail criteria
  • Identify edge cases and error conditions

Example test plan for authentication system:

Test Plan: User Authentication System

Test Types:
- Unit tests for individual components
- Integration tests for API endpoints
- Security tests for authentication flow

Test Scenarios:
1. Successful login with valid credentials
2. Failed login with invalid credentials
3. Password reset flow
4. Account lockout after multiple failed attempts
5. Token expiration handling
6. JWT token validation
7. Password strength validation

Coverage Goals:
- 90% line coverage for authentication code
- 100% coverage for security-critical functions

2. Test Implementation

Primary Role: RD (Research & Development)

The RD agent implements tests following best practices:

  • Writes clear, deterministic tests
  • Uses descriptive test names that explain behavior
  • Follows the Arrange-Act-Assert pattern
  • Implements appropriate mocks and test doubles
  • Organizes tests in a logical structure

Example test implementation:

import pytest
from unittest.mock import patch
from app.auth import AuthService
from app.exceptions import InvalidCredentialsError

class TestAuthService:
@pytest.fixture
def auth_service(self):
return AuthService()

def test_login_success(self, auth_service):
# Arrange
username = "valid_user"
password = "correct_password"
with patch('app.auth.user_repository.get_user') as mock_get_user:
mock_get_user.return_value = {
"id": 123,
"username": username,
"password_hash": "$2b$12$...", # Valid hash for correct_password
"role": "user"
}

# Act
result = auth_service.authenticate(username, password)

# Assert
assert result["authenticated"] is True
assert "token" in result
assert result["user_id"] == 123

def test_login_invalid_credentials(self, auth_service):
# Arrange
username = "valid_user"
password = "wrong_password"
with patch('app.auth.user_repository.get_user') as mock_get_user:
mock_get_user.return_value = {
"id": 123,
"username": username,
"password_hash": "$2b$12$...", # Hash doesn't match wrong_password
"role": "user"
}

# Act & Assert
with pytest.raises(InvalidCredentialsError):
auth_service.authenticate(username, password)

3. Test Execution and Automation

Primary Roles: RD (Research & Development), CI/CD Pipeline

Tests are executed:

  • Locally during development
  • Automatically when code is pushed to repositories
  • In CI/CD pipelines before deployment
  • As scheduled regression tests

The automation process:

  1. Tests run when code is pushed to a branch
  2. Test results determine PR qualification
  3. Coverage reports are generated
  4. Test metrics are tracked over time

Example CI configuration in GitHub Actions:

name: Test Suite

on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main, develop ]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
pip install pytest pytest-cov
- name: Run tests with coverage
run: |
pytest --cov=app --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3

4. Bug Identification and Reporting

Primary Role: QA (Quality Assurance)

When tests reveal bugs, the QA agent:

  • Documents the issue with steps to reproduce
  • Classifies the bug by severity and priority
  • Creates a ticket in the project management system
  • Assigns the ticket to the appropriate developer
  • Tracks bug resolution status

Example bug report:

Bug Report: User Authentication Failure

Severity: High
Priority: High

Description:
Users are unable to log in when the username contains special characters.

Steps to Reproduce:
1. Create a user with username "test.user@example.com"
2. Attempt to log in with this username and correct password
3. System returns "Invalid credentials" error

Expected Behavior:
Login should succeed with correct credentials regardless of special characters in username

Actual Behavior:
Login fails with "Invalid credentials" even when password is correct

Logs:
[Attached relevant logs showing error]

Affected Components:
- AuthenticationController
- UserRepository.findByUsername method

5. Verification and Regression Testing

Primary Roles: RD (Research & Development), QA (Quality Assurance)

After bugs are fixed:

  • Tests are written to verify the fix
  • Regression tests ensure other features aren't broken
  • Fixed bugs are tested in multiple environments
  • Automated tests are updated to include the test case

Example verification workflow:

  1. Developer fixes the bug
  2. Developer adds a test case demonstrating the fix
  3. CI pipeline verifies the fix
  4. QA agent validates in test environment
  5. The fix is approved for production

Test Types and Coverage

Unit Testing

  • Tests individual components in isolation
  • Mocks external dependencies
  • Focuses on function/method-level behavior
  • Aims for high coverage (90%+)
  • Fast execution (milliseconds)

Integration Testing

  • Tests interactions between components
  • Focuses on API contracts and data flow
  • May use test databases or mock services
  • Tests entire features or workflows
  • Medium execution speed (seconds)

End-to-End Testing

  • Tests complete user workflows
  • Simulates real user interactions
  • Tests across multiple systems
  • Verifies business requirements
  • Slower execution (minutes)

Performance Testing

  • Tests system under load
  • Measures response times and throughput
  • Identifies bottlenecks
  • Establishes performance baselines
  • Typically run on schedule or before releases

Best Practices

Test Structure

  • Follow the Arrange-Act-Assert pattern
  • One assertion or concept per test
  • Descriptive test names (test_should_deny_access_when_token_expired)
  • Shared setup in fixtures or setup methods
  • Clean isolation between tests

Test Reliability

  • Avoid flaky tests with intermittent failures
  • Don't rely on execution order
  • Clean up test data after execution
  • Use stable test doubles instead of actual services
  • Set explicit timeouts for async operations

Test Coverage

  • Aim for high code coverage (lines, branches)
  • Focus on critical path and edge cases
  • Don't just test the happy path
  • Include error handling and edge conditions
  • Balance coverage goals with maintenance costs

Example Workflow

  1. Feature Development:

    • RD agent creates tests for new feature
    • Implements feature until tests pass
    • Ensures adequate test coverage
  2. Continuous Integration:

    • Tests run automatically on push
    • Coverage reports generated
    • PR blocks merging if tests fail
  3. Bug Detection:

    • Tests identify a regression
    • QA agent creates a bug report
    • RD agent assigned to fix the issue
  4. Bug Resolution:

    • RD agent adds a test case reproducing the bug
    • Fixes the code until the test passes
    • PR created with the fix and test case
  5. Verification:

    • Automated tests verify the fix
    • QA agent confirms resolution
    • Bug marked as fixed