Testing Workflow
This document describes the testing workflow in the Engineering AI Agent system, focusing on how tests are designed, implemented, and executed to ensure code quality and reliability.
Overview
The testing workflow encompasses several stages:
- Test planning and strategy
- Test implementation
- Test execution and automation
- Bug identification and reporting
- Verification and regression testing
Workflow Stages
1. Test Planning and Strategy
Primary Roles: RD (Research & Development), QA (Quality Assurance)
Before implementing tests, the agents:
- Review requirements to identify test scenarios
- Determine appropriate test types (unit, integration, etc.)
- Define test scope and coverage goals
- Establish pass/fail criteria
- Identify edge cases and error conditions
Example test plan for authentication system:
Test Plan: User Authentication System
Test Types:
- Unit tests for individual components
- Integration tests for API endpoints
- Security tests for authentication flow
Test Scenarios:
1. Successful login with valid credentials
2. Failed login with invalid credentials
3. Password reset flow
4. Account lockout after multiple failed attempts
5. Token expiration handling
6. JWT token validation
7. Password strength validation
Coverage Goals:
- 90% line coverage for authentication code
- 100% coverage for security-critical functions
2. Test Implementation
Primary Role: RD (Research & Development)
The RD agent implements tests following best practices:
- Writes clear, deterministic tests
- Uses descriptive test names that explain behavior
- Follows the Arrange-Act-Assert pattern
- Implements appropriate mocks and test doubles
- Organizes tests in a logical structure
Example test implementation:
import pytest
from unittest.mock import patch
from app.auth import AuthService
from app.exceptions import InvalidCredentialsError
class TestAuthService:
@pytest.fixture
def auth_service(self):
return AuthService()
def test_login_success(self, auth_service):
# Arrange
username = "valid_user"
password = "correct_password"
with patch('app.auth.user_repository.get_user') as mock_get_user:
mock_get_user.return_value = {
"id": 123,
"username": username,
"password_hash": "$2b$12$...", # Valid hash for correct_password
"role": "user"
}
# Act
result = auth_service.authenticate(username, password)
# Assert
assert result["authenticated"] is True
assert "token" in result
assert result["user_id"] == 123
def test_login_invalid_credentials(self, auth_service):
# Arrange
username = "valid_user"
password = "wrong_password"
with patch('app.auth.user_repository.get_user') as mock_get_user:
mock_get_user.return_value = {
"id": 123,
"username": username,
"password_hash": "$2b$12$...", # Hash doesn't match wrong_password
"role": "user"
}
# Act & Assert
with pytest.raises(InvalidCredentialsError):
auth_service.authenticate(username, password)
3. Test Execution and Automation
Primary Roles: RD (Research & Development), CI/CD Pipeline
Tests are executed:
- Locally during development
- Automatically when code is pushed to repositories
- In CI/CD pipelines before deployment
- As scheduled regression tests
The automation process:
- Tests run when code is pushed to a branch
- Test results determine PR qualification
- Coverage reports are generated
- Test metrics are tracked over time
Example CI configuration in GitHub Actions:
name: Test Suite
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main, develop ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
pip install pytest pytest-cov
- name: Run tests with coverage
run: |
pytest --cov=app --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
4. Bug Identification and Reporting
Primary Role: QA (Quality Assurance)
When tests reveal bugs, the QA agent:
- Documents the issue with steps to reproduce
- Classifies the bug by severity and priority
- Creates a ticket in the project management system
- Assigns the ticket to the appropriate developer
- Tracks bug resolution status
Example bug report:
Bug Report: User Authentication Failure
Severity: High
Priority: High
Description:
Users are unable to log in when the username contains special characters.
Steps to Reproduce:
1. Create a user with username "test.user@example.com"
2. Attempt to log in with this username and correct password
3. System returns "Invalid credentials" error
Expected Behavior:
Login should succeed with correct credentials regardless of special characters in username
Actual Behavior:
Login fails with "Invalid credentials" even when password is correct
Logs:
[Attached relevant logs showing error]
Affected Components:
- AuthenticationController
- UserRepository.findByUsername method
5. Verification and Regression Testing
Primary Roles: RD (Research & Development), QA (Quality Assurance)
After bugs are fixed:
- Tests are written to verify the fix
- Regression tests ensure other features aren't broken
- Fixed bugs are tested in multiple environments
- Automated tests are updated to include the test case
Example verification workflow:
- Developer fixes the bug
- Developer adds a test case demonstrating the fix
- CI pipeline verifies the fix
- QA agent validates in test environment
- The fix is approved for production
Test Types and Coverage
Unit Testing
- Tests individual components in isolation
- Mocks external dependencies
- Focuses on function/method-level behavior
- Aims for high coverage (90%+)
- Fast execution (milliseconds)
Integration Testing
- Tests interactions between components
- Focuses on API contracts and data flow
- May use test databases or mock services
- Tests entire features or workflows
- Medium execution speed (seconds)
End-to-End Testing
- Tests complete user workflows
- Simulates real user interactions
- Tests across multiple systems
- Verifies business requirements
- Slower execution (minutes)
Performance Testing
- Tests system under load
- Measures response times and throughput
- Identifies bottlenecks
- Establishes performance baselines
- Typically run on schedule or before releases
Best Practices
Test Structure
- Follow the Arrange-Act-Assert pattern
- One assertion or concept per test
- Descriptive test names (
test_should_deny_access_when_token_expired
) - Shared setup in fixtures or setup methods
- Clean isolation between tests
Test Reliability
- Avoid flaky tests with intermittent failures
- Don't rely on execution order
- Clean up test data after execution
- Use stable test doubles instead of actual services
- Set explicit timeouts for async operations
Test Coverage
- Aim for high code coverage (lines, branches)
- Focus on critical path and edge cases
- Don't just test the happy path
- Include error handling and edge conditions
- Balance coverage goals with maintenance costs
Example Workflow
-
Feature Development:
- RD agent creates tests for new feature
- Implements feature until tests pass
- Ensures adequate test coverage
-
Continuous Integration:
- Tests run automatically on push
- Coverage reports generated
- PR blocks merging if tests fail
-
Bug Detection:
- Tests identify a regression
- QA agent creates a bug report
- RD agent assigned to fix the issue
-
Bug Resolution:
- RD agent adds a test case reproducing the bug
- Fixes the code until the test passes
- PR created with the fix and test case
-
Verification:
- Automated tests verify the fix
- QA agent confirms resolution
- Bug marked as fixed