Skip to main content

Deployment Workflow

This document describes the deployment workflow in the Engineering AI Agent system, focusing on how code changes move from development to production environments.

Overview

The deployment workflow encompasses several stages:

  1. Environment preparation
  2. Build process
  3. Testing in staging
  4. Production deployment
  5. Monitoring and rollback procedures

Workflow Stages

1. Environment Preparation

Primary Roles: RD (Research & Development), SRE (Site Reliability Engineering)

Before deployment, the environment is prepared:

  • Configuration files are updated for the target environment
  • Environment variables are set appropriately
  • Infrastructure is provisioned if needed
  • Dependencies are updated to required versions

Example environment configuration:

# config/environment/staging.yaml
environment: staging
logging:
level: info
format: json
database:
host: ${DB_HOST}
port: ${DB_PORT}
name: engineering_agent_staging
pool_size: 10
services:
slack:
enabled: true
github:
enabled: true
jira:
enabled: true
clickup:
enabled: false

2. Build Process

Primary Role: CI/CD Pipeline

The build process prepares the application for deployment:

  • Source code is fetched from the repository
  • Dependencies are installed
  • Code is compiled or bundled if necessary
  • Tests are run to verify functionality
  • Artifacts are packaged for deployment

Example build workflow:

name: Build

on:
push:
branches: [main, develop]
pull_request:
branches: [main]

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: Run tests
run: pytest
- name: Build package
run: |
pip install build
python -m build
- name: Upload artifact
uses: actions/upload-artifact@v3
with:
name: dist
path: dist/

3. Testing in Staging

Primary Roles: QA (Quality Assurance), SRE (Site Reliability Engineering)

Before production deployment, changes are tested in a staging environment:

  • Built artifacts are deployed to the staging environment
  • Integration with external services is verified
  • Performance and security testing is performed
  • User acceptance testing may be conducted
  • Smoke tests verify basic functionality

Example staging deployment workflow:

name: Deploy to Staging

on:
workflow_run:
workflows: ["Build"]
branches: [develop]
types: [completed]

jobs:
deploy-staging:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- name: Download artifact
uses: actions/download-artifact@v3
with:
name: dist
path: dist/
- name: Deploy to staging
env:
DEPLOY_TOKEN: ${{ secrets.STAGING_DEPLOY_TOKEN }}
run: |
./scripts/deploy.sh --env staging --token $DEPLOY_TOKEN
- name: Run smoke tests
run: |
./scripts/smoke_tests.sh --env staging

4. Production Deployment

Primary Roles: SRE (Site Reliability Engineering), RD (Research & Development)

Once verified in staging, changes are deployed to production:

  • Deployment may follow different strategies (blue/green, canary, etc.)
  • Database migrations are applied if necessary
  • Configuration is updated for production
  • Traffic is gradually shifted to the new version
  • Deployment is monitored for issues

Example production deployment workflow:

name: Deploy to Production

on:
workflow_dispatch:
inputs:
version:
description: 'Version to deploy'
required: true
strategy:
description: 'Deployment strategy'
required: true
default: 'blue-green'
type: choice
options:
- blue-green
- canary
- rolling

jobs:
deploy-production:
runs-on: ubuntu-latest
steps:
- name: Checkout deploy scripts
uses: actions/checkout@v3
with:
sparse-checkout: scripts/
- name: Deploy to production
env:
DEPLOY_TOKEN: ${{ secrets.PROD_DEPLOY_TOKEN }}
VERSION: ${{ github.event.inputs.version }}
STRATEGY: ${{ github.event.inputs.strategy }}
run: |
./scripts/deploy.sh --env production --token $DEPLOY_TOKEN --version $VERSION --strategy $STRATEGY
- name: Verify deployment
run: |
./scripts/verify_deployment.sh --env production

5. Monitoring and Rollback

Primary Role: SRE (Site Reliability Engineering)

After deployment, the system is monitored:

  • Performance metrics are tracked
  • Error rates are monitored
  • User-impacting issues are identified
  • Rollback procedures are executed if necessary

Example monitoring alert configuration:

# prometheus-alerts.yaml
groups:
- name: engineering_agent_alerts
rules:
- alert: HighErrorRate
expr: sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05
for: 2m
labels:
severity: critical
annotations:
summary: High error rate detected
description: Error rate exceeded 5% for more than 2 minutes

- alert: SlowResponses
expr: http_request_duration_seconds{quantile="0.9"} > 2
for: 5m
labels:
severity: warning
annotations:
summary: Slow API responses detected
description: 90th percentile response time is above 2 seconds for 5 minutes

Deployment Strategies

Blue-Green Deployment

  • Two identical environments are maintained (Blue and Green)
  • Changes are deployed to the inactive environment
  • Tests verify the new deployment
  • Traffic is switched from the active to the inactive environment
  • Allows for immediate rollback by switching traffic back

Canary Deployment

  • New version is deployed alongside the old version
  • Small percentage of traffic is routed to the new version
  • Traffic percentage gradually increases if no issues are found
  • Allows for early detection of issues with minimal impact
  • Rollback requires scaling down the new version

Rolling Deployment

  • New version is deployed gradually across instances
  • One or more instances are updated at a time
  • Healthchecks verify each updated instance
  • Process continues until all instances are updated
  • Balances deployment speed with risk mitigation

Versioning and Release Management

Semantic Versioning

The system follows semantic versioning (SemVer):

  • Format: MAJOR.MINOR.PATCH (e.g., 1.2.3)
  • MAJOR: Incompatible API changes
  • MINOR: New features (backward compatible)
  • PATCH: Bug fixes (backward compatible)

Release Branching Strategy

  • main branch represents the current production state
  • develop branch is used for ongoing development
  • Feature branches branch off from and merge back to develop
  • Release branches are created from develop
  • Hotfixes branch directly from main and merge to both main and develop

Release Process

  1. Create a release branch (release/vX.Y.Z) from develop
  2. Perform final testing and bug fixes in the release branch
  3. Merge the release branch to main
  4. Tag the release in main (vX.Y.Z)
  5. Merge the release branch back to develop

Best Practices

Configuration Management

  • Environment-specific configuration remains separate from code
  • Sensitive values are stored in secure vaults
  • Configuration is versioned and auditable
  • Changes to configuration follow the same review process as code

Database Changes

  • Database migrations are automated
  • Migrations are backward compatible when possible
  • Large data migrations are performed in batches
  • Rollback procedures are defined for database changes

Deployment Safety

  • Feature flags control functionality exposure
  • Circuit breakers prevent cascading failures
  • Rate limiting protects against unexpected load
  • Automated rollbacks trigger on error thresholds

Documentation and Communication

  • Release notes document all changes
  • Stakeholders are notified before significant deployments
  • Post-deployment status is communicated to the team
  • Deployment schedules respect low-traffic periods

Example Workflow

  1. Release Planning:

    • Features are prioritized for the next release
    • Version number is assigned based on changes
    • Release branch is created from develop
  2. Preparation:

    • Final testing is performed in the release branch
    • Release notes are prepared
    • Deployment plan is reviewed
  3. Staging Deployment:

    • Release is deployed to staging environment
    • Integration and acceptance testing is performed
    • Issues are fixed directly in the release branch
  4. Production Deployment:

    • Stakeholders are notified of the upcoming deployment
    • Deployment is scheduled during a low-traffic period
    • Release is deployed following the chosen strategy
  5. Post-Deployment:

    • System is monitored for anomalies
    • Success or issues are communicated to the team
    • Lessons learned are documented for future deployments