How It Works Usage Writing a Brief The Engineering Lead What You Get Configuration Limitations

Docs > Dev Execution

Dev Execution

From feature brief to working code in a single session

How It Works

Dev Execution runs nine steps across two phases. User approval gates appear at four points: after requirements analysis, after architecture, after the phase plan, and after the full implementation report.

Planning Phase

Requirements Analysis

Advocate and Strategist read the brief. Advocate extracts user stories and acceptance criteria. Strategist frames the business problem and maps it to the codebase.

Architecture

Architect produces an architecture decision record covering component boundaries, data flow, and technology choices.

Architecture Review

Sentinel and Provocateur challenge the ADR independently. Gaps, risks, and missed dependencies are surfaced and resolved before any plan is written.

Phase Plan

Strategist and Operator break the work into delivery phases, each with a scope and success condition.

Task Breakdown

Operator decomposes each phase into discrete, assignable tasks with domain labels and dependency order.

Implementation Phase

Engineering Lead Spawns Workers

The engineering-lead agent reads the task breakdown and spawns domain-scoped worker agents, one per task cluster. Workers are assigned only the files and tools relevant to their scope.

Worker Implementation

Each worker executes its assigned tasks and writes code to the working directory. Workers run in dependency order; independent clusters run in parallel.

Sentinel Code Review

Sentinel reviews all changed files against the requirements and the architecture decision record. Flagged issues are returned to the responsible worker for revision.

Test Verification and Synthesis

Tests run against the written code. The orchestrator assembles a final implementation report covering what was built, what was changed, and what test results look like.

Usage

aos run dev-execution --brief feature.md --domain saas

The --brief flag takes a path to a Markdown file following the brief format described below. The --domain flag applies a domain knowledge pack to all agents for the duration of the session.

Writing a Brief

The dev-execution workflow requires four sections in the brief. Missing sections will fail validation before the session starts.

`## Feature / Change`

Describe what you are building. Be specific about the user-facing behavior or system behavior you want. Do not describe implementation — that is the agent team's job.

## Feature / Change

Add rate limiting to the public API. Each API key should be limited to 1000
requests per hour. Requests that exceed the limit should return HTTP 429 with
a Retry-After header.

`## Context`

Describe the current state of the codebase relevant to this change. List the files, modules, or services that will be affected. If there is existing behavior the change must preserve, note it here.

## Context

The public API lives in `src/api/`. Auth middleware is in `src/middleware/auth.ts`.
We use Redis for session storage (client in `src/lib/redis.ts`). No rate limiting
exists today. The API currently returns 401 for auth failures and 400 for
malformed requests -- the new 429 should follow the same error response shape.

`## Constraints`

List hard constraints the implementation must respect. This includes timeline, technical debt limits, dependencies on other work, infrastructure boundaries, and any libraries or patterns that are off-limits.

## Constraints

- Must not introduce new infrastructure dependencies (Redis is already available)
- Must not break existing integration tests in `tests/integration/api/`
- Rate limit state must survive a service restart (use Redis, not in-memory)
- No new npm packages -- use the existing `ioredis` client

`## Success Criteria`

Define what done looks like. Include the specific tests that should pass, the behaviors that should be observable, and any non-functional requirements that must be met.

## Success Criteria

- `tests/integration/api/rate-limit.test.ts` passes (will be created as part of this work)
- Existing integration tests continue to pass
- A request that exceeds the rate limit receives HTTP 429 with a valid Retry-After header
- Rate limit resets after the window expires
- API key rate limit counts are isolated (one key's usage does not affect another's)

The Engineering Lead

The engineering-lead agent is the coordinator for the implementation phase. It does not write code directly.

When the planning phase completes, the engineering lead receives the task breakdown produced by the Operator. It reads the dependency graph, groups tasks into domain-scoped clusters (frontend, backend, data, infrastructure, etc.), and spawns a worker agent for each cluster. Each worker is initialized with:

The tasks assigned to its cluster
Domain-scoped file and tool permissions (from domain enforcement)
The relevant artifacts from the planning phase (ADR, requirements, phase plan)

The engineering lead tracks completion across workers. When a dependency exists between clusters, it holds the dependent worker until the upstream cluster reports done. When independent clusters are ready simultaneously, it dispatches them in parallel.

Once all workers report completion, the engineering lead collects their outputs and passes a consolidated change summary to Sentinel for code review.

The engineering lead never touches source files directly. Its role is coordination, dependency sequencing, and result collection.

What You Get

When the session completes, dev execution produces:

Implementation Report

A structured summary of what was built, what files were changed, which tasks were completed, and which (if any) were skipped or deferred.

Code Changes in Your Working Directory

All files written or modified by worker agents are present in your project. The harness does not stage or commit them.

Test Results

Output from the test run triggered at the end of the implementation phase, including pass/fail counts and any failures.

Synthesis

The orchestrator's final assessment of how well the implementation satisfies the brief's success criteria, with notes on anything that diverged from the plan.

Code is written to your working directory. You decide when to review it, when to run additional checks, and when to commit.

Configuration

Dev execution uses the dev-execution profile. Default constraints:

Setting	Default	Notes
Max duration	240 minutes	Gate wait time counts against this limit
Max rounds	30	Applies to each deliberation phase
Max test retries	2	Per test suite, not per individual test
Gate wait	Unlimited	No timeout on user approval gates

To customize these limits, create a derived profile that extends dev-execution and override the constraints block:

# core/profiles/my-dev-execution/profile.yaml
extends: dev-execution
constraints:
  max_duration_minutes: 120
  max_rounds: 20

Limitations

—

No automatic git commits. The harness writes files to your working directory and stops. Staging and committing are your responsibility.

—

Max 2 test retries. If tests fail after two worker revision cycles, the session halts and surfaces the failures in the implementation report. Manual intervention is required.

—

Gate wait counts against session time. The 240-minute limit includes time spent waiting at approval gates. Long review pauses can exhaust the session budget before implementation completes.

—

Single repo only. Dev execution operates on a single working directory. Cross-repository changes are not supported.