by Agent-Field
Autonomous software engineering fleet of AI agents for production-grade PRs on AgentField: plan, code, test, and ship.
# Add to your Claude Code skills
git clone https://github.com/Agent-Field/SWE-AFPronounced: "swee-AF" (one word)
One API call → full engineering team → shipped code.
One API call spins up a full autonomous engineering team — product managers, architects, coders, reviewers, testers — that scopes, builds, adapts, and ships complex software end to end. SWE-AF is a first step toward autonomous software engineering factories, scaling from simple goals to hard multi-issue programs with hundreds to thousands of agent invocations.
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-H "Content-Type: application/json" \
-d @- <<'JSON'
{
"input": {
"goal": "Refactor and harden auth + billing flows",
"repo_url": "https://github.com/user/my-project",
"config": {
"runtime": "claude_code",
"models": {
"default": "sonnet",
"coder": "opus",
"qa": "opus"
},
"enable_learning": true
}
}
}
JSON
No comments yet. Be the first to share your thoughts!
Swap models.default and any role key (coder, qa, architect, etc.) to any model your runtime supports.
SWE-AF works in two modes: point it at a single repository, or orchestrate coordinated changes across multiple repos in one build.
The default. Pass repo_url (remote) or repo_path (local) and SWE-AF handles everything:
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-H "Content-Type: application/json" \
-d '{
"input": {
"goal": "Add JWT auth",
"repo_url": "https://github.com/user/my-project"
}
}'
When your work spans multiple codebases — a primary app plus shared libraries, monorepo sub-projects, or dependent microservices — pass config.repos as an array with roles:
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-H "Content-Type: application/json" \
-d '{
"input": {
"goal": "Add JWT auth across API and shared-lib",
"config": {
"repos": [
{
"repo_url": "https://github.com/org/main-app",
"role": "primary"
},
{
"repo_url": "https://github.com/org/shared-lib",
"role": "dependency"
}
],
"runtime": "claude_code",
"models": {
"default": "sonnet"
}
}
}
}'
Roles:
primary — The main application. Changes here drive the build; failures block progress.dependency — Libraries or services modified to support the primary repo. Failures are captured but don't block.Use cases:
Rust-based Python compiler benchmark (built autonomously):
| Metric | CPython (subprocess) | RustPython (SWE-AF) | Improvement | | ---------------------- | -------------------- | ---------------------------- | ----------------------- | | Steady-state execution | Baseline (~19ms) | Optimized in-process runtime | 88.3x-602.3x faster | | Geometric mean | 1.0x baseline | 253.8x | 253.8x | | Peak throughput | ~52 ops/s | 31,807 ops/s | ~612x |
Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.
Artifact trail includes 175 tracked autonomous agents across planning, coding, review, merge, and verification.
Most agent frameworks wrap a single coder loop. SWE-AF is a coordinated engineering factory — planning, execution, and governance agents run as a control stack that adapts in real time.
coder: opus, qa: haiku). Works with Claude, OpenRouter, OpenAI, and Google.enable_learning=true, conventions and failure patterns discovered early are injected into downstream issues.resume_build after crashes or interruptions.PR #179: Go SDK DID/VC Registration — built entirely by SWE-AF (Claude runtime with haiku-class models). One API call, zero human code.
| Metric | Value |
| ------------------- | ------------------ |
| Issues completed | 10/10 |
| Tests passing | 217 |
| Acceptance criteria | 34/34 |
| Agent invocations | 79 |
| Model | claude-haiku-4-5 |
| Total cost | $19.23 |
| Role | Cost | % | | ---------------------------------- | ----- | ----- | | Coder | $5.88 | 30.6% | | Code Reviewer | $3.48 | 18.1% | | QA | $1.78 | 9.2% | | GitHub PR | $1.66 | 8.6% | | Integration Tester | $1.59 | 8.3% | | Merger | $1.22 | 6.3% | | Workspace Ops | $1.77 | 9.2% | | Planning (PM + Arch + TL + Sprint) | $0.79 | 4.1% | | Verifier + Finalize | $0.34 | 1.8% | | Synthesizer | $0.05 | 0.2% |
79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.
Claude & open-source models supported: Run builds with either runtime and tune models per role in one flat config map.
runtime: "claude_code" maps to Claude backend.runtime: "open_code" maps to OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs).SWE-AF uses three nested control loops to adapt to task difficulty in real time:
| Loop | Scope | Trigger | Action |
| ----------- | ------------- | -------------------- | ---------------------------------------------------------------------------------- |
| Inner loop | Single issue | QA/review fails | Coder retries with feedback |
| Middle loop | Single issue | Inner loop exhausted | run_issue_advisor retries with a new approach, splits work, or accepts with debt |
| Outer loop | Remaining DAG | Escalated failures | run_replanner restructures remaining issues and dependencies |
This is the core factory-control behavior: control agents supervise worker agents and continuously reshape the plan as reality changes.