A structured development methodology for AI coding agents, inspired by Qian Xuesen's engineering cybernetics. Closed-loop feedback control keeps plans aligned with reality. Pure Markdown, any platform.
# Add to your Claude Code skills
git clone https://github.com/zhu1090093659/spec_driven_developGuides for using ai agents skills like spec_driven_develop.
Last scanned: 5/8/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-08T05:57:29.609Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
}No comments yet. Be the first to share your thoughts!
30 days in the Featured rail
English | 中文
A structured methodology for AI coding agents. Pure Markdown. Any platform. Architecture-first.
Spec-Driven Develop is a platform-agnostic AI agent plugin that ships two complementary skills:
No SDK. No runtime. No dependencies. Just Markdown files that any AI coding agent can read and execute.
When you tell your agent something like "rewrite this project in Rust" or "migrate to a microservice architecture", Spec-Driven Develop kicks in with a 6-phase pipeline:
Phase 0 Quick Intent Capture Capture high-level direction (1-2 sentences)
|
Phase 1 Deep Analysis Analyze architecture, inventory modules,
| assess risks — with S.U.P.E.R health evaluation
|
Phase 2 Intent Refinement Ask targeted questions grounded in analysis,
| confirm scope, priorities, and constraints
|
Phase 3 Task Decomposition Break work into phases, tasks, parallel lanes —
| each task annotated with S.U.P.E.R design drivers
| + create GitHub Issues, Milestones & Project board
|
Phase 4 Progress Tracking Generate MASTER.md as GitHub index or local tracker
|
Phase 5 Confirm & Execute Present plan summary, get confirmation,
| then execute all tasks (parallel or sequential)
| with adaptive control feedback loop
|
Phase 6 Archive Preserve all artifacts for traceability
When a GitHub repository is detected, the workflow automatically creates GitHub Issues for every task, organized with Milestones (one per phase), Labels (priority, size, lane), and optionally a GitHub Projects board. Each task executor works in an isolated worktree, creates a PR linked to its Issue (closes #N), and the Issue auto-closes on merge.
Three modes are auto-detected based on environment:
| Mode | What You Get | |:-----|:-------------| | GITHUB_FULL | Issues + Milestones + Labels + Project board + worktree + PR | | GITHUB_STANDARD | Issues + Milestones + Labels + worktree + PR (no board) | | LOCAL_ONLY | Original Markdown-based workflow (no GitHub dependency) |
The workflow gracefully degrades — if gh CLI is unavailable or the repo isn't on GitHub, it falls back to local-only mode automatically.
Inspired by engineering cybernetics (工程控制论), the workflow now includes a closed-loop feedback control system that observes execution reality and automatically corrects course when plans drift:
Set Point (Phase 2 spec)
│
▼
┌────────────┐ drift_score + telemetry
│ Controller │◄────────────────────────────┐
│ (SKILL) │ │
└─────┬──────┘ │
│ task instructions │ observe
▼ │
┌────────────┐ │
│ Executor │──── actual effort/SUPER/deps ┘
│ (Agent) │
└─────┬──────┘
│ code changes
▼
┌────────────┐
│ Codebase │
└────────────┘
After every task, the agent collects execution telemetry (actual effort vs. estimated, S.U.P.E.R compliance delta, unplanned dependencies) and updates a cumulative drift_score. When drift exceeds percentage-based thresholds:
| Drift Level | Threshold | Automatic Response | |:------------|:----------|:-------------------| | Mild | ≥ 20% of phase tasks | Annotate next task with warning | | Significant | ≥ 40% | Halt and re-decompose remaining tasks | | Severe | ≥ 60% | Return to Phase 2 for scope re-evaluation |
This ensures the workflow self-corrects instead of blindly executing a plan that no longer matches reality.
When you describe a problem, a technical puzzle, or say things like "let's discuss", "help me analyze", "I'm stuck on a decision" — Deep Discuss kicks in with a 7-phase structured discussion:
Phase 1 Receive Information Listen, restate, confirm understanding
|
Phase 2 Problem Audit Validate the problem, check info sufficiency,
| surface hidden issues (Critical Thinking)
|
Phase 3 Deep Analysis Multi-angle root cause analysis
| with explicit confidence levels
|
Phase 4 Solution Design 2-3 options with trade-offs and recommendations
|
Phase 5 Self-Review Proactive first review of proposed solutions
|
Phase 6 Final Review Completeness check, risk mitigation, verification plan
|
Phase 7 Execution (optional) Only when user explicitly says "go"
The core philosophy: don't rush to answers — think the problem through first. Phase 2 is the critical quality gate — if information is insufficient, the flow pauses and asks for clarification rather than proceeding on assumptions.
S.U.P.E.R is not a footnote — it is the design philosophy that drives every phase of the workflow and every line of code the agent produces.
Write code like building with LEGO — each brick has a single job, a standard interface, a clear direction, runs anywhere, and can be swapped at will.
| Principle | Meaning | How It's Enforced | |:----------|:--------|:------------------| | Single Purpose | One module, one job | Analysis phase rates each module's single-responsibility compliance. Tasks that span multiple concerns get decomposed further. | | Unidirectional Flow | Data flows one way | Architecture health check flags circular dependencies. Dependencies must point inward — outer layers depend on inner, never the reverse. | | Ports over Implementation | Contracts before code | Module inventory evaluates whether I/O is schema-defined. Task breakdown requires interface contracts before implementation tasks. | | Environment-Agnostic | Runs anywhere | Risk assessment catches hardcoded config and platform-specific assumptions. Config must come from environment variables or config files. | | Replaceable Parts | Swap without ripple | Each module is rated by replacement cost. If swapping a component causes cascading changes, the architecture is broken. |
S.U.P.E.R isn't just a reference document the agent might read — it's woven into the workflow at every level:
Phase 1 — Analysis: Every module gets a per-principle compliance score (S🟢 U🟡 P🔴 E🟢 R🟡). The risk assessment includes a S.U.P.E.R Architecture Health Summary with violation hotspots.
Phase 2 — Intent Refinement: Analysis findings are presented to the user so they can make informed decisions about scope and S.U.P.E.R priorities before task decomposition begins.
Phase 3 — Planning: Each task is annotated with its S.U.P.E.R design drivers (which principles matter most for that task). Early phases prioritize fixing violation hotspots before building new features.
Phase 5 — Execution: The S.U.P.E.R Code Review Checklist is run after every task before marking it complete. The adaptive control loop collects S.U.P.E.R compliance scores as part of execution telemetry:
| Check | Principle | |:------|:----------| | Every new module/file has exactly one responsibility | S | | No function does more than one conceptual thing | S | | Data flows input → processing → output, no reverse deps | U | | No circular imports introduced | U | | Cross-module interfaces are schema-defined | P | | Module I/O is serializable | P | | No hardcoded paths, URLs, keys, or config values | E | | All new dependencies explicitly declared | E | | New modules can be replaced without changes to others | R | | All tests pass after the change | — |
All pass = proceed. 1-2 fail = fix before marking complete. 3+ fail = stop and refactor.
The SKILL prompt is written in a generic, platform-neutral way. It gracefully degrades on platforms without certain capabilities — for example, if sub-agents aren't available, it falls back to sequential execution automatically.
Tested platforms with install scripts: