by gmickel
Spec-driven AI workflow plugin for Claude Code, OpenAI Codex, and Factory Droid. Zero-dep task tracking, worker subagents, Ralph autonomous mode, cross-model reviews.
# Add to your Claude Code skills
git clone https://github.com/gmickel/flow-nextLast scanned: 5/12/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-12T06:40:15.754Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
}flow-next is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by gmickel. Spec-driven AI workflow plugin for Claude Code, OpenAI Codex, and Factory Droid. Zero-dep task tracking, worker subagents, Ralph autonomous mode, cross-model reviews. It has 650 GitHub stars.
Yes. flow-next passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.
Clone the repository with "git clone https://github.com/gmickel/flow-next" and add it to your Claude Code skills directory (see the Installation section above).
flow-next is primarily written in Python. It is open-source under gmickel on GitHub, so you can review or fork the full source.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh flow-next against similar tools.
No comments yet. Be the first to share your thoughts!
The workflow layer for AI coding agents: durable specs, re-anchored workers, adversarial reviews, receipts.
Everything lives in your repo. Zero external dependencies. Uninstall: rm -rf .flow/.
📖 Full doc index → · 🌐 flow-next.dev · 👥 Teams guide · 💬 Discord
Agentic engineering compresses implementation from weeks to hours — and quietly removes every safety valve pre-agentic Agile relied on. The standups, the hallway clarification, the mid-flight course correction that used to finish a vague ticket over a two-week cycle: gone. When an agent can ship the task in one sitting, a rough ticket plus a chat scrollback is the whole work surface.
That work surface fails predictably. Agents drift mid-task, forget requirements, overfit to recent context, and hand reviewers 10K-line diffs with no focus signal. The bottleneck didn't disappear — it moved upstream, to requirements, review, and verification. The spec has to carry the weight.
Flow-Next fixes the operating model, not just the prompt. It turns rough intent into durable specs, specs into context-sized task graphs, task graphs into re-anchored worker runs, and implementation into reviewed PRs with receipts. Between idea and merge it defines six named handover objects — each reviewable on its own, verified by a different model, and frozen at handover.
The artifact chain is not bureaucracy. It is the conversation that would otherwise be missing.
Flow-Next is an AI agent orchestration plugin: 28 agent-native skills covering the full lifecycle — idea → spec → tasks → review → ship → maintain — layered on a bundled pure-stdlib Python CLI (flowctl). The host agent is the intelligence; flowctl is the deterministic plumbing. No external services, no SaaS, no global config.
| Tenet | What it means |
|---|---|
| Spec-driven | Intent survives the chat. The unit of work is the spec — not the ticket, not the transcript, not the PR title. One durable document at .flow/specs/<id>.md, evolving through layers. |
| Context-fit planning | Right-sized task slices. Specs decompose into dependency-ordered tasks, each sized to one fresh ~100k-token context window. |
| Re-anchored work | Fresh context per task. Every worker subagent re-reads the spec, the task, and git state before touching code — no token bleed, no stale assumptions. |
| Adversarial gates | Fix until SHIP. A different model (RepoPrompt / Codex / Copilot / Cursor) reviews every plan and every implementation. Different models make different mistakes — the disagreement surface is where the gaps live. |
| Receipts | "Done" means there is proof. Commits, tests, review verdicts, and evidence recorded per task — never narration. |
| Multi-harness | One workflow everywhere. First-class on Claude Code, OpenAI Codex, and Factory Droid; runs on Grok Build and Cursor; community OpenCode port. |
| Self-improving | Compounds as you work. Memory, glossary, decision records, and strategy grow as side-effects of the workflow you already run — no manual "refresh" ceremony, ever. |
And one tenet about trust: everything lives in your repo under .flow/. Specs, tasks, memory, receipts — all of it is yours, in git, code-reviewable. Uninstall is rm -rf .flow/.
/plugin marketplace add \
https://github.com/gmickel/flow-next
/plugin install flow-next
/reload-plugins
/flow-next:setup
git clone https://github.com/gmickel/flow-next.git
cd flow-next
./scripts/install-codex.sh flow-next
# then: /flow-next:setup
droid plugin marketplace add \
https://github.com/gmickel/flow-next
# /plugins → install flow-next
Why a script for Codex? Codex's plugin protocol only registers skills from plugin.json — not custom .toml agents or hooks. install-codex.sh merges all 21 agents + hooks into ~/.codex/config.toml. Idempotent — safe to re-run. Full platform matrix + community ports in docs/platforms.md.
Grok Build (xAI)? If flow-next is already installed in Claude Code, Grok Build picks it up automatically — grok inspect shows the skills + hook loaded, zero extra setup. The /flow-next:* commands run when typed and the multi-agent flows work (a full /flow-next:plan fanned out all seven scout subagents end-to-end, verified). Grok's slash autocomplete + grok inspect just under-list flow-next's commands/agents — cosmetic, they work when invoked. (Don't grok plugin install the repo — it's a marketplace, not a single plugin.) See docs/platforms.md.
/flow-next:capture # 1. Synthesize conversation → .flow/specs/<id>.md
/flow-next:plan <spec-id> # 2. Break the spec into dependency-ordered tasks
/flow-next:work <spec-id> # 3. Execute tasks in fresh-context worker subagents
/flow-next:make-pr <spec-id> # 4. Render a cognitive-aid PR body (9 input streams)
/flow-next:resolve-pr <PR#> # 5. Fetch review threads → triage → resolve
That's the inner loop. Branch in (/flow-next:prospect for ranked candidates, /flow-next:interview for structured discovery), branch out (/flow-next:pilot + /flow-next:land for the autonomous assembly line, /flow-next:ralph-init for hardened overnight runs, /flow-next:audit for memory garbage collection).
A /flow-next:plan result: dependency-ordered tasks, cross-model review iterated to SHIP, key decisions documented.
flowchart LR
Idea([💡 Idea]) --> P[/flow-next:prospect/]
Idea --> C[/flow-next:capture/]
P --> C
P -.->|direct via promote| L[/flow-next:plan/]
C --> L
C --> I[/flow-next:interview/]
I --> L
L --> W[/flow-next:work/]
W --> R[/flow-next:impl-review/]
R -->|SHIP| Q[/flow-next:qa/]
R -->|NEEDS_WORK| W
Q -->|YES| Done([🚀 Ship])
Q -->|NO| W
Done -.maintenance.-> A[/flow-next:audit/]
A -.-> M[(.flow/memory/)]
/flow-next:qais an opt-in live-app QA stage (after work, before make-pr) — it drives the deployed app like a real user and only runs when there's a live deploy + a driver; with neither it surfaces the limitation rather than blocking. It augments, never replaces CI/staging/manual QA: the cheap first live pass that catches obvious runtime breakage before a human opens the PR. Run it by hand, or wire it into the autonomous loop as the optionalpipeline.qapilot stage (flowctl config set pipeline.qa on, default off) —plan → plan-review → work → qa → make-pr.
The loop is spec-driven. Each step below maps to one skill; click through to flow-next.dev for the full page.
Either synthesize an existing conversation into a structured spec, or — when starting from scratch — generate ranked candidate ideas grounded in the repo. Both land in .flow/specs/<id>.md. Capture source-tags every acceptance criterion as [user] / [paraphrase] / [inferred] and runs a mandatory read-back — you see exactly how much of the spec the agent invented before anything is written.
/flow-next:capture # from a conversation
/flow-next:prospect <focus-hint> # from a focus hint (concept, path, constraint, volume)
→ flow-next.dev/skills/capture · flow-next.dev/skills/prospect
Deep Q&A pass over a spec or task: lead-with-recommendation, confidence tiers, codebase-first investigation. Use it to flesh out an ambiguous spec before breaking it down. --scope=business|technical|both symmetrically narrows the pass — the same skill serves the PO filling the business layer and the tech lead filling the technical layer, on the same spec file.
/flow-next:interview <spec-id>
→ flow-next.dev/skills/interview
Research the codebase via parallel scouts, then write the spec + tasks together. Tasks fn-N.M declare blockers, inherit context from the parent spec, and declare which acceptance criteria they satisfy (satisfies: [R1, R3]). This skill does not write code — only the plan.
/flow-next:plan <spec-id> # or <free-form text>
Execute tasks systematically: each runs in a fresh-context worker subagent, re-anchors against the spec before starting, th