by nelsonwerd
Composable Agent Skills (Claude + OpenAI Codex) for taking an idea from fuzzy → validated → sequenced build → shipped — a manual tier (ideate, deep-dive, prompt-pack) and an autonomous tier (autopilot, build-loop).
# Add to your Claude Code skills
git clone https://github.com/nelsonwerd/idea-to-ship-skillsGuides for using ai agents skills like idea-to-ship-skills.
idea-to-ship-skills is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by nelsonwerd. Composable Agent Skills (Claude + OpenAI Codex) for taking an idea from fuzzy → validated → sequenced build → shipped — a manual tier (ideate, deep-dive, prompt-pack) and an autonomous tier (autopilot, build-loop). It has 61 GitHub stars.
idea-to-ship-skills's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.
Clone the repository with "git clone https://github.com/nelsonwerd/idea-to-ship-skills" and add it to your Claude Code skills directory (see the Installation section above).
idea-to-ship-skills is primarily written in Shell. It is open-source under nelsonwerd on GitHub, so you can review or fork the full source.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh idea-to-ship-skills against similar tools.
No comments yet. Be the first to share your thoughts!
Unlocks once the catalog security scan passes (runs nightly).
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
Composable Agent Skills — for Claude and OpenAI Codex — that take an idea from fuzzy → validated → sequenced build → shipped: by hand, or on autopilot.
Most "build with AI" workflows skip the hard half. They jump straight to code — and skip deciding what's actually worth building, validating it honestly, and planning the build so it ships in safe, verifiable steps. idea-to-ship is that missing front half — plus the autonomous build loop on the far side of it: a small, sharp suite of Agent Skills (they run in Claude and OpenAI Codex), reverse-engineered from real idea→ship journeys (including the mistakes those journeys made), so you don't repeat them.
They're separate, composable skills on purpose — sharp triggers, lean context, independent use. Run them by hand (the manual tier), or let autopilot fly the whole line for you (the autonomous tier).
Who it's for. Solo and small-team builders who tend to start coding before deciding what's actually worth building — working in Claude or OpenAI Codex. If you've shipped a feature nobody used, rebuilt something and lost the parts that worked, or watched a big change stall halfway, this suite front-loads the discipline that prevents it. Each skill also earns its keep alone. (deep-dive is token-hungry by design — see its note below; it shines most when you're not token-constrained, e.g. on a Claude Max plan.)
ideate — turn a fuzzy idea (or an existing thing you want to improve) into a locked concept + roadmap, captured in one living CONCEPT_BRIEF.md. A blunt, honest co-founder: it forces a success metric and a kill criterion, refuses to spec before the concept survives an honest pressure-test, and hands off cleanly to prompt-pack.deep-dive — the rigor engine ideate leans on for high-stakes validation (and that you can run directly on any codebase, strategy, design, or research question): parallel specialist agents → synthesis → adversarial red-team → a plain-English verdict with honest 1–10 confidence.prompt-pack — turn a settled concept into a sequence of self-contained, independently-shippable build prompts: each does one unit, verifies itself, and leaves the app working before the next. Run them in one chat or spread across many — each prompt is self-contained, so any unit moves cleanly to a fresh chat whenever you want (or need) one. Also writes paste-ready handoffs to resume a chat or relay work to another tool.build-loop — drive a build past "it compiles" to near-finish-line craft: it sees and exercises the running app — render → screenshot → critique → rebuild, multi-pass — and checks the machine facts (build, tests, flows, console, a11y) until acceptance criteria pass or a stop-condition fires (no infinite thrash). When feel is load-bearing the visual design loop runs every iteration. Honest bound: objective craft + a self-graded taste pass, ~80% of the way — not a finished or validated product.→ drive each step yourself, or let autopilot fly the whole line autonomously — in character as a grounded founder-persona — handing back a near-finish-line first draft plus an honest ledger of what only a human or the market can finish.
| If you want to… | Type this | You get back |
|---|---|---|
| Decide what to build | "I have an idea for X — help me decide if it's worth building." | docs/CONCEPT_BRIEF.md |
| Investigate something rigorously | "Do a standard design evaluation of X. Research-only." | research/<topic>/ + an executive briefing |
| Turn settled scope into a build plan | "Make a prompt pack from docs/CONCEPT_BRIEF.md." (or "X is too big for one chat — make me a prompt pack.") |
docs/<TOPIC>_PROMPT_PACK.md |
| Drive a build to near-finish-line craft | "Loop on this until the core flows pass and the UI holds its design bar." | a self-verified, iterated build + an honest craft ledger |
| Fly the whole pipeline autonomously | "Run autopilot on X." | a CONCEPT_BRIEF, a build pack, a first-draft app + an honest hand-off |
Each works standalone; run them in sequence — or on autopilot — for the full idea→ship pipeline.
ideate → docs/CONCEPT_BRIEF.md (excerpt — the locked concept + honest verdict, edited in place across the session, not regenerated):
- Confidence verdict: 7/10 — would move to 8 if 5 target users confirm the triage pain in interviews; down to 4 if they already tolerate shared Gmail.
- One-line promise: Every client message handled by the right person, fast — without anyone owning a chaotic shared inbox.
- Beachhead persona: 2–6 person creative/client-service studios. (Secondary: solo freelancers — not v1.)
- Success metric: % of client messages with a clear owner + reply within 1 business day.
- Kill criterion: If 5 target studios won't try a 2-week pilot, shelve it.
- Scope OUT / deferred: Outlook, analytics dashboard, mobile app — each named with a one-line reason.
- LOCKED: Layer on existing email, don't replace it — lowers switching cost (the wedge).
deep-dive → research/<topic>/NN-executive-briefing.md (excerpt — verdict-first, after parallel specialists + an adversarial red-team):
TL;DR. Sound core; one blocker before you ship. Confidence: 6/10 — 4 of 7 load-bearing findings externally verified (tests +
git); the rest rest on model judgment.
- [Blocker] Currency rounding diverges between server and client —
pricing.ts:142vsformat.ts:88.- [High] No regression test covers the refund path; a silent change there ships unnoticed.
- Should you proceed? Fix the blocker, add the refund test, then ship Phase 1.
prompt-pack → docs/<TOPIC>_PROMPT_PACK.md (excerpt — one self-contained, independently-shippable unit; reads the brief above):
P2 · Add per-currency rounding — Risk: HIGH Read first:
CLAUDE.md,docs/CONCEPT_BRIEF.md,pricing.ts— verify file:line before editing. What MUST NOT change: the publicformatAmount()signature; existing USD output. Verification:npm test pricing+ manual matrix (regression: USD unchanged · new: JPY 0-decimal, BHD 3-decimal). When done: report files changed + test results. Do not commit — wait for explicit go.
Manual tier — drive each step yourself.
Fuzzy idea → locked concept + roadmap. Two modes: greenfield (a new idea) and refinement (evaluate/improve an existing thing). Triggers: "help me figure out what to build", "is this idea any good", "should I rebuild X", "turn my idea into a plan". → ideate-skill
Multi-agent investigative analysis for questions that deserve more than a one-shot answer: audits, strategy/viability evaluations, design reviews, open research. Triggers: "do a deep dive", "thorough audit", "evaluate this strategy", "is this sound/safe". → deep-dive-skill
Note —
deep-diveis token-hungry by design. A full run fans out 4–6 specialist agents (each writing thousands of words), then synthesis, follow-up verification, a red-team pass, and a briefing — easily 10+ agent calls and tens of thousands of tokens for one analysis. That's the right trade for a high-stakes call, and a great fit on a Claude Max plan (or any setup where you're not token-constrained). On a smaller plan, reach for it deliberately: lean on its built-in Scale heuristics (2–3 lanes for narrow scope, skip the red-team for low-stakes work), or ask for a single-pass review instead.ideateandprompt-packare far lighter.
A big job → ordered, self-contained prompts, each shippable on its own, plus handoffs. Run them in one chat or across many. Triggers: "make a prompt pack", "break this into phases", "I'm running out of context", "write me a handoff". → prompt-pack-skill
Autonomous tier — the pipeline drives itself.
Experimental — and honest about why. The autonomous tier is an early, lightly-proven experiment: genuinely capable and a lot of fun to watch, but not battle-tested — treat it as a promising prototype, not a production tool. It produces a near-finish-line-aimed first draft a human finishes — not a finished or market-validated product. Three limits it doesn't escape: a ~80% craft ceiling with a last-mile correctness/security/taste tail; the grounding firewall (real data may discover the problem and seed the build, but a synthetic persona's reaction never counts as validation); and judgment quality isn't cleanly measurable — its go/kill calls are a signal a human weighs, never proof. Market validation stays the human handoff. We