by maxritter
Make Claude Code production-ready — spec-driven plans, enforced quality gates, persistent knowledge
# Add to your Claude Code skills
git clone https://github.com/maxritter/pilot-shellFrom requirement to production-grade code — planned, tested, verified. Spec-driven plans. Enforced quality gates. Persistent knowledge.
curl -fsSL https://raw.githubusercontent.com/maxritter/pilot-shell/main/install.sh | bash
macOS · Linux · Windows (WSL2) — installs in under 2 minutes.
Claude Code writes code fast — but without structure, it skips tests, loses context, and produces inconsistent results. Other frameworks add complexity (dozens of agents, thousands of lines of config) without meaningfully better output.
Pilot Shell is different. Every component solves a real problem:
/spec — plans, implements, and verifies features end-to-end with TDD/prd — turns vague ideas into clear requirements with optional deep researchNo comments yet. Be the first to share your thoughts!
Run pilot for Spec-Driven Development with /spec, or pilot bot for 24/7 automations.
Claude Code: Install Claude Code using the native installer before setting up Pilot Shell. If you have the npm or brew version installed, uninstall it first. If no Claude Code installation is detected, the Pilot installer will attempt to set it up for you.
Claude Subscription: Solo developers should choose Max 5x for moderate usage or Max 20x for heavy usage. Teams should use Team Premium (6.25x usage per member, SSO, admin tools, billing management). Companies with stricter compliance or procurement requirements should use Enterprise (API based pricing applies per usage).
Terminal (Recommended): cmux works great with Pilot Shell — its vertical tab layout lets you run multiple sessions side by side. Any modern terminal works: Ghostty, iTerm2, or the built-in macOS/Linux terminal.
Claude Chrome (Recommended): Install the Claude Code Chrome extension for browser automation and E2E testing. Pilot automatically detects it and uses it as the preferred tool. When Chrome isn't available, Pilot falls back to playwright-cli (reliable element targeting, persistent sessions, tracing) or agent-browser (lightweight, fast startup).
Codex Plugin (Included): The Codex plugin is installed automatically with Pilot. It provides adversarial code review powered by OpenAI Codex — an independent second opinion during /spec planning and verification. Run /codex:setup once to authenticate, then enable reviewers in Console Settings → Reviewers. A ChatGPT Plus subscription ($20/mo) covers the Codex API usage.
Works with any existing project. Pilot Shell is installed on top of Claude Code and uses its built-in concepts like commands, rules, hooks, skills, subagents, MCP, LSP and optimized settings to improve your experience:
curl -fsSL https://raw.githubusercontent.com/maxritter/pilot-shell/main/install.sh | bash
Installs globally on macOS, Linux, and Windows (WSL2). All tools and rules go to ~/.pilot/ and ~/.claude/. After installation, cd into any project and run pilot or ccp to start.
If you encounter an issue or unfixed bug in the latest version, you can always go back to a previous version (see releases):
export VERSION=8.0.7
curl -fsSL https://raw.githubusercontent.com/maxritter/pilot-shell/main/install.sh | bash
Removes the Pilot binary, plugin files, managed commands/rules, settings and shell aliases:
curl -fsSL https://raw.githubusercontent.com/maxritter/pilot-shell/main/uninstall.sh | bash
Pilot Shell works inside Dev Containers. Copy the .devcontainer folder from this repository into your project, adapt it to your needs (base image, extensions, dependencies), and run the installer inside the container. The installer auto-detects the container environment and skips system-level dependencies like Homebrew.
7-step installer with progress tracking, rollback on failure, and idempotent re-runs:
~/.claude/ plugin — rules, commands, hooks, MCP servers.nvmrc and project configpilot aliasexport VERSION=8.0.7
curl -fsSL https://raw.githubusercontent.com/maxritter/pilot-shell/main/install.sh | bash
Just chat — no plan, no approval gate. Quality hooks and TDD enforcement still apply. Best for small tasks and exploration. For anything that needs a plan, use /spec — not Claude Code's built-in plan mode.
/spec replaces Claude Code's built-in plan mode (Shift+Tab). It provides a complete planning workflow with TDD, verification, and code review — use /spec instead of plan mode for all planned work.
Features, bug fixes, refactoring — describe it and /spec handles the rest. Auto-detects whether it's a feature or a bugfix and adapts the workflow. Specs are saved to docs/plans/ and visible in the Console's Specification tab.
pilot
> /spec "Add user authentication with OAuth and JWT tokens" # → feature mode
> /spec "Fix the crash when deleting nodes with two children" # → bugfix mode (auto-detected)
Discuss → Plan → Approve → Implement (TDD) → Verify → Done
↑ ↓
└── Loop──┘
Full exploration workflow for new functionality, refactoring, or architectural changes.
Plan: Explores codebase with semantic search → asks clarifying questions → writes detailed spec with scope, tasks, and definition of done → for UI features, writes E2E test scenarios (step-by-step, browser-executable) that become the verification contract → spec-review sub-agent validates completeness → waits for your approval. Optional Codex adversarial review provides an independent second opinion when enabled.
Implement: Creates an isolated git worktree → implements each task with strict TDD (RED → GREEN → REFACTOR) → quality hooks auto-lint, format, and type-check every edit → full test suite after each task.
Verify: Full test suite + actual program execution → unified review sub-agent (compliance + quality + goal) → for UI features, executes each E2E scenario step-by-step via browser automation (pass/fail tracked, results written to plan) → auto-fixes findings → squash merges to main on success.
Investigation-first workflow for targeted fixes. Finds the root cause before touching any code.
Investigate: Reproduces the bug → traces backward through the call chain to find the root cause at a specific file:line → compares against working code patterns → states the fix with confidence level. If 3+ hypotheses fail, escalates as an architectural problem.
Test-Before-Fix: Writes a regression test that FAILS on current code → implements the minimal fix at the root cause → ve