Dive into Claude Code

A comprehensive source-level architectural analysis of Claude Code (v2.1.88, ~1,900 TypeScript files, ~512K lines of code), combined with a curated collection of community analyses, a design-space guide for agent builders, and cross-system comparisons.

[!TIP] TL;DR -- Only 1.6% of Claude Code's codebase is AI decision logic. The other 98.4% is deterministic infrastructure -- permission gates, context management, tool routing, and recovery logic. The agent loop is a simple while-loop; the real engineering complexity lives in the systems around it. This repo dissects that architecture and distills it into actionable design guidance for anyone building AI agent systems.

From Our Paper

🌟 Key Highlights
📖 Reading Guide
🏗️ Architecture at a Glance
🧭 Values and Design Principles
🔄 The Agentic Query Loop
🛡️ Safety and Permissions
🧩 Extensibility
🧠 Context and Memory
👥 Subagent Delegation
💾 Session Persistence

Beyond the Paper

🛰️ New Signals in the Agent Design Space
🛠️ Build Your Own AI Agent: A Design Guide
⚖️ Cross-System Comparison: Claude Code vs OpenClaw vs Hermes-Agent
🌐 Community Projects & Research
🚀 Other Notable AI Agent Projects
🔖 Citation

Key Highlights

98.4% Infrastructure, 1.6% AI -- The agent loop is a simple while-loop; the real complexity is permission gates, context management, and recovery logic.
5 Values → 13 Principles → Implementation -- Every design choice traces back to human authority, safety, reliability, capability, and adaptability.
Defense in Depth with Shared Failure Modes -- 7 safety layers, but all share performance constraints. 50+ subcommands bypass security analysis.
4 CVEs Reveal a Pre-Trust Window -- Extensions execute before the trust dialog appears.
The Cross-Cutting Harness Resists Reimplementation -- The loop is easy to copy; hooks, classifier, compaction, and isolation are not.

Reading Guide

| If you are a... | Start here | Then read | |:----------------|:-----------|:----------| | Agent Builder | Build Your Own Agent | Architecture Deep Dive | | Security Researcher | Safety and Permissions | Architecture: Safety Layers | | Product Manager | Key Highlights | Values and Principles | | Researcher | Full Paper (arXiv) | Community Resources |

1,884 files · ~512K lines · v2.1.88 · 7 safety layers · 5 compaction stages · 54 tools · 27 hook events · 4 extension mechanisms · 7 permission modes

Claude Code answers four design questions that every production coding agent must face:

| Question | Claude Code's Answer | |:---------|:---------------------| | Where does reasoning live? | Model reasons; harness enforces. ~1.6% AI, 98.4% infrastructure. | | How many execution engines? | One queryLoop for all interfaces (CLI, SDK, IDE). | | Default safety posture? | Deny-first: deny > ask > allow. Strictest rule wins. | | Binding resource constraint? | ~200K (older models) / 1M (Claude 4.6 series) context window. 5 compaction layers before every model call. |

The system decomposes into 7 components (User → Interfaces → Agent Loop → Permission System → Tools → State & Persistence → Execution Environment) across 5 architectural layers.

[!NOTE] For the full architectural deep dive -- 7 safety layers, 9-step turn pipeline, 5-layer compaction, and more -- see docs/architecture.md.

The architecture traces from 5 human values through 13 design principles to implementation:

| Value | Core Idea | |:------|:----------| | Human Decision Authority | Humans retain control via principal hierarchy. When a 93% prompt-approval rate revealed approval fatigue, response was restructured boundaries, not more warnings. | | Safety, Security, Privacy | System protects even when human vigilance lapses. 7 independent safety layers. | | Reliable Execution | Does what was meant. Gather-act-verify loop. Graceful recovery. | | Capability Amplification | "A Unix utility, not a product." 98.4% is deterministic infrastructure enabling the model. | | Contextual Adaptability | CLAUDE.md hierarchy, graduated extensibility, trust trajectories that evolve over time. |

| Principle | Design Question | |:----------|:----------------| | Deny-first with human escalation | Should unrecognized actions be allowed, blocked, or escalated? | | Graduated trust spectrum | Fixed permission level, or spectrum users traverse over time? | | Defense in depth | Single safety boundary, or multiple overlapping ones? | | Externalized programmable policy | Hardcoded policy, or externalized configs with lifecycle hooks? | | Context as scarce resource | Single-pass truncation or graduated pipeline? | | Append-only durable state | Mutable state, snapshots, or append-only logs? | | Minimal scaffolding, maximal harness | Invest in scaffolding or operational infrastructure? | | Values over rules | Rigid procedures or contextual judgment with deterministic guardrails? | | Composable multi-mechanism extensibility | One API or layered mechanisms at different costs? | | Reversibility-weighted risk assessment | Same oversight for all, or lighter for reversible actions? | | Transparent file-based config and memory | Opaque DB, embeddings, or user-visible files? | | Isolated subagent boundaries | Shared context/permissions, or isolation? | | Graceful recovery and resilience | Fail hard, or recover silently? |

The paper also applies a sixth evaluative lens -- long-term capability preservation -- citing evidence that developers in AI-assisted conditions score 17% lower on comprehension tests.

The core is a ReAct-pattern while-loop: assemble context → call model → dispatch tools → check permissions → execute → repeat. Implemented as an AsyncGenerator yielding streaming events.

Before every model call, five compaction shapers run sequentially (cheapest first): Budget Reduction → Snip → Microcompact → Context Collapse → Auto-Compact.

9-step pipeline per turn: Settings resolution → State init → Context assembly → 5 pre-model shapers → Model call → Tool dispatch → Permission gate → Tool execution → Stop condition

Two execution paths:

StreamingToolExecutor -- begins executing tools as they stream in (latency optimization)
Fallback runTools -- classifies tools as concurrent-safe or exclusive

Recovery: Max output token escalation (3 retries), reactive compaction (once per turn), prompt-too-long handling, streaming fallback, fallback model

5 stop conditions: No tool use, max turns, context overflow, hook intervention, explicit abort

7 permission modes form a graduated trust spectrum: plan → default → acceptEdits → auto (ML classifier) → dontAsk → bypassPermissions (+ internal bubble).

Deny-first: A broad deny always overrides a narrow allow. 7 independent safety layers from tool pre-filtering through shell sandboxing to hook interception. Permissions are never restored on resume -- trust is re-established per session.

[!WARNING] Shared failure modes: Defense-in-depth degrades when layers share constraints. Per-subcommand parsing causes event-loop starvation -- commands exceeding 50 subcommands bypass security analysis

Dive-into-Claude-Code

Related Skills

Dive into Claude Code

Table of Contents

Key Highlights

Reading Guide