by Jakedismo
100% Rust implementation of code graphRAG with blazing fast AST+FastML parsing, surrealDB backend and advanced agentic code analysis tools through MCP for efficient code agent context management
# Add to your Claude Code skills
git clone https://github.com/Jakedismo/codegraph-rust
Your codebase, understood.
CodeGraph transforms your entire codebase into a semantically searchable knowledge graph that AI agents can actually reason about—not just grep through.
Ready to get started? Jump to the Installation Guide for step-by-step setup instructions.
Already set up? See the Usage Guide for tips on getting the most out of CodeGraph with your AI assistant.
AI coding assistants are powerful, but they're flying blind. They see files one at a time, grep for patterns, and burn tokens trying to understand your architecture. Every conversation starts from zero.
What if your AI assistant already knew your codebase?
Most semantic search tools create embeddings and call it a day. CodeGraph builds a real knowledge graph:
Your Code → Build Context → AST + FastML → LSP Resolution → Enrichment → Graph + Embeddings
↓ ↓ ↓ ↓ ↓ ↓
Packages Nodes/edges Type-aware API surface Graph Semantic
Features Fast patterns linking Module graph traversal search
Targets Spans Definitions Dataflow/Docs (hybrid)
When you search, you don't just get "similar code"—you get code with its relationships intact. The function that matches your query, plus what calls it, what it depends on, and where it fits in the architecture.
Indexing enrichment adds:
defines, uses, flows_to, returns, mutates) for impact analysisREADME.md, docs/**/*.md, and schema/**/*.surqlIndexing is tiered so you can choose between speed/storage and graph richness. The default is fast.
| Tier | What it enables | Typical use |
|------|-----------------|-------------|
| fast | AST nodes + core edges only (no LSP or enrichment) | Quick indexing, low storage |
| balanced | LSP symbols + docs/enrichment + module linking | Good agentic results without full cost |
| full | All analyzers + LSP definitions + dataflow + architecture | Maximum accuracy/richness |
Tier behavior details:
fast: disables build context, LSP, enrichment, module linking, dataflow, docs/contracts, and architecture; filters out Uses/References edges.balanced: enables build context, LSP symbols, enrichment, module linking, and docs/contracts; filters out References edges.full: enables all analyzers and LSP definitions; no edge filtering.Configure the tier:
codegraph index --index-tier balancedCODEGRAPH_INDEX_TIER=balanced[indexing] tier = "balanced"When the tier enables LSP (balanced/full), indexing fails fast if required external tools are missing.
Required tools by language:
rust-analyzernode and typescript-language-servernode and pyright-langservergoplsjdtlsclangdIf indexing appears to stall during LSP resolution, you can adjust the per-request timeout:
CODEGRAPH_LSP_REQUEST_TIMEOUT_SECS (default 600, minimum 5)If LSP resolution fails immediately and the error includes something like Unknown binary 'rust-analyzer' in official toolchain ..., your rust-analyzer is a rustup shim without an installed binary. Install a runnable rust-analyzer (e.g. via brew install rust-analyzer or by switching to a toolchain that provides it).
If you want CodeGraph to flag forbidden package dependencies, add codegraph.boundaries.toml at the project root:
[[deny]]
from = "your_crate"
to = "forbidden_crate"
reason = "explain the boundary"
Indexing will emit violates_boundary edges when a depends_on relationship matches a deny rule.
CodeGraph doesn't return a list of files and wish you luck. It ships 4 consolidated agentic tools that do the thinking:
| Tool | What It Actually Does |
|------|----------------------|
| agentic_context | Gathers the context you need—searches code, builds comprehensive context, answers semantic questions |
| agentic_impact | Maps change impact—dependency chains, call flows, what breaks if you touch something |
| agentic_architecture | The big picture—system structure, API surfaces, architectural patterns |
| agentic_quality | Risk assessment—complexity hotspots, coupling metrics, refactoring priorities |
Each tool accepts an optional focus parameter for precision when needed:
| Tool | Focus Values | Default Behavior |
|------|-------------|-----------------|
| agentic_context | "search", "builder", "question" | Auto-selects based on query |
| agentic_impact | "dependencies", "call_chain" | Analyzes both |
| agentic_architecture | "structure", "api_surface" | Provides both |
| agentic_quality | "complexity", "coupling", "hotspots" | Comprehensive assessment |
Each tool runs a reasoning agent that plans, searches, analyzes graph relationships, and synthesizes an answer. Not a search result—an answer.
View Agent Context Gathering Flow - Interactive diagram showing how agents use graph tools to gather context.

CodeGraph implements agents using Rig the default and recommended choice (legacy react and lats implemented with autoagents still work). Selectable at runtime via CODEGRAPH_AGENT_ARCHITECTURE=rig:
Why Rig is Default: The Rig-based backend delivers the best performance with modern thinking and reasoning models. It is a native Rust implementation that supports internal sub-architectures and provides features like True Token Streaming and Automatic Recovery.
Internal Rig Sub-Architectures:
When using the rig backend, the system automatically maps the consolidated agentic tools to the most effective reasoning strategy:
agentic_architecture (structure), agentic_quality, and agentic_context (question).agentic_context (search/builder), agentic_impact, and agentic_architecture (api_surface).Agents can start with lightweight project context so their first tool calls are not blind. Enable via env:
CODEGRAPH_ARCH_BOOTSTRAP=true — includes a brief directory/structure bootstrap + contents of README.md and CLAUDE.md+AGENTS.md or GEMINI.md (if present) in the agent’s initial context.CODEGRAPH_ARCH_PRIMER="<primer text>" — optional custom primer injected into startup instructions (e.g., areas to focus on).Why? Faster, more relevant early steps, fewer wasted graph/semantic queries, and better architecture answers on large repos.
Notes:
CODEGRAPH_PROJECT_ID or current working directory).# Use Rig for best performance with thinking and reasoning models (recommended)
CODEGRAPH_AGENT_ARCHITECTURE=rig ./codegraph start stdio
# Use default ReAct for traditional instruction models
./codegraph start stdio
# Use LATS for complex analysis
CODEGRAPH_AGENT_ARCHITECTURE=lats ./codegraph start stdio
All architectures use the same 4 consolidated agentic tools (backed by 6 internal graph analysis tools) and tier-aware prompting—only the reasoning strategy differs.
Here's something clever: CodeGraph automatically adjusts its behavior based on the LLM's context window that you configured for the codegraph agent.
Running a small local model? Get focused, efficient queries.
Using GPT-5.1 or Claude with 200K context? Get comprehensive, exploratory analysis.
Using grok-4-1-fast-reasoning with 2M context? Get detailed analysis with intelligent result management.
The Agent only uses the amount of steps that it requires to produce the answer so tool execution times vary based on the query and amount of data indexed in the database.
During development the agent used 3-6 steps on average to produce answers for test scenarios.
The Agent is stateless it only has conversational memory for the span of tool execution it does not accumulate context/memory over multiple chained tool calls this is already handled by your client of choice, it accumulates that context so codegraph needs to just provide answers.
| Your Model | CodeGraph's Behavior | |------------|---------------------| | < 50K tokens | Terse prompts, max 3 steps | | 50K-150K | Balanced analysis, max 5 steps | | 150K-500K | Detailed exploration, max 6 steps | | > 500K (Grok, etc.) | Comprehensive analysis, max 8 steps |
Hard cap: Maximum 8 steps regardless of tier (10 with env override). This prevents runaway costs and context overflow while still allowing thorough analysis.
Same tool, automatically optimized for your setup.
CodeGraph includes multi-layer protection against context overflow—preven
No comments yet. Be the first to share your thoughts!