by jgravelle
The leading, most token-efficient MCP server for GitHub source code exploration via tree-sitter AST parsing
# Add to your Claude Code skills
git clone https://github.com/jgravelle/jcodemunch-mcpLast scanned: 4/26/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-04-26T06:11:19.231Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
}Quickstart - https://github.com/jgravelle/jcodemunch-mcp/blob/main/QUICKSTART.md
A crapload of detailed info: http://jcodemunch.com/
Use it to make money, and Uncle J. gets a taste. Fair enough? details
| Doc | What it covers |
|-----|----------------|
| QUICKSTART.md | Zero-to-indexed in three steps |
| USER_GUIDE.md | Full tool reference, workflows, and best practices |
| AGENT_HOOKS.md | Agent hooks and prompt policies |
| CONFIGURATION.md | JSONC config file reference, migration from env vars |
| GROQ.md | Groq Remote MCP integration, deployment, gcm CLI |
| HEADLESS.md | Using jCodeMunch with claude -p (and the jragmunch CLI) |
| ARCHITECTURE.md | Internal design, storage model, and extension points |
| LANGUAGE_SUPPORT.md | Supported languages and parsing details |
| CONTEXT_PROVIDERS.md | dbt, Git, and custom context provider docs |
| TROUBLESHOOTING.md | Common issues and fixes |
Most AI agents explore repositories the expensive way:
open entire files → skim thousands of irrelevant lines → repeat.
That is not “a little inefficient.” That is a token incinerator.
jCodeMunch indexes a codebase once and lets agents retrieve only the exact code they need: functions, classes, methods, constants, outlines, and tightly scoped context bundles, with byte-level precision.
No comments yet. Be the first to share your thoughts!
Top skills in this category by stars
In retrieval-heavy workflows, that routinely cuts code-reading token usage by 95%+ because the agent stops brute-reading giant files just to find one useful implementation.
| Task | Traditional approach | With jCodeMunch | | ---------------------- | ------------------------- | ------------------------------------------- | | Find a function | Open and scan large files | Search symbol → fetch exact implementation | | Understand a module | Read broad file regions | Pull only relevant symbols and imports | | Explore repo structure | Traverse file after file | Query outlines, trees, and targeted bundles |
Index once. Query cheaply. Keep moving. Precision context beats brute-force context.
Retrieval decides what to send. MUNCH decides how to pack it.
Every tool response can be emitted in a purpose-built compact wire format instead of verbose JSON. Path prefixes are interned to short handles, homogeneous lists of dicts pack into single-character-tagged CSV rows, and per-column types are preserved so the decode is lossless.
# any tool call accepts format=
find_references(identifier="get_user", format="auto")
# auto — emit compact if savings ≥ 15%, otherwise JSON
# compact — always compact
# json — never compact (back-compat passthrough)
Benchmark (v1.56.0): median 45.5% bytes saved across 6 representative tools, peaks at 55.4% on graph and outline responses. Full spec in SPEC_MUNCH.md; numbers and harness in TOKEN_SAVINGS.md.
Encoding savings stack on top of retrieval savings — every byte off the wire is a byte the agent doesn't pay to read.
Commercial licenses
jCodeMunch-MCP is free for non-commercial use.
Commercial use requires a paid license.
jCodeMunch-only licenses
- Builder — $79 — 1 developer
- Studio — $349 — up to 5 developers
- Platform — $1,999 — org-wide internal deployment
Want both code and docs retrieval?
Stop paying your model to read the whole damn file.
jCodeMunch turns repo exploration into structured retrieval.
Instead of forcing an agent to open giant files, wade through imports, boilerplate, comments, helpers, and unrelated code, jCodeMunch lets it navigate by what the code is and retrieve only what matters.
That means:
It indexes your codebase once using tree-sitter, stores structured symbol metadata plus byte offsets into the original source, and retrieves exact implementations on demand instead of re-reading entire files over and over.
Recent releases have made that retrieval workflow sharper and more useful in real engineering work, with BM25-based symbol search, fuzzy matching, semantic/hybrid search (opt-in, zero mandatory dependencies), query-driven token-budgeted context assembly (get_ranked_context), dead code detection (find_dead_code), untested symbol detection (get_untested_symbols), git-diff-to-symbol mapping (get_changed_symbols), architectural centrality ranking (get_symbol_importance, PageRank), blast-radius depth scoring with source snippets, context bundles with token budgets, AST-derived call graphs and call hierarchy traversal, decorator-aware search and filtering, hotspot detection (complexity x churn), dependency cycles and coupling metrics, session-aware routing (plan_turn, turn budgets, negative evidence), agent config auditing, complexity-based model routing (Agent Selector), enforcement hooks (PreToolUse/PostToolUse/PreCompact), dependency graphs, class hierarchy traversal, multi-symbol bundles, live watch-based reindexing, automatic Claude Code worktree discovery (watch-claude), registry-wide auto-reindexing with one-command login-service install (watch-all + watch-install / watch-uninstall / watch-status; also exposed as MCP tool get_watch_status), auto-watch on demand (when watch: true in config, the server automatically indexes and watches any repo a tool is called against — ensuring fresh results from the first call), trusted-folder access controls, edit-ready refactoring plans (plan_refactoring) for rename, move, extract, and signature change operations, symbol provenance archaeology (get_symbol_provenance — full git lineage, semantic commit classification, evolution narrative), unified PR risk profiling (get_pr_risk_profile — composite risk score fusing blast radius, complexity, churn, test gaps, and volume), automatic response secret redaction (AWS/GCP/Azure/JWT/GitHub tokens scrubbed before reaching the LLM context window), and cross-language AST pattern matching (search_ast — 10 preset anti-pattern detectors + custom mini-DSL for structural queries like call:*.unwrap, string:/password/i, nesting:5+; works across all 70+ languages with universal node-type mapping).
Measured with tiktoken cl100k_base across three public repos. Workflow: search_symbols (top 5) + get_symbol_source × 3 per query. Baseline: all source files concatenated (minimum cost for an agent that reads everything). Full methodology and harness →
| Repository | Files | Symbols | Baseline tokens | jCodeMunch tokens | Reduction | |------------|------:|--------:|----------------:|------------------:|----------:| | expressjs/express | 34 | 117 | 73,838 | ~1,300 avg | 98.4% | | fastapi/fastapi | 156 | 1,359 | 214,312 | ~15,600 avg | 92.7% | | gin-gonic/gin | 40 | 805 | 84,892 | ~1,730 avg | 98.0% | | Grand total (15 task-runs) | | | 1,865,210 | 92,515 | 95.0% |
Pe