Use cases

| | Use case | How CCE helps | |---|---|---| | 💰 | Reduce Claude Code costs | 94% fewer input tokens per session | | 🔒 | Keep code private | Everything local, no cloud indexing | | 🔄 | Multi-editor teams | One index across Claude Code, Cursor, VS Code, Gemini CLI | | 🧠 | Cross-session memory | Decisions and context survive restarts | | ⚡ | Faster responses | Less context = faster Claude replies | | 📊 | Track actual savings | Dollar amounts, not estimates |

Quick start (3 lines)

uv tool install code-context-engine
cd /path/to/your/project
cce init

That's it. Claude now searches your index instead of reading entire files. No config needed.

System requirements

Python 3.11+ (tested on 3.11, 3.12, 3.13)
A C compiler and cmake (needed to build tree-sitter grammars)

| Platform | Setup | |----------|-------| | macOS | xcode-select --install (provides compiler and cmake) | | Ubuntu/Debian | sudo apt install build-essential cmake | | Fedora/RHEL | sudo dnf install gcc gcc-c++ cmake | | Windows | Install Visual Studio Build Tools (C++ workload) and CMake |

Tested on all three platforms in CI (macOS, Linux, Windows × Python 3.11/3.12/3.13).

Install and see savings in 60 seconds

uv tool install code-context-engine   # or: pipx install code-context-engine
cd /path/to/your/project
cce init                              # index, install hooks, register MCP server

Embedding backends: CCE auto-detects the best available backend. If you have Ollama running, it uses nomic-embed-text with zero extra dependencies. For offline/local embedding without Ollama, install the [local] extra:

uv tool install "code-context-engine[local]"   # includes fastembed + ONNX Runtime

Restart your editor. Done. Every question now hits the index instead of re-reading files.

cce init auto-detects your editor and writes the right config:

| Editor | Config written | Instructions | |--------|---------------|--------------| | Claude Code | .mcp.json | CLAUDE.md | | VS Code / Copilot | .vscode/mcp.json | | | Cursor | .cursor/mcp.json | .cursorrules | | Gemini CLI | .gemini/settings.json | GEMINI.md | | OpenAI Codex | ~/.codex/config.toml (user-global, per-project section) | | | OpenCode | opencode.json | | | Tabnine | .tabnine/agent/settings.json | TABNINE.md |

Multiple editors in the same project? All get configured in one command.

Codex note: Codex CLI reads MCP servers from ~/.codex/config.toml only — it has no per-project config. cce init adds one [mcp_servers.cce-<project>-<hash>] section per project so multiple projects coexist; cce uninstall removes only the section for the current project.

  my-project · 38 queries

  ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶  94% tokens saved

  Without CCE   48.0k  tokens   $0.14
  With CCE       3.4k  tokens   $0.01
  ──────────────────────────────────────────
  Saved         44.6k  tokens   $0.13

  Cost estimate based on Sonnet input pricing ($3/1M tokens)

Why this matters

Input tokens are 85-95% of your Claude Code bill. CCE cuts them by 94% (benchmarked on FastAPI).

Without CCE:    Claude reads payments.py + shipping.py   = 45,000 tokens
With CCE:       context_search "payment flow"            =    800 tokens

| | Without CCE | With CCE | |---|---|---| | Session startup | Re-reads files every time | Queries the index | | Finding a function | Read entire 800-line file | Get the 40-line function | | Cross-session memory | None | Decisions + code areas persisted | | Token cost (Sonnet, medium project) | ~$0.14/session | ~$0.04/session |

Benchmark: FastAPI (reproducible)

We benchmarked CCE against FastAPI (53 source files, 180K tokens) with 20 real coding questions. No cherry-picking, no synthetic queries.

Methodology: For each query, "without CCE" means reading the full content of every file the query touches. "With CCE" means the relevant chunks after compression.

Important baseline note: The 94% number is measured against full-file reads, not against what Claude Code actually does. In practice, Claude Code already uses grep, partial file reads, and targeted tools, so the real-world savings compared to normal Claude Code behavior will be lower than 94%. We use full-file as the baseline because it's reproducible and deterministic (no agent behavior variability). The benchmark measures CCE's retrieval efficiency, not a head-to-head comparison with Claude Code's built-in exploration.

| Metric | Result | |--------|--------| | Retrieval savings | 94% (83,681 → 4,927 tokens/query) | | Compression (additional, on retrieved chunks) | 89% (4,927 → 523 tokens/query) | | Recall@10 (found the right files) | 0.90 | | Latency p50 | 0.4ms | | Queries tested | 20 |

Per-Layer Savings (each measured independently)

| Layer | What it does | Savings | Method | |-------|-------------|---------|--------| | Retrieval | Full files → relevant code chunks | 94% | measured | | Chunk Compression | Raw chunks → signatures + docstrings | 89% | measured | | Grammar | Drops articles/fillers from memory text | 13% | measured |

Output compression (reducing Claude's reply length) provides additional savings (~65% estimated) but is not included in the headline number above.

Multi-language benchmarks

| Repo | Language | Files | Retrieval savings | Recall@10 | |------|----------|-------|-------------------|-----------| | FastAPI | Python | 53 | 94% | 0.90 | | chi | Go | 94 | 76% | 0.67 | | fiber | Go (monorepo) | 396 | 93% | 0.07 |

Go's shorter files reduce the retrieval headroom (smaller baseline). Monorepos dilute recall at top-10 (fiber). Middleware queries with one-feature-per-file hit R=1.00 consistently.

Reproduce it yourself:

pip install code-context-engine
python benchmarks/run_benchmark.py --repo https://github.com/fastapi/fastapi.git --source-dir fastapi
python benchmarks/run_benchmark.py --repo https://github.com/go-chi/chi.git --source-dir .

Full results in benchmarks/results/. Queries and methodology in benchmarks/.

Use cases

Quick start (3 lines)

uv tool install code-context-engine
cd /path/to/your/project
cce init

That's it. Claude now searches your index instead of reading entire files. No config needed.

System requirements

Python 3.11+ (tested on 3.11, 3.12, 3.13)
A C compiler and cmake (needed to build tree-sitter grammars)

Tested on all three platforms in CI (macOS, Linux, Windows × Python 3.11/3.12/3.13).

Install and see savings in 60 seconds

uv tool install code-context-engine   # or: pipx install code-context-engine
cd /path/to/your/project
cce init                              # index, install hooks, register MCP server

uv tool install "code-context-engine[local]"   # includes fastembed + ONNX Runtime

Restart your editor. Done. Every question now hits the index instead of re-reading files.

cce init auto-detects your editor and writes the right config:

Multiple editors in the same project? All get configured in one command.

  my-project · 38 queries

  ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶  94% tokens saved

  Without CCE   48.0k  tokens   $0.14
  With CCE       3.4k  tokens   $0.01
  ──────────────────────────────────────────
  Saved         44.6k  tokens   $0.13

  Cost estimate based on Sonnet input pricing ($3/1M tokens)

Why this matters

Input tokens are 85-95% of your Claude Code bill. CCE cuts them by 94% (benchmarked on FastAPI).

Without CCE:    Claude reads payments.py + shipping.py   = 45,000 tokens
With CCE:       context_search "payment flow"            =    800 tokens

Benchmark: FastAPI (reproducible)

We benchmarked CCE against FastAPI (53 source files, 180K tokens) with 20 real coding questions. No cherry-picking, no synthetic queries.

Methodology: For each query, "without CCE" means reading the full content of every file the query touches. "With CCE" means the relevant chunks after compression.

Per-Layer Savings (each measured independently)

Output compression (reducing Claude's reply length) provides additional savings (~65% estimated) but is not included in the headline number above.

Multi-language benchmarks

Go's shorter files reduce the retrieval headroom (smaller baseline). Monorepos dilute recall at top-10 (fiber). Middleware queries with one-feature-per-file hit R=1.00 consistently.

Reproduce it yourself:

pip install code-context-engine
python benchmarks/run_benchmark.py --repo https://github.com/fastapi/fastapi.git --source-dir fastapi
python benchmarks/run_benchmark.py --repo https://github.com/go-chi/chi.git --source-dir .

Full results in benchmarks/results/. Queries and methodology in benchmarks/.

code-context-engine

Use cases

Quick start (3 lines)

System requirements

Install and see savings in 60 seconds

Why this matters

Benchmark: FastAPI (reproducible)

Per-Layer Savings (each measured independently)

Multi-language benchmarks

Related Skills

Popular in MCP Servers

code-context-engine

Use cases

Quick start (3 lines)

System requirements

Install and see savings in 60 seconds

Why this matters

Benchmark: FastAPI (reproducible)

Per-Layer Savings (each measured independently)

Multi-language benchmarks

Related Skills

Popular in MCP Servers