by elara-labs
Save 94% on AI coding tokens. Index your codebase, agents search instead of reading files. Works with Claude Code, Codex, Copilot, Cursor, Gemini CLI. Local MCP server, free, open source.
# Add to your Claude Code skills
git clone https://github.com/elara-labs/code-context-engineGuides for using ai agents skills like code-context-engine.
Last scanned: 5/30/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-30T16:14:58.900Z",
"npmAuditRan": true,
"pipAuditRan": true
}code-context-engine is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by elara-labs. Save 94% on AI coding tokens. Index your codebase, agents search instead of reading files. Works with Claude Code, Codex, Copilot, Cursor, Gemini CLI. Local MCP server, free, open source. It has 180 GitHub stars.
Yes. code-context-engine passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.
Clone the repository with "git clone https://github.com/elara-labs/code-context-engine" and add it to your Claude Code skills directory (see the Installation section above).
code-context-engine is primarily written in Python. It is open-source under elara-labs on GitHub, so you can review or fork the full source.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh code-context-engine against similar tools.
No comments yet. Be the first to share your thoughts!
| Use case | How CCE helps | |
|---|---|---|
| 💰 | Reduce Claude Code costs | 94% fewer input tokens per session |
| 🔒 | Keep code private | Everything local, no cloud indexing |
| 🔄 | Multi-editor teams | One index across Claude Code, Cursor, VS Code, Gemini CLI |
| 🧠 | Cross-session memory | Decisions and context survive restarts |
| ⚡ | Faster responses | Less context = faster Claude replies |
| 📊 | Track actual savings | Dollar amounts, not estimates |
One command. 30 seconds.
uvx --from "code-context-engine[local]" cce init # install + index + configure, one shot
Or if you prefer a persistent install:
uv tool install "code-context-engine[local]" # or: pipx install "code-context-engine[local]"
cd /path/to/your/project
cce init
Restart your editor. Done. Every question now hits the index instead of re-reading files.
Already have Ollama? Skip
[local]and useuv tool install code-context-engineinstead. CCE auto-detects Ollama at localhost:11434 and usesnomic-embed-text.
Python 3.11+ and a C compiler (for tree-sitter grammars).
| Platform | Setup |
|---|---|
| macOS | xcode-select --install |
| Ubuntu/Debian | sudo apt install build-essential cmake |
| Fedora/RHEL | sudo dnf install gcc gcc-c++ cmake |
| Windows | Visual Studio Build Tools (C++ workload) + CMake |
Tested on macOS, Linux, Windows with Python 3.11/3.12/3.13.
cce init auto-detects your editor and writes the right config. To target a
specific agent, use --agent claude, --agent codex, --agent copilot, or
--agent all.
| Editor | Config written | Instructions |
|---|---|---|
| Claude Code | .mcp.json |
CLAUDE.md |
| VS Code / Copilot | .vscode/mcp.json |
.github/copilot-instructions.md |
| Cursor | .cursor/mcp.json |
.cursorrules |
| Gemini CLI | .gemini/settings.json |
GEMINI.md |
| OpenAI Codex | ~/.codex/config.toml (user-global, per-project section) |
AGENTS.md |
| OpenCode | opencode.json |
|
| Tabnine | .tabnine/agent/settings.json |
TABNINE.md |
Multiple editors in the same project? All get configured in one command.
Codex note: Codex CLI reads MCP servers from ~/.codex/config.toml only —
it has no per-project config. cce init adds one [mcp_servers.cce-<project>-<hash>]
section per project so multiple projects coexist; cce uninstall removes only
the section for the current project.
my-project · 38 queries · last query 5m ago
⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ 88% tokens saved
Input savings 1.9M tokens $27.78
Output savings 4.8k tokens $0.36
──────────────────────────────────────────
Total saved 1.9M tokens $28.15
Breakdown:
retrieval 84% ▰▰▰▰▰▰▰▰▰▰ 1.8M $26.76 · 12 calls
chunk compression 3% ▰▱▱▱▱▱▱▱▱▱ 68.5k $1.03 · 12 calls
output compression* <1% ▰▱▱▱▱▱▱▱▱▱ 4.8k $0.36 · 12 calls
Cost estimate based on Opus pricing (input $15.0/1M, output $75.0/1M)
Supports Anthropic, OpenAI, and Google model pricing. Configure via pricing.model in ~/.cce/config.yaml.
Input tokens are 85-95% of your Claude Code bill. CCE cuts them by 94% (benchmarked on FastAPI).
Without CCE: Claude reads payments.py + shipping.py = 45,000 tokens
With CCE: context_search "payment flow" = 800 tokens
| Without CCE | With CCE | |
|---|---|---|
| Session startup | Re-reads files every time | Queries the index |
| Finding a function | Read entire 800-line file | Get the 40-line function |
| Cross-session memory | None | Decisions + code areas persisted |
| Token cost (Sonnet, medium project) | ~$0.14/session | ~$0.04/session |
We benchmarked CCE against FastAPI (53 source files, 180K tokens) with 20 real coding questions. No cherry-picking, no synthetic queries.
Methodology: For each query, "without CCE" means reading the full content of every file the query touches. "With CCE" means the relevant chunks after compression.
Important baseline note: The 94% number is measured against full-file reads, not against what Claude Code actually does. In practice, Claude Code already uses grep, partial file reads, and targeted tools, so the real-world savings compared to normal Claude Code behavior will be lower than 94%. We use full-file as the baseline because it's reproducible and deterministic (no agent behavior variability). The benchmark measures CCE's retrieval efficiency, not a head-to-head comparison with Claude Code's built-in exploration.
| Metric | Result |
|---|---|
| Retrieval savings | 94% (83,681 → 4,927 tokens/query) |
| Compression (additional, on retrieved chunks) | 89% (4,927 → 523 tokens/query) |
| Recall@10 (found the right files) | 0.90 |
| Latency p50 | 0.4ms |
| Queries tested | 20 |
| Layer | What it does | Savings | Method |
|---|---|---|---|
| Retrieval | Full files → relevant code chunks | 94% | measured |
| Chunk Compression | Raw chunks → signatures + docstrings | 89% | measured |
| Grammar | Drops articles/fillers from memory text | 13% | measured |
Output compression (reducing Claude's reply length) provides additional savings (~65% estimated) but is not included in the headline number above.
| Repo | Language | Files | Retrieval savings | Recall@10 |
|---|---|---|---|---|
| FastAPI | Python | 53 | 94% | 0.90 |
| chi | Go | 94 | 76% | 0.67 |
| fiber | Go (monorepo) | 396 | 93% | 0.07 |
Go's shorter files reduce the retrieval headroom (smaller baseline). Monorepos dilute recall at top-10 (fiber). Middleware queries with one-feature-per-file hit R=1.00 consistently.
Reproduce it yourself:
pip install code-context-engine
python benchmarks/run_benchmark.py --repo https://github.com/fastapi/fastapi.git