by chopratejas
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
# Add to your Claude Code skills
git clone https://github.com/chopratejas/headroomLast scanned: 5/14/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-14T06:46:19.064Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
} βββ βββββββββββ ββββββ βββββββ βββββββ βββββββ βββββββ ββββ ββββ
βββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ βββββ
ββββββββββββββ βββββββββββ ββββββββββββββ ββββββ ββββββββββββββ
ββββββββββββββ βββββββββββ ββββββββββββββ ββββββ ββββββββββββββ
βββ ββββββββββββββ ββββββββββββββ ββββββββββββββββββββββββ βββ βββ
βββ ββββββββββββββ ββββββββββ βββ βββ βββββββ βββββββ βββ βββ
The context compression layer for AI agents
Headroom compresses everything your AI agent reads β tool outputs, logs, RAG chunks, files, and conversation history β before it reaches the LLM. Same answers, fraction of the tokens.
compress(messages) in Python or TypeScript, inline in any appheadroom proxy --port 8787, zero code changes, any languageheadroom wrap claude|codex|cursor|aider|copilot in one commandheadroom_compress, headroom_retrieve, headroom_stats for any MCP clientheadroom learn β mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md Your agent / app
(Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own codeβ¦)
β prompts Β· tool outputs Β· logs Β· RAG results Β· files
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Headroom (runs locally β your data stays here) β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β CacheAligner β ContentRouter β CCR β
β ββ SmartCrusher (JSON) β
β ββ CodeCompressor (AST) β
β ββ Kompress-base (text, HF) β
β β
β Cross-agent memory Β· headroom learn Β· MCP β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β compressed prompt + retrieval tool
βΌ
LLM provider (Anthropic Β· OpenAI Β· Bedrock Β· β¦)
No comments yet. Be the first to share your thoughts!
headroom_retrieve if it needs them# 1 β Install
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node / TypeScript
# 2 β Pick your mode
headroom wrap claude # wrap a coding agent
headroom proxy --port 8787 # drop-in proxy, zero code changes
# or: from headroom import compress # inline library
# 3 β See the savings
headroom stats
Granular extras: [proxy], [mcp], [ml], [agno], [langchain], [evals]. Requires Python 3.10+.
Savings on real agent workloads:
| Workload | Before | After | Savings | |-------------------------------|-------:|-------:|--------:| | Code search (100 results) | 17,765 | 1,408 | 92% | | SRE incident debugging | 65,694 | 5,118 | 92% | | GitHub issue triage | 54,174 | 14,761 | 73% | | Codebase exploration | 78,502 | 41,254 | 47% |
Accuracy preserved on standard benchmarks:
| Benchmark | Category | N | Baseline | Headroom | Delta | |------------|----------|----:|---------:|---------:|------------| | GSM8K | Math | 100 | 0.870 | 0.870 | Β±0.000 | | TruthfulQA | Factual | 100 | 0.530 | 0.560 | +0.030 | | SQuAD v2 | QA | 100 | β | 97% | 19% compression | | BFCL | Tools | 100 | β | 97% | 32% compression |
Reproduce: python -m headroom.evals suite --tier 1 Β· Full benchmarks & methodology
| Agent | headroom wrap | Notes |
|-------------|:---------------:|----------------------------------|
| Claude Code | β | --memory Β· --code-graph |
| Codex | β | shares memory with Claude |
| Cursor | β | prints config β paste once |
| Aider | β | starts proxy + launches |
| Copilot CLI | β | starts proxy + launches |
| OpenClaw | β | installs as ContextEngine plugin |
Any OpenAI-compatible client works via headroom proxy. MCP-native: headroom mcp install.
Great fit if youβ¦
Skip it if youβ¦
| Your setup | Hook in with |
|------------------------|------------------------------------------------------------------|
| Any Python app | compress(messages, model=β¦) |
| Any TypeScript app | await compress(messages, { model }) |
| Anthropic / OpenAI SDK | withHeadroom(new Anthropic()) Β· withHeadroom(new OpenAI()) |
| Vercel AI SDK | wrapLanguageModel({ model, middleware: headroomMiddleware() }) |
| LiteLLM | litellm.callbacks = [HeadroomCallback()] |
| LangChain | HeadroomChatModel(your_llm) |
| Agno | HeadroomAgnoModel(your_model) |
| Strands | Strands guide |
| ASGI apps | app.add_middleware(CompressionMiddleware) |
| Multi-agent | SharedContext().put / .get |
| MCP clients | headroom mcp install |
headroom learn β plugin-based failure mining for Claude, Codex, Gemini.Headroom exposes one stable request lifecycle across compress(), the SDK, and t