Hook-based token compressor for 5 AI CLI hosts (Claude Code, Copilot CLI, OpenCode, Gemini CLI, Codex CLI). Up to 95% bash compression, signature-mode for code reads, cross-call dedup, MCP server, self-teaching protocol. Zero runtime deps.
# Add to your Claude Code skills
git clone https://github.com/claudioemmanuel/squeezGuides for using mcp servers skills like squeez.
Last scanned: 5/30/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-30T16:21:28.897Z",
"npmAuditRan": true,
"pipAuditRan": true
}No comments yet. Be the first to share your thoughts!
Top skills in this category by stars
30 days in the Featured rail · terms & refunds
End-to-end token optimizer for Claude Code, GitHub Copilot CLI, OpenCode, Gemini CLI, and OpenAI Codex CLI. Compresses bash output up to 95%, collapses redundant calls, and injects a terse prompt persona — automatically, with zero new runtime dependencies.
Three methods — all produce the same result (binary at ~/.claude/squeez/bin/squeez, hooks registered).
curl -fsSL https://raw.githubusercontent.com/claudioemmanuel/squeez/main/install.sh | sh
Windows: requires Git Bash. Run the command above inside Git Bash — PowerShell/CMD are not supported.
# Install globally
npm install -g squeez
# Or run once without installing
npx squeez
Downloads the correct pre-built binary for your platform (macOS universal, Linux x86_64/aarch64, Windows x86_64). Requires Node ≥ 16.
cargo install squeez
Builds from crates.io. Requires Rust stable. On Windows you also need MSVC C++ Build Tools.
squeez setup auto-detects every CLI present on disk and registers the hooks. squeez uninstall removes them. Session data and config.ini are preserved so reinstall is lossless.
| Host | Memory file | Bash wrap | Session memory | Budget inject (Read/Grep) | Notes |
|---|---|---|---|---|---|
| Claude Code | ~/.claude/CLAUDE.md |
✅ native | ✅ native | ✅ native | Restart Claude Code to pick up hooks |
| Copilot CLI | ~/.copilot/copilot-instructions.md |
✅ native | ✅ native | ✅ native | Restart Copilot CLI after setup |
| OpenCode | ~/.config/opencode/AGENTS.md |
✅ native | ✅ native | ✅ native | Plugin at ~/.config/opencode/plugins/squeez.js; MCP tool calls skip hooks (upstream sst/opencode#2319) |
| Gemini CLI | ~/.gemini/GEMINI.md |
✅ native | ✅ native | 🟡 soft via GEMINI.md |
BeforeTool rewrite schema pending upstream docs (google-gemini/gemini-cli#25629) |
| Codex CLI | ~/.codex/AGENTS.md |
✅ native | ✅ native | 🟡 soft via AGENTS.md |
apply_patch hooks landed in 0.123.0 (#18391); updatedInput + read_file/grep hook surface still pending (openai/codex#18491) |
| Pi | ~/.pi/agent/skills/squeez/SKILL.md |
✅ native | ✅ via skill | ✅ native | TypeScript extension at ~/.pi/agent/extensions/squeez/index.ts; restart Pi after setup |
squeez setup # register into every detected host
squeez setup --host=<slug> # register into one host
squeez uninstall # remove squeez entries from every detected host
squeez uninstall --host=<slug>
Slugs: claude-code / copilot / opencode / gemini / codex / pi.
After install, restart the CLI you use to pick up the new hooks.
squeez uninstall # preserves session data + config.ini
bash ~/.claude/squeez/uninstall.sh # (legacy) full wipe, if the script exists
squeez update # download latest binary + verify SHA256
squeez update --check # check for update without installing
squeez update --insecure # skip checksum (not recommended)
| Feature | Description |
|---|---|
| Bash compression | Intercepts every command via PreToolUse hook, applies smart filter → dedup → grouping → truncation. Up to 95% reduction. |
| Context engine | Cross-call redundancy with two paths: exact-hash match (FNV-1a, fast) and fuzzy trigram-shingle Jaccard ≥0.85 (whitespace, timestamps, single-line edits no longer defeat dedup). |
| Summarize fallback | Outputs exceeding 500 lines are replaced with a ≤40-line dense summary (top errors, files, test result, tail). Benign outputs get 2× the threshold so successful builds stay verbatim. |
| Adaptive intensity | Truly adaptive: Full (×0.6 limits) below 80% of token budget, Ultra (×0.3) above. Used to be always-Ultra; now actually responds to session pressure. |
| MCP server | squeez mcp runs a JSON-RPC 2.0 server over stdio exposing 13 read-only tools so any MCP-compatible LLM can query session memory directly. Hand-rolled, no mcp.server dependency. |
| Auto-teach payload | squeez protocol (or the squeez_protocol MCP tool) prints a 2.4 KB self-describing payload — the LLM learns squeez's markers and protocol on first call. |
| Caveman persona | Injects an ultra-terse prompt at session start so the model responds with fewer tokens. |
| Memory-file compression | squeez compress-md compresses CLAUDE.md / AGENTS.md / copilot-instructions.md in-place — pure Rust, zero LLM. i18n-aware: set lang = pt (or --lang pt) for pt-BR article/filler/phrase dropping and Unicode-correct matching. |
| Session memory | On SessionStart, injects a structured summary of the previous session: files investigated, learned facts (errors + git events), completed work (builds, test passes), and next steps (unresolved errors, failing tests). Summaries carry temporal validity (valid_from/valid_to). |
| Token tracking | Every PostToolUse result (Bash, Read, Grep, Glob, Monitor, SubagentStop) feeds a SessionContext so squeez knows what the agent has already seen. Read/Grep/Glob/Monitor outputs are also rewritten via updatedToolOutput (Claude Code v2.1.119+) when content is redundant or oversized. |
| Token economy | Sub-agent cost tracking (~200K tokens/spawn), burn rate prediction ([budget: ~N calls left]), session efficiency scoring, tool result size budgets. |
| Auto-calibration | squeez calibrate runs benchmarks on install and generates an optimized config.ini (aggressive / balanced / conservative profiles). |
There are now several token-reduction tools targeting AI coding CLIs. They make different bets — the right one depends on what you care about: zero deps, lossless filtering, structural reformatting, or task-conditioned ML.
| Tool | Approach | Hosts | Deps | Key wins | Trade-off |
|---|---|---|---|---|---|
| squeez (this project) | Hook + 4-stage filter pipeline + context engine (MinHash dedup, summarize, adaptive intensity) + MCP server | Claude Code, Copilot CLI, OpenCode, Gemini CLI, Codex CLI | Zero runtime deps (libc only on Unix) |
Up to 95% on bash; cross-call dedup; signature-mode for source files; TOON re-encoder for JSON outputs; 14 MCP tools; enterprise (Bedrock/Vertex) USD-saved estimate | Heuristic, not ML — no per-task understanding |
| rtk-ai/rtk | Hook proxy that rewrites bash commands (git status → rtk git status), then compresses 100+ command outputs |
Claude Code, Cursor | Zero deps (Rust) | 60-90% on 100+ commands; rtk read -l aggressive for signature mode |
rtk#582: aggressive rewriting can increase total cost by 18% because Claude emits +50% more output tokens to compensate for stripped context. squeez ships a guard against this regime. |
| KRLabsOrg/squeez | Task-conditioned ML (Qwen 2B / ModernBERT 150M) — pipe tool output + task description, get back only relevant lines | Any (CLI tool) | Python, PyTorch / vLLM server | 92% compression, F1 0.80; task-aware (same log slices kept differently per query) | Requires running an LLM locally; not zero-dep. Same project name, different design. |
| ojuschugh1/sqz | CLI context compressor | Any | Python | Single-command compression | Lower coverage than the others. |
| LLMLingua-2 (Microsoft) | Neural prompt compressor that removes 50-80% of a prompt while preserving meaning | API / library | Python, transformers | Strong on long static prompts | Latency + model dep; not a CLI hook. |
| TOON | Schema-aware JSON replacement (users[100]{id,name,role}:) — ~40% fewer tokens on arrays of uniform objects |
Library, not a CLI | TypeScript SDK | Lossless on the right shape; squeez embeds a TOON encoder for gh/kubectl/aws/gcloud/az JSON outputs |
Only helps on uniform JSON shapes. |
If you want a CLI hook that just works, never needs a Python runtime, and never silently inflates your output tokens, squeez is the safe default. If you can run an LLM next to your shell and want task-aware filtering, KRLabsOrg/squeez is worth a look as a complement. The two squeez projects share a name but are independent.
squeez optimizes what it can reach — the surfaces exposed by each host's hook API. It cannot fix token leaks outside those surfaces.
| Surface | How | When | Supported hosts |
|---|---|---|---|
| Bash stdout/stderr | PreToolUse wraps command w/ 4-sta |