by cnighswonger
Fixes prompt cache regression in Claude Code that causes up to 20x cost increase on resumed sessions
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
# Add to your Claude Code skills
git clone https://github.com/cnighswonger/claude-code-cache-fixGuides for using cli tools skills like claude-code-cache-fix.
No comments yet. Be the first to share your thoughts!
Top skills in this category by stars
English | 中文 | 한국어 | Português
Cache optimization proxy for Claude Code. Fixes prompt cache bugs that cause excessive quota burn, stabilizes the request prefix, and monitors for silent regressions. Works with all CC versions including the v2.1.113+ Bun binary.
v3.0.3 — Local HTTP proxy with 7 hot-reloadable extensions. A/B tested on v2.1.117: 95.5% cache hit rate through proxy vs 82.3% direct on first warm turn. Full release notes →
Opus 4.7 advisory: Metered data shows 4.7 burns Q5h quota at ~2.4x the rate of 4.6 for equivalent visible token counts (independently confirmed by @ArkNill). Two factors: a new tokenizer (up to 35% more tokens, documented) and adaptive thinking overhead (~105%, not documented in usage response). The Q5h impact compounds into Q7d — the weekly quota ceiling that most heavy users will hit first. Workaround:
CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1reduces burn by ~3.3x but may reduce quality on complex tasks. See Discussion #25 (initial observation) and Discussion #42 (controlled A/B data + Q7d analysis).
The proxy works with any CC version — Node.js or Bun binary. It sits between Claude Code and the Anthropic API, applying cache fixes as hot-reloadable extensions.
# Install
npm install -g claude-code-cache-fix
# Start the proxy (runs on localhost:9801)
node "$(npm root -g)/claude-code-cache-fix/proxy/server.mjs" &
# Launch Claude Code through it
ANTHROPIC_BASE_URL=http://127.0.0.1:9801 claude
That's it. The proxy applies all 7 cache-fix extensions automatically. No wrapper scripts, no NODE_OPTIONS, no preload.
On every /v1/messages request, 9 extensions run in order (one opt-in):
| Extension | What it fixes |
|-----------|--------------|
| fingerprint-strip | Removes unstable cc_version fingerprint from system prompt |
| sort-stabilization | Deterministic ordering of tool and MCP definitions |
| ttl-management | Detects server TTL tier, injects correct cache_control markers |
| identity-normalization | Normalizes message identity fields for prefix stability |
| fresh-session-sort | Fixes non-deterministic ordering on first turn |
| cache-control-normalize | Normalizes cache_control markers across messages |
| cache-telemetry | Extracts cache stats from response headers → ~/.claude/quota-status/{account.json,sessions/<id>.json} |
| session-health | Observes per-session thinking-desync risk (context size + thinking-block count) and warns before a session reaches the danger zone. Read-only |
| thinking-block-sanitize | Drops omitted (empty-text) thinking blocks to head off the CC thinking-desync 400 (#63147). Opt-in (CACHE_FIX_THINKING_SANITIZE=on) |
Extensions are hot-reloadable — add, remove, or modify .mjs files in proxy/extensions/ and changes apply to the next request without restarting. Configuration in proxy/extensions.json.
Developing a new extension? See docs/parallel-proxy-test-harness.md for the pattern we use to test extensions end-to-end against real claude -p traffic without disturbing the production proxy.
Recommended (Linux/macOS) — install-service subcommand:
cache-fix-proxy install-service
Detects your platform and writes the appropriate config:
~/.config/systemd/user/cache-fix-proxy.service (systemd user unit)~/Library/LaunchAgents/com.cnighswonger.cache-fix-proxy.plist (launchd agent)The output prints the next-step commands to enable and start the service. On Linux:
systemctl --user daemon-reload
systemctl --user enable --now cache-fix-proxy
systemctl --user enable --now cache-fix-proxy-healthcheck.timer # auto-recovery — see below
sudo loginctl enable-linger $USER # optional: start on boot, not just on login
Auto-recovery (Linux): install-service also drops a healthcheck companion (cache-fix-proxy-healthcheck.service + .timer). The timer fires every 2 minutes; the oneshot service runs curl -fs http://127.0.0.1:<port>/health and systemctl --user start cache-fix-proxy.service if the probe fails. This recovers the proxy from any stop — clean or unclean, expected or unexpected — within 2 minutes. Background: Restart=on-failure doesn't fire on clean stops, so before this companion existed, a systemctl stop from any source (including unidentified ones during an Anthropic outage on 2026-04-25) would leave the proxy down indefinitely. macOS doesn't need the companion — launchd's KeepAlive already auto-restarts on any exit.
On macOS:
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.cnighswonger.cache-fix-proxy.plist
launchctl enable gui/$(id -u)/com.cnighswonger.cache-fix-proxy
launchctl kickstart gui/$(id -u)/com.cnighswonger.cache-fix-proxy
The installed config picks up CACHE_FIX_PROXY_PORT, CACHE_FIX_PROXY_UPSTREAM, and CACHE_FIX_DEBUG from the env at install time. Re-run install-service --force to regenerate after env changes, or edit the service file directly. Pair with cache-fix-proxy uninstall-service to remove cleanly (stops, disables, deletes).
The service runs cache-fix-proxy server in the foreground, which is just the proxy without the wrapper-mode claude launcher.
Manual (any platform):
nohup cache-fix-proxy server > /tmp/cache-fix-proxy.log 2>&1 &
echo 'export ANTHROPIC_BASE_URL=http://127.0.0.1:9801' >> ~/.bashrc
A multi-arch (amd64, arm64) container image is published to GitHub Container Registry on every release tag.
docker run -d --name cache-fix-proxy \
--restart=always \
-p 9801:9801 \
ghcr.io/cnighswonger/claude-code-cache-fix:latest
# Then in your shell:
export ANTHROPIC_BASE_URL=http://127.0.0.1:9801
Use --restart=always instead of the systemd healthcheck companion — Docker handles auto-recovery natively. Mount nothing; the container is stateless. Override the default port with -e CACHE_FIX_PROXY_PORT=.... Override the upstream (e.g. to chain through llm-relay) with -e CACHE_FIX_PROXY_UPSTREAM=http://host.docker.internal:8080. The image runs as the unprivileged node user (uid 1000) and exposes a HEALTHCHECK Docker can use for liveness.
For corporate environments behind an SSL-inspecting proxy, mount your CA bundle and set the env vars:
docker run -d --name cache-fix-proxy --restart=always -p 9801:9801 \
-e HTTPS_PROXY=http://proxy.corp.example:8080 \
-e CACHE_FIX_PROXY_CA_FILE=/etc/ssl/corp-ca.pem \
-v /path/to/zscaler-root.pem:/etc/ssl/corp-ca.pem:ro \
ghcr.io/cnighswonger/claude-code-cache-fix:latest
Image tags: latest, 3, 3.2, 3.2.1 (semver-ladder, so 3 always points to the newest 3.x). latest always tracks the newest tagged release.
Linux note: the chained-upstream host.docker.internal example below is automatic on Docker Desktop (macOS / Windows). On plain Linux Docker Engine you usually need --add-host=host.docker.internal:host-gateway so the name resolves to the host bridge. Without it, the container's name lookup fails and the proxy can't reach the upstream service running on the host. Example chaining cache-fix proxy through llm-relay running on the host:
docker run -d --name cache-fix-proxy --restart=always -p 9801:9801 \
--add-host=host.docker.internal:host-gateway \
-e CACHE_FIX_PROXY_UPSTREAM=http://host.docker.internal:8080 \
ghcr.io/cnighswonger/claude-code-cache-fix:latest
curl http://127.0.0.1:9801/health
# {"status":"ok"}
All proxy settings are controlled via environment variables. Set them before starting the proxy server.
| Variable | Default | Description |
|----------|---------|-------------|
| CACHE_FIX_PROXY_PORT | 9801 | Listen port |
| CACHE_FIX_PROXY_BIND | 127.0.0.1 | Bind address |
| CACHE_FIX_PROXY_UPSTREAM | https://api.anthropic.com | Upstream URL. Change to chain another proxy (e.g. http://localhost:8080) |
| CACHE_FIX_PROXY_TIMEOUT | 600000 | Request timeout in milliseconds |
| CACHE_FIX_EXTENSIONS_DIR | proxy/extensions/ | Directory for extension .mjs files |
| CACHE_FIX_EXTENSIONS_CONFIG | proxy/extensions.json | Extension configuration file |
| CACHE_FIX_DEBUG | 0 | Enable debug logging |
The proxy honors the following environment variables when forwarding to api.anthropic.com. Behind Zscaler / Netskope / Forcepoint / Bluecoat / corporate squid, set these in the proxy's environment.
| Variable | Effect |
|----------|--------|
| HTTPS_PROXY / HTTP_PROXY (and lowercase variants) | Routes upstream requests through the corporate HTTP CONNECT proxy. |
| NO_PROXY | Comma-separated host list to bypass the proxy. Supports * and .suffix.example.com. |
| CACHE_FIX_PROXY_CA_FILE | Path to a PEM file with one or more extra