cost-xray

Name: cost-xray
Author: tigerless-labs

Verified

See what Claude Code and Codex actually send to the API — and what each part costs.

319stars

35forks

Python

Installation

# Add to your Claude Code skills
git clone https://github.com/tigerless-labs/cost-xray

Getting Started

Guides for using cli tools skills like cost-xray.

Getting Started with AI Skills
First-time install walkthrough for Claude Code, Codex CLI, and ChatGPT.
SkillsLLM Features Guide
Voting, bookmarks, comments, comparison pages, and security scanning explained.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.

Security ReportVerified

Last scanned: 6/17/2026

{
  "issues": [
    {
      "file": "README.md",
      "line": 34,
      "type": "remote-install",
      "message": "Install command (remote install script piped to a shell — review the source before running): \"curl -fsSL https://raw.githubusercontent.com/tigerless-labs/cost-xray/master/ins\"",
      "severity": "low"
    }
  ],
  "status": "PASSED",
  "scannedAt": "2026-06-17T09:02:38.913Z",
  "npmAuditRan": true,
  "pipAuditRan": false,
  "promptInjectionRan": true
}

README.md

Frequently Asked Questions

What is cost-xray?

cost-xray is an open-source cli tools skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by tigerless-labs. See what Claude Code and Codex actually send to the API — and what each part costs. It has 319 GitHub stars.

Is cost-xray safe to use?

Yes. cost-xray passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install cost-xray?

Clone the repository with "git clone https://github.com/tigerless-labs/cost-xray" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is cost-xray written in?

cost-xray is primarily written in Python. It is open-source under tigerless-labs on GitHub, so you can review or fork the full source.

Are there alternatives to cost-xray?

Yes. SkillsLLM lists many other CLI Tools skills you can browse and compare side by side. Open the CLI Tools category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh cost-xray against similar tools.

LLM Engineer for Beginners

Ship LLM features to production - prompts, RAG, structured outputs, evaluation

39 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

ECC

by affaan-m

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

236,354

Popular in CLI Tools

Top skills in this category by stars

claude-code-templates

by davila7

CLI tool for configuring and monitoring Claude Code

30,019

claude-code-spinner llmio

Most usage tools read local logs. That shows the total cost of a call or session, but it misses the request-time context assembled before the model is invoked: system prompts, tool schemas, MCP blocks, tool results, cache reads/writes, and previous thinking blocks.

cost-xray captures the actual local API traffic for Claude Code and Codex, then attributes tokens and dollars back to the sources inside the request. It shows not just how much a turn cost, but why it cost that much.

Requirements

A supported coding agent — Claude Code or Codex
macOS or Linux
No API keys, no account, no config changes to your agent — capture is a transparent local hop

Install

curl -fsSL https://raw.githubusercontent.com/tigerless-labs/cost-xray/master/install.sh | bash

The installer asks which agent(s) to capture — Claude Code, Codex, or both — and prompts even under curl … | bash.

Skip the prompt (e.g. CI): set COST_XRAY_AGENTS=claude|codex|all.
Already cloned the repo? Run ./install.sh.

Then open a new terminal and run claude / codex exactly as before — capture is automatic, no flags and no base-URL change. It's forward-only: runs started in that new shell are captured, not past history. Open the live TUI from anywhere:

cx

Capture runs as a background service (auto-start on boot, self-healing, port-adaptive) and doesn't change what your agent does, its results, or its cost — pause anytime with cx stop. See docs/install.md for systemd details, GUI agents (Cursor base-URL setup), manual (no-systemd) mode, and troubleshooting.

Usage

cx                  # open the live cost-xray TUI (from any directory)
cx status           # services' state, live ports, sessions captured
cx stop             # stop monitoring — proxies down; agents run direct (uncaptured)
cx start            # resume monitoring
cx restart          # restart the proxies (after a config change)
cx install          # (re)install — authoritative: installs the chosen agents, removes the rest
cx uninstall        # remove services + shell wrappers (keeps captured data)

cx is the whole CLI and works from any directory. Change port/upstream by editing ~/.cost-xray/env, then cx restart. (./run.sh <cmd> from the repo still works too — cx just calls it for you from anywhere.)

In the TUI, drill from the top down: agent → project → session → category → MCP server → tool → per-turn call → the real output. Every cell carries its cache split (read / write / fresh / output $).

Supported Agents

Agent	Status	Capture
Claude Code	Supported	reverse proxy (base-URL override) — no certificate
Codex	Supported	forward proxy + scoped local CA (self-healing wrapper)

The wire is decoded by a thin per-agent adapter — the only place code forks by agent (docs/architecture.md; per-agent capture + tokenizer notes under docs/providers/). Adding an agent is one small module; anything speaking the Anthropic or OpenAI-Responses wire shape is close to drop-in.

Features

Cost attribution, below the tool

cost-xray traces every token's cost — split into fresh / cache-read / cache-write / output — to the source that caused it, and down to the individual call: the cost of this Read invocation and its output, not a session sum. Everyone else aggregates — a request- or session-level total, at most grouped by tool type ("Read cost $X this session"). As far as we've found, nothing else prices below the tool, call by call.

Window occupancy

What is taking space in the context window right now — system prompt, every tool schema, MCP servers, messages, and generated output — decomposed into source-level rows. Prompt caching makes a stable 40k-token schema block cheap on cache read, but it still crowds out the code and conversation that matter; cost-xray shows you the occupancy, not just the bill.

Unused MCP waste

Configured servers and tools that are injected into every request's prefix but never actually called. They pay their tool-schema overhead on every turn — cost-xray flags the dead weight.

Tokenization accuracy

Claude's tokenizer is private — no official or open-source tokenizer exists — and calling Anthropic's count_tokens API for everything would add load and hit its limits. So for Claude, cost-xray uses an estimator plus proportional calibration: tiktoken sizes each part, the parts it mis-sizes most (thinking, tool schemas) get targeted corrections — pinned with count_tokens when you're logged in, a fixed ratio otherwise — and the rest is scaled so the calibrated total matches the provider's own usage. The total, and therefore the bill, is exact; only the split between sources in the same request is approximate. We benchmark those residuals continuously (CONTRIBUTING.md) and they're small enough for attribution work.

Coming soon: an opt-in exact mode — every part sized by full count_tokens differencing — manually enabled, for users who need maximum per-source precision.

Self-healing capture

The wrappers are per-command and self-healing: if the proxy is down, the wrapper restarts it and routes through; if it can't, the agent runs direct — never broken. Stop monitoring anytime with cx stop (agents then run direct).

Capture everything, store almost nothing

cost-xray keeps the complete raw API traffic — but a long session re-sends its whole history every turn, so the capture is hugely repetitive. We deduplicate it: each unique block (message, schema, tool result) is stored once, with a tiny per-turn delta. Full per-turn bytes rebuild on demand. Disk stays small even across million-token sessions.

Why not just read the logs?

Log-based tools (ccusage, codeburn, and similar) are excellent at local-first session analytics: they read local transcripts, classify turns by tool usage, and price the session by model, day, or task. That answers how much did I spend. cost-xray answers the lower-level question: what bytes did the model actually receive, and which source owns those tokens?

The difference is the data source. Log readers see the transcript after the agent has run — but the system prompt, injected tool schemas, MCP schemas, reminders, and provider-added blocks are assembled at request time and never written to the transcript. In real coding-agent requests, that invisible prefix can be roughly half the context or more. cost-xray reads the raw API request, so it can compute source-level tokens and attribute cost to schemas, MCP servers, tools, and message buckets.

Question	Log / usage tools	cost-xray wire capture
How much did the session cost?	Yes	Yes
Which task/tool was active?	Yes	Yes
System prompt and injected schemas visible?	No	Yes
How many tokens does each tool schema occupy?	No, schemas aren't in logs	Yes
Which MCP server is dead weight in the prefix?	Estimate	Exact
Cache read/write/fresh dollars per source/tool?	No, usage is request-level	Yes, by span and cache boundary
Live view of the current request window?	No	Yes

Reading the dashboard

cost-xray surfaces the data; you read the story. A few patterns worth knowing:

Signal you see	What it might mean
A 40k-token MCP schema block on every turn	A configured server crowding the prefix — drill it to see if any tool was called
Cache-read dollars dwarf fresh input	Stable prefix is working; the spend is in what's new each turn
Repeated cache-write on the same source	The prefix is being rewritten — something upstream of it changed
Thinking tokens dominate output cost	Long reasoning turns; check whether they earned their keep
A tool's schema costs more than the tool is ever used	Candidate to drop from the agent's tool-set
Big `tool_result` rows on `Read`/`Bash`	Uncapped output bloating the window — cap it at the source

These are starting points, not verdicts. One experimental session looking odd is fine; the same pattern across weeks of work is a config issue.

How it works

cost-xray runs mitmproxy as a local capture hop and keeps all analysis off the request path.

agent ──HTTP──▶ mitmproxy ──HTTPS──▶ model API
                     │
                     └── redacted raw request/response → ~/.cost-xr