by postrv
Code Mode inspired local sandboxed MCP Gateway - collapses N servers x M tools into 2 tools (~1,000 tokens)
# Add to your Claude Code skills
git clone https://github.com/postrv/forgemaxCode Mode MCP Gateway — collapses N servers x M tools into 2 tools (~1,000 tokens).
Instead of dumping every tool schema into the LLM's context window, Forgemax exposes exactly two MCP tools:
search — query a capability manifest to discover tools (read-only, sandboxed)execute — run JavaScript against the tool API in a sandboxed V8 isolateAdditional sandbox APIs (the MCP surface stays at exactly 2 tools):
forge.readResource(server, uri) — read MCP resources from downstream serversforge.stash — session-scoped key-value store for sharing data across executionsforge.parallel(calls, opts) — bounded concurrent execution of tool/resource callsThe LLM writes JavaScript that calls through typed proxy objects. Credentials, file paths, and internal state never leave the host — the sandbox only sees opaque bindings. TypeScript definitions (forge.d.ts) are compiled into the binary and served in MCP server instructions, giving LLMs full type awareness.
Forgemax's Code Mode approach draws inspiration from Cloudflare's sandbox tool-calling pattern — their implementation of sandboxed code execution for MCP tool orchestration is excellent and well worth studying. We encourage supporting their work.
| Traditional MCP | Forgemax Code Mode |
|---|---|
| 76 tools = ~15,000 tokens of schema | 2 tools = ~1,000 tokens |
| 5-10 sequential round-trips | 1 execute() call with chaining |
| Every new tool widens the context | Tool count is invisible to the LLM |
No comments yet. Be the first to share your thoughts!
LLMs are trained on billions of lines of code. They're better at writing narsil.symbols.find({pattern: "handle_*"}) than picking the right tool from a 76-item JSON schema list.
| Scenario | Raw MCP (tokens) | Forgemax (tokens) | Savings | |----------|------------------|-------------------|---------| | 10 tools | ~4,200 | ~1,100 | 73% | | 50 tools | ~20,700 | ~1,100 | 94% | | 76 tools | ~33,100 | ~1,100 | 96% | | 150 tools | ~61,800 | ~1,100 | 98% |
Forgemax schema size is constant (~1,100 tokens) regardless of how many tools are connected.
Run the benchmark yourself: cargo run -p forge-manifest --example token_savings
forgemax Binary entry point (stdio MCP transport)
forge-config TOML config loading, env var expansion, file watching
forge-client MCP client connections (stdio + HTTP/SSE), routing
forge-server MCP server handler (search + execute via rmcp)
forge-sandbox V8 sandbox (deno_core, AST validator, worker pool)
forgemax-worker Isolated child process for V8 execution
forge-manifest Capability manifest, LiveManifest, TypeScript defs
forge-error Typed DispatchError enum, structured errors, fuzzy matching
forge-audit Audit event types and structured logging
forge-test-server Mock MCP server for integration tests
The core innovation. Uses deno_core to run LLM-generated JavaScript in a locked-down V8 isolate:
const e = eval; e("code") and destructured globalThis through multiple assignment hopsTimeout, HeapLimit, JsError) preserved across the IPC boundaryworker-pool feature)forge.parallel()) with concurrency capsmetrics feature)Isolated child process binary for production execution. Communicates with the parent via length-delimited JSON IPC over stdin/stdout. Starts with a clean environment — no env vars, no inherited file descriptors. Even a V8 zero-day is contained at the OS process boundary.
Queryable index of all tools across all connected MCP servers. Supports progressive discovery:
Built dynamically from live tools/list responses when downstream servers connect. LiveManifest provides lock-free reads via arc-swap with atomic swap for background refresh — periodic re-discovery on a configurable interval, plus SIGHUP-triggered refresh on Unix. TypeScript definitions (forge.d.ts) are compiled into the binary at build time and served in MCP server instructions.
MCP client connections to downstream servers. Supports stdio and HTTP/SSE transports. RouterDispatcher routes callTool(server, tool, args) to the correct downstream connection with pre-dispatch tool name validation — misspelled tools return TOOL_NOT_FOUND with Levenshtein-based suggestions before ever hitting the upstream server. ReconnectingClient decorator auto-reconnects on permanent transport failures (broken pipe, channel overflow) with exponential backoff — default enabled for stdio transports.
Implements ServerHandler from rmcp. Exposes search and execute as MCP tools, wires them to the sandbox executor, and serves over stdio. Key operations are instrumented with tracing spans for structured observability.
Typed DispatchError enum replacing anyhow::Error across all dispatchers. Variants: ServerNotFound, ToolNotFound, Timeout, CircuitOpen, GroupPolicyDenied, Upstream, TransportDead, RateLimit, Internal. TransportDead distinguishes permanent transport failures (broken pipe, channel closed) from transient upstream errors — triggers circuit breaker but is not retryable without reconnection. Includes fuzzy matching — find_symbls suggests find_symbols via Levenshtein distance. Errors serialize to structured JSON with {error, code, message, retryable, suggested_fix}.
Audit event types for structured logging. Every sandbox execution is logged with code hash, tool calls, duration, outcome, worker reuse status, and pool size at acquisition. Code previews are redacted before logging.
TOML configuration with environment variable expansion (${GITHUB_TOKEN}). Configures downstream servers, transports, sandbox limits, and execution mode. Per-server reconnect and max_reconnect_backoff_secs fields control auto-reconnection on transport death (default: enabled for stdio). Optional config file watching via notify crate with debounced reload (requires config-watch feature). Startup concurrency is configurable (startup_concurrency, default 8) for parallel server connections.
npm (recommended):
npm install -g forgemax
Homebrew (macOS/Linux):
brew tap postrv/forgemax && brew install forgemax
Shell installer (macOS/Linux):
curl -fsSL https://raw.githubusercontent.com/postrv/forgemax/main/install.sh | bash
PowerShell (Windows):
irm https://raw.githubusercontent.com/postrv/forgemax/main/install.ps1 | iex
Scoop (Windows):
scoop bucket add forgemax https://github.com/postrv/scoop-forgemax
scoop install forgemax
Cargo (from source):
cargo install forgemax
From source:
cargo build --release
# Binaries: target/release/forgemax + target/release/forgemax-worker
# 1. Generate a config file
forgemax init
# 2. Edit forge.toml to add your servers and tokens
# 3. Validate your setup
forgemax doctor
# 4. Run (serves MCP over stdio)
RUST_LOG=info forgemax
# Run tests (development)
cargo test --workspace
| Command | Description |
|---------|-------------|
| forgemax | Start the MCP gateway server (default) |
| forgemax serve | Explicit alias for server mode |
| forgemax doctor | Validate configuration and connectivity |
| forgemax manifest | Inspect the capability manifest |
| forgemax run <file> | Execute a JavaScript file against servers |
| forgemax init | Generate a starter config file |
Copy the example config and add your tokens:
cp forge.toml.example forge.toml
The example includes pre-configured connections for 11 reputable MCP servers:
| Server | Company | Transport | Auth | |--------|---------|-----------|------| | narsil | — | stdio | None | | github | GitHub | stdio (Docker) | Personal access token | | playwright | Microsoft | stdio (npx) | None | | sentry | Sentry | stdio (npx) | Auth token | | cloudflare | Cloudflare | SSE | OAuth | | supabase | Supabase | stdio (npx) | Access token | | notion | Notion | stdio (npx) | Integration token | | figma | Figma | SSE | OAuth | | stripe | Stripe | stdio (npx) | Secret key | | linear | Linear | SSE | OAuth | | atlassian | Atlassian | SSE | OAuth |
Uncomment only the servers you need. Environment variables are expanded (${GITHUB_TOKEN}).
[servers.narsil]
command = "narsil-mcp"
args = ["--repos", "."]
transport = "stdio"
[sandbox]
timeout_secs = 5
max_heap_mb = 64
execution_mode = "child_process"
# Per-s