by elusznik
An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.
# Add to your Claude Code skills
git clone https://github.com/elusznik/mcp-server-code-execution-modeStop paying 30,000 tokens per query. This bridge implements Anthropic's discovery pattern with rootless security—reducing MCP context from 30K to 200 tokens while proxying any stdio server.
This bridge implements the "Code Execution with MCP" pattern, a convergence of ideas from industry leaders:
Instead of exposing hundreds of individual tools to the LLM (which consumes massive context and confuses the model), this bridge exposes one tool: run_python. The LLM writes Python code to discover, call, and compose other tools.
While there are JavaScript-based alternatives (like universal-tool-calling-protocol/code-mode), this project is built for Data Science and Security:
| Feature | This Project (Python) | JS Code Mode (Node.js) |
| :--- | :--- | :--- |
| Native Language | Python (The language of AI/ML) | TypeScript/JavaScript |
| Data Science | Native (pandas, numpy, scikit-learn) | Impossible / Hacky |
| Isolation | Hard (Podman/Docker Containers) | Soft (Node.js VM) |
| Security | Enterprise (Rootless, No Net, Read-Only) | Process-level |
| Philosophy | Infrastructure (Standalone Bridge) | Library (Embeddable) |
Choose this if: You want your agent to analyze data, generate charts, use scientific libraries, or if you require strict container-based isolation for running untrusted code.
Connect Claude to 11 MCP servers with ~100 tools = 30,000 tokens of tool schemas loaded into every prompt. That's $0.09 per query before you ask a single question. Scale to 50 servers and your context window breaks.
Traditional MCP (Context-Bound)
┌─────────────────────────────┐
│ LLM Context (30K tokens) │
│ - serverA.tool1: {...} │
│ - serverA.tool2: {...} │
│ - serverB.tool1: {...} │
│ - … (dozens more) │
└─────────────────────────────┘
↓
LLM picks tool
↓
Tool executes
This Bridge (Discovery-First)
┌─────────────────────────────┐
│ LLM Context (≈200 tokens) │
│ “Use discovered_servers(), │
│ query_tool_docs(), │
│ search_tool_docs()” │
└─────────────────────────────┘
↓
LLM discovers servers
↓
LLM hydrates schemas
↓
LLM writes Python
↓
Bridge proxies execution
Result: constant overhead. Whether you manage 10 or 1000 tools, the system prompt stays right-sized and schemas flow only when requested.
| Capability | Docker MCP Gateway | Cloudflare Code Mode | Research Patterns | This Bridge |
|------------|--------------------|----------------------|-------------------|--------------|
| Solves token bloat | ❌ Manual preload | ❌ Fixed catalog | ❌ Theory only | ✅ Discovery runtime |
| Universal MCP proxying | ✅ Containers | ⚠️ Platform-specific | ❌ Not provided | ✅ Any stdio server |
| Rootless security | ⚠️ Optional | ✅ V8 isolate | ❌ Not addressed | ✅ Cap-dropped sandbox |
| Auto-discovery | ⚠️ Catalog-bound | ❌ N/A | ❌ Not implemented | ✅ 12+ config paths |
| Tool doc search | ❌ | ❌ | ⚠️ Conceptual | ✅ search_tool_docs() |
| Production hardening | ⚠️ Depends on you | ✅ Managed service | ❌ Prototype | ✅ Tested bridge |
Speakeasy's Dynamic Toolsets use a 3-step flow: search_tools → describe_tools → execute_tool. While this saves tokens, it forces the agent into a "chatty" loop:
create_issue"create_issue"This Bridge (Code-First) collapses that loop:
mcp_github, search for 'issues', and create one if missing."The agent writes a single Python script that performs discovery, logic, and execution in one round-trip. It's faster, cheaper (fewer intermediate LLM calls), and handles complex logic (loops, retries) that a simple "execute" tool cannot.
OneMCP provides a "Handbook" chat interface where you ask questions and it plans execution. This is great for simple queries but turns the execution into a black box.
This Bridge gives the agent raw, sandboxed control. The agent isn't asking a black box to "do it"; the agent is the programmer, writing the exact code to interact with the API. This allows for precise edge-case handling and complex data processing that a natural language planner might miss.
Two-stage discovery – discovered_servers() reveals what exists; query_tool_docs(name) loads only the schemas you need.
Fuzzy search across servers – let the model find tools without memorising catalog names:
from mcp import runtime
matches = await runtime.search_tool_docs("calendar events", limit=5)
for hit in matches:
print(hit["server"], hit["tool"], hit.get("description", ""))
Zero-copy proxying – every tool call stays within the sandbox, mirrored over stdio with strict timeouts.
Rootless by default – Podman/Docker containers run with --cap-drop=ALL, read-only root, no-new-privileges, and explicit memory/PID caps.
Compact + TOON output – minimal plain-text responses for most runs, with deterministic TOON blocks available via MCP_BRIDGE_OUTPUT_MODE=toon.
This server aligns with the philosophy that you might not need MCP at all for every little tool. Instead of building rigid MCP servers for simple tasks, you can use this server to give your agent raw, sandboxed access to Bash and Python.
No comments yet. Be the first to share your thoughts!