Enterprise AI bastion host for secure AI API and MCP access, with unified proxying, RBAC, audit logs, rate limiting, and cost tracking across OpenAI, Anthropic, Gemini, and self-hosted LLMs.
# Add to your Claude Code skills
git clone https://github.com/ThinkWatchProject/ThinkWatchThe enterprise-grade secure gateway for AI. Secure, audit, and govern every AI API call and MCP tool invocation across your organization — from a single control plane.
Just as an SSH secure gateway is the single gateway through which all server access must flow, ThinkWatch is the single gateway through which all AI access must flow. Every model request. Every tool call. Every token. Authenticated, authorized, rate-limited, logged, and accounted for.
┌──────────────────────────────────────┐
Claude Code ──────>│ │──> OpenAI
Cursor ───────────>│ Gateway :3000 │──> Anthropic
Custom Agent ─────>│ AI API + MCP Unified Proxy │──> Google Gemini
CI/CD Pipeline ───>│ │──> Azure OpenAI / AWS Bedrock
└──────────────────────────────────────┘
┌──────────────────────────────────────┐
Admin Browser ────>│ Console :3001 │
│ Management UI + Admin API │
└──────────────────────────────────────┘
As AI agents proliferate across engineering teams, organizations face a growing governance challenge:
.env files, shared in Slack, rotated neverNo comments yet. Be the first to share your thoughts!
ThinkWatch solves all of this with a single deployment.
/v1/chat/completions), Anthropic Messages (/v1/messages), and OpenAI Responses (/v1/responses) APIs on a single port; works as a drop-in replacement for Cursor, Continue, Cline, Claude Code, and the OpenAI/Anthropic SDKsgpt-/o1-/o3-/o4- for OpenAI, claude- for Anthropic, gemini- for Google) route automatically; Azure and Bedrock require explicit model registrationtw- keys; the same tw- token works on both the AI gateway and the MCP gateway via a per-key surfaces allowlistinput_multiplier / output_multipliergithub__create_issue, postgres__query — no tool name collisionsdeleted_at column) with automatic purge after 30 dayssystem_settings table), configurable via Web UI (Admin > Settings with 7 category tabs)/setup wizard creates the super_admin account, configures the site, and optionally adds the first provider and API key/gateway/guide page in the web console with copy-paste setup instructions for Claude Code, Cursor, Continue, Cline, OpenAI SDK, Anthropic SDK, and cURL; auto-detects the gateway URLGET /metrics endpoint on the gateway port (3000) exposing gateway_requests_total, gateway_request_duration_seconds, gateway_tokens_total, gateway_rate_limited_total, circuit_breaker_state, and more/health/live (liveness probe), /health/ready (readiness probe with PostgreSQL and Redis checks), /api/health (detailed latency and pool statistics)ThinkWatch enforces two parallel kinds of quota at every gateway request, both managed from the same admin UI:
| | Sliding-window rate limits | Natural-period budget caps |
|---|---|---|
| What it counts | Requests OR weighted tokens, depending on the rule's metric | Weighted tokens only |
| Window shape | Rolling 60-bucket window: 1m / 5m / 1h / 5h / 1d / 1w | Calendar-aligned: daily / weekly / monthly (resets on the period boundary) |
| Backing store | Redis ZSET-style buckets | Redis INCR counters keyed by subject:period:bucket_id |
| When it fires | Pre-flight (requests metric) AND post-flight (tokens metric) | Post-flight only |
| Hard or soft? | Hard for requests metric, soft for tokens metric | Soft cap — exactly one request can push you over before subsequent calls in the same period are rejected |
A single request can be subject to multiple rules and budgets at once. The
engine resolves the request to a set of (subject_kind, subject_id) tuples
and runs every enabled rule against all of them in one atomic Lua check.
Any rule rejecting → the request is rejected. All-or-nothing INCR.
| Subject | Rate limit rules | Budget caps |
|---|---|---|
| user | ✅ ai_gateway / mcp_gateway | ✅ |
| api_key | ✅ ai_gateway / mcp_gateway | ✅ |
| provider | ✅ ai_gateway only | ✅ |
| mcp_server | ✅ mcp_ga