by hanzili
let any ai agent use the local browser
# Add to your Claude Code skills
git clone https://github.com/hanzili/hanzi-browseThe context layer for browsing agents.
Your browsing agent keeps failing on real sites — X uses Draft.js, LinkedIn hides the connect button, Gmail needs keyboard shortcuts. Hanzi Browse ships 24 site playbooks — hints for the LLM, not brittle scripts — so it actually finishes the task.
Works with
Same 24 site playbooks underneath. Two install paths depending on who's driving.
One command. npx hanzi-browse setup detects every AI agent on your machine (Claude Code, Cursor, Codex, and 9 more) and wires Hanzi Browse in as an MCP tool. Your main agent delegates browser work; a sub-agent runs the loop — read page → plan next action → click/type/scroll → observe → repeat until done — and returns a clean answer. Site playbooks auto-load by URL so the model already knows the quirks.
No comments yet. Be the first to share your thoughts!
Your backend calls runTask({ task: "…" }). Your users' own Chrome executes it, signed in as themselves. Same 24 playbooks as the CLI, exposed as a REST API and @hanzi-browse/sdk. Free tools on tools.hanzilla.co are built on this SDK.
npx hanzi-browse setup
One command does everything:
npx hanzi-browse setup
│
├── 1. Detect browsers ──── Chrome, Brave, Edge, Arc, Chromium
│
├── 2. Install extension ── Opens Chrome Web Store, waits for install
│
├── 3. Detect AI agents ─── Claude Code, Cursor, Codex, Windsurf,
│ VS Code, Gemini CLI, Amp, Cline, Roo Code
│
├── 4. Configure MCP ────── Merges hanzi-browse into each agent's config
│
├── 5. Install skills ───── Copies browser skills into each agent
│
└── 6. Choose AI mode ───── Managed ($0.05/task) or BYOM (free forever)
"Go to Gmail and unsubscribe from all marketing emails from the last week"
"Apply for the senior engineer position on careers.acme.com"
"Log into my bank and download last month's statement"
"Find AI engineer jobs on LinkedIn in San Francisco"
Hanzi Browse has two distribution channels. Both use the same browser automation engine and site domain knowledge:
Skills — for users who run Hanzi Browse locally through their AI agent. The setup wizard installs skills directly into your agent (Claude Code, Cursor, etc.). Each skill teaches the agent when and how to use the browser for a specific workflow.
Free Tools — hosted web apps that anyone can try without installing anything. Each tool is a standalone app built on the Hanzi Browse API that demonstrates a use case. Every skill can become a free tool.
Installed automatically during npx hanzi-browse setup. Your agent reads these as markdown files.
| Skill | Description |
|-------|-------------|
| hanzi-browse | Core skill — when and how to use browser automation |
| e2e-tester | Test your app in a real browser, report bugs with screenshots |
| social-poster | Draft per-platform posts, publish from your signed-in accounts |
| linkedin-prospector | Find prospects, send personalized connection requests |
| a11y-auditor | Run accessibility audits in a real browser |
| data-extractor | Extract structured data from websites into CSV/JSON |
| x-marketer | Twitter/X marketing workflows |
Open source — add your own.
Try them at tools.hanzilla.co. No account needed — just install the extension and go.
| Tool | What it does | Try it | |------|-------------|--------| | X Marketing | AI finds relevant conversations on X, drafts personalized replies, posts from your Chrome | tools.hanzilla.co/x-marketing |
Both CLI and SDK rely on a shared set of site playbooks — verified interaction recipes for complex websites. They teach the LLM how async loading works on X, which selector hides LinkedIn's connect button, that Gmail responds to keyboard shortcuts, and how to sidestep anti-bot detection on ~20 other sites.
Hints for the LLM, not brittle scripts. The model stays in control; we just hand it the cheat sheet. When the DOM shifts, the agent adapts — no adapter to rebuild.
Currently supports 24 sites: X, LinkedIn, Gmail, GitHub, Notion, Figma, Slack, Reddit, Amazon, eBay, Walmart, Target, Zillow, Apartments.com, Craigslist, Indeed, Google Docs, Sheets, Calendar, Drive, ChatGPT, Claude.ai, Stack Overflow.
All playbooks live in server/src/agent/domain-skills.json as a single shared JSON array. To add a site, open a PR appending a { domain, skill } entry.
Embed browser automation in your product. Your app calls the Hanzi Browse API, a real browser executes the task, you get the result back.
/pair/{token}) — they click it and auto-pairPOST /v1/tasks with a task and browser session IDGET /v1/tasks/:id until complete, or use runTask() which blocksimport { HanziClient } from '@hanzi-browse/sdk';
const client = new HanziClient({ apiKey: process.env.HANZI_API_KEY });
const { pairingToken } = await client.createPairingToken();
const sessions = await client.listSessions();
const result = await client.runTask({
browserSessionId: sessions[0].id,
task: 'Read the patient chart on the current page',
});
console.log(result.answer);
| Tool | Description |
|------|-------------|
| browser_start | Run a task. Blocks until complete. |
| browser_message | Send follow-up to an existing session. |
| browser_status | Check progress. |
| browser_stop | Stop a task. |
| browser_screenshot | Capture current page as image. |
| | Managed | BYOM | |--|---------|------| | Price | $0.05/task (20 free/month) | Free forever | | AI model | We handle it (Gemini) | Your own key | | Data | Processed on Hanzi Browse servers | Never leaves your machine | | Billing | Only completed tasks. Errors are free. | N/A |
Building a product? Contact us for volume pricing.
Prerequisites: Node.js 18+, Docker Desktop (must be running before make fresh).
git clone https://github.com/hanzili/hanzi-browse
cd hanzi-browse
make fresh
Performs full setup: installs deps, builds server/dashboard/extension, starts Postgres, runs migrations, and launches the dev server (~90s).
make dev
Starts the backend services (Postgres + migrations + API server) and serves the dashboard UI.
The defaults in .env.example are enough to run the server.
Optional services:
GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET