Bah - an AI browser (Perplexity Comet-style) by VilelaLab. Type plain-language commands and the agent clicks and types for you. Works free out of the box, or use your own key (DeepSeek/Mistral/NVIDIA) or a local model (Ollama). Electron + React + TypeScript.
# Add to your Claude Code skills
git clone https://github.com/alexvilelabah/bah-browserbah-browser is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by alexvilelabah. Bah - an AI browser (Perplexity Comet-style) by VilelaLab. Type plain-language commands and the agent clicks and types for you. Works free out of the box, or use your own key (DeepSeek/Mistral/NVIDIA) or a local model (Ollama). Electron + React + TypeScript. It has 50 GitHub stars.
bah-browser's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.
Clone the repository with "git clone https://github.com/alexvilelabah/bah-browser" and add it to your Claude Code skills directory (see the Installation section above).
bah-browser is primarily written in TypeScript. It is open-source under alexvilelabah on GitHub, so you can review or fork the full source.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh bah-browser against similar tools.
No comments yet. Be the first to share your thoughts!
Unlocks once the catalog security scan passes (runs nightly).
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
You give natural-language commands ("open gmail and delete the spam") and the AI operates the browser in your place β reading the screen, clicking with a real mouse, typing, and going until it's done.
πΈ Free out of the box β no key, no GPU. Chat and image generation work the moment you open it, with no API key. For the full autonomous agent (or more speed), add a cheap key β DeepSeek (recommended), Mistral or NVIDIA NIM. Prefer 100% local + offline? Run a model with Ollama.

βΆοΈ Download the full video β the agent searching and operating the web on its own. (the GIF above is a preview; GitHub doesn't play embedded video in the README.)
π§ I just want to use it (Windows): grab the installer here β the Bah-Setup-*.exe file, double-click and install.
π Want it 100% local & free? No terminal needed. Set up the AI right inside the browser: open the AI panel β π Local AI β type a model name β Download, and Bah pulls the Ollama model for you. (Prefer the cloud? A cheap DeepSeek key works too β and needs no GPU.)
π Auto-updates: after installing, Bah checks for new versions, downloads them in the background and offers "Restart now" to apply β no reinstalling.
β οΈ Windows shows a blue "protected your PC" screen (the app isn't code-signed with a paid certificate yet). Click More info β Run anyway β normal for new open-source apps. (Later updates install without that warning.)
π¨βπ» I want to hack on the code: clone and run it β see Running it below.
sendInputEvent (not synthetic events β goes through React, Vue, Angular without being ignored)navigator.webdriver) so sites don't wrongly block the browser| Layer | Tech |
|---|---|
| Browser shell | Electron 42 + Chromium |
| UI | React 19 + TypeScript + Vite |
| AI (cloud) | DeepSeek (recommended) Β· Mistral Β· NVIDIA NIM Β· free no-key fallback (Pollinations) |
| AI (local) | Ollama |
| Adblock | @ghostery/adblocker-electron |
| Webview | <webview> tag with persistent partition |
USER β "open gmail and delete the spam"
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββ
β for step in 1..25: β
β 1. observePage(webview) β
β β { url, title, interactive_elements } β
β 2. captureScreenshot() β
β 3. AI decides ONE action: β
β { action: { type, ...params } } β
β 4. execute action via REAL OS input β
β 5. wait, re-observe, self-evaluate β
β 6. if action == 'done' β return β
βββββββββββββββββββββββββββββββββββββββββββββββββ
| Action | What it does |
|---|---|
click_ref(N) |
Clicks the element with id N from the observed list |
fill_ref(N, value) |
Fills input id N with value (and verifies it took) |
click_text(text) |
Finds by visible text and clicks |
click_at(x, y) |
Click at exact coordinates (visual fallback) |
type(text) / press(key) |
Type into the focused field / send a key |
navigate(url) / scroll(dir) |
Go to a URL / scroll |
new_tab / switch_tab / close_tab |
Tab management |
done(reason, success) |
End the loop |
Clicks happen through webContents.sendInputEvent in the main process β a real Chromium mouse event, not a synthetic one, so React/Vue/anti-bot sites respond normally. The AI prefers the DOM-first path (click_ref), falling back to text and then coordinates.
git clone https://github.com/alexvilelabah/bah-browser.git
cd bah-browser
npm install
npm run build
npm start # or: npx electron .
Windows shortcut: double-click Abrir-Bah.bat.
127.0.0.1:11434). Then download a model from inside Bah (βοΈ/π β π Local AI β type a name β Download) or in a terminal (e.g. ollama pull qwen3:14b). Local works offline, but the cloud (DeepSeek) is more reliable.The agent runs with full browser privileges, so it's worth being clear about what it does and doesn't do:
βοΈ You're in control β and responsible. Bah acts in your real session, on your account. Use it within each site's terms and the law. Sensitive actions (paying, buying, deleting, entering card data) always ask for your confirmation first.
π Audit it yourself. The full threat model, a verify-it-yourself checklist (each protection β the exact file), what leaves your machine, and an honest list of accepted tradeoffs all live in SECURITY.md. A map of the codebase β every file, the live agent loop, the build β is in ARCHITECTURE.md. The whole of
src/is ~30 files.
π It's your real session. The browser uses a persistent partition (persist:browser), so cookies and logins are saved. If you're logged into Gmail in Bah, so is the agent. The AI can access anything you could access manually. Don't log into accounts you wouldn't trust an assistant with.
π‘οΈ Safety brake on sensitive actions. Before paying, buying, deleting, or entering card data, the agent pauses and asks for your confirmation β and this works on every path (model clicks, coordinate clicks, Enter on a checkout page, learned shortcuts, and repeated automations). It never does those silently.
π Stop means stop. The β Stop button cancels immediately, even mid model-call or mid-loop; a late response won't "resurrect" a cancelled task.
π― No fake success. After a fill/type the agent checks the field actually holds the value; if an acti