AI cursor assistant for macOS, Windows, Linux. Hold a hotkey, ask a question about what's on your screen, get an instant answer. Vision-capable AI overlay with OpenRouter, Anthropic, OpenAI, Gemini. Local-first, BYO key, no telemetry.
If AIPointer ⦿ is useful to you,
a star 🌟 on GitHub helps the project stay alive.
What it is
AIPointer is an open-source desktop overlay. You hold a key (default: Right-Cmd on macOS, Right-Ctrl on Windows/Linux), a glassmorphism box pops up next to your cursor, you ask a question, and a vision-capable LLM answers about whatever's around the cursor. A screenshot of the cursor region, your prompt, and (optionally) your clipboard get sent to the provider you configured. You keep your own API key, you pay for your own tokens, nothing is logged anywhere.
BSL-1.1 source-available, no framework lock-in, no cloud account. For longer autonomous tasks, AIPointer points you at Skales, the same author's larger AI agent.
Common use cases
AIPointer is useful when you want to ask an AI about something on your screen without copying, pasting, or switching apps:
Quick translation of text you're reading in any app or website
Explain code in your editor without leaving it
Identify an object, product, landmark, or chart from a screenshot
Summarize a long article or document you have open
Get a reply suggestion for a message visible on screen
Define a word without leaving your current app
Solve a math or logic problem by pointing at it
Voice queries when typing isn't convenient
It works as an alternative to switching to ChatGPT, Claude, or Gemini in a browser tab — you stay where you are, the answer comes to you.
First-time install walkthrough for Claude Code, Codex CLI, and ChatGPT.
Cursor-anchored. It answers about what you're already looking at, not what you have to describe in text.
Fast. Vision-capable Gemini 3 Flash by default. Sub-2-second answers for most questions.
Multi-provider. Bring your own OpenRouter key (recommended) or a direct Anthropic, OpenAI, or Google Gemini key. Fallback chain handles outages automatically.
Agentic when it helps. Seven built-in tools: fetch a URL, open a URL, copy text, save the answer as a styled document, reveal your workspace, read clipboard, and (new in v1.1.0) launch a desktop app from a curated whitelist. All action tools sit behind a green-check / red-x approval.
Child Mode (v1.1.0). Kid-friendly response layer. PIN-locked switch, per-language safe-browsing whitelist (EN+DE), stricter HTML rendering, restricted tool set, voice-first by default with slower TTS. Visually identical to Adult mode — pure behavior layer.
Onboarding wizard (v1.1.0). Separate framed window on first launch. Five steps: welcome, mode pick, PIN setup, autostart preferences, provider + API key.
Voice commands (v1.1.1). Say "open settings" or "einstellungen öffnen" — works in EN, DE, FR, ES, IT, PT, NL. Pre-LLM phrase matcher with Levenshtein tolerance for transcription wobble. New voice tools: "play " / "spiele Bach" (YouTube / YouTube Kids in Child Mode), "search for X" / "such nach X" (DuckDuckGo / Kiddle), "louder / quieter / mute / volume 30" / "lauter / leiser / stumm" for system volume. Say "stop" / "halt" / "sei ruhig" to interrupt the voice loop.
Chat-only mode (v1.1.2). Settings → Behaviour. Toggle on to open AIPointer as a pure chat window — no screenshot auto-attached. A small camera button inside the prompt lets you attach one per query when you want visual context. Good for follow-up questions, quick lookups, and people who already know what they want to ask.
Cursor accent picker (v1.1.2). Settings → Appearance → Cursor accent. Three brand presets — lime (default), macOS system blue, or cyberpunk magenta. The comet trail + halo and every accent-tinted UI surface re-tint live without a restart.
Floating chip bubbles (v1.1.2). Screenshot and Clipboard chips now render as toast-style rounded-full pills floating above the BottomPill — same frosted micro-grain + premium shadow as the pill, high-opacity tint with white text so they stay readable over any backdrop (light AIPointer on a dark website, dark AIPointer on a bright editor — all fine).
TTS pause / resume (v1.1.2). The Read button under each response is now a real play / pause toggle: click to play, click to pause at the exact position, click again to resume. Stop-words ("stop" / "halt" / "sei ruhig"), ESC, and a new query still fully end playback.
Activation section in Settings (v1.1.2). Keyboard hotkey + mouse-wiggle on/off toggle, both live-applied. The wiggle has been there since v1.1.0 but had no UI control — now it does.
Screenshot + Clipboard chips (v1.1.1). macOS-blue "Screenshot attached" chip (default on, dismiss with ×) and a paperclip toggle for opt-in clipboard. Explicit per-query control replaces the v1.0 always-on clipboard polling.
Theme picker (v1.1.1). Settings → Appearance: System (follows your OS), Light, or Dark. Pick AIPointer's appearance independently of macOS / Windows — keep your system on light and run AIPointer dark (or vice versa). The picked theme survives macOS appearance changes (fixed in v1.1.2).
Premium pill surface (v1.1.1). The BottomPill / PromptInput render as a macOS-Control-Center-style tile with a fractalNoise micro-grain and a five-layer shadow for perceived depth. Same warm-grey light surface (#f5f5f5) and Claude-sidebar dark surface (#262626) macOS users expect.
Mouse-wiggle activation (v1.1.0). Wiggle the cursor left-right-left within ~700ms to summon AIPointer. Toggle off in Settings.
Voice in and voice out. Microphone input with auto-stop on silence. Text-to-speech read-aloud for the answer. Optional voice-first conversation mode: trigger opens listening, transcript auto-submits, answer is read back, mic re-opens for follow-up. Hands-free loop.
Region screenshots. While the box is open, hold the trigger key again and drag a rectangle to capture exactly the region you want. Replaces the default cursor-centered crop for that query.
API key tester. A Test button per provider checks the key against the live endpoint. For OpenRouter the result includes your remaining credit balance.
Templates that work./summary, /brief, /translate, /explain, /code, /improve, /define, /solve, /reply, /identify. Plain prose works too. AIPointer detects intent across languages.
Save the answer. A4-styled HTML, PDF, or Markdown. Print-ready.
Auto-updater. Quiet check on launch and once per day. Downloads the next version when ready, prompts you on restart. Disable in settings if you prefer manual updates.
Local-first. No telemetry, no analytics, no crash reporting. Your config lives on your machine, your sessions stay on disk if you opt in.
Light and dark theme. Adapts to your system in real time.
Install
Download a signed build for your OS at aipointer.app, or build from source:
git clone https://github.com/gonemedia/aipointer
cd aipointer
npm install
npm run dev
On first launch you'll be prompted to grant Accessibility (for the global key listener) and Screen Recording (for the cursor-region screenshot). Both are required on macOS. The Settings panel opens automatically the first time so you can paste an API key. Get one at:
openrouter.ai/keys recommended, single key, automatic routing across providers.
Hold Right-Cmd (macOS) or Right-Ctrl (Windows / Linux) for 200 ms over the content you want help with. Or click the small status pill at the bottom of your screen.
The box opens beside your cursor. Type a question, press Enter.
Use / to see the commands. Plain language works too. AIPointer auto-detects the intent (summarise, translate, explain, code, etc.) across languages.
Hover the answer for Read (text-to-speech), Save (HTML/PDF/MD), Copy.
Press ESC to dismiss. Open /settings any time to adjust providers, hotkey, voice, profile, workspace.
Commands
/web <question> force a web-grounded answer for this one query.
/summary <topic> long structured doc, ready to save as A4 PDF.
/brief <topic> TL;DR + 3-5 bullets.
/translate [language] <text> translate visible, selected, or pasted text.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.