by agent-sh
Isolated Linux desktop workspaces for AI agents — a hidden, agent-owned desktop and browser over MCP, so an agent can do GUI and web work without touching your real desktop.
# Add to your Claude Code skills
git clone https://github.com/agent-sh/agent-workspace-linuxGuides for using ai agents skills like agent-workspace-linux.
agent-workspace-linux is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by agent-sh. Isolated Linux desktop workspaces for AI agents — a hidden, agent-owned desktop and browser over MCP, so an agent can do GUI and web work without touching your real desktop. It has 58 GitHub stars.
agent-workspace-linux's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.
Clone the repository with "git clone https://github.com/agent-sh/agent-workspace-linux" and add it to your Claude Code skills directory (see the Installation section above).
agent-workspace-linux is primarily written in Rust. It is open-source under agent-sh on GitHub, so you can review or fork the full source.
Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh agent-workspace-linux against similar tools.
No comments yet. Be the first to share your thoughts!
Unlocks once the catalog security scan passes (runs nightly).
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
An isolated, hidden Linux desktop that an AI agent fully controls — over MCP — without ever touching your real mouse, keyboard, focus, or browser.
Agents that "use a computer" normally take over your screen — they move your mouse, steal focus, and drive your logged-in browser. agent-workspace-linux gives the agent its own desktop instead: a headless X11 display with its own window manager, apps, clipboard, and browser. The agent launches apps, types, clicks, screenshots, and browses there; you can watch (and pause) through a small floating viewer. It speaks MCP over stdio, so it drops into Claude Code, Codex, and other MCP hosts.
It is deliberately not a tool for driving your actual desktop — for that, use its sibling computer-use-linux. This one is the separate, agent-owned environment; the two are complements.
Requires Linux. Install the runtime dependencies, then build and install in one step:
sudo apt install xvfb openbox xdotool xauth x11-utils imagemagick xclip \
bubblewrap pkg-config libxkbcommon-x11-dev
./install.sh
./install.sh builds the release binary, installs it to ~/.local/bin/, and installs the bundled skill to ~/.codex/skills/ by default. It is safe to rerun. Codex MCP registration is opt-in: use --codex-configure only for generic MCP-host workflows. In Codex for Linux, use the dedicated Agent Workspaces feature page to configure the backend and permission ceiling so the generic MCP settings/configuration pages stay clean. If an older install still appears in generic MCP/configuration pages, run ./install.sh --clean-codex-config to remove the stale agent-workspace-linux server and tool tables. See install.sh --help for flags (--permissions, --clean-codex-config, --skills-dir, --no-skill, --dry-run).
It builds from source straight from git — no crates.io needed. Install the system dependencies above, then:
# latest from main
cargo install --git https://github.com/agent-sh/agent-workspace-linux
# or pin a tagged release
cargo install --git https://github.com/agent-sh/agent-workspace-linux --tag v0.1.6
That puts agent-workspace-linux on your PATH. Unlike install.sh, it installs only the binary — register it with your MCP host manually (below), and copy skills/agent-workspace-linux/ into your skills directory if you want the bundled skill.
For MCP hosts that read .mcp.json:
{
"mcpServers": {
"agent-workspace-linux": {
"command": "/home/YOU/.local/bin/agent-workspace-linux",
"args": ["mcp"]
}
}
}
Or install the npm wrapper, which downloads the matching prebuilt Linux binary:
npm install -g @agent-sh/agent-workspace-linux
The npm wrapper downloads agent-workspace-linux-<target> from the matching
GitHub Release and verifies the required
agent-workspace-linux-<target>.sha256 sidecar before installing it.
Prebuilt x86_64 and aarch64 Linux binaries are also attached to each
GitHub Release
with their .sha256 sidecars — download the one for your architecture, verify
it with sha256sum -c, chmod +x, and put it on your PATH.
# 1. Ask the runtime what this machine can do (deps, display, sandbox backends)
agent-workspace-linux doctor
# 2. Preview a workspace without creating anything
agent-workspace-linux workspace start --dry-run
# 3. Create the hidden workspace (explicit acknowledgement required)
agent-workspace-linux workspace start --ack-hidden-workspace --purpose "QA run"
# 4. Watch it in the floating viewer
agent-workspace-linux viewer
# 5. Launch an app, see it, then stop the workspace
agent-workspace-linux workspace launch --name editor -- xterm
agent-workspace-linux workspace observe --screenshot --output /tmp/ws.png
agent-workspace-linux workspace stop
Through an MCP host you don't run these by hand — the agent calls the matching tools. Start it via the bundled skill so the agent loads only the tools it needs.
The single most important thing to understand is who sets the limits in each scenario — and the project is explicit about it:
| Scenario | Who sets the boundary | What is enforced | Can it be overridden at runtime? |
|---|---|---|---|
Default (no --permissions) |
Your agent host (Claude Code, Codex, …) | The MCP adds no ceiling of its own and defers to the host's approval flow. One explicit hidden-workspace acknowledgement scopes workspace-local actions to that environment. | Yes — the host/user owns approvals. |
Developer ceiling (--permissions file.json or AGENT_WORKSPACE_PERMISSIONS env) |
The developer / operator who launched the MCP | Network mode, mount paths, and an app allowlist, enforced at both the MCP front-end and the workspace daemon's IPC socket — so even workspace-launched apps and other same-uid processes are capped. | No — only by restarting the MCP with new config. This is the authoritative boundary. |
| Live viewer control (pause / read-only) | The human watching, in real time | Honors a runtime pause when the shared control state is readable; if the daemon cannot read that state, mutating IPC fails closed while inspection and stop remain available. | It's a convenience layer, not the security boundary — the ceiling above is. |
| Workspace vs. host | The runtime | Input, screenshots, windows, clipboard, and browser control target the hidden workspace only — never your real desktop or host Chrome. | Leakage to the host is a reportable bug. |
In short: by default the agent host owns permission, a developer can lock a hard, daemon-enforced ceiling via flag or env, and the viewer gives a human live pause/read-only/stop controls — layered, not conflicting. See docs/permission-model.md and SECURITY.md for the full model and trust assumptions.
Xvfb display + window manager + control socket. Apps launched into it attach to that display, not your session. Creating one requires --ack-hidden-workspace so it is never silent.network, mounts, apps). When set, it is enforced for the life of the MCP process. Mount and network isolation are applied with bubblewrap when available.profile template project-dev or browser-session.For Slack/GitHub account-session validation, keep the only human step explicit: approve the Chrome/Chromium user-data directory, then run the opt-in helper:
node scripts/mcp_real_profile_browser_session_dogfood.js \
--approved-user-data-dir ~/.config/google-chrome \
--site github
The helper refuses obvious active-profile hazards, copies the approved profile to
a disposable directory, launches the current browser-session template through
the repo-owned MCP path, and proves the logged-in page with workspace browser
tools instead of the host Chrome bridge. Use --site slack for Slack, set
BROWSER_BIN=chromium when needed, and rerun with --expect-text for account
or workspace-specific page text.
The MCP exposes ~86 tools. To avoid dumping them all into the agent's context, it ships a skill at skills/agent-workspace-linux/SKILL.md. Only the skill's short description stays loaded; when a task needs an isolated desktop or browser, the agent reads the skill and it routes to the right tools per phase (orient → start → observe → act → stop), loading tool schemas on demand. ./install.sh installs it to