agent-workspace-linux

Name: agent-workspace-linux
Author: agent-sh

Pending

Isolated Linux desktop workspaces for AI agents — a hidden, agent-owned desktop and browser over MCP, so an agent can do GUI and web work without touching your real desktop.

58stars

7forks

Rust

Installation

# Add to your Claude Code skills
git clone https://github.com/agent-sh/agent-workspace-linux

Getting Started

Guides for using ai agents skills like agent-workspace-linux.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

README.md

Frequently Asked Questions

What is agent-workspace-linux?

agent-workspace-linux is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by agent-sh. Isolated Linux desktop workspaces for AI agents — a hidden, agent-owned desktop and browser over MCP, so an agent can do GUI and web work without touching your real desktop. It has 58 GitHub stars.

Is agent-workspace-linux safe to use?

agent-workspace-linux's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install agent-workspace-linux?

Clone the repository with "git clone https://github.com/agent-sh/agent-workspace-linux" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is agent-workspace-linux written in?

agent-workspace-linux is primarily written in Rust. It is open-source under agent-sh on GitHub, so you can review or fork the full source.

Are there alternatives to agent-workspace-linux?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh agent-workspace-linux against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

OpenSkill dashmotion

agent-workspace-linux

An isolated, hidden Linux desktop that an AI agent fully controls — over MCP — without ever touching your real mouse, keyboard, focus, or browser.

Agents that "use a computer" normally take over your screen — they move your mouse, steal focus, and drive your logged-in browser. agent-workspace-linux gives the agent its own desktop instead: a headless X11 display with its own window manager, apps, clipboard, and browser. The agent launches apps, types, clicks, screenshots, and browses there; you can watch (and pause) through a small floating viewer. It speaks MCP over stdio, so it drops into Claude Code, Codex, and other MCP hosts.

Why this project

Use it when an agent needs to QA a GUI app or a website but must not hijack your live desktop or Chrome session.
Use it when you want browser/web/shopping automation in a throwaway, isolated profile — observable and stoppable.
Use it when you need a clean Linux desktop to run, screenshot, and inspect an app, then tear it down.
Use it when a long-running or headless agent needs a desktop it can drive without a human babysitting the real one.

It is deliberately not a tool for driving your actual desktop — for that, use its sibling computer-use-linux. This one is the separate, agent-owned environment; the two are complements.

Install

Requires Linux. Install the runtime dependencies, then build and install in one step:

sudo apt install xvfb openbox xdotool xauth x11-utils imagemagick xclip \
    bubblewrap pkg-config libxkbcommon-x11-dev
./install.sh

./install.sh builds the release binary, installs it to ~/.local/bin/, and installs the bundled skill to ~/.codex/skills/ by default. It is safe to rerun. Codex MCP registration is opt-in: use --codex-configure only for generic MCP-host workflows. In Codex for Linux, use the dedicated Agent Workspaces feature page to configure the backend and permission ceiling so the generic MCP settings/configuration pages stay clean. If an older install still appears in generic MCP/configuration pages, run ./install.sh --clean-codex-config to remove the stale agent-workspace-linux server and tool tables. See install.sh --help for flags (--permissions, --clean-codex-config, --skills-dir, --no-skill, --dry-run).

Install with cargo (from source)

It builds from source straight from git — no crates.io needed. Install the system dependencies above, then:

# latest from main
cargo install --git https://github.com/agent-sh/agent-workspace-linux
# or pin a tagged release
cargo install --git https://github.com/agent-sh/agent-workspace-linux --tag v0.1.6

That puts agent-workspace-linux on your PATH. Unlike install.sh, it installs only the binary — register it with your MCP host manually (below), and copy skills/agent-workspace-linux/ into your skills directory if you want the bundled skill.

For MCP hosts that read .mcp.json:

{
  "mcpServers": {
    "agent-workspace-linux": {
      "command": "/home/YOU/.local/bin/agent-workspace-linux",
      "args": ["mcp"]
    }
  }
}

Or install the npm wrapper, which downloads the matching prebuilt Linux binary:

npm install -g @agent-sh/agent-workspace-linux

The npm wrapper downloads agent-workspace-linux-<target> from the matching GitHub Release and verifies the required agent-workspace-linux-<target>.sha256 sidecar before installing it.

Prebuilt x86_64 and aarch64 Linux binaries are also attached to each GitHub Release with their .sha256 sidecars — download the one for your architecture, verify it with sha256sum -c, chmod +x, and put it on your PATH.

Quick start

# 1. Ask the runtime what this machine can do (deps, display, sandbox backends)
agent-workspace-linux doctor

# 2. Preview a workspace without creating anything
agent-workspace-linux workspace start --dry-run

# 3. Create the hidden workspace (explicit acknowledgement required)
agent-workspace-linux workspace start --ack-hidden-workspace --purpose "QA run"

# 4. Watch it in the floating viewer
agent-workspace-linux viewer

# 5. Launch an app, see it, then stop the workspace
agent-workspace-linux workspace launch --name editor -- xterm
agent-workspace-linux workspace observe --screenshot --output /tmp/ws.png
agent-workspace-linux workspace stop

Through an MCP host you don't run these by hand — the agent calls the matching tools. Start it via the bundled skill so the agent loads only the tools it needs.

Who controls the boundaries

The single most important thing to understand is who sets the limits in each scenario — and the project is explicit about it:

Scenario	Who sets the boundary	What is enforced	Can it be overridden at runtime?
Default (no `--permissions`)	Your agent host (Claude Code, Codex, …)	The MCP adds no ceiling of its own and defers to the host's approval flow. One explicit hidden-workspace acknowledgement scopes workspace-local actions to that environment.	Yes — the host/user owns approvals.
Developer ceiling (`--permissions file.json` or `AGENT_WORKSPACE_PERMISSIONS` env)	The developer / operator who launched the MCP	Network mode, mount paths, and an app allowlist, enforced at both the MCP front-end and the workspace daemon's IPC socket — so even workspace-launched apps and other same-uid processes are capped.	No — only by restarting the MCP with new config. This is the authoritative boundary.
Live viewer control (pause / read-only)	The human watching, in real time	Honors a runtime pause when the shared control state is readable; if the daemon cannot read that state, mutating IPC fails closed while inspection and stop remain available.	It's a convenience layer, not the security boundary — the ceiling above is.
Workspace vs. host	The runtime	Input, screenshots, windows, clipboard, and browser control target the hidden workspace only — never your real desktop or host Chrome.	Leakage to the host is a reportable bug.

In short: by default the agent host owns permission, a developer can lock a hard, daemon-enforced ceiling via flag or env, and the viewer gives a human live pause/read-only/stop controls — layered, not conflicting. See docs/permission-model.md and SECURITY.md for the full model and trust assumptions.

Core concepts

Hidden workspace — a private Xvfb display + window manager + control socket. Apps launched into it attach to that display, not your session. Creating one requires --ack-hidden-workspace so it is never silent.
Permission ceiling — optional, declared in JSON (network, mounts, apps). When set, it is enforced for the life of the MCP process. Mount and network isolation are applied with bubblewrap when available.
Profiles — reusable workspace definitions (mounts, network mode, setup commands, startup apps), e.g. profile template project-dev or browser-session.
Viewer — a small GPUI window that shows workspace state and a live screen view, with pause / read-only / stop controls. It is not always-on-top by default.
Workspace browser — workspace-owned Chrome/Chromium reached over a loopback DevTools endpoint, so browser automation never attaches to your host Chrome.

Real-profile browser-session dogfood

For Slack/GitHub account-session validation, keep the only human step explicit: approve the Chrome/Chromium user-data directory, then run the opt-in helper:

node scripts/mcp_real_profile_browser_session_dogfood.js \
  --approved-user-data-dir ~/.config/google-chrome \
  --site github

The helper refuses obvious active-profile hazards, copies the approved profile to a disposable directory, launches the current browser-session template through the repo-owned MCP path, and proves the logged-in page with workspace browser tools instead of the host Chrome bridge. Use --site slack for Slack, set BROWSER_BIN=chromium when needed, and rerun with --expect-text for account or workspace-specific page text.

The skill (progressive tool loading)

The MCP exposes ~86 tools. To avoid dumping them all into the agent's context, it ships a skill at skills/agent-workspace-linux/SKILL.md. Only the skill's short description stays loaded; when a task needs an isolated desktop or browser, the agent reads the skill and it routes to the right tools per phase (orient → start → observe → act → stop), loading tool schemas on demand. ./install.sh installs it to