OpenGUI

Name: OpenGUI
Author: Core-Mate

Verified

OpenGUI is an Android GUI agent framework for phone-use AI that can see, plan, and operate real mobile apps through the GUI.

624stars

20forks

Kotlin

Installation

# Add to your Claude Code skills
git clone https://github.com/Core-Mate/OpenGUI

Getting Started

Guides for using ai agents skills like OpenGUI.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

Security ReportVerified

Last scanned: 5/30/2026

{
  "issues": [],
  "status": "PASSED",
  "scannedAt": "2026-05-30T16:18:49.321Z",
  "npmAuditRan": true,
  "pipAuditRan": true
}

README.md

Frequently Asked Questions

What is OpenGUI?

OpenGUI is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by Core-Mate. OpenGUI is an Android GUI agent framework for phone-use AI that can see, plan, and operate real mobile apps through the GUI. It has 624 GitHub stars.

Is OpenGUI safe to use?

Yes. OpenGUI passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install OpenGUI?

Clone the repository with "git clone https://github.com/Core-Mate/OpenGUI" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is OpenGUI written in?

OpenGUI is primarily written in Kotlin. It is open-source under Core-Mate on GitHub, so you can review or fork the full source.

Are there alternatives to OpenGUI?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh OpenGUI against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

Franklin claude-bootstrap

Demo

OpenGUI reads a real Android app UI, plans the next step, takes mobile actions, and returns structured results.

Quick Start

The fastest way to try OpenGUI is to let Claude Code or Codex bootstrap it for you.

Read ./skills/open-gui-bootstrap/SKILL.md and help me run OpenGUI. Only ask me for phone-side actions.

You will need:

an Android phone or emulator
USB debugging enabled
AccessibilityService enabled
model API keys for real task execution

OpenGUI will use the repository scripts to start the backend and install the Android client:

cd server
./start.sh

cd client
./start.sh

After the backend and Android client are running, send a first task:

cd server
pnpm opengui -- devices --json
pnpm opengui -- do "Observe the current Android screen and summarize what you see" --json

Manual setup guide: docs/get-started.md.

Recent Updates

[2026.5.16] Added Codex / Claude Code remote control with a local REST API, pnpm opengui -- ... CLI, and the open-gui-remote-control Skill for dispatching Android app tasks from coding agents.
[2026.5.9] Added a Discord IM channel for remote Android task dispatch, including prefix commands, slash commands, allowlists, and guild-scoped command registration.
[2026.5.7] Hardened local startup to avoid common PostgreSQL and Redis port conflicts during Docker-based backend setup.
[2026.5.1] Improved backend onboarding with .env.example, startup checks, and graph-agent VLM environment configuration.

What You Can Do with OpenGUI

OpenGUI provides an Android GUI agent stack for screen understanding, task planning, action execution, review, and recovery.

You can use the same repository in four practical ways:

Operate mainstream Android apps: let AI handle mobile tasks inside X, Reddit, Hacker News, Telegram, WeChat, Weibo, Xiaohongshu, and other Android apps on a real phone.
Run shipped workflows: the repository already includes a runnable backend, Android client, standby dispatch path, and a set of built-in task capabilities.
Let Claude or Codex bootstrap it for you: point the model at skills/open-gui-bootstrap/SKILL.md, describe the goal in plain language, and let it handle setup, build, install, and local debugging.
Let Codex control Android apps: after OpenGUI is running, point Codex or Claude Code at skills/open-gui-remote-control/SKILL.md to list devices, dispatch tasks, and track executions through the local CLI.
Operate phones as remote workers: dispatch tasks through Feishu, Telegram, Discord, or REST API, keep devices on standby, and get structured results back from the backend.
Join the Discord community

Highlights

Built for long-running tasks: OpenGUI is shaped for mobile workflows that may run for hours, with progress, review, and recovery kept inside the system.
Plan before action, summarize after execution: before touching an app, OpenGUI breaks the goal into executable steps; after the run, it returns a structured summary of what happened, what worked, and what still needs attention.
The task can keep moving: Plan Supervisor maintains task state and continuation, Executor Graph runs screenshot, vision, action, and call-user loops on top of live device state, and Summarizer closes the run with a structured result.
Phones can stay on standby: the standby dispatch path lets devices receive remote work through Feishu, Telegram, Discord, or REST entry points.
Models can be assigned by role: model routing separates planning from VLM execution so teams can choose providers by job.
The system is organized around real mobile workflows: the graph, device execution path, and model split already exist in the source tree.

Why OpenGUI Is Different

OpenGUI is built as a mobile operator system with explicit orchestration layers.

The source code currently exposes these pieces:

server/apps/backend/src/modules/graph-agent/graph/mobile-agent.graph.ts for the main graph
server/apps/backend/src/modules/graph-agent/graph/executor.graph.ts for the device-side execution loop
server/apps/backend/src/common/ws/standby.gateway.ts for standby device dispatch
client/core_network/.../StandbySocketManager.kt for persistent device standby connections
client/core_accessibility/.../GestureService.kt for Android-side action execution

Dimension	Typical phone-agent demo	OpenGUI
Execution model	Short interactive loop	Main graph plus executor subgraph
Task state	Usually local and session-bound	Task state managed in the backend graph
Device path	Often laptop-driven control	Android client with standby and execution sockets
Model usage	One model does most of the work	Planning and VLM paths can be split across providers
Remote operation	Optional add-on	Feishu, Telegram, Discord, REST API, and standby dispatch are built into the backend

Typical Use Cases

Open X and collect recent posts for a topic
Read and summarize Reddit or Hacker News threads on a live phone
Trigger Android tasks remotely from Feishu, Telegram, Discord, or REST API
Execute repetitive mobile workflows on Android devices
Run long mobile workflows that need state, review, and recovery over many hours

Current Limitations

Requires an Android device or emulator.
Requires USB debugging and AccessibilityService permissions.
Execution quality depends on the model, app UI, network state, and task length.
Not an always-on OS-level assistant yet; tasks are currently triggered manually or through configured dispatch channels.
Long-running tasks are supported by the system design, but reliability still needs more real-world testing.
More ready-to-run task examples and benchmarks are still needed.

Roadmap

Add a short demo video and more real app examples.
Improve one-command local setup.
Add more ready-to-run phone-use task templates.
Improve execution recovery and failure reporting.
Add benchmark tasks for Android GUI agent reliability.
Expand docs for model configuration and cost-saving profiles.

How to Use OpenGUI

1. With Claude or Codex

Start with skills/open-gui-bootstrap/SKILL.md.

The intended flow is simple:

point Claude or Codex at the skill
describe the task in plain language
let the model handle backend bootstrap, APK build, install, and local debugging

It should only stop for:

connecting a phone or starting an emulator
approving USB debugging
enabling AccessibilityService
granting overlay or battery permissions
providing API keys or bot credentials

After the backend and Android client are running, use skills/open-gui-remote-control/SKILL.md to let Codex or Claude Code control the phone through the local CLI:

cd server
pnpm opengui -- devices --json
pnpm opengui -- do "Observe the current Android screen and summarize what you see" --json
pnpm opengui -- status <executionId> --json

Recommended profiles:

High-performance profile

Use the latest Claude Opus model family across planning, supervision, review, and vision when you want the strongest overall quality.

This is the easiest way to get the best execution quality, and it is the most expensive path.

Cost-saving mixed profile

Use Qwen 3.6 Plus for text-side roles such as Planner and Supervisor, and use Doubao Pro for the VLM side.

This usually preserves the overall system shape while lowering model cost by roughly 10x to 15x compared with an all-Opus setup, depending on task length, screenshot volume, and token mix.

Recommended prompts:

Run it

Read ./skills/open-gui-bootstrap/SKILL.md and help me run OpenGUI. Only ask me for phone-side actions.

Use Claude Opus everywhere

Read ./skills/open-gui-bootstrap/SKILL.md and bootstrap OpenGUI with the latest Claude Opus model family for planning, supervision, review, and vision.

Use Qwen + Doubao to save cost

Read ./skills/open-gui-bootstrap/SKILL.md and set up OpenGUI with Qwen 3.6 Plus for Planner and Supervisor, and Doubao Pro for VLM execution.

Use my own APIs

Read ./skills/open-gui-bootstrap/SKILL.md and use m