Fully Autonomous AI Research System with Self-Evolution, built natively on Claude Code
# Add to your Claude Code skills
git clone https://github.com/Sibyl-Research-Team/sibyl-research-systemInspired by the pioneering work of The AI Scientist, FARS, and AutoResearch, Sibyl takes the vision further by building natively on Claude Code to fully leverage its agent ecosystem — skills, plugins, MCP servers, and multi-agent teams.
Sibyl is a fully autonomous AI scientist that drives end-to-end ML research — from literature survey and hypothesis generation to GPU experiment execution and conference-ready paper writing. It operates as an autonomous research organization: 20+ specialized AI agents debate ideas, design and run GPU experiments, write papers, and critically review their own work — all without human intervention.
Key capabilities: automated literature review, multi-agent idea debate, experiment planning & GPU-parallel execution, multi-agent paper writing & peer review, autonomous iteration with quality gates, and cross-project self-evolution. Supports NeurIPS/ICML/ICLR-level output with LaTeX compilation.
What truly sets Sibyl apart is its dual-loop architecture:
No comments yet. Be the first to share your thoughts!
The fastest way to set up Sibyl is to let Claude Code do it for you. Clone the repo, open it in Claude Code, and ask:
git clone https://github.com/Sibyl-Research-Team/sibyl-research-system.git
cd sibyl-research-system
tmux new -s sibyl # recommended: persistent session
claude --plugin-dir ./plugin --dangerously-skip-permissions
⚠️
--dangerously-skip-permissionsgrants Claude Code unrestricted execution (shell commands, file I/O, MCP calls) without confirmation. It is strongly recommended for Sibyl's autonomous multi-agent workflow (hundreds of tool calls per iteration), but should only be used on dedicated research machines. See Manual Setup for full details and mitigation advice.
Then tell Claude:
"Help me set up Sibyl Research System. Read docs/setup-guide.md and configure everything."
Claude will automatically check your environment, install dependencies, configure MCP servers, create config files, and ask you only for what it can't detect (GPU server IP, username, etc.). The setup guide is a step-by-step checklist designed for Claude to follow.
Once setup is complete, run the init command inside Claude Code to verify the installation and prepare your first workspace:
/sibyl-research:init
ANTHROPIC_API_KEY environment variableCLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variablebrew install tmux (macOS) / apt install tmux (Linux)git clone https://github.com/Sibyl-Research-Team/sibyl-research-system.git
cd sibyl-research-system
chmod +x setup.sh && ./setup.sh # Interactive: creates venv, installs deps, configures MCP
setup.sh also adds or updates export SIBYL_ROOT="..." in your shell rc file so workspace-root Claude sessions can still resolve the repo plugin and tools.
Two MCP servers are required. setup.sh configures them interactively, but for manual setup the preferred path is claude mcp add --scope local ... so the configuration stays repo-scoped:
claude mcp add --scope local ssh-mcp-server -- npx -y @fangjunjie/ssh-mcp-server \
--host YOUR_GPU_IP --port 22 --username YOUR_USER --privateKey ~/.ssh/id_ed25519
claude mcp add --scope local arxiv-mcp-server -- /ABSOLUTE/PATH/TO/sibyl-research-system/.venv/bin/python3 -m arxiv_mcp_server
If you already manage Claude Code MCP servers through JSON, update the existing MCP config instead of creating a second source of truth:
{
"mcpServers": {
"ssh-mcp-server": {
"command": "npx",
"args": ["-y", "@fangjunjie/ssh-mcp-server",
"--host", "YOUR_GPU_IP", "--port", "22",
"--username", "YOUR_USER",
"--privateKey", "~/.ssh/id_ed25519"]
},
"arxiv-mcp-server": {
"command": "/ABSOLUTE/PATH/TO/sibyl-research-system/.venv/bin/python3",
"args": ["-m", "arxiv_mcp_server"]
}
}
}
Server names must be exact:
"ssh-mcp-server"and"arxiv-mcp-server".
Create config.yaml at project root (git-ignored):
ssh_server: "default"
remote_base: "/home/user/sibyl_system"
max_gpus: 4
language: zh
codex_enabled: false
Use ssh_server: "default" when ssh-mcp-server was registered with explicit --host/--username arguments. If your MCP setup resolves a named SSH host alias instead, use that alias.
# `setup.sh` normally writes this for you; set it manually only if you skipped setup.sh
export SIBYL_ROOT=/path/to/sibyl-system
# Repo root: setup, init, status, migrate, evolve
cd "$SIBYL_ROOT"
tmux new -s sibyl-admin
claude --plugin-dir "$SIBYL_ROOT/plugin" --dangerously-skip-permissions
# Workspace root: actual project execution (recommended)
cd "$SIBYL_ROOT/workspaces/my-project"
tmux new -s sibyl-my-project
claude --plugin-dir "$SIBYL_ROOT/plugin" --dangerously-skip-permissions
# Inside Claude Code (repo root) — run once after installation:
/sibyl-research:init # Verify installation and prepare first workspace
# Inside Claude Code launched from workspaces/my-project:
/sibyl-research:start spec.md # New project from this workspace's spec
/sibyl-research:continue . # Resume the current workspace
Why tmux? Sibyl experiments can run for hours. Running inside tmux ensures the session persists through terminal disconnections. The Sentinel watchdog (auto-launched by
/sibyl-research:start) runs in a sibling tmux pane and automatically restarts Claude Code if it crashes or goes idle — enabling truly unattended autonomous research.
Which directory should Claude start in? Use the repo root only for setup and global maintenance (
/sibyl-research:init,:status,:migrate,:evolve). For an actual research run, start Claude from the target workspace root (workspaces/<project>/), not from the repo root and not fromworkspaces/<project>/current. This makes Claude load the workspace-specificCLAUDE.md,.claude/links, Ralph prompt, and project memory d