by BAAI-Agents
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
# Add to your Claude Code skills
git clone https://github.com/BAAI-Agents/Cradle
The Cradle framework empowers nascent foundation models to perform complex computer tasks via the same unified interface humans use, i.e., screenshots as input and keyboard & mouse operations as output.

Click on either of the video thumbnails above to watch them on YouTube.
We currently provide access to OpenAI's and Claude's API. Please create a .env file in the root of the repository to store the keys (one of them is enough).
Sample .env file containing private information:
OA_OPENAI_KEY = "abc123abc123abc123abc123abc123ab"
RF_CLAUDE_AK = "abc123abc123abc123abc123abc123ab" # Access Key for Claude
RF_CLAUDE_SK = "123abc123abc123abc123abc123abc12" # Secret Access Key for Claude
AZ_OPENAI_KEY = "123abc123abc123abc123abc123abc12"
AZ_BASE_URL = "https://abc123.openai.azure.com/"
RF_CLAUDE_AK = "abc123abc123abc123abc123abc123ab"
RF_CLAUDE_SK = "123abc123abc123abc123abc123abc12"
IDE_NAME = "Code"
OA_OPENAI_KEY is the OpenAI API key. You can get it from the OpenAI.
AZ_OPENAI_KEY is the Azure OpenAI API key. You can get it from the Azure Portal.
OA_CLAUDE_KEY is the Anthropic Claude API key. You can get it from the Anthropic.
RF_CLAUDE_AK and RF_CLAUDE_SK are AWS Restful API key and secret key for Claude API.
IDE_NAME refers to the IDE environment in which the repository's code runs, such as PyCharm or Code (VSCode). It is primarily used to enable automatic switching between the IDE and the target environment.
Please setup your python environment and install the required dependencies as:
# Clone the repository
git clone https://github.com/BAAI-Agents/Cradle.git
cd Cradle
# Create a new conda environment
conda create --name cradle-dev python=3.10
conda activate cradle-dev
pip install -r requirements.txt
1. Option 1
# Download best-matching version of specific model for your spaCy installation
python -m spacy download en_core_web_lg
or
# pip install .tar.gz archive or .whl from path or URL
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.1/en_core_web_lg-3.7.1.tar.gz
2. Option 2
# Copy this url https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.1/en_core_web_lg-3.7.1.tar.gz
# Paste it in the browser and download the file to res/spacy/data
cd res/spacy/data
pip install en_core_web_lg-3.7.1.tar.gz
Due to the vast differences between each game and software, we have provided the specific settings for each of them below.
Since some users may want to apply our framework to new games, this section primarily showcases the core directories and organizational structure of Cradle. We will highlight in "āāā" the modules related to migrating to new games, and provide detailed explanations later.
Cradle
āāā cache # Cache the GroundingDino model and the bert-base-uncased model
āāā conf # āāā The configuration files for the environment and the llm model
ā āāā env_config_dealers.json
ā āāā env_config_rdr2_main_storyline.json
ā āāā env_config_rdr2_open_ended_mission.json
ā āāā env_config_skylines.json
ā āāā env_config_stardew_cultivation.json
ā āāā env_config_stardew_farm_clearup.json
ā āāā env_config_stardew_shopping.json
ā āāā openai_config.json
ā āāā claude_config.json
ā āāā restful_claude_config.json
ā āāā ...
āāā deps # The dependencies for the Cradle framework, ignore this folder
āāā docs # The documentation for the Cradle framework, ignore this folder
āāā res # The resources for the Cradle framework
ā āāā models # Ignore this folder
ā āāā tool # Subfinder for RDR2
ā āāā [game or software] # āāā The resources for game, exmpale: rdr2, dealers, skylines, stardew, outlook, chrome, capcut, meitu, feishu
ā ā āāā prompts # The prompts for the game
ā ā ā āāā templates
ā ā ā āāā action_planning.prompt
ā ā ā āāā information_gathering.prompt
ā ā ā āāā self_reflection.prompt
ā ā ā āāā task_inference.prompt
ā ā āāā skills # The skills json for the game, it will be generated automatically
ā ā āāā icons # The icons difficult for GPT-4 to recognize in the game can be replaced with text for better recognition using an icon replacer
ā ā āāā saves # Save files in the game
ā āāā ...
āāā requirements.txt # The requirements for the Cradle framework
āāā runner.py # The main entry for the Cradle framework
āāā cradle # Cradle's core modules
ā āāā config # The configuration for the Cradle framework
ā āāā environment # The environment for the Cradle framework
ā ā āāā [game or software] # āāā The environment for the game, exmpale: rdr2, dealers, skylines, stardew, outlook, chrome, capcut, meitu, feishu
ā ā ā āāā __init__.py # The initialization file for the environment
ā ā ā āāā atomic_skills # Atomic skills in the game. Users should customise them to suit the needs of the game or software, e.g. character movement
ā ā ā āāā composite_skills # Combination skills for atomic skills in games or software
ā ā ā āāā skill_registry.py # The skill registry for the game. Will register all atomic skills and composite skills into the registry.
ā ā ā āāā ui_control.py # The UI control for the game. Define functions to pause the game and switch to the game window
ā ā āāā ...
ā āāā gameio # Interfaces that directly wrap the skill registry and ui control in the environment
ā āāā log # The log for the Cradle framework
ā āāā memory # The memory for the Cradle framework
ā āāā module # Currently there is only the skill execution module. Later will migrate action planning, self-reflection and other modules from planner and provider
ā āāā planner # The planner for the Cradle framework. Unified interface for action planning, self-reflection and other modules. This module will be deleted later and will be moved to the module module.
ā āāā runner # āāā The logical flow of execution for each game and software. All game and software processes will then be unified into a single runner
ā āāā utils # Defines some helper functions such as save json and load json
ā āāā provider # The provider for the Cradle framework. We have semantically decomposed most of the execution flow in the runner into providers
ā āāā augment # Methods for image augmentation
ā āāā llm # Call for the LLM model, e.g. OpenAI's GPT-4o, Claude, etc.
ā āāā module # āāā The module for the Cradle framework. e.g., action planning, self-reflection and other modules. It will be migrated to the cradle/module later.
ā āāā object_detect # Methods for object detection
ā āāā process # āāā Methods for pre-processing and post-processing for action planning, self-reflection and other modules
ā āāā video # Methods for video processing
ā āāā others # Methods for other operations, e.g., save and load coordinates for skylines
ā āāā circle_detector.py # The circle detector for the rdr2
ā āāā icon_replacer.py # Methods for replacing icons with text
ā āāā sam_provider.py # Segment anything for software
ā āāā ...
āāā ...
Since each game's settings and the operating systems they are compatible with are different, Cradle cannot simply replace one game name to migrate to a
No comments yet. Be the first to share your thoughts!