OpenSkill

Name: OpenSkill
Author: OpenLAIR

Pending

Open-World Self-Evolution for LLM Agents — agents that build both their skills and their own verification signals from scratch, with no target-task supervision. (Code coming soon.)

58stars

4forks

Installation

# Add to your Claude Code skills
git clone https://github.com/OpenLAIR/OpenSkill

Getting Started

Guides for using ai agents skills like OpenSkill.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

README.md

Frequently Asked Questions

What is OpenSkill?

OpenSkill is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by OpenLAIR. Open-World Self-Evolution for LLM Agents — agents that build both their skills and their own verification signals from scratch, with no target-task supervision. (Code coming soon.). It has 58 GitHub stars.

Is OpenSkill safe to use?

OpenSkill's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install OpenSkill?

Clone the repository with "git clone https://github.com/OpenLAIR/OpenSkill" and add it to your Claude Code skills directory (see the Installation section above).

Are there alternatives to OpenSkill?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh OpenSkill against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

paper-reading-zh agent-workspace-linux

🧭 OpenSkill

Open-World Self-Evolution for LLM Agents

An agent that builds both its skills and its own verification signals from scratch — using only a task prompt and open-world resources, with no target-task supervision.

[!NOTE] Code is on the way. This repository currently hosts the project overview and release plan. Star ⭐ and watch 👀 to be notified when the code, skills, and benchmark drop. See the roadmap.

TL;DR

Self-evolving agents need to adapt after deployment — but existing methods assume a usable learning loop is already there: curated skills, successful trajectories, or verifier signals. Real open-world deployments may offer none of these, only a task prompt.

OpenSkill studies open-world self-evolution: an agent must build both its skills and its own verification signals from scratch, drawing on open-world resources but no target-task supervision. Target-task supervision is reserved strictly for final evaluation.

The Idea — a new paradigm for self-evolving skills

Unlike human-curated, LLM-generated, or supervised self-evolution, OpenSkill acquires skills from the open world and verifies them with self-built virtual tasks — making it simultaneously scalable, grounded, and supervision-free. Prior paradigms each miss at least one of these properties.

How OpenSkill works

Given only a task prompt, a base model, tool access, and open-world resources, OpenSkill bootstraps a learning loop from scratch in three stages.

Stage	Name	What happens
01	Open-world knowledge acquisition	Retrieves task-relevant knowledge and independent verification anchors from docs, repos, papers, and the web — then drafts a structured skill plan.
02	Leakage-free skill evolution	Drafts skills and refines them in a sandbox against self-built virtual tests grounded in the anchors, fixing bugs and knowledge gaps over up to three rounds.
03	Zero-shot target evaluation	Deploys the frozen skill to the target agent. Ground-truth tests are unlocked only here, at final evaluation — never during construction.

Results — best automated pass rate on every setting

On SkillsBench (11 domains) OpenSkill beats the strongest closed-world baseline by +8.9 / +8.8 points and lands within 1–3 points of the human upper bound — while honoring the no-supervision constraint.

Metric	Value
Overall pass rate on Opus 4.6	43.6% (+8.9 over best baseline)
Overall pass rate on GPT 5.2	42.1% (+8.8 over best baseline)
GT test intents covered by self-built verifier	88.9%
Domains best / tied-best on Opus 4.6	8 / 11

SkillsBench — overall average pass rate (%) (Human = reference upper bound, excluded from ranking)

Target agent	No Skill	Self-Gen	CoT	Skill-Creator	AutoSkill	Memento	OpenSkill	Human
Opus 4.6 (Claude Code)	25.5	23.9	23.9	34.7	24.7	30.1	43.6	44.5
GPT 5.2 (Codex)	25.0	32.2	33.3	29.2	11.2	15.6	42.1	44.8

Beyond SkillsBench, OpenSkill is also the best automated method on SocialMaze (82.7% / 70.7%) and ScienceWorld (90.0% / 85.3%) across both target agents.

RQ1 — Transferability Skills generated by Opus 4.6 transfer as-is to four weaker models, improving by +5.5 to +14.8 points over no-skill with no model-specific adaptation.

RQ2 — Virtual verifier quality Without ever seeing ground-truth tests, the verifier reaches 80.5% recall against GT-positive outcomes, 60.7% overall agreement, and covers 88.9% of GT test intents.

RQ3 — Component contribution. On SocialMaze, reward peaks at three refinement rounds; open-world query and the virtual verifier each improve over a parametric-only baseline and are largely complementary.

🗺️ Roadmap

Releases ship in phases. ⭐ the repo to get notified as each lands.

🟢 Now

Project page & overview — openlair.github.io/openskill
Paper preprint (arXiv) — arXiv:2606.06741

🟡 Next

Core OpenSkill framework code (knowledge acquisition → skill evolution → evaluation)
Reproduction scripts for the SkillsBench main results

Citation

@misc{yan2026openskillopenworldselfevolutionllm,
  title         = {OpenSkill: Open-World Self-Evolution for LLM Agents},
  author        = {Zhiling Yan and Dingjie Song and Hanrong Zhang and Wei Liang and Yuxuan Zhang and Yutong Dai and Lifang He and Philip S. Yu and Ran Xu and Xiang Li and Lichao Sun},
  year          = {2026},
  eprint        = {2606.06741},
  archivePrefix = {arXiv},
  primaryClass  = {cs.AI},
  url           = {https://arxiv.org/abs/2606.06741}
}

Zhiling Yan1,*, Dingjie Song1,*, Hanrong Zhang2, Wei Liang1, Yuxuan Zhang3,4, Yutong Dai5, Lifang He1, Philip S. Yu2, Ran Xu5, Xiang Li6, Lichao Sun1,†

1 Lehigh University · 2 University of Illinois Chicago · 3 University of British Columbia · 4 Vector Institute · 5 Salesforce AI Research · 6 Massachusetts General Hospital & Harvard Medical School

* Equal contribution † Corresponding author