by AgentSkillOS
Making ANY Software Skill-Native -- Auto-generate production-ready AI Agent Skills for Claude Code, OpenClaw, Codex, and more.
# Add to your Claude Code skills
git clone https://github.com/AgentSkillOS/SkillAnythingAutomatically generate production-ready Skills for any target — software, API, CLI tool, library, workflow, or web service. SkillAnything runs a 7-phase pipeline that analyzes your target, designs the skill architecture, implements it, generates test cases, benchmarks performance, optimizes the description, and packages for multiple agent platforms.
Fully automated (one command):
Give SkillAnything a target and it handles everything:
- "Create a skill for the jq CLI tool"
- "Generate a skill for the Stripe API"
- "Turn this workflow into a multi-platform skill"
The pipeline runs all 7 phases automatically. Results land in sa-workspace/.
Phase 1: Analyze → Detect target type, extract capabilities → analysis.json
Phase 2: Design → Map capabilities to skill architecture → architecture.json
Phase 3: Implement → Generate SKILL.md + scripts + references → complete skill directory
Phase 4: Test Plan → Auto-generate eval cases + trigger queries → evals.json
Phase 5: Evaluate → Benchmark with/without skill, grade results → benchmark.json
Phase 6: Optimize → Improve description via train/test loop → optimized SKILL.md
Phase 7: Package → Multi-platform distribution packages → dist/
See METHODOLOGY.md for the full pipeline specification.
Runs all 7 phases end-to-end. Provide the target and SkillAnything does the rest:
Target: "the httpie CLI tool"
→ Analyzes httpie --help output, designs command structure, generates skill,
creates tests, benchmarks, optimizes, packages for 4 platforms
One target in, production-ready Skills out.
SkillAnything is a Skill that generates Skills. Give it any target -- a CLI tool, REST API, Python library, workflow, or web service -- and it runs a fully automated 7-phase pipeline:
Target: "jq"
|
v
[Analyze] -> [Design] -> [Implement] -> [Test] -> [Benchmark] -> [Optimize] -> [Package]
| |
v v
analysis.json dist/
├── claude-code/
├── openclaw/
├── codex/
└── generic/
No manual prompt engineering. No copy-paste between platforms. Just tell it what you want a skill for.
# Claude Code
git clone https://github.com/AgentSkillOS/SkillAnything.git ~/.claude/skills/skill-anything
# OpenClaw
git clone https://github.com/AgentSkillOS/SkillAnything.git ~/.openclaw/skills/skill-anything
# Codex
git clone https://github.com/AgentSkillOS/SkillAnything.git ~/.codex/skills/skill-anything
In Claude Code, just say:
> Create a skill for the httpie CLI tool
> Generate a multi-platform skill for the Stripe API
> Turn this data pipeline workflow into a skill
SkillAnything handles the rest.
# Phase 1: Analyze a target
python -m scripts.analyze_target --target "jq" --output analysis.json
# Phase 2: Design architecture
python -m scripts.design_skill --analysis analysis.json --output architecture.json
# Phase 3: Scaffold skill
python -m scripts.init_skill my-skill --template cli --output ./out
# Phase 4: Generate test cases
python -m scripts.generate_tests --analysis analysis.json --skill-path ./out/my-skill
# Phase 5: Run evaluation
python -m scripts.run_eval --eval-set evals.json --skill-path ./out/my-skill
# Phase 6: Optimize description
python -m scripts.run_loop --eval-set trigger-evals.json --skill-path ./out/my-skill --model claude-sonnet-4-20250514
# Phase 7: Package for all platforms
python -m scripts.package_multiplatform ./out/my-skill --platforms claude-code,openclaw,codex
No comments yet. Be the first to share your thoughts!
Set auto_mode: false in config.yaml. SkillAnything pauses after each phase for review:
Run any phase independently:
python -m scripts.analyze_target --target "jq" --output analysis.json
python -m scripts.design_skill --analysis analysis.json --output architecture.json
python -m scripts.init_skill my-skill --template cli --output ./out
python -m scripts.generate_tests --analysis analysis.json --skill-path ./out/my-skill
python -m scripts.run_eval --eval-set evals.json --skill-path ./out/my-skill
python -m scripts.run_loop --eval-set trigger-evals.json --skill-path ./out/my-skill --model <model>
python -m scripts.package_multiplatform ./out/my-skill --platforms claude-code,openclaw,codex
Edit config.yaml to customize the pipeline. Key settings:
| Setting | Default | Description |
|---------|---------|-------------|
| pipeline.auto_mode | true | Run all phases or pause for review |
| target.type | auto | Force target type: api, cli, library, workflow, service |
| platforms.enabled | all 4 | Which platforms to package for |
| platforms.primary | claude-code | Primary output platform |
| eval.max_optimization_iterations | 5 | Max description optimization rounds |
| obfuscation.enabled | false | Obfuscate original scripts with PyArmor |
See references/schemas.md for the complete configuration schema.
| Platform | Install Path | Package Format |
|----------|-------------|----------------|
| Claude Code | ~/.claude/skills/<name>/ | Directory |
| OpenClaw | ~/.openclaw/skills/<name>/ | Directory |
| Codex | ~/.codex/skills/<name>/ | Directory + openai.yaml |
| Generic | anywhere | .skill zip |
See references/platform-formats.md for platform-specific format details.
SkillAnything uses the same eval system as the Anthropic skill-creator:
agents/grader.mdbenchmark.jsoneval-viewer/generate_review.pyThe eval loop is optional (skip_eval: true in config) for rapid prototyping.
| Script | Phase | Purpose |
|--------|-------|---------|
| analyze_target.py | 1 | Auto-detect and analyze target |
| design_skill.py | 2 | Generate skill architecture from analysis |
| init_skill.py | 3 | Scaffold skill directory from templates |
| generate_tests.py | 4 | Auto-generate test cases and trigger queries |
| run_eval.py | 5 | Test description triggering accuracy |
| aggregate_benchmark.py | 5 | Aggregate benchmark statistics |
| generate_report.py | 5-6 | Generate HTML optimization report |
| improve_description.py | 6 | AI-powered description improvement |
| run_loop.py | 6 | Full eval + improve optimization loop |
| quick_validate.py | 7 | Validate SKILL.md structure |
| package_skill.py | 7 | Package for single platform |
| package_multiplatform.py | 7 | Package for all enabled platforms |
| obfuscate.py | - | PyArmor wrapper for code protection |
Read these when spawning specialized subagents:
| Agent | Purpose |
|-------|---------|
| agents/analyzer.md | Phase 1: Target analysis instructions |
| agents/designer.md | Phase 2: Skill architecture design |
| agents/implementer.md | Phase 3: Skill content writing |
| agents/grader.md | Phase 5: Eval assertion grading |
| agents/comparator.md | Phase 5: Blind A/B output comparison |
| agents/optimizer.md | Phase 6: Description optimization orchestration |
| agents/packager.md | Phase 7: Multi-platform packaging instructions |
SkillAnything auto-detects the target type and adapts its analysis:
| Type | Detection | Analysis Method | |------|-----------|-----------------| | API | URL with /api, OpenAPI spec, swagger | Fetch spec, extract endpoints | | CLI | Executable name, --help output | Run help, parse subcommands | | Library | Package name, import path | Read docs, parse public API | | Workflow | Step descriptions, sequence | Parse steps, map data flow | | Service | URL, web interface | Scrape docs, identify actions |
--target-type overridereferences/platform-formats.mdpip install pyarmorMIT License. See NOTICE for third-party attributions (CLI-Anything, Dazhuang Skill Creator,
Anthropic Skill Creator).
Inspired by CLI-Anything's methodology, adapted for Skill generation:
| Phase | Name | What It Does | Output |
|:-----:|------|-------------|--------|
| 1 | Analyze | Auto-detect target type, extract capabilities | analysis.json |
| 2 | Design | Map capabilities to skill architecture | architecture.json |
| 3 | Implement | Generate SKILL.md + scripts + references | Complete skill directory |
| 4 | Test Plan | Auto-generate eval cases + trigger queries | evals.json |
| 5 | Evaluate | Benchmark with/without skill, grade results | benchmark.json |
| 6 | Optimize | Improve description via train/test loop | Optimized SKILL.md |
| 7 | Package | Multi-platform distribution packages | dist/ |
| Target Type | Detection Method | Example |
|-------------|-----------------|---------|
| CLI Tool | which <name> + --help parsing | jq, httpie, ffmpeg |
| REST API | URL with OpenAPI/Swagger spec | Stripe API, GitHub API |
| Library | Package name via pip/npm | pandas, lodash |
| Workflow | Step-by-step description | ETL pipeline, CI/CD flow |
| Service | URL with web docs | Slack, Notion |
SkillAnything/
├── SKILL.md # Main entry point (< 500 lines)
├── METHODOLOGY.md # Full 7-phase pipeline spec
├── config.yaml # Pipeline configuration
│
├── agents/ # Subagent instructions
│ ├── analyzer.md # Phase 1: Target analysis
│ ├── designer.md # Phase 2: Skill design
│ ├── implementer.md # Phase 3: Content writing
│ ├── grader.md # Phase 5: Eval grading
│ ├── comparator.md # Blind A/B comparison
│ ├── optimizer.md # Phase 6: Description optimization
│ └── packager.md # Phase 7: Multi-platform packaging
│
├── scripts/ # Python automation core
│ ├── analyze_target.py # [NEW] Target auto-detection
│ ├── design_skill.py # [NEW] Architecture generation
│ ├── init_skill.py # [NEW] Skill scaffolding
│ ├── generate_tests.py # [NEW] Auto test generation
│ ├── package_multiplatform.py # [NEW] Multi-platform packaging
│ ├── obfuscate.py # [NEW] PyArmor wrapper
│ ├── run_eval.py # Trigger evaluation
│ ├── improve_description.py # AI-powered optimization
│ ├── run_loop.py # Eval + improve loop
│ ├── aggregate_benchmark.py # Benchmark statistics
│ └── ... # + validators, reporters
│
├── references/ # Documentation
│ ├── platform-formats.md # Platform-specific specs
│ ├── schemas.md # JSON schemas
│ └── pipeline-phases.md # Phase details
│
├── templates/ # Generation templates
│ ├── skill-scaffold/ # Skill directory template
│ └── platform-adapters/ # Platform-specific adapters
│
└── eval-viewer/ # Interactive eval review UI
└── generate_review.py
> Create a skill for the jq CLI tool
Phase 1: Analyzing jq... detected as CLI tool (confidence: 0.95)
Phase 2: Designing skill architecture... tool-augmentation pattern
Phase 3: Generating SKILL.md + 2 scripts + 1 reference
Phase 4: Created 5 test cases + 20 trigger queries
Phase 5: Benchmark: 87% pass rate (vs 42% baseline)
Phase 6: Description optimized: 18/20 trigger accuracy
Phase 7: Packaged for claude-code, openclaw, codex, generic
Done! Skill at: sa-workspace/dist/
> Generate a skill for the Stripe API, focus on payments
Phase 1: Fetching Stripe OpenAPI spec... 247 endpoints found
Phase 2: Focusing on payment_intents, customers, charges
Phase 3: Generated SKILL.md with auth setup + endpoint references
...
> Turn this into a skill: fetch from Postgres, clean with pandas, upload to S3
Phase 1: Detected workflow with 3 steps
Phase 2: workflow-orchestrator pattern, 3 dependencies
Phase 3: Step-by-step SKILL.md with error handling guidance
...
Edit config.yaml:
pipeline:
auto_mode: true # Full automation or interactive
skip_eval: false # Skip phases 5-6 for rapid prototyping
platforms:
enabled: [claude-code, openclaw, codex, generic]
primary: claude-code
eval:
max_optimization_iterations: 5
runs_per_query: 3
obfuscation:
enabled: false # PyArmor protection for core scripts
SkillAnything supports code obfuscation for commercial distribution:
# Obfuscate original scripts (Apache 2.0 derived files are excluded)
python -m scripts.obfuscate --config config.yaml
# Output: dist-protected/ with PyArmor-protected core + readable adapted scripts
| Category | Files | Protection | |----------|-------|-----------| | SkillAnything Original | 6 scripts | PyArmor obfuscated | | Anthropic Adapted | 9 scripts | Source (Apache 2.0 requires it) | | Agent Instructions | 7 .md files | Readable (required by agents) |
Built on the shoulders of giants:
| Project | License | What We Used | |---------|--------