by tripleyak
Intelligent skill router and creator for Claude Code and Codex. Analyzes any input to recommend existing skills, improve them, or create new ones from scratch.
# Add to your Claude Code skills
git clone https://github.com/tripleyak/SkillForgename: skillforge description: "Intelligent skill router and creator. Analyzes ANY input to recommend existing skills, improve them, or create new ones. Uses deep iterative analysis with 11 thinking models, regression questioning, evolution lens, and multi-agent synthesis panel. Phase 0 triage ensures you never duplicate existing functionality." license: MIT model: claude-opus-4-5-20251101 user-invocable: true allowed-tools:
Analyzes ANY input to find, improve, or create the right skill.
Any input works. SkillForge will intelligently route to the right action:
# These all work - SkillForge figures out what you need:
SkillForge: create a skill for automated code review
→ Creates new skill (after checking no duplicates exist)
help me debug this TypeError
→ Recommends ErrorExplainer skill (existing)
improve the testgen skill to handle React components better
→ Enters improvement mode for TestGen
do I have a skill for database migrations?
→ Recommends DBSchema, database-migration skills
TypeError: Cannot read property 'map' of undefined
→ Routes to debugging skills (error detected)
SkillForge: {goal} - Full autonomous skill creationcreate skill - Natural language activationdesign skill for {purpose} - Purpose-first creationultimate skill - Emphasize maximum qualityskillforge --plan-only - Generate specification without execution{any input} - Analyzes and routes automaticallydo I have a skill for - Searches existing skillswhich skill / - Recommends matching skillsFrom Art to Engineering: A Manifesto for AI Skill Creation.

The central challenge in AI development isn't a lack of ideas, but the inconsistent process of turning them into robust, reliable skills. Current methods are often ad-hoc, brittle, and difficult to scale—resembling more of an art form than a predictable engineering discipline.

Quality is built in, not bolted on.
SkillForge is a methodology where rigor is integrated into every step of the creation process, from initial conception to final validation. It's a fundamental shift from reactive testing to proactive engineering.

v5.1 builds on the v5.0 context-efficient redesign and adds stronger frontmatter support, hooks guidance, validation coverage, and packaging safety.
The foundation from v5.0 remains: the context window is a public good. Every line in SKILL.md competes with the user's actual work.
references/ where they're loaded only when neededdescription field for pre-load routingSkills now use only name and description in frontmatter. The description field is the primary triggering mechanism — it determines when a skill activates, so all "when to use" information belongs there.
No comments yet. Be the first to share your thoughts!
what skillimprove {skill-name} skill - Enters improvement modehelp me with / I need to - Detects task and routes| Input | Output | Quality Gate | |-------|--------|--------------| | Any input | Triage → Route → Action | Phase 0 analysis | | Explicit create | New skill | Unanimous panel approval | | Task/question | Skill recommendation | Match confidence ≥60% |
ANY USER INPUT
(prompt, error, code, URL, question, task request)
│
▼
┌─────────────────────────────────────────────────────┐
│ Phase 0: SKILL TRIAGE (NEW) │
│ • Classify input type (create/improve/question/task)│
│ • Scan 250+ skills in ecosystem │
│ • Match against existing skills with confidence % │
│ • Route to: USE | IMPROVE | CREATE | COMPOSE │
├─────────────────────────────────────────────────────┤
│ ↓ USE_EXISTING ↓ IMPROVE ↓ CREATE │
│ [Recommend] [Load & Enhance] [Continue] │
└─────────────────────────────────────────────────────┘
│ (if CREATE_NEW or IMPROVE_EXISTING)
▼
┌─────────────────────────────────────────────────────┐
│ Phase 1: DEEP ANALYSIS │
│ • Expand requirements (explicit, implicit, unknown) │
│ • Apply 11 thinking models + Automation Lens │
│ • Question until no new insights (3 empty rounds) │
│ • Identify automation/script opportunities │
├─────────────────────────────────────────────────────┤
│ Phase 2: SPECIFICATION │
│ • Generate XML spec with all decisions + WHY │
│ • Include scripts section (if applicable) │
│ • Validate timelessness score ≥ 7 │
├─────────────────────────────────────────────────────┤
│ Phase 3: GENERATION │
│ • Write SKILL.md with fresh context │
│ • Generate references/, assets/, and scripts/ │
├─────────────────────────────────────────────────────┤
│ Phase 4: SYNTHESIS PANEL │
│ • 3-4 Opus agents review independently │
│ • Script Agent added when scripts present │
│ • All agents must approve (unanimous) │
│ • If rejected → loop back with feedback │
└─────────────────────────────────────────────────────┘
│
▼
Production-Ready Agentic Skill
Key principles:
Start with least privilege (Read, Glob, Grep, Write, Edit).
Only add higher-risk tools when explicitly required:
Bash for deterministic local scripts that cannot be replaced with file editsWebFetch / WebSearch only when external facts are requiredTask only for true parallel sub-agent orchestration| Command | Action |
|---------|--------|
| SkillForge: {goal} | Full autonomous execution |
| SkillForge --plan-only {goal} | Generate specification only |
| SkillForge --quick {goal} | Reduced depth (not recommended) |
| SkillForge --triage {input} | Run Phase 0 triage only |
| SkillForge --improve {skill} | Enter improvement mode for existing skill |
Before creating anything, SkillForge intelligently analyzes your input to determine the best action.
┌────────────────────────────────────────────────────────────────────┐
│ ANY USER INPUT │
│ (prompt, error, code, URL, question, task request, anything) │
└────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ Step 1: INPUT CLASSIFICATION │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ explicit_create │ │ explicit_improve│ │ skill_question │ │
│ │ "create skill" │ │ "improve skill" │ │ "do I have..." │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ task_request │ │ error_message │ │ code_snippet │ │
│ │ "help me with" │ │ "TypeError..." │ │ [pasted code] │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ Step 2: SKILL ECOSYSTEM SCAN │
│ • Load index of 250+ skills (discover_skills.py) │
│ • Match input against all skills with confidence scoring │
│ • Identify top matches with reasons │
└────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ Step 3: DECISION MATRIX │
│ │
│ Match ≥80% + explicit create → CLARIFY (duplicate warning) │
│ Match ≥80% + other input → USE_EXISTING (recommend skill) │
│ Match 50-79% → IMPROVE_EXISTING (enhance match) │
│ Match <50% + explicit create → CREATE_NEW (proceed to Phase 1) │
│ Multi-domain detected → COMPOSE (suggest skill chain) │
│ Ambiguous input → CLARIFY (ask for more info) │
└────────────────────────────────────────────────────────────────────┘
| Action | When | Result | |--------|------|--------| | USE_EXISTING | Match ≥80% | Recommends existing skill(s) to invoke | | IMPROVE_EXISTING | Match 50-79% | Loads skill and enters enhancement mode | | CREATE_NEW | Match <50% | Proceeds to Phase 1 (Deep Analysis) | | COMPOSE | Multi-domain | Suggests skill chain via SkillComposer | | CLARIFY | Ambiguous or duplicate | Asks user to clarify intent |
# Run triage on any input
python scripts/triage_skill_request.py "help me debug this error"
# JSON output for automation
python scripts/triage_skill_request.py "create a skill for payments" --json
# Examples:
python scripts/triage_skill_request.py "TypeError: Cannot read property 'map'"
# → USE_EXISTING: Recommends ErrorExplainer (92%)
python scripts/triage_skill_request.py "create a skill for code review"
# → CLARIFY: CodeReview skill exists (85%), create anyway?
python scripts/triage_skill_request.py "help me with API and auth and testing"
# → COMPOSE: Multi-domain, suggests APIDesign + AuthSystem + TestGen chain
Phase 0 uses a pre-built index of all skills:
# Rebuild skill index (run periodically or after installing new skills)
python scripts/discover_skills.py
# Index location: ~/.cache/skillrecommender/skill_index.json
# Scans: ~/.claude/skills/, plugins/marketplaces/*, plugins/cache/*
Before distribution, validate your skill:
# Quick validation (required for packaging)
python scripts/quick_validate.py ~/.claude/skills/my-skill/
# Full structural validation
python scripts/validate-skill.py ~/.claude/skills/my-skill/
# Package for distribution
python scripts/package_skill.py ~/.claude/skills/my-skill/ ./dist
Skills must use only these allowed frontmatter properties:
| Property | Required | Description |
|----------|----------|-------------|
| name | Yes | Hyphen-case, max 64 chars |
| description | Yes | Max 1024 chars, no angle brackets |
| license | No | MIT, Apache-2.0, etc. |
| allowed-tools | No | Restrict tool access (comma-separated or YAML list) |
| model | No | Specific Claude model (e.g., claude-sonnet-4-20250514) |
| context | No | Set to fork for isolated sub-agent context |
| agent | No | Agent type when context: fork (Explore, Plan, general-purpose) |
| hooks | No | Lifecycle hooks (PreToolUse, PostToolUse, Stop) |
| user-invocable | No | Show in slash menu (default: true) |
| metadata | No | Custom fields (version, author, domains, etc.) |
Basic Example:
---
name: my-skill
description: What this skill does and when to use it
license: MIT
model: claude-opus-4-5-20251101
user-invocable: true
metadata:
version: 1.0.0
author: your-name
---
Advanced Example (with forked context and hooks):
---
name: isolated-analyzer
description: Runs analysis in isolated context with validation hooks
license: MIT
model: claude-opus-4-5-20251101
context: fork
agent: Explore
user-invocable: true
allowed-tools:
- Read
- Glob
- Grep
hooks:
PreToolUse:
- matcher: "Bash"
hooks:
- type: command
command: "./scripts/validate.sh"
metadata:
version: 1.0.0
---
Field Details:
| Field | Values | Notes |
|-------|--------|-------|
| context | fork | Creates isolated sub-agent with separate conversation history |
| agent | Explore, Plan, general-purpose | Only valid when context: fork |
| user-invocable | true, false | false hides from slash menu but Claude can still auto-invoke |
| hooks | Object | See Hooks Integration section |
~/.claude/skills/{skill-name}/
├── SKILL.md # Main entry point (required)
├── references/ # Deep documentation (optional)
│ ├── patterns.md
│ └── examples.md
├── assets/ # Templates (optional)
│ └── templates/
└── scripts/ # Automation scripts (optional)
├── validate.py # Validation/verification
├── generate.py # Artifact generation
└── state.py # State management
Scripts enable skills to be agentic - capable of autonomous operation with self-verification.
| Category | Purpose | When to Include | |----------|---------|-----------------| | Validation | Verify outputs meet standards | Skill produces artifacts | | Generation | Create artifacts from templates | Repeatable artifact creation | | State Management | Track progress across sessions | Long-running operations | | Transformation | Convert/process data | Data processing tasks | | Calculation | Compute metrics/scores | Scoring or analysis |
Script Requirements:
Result dataclass pattern for structured returnsSkills can define lifecycle hooks for validation, logging, and safety:
---
name: secure-skill
hooks:
PreToolUse:
- matcher: "Bash"
hooks:
- type: command
command: "./scripts/validate-input.sh"
PostToolUse:
- matcher: "Write"
hooks:
- type: command
command: "./scripts/log-output.sh"
once: true
Stop:
- hooks:
- type: command
command: "./scripts/cleanup.sh"
---
Hook Types:
| Hook | When Triggered | Use Case |
|------|----------------|----------|
| PreToolUse | Before tool execution | Input validation, security checks |
| PostToolUse | After tool execution | Output logging, verification |
| Stop | When skill completes | Cleanup, state persistence |
Hook Configuration:
| Field | Description |
|-------|-------------|
| matcher | Tool name pattern to match (e.g., "Bash", "Write", "Bash(python:*)") |
| type | Hook type: command (shell) or prompt (Claude evaluation) |
| command | Shell command to execute (for type: command) |
| once | If true, run only once per session (default: false) |
Read $TOOL_INPUT / $TOOL_OUTPUT inside hook scripts from environment variables.
Do not interpolate untrusted tool payloads directly into shell command strings.
When to Use Hooks:
| Scenario | Hook Type | Example |
|----------|-----------|---------|
| Validate script inputs | PreToolUse | Check parameters before python scripts/*.py |
| Log generated artifacts | PostToolUse | Record files created by Write tool |
| Security gate | PreToolUse | Block dangerous bash commands |
| Cleanup temp files | Stop | Remove intermediate artifacts |
Example: Script Validation Hook
For skills with scripts, add input validation:
hooks:
PreToolUse:
- matcher: "Bash(python:scripts/*)"
hooks:
- type: command
command: "python scripts/quick_validate.py . 2>/dev/null || true"
once: true
| Avoid | Why | Instead | |-------|-----|---------| | Duplicate skills | Bloats registry | Check existing first | | Single trigger | Hard to discover | 3-5 varied phrases | | No verification | Can't confirm success | Measurable outcomes | | Over-engineering | Complexity without value | Start simple | | Missing WHY | Can't evolve | Document rationale | | Invalid frontmatter | Can't package | Use allowed properties only |
After creation:
< or >python scripts/quick_validate.py passespython scripts/check_docs_safety.py passesTransform user's goal into comprehensive requirements:
USER INPUT: "Create a skill for X"
│
▼
┌─────────────────────────────────────────────────────────┐
│ EXPLICIT REQUIREMENTS │
│ • What the user literally asked for │
│ • Direct functionality stated │
├─────────────────────────────────────────────────────────┤
│ IMPLICIT REQUIREMENTS │
│ • What they probably expect but didn't say │
│ • Standard quality expectations │
│ • Integration with existing patterns │
├─────────────────────────────────────────────────────────┤
│ UNKNOWN UNKNOWNS │
│ • What they don't know they need │
│ • Expert-level considerations they'd miss │
│ • Future needs they haven't anticipated │
├─────────────────────────────────────────────────────────┤
│ DOMAIN CONTEXT │
│ • Related skills that exist │
│ • Patterns from similar skills │
│ • Lessons from skill failures │
└─────────────────────────────────────────────────────────┘
Check for overlap with existing skills:
ls ~/.claude/skills/
# Grep for similar triggers in existing SKILL.md files
| Match Score | Action | |-------------|--------| | >7/10 | Use existing skill instead | | 5-7/10 | Clarify distinction before proceeding | | <5/10 | Proceed with new skill |
Apply all 11 thinking models systematically:
| Lens | Core Question | Application | |------|---------------|-------------| | First Principles | What's fundamentally needed? | Strip convention, find core | | Inversion | What guarantees failure? | Build anti-patterns | | Second-Order | What happens after the obvious? | Map downstream effects | | Pre-Mortem | Why did this fail? | Proactive risk mitigation | | Systems Thinking | How do parts interact? | Integration mapping | | Devil's Advocate | Strongest counter-argument? | Challenge every decision | | Constraints | What's truly fixed? | Separate real from assumed | | Pareto | Which 20% delivers 80%? | Focus on high-value features | | Root Cause | Why is this needed? (5 Whys) | Address cause not symptom | | Comparative | How do options compare? | Weighted decision matrix | | Opportunity Cost | What are we giving up? | Explicit trade-offs |
Minimum requirement: All 11 lenses scanned, at least 5 applied in depth.
Iterative self-questioning until no new insights emerge:
ROUND N:
│
├── "What am I missing?"
├── "What would an expert in {domain} add?"
├── "What would make this fail?"
├── "What will this look like in 2 years?"
├── "What's the weakest part of this design?"
└── "Which thinking model haven't I applied?"
│
└── New insights > 0?
│
├── YES → Incorporate and loop
└── NO → Check termination criteria
Termination Criteria:
Identify opportunities for scripts that enable agentic operation:
FOR EACH operation in the skill:
│
├── Is this operation repeatable?
│ └── YES → Consider generation script
│
├── Does this produce verifiable output?
│ └── YES → Consider validation script
│
├── Does this need state across sessions?
│ └── YES → Consider state management script
│
├── Does this involve external tools?
│ └── YES → Consider integration script
│
└── Can Claude verify success autonomously?
└── NO → Add self-verification script
Automation Lens Questions:
| Question | Script Category if YES | |----------|----------------------| | What operations will be repeated identically? | Generation | | What outputs require validation? | Validation | | What state needs to persist? | State Management | | Can the skill run overnight autonomously? | All categories | | How will Claude verify correct execution? | Verification |
Decision: Script vs No Script
| Create Script When | Skip Script When | |-------------------|------------------| | Operation is deterministic | Requires human judgment | | Output can be validated | One-time setup | | Will be reused across invocations | Simple text output | | Enables autonomous operation | No verification needed | | External tool integration | Pure Claude reasoning |
The specification captures all analysis insights in XML format:
<skill_specification>
<metadata>
<name>skill-name</name>
<analysis_iterations>N</analysis_iterations>
<timelessness_score>X/10</timelessness_score>
</metadata>
<context>
<problem_statement>What + Why + Who</problem_statement>
<existing_landscape>Related skills, distinctiveness</existing_landscape>
</context>
<requirements>
<explicit>What user asked for</explicit>
<implicit>Expected but unstated</implicit>
<discovered>Found through analysis</discovered>
</requirements>
<architecture>
<pattern>Selected pattern with WHY</pattern>
<phases>Ordered phases with verification</phases>
<decision_points>Branches and defaults</decision_points>
</architecture>
<scripts>
<decision_summary>needs_scripts + rationale</decision_summary>
<script_inventory>name, category, purpose, patterns</script_inventory>
<agentic_capabilities>autonomous, self-verify, recovery</agentic_capabilities>
</scripts>
<evolution_analysis>
<timelessness_score>X/10</timelessness_score>
<extension_points>Where skill can grow</extension_points>
<obsolescence_triggers>What would break it</obsolescence_triggers>
</evolution_analysis>
<anti_patterns>
<pattern>What to avoid + WHY + alternative</pattern>
</anti_patterns>
<success_criteria>
<criterion>Measurable + verification method</criterion>
</success_criteria>
</skill_specification>
Before proceeding to Phase 3:
Context: Fresh, clean (no analysis artifacts polluting) Standard: Zero errors—every section verified before proceeding
1. Create directory structure
mkdir -p ~/.claude/skills/{skill-name}/references
mkdir -p ~/.claude/skills/{skill-name}/assets/templates
mkdir -p ~/.claude/skills/{skill-name}/scripts # if scripts needed
2. Write SKILL.md
• Frontmatter (YAML - allowed properties only)
• Title and brief intro
• Quick Start section
• Triggers (3-5 varied phrases)
• Quick Reference table
• How It Works overview
• Commands
• Scripts section (if applicable)
• Validation section
• Anti-Patterns
• Verification criteria
• Deep Dive sections (in <details> tags)
3. Generate reference documents (if needed)
• Deep documentation for complex topics
• Templates for generated artifacts
• Checklists for validation
4. Create assets (if needed)
• Templates for skill outputs
5. Create scripts (if needed)
• Use script-template.py as base
• Include Result dataclass pattern
• Add self-verification
• Document exit codes
• Test before finalizing
| Check | Requirement | |-------|-------------| | Frontmatter | Only allowed properties (name, description, license, allowed-tools, metadata) | | Name | Hyphen-case, ≤64 chars | | Description | ≤1024 chars, no angle brackets | | Triggers | 3-5 distinct, natural language | | Phases | 1-3 max, not over-engineered | | Verification | Concrete, measurable | | Tables over prose | Structured information in tables | | No placeholder text | Every section fully written | | Scripts (if present) | Shebang, docstring, argparse, exit codes, Result pattern | | Script docs | Scripts section in SKILL.md with usage examples |
Panel: 3-4 Opus agents with distinct evaluative lenses Requirement: Unanimous approval (all agents) Fallback: Return to Phase 1 with feedback (max 5 iterations)
| Agent | Focus | Key Criteria | When Active | |-------|-------|--------------|-------------| | Design/Architecture | Structure, patterns, correctness | Pattern appropriate, phases logical, no circular deps | Always | | Audience/Usability | Clarity, discoverability, completeness | Triggers natural, steps unambiguous, no assumed knowledge | Always | | Evolution/Timelessness | Future-proofing, extension, ecosystem | Score ≥7, extension points clear, ecosystem fit | Always | | Script/Automation | Agentic capability, verification, quality | Scripts follow patterns, self-verify, documented | When scripts present |
The Script Agent is activated when the skill includes a scripts/ directory. Focus areas:
| Criterion | Checks | |-----------|--------| | Pattern Compliance | Result dataclass, argparse, exit codes | | Self-Verification | Scripts can verify their own output | | Error Handling | Graceful failures, actionable messages | | Documentation | Usage examples in SKILL.md | | Agentic Capability | Can run autonomously without human intervention |
Script Agent Scoring:
| Score | Meaning | |-------|---------| | 8-10 | Fully agentic, self-verifying, production-ready | | 6-7 | Functional but missing some agentic capabilities | | <6 | Requires revision - insufficient automation quality |
Each agent produces:
## [Agent] Review
### Verdict: APPROVED / CHANGES_REQUIRED
### Scores
| Criterion | Score (1-10) | Notes |
|-----------|--------------|-------|
### Strengths
1. [Specific with evidence]
### Issues (if CHANGES_REQUIRED)
| Issue | Severity | Required Change |
|-------|----------|-----------------|
### Recommendations
1. [Even if approved]
IF all agents APPROVED (3/3 or 4/4):
→ Finalize skill
→ Run validate-skill.py
→ Update registry
→ Complete
ELSE:
→ Collect all issues (including script issues)
→ Return to Phase 1 with issues as input
→ Re-apply targeted questioning
→ Regenerate skill and scripts
→ Re-submit to panel
IF 5 iterations without consensus:
→ Flag for human review
→ Present all agent perspectives
→ User makes final decision
Every skill is evaluated through the evolution lens:
| Timeframe | Key Question | |-----------|--------------| | 6 months | How will usage patterns evolve? | | 1 year | What ecosystem changes are likely? | | 2 years | What new capabilities might obsolete this? | | 5 years | Is the core problem still relevant? |
| Score | Description | Verdict | |-------|-------------|---------| | 1-3 | Transient, will be obsolete in months | Reject | | 4-6 | Moderate, depends on current tooling | Revise | | 7-8 | Solid, principle-based, extensible | Approve | | 9-10 | Timeless, addresses fundamental problem | Exemplary |
Requirement: All skills must score ≥7.
| Do | Don't | |----|-------| | Design around principles | Hardcode implementations | | Document the WHY | Only document the WHAT | | Include extension points | Create closed systems | | Abstract volatile dependencies | Direct coupling | | Version-agnostic patterns | Pin specific versions |
Select based on task complexity:
| Pattern | Use When | Structure | |---------|----------|-----------| | Single-Phase | Simple linear tasks | Steps 1-2-3 | | Checklist | Quality/compliance audits | ☐ Item verification | | Generator | Creating artifacts | Input → Transform → Output | | Multi-Phase | Complex ordered workflows | Phase 1 → Phase 2 → Phase 3 | | Multi-Agent Parallel | Independent subtasks | Launch agents concurrently | | Multi-Agent Sequential | Dependent subtasks | Agent 1 → Agent 2 → Agent 3 | | Orchestrator | Coordinating multiple skills | Meta-skill chains |
Is it a simple procedure?
├── Yes → Single-Phase
└── No → Does it produce artifacts?
├── Yes → Generator
└── No → Does it verify/audit?
├── Yes → Checklist
└── No → Are subtasks independent?
├── Yes → Multi-Agent Parallel
└── No → Multi-Agent Sequential or Multi-Phase
SKILLCREATOR_CONFIG:
mode: autonomous
depth: maximum # always
core_lens: evolution_timelessness
analysis:
min_lens_depth: 5
max_questioning_rounds: 7
termination_empty_rounds: 3
synthesis:
panel_size: 3
require_unanimous: true
max_iterations: 5
escalate_to_human: true
evolution:
min_timelessness_score: 7
min_extension_points: 2
require_temporal_projection: true
model:
primary: claude-opus-4-5-20251101
subagents: claude-opus-4-5-20251101
| Skill | Relationship | |-------|--------------| | skill-composer | Can orchestrate created skills | | claude-authoring-guide | Deeper patterns reference | | codereview | Pattern for multi-agent panels | | maker-framework | Zero error standard source |
references/multi-lens-framework.mdassets/templates/references/script-patterns-catalog.mdmodel, context, agent, hooks, user-invocablescripts/_constants.py for shared validation constantsscripts/quick_validate.py with extended property validationscripts/validate-skill.py with hooks and agent validationscripts/discover_skills.py to extract version from metadata.versionreferences/script-integration-framework.mdreferences/script-patterns-catalog.mdassets/templates/script-template.py<scripts> section---
name: my-skill
description: What this skill does and when to use it. Include trigger scenarios.
---
A new design concept for matching instruction specificity to task fragility:
New init_skill.py creates rich skill templates with TODO placeholders, organizational pattern suggestions, and example resource files:
python scripts/init_skill.py my-new-skill --path ~/.codex/skills
Iteration is now built into Phase 3. Skills improve through real usage, not just synthesis panel review.
v5.1 expands skill metadata support and documentation:
model, context, agent, hooks, and user-invocablePreToolUse, PostToolUse, and Stopv5.1 adds stronger guardrails for safe distribution:
.skillignore enforcement restored in packagingSkillForge implements its philosophy through a rigorous, autonomous 4-phase architecture. This structure ensures that every skill undergoes comprehensive analysis, thorough specification, clean generation, and objective approval before it is complete.

Before creating anything, SkillForge analyzes your input to determine the best action:
# These all work - SkillForge routes automatically:
SkillForge: create a skill for automated code review
→ Creates new skill (Phase 1-4)
help me debug this TypeError
→ Recommends debugging skills
do I have a skill for Excel?
→ Searches and recommends matching skills
Maximum depth before a single line is generated.
Every problem is systematically deconstructed through 11 distinct thinking lenses, with degrees of freedom assessed for each design decision.

The 11 lenses include: First Principles, Inversion, Second-Order Effects, Pre-Mortem, Systems Thinking, Devil's Advocate, Constraints, Pareto, Root Cause, Comparative, and Opportunity Cost.
Translating deep analysis into a flawless build.
The insights from analysis are codified into a structured XML specification, then used to generate the skill with fresh context. Phase 3 now includes an explicit iteration step — review output against spec, identify gaps, and refine before panel review.

A panel of experts demands unanimous approval.
A generated skill is submitted to a panel of specialized agents, each evaluating against distinct criteria. Approval must be unanimous.

The panel includes:
Skill quality is not enough on day one. The system must stay maintainable and extensible as the skill ecosystem grows.


| Principle | Implementation | |-----------|----------------| | Engineer for Agents | Standardized directory structure, simplified frontmatter, automated validation | | Systematize Rigor | 4-phase architecture, regression questioning, 11 thinking lenses, multi-agent synthesis | | Design for Evolution | Dedicated Evolution agent, mandatory ≥7/10 timelessness score, degrees of freedom assessment |
SkillForge is designed so skills can execute repeatable work, validate outputs, and support autonomous operation where appropriate.

skillforge/
├── SKILL.md # Main skill definition (< 500 lines)
├── LICENSE # MIT License
├── references/ # Loaded into context when needed
│ ├── regression-questions.md
│ ├── multi-lens-framework.md
│ ├── specification-template.md
│ ├── evolution-scoring.md
│ ├── synthesis-protocol.md
│ ├── script-integration-framework.md
│ ├── script-patterns-catalog.md
│ ├── degrees-of-freedom.md
│ └── iteration-guide.md
├── assets/ # Used in output, never loaded into context
│ └── templates/
│ ├── skill-spec-template.xml
│ ├── skill-md-template.md
│ └── script-template.py
└── scripts/ # Automated quality gates
├── init_skill.py
├── triage_skill_request.py
├── discover_skills.py
├── match_skills.py
├── verify_recommendation.py
├── validate-skill.py
├── quick_validate.py
└── package_skill.py

Key distinction: references/ = loaded into context to inform the model's reasoning. assets/ = used in output, never loaded into context.

# Install (excludes repo-only files like README.md automatically)
git clone https://github.com/tripleyak/SkillForge.git /tmp/skillforge
# Codex install
cp -r /tmp/skillforge ~/.codex/skills/skillforge
rm -rf ~/.codex/skills/skillforge/{README.md,LICENSE,.git,.gitignore,.skillignore}
# Claude Code install
cp -r /tmp/skillforge ~/.claude/skills/skillforge
rm -rf ~/.claude/skills/skillforge/{README.md,LICENSE,.git,.gitignore,.skillignore}
# Or package as .skill file (respects .skillignore)
python scripts/package_skill.py /tmp/skillforge ./dist
# Full autonomous execution
SkillForge: {goal}
# Natural language activation
create skill for {purpose}
# Generate specification only
skillforge --plan-only
# Scaffold a new skill
python scripts/init_skill.py my-skill --path ~/.codex/skills
Note:
README.md,LICENSE, andassets/images/are for GitHub browsing only. They are excluded from.skillpackages via.skillignoreand should not be copied into your skills directory.

SkillForge is a systematic methodology for quality and repeatability.
By codifying expert analysis, rigorous specification, and multi-agent peer review into a fully autonomous system, SkillForge provides a blueprint for building the next generation of robust, reliable, and evolution-aware AI skills.
It transforms skill creation from an art into an engineering discipline.
MIT License — see LICENSE
~/.codex/skills) with uppercase SKILL.md support