metaharness

Name: metaharness
Author: ruvnet

Pending

🛠️ The meta-harness for AI agents — scaffold your own focused, branded agent harness with its own npx CLI, MCP server, memory, learning loop, and witness-signed releases. Works with Claude Code, Codex, pi.dev, Hermes, OpenClaw, and RVM (hardware-isolated sandbox).

344stars

35forks

TypeScript

Installation

# Add to your Claude Code skills
git clone https://github.com/ruvnet/metaharness

Getting Started

Guides for using ai agents skills like metaharness.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

README.md

Frequently Asked Questions

What is metaharness?

metaharness is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by ruvnet. 🛠️ The meta-harness for AI agents — scaffold your own focused, branded agent harness with its own npx CLI, MCP server, memory, learning loop, and witness-signed releases. Works with Claude Code, Codex, pi.dev, Hermes, OpenClaw, and RVM (hardware-isolated sandbox). It has 344 GitHub stars.

Is metaharness safe to use?

metaharness's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install metaharness?

Clone the repository with "git clone https://github.com/ruvnet/metaharness" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is metaharness written in?

metaharness is primarily written in TypeScript. It is open-source under ruvnet on GitHub, so you can review or fork the full source.

Are there alternatives to metaharness?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh metaharness against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

mantishack sandvault

MetaHarness

Mint a custom AI agent harness from any repo.

npx metaharness · open the Studio →

(Repo: ruvnet/agent-harness-generator · CLI: metaharness · Library: @ruvnet/agent-harness-generator)

What this is

Every serious repo deserves its own agent. A repo-aware CLI, a repo-aware coding agent, a local MCP server, memory scoped to the project, skills generated from the actual file layout, governance policy, release verification, witness-signed provenance.

metaharness mints those, on demand, from a GitHub URL or a blank slate. It is not another agent framework. It is a factory for agent frameworks.

The model is replaceable. The harness is the product.

What it gives you

In under 60 seconds, in your browser, with nothing leaving your machine:

A custom AI agent harness for your repo (or any repo)
Recommended agents, skills, slash commands, MCP tools
A scoped memory namespace + governance policy
Witness-signed provenance + release gates
Drops into Claude Code, OpenAI Codex, pi.dev, Hermes, OpenClaw, or RVM — pick one or all

Output is an npm-publishable .zip with your name on it, your branding, your npx <your-name> CLI.

New

Score any repo before you build it. npx metaharness score <repo> reads the repo (never runs it) and prints a one-screen report card — how well a harness fits, how likely it is to build, how safe the tools are, and the rough cost per run — so you know what you'll get before scaffolding.
Pick the cheapest model that's good enough. @metaharness/router routes each request to the right model from your own results — same quality, far less spend. Works out of the box with zero native deps; train it on your data for a sharper fit (npm i @metaharness/router). Add the optional @ruvector/tiny-dancer to train a fast native model instead — same training data, no API change.
Let your harness improve itself. Every scaffold now ships with Darwin Mode (@metaharness/darwin) wired in — run npm run evolve and the harness mutates its own config, tests each change in a sandbox, and keeps only what measurably improves. The model stays frozen; the harness evolves. Safe by default (no network, no API key; pure refactor/tuning behind a safety gate). Validated on real SWE-bench Lite bug-fixing. --no-darwin to skip.
Distil the cheap tier instead of escalating to a frontier model. Weight-EFT (@metaharness/weight-eft, metaharness weight-eft) takes the complementary lever to Darwin's gradient-free evolution: it exports the harness's gold-resolved archive into standard SFT/DPO sets and LoRA-tunes the open cheap tier (GLM/Qwen), so the cost-cascade escalates to Opus/GPT less often. It attacks cost (fewer $0.50 escalations), not the frontier ceiling — and stays honest about it. Strict train/eval-disjointness + reward-hacking filters keep the lift real; the tune is a gene Darwin can prune if it overfits. See ADR-198. ($0 / GPU-gated.)

Tune it to your project — then ship it as your own npm

A generated harness is a starting point you own, not a fixed framework. Open it and make it yours:

Keep only what your repo needs. Delete the agents, skills, slash commands, and MCP servers you won't use — the scaffold ships a recommended set, but a payments service and a docs site want very different harnesses. A smaller, targeted harness is faster, cheaper, and easier to reason about. harness doctor / harness validate keep it healthy as you trim.
Optimize the model routing for your work. Swap the per-task model tiers, tighten the governance policy, point the memory namespace at your domain. The harness is config you control, not a black box.
Publish it as your own package for the whole org. Rename it, set your scope, and npm publish — now anyone on your team runs npx @your-org/your-harness and gets the same repo-tuned agent. One command, org-wide, versioned like any other dependency. (The 19 @metaharness/* examples are exactly this pattern, published live.)

Make older, cheaper models punch like frontier ones. The right harness isn't a pile of extra steps bolted onto an expensive model — it's putting the right model on each task and getting out of the way. Our DRACO benchmark proves it: a small, cheap model delivers frontier-quality research at roughly one-tenth the cost, and a smart router squeezes out the rest. Stop paying frontier prices for work a $0.10 model does just as well.

That router ships as @metaharness/router — route(query) returns the cheapest model predicted to clear your quality bar, learned from your own eval logs. npm i @metaharness/router.

Try it in 30 seconds

# In the browser — zero install, nothing leaves the page
open https://ruvnet.github.io/agent-harness-generator/

# Or in the terminal — the same harness (behaviourally equivalent output)
npx metaharness my-bot --template vertical:coding --host claude-code
cd my-bot && npx . --help

Don't know what to pick? Run the wizard:

npx metaharness --wizard

Already have a repo you want a harness for?

harness analyze-repo .                       # local — deterministic analysis only
harness analyze-repo . --scaffold my-bot     # materialise the recommended harness

No repository code is executed. Inferred build/test commands are emitted as trust: inferred · execution: disabled.

📖 Read the plain-language user guide →

Hosts

The same harness output runs on nine agent hosts — eight interactive, plus GitHub Actions (CI/CD):

Host	What ships	Notes
Claude Code	MCP server + hooks + 3-scope settings	Richest surface; Ruflo-native
OpenAI Codex	MCP via `~/.codex/config.toml`	TOML, no hooks
pi.dev	Pi extension via `pi.registerTool()`	No MCP by design
Hermes	MCP runtime, `<think>` scrubbing	Per Hermes issue #741
OpenClaw	`~/.openclaw/openclaw.json` + workspace skills	Personal-AI gateway
RVM	Bare-metal microhypervisor + capability tokens	Hardware isolation for untrusted peers
GitHub Copilot	MCP via `.vscode/mcp.json`	VSCode 1.99+ (ADR-032)
OpenCode	MCP via `.opencode/opencode.json`	sst/opencode TUI (ADR-036)
GitHub Actions	`.github/workflows/` + composite `action.yml`	Non-interactive CI/CD; default-deny via `permissions:` (ADR-033)

See ADR-004 — Host integration model and ADR-033 — GitHub Actions host.

MCP — modular, default-deny

MCP is included as a first-class adapter surface, not the identity. It is gated and default-deny (ADR-022):

Modes: off · local (stdio) · remote (HTTPS + auth)
Emits src/mcp/{server,tools,resources,prompts,policy,audit}.ts + a scannable .harness/mcp-policy.json
Safe defaults: no network, no shell, no file-write, approve-dangerous, 30s timeout, 8 calls/turn, audit on
harness mcp-scan <path> — "npm audit for agent tools": static-only scan flagging shell/network grants, missing audit/timeouts, wildcard permissions, unguarded secrets, unpinned deps. Exits 1 on any HIGH.

Verticals (19 quick-start templates)

npx metaharness --list
npx metaharness my-bot --template vertical:coding

Category	Templates
Starter / Operations	`minimal`, `vertical:devops`
Engineering	`vertical:coding`, `vertical:ai`, `vertical:repo-maintainer` (iter 113)
Knowledge	`vertical:research`, `vertical:ruview`, `vertical:education`
Finance / Pro	`vertical:trading`, `vertical:legal`, `vertical:health`
Customer / Growth	`vertical:support`, `vertical:crm`, `vertical:marketing`, `vertical:advertising`, `vertical:sales`
Business / Frontier	`vertical:business`, `vertical:agentics`, `vertical:gaming`, `vertical:exotic`

Each ships bespoke domain agents (with system prompts), skills, commands, and per-host settings — all default-deny.

One-command examples

Don't want to pick flags? Each host and vertical has a dedicated @metaharness/* wrapper — published, one npx away, no template/