by boshu2
The operational layer for coding agents. Memory, validation, and feedback loops that compound between sessions.
# Add to your Claude Code skills
git clone https://github.com/boshu2/agentopsLast scanned: 5/26/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-05-26T07:46:23.789Z",
"semgrepRan": false,
"npmAuditRan": true,
"pipAuditRan": true
}No comments yet. Be the first to share your thoughts!
AgentOps sits on top of the coding harness you already use (Claude Code, Codex, Cursor, OpenCode) and adds the parts an engineering team would notice missing: a record of what was tried, gates between phases, and a corpus of learnings that survives the next session. State lives in .agents/ next to your code, and you can mix Claude, Codex, or any model per phase.
This repo was built with AgentOps. As of 2026-05-04 its .agents/ held ~1,842 learnings, ~186 patterns, ~80 planning rules, and ~3,867 cited decisions captured during development. Re-run anytime: bash scripts/corpus-stats.sh.
New here? Start with what AgentOps 3.0 is — the hookless-first CDLC north star.
Most teams run coding agents as isolated chat sessions. Prior attempts, warnings, decisions, and fixes scatter across chats, commits, and human memory, so the same mistakes recur and nothing leaves a reviewable trail.
AgentOps breaks intent into bounded slices, gives each slice a first failing test and a write scope, and makes every phase boundary a gate that records evidence.
A skill loads the corpus before writing a line of code:
> /research add rate limiting to /login
[research] loading context from .agents/...
[corpus] 3 prior auth decisions cited
- .agents/decisions/2026-04-12-session-tokens.md
- .agents/decisions/2026-03-08-rate-limit-policy.md
- .agents/decisions/2026-02-19-redis-as-state-store.md
[corpus] 2 planning rules apply: rate-limit-jitter, redis-fallback-paths
[corpus] 1 learning: 2026-03-08 — token bucket without jitter caused thundering-herd at 5/min
[findings] middleware/auth.go owns /login; no rate limiting present
[findings] internal/cache already wires Redis with a fallback path
[plan] token bucket, 5/min per IP, Redis-backed, jittered per 2026-03-08 learning
[recorded] .agents/runs/2026-05-08-rate-limit/research.md
Skills run in the harness you already use (Claude Code, Codex, Cursor, OpenCode). The corpus is the differentiator: the agent starts loaded with prior decisions, planning rules, and learnings instead of starting cold. From here, /implement or /rpi carries the same context into the build phase.
For a second opinion before shipping, /council --mixed:
> /council --mixed validate this PR
[council] evidence packet sealed -> 6 judges across Claude Code and Codex CLI
[claude/judge-1] WARN - rate limiting missing on /login endpoint
[claude/judge-2] PASS - Redis integration follows middleware pattern
[codex/judge-1] WARN - token bucket refill lacks jitter under burst
[codex/judge-2] PASS - backoff bounds match retry policy
Consensus: WARN - fix /login rate limit and add refill jitter before shipping
Recorded: .agents/council/<run-id>/verdict.md
Pick the runtime you use.
Claude Code
claude plugin marketplace add boshu2/agentops
claude plugin install agentops@agentops-marketplace
Codex CLI on macOS, Linux, or WSL
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.sh | bash
AgentOps installs hookless — workflow is guided by skills + the ao CLI, and CI is the authoritative gate. If you want runtime hooks, author your own with the hooks-authoring skill.
Codex CLI on Windows PowerShell
irm https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.ps1 | iex
OpenCode
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-opencode.sh | bash
Other skills-compatible agents
npx skills@latest add boshu2/agentops --cursor -g
Restart your agent after install. Then type /quickstart in your agent chat.
The ao CLI is optional but recommended: repo-native bookkeeping, retrieval, health checks, and terminal workflows.
macOS
brew tap boshu2/agentops https://github.com/boshu2/homebrew-agentops
brew install agentops
ao version
Windows PowerShell
irm https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-ao.ps1 | iex
ao version
You can also install from release binaries or build from source. Troubleshooting: docs/troubleshooting.md. Configuration: docs/ENV-VARS.md.
AgentOps degrades gracefully — skills check for a tool before using it. The only hard requirement is a coding-agent runtime plus git; everything else is optional and adds capability.
| Tool | Class | Purpose | Required? |
|------|-------|---------|-----------|
| claude / codex / opencode | Agent runtime | The harness AgentOps sits on top of | Required (one of) |
| git | Required | Version control — .agents/ state lives next to your code | Required |
| ao | Required (recommended) | The AgentOps CLI: bookkeeping, retrieval, health, the loops | Recommended |
| bd (beads) + Dolt backend | Tracking | Git-native issue tracking (the mandatory task surface) | Optional |
| gc (Gas City) | Orchestration | Out-of-session substrate that runs whole ao rpi/ao evolve loops — see using-gc | Optional (out-of-session) |
| gh | PR / CI | Open PRs, query CI status | Optional |
| go | Build-from-source | Build cli/bin/ao from source (go 1.26) | Optional |
| jq, rg/ripgrep, curl, openssl, sha256sum, tmux, cass | Utilities | JSON parsing, search, downloads, hashing, sessions, history | Optional |
Full purpose / required-vs-optional / fallback-if-absent for every tool: docs/dependencies.md.
Coding agents are non-deterministic workers. Engineering already has a long history of getting reliable output from non-deterministic workers: disciplined process around them, with checks at the boundaries. The same primitives map across:
| Software Engineering | Coding-Agent World |
|---|---|
| Source code | Context (corpus, planning rules, learnings) |
| SDLC | CDLC (Context Development Life Cycle) |
| Libraries (Maven, npm, crates.io) | Context libraries (the .agents/ corpus) |
| Compilers | Context compilers (ao compile → wiki) |
| Code review | Multi-model councils |
| CI/CD | Validation gates (/vibe, /pre-mortem) |
| Postmortems | Automated postmortems (/post-mortem → learnings) |
| Runbooks | Skills + planning rules |
| Software factories | The in-session loop (/rpi, /evolve); out-of-session runs on a substrate (Gas City reference City) |
| Markdown / Git / Linux (open primitives) | LLM Wiki of Markdown |
| Open-source corpus | Your private corpus (.agents/ in your repo) |
You can't tune the model; that's the vendor's job. You can engineer the context you feed it. AgentOps treats that engineering as a Context Development Life Cycle (CDLC), with the same discipline DevOps brought to delivery.
The narrow waist is small on purpose:
| Practice | What AgentOps uses it for | |---|---| | BDD / Gherkin | Dense intent in observable behavior terms | | DDD | Shared names, bounded contexts, and a vocabulary humans and agents can both use | | Hexagonal architecture | Ports and adapters that keep runtime, tool, and vendor details outside the core loop | | TDD | A local executable done condition for each slice |
Everything else plugs into that waist: CI/CD repeats the proof, SRE/DORA measures fitness, ADRs and provenance preserve why-memory, wikis and ratchets preserve durable learning, and Agile/XP keeps work in vertical slices. The atomic unit is one behavior, one bounded context, one failing test, one write scope, and one acceptance proof. A new learning is added only when it would change a future run.
The waist is executable. GOALS.md declares intent as directives; /scenario and ao goals render turn each directive into Gherkin acceptance examples; ao rpi phased --domain <name> builds inside one bounded context with a declared read scope; ao goals measure scores whether the result satisfies the spec. Intent, behavior, build, and validation stay linked end-to-end.
Full treatment: docs/cdlc.md.
Four layers that compound:
| Layer | Problem | What changes |
|-------|---------|--------------|
| Bookkeeping | Agents forget what they tried, why they changed course, and what evidence mattered | .agents/ captures run packets, handoffs, findings, citations, decisions, verdicts, retros, and post-mortems |
| Context Compiler | Every session starts cold | ao context assemble builds phase-scoped packets; ao lookup retrieves decay-ranked knowledge on demand; skills and execution packets make context explicit; runtime hooks are deliberately custom, not the default path |
| Validation Gates | Agents ship confident garbage | /pre-mortem, /vibe, /council: multi-model consensus validates plans before build and code before commit; gates block, not advise |
| Knowledge Flywheel | Lessons disappear between sessions | /forge extracts learnings from the bookkeeping trail, ao flywheel close-loop scores and promotes them, /evolve runs a bounded reconciliation loop, /dream prepares compounding runs |
All state lives in local .agents/: plain text you can grep, diff, and review. No AgentOps-managed telemetry or hosted control plane. Runtime-neutral across Claude Code, Codex CLI, Cursor, and OpenCode.
| Notion / Confluence | AgentOps .agents/ |
|---|--