Kyoko

Name: Kyoko
Author: kayba-ai

Pending

🔨 Kyoko is the all-in-one, fully local tool for debugging and improving your AI agents.

53stars

4forks

Python

Installation

# Add to your Claude Code skills
git clone https://github.com/kayba-ai/Kyoko

Getting Started

Guides for using ai agents skills like Kyoko.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

README.md

Frequently Asked Questions

What is Kyoko?

Kyoko is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by kayba-ai. 🔨 Kyoko is the all-in-one, fully local tool for debugging and improving your AI agents. It has 53 GitHub stars.

Is Kyoko safe to use?

Kyoko's catalog security scan is still queued. You can run an instant dependency and prompt-injection check now with the "Scan for vulnerabilities" button above.

How do I install Kyoko?

Clone the repository with "git clone https://github.com/kayba-ai/Kyoko" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is Kyoko written in?

Kyoko is primarily written in Python. It is open-source under kayba-ai on GitHub, so you can review or fork the full source.

Are there alternatives to Kyoko?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh Kyoko against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

neuralyzer awesome-claude-fable-5

Kyoko

Kyoko is a fully local system for measuring, debugging, and improving AI agents.

Add telemetry, run your agent. Kyoko shows where performance breaks across runs. It groups recurring failures into evidence-backed issues, lets Codex or Claude Code draft fixes, and only applies changes after checks and evals pass.

Built around the manual dev workflow. Inspect traces, understand the failure, patch the prompt, context, or harness, rerun evals, and decide what ships. Kyoko makes that workflow repeatable while keeping you in control.

Local by default. Traces, issues, proposals, evals, database, and dashboard stay on your machine.

Works with your existing coding-agent subscription. Kyoko can use the Codex or Claude Code CLI you already have, so there is no separate Kyoko model API key or hosted service.

Why Kyoko

Finds the failures that repeat across runs. Kyoko looks across runs, groups recurring problems into evidence-backed issues, and shows where each one happened.
Turns issues into fixes. Accepted issues become proposed changes to your agent context, skills, or harness.
Measures whether fixes worked. Kyoko reruns failing traces, runs deterministic checks, and compares eval results before applying a fix.
Keeps the developer in control. Review every issue, proposal, and apply decision manually, or automate only the parts that pass the gate.
Uses the tools you already have. Codex, Claude Code, OpenClaw, Hermes, or a generic command can analyze evidence and draft fixes through existing CLI auth.
Runs locally by default. SQLite, loopback dashboard, local traces, local proposals, and explicit external calls.
Connects to real agent stacks. OTLP/GenAI, Python and TypeScript SDKs, importers, JSON CLI, dashboard, and MCP.

The loop

        ┌─────────────────┐           ┌─────────────────┐
        │  1. Analyse     │ ───────▶  │  2. Issues      │
        │  traces in      │           │  recurring      │
        │                 │           │  failures       │
        └─────────────────┘           └─────────────────┘
                 ▲                            │
                 │ measure                    │ accept
                 │                            ▼
        ┌─────────────────┐  ┌──────┐ ┌─────────────────┐
        │  4. Evals       │◀─┤ gate ├─│  3. Proposals   │
        │  failure rate   │  └──────┘ │  fixes          │
        │                 │   apply   │                 │
        └─────────────────┘           └─────────────────┘

Kyoko keeps the repair loop explicit. Every step creates something you can inspect in the dashboard or CLI.

Analyse: Kyoko reads real traces from your agent and looks across runs for repeated behavior: tool mistakes, missing context, policy drift, brittle routing, bad handoffs, or eval failures.
Issues: recurring failures become evidence-backed issues with category, severity, occurrence count, and links to the spans where they happened.
Proposals: accepted issues become concrete fixes to your agent context, skills, evals, or harness. The fix stays reviewable before it can apply.
Evals: Kyoko reruns failing traces, runs deterministic checks, and compares eval results so the gate can decide whether the fix worked.

The gate is the control point. It applies a fix only when checks, replay evidence, autonomy policy, and human locks allow it.

Run it your way. The same loop, the same gate. You pick the autonomy level:

Human-in-the-loop: Kyoko surfaces issues and drafts fixes, and you review and approve each change before it applies.
Fully autonomous: the policy auto-applies any change that clears replay, evals, and human locks, and parks anything that doesn't for you to look at.

Quick demo

Try Kyoko without wiring up an agent. The demo creates a local database, loads bundled fixture runs, and serves the dashboard.

pipx install kyoko
kyoko demo --db /tmp/kyoko-demo.db --json
kyoko serve --db /tmp/kyoko-demo.db

Open http://127.0.0.1:8765.

Requires Python 3.12 or newer. No live model, framework adapter, or replay server is needed for the demo.

Get started

From the root of your agent project (e.g. the repo of your AI agent, Hermes or Openclaw), needs Python 3.12+:

pipx install kyoko
kyoko project-bootstrap
kyoko serve

Open http://127.0.0.1:8765. pip install kyoko and uv tool install kyoko work too; see docs/INSTALL.md.

Bootstrap writes a local .kyoko/ workspace: database, scaffolds, MCP config, and operator presets. Every later kyoko command finds that database automatically, so no --db flags are needed inside your project.

Then wire up telemetry. This is the step that makes everything else work: Kyoko can only find and fix what it can see. The easiest way is to let your coding agent do the wiring:

kyoko install-skill   # then run /kyoko-instrument in your coding agent

This installs the bundled /kyoko-instrument skill into .claude/skills/ and .agents/skills/, where Claude Code and Codex pick it up automatically; for Cursor or other agents, kyoko install-skill --print prints the same playbook to paste in. The skill finds your agent's entry point, records one real run, and verifies it shows up in Kyoko.

To connect your agent over MCP instead, or to wire telemetry by hand (Python or TypeScript SDK, OTLP, importers), see Getting Started.

What you get

Run capture: Python SDK, TypeScript SDK, generated source adapters, OTLP/GenAI JSON, Hermes import, and OpenClaw import.
Issue queue: recurring failures grouped into evidence-backed issues with category, severity, occurrence count, and span links.
Fix proposals: accepted issues become validated LearningProposal records for context, skills, evals, o