Mnemon

English | 中文

LLM-supervised persistent memory for AI agents.

LLM agents forget everything between sessions. Context compaction drops critical decisions, cross-session knowledge vanishes, and long conversations push early information out of the window.

Mnemon gives your agent persistent, cross-session memory — a four-graph knowledge store with intent-aware recall, importance decay, and automatic deduplication. Single binary, zero API keys, one setup command.

Experimental beta: this repository also includes mnemon-harness, a source-built beta for project-local host-agent lifecycle state. It is separate from the stable mnemon CLI, not production-ready, and may make breaking changes at any time. See harness/README.md.

Claude Max / Pro subscriber? Mnemon works entirely through your existing subscription — no separate API key required. Your LLM subscription is the intelligence layer. Two commands and you're done.

Why Mnemon?

Most memory tools embed their own LLM inside the pipeline. Mnemon takes a different approach: your host LLM is the supervisor. The binary handles deterministic computation (storage, graph indexing, search, decay); the LLM makes judgment calls (what to remember, how to link, when to forget). No middleman, no extra inference cost.

Pattern	LLM Role	Representative
LLM-Embedded	Executor inside the pipeline	Mem0, Letta
File Injection	None — reads file at session start	Claude Code Memory
MCP Server	Tool provider via MCP protocol	claude-mem
LLM-Supervised	External supervisor of a standalone binary	Mnemon

Mnemon also addresses a gap in the protocol stack. MCP standardizes how LLMs discover and invoke tools. ODBC/JDBC standardizes how applications access databases. But how LLMs interact with databases using memory semantics — this layer has no protocol. Mnemon's three primitives — remember, link, recall — form an intent-native protocol: command names map to the LLM's cognitive vocabulary (remember not INSERT, recall not SELECT), and output is structured JSON with signal transparency rather than raw database rows.

Memory has a compound interest effect — the longer it accumulates, the greater its value. LLM engines iterate constantly, skill files cost nearly nothing to write, but memory is a private asset that grows with the user. It is the only component in the agent ecosystem worth deep investment.

See Design & Architecture for details.

Quick Start

Install

Homebrew (macOS / Linux):

brew install mnemon-dev/tap/mnemon

Go install:

go install github.com/mnemon-dev/mnemon@latest

From source:

git clone https://github.com/mnemon-dev/mnemon.git && cd mnemon
make install

Verify installation:

mnemon --version

Claude Code

mnemon setup

mnemon setup auto-detects Claude Code, then interactively deploys skill, hooks, and behavioral guide. Start a new session — memory just works.

Codex

mnemon setup --target codex --yes

One command deploys the mnemon skill, prompt files, and Codex lifecycle hooks (SessionStart, UserPromptSubmit, Stop) in .codex/hooks.json.

OpenClaw

mnemon setup --target openclaw --yes

One command deploys skill, hook, plugin, and behavioral guide to ~/.openclaw/. Restart the OpenClaw gateway to activate.

Pi

mnemon setup --target pi --yes

One command deploys the mnemon skill, prompt files, and a Pi TypeScript extension to .pi/. The extension maps Mnemon's lifecycle reminders onto Pi events (resources_discover, before_agent_start, agent_end, session_before_compact). Start a new Pi session or run /reload to activate.

Hermes Agent

mnemon setup --target hermes --yes

One command deploys the mnemon skill, prompt files, and Hermes shell hooks to ~/.hermes/. The integration uses Hermes' native lifecycle hooks: on_session_start, pre_llm_call, post_llm_call, and optional on_session_finalize. Hermes may prompt once to approve the installed shell hooks.

NanoClaw

NanoClaw runs agents inside Linux containers. Use the /add-mnemon skill to integrate:

Install mnemon on the host (see above)
In your NanoClaw project, run /add-mnemon — Claude Code will modify the Dockerfile, add a container skill, and set up volume mounts
Each WhatsApp group gets its own isolated memory store, with optional global shared memory (read-only)

The skill is available at .claude/skills/add-mnemon/ in the NanoClaw repo.

Nanobot

mnemon setup --target nanobot --global --yes

One command writes a skill file to ~/.nanobot/workspace/skills/mnemon/SKILL.md. Memory is shared across all Nanobot sessions and projects. Use --global (recommended) because Nanobot discovers skills from the global workspace directory.

Uninstall

mnemon setup --eject

How it works

Once set up, memory operates through a lightweight harness: SKILL.md teaches commands, GUIDELINE.md teaches judgment, hooks remind the agent at lifecycle boundaries, and the mnemon binary executes deterministic memory operations. Supported setup commands automate this, but the harness is installable from markdown alone.

Session starts
    |
    v
  Prime   -> make skill, guideline, and active store visible
    |
    v
User prompt arrives
    |
    v
  Remind  -> decide whether recall could change this task
    |
    v
Agent works and calls Mnemon only when useful
    |
    v
  Nudge   -> decide whether durable writeback is justified
    |
    v
Before context compaction
    |
    v
  Compact -> preserve only critical continuity

The four hook phases are reminders, not a hard workflow. Prime makes the skill, guideline, and active store visible. Remind prompts a recall decision. Nudge prompts a writeback decision. Compact preserves only critical continuity before context compression.

You don't run mnemon commands yourself. The agent does when the guideline says memory is useful.

Features

Zero user-side operation — install once; supported runtimes can use hooks, minimal runtimes can use persistent rules
LLM-supervised — the host LLM decides what to remember, update, and forget; no embedded LLM, no API keys
Multi-framework support — Claude Code and Codex (hooks), OpenClaw (plugins), Pi (extensions), Nanobot (skills), and more
Markdown-installable harness — SKILL.md, INSTALL.md, GUIDELINE.md, and four lifecycle reminders
Four-graph architecture — temporal, entity, causal, and semantic edges, not just vector similarity
Intent-native protocol — three primitives (remember, link, recall) map to the LLM's cognitive vocabulary, not database syntax; structured JSON output with signal transparency
Intent-aware recall — graph traversal + optional vector search (RRF fusion), enabled by default for all queries
Built-in deduplication — remember auto-detects duplicates and conflicts; skips or auto-replaces
Retention lifecycle — importance decay, access-count boosting, and garbage collection
Privacy-safe receipts — export hashed operation receipts for memory-boundary audits without raw memory contents or queries
Optional embeddings — works fully without Ollama; add local Ollama for enhanced vector+keyword hybrid search

Vision

All your local agentic AIs — across sessions and frameworks — sharing one pool of live memory.

  Claude Code ──┐
                │
  OpenClaw ─────┤
                │
  Pi ───────────┤
                │
  Nanobot ──────┤
                │
  NanoClaw ─────┤
                ├──▶  ~/.mnemon  ◀── shared memory
  OpenCode ─────┤
                │
  Gemini CLI ───┘

The foundation is in place: a single ~/.mnemon database that any agent can read and write. Claude Code setup automates hook installation; OpenClaw can use plugin hooks; Pi integrates via native skills and TypeScript lifecycle extensions; Nanobot integrates via skill files; NanoClaw integrates via container skills and volume mounts. The same harness can be installed in any LLM CLI that supports skills, rules, system prompts, or event hooks.

The longer-term direction is a memory gateway: protocol decoupled from storage engine. The current SQLite backend

mnemon

Related Skills