by CodeAbra
The best-benchmarked open-source memory system for AI coding assistants
The deep catalog scan for this skill is still queued. Run an instant dependency check now instead.
# Add to your Claude Code skills
git clone https://github.com/CodeAbra/iai-personal-memory-engineGuides for using ai agents skills like iai-personal-memory-engine.
No comments yet. Be the first to share your thoughts!
Independent Autistic Intelligence — a local memory layer for Claude (and other MCP-compatible assistants).
A local server that speaks the MCP protocol and gives Claude, and any other MCP-compatible assistant, a long-term memory. It captures every turn of every session verbatim, organizes those captures over time into a personal map of who you are, and serves a small slice of relevant memory back at the start of each new conversation. You never have to say "remember this" or "what did we say last time?".
I built this for myself. It worked. I've been running it daily for months, and now I'm sharing it. The benchmarks were mostly for my own curiosity. I wanted to know if it actually works or if I'd just gotten used to it.
Windows and Linux not supported yet but I'm working on it.
git clone https://github.com/CodeAbra/iai-mcp.git
cd iai-mcp
bash scripts/install.sh
The installer creates a Python venv, installs dependencies (LanceDB, sentence-transformers, torch-hd, NetworkX, igraph), builds the TypeScript MCP wrapper, pre-downloads the default embedding model (~130 MB), symlinks the CLI to ~/.local/bin/iai-mcp, and on macOS registers the daemon with launchd.
Make sure ~/.local/bin is on your PATH:
export PATH="$HOME/.local/bin:$PATH" # add to ~/.zshrc or ~/.bashrc
iai-mcp --version
This is what makes memory ambient. Without these hooks iai-mcp reads memory but never writes conversation content and never injects recall at session start. One command wires all three:
iai-mcp capture-hooks install # copies all three hooks + patches ~/.claude/settings.json
iai-mcp capture-hooks status # verify: should print "status: ACTIVE"
iai-mcp capture-hooks uninstall # clean removal if ever needed
For Codex:
iai-mcp capture-hooks install --target codex
To install both:
iai-mcp capture-hooks install --target all
What the install does:
deploy/hooks/ to ~/.claude/hooks/ (chmod +x):
iai-mcp-turn-capture.sh (UserPromptSubmit, timeout 5s) — appends each prompt + the preceding assistant turn(s) to a per-session buffer as pure file IO. Zero daemon RPC during the session.iai-mcp-session-capture.sh (Stop, timeout 35s) — at session end, rolls the buffer over for the daemon to drain, and runs iai-mcp capture-transcript --no-spawn as a safety net.iai-mcp-session-recall.sh (SessionStart, timeout 30s) — calls iai-mcp session-start and pipes the assembled memory prefix to stdout, which Claude Code injects as additionalContext before the first prompt. Fail-safe: empty store or unreachable daemon yields empty stdout — session start is never blocked.What happens at runtime:
Claude Code:
claude mcp add iai-mcp -- node "$(pwd)/mcp-wrapper/dist/index.js"
Or edit ~/.claude.json directly:
{
"mcpServers": {
"iai-mcp": {
"command": "node",
"args": ["/absolute/path/to/iai-mcp/mcp-wrapper/dist/index.js"]
}
}
}
Use the absolute path. ~ and $HOME won't expand here.
For Claude Desktop (untested), edit ~/Library/Application Support/Claude/claude_desktop_config.json.
Codex CLI:
[mcp_servers.iai-mcp]
command = "node"
args = ["/absolute/path/to/iai-mcp/mcp-wrapper/dist/index.js"]
[mcp_servers.iai-mcp.env]
IAI_MCP_PYTHON = "/absolute/path/to/iai-mcp/.venv/bin/python"
IAI_MCP_STORE = "/Users/you/.iai-mcp"
TRANSFORMERS_VERBOSITY = "error"
TOKENIZERS_PARALLELISM = "false"
Codex hooks are stable in current Codex CLI builds. If hooks are disabled by
local policy or an older install, enable [features].hooks = true in
~/.codex/config.toml.
iai-mcp doctor
iai-mcp daemon status
Restart Claude Code. Start a session, do some work, exit. Then:
tail ~/.iai-mcp/logs/capture-$(date -u +%Y-%m-%d).log
You should see a rc=0 line. That's your first memory.
You do not call iai-mcp directly during a session. Once it's connected:
Capture is automatic. Every turn, yours and the assistant's, is recorded verbatim with timestamps and session metadata. You don't say "remember this."
Recall is automatic. When a new session starts, the daemon assembles a small relevant slice of your history and injects it into the conversation prefix. You don't say "what did we say."
Consolidation runs idle. Between sessions, the daemon merges duplicates, strengthens recall pathways for things retrieved often, and prunes weak edges. The system gets quietly better at remembering you over time.
After a few weeks of regular use the difference becomes noticeable. The assistant stops asking the same orientation questions, references things you mentioned in passing, and adapts to your style without being told.
The daemon is a Python process that runs in the background. Your MCP client connects to it via a Unix socket. No network exposure.
Memory is stored in three tiers:
Episodic is verbatim, timestamped fragments of what was said. Write-once, never overwritten or rewritten.
Semantic is summaries induced from clusters of related episodes during idle-time consolidation.
Procedural is a small set of stable parameters about you, learned over time: preferences, style cues, recurring patterns. Eleven sealed knobs that shift based on what works.
A background pass runs periodically (sleep cycles): it clusters episodes, builds semantic summaries, decays old unreinforced connections, and reinforces frequently co-retrieved paths. Things you haven't revisited fade naturally. There's an optional "insight of the day" step that makes one Anthropic API call, but it's off by default.
Recall combines three signals: semantic similarity, graph-link strength, and recency. All ranked together.
All records are encrypted at rest with AES-256-GCM. The key lives in ~/.iai-mcp/.key (mode 0600). Back it up. Lose the key, lose the memories.
Everything lives at ~/.iai-mcp/. Embeddings are computed locally with bge-small-en-v1.5. The only data that leaves the machine is your normal conversation with whatever LLM API your client uses.
Claude Code <--MCP-stdio--> TypeScript wrapper <--UNIX socket--> Python daemon <--> LanceDB
I made these because I wanted honest numbers. Every harness ships in bench/. Run them on your machine, get your own results.
| Metric | Target | Measured | |---|---|---| | Verbatim recall (byte-exact) | >=99% | >=99% at N=10k | | Recall p95 latency | <100 ms | <100 ms at N=10k | | RAM at steady state | <=300 MB | ~150-300 MB | | Session-start tokens (warm cache) | <=3,000 | <=3,000 | | Session-start tokens (cold) | <=8,000 | <=8,000 |
python -m bench.verbatim # verbatim fidelity
python -m bench.neural_map # recall latency
python -m bench.memory_footprint # RAM usage
python -m bench.tokens # session-start cost
python -m bench.total_session_cost # full 10-turn cost
pyth