A living memory system that ingests long-horizon data to infer insights, enabling more decisive action, all while running on a single SQLite file locally.
# Add to your Claude Code skills
git clone https://github.com/buildingjoshbetter/TrueMemoryLast scanned: 6/7/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-06-07T07:56:25.878Z",
"npmAuditRan": true,
"pipAuditRan": true
}No comments yet. Be the first to share your thoughts!
30 days in the Featured rail
Step 1. Open a terminal.
Cmd + Space, type TerminalCtrl + Alt + T
Step 2. Paste this and press Enter:
Mac / Linux:
curl -LsSf https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.sh | sh
Windows (PowerShell):
irm https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.ps1 | iex
Step 3. Wait 3-5 minutes for installation.
Step 4. Quit Claude completely and reopen it (Mac: Cmd+Q, Windows: right-click the Claude tray icon → Quit).
Step 5. Type "Set up TrueMemory" and pick Edge, Base, or Pro.
That's it. TrueMemory remembers your conversations automatically from here.
uv tool upgrade truememory in Terminal, then restart Claudeuv tool uninstall truememory⭐ Tip: Quit your Claude sessions when you're done prompting. TrueMemory's memory hook runs when the session ends, so it can capture and store the full conversation.
Installs uv (Astral's Python tool manager) if needed, fetches a managed Python 3.12, installs TrueMemory with all tier models into an isolated tool environment, registers the MCP server, wires up lifecycle hooks, and merges instructions into ~/.claude/CLAUDE.md. Your system Python is never touched. No sudo, no venvs, no pip struggle.
It's ~200 lines of shell, no sudo, stays entirely under $HOME:
curl -LsSf https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.sh -o install.sh && less install.sh && sh install.sh
If you're embedding TrueMemory in your own Python project (requires Python 3.10+):
pip install truememory
pip installinstalls the Python library only. It does NOT register the MCP server, install hooks, or configure Claude. For Claude Code / Claude Desktop, always use thecurl | shinstaller above.
from truememory import Memory
m = Memory()
m.add("Prefers dark mode and TypeScript", user_id="alex")
m.add("Allergic to peanuts", user_id="alex")
results = m.search("What are Alex's preferences?", user_id="alex")
print(results[0]["content"])
# "Prefers dark mode and TypeScript"
The database is created automatically at ~/.truememory/memories.db.
Same architecture, three tiers. All included in a single install — switch anytime.
| | Edge | Base | Pro | |---|------|------|-----| | LoCoMo (3-run mean) | 89.6% | 92.0% | 93.0% | | LongMemEval (3-run mean) | | | 92.0% | | BEAM-1M (3-run mean) | | | 76.6% (SOTA) | | BEAM-10M (single run) | | | 65.0% | | Embedder | Model2Vec potion-base-8M (8M params, 256d) | Qwen3-Embedding-0.6B (600M params, 256d) | Qwen3-Embedding-0.6B (600M params, 256d) | | Reranker | MiniLM-L-6-v2 (22M) | gte-reranker-modernbert (149M) | gte-reranker-modernbert (149M) | | HyDE | off | off | on (requires LLM API key) | | Runs on | Any machine, CPU only | 4GB+ RAM, CPU or GPU | 4GB+ RAM + LLM API key | | Model size | ~30MB | ~1.5GB | ~1.5GB |
Edge works everywhere. Base is the strongest fully-offline tier. Pro adds HyDE query expansion for the highest scores.
TrueMemory uses a 6-layer retrieval pipeline inspired by how the brain encodes and recalls memory, plus an encoding gate that decides what gets stored.
Every conversation flows through three stages before anything is stored:
| Stage | What it does | |-------|-------------| | LLM Extractor | Pulls atomic facts from raw conversation text. Classifies each as personal, preference, decision, correction, temporal, technical, or relationship. | | Encoding Gate | Three-signal filter: compression novelty + speech-act salience + embedding pair-diff prediction error. Rejects noise, keeps signal. | | Dedup | ADD / UPDATE / SKIP against existing memories using vector similarity + word-overlap Jaccard matching. Prevents duplicates and catches rephrased versions of the same fact. |
Three signals decide whether a fact gets stored or skipped:
| Signal | What it measures | |--------|-----------------| | Compression Novelty | gzip-based information gain. How much new information does this fact add compared to what's already stored? | | Speech-Act Salience | Rule-based scorer for short messages, learned scorer for longer text. Filters out greetings, reactions, and filler. | | Embedding Pair-Diff | Embedding divergence between the message and existing memories on the same topic. Detects when someone says something different about a known subject. |
The weighted sum of all three must exceed a threshold (default 0.30) to be stored. Everything below gets skipped. One SQLite file at ~/.truememory/memories.db. Portable. Backupable. cp it anywhere.
When you ask a question, six layers work together:
| Layer | Name | What it does | |-------|------|-------------| | L0 | Personality | Char-n-gram style vectors + entity profiles. Answers "what kind of person is X?" questions that keyword search can't touch. | | L2 | Episodic | FTS5 full-text keyword search with temporal filtering. Fast, broad recall. | | L3 | Semantic | Dense vector search (Model2Vec or Qwen3 by tier) + RRF fusion + cross-encoder reranking. The heavy lifter. | | L4 | Salience | Noise filtering + entity boosting. Learned 13-feature logistic regression scorer trained on retrieval-utility labels. | | L5 | Consolidation | Structured fact summaries, contradiction resolution, and surprise-weighted reranking (alpha=0.2). | | + | Reranker | Cross-encoder reranking (MiniLM or gte-modernbert by tier) for final precision. |
Tested on LoCoMo (1,540 questions, 10 conversations), LongMemEval (500 questions, multi-session), BEAM-1M (700 questions, 35 conversations at 1M+ tokens), and BEAM-10M (200 questions, 10 conversations at 10M tokens). All systems share the same answer model (GPT-4.1-mini), judge (GPT-4o-mini, 3x majority vote), and scoring pipeline.
| Tier | Overall | Single-hop | Multi-hop | Temporal | Open-domain | |------|---------|------------|-----------|----------|-------------| | Edge | 89.6% | 88.7% | 88.5% | 79.2% | 91.4% | | Base | 92.0% | 91.5% | 91.3% | 82.3% | 93.9% | | Pro | 93.0% | 92.6% | 90.0% | 86.5% | 95.4% |
| Category | Score | |----------|-------| | Preference following | 97.1% | | Contradiction resolution | 91.4% | | Information extraction | 91.4% | | Summarization | 89.5% | | Instruction following | 84.8% | | Abstention | 82.4% | | Knowledge update | 77.6% | | Multi-session reasoning | 67.1% | | Temporal reasoning | 64.8% | | Event ordering | 19.5% | | Overall | 76.6% |
| Variant | Accuracy | Correct/500 | |---------|----------|-------------| | Oracle | 92.0% | 460 | | Strict (_s) | 87.8% | 439 |