by qualixar
World's first local-only AI memory to break 74% retrieval and 60% zero-LLM on LoCoMo. No cloud, no APIs, no data leaves your machine. Additionally, mode C (LLM/Cloud) - 87.7% LoCoMo. Research-backed. arXiv: 2603.14588
# Add to your Claude Code skills
git clone https://github.com/qualixar/superlocalmemoryname: superlocalmemory description: "AI agent memory with mathematical foundations. Store, recall, search, and manage memories locally with zero cloud dependency." version: "3.3.23" author: "Varun Pratap Bhardwaj" license: Elastic-2.0 homepage: https://superlocalmemory.com repository: https://github.com/qualixar/superlocalmemory triggers:
AI agent memory that runs 100% locally. Four-channel retrieval (semantic, graph, BM25, temporal) with mathematical similarity scoring. No cloud, no API keys, EU AI Act compliant.
pip install superlocalmemory
# or
npm install -g superlocalmemory
slm remember "Alice works at Google as a Staff Engineer" --json
slm recall "Who is Alice?" --json
slm status --json
All data-returning commands support --json for structured agent-native output.
slm remember "<content>" --json # Store a memory
slm remember "<content>" --tags "a,b" --json
slm recall "<query>" --json # Semantic search
slm recall "<query>" --limit 5 --json
slm list --json -n 20 # List recent memories
slm forget "<query>" --json # Preview matches (add --yes to delete)
slm forget "<query>" --json --yes # Delete matching memories
slm delete <fact_id> --json --yes # Delete specific memory by ID
slm update <fact_id> "<content>" --json # Update a memory
slm status --json # System status (mode, profile, DB)
slm health --json # Math layer health
slm trace "<query>" --json # Recall with per-channel breakdown
slm mode --json # Get current mode
slm mode a --json # Set mode (a=local, b=ollama, c=cloud)
slm profile list --json # List profiles
slm profile switch <name> --json # Switch profile
slm profile create <name> --json # Create profile
slm connect --json # Auto-configure IDEs
slm connect --list --json # List supported IDEs
slm setup # Interactive setup wizard
slm mcp # Start MCP server (for IDE integration)
slm dashboard # Open web dashboard
slm warmup # Pre-download embedding model
Every --json response follows a consistent envelope:
{
"success": true,
"command": "recall",
"version": "3.0.22",
"data": {
"results": [
{"fact_id": "abc123", "score": 0.87, "content": "Alice works at Google"}
],
"count": 1,
"query_type": "semantic"
},
"next_actions": [
{"command": "slm list --json", "description": "List recent memories"}
]
}
Error responses:
{
"success": false,
"command": "recall",
"version": "3.0.22",
"error": {"code": "ENGINE_ERROR", "message": "Description of what went wrong"}
}
| Mode | Description | Cloud Required | |------|-------------|----------------| | A | Local Guardian -- zero cloud, zero LLM, EU AI Act compliant | None | | B | Smart Local -- local Ollama LLM, data stays on your machine | Local only | | C | Full Power -- cloud LLM for maximum accuracy | Yes |
SuperLocalMemory works via both MCP and CLI:
--json for scripts, CI/CD, agent frameworks (OpenClaw, Codex, Goose)Part of Qualixar | Author: Varun Pratap Bhardwaj (qualixar.com | varunpratap.com)
Every major AI memory system — Mem0, Zep, Letta, EverMemOS — sends your data to cloud LLMs for core operations. That means latency on every query, cost on every interaction, and after August 2, 2026, a compliance problem under the EU AI Act.
SuperLocalMemory V3 takes a different approach: mathematics instead of cloud compute. Three techniques from differential geometry, algebraic topology, and stochastic analysis replace the work that other systems need LLMs to do — similarity scoring, contradiction detection, and lifecycle management. The result is an agent memory that runs entirely on your machine, on CPU, with no API keys, and still outperforms funded alternatives.
The numbers (evaluated on LoCoMo, the standard long-conversation memory benchmark):
| System | Score | Cloud Required | Open Source | Funding | |:-------|:-----:|:--------------:|:-----------:|:-------:| | EverMemOS | 92.3% | Yes | No | — | | Hindsight | 89.6% | Yes | No | — | | SLM V3 Mode C | 87.7% | Optional | Yes (EL2) | $0 | | Zep v3 | 85.2% | Yes | Deprecated | $35M | | SLM V3 Mode A | 74.8% | No | Yes (EL2) | $0 | | Mem0 | 64.2% | Yes | Partial | $24M |
Mode A scores 74.8% with zero cloud dependency — outperforming Mem0 by 16 percentage points without a single API call. On open-domain questions, Mode A scores 85.0% — the highest of any system in the evaluation, including cloud-powered ones. Mode C reaches 87.7%, matching enterprise cloud systems.
Mathematical layers contribute +12.7 percentage points on average across 6 conversations (n=832 questions), with up to +19.9pp on the most challenging dialogues. This isn't more compute — it's better math.
Upgrading from V2 (2.8.6)? V3 is a complete architectural reinvention — new mathematical engine, new retrieval pipeline, new storage schema. Your existing data is preserved but requires migration. After installing V3, run
slm migrateto upgrade your data. Read the Migration Guide before upgrading. Backup is created automatically.
V3.3 gives your memory a lifecycle. Memories strengthen when used, fade when neglected, compress when idle, and consolidate into reusable patterns — all automatically, all locally. Your agent gets smarter the longer it runs.
# Run a memory lifecycle review — strengthens active memories, archives neglected ones
slm decay
# Run smart compression — adapts embedding precision to memory importance
slm quantize
# Extract reusable patterns from memory clusters
slm consolidate --cognitive
# View auto-learned patterns that get injected into agent context
slm soft-prompts
# Clean up orphaned SLM processes
slm reap
| Tool | Description |
|:-----|:------------|
| forget | Programmatic memory archival via lifecycle rules |
| quantize | Trigger smart compression on demand |
| consolidate_cognitive | Extract and store patterns from memory clusters |
| get_soft_prompts | Retrieve auto-learned patterns for context injection |
| reap_processes | Clean orphaned SLM processes |
| get_retention_stats | Memory lifecycle analytics |
| Metric | V3.2 | V3.3 | Change | |:-------|:----:|:----:|:------:| | RAM usage (Mode A/B) | ~4GB | ~40MB | 100x reduction | | Retrieval channels | 5 | 6 | +Hopfield completion | | MCP tools | 29 | 35 | +6 new | | CLI commands | 21 | 26 | +5 new | | Dashboard tabs | 20 | 23 | +3 new | | API endpoints | 9 | 16 | +7 new |
Embedding migration happens automatically when you switch modes — no manual steps needed.
Three new tabs: Memory Lifecycle (retention curves, decay stats), Compression (storage savings, precision distribution), and Patterns (auto-learned soft prompts, consolidation history). Seven new API endpoints power the new views.
All new features default OFF. Zero breaking changes. Opt in when ready:
# Turn on adaptive memory lifecycle
slm config set lifecycle.enabled true
# Turn on smart compression
slm config set quantization.enabled true
# Turn on cognitive consolidation
slm config set consolidation.cognitive.enabled true
# Turn on pattern learning (soft prompts)
slm config set soft_prompts.enabled true
# Turn on Hopfield retrieval (6th channel)
slm config set retrieval.hopfield.enabled true
# Or enable everything at once
slm config set v33_features.all true
Fully backward compatible. All existing MCP tools, CLI commands, and configs work unchanged. New tables are created automatically on first run. No migration needed.
100x faster recall (<10ms at 10K facts), automatic memory surfacing, associative retrieval (5th channel), temporal intelligence with bi-temporal validity, sleep-time consolidation, and core memory blocks. All features default OFF, zero breaking changes.
| Metric | V3.0 | V3.2 | Change | |:-------|:----:|:----:|:------:| | Recall latency (10K facts) | ~500ms | <10ms | 100x faster | | Retrieval channels | 4 | 5 | +spreading activation | | MCP tools | 24 | 29 | +5 new | | DB tables | 9 | 18 | +9 new |
Enable with slm config set v32_features.all true. See the V3.2 Overview wiki page for details.
npm install -g superlocalmemory
slm setup # Choose mode (A/B/C)
slm doctor # Verify everything is working
slm warmup # Pre-download embedding model (~500MB, optional)
pip install superlocalmemory
slm remember "Alice works at Google as a Staff Engineer"
slm recall "What does Alice do?"
slm status
{
"mcpServers": {
"superlocalmemory": {
"command": "slm",
"args": ["mcp"]
}
}
}
35 MCP tools + 7 resources available. Works with Claude Code, Cursor, Windsurf, VS Code Copilot, Continue, Cody, ChatGPT Desktop, Gemini CLI, JetBrains, Zed, and 17+ AI tools. V3.3: Adaptive lifecycle, smart compression, and pattern learning.
SLM works everywhere -- from IDEs to CI pipelines to Docker containers. The only AI memory system with both MCP and agent-native CLI.
| Need | Use | Example | |------|-----|---------| | IDE integration | MCP | Auto-configured for 17+
No comments yet. Be the first to share your thoughts!