Single-file memory layer for AI agents, sub mili-second RAG on Apple Silicon. Metal Optimized On-Device. No Server. No API. One File. Pure Swift
# Add to your Claude Code skills
git clone https://github.com/christopherkarani/WaxWax is a Swift-native persistence engine for AI agents. It stores documents, embeddings, and structured knowledge in a single portable .wax file.
The goal is simple: keep memory local, keep setup light, and make recall fast enough that it can stay in the loop.
| Feature | Wax | SQLite (FTS5) | Cloud Vector DBs | |:-----------------|:-----------------------|:-----------------------|:-----------------------| | Search | Hybrid (Text + Vector) | Text Only* | Vector Only* | | Latency | ~6ms (p95) | ~10ms (p95) | 150ms - 500ms+ | | Privacy | 100% Local | 100% Local | Cloud-hosted | | Setup | Zero Config | Low | Complex (API Keys) | | Architecture | Apple Silicon Native | Generic | Varies |
.wax file?Most RAG setups end up with a database, a vector store, and a file server. Wax keeps the moving pieces smaller by bundling documents, metadata, and indexes into one binary.
Wax is tuned for M-series hardware and local recall.
Lower is better. Measured in milliseconds.
Wax (Hybrid) |██ 6.1ms
SQLite (Text) |████ 12ms
Cloud RAG |██████████████████████████████████████████████████ 150ms+
Lower is better. Measured in milliseconds.
Wax |███ 9.2ms
Traditional |██████████████████████████████████████ 120ms+
[!TIP] Ingest Throughput: Wax handles 85.9 docs/s with full hybrid indexing on an M3 Max. Full benchmark report: Resources/docs/benchmarks/2026-03-06-performance-results.md
Wax uses a frame-based container format and embeds the search engines it needs inside the main file: SQLite FTS5 for text and a Metal-accelerated HNSW index for vectors.
┌──────────────────────────────────────────────────────────────────────────┐
│ Dual Header Pages (A/B) │
│ (Magic, Version, Generation, Pointers to WAL & TOC, Checksums) │
├──────────────────────────────────────────────────────────────────────────┤
│ WAL (Write-Ahead Log) │
│ (Atomic ring buffer for crash-resilient uncommitted mutations) │
├──────────────────────────────────────────────────────────────────────────┤
│ Compressed Data Frames │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Frame 0 (LZ4) │ │ Frame 1 (LZ4) │ │ Frame 2 (LZ4) │ ... │
│ │ [Raw Document] │ │ [Metadata/JSON] │ │ [System Info] │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
├──────────────────────────────────────────────────────────────────────────┤
│ Hybrid Search Indices │
│ ┌──────────────────────────────┐ ┌──────────────────────────────┐ │
│ │ SQLite FTS5 Blob │ │ Metal HNSW Index │ │
│ │ (Text Search + EAV Facts) │ │ (Vector Search) │ │
│ └──────────────────────────────┘ └──────────────────────────────┘ │
├──────────────────────────────────────────────────────────────────────────┤
│ TOC (Table of Contents) │
│ (Index of all frames, parent-child relations, and engine manifests) │
└──────────────────────────────────────────────────────────────────────────┘
import Wax
// Use a sandbox-safe, writable location (works in apps and CLI tools)
let url = URL.documentsDirectory.appending(path: "agent.wax")
// 1. Open a memory store
let memory = try await Memory(at: url)
// 2. Save a memory
try await memory.save("The user is building a habit tracker in SwiftUI.")
// 3. Search with hybrid recall (text + vector)
let results = try await memory.search("What is the user building?")
if let best = results.items.first {
print("Found: \(best.text)")
print("Document ID: \(best.metadata["id"] ?? "unknown")")
// → "Found: The user is building a habit tracker in SwiftUI."
}
try await memory.close()
import SwiftUI
import Wax
struct ContentView: View {
@State private var result = "Searching…"
var body: some View {
Text(result)
.task {
do {
let url = URL.documentsDirectory.appending(path: "agent.wax")
let memory = try await Memory(at: url)
try await memory.save("The user is building a habit tracker in SwiftUI.")
let context = try await memory.search("What is the user building?")
result = context.items.first?.text ?? "Nothing found"
try await memory.close()
} catch {
result = "Error: \(error.localizedDescription)"
}
}
}
}
import Wax
@main
struct AgentMemory {
static func main() async throws {
let url = URL.documentsDirectory.appending(path: "agent.wax")
let memory = try await Memory(at: url)
try await memory.save("The user is building a habit tracker in SwiftUI.")
let results = try await memory.search("What is the user building?")
if let best = results.items.first {
print("Found: \(best.text)")
}
try await memory.close()
}
}
Looking to store persistent facts and long-term reasoning? See Structured Memory.
For repeated CLI vector work, Wax CLI now auto-starts and reuses a local broker that owns
the long-term store and broker-managed session stores for commands such as remember,
recall, and search --mode hybrid.
You can still run the broker directly when you want an explicit long-lived session:
wax-cli daemon --store-path ~/.wax/memory.wax
Send JSON lines such as:
{"id":"1","command":"remember","content":"An automobile needs periodic maintenance."}
{"id":"2","command":"search","query":"car service","mode":"hybrid","topK":3}
{"id":"3","command":"shutdown"}
Simple text-only usage still runs one-shot. If vector search is unavailable, hybrid/vector commands now fail loudly instead of silently dropping to text-only mode.
If you use an AI coding assistant like Claude Code, Cursor, or Windsurf, there are two good setup paths:
Install the MCP server (Claude Code):
npx -y waxmcp@latest mcp install --scope user
This install flow stages the bundled Wax runtime into a stable local directory and
registers the staged wax-mcp binary with Claude Code. npx is only used for install/bootstrap.
Install the skill (Claude Code):
# From within your project directory
claude install-skill https://github.com/christopherkarani/Wax/tree/main/Resources/skills/public/wax
Once installed, your assistant
No comments yet. Be the first to share your thoughts!