by mage0535
Agent-agnostic memory sidecar for AI coding agents. One install, multi-agent shared recall, production-grade. AI智能体持久记忆系统,多Agent共享记忆层,生产级部署。
# Add to your Claude Code skills
git clone https://github.com/mage0535/hermes-memory-installerGuides for using ai agents skills like hermes-memory-installer.
Last scanned: 6/1/2026
{
"issues": [],
"status": "PASSED",
"scannedAt": "2026-06-01T09:27:44.007Z",
"npmAuditRan": true,
"pipAuditRan": false
}No comments yet. Be the first to share your thoughts!
A production-grade sidecar memory system for AI agents.
Every AI coding agent session starts blank. Claude Code, Cursor, Codex, Hermes — none have persistent long-term memory out of the box. You close a session and everything it learned about your project, your preferences, your ongoing work — gone.
Running multiple agents on the same project? Each one starts from zero, with no shared context, no institutional memory. The agent frameworks don't fix this because it's not their job. But if you're running agents in production, you hit this wall every day.
Memory Sidecar v3.0 is a sidecar memory system that sits alongside your agent. It does not patch the agent's core. Instead, it captures what the agent learned, indexes it, and makes it available to the next session — and to every other agent on the same server.
Multi-agent support: all scripts use the AGENT_HOME environment variable (backward compatible with HERMES_HOME).
Mount the sidecar to any agent by setting AGENT_HOME to the agent's data directory.
| Scenario | What the sidecar does | |----------|----------------------| | Cross-session continuity | Agent remembers project decisions, user preferences, ongoing tasks across restarts | | Multi-agent team | Hermes + Claude Code + Codex share the same memory layer — no silos | | Production deployment | Health checks, acceptance test suite, backlog remediation for self-healing | | Bilingual teams | First-class Chinese + English support from day one, 6 multilingual embedding models | | Knowledge management | Session archives → governance objects → focused dossiers → tiered retrieval |
Agent Core
└─ writes state.db + session JSON
Sidecar Capture Layer
└─ session_to_gbrain.py — incremental session ingestion → gbrain
Sidecar Governance Layer
├─ memory_family_registry.py — query intent classification + focus profiles
├─ memory_governance_rebuild.py — canonical objects, hubs, multi-version status, vector index
└─ memory_guardian.py — capacity monitoring, consolidation drain, stuck-op recovery
Sidecar Recall Layer
└─ tiered_context_injector.py — layered retrieval (L1/L2/L3), RRF fusion, rerank
Sidecar Maintenance + Acceptance
├─ memory_maintenance_cycle.py — orchestrator: archive → rebuild → drain → recall → health
└─ sidecar_acceptance_check.py — production verification suite
See ARCHITECTURE.md for the full technical breakdown.
git clone https://github.com/mage0535/hermes-memory-installer.git
cd hermes-memory-installer
python3 installer/install.py
Non-interactive install with explicit embedding model:
python3 installer/install.py --noninteractive --embedding intfloat/multilingual-e5-small
The installer deploys the supported sidecar scripts into $AGENT_HOME/scripts/, patches $AGENT_HOME/config.yaml, and writes install metadata to $AGENT_HOME/memory-sidecar/install-profile.json.
export AGENT_HOME=/home/user/.my-agent
python3 installer/install.py --noninteractive
Backward compatible: --hermes-home and HERMES_HOME env var also work.
AGENT_HOME=/root/.hermes python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py
AGENT_HOME=/root/.hermes python3 $AGENT_HOME/scripts/sidecar_acceptance_check.py
The supported v3.0 sidecar runtime consists of these 7 scripts:
memory_family_registry.pymemory_governance_rebuild.pymemory_guardian.pymemory_maintenance_cycle.pysession_to_gbrain.pysidecar_acceptance_check.pytiered_context_injector.pyThese are the scripts used in the validated production deployment.
The agent writes state.db and session JSON files normally.
The sidecar reads them incrementally and tracks progress with a checkpoint.
session_to_gbrain.py converts high-value sessions into gbrain pages, applies tags, writes timeline entries, and links sessions to topic hubs.
memory_governance_rebuild.py rebuilds:
active / superseded) and time validity (valid_from / valid_to)EMBEDDING_API_URL is configured)It also maintains repair infrastructure:
orphan_messages — orphan message audit trailsession_repair_map — message-to-session repair mappingsession_lineage_repair — session parent-chain repairrecovered_fragments — unassignable memory fragment archivememory_aliases / memory_relations — alias and relation graphsessions_effective view — repaired session view layertiered_context_injector.py classifies the query intent and fuses:
memory_guardian.py reports health, trend data, duplicate counts, sync lag, and consolidation backlog signals.
It includes safe remediation logic for sticky consolidation backlogs and stuck operation detection.
v3.0 introduces the Focused Dossier concept.
A dossier is a first-class memory profile for an important person, relationship, project, event, or topic.
The production deployment includes a validated relationship dossier , and the shared registry supports extending to more dossiers.
Embedding models enable semantic vector search as an additional retrieval layer in L3 recall.
When EMBEDDING_API_URL is set, the governance rebuild automatically generates 384–1024 dimensional embeddings for each active memory_object and stores them in the canonical_semantic_index table. During recall, tiered_context_injector.py can query this index via cosine similarity alongside keyword-based FTS5 and LIKE paths.
The sidecar does not bundle an embedding server. You run one independently and point the sidecar to it via EMBEDDING_API_URL.
Quick start with sentence-transformers (recommended for development):
pip install sentence-transformers flask
Create a minimal server that serves the OpenAI-compatible /v1/embeddings endpoint.
A reference implementation is included in the community scripts:
# embedding_server.py (example — serve with your chosen model)
from sentence_transformers import SentenceTransformer
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
model = SentenceTransformer("intfloat/multilingual-e5-small")
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
length = int(self.headers.get("Content-Length", 0))
body = json.loads(self.rfile.read(length))
texts = body.get("input", [])
emb = model.encode(texts, normalize_embeddings=True).tolist()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps({"data": [{"embedding": e} for e in emb]}).encode())
HTTPServer(("127.0.0.1", 8766), Handler).serve_forever()
Then set the environment variable and run a governance rebuild:
export EMBEDDING_API_URL=http://127.0.0.1:8766/v1/embeddings
python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py
When EMBEDDING_API_URL is not set, the sidecar runs entirely without embeddings — all text-based retrieval (FTS5 / LIKE / hindsight / gbrain) continues to work normally.
During installation, the installer either: