AI coding agents have two memory problems that CLAUDE.md and Rules files don't solve: (1) long conversations get compressed and context silently disappears — the agent forgets decisions made 2 hours ago in the same session, and (2) memory is locked to one tool on one machine. Switch from Claude Code to Cursor, or from your laptop to your desktop, and everything is gone.
I built hmem to fix both. It's an MCP server that gives AI agents persistent, hierarchical memory stored in a local SQLite file. The same .hmem file works across Claude Code, Cursor, Windsurf, OpenCode, and Gemini CLI — on any machine. Your agent's knowledge is portable.
The key idea is borrowed from how human memory works: you remember rough outlines first and recall details on demand. hmem has 5 depth levels. At session start, the agent loads only Level 1 summaries (~20 tokens). It drills deeper into specific memories only when needed — L2 for context, L3-L5 for raw details. Unlike a flat MEMORY.md that gets injected wholesale (3000-8000 tokens every time), hmem loads only what's relevant.
Install: `npx hmem-mcp init` (interactive setup — detects your installed tools and writes the MCP config).
This is beta software. I've been using it in production across two machines with 100+ memory entries and it's been stable, but the API surface may still change. Would appreciate feedback.
GitHub: https://github.com/Bumblebiber/hmem npm: https://www.npmjs.com/package/hmem-mcp License: MIT