Ephemeral agents with persistent memory
Every OpenClaw agent wakes up with amnesia. That’s actually a feature.
Stateless sessions mean no stale context, no runaway token costs, and easy horizontal scaling. But it also means your agent forgets every preference, every correction, every piece of context from yesterday.
Most OpenClaw users solve this with MEMORY.md — a flat file loaded into context every session. It works until it doesn’t. Once that file grows past a few hundred lines, you’re burning tokens on irrelevant context and your agent starts hallucinating connections between unrelated notes.
There’s a better pattern: ephemeral agents backed by semantic memory. Your agent spins up clean, recalls only what’s relevant, does its work, and stores what matters. No state to manage, no files to prune.
The problem with file-based memory
Here’s what MEMORY.md looks like after a month of active use:
- User prefers dark mode
- Project X uses PostgreSQL
- Don't use tabs, use spaces
- Meeting with Sarah moved to Thursday
- Actually meeting cancelled
- User's cat is named Pixel
- API endpoint changed to /v2/...
- User hates when I say "Great question!"
- Project X migrated to MySQL (March 2026)
Every session, your agent loads all of this. The PostgreSQL note contradicts the MySQL note. The meeting info is stale. Half of it isn’t relevant to the current task.
And you’re paying for it. Every token in that file eats context window space your agent could use for actual work.
Ephemeral + memory
The idea is simple:
- Session starts — agent recalls relevant memories based on the current conversation
- Session runs — agent works normally, with recalled context informing its responses
- Session ends — agent stores new learnings, corrections, and preferences
No persistent files. No growing context dumps. Your agent asks for what it needs and remembers what matters.
Setting it up
Install the MemoClaw skill on your OpenClaw agent:
clawhub install anajuliabit/memoclaw
Or add it directly via the CLI:
npm install -g memoclaw
Your agent gets access to store and recall tools. That’s the whole interface.
Recall on start
In your agent’s AGENTS.md or startup instructions, add something like:
On session start:
1. Use memoclaw recall to search for memories related to the current user/project
2. Use recalled context to inform your responses
3. Don't load MEMORY.md — your memories are in MemoClaw now
When a user says “let’s work on Project X,” your agent recalls:
memoclaw recall "Project X context and preferences"
It gets back the 5-10 most relevant memories, not a 200-line dump.
Storing what matters
Not everything deserves to be a memory. Teach your agent to be selective:
- User corrections get high importance (0.8-1.0). “Actually, use MySQL, not Postgres” should stick.
- Preferences sit at medium importance (0.5-0.7). “I prefer bullet points over paragraphs.”
- Project facts are medium too (0.5-0.7). “This project uses Next.js 14.”
- Casual conversation? Don’t store it. Nobody needs “user said good morning” in long-term memory.
memoclaw store "User corrected: Project X uses MySQL, not PostgreSQL" \
--importance 0.9 \
--tags "project-x,database,correction"
Namespaces for isolation
If your agent works across multiple projects or users, namespaces keep memories separate:
memoclaw recall "deployment process" --namespace project-x
memoclaw store "Deploy via GitHub Actions to Railway" --namespace project-x
Recalling “deployment process” in Project X won’t pull in memories from Project Y. Clean separation without any file management.
What this costs
Each store or recall costs $0.005. If your agent does 10 recalls and 5 stores per session, that’s $0.075 per session. For most personal agent use, you’re looking at $2-3/month.
You get 100 free API calls per wallet to start, no registration required. After that, you pay per call with USDC on Base via x402. Yes, you need a crypto wallet. That’s real friction if you’re not already in crypto.
The free endpoints (list, delete, stats, export) don’t count against paid usage. Browsing and managing memories is always free.
Where this pattern breaks down
Be honest about the limits:
- MemoClaw caps at 8,192 characters per memory. It’s not a document store. If you need to remember a whole codebase, use a RAG solution.
- There are no websockets. If two agents write memories at the same time, last write wins. No conflict resolution.
- Never store API keys, passwords, or tokens. MemoClaw isn’t a vault.
- The first recall of a session adds a network round-trip. Usually under 500ms, but it’s not zero.
Migrating from MEMORY.md
Already have a MEMORY.md full of useful context? MemoClaw’s migrate feature imports markdown files:
memoclaw migrate --file ./MEMORY.md
This costs $0.01 per call since it uses GPT-4o-mini to parse and chunk your file. One-time cost to make your existing memories searchable.
After migration, archive your MEMORY.md and let semantic search handle it.
What actually changes
Before: agent loads 4,000 tokens of MEMORY.md every session. Half is irrelevant. User repeats corrections. Old notes conflict with new ones.
After: agent recalls 5-10 relevant memories per session, maybe 500 tokens. Corrections overwrite old facts. Context window stays clean for actual work.
Your agent stays ephemeral. No state to manage, no files to sync. But it remembers what it needs to.
That’s the whole trick. Forget everything, then remember what you need.