Your MEMORY.md is eating your tokens
If your MEMORY.md is 3,000 tokens and youβre on Claude Sonnet, youβre paying about $0.009 per message just to load memories. Not to use them. Not to think about them. Just to have them sitting in context.
$0.009 doesnβt sound like much. Multiply it by 50 messages a day: $0.45. Over a month: ~$13.50. Thatβs the memory tax β tokens spent on context your agent skimmed past to find the one line that actually mattered.
Hereβs the thing: $13.50/month isnβt going to bankrupt anyone. But itβs $13.50 for something that gets worse over time, not better. And the token cost is only part of the problem.
How MEMORY.md works (and why it doesnβt scale)
Every OpenClaw agent follows roughly the same pattern. AGENTS.md says βread MEMORY.md at session start.β The agent loads the whole file into context. Every session. Every message.
When MEMORY.md is 20 lines, this is fine. After a month of daily use, itβs 2,000-4,000 tokens. Iβve seen files over 10,000.
Hereβs a typical token budget for an active session:
System prompt: ~500 tokens
AGENTS.md: ~800 tokens
MEMORY.md: ~3,000 tokens
USER.md: ~200 tokens
Conversation so far: ~2,000 tokens
MEMORY.md is almost half the input. On any given message, maybe 5% of those memories are relevant to what the agent is actually doing.
Less room for the actual conversation. Less room for tool outputs. Less room for thinking.
What selective recall looks like
MemoClaw stores memories as individual records with vector embeddings. Instead of loading everything, your agent queries for whatβs relevant:
memoclaw recall "user's deployment preferences" --limit 3
Returns:
ββββββ¬ββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββ¬ββββββββ
β ID β Content β Importance β Score β
ββββββΌββββββββββββββββββββββββββββββββββββββββΌβββββββββββββΌββββββββ€
β 42 β Deploy to Railway, not Fly.io. β 0.8 β 0.93 β
β 15 β Always use Dockerfile, never buildpacksβ 0.7 β 0.87 β
β 73 β Production branch is 'main', not β 0.5 β 0.81 β
β β 'master'. Changed on 2026-02-10. β β β
ββββββ΄ββββββββββββββββββββββββββββββββββββββββ΄βββββββββββββ΄ββββββββ
Three memories. ~80 tokens. Exactly what the agent needed for this conversation, nothing else.
The math
Letβs compare a realistic day. Your agent handles 40 messages, and about half need memory context.
MEMORY.md:
- 3,000 tokens loaded every message (whether needed or not)
- 40 messages Γ 3,000 tokens = 120,000 tokens/day on memory
- Claude Sonnet at $3/M input: $0.36/day
- Monthly: ~$10.80
MemoClaw:
- 20 recalls/day Γ $0.005 = $0.10/day
- Each recall returns ~100-200 tokens instead of 3,000
- Monthly recall cost: ~$3.00
The savings scale with file size. A 6,000-token MEMORY.md doubles the waste. A 10,000-token file triples it. MemoClaw costs stay flat because youβre only pulling whatβs relevant.
Speed, not just cost
More input tokens means longer time-to-first-token. If your agent has been getting slower over the weeks, check how big MEMORY.md has gotten. Shaving 2,800 tokens off every request wonβt make responses instant, but it adds up β especially on models with higher latency per input token.
A MemoClaw recall takes ~200-400ms. Thatβs a network round trip the agent didnβt have before. But the modelβs thinking time is measured in seconds, so the recall latency is usually invisible in practice.
Setting it up
Install the skill:
clawhub install anajuliabit/memoclaw
Update your AGENTS.md. Replace:
Read MEMORY.md at session start.
With:
Use memoclaw recall to fetch relevant context when needed.
Don't load all memories at once β query for what's relevant to the current task.
Your agent now has store and recall tools. It calls recall when it needs context, store when it learns something new. No bulk file loading.
Migrating existing memories
If youβve got a MEMORY.md with useful stuff in it, you donβt have to start from scratch. MemoClaw can ingest markdown files:
memoclaw migrate ~/path/to/MEMORY.md
This parses the file into individual memories with importance scores and stores them. After migration, each memory is a separate searchable record instead of a line in a big file.
Review the results after migration. The parser does a reasonable job, but youβll probably want to adjust some importance scores and clean up a few entries that didnβt split well.
The free tier
MemoClaw gives you 100 free API calls. Store and recall each cost one call. At 20 recalls/day, youβll hit the limit in about 5 days.
After that, pay-per-request with USDC on Base. No subscription. $0.005 per recall, $0.005 per store. The payment happens automatically via x402 β your agentβs wallet handles it.
memoclaw status
When to stick with MEMORY.md
If your file is under 500 tokens and isnβt growing, the file approach works fine. The overhead is small enough that the simplicity wins.
The crossover point is around 1,000-1,500 tokens. Below that, a flat file is simpler. Above that, youβre paying an increasing tax on every message for context your agent mostly ignores.
If your MEMORY.md is already past 3,000 tokens, youβre past the crossover. The longer you wait, the more tokens you burn.
Your agentβs context window is expensive real estate. MEMORY.md fills it with everything whether you need it or not. MemoClaw fills it with just whatβs relevant. The difference shows up in your API bill and your response times.
Start free at memoclaw.com or install the OpenClaw skill.