Context Rot Is Real: Why Bigger Context Windows Won't Save Your Agent
Every few months, a new model drops with a bigger context window. 200K tokens. A million. The announcement always implies the same thing: more context means better agents.
It doesn’t.
I’ve been running OpenClaw agents for a while now, and the pattern is always the same. The agent starts sharp. By hour three of a long session, it’s referencing things I corrected two hours ago. It’s not hallucinating. It’s reading stale information from earlier in the conversation and treating it as current.
That’s context rot.
What context rot actually looks like
Here’s a real example. Early in a session, I told my agent I wanted deployments on Railway. An hour later, I changed my mind and said Fly.io. The agent acknowledged the change. Twenty minutes after that, it generated a Dockerfile optimized for Railway.
The correction was there in the context. The model just didn’t prioritize it over the earlier, more detailed discussion about Railway. Research from Google and Stanford on “lost in the middle” showed that LLMs perform worst on information buried in the center of long contexts. Your correction at message 47 is exactly the kind of thing that gets ignored.
Bigger windows make this worse, not better. More tokens means more places for the right answer to hide.
The MEMORY.md version of this problem
OpenClaw agents typically use MEMORY.md for persistence. The agent reads it at the start of every session. This works when the file is small.
But memories accumulate. After a few weeks of active use, MEMORY.md can hit 30-50KB. That’s thousands of tokens loaded every single session, whether or not any of it is relevant to what you’re doing right now.
Some of those memories contradict each other. You changed your preferred language from Python to TypeScript three weeks ago, but both entries are still in the file. The agent sees both. Sometimes it picks the old one.
That’s context rot in slow motion.
The token math
A 40KB MEMORY.md file is roughly 10,000 tokens. At Claude’s pricing, that’s about $0.03 per session just to load memories. Sounds cheap until you realize your agent runs 20 sessions a day. That’s $0.60/day on memory loading alone. $18/month.
Most of those tokens are irrelevant to the current task. You’re paying to load your pizza topping preferences while debugging a deployment pipeline.
What changes with semantic recall
With MemoClaw, the agent doesn’t load everything. It asks for what it needs.
Instead of reading a 10,000-token file, the agent calls recall with a query like “deployment preferences” and gets back 3-5 relevant memories. Maybe 200 tokens total. The memories come back ranked by relevance and importance score.
That correction about switching from Railway to Fly.io? If you stored it with a high importance score (say 0.9), it surfaces first. The old Railway preference either got deleted or sits at a lower importance. No ambiguity.
The agent sees a clean, current picture. Not a timeline of every decision you ever made.
A real session comparison
Without MemoClaw (MEMORY.md loaded every session):
The agent reads 10,000 tokens of accumulated context. Some of it conflicts. The model picks whichever entry its attention mechanism lands on. You spend part of every session re-correcting things you already corrected.
With MemoClaw (semantic recall on demand):
The agent starts clean. When it needs to know your deployment preference, it recalls it. Gets back “User switched to Fly.io (importance: 0.9, stored 2 days ago)” and the older Railway entry doesn’t even show up because it was either deleted or outranked.
Total tokens loaded: a few hundred instead of ten thousand.
Setting this up in OpenClaw
Install the MemoClaw skill and your agent gets store and recall tools. In your AGENTS.md, replace the “read MEMORY.md every session” instruction with something like:
When you need context about user preferences, project details, or past decisions,
use the recall tool to search for relevant memories. Store important corrections
with importance 0.9 or higher.
If you have an existing MEMORY.md, the CLI can migrate it:
memoclaw migrate --file MEMORY.md --namespace my-agent
This imports each entry as a separate memory with embeddings. After migration, your old text file becomes searchable by meaning, not just keywords.
When bigger context windows do help
I want to be fair. Large context windows are useful for some things. Reading a full codebase in one shot. Processing long documents. Maintaining coherence in extended conversations.
But they don’t solve the memory problem. Stuffing more information into the window doesn’t help if the model can’t reliably find the right piece at the right time. External memory with semantic search takes a different approach: give the model less context, but make sure it’s the right context.
The cost difference
Monthly costs for an agent running 20 sessions per day:
- MEMORY.md approach (40KB file): ~10,000 tokens/session = $18/month in context loading
- MemoClaw approach (5 recalls/session): ~200 tokens/session + $0.025/session in API costs = $15/month total
The costs are roughly similar, but the MemoClaw approach gives you relevant context instead of everything. As your memory grows, the gap widens. MEMORY.md costs scale linearly with file size. MemoClaw recall costs stay flat regardless of how many memories you have stored.
What I’d actually recommend
Start with MEMORY.md. It’s simple and it works for small setups. When you notice your agent referencing outdated information, or when the file passes 10KB, that’s your signal to migrate.
The migration takes about 10 minutes with the CLI. Your agent gets better recall with less context. And you stop paying to load your entire history every session.
Context rot is a real problem. Bigger windows won’t fix it. Better retrieval will.
MemoClaw is memory-as-a-service for AI agents. 100 free API calls per wallet, no registration required. Docs | Skill