Why Your OpenClaw Agent Forgets Everything (And How to Fix It)


You’ve been working with your OpenClaw agent for weeks. It knows your preferences, your projects, your communication style. Then you start a new session and it asks you the same questions it asked on day one.

Sound familiar?

This isn’t a bug. It’s a fundamental limitation of how LLMs work — and it’s solvable. But first, let’s understand why it happens.

The Context Window Illusion

Large language models don’t have memory. They have context windows.

When you chat with your OpenClaw agent, every message — yours and the agent’s — gets packed into a context window. The model processes this entire window on every turn. It looks like the agent remembers your conversation, but it’s actually re-reading the entire transcript every time it responds.

This creates two problems:

Problem 1: Sessions Are Isolated

When a session ends, the context window is gone. The next session starts with a blank slate. Your agent doesn’t remember that you prefer TypeScript over Python, that your project uses PostgreSQL, or that you hate verbose output.

Problem 2: Context Windows Have Limits

Even within a session, there’s a ceiling. Models have context limits (128K tokens for Claude, 200K+ for some models), and once you hit them, older messages get dropped. Long conversations lose their beginning.

The bigger your conversation, the more expensive each turn becomes (you’re paying for the model to re-read everything). And the model’s attention to older context degrades — information at the edges is recalled less reliably than recent messages.

The MEMORY.md Approach

OpenClaw has a built-in solution: MEMORY.md and daily memory files.

# MEMORY.md

- User prefers concise responses
- Main project: content-engine (TypeScript, Node.js)
- Timezone: UTC+1
- Deploys to Railway
- Prefers dark mode in all tools

Your agent reads this file at the start of every session. It works. It’s simple. And for many users, it’s enough.

But it has real limitations:

Every Line Eats Context

MEMORY.md gets loaded into the context window. A 50-line memory file might be 2,000 tokens. A 200-line file: 8,000 tokens. That’s context space that could be used for your actual conversation.

# Context budget for a 128K model:
System prompt:     ~2,000 tokens
MEMORY.md:         ~4,000 tokens  ← competing with your conversation
SOUL.md:           ~1,000 tokens
Conversation:      ~121,000 tokens

As your agent learns more about you, the memory file grows, and the available conversation space shrinks. You’re paying (literally — in API costs) for the model to re-read your entire memory on every single turn.

MEMORY.md is a flat file. Your agent reads the whole thing or nothing. It can’t search for “what does the user think about React?” and get just the relevant memories. It gets everything — your React preferences, your lunch habits, your SSH config notes — all loaded into context whether relevant or not.

Manual Maintenance

Someone has to maintain the file. Either you write entries manually, or your agent appends to it (and the file grows forever). There’s no automatic pruning, no importance ranking, no way to say “this memory matters more than that one.”

No Sharing Between Agents

If you run multiple OpenClaw agents — a coding agent, a research agent, a writing agent — each has its own MEMORY.md. Knowledge doesn’t flow between them unless you manually copy it.

Enter Semantic Memory

MemoClaw takes a different approach. Instead of dumping everything into a text file that gets loaded wholesale, it stores memories as vectors and retrieves only what’s relevant.

Here’s the difference:

MEMORY.md approach:

Session starts → Load entire MEMORY.md (200 lines, 8K tokens) → Start conversation

MemoClaw approach:

Session starts → Agent encounters a topic → Recall relevant memories (3-5 results, ~500 tokens) → Continue

Instead of front-loading everything, MemoClaw lets your agent pull memories on demand. The context stays lean. Only relevant information gets loaded.

How It Works in Practice

Install the MemoClaw skill on your OpenClaw agent:

clawhub install anajuliabit/memoclaw

Now your agent can store and recall memories:

# Store a memory with importance and tags
memoclaw store "User's project uses Next.js 14 with App Router" \
  --importance 0.8 \
  --tags tech-stack,frontend

# Store a preference
memoclaw store "User hates long-winded explanations. Keep it concise." \
  --importance 0.9 \
  --tags preferences,communication

# Store project context
memoclaw store "content-engine deploys to Railway, auto-deploys from main branch" \
  --importance 0.7 \
  --tags project,infra,content-engine

When context is needed, the agent recalls semantically:

# Agent is working on a frontend task
memoclaw recall "what frontend framework does the user use"
# → "User's project uses Next.js 14 with App Router" (score: 0.94)

# Agent is writing a response
memoclaw recall "how does the user like responses formatted"
# → "User hates long-winded explanations. Keep it concise." (score: 0.91)

The recall is semantic, not keyword-based. “What frontend framework” matches a memory about “Next.js 14 with App Router” because MemoClaw understands the meaning, not just the words.

Importance Scoring

Not all memories are equal. Your preference for concise responses matters more than the fact that you had pizza last Tuesday. MemoClaw lets you (or your agent) assign importance scores:

# Critical — agent should almost always recall this
memoclaw store "NEVER push directly to main. Always use PRs." \
  --importance 1.0 \
  --tags workflow,git

# Useful but not critical
memoclaw store "User usually works between 9am-6pm UTC+1" \
  --importance 0.5 \
  --tags schedule

# Low importance — nice to know
memoclaw store "User's favorite color is blue" \
  --importance 0.2 \
  --tags personal

When recalling, higher-importance memories get prioritized. Your agent surfaces the critical stuff first.

Namespaces for Projects

Working on multiple projects? Use namespaces to keep memory organized:

# Memories for the content engine project
memoclaw store "Blog posts go to blog.memoclaw.com, not /blog on main site" \
  --namespace content-engine \
  --importance 0.9

# Memories for a different project
memoclaw store "API uses GraphQL, not REST" \
  --namespace client-project \
  --importance 0.8

When your agent works on the content engine, it recalls from that namespace. When it switches to the client project, it gets that project’s context. Clean separation, no cross-contamination.

MEMORY.md + MemoClaw: Better Together

Here’s the thing — you don’t have to choose. The best setup uses both:

MEMORY.md for the essentials your agent needs every session:

# MEMORY.md
- Name: Ana
- Timezone: UTC+1
- Tone: Direct, no fluff

Keep it short. 10-20 lines. The stuff that’s relevant 100% of the time.

MemoClaw for everything else:

  • Project-specific context (recalled when working on that project)
  • Detailed preferences (recalled when relevant)
  • Historical decisions (recalled when the topic comes up)
  • Corrections and lessons learned (recalled when the agent is about to make the same mistake)

This way, your context window stays lean (small MEMORY.md), but your agent has access to a deep well of searchable, prioritized memory through MemoClaw.

Migration: From MEMORY.md to MemoClaw

Already have a long MEMORY.md? MemoClaw can ingest it:

# Migrate an existing memory file
memoclaw migrate ./MEMORY.md

This parses the file, splits it into individual memories, generates embeddings, and stores them. You can then trim your MEMORY.md down to just the essentials and let MemoClaw handle the rest.

The migration costs $0.01 per file (it uses GPT-4o-mini to parse and chunk intelligently). A one-time cost to upgrade from flat-file memory to semantic search.

The Numbers

Let’s compare context usage:

MEMORY.md OnlyMemoClaw + Slim MEMORY.md
Baseline context8,000 tokens (200 lines)800 tokens (20 lines)
Per-query memory0 (already loaded)~500 tokens (5 results)
RelevanceEverything, alwaysOnly what’s needed
Cost per sessionHigher (larger context every turn)Lower (lean context + targeted recalls)
SearchCtrl+F (exact match)Semantic (meaning-based)
Multi-agentCopy files manuallyShared via wallet

For an agent that makes 20 turns per session, loading 8K tokens of memory every turn means 160K tokens of memory re-processing per session. With MemoClaw, you load 800 tokens baseline + maybe 2,500 tokens of recalled memories across 5 recall operations = 18,500 tokens total. That’s an order of magnitude less.

Getting Started

The fix is three commands:

# Install the skill
clawhub install anajuliabit/memoclaw

# Authenticate
memoclaw auth

# Migrate your existing memories
memoclaw migrate ./MEMORY.md

Then trim your MEMORY.md to the bare essentials and let MemoClaw handle the deep memory. Your first 100 API calls are free — enough to migrate a file and test recall for a few sessions before deciding if it’s worth the (very small) ongoing cost.

Your agent doesn’t have to forget everything. It just needs a better place to remember.