Mar 9, 2026

Building a self-correcting OpenClaw agent with MemoClaw

You correct your agent. It apologizes. It does the right thing. Next session, it makes the same mistake.

This is the default experience with every AI agent that relies on context windows for memory. Corrections live in the conversation. When the conversation ends, the correction vanishes.

Some agents work around this with MEMORY.md or correction logs. The agent writes down what it got wrong and reads the file next session. This works until the file gets long enough that corrections get buried. The agent reads fifty corrections every session and somehow still picks the wrong deployment target.

The fix is semantic recall. Store corrections as memories, tag them appropriately, and let the agent pull relevant corrections before acting. The correction about Railway vs Fly.io surfaces when the agent is working on deployment. It stays quiet when the agent is writing documentation.

What a correction memory looks like

When your agent gets corrected, you want it to store something specific:

memoclaw store "CORRECTION: Do not use console.log for debugging in production code. Use the structured logger at lib/logger.ts instead. Ana corrected this on 2026-02-20." \
  --tags correction,coding,logging --importance 0.95

A few things matter here.

High importance. Corrections should be 0.9 or above. When your agent recalls memories, importance affects ranking. A correction at 0.95 will outrank a casual preference at 0.6 when both are relevant to the same query.

The CORRECTION prefix. This isn’t required by MemoClaw, but it helps the agent parse what it’s reading. When a recall returns a memory that starts with “CORRECTION:”, the agent knows this is something it got wrong before and should pay attention to.

Specific tags. The tags correction and coding let you filter recalls. The tag logging makes this memory surface when the agent is working on anything log-related.

Date and source. Including when and who corrected the agent helps with ordering. If two corrections conflict, the newer one wins.

Setting up automatic correction storage

You can add instructions to your AGENTS.md that tell the agent to store corrections automatically:

When the user corrects you:
1. Acknowledge the correction
2. Store it as a memory:
   memoclaw store "CORRECTION: <what was wrong> → <what is correct>. Corrected <date>." \
     --tags correction,<relevant-topic> --importance 0.95
3. Recall before acting to check for prior corrections:
   memoclaw recall "<brief task description>" --tags correction

Step 3 is where the value compounds. Before the agent starts any task, it checks whether it’s been corrected on something related. The recall is semantic, so the agent doesn’t need to know the exact wording of past corrections. “Working on deployment config” will surface the Railway/Fly.io correction even though the stored memory doesn’t contain the phrase “deployment config.”

Preferences vs corrections

Not every piece of feedback is a correction. Some things are preferences. The difference matters for importance scoring.

Corrections are things the agent got wrong. Wrong deployment target, wrong file format, wrong API endpoint. These should be importance 0.9-0.95.

memoclaw store "CORRECTION: API responses must use HTTP 422 for validation errors, not 400. 400 is for malformed requests only." \
  --tags correction,api,http --importance 0.95

Preferences are choices that aren’t wrong, just not what the user wants. Tab width, commit message style, whether to use semicolons. These should be importance 0.7-0.8.

memoclaw store "PREFERENCE: Use 2-space indentation in all TypeScript files. No tabs." \
  --tags preference,coding,typescript --importance 0.8

Both types benefit from semantic recall. The difference is in how they rank against other memories. A correction at 0.95 will appear above a casual project note at 0.6. A preference at 0.8 might rank below a detailed technical decision at 0.85, which is usually the right behavior.

The recall pattern

Here’s how this works in practice. Say your agent is about to write a new API endpoint.

Before writing any code, it runs:

memoclaw recall "writing a new REST API endpoint" --tags correction

This returns any corrections related to API work. Maybe it gets back:

“CORRECTION: API responses must use HTTP 422 for validation errors, not 400.”
“CORRECTION: Always include request-id header in API responses. Was missing from the payments endpoint.”

Two memories, maybe 200 tokens total. Compare this to reading a corrections log file that contains every correction the agent has ever received.

The agent can also do a broader recall without the tag filter:

memoclaw recall "writing a new REST API endpoint"

This might return the corrections above plus relevant technical decisions and conventions. Still focused on what’s relevant, still under 1000 tokens.

Why this beats a corrections file

A flat corrections file has two problems.

First, it grows without bound. After six months of active use, you might have hundreds of corrections. The agent reads all of them every session. Most are irrelevant to the current task.

Second, it’s text matching, not semantic matching. If your correction says “don’t use console.log” and the agent is working on “adding debug output,” a text search for “debug” won’t find the correction. Semantic search will, because the embeddings understand that debug output and console.log are related concepts.

MemoClaw’s semantic search handles both problems. It returns only relevant memories, and it matches on meaning rather than keywords.

Compounding over time

This pattern gets more valuable the longer you use it. After a month, your agent has accumulated dozens of corrections and preferences. Each one fires only when relevant. The agent that used to make the same five mistakes now catches them before they happen.

After a few months, you start seeing something interesting. The agent makes fewer new mistakes because the old corrections inform its general approach. The correction about HTTP 422 vs 400 teaches the agent to be more careful about HTTP status codes in general, not just that one specific case.

The context window stays clean because none of these corrections are loaded unless they’re relevant. A hundred stored corrections cost nothing when the agent is working on something unrelated.

Handling contradictions

Sometimes you change your mind. You told the agent to use Jest in January and switched to Vitest in March. Both corrections exist as memories.

The date in the memory helps here. When the agent recalls both, it can see which one is newer. Add this to your AGENTS.md:

When recalls return contradictory corrections, prefer the most recent one.

You can also update the old memory. MemoClaw supports memory deletion, so you can clean up outdated corrections:

memoclaw recall "testing framework preference" --tags correction
# Find the old Jest correction, note its ID
memoclaw delete <memory-id>

In practice, I find that the date-based approach works well enough. The agent sees “Jest, corrected January 2026” and “Vitest, corrected March 2026” and picks the newer one. Cleaning up is good housekeeping but not strictly necessary.

The minimal setup

If you want to start with self-correcting behavior today, add these lines to your AGENTS.md:

## Corrections

When corrected, store it:
  memoclaw store "CORRECTION: <old behavior> → <new behavior>. <date>." \
    --tags correction --importance 0.95

Before starting any task, check for relevant corrections:
  memoclaw recall "<task description>" --tags correction

That’s two lines of instruction. The agent does the rest. Over time, it builds a library of things it’s learned, and each lesson surfaces exactly when it’s needed.

No file management. No growing context windows. Just an agent that remembers what it got wrong.