Cross-session learning loops — teaching your OpenClaw agent to spot patterns over time
Your agent handles 10 sessions a day. Each one starts fresh. Each one makes the same discoveries, hits the same walls, asks the same clarifying questions. Session 47 is no smarter than session 1.
This is the default for most OpenClaw setups, and it’s wasteful. The agent has already figured out that your deploy script needs a --force flag on Fridays, that you prefer bullet points over paragraphs, that the staging API returns 500s after 6pm UTC. But without persistent memory, every session starts from zero.
Here’s how to build learning loops that make your agent smarter over time.
What a learning loop looks like
The concept is simple: store observations during a session, recall them in future sessions, and let the accumulated context change behavior. In practice, it looks like this:
- Agent encounters something (a user correction, a failed command, a preference)
- Agent stores it with appropriate importance and tags
- Future sessions recall relevant memories before acting
- Agent behavior changes based on what it remembers
The tricky part isn’t the mechanics. It’s deciding what to store, how to tag it, and how to recall it so the agent actually learns instead of just accumulating noise.
Setting up the feedback loop
Install the MemoClaw skill if you haven’t:
clawhub install anajuliabit/memoclaw
You get 100 free API calls per wallet — no payment or registration required. That’s enough to prototype your learning loop and validate the pattern before funding with USDC on Base. See the getting started guide for full setup details.
The learning loop has two halves: storing lessons and recalling them.
Storing lessons
The most valuable memories come from corrections. When a user says “no, use the v2 endpoint” or “I already told you I don’t want email summaries”, that’s a direct signal about expected behavior.
memoclaw store "user does not want email summaries - asked twice to stop" \
--importance 0.95 --tags correction,preferences
Importance scoring matters here. A casual preference (“I like dark mode”) is different from an explicit correction (“stop doing that”). Score corrections at 0.9+ and preferences at 0.7.
Failures are the second-best learning signal. When a command fails, when an API returns an unexpected error, when a file isn’t where the agent expected it:
memoclaw store "gh pr merge fails on this repo without --admin flag due to branch protection" \
--importance 0.8 --tags git,workflow,repo-xyz
The third category is patterns you notice through repetition. These are harder because the agent needs to recognize “I’ve done this before” without having memory of doing it before. The workaround: store observations after every session and let recall surface the patterns.
memoclaw store "user asked for weekly metrics report - third Monday in a row" \
--importance 0.7 --tags patterns,reporting
Recalling before acting
The other half of the loop. Before your agent takes action, it should check what it already knows.
Add this to your AGENTS.md startup:
# Recall corrections — highest priority
memoclaw recall "things the user corrected or told me to stop doing" --limit 5
# Recall patterns relevant to today
memoclaw recall "recurring tasks and schedules" --limit 3 --tags patterns
For task-specific recall, query based on what the agent is about to do:
# Before working on a repo
memoclaw recall "repo-xyz quirks and gotchas" --tags repo-xyz --limit 5
# Before writing content
memoclaw recall "content style preferences and past feedback" --tags content --limit 5
Tags as a taxonomy
Tags are what turn a pile of memories into a searchable knowledge base. Without consistent tags, recall returns noisy results. Here’s a tagging scheme that works.
Tag by type (correction, preference, failure, pattern, outcome), by domain (git, deploy, content, api, scheduling), and by project (repo-xyz, blog, infra, memoclaw). Combine them freely — a memory like “CI pipeline needs Node 20, not Node 18, for this repo” gets tagged failure,ci,repo-xyz. When the agent works on that repo next, recalling with --tags repo-xyz surfaces it.
memoclaw store "CI pipeline needs Node 20 - builds fail silently on Node 18" \
--importance 0.85 --tags failure,ci,repo-xyz
# Later, before CI work on that repo:
memoclaw recall "CI issues" --tags repo-xyz --limit 3
Detecting patterns across sessions
This is where it gets interesting. Individual memories are useful. Patterns across memories are powerful.
Say your agent stores these over three weeks:
- “user asked for deploy status at 9am Monday” (week 1)
- “user asked for deploy status at 9am Monday” (week 2)
- “user asked for deploy report Monday morning” (week 3)
When the agent recalls “recurring Monday tasks”, all three come back. A well-instructed agent can read these and proactively offer the deploy status on Monday mornings without being asked.
You can make this explicit in your agent instructions:
## Pattern recognition
After recalling memories, look for repetition:
- If 3+ memories describe similar requests, treat it as a recurring pattern
- If a correction appears multiple times, treat it as a hard rule
- If a failure keeps recurring, flag it rather than retrying silently
The agent doesn’t need special ML for this. It just needs enough stored observations and the instruction to look for repetition in recall results.
Importance decay and memory hygiene
Not all memories age well. “The staging server is down” was critical on March 3rd. By March 10th it’s noise.
MemoClaw doesn’t have automatic decay, but you can manage this manually. During periodic maintenance (a heartbeat task or scheduled cron), have your agent clean up:
# List old memories
memoclaw list --tags outcome --limit 20
# Delete stale ones by ID
memoclaw delete MEMORY_ID
A practical approach: store operational facts (server status, one-time fixes) at importance 0.5-0.6. Store durable knowledge (user preferences, repo patterns) at 0.8+. When recalling, the higher-importance memories rank higher in results, so stale low-importance memories naturally get pushed down.
What this looks like in practice
After a month of consistent storing and recalling, your agent has built up something like institutional knowledge. It knows:
- Your preferred output format (learned from three corrections in week 1)
- That the staging API goes down during deploys (learned from two failures)
- That you want a Monday morning status report (learned from four repeated requests)
- That PR reviews on repo-xyz need to check for missing migrations (learned from a code review failure)
None of this was programmed. The agent learned it from running, failing, getting corrected, and storing what happened.
The cost is modest. If your agent stores 5 memories and recalls 5 per session at $0.005 each, that’s $0.05 per session. Over a month of daily use, about $1.50 for an agent that actually remembers what you’ve taught it. And remember, the first 100 calls are free — so you’re not paying anything until the loop is already working.
The real shift
Most agent setups treat sessions as independent. Each one is a fresh start with a clean slate. Cross-session learning changes that model: each session builds on the last.
The agent doesn’t get smarter by accident. You have to be deliberate about what gets stored, how it’s tagged, and when it’s recalled. But the mechanics are straightforward — store, tag, recall, act differently. After a few weeks, the difference is concrete: your agent offers the Monday deploy status before you ask for it, avoids the mistakes you corrected once, and knows which repos need special handling. That’s the loop working.