Importance scoring strategies
Not all memories are equal. Here’s how to stop treating them like they are.
Every time your OpenClaw agent stores a memory in MemoClaw, it can attach an importance score — a float between 0 and 1. Most agents ignore this completely, hardcoding 0.8 for everything.
That’s like highlighting every sentence in a textbook. If everything is important, nothing is.
How importance affects recall
When you call memoclaw recall, the system runs semantic similarity against your query. But similarity isn’t the only factor. Importance acts as a weight that boosts or dampens results.
A memory with importance 0.95 that’s somewhat relevant can outrank one with importance 0.3 that’s slightly more relevant on pure text similarity. This is by design. If your user said “I hate tabs, always use spaces,” that correction should surface even when the query is loosely related.
# Agent stored these memories:
# "User prefers VS Code" — importance: 0.5
# "User said: NEVER suggest Vim, I hate it" — importance: 0.95
# "User mentioned trying Neovim once" — importance: 0.3
memoclaw recall "text editor recommendations" --namespace my-agent
# Result order:
# 1. "User said: NEVER suggest Vim, I hate it" (importance boost)
# 2. "User prefers VS Code"
# 3. "User mentioned trying Neovim once"
The high-importance correction surfaces first. Without proper scoring, the Neovim mention might rank alongside the preference, and now your agent’s response is a coin flip.
The scoring heuristic
Four tiers. Covers 90% of cases.
Tier 1: Critical (0.9–1.0)
Explicit corrections. When a user says “No, that’s wrong” or “Actually, I prefer X.” These override previous assumptions.
memoclaw store "User corrected: their name is Ana, not Anna" \
--importance 0.95 \
--tags "correction,personal" \
--namespace my-agent
Hard boundaries. “Never do X” or “Always do Y.” Non-negotiable.
memoclaw store "User said: never send emails without asking first" \
--importance 1.0 \
--tags "boundary,rule" \
--namespace my-agent
Tier 2: Strong preferences (0.7–0.9)
Stated preferences. “I like dark mode,” “Use TypeScript, not JavaScript.” Explicit but less urgent than corrections.
memoclaw store "User prefers TypeScript over JavaScript for all projects" \
--importance 0.8 \
--tags "preference,technical" \
--namespace my-agent
Personal info. Timezone, job role, tech stack. Context your agent should reliably surface.
memoclaw store "User is a senior backend engineer at a fintech startup, timezone UTC-3" \
--importance 0.75 \
--tags "personal,context" \
--namespace my-agent
Tier 3: Useful context (0.4–0.7)
Session summaries. “Today we worked on migrating the auth service.” Helpful for continuity, not critical for any single recall.
memoclaw store "Session 2026-03-07: migrated auth service from MongoDB to Postgres, \
updated connection pooling config" \
--importance 0.6 \
--tags "session,project-auth" \
--namespace my-agent
Project details. Repo names, deployment targets, architecture decisions. Important in the right context, noise otherwise.
memoclaw store "Project Aurora uses Railway for API, Neon for Postgres, \
Vercel for frontend" \
--importance 0.55 \
--tags "project,infrastructure" \
--namespace my-agent
Tier 4: Background (0.1–0.4)
Casual observations. “User mentioned upcoming vacation.” Nice-to-have, shouldn’t crowd out real preferences.
memoclaw store "User mentioned upcoming vacation March 15-22" \
--importance 0.35 \
--tags "personal,scheduling" \
--namespace my-agent
Ephemeral context. Things true now but not in a month. Current sprint goals, temporary workarounds.
memoclaw store "Current sprint focus: fix payment webhook reliability" \
--importance 0.3 \
--tags "sprint,temporary" \
--namespace my-agent
Building automatic scoring
Manually assigning scores for every memory doesn’t scale. Instead, build scoring logic into your agent’s instructions. Drop this in your AGENTS.md:
## Memory scoring rules
When storing memories in MemoClaw, assign importance based on type:
| Type | Score | Example |
|------|-------|---------|
| User corrections | 0.95 | "No, I said X not Y" |
| Hard boundaries | 1.0 | "Never do X" |
| Stated preferences | 0.8 | "I prefer X" |
| Personal info | 0.75 | Name, timezone, role |
| Session summaries | 0.6 | "Today we worked on..." |
| Project context | 0.55 | Repo names, stack info |
| Observations | 0.35 | "User seemed interested in..." |
| Ephemeral notes | 0.3 | Sprint goals, temp workarounds |
Default to 0.5 if uncertain. When in doubt, score higher — easier to find
and downgrade an over-scored memory than to miss an under-scored one.
Your agent reads these instructions and applies them at store time. No code changes needed.
Keyword-based scoring hints
For agents that need more precision, add triggers:
## Scoring triggers
Boost to 0.9+ when the user says:
- "never", "always", "don't ever", "make sure you"
- "I told you", "I already said", "no, it's actually"
- "important:", "remember this"
Reduce to 0.3-0.4 when storing:
- Daily summaries, weather checks, routine logs
- "FYI", "by the way", "not a big deal but"
The “everything is 0.8” problem
Here’s what happens when you don’t differentiate:
Recall query: "How does the user like their code formatted?"
Without scoring (all 0.8):
1. "User mentioned they use Prettier" (relevant)
2. "Session 2026-02-14: discussed code review workflow" (noise)
3. "User's project uses ESLint with Airbnb config" (somewhat relevant)
4. "User said 'always use tabs, 4-width, no exceptions'" (CRITICAL — buried at #4)
With proper scoring:
1. "User said 'always use tabs, 4-width, no exceptions'" (0.95 — surfaces first)
2. "User mentioned they use Prettier" (0.7)
3. "User's project uses ESLint with Airbnb config" (0.55)
4. "Session 2026-02-14: discussed code review workflow" (0.4 — correctly deprioritized)
The explicit instruction is what matters. Proper scoring makes it surface first.
Updating scores over time
Memories don’t have static importance forever. Something urgent today may be irrelevant next month. MemoClaw lets you update scores:
# Sprint goal that's now completed
memoclaw update <memory-id> --importance 0.1
# Preference that got reinforced
memoclaw update <memory-id> --importance 0.9
A good practice: have your agent periodically review low-recall memories and adjust scores. If a memory hasn’t been recalled in 30 days and it’s scored above 0.5, it might be worth demoting.
Practical tips
The four-tier system covers most cases. Don’t over-engineer scoring until you actually see recall problems.
For the first week, have your agent mention what memories it pulled. Something like “Based on your preference for TypeScript…” in its responses. This lets you spot scoring mistakes early.
If you do nothing else from this article, score corrections at 0.95+. An agent that remembers “actually it’s X, not Y” is noticeably better than one that keeps making the same mistake.
Importance scoring and tag filtering work together. A memory tagged correction with score 0.95 and another tagged preference with score 0.8 gives you two axes to filter on. Use both.
One thing to watch: every store and recall costs one API call. With 100 free calls, be intentional about what you store early on. Once you’ve set up payments ($0.005 per call), you can be more liberal.
# Get started
clawhub install anajuliabit/memoclaw
npm install -g memoclaw
Full docs and API reference at docs.memoclaw.com.