Importance scoring that actually works


Every time your OpenClaw agent stores a memory in MemoClaw, it can attach an importance score — a float between 0 and 1. Most agents ignore this, hardcoding 0.8 for everything and calling it a day.

That’s like highlighting every sentence in a textbook. If everything is important, nothing is.

Here’s how to actually use importance scoring: what scores to assign, how to automate the decision, and how it changes recall quality.

How importance affects recall

When you call memoclaw recall, the system runs semantic similarity against your query. But similarity isn’t the only factor — importance acts as a weight that boosts or dampens results.

A memory with importance 0.95 that’s somewhat relevant can outrank one with importance 0.3 that’s a closer text match. This is intentional: if your user said “I hate tabs, always use spaces,” that correction should surface even when the query is loosely related.

# Agent stored these memories:
# "User prefers VS Code" — importance: 0.5
# "User said: NEVER suggest Vim, I hate it" — importance: 0.95
# "User mentioned trying Neovim once" — importance: 0.3

memoclaw recall "text editor recommendations" --namespace my-agent
# Result order:
# 1. "User said: NEVER suggest Vim, I hate it" (importance boost)
# 2. "User prefers VS Code"
# 3. "User mentioned trying Neovim once"

The high-importance correction surfaces first, preventing the agent from suggesting something the user explicitly rejected. Without proper scoring, the Neovim mention might rank alongside the preference, and the agent’s response becomes a coin flip.

The four-tier system

Tier 1: Critical (0.9-1.0)

Explicit corrections. When a user says “No, that’s wrong” or “Actually, I prefer X,” store it near max importance. These override previous assumptions.

memoclaw store "User corrected: their name is Ana, not Anna" \
  --importance 0.95 --tags correction,personal

Hard boundaries. “Never do X” or “Always do Y” statements. Non-negotiable.

memoclaw store "User said: never send emails without asking first" \
  --importance 1.0 --tags boundary,rule

Tier 2: Strong preferences (0.7-0.9)

Stated preferences. “I like dark mode,” “I prefer concise responses,” “Use TypeScript, not JavaScript.”

memoclaw store "User prefers TypeScript over JavaScript for all projects" \
  --importance 0.8 --tags preference,technical

Personal information. Timezone, job role, tech stack, team members.

memoclaw store "User is a senior backend engineer at a fintech startup, timezone UTC-3" \
  --importance 0.75 --tags personal,context

Tier 3: Useful context (0.4-0.7)

Session summaries. “Today we worked on migrating the auth service to Postgres.” Helpful for continuity, not critical for any single recall.

memoclaw store "Session 2026-03-07: migrated auth service from MongoDB to Postgres, \
  updated connection pooling config" \
  --importance 0.6 --tags session,project-auth

Project details. Repository names, deployment targets, architecture decisions. Important in the right context, noise otherwise.

memoclaw store "Project Aurora uses Railway for API, Neon for Postgres, \
  Vercel for frontend" \
  --importance 0.55 --tags project,infrastructure

Tier 4: Background (0.1-0.4)

Casual observations. “User mentioned going on vacation next week.” Nice-to-have, shouldn’t crowd out real preferences.

memoclaw store "User mentioned upcoming vacation March 15-22" \
  --importance 0.35 --tags personal,scheduling

Ephemeral context. Things that are true now but won’t be in a month. Current sprint goals, temporary workarounds, meeting prep notes.

memoclaw store "Current sprint focus: fix payment webhook reliability" \
  --importance 0.3 --tags sprint,temporary

Building automatic scoring

Manually assigning scores for every memory isn’t practical. Instead, wire scoring logic into your agent’s instructions. Add this to your AGENTS.md:

## Memory scoring rules

When storing memories in MemoClaw, assign importance based on type:

| Type | Score | Example |
|------|-------|---------|
| User corrections | 0.95 | "No, I said X not Y" |
| Hard boundaries | 1.0 | "Never do X" |
| Stated preferences | 0.8 | "I prefer X" |
| Personal info | 0.75 | Name, timezone, role |
| Session summaries | 0.6 | "Today we worked on..." |
| Project context | 0.55 | Repo names, stack info |
| Observations | 0.35 | "User seemed interested in..." |
| Ephemeral notes | 0.3 | Sprint goals, temp workarounds |

Default to 0.5 if uncertain. When in doubt, score higher — it's easier to
find and downgrade an over-scored memory than to miss an under-scored one.

Your agent reads these instructions and applies them at store time. No code changes — just prompt engineering.

Keyword triggers

For agents that need more precision, add triggers:

## Scoring triggers

Boost to 0.9+ when the user says:
- "never", "always", "don't ever", "make sure you"
- "I told you", "I already said", "no, it's actually"
- "important:", "remember this"

Reduce to 0.3-0.4 when storing:
- Daily summaries, weather checks, routine logs
- "FYI", "by the way", "not a big deal but"

This gives your agent a heuristic without needing a separate scoring model.

What happens when you don’t differentiate

Recall query: "How does the user like their code formatted?"

Without scoring (all 0.8):
1. "User mentioned they use Prettier" (relevant)
2. "Session 2026-02-14: discussed code review workflow" (noise)
3. "User's project uses ESLint with Airbnb config" (somewhat relevant)
4. "User said 'always use tabs, 4-width, no exceptions'" (CRITICAL — buried at #4)

With proper scoring:
1. "User said 'always use tabs, 4-width, no exceptions'" (0.95 — surfaces first)
2. "User mentioned they use Prettier" (0.7)
3. "User's project uses ESLint with Airbnb config" (0.55)
4. "Session 2026-02-14: discussed code review workflow" (0.4 — deprioritized)

The explicit instruction is what matters. Proper scoring makes sure it rises to the top.

Updating scores over time

Memories don’t have static importance forever. Something urgent today may be irrelevant next month. MemoClaw lets you update scores:

# Sprint goal that's now completed
memoclaw update <memory-id> --importance 0.1

# Preference that got reinforced
memoclaw update <memory-id> --importance 0.9

A good practice: have your agent periodically review low-recall memories and adjust scores. If a memory hasn’t been recalled in 30 days and it’s scored above 0.5, it might be worth demoting. Keeps your memory clean and recall sharp.

Practical tips

The four-tier system covers 90% of cases. Don’t over-engineer scoring until you see recall problems.

For the first week, have your agent mention what memories it pulled — something like “Based on your preference for TypeScript…” in responses. This lets you verify the right memories are surfacing and catch scoring mistakes early.

If you do nothing else from this article: score corrections at 0.95+. An agent that remembers “actually it’s X, not Y” is noticeably better than one that keeps making the same mistake.

Importance scoring and tag filtering work together. A correction tagged correction at 0.95 and a preference tagged preference at 0.8 gives you two axes to filter on. Use both.

One thing to watch: every store and recall costs one API call. With 100 free calls, be intentional about what you store early on. After you’ve set up payments ($0.005 per call), you can be more liberal.

# Get started
clawhub install anajuliabit/memoclaw
npm install -g memoclaw

Full docs at docs.memoclaw.com.