Importance scoring strategies


Not all memories are equal. Here’s how to stop treating them like they are.


Every time your OpenClaw agent stores a memory in MemoClaw, it can attach an importance score — a float between 0 and 1. Most agents ignore this completely, hardcoding 0.8 for everything.

That’s like highlighting every sentence in a textbook. If everything is important, nothing is.

How importance affects recall

When you call memoclaw recall, the system runs semantic similarity against your query. But similarity isn’t the only factor. Importance acts as a weight that boosts or dampens results.

A memory with importance 0.95 that’s somewhat relevant can outrank one with importance 0.3 that’s slightly more relevant on pure text similarity. This is by design. If your user said “I hate tabs, always use spaces,” that correction should surface even when the query is loosely related.

# Agent stored these memories:
# "User prefers VS Code" — importance: 0.5
# "User said: NEVER suggest Vim, I hate it" — importance: 0.95
# "User mentioned trying Neovim once" — importance: 0.3

memoclaw recall "text editor recommendations" --namespace my-agent
# Result order:
# 1. "User said: NEVER suggest Vim, I hate it" (importance boost)
# 2. "User prefers VS Code"
# 3. "User mentioned trying Neovim once"

The high-importance correction surfaces first. Without proper scoring, the Neovim mention might rank alongside the preference, and now your agent’s response is a coin flip.

The scoring heuristic

Four tiers. Covers 90% of cases.

Tier 1: Critical (0.9–1.0)

Explicit corrections. When a user says “No, that’s wrong” or “Actually, I prefer X.” These override previous assumptions.

memoclaw store "User corrected: their name is Ana, not Anna" \
  --importance 0.95 \
  --tags "correction,personal" \
  --namespace my-agent

Hard boundaries. “Never do X” or “Always do Y.” Non-negotiable.

memoclaw store "User said: never send emails without asking first" \
  --importance 1.0 \
  --tags "boundary,rule" \
  --namespace my-agent

Tier 2: Strong preferences (0.7–0.9)

Stated preferences. “I like dark mode,” “Use TypeScript, not JavaScript.” Explicit but less urgent than corrections.

memoclaw store "User prefers TypeScript over JavaScript for all projects" \
  --importance 0.8 \
  --tags "preference,technical" \
  --namespace my-agent

Personal info. Timezone, job role, tech stack. Context your agent should reliably surface.

memoclaw store "User is a senior backend engineer at a fintech startup, timezone UTC-3" \
  --importance 0.75 \
  --tags "personal,context" \
  --namespace my-agent

Tier 3: Useful context (0.4–0.7)

Session summaries. “Today we worked on migrating the auth service.” Helpful for continuity, not critical for any single recall.

memoclaw store "Session 2026-03-07: migrated auth service from MongoDB to Postgres, \
  updated connection pooling config" \
  --importance 0.6 \
  --tags "session,project-auth" \
  --namespace my-agent

Project details. Repo names, deployment targets, architecture decisions. Important in the right context, noise otherwise.

memoclaw store "Project Aurora uses Railway for API, Neon for Postgres, \
  Vercel for frontend" \
  --importance 0.55 \
  --tags "project,infrastructure" \
  --namespace my-agent

Tier 4: Background (0.1–0.4)

Casual observations. “User mentioned upcoming vacation.” Nice-to-have, shouldn’t crowd out real preferences.

memoclaw store "User mentioned upcoming vacation March 15-22" \
  --importance 0.35 \
  --tags "personal,scheduling" \
  --namespace my-agent

Ephemeral context. Things true now but not in a month. Current sprint goals, temporary workarounds.

memoclaw store "Current sprint focus: fix payment webhook reliability" \
  --importance 0.3 \
  --tags "sprint,temporary" \
  --namespace my-agent

Building automatic scoring

Manually assigning scores for every memory doesn’t scale. Instead, build scoring logic into your agent’s instructions. Drop this in your AGENTS.md:

## Memory scoring rules

When storing memories in MemoClaw, assign importance based on type:

| Type | Score | Example |
|------|-------|---------|
| User corrections | 0.95 | "No, I said X not Y" |
| Hard boundaries | 1.0 | "Never do X" |
| Stated preferences | 0.8 | "I prefer X" |
| Personal info | 0.75 | Name, timezone, role |
| Session summaries | 0.6 | "Today we worked on..." |
| Project context | 0.55 | Repo names, stack info |
| Observations | 0.35 | "User seemed interested in..." |
| Ephemeral notes | 0.3 | Sprint goals, temp workarounds |

Default to 0.5 if uncertain. When in doubt, score higher — easier to find
and downgrade an over-scored memory than to miss an under-scored one.

Your agent reads these instructions and applies them at store time. No code changes needed.

Keyword-based scoring hints

For agents that need more precision, add triggers:

## Scoring triggers

Boost to 0.9+ when the user says:
- "never", "always", "don't ever", "make sure you"
- "I told you", "I already said", "no, it's actually"
- "important:", "remember this"

Reduce to 0.3-0.4 when storing:
- Daily summaries, weather checks, routine logs
- "FYI", "by the way", "not a big deal but"

The “everything is 0.8” problem

Here’s what happens when you don’t differentiate:

Recall query: "How does the user like their code formatted?"

Without scoring (all 0.8):
1. "User mentioned they use Prettier" (relevant)
2. "Session 2026-02-14: discussed code review workflow" (noise)
3. "User's project uses ESLint with Airbnb config" (somewhat relevant)
4. "User said 'always use tabs, 4-width, no exceptions'" (CRITICAL — buried at #4)

With proper scoring:
1. "User said 'always use tabs, 4-width, no exceptions'" (0.95 — surfaces first)
2. "User mentioned they use Prettier" (0.7)
3. "User's project uses ESLint with Airbnb config" (0.55)
4. "Session 2026-02-14: discussed code review workflow" (0.4 — correctly deprioritized)

The explicit instruction is what matters. Proper scoring makes it surface first.

Updating scores over time

Memories don’t have static importance forever. Something urgent today may be irrelevant next month. MemoClaw lets you update scores:

# Sprint goal that's now completed
memoclaw update <memory-id> --importance 0.1

# Preference that got reinforced
memoclaw update <memory-id> --importance 0.9

A good practice: have your agent periodically review low-recall memories and adjust scores. If a memory hasn’t been recalled in 30 days and it’s scored above 0.5, it might be worth demoting.

Practical tips

The four-tier system covers most cases. Don’t over-engineer scoring until you actually see recall problems.

For the first week, have your agent mention what memories it pulled. Something like “Based on your preference for TypeScript…” in its responses. This lets you spot scoring mistakes early.

If you do nothing else from this article, score corrections at 0.95+. An agent that remembers “actually it’s X, not Y” is noticeably better than one that keeps making the same mistake.

Importance scoring and tag filtering work together. A memory tagged correction with score 0.95 and another tagged preference with score 0.8 gives you two axes to filter on. Use both.

One thing to watch: every store and recall costs one API call. With 100 free calls, be intentional about what you store early on. Once you’ve set up payments ($0.005 per call), you can be more liberal.


# Get started
clawhub install anajuliabit/memoclaw
npm install -g memoclaw

Full docs and API reference at docs.memoclaw.com.