Memory poisoning defense: protecting your agent from bad data
Your agent stores a fact. That fact is wrong. Now every future recall is contaminated.
This is memory poisoning, and it’s different from prompt injection. Prompt injection tricks an agent in a single session. Memory poisoning persists. The bad data sits in your agent’s memory store, surfacing in future recalls, influencing decisions across sessions, compounding over time. A single poisoned entry can corrupt downstream reasoning indefinitely.
If you’re building production agents on MemoClaw, here’s what you can do about it right now.
The threat model
Memory poisoning happens when your agent stores information that is:
- Incorrect — a hallucination gets stored as fact (“user’s timezone is UTC+9” when it’s UTC-3)
- Manipulated — someone feeds the agent false info that gets persisted (“always use rm -rf for cleanup”)
- Stale — something that was true six months ago but isn’t anymore, still ranking high in recall
The attack surface depends on where your agent gets data. User messages, tool outputs, web scrapes, other agents — each is an ingestion point where bad data can enter.
You can’t fix this by improving your system prompt. The data is already in the store.
Pin your ground truth
MemoClaw’s pinned flag marks memories as exempt from decay. Use this for verified, high-confidence facts that should always surface in recall.
memoclaw store "Production database is PostgreSQL 16 on AWS RDS us-east-1" \
--importance 0.95 \
--tags verified,infrastructure \
--memory-type decision \
--pinned true
Pinned memories don’t decay, so they consistently outrank stale or low-importance entries. Think of them as your agent’s source of truth.
The discipline: don’t pin everything. Pin the things that, if wrong, would cause real damage. Infrastructure details, user identity facts, security constraints. Pin too much and you dilute the signal.
Importance scoring as trust weighting
Every memory in MemoClaw has an importance score from 0 to 1. This directly affects recall ranking. Use it as a proxy for how much you trust the source.
# High trust — you verified this yourself
memoclaw store "API rate limit is 100 req/min per key" \
--importance 0.95 --tags verified,api --memory-type decision
# Medium trust — reliable source, unverified
memoclaw store "User mentioned they prefer dark mode" \
--importance 0.7 --tags preferences --memory-type preference
# Low trust — extracted automatically, not verified
memoclaw store "Meeting scheduled for Thursday per email scan" \
--importance 0.4 --tags calendar --memory-type observation
When your agent recalls, higher-importance memories rank first. A poisoned memory at 0.4 gets buried by a verified one at 0.95 on the same topic.
The pattern: assign importance based on how the data entered the system, not how important the topic feels. User-verified data gets 0.8+. Auto-extracted data starts at 0.4-0.5 until confirmed.
Namespace isolation
If your agent ingests data from sources with different trust levels, don’t dump everything into default. Use namespaces to isolate them.
# Trusted internal knowledge
memoclaw store "Deploy process requires approval from #ops channel" \
--namespace internal --importance 0.9 --memory-type decision
# Data from external sources
memoclaw store "Competitor launched a new API yesterday" \
--namespace external-intel --importance 0.5 --memory-type observation
When recalling, you control which namespaces to query:
# Only recall from trusted namespace
memoclaw recall "deployment process" --namespace internal
This won’t prevent poisoning within a namespace, but it limits blast radius. Bad data in external-intel can’t contaminate recalls against internal.
Validate before storing
MemoClaw doesn’t validate the truthfulness of what you store. That’s your agent’s job. Build validation into your storage pipeline.
A pattern that works for OpenClaw agents:
- Before storing, recall existing memories on the same topic
- If the new fact contradicts a pinned or high-importance memory, flag it instead of storing blindly
- Use the
contradictsrelation type to track conflicts
# Check what you already know
memoclaw recall "database provider" --namespace internal --limit 3
# If new info contradicts existing high-importance memory,
# store with lower importance and mark the conflict
memoclaw store "Someone mentioned we're migrating to MySQL" \
--importance 0.3 --tags unverified,needs-review --memory-type observation
# Create a contradiction relation
memoclaw relations create <new-memory-id> <existing-memory-id> contradicts
Your agent now has both pieces of information but knows they conflict. A well-designed recall strategy prefers the higher-importance, pinned version while flagging the contradiction for human review.
Regular memory audits
MemoClaw’s list and export endpoints are free. Use them.
# List recent memories
memoclaw list --limit 50 --namespace default
# Check for stale memories
memoclaw suggested --category stale --limit 20
# Check for decaying memories
memoclaw suggested --category decaying --limit 20
Build a periodic audit into your agent’s routine. In OpenClaw, set up a cron job or heartbeat task that:
- Lists memories stored in the last 24 hours
- Flags any with importance below 0.5 that contradict higher-importance memories
- Reviews memories tagged
needs-review - Runs consolidation to merge duplicates
# Consolidate duplicates (dry run first)
memoclaw consolidate --namespace default --dry-run
# If the clusters look right, merge them
memoclaw consolidate --namespace default
Duplicates are another form of noise. Consolidation keeps things clean.
Memory types and decay
MemoClaw’s memory types have built-in decay half-lives: observations decay in 14 days, project context in 30 days, decisions in 90 days, preferences and corrections in 180 days.
Use this deliberately. Data from untrusted sources should be stored as observation type — if it’s wrong, it’ll naturally lose ranking weight. Verified facts should be decision or correction type with higher importance, so they persist longer.
# Unverified external data — decays fast
memoclaw store "Blog post claims 10x performance improvement" \
--memory-type observation --importance 0.3
# Verified architecture decision — decays slowly
memoclaw store "We chose pgvector over Pinecone for cost reasons" \
--memory-type decision --importance 0.9 --pinned true
Layering it all together
No single pattern stops memory poisoning completely. The defense is layered:
- Pin verified ground truth so it always wins in recall
- Use importance as a trust signal, not a topic-importance signal
- Isolate sources with namespaces so one bad feed doesn’t contaminate everything
- Validate new facts against existing high-trust memories before storing
- Audit regularly with the free list/suggested endpoints
- Let decay do its work — untrusted data should have short half-lives
Memory poisoning is a real risk for any agent that persists information across sessions. If you’re already using MemoClaw, you have the primitives. You just need to use them deliberately.