How X-Mem Thinks: A Mental Model for Belief-Aware Memory

Share to

This is the third post in a series on XMem. Blog 1 made the case that existing memory tools do retrieval, not belief management. Blog 2 went deep on the architecture. This post sits between them, a conceptual map for developers who want to understand how XMem thinks before they start building with it.
If you've integrated a memory layer into an agent before, you've probably written some version of this pipeline: embed the user message, search a vector store, stuff the top-k chunks into the context window, call the model. It works until it doesn't — until the user changes their mind, until a fact from three sessions ago quietly contradicts a fact from yesterday, until your agent confidently acts on information that's no longer true.
X-Mem is designed around a different premise: memory is not a search problem. It's a state management problem. The system needs to track not just what it knows, but whether that knowledge is still valid — and how to reconcile beliefs when they conflict.
Think of how a good version control system works. It doesn't just store code — it stores the commit message, the reason for the change, the history of what came before. Reverting a commit doesn't delete the old state; it marks it superseded, with a pointer to what replaced it. X-Mem does that for beliefs.
Here's how that translates into a mental model.
The Data Model: Five Layers, Each With a Job
X-Mem processes raw conversation through a five-layer hierarchy. Understanding what each layer is for is more useful than knowing how it's implemented.
Events are individual conversation turns — the raw input, uninterpreted.
Batches group events into coherent processing units. The grouping is deterministic: the same conversation always produces the same batches, making the pipeline reproducible and auditable.
Episodes are narrative summaries of each batch — "what happened in this session." They give the system a way to reason about a conversation without re-reading every raw turn.
Facts are where the interesting design work lives. A fact is a single atomic belief about the user or their context. X-Mem distinguishes four types by lifespan: preference (persists indefinitely), personal (biographical background), decision (project-lifetime rationale), and context (session-scoped, ephemeral).
Every fact carries provenance — which conversation it came from, which episode — and a lifecycle state: ACTIVE, REVISED, or RETRACTED. That lifecycle state is the key design choice that separates XMem from a system that just accumulates embeddings.
Artifacts are versioned outputs — code, documents, plans — organized into version chains. Each artifact automatically generates descriptor facts that make it searchable through the same system as everything else.
How Conflicts Get Resolved
The hard problem in memory isn't storage. It's what happens when new information contradicts existing information.
X-Mem applies the AGM framework from belief revision theory. Three cases:
Expansion: new fact arrives, no conflict. Stored as ACTIVE, nothing else touched.
Revision: new fact contradicts an existing active fact. An LLM-based resolver determines the outcome — new fact wins, the two beliefs merge, they represent temporal progression, or they're actually compatible and both stay. Whatever the outcome, the resolution is auditable: you can trace exactly why a belief changed and what replaced it.
Contraction: explicit denial with no replacement — "I don't have that dog anymore." The old fact is marked RETRACTED, not deleted. The distinction matters: RETRACTED means the belief was withdrawn without replacement; REVISED means it was replaced.
One design principle worth internalizing: the system is deliberately harder to accidentally mislead than to deliberately update. User-stated facts outrank LLM-inferred facts.
One honest note on benchmarking: X-Mem's architectural predecessor scores above full-context retrieval on the LoCoMo benchmark — typically the theoretical ceiling. But the lead engineer is candid: when the system was tuned to store more selectively, scores dropped, because current benchmarks reward recall volume, not revision quality. "Current memory benchmarks are a bit rigged," in his words. XMem-specific numbers will ship with the API launch.
Retrieval: The Goal Is Understanding, Not Proximity
Once beliefs are stored correctly, retrieval needs to surface the right ones — not just the semantically nearest ones.
X-Mem retrieval runs in three stages. Stage 1 is semantic search: find facts similar to the query. Stage 2 is a sufficiency check: are those facts actually enough to answer the question? If not, the system generates targeted follow-up queries and iterates — enabling multi-hop reasoning across sessions without requiring the caller to pre-structure anything. Stage 3 is enrichment: retrieved facts are reconnected to their parent episodes and related artifacts, producing a structured context package rather than a flat list.
That last point matters. A flat list of disconnected facts puts a significant interpretation burden on the consuming model. Grouping facts by episode — preserving the conversation they came from — is what makes them interpretable. Facts without context are just assertions. Facts inside their episode are evidence.
The Context Agent: Memory With Instructions
Retrieval gets you the right facts. The Context Agent solves a different problem: knowing which facts a particular external agent needs for a particular task — and making sure that agent doesn't hallucinate around the gaps in what it's been given.
The flow: at the start of a task, the Context Agent interviews the user to understand scope, then assembles a context space from long-term memory — not everything, just what's applicable to the task at hand.
The key output is what the X-Mem team calls a contract: a structured document that describes exactly what's in the context space and how to use it. In the words of the system's architect: "We literally tell you what is inside this context space, how to use it, what to expect. A lot of times you just directly stack the long-term memory to external agent. They don't even know what's inside, and start hallucinating. We're really letting them know: what is inside, and because we know what the external agent's task is, we also tell you: use it for XYZ when you are not sure about the user decision."
Think of it as the difference between handing someone a stack of files and handing them a briefing document alongside those files. The contract transforms memory from a raw retrieval dump into a documented interface — the agent knows what it's been given, what it's for, and where the edges are.
This is also what makes X-Mem agent-agnostic: the contract describes the context in terms of what the agent can expect and how to use it, so the memory system doesn't need to know the specifics of the consuming agent. It just needs to know the task. The Context Agent ships as an MCP server, which means integration is typically configuration, not code.
Encrypted by Default
The kinds of facts X-Mem stores — decisions, preferences, project rationale, biographical context — are sensitive by nature. A memory system that knows how you work, what you've chosen, and why, accumulates something more sensitive than most enterprise data.
X-Mem is built on encrypted infrastructure. Content is AES-256 encrypted client-side before it leaves the machine. The server computes similarity scores on encrypted vectors and never decrypts the content. Each organization gets cryptographic isolation via separate keys — not just access controls — so the isolation holds even if the underlying infrastructure is compromised.
There's an explicit tradeoff: encryption imposes some constraints on retrieval precision. That's an engineering decision, not an oversight.
What's Next
X-Mem's API is in private development. When it launches, the documentation will cover endpoints, SDK methods, and integration patterns — the things this post has deliberately set aside.
What this post has tried to give you is the mental model that makes those API docs useful from the first read: the data hierarchy, the belief lifecycle, the retrieval logic, the contract pattern. If you understand how XMem thinks about memory, you'll be ready to build with it when the API is available.
The architecture described here exists. The benchmark results exist. The implementation is not vaporware. If you're building agent infrastructure and want early access, reach out — the team is talking to developers now.
Frequently Asked Questions
How does X-Mem handle conflicting information in AI agent memory?
X-Mem uses the AGM belief revision framework to resolve conflicts through three operations: expansion (new fact, no conflict), revision (new fact contradicts existing belief, resolved by an LLM-based resolver), and contraction (explicit denial with no replacement). Old beliefs aren't deleted. They're marked REVISED or RETRACTED, with full provenance preserved so every change is auditable. User-stated facts outrank LLM-inferred facts, making the system harder to accidentally mislead than to deliberately update.
What is the difference between vector search and AI memory management?
Vector search is a retrieval problem: embed a query, find the nearest chunks, return them. AI memory management is a state management problem: track what's known, whether it's still valid, and how to reconcile beliefs when they conflict. A vector store will happily return a fact from three sessions ago that contradicts yesterday's update. A memory system like X-Mem tracks belief lifecycles (ACTIVE, REVISED, RETRACTED), preserves provenance, and resolves contradictions before they reach the agent. Search finds similar text. Memory maintains truth over time.
What is a context contract in an AI memory system?
A context contract is a structured document that tells an external agent exactly what's in its context space and how to use it. Instead of dumping retrieved memory into an agent's prompt and hoping it figures out the gaps, the contract specifies what facts are included, what task they're for, and where their edges are. This prevents the common failure mode where agents hallucinate around missing information because they don't know what they don't have. In X-Mem, the Context Agent generates contracts automatically and ships as an MCP server, making the memory system agent-agnostic.
Part 1 of this series: Your AI Agent Doesn't Have Memory. It Has a List.
Part 2 of this series: Every Tool Solving the AI Memory Problem Is Solving a Different Problem
