Request access

Research

How it works

Resources

Pricing

Request access

Research

How it works

Resources

Pricing

Request access

April 20, 2026

Why Your Vector Database Can Read Your Data And What To Do About It

XTrace is the first production-ready encrypted vector database SDK that performs nearest-neighbor search on fully encrypted data, so the server never sees your vectors, queries, or documents. Built for healthcare, legal, enterprise R&D, and AI agents handling sensitive context.

Share to

Every vector database in production today makes the same implicit bargain: hand us your data, and we'll search it fast. The transport is encrypted, the disk is encrypted, the access controls are real. But the server reads your vectors. It reads your documents. It knows what you're searching for. That's the deal.

For many workloads, this is fine. For a growing category — healthcare systems, law firms, enterprises with genuinely sensitive IP, AI agents accumulating personal context — it isn't. And until recently, there wasn't a production-ready alternative.

XTrace is an attempt to break that bargain: an encrypted vector database SDK where the server performs nearest-neighbor search over fully encrypted data, without ever decrypting anything. This post explains how that works, why it's hard, and what it enables.

The Vector DB Landscape: Fast, Capable, and Architecturally Blind to Privacy

The current vector database market is mature and competitive. Pinecone is the easiest managed option at scale, with strong performance and SOC 2 / HIPAA attestations. Weaviate offers hybrid keyword-plus-vector search and solid multi-tenancy. Qdrant, written in Rust, has the fastest raw throughput and excellent filtering. pgvector lets Postgres shops add vector search without adding infrastructure. Milvus handles billion-scale deployments.

Each of these is a serious piece of engineering. And every single one stores your embedding vectors — and usually your document chunks — in plaintext on the server side.

"Encrypted at rest" is commonly misread as stronger than it is. It means the disk is encrypted. It does not mean the database process can't read your data — it obviously can, because it needs to read vectors to search them. If the server is compromised, or subpoenaed, or the vendor's employees have access that shouldn't extend to your data, your vectors are exposed. And embedding vectors aren't opaque: research has repeatedly shown that high-dimensional embeddings can be reverse-engineered, partially or fully, into the source text that generated them.

For a law firm storing privileged client documents, or a hospital system indexing clinical notes, or an enterprise with proprietary R&D data, this is not a theoretical risk. It's a compliance exposure and, in some cases, a legal one.

There are emerging players addressing this — HEVEC, for example, launched in early 2026 with a homomorphic encryption-backed approach reaching roughly 1M vectors in ~187ms. But the space is still nascent. Most teams building RAG pipelines, AI agents, or semantic search today accept the plaintext tradeoff as the cost of functionality.

XTrace's thesis is that the tradeoff is no longer necessary.

How XTrace Actually Works

The core idea: encrypt everything client-side before it leaves your machine, in a way that lets the server still do useful math on the ciphertexts. The server gets encrypted blobs. It returns encrypted results. It never sees your data.

Here's the concrete flow:

1. Document encryption. Before upload, your document text is AES-256 encrypted locally. The ciphertext goes to the server; the plaintext never does.

2. Vector encryption. Your embedding model produces a float vector. XTrace converts this to a binary (Hamming-distance) representation and encrypts it using Paillier homomorphic encryption before upload. The server receives a ciphertext it cannot decrypt.

3. Encrypted query. When you search, your query vector is similarly Paillier-encrypted on the client. The ciphertext is sent to the server.

4. Server-side encrypted math. Here's the non-obvious part: the server computes nearest-neighbor Hamming distances directly on the ciphertexts. The math produces correct distance rankings without the server ever seeing the underlying vectors. The algebraic property that makes this work is explained below.

5. Local decryption. The server returns ranked chunk IDs — still encrypted. You decrypt locally. The server learns which chunks matched; it does not learn what the query was, what the vectors contain, or what the documents say.

The private key never leaves the client. The server is, in a precise cryptographic sense, blind.

One nuance: metadata tags (used for pre-search filtering) are stored in plaintext. If metadata privacy also matters, use opaque identifiers — don't put "patient: Jane Doe" in a tag field.

The Engineering: Why Paillier, Why Hamming, and Why It's Not Slow

The mechanism is Paillier homomorphic encryption combined with a specific binary encoding scheme. Understanding why these two choices fit together is the key to understanding XTrace.

Paillier is additively homomorphic. That means: multiplying two Paillier ciphertexts corresponds to adding their underlying plaintexts. You can do arithmetic on encrypted data without decrypting it. This is the property that makes server-side distance computation possible.

Hamming distance fits the math. Hamming distance — the count of positions where two binary vectors differ — is the right similarity metric for binary embeddings. Crucially, XTrace engineered an encoding scheme where Hamming distance survives homomorphic addition. Vector bits are interleaved into an "odd positions only" padded layout. After the server performs encrypted addition and the client decrypts, a popcount on odd bit positions yields the exact Hamming distance. No approximation, no loss of ranking quality.

The production problem with textbook Paillier. Standard Paillier is cryptographically sound but computationally brutal at batch scale. Three bottlenecks hit hard:

Encryption requires fresh large modular exponentiation per ciphertext

Randomization (required for semantic security) adds another expensive exponentiation

Decryption also requires large-exponent modular exponentiation

Run this naively on thousands of 512-dimensional vectors and you'll be waiting.

What XTrace did instead. The engineering paper (Litchev, March 2026) describes a set of concrete optimizations that make this production-viable:

Reduced-exponent decryption. Key generation is restructured so the decryption exponent is derived from the LCM of two constrained subgroup orders — a materially smaller value. Less GPU work per decrypt.

Lookup-table encryption. Instead of computing g^m mod n² from scratch per ciphertext, a table indexed by base-2⁸ digits of the plaintext chunk is precomputed. Online encryption becomes a product of cached table lookups. The PaillierLookupClient exposes this and is the recommended path for production.

GPU acceleration. Big-integer arithmetic runs on GPU via CGBN-backed kernels. Tables stay resident on GPU across calls. The speedup is roughly 20× over CPU for batch operations.

Precomputed randomness. Random blinding terms are precomputed and cached rather than generated fresh per ciphertext.

Native GMP objects. Ciphertexts are returned as GMP-backed Python objects, not hex strings. This eliminates repeated serialization overhead across pipeline stages.

The SDK exposes three clients: PaillierClient (standard), PaillierLookupClient (optimized, recommended for production), and GoldwasserMicaliClient (experimental). A GPU backend is available for teams with the throughput requirements to warrant it.

The critical property: nowhere in this flow does the server handle a plaintext vector or plaintext query.

Who This Is For

Healthcare. Clinical notes, diagnostic summaries, patient histories — all are candidates for AI-powered retrieval. HIPAA requires not just access controls but genuine data protection. A traditional vector DB backed by a cloud provider means the provider can read your patients' records, even if they contractually agree not to. Encrypted vector search means the infrastructure is provably blind.

Legal. Attorney-client privilege is not just a regulatory checkbox — it's a legal obligation with teeth. M&A deal rooms, litigation strategy documents, and client communications indexed for AI-assisted research cannot pass through a server that can read them. XTrace lets a law firm run semantic search over privileged matter without that matter ever appearing in plaintext outside the firm's perimeter.

Enterprise R&D and internal knowledge. Proprietary process documentation, internal code, research notes, competitive analysis — the data companies most want to make searchable is often the data they least want on a cloud provider's servers. An encrypted vector store lets teams build powerful internal retrieval without extending trust to the infrastructure layer.

AI agents and personalized memory. This is the most forward-looking use case. AI agents that are genuinely useful accumulate context: preferences, working patterns, past reasoning, personal facts. That context is among the most sensitive data a system can hold. As agents become more capable and more deeply personalized, the question of where that memory lives — and who can read it — becomes critical. An encrypted vector store lets an agent remember, without that memory being readable by anyone but the user it belongs to.

The Foundation for What Comes Next

The reason all of this matters beyond search is that it's the cryptographic foundation for X-Mem, XTrace's forthcoming AI memory layer. X-Mem stores beliefs, facts, preferences, and reasoning artifacts from AI conversations: the long-term context that makes an agent genuinely useful across time rather than perpetually amnesiac. That kind of data — the full contents of someone's working memory with an AI system — is among the most sensitive data a platform can hold. Building the memory layer on top of an encrypted vector store means persistent, portable AI memory doesn't have to be a privacy compromise. We'll cover the memory layer architecture in a follow-up post.

For now: if you're building RAG pipelines, AI agents, or semantic search over data that can't live on a plaintext server, XTrace is worth a close look. The cryptographic overhead that once made this impractical has been engineered away.

Frequently Asked Questions

What is an encrypted vector database?

An encrypted vector database performs nearest-neighbor search directly on encrypted data, without the server ever decrypting the vectors, queries, or documents. Unlike traditional vector databases like Pinecone, Weaviate, or Qdrant, which encrypt data at rest but must read it in plaintext to search it, an encrypted vector database like XTrace uses homomorphic encryption so the server stays cryptographically blind to the underlying data.

How is XTrace different from Pinecone, Weaviate, or Qdrant?

Pinecone, Weaviate, and Qdrant store embedding vectors and document chunks in plaintext on the server, which means the database process, vendor employees, or a compromised server can read your data. XTrace uses Paillier homomorphic encryption to run semantic search on fully encrypted vectors, so the server never sees your queries or documents. The private key never leaves the client.

When should you use an encrypted vector database?

Use an encrypted vector database when your data cannot legally or contractually live on a plaintext server. Common fits include healthcare systems indexing clinical notes under HIPAA, law firms running retrieval over privileged client documents, enterprises with proprietary R&D, and AI agents accumulating personal user context. For public or low-sensitivity workloads, a standard vector database is usually sufficient.

How to Export Gemini Conversations

What Is Claude Mythos? Anthropic's Restricted AI Model