Ontology Augmented Generation: Why Vector Databases Are Not Enough
Retrieval-Augmented Generation has a precision problem. Not a recall problem — RAG is good at finding documents that contain relevant words. The problem is that RAG cannot distinguish between a document that is currently valid and one that has been superseded.
Ask a RAG system about a legal precedent. It will retrieve every case that matches your query — including overruled decisions, superseded statutes, and opinions that were vacated on appeal. The embedding vectors for a valid ruling and an overruled ruling are nearly identical. They use the same words. They discuss the same concepts. The cosine similarity is high.
The AI then synthesizes an answer from a mix of valid and invalid sources. The user has no way to know which parts of the response are grounded in current law and which cite dead precedent.
This isn't a prompt engineering problem. It's an architectural problem. And we built the architectural solution.
What Ontology Augmented Generation Is
OAG replaces vector similarity search with hyperdimensional computing (HDC) — a mathematical framework where knowledge is encoded as high-dimensional binary vectors that preserve semantic relationships, temporal validity, and domain constraints geometrically.
The key difference: in a vector database, a document's embedding captures what it's about. In OAG, a knowledge vector captures what it's about, when it's valid, what it supersedes, and how it relates to other knowledge in the domain ontology.
Validity isn't metadata attached after the fact. It's encoded in the geometry of the vector itself.
The Legal Precision Test
We tested OAG against standard RAG on a legal retrieval benchmark: given a query about current law, retrieve only documents that reflect the current state of the law.
| System | Precision | Recall | Invalid Sources Returned |
|---|---|---|---|
| RAG (vector similarity) | 62.5% | 91% | 37.5% of results |
| OAG (HDC + validity binding) | 100% | 89% | 0% |
Standard RAG returned the right topic 91% of the time — but 37.5% of those results included overruled or superseded sources. The AI had no way to distinguish them.
OAG returned zero invalid sources. Every document in the result set was currently valid law. Recall dropped by 2 percentage points — a trivial tradeoff for eliminating hallucination from dead precedent.
How It Works
Step 1: Knowledge Encoding
Every document is encoded as a hyperdimensional vector (10,000 dimensions, binary). The encoding process binds three components:
- Semantic content — what the document is about (similar to a traditional embedding)
- Validity state — whether the document is current, superseded, amended, or revoked
- Ontological relationships — how the document relates to other knowledge in the domain graph
These components are combined using HDC operations (binding, bundling, permutation) that preserve each component's information while creating a composite vector that can be queried on any dimension.
Step 2: Constraint-Aware Retrieval
A query isn't just "find similar documents." A query is a composite vector that specifies:
- The topic (semantic similarity)
- The validity requirement (e.g., "currently in force")
- The relationship constraints (e.g., "not superseded by a later ruling")
The retrieval operation computes similarity in all dimensions simultaneously. A document that matches topically but fails the validity constraint scores low. A document that is valid but off-topic scores low. Only documents that satisfy all constraints score high.
Step 3: O(D) Retrieval
This is the performance breakthrough. Traditional vector search is O(N) — retrieval time scales linearly with the number of documents. Double your knowledge base, double your query time. This is why vector databases need approximate nearest neighbor (ANN) algorithms, sharding, and caching to scale.
HDC retrieval is O(D) — retrieval time scales with the dimensionality of the vectors (fixed at 10,000), not the number of documents. Whether you have 1,000 documents or 10 million, the query takes the same time.
This isn't an approximation. It's exact retrieval in constant time.
12x Smaller Storage
Vector embeddings from models like text-embedding-3-large produce 3,072-dimensional float32 vectors: 12,288 bytes per document.
HDC vectors are 10,000-dimensional binary vectors: 1,250 bytes per document. But because HDC supports algebraic composition — bundling multiple documents into a single superposition vector — the effective storage per document drops further.
In our benchmarks on a 100K-document legal corpus:
| Metric | Vector DB | OAG (HDC) | Improvement |
|---|---|---|---|
| Storage per document | 12.3 KB | 1.0 KB | 12x smaller |
| Total index size | 1.2 GB | 98 MB | 12x smaller |
| Query latency (100K docs) | 45 ms | 0.8 ms | 56x faster |
| Query latency (1M docs) | 380 ms | 0.8 ms | 475x faster |
The constant-time property means OAG gets relatively faster as the knowledge base grows. At 1 million documents, the gap is nearly 500x.
Beyond Legal: Where Validity Binding Matters
Legal retrieval is the clearest demonstration, but validity binding solves problems across domains:
Medical. Drug interaction databases change constantly. A retrieval system that returns superseded dosage guidelines alongside current ones creates patient safety risks. OAG ensures only current protocols surface.
Compliance. Regulatory frameworks evolve. SOX requirements from 2005 are different from 2025. An AI compliance assistant must distinguish current obligations from historical ones. Vector similarity can't do this — the text is too similar.
Intelligence. Threat assessments have temporal validity. A 2024 assessment of a threat actor's capabilities may be completely wrong in 2026. Retrieval that treats old assessments as equivalent to current ones produces dangerously misleading analysis.
Engineering. Specification documents get revised. A RAG system that retrieves Rev A of a component specification when Rev C is current causes manufacturing errors. OAG encodes revision relationships in the vector geometry.
The Research Head Start
We've been building HDC-based knowledge systems for 18 months. The mathematical foundations — Kanerva's work on sparse distributed memory, Gayler's work on vector symbolic architectures, Plate's holographic reduced representations — are well-established in the computational neuroscience literature.
What's new is applying these foundations to knowledge retrieval with validity constraints. The combination of HDC encoding, ontological binding, and temporal validity into a single retrieval framework is, as far as we can determine, novel.
The closest comparable work in industry is IBM's research on hyperdimensional computing for classification tasks. But classification and retrieval are different problems, and nobody else is encoding domain validity as a geometric constraint.
Why Not Just Add Metadata Filtering?
The obvious objection: "Can't you just add a valid=true filter to your vector database query?"
You can. Many teams do. It doesn't solve the problem.
-
Metadata is disconnected from semantics. A filter is a binary gate applied after similarity search. It can't express "this document supersedes that document" or "this ruling modifies the holding of that ruling." Domain relationships are richer than boolean flags.
-
Metadata must be maintained manually. Someone has to mark documents as superseded. In fast-moving domains (law, compliance, threat intelligence), this maintenance lag is where errors live.
-
Metadata filtering doesn't improve retrieval quality. It's still vector similarity underneath. You're still getting the 62.5% precision — you're just filtering out some of the 37.5% bad results. OAG doesn't filter bad results. It never retrieves them in the first place.
-
Metadata filtering is still O(N). You search the full index, then filter. OAG is O(D) from the start.
Integration
OAG runs as a retrieval layer that replaces or sits alongside existing vector databases. For teams already using RAG pipelines:
Before: Query → Vector DB → Top-K documents → LLM
After: Query → OAG (HDC) → Valid documents → LLM
The LLM receives only documents that satisfy the domain's validity constraints. No prompt engineering needed. No post-retrieval filtering. The retrieval layer handles it.
100% precision on legal benchmarks. Constant-time retrieval regardless of corpus size. 12x smaller storage. Validity encoded in the math, not in metadata.
That's Ontology Augmented Generation. It's the retrieval layer that makes AI trustworthy for domains where "close enough" isn't good enough.
Aethyr Research — Salt Lake City, UT