aiknowledge-graphragwikiagents

Your AI Agent Now Has a Long-Term Memory That Thinks

OIDO Team·June 1, 2026

The Problem With Knowledge Bases

Every company has knowledge scattered across documents, Slack exports, meeting transcripts, articles, and internal wikis. The standard playbook for making this useful to an AI agent is RAG: chunk the documents, store the chunks as embeddings, retrieve the most relevant chunks when the agent needs context.

RAG works. But it has a ceiling.

Chunk retrieval finds paragraphs that match your query. It doesn't know that one document contradicts another, that a concept in your product spec derives from a decision in a six-month-old meeting, or that three separate documents all mention the same entity without realizing it. You get relevant text back — you don't get a picture of how your knowledge connects.

OIDO's Wiki is built around a different premise: don't just retrieve knowledge, understand its structure.

What the Wiki Does

You feed the Wiki raw content — articles, transcripts, documents, chat exports. A background worker processes each submission through an LLM extraction pass, pulling out structured pages and the relationships between them.

Raw content in
      │
      ▼
LLM extraction
      │
      ├── entity pages    (people, orgs, products)
      ├── concept pages   (ideas, patterns, theories)
      ├── source pages    (the document itself, summarized)
      └── synthesis pages (cross-cutting analysis)
      │
      ▼
Knowledge graph stored in Postgres + pgvector

Every page gets typed. Every relationship between pages gets typed. After ingest, you don't have a folder of documents — you have a graph that the agent can traverse.

The Four Page Types

The extraction step classifies every piece of knowledge into one of four types:

Entity — A discrete thing: a company, a person, a product, a system. "Acme Corp", "the payments service", "GPT-4". Entities are the nouns of your knowledge base.

Concept — An idea, pattern, or theory. "Zero-downtime deployment", "eventual consistency", "the CAP theorem". Concepts are the verbs and adjectives — the recurring patterns across your work.

Source — The document or transcript being ingested, summarized and preserved. Sources are the anchors: everything extracted from them traces back to where it came from.

Synthesis — Cross-cutting analysis that combines multiple concepts or entities. These don't come from one document; they emerge when the extraction sees patterns that span several inputs. "The tension between our reliability goals and our deployment velocity" is a synthesis, not a source.

Typed Relationships

Pages connect via four relationship types:

Relationship	Meaning
`supports`	This page provides evidence for the target
`contradicts`	This page conflicts with the target
`mentions`	This page references the target in passing
`derives_from`	This page's content was built on top of the target

These aren't decorative. When your agent asks "what do we know about X," it gets back not just X's page, but the graph around it: what supports the claims, what contradicts them, what was derived from this work, what other documents reference it.

Auto-linking runs on every ingest: after new pages are created, the worker scans existing pages for mentions of the new titles and creates mentions edges automatically. Your knowledge graph connects itself as it grows.

Hybrid Search With Reranking

Finding the right page uses three layers:

1. Keyword search — Full-text search against titles and page bodies. Fast, precise, handles exact terminology well.

2. Vector search — Semantic similarity via pgvector. Handles synonyms, paraphrases, and conceptually related content that doesn't share keywords.

3. Hybrid fusion — Both results are combined using Reciprocal Rank Fusion, with configurable keyword/vector weights (default: 40% keyword, 60% vector). Pages that rank well in both searches score highest.

4. Reranking — An optional final pass runs a cross-encoder model over the top candidates, re-ordering by true relevance to the query rather than embedding similarity. The gap between vector similarity and actual relevance is real; reranking closes it.

results, err := wiki.Search(ctx, orgID, SearchParams{
    Query:         "zero downtime deployment strategy",
    Limit:         10,
    KeywordWeight: 0.4,
    VectorWeight:  0.6,
})

Quality: The Lint Pass

A knowledge base that grows without maintenance degrades. OIDO's Wiki includes a lint command that audits your organization's graph for structural problems:

Orphan pages — pages with no edges in or out; knowledge that isn't connected to anything
Contradictions — pages connected by contradicts edges that haven't been resolved
Stale content — pages that haven't been updated or referenced in a configurable time window
Empty pages — pages with no meaningful body content
Coverage gaps — entity or concept types that are under-represented given the volume of source material

The lint report gives you a summary with counts by page type and relationship type, plus a prioritized list of issues. Pipe it into a cron job and you get a weekly health check on your knowledge base.

How It Fits Into Agent Workflows

The Wiki isn't a standalone feature — it's the long-term memory layer that agents query when they need context beyond the current conversation.

Support agents can query the Wiki for product knowledge, known issues, and past resolutions before composing a response.

Engineering agents can pull in architecture decisions, API contracts, and design patterns specific to your codebase — not from a static prompt, but from a living knowledge graph that updates as your documentation does.

Research agents can ingest articles and papers, then traverse the graph to surface contradictions and synthesis across sources — work that would take a human analyst hours.

The ingest pipeline accepts four source types out of the box: article, transcript, document, and chat_export. Feed it whatever your team already produces. The graph builds itself.

The Technical Underpinning

The Wiki runs on Postgres with the pgvector extension — no separate vector database to manage, no additional infrastructure. Ingest jobs run through a background worker pool (up to four concurrent jobs) with automatic recovery for stuck jobs. Embeddings are stored per-page with support for both remote embedding APIs and local embedding models.

Everything is organization-scoped. Your knowledge graph is private to your org and never crosses tenant boundaries.

Start Building Your Knowledge Graph

Every document your team creates, every transcript from a customer call, every design decision in Notion — all of it can feed the Wiki. Over time, your agents stop working from static prompts and start working from a knowledge base that understands your company, your codebase, and your decisions.

Sign up at oidostudio.com →

← Back to blog