Replaces dual-agent demo with a full personal knowledge base system where Claude reads source documents and incrementally builds and maintains a structured, interlinked wiki of markdown pages. - tools/ingest.py: reads a source, extracts knowledge, updates wiki pages - tools/query.py: queries the wiki with Claude, optionally files answers back - tools/lint.py: health-checks the wiki (orphans, contradictions, gaps) - tools/build_graph.py: two-pass graph builder (wikilinks + Claude inference) with Louvain community detection and vis.js interactive HTML output - CLAUDE.md: schema and workflow instructions for the LLM - wiki/: starter index, log, and overview pages - raw/, graph/: directory scaffolding Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
LLM Wiki Agent
A personal knowledge base that builds and maintains itself. Drop in source documents — articles, papers, notes — and the LLM reads them, extracts the knowledge, and integrates everything into a persistent, interlinked wiki. You never write the wiki. The LLM does.
Unlike RAG systems that re-derive knowledge from scratch on every query, LLM Wiki Agent compiles knowledge once and keeps it current. Cross-references are pre-built. Contradictions are flagged at ingest time. Every new source makes the wiki richer.
How It Works
You drop a source → LLM reads it → wiki pages are created/updated → graph is rebuilt
You ask a question → LLM reads relevant wiki pages → synthesizes answer with citations
Three layers:
raw/— your source documents (immutable, you own this)wiki/— LLM-maintained markdown pages (Claude writes, you read)graph/— auto-generated knowledge graph visualization
Quick Start
git clone https://github.com/SamurAIGPT/GPT-Agent.git
cd GPT-Agent
pip install -r requirements.txt
export ANTHROPIC_API_KEY=your_key_here
Add your first source:
# Drop a source document into raw/
cp my-article.md raw/articles/my-article.md
# Ingest it — LLM reads, extracts, and files knowledge into the wiki
python tools/ingest.py raw/articles/my-article.md
Query the wiki:
python tools/query.py "What are the main themes across all sources?"
python tools/query.py "How does X relate to Y?" --save # save answer back to wiki
Build the knowledge graph:
python tools/build_graph.py --open # opens graph.html in browser
Health-check the wiki:
python tools/lint.py --save # checks for orphans, contradictions, gaps
Architecture
raw/ ← your sources (never modified by LLM)
wiki/
index.md ← catalog of all pages (updated on every ingest)
log.md ← append-only operation log
overview.md ← living synthesis across all sources
sources/ ← one page per source document
entities/ ← people, companies, projects
concepts/ ← ideas, frameworks, methods
syntheses/ ← answers to queries, filed back as pages
graph/
graph.json ← node/edge data (SHA256-cached)
graph.html ← interactive vis.js visualization
tools/
ingest.py ← process a new source
query.py ← ask a question
lint.py ← health-check the wiki
build_graph.py ← rebuild the knowledge graph
CLAUDE.md ← schema and workflow instructions for the LLM
Tools
| Command | What it does |
|---|---|
python tools/ingest.py <file> |
Read a source, update wiki pages, append to log |
python tools/query.py "<question>" |
Search wiki, synthesize answer with citations |
python tools/query.py "<question>" --save |
Same, and file the answer back as a wiki page |
python tools/lint.py |
Check for orphans, broken links, contradictions, gaps |
python tools/build_graph.py |
Build graph.json + graph.html from wiki |
python tools/build_graph.py --no-infer |
Build graph without semantic inference (faster) |
python tools/build_graph.py --open |
Build and open in browser |
The Graph
build_graph.py runs two passes:
- Deterministic — parse all
[[wikilinks]]in every page → explicit edges taggedEXTRACTED - Semantic — Claude infers implicit relationships not captured by wikilinks → edges tagged
INFERRED(with confidence) orAMBIGUOUS
Community detection (Louvain) clusters nodes by topic. The output is a self-contained graph.html — open it in any browser. SHA256 caching means only changed pages are reprocessed.
CLAUDE.md
CLAUDE.md is the schema document — it tells the LLM how to maintain the wiki. It defines page formats, ingest/query/lint workflows, naming conventions, and log format. This is the key configuration file. Edit it to customize behavior for your domain.
What Makes This Different from RAG
| RAG | LLM Wiki Agent |
|---|---|
| Re-derives knowledge every query | Compiles once, keeps current |
| Raw chunks as retrieval unit | Structured wiki pages |
| No cross-references | Cross-references pre-built |
| Contradictions surface at query time (maybe) | Flagged at ingest time |
| No accumulation | Every source makes the wiki richer |
Use Cases
- Research — go deep on a topic over weeks; every paper/article updates the same wiki
- Reading — build a companion wiki as you read a book; by the end you have a rich reference
- Personal knowledge — file journal entries, health notes, goals; build a structured picture over time
- Business — feed in meeting transcripts, Slack threads, docs; LLM does the maintenance no one wants to do
Tips
- Use Obsidian to read/browse the wiki — follow links, check graph view
- Use Obsidian Web Clipper to clip web articles directly to
raw/ - The wiki is a git repo — you get version history for free
- File good query answers back with
--save— your explorations compound just like ingested sources
License
MIT License — see LICENSE for details.
Related
- graphify — graph-based knowledge extraction skill (inspiration for the graph layer)
- Vannevar Bush's Memex (1945) — the original vision this is related to in spirit