Rewrite README for clarity and impact
- Lead with one-sentence hook + output structure upfront - Add What You Get section naming concrete deliverables - Consolidate agent compatibility into schema file table - Add tech stack one-liner - Streamline use cases, quick start, and graph sections
This commit is contained in:
370
README.md
370
README.md
@@ -2,125 +2,199 @@
|
|||||||
|
|
||||||
[](LICENSE)
|
[](LICENSE)
|
||||||
|
|
||||||
**A personal knowledge base that builds and maintains itself.** Drop in source documents — articles, papers, notes — and the LLM reads them, extracts the knowledge, and integrates everything into a persistent, interlinked wiki. You never write the wiki. Claude does.
|
**A coding agent skill.** Drop source documents into `raw/` and type `/wiki-ingest` — the agent reads them, extracts knowledge, and builds a persistent interlinked wiki. Every new source makes the wiki richer. You never write it.
|
||||||
|
|
||||||
Unlike RAG systems that re-derive knowledge from scratch on every query, LLM Wiki Agent compiles knowledge once and keeps it current. Cross-references are pre-built. Contradictions are flagged at ingest time. Every new source makes the wiki richer.
|
> Most knowledge tools make you search your own notes. This one reads everything you've collected and writes a structured wiki that compounds over time — cross-references already built, contradictions already flagged, synthesis already done.
|
||||||
|
|
||||||
## How It Works
|
|
||||||
|
|
||||||
```
|
```
|
||||||
You drop a source → Claude reads it → wiki pages are created/updated → graph is rebuilt
|
/wiki-ingest raw/papers/attention-is-all-you-need.md
|
||||||
|
|
||||||
You ask a question → Claude reads relevant wiki pages → synthesizes answer with citations
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Three layers:
|
```
|
||||||
|
wiki/
|
||||||
|
├── index.md catalog of all pages — updated on every ingest
|
||||||
|
├── log.md append-only record of every operation
|
||||||
|
├── overview.md living synthesis across all sources
|
||||||
|
├── sources/ one summary page per source document
|
||||||
|
├── entities/ people, companies, projects — auto-created
|
||||||
|
├── concepts/ ideas, frameworks, methods — auto-created
|
||||||
|
└── syntheses/ query answers filed back as wiki pages
|
||||||
|
graph/
|
||||||
|
├── graph.json persistent node/edge data (SHA256-cached)
|
||||||
|
└── graph.html interactive vis.js visualization — open in any browser
|
||||||
|
```
|
||||||
|
|
||||||
- **`raw/`** — your source documents (immutable, you own this)
|
## Install
|
||||||
- **`wiki/`** — Claude-maintained markdown pages (Claude writes, you read)
|
|
||||||
- **`graph/`** — auto-generated knowledge graph visualization
|
|
||||||
|
|
||||||
## Quick Start
|
**Requires:** [Claude Code](https://claude.ai/code), [Codex](https://openai.com/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), or any agent that reads a config file.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/SamurAIGPT/llm-wiki-agent.git
|
git clone https://github.com/SamurAIGPT/llm-wiki-agent.git
|
||||||
cd llm-wiki-agent
|
cd llm-wiki-agent
|
||||||
```
|
```
|
||||||
|
|
||||||
Open it in your coding agent — **no API key or Python setup needed**:
|
Open in your agent — no API key or Python setup needed:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
claude # Claude Code
|
claude # reads CLAUDE.md + .claude/commands/
|
||||||
codex # OpenAI Codex
|
codex # reads AGENTS.md
|
||||||
opencode # OpenCode / Pear AI
|
opencode # reads AGENTS.md
|
||||||
gemini # Gemini CLI
|
gemini # reads GEMINI.md
|
||||||
```
|
```
|
||||||
|
|
||||||
Each agent reads its config file automatically (`CLAUDE.md`, `AGENTS.md`, or `GEMINI.md`) and follows the same workflows. Then just talk to it:
|
## Usage
|
||||||
|
|
||||||
```
|
```
|
||||||
# Claude Code — slash commands:
|
/wiki-ingest raw/papers/my-paper.md # ingest a source into the wiki
|
||||||
/wiki-ingest raw/articles/my-article.md
|
/wiki-ingest raw/articles/my-article.md # works on any markdown file
|
||||||
/wiki-query what are the main themes across all sources?
|
|
||||||
|
/wiki-query "what are the main themes?" # synthesize answer from wiki pages
|
||||||
|
/wiki-query "how does X relate to Y?" # with [[wikilink]] citations
|
||||||
|
|
||||||
|
/wiki-lint # find orphans, contradictions, gaps
|
||||||
|
/wiki-graph # build graph.html from all wikilinks
|
||||||
|
```
|
||||||
|
|
||||||
|
Plain English also works with any agent:
|
||||||
|
```
|
||||||
|
"Ingest this paper: raw/papers/llama2.md"
|
||||||
|
"What does the wiki say about attention mechanisms?"
|
||||||
|
"Check for contradictions across sources"
|
||||||
|
"Build the knowledge graph and tell me the most connected nodes"
|
||||||
|
```
|
||||||
|
|
||||||
|
Works with any markdown source — articles, papers, book chapters, meeting notes, journal entries, research summaries.
|
||||||
|
|
||||||
|
## What You Get
|
||||||
|
|
||||||
|
**Persistent wiki** — structured markdown pages that accumulate across sessions. Unlike chat, nothing is lost.
|
||||||
|
|
||||||
|
**Entity pages** — auto-created for every person, company, or project mentioned across sources. Updated each time a new source references them.
|
||||||
|
|
||||||
|
**Concept pages** — auto-created for every key idea or framework. Cross-referenced to every source that discusses them.
|
||||||
|
|
||||||
|
**Living overview** — `wiki/overview.md` is revised on every ingest to reflect the current synthesis across everything you've read.
|
||||||
|
|
||||||
|
**Contradiction flags** — when a new source contradicts an existing claim, it's flagged at ingest time, not buried until query time.
|
||||||
|
|
||||||
|
**Knowledge graph** — `graph.html` shows every wiki page as a node, every `[[wikilink]]` as an edge, and Claude-inferred implicit relationships as dotted edges. Community detection clusters related topics.
|
||||||
|
|
||||||
|
**Lint reports** — orphan pages, broken links, missing entity pages, data gaps with suggested sources to fill them.
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
### Research
|
||||||
|
|
||||||
|
Going deep on a topic over weeks — reading papers, articles, reports.
|
||||||
|
|
||||||
|
```
|
||||||
|
/wiki-ingest raw/papers/attention-is-all-you-need.md
|
||||||
|
/wiki-ingest raw/papers/llama2.md
|
||||||
|
/wiki-ingest raw/papers/rag-survey.md
|
||||||
|
|
||||||
|
# Wiki builds entity pages (Meta AI, Google Brain) and
|
||||||
|
# concept pages (Attention, RLHF, Context Window) automatically.
|
||||||
|
|
||||||
|
/wiki-query "What are the main approaches to reducing hallucination?"
|
||||||
|
/wiki-query "How has context window size evolved across models?"
|
||||||
|
|
||||||
/wiki-lint
|
/wiki-lint
|
||||||
/wiki-graph
|
# → "No sources on mixture-of-experts — consider the Mixtral paper"
|
||||||
|
|
||||||
# Any agent — plain English works too:
|
|
||||||
"Ingest this paper: raw/papers/my-paper.md"
|
|
||||||
"What does the wiki say about X?"
|
|
||||||
"Check for contradictions"
|
|
||||||
"Build the knowledge graph"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
| Agent | Config file |
|
By the end you have a structured, interlinked reference — not a folder of PDFs you'll never reopen.
|
||||||
|---|---|
|
|
||||||
| [Claude Code](https://claude.ai/code) | `CLAUDE.md` + `.claude/commands/` |
|
|
||||||
| [OpenAI Codex](https://openai.com/codex) | `AGENTS.md` |
|
|
||||||
| OpenCode / Pear AI | `AGENTS.md` |
|
|
||||||
| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `GEMINI.md` |
|
|
||||||
|
|
||||||
> **Standalone use** (without a coding agent): `pip install -r requirements.txt`, set `ANTHROPIC_API_KEY`, then use `python tools/ingest.py`, `python tools/query.py`, etc.
|
---
|
||||||
|
|
||||||
## Architecture
|
### Reading a Book
|
||||||
|
|
||||||
|
File each chapter as you go. Build out pages for characters, themes, arguments.
|
||||||
|
|
||||||
```
|
```
|
||||||
raw/ ← your sources (never modified by LLM)
|
/wiki-ingest raw/book/chapter-01.md
|
||||||
wiki/
|
/wiki-ingest raw/book/chapter-02.md
|
||||||
index.md ← catalog of all pages (updated on every ingest)
|
|
||||||
log.md ← append-only operation log
|
# Wiki creates entity and theme pages automatically.
|
||||||
overview.md ← living synthesis across all sources
|
|
||||||
sources/ ← one page per source document
|
/wiki-query "How has the protagonist's motivation evolved?"
|
||||||
entities/ ← people, companies, projects
|
/wiki-query "What contradictions exist in the author's argument so far?"
|
||||||
concepts/ ← ideas, frameworks, methods
|
|
||||||
syntheses/ ← answers to queries, filed back as pages
|
/wiki-graph # → graph.html shows every character/theme and how they connect
|
||||||
graph/
|
|
||||||
graph.json ← node/edge data (SHA256-cached)
|
|
||||||
graph.html ← interactive vis.js visualization
|
|
||||||
tools/
|
|
||||||
ingest.py ← process a new source
|
|
||||||
query.py ← ask a question
|
|
||||||
lint.py ← health-check the wiki
|
|
||||||
build_graph.py ← rebuild the knowledge graph
|
|
||||||
CLAUDE.md ← schema and workflow instructions for the LLM
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Commands
|
Think fan wikis like Tolkien Gateway — built as you read, with the agent doing all the cross-referencing.
|
||||||
|
|
||||||
### Claude Code (primary — no API key)
|
---
|
||||||
|
|
||||||
| Slash command | What it does |
|
### Personal Knowledge Base
|
||||||
|---|---|
|
|
||||||
| `/wiki-ingest <file>` | Read a source, update wiki pages, append to log |
|
|
||||||
| `/wiki-query <question>` | Search wiki, synthesize answer with citations |
|
|
||||||
| `/wiki-lint` | Check for orphans, broken links, contradictions, gaps |
|
|
||||||
| `/wiki-graph` | Build knowledge graph (`graph.json` + `graph.html`) |
|
|
||||||
|
|
||||||
Or describe what you want in plain English — Claude Code follows `CLAUDE.md` and does the right thing.
|
Track goals, health, habits, self-improvement — file journal entries, articles, podcast notes.
|
||||||
|
|
||||||
### Standalone Python (optional — requires `ANTHROPIC_API_KEY`)
|
```
|
||||||
|
/wiki-ingest raw/journal/2026-01-week1.md
|
||||||
|
/wiki-ingest raw/articles/huberman-sleep-protocol.md
|
||||||
|
/wiki-ingest raw/articles/atomic-habits-summary.md
|
||||||
|
|
||||||
| Command | What it does |
|
/wiki-query "What patterns show up in my journal entries about energy?"
|
||||||
|---|---|
|
/wiki-query "What habits have I tried and what was the outcome?"
|
||||||
| `python tools/ingest.py <file>` | Ingest a source |
|
```
|
||||||
| `python tools/query.py "<question>"` | Query the wiki |
|
|
||||||
| `python tools/query.py "<question>" --save` | Query and file answer back |
|
The wiki builds a structured picture over time. Concepts like "Sleep", "Exercise", "Deep Work" accumulate evidence from every source filed.
|
||||||
| `python tools/lint.py` | Lint the wiki |
|
|
||||||
| `python tools/build_graph.py` | Build graph |
|
---
|
||||||
| `python tools/build_graph.py --no-infer` | Build graph (skip inference, faster) |
|
|
||||||
| `python tools/build_graph.py --open` | Build and open in browser |
|
### Business / Team Intelligence
|
||||||
|
|
||||||
|
Feed in meeting transcripts, project docs, customer calls.
|
||||||
|
|
||||||
|
```
|
||||||
|
/wiki-ingest raw/meetings/q1-planning-transcript.md
|
||||||
|
/wiki-ingest raw/docs/product-roadmap-2026.md
|
||||||
|
/wiki-ingest raw/calls/customer-interview-acme.md
|
||||||
|
|
||||||
|
/wiki-query "What feature requests have come up most across customer calls?"
|
||||||
|
/wiki-query "What decisions were made in Q1 and what was the rationale?"
|
||||||
|
|
||||||
|
/wiki-lint
|
||||||
|
# → "Project X mentioned in 5 pages but no dedicated page"
|
||||||
|
# → "Roadmap contradicts customer interview on priority of feature Y"
|
||||||
|
```
|
||||||
|
|
||||||
|
The wiki stays current because the agent does the maintenance no one wants to do.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Competitive Analysis
|
||||||
|
|
||||||
|
Track a company, market, or technology over time.
|
||||||
|
|
||||||
|
```
|
||||||
|
/wiki-ingest raw/competitors/openai-announcements.md
|
||||||
|
/wiki-ingest raw/market/ai-funding-report-q1.md
|
||||||
|
|
||||||
|
/wiki-query "How do OpenAI and Anthropic differ on safety approach?"
|
||||||
|
/wiki-query "Which companies announced multimodal models in the last 6 months?"
|
||||||
|
/wiki-query "Competitive landscape summary as of today" --save
|
||||||
|
```
|
||||||
|
|
||||||
## The Graph
|
## The Graph
|
||||||
|
|
||||||
`build_graph.py` runs two passes:
|
Two-pass build:
|
||||||
|
|
||||||
1. **Deterministic** — parse all `[[wikilinks]]` in every page → explicit edges tagged `EXTRACTED`
|
1. **Deterministic** — parses all `[[wikilinks]]` across wiki pages → edges tagged `EXTRACTED`
|
||||||
2. **Semantic** — Claude infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence) or `AMBIGUOUS`
|
2. **Semantic** — agent infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence score) or `AMBIGUOUS`
|
||||||
|
|
||||||
Community detection (Louvain) clusters nodes by topic. The output is a self-contained `graph.html` — open it in any browser. SHA256 caching means only changed pages are reprocessed.
|
Louvain community detection clusters nodes by topic. SHA256 cache means only changed pages are reprocessed. Output is a self-contained `graph.html` — no server, opens in any browser.
|
||||||
|
|
||||||
## CLAUDE.md
|
## CLAUDE.md / AGENTS.md
|
||||||
|
|
||||||
`CLAUDE.md` is the schema document — it tells the LLM how to maintain the wiki. It defines page formats, ingest/query/lint workflows, naming conventions, and log format. This is the key configuration file. Edit it to customize behavior for your domain.
|
The schema file tells the agent how to maintain the wiki — page formats, ingest/query/lint/graph workflows, naming conventions. This is the key config file. Edit it to customize behavior for your domain.
|
||||||
|
|
||||||
|
| Agent | Schema file |
|
||||||
|
|---|---|
|
||||||
|
| Claude Code | `CLAUDE.md` |
|
||||||
|
| Codex / OpenCode | `AGENTS.md` |
|
||||||
|
| Gemini CLI | `GEMINI.md` |
|
||||||
|
|
||||||
## What Makes This Different from RAG
|
## What Makes This Different from RAG
|
||||||
|
|
||||||
@@ -132,141 +206,23 @@ Community detection (Louvain) clusters nodes by topic. The output is a self-cont
|
|||||||
| Contradictions surface at query time (maybe) | Flagged at ingest time |
|
| Contradictions surface at query time (maybe) | Flagged at ingest time |
|
||||||
| No accumulation | Every source makes the wiki richer |
|
| No accumulation | Every source makes the wiki richer |
|
||||||
|
|
||||||
## Use Cases
|
|
||||||
|
|
||||||
### Research
|
|
||||||
|
|
||||||
Going deep on a topic over weeks or months — reading papers, articles, reports.
|
|
||||||
|
|
||||||
```
|
|
||||||
# Each paper you read gets ingested:
|
|
||||||
/wiki-ingest raw/papers/attention-is-all-you-need.md
|
|
||||||
/wiki-ingest raw/papers/llama2.md
|
|
||||||
/wiki-ingest raw/papers/rag-survey.md
|
|
||||||
|
|
||||||
# Wiki builds up entity pages (e.g. "Meta AI", "Google Brain") and
|
|
||||||
# concept pages (e.g. "Attention Mechanism", "RLHF") automatically.
|
|
||||||
|
|
||||||
# Ask synthesis questions across everything you've read:
|
|
||||||
/wiki-query "What are the main approaches to reducing hallucination?"
|
|
||||||
/wiki-query "How has context window size evolved across models?"
|
|
||||||
|
|
||||||
# Check where your knowledge has gaps:
|
|
||||||
/wiki-lint
|
|
||||||
# → "No sources on mixture-of-experts — consider reading the Mixtral paper"
|
|
||||||
```
|
|
||||||
|
|
||||||
By the end of a research project you have a structured, interlinked reference that reflects everything you've read — not a folder of PDFs you'll never reopen.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Reading a Book
|
|
||||||
|
|
||||||
File each chapter as you go. Build out pages for characters, themes, plot threads.
|
|
||||||
|
|
||||||
```
|
|
||||||
# After each chapter:
|
|
||||||
/wiki-ingest raw/book/chapter-01-the-beginning.md
|
|
||||||
/wiki-ingest raw/book/chapter-02-the-conflict.md
|
|
||||||
|
|
||||||
# Wiki creates pages like:
|
|
||||||
# entities/ElonMusk.md, entities/Tesla.md
|
|
||||||
# concepts/FirstPrinciplesThinking.md
|
|
||||||
|
|
||||||
# Mid-book:
|
|
||||||
/wiki-query "How has the protagonist's motivation evolved?"
|
|
||||||
/wiki-query "What contradictions exist in the author's argument so far?"
|
|
||||||
|
|
||||||
# End of book — build the graph:
|
|
||||||
/wiki-graph
|
|
||||||
# Open graph.html → see every character/theme/event and how they connect
|
|
||||||
```
|
|
||||||
|
|
||||||
Think fan wikis like the Tolkien Gateway — thousands of interlinked pages. You can build something like that as you read, with the agent doing all the cross-referencing.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Personal Knowledge Base
|
|
||||||
|
|
||||||
Track goals, health, psychology, self-improvement — file journal entries, articles, podcast notes.
|
|
||||||
|
|
||||||
```
|
|
||||||
# File your journal entries:
|
|
||||||
/wiki-ingest raw/journal/2026-01-week1.md
|
|
||||||
/wiki-ingest raw/journal/2026-01-week2.md
|
|
||||||
|
|
||||||
# File articles and podcast notes that resonated:
|
|
||||||
/wiki-ingest raw/articles/huberman-sleep-protocol.md
|
|
||||||
/wiki-ingest raw/articles/atomic-habits-summary.md
|
|
||||||
|
|
||||||
# Ask introspective questions:
|
|
||||||
/wiki-query "What patterns show up in my journal entries about energy levels?"
|
|
||||||
/wiki-query "What habits have I tried and what was the outcome?"
|
|
||||||
|
|
||||||
# The wiki builds a structured picture of you over time —
|
|
||||||
# entities like "Sleep", "Exercise", "Deep Work" accumulate evidence
|
|
||||||
# from every source you've filed.
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Business / Team Intelligence
|
|
||||||
|
|
||||||
Feed in meeting transcripts, Slack exports, project docs, customer calls.
|
|
||||||
|
|
||||||
```
|
|
||||||
# Onboard new context:
|
|
||||||
/wiki-ingest raw/meetings/q1-planning-transcript.md
|
|
||||||
/wiki-ingest raw/docs/product-roadmap-2026.md
|
|
||||||
/wiki-ingest raw/calls/customer-interview-acme.md
|
|
||||||
|
|
||||||
# Wiki creates pages for projects, people, decisions, recurring themes.
|
|
||||||
|
|
||||||
# Ask strategic questions:
|
|
||||||
/wiki-query "What feature requests have come up most across customer calls?"
|
|
||||||
/wiki-query "What decisions were made in Q1 planning and what was the rationale?"
|
|
||||||
|
|
||||||
# Lint catches things like:
|
|
||||||
# → "Project X mentioned in 5 pages but no dedicated page"
|
|
||||||
# → "Roadmap contradicts customer interview on priority of feature Y"
|
|
||||||
```
|
|
||||||
|
|
||||||
The wiki stays current because the agent does the maintenance no one on the team wants to do.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Competitive Analysis / Due Diligence
|
|
||||||
|
|
||||||
Track a company, market, or technology area over time.
|
|
||||||
|
|
||||||
```
|
|
||||||
# Feed in everything you find:
|
|
||||||
/wiki-ingest raw/competitors/openai-announcements.md
|
|
||||||
/wiki-ingest raw/competitors/anthropic-blog-posts.md
|
|
||||||
/wiki-ingest raw/market/ai-funding-report-q1.md
|
|
||||||
|
|
||||||
# Wiki builds entity pages per company, concept pages per technology.
|
|
||||||
|
|
||||||
# Ask comparison questions:
|
|
||||||
/wiki-query "How do OpenAI and Anthropic differ in their approach to safety?"
|
|
||||||
/wiki-query "Which companies have announced multimodal models in the last 6 months?"
|
|
||||||
|
|
||||||
# Save the answer back as a reusable synthesis:
|
|
||||||
/wiki-query "Competitive landscape summary as of today" --save
|
|
||||||
```
|
|
||||||
|
|
||||||
## Tips
|
## Tips
|
||||||
|
|
||||||
- Use [Obsidian](https://obsidian.md) to read/browse the wiki — follow links, check graph view
|
- Use [Obsidian](https://obsidian.md) to browse the wiki — follow links, check graph view, use Dataview for frontmatter queries
|
||||||
- Use [Obsidian Web Clipper](https://obsidian.md/clipper) to clip web articles directly to `raw/`
|
- Use [Obsidian Web Clipper](https://obsidian.md/clipper) to clip web articles directly to `raw/`
|
||||||
- The wiki is a git repo — you get version history for free
|
|
||||||
- File good query answers back with `--save` — your explorations compound just like ingested sources
|
- File good query answers back with `--save` — your explorations compound just like ingested sources
|
||||||
|
- The wiki is a git repo — version history for free
|
||||||
|
- Standalone Python scripts in `tools/` work without a coding agent (require `ANTHROPIC_API_KEY`)
|
||||||
|
|
||||||
## License
|
## Tech Stack
|
||||||
|
|
||||||
MIT License — see [LICENSE](LICENSE) for details.
|
NetworkX + Louvain + Claude + vis.js. No server, no database, runs entirely locally. Everything is plain markdown files.
|
||||||
|
|
||||||
## Related
|
## Related
|
||||||
|
|
||||||
- [graphify](https://github.com/safishamsi/graphify) — graph-based knowledge extraction skill (inspiration for the graph layer)
|
- [graphify](https://github.com/safishamsi/graphify) — graph-based knowledge extraction skill (inspiration for the graph layer)
|
||||||
- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this is related to in spirit
|
- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this resembles
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
MIT License — see [LICENSE](LICENSE) for details.
|
||||||
|
|||||||
Reference in New Issue
Block a user