Rewrite README for clarity and impact

- Lead with one-sentence hook + output structure upfront - Add What You Get section naming concrete deliverables - Consolidate agent compatibility into schema file table - Add tech stack one-liner - Streamline use cases, quick start, and graph sections
2026-04-07 08:21:48 +05:30
parent e94fdbdafe
commit d8ac6107bf
1 changed files with 163 additions and 207 deletions
--- a/README.md
+++ b/README.md
@@ -2,125 +2,199 @@

 [![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

-**A personal knowledge base that builds and maintains itself.** Drop in source documents — articles, papers, notes — and the LLM reads them, extracts the knowledge, and integrates everything into a persistent, interlinked wiki. You never write the wiki. Claude does.
+**A coding agent skill.** Drop source documents into `raw/` and type `/wiki-ingest` — the agent reads them, extracts knowledge, and builds a persistent interlinked wiki. Every new source makes the wiki richer. You never write it.

-Unlike RAG systems that re-derive knowledge from scratch on every query, LLM Wiki Agent compiles knowledge once and keeps it current. Cross-references are pre-built. Contradictions are flagged at ingest time. Every new source makes the wiki richer.
-
-## How It Works
+> Most knowledge tools make you search your own notes. This one reads everything you've collected and writes a structured wiki that compounds over time — cross-references already built, contradictions already flagged, synthesis already done.

 ```
-You drop a source → Claude reads it → wiki pages are created/updated → graph is rebuilt
-
-You ask a question → Claude reads relevant wiki pages → synthesizes answer with citations
+/wiki-ingest raw/papers/attention-is-all-you-need.md
 ```

-Three layers:
+```
+wiki/
+├── index.md          catalog of all pages — updated on every ingest
+├── log.md            append-only record of every operation
+├── overview.md       living synthesis across all sources
+├── sources/          one summary page per source document
+├── entities/         people, companies, projects — auto-created
+├── concepts/         ideas, frameworks, methods — auto-created
+└── syntheses/        query answers filed back as wiki pages
+graph/
+├── graph.json        persistent node/edge data (SHA256-cached)
+└── graph.html        interactive vis.js visualization — open in any browser
+```

- **`raw/`** — your source documents (immutable, you own this)
- **`wiki/`** — Claude-maintained markdown pages (Claude writes, you read)
- **`graph/`** — auto-generated knowledge graph visualization
+## Install

-## Quick Start
+**Requires:** [Claude Code](https://claude.ai/code), [Codex](https://openai.com/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), or any agent that reads a config file.

 ```bash
 git clone https://github.com/SamurAIGPT/llm-wiki-agent.git
 cd llm-wiki-agent
 ```

-Open it in your coding agent — **no API key or Python setup needed**:
+Open in your agent — no API key or Python setup needed:

 ```bash
-claude      # Claude Code
-codex       # OpenAI Codex
-opencode    # OpenCode / Pear AI
-gemini      # Gemini CLI
+claude      # reads CLAUDE.md + .claude/commands/
+codex       # reads AGENTS.md
+opencode    # reads AGENTS.md
+gemini      # reads GEMINI.md
 ```

-Each agent reads its config file automatically (`CLAUDE.md`, `AGENTS.md`, or `GEMINI.md`) and follows the same workflows. Then just talk to it:
+## Usage

 ```
-# Claude Code — slash commands:
-/wiki-ingest raw/articles/my-article.md
-/wiki-query what are the main themes across all sources?
+/wiki-ingest raw/papers/my-paper.md          # ingest a source into the wiki
+/wiki-ingest raw/articles/my-article.md      # works on any markdown file
+
+/wiki-query "what are the main themes?"      # synthesize answer from wiki pages
+/wiki-query "how does X relate to Y?"        # with [[wikilink]] citations
+
+/wiki-lint                                   # find orphans, contradictions, gaps
+/wiki-graph                                  # build graph.html from all wikilinks
+```
+
+Plain English also works with any agent:
+```
+"Ingest this paper: raw/papers/llama2.md"
+"What does the wiki say about attention mechanisms?"
+"Check for contradictions across sources"
+"Build the knowledge graph and tell me the most connected nodes"
+```
+
+Works with any markdown source — articles, papers, book chapters, meeting notes, journal entries, research summaries.
+
+## What You Get
+
+**Persistent wiki** — structured markdown pages that accumulate across sessions. Unlike chat, nothing is lost.
+
+**Entity pages** — auto-created for every person, company, or project mentioned across sources. Updated each time a new source references them.
+
+**Concept pages** — auto-created for every key idea or framework. Cross-referenced to every source that discusses them.
+
+**Living overview** — `wiki/overview.md` is revised on every ingest to reflect the current synthesis across everything you've read.
+
+**Contradiction flags** — when a new source contradicts an existing claim, it's flagged at ingest time, not buried until query time.
+
+**Knowledge graph** — `graph.html` shows every wiki page as a node, every `[[wikilink]]` as an edge, and Claude-inferred implicit relationships as dotted edges. Community detection clusters related topics.
+
+**Lint reports** — orphan pages, broken links, missing entity pages, data gaps with suggested sources to fill them.
+
+## Use Cases
+
+### Research
+
+Going deep on a topic over weeks — reading papers, articles, reports.
+
+```
+/wiki-ingest raw/papers/attention-is-all-you-need.md
+/wiki-ingest raw/papers/llama2.md
+/wiki-ingest raw/papers/rag-survey.md
+
+# Wiki builds entity pages (Meta AI, Google Brain) and
+# concept pages (Attention, RLHF, Context Window) automatically.
+
+/wiki-query "What are the main approaches to reducing hallucination?"
+/wiki-query "How has context window size evolved across models?"
+
 /wiki-lint
-/wiki-graph
-
-# Any agent — plain English works too:
-"Ingest this paper: raw/papers/my-paper.md"
-"What does the wiki say about X?"
-"Check for contradictions"
-"Build the knowledge graph"
+# → "No sources on mixture-of-experts — consider the Mixtral paper"
 ```

-| Agent | Config file |
-|---|---|
-| [Claude Code](https://claude.ai/code) | `CLAUDE.md` + `.claude/commands/` |
-| [OpenAI Codex](https://openai.com/codex) | `AGENTS.md` |
-| OpenCode / Pear AI | `AGENTS.md` |
-| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `GEMINI.md` |
+By the end you have a structured, interlinked reference — not a folder of PDFs you'll never reopen.

-> **Standalone use** (without a coding agent): `pip install -r requirements.txt`, set `ANTHROPIC_API_KEY`, then use `python tools/ingest.py`, `python tools/query.py`, etc.
+---

-## Architecture
+### Reading a Book
+
+File each chapter as you go. Build out pages for characters, themes, arguments.

 ```
-raw/                    ← your sources (never modified by LLM)
-wiki/
-  index.md              ← catalog of all pages (updated on every ingest)
-  log.md                ← append-only operation log
-  overview.md           ← living synthesis across all sources
-  sources/              ← one page per source document
-  entities/             ← people, companies, projects
-  concepts/             ← ideas, frameworks, methods
-  syntheses/            ← answers to queries, filed back as pages
-graph/
-  graph.json            ← node/edge data (SHA256-cached)
-  graph.html            ← interactive vis.js visualization
-tools/
-  ingest.py             ← process a new source
-  query.py              ← ask a question
-  lint.py               ← health-check the wiki
-  build_graph.py        ← rebuild the knowledge graph
-CLAUDE.md               ← schema and workflow instructions for the LLM
+/wiki-ingest raw/book/chapter-01.md
+/wiki-ingest raw/book/chapter-02.md
+
+# Wiki creates entity and theme pages automatically.
+
+/wiki-query "How has the protagonist's motivation evolved?"
+/wiki-query "What contradictions exist in the author's argument so far?"
+
+/wiki-graph   # → graph.html shows every character/theme and how they connect
 ```

-## Commands
+Think fan wikis like Tolkien Gateway — built as you read, with the agent doing all the cross-referencing.

-### Claude Code (primary — no API key)
+---

-| Slash command | What it does |
-|---|---|
-| `/wiki-ingest <file>` | Read a source, update wiki pages, append to log |
-| `/wiki-query <question>` | Search wiki, synthesize answer with citations |
-| `/wiki-lint` | Check for orphans, broken links, contradictions, gaps |
-| `/wiki-graph` | Build knowledge graph (`graph.json` + `graph.html`) |
+### Personal Knowledge Base

-Or describe what you want in plain English — Claude Code follows `CLAUDE.md` and does the right thing.
+Track goals, health, habits, self-improvement — file journal entries, articles, podcast notes.

-### Standalone Python (optional — requires `ANTHROPIC_API_KEY`)
+```
+/wiki-ingest raw/journal/2026-01-week1.md
+/wiki-ingest raw/articles/huberman-sleep-protocol.md
+/wiki-ingest raw/articles/atomic-habits-summary.md

-| Command | What it does |
-|---|---|
-| `python tools/ingest.py <file>` | Ingest a source |
-| `python tools/query.py "<question>"` | Query the wiki |
-| `python tools/query.py "<question>" --save` | Query and file answer back |
-| `python tools/lint.py` | Lint the wiki |
-| `python tools/build_graph.py` | Build graph |
-| `python tools/build_graph.py --no-infer` | Build graph (skip inference, faster) |
-| `python tools/build_graph.py --open` | Build and open in browser |
+/wiki-query "What patterns show up in my journal entries about energy?"
+/wiki-query "What habits have I tried and what was the outcome?"
+```
+
+The wiki builds a structured picture over time. Concepts like "Sleep", "Exercise", "Deep Work" accumulate evidence from every source filed.
+
+---
+
+### Business / Team Intelligence
+
+Feed in meeting transcripts, project docs, customer calls.
+
+```
+/wiki-ingest raw/meetings/q1-planning-transcript.md
+/wiki-ingest raw/docs/product-roadmap-2026.md
+/wiki-ingest raw/calls/customer-interview-acme.md
+
+/wiki-query "What feature requests have come up most across customer calls?"
+/wiki-query "What decisions were made in Q1 and what was the rationale?"
+
+/wiki-lint
+# → "Project X mentioned in 5 pages but no dedicated page"
+# → "Roadmap contradicts customer interview on priority of feature Y"
+```
+
+The wiki stays current because the agent does the maintenance no one wants to do.
+
+---
+
+### Competitive Analysis
+
+Track a company, market, or technology over time.
+
+```
+/wiki-ingest raw/competitors/openai-announcements.md
+/wiki-ingest raw/market/ai-funding-report-q1.md
+
+/wiki-query "How do OpenAI and Anthropic differ on safety approach?"
+/wiki-query "Which companies announced multimodal models in the last 6 months?"
+/wiki-query "Competitive landscape summary as of today" --save
+```

 ## The Graph

-`build_graph.py` runs two passes:
+Two-pass build:

-1. **Deterministic** — parse all `[[wikilinks]]` in every page → explicit edges tagged `EXTRACTED`
-2. **Semantic** — Claude infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence) or `AMBIGUOUS`
+1. **Deterministic** — parses all `[[wikilinks]]` across wiki pages → edges tagged `EXTRACTED`
+2. **Semantic** — agent infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence score) or `AMBIGUOUS`

-Community detection (Louvain) clusters nodes by topic. The output is a self-contained `graph.html` — open it in any browser. SHA256 caching means only changed pages are reprocessed.
+Louvain community detection clusters nodes by topic. SHA256 cache means only changed pages are reprocessed. Output is a self-contained `graph.html` — no server, opens in any browser.

-## CLAUDE.md
+## CLAUDE.md / AGENTS.md

-`CLAUDE.md` is the schema document — it tells the LLM how to maintain the wiki. It defines page formats, ingest/query/lint workflows, naming conventions, and log format. This is the key configuration file. Edit it to customize behavior for your domain.
+The schema file tells the agent how to maintain the wiki — page formats, ingest/query/lint/graph workflows, naming conventions. This is the key config file. Edit it to customize behavior for your domain.
+
+| Agent | Schema file |
+|---|---|
+| Claude Code | `CLAUDE.md` |
+| Codex / OpenCode | `AGENTS.md` |
+| Gemini CLI | `GEMINI.md` |

 ## What Makes This Different from RAG

@@ -132,141 +206,23 @@ Community detection (Louvain) clusters nodes by topic. The output is a self-cont
 | Contradictions surface at query time (maybe) | Flagged at ingest time |
 | No accumulation | Every source makes the wiki richer |

-## Use Cases
-
-### Research
-
-Going deep on a topic over weeks or months — reading papers, articles, reports.
-
-```
-# Each paper you read gets ingested:
-/wiki-ingest raw/papers/attention-is-all-you-need.md
-/wiki-ingest raw/papers/llama2.md
-/wiki-ingest raw/papers/rag-survey.md
-
-# Wiki builds up entity pages (e.g. "Meta AI", "Google Brain") and
-# concept pages (e.g. "Attention Mechanism", "RLHF") automatically.
-
-# Ask synthesis questions across everything you've read:
-/wiki-query "What are the main approaches to reducing hallucination?"
-/wiki-query "How has context window size evolved across models?"
-
-# Check where your knowledge has gaps:
-/wiki-lint
-# → "No sources on mixture-of-experts — consider reading the Mixtral paper"
-```
-
-By the end of a research project you have a structured, interlinked reference that reflects everything you've read — not a folder of PDFs you'll never reopen.
-
---
-
-### Reading a Book
-
-File each chapter as you go. Build out pages for characters, themes, plot threads.
-
-```
-# After each chapter:
-/wiki-ingest raw/book/chapter-01-the-beginning.md
-/wiki-ingest raw/book/chapter-02-the-conflict.md
-
-# Wiki creates pages like:
-# entities/ElonMusk.md, entities/Tesla.md
-# concepts/FirstPrinciplesThinking.md
-
-# Mid-book:
-/wiki-query "How has the protagonist's motivation evolved?"
-/wiki-query "What contradictions exist in the author's argument so far?"
-
-# End of book — build the graph:
-/wiki-graph
-# Open graph.html → see every character/theme/event and how they connect
-```
-
-Think fan wikis like the Tolkien Gateway — thousands of interlinked pages. You can build something like that as you read, with the agent doing all the cross-referencing.
-
---
-
-### Personal Knowledge Base
-
-Track goals, health, psychology, self-improvement — file journal entries, articles, podcast notes.
-
-```
-# File your journal entries:
-/wiki-ingest raw/journal/2026-01-week1.md
-/wiki-ingest raw/journal/2026-01-week2.md
-
-# File articles and podcast notes that resonated:
-/wiki-ingest raw/articles/huberman-sleep-protocol.md
-/wiki-ingest raw/articles/atomic-habits-summary.md
-
-# Ask introspective questions:
-/wiki-query "What patterns show up in my journal entries about energy levels?"
-/wiki-query "What habits have I tried and what was the outcome?"
-
-# The wiki builds a structured picture of you over time —
-# entities like "Sleep", "Exercise", "Deep Work" accumulate evidence
-# from every source you've filed.
-```
-
---
-
-### Business / Team Intelligence
-
-Feed in meeting transcripts, Slack exports, project docs, customer calls.
-
-```
-# Onboard new context:
-/wiki-ingest raw/meetings/q1-planning-transcript.md
-/wiki-ingest raw/docs/product-roadmap-2026.md
-/wiki-ingest raw/calls/customer-interview-acme.md
-
-# Wiki creates pages for projects, people, decisions, recurring themes.
-
-# Ask strategic questions:
-/wiki-query "What feature requests have come up most across customer calls?"
-/wiki-query "What decisions were made in Q1 planning and what was the rationale?"
-
-# Lint catches things like:
-# → "Project X mentioned in 5 pages but no dedicated page"
-# → "Roadmap contradicts customer interview on priority of feature Y"
-```
-
-The wiki stays current because the agent does the maintenance no one on the team wants to do.
-
---
-
-### Competitive Analysis / Due Diligence
-
-Track a company, market, or technology area over time.
-
-```
-# Feed in everything you find:
-/wiki-ingest raw/competitors/openai-announcements.md
-/wiki-ingest raw/competitors/anthropic-blog-posts.md
-/wiki-ingest raw/market/ai-funding-report-q1.md
-
-# Wiki builds entity pages per company, concept pages per technology.
-
-# Ask comparison questions:
-/wiki-query "How do OpenAI and Anthropic differ in their approach to safety?"
-/wiki-query "Which companies have announced multimodal models in the last 6 months?"
-
-# Save the answer back as a reusable synthesis:
-/wiki-query "Competitive landscape summary as of today" --save
-```
-
 ## Tips

- Use [Obsidian](https://obsidian.md) to read/browse the wiki — follow links, check graph view
+- Use [Obsidian](https://obsidian.md) to browse the wiki — follow links, check graph view, use Dataview for frontmatter queries
 - Use [Obsidian Web Clipper](https://obsidian.md/clipper) to clip web articles directly to `raw/`
- The wiki is a git repo — you get version history for free
 - File good query answers back with `--save` — your explorations compound just like ingested sources
+- The wiki is a git repo — version history for free
+- Standalone Python scripts in `tools/` work without a coding agent (require `ANTHROPIC_API_KEY`)

-## License
+## Tech Stack

-MIT License — see [LICENSE](LICENSE) for details.
+NetworkX + Louvain + Claude + vis.js. No server, no database, runs entirely locally. Everything is plain markdown files.

 ## Related

 - [graphify](https://github.com/safishamsi/graphify) — graph-based knowledge extraction skill (inspiration for the graph layer)
- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this is related to in spirit
+- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this resembles
+
+## License
+
+MIT License — see [LICENSE](LICENSE) for details.