Rewrite README for clarity and impact

- Lead with one-sentence hook + output structure upfront - Add What You Get section naming concrete deliverables - Consolidate agent compatibility into schema file table - Add tech stack one-liner - Streamline use cases, quick start, and graph sections
2026-04-07 08:21:48 +05:30
parent e94fdbdafe
commit d8ac6107bf
1 changed files with 163 additions and 207 deletions
--- a/README.md
+++ b/README.md
@@ -2,125 +2,199 @@
 [![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
-**A personal knowledge base that builds and maintains itself.** Drop in source documents — articles, papers, notes — and the LLM reads them, extracts the knowledge, and integrates everything into a persistent, interlinked wiki. You never write the wiki. Claude does.
+**A coding agent skill.** Drop source documents into `raw/` and type `/wiki-ingest` — the agent reads them, extracts knowledge, and builds a persistent interlinked wiki. Every new source makes the wiki richer. You never write it.
-Unlike RAG systems that re-derive knowledge from scratch on every query, LLM Wiki Agent compiles knowledge once and keeps it current. Cross-references are pre-built. Contradictions are flagged at ingest time. Every new source makes the wiki richer.
+> Most knowledge tools make you search your own notes. This one reads everything you've collected and writes a structured wiki that compounds over time — cross-references already built, contradictions already flagged, synthesis already done.
 ## How It Works
 ```
-You drop a source → Claude reads it → wiki pages are created/updated → graph is rebuilt
+/wiki-ingest raw/papers/attention-is-all-you-need.md
 You ask a question → Claude reads relevant wiki pages → synthesizes answer with citations
 ```
-Three layers:
+```
 wiki/
 ├── index.md          catalog of all pages — updated on every ingest
 ├── log.md            append-only record of every operation
 ├── overview.md       living synthesis across all sources
 ├── sources/          one summary page per source document
 ├── entities/         people, companies, projects — auto-created
 ├── concepts/         ideas, frameworks, methods — auto-created
 └── syntheses/        query answers filed back as wiki pages
 graph/
 ├── graph.json        persistent node/edge data (SHA256-cached)
 └── graph.html        interactive vis.js visualization — open in any browser
 ```
- **`raw/`** — your source documents (immutable, you own this)
+## Install
 - **`wiki/`** — Claude-maintained markdown pages (Claude writes, you read)
 - **`graph/`** — auto-generated knowledge graph visualization
-## Quick Start
+**Requires:** [Claude Code](https://claude.ai/code), [Codex](https://openai.com/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), or any agent that reads a config file.
 ```bash
 git clone https://github.com/SamurAIGPT/llm-wiki-agent.git
 cd llm-wiki-agent
 ```
-Open it in your coding agent — **no API key or Python setup needed**:
+Open in your agent — no API key or Python setup needed:
 ```bash
-claude      # Claude Code
+claude      # reads CLAUDE.md + .claude/commands/
-codex       # OpenAI Codex
+codex       # reads AGENTS.md
-opencode    # OpenCode / Pear AI
+opencode    # reads AGENTS.md
-gemini      # Gemini CLI
+gemini      # reads GEMINI.md
 ```
-Each agent reads its config file automatically (`CLAUDE.md`, `AGENTS.md`, or `GEMINI.md`) and follows the same workflows. Then just talk to it:
+## Usage
 ```
-# Claude Code — slash commands:
+/wiki-ingest raw/papers/my-paper.md          # ingest a source into the wiki
-/wiki-ingest raw/articles/my-article.md
+/wiki-ingest raw/articles/my-article.md      # works on any markdown file
-/wiki-query what are the main themes across all sources?
+
 /wiki-query "what are the main themes?"      # synthesize answer from wiki pages
 /wiki-query "how does X relate to Y?"        # with [[wikilink]] citations
 /wiki-lint                                   # find orphans, contradictions, gaps
 /wiki-graph                                  # build graph.html from all wikilinks
 ```
 Plain English also works with any agent:
 ```
 "Ingest this paper: raw/papers/llama2.md"
 "What does the wiki say about attention mechanisms?"
 "Check for contradictions across sources"
 "Build the knowledge graph and tell me the most connected nodes"
 ```
 Works with any markdown source — articles, papers, book chapters, meeting notes, journal entries, research summaries.
 ## What You Get
 **Persistent wiki** — structured markdown pages that accumulate across sessions. Unlike chat, nothing is lost.
 **Entity pages** — auto-created for every person, company, or project mentioned across sources. Updated each time a new source references them.
 **Concept pages** — auto-created for every key idea or framework. Cross-referenced to every source that discusses them.
 **Living overview** — `wiki/overview.md` is revised on every ingest to reflect the current synthesis across everything you've read.
 **Contradiction flags** — when a new source contradicts an existing claim, it's flagged at ingest time, not buried until query time.
 **Knowledge graph** — `graph.html` shows every wiki page as a node, every `[[wikilink]]` as an edge, and Claude-inferred implicit relationships as dotted edges. Community detection clusters related topics.
 **Lint reports** — orphan pages, broken links, missing entity pages, data gaps with suggested sources to fill them.
 ## Use Cases
 ### Research
 Going deep on a topic over weeks — reading papers, articles, reports.
 ```
 /wiki-ingest raw/papers/attention-is-all-you-need.md
 /wiki-ingest raw/papers/llama2.md
 /wiki-ingest raw/papers/rag-survey.md
 # Wiki builds entity pages (Meta AI, Google Brain) and
 # concept pages (Attention, RLHF, Context Window) automatically.
 /wiki-query "What are the main approaches to reducing hallucination?"
 /wiki-query "How has context window size evolved across models?"
 /wiki-lint
-/wiki-graph
+# → "No sources on mixture-of-experts — consider the Mixtral paper"
 # Any agent — plain English works too:
 "Ingest this paper: raw/papers/my-paper.md"
 "What does the wiki say about X?"
 "Check for contradictions"
 "Build the knowledge graph"
 ```
-| Agent | Config file |
+By the end you have a structured, interlinked reference — not a folder of PDFs you'll never reopen.
 |---|---|
 | [Claude Code](https://claude.ai/code) | `CLAUDE.md` + `.claude/commands/` |
 | [OpenAI Codex](https://openai.com/codex) | `AGENTS.md` |
 | OpenCode / Pear AI | `AGENTS.md` |
 | [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `GEMINI.md` |
-> **Standalone use** (without a coding agent): `pip install -r requirements.txt`, set `ANTHROPIC_API_KEY`, then use `python tools/ingest.py`, `python tools/query.py`, etc.
+---
-## Architecture
+### Reading a Book
 File each chapter as you go. Build out pages for characters, themes, arguments.
 ```
-raw/                    ← your sources (never modified by LLM)
+/wiki-ingest raw/book/chapter-01.md
-wiki/
+/wiki-ingest raw/book/chapter-02.md
-  index.md              ← catalog of all pages (updated on every ingest)
+
-  log.md                ← append-only operation log
+# Wiki creates entity and theme pages automatically.
-  overview.md           ← living synthesis across all sources
+
-  sources/              ← one page per source document
+/wiki-query "How has the protagonist's motivation evolved?"
-  entities/             ← people, companies, projects
+/wiki-query "What contradictions exist in the author's argument so far?"
-  concepts/             ← ideas, frameworks, methods
+
-  syntheses/            ← answers to queries, filed back as pages
+/wiki-graph   # → graph.html shows every character/theme and how they connect
 graph/
  graph.json            ← node/edge data (SHA256-cached)
  graph.html            ← interactive vis.js visualization
 tools/
  ingest.py             ← process a new source
  query.py              ← ask a question
  lint.py               ← health-check the wiki
  build_graph.py        ← rebuild the knowledge graph
 CLAUDE.md               ← schema and workflow instructions for the LLM
 ```
-## Commands
+Think fan wikis like Tolkien Gateway — built as you read, with the agent doing all the cross-referencing.
-### Claude Code (primary — no API key)
+---
-| Slash command | What it does |
+### Personal Knowledge Base
 |---|---|
 | `/wiki-ingest <file>` | Read a source, update wiki pages, append to log |
 | `/wiki-query <question>` | Search wiki, synthesize answer with citations |
 | `/wiki-lint` | Check for orphans, broken links, contradictions, gaps |
 | `/wiki-graph` | Build knowledge graph (`graph.json` + `graph.html`) |
-Or describe what you want in plain English — Claude Code follows `CLAUDE.md` and does the right thing.
+Track goals, health, habits, self-improvement — file journal entries, articles, podcast notes.
-### Standalone Python (optional — requires `ANTHROPIC_API_KEY`)
+```
 /wiki-ingest raw/journal/2026-01-week1.md
 /wiki-ingest raw/articles/huberman-sleep-protocol.md
 /wiki-ingest raw/articles/atomic-habits-summary.md
-| Command | What it does |
+/wiki-query "What patterns show up in my journal entries about energy?"
-|---|---|
+/wiki-query "What habits have I tried and what was the outcome?"
-| `python tools/ingest.py <file>` | Ingest a source |
+```
-| `python tools/query.py "<question>"` | Query the wiki |
+
-| `python tools/query.py "<question>" --save` | Query and file answer back |
+The wiki builds a structured picture over time. Concepts like "Sleep", "Exercise", "Deep Work" accumulate evidence from every source filed.
-| `python tools/lint.py` | Lint the wiki |
+
-| `python tools/build_graph.py` | Build graph |
+---
-| `python tools/build_graph.py --no-infer` | Build graph (skip inference, faster) |
+
-| `python tools/build_graph.py --open` | Build and open in browser |
+### Business / Team Intelligence
 Feed in meeting transcripts, project docs, customer calls.
 ```
 /wiki-ingest raw/meetings/q1-planning-transcript.md
 /wiki-ingest raw/docs/product-roadmap-2026.md
 /wiki-ingest raw/calls/customer-interview-acme.md
 /wiki-query "What feature requests have come up most across customer calls?"
 /wiki-query "What decisions were made in Q1 and what was the rationale?"
 /wiki-lint
 # → "Project X mentioned in 5 pages but no dedicated page"
 # → "Roadmap contradicts customer interview on priority of feature Y"
 ```
 The wiki stays current because the agent does the maintenance no one wants to do.
 ---
 ### Competitive Analysis
 Track a company, market, or technology over time.
 ```
 /wiki-ingest raw/competitors/openai-announcements.md
 /wiki-ingest raw/market/ai-funding-report-q1.md
 /wiki-query "How do OpenAI and Anthropic differ on safety approach?"
 /wiki-query "Which companies announced multimodal models in the last 6 months?"
 /wiki-query "Competitive landscape summary as of today" --save
 ```
 ## The Graph
-`build_graph.py` runs two passes:
+Two-pass build:
-1. **Deterministic** — parse all `[[wikilinks]]` in every page → explicit edges tagged `EXTRACTED`
+1. **Deterministic** — parses all `[[wikilinks]]` across wiki pages → edges tagged `EXTRACTED`
-2. **Semantic** — Claude infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence) or `AMBIGUOUS`
+2. **Semantic** — agent infers implicit relationships not captured by wikilinks → edges tagged `INFERRED` (with confidence score) or `AMBIGUOUS`
-Community detection (Louvain) clusters nodes by topic. The output is a self-contained `graph.html` — open it in any browser. SHA256 caching means only changed pages are reprocessed.
+Louvain community detection clusters nodes by topic. SHA256 cache means only changed pages are reprocessed. Output is a self-contained `graph.html` — no server, opens in any browser.
-## CLAUDE.md
+## CLAUDE.md / AGENTS.md
-`CLAUDE.md` is the schema document — it tells the LLM how to maintain the wiki. It defines page formats, ingest/query/lint workflows, naming conventions, and log format. This is the key configuration file. Edit it to customize behavior for your domain.
+The schema file tells the agent how to maintain the wiki — page formats, ingest/query/lint/graph workflows, naming conventions. This is the key config file. Edit it to customize behavior for your domain.
 | Agent | Schema file |
 |---|---|
 | Claude Code | `CLAUDE.md` |
 | Codex / OpenCode | `AGENTS.md` |
 | Gemini CLI | `GEMINI.md` |
 ## What Makes This Different from RAG
@@ -132,141 +206,23 @@ Community detection (Louvain) clusters nodes by topic. The output is a self-cont
 | Contradictions surface at query time (maybe) | Flagged at ingest time |
 | No accumulation | Every source makes the wiki richer |
 ## Use Cases
 ### Research
 Going deep on a topic over weeks or months — reading papers, articles, reports.
 ```
 # Each paper you read gets ingested:
 /wiki-ingest raw/papers/attention-is-all-you-need.md
 /wiki-ingest raw/papers/llama2.md
 /wiki-ingest raw/papers/rag-survey.md
 # Wiki builds up entity pages (e.g. "Meta AI", "Google Brain") and
 # concept pages (e.g. "Attention Mechanism", "RLHF") automatically.
 # Ask synthesis questions across everything you've read:
 /wiki-query "What are the main approaches to reducing hallucination?"
 /wiki-query "How has context window size evolved across models?"
 # Check where your knowledge has gaps:
 /wiki-lint
 # → "No sources on mixture-of-experts — consider reading the Mixtral paper"
 ```
 By the end of a research project you have a structured, interlinked reference that reflects everything you've read — not a folder of PDFs you'll never reopen.
 ---
 ### Reading a Book
 File each chapter as you go. Build out pages for characters, themes, plot threads.
 ```
 # After each chapter:
 /wiki-ingest raw/book/chapter-01-the-beginning.md
 /wiki-ingest raw/book/chapter-02-the-conflict.md
 # Wiki creates pages like:
 # entities/ElonMusk.md, entities/Tesla.md
 # concepts/FirstPrinciplesThinking.md
 # Mid-book:
 /wiki-query "How has the protagonist's motivation evolved?"
 /wiki-query "What contradictions exist in the author's argument so far?"
 # End of book — build the graph:
 /wiki-graph
 # Open graph.html → see every character/theme/event and how they connect
 ```
 Think fan wikis like the Tolkien Gateway — thousands of interlinked pages. You can build something like that as you read, with the agent doing all the cross-referencing.
 ---
 ### Personal Knowledge Base
 Track goals, health, psychology, self-improvement — file journal entries, articles, podcast notes.
 ```
 # File your journal entries:
 /wiki-ingest raw/journal/2026-01-week1.md
 /wiki-ingest raw/journal/2026-01-week2.md
 # File articles and podcast notes that resonated:
 /wiki-ingest raw/articles/huberman-sleep-protocol.md
 /wiki-ingest raw/articles/atomic-habits-summary.md
 # Ask introspective questions:
 /wiki-query "What patterns show up in my journal entries about energy levels?"
 /wiki-query "What habits have I tried and what was the outcome?"
 # The wiki builds a structured picture of you over time —
 # entities like "Sleep", "Exercise", "Deep Work" accumulate evidence
 # from every source you've filed.
 ```
 ---
 ### Business / Team Intelligence
 Feed in meeting transcripts, Slack exports, project docs, customer calls.
 ```
 # Onboard new context:
 /wiki-ingest raw/meetings/q1-planning-transcript.md
 /wiki-ingest raw/docs/product-roadmap-2026.md
 /wiki-ingest raw/calls/customer-interview-acme.md
 # Wiki creates pages for projects, people, decisions, recurring themes.
 # Ask strategic questions:
 /wiki-query "What feature requests have come up most across customer calls?"
 /wiki-query "What decisions were made in Q1 planning and what was the rationale?"
 # Lint catches things like:
 # → "Project X mentioned in 5 pages but no dedicated page"
 # → "Roadmap contradicts customer interview on priority of feature Y"
 ```
 The wiki stays current because the agent does the maintenance no one on the team wants to do.
 ---
 ### Competitive Analysis / Due Diligence
 Track a company, market, or technology area over time.
 ```
 # Feed in everything you find:
 /wiki-ingest raw/competitors/openai-announcements.md
 /wiki-ingest raw/competitors/anthropic-blog-posts.md
 /wiki-ingest raw/market/ai-funding-report-q1.md
 # Wiki builds entity pages per company, concept pages per technology.
 # Ask comparison questions:
 /wiki-query "How do OpenAI and Anthropic differ in their approach to safety?"
 /wiki-query "Which companies have announced multimodal models in the last 6 months?"
 # Save the answer back as a reusable synthesis:
 /wiki-query "Competitive landscape summary as of today" --save
 ```
 ## Tips
- Use [Obsidian](https://obsidian.md) to read/browse the wiki — follow links, check graph view
+- Use [Obsidian](https://obsidian.md) to browse the wiki — follow links, check graph view, use Dataview for frontmatter queries
 - Use [Obsidian Web Clipper](https://obsidian.md/clipper) to clip web articles directly to `raw/`
 - The wiki is a git repo — you get version history for free
 - File good query answers back with `--save` — your explorations compound just like ingested sources
 - The wiki is a git repo — version history for free
 - Standalone Python scripts in `tools/` work without a coding agent (require `ANTHROPIC_API_KEY`)
-## License
+## Tech Stack
-MIT License — see [LICENSE](LICENSE) for details.
+NetworkX + Louvain + Claude + vis.js. No server, no database, runs entirely locally. Everything is plain markdown files.
 ## Related
 - [graphify](https://github.com/safishamsi/graphify) — graph-based knowledge extraction skill (inspiration for the graph layer)
- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this is related to in spirit
+- [Vannevar Bush's Memex (1945)](https://en.wikipedia.org/wiki/Memex) — the original vision this resembles
 ## License
 MIT License — see [LICENSE](LICENSE) for details.