chore: ignore raw and wiki, update remote

2026-04-16 13:13:32 +08:00
parent b02eb12d1d
commit 753f7841e8
11 changed files with 1038 additions and 155 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,2 @@
+raw/
+wiki/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,78 +1,132 @@
-# LLM Wiki Agent — Schema & Workflow Instructions
+# LLM Wiki Agent — Schema & Workflow Instructions（中文版增强规范）

-This wiki is maintained entirely by Claude Code. No API key or Python scripts needed — just open this repo in Claude Code and talk to it.
+本 Wiki 完全由 Claude Code 自动维护。无需 API Key 或 Python 脚本 —— 只需在 Claude Code 中打开本仓库并与其对话。

-## Slash Commands (Claude Code)
+---
+# 🔴 全局强制规则（CRITICAL）

-| Command | What to say |
-|---|---|
-| `/wiki-ingest` | `ingest raw/my-article.md` |
-| `/wiki-query` | `query: what are the main themes?` |
-| `/wiki-lint` | `lint the wiki` |
-| `/wiki-graph` | `build the knowledge graph` |
+## 1. 输出语言（必须遵守）

-Or just describe what you want in plain English:
- *"Ingest this file: raw/papers/attention-is-all-you-need.md"*
- *"What does the wiki say about transformer models?"*
- *"Check the wiki for orphan pages and contradictions"*
- *"Build the graph and show me what's connected to RAG"*
+- 所有输出必须使用**简体中文**
+- 专有名词允许保留英文，但首次出现必须附带中文解释
+- 如果原始文件名是中文，则source页面的名称尽量用中文，不要用拼音表示, 如果有特殊字符可以忽略
+- 禁止中英混合句（术语除外）
+- 不允许输出纯英文总结或分析

-Claude Code reads this file automatically and follows the workflows below.
+示例：
+
+Transformer（变压器模型，一种基于注意力机制的神经网络架构）

 ---

-## Directory Layout
+## 2. 输出风格（严格限制）

-```
-raw/          # Immutable source documents — never modify these
-wiki/         # Claude owns this layer entirely
-  index.md    # Catalog of all pages — update on every ingest
-  log.md      # Append-only chronological record
-  overview.md # Living synthesis across all sources
-  sources/    # One summary page per source document
-  entities/   # People, companies, projects, products
-  concepts/   # Ideas, frameworks, methods, theories
-  syntheses/  # Saved query answers
-graph/        # Auto-generated graph data
-tools/        # Optional standalone Python scripts (require ANTHROPIC_API_KEY)
-```
+所有输出必须：
+
+- 去修辞（禁止 narrative 风格）
+- 去模糊（禁止“可能”“大概”等词）
+- 信息密度最大化
+- 面向“知识结构化”，而非阅读体验
+
+优先级：
+
+结构 > 关系 > 结论 > 描述

 ---

-## Page Format
+## 3. 结构化语义（必须）

-Every wiki page uses this frontmatter:
+所有页面必须遵循结构化语义规则：
+
+- Summary 必须使用固定字段
+- Claim 必须符合标准语法
+- Connections 必须使用关系类型
+- 禁止自由发挥
+
+---
+
+# Slash Commands（Claude Code）
+
+| Command        | 使用方式                        |
+| -------------- | --------------------------- |
+| `/wiki-ingest` | `ingest raw/your-file.md`   |
+| `/wiki-query`  | `query: 你的问题`               |
+| `/wiki-lint`   | `lint the wiki`             |
+| `/wiki-graph`  | `build the knowledge graph` |
+
+---
+
+## 自然语言示例
+
+- ingest raw/papers/attention-is-all-you-need.md  
+- query: Transformer 的核心机制是什么？  
+- lint the wiki  
+- build the graph and analyze RAG  
+
+Claude Code 会自动读取本文件并执行以下工作流。
+
+
+
+---
+
+# Directory Layout（目录结构）
+
+```
+raw/          # 原始文档（不可修改） 
+wiki/         # 知识层（由 Claude 完全维护）  
+  index.md    # 页面索引（每次 ingest 必须更新） 
+  log.md      # 追加式日志 
+  overview.md # 全局知识总结  
+  sources/    # 每个原始文档对应一个页面  
+  entities/   # 实体（人/公司/产品/项目） 
+  concepts/   # 概念（方法/理论/框架）
+  syntheses/  # 查询结果沉淀  
+graph/        # 自动生成的图数据  
+tools/        # 可选 Python 工具 (require ANTHROPIC_API_KEY)
+````
+
+
+---
+
+# Page Format（页面格式）
+
+每个页面必须包含：

 ```yaml
 ---
+id: unique_id
 title: "Page Title"
 type: source | entity | concept | synthesis
 tags: []
-sources: []       # list of source slugs that inform this page
+sources: []       # 来源
 last_updated: YYYY-MM-DD
 ---
-```
+````

-Use `[[PageName]]` wikilinks to link to other wiki pages.
+必须使用 `[[PageName]]` 进行链接。

 ---

-## Ingest Workflow
+# Ingest Workflow（摄取流程）
+**重要** 请严格按照摄取流程进行操作，每分析一个页面必须要创建/更新source page，entity, concept等。不可遗漏！

-Triggered by: *"ingest <file>"* or `/wiki-ingest`
+触发方式：
+- `/wiki-ingest`
+- 或：`ingest <file>`
+## 执行步骤（严格顺序）
+1. 使用 Read 工具完整读取 source 文档
+2. 读取 `wiki/index.md` 和 `wiki/overview.md`
+3. 生成 `wiki/sources/原始中文名.md` (非中文使用 slug.md)
+4. 更新 `wiki/index.md`
+5. 更新 `wiki/overview.md`（如有必要）
+6. 创建或更新 Entity 页面
+7. 创建或更新 Concept 页面
+8. 检测并记录冲突
+9. 追加 `wiki/log.md`

-Steps (in order):
-1. Read the source document fully using the Read tool
-2. Read `wiki/index.md` and `wiki/overview.md` for current wiki context
-3. Write `wiki/sources/<slug>.md` — use the source page format below
-4. Update `wiki/index.md` — add entry under Sources section
-5. Update `wiki/overview.md` — revise synthesis if warranted
-6. Update/create entity pages for key people, companies, projects mentioned
-7. Update/create concept pages for key ideas and frameworks discussed
-8. Flag any contradictions with existing wiki content
-9. Append to `wiki/log.md`: `## [YYYY-MM-DD] ingest | <Title>`
+---

-### Source Page Format
+# Source Page Format（增强结构）

 ```markdown
 ---
@@ -80,32 +134,46 @@ title: "Source Title"
 type: source
 tags: []
 date: YYYY-MM-DD
-source_file: raw/...
 ---

+## Source File
+- [[raw/...]]
+
 ## Summary
-2–4 sentence summary.
+- 核心主题：
+- 问题域：
+- 方法/机制：
+- 结论/价值：

 ## Key Claims
- Claim 1
- Claim 2
+- （必须符合：主体 + 机制 + 结果）

 ## Key Quotes
-> "Quote here" — context
+> "引用内容" — 上下文说明
+
+## Key Concepts
+- [[ConceptName]]：定义
+
+## Key Entities
+- [[EntityName]]：角色说明

 ## Connections
- [[EntityName]] — how they relate
- [[ConceptName]] — how it connects
+- [[A]] ← depends_on ← [[B]]
+- [[C]] ← extends ← [[D]]

 ## Contradictions
- Contradicts [[OtherPage]] on: ...
+- 与 [[OtherPage]] 冲突：
+  - 冲突点：
+  - 当前观点：
+  - 对方观点：
 ```

-### Domain-Specific Templates
+---

-If the source falls into a specific domain (e.g., personal diary, meeting notes), the agent should use a specialized template instead of the default generic one above:
+# Domain-Specific Templates（领域模板）
+
+## Diary / Journal

-#### Diary / Journal Template
 ```markdown
 ---
 title: "YYYY-MM-DD Diary"
@@ -114,18 +182,16 @@ tags: [diary]
 date: YYYY-MM-DD
 ---
 ## Event Summary
-...
 ## Key Decisions
-...
 ## Energy & Mood
-...
 ## Connections
-...
 ## Shifts & Contradictions
-...
 ```

-#### Meeting Notes Template
+---
+
+## Meeting Notes
+
 ```markdown
 ---
 title: "Meeting Title"
@@ -134,97 +200,153 @@ tags: [meeting]
 date: YYYY-MM-DD
 ---
 ## Goal
-...
 ## Key Discussions
-...
 ## Decisions Made
-...
 ## Action Items
-...
 ```

 ---

-## Query Workflow
+# Entity & Concept Rules（关键增强）

-Triggered by: *"query: <question>"* or `/wiki-query`
+## Entity（实体）

-Steps:
-1. Read `wiki/index.md` to identify relevant pages
-2. Read those pages with the Read tool
-3. Synthesize an answer with inline citations as `[[PageName]]` wikilinks
-4. Ask the user if they want the answer filed as `wiki/syntheses/<slug>.md`
+创建条件：
+- 出现 ≥ 2 次  
+    或
+- 对主题有关键影响
+
+类型：
+- 人 / 公司 / 产品 / 项目

 ---

-## Lint Workflow
+## Concept（概念）
+创建条件：
+- 可抽象
+- 可复用
+- 非具体实例
+---

-Triggered by: *"lint the wiki"* or `/wiki-lint`
+## 命名规范（强制）
+- 使用唯一标准名称
+- 所有别名写入页面：

-Use Grep and Read tools to check for:
- **Orphan pages** — wiki pages with no inbound `[[links]]` from other pages
- **Broken links** — `[[WikiLinks]]` pointing to pages that don't exist
- **Contradictions** — claims that conflict across pages
- **Stale summaries** — pages not updated after newer sources
- **Missing entity pages** — entities mentioned in 3+ pages but lacking their own page
- **Data gaps** — questions the wiki can't answer; suggest new sources
-
-Output a lint report and ask if the user wants it saved to `wiki/lint-report.md`.
+```markdown
+## Aliases
+- GPT4
+- GPT-4
+```

 ---

-## Graph Workflow
+## 去重机制（必须）

-Triggered by: *"build the knowledge graph"* or `/wiki-graph`
-
-When the user asks to build the graph, run `tools/build_graph.py` which:
- Pass 1: Parses all `[[wikilinks]]` → deterministic `EXTRACTED` edges
- Pass 2: Infers implicit relationships → `INFERRED` edges with confidence scores
- Runs Louvain community detection
- Outputs `graph/graph.json` + `graph/graph.html`
-
-If the user doesn't have Python/dependencies set up, instead generate the graph data manually:
-1. Use Grep to find all `[[wikilinks]]` across wiki pages
-2. Build a node/edge list
-3. Write `graph/graph.json` directly
-4. Write `graph/graph.html` using the vis.js template
+创建前必须：
+1. 搜索 index
+2. 判断是否存在
+3. 存在则更新

 ---

-## Naming Conventions
+# Query Workflow（查询流程）

- Source slugs: `kebab-case` matching source filename
- Entity pages: `TitleCase.md` (e.g. `OpenAI.md`, `SamAltman.md`)
- Concept pages: `TitleCase.md` (e.g. `ReinforcementLearning.md`, `RAG.md`)
- Source pages: `kebab-case.md`
+触发：
+- `/wiki-query`
+- 或：`query: 问题`

-## Index Format
+---
+
+## 步骤
+
+1. 读取 index
+2. 找到相关页面
+3. 使用 Read 工具加载
+4. 输出结构化答案
+5. 使用 `[[Page]]` 引用
+6. 询问是否保存为 synthesis
+
+---
+
+# Lint Workflow（校验）
+
+检查内容：
+
+- 孤立页面
+- 断链
+- 冲突
+- 过期内容
+- 缺失Entity
+- 缺失Concept
+- 知识空白
+
+---
+
+# Graph Workflow（知识图谱）
+
+触发：
+- `/wiki-graph`
+
+---
+
+执行：
+- 优先运行 `tools/build_graph.py`
+- 否则手动构建：
+
+步骤：
+1. 提取所有 `[[links]]`
+2. 构建节点与边
+3. 输出 `graph.json`
+
+---
+
+# Naming Conventions（命名规范）
+- Source：保留原始中文名称（去除特殊符号），非中文使用 kebab-case
+- Entity：TitleCase
+- Concept：TitleCase
+
+---
+
+# Index Format（索引结构）

 ```markdown
 # Wiki Index

 ## Overview
- [Overview](overview.md) — living synthesis
+- [Overview](overview.md)

 ## Sources
- [Source Title](sources/slug.md) — one-line summary
+- [Title](sources/原始中文名.md)

 ## Entities
- [Entity Name](entities/EntityName.md) — one-line description
+- [Entity](entities/Entity.md)

 ## Concepts
- [Concept Name](concepts/ConceptName.md) — one-line description
+- [Concept](concepts/Concept.md)

 ## Syntheses
- [Analysis Title](syntheses/slug.md) — what question it answers
+- [Title](syntheses/slug.md)
 ```

-## Log Format
+---

-Each entry starts with `## [YYYY-MM-DD] <operation> | <title>` so it's grep-parseable:
+# Log Format（日志）

 ```
-grep "^## \[" wiki/log.md | tail -10
+## [YYYY-MM-DD] ingest | 标题
 ```

-Operations: `ingest`, `query`, `lint`, `graph`
+---
+
+# ✅ 最终目标
+
+该系统用于：
+
+- 知识沉淀
+- 结构化理解
+- 自动图谱构建
+- Agent 推理支持
+
+---
+
+# END
--- a/CLAUDE.md.bak
+++ b/CLAUDE.md.bak
@@ -0,0 +1,230 @@
+# LLM Wiki Agent — Schema & Workflow Instructions
+
+This wiki is maintained entirely by Claude Code. No API key or Python scripts needed — just open this repo in Claude Code and talk to it.
+
+## Slash Commands (Claude Code)
+
+| Command | What to say |
+|---|---|
+| `/wiki-ingest` | `ingest raw/my-article.md` |
+| `/wiki-query` | `query: what are the main themes?` |
+| `/wiki-lint` | `lint the wiki` |
+| `/wiki-graph` | `build the knowledge graph` |
+
+Or just describe what you want in plain English:
+- *"Ingest this file: raw/papers/attention-is-all-you-need.md"*
+- *"What does the wiki say about transformer models?"*
+- *"Check the wiki for orphan pages and contradictions"*
+- *"Build the graph and show me what's connected to RAG"*
+
+Claude Code reads this file automatically and follows the workflows below.
+
+---
+
+## Directory Layout
+
+```
+raw/          # Immutable source documents — never modify these
+wiki/         # Claude owns this layer entirely
+  index.md    # Catalog of all pages — update on every ingest
+  log.md      # Append-only chronological record
+  overview.md # Living synthesis across all sources
+  sources/    # One summary page per source document
+  entities/   # People, companies, projects, products
+  concepts/   # Ideas, frameworks, methods, theories
+  syntheses/  # Saved query answers
+graph/        # Auto-generated graph data
+tools/        # Optional standalone Python scripts (require ANTHROPIC_API_KEY)
+```
+
+---
+
+## Page Format
+
+Every wiki page uses this frontmatter:
+
+```yaml
+---
+title: "Page Title"
+type: source | entity | concept | synthesis
+tags: []
+sources: []       # list of source slugs that inform this page
+last_updated: YYYY-MM-DD
+---
+```
+
+Use `[[PageName]]` wikilinks to link to other wiki pages.
+
+---
+
+## Ingest Workflow
+
+Triggered by: *"ingest <file>"* or `/wiki-ingest`
+
+Steps (in order):
+1. Read the source document fully using the Read tool
+2. Read `wiki/index.md` and `wiki/overview.md` for current wiki context
+3. Write `wiki/sources/<slug>.md` — use the source page format below
+4. Update `wiki/index.md` — add entry under Sources section
+5. Update `wiki/overview.md` — revise synthesis if warranted
+6. Update/create entity pages for key people, companies, projects mentioned
+7. Update/create concept pages for key ideas and frameworks discussed
+8. Flag any contradictions with existing wiki content
+9. Append to `wiki/log.md`: `## [YYYY-MM-DD] ingest | <Title>`
+
+### Source Page Format
+
+```markdown
+---
+title: "Source Title"
+type: source
+tags: []
+date: YYYY-MM-DD
+source_file: raw/...
+---
+
+## Summary
+2–4 sentence summary.
+
+## Key Claims
+- Claim 1
+- Claim 2
+
+## Key Quotes
+> "Quote here" — context
+
+## Connections
+- [[EntityName]] — how they relate
+- [[ConceptName]] — how it connects
+
+## Contradictions
+- Contradicts [[OtherPage]] on: ...
+```
+
+### Domain-Specific Templates
+
+If the source falls into a specific domain (e.g., personal diary, meeting notes), the agent should use a specialized template instead of the default generic one above:
+
+#### Diary / Journal Template
+```markdown
+---
+title: "YYYY-MM-DD Diary"
+type: source
+tags: [diary]
+date: YYYY-MM-DD
+---
+## Event Summary
+...
+## Key Decisions
+...
+## Energy & Mood
+...
+## Connections
+...
+## Shifts & Contradictions
+...
+```
+
+#### Meeting Notes Template
+```markdown
+---
+title: "Meeting Title"
+type: source
+tags: [meeting]
+date: YYYY-MM-DD
+---
+## Goal
+...
+## Key Discussions
+...
+## Decisions Made
+...
+## Action Items
+...
+```
+
+---
+
+## Query Workflow
+
+Triggered by: *"query: <question>"* or `/wiki-query`
+
+Steps:
+1. Read `wiki/index.md` to identify relevant pages
+2. Read those pages with the Read tool
+3. Synthesize an answer with inline citations as `[[PageName]]` wikilinks
+4. Ask the user if they want the answer filed as `wiki/syntheses/<slug>.md`
+
+---
+
+## Lint Workflow
+
+Triggered by: *"lint the wiki"* or `/wiki-lint`
+
+Use Grep and Read tools to check for:
+- **Orphan pages** — wiki pages with no inbound `[[links]]` from other pages
+- **Broken links** — `[[WikiLinks]]` pointing to pages that don't exist
+- **Contradictions** — claims that conflict across pages
+- **Stale summaries** — pages not updated after newer sources
+- **Missing entity pages** — entities mentioned in 3+ pages but lacking their own page
+- **Data gaps** — questions the wiki can't answer; suggest new sources
+
+Output a lint report and ask if the user wants it saved to `wiki/lint-report.md`.
+
+---
+
+## Graph Workflow
+
+Triggered by: *"build the knowledge graph"* or `/wiki-graph`
+
+When the user asks to build the graph, run `tools/build_graph.py` which:
+- Pass 1: Parses all `[[wikilinks]]` → deterministic `EXTRACTED` edges
+- Pass 2: Infers implicit relationships → `INFERRED` edges with confidence scores
+- Runs Louvain community detection
+- Outputs `graph/graph.json` + `graph/graph.html`
+
+If the user doesn't have Python/dependencies set up, instead generate the graph data manually:
+1. Use Grep to find all `[[wikilinks]]` across wiki pages
+2. Build a node/edge list
+3. Write `graph/graph.json` directly
+4. Write `graph/graph.html` using the vis.js template
+
+---
+
+## Naming Conventions
+
+- Source slugs: `kebab-case` matching source filename
+- Entity pages: `TitleCase.md` (e.g. `OpenAI.md`, `SamAltman.md`)
+- Concept pages: `TitleCase.md` (e.g. `ReinforcementLearning.md`, `RAG.md`)
+- Source pages: `kebab-case.md`
+
+## Index Format
+
+```markdown
+# Wiki Index
+
+## Overview
+- [Overview](overview.md) — living synthesis
+
+## Sources
+- [Source Title](sources/slug.md) — one-line summary
+
+## Entities
+- [Entity Name](entities/EntityName.md) — one-line description
+
+## Concepts
+- [Concept Name](concepts/ConceptName.md) — one-line description
+
+## Syntheses
+- [Analysis Title](syntheses/slug.md) — what question it answers
+```
+
+## Log Format
+
+Each entry starts with `## [YYYY-MM-DD] <operation> | <title>` so it's grep-parseable:
+
+```
+grep "^## \[" wiki/log.md | tail -10
+```
+
+Operations: `ingest`, `query`, `lint`, `graph`
--- a/1
+++ b/1
@@ -0,0 +1 @@
+/Users/weishen/Workspace/nexus/raw
--- a/raw/.gitkeep
+++ b/raw/.gitkeep
--- a/tools/pycache/sync.cpython-311.pyc
+++ b/tools/pycache/sync.cpython-311.pyc
--- a/tools/sync.py
+++ b/tools/sync.py
@@ -0,0 +1,567 @@
+#!/usr/bin/env python3
+"""
+Wiki ↔ Raw 三向同步工具
+
+功能：
+  - 检测 raw/ 下文件变化（新增/修改/删除）
+  - 自动调用 ingest.py 进行同步
+  - 维护 manifest.json 状态映射
+  - 检测 orphan entity/concept（仅报告，不删除）
+
+用法：
+    python tools/sync.py --check        预览变化（不执行）
+    python tools/sync.py --sync         执行同步
+    python tools/sync.py --rebuild      从 manifest 重建 wiki/index（兜底）
+    python tools/sync.py --bootstrap    从现有 wiki sources 反向生成 manifest（首次用，跳过已 ingest 的文件）
+
+manifest.json 格式：
+{
+  "version": 1,
+  "updated_at": "ISO timestamp",
+  "files": {
+    "relative/path/to/file.md": {
+      "hash": "sha256",
+      "modified": "ISO timestamp",
+      "slug": "wiki-source-slug",
+      "source_path": "wiki/sources/slug.md",
+      "ingested": true
+    }
+  }
+}
+"""
+
+import os
+import sys
+import json
+import hashlib
+import subprocess
+from pathlib import Path
+from datetime import datetime, timezone
+
+
+REPO_ROOT = Path(__file__).parent.parent
+WIKI_DIR = REPO_ROOT / "wiki"
+MANIFEST_FILE = WIKI_DIR / "manifest.json"
+SCHEMA_FILE = REPO_ROOT / "CLAUDE.md"
+
+
+# ─── 工具函数 ───────────────────────────────────────────────
+
+def green(text):
+    return f"\033[92m{text}\033[0m"
+
+def yellow(text):
+    return f"\033[93m{text}\033[0m"
+
+def red(text):
+    return f"\033[91m{text}\033[0m"
+
+def dim(text):
+    return f"\033[2m{text}\033[0m"
+
+def bold(text):
+    return f"\033[1m{text}\033[0m"
+
+
+def log(msg, style="normal"):
+    prefixes = {
+        "normal":   "  ",
+        "info":     "  ℹ ",
+        "success":  "  ✓ ",
+        "warn":     "  ⚠ ",
+        "error":    "  ✗ ",
+        "section":  "\n── ",
+    }
+    print(f"{prefixes.get(style, '  ')}{msg}")
+
+
+def sha256_file(path: Path) -> str:
+    h = hashlib.sha256()
+    h.update(path.read_bytes())
+    return h.hexdigest()[:16]
+
+
+def iso_now():
+    return datetime.now(timezone.utc).isoformat()
+
+
+def load_manifest() -> dict:
+    if MANIFEST_FILE.exists():
+        try:
+            return json.loads(MANIFEST_FILE.read_text(encoding="utf-8"))
+        except (json.JSONDecodeError, IOError):
+            pass
+    return {"version": 1, "updated_at": iso_now(), "files": {}}
+
+
+def save_manifest(manifest: dict):
+    manifest["updated_at"] = iso_now()
+    MANIFEST_FILE.write_text(json.dumps(manifest, ensure_ascii=False, indent=2), encoding="utf-8")
+
+
+def scan_raw() -> dict[str, dict]:
+    """返回 {relative_path: {hash, modified, size}}"""
+    raw_dir = REPO_ROOT / "raw"
+    result = {}
+    if not raw_dir.exists():
+        return result
+    for p in raw_dir.rglob("*.md"):
+        if p.is_file() and not p.name.startswith("."):
+            rel = str(p.relative_to(REPO_ROOT))
+            stat = p.stat()
+            result[rel] = {
+                "hash": sha256_file(p),
+                "modified": datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc).isoformat(),
+                "size": stat.st_size,
+                "abs_path": str(p),
+            }
+    return result
+
+
+def build_slug_from_path(rel_path: str) -> str:
+    """从相对路径生成 slug（尽量保留中文，kebab-case）"""
+    name = Path(rel_path).stem
+    name = name.replace(" ", "-").replace("/", "-").replace("\\", "-")
+    name = "".join(c if c.isalnum() or c in ("-", "_", "·") else "-" for c in name)
+    name = name.strip("-")
+    return name or "untitled"
+
+
+def call_ingest(source_path: str, slug: str = None) -> dict:
+    """调用 ingest.py，返回结果"""
+    cmd = [sys.executable, str(REPO_ROOT / "tools" / "ingest.py"), source_path]
+    try:
+        result = subprocess.run(
+            cmd,
+            capture_output=True,
+            text=True,
+            timeout=300,
+            cwd=str(REPO_ROOT),
+        )
+        return {
+            "success": result.returncode == 0,
+            "stdout": result.stdout,
+            "stderr": result.stderr,
+        }
+    except subprocess.TimeoutExpired:
+        return {"success": False, "stdout": "", "stderr": "Timeout (>5min)"}
+    except Exception as e:
+        return {"success": False, "stdout": "", "stderr": str(e)}
+
+
+def find_orphan_entity_concept(manifest: dict) -> tuple[list, list]:
+    """检测未被任何 source page 引用的 entity 和 concept"""
+    # 从所有 source 内容中提取 [[wikilinks]]
+    import re
+    wikilink_pattern = re.compile(r"\[\[([^\]]+)\]\]")
+
+    sources_dir = WIKI_DIR / "sources"
+    referenced_entities = set()
+    referenced_concepts = set()
+
+    if sources_dir.exists():
+        for src in sources_dir.glob("*.md"):
+            content = src.read_text(encoding="utf-8")
+            for link in wikilink_pattern.findall(content):
+                name = link.strip()
+                if name.startswith("entities/"):
+                    referenced_entities.add(Path(name).stem)
+                elif name.startswith("concepts/"):
+                    referenced_concepts.add(Path(name).stem)
+                elif "/" not in name:
+                    # 裸 wikilink，可能是 entity 或 concept
+                    referenced_entities.add(name)
+                    referenced_concepts.add(name)
+
+    # 检查 entity 目录
+    orphan_entities = []
+    entities_dir = WIKI_DIR / "entities"
+    if entities_dir.exists():
+        for f in entities_dir.glob("*.md"):
+            if f.stem not in referenced_entities:
+                orphan_entities.append(f.name)
+
+    # 检查 concept 目录
+    orphan_concepts = []
+    concepts_dir = WIKI_DIR / "concepts"
+    if concepts_dir.exists():
+        for f in concepts_dir.glob("*.md"):
+            if f.stem not in referenced_concepts:
+                orphan_concepts.append(f.name)
+
+    return orphan_entities, orphan_concepts
+
+
+# ─── 核心同步逻辑 ───────────────────────────────────────────────
+
+def check_changes(manifest: dict, raw_files: dict) -> dict:
+    """对比 manifest 和实际 raw 文件，返回变化"""
+    changes = {"new": [], "updated": [], "deleted": [], "unchanged": []}
+    manifest_files = manifest.get("files", {})
+
+    # 遍历当前 raw 文件
+    for rel_path, info in raw_files.items():
+        if rel_path not in manifest_files:
+            changes["new"].append({"rel_path": rel_path, **info})
+        elif info["hash"] != manifest_files[rel_path]["hash"]:
+            changes["updated"].append({
+                "rel_path": rel_path,
+                "old_hash": manifest_files[rel_path]["hash"],
+                **info,
+            })
+        else:
+            changes["unchanged"].append(rel_path)
+
+    # 遍历 manifest，找已删除的
+    for rel_path in manifest_files:
+        abs_path = REPO_ROOT / rel_path
+        if not abs_path.exists():
+            changes["deleted"].append({
+                "rel_path": rel_path,
+                "slug": manifest_files[rel_path].get("slug", build_slug_from_path(rel_path)),
+                "source_path": manifest_files[rel_path].get("source_path"),
+            })
+
+    return changes
+
+
+def run_sync(dry_run: bool = False, verbose: bool = False):
+    print(f"\n{bold('=== Wiki Sync')}\n")
+    print(f"  Date:    {datetime.now().strftime('%Y-%m-%d %H:%M')}")
+    print(f"  Raw:     {REPO_ROOT / 'raw'}")
+    print(f"  Wiki:    {WIKI_DIR}")
+    print(f"  Mode:    {'DRY-RUN (preview only)' if dry_run else 'LIVE SYNC'}")
+    print()
+
+    # Step 1: load manifest
+    manifest = load_manifest()
+    log("manifest.json loaded", "info")
+
+    # Step 2: scan raw/
+    raw_files = scan_raw()
+    log(f"raw/ scan: {len(raw_files)} .md files found", "info")
+
+    # Step 3: check changes
+    changes = check_changes(manifest, raw_files)
+    total_changes = len(changes["new"]) + len(changes["updated"]) + len(changes["deleted"])
+
+    if total_changes == 0:
+        log("No changes detected — wiki is up to date.", "success")
+        return
+
+    # ─── Report ───
+    print(f"\n{bold('--- Changes ---')}")
+    print(f"  {green('+')} New:     {len(changes['new'])}")
+    print(f"  {yellow('~')} Updated: {len(changes['updated'])}")
+    print(f"  {red('-')} Deleted: {len(changes['deleted'])}")
+
+    if verbose or not dry_run:
+        if changes["new"]:
+            print(f"\n  {bold('New Files:')}")
+            for f in changes["new"]:
+                log(f"{green('[+')} {f['rel_path']}", "normal")
+
+        if changes["updated"]:
+            print(f"\n  {bold('Updated Files:')}")
+            for f in changes["updated"]:
+                log(f"{yellow('[~]')} {f['rel_path']} (hash changed)", "normal")
+
+        if changes["deleted"]:
+            print(f"\n  {bold('Deleted Files:')}")
+            for f in changes["deleted"]:
+                log(f"{red('[-]')} {f['rel_path']}", "normal")
+
+    if dry_run:
+        log("\nDry-run complete. Run with --sync to apply.", "warn")
+        return
+
+    # ─── Apply Sync ───
+    print(f"\n{bold('--- Applying Sync ---')}")
+
+    updated_manifest = manifest.copy()
+    updated_manifest["files"] = manifest.get("files", {}).copy()
+
+    # ① 新增 → ingest
+    for f in changes["new"]:
+        rel_path = f["rel_path"]
+        abs_path = f["abs_path"]
+        slug = build_slug_from_path(rel_path)
+        print(f"\n  {green('[+]')} New: {rel_path}")
+        print(f"      slug: {slug}")
+
+        result = call_ingest(abs_path, slug)
+        if result["success"]:
+            log(f"Ingested: {slug}.md", "success")
+            updated_manifest["files"][rel_path] = {
+                "hash": f["hash"],
+                "modified": f["modified"],
+                "slug": slug,
+                "source_path": f"wiki/sources/{slug}.md",
+                "ingested": True,
+                "ingested_at": iso_now(),
+            }
+        else:
+            log(f"Failed: {result['stderr'][:200]}", "error")
+            # 仍然记录（避免重复 ingest）
+            updated_manifest["files"][rel_path] = {
+                "hash": f["hash"],
+                "modified": f["modified"],
+                "slug": slug,
+                "source_path": f"wiki/sources/{slug}.md",
+                "ingested": False,
+                "ingested_at": None,
+                "error": result["stderr"][:500],
+            }
+
+    # ② 修改 → re-ingest
+    for f in changes["updated"]:
+        rel_path = f["rel_path"]
+        abs_path = f["abs_path"]
+        old_slug = manifest["files"].get(rel_path, {}).get("slug") or build_slug_from_path(rel_path)
+        print(f"\n  {yellow('[~]')} Updated: {rel_path}")
+
+        result = call_ingest(abs_path, old_slug)
+        if result["success"]:
+            log(f"Re-ingested: {old_slug}.md", "success")
+            updated_manifest["files"][rel_path] = {
+                **updated_manifest["files"].get(rel_path, {}),
+                "hash": f["hash"],
+                "modified": f["modified"],
+                "slug": old_slug,
+                "source_path": f"wiki/sources/{old_slug}.md",
+                "ingested": True,
+                "ingested_at": iso_now(),
+            }
+        else:
+            log(f"Failed: {result['stderr'][:200]}", "error")
+
+    # ③ 删除 → 保留 wiki 内容，仅从 manifest 移除（按用户要求保留 orphan）
+    for f in changes["deleted"]:
+        rel_path = f["rel_path"]
+        source_path = f.get("source_path")
+        print(f"\n  {red('[-]')} Deleted: {rel_path}")
+        if source_path:
+            sp = WIKI_DIR / source_path
+            log(f"      Wiki source kept: {sp}", "warn")
+        # 从 manifest 移除（不删除 wiki 文件）
+        if rel_path in updated_manifest["files"]:
+            del updated_manifest["files"][rel_path]
+
+    # Step 4: Save manifest
+    save_manifest(updated_manifest)
+    log(f"\nmanifest.json updated ({len(updated_manifest['files'])} entries)", "success")
+
+    # Step 5: Orphan detection
+    orphan_entities, orphan_concepts = find_orphan_entity_concept(updated_manifest)
+    if orphan_entities or orphan_concepts:
+        print(f"\n{bold('--- Orphan Report (kept as requested) ---')}")
+        if orphan_entities:
+            print(f"  {bold('Orphan Entities')} ({len(orphan_entities)}):")
+            for e in sorted(orphan_entities):
+                print(f"    {dim('?')} {e}")
+        if orphan_concepts:
+            print(f"  {bold('Orphan Concepts')} ({len(orphan_concepts)}):")
+            for c in sorted(orphan_concepts):
+                print(f"    {dim('?')} {c}")
+        log("\nOrphan pages are kept (not deleted per user request).", "info")
+    else:
+        log("No orphan entity/concept detected.", "success")
+
+    print(f"\n{bold('Done.')}")
+
+
+def run_bootstrap():
+    """从现有 wiki sources 反向生成 manifest，跳过已 ingest 的文件"""
+    import re
+
+    print(f"\n{bold('=== Wiki Bootstrap')}\n")
+    print(f"  Scanning existing wiki sources to build manifest ...\n")
+
+    sources_dir = WIKI_DIR / "sources"
+    if not sources_dir.exists():
+        print(f"  {red('✗')} No wiki/sources/ directory found. Nothing to bootstrap.")
+        return
+
+    wikilink_pattern = re.compile(r"\[\[?raw/([^\]\s]+\.md)\]?]?", re.IGNORECASE)
+
+    manifest = {"version": 1, "updated_at": iso_now(), "files": {}}
+    raw_dir = (REPO_ROOT / "raw").resolve()  # 解析 symlink 到真实路径
+    repo_raw_prefix = str(REPO_ROOT / "raw")  # 用于 strip 前缀得到相对路径
+    bootstrapped = 0
+    skipped_not_found = 0
+    skipped_no_source_field = 0
+
+    for src in sources_dir.glob("*.md"):
+        content = src.read_text(encoding="utf-8")
+
+        # 尝试从 ## Source File 字段提取原始路径
+        match = wikilink_pattern.search(content)
+        if not match:
+            skipped_no_source_field += 1
+            continue
+
+        # raw_rel 格式如 "Agent/usecases/xxx.md"（不含 raw/ 前缀）
+        raw_rel = match.group(1).lstrip("/")
+        # 用 resolved 后的 raw_dir 拼接（follow symlink）
+        raw_path = raw_dir / raw_rel
+
+        if not raw_path.exists():
+            # 文件已删除，保留 source page 但不加入 manifest
+            skipped_not_found += 1
+            continue
+
+        stat = raw_path.stat()
+        file_hash = sha256_file(raw_path)
+        slug = src.stem
+
+        # manifest key 用 "raw/Agent/xxx.md" 格式（REPO_ROOT 相对路径）
+        manifest_key = f"raw/{raw_rel}"
+        manifest["files"][manifest_key] = {
+            "hash": file_hash,
+            "modified": datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc).isoformat(),
+            "slug": slug,
+            "source_path": f"wiki/sources/{slug}.md",
+            "ingested": True,
+            "ingested_at": datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc).isoformat(),
+        }
+        bootstrapped += 1
+
+    save_manifest(manifest)
+
+    print(f"  {bold('Result:')}")
+    print(f"    {green('✓')} Manifest entries created: {bootstrapped}")
+    print(f"    {yellow('~')} Skipped (source file deleted): {skipped_not_found}")
+    print(f"    {dim('-')} Skipped (no source_file field): {skipped_no_source_field}")
+    print(f"\n  {green('✓')} manifest.json created at: {MANIFEST_FILE}")
+    print(f"\n  Run now: {bold('python tools/sync.py --check')}  to preview new/updated files.\n")
+
+
+def run_check():
+    """只预览变化，不执行"""
+    manifest = load_manifest()
+    raw_files = scan_raw()
+    changes = check_changes(manifest, raw_files)
+    total = len(changes["new"]) + len(changes["updated"]) + len(changes["deleted"])
+
+    print(f"\n{bold('=== Wiki Sync Check')} (preview mode)\n")
+    print(f"  Raw files:      {len(raw_files)}")
+    print(f"  Manifest entries: {len(manifest.get('files', {}))}")
+    print(f"  {green('+')} New:     {len(changes['new'])}")
+    print(f"  {yellow('~')} Updated: {len(changes['updated'])}")
+    print(f"  {red('-')} Deleted: {len(changes['deleted'])}")
+
+    if total > 0:
+        if changes["new"]:
+            print(f"\n  {bold('New Files:')}")
+            for f in changes["new"]:
+                print(f"    {green('[+]')} {f['rel_path']}")
+        if changes["updated"]:
+            print(f"\n  {bold('Updated Files:')}")
+            for f in changes["updated"]:
+                print(f"    {yellow('[~]')} {f['rel_path']} (was {f['old_hash']}, now {f['hash']})")
+        if changes["deleted"]:
+            print(f"\n  {bold('Deleted Files:')}")
+            for f in changes["deleted"]:
+                print(f"    {red('[-]')} {f['rel_path']}")
+    else:
+        print(f"\n  {green('No changes — wiki is in sync.')}")
+
+    print()
+
+
+def run_rebuild():
+    """从 manifest 重建 wiki/index.md（兜底方案）"""
+    manifest = load_manifest()
+    print(f"\n{bold('=== Wiki Rebuild from Manifest')}\n")
+    print(f"  Manifest entries: {len(manifest.get('files', {}))}")
+    print(f"  Rebuilding index.md ...\n")
+
+    index_lines = [
+        "# Wiki Index\n",
+        "\n## Overview\n",
+        "- [Overview](overview.md) — living synthesis\n",
+        "\n## Sources\n",
+    ]
+
+    files = manifest.get("files", {})
+    # 按 modified 时间倒序
+    sorted_files = sorted(files.items(), key=lambda x: x[1].get("modified", ""), reverse=True)
+
+    for rel_path, info in sorted_files:
+        slug = info.get("slug", build_slug_from_path(rel_path))
+        source_md_path = WIKI_DIR / "sources" / f"{slug}.md"
+        if source_md_path.exists():
+            title = source_md_path.read_text(encoding="utf-8").split("\n")[0].lstrip("# ").strip()
+            index_lines.append(f"- [{title}](sources/{slug}.md)\n")
+        else:
+            index_lines.append(f"- [{slug}](sources/{slug}.md) — (source missing)\n")
+
+    index_lines.append("\n## Entities\n\n## Concepts\n\n## Syntheses\n")
+
+    index_file = WIKI_DIR / "index.md"
+    index_file.write_text("".join(index_lines), encoding="utf-8")
+    print(f"  {green('✓')} index.md rebuilt with {len(sorted_files)} sources")
+
+    # Orphan report
+    orphan_entities, orphan_concepts = find_orphan_entity_concept(manifest)
+    if orphan_entities:
+        print(f"  {dim('?')} Orphan entities: {len(orphan_entities)}")
+    if orphan_concepts:
+        print(f"  {dim('?')} Orphan concepts: {len(orphan_concepts)}")
+
+    print(f"\nDone.")
+
+
+# ─── CLI 入口 ───────────────────────────────────────────────
+
+if __name__ == "__main__":
+    import argparse
+
+    parser = argparse.ArgumentParser(
+        description="Wiki ↔ Raw 三向同步工具",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--check",
+        action="store_true",
+        help="预览变化，不执行同步",
+    )
+    parser.add_argument(
+        "--sync",
+        action="store_true",
+        help="执行完整同步（新增/修改/删除 + orphan 检测）",
+    )
+    parser.add_argument(
+        "--rebuild",
+        action="store_true",
+        help="从 manifest 重建 wiki/index.md（兜底方案）",
+    )
+    parser.add_argument(
+        "--bootstrap",
+        action="store_true",
+        help="从现有 wiki sources 反向生成 manifest（首次使用，跳过已 ingest 的文件）",
+    )
+    parser.add_argument(
+        "--verbose", "-v",
+        action="store_true",
+        help="详细输出",
+    )
+
+    args = parser.parse_args()
+
+    if args.bootstrap:
+        run_bootstrap()
+    elif args.rebuild:
+        run_rebuild()
+    elif args.check:
+        run_check()
+    elif args.sync:
+        run_sync(dry_run=False, verbose=args.verbose)
+    else:
+        parser.print_help()
+        print("\n示例:")
+        print("  python tools/sync.py --check       # 预览变化")
+        print("  python tools/sync.py --sync        # 执行同步")
+        print("  python tools/sync.py --sync -v     # 详细模式")
+        print("  python tools/sync.py --rebuild     # 重建 index")
+        print("  python tools/sync.py --bootstrap   # 首次：从 wiki sources 生成 manifest")
--- a/1
+++ b/1
@@ -0,0 +1 @@
+/Users/weishen/Workspace/nexus/wiki
--- a/wiki/index.md
+++ b/wiki/index.md
@@ -1,14 +0,0 @@
-# Wiki Index
-
-This file is maintained by the LLM. Updated on every ingest.
-
-## Overview
- [Overview](overview.md) — living synthesis across all sources
-
-## Sources
-
-## Entities
-
-## Concepts
-
-## Syntheses
--- a/wiki/log.md
+++ b/wiki/log.md
@@ -1,9 +0,0 @@
-# Wiki Log
-
-Append-only chronological record of all operations.
-
-Format: `## [YYYY-MM-DD] <operation> | <title>`
-
-Parse recent entries: `grep "^## \[" wiki/log.md | tail -10`
-
---
--- a/wiki/overview.md
+++ b/wiki/overview.md
@@ -1,17 +0,0 @@
---
-title: "Overview"
-type: synthesis
-tags: []
-sources: []
-last_updated: ""
---
-
-# Overview
-
-*This page is maintained by the LLM. It is updated on every ingest to reflect the current synthesis across all sources.*
-
-No sources ingested yet. Add your first source with:
-
-```bash
-python tools/ingest.py raw/your-source.md
-```