Update nexus: fix conflicts and sync local changes

2026-04-26 12:06:50 +08:00
parent 191797c01b
commit f09834b5a5
2443 changed files with 254323 additions and 255154 deletions
--- a/wiki/sources/semantic-memory-search.md
+++ b/wiki/sources/semantic-memory-search.md
@@ -1,46 +1,46 @@
---
-title: "Semantic Memory Search"
-type: source
-tags: [memory, semantic-search, vector-db, openclaw]
-date: 2026-04-22
---
-
-## Source File
- [[Agent/usecases/semantic-memory-search]]
-
-## Summary（用中文描述）
- 核心主题：为 OpenClaw 的 Markdown 记忆文件添加向量语义搜索能力
- 问题域：OpenClaw 记忆以纯 Markdown 存储，随时间积累后无法检索，grep 只能关键词匹配，无法语义理解
- 方法/机制：使用 memsearch 库（Milvus 向量数据库）构建混合搜索（稠密向量 + BM25）配合 RRF 重排；SHA-256 内容哈希实现增量索引；文件监视器自动重建索引
- 结论/价值：用自然语言提问（如"我们选了哪个缓存方案？"）即可找到相关内容，无需记忆精确措辞；支持本地模式无需 API Key
-
-## Key Claims（用中文描述）
- OpenClaw 记忆库积累后，纯 Markdown 无法语义检索，用户需要通过含义而非关键词找到过去决策
- 混合搜索（稠密向量 + BM25）结合 RRF 重排，同时捕获语义相似性和关键词精确匹配，优于纯向量搜索
- SHA-256 内容哈希确保仅新内容或变更内容被嵌入，避免重复 API 调用，节省成本
- Markdown 文件是唯一真相，向量索引只是派生缓存，随时可通过 `memsearch index` 重建
-
-## Key Quotes
-> "Markdown stays the source of truth. The vector index is just a derived cache — you can rebuild it anytime with `memsearch index`. Your memory files are never modified." — 核心理念：原始文档不可变
-> "Hybrid search beats pure vector search. Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries." — 混合搜索的优越性
-> "Smart dedup saves money. Each chunk is identified by a SHA-256 content hash. Re-running `index` only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls." — 增量索引节省成本
-
-## Key Concepts
- [[Semantic Memory Search]]：通过向量嵌入实现对记忆文件的语义搜索，而非仅关键词匹配
- [[Hybrid Search]]：结合稠密向量（语义相似性）和 BM25（关键词精确匹配）的混合检索策略
- [[Reciprocal Rank Fusion (RRF)]]：通过排名融合重排合并多个检索结果，提升搜索质量
- [[Content Hashing]]：使用 SHA-256 哈希识别内容块，仅对新增或变更内容重新嵌入
- [[File Watcher]]：监视记忆文件变化，自动触发增量重建索引，保持索引实时更新
-
-## Key Entities
- [[memsearch]]：ZillizTech 开源的向量语义搜索 CLI/库，为 OpenClaw 记忆提供语义搜索能力，基于 Milvus 向量数据库
- [[Milvus]]：开源向量数据库后端，memsearch 的向量存储和检索引擎
- [[OpenClaw]]：多 Agent 框架，自带 Markdown 记忆系统，是本用例的上层应用框架
-
-## Connections
- [[OpenClaw]] ← extends ← [[Semantic Memory Search]]：本用例在 OpenClaw 纯 Markdown 记忆之上叠加向量语义搜索层
- [[Knowledge-Base-RAG]] ← related_to ← [[Semantic Memory Search]]：两者都涉及向量 Embedding 检索，属于 RAG 技术栈的不同场景
- [[Second Brain]] ← related_to ← [[Semantic Memory Search]]：第二大脑的记忆持久化与语义检索能力相辅相成
-
-## Contradictions
- 与 [[Knowledge-Base-RAG]] 无冲突，两者属同一技术栈的不同实现：Knowledge Base RAG 侧重 Telegram/Slack 投递 URL 并入库，本用例侧重现有 Markdown 文件的语义索引
+---
+title: "Semantic Memory Search"
+type: source
+tags: [memory, semantic-search, vector-db, openclaw]
+date: 2026-04-22
+---
+
+## Source File
+- [[Agent/usecases/semantic-memory-search]]
+
+## Summary（用中文描述）
+- 核心主题：为 OpenClaw 的 Markdown 记忆文件添加向量语义搜索能力
+- 问题域：OpenClaw 记忆以纯 Markdown 存储，随时间积累后无法检索，grep 只能关键词匹配，无法语义理解
+- 方法/机制：使用 memsearch 库（Milvus 向量数据库）构建混合搜索（稠密向量 + BM25）配合 RRF 重排；SHA-256 内容哈希实现增量索引；文件监视器自动重建索引
+- 结论/价值：用自然语言提问（如"我们选了哪个缓存方案？"）即可找到相关内容，无需记忆精确措辞；支持本地模式无需 API Key
+
+## Key Claims（用中文描述）
+- OpenClaw 记忆库积累后，纯 Markdown 无法语义检索，用户需要通过含义而非关键词找到过去决策
+- 混合搜索（稠密向量 + BM25）结合 RRF 重排，同时捕获语义相似性和关键词精确匹配，优于纯向量搜索
+- SHA-256 内容哈希确保仅新内容或变更内容被嵌入，避免重复 API 调用，节省成本
+- Markdown 文件是唯一真相，向量索引只是派生缓存，随时可通过 `memsearch index` 重建
+
+## Key Quotes
+> "Markdown stays the source of truth. The vector index is just a derived cache — you can rebuild it anytime with `memsearch index`. Your memory files are never modified." — 核心理念：原始文档不可变
+> "Hybrid search beats pure vector search. Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries." — 混合搜索的优越性
+> "Smart dedup saves money. Each chunk is identified by a SHA-256 content hash. Re-running `index` only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls." — 增量索引节省成本
+
+## Key Concepts
+- [[Semantic Memory Search]]：通过向量嵌入实现对记忆文件的语义搜索，而非仅关键词匹配
+- [[Hybrid Search]]：结合稠密向量（语义相似性）和 BM25（关键词精确匹配）的混合检索策略
+- [[Reciprocal Rank Fusion (RRF)]]：通过排名融合重排合并多个检索结果，提升搜索质量
+- [[Content Hashing]]：使用 SHA-256 哈希识别内容块，仅对新增或变更内容重新嵌入
+- [[File Watcher]]：监视记忆文件变化，自动触发增量重建索引，保持索引实时更新
+
+## Key Entities
+- [[memsearch]]：ZillizTech 开源的向量语义搜索 CLI/库，为 OpenClaw 记忆提供语义搜索能力，基于 Milvus 向量数据库
+- [[Milvus]]：开源向量数据库后端，memsearch 的向量存储和检索引擎
+- [[OpenClaw]]：多 Agent 框架，自带 Markdown 记忆系统，是本用例的上层应用框架
+
+## Connections
+- [[OpenClaw]] ← extends ← [[Semantic Memory Search]]：本用例在 OpenClaw 纯 Markdown 记忆之上叠加向量语义搜索层
+- [[Knowledge-Base-RAG]] ← related_to ← [[Semantic Memory Search]]：两者都涉及向量 Embedding 检索，属于 RAG 技术栈的不同场景
+- [[Second Brain]] ← related_to ← [[Semantic Memory Search]]：第二大脑的记忆持久化与语义检索能力相辅相成
+
+## Contradictions
+- 与 [[Knowledge-Base-RAG]] 无冲突，两者属同一技术栈的不同实现：Knowledge Base RAG 侧重 Telegram/Slack 投递 URL 并入库，本用例侧重现有 Markdown 文件的语义索引