nexus/wiki/sources/semantic-memory-search.md at 8c909c9c0890da1f775aba2c27583e50916074d7

ishenwei/nexus

Fork 0

Files

weishen e823c78a9b Auto-sync: 2026-04-23 00:02

2026-04-23 00:02:55 +08:00

3.5 KiB

Raw Blame History

title, type, tags, date

title

type

Source File

Agent/usecases/semantic-memory-search

Summary（用中文描述）

核心主题：为 OpenClaw 的 Markdown 记忆文件添加向量语义搜索能力
问题域：OpenClaw 记忆以纯 Markdown 存储，随时间积累后无法检索，grep 只能关键词匹配，无法语义理解
方法/机制：使用 memsearch 库（Milvus 向量数据库）构建混合搜索（稠密向量 + BM25）配合 RRF 重排；SHA-256 内容哈希实现增量索引；文件监视器自动重建索引
结论/价值：用自然语言提问（如"我们选了哪个缓存方案？"）即可找到相关内容，无需记忆精确措辞；支持本地模式无需 API Key

Key Claims（用中文描述）

OpenClaw 记忆库积累后，纯 Markdown 无法语义检索，用户需要通过含义而非关键词找到过去决策
混合搜索（稠密向量 + BM25）结合 RRF 重排，同时捕获语义相似性和关键词精确匹配，优于纯向量搜索
SHA-256 内容哈希确保仅新内容或变更内容被嵌入，避免重复 API 调用，节省成本
Markdown 文件是唯一真相，向量索引只是派生缓存，随时可通过 memsearch index 重建

Key Quotes

"Markdown stays the source of truth. The vector index is just a derived cache — you can rebuild it anytime with memsearch index. Your memory files are never modified." — 核心理念：原始文档不可变 "Hybrid search beats pure vector search. Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries." — 混合搜索的优越性 "Smart dedup saves money. Each chunk is identified by a SHA-256 content hash. Re-running index only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls." — 增量索引节省成本

Key Concepts

Semantic Memory Search：通过向量嵌入实现对记忆文件的语义搜索，而非仅关键词匹配
Hybrid Search：结合稠密向量（语义相似性）和 BM25（关键词精确匹配）的混合检索策略
Reciprocal Rank Fusion (RRF)：通过排名融合重排合并多个检索结果，提升搜索质量
Content Hashing：使用 SHA-256 哈希识别内容块，仅对新增或变更内容重新嵌入
File Watcher：监视记忆文件变化，自动触发增量重建索引，保持索引实时更新

Key Entities

memsearch：ZillizTech 开源的向量语义搜索 CLI/库，为 OpenClaw 记忆提供语义搜索能力，基于 Milvus 向量数据库
Milvus：开源向量数据库后端，memsearch 的向量存储和检索引擎
OpenClaw：多 Agent 框架，自带 Markdown 记忆系统，是本用例的上层应用框架

Connections

OpenClaw ← extends ← Semantic Memory Search：本用例在 OpenClaw 纯 Markdown 记忆之上叠加向量语义搜索层
Knowledge-Base-RAG ← related_to ← Semantic Memory Search：两者都涉及向量 Embedding 检索，属于 RAG 技术栈的不同场景
Second Brain ← related_to ← Semantic Memory Search：第二大脑的记忆持久化与语义检索能力相辅相成

Contradictions

与 Knowledge-Base-RAG 无冲突，两者属同一技术栈的不同实现：Knowledge Base RAG 侧重 Telegram/Slack 投递 URL 并入库，本用例侧重现有 Markdown 文件的语义索引

3.5 KiB Raw Blame History Unescape Escape

Source File

Summary（用中文描述）

Key Claims（用中文描述）

Key Quotes

Key Concepts

Key Entities

Connections

Contradictions

3.5 KiB

Raw Blame History