Update nexus: fix conflicts and sync local changes
This commit is contained in:
@@ -1,46 +1,46 @@
|
||||
---
|
||||
title: "Semantic Memory Search"
|
||||
type: source
|
||||
tags: [memory, semantic-search, vector-db, openclaw]
|
||||
date: 2026-04-22
|
||||
---
|
||||
|
||||
## Source File
|
||||
- [[Agent/usecases/semantic-memory-search]]
|
||||
|
||||
## Summary(用中文描述)
|
||||
- 核心主题:为 OpenClaw 的 Markdown 记忆文件添加向量语义搜索能力
|
||||
- 问题域:OpenClaw 记忆以纯 Markdown 存储,随时间积累后无法检索,grep 只能关键词匹配,无法语义理解
|
||||
- 方法/机制:使用 memsearch 库(Milvus 向量数据库)构建混合搜索(稠密向量 + BM25)配合 RRF 重排;SHA-256 内容哈希实现增量索引;文件监视器自动重建索引
|
||||
- 结论/价值:用自然语言提问(如"我们选了哪个缓存方案?")即可找到相关内容,无需记忆精确措辞;支持本地模式无需 API Key
|
||||
|
||||
## Key Claims(用中文描述)
|
||||
- OpenClaw 记忆库积累后,纯 Markdown 无法语义检索,用户需要通过含义而非关键词找到过去决策
|
||||
- 混合搜索(稠密向量 + BM25)结合 RRF 重排,同时捕获语义相似性和关键词精确匹配,优于纯向量搜索
|
||||
- SHA-256 内容哈希确保仅新内容或变更内容被嵌入,避免重复 API 调用,节省成本
|
||||
- Markdown 文件是唯一真相,向量索引只是派生缓存,随时可通过 `memsearch index` 重建
|
||||
|
||||
## Key Quotes
|
||||
> "Markdown stays the source of truth. The vector index is just a derived cache — you can rebuild it anytime with `memsearch index`. Your memory files are never modified." — 核心理念:原始文档不可变
|
||||
> "Hybrid search beats pure vector search. Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries." — 混合搜索的优越性
|
||||
> "Smart dedup saves money. Each chunk is identified by a SHA-256 content hash. Re-running `index` only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls." — 增量索引节省成本
|
||||
|
||||
## Key Concepts
|
||||
- [[Semantic Memory Search]]:通过向量嵌入实现对记忆文件的语义搜索,而非仅关键词匹配
|
||||
- [[Hybrid Search]]:结合稠密向量(语义相似性)和 BM25(关键词精确匹配)的混合检索策略
|
||||
- [[Reciprocal Rank Fusion (RRF)]]:通过排名融合重排合并多个检索结果,提升搜索质量
|
||||
- [[Content Hashing]]:使用 SHA-256 哈希识别内容块,仅对新增或变更内容重新嵌入
|
||||
- [[File Watcher]]:监视记忆文件变化,自动触发增量重建索引,保持索引实时更新
|
||||
|
||||
## Key Entities
|
||||
- [[memsearch]]:ZillizTech 开源的向量语义搜索 CLI/库,为 OpenClaw 记忆提供语义搜索能力,基于 Milvus 向量数据库
|
||||
- [[Milvus]]:开源向量数据库后端,memsearch 的向量存储和检索引擎
|
||||
- [[OpenClaw]]:多 Agent 框架,自带 Markdown 记忆系统,是本用例的上层应用框架
|
||||
|
||||
## Connections
|
||||
- [[OpenClaw]] ← extends ← [[Semantic Memory Search]]:本用例在 OpenClaw 纯 Markdown 记忆之上叠加向量语义搜索层
|
||||
- [[Knowledge-Base-RAG]] ← related_to ← [[Semantic Memory Search]]:两者都涉及向量 Embedding 检索,属于 RAG 技术栈的不同场景
|
||||
- [[Second Brain]] ← related_to ← [[Semantic Memory Search]]:第二大脑的记忆持久化与语义检索能力相辅相成
|
||||
|
||||
## Contradictions
|
||||
- 与 [[Knowledge-Base-RAG]] 无冲突,两者属同一技术栈的不同实现:Knowledge Base RAG 侧重 Telegram/Slack 投递 URL 并入库,本用例侧重现有 Markdown 文件的语义索引
|
||||
---
|
||||
title: "Semantic Memory Search"
|
||||
type: source
|
||||
tags: [memory, semantic-search, vector-db, openclaw]
|
||||
date: 2026-04-22
|
||||
---
|
||||
|
||||
## Source File
|
||||
- [[Agent/usecases/semantic-memory-search]]
|
||||
|
||||
## Summary(用中文描述)
|
||||
- 核心主题:为 OpenClaw 的 Markdown 记忆文件添加向量语义搜索能力
|
||||
- 问题域:OpenClaw 记忆以纯 Markdown 存储,随时间积累后无法检索,grep 只能关键词匹配,无法语义理解
|
||||
- 方法/机制:使用 memsearch 库(Milvus 向量数据库)构建混合搜索(稠密向量 + BM25)配合 RRF 重排;SHA-256 内容哈希实现增量索引;文件监视器自动重建索引
|
||||
- 结论/价值:用自然语言提问(如"我们选了哪个缓存方案?")即可找到相关内容,无需记忆精确措辞;支持本地模式无需 API Key
|
||||
|
||||
## Key Claims(用中文描述)
|
||||
- OpenClaw 记忆库积累后,纯 Markdown 无法语义检索,用户需要通过含义而非关键词找到过去决策
|
||||
- 混合搜索(稠密向量 + BM25)结合 RRF 重排,同时捕获语义相似性和关键词精确匹配,优于纯向量搜索
|
||||
- SHA-256 内容哈希确保仅新内容或变更内容被嵌入,避免重复 API 调用,节省成本
|
||||
- Markdown 文件是唯一真相,向量索引只是派生缓存,随时可通过 `memsearch index` 重建
|
||||
|
||||
## Key Quotes
|
||||
> "Markdown stays the source of truth. The vector index is just a derived cache — you can rebuild it anytime with `memsearch index`. Your memory files are never modified." — 核心理念:原始文档不可变
|
||||
> "Hybrid search beats pure vector search. Combining semantic similarity (dense vectors) with keyword matching (BM25) via Reciprocal Rank Fusion catches both meaning-based and exact-match queries." — 混合搜索的优越性
|
||||
> "Smart dedup saves money. Each chunk is identified by a SHA-256 content hash. Re-running `index` only embeds new or changed content, so you can run it as often as you like without wasting embedding API calls." — 增量索引节省成本
|
||||
|
||||
## Key Concepts
|
||||
- [[Semantic Memory Search]]:通过向量嵌入实现对记忆文件的语义搜索,而非仅关键词匹配
|
||||
- [[Hybrid Search]]:结合稠密向量(语义相似性)和 BM25(关键词精确匹配)的混合检索策略
|
||||
- [[Reciprocal Rank Fusion (RRF)]]:通过排名融合重排合并多个检索结果,提升搜索质量
|
||||
- [[Content Hashing]]:使用 SHA-256 哈希识别内容块,仅对新增或变更内容重新嵌入
|
||||
- [[File Watcher]]:监视记忆文件变化,自动触发增量重建索引,保持索引实时更新
|
||||
|
||||
## Key Entities
|
||||
- [[memsearch]]:ZillizTech 开源的向量语义搜索 CLI/库,为 OpenClaw 记忆提供语义搜索能力,基于 Milvus 向量数据库
|
||||
- [[Milvus]]:开源向量数据库后端,memsearch 的向量存储和检索引擎
|
||||
- [[OpenClaw]]:多 Agent 框架,自带 Markdown 记忆系统,是本用例的上层应用框架
|
||||
|
||||
## Connections
|
||||
- [[OpenClaw]] ← extends ← [[Semantic Memory Search]]:本用例在 OpenClaw 纯 Markdown 记忆之上叠加向量语义搜索层
|
||||
- [[Knowledge-Base-RAG]] ← related_to ← [[Semantic Memory Search]]:两者都涉及向量 Embedding 检索,属于 RAG 技术栈的不同场景
|
||||
- [[Second Brain]] ← related_to ← [[Semantic Memory Search]]:第二大脑的记忆持久化与语义检索能力相辅相成
|
||||
|
||||
## Contradictions
|
||||
- 与 [[Knowledge-Base-RAG]] 无冲突,两者属同一技术栈的不同实现:Knowledge Base RAG 侧重 Telegram/Slack 投递 URL 并入库,本用例侧重现有 Markdown 文件的语义索引
|
||||
|
||||
Reference in New Issue
Block a user