Batch 9: Obsidian插件/AI开源平替/Coze培训/TK面单/Ubuntu科学上网
- Sources: 5个新文档 - Concepts: ProxyChains, SOCKS5代理, Docker Daemon代理 - Index: 更新至 Batch 9 - 累计 sources: 108/182
This commit is contained in:
26
wiki/concepts/增量索引.md
Normal file
26
wiki/concepts/增量索引.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
title: "增量索引"
|
||||
type: concept
|
||||
tags: [indexing, efficiency, vector-search]
|
||||
date: 2026-04-16
|
||||
---
|
||||
|
||||
## Definition
|
||||
基于内容哈希(SHA-256)识别未变化的文件,仅对新增或内容变更的文件重新构建索引,避免对未变化内容重复计算。
|
||||
|
||||
## Why It Matters
|
||||
- Embedding API 调用成本高,增量索引可节省 90%+ 的 API 费用
|
||||
- 文件监视器实时触发增量索引,保持索引最新
|
||||
- 零浪费:每枚 token 都花在真正变化的内容上
|
||||
|
||||
## Implementation
|
||||
```python
|
||||
# 内容哈希 → 对比上次索引记录
|
||||
content_hash = sha256(file_content)
|
||||
if content_hash not in last_index:
|
||||
embed_and_index(file_content)
|
||||
```
|
||||
|
||||
## Connections
|
||||
- [[memsearch]]:增量索引的具体实现
|
||||
- [[向量数据库]]:增量索引的存储后端
|
||||
Reference in New Issue
Block a user