42 lines
1.2 KiB
Markdown
42 lines
1.2 KiB
Markdown
---
|
||
id: rag
|
||
title: "RAG"
|
||
type: concept
|
||
tags: [LLM, retrieval, augmentation]
|
||
sources:
|
||
- "[[RAG从入门到精通系列1:基础RAG]]"
|
||
- "[[LLM Terms Framework]]"
|
||
last_updated: 2025-12-18
|
||
---
|
||
|
||
## Definition
|
||
|
||
RAG(Retrieval Augmented Generation,检索增强生成)是一种结合检索系统和LLM生成的技术,解决LLM缺乏最新和私有数据的问题。
|
||
|
||
## Three-Step Process
|
||
|
||
1. **索引(Indexing)**:将文档切分并转换为Embedding向量存入向量数据库
|
||
2. **检索(Retrieval)**:根据问题语义向量检索相关文档块
|
||
3. **生成(Generation)**:将问题和相关文档输入LLM生成答案
|
||
|
||
## Key Components
|
||
|
||
- **Embedding**:将文本转换为数值向量
|
||
- **向量数据库**:存储和检索向量表示(如Qdrant)
|
||
- **文档切分**:将长文档分割成符合Embedding窗口的块
|
||
- **Context Window**:模型能接受的上下文长度限制(512-8192 token)
|
||
|
||
## Why It Matters
|
||
|
||
解决LLM的幻觉问题,让模型能够:
|
||
- 访问最新信息
|
||
- 利用私有数据
|
||
- 提供可溯源的回答
|
||
|
||
## Connections
|
||
- [[LLM]] ← uses ← [[RAG]]
|
||
- [[RAG]] ← includes ← [[索引]]
|
||
- [[RAG]] ← includes ← [[检索]]
|
||
- [[RAG]] ← includes ← [[生成]]
|
||
- [[RAG]] ← extends ← [[LLM]]
|