diff --git a/wiki/entities/BAAI.md b/wiki/entities/BAAI.md new file mode 100644 index 00000000..af0d94ac --- /dev/null +++ b/wiki/entities/BAAI.md @@ -0,0 +1,27 @@ +--- +title: "BAAI" +type: entity +tags: [embedding, open-source, chinese-optimized] +sources: ["RAG从入门到精通系列1:基础RAG"] +last_updated: 2026-04-16 +--- + +## Basic Information +- **Type**: Embedding Model Series +- **Source**: RAG从入门到精通系列1:基础RAG + +## Definition +BAAI (Beijing Academy of Artificial Intelligence) provides an open-source series of embedding models (e.g., BAAI/bge series) that convert text into embedding vectors for use in RAG systems. + +## Key Models +- **BAAI BGE Series**: Chinese-optimized open-source embedding models +- Models can convert text to fixed-length embedding vectors +- Context Window typically 512~8192 tokens + +## Applications +- [[Embedding]]:BAAI models are used to create embedding vectors +- [[RAG]]:BAAI embeddings enable semantic search in RAG systems + +## Related Concepts +- [[Embedding]]:The technology BAAI models implement +- [[向量数据库]]:Where BAAI embeddings are stored diff --git a/wiki/entities/LangChain.md b/wiki/entities/LangChain.md new file mode 100644 index 00000000..3b4930c5 --- /dev/null +++ b/wiki/entities/LangChain.md @@ -0,0 +1,37 @@ +--- +title: "LangChain" +type: entity +tags: [llm, framework, rag, document-loading] +sources: ["RAG从入门到精通系列1:基础RAG"] +last_updated: 2026-04-16 +--- + +## Basic Information +- **Type**: LLM Application Framework +- **Source**: RAG从入门到精通系列1:基础RAG + +## Definition +LangChain is a framework for building LLM applications, providing over 160 different document loaders for loading data from various sources, as well as components for building RAG pipelines. + +## Key Features +- **Document Loaders**: 160+ loaders for various data sources +- **Chain Abstraction**: Link retrieval and generation components together +- **Retriever Interface**: Unified abstraction for retrieval components +- **PromptTemplate**: Template system for constructing prompts +- **Integration**: Works with various LLMs (Qwen, GPT-4, Claude, etc.) and vector databases (Qdrant, Chroma, Pinecone, etc.) + +## Applications in RAG +- Loading external documents via document loaders +- Splitting documents into chunks (Splits) +- Creating retrievers from vector stores +- Chaining retrieval and generation into a unified pipeline +- Converting raw AIMessage outputs to clean string results + +## Related Concepts +- [[RAG]]:LangChain is commonly used to build RAG pipelines +- [[LlamaIndex]]:Alternative framework for building LLM applications +- [[向量数据库]]:Vector stores integrated with LangChain +- [[Qdrant]]:Vector database mentioned in RAG tutorials + +## Related Entities +- [[Qwen]]:LLM often used with LangChain diff --git a/wiki/entities/LangSmith.md b/wiki/entities/LangSmith.md new file mode 100644 index 00000000..65e3a7af --- /dev/null +++ b/wiki/entities/LangSmith.md @@ -0,0 +1,27 @@ +--- +title: "LangSmith" +type: entity +tags: [llm, debugging, monitoring, production] +sources: ["RAG从入门到精通系列1:基础RAG"] +last_updated: 2026-04-16 +--- + +## Basic Information +- **Type**: LLM Application Platform +- **Source**: RAG从入门到精通系列1:基础RAG + +## Definition +LangSmith is a platform for building production-grade LLM applications. It allows close monitoring and evaluation of LLM applications, enabling fast and confident delivery. + +## Key Capabilities +- **Tracing**: Track LLM applications through the entire pipeline +- **Debugging**: Understand LLM calls and other parts of application logic +- **Evaluation**: Evaluate application performance +- **Monitoring**: Observe application behavior in production + +## Use Case +LangSmith helps visualize how the entire RAG pipeline is connected step by step, useful for debugging and understanding RAG workflows. + +## Related Concepts +- [[RAG]]:LangSmith can be used to monitor RAG pipelines +- [[LangChain]]:LangChain integrates with LangSmith for debugging diff --git a/wiki/entities/LlamaIndex.md b/wiki/entities/LlamaIndex.md new file mode 100644 index 00000000..df6f68ed --- /dev/null +++ b/wiki/entities/LlamaIndex.md @@ -0,0 +1,23 @@ +--- +title: "LlamaIndex" +type: entity +tags: [llm, framework, rag] +sources: ["RAG从入门到精通系列1:基础RAG"] +last_updated: 2026-04-16 +--- + +## Basic Information +- **Type**: LLM Application Framework +- **Source**: RAG从入门到精通系列1:基础RAG + +## Definition +LlamaIndex is a framework for building LLM applications with data connectors, mentioned alongside LangChain as a way to simplify the complex RAG pipeline construction. + +## Relationship with LangChain +- Both LangChain and LlamaIndex are frameworks for building LLM applications +- Both can be used to construct RAG pipelines +- Both provide abstractions for document loading, splitting, embedding, and retrieval + +## Related Concepts +- [[RAG]]:LlamaIndex is used for building RAG pipelines +- [[LangChain]]:Alternative/companion framework diff --git a/wiki/entities/Qdrant.md b/wiki/entities/Qdrant.md new file mode 100644 index 00000000..0a4e978f --- /dev/null +++ b/wiki/entities/Qdrant.md @@ -0,0 +1,35 @@ +--- +title: "Qdrant" +type: entity +tags: [vector-database, rag, rust, open-source] +sources: ["RAG从入门到精通系列1:基础RAG"] +last_updated: 2026-04-16 +--- + +## Basic Information +- **Type**: Vector Database +- **Source**: RAG从入门到精通系列1:基础RAG + +## Definition +Qdrant is an open-source vector database written in Rust, designed for storing and searching high-dimensional embedding vectors with high performance. + +## Key Features +- **Written in Rust**: High performance and memory safety +- **Vector Search**: Supports similarity search with various metrics +- **Open Source**: Freely available for self-hosting +- **RAG Integration**: Commonly used as the vector store in RAG pipelines + +## Technical Details +- Implements various similarity comparison methods for embedding vectors +- Supports Top-k retrieval (returning k most similar results) +- Can store metadata alongside vectors + +## Related Concepts +- [[向量数据库]]:Qdrant is a specific vector database implementation +- [[Embedding]]:Qdrant stores embedding vectors +- [[RAG]]:Qdrant serves as the storage layer in RAG systems +- [[LangChain]]:LangChain can integrate with Qdrant as a vector store + +## Related Entities +- [[BAAI]]:Embedding models that feed data into Qdrant +- [[Qwen]]:LLM that queries Qdrant via retrieval diff --git a/wiki/index.md b/wiki/index.md index b645fe05..c75f5ee6 100644 --- a/wiki/index.md +++ b/wiki/index.md @@ -78,7 +78,8 @@ - [LLMs、RAG、AI Agent 三个到底什么区别?](sources/LLMs-RAG-AI-Agent-三个到底什么区别.md) — LLM/RAG/AI Agent 层级关系与协同模式 - [Multi-Agent System Reliability](sources/Multi-Agent-System-Reliability.md) — 4种多智能体可靠性架构模式 - [如何写出完美的Prompt(提示词)?](sources/如何写出完美的Prompt(提示词)?.md) — 结构化 Prompt 构建方法论与职场能力培养 -- [RAG从入门到精通系列1:基础RAG](sources/RAG从入门到精通系列1:基础RAG.md) — Indexing-Retrieval-Generation 三阶段管道 +- [RAG从入门到精通系列1:基础RAG](sources/RAG从入门到精通系列1:基础RAG.md) — Indexing-Retrieval-Generation 三阶段管道;Qwen + BAAI + LangChain + Qdrant 实战 +- [RAG从入门到精通系列1-基础RAG](sources/RAG从入门到精通系列1-基础RAG.md) — Wiki 源页面:RAG 基础概念、关键技术栈、实操流程笔记 - [How to Get the RSS Feed For Any YouTube Channel](sources/How-to-Get-the-RSS-Feed-For-Any-YouTube-Channel.md) — YouTube 频道 RSS Feed 获取方法 - [Nano Banana 提示词框架](sources/Nano-Banana结构化提示词框架.md) — Google 结构化图像生成提示词 9 层框架 - [Claude Code 调用方法总结](sources/Claude-Code调用方法总结.md) — Print Mode / TMUX 交互模式与 Skill 加载 @@ -112,6 +113,11 @@ - [LaunchDarkly](entities/LaunchDarkly.md) — Feature Flag 管理平台,86% 客户可在一天内恢复;HP/Dior 将回滚从小时级降至秒级 - [HP](entities/HP.md) — 通过 LaunchDarkly 将回滚时间从小时级降至分钟级 - [Christian Dior](entities/Christian-Dior.md) — 通过 LaunchDarkly 将 15 分钟回滚降至即时开关 +- [LangChain](entities/LangChain.md) — LLM 应用框架,160+ 文档加载器,用于构建 RAG 管道 +- [LlamaIndex](entities/LlamaIndex.md) — LLM 应用框架,与 LangChain 并列的 RAG 管道构建工具 +- [LangSmith](entities/LangSmith.md) — LLM 应用监控调试平台,可视化 RAG 管道全过程 +- [BAAI](entities/BAAI.md) — 北京智源人工智能研究院,开源 Embedding 模型系列(BAAI/bge) +- [Qdrant](entities/Qdrant.md) — Rust 编写的开源向量数据库,高性能 RAG 存储层 ## Entities (2026-04-16 Batch 3) - [tukuai](entities/tukuai.md) — 独立研究者,提出递归自优化生成系统的数学形式化框架 diff --git a/wiki/sources/RAG从入门到精通系列1-基础RAG.md b/wiki/sources/RAG从入门到精通系列1-基础RAG.md new file mode 100644 index 00000000..93efbf54 --- /dev/null +++ b/wiki/sources/RAG从入门到精通系列1-基础RAG.md @@ -0,0 +1,57 @@ +# RAG从入门到精通系列1:基础RAG + +## Metadata + +- **Date**: 2025-12-18 +- **Source**: https://mp.weixin.qq.com/s/TlFNOw7_3Q8qywKLpVUmfg +- **Category**: AI / RAG + +## Key Insights + +- RAG (Retrieval Augmented Generation) connects LLM with external data sources for more relevant, up-to-date responses +- Basic RAG consists of three stages: Indexing (document processing), Retrieval (finding relevant docs), and Generation (LLM answer synthesis) +- Documents must be split into chunks (Splits) to fit within embedding models' limited Context Window (512-8192 tokens) +- Embedding models convert text into numerical Embedding Vectors for similarity comparison using methods like cosine similarity +- Vector databases like Qdrant store embedding vectors and enable efficient similarity search +- LangChain and LlamaIndex are frameworks that simplify RAG pipeline construction +- LangSmith helps monitor and debug RAG pipelines in production + +## Summary + +RAG (Retrieval Augmented Generation) is a method for connecting Large Language Models with external data sources, allowing them to generate responses based on private or up-to-date data. The basic RAG workflow consists of three stages: Indexing, Retrieval, and Generation. + +In the Indexing stage, external documents are loaded using document loaders (like those in LangChain), split into smaller chunks that fit within embedding models' context windows, and converted into embedding vectors stored in a vector database like Qdrant. + +During Retrieval, a user's question is converted into an embedding vector, and similar vectors are searched from the vector store using similarity measures like cosine similarity to find the k most relevant document chunks. + +In the Generation stage, the original question and retrieved context chunks are combined into a prompt template and fed to an LLM (like Qwen) to generate a grounded, accurate response with citation to source material. + +## Key Entities + +- **LLM (Large Language Model)**: Powerful AI model that generates text; doesn't always have access to task-relevant or latest data +- **RAG (Retrieval Augmented Generation)**: Framework connecting LLM with external data sources for grounded generation +- **Qwen**: LLM model referenced in the tutorial for RAG implementation +- **BAAI**: Embedding model series for creating embedding vectors (e.g., BAAI/bge series) +- **Qdrant**: Open-source vector database written in Rust for storing and searching embedding vectors +- **LangChain**: Framework providing 160+ document loaders and components for building LLM applications +- **LlamaIndex**: Framework for building LLM applications with data connectors (mentioned alongside LangChain) +- **LangSmith**: Platform for monitoring, debugging, and evaluating production LLM applications +- **Vector Store**: Database system for storing embedding vectors with similarity search capabilities +- **Retriever**: Component that loads external documents and filters chunks relevant to a question + +## Key Concepts + +- **Indexing**: Process of loading external documents, splitting them into chunks, and storing their embedding vectors in a vector database +- **Retrieval**: Process of converting a question to an embedding vector and finding k most similar document chunks from vector store +- **Generation**: Process of combining question and retrieved context into a prompt and generating answer via LLM +- **Embedding Vector**: Fixed-length numerical representation of text that captures semantic meaning, generated by embedding models +- **Context Window**: Maximum token limit an embedding model can process (typically 512-8192 tokens) +- **Token**: Basic unit for representing text in models; ~1 Chinese character or 3-4 English letters per token +- **Cosine Similarity**: Method measuring similarity between vectors using cosine of angle between them +- **Chunking/Splitting**: Breaking documents into smaller pieces to fit within embedding model context windows +- **Chain**: Linking retrieval and generation components into a unified pipeline (e.g., LangChain's Chain abstraction) + +## Related Sources + +- [Qdrant:使用Rust编写的开源向量数据库&向量搜索引擎](https://mp.weixin.qq.com/s?__biz=MzI2ODUyMTQyNA==&mid=2247493427&idx=1&sn=75181307c395cd1d51ccfaafac340866&scene=21#wechat_redirect) +- [GitHub: RAG Tutorial](https://github.com/realyinchen/RAG/blob/main/01_Indexing_Retrieval_Generation.ipynb)