wiki-ingest: RAG从入门到精通系列1

2026-04-16 03:47:33 +08:00
parent 821be5e431
commit 997ad92e81
7 changed files with 213 additions and 1 deletions
--- a/wiki/entities/BAAI.md
+++ b/wiki/entities/BAAI.md
@@ -0,0 +1,27 @@
+---
+title: "BAAI"
+type: entity
+tags: [embedding, open-source, chinese-optimized]
+sources: ["RAG从入门到精通系列1：基础RAG"]
+last_updated: 2026-04-16
+---
+
+## Basic Information
+- **Type**: Embedding Model Series
+- **Source**: RAG从入门到精通系列1：基础RAG
+
+## Definition
+BAAI (Beijing Academy of Artificial Intelligence) provides an open-source series of embedding models (e.g., BAAI/bge series) that convert text into embedding vectors for use in RAG systems.
+
+## Key Models
+- **BAAI BGE Series**: Chinese-optimized open-source embedding models
+- Models can convert text to fixed-length embedding vectors
+- Context Window typically 512~8192 tokens
+
+## Applications
+- [[Embedding]]：BAAI models are used to create embedding vectors
+- [[RAG]]：BAAI embeddings enable semantic search in RAG systems
+
+## Related Concepts
+- [[Embedding]]：The technology BAAI models implement
+- [[向量数据库]]：Where BAAI embeddings are stored
--- a/wiki/entities/LangChain.md
+++ b/wiki/entities/LangChain.md
@@ -0,0 +1,37 @@
+---
+title: "LangChain"
+type: entity
+tags: [llm, framework, rag, document-loading]
+sources: ["RAG从入门到精通系列1：基础RAG"]
+last_updated: 2026-04-16
+---
+
+## Basic Information
+- **Type**: LLM Application Framework
+- **Source**: RAG从入门到精通系列1：基础RAG
+
+## Definition
+LangChain is a framework for building LLM applications, providing over 160 different document loaders for loading data from various sources, as well as components for building RAG pipelines.
+
+## Key Features
+- **Document Loaders**: 160+ loaders for various data sources
+- **Chain Abstraction**: Link retrieval and generation components together
+- **Retriever Interface**: Unified abstraction for retrieval components
+- **PromptTemplate**: Template system for constructing prompts
+- **Integration**: Works with various LLMs (Qwen, GPT-4, Claude, etc.) and vector databases (Qdrant, Chroma, Pinecone, etc.)
+
+## Applications in RAG
+- Loading external documents via document loaders
+- Splitting documents into chunks (Splits)
+- Creating retrievers from vector stores
+- Chaining retrieval and generation into a unified pipeline
+- Converting raw AIMessage outputs to clean string results
+
+## Related Concepts
+- [[RAG]]：LangChain is commonly used to build RAG pipelines
+- [[LlamaIndex]]：Alternative framework for building LLM applications
+- [[向量数据库]]：Vector stores integrated with LangChain
+- [[Qdrant]]：Vector database mentioned in RAG tutorials
+
+## Related Entities
+- [[Qwen]]：LLM often used with LangChain
--- a/wiki/entities/LangSmith.md
+++ b/wiki/entities/LangSmith.md
@@ -0,0 +1,27 @@
+---
+title: "LangSmith"
+type: entity
+tags: [llm, debugging, monitoring, production]
+sources: ["RAG从入门到精通系列1：基础RAG"]
+last_updated: 2026-04-16
+---
+
+## Basic Information
+- **Type**: LLM Application Platform
+- **Source**: RAG从入门到精通系列1：基础RAG
+
+## Definition
+LangSmith is a platform for building production-grade LLM applications. It allows close monitoring and evaluation of LLM applications, enabling fast and confident delivery.
+
+## Key Capabilities
+- **Tracing**: Track LLM applications through the entire pipeline
+- **Debugging**: Understand LLM calls and other parts of application logic
+- **Evaluation**: Evaluate application performance
+- **Monitoring**: Observe application behavior in production
+
+## Use Case
+LangSmith helps visualize how the entire RAG pipeline is connected step by step, useful for debugging and understanding RAG workflows.
+
+## Related Concepts
+- [[RAG]]：LangSmith can be used to monitor RAG pipelines
+- [[LangChain]]：LangChain integrates with LangSmith for debugging
--- a/wiki/entities/LlamaIndex.md
+++ b/wiki/entities/LlamaIndex.md
@@ -0,0 +1,23 @@
+---
+title: "LlamaIndex"
+type: entity
+tags: [llm, framework, rag]
+sources: ["RAG从入门到精通系列1：基础RAG"]
+last_updated: 2026-04-16
+---
+
+## Basic Information
+- **Type**: LLM Application Framework
+- **Source**: RAG从入门到精通系列1：基础RAG
+
+## Definition
+LlamaIndex is a framework for building LLM applications with data connectors, mentioned alongside LangChain as a way to simplify the complex RAG pipeline construction.
+
+## Relationship with LangChain
+- Both LangChain and LlamaIndex are frameworks for building LLM applications
+- Both can be used to construct RAG pipelines
+- Both provide abstractions for document loading, splitting, embedding, and retrieval
+
+## Related Concepts
+- [[RAG]]：LlamaIndex is used for building RAG pipelines
+- [[LangChain]]：Alternative/companion framework
--- a/wiki/entities/Qdrant.md
+++ b/wiki/entities/Qdrant.md
@@ -0,0 +1,35 @@
+---
+title: "Qdrant"
+type: entity
+tags: [vector-database, rag, rust, open-source]
+sources: ["RAG从入门到精通系列1：基础RAG"]
+last_updated: 2026-04-16
+---
+
+## Basic Information
+- **Type**: Vector Database
+- **Source**: RAG从入门到精通系列1：基础RAG
+
+## Definition
+Qdrant is an open-source vector database written in Rust, designed for storing and searching high-dimensional embedding vectors with high performance.
+
+## Key Features
+- **Written in Rust**: High performance and memory safety
+- **Vector Search**: Supports similarity search with various metrics
+- **Open Source**: Freely available for self-hosting
+- **RAG Integration**: Commonly used as the vector store in RAG pipelines
+
+## Technical Details
+- Implements various similarity comparison methods for embedding vectors
+- Supports Top-k retrieval (returning k most similar results)
+- Can store metadata alongside vectors
+
+## Related Concepts
+- [[向量数据库]]：Qdrant is a specific vector database implementation
+- [[Embedding]]：Qdrant stores embedding vectors
+- [[RAG]]：Qdrant serves as the storage layer in RAG systems
+- [[LangChain]]：LangChain can integrate with Qdrant as a vector store
+
+## Related Entities
+- [[BAAI]]：Embedding models that feed data into Qdrant
+- [[Qwen]]：LLM that queries Qdrant via retrieval