Auto-sync: 2026-04-16 21:08

2026-04-16 21:08:55 +08:00
parent be7e39a4d0
commit 0dc7e71539
37 changed files with 846 additions and 3 deletions
--- a/wiki/concepts/Scrapy.md
+++ b/wiki/concepts/Scrapy.md
@@ -0,0 +1,20 @@
+---
+title: "Scrapy"
+type: concept
+tags: [爬虫, Python, Scrapy]
+date: 2025-11-11
+---
+
+## Definition
+Scrapy 是一个用 Python 编写的快速高级网页爬虫框架，用于从网站中提取结构化数据。
+
+## Key Features
+- 轻量高效、插件生态丰富、可 Docker 化部署
+- 对 JS 渲染页面支持弱，需要配合 Splash 或 Playwright
+
+## Role
+在电商数据采集系统中，Scrapy 负责结构化抓取、分页调度、下载媒体，输出 JSON 或 CSV 文件供 n8n 消费。
+
+## Connections
+- [[Scrapy]] ← depends_on ← [[Playwright]]
+- [[n8n]] ← orchestrates ← [[Scrapy]]