Auto-sync: 2026-04-16 21:08

This commit is contained in:
2026-04-16 21:08:55 +08:00
parent be7e39a4d0
commit 0dc7e71539
37 changed files with 846 additions and 3 deletions

20
wiki/concepts/Scrapy.md Normal file
View File

@@ -0,0 +1,20 @@
---
title: "Scrapy"
type: concept
tags: [爬虫, Python, Scrapy]
date: 2025-11-11
---
## Definition
Scrapy 是一个用 Python 编写的快速高级网页爬虫框架,用于从网站中提取结构化数据。
## Key Features
- 轻量高效、插件生态丰富、可 Docker 化部署
- 对 JS 渲染页面支持弱,需要配合 Splash 或 Playwright
## Role
在电商数据采集系统中Scrapy 负责结构化抓取、分页调度、下载媒体,输出 JSON 或 CSV 文件供 n8n 消费。
## Connections
- [[Scrapy]] ← depends_on ← [[Playwright]]
- [[n8n]] ← orchestrates ← [[Scrapy]]