Auto-sync: 2026-04-16 21:08
This commit is contained in:
20
wiki/concepts/Scrapy.md
Normal file
20
wiki/concepts/Scrapy.md
Normal file
@@ -0,0 +1,20 @@
|
||||
---
|
||||
title: "Scrapy"
|
||||
type: concept
|
||||
tags: [爬虫, Python, Scrapy]
|
||||
date: 2025-11-11
|
||||
---
|
||||
|
||||
## Definition
|
||||
Scrapy 是一个用 Python 编写的快速高级网页爬虫框架,用于从网站中提取结构化数据。
|
||||
|
||||
## Key Features
|
||||
- 轻量高效、插件生态丰富、可 Docker 化部署
|
||||
- 对 JS 渲染页面支持弱,需要配合 Splash 或 Playwright
|
||||
|
||||
## Role
|
||||
在电商数据采集系统中,Scrapy 负责结构化抓取、分页调度、下载媒体,输出 JSON 或 CSV 文件供 n8n 消费。
|
||||
|
||||
## Connections
|
||||
- [[Scrapy]] ← depends_on ← [[Playwright]]
|
||||
- [[n8n]] ← orchestrates ← [[Scrapy]]
|
||||
Reference in New Issue
Block a user