21 lines
609 B
Markdown
21 lines
609 B
Markdown
---
|
||
title: "Scrapy"
|
||
type: concept
|
||
tags: [爬虫, Python, Scrapy]
|
||
date: 2025-11-11
|
||
---
|
||
|
||
## Definition
|
||
Scrapy 是一个用 Python 编写的快速高级网页爬虫框架,用于从网站中提取结构化数据。
|
||
|
||
## Key Features
|
||
- 轻量高效、插件生态丰富、可 Docker 化部署
|
||
- 对 JS 渲染页面支持弱,需要配合 Splash 或 Playwright
|
||
|
||
## Role
|
||
在电商数据采集系统中,Scrapy 负责结构化抓取、分页调度、下载媒体,输出 JSON 或 CSV 文件供 n8n 消费。
|
||
|
||
## Connections
|
||
- [[Scrapy]] ← depends_on ← [[Playwright]]
|
||
- [[n8n]] ← orchestrates ← [[Scrapy]]
|