- Others: ChinaTextbook, Obsidian笔记系列, YouTube Channel ID, TikTok PM Django - Skills: GOG CLI, Last30Days, baoyu-skills - Vibe Coding: Cursor 2.0, Trae远程开发, Vibe-Kanban+OpenCode, vibe coding经验 - 微信公众号: 养虾日记1-5, AI时代赚钱 - 跨境电商: TikTok数据抓取, 选品策略, Superset Dashboard - AI目录补充: 20个文件 Source pages: 51 Entities: TapXWorld, VibeKanban, OpenCode, Trae, SourceGrounding等 Concepts: 自举Meta生成, 5大设计原则, MD5去重, 混合搜索等
36 lines
1020 B
Markdown
36 lines
1020 B
Markdown
---
|
||
title: "TikTok数据抓取"
|
||
type: source
|
||
tags: []
|
||
date: 2025-12-19
|
||
---
|
||
|
||
## Source File
|
||
- [[raw/跨境电商/Scrapy + Playwright 抓取TikTok Shop Data.md]]
|
||
|
||
## Summary
|
||
- 核心主题:Scrapy+Playwright抓取TikTok Shop数据技术指南
|
||
- 问题域:电商数据采集、动态页面抓取
|
||
- 方法/机制:创建venv安装scrapy+scrapy-playwright,playwright install chromium
|
||
- 结论/价值:实现TikTok Shop店铺数据的自动化抓取
|
||
|
||
## Key Claims
|
||
- 推荐创建虚拟环境(venv)隔离依赖,避免与系统Python冲突
|
||
- scrapy-playwright结合Scrapy和Playwright优势,支持动态页面抓取
|
||
- Docker容器需额外配置venv环境
|
||
|
||
## Key Concepts
|
||
- [[动态页面抓取]]:Playwright处理JavaScript渲染
|
||
- [[Python虚拟环境]]:venv隔离项目依赖
|
||
|
||
## Key Entities
|
||
- [[Scrapy]]:Python爬虫框架
|
||
- [[Playwright]]:浏览器自动化工具
|
||
- [[TikTokShop]]:电商平台数据来源
|
||
|
||
## Connections
|
||
- [[Scrapy]] ← enhanced_by ← [[Playwright]]
|
||
|
||
## Contradictions
|
||
- 无冲突
|