39 lines
1.6 KiB
Markdown
39 lines
1.6 KiB
Markdown
---
|
||
title: "Transcript-Based Summarization"
|
||
type: concept
|
||
tags: [Transcript, Summarization, YouTube, Content-Processing, AI]
|
||
sources: [daily-youtube-digest]
|
||
last_updated: 2026-04-22
|
||
---
|
||
|
||
## Definition
|
||
|
||
Transcript-Based Summarization 是指从视频/音频内容中提取字幕/ transcript,然后通过 AI 压缩为结构化要点摘要的处理流程。它使长视频/播客的消费从"没时间看完"变为"5 分钟掌握精华"。
|
||
|
||
## Workflow
|
||
|
||
1. **Transcript Extraction**: 通过 API([[TranscriptAPI.com]])或 CLI 工具(yt-dlp)获取字幕
|
||
2. **AI Summarization**: LLM 处理字幕文本,输出关键点、亮点引用、时间戳
|
||
3. **Structured Output**: 生成 bullet points、key quotes、timestamps 等结构化格式
|
||
4. **Delivery**: 整合到 [[Daily-Digest]] 或 [[second-brain]]
|
||
|
||
## TranscriptAPI vs yt-dlp
|
||
|
||
| Criteria | yt-dlp | TranscriptAPI.com |
|
||
|---|---|---|
|
||
| Output format | Verbose CLI logs | Clean JSON |
|
||
| Cloud compatibility | Fails on GCP/cloud | ✅ Works everywhere |
|
||
| Caching | None | ✅ Cached results |
|
||
| Rate limiting | Random blocks | ✅ Reliable, millions served |
|
||
| Dependencies | Binary required | HTTP API only |
|
||
|
||
## Applications
|
||
|
||
- [[Daily YouTube Digest]]: 频道新视频 → 字幕提取 → 要点摘要 → 推送
|
||
- [[Podcast Production Pipeline]]: 播客音频 → 字幕 → 时间戳笔记 → 社交媒体片段
|
||
- [[youtube-content-pipeline]]: YouTube 视频 → 字幕 → 博客文章/Newsletter
|
||
|
||
## Connections
|
||
- [[Transcript-Based Summarization]] ← uses ← [[TranscriptAPI.com]]
|
||
- [[Daily-Digest]] ← incorporates ← [[Transcript-Based Summarization]]
|