Files
nexus/wiki/concepts/Transcript-Based-Summarization.md

39 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Transcript-Based Summarization"
type: concept
tags: [Transcript, Summarization, YouTube, Content-Processing, AI]
sources: [daily-youtube-digest]
last_updated: 2026-04-22
---
## Definition
Transcript-Based Summarization 是指从视频/音频内容中提取字幕/ transcript然后通过 AI 压缩为结构化要点摘要的处理流程。它使长视频/播客的消费从"没时间看完"变为"5 分钟掌握精华"。
## Workflow
1. **Transcript Extraction**: 通过 API[[TranscriptAPI.com]])或 CLI 工具yt-dlp获取字幕
2. **AI Summarization**: LLM 处理字幕文本,输出关键点、亮点引用、时间戳
3. **Structured Output**: 生成 bullet points、key quotes、timestamps 等结构化格式
4. **Delivery**: 整合到 [[Daily-Digest]] 或 [[second-brain]]
## TranscriptAPI vs yt-dlp
| Criteria | yt-dlp | TranscriptAPI.com |
|---|---|---|
| Output format | Verbose CLI logs | Clean JSON |
| Cloud compatibility | Fails on GCP/cloud | ✅ Works everywhere |
| Caching | None | ✅ Cached results |
| Rate limiting | Random blocks | ✅ Reliable, millions served |
| Dependencies | Binary required | HTTP API only |
## Applications
- [[Daily YouTube Digest]]: 频道新视频 → 字幕提取 → 要点摘要 → 推送
- [[Podcast Production Pipeline]]: 播客音频 → 字幕 → 时间戳笔记 → 社交媒体片段
- [[youtube-content-pipeline]]: YouTube 视频 → 字幕 → 博客文章/Newsletter
## Connections
- [[Transcript-Based Summarization]] ← uses ← [[TranscriptAPI.com]]
- [[Daily-Digest]] ← incorporates ← [[Transcript-Based Summarization]]