nexus/wiki/concepts/TranscriptProcessing.md

---
title: "TranscriptProcessing"
type: concept
tags: []
last_updated: 2026-04-22
---

# TranscriptProcessing

AI Agent 处理会议转录文本（Transcripts）的技术方法，包括文本解析、发言人识别、关键内容提取和信息结构化。是 [[MeetingNotes]] 自动化的核心技术环节。

## Definition

TranscriptProcessing 解决的核心问题：原始转录文本（VTT、SRT、TXT）包含大量噪音（语气词、重复、停顿），需要 AI 理解并提取有价值的信息。该过程包括：
- 格式解析：识别 VTT/SRT 时间戳、说话人标签
- 去噪清理：去除语气词、重复、停顿标记
- 发言人归属：将发言内容归因到具体人员
- 主题分段：识别不同讨论主题的边界
- 关键提取：决策、行动项、问题、待跟进事项

## Recommended Input Formats

| 格式 | 来源 | 优势 |
|------|------|------|
| VTT | Zoom/Google Meet 字幕导出 | 包含时间戳，利于发言人归属 |
| SRT | 视频字幕导出 | 时间戳支持多发言人识别 |
| TXT | Otter.ai 导出 | 已整理的纯文本 |
| JSON | Otter.ai API | 结构化数据（speaker, words, timing） |

## Key Insight

> "VTT/SRT subtitle files from Zoom or Google Meet work great as input — they include timestamps which help the agent attribute statements to speakers."

## Related Concepts
- [[MeetingNotes]] — 转录处理的主要应用场景
- [[ActionItemTracking]] — 从处理结果中提取行动项

## Related Sources
- [[meeting-notes-action-items]]