Files
nexus/wiki/concepts/StructuredTranscriptJSON.md
2026-05-03 05:42:12 +08:00

71 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "StructuredTranscriptJSON"
type: concept
tags: ["voice-ai", "transcription", "json-schema", "interoperability"]
last_updated: 2026-05-02
---
# StructuredTranscriptJSON结构化转录 JSON
## Definition
Structured Transcript JSON 是转录管道输出的稳定 Schema JSON 格式,包含分段时间戳、说话人标签、置信度分数、完整文本和元数据。设计原则:**添加字段,永不删除或重命名**——下游消费者CMS、LLM Agent、CI 工具)依赖 Schema 稳定性。
## Schema Design
```json
{
"schema_version": "1.0",
"metadata": {
"source_file": "...",
"duration": 3600.5,
"language": "en",
"transcription_date": "2026-05-02"
},
"segments": [
{
"index": 0,
"start": 0.0,
"end": 5.2,
"duration": 5.2,
"speaker": "SPEAKER_00",
"text": "Hello, welcome to the meeting.",
"confidence": -0.31
}
],
"full_text": "Hello, welcome to the meeting...",
"speakers": ["SPEAKER_00", "SPEAKER_01"],
"total_duration": 3600.5
}
```
## Schema Versioning Rules
- **向后兼容**:新增字段是安全的(消费者忽略未知字段)
- **破坏性变更**:删除/重命名字段 = 破坏所有消费者 = 必须 major 版本升级
- **版本声明**:每个文档必须包含 `schema_version` 字段
## Output Format Variants
| 格式 | 用途 | 包含内容 |
|------|------|---------|
| JSON | LLM Agent、CMS API | 全量结构(含时间戳、说话人、置信度) |
| SRT | 字幕文件、视频嵌入 | 时间戳 + 说话人前缀 + 文本 |
| VTT | Web 字幕 | 时间戳 + 说话人前缀 + 文本WebVTT 格式) |
| TXT | 快速查阅 | 纯文本,无元数据 |
## Downstream Consumers
- [[LangChain]] / CrewAIJSON 作为 `Document``Conversation` 输入
- CMSDrupal/WordPressJSON 存储为 `field_transcript_json``full_text` 存储为正文
- GitHub ActionsJSON 作为 CI artifact触发后续处理流水线
- [[LLMHandoff]]:将 JSON 中的 `segments` 格式化为带时间戳的文本行用于 LLM 摘要/问答
## Related Concepts
- [[PIIRedaction]] — 输出前的 PII 脱敏
- [[SpeakerDiarization]] — `speaker` 字段的数据来源
- [[LLMHandoff]] — 消费 Structured JSON 的标准接口
## Related Sources
- [[engineering-voice-ai-integration-engineer]]