Files
nexus/wiki/concepts/EBUR128LoudnessNormalization.md
2026-05-03 05:42:12 +08:00

48 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "EBUR128LoudnessNormalization"
type: concept
tags: ["audio-processing", "loudness", "ffmpeg", "ebur128"]
last_updated: 2026-05-02
---
# EBUR128LoudnessNormalizationEBU R128 响度归一化)
## Definition
EBU R128 是欧洲广播联盟制定的响度归一化标准,用于确保不同来源的音频具有一致的感知响度。在 Whisper 类转录模型管道中R128 归一化确保输入音频响度稳定,避免因音量差异导致的精度下降。
## Standard Parameters
```bash
-af "loudnorm=I=-16:TP=-1.5:LRA=11"
```
| 参数 | 含义 | 标准值 |
|------|------|--------|
| `I` | 综合响度Integrated Loudness | -16 LUFS |
| `TP` | 真峰值True Peak | -1.5 dBTP |
| `LRA` | 响度范围Loudness Range | 11 LU |
## Why -16 LUFS?
- 广播标准TV/Streaming-24 LUFS旧标准→ -16 LUFS新趋势Netflix/YouTube
- Podcast/对话内容:-16 LUFS 更适合语音主导的内容
- 过高的综合响度(>-14 LUFS会导致语音压缩失真
## Pipeline Context
```
原始音频 → 格式检测ffprobe→ EBU R128 归一化 → 重采样至 16kHz → 单声道
```
## Why It Matters for Whisper
Whisper 对响度变化不免疫。同一段语音,-30 LUFS 的录音和 -16 LUFS 的录音后者的WERWord Error Rate更低因为响度归一化降低了动态范围减少了模型在处理软/响片段时的注意力分散。
## Related Concepts
- [[VoiceActivityDetection]] — 归一化之后的后处理
- [[FasterWhisper]] — 归一化音频的消费者
## Related Sources
- [[engineering-voice-ai-integration-engineer]]