--- title: "Calibration Testing" type: concept tags: [ml-ops, evaluation, calibration] sources: [specialized-model-qa] last_updated: 2026-04-20 --- ## Definition 校准测试用于评估模型预测概率是否与真实发生率一致。 ## Common Methods - Hosmer-Lemeshow test - Brier score - Reliability diagrams ## Use in Model QA - 检查概率输出是否可信 - 比较不同子群体的校准差异 - 评估分布漂移下的概率稳定性 ## Related Concepts - [[Model Audit]] - [[Discrimination Metrics]]