24 lines
519 B
Markdown
24 lines
519 B
Markdown
---
|
|
title: "Calibration Testing"
|
|
type: concept
|
|
tags: [ml-ops, evaluation, calibration]
|
|
sources: [specialized-model-qa]
|
|
last_updated: 2026-04-20
|
|
---
|
|
|
|
## Definition
|
|
校准测试用于评估模型预测概率是否与真实发生率一致。
|
|
|
|
## Common Methods
|
|
- Hosmer-Lemeshow test
|
|
- Brier score
|
|
- Reliability diagrams
|
|
|
|
## Use in Model QA
|
|
- 检查概率输出是否可信
|
|
- 比较不同子群体的校准差异
|
|
- 评估分布漂移下的概率稳定性
|
|
|
|
## Related Concepts
|
|
- [[Model Audit]]
|
|
- [[Discrimination Metrics]] |