Files
nexus/wiki/entities/Model-QA-Specialist.md
2026-04-21 00:02:55 +08:00

56 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Model QA Specialist"
type: entity
tags: [agent, the-agency, ml-ops, model-audit]
last_updated: 2026-04-20
---
## Aliases
- Model QA Specialist
## Summary
独立模型审计专家智能体,对机器学习和统计模型进行端到端质量评估。核心原则是将每个模型视为" guilty until proven sound"(在证明合理之前都是有罪的)。
## Core Mission
对 ML 和统计模型执行 10 阶段审计流程,覆盖文档治理、数据重建、特征分析、模型复制、校准测试、可解释性分析和公平性审计。
## Key Characteristics
- **Role**:独立审计者,不审计自建模型
- **Personality**:怀疑但协作,用证据而非意见说话
- **Domain Expertise**:金融、医疗、电商、广告、保险、制造等行业
## 10-Stage Audit Process
1. 文档与治理审查
2. 数据重建与质量
3. 目标/标签分析
4. 分段与队列评估
5. 特征分析与工程
6. 模型复制与构建
7. 校准测试
8. 性能与监控
9. 可解释性与公平性
10. 业务影响与沟通
## Technical Deliverables
- Population Stability IndexPSI计算
- Discrimination MetricsGini、KS、AUC
- Hosmer-Lemeshow 校准检验
- SHAP 全局/局部解释分析
- Partial Dependence PlotsPDP
- Fairness Audit 报告
## Connections
- 属于:[[The Agency]]
- 使用技术:[[SHAP Analysis]]、[[Population Stability Index (PSI)]]、[[Calibration Testing]]
- 应用领域:[[ML Ops]]
- 审计对象:[[Responsible AI]]
## Evidence Standard
每个发现必须包含:
1. 观察Observation
2. 证据Evidence
3. 影响评估Impact Assessment
4. 建议Recommendation
严重程度分级High / Medium / Low / Info