--- title: "Population Stability Index" type: concept tags: [model-monitoring, feature-drift, model-governance] last_updated: 2026-04-25 --- ## Definition 群体稳定性指数(Population Stability Index,PSI)是衡量两个分布(通常是开发样本 vs 实际样本)之间差异的量化指标,广泛用于监控机器学习模型输入特征和输出评分的分布漂移,是模型生命周期管理的核心监控工具。 ## Algorithm $$\text{PSI} = \sum_{i=1}^{n} (act_i - exp_i) \times \ln\left(\frac{act_i}{exp_i}\right)$$ 其中: - $act_i$ = 实际(当前)样本在分箱中的占比 - $exp_i$ = 期望(基准)样本在分箱中的占比 - 使用 **Laplace smoothing**(加 1 平滑)避免除零 ## Interpretation Thresholds | PSI Range | 判读 | 建议行动 | |-----------|------|---------| | < 0.10 | 🟢 无显著漂移 | 无需操作 | | 0.10–0.25 | 🟡 中等漂移 | 调查原因,密切监控 | | ≥ 0.25 | 🔴 显著漂移 | **立即采取行动**,考虑重训 | ## Implementation ```python import numpy as np import pandas as pd def compute_psi(expected: pd.Series, actual: pd.Series, bins: int = 10) -> float: """ Compute Population Stability Index between two distributions. Interpretation: < 0.10 → No significant shift (green) 0.10–0.25 → Moderate shift, investigation recommended (amber) >= 0.25 → Significant shift, action required (red) """ breakpoints = np.linspace(0, 100, bins + 1) expected_pcts = np.percentile(expected.dropna(), breakpoints) expected_counts = np.histogram(expected, bins=expected_pcts)[0] actual_counts = np.histogram(actual, bins=expected_pcts)[0] # Laplace smoothing exp_pct = (expected_counts + 1) / (expected_counts.sum() + bins) act_pct = (actual_counts + 1) / (actual_counts.sum() + bins) psi = np.sum((act_pct - exp_pct) * np.log(act_pct / exp_pct)) return round(psi, 6) def variable_stability_report( df: pd.DataFrame, date_col: str, variables: list[str], psi_threshold: float = 0.25, ) -> pd.DataFrame: """Monthly stability report for model features.""" periods = sorted(df[date_col].unique()) baseline = df[df[date_col] == periods[0]] results = [] for var in variables: for period in periods[1:]: current = df[df[date_col] == period] psi = compute_psi(baseline[var], current[var]) results.append({ "variable": var, "period": period, "psi": psi, "flag": "🔴" if psi >= psi_threshold else ("🟡" if psi >= 0.10 else "🟢"), }) return pd.DataFrame(results).pivot_table( index="variable", columns="period", values="psi" ).round(4) ``` ## Model QA 中的应用 Model QA Specialist 将 PSI 应用于以下场景: 1. **特征稳定性监控**:每月计算所有特征的 PSI,识别漂移最早的预警信号 2. **评分分布监控**:模型输出的评分 PSI,检测整体预测分布变化 3. **分段 PSI**:在子群体上分别计算,识别特定分段的漂移(整体 PSI 掩盖的局部问题) 4. **重训触发器**:将 PSI ≥ 0.25 设为自动重训的硬触发条件 ## Relationship - **被依赖** [[SHAP]]:PSI 识别分布漂移,SHAP 分析漂移后的特征贡献变化 - **被依赖** [[Discrimination-Metrics]]:PSI 漂移通常先于 AUC/Gini 下降出现,是预警指标 - **被依赖** [[Calibration-Testing]]:特征分布漂移(PSI)是校准失效的根本原因之一 - **支撑** [[specialized-model-qa]](Source):Model QA Specialist 的监控框架核心指标 ## Key Insights - **方向性陷阱**:PSI 仅反映分布差异大小,不反映变化方向(高→低 或 低→高 均为漂移) - **阈值依赖**:0.1/0.25 阈值是行业惯例,具体阈值应基于业务风险调整 - **特征 vs 评分 PSI**:特征 PSI 先于评分 PSI 变化,是更敏感的早期预警 - **监控频率**:生产模型应至少每月计算一次,关键业务模型建议每周甚至每日