nexus/wiki/concepts/Defect-Prediction.md at 111bc65b7b36796956b7061a1e7f0e859eedb9ba

ishenwei/nexus

Fork 0

Files

weishen 111bc65b7b Update nexus wiki content

2026-05-03 05:42:12 +08:00

1.7 KiB

Raw Blame History

title, type, tags, sources, last_updated

title

type

Definition

缺陷预测——使用机器学习模型基于代码指标和历史缺陷数据，预测哪些代码区域最可能包含缺陷，指导测试资源的定向投入。

Approach

Feature Engineering (from TestResultsAnalyzer)

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# 特征：代码指标
features = extract_code_metrics()        # 圈复杂度、代码行数、变更频率等
historical_defects = load_historical_defect_data()  # 历史缺陷标签

# 训练/测试分割
X_train, X_test, y_train, y_test = train_test_split(
    features, historical_defects, test_size=0.2, random_state=42
)

# Random Forest 分类器
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# 预测 + 置信度 + 特征重要性
predictions = model.predict_proba(features)
feature_importance = model.feature_importances_
accuracy = model.score(X_test, y_test)

Key Metrics

Prediction Accuracy：模型在测试集上的准确率，目标 ≥ 85%。
Feature Importance：哪些代码指标（圈复杂度、变更频率、代码行数等）对缺陷预测最有预测力。
Confidence Score：每个预测结果附带置信度评分。

Connections

Statistical-Analysis：模型验证需统计显著性检验。
Test-Coverage-Analysis：预测的高风险区域优先增加测试覆盖率。
Release-Readiness-Assessment：缺陷预测结果纳入整体发布就绪度评估。
Quality-Metrics：缺陷密度是预测模型的目标变量。

1.7 KiB Raw Blame History

Definition

Approach

Feature Engineering (from TestResultsAnalyzer)

Key Metrics

Connections

1.7 KiB

Raw Blame History