Files
nexus/wiki/concepts/Confidence-Score.md

53 lines
2.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Confidence Score"
type: concept
tags: ["identity-resolution", "decision-making", "threshold", "multi-agent"]
sources: ["identity-graph-operator"]
last_updated: 2026-04-25
---
# Confidence Score置信度评分
## Definition
身份解析决策的核心度量——综合所有字段级匹配证据,通过加权求和得出的合并置信度。是决定"自动合并 / 提案审查 / 创建新实体"三类决策的分界指标。
## Calculation
```
confidence = Σ(score_i × weight_i) / Σ(weight_i)
```
其中 `score_i` 是字段级 fuzzy/exact match 分数01`weight_i` 是字段可靠性权重。
### 示例(来自 Identity Graph Operator 源码)
| 字段 | 记录A值 | 记录B值 | Normalizer | Comparator | Score | Weight |
|------|---------|---------|-----------|------------|-------|--------|
| email | wsmith@acme.com | wsmith@acme.com | email | exact | 1.0 | 高 |
| last_name | Smith | Smith | name | exact | 1.0 | 高 |
| first_name | William | Bill | name | nickname | 0.82 | 中 |
| phone | +155****0142 | +155****0142 | phone | exact | 1.0 | 高 |
综合置信度 = `1.0×0.3 + 1.0×0.3 + 0.82×0.2 + 1.0×0.2`**0.96**
## Decision Thresholds
```
confidence > 0.95 → 自动合并(单 Agent 高置信)
0.60 ≤ confidence ≤ 0.95 → 提案审查(多 Agent 协作)
confidence < 0.60 → 创建新实体
```
## Field Reliability Weights
| 字段 | 权重 | 原因 |
|------|------|------|
| Email | 高 | 几乎唯一,变更需主动操作 |
| Phone | 高 | 需验证,变更成本高 |
| Name | 中 | 常见同名不同人,需结合其他字段 |
| Address | 低 | 常见地址变更(搬家) |
## Why Thresholds Matter
- **防止假阳性**False Merge将两个不同人如同名"John Smith")错误合并——高阈值 + 字段级证据防止
- **防止假阴性**Missed Match将同一人如"Bill Smith"/"William Smith")遗漏为不同实体——中等阈值触发提案审查而非直接拒绝
- **可解释性**per-field evidence 使决策可被其他 Agent 和人类审计