Update nexus: fix conflicts and sync local changes
This commit is contained in:
@@ -1,52 +1,52 @@
|
||||
---
|
||||
title: "Confidence Score"
|
||||
type: concept
|
||||
tags: ["identity-resolution", "decision-making", "threshold", "multi-agent"]
|
||||
sources: ["identity-graph-operator"]
|
||||
last_updated: 2026-04-25
|
||||
---
|
||||
|
||||
# Confidence Score(置信度评分)
|
||||
|
||||
## Definition
|
||||
身份解析决策的核心度量——综合所有字段级匹配证据,通过加权求和得出的合并置信度。是决定"自动合并 / 提案审查 / 创建新实体"三类决策的分界指标。
|
||||
|
||||
## Calculation
|
||||
|
||||
```
|
||||
confidence = Σ(score_i × weight_i) / Σ(weight_i)
|
||||
```
|
||||
|
||||
其中 `score_i` 是字段级 fuzzy/exact match 分数(0–1),`weight_i` 是字段可靠性权重。
|
||||
|
||||
### 示例(来自 Identity Graph Operator 源码)
|
||||
| 字段 | 记录A值 | 记录B值 | Normalizer | Comparator | Score | Weight |
|
||||
|------|---------|---------|-----------|------------|-------|--------|
|
||||
| email | wsmith@acme.com | wsmith@acme.com | email | exact | 1.0 | 高 |
|
||||
| last_name | Smith | Smith | name | exact | 1.0 | 高 |
|
||||
| first_name | William | Bill | name | nickname | 0.82 | 中 |
|
||||
| phone | +155****0142 | +155****0142 | phone | exact | 1.0 | 高 |
|
||||
|
||||
综合置信度 = `1.0×0.3 + 1.0×0.3 + 0.82×0.2 + 1.0×0.2` ≈ **0.96**
|
||||
|
||||
## Decision Thresholds
|
||||
|
||||
```
|
||||
confidence > 0.95 → 自动合并(单 Agent 高置信)
|
||||
0.60 ≤ confidence ≤ 0.95 → 提案审查(多 Agent 协作)
|
||||
confidence < 0.60 → 创建新实体
|
||||
```
|
||||
|
||||
## Field Reliability Weights
|
||||
|
||||
| 字段 | 权重 | 原因 |
|
||||
|------|------|------|
|
||||
| Email | 高 | 几乎唯一,变更需主动操作 |
|
||||
| Phone | 高 | 需验证,变更成本高 |
|
||||
| Name | 中 | 常见同名不同人,需结合其他字段 |
|
||||
| Address | 低 | 常见地址变更(搬家) |
|
||||
|
||||
## Why Thresholds Matter
|
||||
- **防止假阳性**(False Merge):将两个不同人(如同名"John Smith")错误合并——高阈值 + 字段级证据防止
|
||||
- **防止假阴性**(Missed Match):将同一人(如"Bill Smith"/"William Smith")遗漏为不同实体——中等阈值触发提案审查而非直接拒绝
|
||||
- **可解释性**:per-field evidence 使决策可被其他 Agent 和人类审计
|
||||
---
|
||||
title: "Confidence Score"
|
||||
type: concept
|
||||
tags: ["identity-resolution", "decision-making", "threshold", "multi-agent"]
|
||||
sources: ["identity-graph-operator"]
|
||||
last_updated: 2026-04-25
|
||||
---
|
||||
|
||||
# Confidence Score(置信度评分)
|
||||
|
||||
## Definition
|
||||
身份解析决策的核心度量——综合所有字段级匹配证据,通过加权求和得出的合并置信度。是决定"自动合并 / 提案审查 / 创建新实体"三类决策的分界指标。
|
||||
|
||||
## Calculation
|
||||
|
||||
```
|
||||
confidence = Σ(score_i × weight_i) / Σ(weight_i)
|
||||
```
|
||||
|
||||
其中 `score_i` 是字段级 fuzzy/exact match 分数(0–1),`weight_i` 是字段可靠性权重。
|
||||
|
||||
### 示例(来自 Identity Graph Operator 源码)
|
||||
| 字段 | 记录A值 | 记录B值 | Normalizer | Comparator | Score | Weight |
|
||||
|------|---------|---------|-----------|------------|-------|--------|
|
||||
| email | wsmith@acme.com | wsmith@acme.com | email | exact | 1.0 | 高 |
|
||||
| last_name | Smith | Smith | name | exact | 1.0 | 高 |
|
||||
| first_name | William | Bill | name | nickname | 0.82 | 中 |
|
||||
| phone | +155****0142 | +155****0142 | phone | exact | 1.0 | 高 |
|
||||
|
||||
综合置信度 = `1.0×0.3 + 1.0×0.3 + 0.82×0.2 + 1.0×0.2` ≈ **0.96**
|
||||
|
||||
## Decision Thresholds
|
||||
|
||||
```
|
||||
confidence > 0.95 → 自动合并(单 Agent 高置信)
|
||||
0.60 ≤ confidence ≤ 0.95 → 提案审查(多 Agent 协作)
|
||||
confidence < 0.60 → 创建新实体
|
||||
```
|
||||
|
||||
## Field Reliability Weights
|
||||
|
||||
| 字段 | 权重 | 原因 |
|
||||
|------|------|------|
|
||||
| Email | 高 | 几乎唯一,变更需主动操作 |
|
||||
| Phone | 高 | 需验证,变更成本高 |
|
||||
| Name | 中 | 常见同名不同人,需结合其他字段 |
|
||||
| Address | 低 | 常见地址变更(搬家) |
|
||||
|
||||
## Why Thresholds Matter
|
||||
- **防止假阳性**(False Merge):将两个不同人(如同名"John Smith")错误合并——高阈值 + 字段级证据防止
|
||||
- **防止假阴性**(Missed Match):将同一人(如"Bill Smith"/"William Smith")遗漏为不同实体——中等阈值触发提案审查而非直接拒绝
|
||||
- **可解释性**:per-field evidence 使决策可被其他 Agent 和人类审计
|
||||
|
||||
Reference in New Issue
Block a user