Files
nexus/wiki/concepts/Identity-Resolution.md

44 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Identity Resolution"
type: concept
tags: ["identity", "entity-resolution", "multi-agent", "data-matching"]
sources: ["identity-graph-operator"]
last_updated: 2026-04-25
---
# Identity Resolution身份解析
## Definition
将来自不同来源的多条记录归一化为同一 **canonical entity_id** 的过程——确保同一个真实世界实体(人/公司/产品)在系统中只对应一个唯一标识符,所有 Agent 共享这一规范视图。
## Core Workflow
```
原始记录 → 规范化(Normalize) → 阻塞(Blocking) → 评分(Scoring) → 聚类(Clustering) → Canonical Entity
```
1. **规范化**:邮箱小写、电话 E.164 格式、昵称扩展Bill→William
2. **阻塞**:用 blocking keyemail domain / phone prefix / name soundex快速筛选候选对避免 O(n²) 全图扫描
3. **评分**字段级加权相似度email exact match = 1.0name fuzzy = 0.82
4. **聚类**:高置信度候选归入同一 cluster生成 canonical entity_id
## Key Properties
- **确定性**:相同输入必须返回相同 entity_id由 Identity Graph Operator 强制执行)
- **证据驱动**:每条合并决策必须有 per-field evidence拒绝"看起来相似"断言
- **并发安全**通过乐观锁version field防止并发写入冲突
- **可审计**完整事件历史entity.created / merged / split / updated
## Confidence Thresholds
| 置信度 | 操作 |
|--------|------|
| > 0.95 | 直接合并(单 Agent 高置信) |
| 0.600.95 | 提案审查(多 Agent 协作) |
| < 0.60 | 创建新实体 |
## Relationship to Related Concepts
- [[Identity Resolution]] ⊂ [[Master-Data-Management]]MDM身份解析是 MDM 在多 Agent 系统中的分布式实现,增加了并发协调维度
- [[Identity Resolution]] 应用层:[[Personal CRM]] 联系人去重、[[Identity-Graph-Operator]] 企业级多 Agent 协调
## Related Agents
- [[identity-graph-operator]]Identity Resolution 能力在多 Agent 系统中的 Agent 化封装