Files
nexus/wiki/concepts/Data-Contract.md
2026-05-03 05:42:12 +08:00

62 lines
1.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Data Contract"
type: concept
tags: [data-engineering, data-quality, schema, SLA]
sources: [engineering-data-engineer]
last_updated: 2026-05-02
---
## Definition
Data Contract数据契约是数据生产者和消费者之间的明确协议定义了数据的预期 schema、数据类型、SLA、所有权和消费方。数据契约是 Medallion Architecture 中 Silver→Gold 层质量保证的核心机制。
## Components
### Schema Contract
- 字段名、类型、约束not_null、unique、foreign key
- Schema 演化规则:允许添加 nullable 字段,禁止删除或修改类型
- `mergeSchema=true`:允许 schema 演进,但触发告警而非自动污染下游
### SLA Contract
- 刷新频率(如"每 15 分钟刷新一次"
- 数据新鲜度阈值(如"1 小时内必须有新数据"
- 可用性承诺(如"Gold 层 99.9% 可用性"
### Ownership Contract
- 数据所有者Data Owner
- 数据消费者Data Consumer
- 支持联系人Support Contact
## Enforcement
### dbt Contract Enforcement
```yaml
models:
- name: silver_orders
config:
contract:
enforced: true # 强制 schema 契约,类型不匹配则构建失败
columns:
- name: order_id
data_type: string
constraints:
- type: not_null
- type: unique
```
### Great Expectations数据质量验证
- 行级数据质量评分必须在 Gold 层附加
- Null 率告警阈值(如 `customer_id` null 率从 0.1% 跳至 4.2% → 触发 PagerDuty
## Key Rules
- **Schema 漂移必须告警**:不得静默损坏下游数据
- **Null 处理必须显式**:不得隐式将 null 传播到 Gold 层
- **发布前必须与消费者确认**:数据契约签署后才能部署 Gold 层管道
## Related Concepts
- [[Medallion Architecture]]
- [[Great Expectations]](数据质量验证工具)
- [[Data Lineage]]
- [[SCD Type 2]]