Update nexus wiki content
This commit is contained in:
61
wiki/concepts/Data-Contract.md
Normal file
61
wiki/concepts/Data-Contract.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
title: "Data Contract"
|
||||
type: concept
|
||||
tags: [data-engineering, data-quality, schema, SLA]
|
||||
sources: [engineering-data-engineer]
|
||||
last_updated: 2026-05-02
|
||||
---
|
||||
|
||||
## Definition
|
||||
|
||||
Data Contract(数据契约)是数据生产者和消费者之间的明确协议,定义了数据的预期 schema、数据类型、SLA、所有权和消费方。数据契约是 Medallion Architecture 中 Silver→Gold 层质量保证的核心机制。
|
||||
|
||||
## Components
|
||||
|
||||
### Schema Contract
|
||||
- 字段名、类型、约束(not_null、unique、foreign key)
|
||||
- Schema 演化规则:允许添加 nullable 字段,禁止删除或修改类型
|
||||
- `mergeSchema=true`:允许 schema 演进,但触发告警而非自动污染下游
|
||||
|
||||
### SLA Contract
|
||||
- 刷新频率(如"每 15 分钟刷新一次")
|
||||
- 数据新鲜度阈值(如"1 小时内必须有新数据")
|
||||
- 可用性承诺(如"Gold 层 99.9% 可用性")
|
||||
|
||||
### Ownership Contract
|
||||
- 数据所有者(Data Owner)
|
||||
- 数据消费者(Data Consumer)
|
||||
- 支持联系人(Support Contact)
|
||||
|
||||
## Enforcement
|
||||
|
||||
### dbt Contract Enforcement
|
||||
```yaml
|
||||
models:
|
||||
- name: silver_orders
|
||||
config:
|
||||
contract:
|
||||
enforced: true # 强制 schema 契约,类型不匹配则构建失败
|
||||
columns:
|
||||
- name: order_id
|
||||
data_type: string
|
||||
constraints:
|
||||
- type: not_null
|
||||
- type: unique
|
||||
```
|
||||
|
||||
### Great Expectations(数据质量验证)
|
||||
- 行级数据质量评分必须在 Gold 层附加
|
||||
- Null 率告警阈值(如 `customer_id` null 率从 0.1% 跳至 4.2% → 触发 PagerDuty)
|
||||
|
||||
## Key Rules
|
||||
|
||||
- **Schema 漂移必须告警**:不得静默损坏下游数据
|
||||
- **Null 处理必须显式**:不得隐式将 null 传播到 Gold 层
|
||||
- **发布前必须与消费者确认**:数据契约签署后才能部署 Gold 层管道
|
||||
|
||||
## Related Concepts
|
||||
- [[Medallion Architecture]]
|
||||
- [[Great Expectations]](数据质量验证工具)
|
||||
- [[Data Lineage]]
|
||||
- [[SCD Type 2]]
|
||||
Reference in New Issue
Block a user