Update nexus wiki content

This commit is contained in:
2026-05-03 05:42:06 +08:00
parent 90f3811b83
commit 111bc65b7b
707 changed files with 32306 additions and 7289 deletions

62
wiki/entities/dbt.md Normal file
View File

@@ -0,0 +1,62 @@
---
title: "dbt (data build tool)"
type: entity
tags: [data-engineering, data-transformation, SQL, data-quality]
sources: [engineering-data-engineer]
last_updated: 2026-05-02
---
## Overview
dbtdata build tool是数据转换和数据质量管理的 SQL-first 工具,允许分析师和工程师使用 SQL 定义数据转换、测试和质量契约。Data Engineer Agent 使用 dbt Cloud 定义 Medallion Architecture Silver 层的 schema 契约和数据质量测试。
## Core Capabilities
### Data Transformation
- 使用 SQL 定义 `models/`(转换模型)
- 支持 Jinja2 模板化 SQL复用逻辑
- 增量模型(`incremental` materialization减少全量计算
### Schema Contract Enforcement
```yaml
models:
- name: silver_orders
config:
contract:
enforced: true # schema 不匹配时构建失败
columns:
- name: order_id
data_type: string
constraints:
- type: not_null
- type: unique
tests:
- not_null
- unique
- name: revenue
data_type: decimal(18, 2)
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 1000000
```
### Data Testing
- Column testsnot_null、unique、relationships
- dbt Expectations 扩展(值范围、分布、新鲜度)
- Recency tests如"1 小时内必须有新数据"
### Semantic Layerdbt Cloud
- 定义 Metrics度量一次在多个 BI 工具中复用
- 统一业务指标定义,消除 BI 层的重复逻辑
## Integration with Lakehouse
- **dbt + Spark/Delta Lake**Silver 层清洗和 conform
- **dbt + Kafka**:结合流式写入实现近实时 Silver 层更新
- **dbt + Databricks**:原生 Unity Catalog 集成
## Related Concepts
- [[Medallion Architecture]]
- [[Data Contract]]
- [[Delta Lake]]