Update nexus wiki content
This commit is contained in:
62
wiki/entities/dbt.md
Normal file
62
wiki/entities/dbt.md
Normal file
@@ -0,0 +1,62 @@
|
||||
---
|
||||
title: "dbt (data build tool)"
|
||||
type: entity
|
||||
tags: [data-engineering, data-transformation, SQL, data-quality]
|
||||
sources: [engineering-data-engineer]
|
||||
last_updated: 2026-05-02
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
dbt(data build tool)是数据转换和数据质量管理的 SQL-first 工具,允许分析师和工程师使用 SQL 定义数据转换、测试和质量契约。Data Engineer Agent 使用 dbt Cloud 定义 Medallion Architecture Silver 层的 schema 契约和数据质量测试。
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### Data Transformation
|
||||
- 使用 SQL 定义 `models/`(转换模型)
|
||||
- 支持 Jinja2 模板化 SQL,复用逻辑
|
||||
- 增量模型(`incremental` materialization)减少全量计算
|
||||
|
||||
### Schema Contract Enforcement
|
||||
```yaml
|
||||
models:
|
||||
- name: silver_orders
|
||||
config:
|
||||
contract:
|
||||
enforced: true # schema 不匹配时构建失败
|
||||
columns:
|
||||
- name: order_id
|
||||
data_type: string
|
||||
constraints:
|
||||
- type: not_null
|
||||
- type: unique
|
||||
tests:
|
||||
- not_null
|
||||
- unique
|
||||
- name: revenue
|
||||
data_type: decimal(18, 2)
|
||||
tests:
|
||||
- dbt_expectations.expect_column_values_to_be_between:
|
||||
min_value: 0
|
||||
max_value: 1000000
|
||||
```
|
||||
|
||||
### Data Testing
|
||||
- Column tests(not_null、unique、relationships)
|
||||
- dbt Expectations 扩展(值范围、分布、新鲜度)
|
||||
- Recency tests(如"1 小时内必须有新数据")
|
||||
|
||||
### Semantic Layer(dbt Cloud)
|
||||
- 定义 Metrics(度量)一次,在多个 BI 工具中复用
|
||||
- 统一业务指标定义,消除 BI 层的重复逻辑
|
||||
|
||||
## Integration with Lakehouse
|
||||
|
||||
- **dbt + Spark/Delta Lake**:Silver 层清洗和 conform
|
||||
- **dbt + Kafka**:结合流式写入实现近实时 Silver 层更新
|
||||
- **dbt + Databricks**:原生 Unity Catalog 集成
|
||||
|
||||
## Related Concepts
|
||||
- [[Medallion Architecture]]
|
||||
- [[Data Contract]]
|
||||
- [[Delta Lake]]
|
||||
Reference in New Issue
Block a user