Auto-sync: 2026-04-21 00:02

This commit is contained in:
2026-04-21 00:02:55 +08:00
parent 177469a1cd
commit cb7c11e14f
235 changed files with 16567 additions and 237 deletions

View File

@@ -1,23 +1,25 @@
---
title: "ML Ops"
type: concept
tags: [DevOps, ML, operations]
sources: [public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machine-learning-20240206]
last_updated: 2024-02-06
tags: [machine-learning, operations, lifecycle]
sources: [specialized-model-qa]
last_updated: 2026-04-20
---
## Summary
ML Ops 结合机器学习和运营,涉及人员、技术和流程,以实现协作式 ML 解决方案。
## Definition
ML Ops (Machine Learning Operations) 是将 DevOps 原则应用于机器学习系统的实践,包括数据管道、训练管道和推理管道的自动化和监控。
ML Ops is the discipline of operationalizing machine learning models across development, deployment, monitoring, and governance.
## Key Attributes
- **核心组成**:数据管道、训练管道、推理管道
- **关注点**:数据溯源、模型管理、部署工作流
- **工具**Amazon SageMaker、MLflow、Kubeflow
## Core Areas
- Data pipelines
- Training and deployment
- Monitoring and drift detection
- Governance and auditability
## Connections
- [[DevOps]] ← extends ← [[ML Ops]]
- [[ML Ops]] ← manages ← [[Machine Learning]]
- [[Amazon SageMaker]] ← implements ← [[ML Ops]]
## Relevance to Model QA
- Provides the operational context for audits
- Supplies monitoring and reproducibility artifacts
- Supports remediation and retraining loops
## Related Concepts
- [[Model Audit]]
- [[Discrimination Metrics]]