Auto-sync: 2026-04-21 00:02
This commit is contained in:
@@ -1,23 +1,25 @@
|
||||
---
|
||||
title: "ML Ops"
|
||||
type: concept
|
||||
tags: [DevOps, ML, operations]
|
||||
sources: [public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machine-learning-20240206]
|
||||
last_updated: 2024-02-06
|
||||
tags: [machine-learning, operations, lifecycle]
|
||||
sources: [specialized-model-qa]
|
||||
last_updated: 2026-04-20
|
||||
---
|
||||
|
||||
## Summary
|
||||
ML Ops 结合机器学习和运营,涉及人员、技术和流程,以实现协作式 ML 解决方案。
|
||||
|
||||
## Definition
|
||||
ML Ops (Machine Learning Operations) 是将 DevOps 原则应用于机器学习系统的实践,包括数据管道、训练管道和推理管道的自动化和监控。
|
||||
ML Ops is the discipline of operationalizing machine learning models across development, deployment, monitoring, and governance.
|
||||
|
||||
## Key Attributes
|
||||
- **核心组成**:数据管道、训练管道、推理管道
|
||||
- **关注点**:数据溯源、模型管理、部署工作流
|
||||
- **工具**:Amazon SageMaker、MLflow、Kubeflow
|
||||
## Core Areas
|
||||
- Data pipelines
|
||||
- Training and deployment
|
||||
- Monitoring and drift detection
|
||||
- Governance and auditability
|
||||
|
||||
## Connections
|
||||
- [[DevOps]] ← extends ← [[ML Ops]]
|
||||
- [[ML Ops]] ← manages ← [[Machine Learning]]
|
||||
- [[Amazon SageMaker]] ← implements ← [[ML Ops]]
|
||||
## Relevance to Model QA
|
||||
- Provides the operational context for audits
|
||||
- Supplies monitoring and reproducibility artifacts
|
||||
- Supports remediation and retraining loops
|
||||
|
||||
## Related Concepts
|
||||
- [[Model Audit]]
|
||||
- [[Discrimination Metrics]]
|
||||
Reference in New Issue
Block a user