60 lines
1.7 KiB
Markdown
60 lines
1.7 KiB
Markdown
---
|
||
title: "MLOps"
|
||
type: concept
|
||
tags: [ML, DevOps, machine-learning, operations, CI/CD]
|
||
sources:
|
||
- public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin
|
||
last_updated: 2026-05-12
|
||
---
|
||
|
||
## Definition
|
||
MLOps(Machine Learning Operations,机器学习运维)将机器学习与运维结合,涉及人员、技术和流程,以实现协作式 ML 解决方案。ML Ops 需要多元化团队和鼓励协作的文化,扩展了 DevOps 的原则和方法。
|
||
|
||
## Key Components
|
||
|
||
### Three Pipelines
|
||
|
||
#### 1. Data Pipeline(数据管道)
|
||
- 数据收集(Data Collection)
|
||
- 数据集成(Data Integration)
|
||
- 数据准备(Data Preparation)
|
||
- **工具**: Amazon S3, Amazon Redshift
|
||
|
||
#### 2. Training Pipeline(训练管道)
|
||
- 特征工程(Feature Engineering)
|
||
- 模型训练(Model Training)
|
||
- 超参数调优(Hyperparameter Tuning)
|
||
- **工具**: Amazon SageMaker
|
||
|
||
#### 3. Inference Pipeline(推理管道)
|
||
- 模型部署(Model Deployment)
|
||
- 模型监控(Model Monitoring)
|
||
- **工具**: Amazon SageMaker Real-time Endpoints
|
||
|
||
## Key Challenges
|
||
- 数据溯源(Data Provenance)
|
||
- 模型管理(Model Management)
|
||
- 部署工作流(Deployment Workflows)
|
||
- 持续集成/持续部署(CI/CD)
|
||
- 监控与可观测性
|
||
|
||
## Relationship to DevOps
|
||
MLOps 在 DevOps 实践基础上增加了 ML 特有的挑战:
|
||
- 模型版本控制
|
||
- 实验追踪
|
||
- A/B 测试
|
||
- 模型性能监控
|
||
- 数据漂移检测
|
||
|
||
## Related Concepts
|
||
- [[DevOps]]
|
||
- [[Amazon-SageMaker]]
|
||
- [[Foundation-Models]]
|
||
- [[Responsible-AI]]
|
||
|
||
## Related Entities
|
||
- [[AWS]]
|
||
|
||
## Sources
|
||
- [[public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin]]
|