Files
nexus/wiki/concepts/MLOps.md

60 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "MLOps"
type: concept
tags: [ML, DevOps, machine-learning, operations, CI/CD]
sources:
- public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin
last_updated: 2026-05-12
---
## Definition
MLOpsMachine Learning Operations机器学习运维将机器学习与运维结合涉及人员、技术和流程以实现协作式 ML 解决方案。ML Ops 需要多元化团队和鼓励协作的文化,扩展了 DevOps 的原则和方法。
## Key Components
### Three Pipelines
#### 1. Data Pipeline数据管道
- 数据收集Data Collection
- 数据集成Data Integration
- 数据准备Data Preparation
- **工具**: Amazon S3, Amazon Redshift
#### 2. Training Pipeline训练管道
- 特征工程Feature Engineering
- 模型训练Model Training
- 超参数调优Hyperparameter Tuning
- **工具**: Amazon SageMaker
#### 3. Inference Pipeline推理管道
- 模型部署Model Deployment
- 模型监控Model Monitoring
- **工具**: Amazon SageMaker Real-time Endpoints
## Key Challenges
- 数据溯源Data Provenance
- 模型管理Model Management
- 部署工作流Deployment Workflows
- 持续集成/持续部署CI/CD
- 监控与可观测性
## Relationship to DevOps
MLOps 在 DevOps 实践基础上增加了 ML 特有的挑战:
- 模型版本控制
- 实验追踪
- A/B 测试
- 模型性能监控
- 数据漂移检测
## Related Concepts
- [[DevOps]]
- [[Amazon-SageMaker]]
- [[Foundation-Models]]
- [[Responsible-AI]]
## Related Entities
- [[AWS]]
## Sources
- [[public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin]]