--- title: "MLOps" type: concept tags: [ML, DevOps, machine-learning, operations, CI/CD] sources: - public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin last_updated: 2026-05-12 --- ## Definition MLOps(Machine Learning Operations,机器学习运维)将机器学习与运维结合,涉及人员、技术和流程,以实现协作式 ML 解决方案。ML Ops 需要多元化团队和鼓励协作的文化,扩展了 DevOps 的原则和方法。 ## Key Components ### Three Pipelines #### 1. Data Pipeline(数据管道) - 数据收集(Data Collection) - 数据集成(Data Integration) - 数据准备(Data Preparation) - **工具**: Amazon S3, Amazon Redshift #### 2. Training Pipeline(训练管道) - 特征工程(Feature Engineering) - 模型训练(Model Training) - 超参数调优(Hyperparameter Tuning) - **工具**: Amazon SageMaker #### 3. Inference Pipeline(推理管道) - 模型部署(Model Deployment) - 模型监控(Model Monitoring) - **工具**: Amazon SageMaker Real-time Endpoints ## Key Challenges - 数据溯源(Data Provenance) - 模型管理(Model Management) - 部署工作流(Deployment Workflows) - 持续集成/持续部署(CI/CD) - 监控与可观测性 ## Relationship to DevOps MLOps 在 DevOps 实践基础上增加了 ML 特有的挑战: - 模型版本控制 - 实验追踪 - A/B 测试 - 模型性能监控 - 数据漂移检测 ## Related Concepts - [[DevOps]] - [[Amazon-SageMaker]] - [[Foundation-Models]] - [[Responsible-AI]] ## Related Entities - [[AWS]] ## Sources - [[public-cloud-learning-sessions-introduction-to-artificial-intelligence-ai-machin]]