Auto-sync: 2026-04-26 16:02
This commit is contained in:
52
wiki/concepts/Cloud-Computing.md
Normal file
52
wiki/concepts/Cloud-Computing.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: "Cloud Computing"
|
||||
type: concept
|
||||
tags: [cloud, infrastructure, iaas, paas, saas]
|
||||
sources: [the-myths-and-misconceptions-about-cloud-computing-linkedin, what-i-know-about-cloud-service-delivery-1, cloud-maturity-model-a-detailed-guide-for-cloud-adoption]
|
||||
last_updated: 2025-03-02
|
||||
---
|
||||
|
||||
## Definition
|
||||
|
||||
Cloud computing is the delivery of computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet ("the cloud") to offer faster innovation, flexible resources, and economies of scale.
|
||||
|
||||
## Service Models
|
||||
|
||||
- **IaaS (Infrastructure as a Service)**: Provides virtualized computing resources over the internet (e.g., AWS EC2, Azure VMs)
|
||||
- **PaaS (Platform as a Service)**: Provides a platform for developing, running, and managing applications without dealing with infrastructure (e.g., AWS Elastic Beanstalk, Azure App Service)
|
||||
- **SaaS (Software as a Service)**: Provides software applications over the internet on a subscription basis (e.g., Microsoft 365, Salesforce)
|
||||
|
||||
## Key Characteristics
|
||||
|
||||
- **On-demand self-service**: Provision resources as needed without human intervention
|
||||
- **Broad network access**: Access services over the network via standard mechanisms
|
||||
- **Resource pooling**: Multiple customers share infrastructure with logical separation
|
||||
- **Rapid elasticity**: Scale resources up or down dynamically
|
||||
- **Measured service**: Pay-as-you-go pricing model
|
||||
|
||||
## Common Misconceptions
|
||||
|
||||
According to [[the-myths-and-misconceptions-about-cloud-computing-linkedin]], the following misconceptions are prevalent:
|
||||
|
||||
1. **Cloud is not secure** → Reality: Major providers invest heavily in security (encryption, MFA, ISO 27001, HIPAA, GDPR compliance)
|
||||
2. **Cloud is just "someone else's computer"** → Reality: Cloud is a sophisticated network of data centers with redundancy and high availability
|
||||
3. **Cloud is too expensive** → Reality: Pay-as-you-go model with proper management can be cost-effective
|
||||
4. **You lose control of your data** → Reality: Cloud provides robust data governance and control tools
|
||||
5. **Cloud is only for large enterprises** → Reality: SMBs can leverage enterprise-grade technology without large upfront investments
|
||||
6. **Migration is too complex** → Reality: Phased migration and hybrid cloud solutions mitigate risks
|
||||
7. **Cloud performance is unreliable** → Reality: SLAs often guarantee 99.99%+ uptime
|
||||
|
||||
## Related Concepts
|
||||
|
||||
- [[Hybrid-Cloud]]: Combining on-premises infrastructure with public cloud
|
||||
- [[Multi-Cloud]]: Using multiple cloud providers simultaneously
|
||||
- [[Cloud-Migration]]: The process of moving workloads to the cloud
|
||||
- [[Cloud-Security]]: Security practices in cloud environments
|
||||
- [[Pay-as-you-go]]: Cost model based on actual usage
|
||||
- [[High-Availability]]: Design principle for minimizing downtime
|
||||
- [[Serverless-Computing]]: Event-driven computing without server management
|
||||
|
||||
## Aliases
|
||||
|
||||
- Cloud
|
||||
- 云计算
|
||||
51
wiki/concepts/DevOps-Maturity-Model.md
Normal file
51
wiki/concepts/DevOps-Maturity-Model.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "DevOps Maturity Model"
|
||||
type: concept
|
||||
tags: [DevOps, Maturity Assessment, CI/CD]
|
||||
sources: [devops-maturity-model-from-traditional-it-to-advanced-devops]
|
||||
last_updated: 2026-04-26
|
||||
---
|
||||
|
||||
## 定义
|
||||
|
||||
DevOps 成熟度模型(DevOps Maturity Model)是一种结构化框架,用于评估组织当前 DevOps 实践水平,识别改进领域,并规划向更高成熟度等级的演进路径。
|
||||
|
||||
该模型涵盖四个核心评估维度:**文化与战略**、**自动化**、**结构与流程**、**协作与共享**、**技术**,并通过五个递进阶段量化组织 DevOps 能力。
|
||||
|
||||
## 成熟度五阶段
|
||||
|
||||
| 阶段 | 名称 | 关键特征 |
|
||||
|------|------|----------|
|
||||
| Phase 1 | 初始/临时阶段 | 瀑布式开发,团队孤立,手动流程,反应式监控 |
|
||||
| Phase 2 | 局部试点 | 小范围 DevOps 实践,版本控制引入,单元/集成测试 |
|
||||
| Phase 3 | 自动化与定义 | 基础设施自动化,敏捷跨团队协作,安全扫描集成 |
|
||||
| Phase 4 | 高度优化 | CI/CD 流水线,不可变基础设施,第三方依赖管理 |
|
||||
| Phase 5 | 完全成熟 | 连续部署,零人工干预,数据驱动决策 |
|
||||
|
||||
## 关键衡量指标
|
||||
|
||||
- **部署频率(Deployment Frequency)**:在设定周期内代码部署的频率
|
||||
- **变更前置时间(Lead Time)**:从代码提交到部署的时间
|
||||
- **变更失败率(Change Failure Rate)**:部署后引发故障或回滚的比例
|
||||
- **平均恢复时间(MTTR)**:从故障恢复到正常运行的时间
|
||||
- **错误预算(Error Budget)**:允许的生产环境错误和失败率
|
||||
|
||||
## 核心评估维度
|
||||
|
||||
1. **文化与战略**:团队协作、透明度、以客户为中心的产品思维
|
||||
2. **自动化**:CI/CD 流水线、基础设施即代码、测试自动化
|
||||
3. **结构与流程**:标准化流程、小批量工作、消除浪费
|
||||
4. **协作与共享**:开发与运维协同、知识共享、统一目标
|
||||
5. **技术选型**:工具链集成、监控告警、容器化解决方案
|
||||
|
||||
## 常见演进障碍
|
||||
|
||||
- 团队间沟通不畅
|
||||
- 缺乏清晰目标和策略
|
||||
- 抗拒变革
|
||||
- 投入不足
|
||||
- 治理薄弱
|
||||
- 流程僵化
|
||||
|
||||
## 来源
|
||||
- [[devops-maturity-model-from-traditional-it-to-advanced-devops]]
|
||||
@@ -1,79 +1,63 @@
|
||||
# Error Budget
|
||||
|
||||
## Definition
|
||||
Error Budget is the permissible rate of errors and failures that a system can tolerate within a defined period without violating its reliability targets. It represents the "budget" of allowed failures before reliability SLAs are breached.
|
||||
|
||||
Error Budget = 100% - (Actual Reliability Target)
|
||||
|
||||
Example: If your target is 99.9% uptime, your error budget is 0.1% downtime per month.
|
||||
|
||||
## Role in DevOps Maturity
|
||||
|
||||
The DevOps Maturity Model explicitly lists Error Budget as one of the key metrics for measuring DevOps maturity.
|
||||
|
||||
### Error Budget Across Maturity Levels
|
||||
| Maturity | Error Budget Usage |
|
||||
|----------|-------------------|
|
||||
| Phase 1 | No error budget concept — reactive to failures as they occur |
|
||||
| Phase 2 | Awareness growing — teams begin to understand the cost of failures |
|
||||
| Phase 3 | Error budgets not explicitly managed — standardization helps but not measured |
|
||||
| Phase 4 | Error budgets tracked — continuous monitoring enables measurement |
|
||||
| Phase 5 | Error budgets actively used to drive deployment decisions — balancing innovation vs reliability |
|
||||
|
||||
## How Error Budgets Work
|
||||
|
||||
### The Concept
|
||||
If your system achieves:
|
||||
- **99.9% uptime**: 8.76 hours of downtime allowed per year (43.8 minutes per month)
|
||||
- **99.99% uptime**: 52.6 minutes of downtime allowed per year (4.38 minutes per month)
|
||||
|
||||
The "error budget" is the allowed bad events — once depleted, deployment velocity must slow down until reliability improves.
|
||||
|
||||
### Error Budget Policy Example
|
||||
- If error budget is >50% remaining: Deploy freely (encourage experimentation)
|
||||
- If error budget is 25-50%: Proceed with caution, require additional testing
|
||||
- If error budget is <25%: Pause non-critical deployments until budget recovers
|
||||
- If error budget is exhausted: Stop all deployments, focus on reliability
|
||||
|
||||
## Error Budget and SLOs
|
||||
|
||||
| Concept | Role |
|
||||
|---------|------|
|
||||
| **SLO (Service Level Objective)** | The target reliability level (e.g., 99.9%) |
|
||||
| **Error Budget** | The allowable failure budget derived from the SLO |
|
||||
| **SLI (Service Level Indicator)** | The actual reliability measured |
|
||||
|
||||
Error Budgets operationalize SLOs by creating concrete incentives for balancing innovation and reliability.
|
||||
|
||||
## Business Impact
|
||||
|
||||
### Benefits of Error Budget Thinking
|
||||
1. **Incentivizes reliability**: Teams are motivated to maintain system health
|
||||
2. **Enables calculated risk-taking**: Clear budget allows confident experimentation
|
||||
3. **Prevents over-engineering**: Don't build for 99.999% when 99.9% is the target
|
||||
4. **Aligns business and engineering**: Both understand the reliability-investment trade-off
|
||||
|
||||
### Risks Without Error Budgets
|
||||
- Over-investment in reliability beyond business needs
|
||||
- Under-investment leading to frequent customer-facing failures
|
||||
- Conflicting priorities between feature delivery and reliability
|
||||
- No clear signal for when to slow down
|
||||
|
||||
## Error Budget vs Change Failure Rate
|
||||
|
||||
| Metric | Measures |
|
||||
|--------|----------|
|
||||
| **Error Budget** | Total allowable failures over a time period |
|
||||
| **Change Failure Rate** | Percentage of deployments causing failures |
|
||||
|
||||
These metrics work together: Low CFR preserves error budget; depleted error budget signals need to improve CFR.
|
||||
|
||||
## Sources
|
||||
- [[sources/devops-maturity-model-from-traditional-it-to-advanced-devops.md]]
|
||||
|
||||
## Related Concepts
|
||||
- [[concepts/SLO]]
|
||||
- [[concepts/Change-Failure-Rate]]
|
||||
- [[concepts/DORA-Metrics]]
|
||||
- [[concepts/High-Availability]]
|
||||
- [[concepts/DevOps-Maturity]]
|
||||
---
|
||||
title: "Error Budget"
|
||||
type: concept
|
||||
tags: [SRE, Reliability, DevOps Metrics]
|
||||
sources: [devops-maturity-model-from-traditional-it-to-advanced-devops]
|
||||
last_updated: 2026-04-26
|
||||
---
|
||||
|
||||
## 定义
|
||||
|
||||
错误预算(Error Budget)是允许的、一定时间段内系统可以承受的错误和失败的数量或比例。它是一个平衡可靠性目标与创新速度的风险管理工具。
|
||||
|
||||
## 核心概念
|
||||
|
||||
错误预算源于 SRE(Site Reliability Engineering)理念,核心思想是:
|
||||
|
||||
> 如果你的服务可靠性目标是 99.9%,那么你有 0.1% 的"错误预算"可以用于实验和发布。
|
||||
|
||||
## 计算方式
|
||||
|
||||
```
|
||||
Error Budget = (1 - Reliability SLO) × Time Period
|
||||
|
||||
例如:
|
||||
- 月 SLO = 99.9%
|
||||
- 月错误预算 = 0.1% × 30天 × 24小时 = 0.72 小时(约 43 分钟)
|
||||
```
|
||||
|
||||
## 在 DevOps 成熟度模型中的位置
|
||||
|
||||
在 DevOps 成熟度衡量指标体系中,错误预算是一个重要指标:
|
||||
|
||||
> "Error Budget — The permissible rate of errors and failures in production."
|
||||
|
||||
错误预算的使用策略因 DevOps 成熟度阶段不同而异:
|
||||
|
||||
| 成熟度阶段 | 错误预算使用方式 |
|
||||
|-----------|----------------|
|
||||
| Phase 1-2 | 无正式错误预算概念 |
|
||||
| Phase 3 | 开始建立 SLO,但未充分利用错误预算 |
|
||||
| Phase 4 | 明确的错误预算政策,用于平衡创新与可靠性 |
|
||||
| Phase 5 | 数据驱动决策,团队自主利用错误预算进行实验 |
|
||||
|
||||
## 与相关概念的关系
|
||||
|
||||
- [[MTTR]]:错误预算与 MTTR 共同定义系统可靠性曲线
|
||||
- [[Change Failure Rate]]:高变更失败率会快速消耗错误预算
|
||||
- [[Deployment Frequency]]:高部署频率需要配合错误预算管理以维持可靠性目标
|
||||
- [[DevOps Maturity Model]]:错误预算是衡量组织成熟度的重要指标之一
|
||||
|
||||
## 错误预算政策示例
|
||||
|
||||
```yaml
|
||||
SLO: 99.9%(每月 43 分钟错误预算)
|
||||
策略:
|
||||
- 错误预算充足(>50%):可自由发布和实验
|
||||
- 错误预算中等(25-50%):谨慎发布
|
||||
- 错误预算不足(<25%):冻结发布,专注可靠性
|
||||
- 错误预算耗尽:停止所有非关键变更
|
||||
```
|
||||
|
||||
## 来源
|
||||
- [[devops-maturity-model-from-traditional-it-to-advanced-devops]]
|
||||
|
||||
@@ -1,75 +1,72 @@
|
||||
# Immutable Infrastructure
|
||||
|
||||
## Definition
|
||||
Immutable Infrastructure is an approach where components are never modified after deployment. Instead of updating existing components, new versions are created and replaced entirely.
|
||||
|
||||
## Concept
|
||||
不可变基础设施是一种部署策略,其中服务器和基础设施组件一旦部署就不再修改。任何变更都需要创建新版本并替换整个组件。
|
||||
|
||||
## Core Principles
|
||||
|
||||
### 1. Never Modify Running Systems
|
||||
- 不直接在生产环境修改配置
|
||||
- 所有变更通过重新部署实现
|
||||
- 使用版本化配置和模板
|
||||
|
||||
### 2. Replace, Don't Modify
|
||||
- 新版本 = 新环境
|
||||
- 旧版本直接销毁
|
||||
- 保证一致性
|
||||
|
||||
### 3. Infrastructure as Code
|
||||
- 所有基础设施定义代码化
|
||||
- 版本控制所有配置
|
||||
- 可重复的部署流程
|
||||
|
||||
## Benefits for DevSecOps
|
||||
|
||||
### Security Benefits
|
||||
- **减少攻击面**:生产环境无交互式访问
|
||||
- **一致性保证**:每个环境完全相同
|
||||
- **快速回滚**:发现问题时快速切换
|
||||
- **审计简化**:代码即记录
|
||||
|
||||
### Operational Benefits
|
||||
- 环境一致性
|
||||
- 可预测的部署
|
||||
- 简化的故障排除
|
||||
- 更容易扩展
|
||||
|
||||
## Implementation Patterns
|
||||
|
||||
### Container-Based Approach
|
||||
```
|
||||
容器镜像 = 应用 + 依赖 + 配置
|
||||
每次变更 → 新镜像版本 → 滚动更新
|
||||
```
|
||||
|
||||
### Cloud Infrastructure
|
||||
- AWS:使用 AMI + Auto Scaling
|
||||
- Kubernetes:使用 Pod 重建
|
||||
- Terraform:管理不可变配置
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **使用标签(Tag)管理版本**
|
||||
2. **自动化构建流程**
|
||||
3. **保存历史镜像版本**
|
||||
4. **实施蓝绿部署或滚动更新**
|
||||
5. **监控不可变资源的变更**
|
||||
|
||||
## Related Concepts
|
||||
- [[DevSecOps]] — 不可变基础设施是安全架构的重要组成部分
|
||||
- [[Policy-as-Code]] — 策略代码化
|
||||
- [[Container-Lifecycle-Hardening]] — 容器安全加固
|
||||
- [[Blue-Green-Deployment]] — 蓝绿部署模式
|
||||
- [[Infrastructure-as-Code]] — 基础设施即代码
|
||||
|
||||
## Tools
|
||||
- Packer — 镜像构建工具
|
||||
- Terraform — IaC 工具
|
||||
- Kubernetes — 容器编排
|
||||
- Docker — 容器化
|
||||
|
||||
## Sources
|
||||
- [[what-is-devsecops-best-practices-benefits-and-tools]]
|
||||
---
|
||||
title: "Immutable Infrastructure"
|
||||
type: concept
|
||||
tags: [Infrastructure as Code, DevOps, Cloud Native]
|
||||
sources: [devops-maturity-model-from-traditional-it-to-advanced-devops]
|
||||
last_updated: 2026-04-26
|
||||
---
|
||||
|
||||
## 定义
|
||||
|
||||
不可变基础设施(Immutable Infrastructure)是一种基础设施管理范式,服务器一旦部署就不再进行原地修改。当需要更新配置或修复问题时,整个服务器被替换为新版本,而不是在原有服务器上打补丁或更新。
|
||||
|
||||
## 核心原则
|
||||
|
||||
1. **不修改已部署的服务器**:任何变更都生成新服务器镜像
|
||||
2. **完整镜像部署**:使用预构建的镜像完整部署
|
||||
3. **自动化替换**:通过自动化流水线处理服务器生命周期
|
||||
4. **环境一致性**:所有环境使用相同的基础镜像
|
||||
|
||||
## 在 DevOps 成熟度模型中的位置
|
||||
|
||||
不可变基础设施是 **Phase 4(高度优化阶段)** 的关键特征:
|
||||
|
||||
> "Immutable infrastructure replaces old servers rather than updating them."
|
||||
|
||||
在该阶段,组织通过流水线管理基础设施和代码更新,不再依赖手动服务器修改。
|
||||
|
||||
## 不可变 vs 可变基础设施
|
||||
|
||||
| 维度 | 不可变基础设施 | 可变基础设施 |
|
||||
|------|---------------|-------------|
|
||||
| 更新方式 | 替换整个服务器 | 在原服务器上打补丁 |
|
||||
| 一致性 | 所有环境高度一致 | 环境间可能存在差异 |
|
||||
| 回滚难度 | 简单(切换回旧镜像) | 困难(需反向补丁) |
|
||||
| 调试复杂度 | 低(快照确定) | 高(变化累积) |
|
||||
| 部署速度 | 快(预构建镜像) | 慢(需逐步更新) |
|
||||
|
||||
## 实现方式
|
||||
|
||||
### 容器化(推荐)
|
||||
```dockerfile
|
||||
# 每次构建生成新镜像
|
||||
FROM base-image:latest
|
||||
RUN ./build.sh
|
||||
# 部署时拉取新镜像,不修改原容器
|
||||
```
|
||||
|
||||
### 虚拟机镜像
|
||||
```bash
|
||||
# Packer 创建镜像
|
||||
packer build template.json
|
||||
# Terraform 用新 AMI 替换旧实例
|
||||
terraform apply
|
||||
```
|
||||
|
||||
### 云基础设施
|
||||
```yaml
|
||||
# Kubernetes 中使用 Immutable Pod
|
||||
spec:
|
||||
containers:
|
||||
- image: myapp:v2.0 # 替换镜像而非修改容器
|
||||
```
|
||||
|
||||
## 与相关概念的关系
|
||||
|
||||
- [[Infrastructure as Code]]:不可变基础设施通常依赖 IaC 工具(Terraform、CloudFormation)实现
|
||||
- [[CI/CD Pipeline]]:不可变基础设施通过 CI/CD 流水线自动化构建和部署
|
||||
- [[DevOps Maturity Model]]:是 Phase 4 高度优化阶段的核心特征
|
||||
- [[Container-Lifecycle-Hardening]]:容器天然支持不可变范式,结合使用可提升安全性和一致性
|
||||
|
||||
## 来源
|
||||
- [[devops-maturity-model-from-traditional-it-to-advanced-devops]]
|
||||
|
||||
49
wiki/concepts/MVP.md
Normal file
49
wiki/concepts/MVP.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "MVP"
|
||||
type: concept
|
||||
tags: [Product Development, Agile, Lean Startup]
|
||||
sources: [devops-maturity-model-from-traditional-it-to-advanced-devops]
|
||||
last_updated: 2026-04-26
|
||||
---
|
||||
|
||||
## 定义
|
||||
|
||||
MVP(Minimum Viable Product,最小可行产品)是指具有最小功能集的产品版本,仅包含核心功能足以满足早期用户需求并收集验证性反馈。
|
||||
|
||||
## 核心特征
|
||||
|
||||
- **最小功能集**:只实现解决核心问题所必需的最小功能
|
||||
- **快速验证**:尽早发布以获得真实用户反馈
|
||||
- **学习导向**:优先获取市场验证数据而非追求功能完备
|
||||
- **迭代演进**:基于反馈快速迭代改进
|
||||
|
||||
## 与 DevOps 成熟度的关系
|
||||
|
||||
在 DevOps 成熟度模型中,MVP 是 **Phase 4(高度优化阶段)** 的关键实践:
|
||||
|
||||
> "Use of MVPs and management of tech debt to speed up releases."
|
||||
|
||||
在该阶段,组织已建立成熟的 CI/CD 流水线,可以:
|
||||
1. 快速构建和部署 MVP
|
||||
2. 收集生产环境真实反馈
|
||||
3. 缩短从想法到验证的周期
|
||||
4. 降低大功能发布的风险
|
||||
|
||||
## MVP vs 完整产品
|
||||
|
||||
| 维度 | MVP | 完整产品 |
|
||||
|------|-----|---------|
|
||||
| 功能范围 | 最小核心功能 | 完整功能集 |
|
||||
| 目标 | 验证假设 | 全面满足需求 |
|
||||
| 发布时间 | 尽早发布 | 功能完备后发布 |
|
||||
| 反馈来源 | 早期用户 | 广泛用户群 |
|
||||
| 风险 | 低投入高学习 | 高投入风险大 |
|
||||
|
||||
## 与相关概念的关系
|
||||
|
||||
- [[Agile]]:MVP 是敏捷开发的核心实践之一,支持快速迭代
|
||||
- [[Technical Debt]]:MVP 策略需要平衡快速交付与技术债务管理
|
||||
- [[DevOps Maturity Model]]:在 Phase 4 高度优化阶段,MVP 被用于加速发布周期
|
||||
|
||||
## 来源
|
||||
- [[devops-maturity-model-from-traditional-it-to-advanced-devops]]
|
||||
Reference in New Issue
Block a user