Update nexus wiki content
This commit is contained in:
51
wiki/concepts/Autoscaling.md
Normal file
51
wiki/concepts/Autoscaling.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Autoscaling"
|
||||
type: concept
|
||||
tags: [sre, cloud, scalability, reliability, kubernetes]
|
||||
last_updated: 2026-04-20
|
||||
---
|
||||
|
||||
# Autoscaling
|
||||
|
||||
自动扩缩容(Autoscaling)是云原生系统中根据负载自动调整资源容量的机制,但它与真正的弹性(Elasticity)有本质区别。
|
||||
|
||||
## Definition
|
||||
Autoscaling 通过预定义的规则(如 CPU 使用率、请求队列长度等)自动增加或减少计算资源。它是一种**被动的、反应式的**机制。
|
||||
|
||||
## Key Limitation
|
||||
> "Autoscaling is reactive, not resilient. Without caps, metrics, or overrides, it can worsen failures." — David Iyanu Jonathan
|
||||
|
||||
没有以下保护机制时,Autoscaling 可能**加剧故障**:
|
||||
- **上限(caps)**:防止无限扩容
|
||||
- **指标(metrics)**:确保扩容基于可靠数据
|
||||
- **覆盖机制(overrides)**:允许人工干预
|
||||
|
||||
## Autoscaling vs. Elasticity
|
||||
|
||||
| Aspect | Autoscaling | [[Elasticity]] |
|
||||
|--------|-------------|----------------|
|
||||
| 性质 | 被动的、反应式的 | 主动的、前瞻性的 |
|
||||
| 触发 | 基于指标阈值 | 基于策略和规划 |
|
||||
| 保护机制 | 可能缺失 | 必须具备 |
|
||||
| 故障时行为 | 可能加剧故障 | 设计上防止故障扩大 |
|
||||
|
||||
## Anti-Patterns
|
||||
- **Autoscaling to Death**:系统在负载高峰时无限扩容,导致资源耗尽
|
||||
- **No Upper Limits**:缺少上限导致成本爆炸
|
||||
- **Metrics Blindness**:依赖单一指标,忽视系统整体健康状况
|
||||
|
||||
## Best Practices
|
||||
1. 设置合理的扩容上限和缩容下限
|
||||
2. 配置多维度指标(不仅仅是 CPU)
|
||||
3. 建立人工覆盖机制
|
||||
4. 在非生产环境测试扩容策略
|
||||
5. 监控 Autoscaling 本身的行为
|
||||
|
||||
## Related Concepts
|
||||
- [[Elasticity]]
|
||||
- [[Scalability]]
|
||||
- [[Cluster-Autoscaler]]
|
||||
- [[Cost-Optimization]]
|
||||
|
||||
## Source
|
||||
- SRE Weekly Issue #513 — [[sre-weekly-issue-513]]
|
||||
Reference in New Issue
Block a user