Auto-sync: 2026-04-19 14:51
This commit is contained in:
@@ -0,0 +1,57 @@
|
||||
---
|
||||
title: "CTP Topic 59 Achieving reliability with Amazon EKS"
|
||||
type: source
|
||||
tags: [AWS, EKS, Kubernetes, Reliability, CTP]
|
||||
date: 2026-04-14
|
||||
---
|
||||
|
||||
## Source File
|
||||
- [[raw/Cloud & DevOps/Public-Cloud-Learning-Sessions/04_EKS/ctp-topic-59-achieving-reliability-with-amazon-eks.md]]
|
||||
|
||||
## Summary
|
||||
- 核心主题:Amazon EKS 可靠性实践,涵盖容器服务选型、共享责任模型、三层可靠性设计
|
||||
- 问题域:如何在 EKS 上构建高可靠性 Kubernetes 集群
|
||||
- 方法/机制:应用可靠性(Pod 分布、HPA、VPA、探针)、控制平面可靠性(监控、认证、集群升级)、数据平面可靠性(节点检测、资源预留、QoS)
|
||||
- 结论/价值:EKS 可靠性需要从应用层、控制层、数据层全面考虑,AWS 与客户按共享责任模型分工
|
||||
|
||||
## Key Claims
|
||||
- ECS 适合容器入门用户,EKS 适合熟悉 Kubernetes 生态的用户
|
||||
- 可靠性是指系统在故障发生时仍能提供可预测行为
|
||||
- AWS 负责管理控制平面(API Server、etcd、Scheduler、Controller Manager),客户负责数据平面(Worker Node、OS、应用配置)
|
||||
- Fargate 模式下客户无需管理节点和补丁升级
|
||||
- 应用可靠性通过 Pod 反亲和性、拓扑分布约束、HPA/VPA、探针、Pod 中断预算实现
|
||||
- 控制平面可靠性通过监控控制平面指标、安全认证、精心配置的 webhook、集群升级实现
|
||||
- 数据平面可靠性通过节点问题检测器、系统资源预留、QoS 资源配额实现
|
||||
|
||||
## Key Quotes
|
||||
> "Reliability in a system means it offers predictable behavior even when failures occur." — Surav Paul
|
||||
|
||||
> "ECS is a more AWS opinionated way of running containers." — Surav Paul
|
||||
|
||||
> "With Fargate, you don't have to worry about managing the nodes or worrying about patching or upgrading the nodes." — Surav Paul
|
||||
|
||||
## Key Concepts
|
||||
- [[EKS 可靠性]]:系统在故障发生时仍提供可预测行为
|
||||
- [[共享责任模型]]:AWS 管理控制平面,客户负责数据平面和应用
|
||||
- [[Pod 反亲和性]]:避免 Pod 部署在同一节点或可用区
|
||||
- [[拓扑分布约束]]:细粒度控制 Pod 在可用区间的分布
|
||||
- [[HPA]]:Horizontal Pod Autoscaler,根据 CPU/内存自动扩展 Pod
|
||||
- [[VPA]]:Vertical Pod Autoscaler,自动调整 Pod 资源请求
|
||||
- [[探针]]:Liveness、Readiness、Startup 探针用于 Pod 健康检测
|
||||
- [[Pod 中断预算]]:确保维护期间仍提供最低服务水平
|
||||
|
||||
## Key Entities
|
||||
- [[Surav Paul]]:AWS 高级解决方案架构师,本主题演讲人
|
||||
- [[AWS]]:公有云平台,提供 EKS 服务
|
||||
- [[EKS]]:Elastic Kubernetes Service,AWS 托管 Kubernetes 服务
|
||||
- [[ECS]]:Elastic Container Service,AWS 容器服务
|
||||
- [[Fargate]]:AWS 无服务器容器运行环境
|
||||
|
||||
## Connections
|
||||
- [[EKS]] ← 使用 [[共享责任模型]] ← [[AWS]]
|
||||
- [[Surav Paul]] ← 演讲 [[CTP Topic 59 Achieving reliability with Amazon EKS]]
|
||||
- [[CTP Topic 59 Achieving reliability with Amazon EKS]] ← 依赖 [[EKS]]
|
||||
- [[CTP Topic 70 EKS Deployment using IAC]] ← 关联主题
|
||||
|
||||
## Contradictions
|
||||
- (暂无)
|
||||
Reference in New Issue
Block a user