change folder

2026-04-18 16:17:24 +08:00
parent 218d592979
commit 60d2f8254b
179 changed files with 190 additions and 190 deletions
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-1-gruntwork-landing-zone-architecture.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-1-gruntwork-landing-zone-architecture.md
@@ -1,89 +0,0 @@
---
-title: "CTP Topic 1 Gruntwork Landing Zone Architecture"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Landing-Zone
-  - Gruntwork
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 1_ Gruntwork Landing Zone Architecture.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 1 Gruntwork Landing Zone Architecture
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 1_ Gruntwork Landing Zone Architecture.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-本次会议主要讨论了基于 Gruntwork 的云平台 Landing Zone 架构，以及如何在云转型项目中应用最佳实践。Gruntwork 是一个拥有大量 Terraform 代码的组织，其代码经过多次实践验证，被认为是最佳实践。会议介绍了参考架构（Reference Architecture）和 Landing Zone 的概念，以及它们在不同环境和账户中的实现方式。参考架构是一个起点，包含共享、日志和安全等核心账户，以及生产、测试和开发等工作负载账户。Landing Zone 基于 Gruntwork，但不包含具体的 ECS 集群或 RDS 数据库，而是由产品团队自行定义。安全账户使用联邦用户，通过 AD 组映射到 IAM 角色。会议还强调了 Jenkins 在 CI/CD 流程中的作用，每个 Landing Zone 都有一个 Jenkins 服务器来部署基础设施变更，每个产品团队也有自己的 Jenkins 任务来部署其负责的基础设施。此外，会议还讨论了 Git 工作流，强调使用特性分支进行开发，并通过 Pull Request 合并到主分支。最后，会议介绍了 Gruntwork 的 Terraform AWS 服务目录，强调服务应具有业务上下文，而非简单的资源。
-
---
-
-## 关键概念
-
- **参考架构 (Reference Architecture)**: 一套最佳实践的集合，作为云平台部署的起点，包含核心账户和工作负载账户。
- **Landing Zone**: 基于 Gruntwork 的云平台环境，包含安全、共享和日志等核心账户，以及由产品团队自定义的工作负载账户。
- **联邦用户 (Federated User)**: 通过 AD 组映射到 IAM 角色，用于访问云平台资源，替代了传统的 IAM 用户。
- **CI/CD 流程**: 使用 Jenkins 进行持续集成和持续交付，通过特性分支、Pull Request 和审批流程来管理基础设施变更。
- **Terraform AWS 服务目录**: Gruntwork 提供的 Terraform 模块集合，用于构建具有业务上下文的云服务。
-
---
-
-## 行动项
-
- [ ] 熟悉 Gruntwork 的 Terraform AWS 服务目录，了解可用的模块和服务。
- [ ] 遵循 Git 工作流，使用特性分支进行开发，并通过 Pull Request 合并到主分支。
- [ ] 了解 Jenkins 在 CI/CD 流程中的作用，以及如何配置 Jenkins 任务来部署基础设施变更。
- [ ] 熟悉联邦用户的配置方式，以及如何通过 AD 组映射到 IAM 角色。
- [ ] 确定 Active Directory 连接的具体配置，特别是 corp.joml 还是 swing throw。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-XX-git-workflow.md]] — 详细解释了 Git 工作流的最佳实践。
-
-
-## 关键概念
-
- **Reference Architecture**: 包含核心账户（Shared/Logs/Security）和工作负载账户（Prod/Stage/Dev）的最佳实践起点
- **Landing Zone**: 基于 Gruntwork 仓库的基础设施部署单元，每个 Zone 有独立 GitHub 仓库管理 IaC
- **Federated Access**: 通过 AD 组映射到 IAM 角色的联邦身份访问，简化安全账户管理
- **Gruntwork Modules**: 经过实战验证的 Terraform 模块，提供业务上下文和粒度支持
- **CI/CD Pipeline**: 基于特性分支 + PR + Jenkins 的基础设施变更自动化流程
-
---
-
-## 行动项
-
- [ ] 熟悉 Gruntwork Terraform AWS Service Catalog，了解可用模块
- [ ] 采用特性分支开发流程，通过 PR 合并到主分支
- [ ] 配置 Jenkins 流水线，实现 Terraform Plan/Apply 自动化
- [ ] 探索 TerraTest 用于基础设施变更的自动化测试
- [ ] 确定 Active Directory 联邦访问的具体配置方案
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-2-git.md]] — Git 版本控制基础（CI/CD 前提）
-> [[ctp-topic-3-deploy-and-maintain-infrastructure.md]] — Terraform 部署与维护
-> [[ctp-topic-9-ci-cd-with-gruntwork.md]] — Gruntwork CI/CD 流水线实践
-> [[ctp-topic-10-aws-landing-zone-lz-data-collection-tagging-related-security.md]] — Landing Zone 安全配置
-
---
-
-*最后更新: 2026-04-15*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-10-aws-landing-zone-lz-data-collection-tagging-related-security.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-10-aws-landing-zone-lz-data-collection-tagging-related-security.md
@@ -1,66 +0,0 @@
---
-title: "CTP Topic 10 AWS Landing Zone (LZ) Data Collection, Tagging Related Security"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Landing-Zone
-  - Tagging
-  - Security
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 10_ AWS  Landing Zone (LZ) Data Collection, Tagging _ Related Security.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 10 AWS Landing Zone (LZ) Data Collection, Tagging Related Security
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 10_ AWS  Landing Zone (LZ) Data Collection, Tagging _ Related Security.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次视频是云转型计划（Cloud Transformation Program）的每周技术分享，重点探讨了 **AWS Landing Zones** 的部署流程、数据收集策略，以及如何利用标签（Tagging）和安全策略构建现代化的云安全架构。会议由 Steve Jarman 和 Pradeep 主讲，旨在帮助团队理解从传统网络安全向基于身份和元数据的云原生安全转型的过程。
->
-> 核心内容分为三个部分：首先，Steve 介绍了 Landing Zone 的规划与自动化。他强调在部署前，必须深入了解业务部门（BU）的资产清单、IP 地址空间及数据敏感性，以便制定合适的安全姿态。目前，DNS、Transit Gateway 等基础服务的创建已通过 SRE 团队实现了高度自动化。
->
-> 其次，视频详细讲解了**基于标签的安全控制机制**。与传统基于 IP 的防火墙规则不同，该方案利用 AWS 标签作为安全凭证。为了防止用户通过篡改标签绕过安全审计，架构中引入了 **OU（组织单元）** 和 **SCP（服务控制策略）**。通过 SCP 的“显式拒绝”逻辑，系统能够强制执行标签规范，确保资源在创建时即具备正确的归属（如 BU、产品、环境等）。
->
-> 最后，Pradeep 演示了 **Checkpoint 防火墙** 中的“有序层（Ordered Layer）”逻辑。防火墙根据标签对流量进行分层过滤，包括地理屏蔽、BU 隔离、产品隔离及环境隔离（如开发环境与生产环境隔离）。这种设计确保了流量在跨 VPC、访问本地（On-prem）或互联网时，能够受到精细化的策略约束，同时支持 PSDC 等共享服务的合法访问。
-
---
-
-## 关键概念
-
- **AWS Landing Zones**: 一种能够按照最佳实践快速设置、安全且多账号的 AWS 环境的基础架构框架。
- **Tagging Methodology**: 标签方法论，通过为资源定义标准化的元数据（如 Owner, BU, Product, Environment），作为自动化管理和安全策略执行的基础。
- **SCP (Service Control Policies)**: 服务控制策略，一种组织策略，用于管理组织中的权限，本视频中用于强制执行标签合规性，防止未经授权的标签更改。
- **OU (Organizational Unit)**: 组织单元，AWS Organizations 中账号的分组容器，用于分层应用安全策略（SCP）。
- **Checkpoint Firewall**: 部署在云环境中的虚拟防火墙，通过集成 AWS 标签实现动态的对象识别和流量过滤。
- **Transit Gateway**: 传输网关，作为网络中心枢纽，连接 VPC 与本地网络，是跨环境流量经过防火墙检查的关键节点。
- **Ordered Layer**: 有序层，防火墙策略的一种组织方式，按顺序执行地理屏蔽、BU 隔离、环境隔离等逻辑。
- **SRE (Site Reliability Engineering)**: 站点可靠性工程，负责 Landing Zone 部署中的自动化脚本编写与基础架构维护。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[AWS_Organizations_and_SCP_Deep_Dive]] — 深入探讨如何编写和应用 SCP 策略以增强账号安全性。
-> [[SRE_Automation_Services_Overview]] — 关联 SRE 团队在 Landing Zone 中实现的 DNS 与网络自动化工具。
-> [[Hybrid_Cloud_Connectivity_Guide]] — 详细说明 Transit Gateway 如何连接 AWS 环境与本地数据中心。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-14-octane-hub-on-aws-real-life-experience-moving-production-services-i.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-14-octane-hub-on-aws-real-life-experience-moving-production-services-i.md
@@ -1,85 +0,0 @@
---
-title: "CTP Topic 14 Octane Hub on AWS Real life experience moving production services into the new land"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Octane-Hub
-  - Migration
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 14_ Octane Hub on AWS_ Real life experience moving production services into the new land.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 14 Octane Hub on AWS: Real-Life Experiences
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 14_ Octane Hub on AWS_ Real life experience moving production services into the new land.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status: ✅ 已完成摘要**
-
---
-
-## 摘要
-
-Holger Rode（Octane Hub CTO 软件工厂团队负责人）分享了 Octane Hub 云设计考虑因素、学习曲线、网络和安全要求以及常见陷阱。Octane Hub 团队主要使用 Docker 容器运行，之前托管在 Bibling Lab，拥有三台物理服务器和多台虚拟机。
-
-### 现有工作负载
-
-这些容器运行各种 Web 应用程序，包括 QuickSee、Release Manager、Patch Manager 和安全程序板。他们还处理后台作业，如支持集成、数据复制和内部空闲搜索。团队还管理大约 10TB 的文件存储和大型 MSSQL 服务器数据库。
-
-### 云迁移动因
-
-由于 Bibling 数据中心即将关闭，云迁移变得紧迫。云转型计划提供了帮助，团队在 5 月左右获得了概念验证 Landing Zone 账户的访问权限，随后在 6 月获得了生产账户。团队目标是实现无缝过渡，紧密镜像现有设置以避免在 Go Live 期间进行重大技术变更。
-
-### 技术选型与挑战
-
- 使用 AWS 定价计算器了解服务成本
- 最初考虑 EFS 用于存储，但由于性能问题（数据库无法直接在 EFS 上运行）不适用
- 改用 EBS 用于实时数据库，EFS 用于备份
- 部署方式：从控制台脚本演变为使用 Packer 构建 AMI，使用 Terraform/TerraGrunt 部署
- 网络问题需要多次 PCS 请求，与网络团队协作解决
- 使用 VPC Transit Gateway 并实施标签系统管理访问
- DNS 设置：使用 Cname 指向 AWS software infra.net 域，通过 Route 53 管理
-
-### 下一步计划
-
- 改进 DR 和高可用性
- 通过最佳匹配存储选项（S3）进行成本优化
- 从 MSSQL 迁移到 Postgres
- 可能迁移到 AWS ECS 服务
-
---
-
-## 关键概念
-
- **Docker 容器化**: Octane Hub 的主要部署模式
- **Packer + Terraform/TerraGrunt**: 基础设施即代码的部署流程
- **VPC Transit Gateway**: AWS 网络互联解决方案
- **标签系统**: 基于角色和环境管理资源访问
- **EFS vs EBS**: 文件存储与块存储的性能差异
-
---
-
-## 行动项
-
- [ ] 评估现有工作负载是否适合容器化
- [ ] 规划数据库从 MSSQL 到 Postgres 的迁移路径
- [ ] 检查 EBS/EFS 存储选型是否合理
- [ ] 制定 DR 和高可用性改进计划
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-7-saas-landing-zone-design.md]] — SaaS Landing Zone 设计
-> [[ctp-topic-25-labs-landing-zone-overview-itom-teams.md]] — Labs Landing Zone 概述
-
---
-
-*最后更新: 2026-04-15*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-17-active-directory-services-in-gruntwork-aws-lzs.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-17-active-directory-services-in-gruntwork-aws-lzs.md
@@ -1,62 +0,0 @@
---
-title: "CTP Topic 17 Active Directory Services in Gruntwork AWS LZs"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Landing-Zone
-  - AD
-  - Gruntwork
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 17_ Active Directory Services in Gruntwork AWS LZs.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 17 Active Directory Services in Gruntwork AWS LZs
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 17_ Active Directory Services in Gruntwork AWS LZs.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status: 🟡 Awaiting Whisper transcription → Summary**
-
---
-
-## 摘要
-
-> 本次视频是 DevOps 云学习系列课程之一，重点介绍了在 Gruntwork AWS Landing Zones 架构中集成与管理 Active Directory (AD) 服务的核心实践。演讲者 Paul 详细阐述了两种主要环境的域名配置：研发实验室（R&D Labs）统一使用 `swinford.net` 域名，而生产与分阶段 SAS 环境则采用 `intsas.local`。视频明确指出，旧有的 `infra` 和 `AST` 域名在新的 Gruntwork 落地页中已被废弃，并为用户提供了相应的迁移路径和所有权归属建议。
-> 
-> 在技术实现层面，视频重点讲解了如何利用 SRE 团队提供的预制 AMI（Amazon Machine Images）实现自动化的域加入（Domain Join）。通过在 Terraform 的 `user_data` 中调用内置脚本，Windows 实例可以实现自动命名、管理员权限分配及旧对象清理；Linux 实例则支持安全动态更新以自动注册 DNS A 记录。此外，视频还介绍了针对不同环境的自助服务工具（如 MIM）和支持渠道（如 SMACKS 工单系统），旨在帮助开发者在遵循安全合规的前提下，提升系统接入域的效率与自动化水平。
-
---
-
-## 关键概念
-
- **Gruntwork Landing Zones**: 预配置的、基于最佳实践的 AWS 基础架构框架，分为 R&D Labs 和 SAS 两种环境类型。
- **swinford.net**: 专门用于研发实验室（R&D Labs）环境的 Active Directory 域名，支持自助服务管理。
- **intsas.local**: 用于生产和分阶段 SAS 环境的内部 Active Directory 域名，强调资源的所有权和审计。
- **SRE-provided AMIs**: 由 SRE 团队预先构建的机器镜像，内置了用于自动加入域的 PowerShell 和 Shell 脚本。
- **User Data**: 在 AWS 实例启动时执行的脚本数据，本视频中用于触发自动化的域加入流程。
- **MIM (Microsoft Identity Manager)**: 用于 R&D 环境中安全组管理和权限申请的自助服务解决方案。
- **SMACKS Ticket**: 内部服务管理工单系统，用于申请新账号、重置密码或处理复杂的生产环境变更。
- **Secure Dynamic Updates**: 一种安全机制，允许 Linux 系统在加入域时向 Windows DNS 服务器安全地注册其 A 记录。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[Gruntwork AWS Landing Zones Overview]] — 了解 AD 服务运行的基础架构背景
-> [[SRE Standard AMIs and Image Building]] — 了解内置域加入脚本的 AMI 制作标准
-> [[Terraform Single Server Module Deep Dive]] — 深入理解视频中用于部署实例的 Terraform 模块用法
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-25-labs-landing-zone-overview-itom-teams.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-25-labs-landing-zone-overview-itom-teams.md
@@ -1,71 +0,0 @@
---
-title: CTP Topic 25 Labs Landing Zone overview - ITOM teams
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - Labs
-  - ITOM
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 25 Labs Landing Zone overview - ITOM teams
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Labs Landing Zone Overview
-
-The Labs landing zone is based on the Gruntworks reference architecture and AWS standards, utilizing a multi-account strategy. The entire stack is managed through infrastructure as code (Terraform), using a library of common functions accessible for review and modification. *Everything should be managed using Terraform or some other code-based mechanism.*
-
-Key components include:
-
-*   **Shared Account:** Hosts the Jenkins master for the CI/CD pipeline (Gruntworks production grade), hardened AMIs, and a Docker container store.
-*   **Logs Account:** Secure storage for AWS Config and CloudTrail logs, with access controlled by the security team.
-*   **Security Account:** Manages user accounts and access, primarily for cross-account access and shared accounts, with most access being federated.
-*   **Core Accounts:**
-    *   Active Directory: Manages Windows instances and IDPs (all in Swimford.net).
-    *   DNS: Manages AWS Swimford.net, allowing for local domains or referencing the wider infrastructure.
-*   **Network Account:** Central hub for network communication, managing traffic via Transit Gateway and JetPult firewall. All internet access is routed through here, managed by the network team via tags. Pulse VPN access is also managed here, providing access to the micro focus network.
-*   **Shared Service Accounts:** Provide access to services like monitoring (45 arc site) and Qualys.
-*   **Product Account:** The primary working environment, built to standard infrastructure-as-code modules. It can have multiple accounts (production, staging, development). Logs are shipped to the logs account, and Jenkins manages automation within the account.
-
-When deploying a product account, key requirements include defining IP address ranges and agreeing on specific tags with the network team for firewall access. *Access through that firewall is all managed by tags.* The team recommends using their Terraform modules for deploying subnets.
-
-The standard Jenkins-based pipelines scan GitHub Enterprise repositories for changes, running Terragrunt plans or applies based on the branch. Internet connectivity is restricted; access to specific corporate network locations requires a request to the network services team. The pipelines are continuously being improved for robustness and security, including pre-commit checks and Fortify scans.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-25-labs-landing-zone-overview-itom-teams.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-25-labs-landing-zone-overview-itom-teams.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 25 Labs Landing Zone overview - ITOM teams
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - Labs
-  - ITOM
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 25 Labs Landing Zone overview - ITOM teams
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-26-standard-ami-build-publish-share-processes.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-26-standard-ami-build-publish-share-processes.md
@@ -1,64 +0,0 @@
---
-title: "CTP Topic 26 Standard AMI – build, publish, share processes"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - AMI
-  - Build-Process
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 26_ Standard AMI – build, publish, share processes.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 26 Standard AMI – build, publish, share processes
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 26_ Standard AMI – build, publish, share processes.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次会议是每周转型计划（Weekly Transformation Program）的一部分，重点讨论了 **Foundation AMI（基础亚马逊机器镜像）** 的构建、加固与分发流程。会议由 Srihari、Alan 和 Praveen 三位专家主讲，旨在向产品团队介绍如何利用标准化的镜像来提升安全性和运维效率。
->
-> 核心内容涵盖了 Foundation AMI 的全生命周期管理。首先，Foundation AMI 是基于市场主流操作系统（如 CentOS, Ubuntu, Windows 等）进行深度加固的镜像，集成了 CIS 安全基准、防病毒软件（McAfee EPO）、日志管理（Syslog-ng）以及单点登录（AD 集成）。其主要优势在于“即插即用”，确保所有实例从启动之日起就符合 Micro Focus 的安全合规标准，并预装了 SSM Agent 和 SiteScope 监控预选件。
->
-> 在技术实现上，团队采用了 **HashiCorp Packer** 和 **Jenkins** 构建流水线，实现了镜像创建的完全自动化。为了优化成本和分发速度，镜像存储在中央存储库中，并通过跨账号共享（Sharing）而非物理复制（Copying）的方式分发到全球多个区域（如俄勒冈、法兰克福、悉尼等）。此外，镜像每两个月更新一次，遵循 N-2 的版本保留策略。
->
-> 最后，会议强调了责任共担模型：CCOE 负责提供安全的基础镜像，而产品团队则被鼓励在 Foundation AMI 之上构建自定义的“产品特定 AMI”，以满足各自应用的特殊需求，并负责其后续的生命周期管理。
-
---
-
-## 关键概念
-
- **Foundation AMI**: 基础机器镜像，是经过安全加固、集成标准组件并预配置好的操作系统模板。
- **OS Hardening**: 操作系统加固，通过关闭不必要服务、优化内核参数和应用安全补丁来减少系统攻击面。
- **CIS Benchmarks**: 由互联网安全中心制定的安全配置基准，用于衡量镜像是否符合行业最佳安全实践。
- **HashiCorp Packer**: 一种开源工具，用于从单一源配置为多个云平台自动创建一致的机器镜像。
- **SSM Agent**: AWS 系统管理器代理，用于实现实例的远程管理、自动化补丁更新和配置同步。
- **AMI Sharing**: 镜像共享机制，通过授权其他账号访问中央镜像，避免了跨账号复制带来的额外存储成本。
- **Central Repository**: 中央仓库，用于统一存放和管理经过验证的官方镜像，确保分发源的唯一性。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[Cloud Transformation Program Overview]] — 了解转型计划的背景与整体框架
-> [[Guardrail Rules and Compliance]] — 关联 Foundation AMI 在合规性检查中的角色
-> [[CCOE Portal User Guide]] — 如何在门户网站订阅 AMI 更新通知
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-28-aws-tag-validation-tool.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-28-aws-tag-validation-tool.md
@@ -1,62 +0,0 @@
---
-title: "CTP Topic 28 AWS Tag Validation Tool"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Tagging
-  - Validation
-  - Tool
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 28_ AWS Tag Validation Tool.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 28 AWS Tag Validation Tool
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 28_ AWS Tag Validation Tool.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次视频由 Lewis Brown 主讲，重点介绍了由 SRE 团队开发的一款 **AWS 标签验证工具（AWS Tag Validation Tool）**。在 AWS 架构中，标签（Tags）不仅是简单的元数据（键值对），还直接影响网络安全。该组织使用的 Checkpoint 防火墙会读取 EC2 实例、安全组和负载均衡器的标签值来配置网络访问权限；如果标签无效或缺失，防火墙将拦截相关网络流量。
-> 
-> 虽然目前已通过服务控制策略（SCPs）在组织层面拦截了不合规资源的创建（目前主要应用于 SAS 账户），但对于大量已经存在的存量资源，仍需有效的审计手段。为此，Lewis 展示了这款基于 Python 和 Boto3 开发的工具。该工具通过读取包含各账户预定义合法标签值的 YAML 配置文件，自动扫描指定账户内的 EC2、安全组、负载均衡器及 Lambda 函数，并将扫描结果与预期值进行比对。
-> 
-> 工具最终会生成一份详尽的 CSV 报告，列出所有缺失或标签值错误的资源 ID 及其具体问题，极大提高了审计效率。在演示环节，Lewis 展示了如何通过 Poetry 管理 Python 环境并运行该脚本。此外，视频还讨论了标签在未来成本核算（Costing）中的潜在用途，即通过标签区分同一账户下不同产品的资源消耗。
-
---
-
-## 关键概念
-
- **AWS Tags**: 附加在 AWS 资源上的元数据，以键值对（Key-Value pairs）形式存在，用于识别和管理资源。
- **Checkpoint Firewall**: 一种网络安全解决方案，在本案例中通过读取资源标签来动态配置和执行网络访问策略。
- **Service Control Policies (SCPs)**: AWS Organizations 的一种策略，用于集中管理组织中所有账户的最大可用权限，此处用于强制执行标签规范。
- **Boto3**: 适用于 Python 的 AWS SDK，允许开发者通过编写 Python 代码来调用 AWS 服务接口。
- **Poetry**: 一个 Python 依赖管理和打包工具，用于确保开发环境的一致性并简化工具的安装与运行。
- **variables.yaml**: 该工具的核心配置文件，定义了特定 AWS 账户所期望的合法标签键及其对应的允许值列表。
- **SRE Tools Repository**: 存放该验证工具及其他 SRE 自动化脚本的内部代码仓库。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[CTP Topic 10 - AWS Tagging Deep Dive]] — 演讲者提到的相关视频，详细介绍了标签的技术细节与标准。
-> [[AWS Landing Zone Governance]] — 关联原因：讨论了新旧登陆区（Landing Zone）中通过标签进行治理的背景。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-34-azure-landing-zone-architecture-overview.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-34-azure-landing-zone-architecture-overview.md
@@ -1,59 +0,0 @@
---
-title: CTP Topic 34 Azure Landing Zone Architecture Overview
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - Azure
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 34 Azure Landing Zone Architecture Overview
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Azure Landing Zone Architecture Overview
-
-Kishore Garlopati presents an overview of the upcoming Azure Landing Zones implementation within Micro Focus, detailing how it will simplify Azure adoption for various teams and enable them to deploy workloads to the Azure cloud. The primary goal is to minimize cross-team dependencies through automation, granting teams greater independence in deploying innovative solutions within the Azure environment.
-
-The architecture begins with enrollment into Azure Enterprise, utilizing Azure Active Directory for user authentication. Azure employs management groups, similar to parent directories in Windows, to organize the entities within Micro Focus. These are divided into four areas: platform, landing zones, decommission, and sandbox. The platform includes identity management and connectivity subscriptions, each with a specific purpose and managed by dedicated teams to enhance security. *The core reason of these individual or isolated subscriptions is you are basically containing a subscription for a specific purpose.*
-
-Identity subscriptions manage access policies, while connectivity subscriptions serve as a central hub for all inbound and outbound Azure traffic, incorporating security measures like DDoS protection and checkpoint firewalls. Landing zones are designed to be scalable, modular, and fully automated, providing a template-based approach for new projects. These zones emphasize identity access management, auditing, compliance, security monitoring, and networking. Decommissioned subscriptions are for unused resources, and sandbox subscriptions offer isolated environments for experimentation. *This sandbox is a is an interesting one because these landings on subscriptions allows your workloads.*
-
-Privileged Identity Management (PIM) and privileged access groups manage user access, ensuring appropriate role and policy enforcement. Terraform Cloud is used for infrastructure automation, leveraging Terraform states to manage dependencies between subscriptions. This layered approach allows teams to access necessary data without exposing sensitive information.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-34-azure-landing-zone-architecture-overview.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-34-azure-landing-zone-architecture-overview.md.bak
@@ -1,50 +0,0 @@
---
-title: CTP Topic 34 Azure Landing Zone Architecture Overview
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - Azure
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 34 Azure Landing Zone Architecture Overview
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-35-aws-landing-zone-design-refresher-saas-labs.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-35-aws-landing-zone-design-refresher-saas-labs.md
@@ -1,59 +0,0 @@
---
-title: CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - SaaS
-  - Labs
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AWS Landing Zone Design Refresher
-
-This session provides an overview of AWS Landing Zones, focusing on their design, updates, and differences between SaaS and Labs environments. The primary goal of landing zones is to support diverse AWS use cases while ensuring reuse, control, auditing, and management. *Our AWS landing zones, they're built infrastructure as code as you'd expect on terraform templates using the grunt work framework.*
-
-AWS SaaS landing zones offer customer-dedicated environments with product accounts for each product area, such as Snacks. These accounts connect to shared services accounts for security, logging, and networking. The core accounts group includes Active Directory, DNS, and network accounts to support IT services within the micro-focus infrastructure. The shared service accounts host services like artifactory, cyberqualice, cyber EPO, ArcSight, and monitoring. Grunt work accounts manage AMIs, logs, and security across all accounts. Product accounts host IT products, projects, applications, and supporting AWS resources, managed by individual project teams.
-
-Recent changes to the landing zones include network segmentation to block direct connectivity to SaaS workloads, decommissioning of the Gruntworks Cloud Trail in favor of CCOEs Cloud Trail, and proposed rerouting of ingress traffic via checkpoints in the network account. Native AWS backup is likely to be mandated, and management VPCs may be removed for new accounts. The key difference between SaaS and Labs is that SaaS is for production, while Labs is for development, with plans to introduce internet access into Labs. *Basically, the only answer is that SAS is production, Labs is development.* The PoC landing zone will be combined with Labs to maximize shared resources. The Cloud Technology Design Forum aims to standardize and centralize microfocus's cloud delivery offering, including landing zone designs.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-35-aws-landing-zone-design-refresher-saas-labs.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-35-aws-landing-zone-design-refresher-saas-labs.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - SaaS
-  - Labs
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-40-saas-database-architecture-on-aws-cloud.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-40-saas-database-architecture-on-aws-cloud.md
@@ -1,63 +0,0 @@
---
-title: CTP Topic 40 SaaS Database Architecture On AWS Cloud
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - SaaS
-  - Database
-  - Architecture
-  - AWS
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 40 SaaS Database Architecture On AWS Cloud
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## SAS Database Architecture on AWS Cloud
-
-The SAS database team is a global team located in the US, Canada, India, and Israel, providing 24/7 support. The team consists of certified professionals, including Oracle certified professionals, DBAs, and security professionals. They manage over 500 databases and 1000+ DB servers on-premise and in the public cloud, having migrated numerous DB servers and databases to the public cloud.
-
-The team supports various regions, including Sacramento and Reading for on-premise data centers, and AWS regions like Canada, Frankfurt, London, Oregon, North Virginia, and Sydney. They support database flavors such as Oracle, Vertica, Postgres, DynamoDB, SQL Server, MongoDB, and MySQL, utilizing AWS technologies like Postgres Aurora, Elasticsearch, AWS RDS, EFS, S3, and EBS. Databases reside mostly on application VPCs with integrated security measures.
-
-For database monitoring, performance tuning, and gap analysis, tools like Micro Focus Sidescope, Oracle OEM, Ignite, AWS CloudWatch, and Questsoft Foglight are used. Day-to-day operations are managed through a ticketing tool, with an on-call DBA resource. The team actively participates in squads and executes a minimum of 10 changes a month, handling 400-500 SSRs and IMs monthly. They provide layer 1 and layer 3 support, using technologies like shell scripting, Terraform, AWS CLI, and PowerShell for automation. *Data center migrations and cloud provisioning were key automation projects.*
-
-Key projects include data center migrations, onboarding new customers, database security enhancements, DB-AD integrations, SOX compliance, database consolidation, and DB patching. The team is also working on Oracle Golden Gate for multi-tenancy, adopting cloud-native technologies, and enhancing the Pretty Tool for on-demand backups and database migrations. Future plans involve new AMI automations, storage compression, RI instance optimization, AWS cloud-native backups, and enhancements to the DB apps tool. *The idea was to move those databases seamless without downtime or with minimum downtime.*
-
-For high availability, Oracle uses Data Guard technology, Postgres uses a classic active-passive mechanism (with plans to use Active Active), and RDS uses RDS high availability. Databases are run in two availability zones within a region, with a primary database in one zone, a standby database in the second, and a witness in the third to observe and manage failovers. Reporting databases have a read-only warehouse in the third availability zone, with secure VPN access for customers to run operational warehousing queries.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-40-saas-database-architecture-on-aws-cloud.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-40-saas-database-architecture-on-aws-cloud.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 40 SaaS Database Architecture On AWS Cloud
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - SaaS
-  - Database
-  - Architecture
-  - AWS
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 40 SaaS Database Architecture On AWS Cloud
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-44-aws-backup-in-micro-focus.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-44-aws-backup-in-micro-focus.md
@@ -1,87 +0,0 @@
---
-title: "CTP Topic 44 AWS Backup in Micro Focus"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - Backup
-  - DR
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 44_ AWS Backup in Micro Focus.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 44 AWS Backup in Micro Focus
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 44_ AWS Backup in Micro Focus.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-AWS Backup 是一个托管服务，用于在 AWS 云中集中化和自动化数据保护。它支持跨账户和跨区域备份，并提供不可变性以防止勒索软件等威胁。该服务为 S3 和 RDS 等服务提供时间点恢复，潜在恢复时间可在 1 秒以内。它还支持法律保留，允许隔离特定备份并保留以满足合规性要求。
-
-### 灾难恢复策略
-
-灾难恢复策略根据恢复时间目标（RTO）和恢复点目标（RPO）而有所不同。四种主要策略是：
-
- **备份和恢复**：适用于恢复时间为数小时的低优先级情况。
- **Pilot Light**：数据复制到 DR 区域，允许在 1 小时内恢复。
- **Warm Standby**：应用程序在生产和 DR 区域以较小规模运行，可在几分钟内恢复。
- **Active-Active**：提供近乎零停机和数据丢失，但成本最高，应用程序在两个区域同时运行。
-
-### 当前 AWS 备份流程
-
-目前，应用程序所有者管理 EC2、EBS、EFS、S3 和数据库等资源的快照。这些快照在 CCIE 门户中注册，并根据标签复制到 DR 区域。对于客户管理的密钥，会执行转换过程。CCIE 门户更新标签以跟踪备份过程并提供错误通知。
-
-### 当前流程的差距
-
-当前备份流程是分散的，涉及多个团队，增加了错误风险。快照存储在与资源相同的账户中，一旦账户被攻破会带来风险。CCIE vault 将快照复制到不同区域，但由于成本原因，保留期仅限于三天。备份不是不可变的，CCIE vault 需要新插件来支持新的 AWS 服务。产品组之间的保留期不一致。
-
-### AWS Backup 详情
-
-AWS Backup 加密所有备份，包括静态和传输中的数据。一个限制是它无法排除附加到 EC2 实例的特定卷，强制备份所有附加卷。Amazon 不再建议数据库使用热备份，指出快照是崩溃一致的，支持增量备份。
-
-### 演示亮点
-
-演示展示了创建备份保管库（用于加密 AWS 备份）、创建备份计划（每日、每小时或自定义计划）、启用 S3 和 RDS 的时间点恢复、按需备份以及从备份保管库恢复（创建新的 EBS 卷或 RDS 实例）。该服务支持基于角色的访问控制，可以使用 CloudWatch 进行监控。
-
---
-
-## 关键概念
-
- **AWS Backup**: AWS 托管的集中化数据保护服务
- **不可变性 (Immutability)**: 防止备份被篡改或删除
- **RTO (Recovery Time Objective)**: 恢复时间目标
- **RPO (Recovery Point Objective)**: 恢复点目标
- **备份和恢复**: 最基本的 DR 策略，适合低优先级场景
- **法律保留 (Legal Holds)**: 用于合规性保留特定备份
- **CCIE 门户**: 当前管理快照的内部平台
-
---
-
-## 行动项
-
- [ ] 评估现有备份流程是否需要迁移到 AWS Backup
- [ ] 审查当前备份的 RTO/RPO 是否满足业务需求
- [ ] 考虑跨账户和跨区域备份以提高韧性
- [ ] 检查数据库备份是否还在使用热备份模式（不推荐）
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-72-implementing-an-enterprise-dr-strategy-using-aws-backup.md]] — 企业级 DR 策略实施
-> [[ctp-topic-73-aws-backup-implementation-of-the-cloud-transformation-program.md]] — CTP AWS Backup 实施
-
---
-
-*最后更新: 2026-04-15*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-46-netapps-on-aws.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-46-netapps-on-aws.md
@@ -1,97 +0,0 @@
---
-title: CTP Topic 46 NetApps on AWS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - NetApp
-  - AWS
-  - Storage
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 46 NetApps on AWS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## NetApp on AWS: A Cloud Transformation Program Learning Session
-
-Sandeep and Yael presented a training session on NetApp, covering basic components, architecture, data tiering, security, backup/DR strategy, migration from on-prem to cloud, current NetApp usage, architecture, and a demonstration.
-
-### Traditional NetApp
-
-NetApp is a storage system, with ONTAP as its operating system. It features controller nodes connected to disk enclosures, supporting SSD, SATA, SAS, and FC disks. NetApp primarily supports SMB, NFS, FC, FCOE, and ISCSI protocols, often configured as a single node or HA pair (high availability pair).
-
-Key components include:
-*   **Aggregate:** A collection of disks forming a RAID group.
-*   **Volume (FlexVolume):** A data container hosted on top of an aggregate, presented to hosts for data storage, accessible via NFS or CIFS.
-*   **Qtree:** A further segmentation of a volume, similar to directories in UNIX or folders in Windows, with special attributes like permissions and quota management.
-*   **LUN (Logical Unit Number):** A logical representation of storage, hosted on a volume or Qtree, presented to hosts via FC or ISKSI as block-level storage.
-*   **Logical Interface (Lift):** An interface on top of a physical network card, hosting an IP address or WWPN, used for node management, inter-cluster replication, cluster management, and data serving.
-*   **Storage Virtual Machine (SVM):** A virtual segmentation of a NetApp system, enabling multi-tenancy, treating each SVM as a separate operating system with no data flow between them. *At least one SVM is needed for a cluster.*
-
-### NetApp in AWS (Cloud Volume ONTAP - CVO)
-
-CVO is a software-only storage appliance hosted on EC2 instances, functioning as nodes. It can be a single node or HA pair, utilizing a mediator instance to aid during takeover and give back processes. The nodes are deployed across multiple availability zones with synchronous replication. EBS disks (GP3, GP2, IEO, IEO1, ST1) are used as storage, managed via Cloud Manager.
-
-High availability is maintained through a floating IP concept, where clients access data via a unique IP address that migrates to the serving node in case of failure. Takeover give back refers to the process of a serving node taking over services from a failed node and relinquishing them when the failed node recovers.
-
-### Data Tiering
-
-Data tiering involves using various storage media to optimize cost, performance, and availability. NetApp in AWS stores active data on EBS and inactive data on S3. Data inactive for 30 days or more is automatically moved to S3 and pulled back to EBS when accessed. *NetApp stores the active data in EBS and inactive data to S3.*
-
-### Data Security
-
-NetApp supports encryption via AWS Key Management Service and NetApp Encryption Solution (volume or aggregate encryption), both offering 256-bit encryption. Virus scanning is integrated with McAfee Antivirus (VSES), using an external scan server. Scanning options include on-access (for SMB/CIFS) and on-demand (for NFS) scanning.
-
-### Backup and DR
-
-Snapshots are point-in-time, read-only file system images that create copies of volumes using pointers, minimizing space consumption. SnapMirror is a tool for replicating data between NetApps, copying volumes and their snapshots. It requires peering relationships between clusters and SVMs, with optional encryption. Baseline copies perform initial full data replication, while subsequent updates copy only the changes. Destination volumes in a SnapMirror relationship are read-only.
-
-### Migration
-
-Tools for migrating from on-prem to AWS include:
-*   **SnapMirror:** Fast, block-level replication, preserving D-Dupe and compression.
-*   **NetApp XCP:** File-based tool, copying data at the file level with concurrent sessions.
-*   **NetApp Cloud Sync:** Used for AWS migrations, supporting NetApp to NetApp, NFS, SMB, NetApp to S3/EFS, and EFS/S3 to NetApp.
-*   **AWS DataSync:** AWS-provided file-based tool for NetApp to EFS or S3 migrations.
-*   **Silver Peak:** A WAN optimizer for compressing packets.
-
-### Current NetApp Usage and Future Plans
-
-The organization has around 15 NetApp clusters in various AWS regions, hosting approximately 1.3 petabytes of data. Cloud Manager is used for central management, with storage operations maintaining and supporting the NetApps. Monitoring is currently done through Cityscope and WebTool, with plans to use AWS native services. S3 tiering is enabled for most NetApps, and FSX for NetApp is under POC. There are also plans to use Terraform for deploying NetApps.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-46-netapps-on-aws.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-46-netapps-on-aws.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 46 NetApps on AWS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - NetApp
-  - AWS
-  - Storage
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 46 NetApps on AWS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-47-enterprise-architecture-cloud-standards.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-47-enterprise-architecture-cloud-standards.md
@@ -1,64 +0,0 @@
---
-title: CTP Topic 47 Enterprise Architecture Cloud Standards
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - Enterprise-Architecture
-  - Cloud-Standards
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 47 Enterprise Architecture Cloud Standards
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Enterprise Architecture Cloud Standards
-
-[slide:N]
-The session will cover landing zones, their purpose, the role of enterprise architecture in cloud environments, guardrails, and the need for community input. The speaker, Lindsay, an enterprise architect with a development background, aims to provide a learner's perspective on cloud architecture.
-
-A landing zone is a framework for hosting cloud workloads, focusing on security, compliance, and manageability. Key components include account structure, networking, security, access management, and telemetry. *The account structure aligns with environments (dev, staging, production), and roles define access based on zero trust and least privilege principles.* The landing zone provides pre-configured networking and security, reducing the security review burden on application teams. Centralized logging and auditing are provided within the framework.
-
-Benefits of using landing zones include a pre-designed security model, pre-built compliance, and visible cost control. Infrastructure automation, using Terraform, enables efficient environment configuration. *Terraform allows specifying the desired environment in code, promoting standardization and testability.* Terragrunt, a wrapper for Terraform, aids in generating different environments. The framework eliminates reinvention, allowing application teams to focus on application-specific tasks.
-
-Enterprise architecture helps articulate the cloud architecture, informing application teams about available resources and requirements. Guardrails capture mandatory requirements and optimal practices for scalability, cost minimization, and flexibility. The enterprise architecture team has created a page on the intranet site with business architecture concepts, data connections, application information, and technology roadmaps.
-
-The cloud guardrails document covers design concepts, capabilities, and best practices. Key design concepts include cloud-first, leveraging well-architected frameworks, infrastructure as code (Terraform), and resource tagging. The document provides guidance on executable packaging, functional partitioning, capacity management, and identity management.
-
-Executable packaging prioritizes using existing cloud services and managed services to minimize custom code. Functional partitioning involves breaking monolithic applications into smaller, independent blocks or serverless functions. The speaker emphasizes the need for input from application teams to refine the guardrails and incorporate real-world experiences. *We want your knowledge collected here for reuse and help help to help other app developers down the road.*
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-47-enterprise-architecture-cloud-standards.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-47-enterprise-architecture-cloud-standards.md.bak
@@ -1,50 +0,0 @@
---
-title: CTP Topic 47 Enterprise Architecture Cloud Standards
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - Enterprise-Architecture
-  - Cloud-Standards
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 47 Enterprise Architecture Cloud Standards
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-50-ami-roadmap-for-aws-amis.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-50-ami-roadmap-for-aws-amis.md
@@ -1,62 +0,0 @@
---
-title: CTP Topic 50 AMI Roadmap for AWS AMIs
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - AMI
-  - Roadmap
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 50 AMI Roadmap for AWS AMIs
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AMI Roadmap for AWS AMIs
-
-The Cloud Transformation Program held a learning session to discuss the AMI roadmap for AWS AMIs. The session covered the CCOE AMI roadmap, end-of-life operating systems, AMI notifications, change logs, new features, the process for adding new AMIs, current supported AMIs, and the roadmap.
-
-The CCOE provides hardened AMIs on a bi-monthly basis aligned with security standards. The session focused on the roadmap, not the hardened AMIs themselves. The current available AMIs include three versions of Ubuntu, CentOS 7 and 8, Reddit 8.4 ARM, Amazon Linux 2, and four versions of Windows operating systems.
-
-The roadmap includes planned releases for new operating systems. In November, SLES 15 and Reddit 9 will be released. In January 2023, open Susa 15 and Amazon Linux 2022 will be added. In March 2023, Rocky 8 and Rocky 9 will be available. May 2023 will see Reddit 9.4 ARM and Ubuntu 22.04 ARM. *Starting May 2023, all ARM processors related to AMIs will be released.* The order was created mainly by ADM requirements. Any requirements to change the prioritization of the roadmap should go through the demand pipeline process.
-
-Windows Server 2008 and 2008 R2 are end-of-life since January 2020, CentOS 8 since December 2021, and Windows Server 2012 will be by October 2023. Red Hat 7 will be end-of-life by June 2024, as will CentOS 7. AMI notifications are sent via email to those on the CCOE notifications PDL. A change log is now available in the CCRE portal, representing the latest changes from the previous release. *This change log focuses on changes done by CCRE.*
-
-The features contained in the AMIs include domain join services, enabling SSHR, integrating McAfee antivirus services, enabling DNS settings, updating the cloud init process, enabling the SSM client, and edge installations. The process of adding new AMI integration and validation involves integrating services, enabling features, and undergoing a build and test process. The AMIs are shared with every account in the organization, including the AMI itself, EBS volumes, and KMS keys.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-50-ami-roadmap-for-aws-amis.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-50-ami-roadmap-for-aws-amis.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 50 AMI Roadmap for AWS AMIs
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - AMI
-  - Roadmap
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 50 AMI Roadmap for AWS AMIs
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-51-architecting-with-aws-purpose-built-databases.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-51-architecting-with-aws-purpose-built-databases.md
@@ -1,68 +0,0 @@
---
-title: CTP Topic 51 Architecting with AWS purpose-built databases
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Database
-  - Purpose-Built
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 51 Architecting with AWS purpose-built databases
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Architecting with AWS Purpose-Built Databases
-
-Femi George, a database sales specialist from AWS, discussed purpose-built databases for modern applications, covering modern applications, the rationale for purpose-built databases, key AWS databases, and the evolving role of DBAs/developers in the cloud.
-
-Modern applications have evolved from client-server models due to changing customer requirements, new devices, diverse data types, and economic considerations. Key questions include scalability, global delivery with low latency, and developer access. The approach involves starting with the use case and selecting the best tool for the job, avoiding a one-size-fits-all approach. *We need to start thinking of the right purpose built database for the right application.*
-
-Considerations for purpose-built databases include application scale, user numbers, access patterns, usage spikes, and performance requirements like latency and availability. Duolingo uses DynamoDB for personalized data, ElastiCache for common words/phrases, and Aurora for transactional data. AWS offers a range of purpose-built databases, including relational (e.g., RDS, Aurora) and NoSQL (key-value, document, in-memory, graph) options, along with time series, ledger, and wide-column databases.
-
-Relational databases are suitable for fixed schemas and maintaining referential integrity. Amazon RDS provides fully managed traditional and open-source databases, handling backups and patching. Data endpoints in RDS facilitate easy application access. Amazon Aurora, a cloud-native database, offers MySQL and PostgreSQL compatibility with enhanced performance, scalability, and security. *Amazon Aurora has two flavors, MySQL and PostgreSQL.* Aurora separates storage and compute, improving IO and availability.
-
-Key-value data is popular among developers and forms the basis of NoSQL databases. Amazon DynamoDB is a key-value and document database with single-digit millisecond performance at any scale, supporting trillions of requests per day. Netflix uses DynamoDB for resilience and low-latency access to JSON documents. Document databases extend key-value stores by enabling deeper querying within JSON files. Amazon DocumentDB is compatible with MongoDB and offers flexible schemas.
-
-Apache Cassandra, a wide-column database, is used for large-scale applications with unstructured schemas. Amazon Keyspaces is a managed service for Cassandra-compatible databases, offering serverless options. In-memory databases, like Amazon ElastiCache (Redis, Memcached), are used for caching, media streaming, session stores, and real-time analytics. Peloton uses ElastiCache Redis for immediate feedback to customers.
-
-Graph databases (e.g., Amazon Neptune) are suitable for fraud detection, social networking, and recommendations. They help uncover correlations that relational databases struggle with. Time series databases (e.g., Amazon Timestream) are designed for high-volume, time-based data analysis, such as data from IoT devices.
-
-The role of the DBA is evolving in the cloud. While AWS manages much of the platform, DBAs still handle tasks like restoring databases, managing access, and optimizing queries. The focus shifts from platform management to application innovation.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-51-architecting-with-aws-purpose-built-databases.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-51-architecting-with-aws-purpose-built-databases.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 51 Architecting with AWS purpose-built databases
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Database
-  - Purpose-Built
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 51 Architecting with AWS purpose-built databases
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-58-aws-ec2-image-builder.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-58-aws-ec2-image-builder.md
@@ -1,64 +0,0 @@
---
-title: CTP Topic 58 AWS EC2 image builder
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - EC2
-  - Image-Builder
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 58 AWS EC2 image builder
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AWS EC2 Image Builder
-
-AWS EC2 Image Builder is a managed AWS service to automate the creation, management, and distribution of AMIs and Docker images using components like image pipelines, image recipes, and infrastructure configurations. Image pipelines define how AMIs are published, including installations, security hardening, and distribution schedules.
-
-Image recipes, written in YAML, define the source AMI for creating an output AMI, while container recipes support Docker images. Components are individual steps executed within the source AMI, such as installing packages or running shell commands. *A component is basically just a particular step that you want to execute in order to achieve the output AMI.* Infrastructure configurations define instance attributes like instance type, VPC, subnet, and security groups. Distribution settings manage the distribution of AMIs across different regions and accounts.
-
-The current AMI publishing process involves OS-specific hardening scripts in GitLab repositories and Jenkins pipelines launching Packer to build and share images. Some product teams have developed parallel image bakeries, while others use manual processes with limited automation. The current approach has shortcomings, including longer turnaround times for modifications, AMI compatibility issues across landing zones, and limited automation in manual image bakeries. *Due to these limitations and these things what happens is eventually the product teams try to cater to their requirements by developing some kind of workflow or CI CD pipelines wherein they consume that CCOE AMI and they try to update or install whatever packages they require for their requirement or try to fulfill the functionalities which were lacking in the base AMI.*
-
-Image Builder offers advantages such as increased productivity through automation, efficient image testing during the build process, incorporation of hardening standards, and easy image distribution. It integrates with AWS Organizations and AWS RAM for distributing AMIs across managed accounts. Supported OSes include Amazon Linux, Windows Server, Red Hat Linux, CentOS, Ubuntu, and SUSE, with the list expected to expand.
-
-A POC has implemented end-to-end pipelines for CentOS 7 and Ubuntu 18, using CCOE hardening scripts converted into individual components. Terraform modules are in place for creating resources, with a consolidated module simplifying consumption for product teams. Testing scenarios are incorporated within components to validate execution, and AWS Inspector is integrated for AMI scanning against security standards. A Lambda workflow triggers scans, sends email notifications, and uploads reports to S3, maintaining a historical data of published AMIs. Qualys scan integration is under evaluation.
-
-Product groups can use a service module to add components to the golden AMI. A component is a script, and components should be added in alphabetical order. The HCL file is used to create and manage components. Logs are published in CloudWatch. The image builder process requires approval, and the approval process is still under development.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-58-aws-ec2-image-builder.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-58-aws-ec2-image-builder.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 58 AWS EC2 image builder
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - EC2
-  - Image-Builder
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 58 AWS EC2 image builder
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-66-exposing-the-differences-between-postgresql-rds-and-aurora.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-66-exposing-the-differences-between-postgresql-rds-and-aurora.md
@@ -1,92 +0,0 @@
---
-title: CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - RDS
-  - Aurora
-  - PostgreSQL
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## RDS vs. Aurora: Key Differences
-
-Greg Klau presented a detailed comparison of PostgreSQL on Amazon RDS and Aurora, focusing on performance, cost, and use cases. The session covered choosing between the two, running blue-green and cross-region operations, monitoring, and network performance tweaks for high availability.
-
-### Key Differences and Considerations
-
-*   **Minimum Size and Cost:** RDS offers smaller, cheaper instances suitable for small databases, while Aurora has a higher minimum size and cost due to its architecture.
-*   **Maximum Size and Performance:** Aurora scales to larger databases and offers better IO performance, making it suitable for databases exceeding 10-20 terabytes.
-*   **Auto Scaling:** Aurora offers auto-scaling (Serverless v2) but with limitations on instance shapes, versions, and regions.
-*   **Recovery Time Objective (RTO):** Aurora boasts a 30-second RTO, compared to RDS's two minutes in the event of an AZ failure.
-*   **Storage Flexibility:** RDS provides more storage options (GP2, GP3, provisioned IOPS, magnetic), while Aurora charges per IO.
-*   *With RDS, you get to choose multiple different storage mechanisms.*
-*   *Aurora IO is generally unbounded because they're motivated to give you as much IO as you can consume because they're charging you per IO.*
-
-### Architectural Comparison
-
-*   **RDS:** Uses compute with attached storage (EBS). Multi-AZ setup involves another compute and storage node for failover. Replication across regions is asynchronous.
-*   **Aurora:** Employs six EBS volumes across three availability zones, managed by Amazon. Adding compute uses the same cluster volume, avoiding data replication for read replicas. Aurora Global allows multi-region setups with asynchronous replication.
-*   *With Aurora, you get six EBS volumes. They're spread across three availability zones.*
-*   **Endpoints:** RDS has one endpoint per cluster, while Aurora has separate writer and reader endpoints.
-
-### Database Switchover and Failover
-
-*   **RDS:** Requires blocking access, forcing a new primary, destroying the old cluster, and rebuilding it as a standby.
-*   **Aurora:** Allows clean, managed switchovers using Aurora Global, without re-replication. Failover involves promoting a secondary region and re-adding the failed region as a new global cluster after it recovers.
-
-### Blue-Green Deployments (Aurora MySQL Only)
-
-*   Aurora MySQL supports blue-green deployments for major version upgrades, creating a duplicate environment for testing before switching over. This involves logical replication to a green environment, with guardrails to prevent data loss.
-
-### Monitoring
-
-*   Both RDS and Aurora offer monitoring options via CloudWatch, Grafana, and Performance Insights. Performance Insights provides a view of database load, query performance, and wait times.
-*   Aurora utilizes free local storage (ephemeral SSD) for temporary work, which is fixed per instance type. RDS uses EBS for temporary storage.
-
-### High Availability Performance Tweaks
-
-*   Lower DNS time to live (TTL) to one second for faster failover.
-*   Adjust TCP Keep-Alive settings to detect database failures quickly.
-*   Use JDBC connection string overloading with reader and writer endpoints for resilience.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-66-exposing-the-differences-between-postgresql-rds-and-aurora.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-66-exposing-the-differences-between-postgresql-rds-and-aurora.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - RDS
-  - Aurora
-  - PostgreSQL
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-68-introduction-to-redshift.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-68-introduction-to-redshift.md
@@ -1,60 +0,0 @@
---
-title: CTP Topic 68 Introduction to Redshift
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Redshift
-  - Data-Warehouse
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 68 Introduction to Redshift
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AWS Redshift Architecture and Components
-
-This learning session covers AWS Redshift, focusing on its architecture, management, and key components. The session aims to provide a foundational understanding of Redshift, including its features like columnar operations, row-based operations, MPP (Massively Parallel Processing), data compression, and the significance of distinct and hot keys.
-
-Redshift is a fully managed, petabyte-scale data warehouse solution in the cloud. *It is designed for data warehousing, enabling quick data retrieval from large datasets.* It supports online analytical processing (OLAP) and offers advantages such as easy installation, maintenance of backups, point-in-time recovery, and cross-region disaster recovery.
-
-Redshift architecture involves client applications communicating with Redshift clusters via JDBC and ODBC drivers, connecting to a leader node. The leader node manages schema, warehouse metadata, and query planning, distributing instructions to compute nodes. Compute nodes, determined by the instance type, execute queries across slices, processing data and returning results to the leader node. *The leader node then stores results in buffers for quick retrieval, enhancing performance.* Instance types include dense compute, dense storage, and RA3, each offering varying levels of compute power, RAM, and storage capacity. RA3 is noted for its cost-effectiveness and large storage capacity, utilizing AWS-managed NVMe storage.
-
-Key features of Redshift include MPP, which enables parallel processing of queries across multiple compute nodes, improving query speed and response times. Data storage can be columnar or row-based; columnar storage is optimized for data warehouse operations due to faster performance and efficient memory usage. Data compression techniques, including LZO, further enhance performance by reducing data size. The sort key and dist key play a crucial role in optimizing queries and managing data distribution across compute nodes.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-68-introduction-to-redshift.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-68-introduction-to-redshift.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 68 Introduction to Redshift
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Redshift
-  - Data-Warehouse
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 68 Introduction to Redshift
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-7-saas-landing-zone-design.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-7-saas-landing-zone-design.md
@@ -1,97 +0,0 @@
---
-title: CTP Topic 7 SaaS Landing Zone design
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - SaaS
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 7 SaaS Landing Zone design
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## SAS Landing Zone Design
-
-The session covers the high-level design for the new production SAS Landing Zone, emphasizing a single landing zone approach for all products to reduce overhead and costs, a departure from the per-product group (PG) landing zones used in dev labs. The design incorporates AWS accounts, Terraform modules, and TerraGrant for deployment.
-
-Key components include core accounts (shared, logs, security), baseline accounts (network, DNS, Active Directory), shared services accounts (software factory, cyber, ARC site, monitoring), and product accounts.
-
-*The SAS landing zone will use a single landing zone for all the product groups.*
-
-### Core Accounts
-
-These accounts are based on the grant work reference architecture and include:
-
-*   **Shared Account:** Hosts hardened AMIs and a master Jenkins server for managing deployments. The master Jenkins initiates Lambda functions within each account to trigger Jenkins slaves, enhancing security by preventing direct exposure of the master Jenkins to jobs or credentials.
-*   **Logs Account:** A centralized account for logs from every account (CloudTrail, Config, Flowlogs), accessible primarily to the security team, with read access for products to their specific logs.
-*   **Security Account:** Hosts IAM roles inherited within each account, with the ability for account owners to attach additional policies to restrict role usage.
-
-### Baseline Accounts
-
-These accounts are essential for product functionality and include:
-
-*   **Network Account:** Contains a regional transit gateway connecting all accounts, with a checkpoint appliance for monitoring traffic based on a tagging approach. Resources require specific tags to access destinations like the internet or on-prem networks.
-*   **DNS Account:** Hosts Route 53, with each product having its own hosted zone for managing DNS records.
-*   **Active Directory Account:** Includes two AD nodes for domain joining and controlling resource access.
-
-### Shared Services Accounts
-
-These accounts provide internal production services to product accounts:
-
-*   Software Factory accounts (45 hubs, Octane Hub, Artifactory).
-*   Cyber account (Qalis).
-*   ARC site account.
-*   Monitoring account (OBM, potentially Sitescope).
-
-### Product Accounts
-
-Each product account features a public subnet for internet exposure via a load balancer and internet gateway, while workloads reside in private subnets. A web application firewall (WAF) monitors incoming traffic, and CloudFront is available as a CDN.
-
-*The workload itself is going to be under private subnet.*
-
-### Automation and Deployment
-
-Terraform is used for automation, with each account having its own GitHub repository. Changes to Terraform code trigger Jenkins via a GitHub hook, initiating a deployment process through the management VPC, Lambda, and ECS cluster. A review process, including code review and plan output review, is implemented before applying changes, with staging environments used for testing before production deployment.
-
-### Remote Access
-
-Remote access is transitioning from Checkpoint VPN to Pulse VPN, requiring operators to use a VPN client and authenticate against the AD. Future plans involve SD1 replacing some network components.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-7-saas-landing-zone-design.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-7-saas-landing-zone-design.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 7 SaaS Landing Zone design
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Landing-Zone
-  - SaaS
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 7 SaaS Landing Zone design
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-72-implementing-an-enterprise-dr-strategy-using-aws-backup.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-72-implementing-an-enterprise-dr-strategy-using-aws-backup.md
@@ -1,65 +0,0 @@
---
-title: CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - DR
-  - Backup
-  - Enterprise
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Implementing an Enterprise DR Strategy Using AWS Backup
-
-Sabith from AWS discusses disaster recovery (DR) strategies using AWS Backup, differentiating between high availability and disaster recovery. He recaps basic concepts like RTO and RPO, introduces AWS Backup, and presents reference architectures.
-
-*We should always be prepared for a situation that everything falls all the time.* The shared responsibility model defines AWS's and the customer's roles in ensuring a resilient cloud environment. Human errors, technical failures, and natural disasters are major categories to consider when creating DR plans.
-
-High availability ensures a system performs its functions, measured by mean time between failures. Disaster recovery focuses on data loss prevention and recovery, while high availability focuses on system uptime and service availability.
-
-Recovery Point Objective (RPO) defines the acceptable data loss, while Recovery Time Objective (RTO) defines the acceptable downtime. Architectural patterns range from multi-site active-active (minimal interruption, high cost) to backup and restore (lower cost, longer interruption). AWS Backup is a fully managed, policy-based backup service that simplifies data protection. It supports numerous resource types and integrates with AWS Organizations for cross-account backup copies.
-
-AWS Backup uses backup plans to define what, when, and how to back up, storing recovery points in backup vaults. It integrates with IAM policies for access control and AWS Backup Audit Manager (BAM) for compliance reporting. AWS Backup integrates with underlying services through data plane and control plane integrations. Full backups capture all data, while incremental backups only capture changes since the last backup.
-
-AWS Backup offers immutable recovery points, automated scalability, and compliance features. Vault Lock in compliance mode prevents even root users from deleting recovery points until their lifecycle ends, deterring ransomware. Customers often use a vault or bunker account for storing backup copies, separate from workload accounts, to protect against compromises. A forensic account can be used to regularly test recovery points and scan for malware.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-72-implementing-an-enterprise-dr-strategy-using-aws-backup.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-72-implementing-an-enterprise-dr-strategy-using-aws-backup.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - DR
-  - Backup
-  - Enterprise
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-73-aws-backup-implementation-of-the-cloud-transformation-program.md
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-73-aws-backup-implementation-of-the-cloud-transformation-program.md
@@ -1,57 +0,0 @@
---
-title: CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Backup
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> The session covers the AWS backup implementation of the cloud transformation program, focusing on the CTP backup strategy, AWS backup audit manager, and the AWS backup module. The SRE core, SRE product, and architecture teams collaborated on a design to provide product groups with flexibility in their backup strategies.
-
-Key points include the assumed backup policy for production workloads, which requires customer data to be backed up regularly (at least once in 24 hours) with a retention policy of at least 30 days, and two backup locations. AWS backup was adopted as the strategic tool for backup in AWS for the cloud transformation program to standardize backup processes. An SRE model was developed to allow product groups to create and control their own backups, aligned with the assumed backup policy, enabling independent backup and restore operations in their DRA accounts.
-
-AWS backup was chosen because it is a native service managed by AWS, simplifying data protection at scale and supporting multiple AWS resources. It supports TAC based backup plans, cross-account and cross-region backups, immutability for backups, out-of-the-box audit reports and frameworks, and point-in-time recovery for S3 and RDS. The design involves taking initial backups within the source accounts and copying them to a remote account and region, ideally a dedicated DR account for each production workload account. *This keeps backups within the DR account for immediate restore, avoiding time-consuming data copies.* If a DR account is unavailable, a Databunker account can be used as a centralized account for storing backups. The SRE backup model simplifies the adoption of AWS backup by creating AWS backup plans, selections, local AWS backup vaults, KMSKN policies, additional vaults in the DR account, Enroll policies, lifecycle policies, SNS topic creations, audit reports, and optional point-in-time restore for SRE and RDS. *The SRE models were adjusted to optionally create custom KMS kits, which is a fundamental requirement for having a remote account and region for the AWS backup processes.*
-
-The AWS backup audit manager provides out-of-the-box reports and compliance reports. Reports can be exported to an S3 bucket in CSV or JSON format, providing insights into the status of backups, resources backed up, creation date, recovery point, backup duration, and size. SNS notifications can be configured to receive alerts regarding the status of backups. The AWS backup audit manager framework includes controls that help evaluate backup practices, providing compliance reports. Controls include ensuring backup resources are protected by a backup plan, minimum frequency and retention, prevention of manual deletion of recovery points, encryption of recovery points, and scheduled cross-region and cross-account backups.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/ctp-topic-73-aws-backup-implementation-of-the-cloud-transformation-program.md.bak
+++ b/SRE/01_AWS-Landing-Zone/ctp-topic-73-aws-backup-implementation-of-the-cloud-transformation-program.md.bak
@@ -1,50 +0,0 @@
---
-title: CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/01_AWS-Landing-Zone
-tags:
-  - AWS
-  - Backup
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/01_AWS-Landing-Zone/learning-sessions-standard-amis-updates-20231205-160324-meeting-recording-2.md
+++ b/SRE/01_AWS-Landing-Zone/learning-sessions-standard-amis-updates-20231205-160324-meeting-recording-2.md
@@ -1,26 +0,0 @@
-
-
-# learning sessions standard amis updates 20231205 160324 meeting recording 2
-
-## Standard AMI Updates and Overview
-
-The session provides a high-level overview and updates regarding Amazon Machine Images (AMIs). The standard AMIs are based on AWS AMIs but include OS hardening, the latest patches, and security updates. These AMIs also support domain joining, security tools, endpoint protection, access integration, a QALIS agent, SSM agent, DNS settings, Microsoft Edge for Windows AMIs, and GP3 EBS storage.
-
-The AMIs are built, tested, and shared to all AWS accounts every two months, and are immediately available as private AMIs. Currently, 23 different AMIs are supported, including various versions of Amazon Linux, CentOS, Oracle Enterprise Linux, Red Hat, Rocky Linux, SUSE Linux, Ubuntu, and Windows servers. The latest three releases are available in 12 regions, and older AMIs are archived for 12 months.
-
-The AMI release process follows a standard software release process, with changes developed on feature branches and merged into an integration branch. Jenkins multi-branch pipelines are used for building and testing the AMIs, including scripted tests and AWS Inspector. The publishing process involves copying the AMIs to different regions and sharing them to multiple organizations, with encryption and automatic creation of necessary grants. *The AMIs are then thrown through all of the test suites, and we'll see a couple of those as they come up in later slides, and then we verify that nothing seems to have regressed at that point.*
-
-## Roadmap, Notifications, and End-of-Life
-
-The current roadmap includes a future release of Amazon Linux 2023, X64, planned for January. New AMI requests must go through the demand pipeline and take approximately 60 days to release. AMI notifications are sent out with each release, including links to relevant documents and the portal. A change log is available in the portal, detailing the changes included in each release.
-
-Several operating systems are reaching end-of-life, including CentOS 7 and Red Hat 7 in June 2024. *CentOS 7 will be replaced by Rocky Linux, which is already available as a standard AMI.* OpenSUSE Leap 15 and OEL 7 will reach end-of-life in December 2024.
-
-## New Features and Validation
-
-New features are injected into the release cycles based on various inputs, such as the migration from Trellix to Sentinel-1. The AMIs are designed to work across multiple landing zones and domain controller environments. The new landing zone uses secrets instead of parameter stores, and all automations now use cloud-based init. AMI utilization is monitored to track how frequently and how many AMIs are being used.
-
-A robotic framework has been integrated to automate basic test cases and validations, reducing the validation time for one AMI from three-four days to 60 minutes. An SSM patching solution is available for long-running instances that cannot be refreshed frequently. The AMIs are validated and tested according to the highest security standards, with penetration testing conducted periodically.
-via model google/gemini-2.0-flash
-
-Cached · google/gemini-2.0-flash
--- a/SRE/01_AWS-Landing-Zone/learning-sessions-standard-amis-updates-20231205-160324-meeting-recording-2.md.bak
+++ b/SRE/01_AWS-Landing-Zone/learning-sessions-standard-amis-updates-20231205-160324-meeting-recording-2.md.bak
@@ -1,51 +0,0 @@
---
-title: "Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/01_AWS-Landing-Zone"
-tags:
-  - AWS
-  - AMI
-  - Updates
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4"
-audio-source: ""
-status: raw
---
-
-# Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4`
-
-**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/02_IAM/ctp-topic-11-ad-integration-and-login-using-ad-accounts.md
+++ b/SRE/02_IAM/ctp-topic-11-ad-integration-and-login-using-ad-accounts.md
@@ -1,61 +0,0 @@
---
-title: "CTP Topic 11 AD Integration, and Login using AD accounts"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/02_IAM"
-tags:
-  - AWS
-  - AD
-  - IAM
-  - SSO
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 11_ AD Integration, and Login using AD accounts.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 11 AD Integration, and Login using AD accounts
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 11_ AD Integration, and Login using AD accounts.mp4`
-
-**Type:** VIDEO | **Category:** 02_IAM
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次 DevOps Cloud Learning Session 由 Niranjan 主讲，核心内容围绕 Jenkins 的身份认证优化以及 Terraform 代码的自动化质量检查展开。视频首先介绍了 Jenkins 与 SW Infra Active Directory (AD) 的集成。通过这一集成，团队告别了过去手动创建本地用户的繁琐流程，实现了基于 AD 账号的自动登录。这不仅简化了用户入职与离职的账号管理，还为未来实施基于角色的访问控制（RBAC）奠定了基础。目前，系统已实现认证集成，下一步将通过 AD 组策略实现精细化的权限管理（如只读、读写、流水线创建权限）。
-> 
-> 视频的第二部分重点展示了如何利用 `pre-commit` 框架在 CI/CD 流水线中嵌入自动化检查，以防止“坏代码”或安全漏洞进入生产环境。Niranjan 详细演示了三个核心工具的应用：`terraform fmt` 用于统一代码格式，`TFLint` 用于验证配置逻辑与参数完整性，而 `Checkov` 则负责静态安全分析（例如检测未挂载到实例的安全组）。
-> 
-> 在工作流设计上，演讲者强调了“左移”思想：在功能分支的每次提交（Commit）时仅触发自动化检查；在拉取请求（PR）阶段触发检查与 `terraform plan`；只有在代码合并至 Master 分支并经过人工审核后，才会执行最终的 `terraform apply`。这种分层治理的模式极大地提升了基础设施即代码（IaC）的安全性和稳定性。
-
---
-
-## 关键概念
-
- **Active Directory (AD) Integration**: 将 Jenkins 的安全域（Security Realm）与企业活动目录关联，实现用户身份的统一认证与自动化管理。
- **RBAC (Role-Based Access Control)**: 基于角色的访问控制，通过 AD 组策略决定用户在 Jenkins 中拥有的具体操作权限。
- **Pre-commit Framework**: 一个用于管理和维护多语言预提交钩子的框架，旨在代码提交至仓库前识别简单问题。
- **terraform fmt**: Terraform 内置的格式化工具，用于将配置文件重写为符合官方规范的标准格式。
- **TFLint**: 一种针对 Terraform 的静态分析工具，用于检查代码中的人为错误、过时语法及缺失的参数。
- **Checkov**: 一种静态代码分析工具，专门用于扫描基础设施即代码 (IaC) 中的安全性与合规性配置错误。
- **Static Analysis**: 在不实际运行代码的情况下，通过检查源代码来发现程序中潜在错误或安全漏洞的过程。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[GitHub and Jenkins Integration]] — 本视频提到的前置基础，介绍了 GitHub 仓库与 Jenkins 流水线的触发与反馈机制。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/02_IAM/ctp-topic-5-aws-identity-and-access-management-iam.md
+++ b/SRE/02_IAM/ctp-topic-5-aws-identity-and-access-management-iam.md
@@ -1,79 +0,0 @@
---
-title: CTP Topic 5 - AWS Identity and Access Management (IAM)
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/02_IAM
-tags:
-  - AWS
-  - IAM
-  - Security
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 5 - AWS Identity and Access Management (IAM)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4`
-
-**Type:** VIDEO | **Category:** 02_IAM
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AWS Identity and Access Management (IAM) Explained
-
-This session covers AWS Identity and Access Management (IAM), focusing on users, groups, roles, and policies, and how they relate to accessing AWS via the CLI and federation. The discussion emphasizes accessing landing zone accounts and determining the appropriate method.
-
-Key points include:
-*   IAM dashboard resources: users, groups, customer managed policies, roles, and identity providers.
-*   Federated access: Users gain access to accounts via Active Directory (AD) groups, which grant specific roles.
-*   `accounts.json`: This file, located in the root of every landing zone, contains a list of account numbers.
-*   IAM users are primarily for service accounts; federation is the preferred method for user management.
-*   User groups are less relevant due to the focus on federated user management.
-*   Roles are used by services or users and tie together permissions.
-*   Policies define permissions, specifying what actions are allowed or denied on resources.
-*   *Roles don't enable actions; they tie together who can do something and what they can do.*
-*   Policies can be AWS-managed or customer-managed.
-
-Federated users log in via their organization's AD, which maps to an IAM role. Command-line access via federation requires a tool called PFSSO. *We only want to allow the access that is strictly required.* Least privilege model: Granting only the necessary permissions is crucial.
-
-Configuring permissions typically involves a service accessing AWS resources, requiring a role and policy. Terraform modules can define IAM roles, including an assumed role policy and inline policy blocks. Policies should be fine-grained, limiting access to only the required resources. Inline policies are tied to a specific role, while managed policies can be reused across multiple roles.
-
-Key takeaways:
-*   Federation is the primary method for user access.
-*   Roles and policies are central to managing permissions.
-*   Least privilege is a guiding principle when defining policies.
-*   Consider using inline policies for role-specific permissions and managed policies for reusable permissions.
-*   When defining pterogrant modules, ensure policies are not too wide open.
-*   VSM requests are required to gain account access through Federation.
-*   User attributes beyond usernames are supported, including additional STS values and tags.
-*   Cross-account role assumption is possible, where principles in specified accounts can assume a role.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/02_IAM/ctp-topic-5-aws-identity-and-access-management-iam.md.bak
+++ b/SRE/02_IAM/ctp-topic-5-aws-identity-and-access-management-iam.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 5 - AWS Identity and Access Management (IAM)
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/02_IAM
-tags:
-  - AWS
-  - IAM
-  - Security
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 5 - AWS Identity and Access Management (IAM)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4`
-
-**Type:** VIDEO | **Category:** 02_IAM
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/02_IAM/learning-sessions-identity-governance-vsm-replacement-20231128-160326-meeting-re.md
+++ b/SRE/02_IAM/learning-sessions-identity-governance-vsm-replacement-20231128-160326-meeting-re.md
@@ -1,32 +0,0 @@
---
-title: "Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/02_IAM"
-tags:
-  - Identity-Governance
-  - VSM
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4`
-
-**Type:** VIDEO | **Category:** 02_IAM
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## Identity Governance and VSM Replacement
-
-The learning session covers identity governance, focusing on the replacement of Virtual SM (VSM), a DXC tool, with identity governance (IG). The objective is to understand identity governance, its necessity, micro-focused IG, its utilization with control tower and counter-automation, the plan to replace VSM with IG, and how to use the IGA portal.
-
-Identity governance is a framework for managing digital identities efficiently, minimizing risk, and maintaining compliance. Key questions addressed by identity governance include: *who currently has access to our systems, who should have access, and how is the access being done?* It comprises identity management, access management, and identity auditing. Microfocus's IGA governs access through resources, providing workflows for approving and revoking access, as well as monitoring and auditing access. IG is used to provide access to both internal and external users, including contractors, with time-limited access.
-
-IG integrates with AWS Identity Center to provide access to resources via IAM. Groups in Active Directory represent roles, and IG governs access to these groups. A bridge is established using Azure AD domain services for authentication. IG controls Active Directory groups and workflows, while IAM connects to Azure to Cobdom domain. The plan is to replace VSM with IG for all accounts, using the same architecture as VSM, but with IG connected to Coptum domain. Changes include adding owner information to Active Directory groups and automating the account owner as the first-level approver. A POC is underway to validate the architecture and process. Gaining access involves searching for the resource in the IG portal, requesting access, and filling out a form. The request goes through an approval flow, and upon approval, access is granted automatically.
--- a/SRE/02_IAM/learning-sessions-identity-governance-vsm-replacement-20231128-160326-meeting-re.md.bak
+++ b/SRE/02_IAM/learning-sessions-identity-governance-vsm-replacement-20231128-160326-meeting-re.md.bak
@@ -1,50 +0,0 @@
---
-title: "Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/02_IAM"
-tags:
-  - Identity-Governance
-  - VSM
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4"
-audio-source: ""
-status: raw
---
-
-# Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4`
-
-**Type:** VIDEO | **Category:** 02_IAM
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/ctp-topic-12-using-ses-smtp-service-terraform-module.md
+++ b/SRE/03_Terraform/ctp-topic-12-using-ses-smtp-service-terraform-module.md
@@ -1,66 +0,0 @@
---
-title: "CTP Topic 12 Using SES SMTP service terraform module"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - AWS
-  - Terraform
-  - SES
-  - Email
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 12_ Using SES SMTP service terraform module.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 12 Using SES SMTP service terraform module
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 12_ Using SES SMTP service terraform module.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次会议主要介绍了 Micro Focus 在云转型过程中，如何利用 AWS SES（Simple Email Service，转录中偶称为 ACS）替代传统的本地（On-prem）SMTP 网关。会议由 Christian Deckelmann 和 Filos Christolakis 主讲，核心内容涵盖了 SES 的背景、技术架构、Terraform 模块化部署方案以及使用中的注意事项。
->
-> Christian 指出，随着业务向云端迁移，使用本地 SMTP 网关（如 `smtbmicrofocus.com`）已不再高效，SES 是目前网络安全部门唯一批准的云端邮件发送方案。Filos 详细讲解了团队开发的 SES Terraform 模块，该模块封装了 SMTP 终端节点的配置，方便现有应用程序通过标准的 SMTP 协议进行集成，而无需重构代码以适配 SES API。
->
-> 在技术实现上，该方案要求在应用 VPC 中配置 VPC 终端节点以确保网络安全，并利用 IAM 用户凭证作为 SMTP 认证信息，这些凭证会安全地存储在 AWS Secrets Manager 中。此外，模块还自动化了 DKIM 验证和 Infoblox 中的 DNS 记录创建。
->
-> 会议强调了两个关键的后续手动步骤：一是申请脱离 SES 沙箱环境（Sandbox Mode）以提升发送限额并允许向外部地址发信；二是手动更新 DNS TXT 记录以验证域名所有权，这是因为 Terraform 目前难以处理多个 AWS 账号共享同一域名时对同一 TXT 记录值的追加操作。未来，该模块计划引入收件人地址限制和凭证滚动更新等增强安全功能。
-
---
-
-## 关键概念
-
- **AWS SES (Simple Email Service)**: AWS 提供的基于云的邮件发送服务，支持通过 API 或 SMTP 接口发送电子邮件。
- **SMTP Endpoint**: SES 提供的区域性邮件传输协议终端节点，允许传统应用程序通过标准 SMTP 协议接入云端邮件服务。
- **Sandbox Mode**: AWS SES 的默认限制状态，仅允许向验证过的地址发送少量邮件，需提交工单申请生产访问权限。
- **DKIM (DomainKeys Identified Mail)**: 一种电子邮件验证标准，通过在邮件中添加数字签名来防止欺诈和确保邮件完整性。
- **Infoblox**: 公司内部使用的 DNS 管理系统，用于存放和管理验证域名所有权所需的 DNS 记录。
- **VPC Endpoint**: 为了安全起见，在不访问公网的情况下，应用通过该私有节点与 SES SMTP 服务进行通信。
- **IAM User for SES**: 专门为 SES 创建的身份账号，其 Access Key 和 Secret Key 被转换并用作 SMTP 认证的用户名和密码。
- **Secrets Manager**: AWS 提供的凭证管理服务，用于安全地存储和检索 SES SMTP 的认证信息。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[VPC Wrapper Module Session]] — 本视频提到的 SES 模块依赖于 VPC Wrapper 模块预先配置的 SMTP VPC 终端节点。
-> [[Landing Zone Architecture Overview]] — SES 模块作为通用服务组件，被集成在 DevLab 和 SAS 等不同的 Landing Zone 环境中。
-> [[Terraform & Terragrunt Best Practices]] — 视频中讨论了利用 Terragrunt 的 Pre-hook/Post-hook 来处理 SES 部署中的手动 DNS 验证步骤。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/ctp-topic-16-cross-account-terraform-modules.md
+++ b/SRE/03_Terraform/ctp-topic-16-cross-account-terraform-modules.md
@@ -1,62 +0,0 @@
---
-title: "CTP Topic 16 Cross-account Terraform modules"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - Terraform
-  - Cross-Account
-  - Modules
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 16_ Cross-account Terraform modules.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 16 Cross-account Terraform modules
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 16_ Cross-account Terraform modules.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次会议由 Fibos 主讲，重点探讨了在多账号 AWS 环境中如何实现和管理 **Cross-account Terraform Modules（跨账号 Terraform 模块）**。在复杂的云架构中，经常需要在一个模块内跨多个账号创建资源（例如在 InfoBlocks 账号配置 DNS，同时在 Workload 账号部署应用）。然而，原有的 Gruntwork 流水线主要针对单账号设计，且直接赋予账号间互访权限存在巨大的安全风险（如某一账号被攻破可能波及全局）。
->
-> 为了解决这一问题，团队设计了一套基于 **Shared Account（共享账号）** 的中心化部署方案。核心思路是利用托管 Jenkins 的 Shared Account 作为中转站。当 Jenkins 检测到模块目录中存在 `cross-account.json` 标记文件时，会触发 Shared Account 中的 ECS Deploy Runner。该 Runner 被授予特殊权限，能够通过 Assume Role 方式访问目标账号的两个关键角色：一是用于读取状态文件的 `TF state bucket accessor`，二是用于执行资源部署的 `cross-account ECS deploy runner role`。
->
-> 这种架构实现了三大目标：首先是**安全性**，避免了 Workload 账号之间的直接信任，将权限控制集中在受严格审计的 Shared Account；其次是**自动化**，通过 Jenkins 自动识别模块类型并选择正确的部署路径；最后是**可复用性**，模块代码中不再硬编码特定账号的角色，提高了代码的灵活性。Fibos 还详细演示了如何通过修改根目录的 `terragrunt.hcl` 配置文件来支持这种全局性的角色切换逻辑，并简要介绍了本地开发与 Jenkins 自动部署在角色处理上的差异。
-
---
-
-## 关键概念
-
- **Cross-account Modules**: 指在一个 Terraform 模块中通过配置多个 Provider，实现在多个 AWS 账号中同时创建或管理资源的功能。
- **Shared Account**: 整个落地分区（Landing Zone）中的核心管理账号，托管 Jenkins、镜像仓库等公共服务，并作为跨账号部署的信任源。
- **ECS Deploy Runner (EDR)**: 运行在 ECS 上的 Docker 容器，负责执行具体的 Terraform plan 和 apply 命令，是流水线中的实际执行单元。
- **TF state bucket accessor**: 一种专门定义的 IAM 角色，仅允许部署工具访问存储在目标账号 S3 桶中的 Terraform 状态文件。
- **Cross-account ECS deploy runner role**: 部署在目标账号中的角色，允许 Shared Account 的执行器通过切换角色来获取在该账号内创建资源的权限。
- **cross-account.json**: 一个约定俗成的标记文件，放置在模块目录中，用于告知 Jenkins 该模块需要调用跨账号部署逻辑。
- **Root Terragrunt HCL**: 全局 Terragrunt 配置文件，用于定义所有模块通用的远程状态存储（Remote State）和角色切换逻辑。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[Gruntwork Pipeline Deep Dive]] — 了解基础的单账号 Gruntwork 流水线工作原理。
-> [[AWS Multi-account Security Best Practices]] — 探讨为何要限制账号间的直接访问权限（Blast Radius 控制）。
-> [[Terragrunt Advanced Configuration]] — 深入学习如何利用 Terragrunt 的继承机制管理复杂环境。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/ctp-topic-48-terraform-vs-terragrunt.md
+++ b/SRE/03_Terraform/ctp-topic-48-terraform-vs-terragrunt.md
@@ -1,68 +0,0 @@
---
-title: CTP Topic 48 Terraform vs Terragrunt
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/03_Terraform
-tags:
-  - Terraform
-  - Terragrunt
-  - IaC
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 48 Terraform vs Terragrunt
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Terraform vs. Terragrunt
-
-Bob, an AWS Solutions Architect and Tech Lead, contrasts Terraform and Terragrunt, emphasizing the importance of understanding their differentiation for both high-level strategy/design roles and low-level development/debugging roles.
-
-Terraform, founded by HashiCorp, is a Golang application used to provision, change, and version-control resources across various environments. A key selling point is its cloud-agnostic nature. The plan command allows users to preview changes before implementation, providing a distinct advantage. *To run Terraform consistently, it ties the desired state to the existing environment using a state file.* For enterprise-scale use, storing this file in a safe, accessible location is crucial, with cloud vendors offering persistence solutions.
-
-Terragrunt is presented as a thin wrapper around Terraform, promoting the DRY (don't repeat yourself) principle. All Terraform commands work with Terragrunt; a Terraform plan becomes a Terragrunt plan. The language, including blocks and attributes, remains consistent. Terragrunt helps manage provider and remote state blocks, which can be complex and error-prone when declared multiple times across different environments. *Terragrunt offers a way to use information in a repeatable way without hard coding values.*
-
-Terraform and Terragrunt have similar commands and languages, but differ in their approach to reusability and state management. Terraform's core is cloud-agnostic, while its vendor-specific parts require separate modules for each cloud provider. Terragrunt helps streamline configurations across environments.
-
-Additional points:
-*   Terraform Enterprise is a CI platform with workspaces.
-*   Gruntwork offers pre-built, customizable modules and a Terraform native AWS landing zone.
-*   Atlantis integrates Terraform with GitHub for infrastructure provisioning.
-*   Tools like tfsec aid in maintaining security through static code analysis.
-*   Terratest enables test automation for improved stability and velocity in the software delivery pipeline.
-*   Cloud cost customization tools can help visualize the cost implications of changes before deployment.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/ctp-topic-48-terraform-vs-terragrunt.md.bak
+++ b/SRE/03_Terraform/ctp-topic-48-terraform-vs-terragrunt.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 48 Terraform vs Terragrunt
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/03_Terraform
-tags:
-  - Terraform
-  - Terragrunt
-  - IaC
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 48 Terraform vs Terragrunt
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-20230808-183322-meeting-recordi.md
+++ b/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-20230808-183322-meeting-recordi.md
@@ -1,30 +0,0 @@
---
-title: "Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - Terraform
-  - CTP
-  - IaC
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-The learning session focuses on ECS deployment using infrastructure as code, presented by JP and Raja M. The session is part of a weekly series on Tuesdays, emphasizing interactive learning with Q&A opportunities. Recordings and presentations are available on a SharePoint site, with notifications sent beforehand.
-
-JP discusses the business and technology background of ECS, while Raja details the ECS module developed within CTP and SRE. The industry faces challenges like unpredictability and the need for agility, pushing businesses towards infrastructure as code. *Businesses have to thrive in the middle of all these challenges and it is forged by code.* Dynamic scaling is crucial due to unpredictable load patterns, requiring technologies to evolve. ECS (Elastic Container Services) is an AWS proprietary technology that integrates with AWS services, offering advantages and challenges compared to EKS or native Kubernetes.
-
-The ECS model, built on the grant work repository, allows creating Docker containers as logical units and supports EC2 instances or target deployments. It features auto-scaling, auto-healing, and canary deployments. The module supports a listener approach for centralized ECS management and integrates with AWS services. *We have implemented the listener approach because we have seen many of the products are you know they are downloading the quotes from the grant work and using locally.* Prerequisites for using the module include VPC, ELB security group, and EFS volume mounting. Configurations can be passed via YAML or JSON, with integration support for AWS CloudWatch, Splunk, Grafana, and Prometheus.
--- a/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-20230808-183322-meeting-recordi.md.bak
+++ b/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-20230808-183322-meeting-recordi.md.bak
@@ -1,50 +0,0 @@
---
-title: "Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - Terraform
-  - CTP
-  - IaC
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-deploying-rds-via-terraform.md
+++ b/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-deploying-rds-via-terraform.md
@@ -1,31 +0,0 @@
---
-title: "Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - Terraform
-  - RDS
-  - IaC
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-Greg from the DBRE team discusses deploying RDS via Terraform, advocating its use over the console for deploying any size RDS into Amazon. The presentation covers why infrastructure as code is helpful, clarifies the use of grunt work modules, and introduces SRE core modules. It also includes technical details, live demos of deployment, maintenance, upgrades, and monitoring/alarming.
-
-Key benefits of infrastructure as code include speed, flexibility, consistency, disaster recovery, documentation, and automation. *The code is the documentation.* There are two main options for deploying RDS: the bare-bones RDS module and the more comprehensive RDS service. The grunt work RDS service is recommended due to its pre-built features like KMS key encryption and CloudWatch alarming. The SRE core modules are less fully featured than the grunt work service.
-
-To deploy an RDS database, use Terragrunt, a wrapper around Terraform, to keep code clean and avoid repeating variables. *We use Terragrunt, which is basically it's a wrapper around Terraform, and it allows you to keep your code clean and you're not repeating your variables all the time.* Use a tagged release instead of the master branch for stability. Basic variables include VPC, database type (Oracle, Postgres), port, and license model. For day two operations like scaling, patching, and major version upgrades, changes are made in the TerraGrant file and applied via GitHub pull requests and Atlantis. Monitoring is achieved through CloudWatch dashboards and alarms, with considerations for burstable instance shapes and CPU credits.
--- a/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-deploying-rds-via-terraform.md.bak
+++ b/SRE/03_Terraform/learning-sessions-cloud-transformation-programme-deploying-rds-via-terraform.md.bak
@@ -1,51 +0,0 @@
---
-title: "Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - Terraform
-  - RDS
-  - IaC
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4"
-audio-source: ""
-status: raw
---
-
-# Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/03_Terraform/learning-sessions-ecs-deployment-using-iac-20230808-183322-meeting-recording.md
+++ b/SRE/03_Terraform/learning-sessions-ecs-deployment-using-iac-20230808-183322-meeting-recording.md
@@ -1,32 +0,0 @@
---
-title: "Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - AWS
-  - ECS
-  - IaC
-  - Terraform
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-The learning session focuses on ECS deployment using infrastructure as code, presented by JP and Raja M. The session is part of a weekly series on Tuesdays, emphasizing interactive learning with Q&A opportunities. Recordings and presentations are available on a SharePoint site, with notifications sent beforehand.
-
-JP discusses the business and technology background of ECS, while Raja details the ECS module developed within CTP and SRE. The industry faces challenges like unpredictability and the need for agility, pushing businesses towards infrastructure as code. *Businesses have to thrive in the middle of all these challenges and it is forged by code.* Dynamic scaling is crucial due to unpredictable load patterns, requiring technologies to evolve. ECS (Elastic Container Services) is an AWS proprietary technology that integrates with AWS services, offering advantages and challenges compared to EKS or native Kubernetes.
-
-The ECS model, built on the grant work repository, allows creating Docker containers as logical units and supports EC2 instances or target deployments. It features auto-scaling, auto-healing, and canary deployments. The module supports a listener approach for centralized ECS management and integrates with AWS services. *We have implemented the listener approach because we have seen many of the products are you know they are downloading the quotes from the grant work and using locally.* Prerequisites for using the module include VPC, ELB security group, and EFS volume mounting. Configurations can be passed via YAML or JSON, with integration support for AWS CloudWatch, Splunk, Grafana, and Prometheus.
--- a/SRE/03_Terraform/learning-sessions-ecs-deployment-using-iac-20230808-183322-meeting-recording.md.bak
+++ b/SRE/03_Terraform/learning-sessions-ecs-deployment-using-iac-20230808-183322-meeting-recording.md.bak
@@ -1,52 +0,0 @@
---
-title: "Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/03_Terraform"
-tags:
-  - AWS
-  - ECS
-  - IaC
-  - Terraform
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 03_Terraform
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-29-cloud-monitoring-saas-lz-accounts.md
+++ b/SRE/04_EKS/ctp-topic-29-cloud-monitoring-saas-lz-accounts.md
@@ -1,59 +0,0 @@
---
-title: CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Monitoring
-  - SaaS
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## AWS Cloud Monitoring with OpsBridge
-
-The session covers AWS cloud monitoring using Micro Focus OpsBridge, focusing on a new Cloud Monitoring feature. This containerized solution can be deployed on-prem or on AWS EKS and supports monitoring over 20 AWS data services, with data stored in an optic data lake using Vertica for performance dashboarding and reporting. The architecture collects data from CloudWatch metrics using read-only access to monitored accounts, correlating data and updating the configuration management database.
-
-Key points include deployment, monitoring setup, and operations. Cloud Monitoring is enabled within OpsBridge, requiring a one-time IAM role setup in customer accounts for read-only access. *Tag-based monitoring is emphasized as a best practice, with automation to identify missing tags.* The solution uses a single instance to monitor multiple accounts and regions.
-
-Data consumption occurs via event dashboards, topology views, and performance dashboards. The solution is being developed in collaboration with the product R&D team, with new reporting features expected in the next release. The demo showcased event perspectives, performance dashboards, and topology views, highlighting event details, historical usage, and hierarchical resource presentation. The operational model's impact on application teams was discussed, including data feedback, OpsBridge expertise, and outage detection capabilities.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-29-cloud-monitoring-saas-lz-accounts.md.bak
+++ b/SRE/04_EKS/ctp-topic-29-cloud-monitoring-saas-lz-accounts.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Monitoring
-  - SaaS
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-39-implementing-eks-in-the-aws-lab-landing-zone.md
+++ b/SRE/04_EKS/ctp-topic-39-implementing-eks-in-the-aws-lab-landing-zone.md
@@ -1,66 +0,0 @@
---
-title: CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> Spencer and Guy discuss implementing Elastic Kubernetes Service (EKS) in the AWS landing zone, focusing on a use case with Octane, a Microfocus SaaS application that is IP-hungry. They faced challenges with the limited range of IP addresses in AWS labs run on the Microfocus network.
-
-The solution involved creating a private subnet within their own space, not connected to the main subnet, to provide a large number of IPs for EKS to use. *The problem was was that this wasn't supported in the EKS sort of solution that was given to us.* They utilized Terraform and Terragrunt modules to create the lab, working with SRE to enable EKS to create its own subnet and use its own IPs within each pod.
-
-Key points:
-*   The EKS module has a flag for custom networking configuration to control IP allocation.
-*   They demonstrated how to call the EKS module within Terraform code, specifying the subnet and mappings between federated accounts/roles.
-*   They showed how to access the EKS cluster, get pods, and access both internal Microfocus network resources and external resources from within a pod.
-*   *Within the spec configuration, we basically have to put host network equals true.*
-*   They addressed a question about container hardening guidelines, explaining that they had discussions with security teams and implemented strong security measures.
-*   They mentioned that AWS may have contributed to the idea of this solution.
-*   Atlantis cannot currently deploy EKS clusters; a Terragrunt module on Jenkins is used instead.
-*   Mapping roles allows connection to the cluster and visibility of EKS components in the AWS console.
-*   The number of node groups is currently hardcoded but will be made configurable in future versions.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-39-implementing-eks-in-the-aws-lab-landing-zone.md.bak
+++ b/SRE/04_EKS/ctp-topic-39-implementing-eks-in-the-aws-lab-landing-zone.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Landing-Zone
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-42-grafana-observability-dashboard.md
+++ b/SRE/04_EKS/ctp-topic-42-grafana-observability-dashboard.md
@@ -1,72 +0,0 @@
---
-title: CTP Topic 42 Grafana Observability dashboard
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - Grafana
-  - Observability
-  - Dashboard
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 42 Grafana Observability dashboard
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Grafana Observability and Dashboards
-
-Grafana is an open-source web application used for data visualization through charts and dashboards. It supports various data sources, including metrics (CPU load, memory usage) and logs (timestamps, debug levels). Data producers like Jenkins, CA servers, and AWS CloudWatch inject data into these sources, which Grafana then visualizes. *Grafana does not exist differently data source by itself. It needs to be expressed from the data, all kinds of data sources.*
-
-The infrastructure architecture involves users accessing Grafana through a load balancer and auto-scaling groups. Grafana is installed in a monitoring account and configured to access other product team AWS accounts via IAM role policies. A Grafana monitoring role is assumed from a Terraform service catalog repo, granting access to various landing zone source accounts.
-
-Grafana offers user-level and team-level access controls, with roles like editor, viewer, and admin. Data sources are created with specific ARNs to access AWS accounts. Dashboards are dynamic, fetching data based on product team access. A sample dashboard includes CPU, I/O, network, EBS, and estimated charges monitoring. Alerting systems can be configured to notify channels like Microsoft Teams of high CPU usage or service downtime.
-
-### Terraform and Automation
-
-Terraform is used to automate Grafana resource provisioning. Modules exist for data sources and Grafana organizations. A demo scenario simulates onboarding Grafana for a new product group account using LZSAP. The process involves creating folders, calling modules, and using JSON input variables to define organization names and user access.
-
-Dashboards are provisioned with data sources and regions as inputs. Grafana offers flexibility in dashboard layout and data visualization. Product teams can leverage these modules and customize dashboards with application-specific logs or custom CloudWatch metrics.
-
-### Network Monitoring and Roadmap
-
-Network monitoring is achieved using Prometheus as a data source for checkpoint and firewall instances. A tool called norm is referenced to fetch metrics via the SNMP protocol. Key dashboards display packet in/out transfers, interface metrics, and CPU/disk usage.
-
-The roadmap includes implementing alerting and notification rules, refining network monitoring dashboards, building application-specific dashboards, and enabling product groups to consume Grafana Terraform modules. The goal is to replace Micro Focus tools with Grafana for end-to-end monitoring. *We would like to build application specific dashboards which can basically give us key insight with respect to our applications that are running over there.*
-
-Grafana offers open-source and paid versions (Grafana Enterprise and Grafana Cloud). User management is currently within the Grafana database but will move to LDAP or SSO.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-42-grafana-observability-dashboard.md.bak
+++ b/SRE/04_EKS/ctp-topic-42-grafana-observability-dashboard.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 42 Grafana Observability dashboard
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - Grafana
-  - Observability
-  - Dashboard
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 42 Grafana Observability dashboard
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-54-esm-saas-log-analytics.md
+++ b/SRE/04_EKS/ctp-topic-54-esm-saas-log-analytics.md
@@ -1,66 +0,0 @@
---
-title: CTP Topic 54 ESM SaaS Log Analytics
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - Log-Analytics
-  - SaaS
-  - ESM
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 54 ESM SaaS Log Analytics
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## ESM SAS Log Analytics
-
-Jackie, an ITOM ESM SAS architect, discusses Log Analytics, covering concepts, architecture, regional setup, provisioning, security, and a demo of a counter solution. He also briefly compares different solutions.
-
-The presentation begins with an overview of the ELK stack (Elasticsearch, Logstash, Kibana) and its open-source alternative, OpenSearch. Applications collect logs via BEATS, which are then aggregated and processed by Logstash to give meaning to each column, before being stored in Elasticsearch or OpenSearch. Kibana is used as a front-end for log file visualization and analysis.
-
-*The application collects your log, it's called the BEATS.* The architecture involves two VPCs: one for the application and another for logging. Filebeat, running as a container, continuously ships logs from the application VPC to the logging VPC. Logstash processes these logs, and OpenSearch stores them. End users can view logs via Kibana, connecting from a specified network. Redis is used as an optional buffer to prevent Logstash overload.
-
-Due to legal reasons like GDPR, farms are split regionally, with farms in Oregon, the US, and Europe. Provisioning is done via CloudFormation or Terraform, but security hardening and continuous optimization pose challenges. Security measures include encryption at rest (using encrypted nodes and hardware-level encryption on NVMe devices) and in transit (using TLS 1.2). Traffic between VPCs is private, not over the internet. Index-based access control and RBAC are implemented for different user roles.
-
-A demo shows how to search for specific IDs or services within the logs. A comparison of solutions like Logz.io, AWS OpenSearch, self-hosted ELK, and Microfocus OBA is provided. Logz.io is a managed ELK solution, while OBA offers more mature commercial options with automated clustering. ELK is easy to configure but complex to manage, while OBA is more mature with commercial options. ELK supports fine-grained access control, while OBA supports column-level access control.
-
-Cost estimates are provided based on a single farm usage with 14 days retention and 100GB processed daily. Logz.io costs around $4,000, while AWS OpenSearch costs around $1,500 or less. Self-hosted options can be very low cost but require more maintenance. Availability SLAs vary, with Logz.io offering 99.8% and AWS OpenSearch offering 99.9%. Disaster recovery is covered by the vendor for Logz.io, while AWS OpenSearch automatically captures snapshots.
-
-Recommendations for starting with Log Analytics include beginning with Logz.io for its trial period, then transitioning to AWS OpenSearch or self-hosted options for more control. The presentation concludes with a Q&A session covering GDPR requirements, log acquisition, cost details, scaling, and comparisons to other solutions. *We have already built up all the farms.*
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-54-esm-saas-log-analytics.md.bak
+++ b/SRE/04_EKS/ctp-topic-54-esm-saas-log-analytics.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 54 ESM SaaS Log Analytics
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - Log-Analytics
-  - SaaS
-  - ESM
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 54 ESM SaaS Log Analytics
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-59-achieving-reliability-with-amazon-eks.md
+++ b/SRE/04_EKS/ctp-topic-59-achieving-reliability-with-amazon-eks.md
@@ -1,65 +0,0 @@
---
-title: CTP Topic 59 Achieving reliability with Amazon EKS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Reliability
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 59 Achieving reliability with Amazon EKS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## EKS Reliability with AWS
-
-Surav Paul, a Senior Solutions Architect from AWS, presented on EKS (Elastic Kubernetes Service), covering container offerings and reliability practices. The session aimed to be interactive, encouraging questions about shared responsibility models, reliability-based practices, application reliability, and data plane reliability.
-
-When considering container offerings on AWS, users can choose between Amazon Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS). ECS is recommended for those starting their container adoption journey, offering a simple interface with native AWS service integrations. EKS is suitable for those familiar with the Kubernetes ecosystem, providing flexibility with open community initiatives. *ECS is a more AWS opinionated way of running containers.* Both ECS and EKS offer multiple compute options, including VM images, serverless deployments (AWS Fargate), and on-prem deployments.
-
-Reliability in a system means it offers predictable behavior even when failures occur. Key concerns include failure detection, graceful service degradation, deterministic failure modes, self-healing capabilities, and on-demand scaling. Reliability concerns are grouped under application, control plane, and data plane categories. The shared responsibility model dictates that AWS manages control plane components (state store, scheduler, controller manager, API servers), while customers manage aspects like worker nodes, operating systems, and application configurations. *With Fargate, you don't have to worry about managing the nodes or worrying about patching or upgrading the nodes.*
-
-Application reliability involves avoiding singleton pods and spreading application pods across availability zones using pod anti-affinity or topology spread constraints. Topology spread constraints offer finer-grained control over workload distribution. Collecting metrics via the metrics server is crucial for scaling, with HPA (Horizontal Pod Autoscaler) using CPU utilization and memory consumption by default, and custom/external metrics available. VPA (Vertical Pod Autoscaler) can right-size pods, but runtime adjustments cause restarts. Deployment strategies include rolling upgrades, blue-green deployments, and canary deployments, each with different levels of control and complexity. Liveness, readiness, and startup probes are essential for monitoring pod health, and pod disruption budgets ensure minimum service levels during maintenance.
-
-Control plane reliability involves monitoring control plane metrics (API server requests, HCT state store size) to prevent issues. Securing cluster authentication by creating a secure user with super admin role is crucial. Admission webhooks should be carefully configured and tested to avoid obstructing the control plane. Cluster upgrades have control plane and data plane phases, with EKS platform versions handling patch releases transparently. Minor version upgrades have a 14-month support cycle before automatic upgrades occur.
-
-Data plane reliability involves using tools like node problem detector, reserving system resources, implementing quality of service, and configuring resource quotas and limit ranges. Pod priority and control preemption are also important.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-59-achieving-reliability-with-amazon-eks.md.bak
+++ b/SRE/04_EKS/ctp-topic-59-achieving-reliability-with-amazon-eks.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 59 Achieving reliability with Amazon EKS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Reliability
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 59 Achieving reliability with Amazon EKS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-60-monitor-aws-using-hyperscale-observability-with-grafana.md
+++ b/SRE/04_EKS/ctp-topic-60-monitor-aws-using-hyperscale-observability-with-grafana.md
@@ -1,65 +0,0 @@
---
-title: CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Grafana
-  - Observability
-  - Hyperscale
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Monitoring AWS Using Hyperscale Observability with Grafana
-
-This session is a continuation of a previous session about Grafana. It focuses on recent capabilities and features now available. Vinay covers the session, in place of Sashi, who is on leave.
-
-The session recaps previous discussions, including the effective use of Grafana with different data sources, creating queries, and customizing visualizations. Grafana's ability to provision infrastructure and applications using Terraform modules (dashboard as code) is highlighted, along with its use for SNMP-based network infrastructure monitoring. The move from the open-source version of Grafana to the enterprise license version is emphasized to leverage the full potential of Grafana.
-
-Key highlights explored through demonstrations include data source integration, event tracking, alert integrations, instance monitoring, and resource tracking. Optic DR, an internal monitoring solution and plugin of VaticaDB, is crucial for pulling data into Grafana dashboards. *Opsbridge monitoring solutions use a dashboard to display even triggered by the monitoring systems.* Grafana's alert system is flexible and can be configured to use different notification channels, with the ability to forward alerts to Opsbridge to create incidents. Instance monitoring helps identify resource utilization, and resource tagging categorizes resources for effective management.
-
-The session covers the use of a Terraform module for product teams, which creates Grafana organizations, users, folders, IAM roles, and dashboards for AWS services. *The product team can consume the modules by using sample telegram HCL file.* Default dashboards are provided for accounts onboarded to code, with prerequisites outlined in a readme file. Several default dashboards are offered to product teams, such as billing information dashboards that display resource utilization and EC2 dashboards that can be customized. Customized dashboards can consolidate all services into a single view, though this is typically limited to one account and one region.
-
-EC2 inventory dashboards, using data from Optic DR, provide a view of running and non-running EC2 instances and identify whether resources are tagged. Event dashboards display daily active events triggered by OpsBridge AWS monitoring solutions, with ongoing integration of alerts generated by Grafana. Future roadmap items include SSO authentication, reporting capabilities, URL monitoring, process monitoring, log monitoring, and integration with other products like PagerDuty and Slack Manager.
-
-The session concludes with a discussion of next steps and collaboration, encouraging users to leverage available dashboards and provide feedback or enhancement requests. The team also addresses questions about the cost impact of joining the service, clarifying that default metrics do not incur additional costs, but custom metrics may.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-60-monitor-aws-using-hyperscale-observability-with-grafana.md.bak
+++ b/SRE/04_EKS/ctp-topic-60-monitor-aws-using-hyperscale-observability-with-grafana.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Grafana
-  - Observability
-  - Hyperscale
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-64-scaling-out-with-amazon-eks.md
+++ b/SRE/04_EKS/ctp-topic-64-scaling-out-with-amazon-eks.md
@@ -1,71 +0,0 @@
---
-title: CTP Topic 64 Scaling out with Amazon EKS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Scaling
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 64 Scaling out with Amazon EKS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Scaling Out with Amazon EKS
-
-The 64th Cloud Transformation Program session covers scaling out with Amazon EKS, with a special guest presenter from AWS. The session is interactive and encourages questions, with a survey link to be shared for feedback.
-
-Suravpul, a senior solutions architect from AWS, discusses scaling workloads using the horizontal pod autoscaler (HPA), event-driven autoscaling with KEDA, capacity autoscaling (cluster autoscaler and Carpenter), addressing IP exhaustion, and scaling cluster components like DNS.
-
-The horizontal pod autoscaler (HPA) is the standard Kubernetes mechanism for scaling application workloads, using metrics to determine replica requirements. It supports CPU and memory utilization out of the box via a metrics server. Custom and external metrics, such as those from load balancers or messaging middleware, can also be used. *The horizontal pod autoscaler is going to pull the metrics and it is going to calculate how many replicas are required for your application workload.* The speaker notes that the gap between the target threshold and 100% utilization is important, and addresses flapping via period seconds and stabilization window seconds settings. HPA currently considers resource consumption only at the pod level, not at the container level.
-
-KEDA allows scaling application workloads based on external events, using a custom resource definition called a scaled object. It can scale applications from zero replicas, or publish metrics for the horizontal pod autoscaler to use.
-
-Capacity autoscaling can be achieved using Fargate or EC2 instances. For EC2 instances, cluster autoscaler or Carpenter can be used. Cluster autoscaler is tied to auto scaling groups and node groups, updating the desired capacity of the auto scaling group based on the number of pending pods. It considers CPU and memory requests, and supports mixed instances policies. *The scaling decision that is made by the cluster auto scaler, it is done on the number of pending pods in the cluster.* Auto-discovery is recommended, and changes to min/max configuration should be made at the managed node group or auto scaling group level.
-
-Carpenter is an open-source Kubernetes native capacity auto scaler that directly interacts with the EC2 API, offering dynamic on-demand provisioning and improved speed. It does not depend on pre-configured node groups or auto scaling groups. Carpenter uses the concept of a provisioner to define requirements for EC2 instances, matched with workload requirements using node selectors and affinity terms. Reclamation is disabled by default, so TTL or cluster consolidation must be enabled. Carpenter is recommended for clusters with varying capacity and workload requirements.
-
-To address IP exhaustion, switching to IPv6 addressing is recommended. If not possible, custom networking can be used with carrier-grade NAT. For IPv6, a dual-stack VPC is recommended, with nodes supporting dual-stack IP addresses but pods having only IPv6 addresses. Interaction between IPv6 pods and IPv4 destinations is configured by utilizing matting at two different layers.
-
-Additional considerations for scaling include enabling API server priority and fairness metrics, enabling caching and disabling compression, removing underutilized nodes, and limiting scaling spikes. Scaling the DNS component (CoreDNS) and installing node local DNS cache are also important.
-
-The presentation concludes by recommending the EKS best practices guides, specifically the scalability section.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-64-scaling-out-with-amazon-eks.md.bak
+++ b/SRE/04_EKS/ctp-topic-64-scaling-out-with-amazon-eks.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 64 Scaling out with Amazon EKS
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - Kubernetes
-  - Scaling
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 64 Scaling out with Amazon EKS
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-67-cloud-native-observability-using-opentelemetry.md
+++ b/SRE/04_EKS/ctp-topic-67-cloud-native-observability-using-opentelemetry.md
@@ -1,58 +0,0 @@
---
-title: CTP Topic 67 Cloud native observability using OpenTelemetry
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - OpenTelemetry
-  - Observability
-  - Cloud-Native
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using  OpenTelemetry.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 67 Cloud native observability using OpenTelemetry
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using  OpenTelemetry.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> Surav from AWS presented a session on observability for Amazon EKS, covering the need for observability, code instrumentation using open telemetry, defining pipelines, AWS Distro for Open Telemetry collector deployment patterns, and observability deployment options on EKS and ECS.
-
-Observability is essential for managing complexity as systems evolve. *Building observable applications is a developer responsibility.* Key signals to collect include traces, metrics, and logs, enabling reactive and proactive troubleshooting. AWS offers native options like CloudWatch and X-Ray, alongside open-source solutions such as Yeager, Zipkin, Prometheus, and Grafana, either self-hosted or managed. The AWS Distro for Open Telemetry (ADOT) is a secure, production-ready solution with AWS-developed components, offering support for operational issues.
-
-Open Telemetry provides a vendor-agnostic instrumentation library, simplifying code instrumentation. The Open Telemetry collector uses receivers, processors, and exporters to manage signals. Receivers collect signals, processors transform them, and exporters send them to destinations. *A trace captures the processing time taken at individual layers in your application call stack.* ADOT includes the AWS SIG V4 extension for seamless integration with AWS services. Collecting metrics from both application and infrastructure layers allows comprehensive application views, including business-level metrics, service maps from X-Ray traces, and application logs. Correlation IDs, like the X-ray trace ID, enable deep links to trace views from log events.
-
-ADOT is a repackaged Open Telemetry collector with AWS-developed components. It offers receivers like Prometheus and X-ray, processors like batch and filter, and exporters like X-ray, CloudWatch, Prometheus, and EMF. In ECS deployments, the AWS ECS container metrics receiver collects infrastructure metrics, while the Prometheus remote write exporter sends metrics to Prometheus. The SIGV4 Auth extension is used for AWS API calls. ADOT can be deployed as a sidecar container or a separate task, with configurations for scraping targets and defining pipelines. Deployment patterns include sidecar, separate task, demon set, and high-availability replicas. The ADOT add-on for EKS simplifies deployment with an operator and Terraform module, including prebuilt Grafana dashboards. Costs depend on the destination service, such as metric storage for Prometheus or trace ingestion for X-ray. An observability workshop and best practices site offer further guidance.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-67-cloud-native-observability-using-opentelemetry.md.bak
+++ b/SRE/04_EKS/ctp-topic-67-cloud-native-observability-using-opentelemetry.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 67 Cloud native observability using OpenTelemetry
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - OpenTelemetry
-  - Observability
-  - Cloud-Native
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using  OpenTelemetry.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 67 Cloud native observability using OpenTelemetry
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using  OpenTelemetry.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-70-eks-deployment-using-iac.md
+++ b/SRE/04_EKS/ctp-topic-70-eks-deployment-using-iac.md
@@ -1,76 +0,0 @@
---
-title: CTP Topic 70 EKS deployment using IAC
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - IaC
-  - Kubernetes
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 70 EKS deployment using IAC
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## EKS Deployment Using Infrastructure As Code
-
-This session covers EKS cluster deployment via Infrastructure as Code (IAC), focusing on managing containers and worker nodes using the SRE EKS module. Key capabilities include cluster autoscaling, ingress controller, and custom networking. The agenda includes comparing containers and VMs, discussing EKS features, and demonstrating EKS deployment via Terraform and Service Catalog. Monitoring the EKS stack and containers for proactive alerting is also covered.
-
-The discussion begins with the differences between VMs and containers, highlighting the benefits of containers such as reduced boot time, memory efficiency, and portability. Kubernetes is presented as a framework for running distributed systems resiliently, automating rollouts/rollbacks, load balancing, and horizontal pod scaling.
-
-EKS, a managed Kubernetes service by Amazon, offers features like fully managed control planes and autoscaling worker nodes. *Zero downtime rolling deployments for worker node updates* and IAM RBAC mapping for least privilege access are implemented. The SRE EKS module integrates an ALB ingress controller for traffic management and EMI custom networking for pods to handle CIDR limitations.
-
-### Deployment Methods
-
-Two deployment methods are detailed:
-
-1.  **Terraform:** Using a `tera-grant.scl` file, users can define environment variables, EKS cluster version, and worker node types (CPU, GPU, or default). Integration with AWS Secret Manager is included for engineering contact notifications.
-2.  **Service Catalog:** This method allows users to create EKS clusters via a module with version selection and worker node type configuration. It provides more control over security and permissions.
-
-*Service Catalog allows creating, organizing, and governing AWS resources with permission control.*
-
-### Custom Networking and Autoscaling
-
-Custom networking for pods addresses CIDR limitations by adding a virtual EMI to assign IP addresses to pods. The Kubernetes cluster autoscaler automatically scales worker nodes based on resource needs. Future implementation of Carpenter is being considered for more efficient instance type creation based on pod requirements.
-
-### Monitoring
-
-Monitoring is achieved using CloudWatch agent and FluentBit deployed as demon sets. Container Insights needs to be enabled to publish metrics to CloudWatch. The process involves applying manifest files within the cluster to set up CloudWatch logs and metrics. AWS Open Telemetry can also be used for monitoring. Centralized Grafana instances are available for visualizing metrics via templated dashboards, including an EKS-specific dashboard.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-70-eks-deployment-using-iac.md.bak
+++ b/SRE/04_EKS/ctp-topic-70-eks-deployment-using-iac.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 70 EKS deployment using IAC
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - EKS
-  - IaC
-  - Kubernetes
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 70 EKS deployment using IAC
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid.md
+++ b/SRE/04_EKS/ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid.md
@@ -1,60 +0,0 @@
---
-title: CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Monitoring
-  - Observability
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Cloud Monitoring Using OBM Implementation
-
-The session covers the implementation of cloud monitoring using Microfocus's Operations Bridge Manager (OBM), a solution designed to address gaps in existing monitoring systems like Sitescope, especially with the increasing shift towards public cloud environments. OBM offers a dynamic monitoring solution for AWS core services, enhanced security, and improved dynamic capabilities compared to Sitescope.
-
-The current architecture involves data collection from various sources (infrastructure, servers, applications, hardware, and networks) using data collectors like Sitescope, HPCM, and norm, feeding into regional OBMs. These regional OBMs then send data to a global OBM, which acts as a manager of managers. The global OBM integrates with smacks, enabling the OSE team to escalate and create tickets for events. A new regional OBM setup is planned for AWS cloud monitoring in a lab landing zone environment in Frankfurt. The OBM account will be part of the digital factory landing zone, interacting with core accounts like shared, logs, and security accounts. The regional OBM collects data from different AWS accounts through an operation agent and CloudWatch API, forwarding it to the on-premise global OBM.
-
-The architecture includes an OBM AWS account with an OBM application, a Postgres RDS database, and a separate instance with an operation agent. The operation agent collects data using OBM management packs, specifically the AWS management pack, which instructs the agent to gather data from different accounts. *The agent uses role-based access to collect data from CloudWatch API, eliminating the need to install servers in customer accounts and share sensitive access keys.* The management pack solution uses policies to define monitoring intervals, specific metrics, and data collection from specific accounts, matching data against thresholds to trigger events. *Whenever new instances are added, policies are automatically deployed, and monitoring begins, offering dynamic monitoring capabilities.*
-
-For onboarding new customers, an IAM role with CloudWatch read-only access needs to be created, and the AWS account where the OBM and operation agent reside must be added to the trust relationship tab. The role ARN is then added as a policy in the OBM account's IAM role, attached to the agent node. The process involves specifying the role ARN, account ID, namespaces/services to be monitored, metrics, thresholds, monitoring frequency, and title format. The title format is enriched to provide useful information for the service center team, facilitating escalation and runbook execution. CloudWatch custom metrics can be used for metrics not exposed by default. The OBM management pack solution can monitor any public cloud vendor (Amazon, Azure, Google Cloud) and any AWS service with data exposed to CloudWatch metrics, using both metrics and logs. The solution is dynamic and customizable, with all data collected from the OBM account without requiring any installations in customer accounts.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid.md.bak
+++ b/SRE/04_EKS/ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/04_EKS
-tags:
-  - AWS
-  - Monitoring
-  - Observability
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-1-of-3-compute-optimization.md
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-1-of-3-compute-optimization.md
@@ -1,60 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Karpenter
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## EKS Optimization with Carpenter
-
-This session introduces Carpenter, an open-source compute infrastructure management tool for Kubernetes clusters, addressing challenges associated with the traditional Cluster Autoscaler. Carpenter offers native integration with Kubernetes, direct EC2 fleet API communication, and intelligent workload placement and consolidation based on cost and utilization.
-
-Key differences between Carpenter and Cluster Autoscaler:
-*   Carpenter integrates with Kubernetes workload scheduling constructs.
-*   It directly communicates with the EC2 fleet API, reducing latency.
-*   It provides native experiences for workload placement and node consolidation.
-
-Two core components of Carpenter: node pools and node classes. Node pools define scheduling constraints and capacity limits, while node classes define instance provisioning details like subnets, node roles, and AMIs.
-
-Carpenter supports Kubernetes scheduling constraints like node selectors, affinity, taints, tolerations, and topology spread, along with AWS placement requirements such as purchasing options, processor architectures, and availability zones. It can identify zonal requirements based on volume claims and storage classes, simplifying workload definitions compared to Cluster Autoscaler.
-
-_*Carpenter has native integration with Kubernetes and it complements the native Kubernetes spot pod scheduling constraints that is available for your workloads.*_
-
-Carpenter natively supports spot interruptions without requiring additional components like the node termination handler. It uses EventBridge and SQS to handle spot interruption notifications, instance rebalance notifications, health events, and instance state change events.
-
-Node pools can be designed for various scenarios, including single node pools, mixed compute/accelerated nodes, or isolated node pools based on cost, security, or multi-tenancy. Weighted node pools can prioritize instances based on existing commitments or reservations.
-
-Carpenter simplifies data plane management by removing pain points associated with node groups, integrating node termination handlers, and providing native integration with Kubernetes scheduling constraints. It also helps consolidate compute instances for greater cost efficiency.
-
-_*Carpenter not only does the auto-scaling bit, but it also removes the pain points of working with node groups.*_
-
-Carpenter can automatically upgrade AMIs or use defined AMIs, referring to the parameter store for the latest EKS optimized AMIs for the corresponding control plane version. It identifies drifts between the desired state and running machines, rolling out changes in a rolling upgrade fashion.
-
-AMI selection can be pinned to specific versions or use custom AMIs. The AMI family setting tells Carpenter what user data to inject when spinning up instances.
-
-Consolidation policies can be configured with fine-grained budgets, such as preventing consolidation during peak business hours or limiting the percentage of instances disrupted at a time.
-
-Carpenter publishes logs and emits Prometheus metrics for observability, with community-maintained dashboards available for visualization.
-
-Onboarding is simple, requiring Carpenter to be deployed on nodes not managed by Carpenter, such as a small node group or Fargate instances. Migration guides are available for migrating from Cluster Autoscaler.
-
-The session is the first in a series of three, with subsequent sessions covering the Bottlerocket operating system and EKS Auto Mode.
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-1-of-3-compute-optimization.md.bak
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-1-of-3-compute-optimization.md.bak
@@ -1,51 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Karpenter
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-2-of-3-running-containers-w.md
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-2-of-3-running-containers-w.md
@@ -1,35 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Bottlerocket
-  - OS
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## EKS Optimization: Running Containers with Water Rocket OS
-
-This session focuses on Water Rocket OS and its benefits for running containerized workloads in EKS. Water Rocket is a Linux-based operating system designed specifically for hosting containers, differing from general-purpose OSes by including only essential components. It is free, open-source, and maintained on GitHub, with AWS as a core maintainer and sponsor. Water Rocket can be run on laptops, workstations, or in data centers, and is designed to be minimal, enforce safe updates, and be security-focused.
-
-Water Rocket is minimal because it lacks unnecessary software, drivers, and tools. It does not include a package manager, default shell interpreter, or default SSH access. Only essential kernel components are packaged into the OS image during build time. To accommodate specific workload needs like GPU resources, Water Rocket uses variants, which are combinations of platform, processor architecture, and necessary binary components. These variants are built with specific packages, drivers, and tools included. *A variant is basically a combination of platform, supported platform, the processor architecture and the necessary binary components that are supported by the processor architecture and any additional packages and drivers that are required for your specific workloads.* Configuration is managed through an API interface or Toml-formatted user data.
-
-Safe updates are enforced through in-place updates and node replacement. In-place updates involve downloading a new image version to an inactive partition and switching the active partition upon reboot, ensuring system consistency. The data volume caches container images and can be pre-populated with images via snapshots. Security is enhanced through secure boot, cryptographic verification of the root file system using dm-verity, and an immutable root file system. The `/etc` directory is a temporary file system, and SE Linux is enabled by default in enforcing mode. *The root file system is by default immutable, you cannot change anything there.* Bottle Rocket has a dedicated CIS benchmark for hardening, and comprehensive security guidance is available.
-
-Water Rocket integrates with EKS through optimized variants and is supported across self-managed node groups, managed node groups, and Carpenter node pools. It can be configured using tools like EKS Cuddle and Carpenter, with best practices including pinning the AMI to a specific version.
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-2-of-3-running-containers-w.md.bak
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-2-of-3-running-containers-w.md.bak
@@ -1,51 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Bottlerocket
-  - OS
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-3-of-3-introduction-to-eks-.md
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-3-of-3-introduction-to-eks-.md
@@ -1,42 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Auto-Mode
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## EKS Optimization: Introduction to EKS Auto Mode
-
-This session focuses on EKS Auto Mode, the third part of a series on EKS optimization. EKS Auto Mode extends the management responsibilities of the EKS service to the data plane, managing instances, operating systems, patches, and security updates. It leverages core capabilities like Carpenter for infrastructure management, a managed EBS CSI driver for stateful workloads, and the AWS load balancer controller.
-
-Key benefits of EKS Auto Mode include increased agility, automatic consolidation, dynamic instance determination, and optimized compute costs. *With Auto Mode, a majority of the operational concerns are being managed by the ECS service.* Core capabilities are managed within instances provisioned inside the EKS account, while customers retain control over VPC infrastructure, cluster configuration, add-ons, and workload configurations.
-
-EKS Auto Mode offers an easier interface for working with EKS, providing data plane management in addition to control plane management. It supports a wide range of EC2 instances (excluding bare metal) and is fully compatible with Kubernetes-compliant workloads. Security is enhanced through the use of the Bottle Rocket operating system and automated patch management. The core cluster capabilities are grouped under compute (Carpenter controller), networking (AWS load balancer controller), storage (EBS CSI controller), and security (pod identity associations).
-
-By default, Auto Mode includes two node pools (general purpose and system) and one node class. The default node pools are immutable and configured with zero weight, allowing custom node pools to be prioritized. The general purpose node pool is locked to AMD64 architecture, while custom node pools can be defined for Graviton instances. Instances in the system node pool have a taint applied, requiring corresponding tolerations for system add-ons.
-
-Networking in Auto Mode includes Core DNS packaged with every node as a system service, VPCCNI as a system service, and Kube proxy set up in IP tables mode. Prefix delegation is enabled by default. The AWS load balancer controller is available as a core capability, using an EKS Auto Mode-specific load balancer class. The packaged CSI controller requires a storage class referring to the EBS CSI EKS provisioner.
-
-Version upgrades in Auto Mode are initiated by an operator for the control plane. *Once the control plane version gets upgraded, then the compute controller, which is running as a core capability, will identify that the control plane version has changed and it will try to pull the current AMI version for that new control plane version.* The compute controller then rolls out the new AMI across the cluster through a rolling upgrade.
-
-While the controllers are managed by the EKS service, users can investigate custom resources and deploy node diagnostic CRDs. Observability can be achieved through CloudWatch agent, AWS distro for open telemetry, or other collectors.
-
-For every instance spun up in an Auto Mode cluster, there is a 12% premium charged for the automatic management of those instances.
--- a/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-3-of-3-introduction-to-eks-.md.bak
+++ b/SRE/04_EKS/public-cloud-learning-sessions-eks-optimization-part-3-of-3-introduction-to-eks-.md.bak
@@ -1,50 +0,0 @@
---
-title: "Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - AWS
-  - EKS
-  - Auto-Mode
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md
+++ b/SRE/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md
@@ -1,43 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - OpenTelemetry
-  - Observability
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## Observability with Open Telemetry
-
-Jay Comer, Solutions Architect with AWS, presented an overview of observability with OpenTelemetry, including changes and updates within the AWS observability ecosystem since the last session a year ago. The session included a demo showing how to piece together the components and how to instrument an application with OpenTelemetry.
-
-Observability is defined as *a measure of how well internal states of a system can be inferred from knowledge of its external outputs.* These outputs include logs, metrics, and traces, which are correlated with the application's health. As systems transition to micro-service-based architectures, the observability challenge becomes more prominent due to increasing complexity. Downtime can cost significant money and effort, with Gartner estimating an average of 87 hours per year of downtime, costing $42,000 per hour.
-
-The three signals used for observability are metrics, logs, and traces. Metrics are aggregated source statistics, logs help determine the root cause of problems, and traces provide a holistic view of a specific request within the system. A trace span includes a start time, a duration, and metadata such as a log.
-
-The AWS observability landscape includes AWS native services like CloudWatch and X-Ray, as well as managed services of open-source implementations like Grafana, OpenSearch, Prometheus, and OpenTelemetry. OpenTelemetry aims to solve the problem of disparate SDKs and tooling for different components within the observability landscape by providing an instrumentation language with different SDKs per language. It offers an end-to-end implementation for making telemetry data accessible and usable and is vendor-agnostic.
-
-OpenTelemetry is a data format with support for 11 language SDKs and automates instrumentation. The OpenTelemetry collector standardizes and transforms data into the OpenTelemetry protocol (OTLP) format and exports it to different destinations. The collector includes receivers (AWS-specific or open source), processors (filtering, transformations), exporters (AWS native, open source, or third-party), and extensions (SIGV for authorization, health check).
-
-The AWS distribution for OpenTelemetry is a unified agent for collecting traces, metrics, and logs. It includes an operator that automatically instruments applications by detecting the language used and creating pre-configured OpenTelemetry collectors. Custom attributes, such as tenant IDs, can be added to OpenTelemetry items.
-
-Recent announcements focused on security and compliance, scale and region expansion, and a centralized pane of glass with an improved user experience. The managed service collector for Amazon Prometheus provides a serverless, agentless scraper that automatically discovers and pulls Prometheus-compatible metrics. Log support was added to the AWS distribution for OpenTelemetry, and Amazon Managed Grafana now supports community plugins.
-
-The demo showcased a sample application running on EKS, using Fluent Bit for collecting logs and forwarding them to the OpenTelemetry container. The OpenTelemetry container collects traces and metrics from the application, sending logs, traces, and metrics to Amazon OpenSearch Service via an ingestion pipeline. The source code included Fluent Bit and OpenTelemetry YAML configuration files. *The output that Fluent Bit is sending the individual logs to is the Open Telemetry endpoint on the port 55681.* On a code level, the implementation involves importing OpenTelemetry SDKs, configuring a trace provider, and starting a span with the tracer at each point where instrumentation and request duration measurement are needed.
-
-OpenSearch dashboards can display latency by trace group and an application composition map, showing where bottlenecks are appearing.
--- a/SRE/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md.bak
+++ b/SRE/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md.bak
@@ -1,49 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/04_EKS"
-tags:
-  - OpenTelemetry
-  - Observability
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 04_EKS
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/ctp-topic-13-cloud-finops-micro-focus-policies-best-practices-to-optimize-the-co.md
+++ b/SRE/05_FinOps/ctp-topic-13-cloud-finops-micro-focus-policies-best-practices-to-optimize-the-co.md
@@ -1,102 +0,0 @@
---
-title: "CTP Topic 13 Cloud FinOps Micro Focus Policies best practices to optimize the costs"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - FinOps
-  - Cost-Optimization
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 13_ Cloud FinOps_ Micro Focus Policies _ best practices to optimize the costs.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 13 Cloud FinOps Micro Focus Policies best practices to optimize the costs
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 13_ Cloud FinOps_ Micro Focus Policies _ best practices to optimize the costs.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-本次云转型学习会议的主题是"Cloud FinOps: Micro Focus 政策与成本优化最佳实践"。由 PCG（Public Cloud Governance）团队的 Uday 和 Vinay 主讲。
-
-### 核心内容
-
-1. **PCG 服务分层**
-   - **成本管理**：账单支付、showback/chargeback、预算管理
-   - **成本优化**：组织级和账户级优化，包括购买 Reserved Instances 和识别未充分利用的资源
-   - **治理与自动化**：集中式上线、策略开发、自动报告
-
-2. **核心策略**
-   - **可见性**：确保账单可见
-   - **标签合规**：强制标签要求
-   - **预算责任**：账户负责人负责控制在预算内
-   - **集中管理**：集中管理 Reserved Instances 和 Savings Plans
-   - **区域限制**：限制区域使用以优化成本、安全和管理
-
-3. **安全策略**
-   - 预安装 Godrails
-   - 通过联合身份管理访问（MFA 即将推出）
-   - 供应商告警重定向到安全团队
-   - 账户负责人需提供公共分发列表（PDL）用于告警
-   - Cloud Security Postal Management 工具正在实施
-
-4. **最佳实践**
-   - 使用计算器了解成本
-   - 检查资源清单
-   - 监控月度账单
-   - **Cloud Health**：关键工具，提供资源清单、成本分析和月度账单洞察
-   - 标准化实例类型：
-     - M 系列：通用场景
-     - T 系列：突发性工作负载
-     - C 系列：计算密集型应用
-     - R/X 系列：内存优化工作负载
-   - 使用 Graviton 实例节省成本
-
-5. **研发环境优化**
-   - 使用突发性实例
-   - 使用实例调度器
-   - 使用 Spot 实例
-
---
-
-## 关键概念
-
- **PCG (Public Cloud Governance)**：公共云治理框架，提供工作负载放置、成本和优化指导
- **Showback/Chargeback**：成本分摊机制
- **Cloud Health**：云成本分析和监控工具
- **Godrails**：预安装安全控制
- **Reserved Instances / Savings Plans**：承诺计划用于成本优化
- **Graviton**：AWS ARM 处理器，比 Intel 更便宜
-
---
-
-## 行动项
-
- [ ] 了解 Cloud Health 工具的使用
- [ ] 审查并标准化实例类型选择
- [ ] 确保所有资源使用正确的标签
- [ ] 探索 Graviton 实例用于兼容的工作负载
- [ ] 在研发环境中实施实例调度器
- [ ] 检查月度账单并识别优化机会
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-63-optimise-resource-cost-using-automation.md]] — 深入讲解自动化调度优化资源成本
-> [[ctp-topic-27-aws-instance-scheduler.md]] — AWS 实例调度器详解
-> [[ctp-topic-71-pcgs-guide-to-rightsizing-why-how-when.md]] — Rightsizing 最佳实践
-
---
-
-*最后更新: 2026-04-15*
--- a/SRE/05_FinOps/ctp-topic-27-aws-instance-scheduler.md
+++ b/SRE/05_FinOps/ctp-topic-27-aws-instance-scheduler.md
@@ -1,62 +0,0 @@
---
-title: "CTP Topic 27 AWS Instance Scheduler"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Instance-Scheduler
-  - Cost-Optimization
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 27_ AWS Instance Scheduler.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 27 AWS Instance Scheduler
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 27_ AWS Instance Scheduler.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次会议由 Gustavo 主讲，重点介绍了 **AWS Instance Scheduler**。这是一项由 AWS 官方提供并由 CCOE（云卓越中心）集成在 Guardrails 部署方案中的成本优化工具。该方案的核心目标是通过自动化的定时任务来控制 EC2 和 RDS 实例的运行状态，从而降低非生产环境（如开发和测试环境）的云端成本。
-
-> 在技术实现上，该方案基于 CloudFormation 部署，利用 CloudWatch Events 每 15 分钟（默认配置）触发一次 Lambda 函数。Lambda 函数会读取存储在 DynamoDB 中的调度配置（包括时区、工作时间和周期），并根据实例上的特定标签（Tags）来决定是否执行启动或停止操作。Gustavo 在演示中展示了如何通过设置 `Schedule` 和 `Period` 标签来关联不同的办公时间（如西雅图或英国办公时间）。
-
-> 会议还深入探讨了几个关键的运营细节：首先，实例的关机行为必须设置为“停止（Stop）”而非“终止（Terminate）”以保留数据；其次，针对 RDS 实例，该工具能智能处理每七天一次的强制维护窗口，确保维护完成后实例能恢复到预期的调度状态。在问答环节，Gustavo 澄清了该工具是基于“时间表”而非“空闲率（Idle time）”触发的，并确认了通过 Guardrails，该功能已自动覆盖了公司内部绝大多数月消费超过 10 美元的 AWS 账号。
-
---
-
-## 关键概念
-
- **AWS Instance Scheduler**: AWS 官方提供的解决方案，用于自动启动和停止 EC2 及 RDS 实例以节省成本。
- **Guardrails**: 公司 CCOE 团队实施的一套自动化合规与治理框架，Instance Scheduler 作为其中的成本控制组件被自动部署。
- **CloudWatch Events**: 系统的触发器，按照预设的时间间隔（如 15 分钟）激活 Lambda 函数。
- **DynamoDB Config Table**: 用于存储调度定义（Schedules）和周期定义（Periods）的数据库，是调度的逻辑核心。
- **Tagging (标签化)**: 用户通过在实例上添加特定的标签（如 `Schedule`）来将其关联到预定义的调度逻辑。
- **RDS Maintenance Window**: RDS 特有的维护窗口，Instance Scheduler 能够识别并配合该窗口，确保数据库在维护后正确关闭。
- **Override Status**: 一种高级配置，允许管理员强制将实例保持在停止状态，即使在预设的启动时间内也不启动。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[AWS Guardrails Overview]] — 了解 Instance Scheduler 赖以部署的底层治理框架
-> [[Cloud Cost Optimization Strategies]] — 探讨除定时开关机外的其他云成本优化手段
-> [[AWS Lambda and Serverless Architecture]] — 深入理解本方案中使用的 Lambda 触发机制方式
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/ctp-topic-63-optimise-resource-cost-using-automation.md
+++ b/SRE/05_FinOps/ctp-topic-63-optimise-resource-cost-using-automation.md
@@ -1,97 +0,0 @@
---
-title: "CTP Topic 63 Optimise resource cost using automation"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Cost-Optimization
-  - Automation
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 63_ Optimise resource cost using automation.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 63 Optimise resource cost using automation
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 63_ Optimise resource cost using automation.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-本次云转型学习会议的主题是"使用自动化优化资源成本"。会议重点介绍了如何通过标准化、合理选择实例类型、利用承诺计划以及实施自动化调度等方式来降低云资源成本。
-
-### 核心内容
-
-1. **批准区域（Approved Region）**
-   - 推荐使用特定的云区域（AWS: Oregon, North Virginia, Frankfurt, London, Sydney, Singapore）
-   - 好处：提高安全性、标准化管理、便于成本优化
-
-2. **实例类型选择**
-   - 通用型：M6i/M6g (ARM Graviton 比 Intel 便宜 20-25%)
-   - 经济型：T3/T4g (适合 R&D 开发测试环境)
-   - 计算型：C 系列
-   - 内存型：R 系列/X 系列
-   - **关键示例**：同配置从 M 系列切换到 R 系列可节省 35% on-demand 价格
-
-3. **承诺计划（Commitment Plans）**
-   - 1年承诺：约 40% 折扣
-   - 3年承诺：约 60-64% 折扣
-   - 可结合 EDP 进一步降低成本
-
-4. **存储优化**
-   - GP2 迁移到 GP3：直接节省 20%
-   - 及时删除未使用的 EBS 卷和快照
-   - 避免过度分配存储空间
-
-5. **自动化调度（Scheduler）**
-   - 基于标签的 EC2/RDS 启动/停止
-   - 潜在节省：如果每天只运行 10 小时，可节省 70% 成本
-   - 支持不同时区的团队需求
-   - 通过 Lambda + EventBridge 实现
-
-### 演示环节
-
-Pushka 演示了如何使用 Terraform 模块配置 scheduler，通过设置标签（如 `auto shutdown = yes`）实现实例自动停止。
-
---
-
-## 关键概念
-
- **批准区域**: 建议使用的云资源部署区域，有助于提高安全性、标准化管理和优化成本。
- **实例类型选择**: 根据工作负载选择合适的实例家族（如M系列、T系列、C系列、R系列），以优化性能和成本。
- **承诺计划**: 通过预先承诺使用云资源一段时间（如一年或三年），获得折扣价格。
- **自动化调度**: 通过设置定时任务，自动启动和停止云资源，以节省非工作时间的资源成本。
- **存储优化**: 通过选择合适的存储类型（如GP3替代GP2），及时清理无用存储，合理分配存储空间来降低存储成本。
- **Graviton**: AWS 自研 ARM 处理器，比同规格 Intel 便宜 20-25%，已成熟用于生产环境
-
---
-
-## 行动项
-
- [ ] 评估现有云资源的使用情况，确定可以迁移到批准区域的资源。
- [ ] 分析不同工作负载的资源需求，选择合适的实例类型，并进行成本效益分析。
- [ ] 评估现有云资源的使用率，考虑购买承诺计划以降低成本。
- [ ] 在研发环境中实施自动化调度，设置定时任务自动启动和停止实例。
- [ ] 定期清理未使用的存储卷和快照，优化存储成本。
- [ ] 探索 Graviton 实例用于兼容的工作负载
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[ctp-topic-XX-instance-types.md]] — 详细介绍不同实例类型的适用场景和成本效益。
-> [[ctp-topic-XX-ri-savings-plan.md]] — 深入讲解承诺计划的类型和选择策略。
-> [[ctp-topic-XX-scheduler-demo.md]] — 演示如何使用自动化调度工具来优化资源成本。
-
---
-
-*最后更新: 2026-04-15*
--- a/SRE/05_FinOps/ctp-topic-71-pcgs-guide-to-rightsizing-why-how-when.md
+++ b/SRE/05_FinOps/ctp-topic-71-pcgs-guide-to-rightsizing-why-how-when.md
@@ -1,51 +0,0 @@
---
-title: "CTP Topic 71 PCG's guide to RightSizing, why, how when"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - RightSizing
-  - Cost-Optimization
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 71_ PCG's guide to RightSizing, why, how _ when.mp4"
-audio-source: ""
-status: raw
---
-
-# CTP Topic 71 PCG's guide to RightSizing, why, how when
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 71_ PCG's guide to RightSizing, why, how _ when.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/public-cloud-learning-sessions-best-practices-for-ec2-cost-optimization-in-aws-2.md
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-best-practices-for-ec2-cost-optimization-in-aws-2.md
@@ -1,40 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - EC2
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## EC2 Cost Optimization in AWS: Best Practices
-
-Mike Dukes and Steele Taylor, AWS experts, presented a learning session on EC2 cost optimization, covering compute efficiency, Graviton usage, EC2 spot leveraging, and cost-effective container deployments. The session emphasized interactive participation and welcomed questions.
-
-Efficiency in the cloud involves architectural best practices and leveraging AWS services and instance types for optimal workload performance. Technical advantages include high availability, elastic usage, and innovation adoption. Benefits include cost efficiency, leveraging purchase options, and reducing carbon footprint. *When we start talking about architecting and using best practice efficiency in the cloud, you effectively only pay for what you use when you use AWS.*
-
-EC2 offers over 750 instance types tailored for various workloads. AWS's Nitro system enhances efficiency by externalizing network, storage, and security components. AWS Graviton processors provide price performance benefits. Purchase options include on-demand, savings plans, and spot instances, each suited for different workload types.
-
-Graviton instances offer up to 40% better price performance than comparable x86 instances. Graviton is based on ARM64 and has extensive software support across Linux OS, ISVs, and open-source software, with sustainability benefits through reduced power consumption. AWS now offers the fourth version of Graviton. Graviton supports various instance types, including compute-optimized, memory-optimized, and general-purpose. AWS services like RDS, Aurora, and Lambda also support Graviton. Migrating to Graviton for services like RDS Aurora is relatively straightforward. *Graviton Free actually uses up to 60% less power consumption than comparable X86-based instances.*
-
-EC2 Spot instances offer up to 90% discounts compared to on-demand pricing, leveraging spare capacity. Key considerations for Spot instances include fault tolerance, flexibility, and statelessness. Diversification across instance types and availability zones is crucial for Spot usage. Spot instances can be interrupted when capacity is needed for on-demand instances, with notifications provided before termination. Integrations with AWS services like autoscaling, EKS, and ECS support automated responses to interruptions.
-
-Spot instances are suitable for web services, containers, HPC batch processing, big data, and CI/CD, while Graviton is beneficial for most of these except stateful services like databases. Spot and Graviton can be used together with containers, provided instance pools are not overly restricted.
-
-Spot Invaders, a fault-tolerant chaos engineering game powered by EKS and EC2 Spot, demonstrates best practices for running resilient applications on EKS while optimizing costs. The game involves shooting aliens to simulate pod failures and whales to trigger spot interruptions, showcasing the ability to maintain service availability despite disruptions.
--- a/SRE/05_FinOps/public-cloud-learning-sessions-best-practices-for-ec2-cost-optimization-in-aws-2.md.bak
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-best-practices-for-ec2-cost-optimization-in-aws-2.md.bak
@@ -1,50 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - EC2
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/public-cloud-learning-sessions-budget-control-20240319-160204-meeting-recording.md
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-budget-control-20240319-160204-meeting-recording.md
@@ -1,52 +0,0 @@
---
-title: "Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Budget-Control
-  - FinOps
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## Budget Control Automation
-
-The SRE Core team (Daniela, Evan, and Alan) presented a learning session on budget control, a new automation providing detailed data to manage budgets and costs within AWS accounts. The session covered the new budget control's value, diagrams, detailed cost reports, AWS budget alerts/actions, and source identity implementation.
-
-The budget control automation aims to address uncontrolled AWS account sprawl and unsustainable cost reduction efforts. It provides account owners with detailed alerts, including information on account spending and cost drivers, enabling them to identify areas for cost reduction. Enforcement will involve attaching an SCP to block new resource creation. The initial scope is limited to lab accounts, with other accounts continuing to receive standard out-of-budget alerts.
-
-An example alert email includes account details, alert details, warning messages, and detailed reports. There are four types of email alerts: forecast, actual, severe, and enforcement. The alert flow includes forecast alerts at 100% threshold with no action, and actual alerts at 80%, 90%, 95%, and 98% thresholds with escalating recipient lists. At 100%, a severe or enforcement alert is triggered based on a scoring system, with enforcement initially via manual approval and later automated. Budget increases can be requested through an Oli workflow.
-
-*The source identity must be tracked.* Challenges during development included tracking source identity, customizing AWS budget alerts, choosing an enforcement method (SCP), and providing a grace period before enforcement. Budgets are evaluated every eight hours, and disabled budget actions result in no spend control until the next month. Currently, 80 lab accounts exceed their budgets, and around 100 are expected to exceed 80% of their budget threshold.
-
-The implementation will be gradual, starting with alerts only on April 1st. Manual enforcement will follow upon FinOps' approval, with automatic enforcement as the next step.
-
-## Diagrams and Detailed Cost Reports
-
-Daniel discussed diagrams and cost reports attached to email alerts, explaining their creation and content. Libraries for lambdas were created to improve code visibility and simplify deployment. The *top services of recent months* report helps managers understand cost drivers, showing the percentage of budget spent on specific services over time. The *top users of current months* diagram allows account owners to monitor daily spending by users. A detailed Excel report provides granular information on resource IDs, creators, and associated costs, separated by month.
-
-*This is the first time that we were able to get to this level of granularity.* Data for the top services report is generated from Athena, while the user's diagram uses data from Cost Explorer.
-
-## AWS Budget Alerts and Actions
-
-Alan discussed the implementation of AWS budget alerts and actions. The AWS budget service is primitive in terms of customization, so the team had to parse the bodies of the emails received from it. The budget alert system sends messages to an SNS topic, which triggers a Lambda function. The Lambda extracts data from the email and uses it to create a more detailed message. The step function enriches the data with account information, budget details, and owner/manager contacts.
-
-AWS allows actions to be applied based on alert thresholds. A budget action on 100% triggers either a severe or enforcement email, depending on the scoring system. If budget enforcement is enabled, an SCP is applied to block resource creation. The FINOPS group receives a notification and decides whether to apply the action immediately or negotiate with the account owner.
-
-The scoring system and grace period calculations aim to avoid penalizing accounts that slightly exceed their budget near the end of the month. The scoring considers account size and proximity to the end of the month. Smaller accounts have a better grace period.
-
-FinOps has classified accounts based on cost range. The budgets were last updated on February 23rd. The source identity attribute was implemented to track user activity within AWS accounts, even when assuming different roles. Federated logins use NetIQ access manager to authenticate users and provide access to AWS accounts. The source identity ensures that the original login identity is maintained across role changes, allowing CloudTrail and other services to track user activity accurately.
--- a/SRE/05_FinOps/public-cloud-learning-sessions-budget-control-20240319-160204-meeting-recording.md.bak
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-budget-control-20240319-160204-meeting-recording.md.bak
@@ -1,50 +0,0 @@
---
-title: "Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Budget-Control
-  - FinOps
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/public-cloud-learning-sessions-reducing-cloud-costs-20250318-170100-meeting-reco.md
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-reducing-cloud-costs-20250318-170100-meeting-reco.md
@@ -1,42 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Cost-Optimization
-  - FinOps
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## Reducing Cloud Costs
-
-Vinay from the FINOPS team presented a session on reducing cloud costs, focusing on workload and rate optimization. The session covered modernization, right sizing, and best practices for cost reduction.
-
-### Workload Optimization via Modernization and Right Sizing
-
-Modernization involves using newer generations of services, like EC2 instances. While there's a perception that newer instances are more expensive, the latest families are generally cheaper and offer better performance. *Whenever there's a new family launched by the hyperscale, the latest families are almost cheaper.* However, AWS has slightly changed its pricing model after M6, making M7 and M8 somewhat more expensive. Moving from Intel to AMD can save around 6-10% on on-demand prices for Windows and Linux workloads. Graviton instances can offer even greater savings (20-25% reduction in on-demand cost) for Linux workloads, combined with EDP discounts and commitment plans.
-
-Upgrading storage from GP2 to GP3 offers a 20% direct cost benefit without downtime. For Amazon EKS clusters, upgrading to the latest versions is crucial to avoid extended support costs, which are significantly higher. *Rather than spending up unnecessary moment on the extended support, you can deploy additional four or five cluster, right.* Spot instances can provide up to 90% discount compared to on-demand, suitable for big data, CI/CD pipelines, web servers, and HPC.
-
-Right sizing involves identifying the correct resource configuration for workload performance and capacity needs. The EC2 right sizing recommendation report captures CPU usage, memory, and network data to provide recommendations. Configuring instance schedules is useful for non-production environments, allowing instances to be powered on/off based on business hours, potentially reducing costs to 40% of on-demand prices. Identifying and deleting idle load balancers, unassociated elastic IPs, and underutilized EBS volumes are also key to cost savings. Old snapshots and CloudWatch logs also contribute to unnecessary costs. Using cheaper regions like Oregon or North Virginia can reduce costs if there are no specific regional requirements.
-
-### Rate Optimization
-
-Rate optimization involves commitment-based discounts. Hyperscalers offer discounts for committing to resource usage or spending for a term (1-3 years). There are two categories: resource-level commitment (better discount with limitations) and flexible commitment (standard discount with flexibility). AWS offers Savings Plans (EC2 and Compute) and reservations for various services like RDS, ElastiCache, and CloudFront.
-
-The rate optimization workflow includes pre-work (right sizing), analysis (identifying workloads requiring 24/7 uptime), communication (sharing details with finance), approval (from account owner), and reporting (monitoring utilization). Only the Phenop's team can implement commitment plans. All commitment plans will be purchased with no upfront payment options only. The minimum transaction value is 5k per annum.
--- a/SRE/05_FinOps/public-cloud-learning-sessions-reducing-cloud-costs-20250318-170100-meeting-reco.md.bak
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-reducing-cloud-costs-20250318-170100-meeting-reco.md.bak
@@ -1,50 +0,0 @@
---
-title: "Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Cost-Optimization
-  - FinOps
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/05_FinOps/public-cloud-learning-sessions-storage-cost-optimization-20240305-160037-meeting.md
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-storage-cost-optimization-20240305-160037-meeting.md
@@ -1,46 +0,0 @@
---
-title: "Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Storage
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## Storage Cost Optimization
-
-This session covers storage cost optimization best practices across various AWS storage services: Amazon EBS, Amazon EFS, Amazon FSx, and Amazon S3. It includes an optimization example from ADM.
-
-Key points include choosing the right storage for your workload, considering API costs and data transfer costs in addition to price per gigabyte, and understanding the different tiers available within each service.
-
-### Amazon EBS
-
-EBS has SSD and HDD volumes. GP3 volumes are recommended as the default for general-purpose SSD due to being 20% more cost-effective than GP2. *With GP3, you can scale IOPS and throughput independently of the volume size.* For migration from GP2 to GP3, automation tools should be updated to create GP3 volumes by default. EBS snapshots have standard and archive tiers, with the archive tier offering 75% lower costs but higher restore times and a 90-day retention period. Automation via Data Lifecycle Management (DLM) or AWS Backup is recommended for managing snapshots, including setting retention policies and migrating to the archive tier.
-
-### Amazon EFS and FSx
-
-FSx considerations include data deduplication, compression, and tiering. EFS offers standard, one-zone, and infrequent access tiers, with lifecycle policies to move files between tiers. The infrequent tier has a minimum billable object size of 128KB. EFS archive is a new tier, similar to Glacier, with a 90-day minimum duration and a 128KB minimum billable object size. FSx for NetApp ONTAP has SSD and HDD tiers (capacity pool), with automatic tiering between them.
-
-### Amazon S3
-
-Choosing the right storage class is crucial for S3 cost optimization. S3 Standard is for frequently accessed objects, with no retrieval fees, minimum retention, or minimum billable object size. Glacier tiers (Instant Retrieval, Flexible Retrieval, Deep Archive) are for rarely accessed data, with varying retrieval times and costs. Intelligent Tiering automatically moves data between tiers based on access patterns, with no transition fees between tiers within Intelligent Tiering. *With intelligent hearing we can automatically move data from warmer to colder color storage tiers and it will be based on the object less access data.* Lifecycle policies can transition objects between tiers, expire non-current versions, and delete incomplete multi-part uploads. Data transfer charges should be considered, and PrivateLink can be leveraged to stay within the AWS network. Storage Lens, CloudWatch, S3 Inventory, and access logs can be used to monitor and optimize S3 usage.
-
-### ADM Optimization Example
-
-ADM migrated NetApp file shares from on-premises to AWS. The initial migration to OpenZFS was inefficient. A second migration to a self-managed NetApp on EC2 instances incurred high data transfer costs. The final migration to AWS FSx for NetApp ONTAP resulted in a 60% cost reduction.
--- a/SRE/05_FinOps/public-cloud-learning-sessions-storage-cost-optimization-20240305-160037-meeting.md.bak
+++ b/SRE/05_FinOps/public-cloud-learning-sessions-storage-cost-optimization-20240305-160037-meeting.md.bak
@@ -1,50 +0,0 @@
---
-title: "Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/05_FinOps"
-tags:
-  - AWS
-  - Storage
-  - Cost-Optimization
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4"
-audio-source: ""
-status: raw
---
-
-# Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4`
-
-**Type:** VIDEO | **Category:** 05_FinOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-15-working-with-renovatebot.md
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-15-working-with-renovatebot.md
@@ -1,62 +0,0 @@
---
-title: "CTP Topic 15 Working with Renovatebot"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/06_CI_CD_GitOps"
-tags:
-  - Renovatebot
-  - Dependency-Update
-  - GitOps
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 15_ Working with Renovatebot.mp4"
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 15 Working with Renovatebot
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 15_ Working with Renovatebot.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** ✅ 已完成（Gemini 摘要）
-
---
-
-## 摘要
-
-> 本次会议由 Paul Hopkins 主讲，核心围绕如何利用 **Renovate Bot** 自动化管理云原生基础设施中的依赖项更新。在复杂的云架构中，依赖项无处不在，包括 Docker 基础镜像、Maven 依赖、Terraform 模块、Kubernetes Helm Charts 等。Paul 指出，团队在维护大量基于 Gruntwork 的 Terraform 模块和 Terragrunt 配置时，面临着手动更新版本号耗时耗力且极易滞后的挑战。
-
-> 为了解决这一“依赖地狱”问题，团队引入了 Renovate Bot。该工具能够实时扫描代码库，识别过时的版本标签（Semantic Versioning），并自动发起拉取请求（Pull Request）。Paul 详细展示了 Renovate 的核心功能，如 **Dependency Dashboard**（依赖仪表板），它能在一个 GitHub Issue 中列出所有待更新的项，提供全局视角。在实施层面，团队通过在仓库中配置 `renovate.json` 文件来定义管理策略，并支持 Terraform、Terragrunt、Docker 以及 pre-commit 钩子等多种技术栈。
-
-> 目前，该方案已集成到 Jenkins 流水线中，虽然在初期遇到了 GitHub Enterprise 适配及 Jenkins 处理大量并发 PR 的性能瓶颈，但通过本地 Podman 容器化运行和合理的速率限制（Rate Limiting）配置，团队成功实现了依赖更新的自动化与标准化。这不仅提升了基础设施的安全性（及时修复漏洞），也确保了开发环境与生产环境配置的一致性。
-
---
-
-## 关键概念
-
- **Renovate Bot**: 一款开源的依赖自动化更新工具，通过扫描代码并自动提交 PR 来保持依赖项处于最新状态。
- **Dependency Management**: 依赖管理，指对项目中引用的外部库、模块或镜像的版本进行跟踪、更新和维护的过程。
- **Terragrunt**: 一个 Terraform 的轻量级封装层，用于处理多环境配置、减少重复代码（DRY）并管理远程状态。
- **Semantic Versioning (SemVer)**: 语义化版本控制，通常采用 `主版本号.次版本号.修订号` 的格式，Renovate 依据此规则判断更新级别。
- **Dependency Dashboard**: Renovate 在 GitHub 仓库中自动创建的一个 Issue，用于汇总所有依赖状态、待处理的 PR 以及更新选项。
- **Managers**: Renovate 中的插件机制，用于识别和处理特定类型的依赖文件（如 `terraform` 经理处理 `.tf` 文件，`dockerfile` 经理处理镜像标签）。
- **Rate Limiting**: 速率限制，为了防止自动化工具瞬间产生过多 PR 导致 CI/CD 系统崩溃，Renovate 允许限制每小时或同时开启的 PR 数量。
- **Pre-commit Hooks**: 在提交代码前运行的脚本工具，Renovate 同样可以自动更新这些钩子插件的版本。
-
---
-
-## 相关视频
-
-> [!info]+ 交叉引用
-> [[Pre-commit Hooks and Linting Sessions]] — Paul 在视频中提到 Neurangin 曾讲解过 pre-commit 的格式化与安全扫描，Renovate 也负责其版本更新。
-> [[Terraform and Terragrunt Best Practices]] — 本视频深入探讨了如何自动化维护这些基础设施即代码（IaC）工具的模块引用。
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-2-git.md
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-2-git.md
@@ -1,50 +0,0 @@
---
-title: "CTP Topic 2 Git"
-type: cloud-learning
-source-type: video
-category: "DevOps & SRE/06_CI_CD_GitOps"
-tags:
-  - Git
-  - VCS
-  - CTP
-date-added: 2026-04-14
-video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 2_ Git.mp4"
-audio-source: ""
-status: raw
---
-
-# CTP Topic 2 Git
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 2_ Git.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-3-deploy-and-maintain-infrastructure.md
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-3-deploy-and-maintain-infrastructure.md
@@ -1,64 +0,0 @@
---
-title: CTP Topic 3 Deploy and maintain infrastructure
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - IaC
-  - Deployment
-  - CI/CD
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 3 Deploy and maintain infrastructure
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Deploying and Maintaining Infrastructure
-
-The session focuses on deploying and maintaining infrastructure, clarifying Terraform, Terragrunt, modules, and service catalogs within the landing zone context. It emphasizes the structure of Git repositories and how Terraform and Terragrunt files interact.
-
-When a landing zone is provisioned, product teams are grouped, each having a landing zone and workload accounts. A product team, such as DevTools, deploys infrastructure to meet specific requirements across accounts like Artifactory and Active Directory. This involves multiple Git repositories, including the core landing zone repository, Terraform service catalog, and a product team service catalog.
-
-A service module consists of a main.tf file that references other repositories, grouping modules to fulfill a business requirement, such as an active directory or DNS service. *When deploying infrastructure, Terragrunt HCL files are used to reference these services, targeting specific versions rather than the master branch.* These files may include dependencies to reference values across services, favoring dependencies over reading state files.
-
-When referencing modules within the current codebase, a relative path can be used, but the preferred approach is to have a dedicated service catalog with a modules directory. This allows for independent release cycles and better maintainability. Modules can be used within one account, reused within a product team (in the product team service catalog), or used across product teams (in the Terraform service catalog).
-
-*A service is a business requirement, while a regular module is a technical requirement.* A service deploys a set of multiple modules, abstracting them. The higher up the chain, the less configuration options are available, similar to an object-oriented approach.
-
-Terragrunt fetches all references before running, using a Terragrunt cache directory to store cloned repositories. Terragrunt can be run at the directory level, considering dependencies, but applying without verification is discouraged. Jenkins jobs can be enhanced for debugging, and documentation should be comprehensive, referencing Gruntwork as a model. Versioning modules should follow major, minor, and patch conventions.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-3-deploy-and-maintain-infrastructure.md.bak
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-3-deploy-and-maintain-infrastructure.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 3 Deploy and maintain infrastructure
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - IaC
-  - Deployment
-  - CI/CD
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 3 Deploy and maintain infrastructure
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-32-using-atlantis-cicd-for-infrastructure-deployments.md
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-32-using-atlantis-cicd-for-infrastructure-deployments.md
@@ -1,61 +0,0 @@
---
-title: CTP Topic 32 Using Atlantis CICD for infrastructure deployments
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - Atlantis
-  - CI/CD
-  - IaC
-  - Terraform
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 32 Using Atlantis CICD for infrastructure deployments
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> ## Atlantis CICD: Replacing Jenkins for Infrastructure Deployments
-
-The presentation introduces Atlantis, a new automation tool designed for teams to collaborate on Terraform code, aiming to replace Jenkins for infrastructure deployments. Atlantis addresses the speed and complexity issues of the current pipeline. *The current pipeline is practically very slow* due to significant initialization time, multiple code cloning, sequential testing, and ECS deployer provisioning. The existing pipeline's complexity stems from continuous tweaking to integrate more features and cover edge cases, leading to fragility and drift.
-
-Atlantis is standalone, self-hosted, free, and open source, with an active community. It offers a better collaboration model, simplified networking, and cost savings by removing the need for numerous VPC endpoints. Atlantis applies changes before merging, ensuring code in sync with infrastructure. The workflow is simplified, allowing direct communication with Atlantis from GitHub via comments on pull requests, eliminating the need for separate accounts and integrations.
-
-Atlantis is hosted on a single EC2 instance in each landing zone's shared account, notified by GitHub Enterprise using webhooks. It uses service accounts to interact with GitHub, post comments, do merges, and close PRs. Cross-account access is managed through deployed key roles in each account, utilized for both simple and cross-account module deployments. User management is controlled on GitHub, and build logs are stored in comments for auditing. Atlantis enforces apply requirements, such as mergeability and peer approval, before applying changes. Auto-merge is enabled for automatic merging upon successful application. Parallel builds are supported, running plan and apply commands concurrently for multiple modules.
-
-Atlantis locking prevents conflicts by locking the directory of each module when a plan is run, until the pull request is merged, closed, or the plan is discarded. *When a plan is run, the directory of each module is locked until the pull request that is that has this folder locked is merged or closed, or the plan is manually discarded.* Modules and data file dependencies can be declared to trigger plans when dependencies change. Documentation, troubleshooting guides, and a list of migrated repositories are available to assist users.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-32-using-atlantis-cicd-for-infrastructure-deployments.md.bak
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-32-using-atlantis-cicd-for-infrastructure-deployments.md.bak
@@ -1,52 +0,0 @@
---
-title: CTP Topic 32 Using Atlantis CICD for infrastructure deployments
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - Atlantis
-  - CI/CD
-  - IaC
-  - Terraform
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 32 Using Atlantis CICD for infrastructure deployments
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-33-an-introduction-to-gitops.md
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-33-an-introduction-to-gitops.md
@@ -1,72 +0,0 @@
---
-title: CTP Topic 33 An introduction to GitOps
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - GitOps
-  - CI/CD
-  - Git
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4
-audio-source: ""
-status: summarized (Gemini 摘要)
---
-
-# CTP Topic 33 An introduction to GitOps
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> Victor Etkin presents an introduction to GitOps, explaining how it complements DevOps. GitOps applies software development principles to deployment processes, potentially resolving challenges like failed deployments and configuration inconsistencies.
-
-Key benefits of GitOps:
-*   Increased developer productivity using familiar tools.
-*   Minimized failed deployments with easy rollback capabilities.
-*   Faster feature releases.
-*   Real-time auditing and improved security through Git's features.
-
-GitOps uses Git workflows, CD pipelines, and infrastructure as code. Observability is crucial for ensuring the desired and actual states align. GitOps is often used with Kubernetes but can be applied elsewhere.
-
-The four principles of GitOps: declarative configuration, version control, CD process separation, and incremental infrastructure implementation. Git serves as the primary tool, storing deployment infrastructure and application configurations. A GitOps controller reconciles the Git state with the actual system state. *The only tool a developer needs to know is Git.*
-
-The goal is full automation, with code changes deployed safely in minutes. CI and CD should be decoupled. A basic GitOps workflow for Kubernetes involves developers committing code, creating container images, storing deployment configurations in Git, monitoring changes via a GitOps agent, and rolling out images to environments.
-
-CI focuses on building and analyzing code, while CD focuses on deploying binaries. Separating CI and CD enhances security. CD tools can run inside container platforms like Kubernetes for added security.
-
-GitOps enables on-demand incremental deployment, benefiting microservices architectures. CD processes require an IDEMPOTENT platform like Kubernetes. *An IDEMPOTENT operation is one that can be applied multiple times without changing the result beyond the initial application.*
-
-CD processes can be implemented using push or pull models. The pull model, which monitors both Git and the target system, is recommended for GitOps. Human intervention is still needed for issues like resource failures. GitOps simplifies operations, allowing developers to focus on more valuable activities.
-
-GitOps is a logical evolution of DevOps, simplifying adoption and enhancing portability. Git commit logs become audit trails, streamlining compliance.
-
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/SRE/06_CI_CD_GitOps/ctp-topic-33-an-introduction-to-gitops.md.bak
+++ b/SRE/06_CI_CD_GitOps/ctp-topic-33-an-introduction-to-gitops.md.bak
@@ -1,51 +0,0 @@
---
-title: CTP Topic 33 An introduction to GitOps
-type: cloud-learning
-source-type: video
-category: DevOps & SRE/06_CI_CD_GitOps
-tags:
-  - GitOps
-  - CI/CD
-  - Git
-  - CTP
-date-added: 2026-04-14
-video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4
-audio-source: ""
-status: raw
---
-
-# CTP Topic 33 An introduction to GitOps
-
-**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4`
-
-**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
-
-**Status:** 🟡 Awaiting Whisper transcription → Summary
-
---
-
-## 摘要
-
-> 待转录后由 LLM 生成
-
---
-
-## 关键概念
-
-
-
---
-
-## 行动项
-
-
-
---
-
-## 相关视频
-
-> 配对视频笔记链接（生成后填入）
-
---
-
-*最后更新: 2026-04-14*
--- a/Show More
+++ b/Show More