Auto-sync: wiki-ingest 3 sources (2026-04-16)
This commit is contained in:
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 1_ Gruntwork Landing Zone Architecture.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 1 Gruntwork Landing Zone Architecture
|
||||
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 10_ AWS Landing Zone (LZ) Data Collection, Tagging _ Related Security.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 10 AWS Landing Zone (LZ) Data Collection, Tagging Related Security
|
||||
@@ -21,7 +21,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 14_ Octane Hub on AWS_ Real life experience moving production services into the new land.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 14 Octane Hub on AWS: Real-Life Experiences
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** ✅ 已完成摘要
|
||||
**Status: ✅ 已完成摘要**
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 17_ Active Directory Services in Gruntwork AWS LZs.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 17 Active Directory Services in Gruntwork AWS LZs
|
||||
@@ -21,7 +21,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status: 🟡 Awaiting Whisper transcription → Summary**
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 25 Labs Landing Zone overview - ITOM teams"
|
||||
title: CTP Topic 25 Labs Landing Zone overview - ITOM teams
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- ITOM
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 25 Labs Landing Zone overview - ITOM teams
|
||||
@@ -27,7 +27,26 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Labs Landing Zone Overview
|
||||
|
||||
The Labs landing zone is based on the Gruntworks reference architecture and AWS standards, utilizing a multi-account strategy. The entire stack is managed through infrastructure as code (Terraform), using a library of common functions accessible for review and modification. *Everything should be managed using Terraform or some other code-based mechanism.*
|
||||
|
||||
Key components include:
|
||||
|
||||
* **Shared Account:** Hosts the Jenkins master for the CI/CD pipeline (Gruntworks production grade), hardened AMIs, and a Docker container store.
|
||||
* **Logs Account:** Secure storage for AWS Config and CloudTrail logs, with access controlled by the security team.
|
||||
* **Security Account:** Manages user accounts and access, primarily for cross-account access and shared accounts, with most access being federated.
|
||||
* **Core Accounts:**
|
||||
* Active Directory: Manages Windows instances and IDPs (all in Swimford.net).
|
||||
* DNS: Manages AWS Swimford.net, allowing for local domains or referencing the wider infrastructure.
|
||||
* **Network Account:** Central hub for network communication, managing traffic via Transit Gateway and JetPult firewall. All internet access is routed through here, managed by the network team via tags. Pulse VPN access is also managed here, providing access to the micro focus network.
|
||||
* **Shared Service Accounts:** Provide access to services like monitoring (45 arc site) and Qualys.
|
||||
* **Product Account:** The primary working environment, built to standard infrastructure-as-code modules. It can have multiple accounts (production, staging, development). Logs are shipped to the logs account, and Jenkins manages automation within the account.
|
||||
|
||||
When deploying a product account, key requirements include defining IP address ranges and agreeing on specific tags with the network team for firewall access. *Access through that firewall is all managed by tags.* The team recommends using their Terraform modules for deploying subnets.
|
||||
|
||||
The standard Jenkins-based pipelines scan GitHub Enterprise repositories for changes, running Terragrunt plans or applies based on the branch. Internet connectivity is restricted; access to specific corporate network locations requires a request to the network services team. The pipelines are continuously being improved for robustness and security, including pre-commit checks and Fortify scans.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 25 Labs Landing Zone overview - ITOM teams
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
- Labs
|
||||
- ITOM
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 25 Labs Landing Zone overview - ITOM teams
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 25_ Labs Landing Zone overview - ITOM teams.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 26_ Standard AMI – build, publish, share processes.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 26 Standard AMI – build, publish, share processes
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 28_ AWS Tag Validation Tool.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 28 AWS Tag Validation Tool
|
||||
@@ -21,7 +21,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,16 +1,16 @@
|
||||
---
|
||||
title: "CTP Topic 34 Azure Landing Zone Architecture Overview"
|
||||
title: CTP Topic 34 Azure Landing Zone Architecture Overview
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- Azure
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 34 Azure Landing Zone Architecture Overview
|
||||
@@ -25,7 +25,16 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Azure Landing Zone Architecture Overview
|
||||
|
||||
Kishore Garlopati presents an overview of the upcoming Azure Landing Zones implementation within Micro Focus, detailing how it will simplify Azure adoption for various teams and enable them to deploy workloads to the Azure cloud. The primary goal is to minimize cross-team dependencies through automation, granting teams greater independence in deploying innovative solutions within the Azure environment.
|
||||
|
||||
The architecture begins with enrollment into Azure Enterprise, utilizing Azure Active Directory for user authentication. Azure employs management groups, similar to parent directories in Windows, to organize the entities within Micro Focus. These are divided into four areas: platform, landing zones, decommission, and sandbox. The platform includes identity management and connectivity subscriptions, each with a specific purpose and managed by dedicated teams to enhance security. *The core reason of these individual or isolated subscriptions is you are basically containing a subscription for a specific purpose.*
|
||||
|
||||
Identity subscriptions manage access policies, while connectivity subscriptions serve as a central hub for all inbound and outbound Azure traffic, incorporating security measures like DDoS protection and checkpoint firewalls. Landing zones are designed to be scalable, modular, and fully automated, providing a template-based approach for new projects. These zones emphasize identity access management, auditing, compliance, security monitoring, and networking. Decommissioned subscriptions are for unused resources, and sandbox subscriptions offer isolated environments for experimentation. *This sandbox is a is an interesting one because these landings on subscriptions allows your workloads.*
|
||||
|
||||
Privileged Identity Management (PIM) and privileged access groups manage user access, ensuring appropriate role and policy enforcement. Terraform Cloud is used for infrastructure automation, leveraging Terraform states to manage dependencies between subscriptions. This layered approach allows teams to access necessary data without exposing sensitive information.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: CTP Topic 34 Azure Landing Zone Architecture Overview
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- Azure
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 34 Azure Landing Zone Architecture Overview
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 34_ Azure Landing Zone Architecture Overview.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)"
|
||||
title: CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Labs
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
|
||||
@@ -27,7 +27,14 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AWS Landing Zone Design Refresher
|
||||
|
||||
This session provides an overview of AWS Landing Zones, focusing on their design, updates, and differences between SaaS and Labs environments. The primary goal of landing zones is to support diverse AWS use cases while ensuring reuse, control, auditing, and management. *Our AWS landing zones, they're built infrastructure as code as you'd expect on terraform templates using the grunt work framework.*
|
||||
|
||||
AWS SaaS landing zones offer customer-dedicated environments with product accounts for each product area, such as Snacks. These accounts connect to shared services accounts for security, logging, and networking. The core accounts group includes Active Directory, DNS, and network accounts to support IT services within the micro-focus infrastructure. The shared service accounts host services like artifactory, cyberqualice, cyber EPO, ArcSight, and monitoring. Grunt work accounts manage AMIs, logs, and security across all accounts. Product accounts host IT products, projects, applications, and supporting AWS resources, managed by individual project teams.
|
||||
|
||||
Recent changes to the landing zones include network segmentation to block direct connectivity to SaaS workloads, decommissioning of the Gruntworks Cloud Trail in favor of CCOEs Cloud Trail, and proposed rerouting of ingress traffic via checkpoints in the network account. Native AWS backup is likely to be mandated, and management VPCs may be removed for new accounts. The key difference between SaaS and Labs is that SaaS is for production, while Labs is for development, with plans to introduce internet access into Labs. *Basically, the only answer is that SAS is production, Labs is development.* The PoC landing zone will be combined with Labs to maximize shared resources. The Cloud Technology Design Forum aims to standardize and centralize microfocus's cloud delivery offering, including landing zone designs.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
- SaaS
|
||||
- Labs
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 35 AWS Landing Zone Design Refresher (SaaS Labs)
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 35_ AWS Landing Zone Design Refresher (SaaS _ Labs).mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 40 SaaS Database Architecture On AWS Cloud"
|
||||
title: CTP Topic 40 SaaS Database Architecture On AWS Cloud
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- SaaS
|
||||
- Database
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- AWS
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 40 SaaS Database Architecture On AWS Cloud
|
||||
@@ -27,7 +27,18 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## SAS Database Architecture on AWS Cloud
|
||||
|
||||
The SAS database team is a global team located in the US, Canada, India, and Israel, providing 24/7 support. The team consists of certified professionals, including Oracle certified professionals, DBAs, and security professionals. They manage over 500 databases and 1000+ DB servers on-premise and in the public cloud, having migrated numerous DB servers and databases to the public cloud.
|
||||
|
||||
The team supports various regions, including Sacramento and Reading for on-premise data centers, and AWS regions like Canada, Frankfurt, London, Oregon, North Virginia, and Sydney. They support database flavors such as Oracle, Vertica, Postgres, DynamoDB, SQL Server, MongoDB, and MySQL, utilizing AWS technologies like Postgres Aurora, Elasticsearch, AWS RDS, EFS, S3, and EBS. Databases reside mostly on application VPCs with integrated security measures.
|
||||
|
||||
For database monitoring, performance tuning, and gap analysis, tools like Micro Focus Sidescope, Oracle OEM, Ignite, AWS CloudWatch, and Questsoft Foglight are used. Day-to-day operations are managed through a ticketing tool, with an on-call DBA resource. The team actively participates in squads and executes a minimum of 10 changes a month, handling 400-500 SSRs and IMs monthly. They provide layer 1 and layer 3 support, using technologies like shell scripting, Terraform, AWS CLI, and PowerShell for automation. *Data center migrations and cloud provisioning were key automation projects.*
|
||||
|
||||
Key projects include data center migrations, onboarding new customers, database security enhancements, DB-AD integrations, SOX compliance, database consolidation, and DB patching. The team is also working on Oracle Golden Gate for multi-tenancy, adopting cloud-native technologies, and enhancing the Pretty Tool for on-demand backups and database migrations. Future plans involve new AMI automations, storage compression, RI instance optimization, AWS cloud-native backups, and enhancements to the DB apps tool. *The idea was to move those databases seamless without downtime or with minimum downtime.*
|
||||
|
||||
For high availability, Oracle uses Data Guard technology, Postgres uses a classic active-passive mechanism (with plans to use Active Active), and RDS uses RDS high availability. Databases are run in two availability zones within a region, with a primary database in one zone, a standby database in the second, and a witness in the third to observe and manage failovers. Reporting databases have a read-only warehouse in the third availability zone, with secure VPN access for customers to run operational warehousing queries.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 40 SaaS Database Architecture On AWS Cloud
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- SaaS
|
||||
- Database
|
||||
- Architecture
|
||||
- AWS
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 40 SaaS Database Architecture On AWS Cloud
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 40_ SaaS Database Architecture On AWS Cloud.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 44_ AWS Backup in Micro Focus.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 44 AWS Backup in Micro Focus
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** ✅ 已完成摘要
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 46 NetApps on AWS"
|
||||
title: CTP Topic 46 NetApps on AWS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- NetApp
|
||||
- AWS
|
||||
- Storage
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 46 NetApps on AWS
|
||||
@@ -26,7 +26,53 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## NetApp on AWS: A Cloud Transformation Program Learning Session
|
||||
|
||||
Sandeep and Yael presented a training session on NetApp, covering basic components, architecture, data tiering, security, backup/DR strategy, migration from on-prem to cloud, current NetApp usage, architecture, and a demonstration.
|
||||
|
||||
### Traditional NetApp
|
||||
|
||||
NetApp is a storage system, with ONTAP as its operating system. It features controller nodes connected to disk enclosures, supporting SSD, SATA, SAS, and FC disks. NetApp primarily supports SMB, NFS, FC, FCOE, and ISCSI protocols, often configured as a single node or HA pair (high availability pair).
|
||||
|
||||
Key components include:
|
||||
* **Aggregate:** A collection of disks forming a RAID group.
|
||||
* **Volume (FlexVolume):** A data container hosted on top of an aggregate, presented to hosts for data storage, accessible via NFS or CIFS.
|
||||
* **Qtree:** A further segmentation of a volume, similar to directories in UNIX or folders in Windows, with special attributes like permissions and quota management.
|
||||
* **LUN (Logical Unit Number):** A logical representation of storage, hosted on a volume or Qtree, presented to hosts via FC or ISKSI as block-level storage.
|
||||
* **Logical Interface (Lift):** An interface on top of a physical network card, hosting an IP address or WWPN, used for node management, inter-cluster replication, cluster management, and data serving.
|
||||
* **Storage Virtual Machine (SVM):** A virtual segmentation of a NetApp system, enabling multi-tenancy, treating each SVM as a separate operating system with no data flow between them. *At least one SVM is needed for a cluster.*
|
||||
|
||||
### NetApp in AWS (Cloud Volume ONTAP - CVO)
|
||||
|
||||
CVO is a software-only storage appliance hosted on EC2 instances, functioning as nodes. It can be a single node or HA pair, utilizing a mediator instance to aid during takeover and give back processes. The nodes are deployed across multiple availability zones with synchronous replication. EBS disks (GP3, GP2, IEO, IEO1, ST1) are used as storage, managed via Cloud Manager.
|
||||
|
||||
High availability is maintained through a floating IP concept, where clients access data via a unique IP address that migrates to the serving node in case of failure. Takeover give back refers to the process of a serving node taking over services from a failed node and relinquishing them when the failed node recovers.
|
||||
|
||||
### Data Tiering
|
||||
|
||||
Data tiering involves using various storage media to optimize cost, performance, and availability. NetApp in AWS stores active data on EBS and inactive data on S3. Data inactive for 30 days or more is automatically moved to S3 and pulled back to EBS when accessed. *NetApp stores the active data in EBS and inactive data to S3.*
|
||||
|
||||
### Data Security
|
||||
|
||||
NetApp supports encryption via AWS Key Management Service and NetApp Encryption Solution (volume or aggregate encryption), both offering 256-bit encryption. Virus scanning is integrated with McAfee Antivirus (VSES), using an external scan server. Scanning options include on-access (for SMB/CIFS) and on-demand (for NFS) scanning.
|
||||
|
||||
### Backup and DR
|
||||
|
||||
Snapshots are point-in-time, read-only file system images that create copies of volumes using pointers, minimizing space consumption. SnapMirror is a tool for replicating data between NetApps, copying volumes and their snapshots. It requires peering relationships between clusters and SVMs, with optional encryption. Baseline copies perform initial full data replication, while subsequent updates copy only the changes. Destination volumes in a SnapMirror relationship are read-only.
|
||||
|
||||
### Migration
|
||||
|
||||
Tools for migrating from on-prem to AWS include:
|
||||
* **SnapMirror:** Fast, block-level replication, preserving D-Dupe and compression.
|
||||
* **NetApp XCP:** File-based tool, copying data at the file level with concurrent sessions.
|
||||
* **NetApp Cloud Sync:** Used for AWS migrations, supporting NetApp to NetApp, NFS, SMB, NetApp to S3/EFS, and EFS/S3 to NetApp.
|
||||
* **AWS DataSync:** AWS-provided file-based tool for NetApp to EFS or S3 migrations.
|
||||
* **Silver Peak:** A WAN optimizer for compressing packets.
|
||||
|
||||
### Current NetApp Usage and Future Plans
|
||||
|
||||
The organization has around 15 NetApp clusters in various AWS regions, hosting approximately 1.3 petabytes of data. Cloud Manager is used for central management, with storage operations maintaining and supporting the NetApps. Monitoring is currently done through Cityscope and WebTool, with plans to use AWS native services. S3 tiering is enabled for most NetApps, and FSX for NetApp is under POC. There are also plans to use Terraform for deploying NetApps.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 46 NetApps on AWS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- NetApp
|
||||
- AWS
|
||||
- Storage
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 46 NetApps on AWS
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 46_ NetApps on AWS.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,16 +1,16 @@
|
||||
---
|
||||
title: "CTP Topic 47 Enterprise Architecture Cloud Standards"
|
||||
title: CTP Topic 47 Enterprise Architecture Cloud Standards
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- Enterprise-Architecture
|
||||
- Cloud-Standards
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 47 Enterprise Architecture Cloud Standards
|
||||
@@ -25,7 +25,21 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Enterprise Architecture Cloud Standards
|
||||
|
||||
[slide:N]
|
||||
The session will cover landing zones, their purpose, the role of enterprise architecture in cloud environments, guardrails, and the need for community input. The speaker, Lindsay, an enterprise architect with a development background, aims to provide a learner's perspective on cloud architecture.
|
||||
|
||||
A landing zone is a framework for hosting cloud workloads, focusing on security, compliance, and manageability. Key components include account structure, networking, security, access management, and telemetry. *The account structure aligns with environments (dev, staging, production), and roles define access based on zero trust and least privilege principles.* The landing zone provides pre-configured networking and security, reducing the security review burden on application teams. Centralized logging and auditing are provided within the framework.
|
||||
|
||||
Benefits of using landing zones include a pre-designed security model, pre-built compliance, and visible cost control. Infrastructure automation, using Terraform, enables efficient environment configuration. *Terraform allows specifying the desired environment in code, promoting standardization and testability.* Terragrunt, a wrapper for Terraform, aids in generating different environments. The framework eliminates reinvention, allowing application teams to focus on application-specific tasks.
|
||||
|
||||
Enterprise architecture helps articulate the cloud architecture, informing application teams about available resources and requirements. Guardrails capture mandatory requirements and optimal practices for scalability, cost minimization, and flexibility. The enterprise architecture team has created a page on the intranet site with business architecture concepts, data connections, application information, and technology roadmaps.
|
||||
|
||||
The cloud guardrails document covers design concepts, capabilities, and best practices. Key design concepts include cloud-first, leveraging well-architected frameworks, infrastructure as code (Terraform), and resource tagging. The document provides guidance on executable packaging, functional partitioning, capacity management, and identity management.
|
||||
|
||||
Executable packaging prioritizes using existing cloud services and managed services to minimize custom code. Functional partitioning involves breaking monolithic applications into smaller, independent blocks or serverless functions. The speaker emphasizes the need for input from application teams to refine the guardrails and incorporate real-world experiences. *We want your knowledge collected here for reuse and help help to help other app developers down the road.*
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: CTP Topic 47 Enterprise Architecture Cloud Standards
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- Enterprise-Architecture
|
||||
- Cloud-Standards
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 47 Enterprise Architecture Cloud Standards
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 47_Enterprise Architecture Cloud Standards.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 50 AMI Roadmap for AWS AMIs"
|
||||
title: CTP Topic 50 AMI Roadmap for AWS AMIs
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- AMI
|
||||
- Roadmap
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 50 AMI Roadmap for AWS AMIs
|
||||
@@ -26,7 +26,18 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AMI Roadmap for AWS AMIs
|
||||
|
||||
The Cloud Transformation Program held a learning session to discuss the AMI roadmap for AWS AMIs. The session covered the CCOE AMI roadmap, end-of-life operating systems, AMI notifications, change logs, new features, the process for adding new AMIs, current supported AMIs, and the roadmap.
|
||||
|
||||
The CCOE provides hardened AMIs on a bi-monthly basis aligned with security standards. The session focused on the roadmap, not the hardened AMIs themselves. The current available AMIs include three versions of Ubuntu, CentOS 7 and 8, Reddit 8.4 ARM, Amazon Linux 2, and four versions of Windows operating systems.
|
||||
|
||||
The roadmap includes planned releases for new operating systems. In November, SLES 15 and Reddit 9 will be released. In January 2023, open Susa 15 and Amazon Linux 2022 will be added. In March 2023, Rocky 8 and Rocky 9 will be available. May 2023 will see Reddit 9.4 ARM and Ubuntu 22.04 ARM. *Starting May 2023, all ARM processors related to AMIs will be released.* The order was created mainly by ADM requirements. Any requirements to change the prioritization of the roadmap should go through the demand pipeline process.
|
||||
|
||||
Windows Server 2008 and 2008 R2 are end-of-life since January 2020, CentOS 8 since December 2021, and Windows Server 2012 will be by October 2023. Red Hat 7 will be end-of-life by June 2024, as will CentOS 7. AMI notifications are sent via email to those on the CCOE notifications PDL. A change log is now available in the CCRE portal, representing the latest changes from the previous release. *This change log focuses on changes done by CCRE.*
|
||||
|
||||
The features contained in the AMIs include domain join services, enabling SSHR, integrating McAfee antivirus services, enabling DNS settings, updating the cloud init process, enabling the SSM client, and edge installations. The process of adding new AMI integration and validation involves integrating services, enabling features, and undergoing a build and test process. The AMIs are shared with every account in the organization, including the AMI itself, EBS volumes, and KMS keys.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 50 AMI Roadmap for AWS AMIs
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- AMI
|
||||
- Roadmap
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 50 AMI Roadmap for AWS AMIs
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 50_ AMI Roadmap for AWS AMIs.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 51 Architecting with AWS purpose-built databases"
|
||||
title: CTP Topic 51 Architecting with AWS purpose-built databases
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Database
|
||||
- Purpose-Built
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 51 Architecting with AWS purpose-built databases
|
||||
@@ -26,7 +26,24 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Architecting with AWS Purpose-Built Databases
|
||||
|
||||
Femi George, a database sales specialist from AWS, discussed purpose-built databases for modern applications, covering modern applications, the rationale for purpose-built databases, key AWS databases, and the evolving role of DBAs/developers in the cloud.
|
||||
|
||||
Modern applications have evolved from client-server models due to changing customer requirements, new devices, diverse data types, and economic considerations. Key questions include scalability, global delivery with low latency, and developer access. The approach involves starting with the use case and selecting the best tool for the job, avoiding a one-size-fits-all approach. *We need to start thinking of the right purpose built database for the right application.*
|
||||
|
||||
Considerations for purpose-built databases include application scale, user numbers, access patterns, usage spikes, and performance requirements like latency and availability. Duolingo uses DynamoDB for personalized data, ElastiCache for common words/phrases, and Aurora for transactional data. AWS offers a range of purpose-built databases, including relational (e.g., RDS, Aurora) and NoSQL (key-value, document, in-memory, graph) options, along with time series, ledger, and wide-column databases.
|
||||
|
||||
Relational databases are suitable for fixed schemas and maintaining referential integrity. Amazon RDS provides fully managed traditional and open-source databases, handling backups and patching. Data endpoints in RDS facilitate easy application access. Amazon Aurora, a cloud-native database, offers MySQL and PostgreSQL compatibility with enhanced performance, scalability, and security. *Amazon Aurora has two flavors, MySQL and PostgreSQL.* Aurora separates storage and compute, improving IO and availability.
|
||||
|
||||
Key-value data is popular among developers and forms the basis of NoSQL databases. Amazon DynamoDB is a key-value and document database with single-digit millisecond performance at any scale, supporting trillions of requests per day. Netflix uses DynamoDB for resilience and low-latency access to JSON documents. Document databases extend key-value stores by enabling deeper querying within JSON files. Amazon DocumentDB is compatible with MongoDB and offers flexible schemas.
|
||||
|
||||
Apache Cassandra, a wide-column database, is used for large-scale applications with unstructured schemas. Amazon Keyspaces is a managed service for Cassandra-compatible databases, offering serverless options. In-memory databases, like Amazon ElastiCache (Redis, Memcached), are used for caching, media streaming, session stores, and real-time analytics. Peloton uses ElastiCache Redis for immediate feedback to customers.
|
||||
|
||||
Graph databases (e.g., Amazon Neptune) are suitable for fraud detection, social networking, and recommendations. They help uncover correlations that relational databases struggle with. Time series databases (e.g., Amazon Timestream) are designed for high-volume, time-based data analysis, such as data from IoT devices.
|
||||
|
||||
The role of the DBA is evolving in the cloud. While AWS manages much of the platform, DBAs still handle tasks like restoring databases, managing access, and optimizing queries. The focus shifts from platform management to application innovation.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 51 Architecting with AWS purpose-built databases
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Database
|
||||
- Purpose-Built
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 51 Architecting with AWS purpose-built databases
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 51_ Architecting with AWS purpose-built databases.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 58 AWS EC2 image builder"
|
||||
title: CTP Topic 58 AWS EC2 image builder
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- EC2
|
||||
- Image-Builder
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 58 AWS EC2 image builder
|
||||
@@ -26,7 +26,20 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AWS EC2 Image Builder
|
||||
|
||||
AWS EC2 Image Builder is a managed AWS service to automate the creation, management, and distribution of AMIs and Docker images using components like image pipelines, image recipes, and infrastructure configurations. Image pipelines define how AMIs are published, including installations, security hardening, and distribution schedules.
|
||||
|
||||
Image recipes, written in YAML, define the source AMI for creating an output AMI, while container recipes support Docker images. Components are individual steps executed within the source AMI, such as installing packages or running shell commands. *A component is basically just a particular step that you want to execute in order to achieve the output AMI.* Infrastructure configurations define instance attributes like instance type, VPC, subnet, and security groups. Distribution settings manage the distribution of AMIs across different regions and accounts.
|
||||
|
||||
The current AMI publishing process involves OS-specific hardening scripts in GitLab repositories and Jenkins pipelines launching Packer to build and share images. Some product teams have developed parallel image bakeries, while others use manual processes with limited automation. The current approach has shortcomings, including longer turnaround times for modifications, AMI compatibility issues across landing zones, and limited automation in manual image bakeries. *Due to these limitations and these things what happens is eventually the product teams try to cater to their requirements by developing some kind of workflow or CI CD pipelines wherein they consume that CCOE AMI and they try to update or install whatever packages they require for their requirement or try to fulfill the functionalities which were lacking in the base AMI.*
|
||||
|
||||
Image Builder offers advantages such as increased productivity through automation, efficient image testing during the build process, incorporation of hardening standards, and easy image distribution. It integrates with AWS Organizations and AWS RAM for distributing AMIs across managed accounts. Supported OSes include Amazon Linux, Windows Server, Red Hat Linux, CentOS, Ubuntu, and SUSE, with the list expected to expand.
|
||||
|
||||
A POC has implemented end-to-end pipelines for CentOS 7 and Ubuntu 18, using CCOE hardening scripts converted into individual components. Terraform modules are in place for creating resources, with a consolidated module simplifying consumption for product teams. Testing scenarios are incorporated within components to validate execution, and AWS Inspector is integrated for AMI scanning against security standards. A Lambda workflow triggers scans, sends email notifications, and uploads reports to S3, maintaining a historical data of published AMIs. Qualys scan integration is under evaluation.
|
||||
|
||||
Product groups can use a service module to add components to the golden AMI. A component is a script, and components should be added in alphabetical order. The HCL file is used to create and manage components. Logs are published in CloudWatch. The image builder process requires approval, and the approval process is still under development.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 58 AWS EC2 image builder
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- EC2
|
||||
- Image-Builder
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 58 AWS EC2 image builder
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 58_ AWS EC2 image builder.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora"
|
||||
title: CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- RDS
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- PostgreSQL
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
|
||||
@@ -27,7 +27,47 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## RDS vs. Aurora: Key Differences
|
||||
|
||||
Greg Klau presented a detailed comparison of PostgreSQL on Amazon RDS and Aurora, focusing on performance, cost, and use cases. The session covered choosing between the two, running blue-green and cross-region operations, monitoring, and network performance tweaks for high availability.
|
||||
|
||||
### Key Differences and Considerations
|
||||
|
||||
* **Minimum Size and Cost:** RDS offers smaller, cheaper instances suitable for small databases, while Aurora has a higher minimum size and cost due to its architecture.
|
||||
* **Maximum Size and Performance:** Aurora scales to larger databases and offers better IO performance, making it suitable for databases exceeding 10-20 terabytes.
|
||||
* **Auto Scaling:** Aurora offers auto-scaling (Serverless v2) but with limitations on instance shapes, versions, and regions.
|
||||
* **Recovery Time Objective (RTO):** Aurora boasts a 30-second RTO, compared to RDS's two minutes in the event of an AZ failure.
|
||||
* **Storage Flexibility:** RDS provides more storage options (GP2, GP3, provisioned IOPS, magnetic), while Aurora charges per IO.
|
||||
* *With RDS, you get to choose multiple different storage mechanisms.*
|
||||
* *Aurora IO is generally unbounded because they're motivated to give you as much IO as you can consume because they're charging you per IO.*
|
||||
|
||||
### Architectural Comparison
|
||||
|
||||
* **RDS:** Uses compute with attached storage (EBS). Multi-AZ setup involves another compute and storage node for failover. Replication across regions is asynchronous.
|
||||
* **Aurora:** Employs six EBS volumes across three availability zones, managed by Amazon. Adding compute uses the same cluster volume, avoiding data replication for read replicas. Aurora Global allows multi-region setups with asynchronous replication.
|
||||
* *With Aurora, you get six EBS volumes. They're spread across three availability zones.*
|
||||
* **Endpoints:** RDS has one endpoint per cluster, while Aurora has separate writer and reader endpoints.
|
||||
|
||||
### Database Switchover and Failover
|
||||
|
||||
* **RDS:** Requires blocking access, forcing a new primary, destroying the old cluster, and rebuilding it as a standby.
|
||||
* **Aurora:** Allows clean, managed switchovers using Aurora Global, without re-replication. Failover involves promoting a secondary region and re-adding the failed region as a new global cluster after it recovers.
|
||||
|
||||
### Blue-Green Deployments (Aurora MySQL Only)
|
||||
|
||||
* Aurora MySQL supports blue-green deployments for major version upgrades, creating a duplicate environment for testing before switching over. This involves logical replication to a green environment, with guardrails to prevent data loss.
|
||||
|
||||
### Monitoring
|
||||
|
||||
* Both RDS and Aurora offer monitoring options via CloudWatch, Grafana, and Performance Insights. Performance Insights provides a view of database load, query performance, and wait times.
|
||||
* Aurora utilizes free local storage (ephemeral SSD) for temporary work, which is fixed per instance type. RDS uses EBS for temporary storage.
|
||||
|
||||
### High Availability Performance Tweaks
|
||||
|
||||
* Lower DNS time to live (TTL) to one second for faster failover.
|
||||
* Adjust TCP Keep-Alive settings to detect database failures quickly.
|
||||
* Use JDBC connection string overloading with reader and writer endpoints for resilience.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- RDS
|
||||
- Aurora
|
||||
- PostgreSQL
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 66 Exposing the differences between PostgreSQL RDS and Aurora
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 66_ Exposing the differences between PostgreSQL RDS and Aurora.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 68 Introduction to Redshift"
|
||||
title: CTP Topic 68 Introduction to Redshift
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Redshift
|
||||
- Data-Warehouse
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 68 Introduction to Redshift
|
||||
@@ -26,7 +26,16 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AWS Redshift Architecture and Components
|
||||
|
||||
This learning session covers AWS Redshift, focusing on its architecture, management, and key components. The session aims to provide a foundational understanding of Redshift, including its features like columnar operations, row-based operations, MPP (Massively Parallel Processing), data compression, and the significance of distinct and hot keys.
|
||||
|
||||
Redshift is a fully managed, petabyte-scale data warehouse solution in the cloud. *It is designed for data warehousing, enabling quick data retrieval from large datasets.* It supports online analytical processing (OLAP) and offers advantages such as easy installation, maintenance of backups, point-in-time recovery, and cross-region disaster recovery.
|
||||
|
||||
Redshift architecture involves client applications communicating with Redshift clusters via JDBC and ODBC drivers, connecting to a leader node. The leader node manages schema, warehouse metadata, and query planning, distributing instructions to compute nodes. Compute nodes, determined by the instance type, execute queries across slices, processing data and returning results to the leader node. *The leader node then stores results in buffers for quick retrieval, enhancing performance.* Instance types include dense compute, dense storage, and RA3, each offering varying levels of compute power, RAM, and storage capacity. RA3 is noted for its cost-effectiveness and large storage capacity, utilizing AWS-managed NVMe storage.
|
||||
|
||||
Key features of Redshift include MPP, which enables parallel processing of queries across multiple compute nodes, improving query speed and response times. Data storage can be columnar or row-based; columnar storage is optimized for data warehouse operations due to faster performance and efficient memory usage. Data compression techniques, including LZO, further enhance performance by reducing data size. The sort key and dist key play a crucial role in optimizing queries and managing data distribution across compute nodes.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 68 Introduction to Redshift
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Redshift
|
||||
- Data-Warehouse
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 68 Introduction to Redshift
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 68_ Introduction to Redshift.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 7 SaaS Landing Zone design"
|
||||
title: CTP Topic 7 SaaS Landing Zone design
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
- SaaS
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 7 SaaS Landing Zone design
|
||||
@@ -26,7 +26,53 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## SAS Landing Zone Design
|
||||
|
||||
The session covers the high-level design for the new production SAS Landing Zone, emphasizing a single landing zone approach for all products to reduce overhead and costs, a departure from the per-product group (PG) landing zones used in dev labs. The design incorporates AWS accounts, Terraform modules, and TerraGrant for deployment.
|
||||
|
||||
Key components include core accounts (shared, logs, security), baseline accounts (network, DNS, Active Directory), shared services accounts (software factory, cyber, ARC site, monitoring), and product accounts.
|
||||
|
||||
*The SAS landing zone will use a single landing zone for all the product groups.*
|
||||
|
||||
### Core Accounts
|
||||
|
||||
These accounts are based on the grant work reference architecture and include:
|
||||
|
||||
* **Shared Account:** Hosts hardened AMIs and a master Jenkins server for managing deployments. The master Jenkins initiates Lambda functions within each account to trigger Jenkins slaves, enhancing security by preventing direct exposure of the master Jenkins to jobs or credentials.
|
||||
* **Logs Account:** A centralized account for logs from every account (CloudTrail, Config, Flowlogs), accessible primarily to the security team, with read access for products to their specific logs.
|
||||
* **Security Account:** Hosts IAM roles inherited within each account, with the ability for account owners to attach additional policies to restrict role usage.
|
||||
|
||||
### Baseline Accounts
|
||||
|
||||
These accounts are essential for product functionality and include:
|
||||
|
||||
* **Network Account:** Contains a regional transit gateway connecting all accounts, with a checkpoint appliance for monitoring traffic based on a tagging approach. Resources require specific tags to access destinations like the internet or on-prem networks.
|
||||
* **DNS Account:** Hosts Route 53, with each product having its own hosted zone for managing DNS records.
|
||||
* **Active Directory Account:** Includes two AD nodes for domain joining and controlling resource access.
|
||||
|
||||
### Shared Services Accounts
|
||||
|
||||
These accounts provide internal production services to product accounts:
|
||||
|
||||
* Software Factory accounts (45 hubs, Octane Hub, Artifactory).
|
||||
* Cyber account (Qalis).
|
||||
* ARC site account.
|
||||
* Monitoring account (OBM, potentially Sitescope).
|
||||
|
||||
### Product Accounts
|
||||
|
||||
Each product account features a public subnet for internet exposure via a load balancer and internet gateway, while workloads reside in private subnets. A web application firewall (WAF) monitors incoming traffic, and CloudFront is available as a CDN.
|
||||
|
||||
*The workload itself is going to be under private subnet.*
|
||||
|
||||
### Automation and Deployment
|
||||
|
||||
Terraform is used for automation, with each account having its own GitHub repository. Changes to Terraform code trigger Jenkins via a GitHub hook, initiating a deployment process through the management VPC, Lambda, and ECS cluster. A review process, including code review and plan output review, is implemented before applying changes, with staging environments used for testing before production deployment.
|
||||
|
||||
### Remote Access
|
||||
|
||||
Remote access is transitioning from Checkpoint VPN to Pulse VPN, requiring operators to use a VPN client and authenticate against the AD. Future plans involve SD1 replacing some network components.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 7 SaaS Landing Zone design
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Landing-Zone
|
||||
- SaaS
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 7 SaaS Landing Zone design
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 7_ SaaS Landing Zone design.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup"
|
||||
title: CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- DR
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Enterprise
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
|
||||
@@ -27,7 +27,20 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Implementing an Enterprise DR Strategy Using AWS Backup
|
||||
|
||||
Sabith from AWS discusses disaster recovery (DR) strategies using AWS Backup, differentiating between high availability and disaster recovery. He recaps basic concepts like RTO and RPO, introduces AWS Backup, and presents reference architectures.
|
||||
|
||||
*We should always be prepared for a situation that everything falls all the time.* The shared responsibility model defines AWS's and the customer's roles in ensuring a resilient cloud environment. Human errors, technical failures, and natural disasters are major categories to consider when creating DR plans.
|
||||
|
||||
High availability ensures a system performs its functions, measured by mean time between failures. Disaster recovery focuses on data loss prevention and recovery, while high availability focuses on system uptime and service availability.
|
||||
|
||||
Recovery Point Objective (RPO) defines the acceptable data loss, while Recovery Time Objective (RTO) defines the acceptable downtime. Architectural patterns range from multi-site active-active (minimal interruption, high cost) to backup and restore (lower cost, longer interruption). AWS Backup is a fully managed, policy-based backup service that simplifies data protection. It supports numerous resource types and integrates with AWS Organizations for cross-account backup copies.
|
||||
|
||||
AWS Backup uses backup plans to define what, when, and how to back up, storing recovery points in backup vaults. It integrates with IAM policies for access control and AWS Backup Audit Manager (BAM) for compliance reporting. AWS Backup integrates with underlying services through data plane and control plane integrations. Full backups capture all data, while incremental backups only capture changes since the last backup.
|
||||
|
||||
AWS Backup offers immutable recovery points, automated scalability, and compliance features. Vault Lock in compliance mode prevents even root users from deleting recovery points until their lifecycle ends, deterring ransomware. Customers often use a vault or bunker account for storing backup copies, separate from workload accounts, to protect against compromises. A forensic account can be used to regularly test recovery points and scan for malware.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- DR
|
||||
- Backup
|
||||
- Enterprise
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 72 Implementing an Enterprise DR Strategy using AWS Backup
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 72_ Implementing an Enterprise DR Strategy using AWS Backup.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,16 +1,16 @@
|
||||
---
|
||||
title: "CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program"
|
||||
title: CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Backup
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
|
||||
@@ -25,7 +25,14 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> The session covers the AWS backup implementation of the cloud transformation program, focusing on the CTP backup strategy, AWS backup audit manager, and the AWS backup module. The SRE core, SRE product, and architecture teams collaborated on a design to provide product groups with flexibility in their backup strategies.
|
||||
|
||||
Key points include the assumed backup policy for production workloads, which requires customer data to be backed up regularly (at least once in 24 hours) with a retention policy of at least 30 days, and two backup locations. AWS backup was adopted as the strategic tool for backup in AWS for the cloud transformation program to standardize backup processes. An SRE model was developed to allow product groups to create and control their own backups, aligned with the assumed backup policy, enabling independent backup and restore operations in their DRA accounts.
|
||||
|
||||
AWS backup was chosen because it is a native service managed by AWS, simplifying data protection at scale and supporting multiple AWS resources. It supports TAC based backup plans, cross-account and cross-region backups, immutability for backups, out-of-the-box audit reports and frameworks, and point-in-time recovery for S3 and RDS. The design involves taking initial backups within the source accounts and copying them to a remote account and region, ideally a dedicated DR account for each production workload account. *This keeps backups within the DR account for immediate restore, avoiding time-consuming data copies.* If a DR account is unavailable, a Databunker account can be used as a centralized account for storing backups. The SRE backup model simplifies the adoption of AWS backup by creating AWS backup plans, selections, local AWS backup vaults, KMSKN policies, additional vaults in the DR account, Enroll policies, lifecycle policies, SNS topic creations, audit reports, and optional point-in-time restore for SRE and RDS. *The SRE models were adjusted to optionally create custom KMS kits, which is a fundamental requirement for having a remote account and region for the AWS backup processes.*
|
||||
|
||||
The AWS backup audit manager provides out-of-the-box reports and compliance reports. Reports can be exported to an S3 bucket in CSV or JSON format, providing insights into the status of backups, resources backed up, creation date, recovery point, backup duration, and size. SNS notifications can be configured to receive alerts regarding the status of backups. The AWS backup audit manager framework includes controls that help evaluate backup practices, providing compliance reports. Controls include ensuring backup resources are protected by a backup plan, minimum frequency and retention, prevention of manual deletion of recovery points, encryption of recovery points, and scheduled cross-region and cross-account backups.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/01_AWS-Landing-Zone
|
||||
tags:
|
||||
- AWS
|
||||
- Backup
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 73 AWS Backup implementation of the Cloud Transformation Program
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 73_ AWS Backup implementation of the Cloud Transformation Program.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,51 +1,26 @@
|
||||
---
|
||||
title: "Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
tags:
|
||||
- AWS
|
||||
- AMI
|
||||
- Updates
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4`
|
||||
# learning sessions standard amis updates 20231205 160324 meeting recording 2
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
## Standard AMI Updates and Overview
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
The session provides a high-level overview and updates regarding Amazon Machine Images (AMIs). The standard AMIs are based on AWS AMIs but include OS hardening, the latest patches, and security updates. These AMIs also support domain joining, security tools, endpoint protection, access integration, a QALIS agent, SSM agent, DNS settings, Microsoft Edge for Windows AMIs, and GP3 EBS storage.
|
||||
|
||||
---
|
||||
The AMIs are built, tested, and shared to all AWS accounts every two months, and are immediately available as private AMIs. Currently, 23 different AMIs are supported, including various versions of Amazon Linux, CentOS, Oracle Enterprise Linux, Red Hat, Rocky Linux, SUSE Linux, Ubuntu, and Windows servers. The latest three releases are available in 12 regions, and older AMIs are archived for 12 months.
|
||||
|
||||
## 摘要
|
||||
The AMI release process follows a standard software release process, with changes developed on feature branches and merged into an integration branch. Jenkins multi-branch pipelines are used for building and testing the AMIs, including scripted tests and AWS Inspector. The publishing process involves copying the AMIs to different regions and sharing them to multiple organizations, with encryption and automatic creation of necessary grants. *The AMIs are then thrown through all of the test suites, and we'll see a couple of those as they come up in later slides, and then we verify that nothing seems to have regressed at that point.*
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
## Roadmap, Notifications, and End-of-Life
|
||||
|
||||
---
|
||||
The current roadmap includes a future release of Amazon Linux 2023, X64, planned for January. New AMI requests must go through the demand pipeline and take approximately 60 days to release. AMI notifications are sent out with each release, including links to relevant documents and the portal. A change log is available in the portal, detailing the changes included in each release.
|
||||
|
||||
## 关键概念
|
||||
Several operating systems are reaching end-of-life, including CentOS 7 and Red Hat 7 in June 2024. *CentOS 7 will be replaced by Rocky Linux, which is already available as a standard AMI.* OpenSUSE Leap 15 and OEL 7 will reach end-of-life in December 2024.
|
||||
|
||||
-
|
||||
## New Features and Validation
|
||||
|
||||
---
|
||||
New features are injected into the release cycles based on various inputs, such as the migration from Trellix to Sentinel-1. The AMIs are designed to work across multiple landing zones and domain controller environments. The new landing zone uses secrets instead of parameter stores, and all automations now use cloud-based init. AMI utilization is monitored to track how frequently and how many AMIs are being used.
|
||||
|
||||
## 行动项
|
||||
A robotic framework has been integrated to automate basic test cases and validations, reducing the validation time for one AMI from three-four days to 60 minutes. An SSM patching solution is available for long-running instances that cannot be refreshed frequently. The AMIs are validated and tested according to the highest security standards, with penetration testing conducted periodically.
|
||||
via model google/gemini-2.0-flash
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
Cached · google/gemini-2.0-flash
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/01_AWS-Landing-Zone"
|
||||
tags:
|
||||
- AWS
|
||||
- AMI
|
||||
- Updates
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions Standard AMIs Updates - 20231205 160324-Meeting Recording (2)
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Standard AMIs Updates - 20231205_160324-Meeting Recording (2).mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 01_AWS-Landing-Zone
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 11_ AD Integration, and Login using AD accounts.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 11 AD Integration, and Login using AD accounts
|
||||
@@ -21,7 +21,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 02_IAM
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 5 - AWS Identity and Access Management (IAM)"
|
||||
title: CTP Topic 5 - AWS Identity and Access Management (IAM)
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/02_IAM"
|
||||
category: DevOps & SRE/02_IAM
|
||||
tags:
|
||||
- AWS
|
||||
- IAM
|
||||
- Security
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 5 - AWS Identity and Access Management (IAM)
|
||||
@@ -26,7 +26,35 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AWS Identity and Access Management (IAM) Explained
|
||||
|
||||
This session covers AWS Identity and Access Management (IAM), focusing on users, groups, roles, and policies, and how they relate to accessing AWS via the CLI and federation. The discussion emphasizes accessing landing zone accounts and determining the appropriate method.
|
||||
|
||||
Key points include:
|
||||
* IAM dashboard resources: users, groups, customer managed policies, roles, and identity providers.
|
||||
* Federated access: Users gain access to accounts via Active Directory (AD) groups, which grant specific roles.
|
||||
* `accounts.json`: This file, located in the root of every landing zone, contains a list of account numbers.
|
||||
* IAM users are primarily for service accounts; federation is the preferred method for user management.
|
||||
* User groups are less relevant due to the focus on federated user management.
|
||||
* Roles are used by services or users and tie together permissions.
|
||||
* Policies define permissions, specifying what actions are allowed or denied on resources.
|
||||
* *Roles don't enable actions; they tie together who can do something and what they can do.*
|
||||
* Policies can be AWS-managed or customer-managed.
|
||||
|
||||
Federated users log in via their organization's AD, which maps to an IAM role. Command-line access via federation requires a tool called PFSSO. *We only want to allow the access that is strictly required.* Least privilege model: Granting only the necessary permissions is crucial.
|
||||
|
||||
Configuring permissions typically involves a service accessing AWS resources, requiring a role and policy. Terraform modules can define IAM roles, including an assumed role policy and inline policy blocks. Policies should be fine-grained, limiting access to only the required resources. Inline policies are tied to a specific role, while managed policies can be reused across multiple roles.
|
||||
|
||||
Key takeaways:
|
||||
* Federation is the primary method for user access.
|
||||
* Roles and policies are central to managing permissions.
|
||||
* Least privilege is a guiding principle when defining policies.
|
||||
* Consider using inline policies for role-specific permissions and managed policies for reusable permissions.
|
||||
* When defining pterogrant modules, ensure policies are not too wide open.
|
||||
* VSM requests are required to gain account access through Federation.
|
||||
* User attributes beyond usernames are supported, including additional STS values and tags.
|
||||
* Cross-account role assumption is possible, where principles in specified accounts can assume a role.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 5 - AWS Identity and Access Management (IAM)
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/02_IAM
|
||||
tags:
|
||||
- AWS
|
||||
- IAM
|
||||
- Security
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 5 - AWS Identity and Access Management (IAM)
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 5 - AWS Identity and Access Management (IAM).mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 02_IAM
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)
|
||||
@@ -23,28 +23,10 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## Identity Governance and VSM Replacement
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
The learning session covers identity governance, focusing on the replacement of Virtual SM (VSM), a DXC tool, with identity governance (IG). The objective is to understand identity governance, its necessity, micro-focused IG, its utilization with control tower and counter-automation, the plan to replace VSM with IG, and how to use the IGA portal.
|
||||
|
||||
---
|
||||
Identity governance is a framework for managing digital identities efficiently, minimizing risk, and maintaining compliance. Key questions addressed by identity governance include: *who currently has access to our systems, who should have access, and how is the access being done?* It comprises identity management, access management, and identity auditing. Microfocus's IGA governs access through resources, providing workflows for approving and revoking access, as well as monitoring and auditing access. IG is used to provide access to both internal and external users, including contractors, with time-limited access.
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
IG integrates with AWS Identity Center to provide access to resources via IAM. Groups in Active Directory represent roles, and IG governs access to these groups. A bridge is established using Azure AD domain services for authentication. IG controls Active Directory groups and workflows, while IAM connects to Azure to Cobdom domain. The plan is to replace VSM with IG for all accounts, using the same architecture as VSM, but with IG connected to Coptum domain. Changes include adding owner information to Active Directory groups and automating the account owner as the first-level approver. A POC is underway to validate the architecture and process. Gaining access involves searching for the resource in the IG portal, requesting access, and filling out a form. The request goes through an approval flow, and upon approval, access is granted automatically.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/02_IAM"
|
||||
tags:
|
||||
- Identity-Governance
|
||||
- VSM
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions Identity Governance VSM replacement -20231128 160326-Meeting Recording (1)
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Identity Governance VSM replacement -20231128_160326-Meeting Recording (1).mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 02_IAM
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 12_ Using SES SMTP service terraform module.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 12 Using SES SMTP service terraform module
|
||||
@@ -21,7 +21,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 16_ Cross-account Terraform modules.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 16 Cross-account Terraform modules
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 48 Terraform vs Terragrunt"
|
||||
title: CTP Topic 48 Terraform vs Terragrunt
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/03_Terraform"
|
||||
category: DevOps & SRE/03_Terraform
|
||||
tags:
|
||||
- Terraform
|
||||
- Terragrunt
|
||||
- IaC
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 48 Terraform vs Terragrunt
|
||||
@@ -26,7 +26,24 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Terraform vs. Terragrunt
|
||||
|
||||
Bob, an AWS Solutions Architect and Tech Lead, contrasts Terraform and Terragrunt, emphasizing the importance of understanding their differentiation for both high-level strategy/design roles and low-level development/debugging roles.
|
||||
|
||||
Terraform, founded by HashiCorp, is a Golang application used to provision, change, and version-control resources across various environments. A key selling point is its cloud-agnostic nature. The plan command allows users to preview changes before implementation, providing a distinct advantage. *To run Terraform consistently, it ties the desired state to the existing environment using a state file.* For enterprise-scale use, storing this file in a safe, accessible location is crucial, with cloud vendors offering persistence solutions.
|
||||
|
||||
Terragrunt is presented as a thin wrapper around Terraform, promoting the DRY (don't repeat yourself) principle. All Terraform commands work with Terragrunt; a Terraform plan becomes a Terragrunt plan. The language, including blocks and attributes, remains consistent. Terragrunt helps manage provider and remote state blocks, which can be complex and error-prone when declared multiple times across different environments. *Terragrunt offers a way to use information in a repeatable way without hard coding values.*
|
||||
|
||||
Terraform and Terragrunt have similar commands and languages, but differ in their approach to reusability and state management. Terraform's core is cloud-agnostic, while its vendor-specific parts require separate modules for each cloud provider. Terragrunt helps streamline configurations across environments.
|
||||
|
||||
Additional points:
|
||||
* Terraform Enterprise is a CI platform with workspaces.
|
||||
* Gruntwork offers pre-built, customizable modules and a Terraform native AWS landing zone.
|
||||
* Atlantis integrates Terraform with GitHub for infrastructure provisioning.
|
||||
* Tools like tfsec aid in maintaining security through static code analysis.
|
||||
* Terratest enables test automation for improved stability and velocity in the software delivery pipeline.
|
||||
* Cloud cost customization tools can help visualize the cost implications of changes before deployment.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 48 Terraform vs Terragrunt
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/03_Terraform
|
||||
tags:
|
||||
- Terraform
|
||||
- Terragrunt
|
||||
- IaC
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 48 Terraform vs Terragrunt
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 48_ Terraform vs Terragrunt.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording
|
||||
@@ -23,28 +23,8 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
The learning session focuses on ECS deployment using infrastructure as code, presented by JP and Raja M. The session is part of a weekly series on Tuesdays, emphasizing interactive learning with Q&A opportunities. Recordings and presentations are available on a SharePoint site, with notifications sent beforehand.
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
JP discusses the business and technology background of ECS, while Raja details the ECS module developed within CTP and SRE. The industry faces challenges like unpredictability and the need for agility, pushing businesses towards infrastructure as code. *Businesses have to thrive in the middle of all these challenges and it is forged by code.* Dynamic scaling is crucial due to unpredictable load patterns, requiring technologies to evolve. ECS (Elastic Container Services) is an AWS proprietary technology that integrates with AWS services, offering advantages and challenges compared to EKS or native Kubernetes.
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
The ECS model, built on the grant work repository, allows creating Docker containers as logical units and supports EC2 instances or target deployments. It features auto-scaling, auto-healing, and canary deployments. The module supports a listener approach for centralized ECS management and integrates with AWS services. *We have implemented the listener approach because we have seen many of the products are you know they are downloading the quotes from the grant work and using locally.* Prerequisites for using the module include VPC, ELB security group, and EFS volume mounting. Configurations can be passed via YAML or JSON, with integration support for AWS CloudWatch, Splunk, Grafana, and Prometheus.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/03_Terraform"
|
||||
tags:
|
||||
- Terraform
|
||||
- CTP
|
||||
- IaC
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions Cloud Transformation Programme-20230808 183322-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-20230808_183322-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform
|
||||
@@ -24,28 +24,8 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
Greg from the DBRE team discusses deploying RDS via Terraform, advocating its use over the console for deploying any size RDS into Amazon. The presentation covers why infrastructure as code is helpful, clarifies the use of grunt work modules, and introduces SRE core modules. It also includes technical details, live demos of deployment, maintenance, upgrades, and monitoring/alarming.
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
Key benefits of infrastructure as code include speed, flexibility, consistency, disaster recovery, documentation, and automation. *The code is the documentation.* There are two main options for deploying RDS: the bare-bones RDS module and the more comprehensive RDS service. The grunt work RDS service is recommended due to its pre-built features like KMS key encryption and CloudWatch alarming. The SRE core modules are less fully featured than the grunt work service.
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
To deploy an RDS database, use Terragrunt, a wrapper around Terraform, to keep code clean and avoid repeating variables. *We use Terragrunt, which is basically it's a wrapper around Terraform, and it allows you to keep your code clean and you're not repeating your variables all the time.* Use a tagged release instead of the master branch for stability. Basic variables include VPC, database type (Oracle, Postgres), port, and license model. For day two operations like scaling, patching, and major version upgrades, changes are made in the TerraGrant file and applied via GitHub pull requests and Atlantis. Monitoring is achieved through CloudWatch dashboards and alarms, with considerations for burstable instance shapes and CPU credits.
|
||||
|
||||
@@ -1,23 +1,24 @@
|
||||
---
|
||||
title: "Learning Sessions - FY24Q1 Cost Optimisation - 20230912"
|
||||
title: "Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform"
|
||||
type: cloud-learning
|
||||
source-type: pptx
|
||||
category: "DevOps & SRE/05_FinOps"
|
||||
source-type: video
|
||||
category: "DevOps & SRE/03_Terraform"
|
||||
tags:
|
||||
- Cost-Optimization
|
||||
- FinOps
|
||||
- FY24
|
||||
- Terraform
|
||||
- RDS
|
||||
- IaC
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions - FY24Q1 Cost Optimisation - 20230912.pptx"
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions - FY24Q1 Cost Optimisation - 20230912
|
||||
# Learning Sessions Cloud Transformation Programme-Deploying RDS via Terraform
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions - FY24Q1 Cost Optimisation - 20230912.pptx`
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ Cloud Transformation Programme-Deploying RDS via Terraform.mp4`
|
||||
|
||||
**Type:** PPTX | **Category:** 05_FinOps
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
@@ -12,7 +12,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording
|
||||
@@ -25,28 +25,8 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
The learning session focuses on ECS deployment using infrastructure as code, presented by JP and Raja M. The session is part of a weekly series on Tuesdays, emphasizing interactive learning with Q&A opportunities. Recordings and presentations are available on a SharePoint site, with notifications sent beforehand.
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
JP discusses the business and technology background of ECS, while Raja details the ECS module developed within CTP and SRE. The industry faces challenges like unpredictability and the need for agility, pushing businesses towards infrastructure as code. *Businesses have to thrive in the middle of all these challenges and it is forged by code.* Dynamic scaling is crucial due to unpredictable load patterns, requiring technologies to evolve. ECS (Elastic Container Services) is an AWS proprietary technology that integrates with AWS services, offering advantages and challenges compared to EKS or native Kubernetes.
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
The ECS model, built on the grant work repository, allows creating Docker containers as logical units and supports EC2 instances or target deployments. It features auto-scaling, auto-healing, and canary deployments. The module supports a listener approach for centralized ECS management and integrates with AWS services. *We have implemented the listener approach because we have seen many of the products are you know they are downloading the quotes from the grant work and using locally.* Prerequisites for using the module include VPC, ELB security group, and EFS volume mounting. Configurations can be passed via YAML or JSON, with integration support for AWS CloudWatch, Splunk, Grafana, and Prometheus.
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: "Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/03_Terraform"
|
||||
tags:
|
||||
- AWS
|
||||
- ECS
|
||||
- IaC
|
||||
- Terraform
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Learning Sessions ECS Deployment using IAC -20230808 183322-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Learning Sessions _ ECS Deployment using IAC -20230808_183322-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 03_Terraform
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 29 Cloud Monitoring – SaaS LZ accounts"
|
||||
title: CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Monitoring
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
|
||||
@@ -27,7 +27,14 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## AWS Cloud Monitoring with OpsBridge
|
||||
|
||||
The session covers AWS cloud monitoring using Micro Focus OpsBridge, focusing on a new Cloud Monitoring feature. This containerized solution can be deployed on-prem or on AWS EKS and supports monitoring over 20 AWS data services, with data stored in an optic data lake using Vertica for performance dashboarding and reporting. The architecture collects data from CloudWatch metrics using read-only access to monitored accounts, correlating data and updating the configuration management database.
|
||||
|
||||
Key points include deployment, monitoring setup, and operations. Cloud Monitoring is enabled within OpsBridge, requiring a one-time IAM role setup in customer accounts for read-only access. *Tag-based monitoring is emphasized as a best practice, with automation to identify missing tags.* The solution uses a single instance to monitor multiple accounts and regions.
|
||||
|
||||
Data consumption occurs via event dashboards, topology views, and performance dashboards. The solution is being developed in collaboration with the product R&D team, with new reporting features expected in the next release. The demo showcased event perspectives, performance dashboards, and topology views, highlighting event details, historical usage, and hierarchical resource presentation. The operational model's impact on application teams was discussed, including data feedback, OpsBridge expertise, and outage detection capabilities.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Monitoring
|
||||
- SaaS
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 29 Cloud Monitoring – SaaS LZ accounts
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 29_ Cloud Monitoring – SaaS LZ accounts.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone"
|
||||
title: CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
|
||||
@@ -27,7 +27,21 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> Spencer and Guy discuss implementing Elastic Kubernetes Service (EKS) in the AWS landing zone, focusing on a use case with Octane, a Microfocus SaaS application that is IP-hungry. They faced challenges with the limited range of IP addresses in AWS labs run on the Microfocus network.
|
||||
|
||||
The solution involved creating a private subnet within their own space, not connected to the main subnet, to provide a large number of IPs for EKS to use. *The problem was was that this wasn't supported in the EKS sort of solution that was given to us.* They utilized Terraform and Terragrunt modules to create the lab, working with SRE to enable EKS to create its own subnet and use its own IPs within each pod.
|
||||
|
||||
Key points:
|
||||
* The EKS module has a flag for custom networking configuration to control IP allocation.
|
||||
* They demonstrated how to call the EKS module within Terraform code, specifying the subnet and mappings between federated accounts/roles.
|
||||
* They showed how to access the EKS cluster, get pods, and access both internal Microfocus network resources and external resources from within a pod.
|
||||
* *Within the spec configuration, we basically have to put host network equals true.*
|
||||
* They addressed a question about container hardening guidelines, explaining that they had discussions with security teams and implemented strong security measures.
|
||||
* They mentioned that AWS may have contributed to the idea of this solution.
|
||||
* Atlantis cannot currently deploy EKS clusters; a Terragrunt module on Jenkins is used instead.
|
||||
* Mapping roles allows connection to the cluster and visibility of EKS components in the AWS console.
|
||||
* The number of node groups is currently hardcoded but will be made configurable in future versions.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Kubernetes
|
||||
- Landing-Zone
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 39 Implementing EKS in the AWS Lab Landing Zone
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 39_ Implementing EKS in the AWS Lab Landing Zone.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 42 Grafana Observability dashboard"
|
||||
title: CTP Topic 42 Grafana Observability dashboard
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- Grafana
|
||||
- Observability
|
||||
- Dashboard
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 42 Grafana Observability dashboard
|
||||
@@ -26,7 +26,28 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Grafana Observability and Dashboards
|
||||
|
||||
Grafana is an open-source web application used for data visualization through charts and dashboards. It supports various data sources, including metrics (CPU load, memory usage) and logs (timestamps, debug levels). Data producers like Jenkins, CA servers, and AWS CloudWatch inject data into these sources, which Grafana then visualizes. *Grafana does not exist differently data source by itself. It needs to be expressed from the data, all kinds of data sources.*
|
||||
|
||||
The infrastructure architecture involves users accessing Grafana through a load balancer and auto-scaling groups. Grafana is installed in a monitoring account and configured to access other product team AWS accounts via IAM role policies. A Grafana monitoring role is assumed from a Terraform service catalog repo, granting access to various landing zone source accounts.
|
||||
|
||||
Grafana offers user-level and team-level access controls, with roles like editor, viewer, and admin. Data sources are created with specific ARNs to access AWS accounts. Dashboards are dynamic, fetching data based on product team access. A sample dashboard includes CPU, I/O, network, EBS, and estimated charges monitoring. Alerting systems can be configured to notify channels like Microsoft Teams of high CPU usage or service downtime.
|
||||
|
||||
### Terraform and Automation
|
||||
|
||||
Terraform is used to automate Grafana resource provisioning. Modules exist for data sources and Grafana organizations. A demo scenario simulates onboarding Grafana for a new product group account using LZSAP. The process involves creating folders, calling modules, and using JSON input variables to define organization names and user access.
|
||||
|
||||
Dashboards are provisioned with data sources and regions as inputs. Grafana offers flexibility in dashboard layout and data visualization. Product teams can leverage these modules and customize dashboards with application-specific logs or custom CloudWatch metrics.
|
||||
|
||||
### Network Monitoring and Roadmap
|
||||
|
||||
Network monitoring is achieved using Prometheus as a data source for checkpoint and firewall instances. A tool called norm is referenced to fetch metrics via the SNMP protocol. Key dashboards display packet in/out transfers, interface metrics, and CPU/disk usage.
|
||||
|
||||
The roadmap includes implementing alerting and notification rules, refining network monitoring dashboards, building application-specific dashboards, and enabling product groups to consume Grafana Terraform modules. The goal is to replace Micro Focus tools with Grafana for end-to-end monitoring. *We would like to build application specific dashboards which can basically give us key insight with respect to our applications that are running over there.*
|
||||
|
||||
Grafana offers open-source and paid versions (Grafana Enterprise and Grafana Cloud). User management is currently within the Grafana database but will move to LDAP or SSO.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 42 Grafana Observability dashboard
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- Grafana
|
||||
- Observability
|
||||
- Dashboard
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 42 Grafana Observability dashboard
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 54 ESM SaaS Log Analytics"
|
||||
title: CTP Topic 54 ESM SaaS Log Analytics
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- Log-Analytics
|
||||
- SaaS
|
||||
- ESM
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 54 ESM SaaS Log Analytics
|
||||
@@ -26,7 +26,22 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## ESM SAS Log Analytics
|
||||
|
||||
Jackie, an ITOM ESM SAS architect, discusses Log Analytics, covering concepts, architecture, regional setup, provisioning, security, and a demo of a counter solution. He also briefly compares different solutions.
|
||||
|
||||
The presentation begins with an overview of the ELK stack (Elasticsearch, Logstash, Kibana) and its open-source alternative, OpenSearch. Applications collect logs via BEATS, which are then aggregated and processed by Logstash to give meaning to each column, before being stored in Elasticsearch or OpenSearch. Kibana is used as a front-end for log file visualization and analysis.
|
||||
|
||||
*The application collects your log, it's called the BEATS.* The architecture involves two VPCs: one for the application and another for logging. Filebeat, running as a container, continuously ships logs from the application VPC to the logging VPC. Logstash processes these logs, and OpenSearch stores them. End users can view logs via Kibana, connecting from a specified network. Redis is used as an optional buffer to prevent Logstash overload.
|
||||
|
||||
Due to legal reasons like GDPR, farms are split regionally, with farms in Oregon, the US, and Europe. Provisioning is done via CloudFormation or Terraform, but security hardening and continuous optimization pose challenges. Security measures include encryption at rest (using encrypted nodes and hardware-level encryption on NVMe devices) and in transit (using TLS 1.2). Traffic between VPCs is private, not over the internet. Index-based access control and RBAC are implemented for different user roles.
|
||||
|
||||
A demo shows how to search for specific IDs or services within the logs. A comparison of solutions like Logz.io, AWS OpenSearch, self-hosted ELK, and Microfocus OBA is provided. Logz.io is a managed ELK solution, while OBA offers more mature commercial options with automated clustering. ELK is easy to configure but complex to manage, while OBA is more mature with commercial options. ELK supports fine-grained access control, while OBA supports column-level access control.
|
||||
|
||||
Cost estimates are provided based on a single farm usage with 14 days retention and 100GB processed daily. Logz.io costs around $4,000, while AWS OpenSearch costs around $1,500 or less. Self-hosted options can be very low cost but require more maintenance. Availability SLAs vary, with Logz.io offering 99.8% and AWS OpenSearch offering 99.9%. Disaster recovery is covered by the vendor for Logz.io, while AWS OpenSearch automatically captures snapshots.
|
||||
|
||||
Recommendations for starting with Log Analytics include beginning with Logz.io for its trial period, then transitioning to AWS OpenSearch or self-hosted options for more control. The presentation concludes with a Q&A session covering GDPR requirements, log acquisition, cost details, scaling, and comparisons to other solutions. *We have already built up all the farms.*
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 54 ESM SaaS Log Analytics
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- Log-Analytics
|
||||
- SaaS
|
||||
- ESM
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 54 ESM SaaS Log Analytics
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 54_ ESM SaaS Log Analytics.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 59 Achieving reliability with Amazon EKS"
|
||||
title: CTP Topic 59 Achieving reliability with Amazon EKS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Reliability
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 59 Achieving reliability with Amazon EKS
|
||||
@@ -27,7 +27,20 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## EKS Reliability with AWS
|
||||
|
||||
Surav Paul, a Senior Solutions Architect from AWS, presented on EKS (Elastic Kubernetes Service), covering container offerings and reliability practices. The session aimed to be interactive, encouraging questions about shared responsibility models, reliability-based practices, application reliability, and data plane reliability.
|
||||
|
||||
When considering container offerings on AWS, users can choose between Amazon Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS). ECS is recommended for those starting their container adoption journey, offering a simple interface with native AWS service integrations. EKS is suitable for those familiar with the Kubernetes ecosystem, providing flexibility with open community initiatives. *ECS is a more AWS opinionated way of running containers.* Both ECS and EKS offer multiple compute options, including VM images, serverless deployments (AWS Fargate), and on-prem deployments.
|
||||
|
||||
Reliability in a system means it offers predictable behavior even when failures occur. Key concerns include failure detection, graceful service degradation, deterministic failure modes, self-healing capabilities, and on-demand scaling. Reliability concerns are grouped under application, control plane, and data plane categories. The shared responsibility model dictates that AWS manages control plane components (state store, scheduler, controller manager, API servers), while customers manage aspects like worker nodes, operating systems, and application configurations. *With Fargate, you don't have to worry about managing the nodes or worrying about patching or upgrading the nodes.*
|
||||
|
||||
Application reliability involves avoiding singleton pods and spreading application pods across availability zones using pod anti-affinity or topology spread constraints. Topology spread constraints offer finer-grained control over workload distribution. Collecting metrics via the metrics server is crucial for scaling, with HPA (Horizontal Pod Autoscaler) using CPU utilization and memory consumption by default, and custom/external metrics available. VPA (Vertical Pod Autoscaler) can right-size pods, but runtime adjustments cause restarts. Deployment strategies include rolling upgrades, blue-green deployments, and canary deployments, each with different levels of control and complexity. Liveness, readiness, and startup probes are essential for monitoring pod health, and pod disruption budgets ensure minimum service levels during maintenance.
|
||||
|
||||
Control plane reliability involves monitoring control plane metrics (API server requests, HCT state store size) to prevent issues. Securing cluster authentication by creating a secure user with super admin role is crucial. Admission webhooks should be carefully configured and tested to avoid obstructing the control plane. Cluster upgrades have control plane and data plane phases, with EKS platform versions handling patch releases transparently. Minor version upgrades have a 14-month support cycle before automatic upgrades occur.
|
||||
|
||||
Data plane reliability involves using tools like node problem detector, reserving system resources, implementing quality of service, and configuring resource quotas and limit ranges. Pod priority and control preemption are also important.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 59 Achieving reliability with Amazon EKS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Kubernetes
|
||||
- Reliability
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 59 Achieving reliability with Amazon EKS
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 59_ Achieving reliability with Amazon EKS.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana"
|
||||
title: CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Grafana
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Hyperscale
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
|
||||
@@ -27,7 +27,20 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Monitoring AWS Using Hyperscale Observability with Grafana
|
||||
|
||||
This session is a continuation of a previous session about Grafana. It focuses on recent capabilities and features now available. Vinay covers the session, in place of Sashi, who is on leave.
|
||||
|
||||
The session recaps previous discussions, including the effective use of Grafana with different data sources, creating queries, and customizing visualizations. Grafana's ability to provision infrastructure and applications using Terraform modules (dashboard as code) is highlighted, along with its use for SNMP-based network infrastructure monitoring. The move from the open-source version of Grafana to the enterprise license version is emphasized to leverage the full potential of Grafana.
|
||||
|
||||
Key highlights explored through demonstrations include data source integration, event tracking, alert integrations, instance monitoring, and resource tracking. Optic DR, an internal monitoring solution and plugin of VaticaDB, is crucial for pulling data into Grafana dashboards. *Opsbridge monitoring solutions use a dashboard to display even triggered by the monitoring systems.* Grafana's alert system is flexible and can be configured to use different notification channels, with the ability to forward alerts to Opsbridge to create incidents. Instance monitoring helps identify resource utilization, and resource tagging categorizes resources for effective management.
|
||||
|
||||
The session covers the use of a Terraform module for product teams, which creates Grafana organizations, users, folders, IAM roles, and dashboards for AWS services. *The product team can consume the modules by using sample telegram HCL file.* Default dashboards are provided for accounts onboarded to code, with prerequisites outlined in a readme file. Several default dashboards are offered to product teams, such as billing information dashboards that display resource utilization and EC2 dashboards that can be customized. Customized dashboards can consolidate all services into a single view, though this is typically limited to one account and one region.
|
||||
|
||||
EC2 inventory dashboards, using data from Optic DR, provide a view of running and non-running EC2 instances and identify whether resources are tagged. Event dashboards display daily active events triggered by OpsBridge AWS monitoring solutions, with ongoing integration of alerts generated by Grafana. Future roadmap items include SSO authentication, reporting capabilities, URL monitoring, process monitoring, log monitoring, and integration with other products like PagerDuty and Slack Manager.
|
||||
|
||||
The session concludes with a discussion of next steps and collaboration, encouraging users to leverage available dashboards and provide feedback or enhancement requests. The team also addresses questions about the cost impact of joining the service, clarifying that default metrics do not incur additional costs, but custom metrics may.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Grafana
|
||||
- Observability
|
||||
- Hyperscale
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 60 Monitor AWS using Hyperscale Observability with Grafana
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 60_ Monitor AWS using Hyperscale Observability with Grafana.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 64 Scaling out with Amazon EKS"
|
||||
title: CTP Topic 64 Scaling out with Amazon EKS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Scaling
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 64 Scaling out with Amazon EKS
|
||||
@@ -27,7 +27,26 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Scaling Out with Amazon EKS
|
||||
|
||||
The 64th Cloud Transformation Program session covers scaling out with Amazon EKS, with a special guest presenter from AWS. The session is interactive and encourages questions, with a survey link to be shared for feedback.
|
||||
|
||||
Suravpul, a senior solutions architect from AWS, discusses scaling workloads using the horizontal pod autoscaler (HPA), event-driven autoscaling with KEDA, capacity autoscaling (cluster autoscaler and Carpenter), addressing IP exhaustion, and scaling cluster components like DNS.
|
||||
|
||||
The horizontal pod autoscaler (HPA) is the standard Kubernetes mechanism for scaling application workloads, using metrics to determine replica requirements. It supports CPU and memory utilization out of the box via a metrics server. Custom and external metrics, such as those from load balancers or messaging middleware, can also be used. *The horizontal pod autoscaler is going to pull the metrics and it is going to calculate how many replicas are required for your application workload.* The speaker notes that the gap between the target threshold and 100% utilization is important, and addresses flapping via period seconds and stabilization window seconds settings. HPA currently considers resource consumption only at the pod level, not at the container level.
|
||||
|
||||
KEDA allows scaling application workloads based on external events, using a custom resource definition called a scaled object. It can scale applications from zero replicas, or publish metrics for the horizontal pod autoscaler to use.
|
||||
|
||||
Capacity autoscaling can be achieved using Fargate or EC2 instances. For EC2 instances, cluster autoscaler or Carpenter can be used. Cluster autoscaler is tied to auto scaling groups and node groups, updating the desired capacity of the auto scaling group based on the number of pending pods. It considers CPU and memory requests, and supports mixed instances policies. *The scaling decision that is made by the cluster auto scaler, it is done on the number of pending pods in the cluster.* Auto-discovery is recommended, and changes to min/max configuration should be made at the managed node group or auto scaling group level.
|
||||
|
||||
Carpenter is an open-source Kubernetes native capacity auto scaler that directly interacts with the EC2 API, offering dynamic on-demand provisioning and improved speed. It does not depend on pre-configured node groups or auto scaling groups. Carpenter uses the concept of a provisioner to define requirements for EC2 instances, matched with workload requirements using node selectors and affinity terms. Reclamation is disabled by default, so TTL or cluster consolidation must be enabled. Carpenter is recommended for clusters with varying capacity and workload requirements.
|
||||
|
||||
To address IP exhaustion, switching to IPv6 addressing is recommended. If not possible, custom networking can be used with carrier-grade NAT. For IPv6, a dual-stack VPC is recommended, with nodes supporting dual-stack IP addresses but pods having only IPv6 addresses. Interaction between IPv6 pods and IPv4 destinations is configured by utilizing matting at two different layers.
|
||||
|
||||
Additional considerations for scaling include enabling API server priority and fairness metrics, enabling caching and disabling compression, removing underutilized nodes, and limiting scaling spikes. Scaling the DNS component (CoreDNS) and installing node local DNS cache are also important.
|
||||
|
||||
The presentation concludes by recommending the EKS best practices guides, specifically the scalability section.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 64 Scaling out with Amazon EKS
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Kubernetes
|
||||
- Scaling
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 64 Scaling out with Amazon EKS
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 64_ Scaling out with Amazon EKS.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 67 Cloud native observability using OpenTelemetry"
|
||||
title: CTP Topic 67 Cloud native observability using OpenTelemetry
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- OpenTelemetry
|
||||
- Observability
|
||||
- Cloud-Native
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using OpenTelemetry.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using OpenTelemetry.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 67 Cloud native observability using OpenTelemetry
|
||||
@@ -26,7 +26,14 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> Surav from AWS presented a session on observability for Amazon EKS, covering the need for observability, code instrumentation using open telemetry, defining pipelines, AWS Distro for Open Telemetry collector deployment patterns, and observability deployment options on EKS and ECS.
|
||||
|
||||
Observability is essential for managing complexity as systems evolve. *Building observable applications is a developer responsibility.* Key signals to collect include traces, metrics, and logs, enabling reactive and proactive troubleshooting. AWS offers native options like CloudWatch and X-Ray, alongside open-source solutions such as Yeager, Zipkin, Prometheus, and Grafana, either self-hosted or managed. The AWS Distro for Open Telemetry (ADOT) is a secure, production-ready solution with AWS-developed components, offering support for operational issues.
|
||||
|
||||
Open Telemetry provides a vendor-agnostic instrumentation library, simplifying code instrumentation. The Open Telemetry collector uses receivers, processors, and exporters to manage signals. Receivers collect signals, processors transform them, and exporters send them to destinations. *A trace captures the processing time taken at individual layers in your application call stack.* ADOT includes the AWS SIG V4 extension for seamless integration with AWS services. Collecting metrics from both application and infrastructure layers allows comprehensive application views, including business-level metrics, service maps from X-Ray traces, and application logs. Correlation IDs, like the X-ray trace ID, enable deep links to trace views from log events.
|
||||
|
||||
ADOT is a repackaged Open Telemetry collector with AWS-developed components. It offers receivers like Prometheus and X-ray, processors like batch and filter, and exporters like X-ray, CloudWatch, Prometheus, and EMF. In ECS deployments, the AWS ECS container metrics receiver collects infrastructure metrics, while the Prometheus remote write exporter sends metrics to Prometheus. The SIGV4 Auth extension is used for AWS API calls. ADOT can be deployed as a sidecar container or a separate task, with configurations for scraping targets and defining pipelines. Deployment patterns include sidecar, separate task, demon set, and high-availability replicas. The ADOT add-on for EKS simplifies deployment with an operator and Terraform module, including prebuilt Grafana dashboards. Costs depend on the destination service, such as metric storage for Prometheus or trace ingestion for X-ray. An observability workshop and best practices site offer further guidance.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 67 Cloud native observability using OpenTelemetry
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- OpenTelemetry
|
||||
- Observability
|
||||
- Cloud-Native
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using OpenTelemetry.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 67 Cloud native observability using OpenTelemetry
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 67_ Cloud native observability using OpenTelemetry.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 70 EKS deployment using IAC"
|
||||
title: CTP Topic 70 EKS deployment using IAC
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Kubernetes
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 70 EKS deployment using IAC
|
||||
@@ -27,7 +27,31 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## EKS Deployment Using Infrastructure As Code
|
||||
|
||||
This session covers EKS cluster deployment via Infrastructure as Code (IAC), focusing on managing containers and worker nodes using the SRE EKS module. Key capabilities include cluster autoscaling, ingress controller, and custom networking. The agenda includes comparing containers and VMs, discussing EKS features, and demonstrating EKS deployment via Terraform and Service Catalog. Monitoring the EKS stack and containers for proactive alerting is also covered.
|
||||
|
||||
The discussion begins with the differences between VMs and containers, highlighting the benefits of containers such as reduced boot time, memory efficiency, and portability. Kubernetes is presented as a framework for running distributed systems resiliently, automating rollouts/rollbacks, load balancing, and horizontal pod scaling.
|
||||
|
||||
EKS, a managed Kubernetes service by Amazon, offers features like fully managed control planes and autoscaling worker nodes. *Zero downtime rolling deployments for worker node updates* and IAM RBAC mapping for least privilege access are implemented. The SRE EKS module integrates an ALB ingress controller for traffic management and EMI custom networking for pods to handle CIDR limitations.
|
||||
|
||||
### Deployment Methods
|
||||
|
||||
Two deployment methods are detailed:
|
||||
|
||||
1. **Terraform:** Using a `tera-grant.scl` file, users can define environment variables, EKS cluster version, and worker node types (CPU, GPU, or default). Integration with AWS Secret Manager is included for engineering contact notifications.
|
||||
2. **Service Catalog:** This method allows users to create EKS clusters via a module with version selection and worker node type configuration. It provides more control over security and permissions.
|
||||
|
||||
*Service Catalog allows creating, organizing, and governing AWS resources with permission control.*
|
||||
|
||||
### Custom Networking and Autoscaling
|
||||
|
||||
Custom networking for pods addresses CIDR limitations by adding a virtual EMI to assign IP addresses to pods. The Kubernetes cluster autoscaler automatically scales worker nodes based on resource needs. Future implementation of Carpenter is being considered for more efficient instance type creation based on pod requirements.
|
||||
|
||||
### Monitoring
|
||||
|
||||
Monitoring is achieved using CloudWatch agent and FluentBit deployed as demon sets. Container Insights needs to be enabled to publish metrics to CloudWatch. The process involves applying manifest files within the cluster to set up CloudWatch logs and metrics. AWS Open Telemetry can also be used for monitoring. Centralized Grafana instances are available for visualizing metrics via templated dashboards, including an EKS-specific dashboard.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 70 EKS deployment using IAC
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- IaC
|
||||
- Kubernetes
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 70 EKS deployment using IAC
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 70_ EKS deployment using IAC.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol"
|
||||
title: CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Monitoring
|
||||
- Observability
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
|
||||
@@ -26,7 +26,16 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Cloud Monitoring Using OBM Implementation
|
||||
|
||||
The session covers the implementation of cloud monitoring using Microfocus's Operations Bridge Manager (OBM), a solution designed to address gaps in existing monitoring systems like Sitescope, especially with the increasing shift towards public cloud environments. OBM offers a dynamic monitoring solution for AWS core services, enhanced security, and improved dynamic capabilities compared to Sitescope.
|
||||
|
||||
The current architecture involves data collection from various sources (infrastructure, servers, applications, hardware, and networks) using data collectors like Sitescope, HPCM, and norm, feeding into regional OBMs. These regional OBMs then send data to a global OBM, which acts as a manager of managers. The global OBM integrates with smacks, enabling the OSE team to escalate and create tickets for events. A new regional OBM setup is planned for AWS cloud monitoring in a lab landing zone environment in Frankfurt. The OBM account will be part of the digital factory landing zone, interacting with core accounts like shared, logs, and security accounts. The regional OBM collects data from different AWS accounts through an operation agent and CloudWatch API, forwarding it to the on-premise global OBM.
|
||||
|
||||
The architecture includes an OBM AWS account with an OBM application, a Postgres RDS database, and a separate instance with an operation agent. The operation agent collects data using OBM management packs, specifically the AWS management pack, which instructs the agent to gather data from different accounts. *The agent uses role-based access to collect data from CloudWatch API, eliminating the need to install servers in customer accounts and share sensitive access keys.* The management pack solution uses policies to define monitoring intervals, specific metrics, and data collection from specific accounts, matching data against thresholds to trigger events. *Whenever new instances are added, policies are automatically deployed, and monitoring begins, offering dynamic monitoring capabilities.*
|
||||
|
||||
For onboarding new customers, an IAM role with CloudWatch read-only access needs to be created, and the AWS account where the OBM and operation agent reside must be added to the trust relationship tab. The role ARN is then added as a policy in the OBM account's IAM role, attached to the agent node. The process involves specifying the role ARN, account ID, namespaces/services to be monitored, metrics, thresholds, monitoring frequency, and title format. The title format is enriched to provide useful information for the service center team, facilitating escalation and runbook execution. CloudWatch custom metrics can be used for metrics not exposed by default. The OBM management pack solution can monitor any public cloud vendor (Amazon, Azure, Google Cloud) and any AWS service with data exposed to CloudWatch metrics, using both metrics and logs. The solution is dynamic and customizable, with all data collected from the OBM account without requiring any installations in customer accounts.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/04_EKS
|
||||
tags:
|
||||
- AWS
|
||||
- Monitoring
|
||||
- Observability
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 8 Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 8_ Implementation of Cloud monitoring using Micro Focus Operations Bridge Monitoring Sol.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording
|
||||
@@ -24,28 +24,37 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## EKS Optimization with Carpenter
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
This session introduces Carpenter, an open-source compute infrastructure management tool for Kubernetes clusters, addressing challenges associated with the traditional Cluster Autoscaler. Carpenter offers native integration with Kubernetes, direct EC2 fleet API communication, and intelligent workload placement and consolidation based on cost and utilization.
|
||||
|
||||
---
|
||||
Key differences between Carpenter and Cluster Autoscaler:
|
||||
* Carpenter integrates with Kubernetes workload scheduling constructs.
|
||||
* It directly communicates with the EC2 fleet API, reducing latency.
|
||||
* It provides native experiences for workload placement and node consolidation.
|
||||
|
||||
## 关键概念
|
||||
Two core components of Carpenter: node pools and node classes. Node pools define scheduling constraints and capacity limits, while node classes define instance provisioning details like subnets, node roles, and AMIs.
|
||||
|
||||
-
|
||||
Carpenter supports Kubernetes scheduling constraints like node selectors, affinity, taints, tolerations, and topology spread, along with AWS placement requirements such as purchasing options, processor architectures, and availability zones. It can identify zonal requirements based on volume claims and storage classes, simplifying workload definitions compared to Cluster Autoscaler.
|
||||
|
||||
---
|
||||
_*Carpenter has native integration with Kubernetes and it complements the native Kubernetes spot pod scheduling constraints that is available for your workloads.*_
|
||||
|
||||
## 行动项
|
||||
Carpenter natively supports spot interruptions without requiring additional components like the node termination handler. It uses EventBridge and SQS to handle spot interruption notifications, instance rebalance notifications, health events, and instance state change events.
|
||||
|
||||
-
|
||||
Node pools can be designed for various scenarios, including single node pools, mixed compute/accelerated nodes, or isolated node pools based on cost, security, or multi-tenancy. Weighted node pools can prioritize instances based on existing commitments or reservations.
|
||||
|
||||
---
|
||||
Carpenter simplifies data plane management by removing pain points associated with node groups, integrating node termination handlers, and providing native integration with Kubernetes scheduling constraints. It also helps consolidate compute instances for greater cost efficiency.
|
||||
|
||||
## 相关视频
|
||||
_*Carpenter not only does the auto-scaling bit, but it also removes the pain points of working with node groups.*_
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
Carpenter can automatically upgrade AMIs or use defined AMIs, referring to the parameter store for the latest EKS optimized AMIs for the corresponding control plane version. It identifies drifts between the desired state and running machines, rolling out changes in a rolling upgrade fashion.
|
||||
|
||||
---
|
||||
AMI selection can be pinned to specific versions or use custom AMIs. The AMI family setting tells Carpenter what user data to inject when spinning up instances.
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
Consolidation policies can be configured with fine-grained budgets, such as preventing consolidation during peak business hours or limiting the percentage of instances disrupted at a time.
|
||||
|
||||
Carpenter publishes logs and emits Prometheus metrics for observability, with community-maintained dashboards available for visualization.
|
||||
|
||||
Onboarding is simple, requiring Carpenter to be deployed on nodes not managed by Carpenter, such as a small node group or Fargate instances. Migration guides are available for migrating from Cluster Autoscaler.
|
||||
|
||||
The session is the first in a series of three, with subsequent sessions covering the Bottlerocket operating system and EKS Auto Mode.
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Karpenter
|
||||
- Cost-Optimization
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204 170113-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 1 of 3 - Compute Optimization with Karpenter - 20250204_170113-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording
|
||||
@@ -24,28 +24,12 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## EKS Optimization: Running Containers with Water Rocket OS
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
This session focuses on Water Rocket OS and its benefits for running containerized workloads in EKS. Water Rocket is a Linux-based operating system designed specifically for hosting containers, differing from general-purpose OSes by including only essential components. It is free, open-source, and maintained on GitHub, with AWS as a core maintainer and sponsor. Water Rocket can be run on laptops, workstations, or in data centers, and is designed to be minimal, enforce safe updates, and be security-focused.
|
||||
|
||||
---
|
||||
Water Rocket is minimal because it lacks unnecessary software, drivers, and tools. It does not include a package manager, default shell interpreter, or default SSH access. Only essential kernel components are packaged into the OS image during build time. To accommodate specific workload needs like GPU resources, Water Rocket uses variants, which are combinations of platform, processor architecture, and necessary binary components. These variants are built with specific packages, drivers, and tools included. *A variant is basically a combination of platform, supported platform, the processor architecture and the necessary binary components that are supported by the processor architecture and any additional packages and drivers that are required for your specific workloads.* Configuration is managed through an API interface or Toml-formatted user data.
|
||||
|
||||
## 关键概念
|
||||
Safe updates are enforced through in-place updates and node replacement. In-place updates involve downloading a new image version to an inactive partition and switching the active partition upon reboot, ensuring system consistency. The data volume caches container images and can be pre-populated with images via snapshots. Security is enhanced through secure boot, cryptographic verification of the root file system using dm-verity, and an immutable root file system. The `/etc` directory is a temporary file system, and SE Linux is enabled by default in enforcing mode. *The root file system is by default immutable, you cannot change anything there.* Bottle Rocket has a dedicated CIS benchmark for hardening, and comprehensive security guidance is available.
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
Water Rocket integrates with EKS through optimized variants and is supported across self-managed node groups, managed node groups, and Carpenter node pools. It can be configured using tools like EKS Cuddle and Carpenter, with best practices including pinning the AMI to a specific version.
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Bottlerocket
|
||||
- OS
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218 170127-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 2 of 3 - Running Containers with Bottlerocket OS - 20250218_170127-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording
|
||||
@@ -23,28 +23,20 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## EKS Optimization: Introduction to EKS Auto Mode
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
This session focuses on EKS Auto Mode, the third part of a series on EKS optimization. EKS Auto Mode extends the management responsibilities of the EKS service to the data plane, managing instances, operating systems, patches, and security updates. It leverages core capabilities like Carpenter for infrastructure management, a managed EBS CSI driver for stateful workloads, and the AWS load balancer controller.
|
||||
|
||||
---
|
||||
Key benefits of EKS Auto Mode include increased agility, automatic consolidation, dynamic instance determination, and optimized compute costs. *With Auto Mode, a majority of the operational concerns are being managed by the ECS service.* Core capabilities are managed within instances provisioned inside the EKS account, while customers retain control over VPC infrastructure, cluster configuration, add-ons, and workload configurations.
|
||||
|
||||
## 关键概念
|
||||
EKS Auto Mode offers an easier interface for working with EKS, providing data plane management in addition to control plane management. It supports a wide range of EC2 instances (excluding bare metal) and is fully compatible with Kubernetes-compliant workloads. Security is enhanced through the use of the Bottle Rocket operating system and automated patch management. The core cluster capabilities are grouped under compute (Carpenter controller), networking (AWS load balancer controller), storage (EBS CSI controller), and security (pod identity associations).
|
||||
|
||||
-
|
||||
By default, Auto Mode includes two node pools (general purpose and system) and one node class. The default node pools are immutable and configured with zero weight, allowing custom node pools to be prioritized. The general purpose node pool is locked to AMD64 architecture, while custom node pools can be defined for Graviton instances. Instances in the system node pool have a taint applied, requiring corresponding tolerations for system add-ons.
|
||||
|
||||
---
|
||||
Networking in Auto Mode includes Core DNS packaged with every node as a system service, VPCCNI as a system service, and Kube proxy set up in IP tables mode. Prefix delegation is enabled by default. The AWS load balancer controller is available as a core capability, using an EKS Auto Mode-specific load balancer class. The packaged CSI controller requires a storage class referring to the EBS CSI EKS provisioner.
|
||||
|
||||
## 行动项
|
||||
Version upgrades in Auto Mode are initiated by an operator for the control plane. *Once the control plane version gets upgraded, then the compute controller, which is running as a core capability, will identify that the control plane version has changed and it will try to pull the current AMI version for that new control plane version.* The compute controller then rolls out the new AMI across the cluster through a rolling upgrade.
|
||||
|
||||
-
|
||||
While the controllers are managed by the EKS service, users can investigate custom resources and deploy node diagnostic CRDs. Observability can be achieved through CloudWatch agent, AWS distro for open telemetry, or other collectors.
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
For every instance spun up in an Auto Mode cluster, there is a 12% premium charged for the automatic management of those instances.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
tags:
|
||||
- AWS
|
||||
- EKS
|
||||
- Auto-Mode
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304 170115-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - EKS Optimization part 3 of 3 - Introduction to EKS Auto Mode - 20250304_170115-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -9,7 +9,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording
|
||||
@@ -22,28 +22,22 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## Observability with Open Telemetry
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
Jay Comer, Solutions Architect with AWS, presented an overview of observability with OpenTelemetry, including changes and updates within the AWS observability ecosystem since the last session a year ago. The session included a demo showing how to piece together the components and how to instrument an application with OpenTelemetry.
|
||||
|
||||
---
|
||||
Observability is defined as *a measure of how well internal states of a system can be inferred from knowledge of its external outputs.* These outputs include logs, metrics, and traces, which are correlated with the application's health. As systems transition to micro-service-based architectures, the observability challenge becomes more prominent due to increasing complexity. Downtime can cost significant money and effort, with Gartner estimating an average of 87 hours per year of downtime, costing $42,000 per hour.
|
||||
|
||||
## 关键概念
|
||||
The three signals used for observability are metrics, logs, and traces. Metrics are aggregated source statistics, logs help determine the root cause of problems, and traces provide a holistic view of a specific request within the system. A trace span includes a start time, a duration, and metadata such as a log.
|
||||
|
||||
-
|
||||
The AWS observability landscape includes AWS native services like CloudWatch and X-Ray, as well as managed services of open-source implementations like Grafana, OpenSearch, Prometheus, and OpenTelemetry. OpenTelemetry aims to solve the problem of disparate SDKs and tooling for different components within the observability landscape by providing an instrumentation language with different SDKs per language. It offers an end-to-end implementation for making telemetry data accessible and usable and is vendor-agnostic.
|
||||
|
||||
---
|
||||
OpenTelemetry is a data format with support for 11 language SDKs and automates instrumentation. The OpenTelemetry collector standardizes and transforms data into the OpenTelemetry protocol (OTLP) format and exports it to different destinations. The collector includes receivers (AWS-specific or open source), processors (filtering, transformations), exporters (AWS native, open source, or third-party), and extensions (SIGV for authorization, health check).
|
||||
|
||||
## 行动项
|
||||
The AWS distribution for OpenTelemetry is a unified agent for collecting traces, metrics, and logs. It includes an operator that automatically instruments applications by detecting the language used and creating pre-configured OpenTelemetry collectors. Custom attributes, such as tenant IDs, can be added to OpenTelemetry items.
|
||||
|
||||
-
|
||||
Recent announcements focused on security and compliance, scale and region expansion, and a centralized pane of glass with an improved user experience. The managed service collector for Amazon Prometheus provides a serverless, agentless scraper that automatically discovers and pulls Prometheus-compatible metrics. Log support was added to the AWS distribution for OpenTelemetry, and Amazon Managed Grafana now supports community plugins.
|
||||
|
||||
---
|
||||
The demo showcased a sample application running on EKS, using Fluent Bit for collecting logs and forwarding them to the OpenTelemetry container. The OpenTelemetry container collects traces and metrics from the application, sending logs, traces, and metrics to Amazon OpenSearch Service via an ingestion pipeline. The source code included Fluent Bit and OpenTelemetry YAML configuration files. *The output that Fluent Bit is sending the individual logs to is the Open Telemetry endpoint on the port 55681.* On a code level, the implementation involves importing OpenTelemetry SDKs, configuring a trace provider, and starting a span with the tracer at each point where instrumentation and request duration measurement are needed.
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
OpenSearch dashboards can display latency by trace group and an application composition map, showing where bottlenecks are appearing.
|
||||
|
||||
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/04_EKS"
|
||||
tags:
|
||||
- OpenTelemetry
|
||||
- Observability
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402 160113-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Observability with OpenTelemetry - 20240402_160113-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 04_EKS
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 13_ Cloud FinOps_ Micro Focus Policies _ best practices to optimize the costs.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 13 Cloud FinOps Micro Focus Policies best practices to optimize the costs
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** ✅ 已完成摘要
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 27_ AWS Instance Scheduler.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 27 AWS Instance Scheduler
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 63_ Optimise resource cost using automation.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 63 Optimise resource cost using automation
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** ✅ 已完成摘要
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording
|
||||
@@ -23,28 +23,18 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## EC2 Cost Optimization in AWS: Best Practices
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
Mike Dukes and Steele Taylor, AWS experts, presented a learning session on EC2 cost optimization, covering compute efficiency, Graviton usage, EC2 spot leveraging, and cost-effective container deployments. The session emphasized interactive participation and welcomed questions.
|
||||
|
||||
---
|
||||
Efficiency in the cloud involves architectural best practices and leveraging AWS services and instance types for optimal workload performance. Technical advantages include high availability, elastic usage, and innovation adoption. Benefits include cost efficiency, leveraging purchase options, and reducing carbon footprint. *When we start talking about architecting and using best practice efficiency in the cloud, you effectively only pay for what you use when you use AWS.*
|
||||
|
||||
## 关键概念
|
||||
EC2 offers over 750 instance types tailored for various workloads. AWS's Nitro system enhances efficiency by externalizing network, storage, and security components. AWS Graviton processors provide price performance benefits. Purchase options include on-demand, savings plans, and spot instances, each suited for different workload types.
|
||||
|
||||
-
|
||||
Graviton instances offer up to 40% better price performance than comparable x86 instances. Graviton is based on ARM64 and has extensive software support across Linux OS, ISVs, and open-source software, with sustainability benefits through reduced power consumption. AWS now offers the fourth version of Graviton. Graviton supports various instance types, including compute-optimized, memory-optimized, and general-purpose. AWS services like RDS, Aurora, and Lambda also support Graviton. Migrating to Graviton for services like RDS Aurora is relatively straightforward. *Graviton Free actually uses up to 60% less power consumption than comparable X86-based instances.*
|
||||
|
||||
---
|
||||
EC2 Spot instances offer up to 90% discounts compared to on-demand pricing, leveraging spare capacity. Key considerations for Spot instances include fault tolerance, flexibility, and statelessness. Diversification across instance types and availability zones is crucial for Spot usage. Spot instances can be interrupted when capacity is needed for on-demand instances, with notifications provided before termination. Integrations with AWS services like autoscaling, EKS, and ECS support automated responses to interruptions.
|
||||
|
||||
## 行动项
|
||||
Spot instances are suitable for web services, containers, HPC batch processing, big data, and CI/CD, while Graviton is beneficial for most of these except stateful services like databases. Spot and Graviton can be used together with containers, provided instance pools are not overly restricted.
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
Spot Invaders, a fault-tolerant chaos engineering game powered by EKS and EC2 Spot, demonstrates best practices for running resilient applications on EKS while optimizing costs. The game involves shooting aliens to simulate pod failures and whales to trigger spot interruptions, showcasing the ability to maintain service availability despite disruptions.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/05_FinOps"
|
||||
tags:
|
||||
- AWS
|
||||
- EC2
|
||||
- Cost-Optimization
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529 160242-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Best practices for EC2 cost optimization in AWS - 20240529_160242-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording
|
||||
@@ -23,28 +23,30 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## Budget Control Automation
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
The SRE Core team (Daniela, Evan, and Alan) presented a learning session on budget control, a new automation providing detailed data to manage budgets and costs within AWS accounts. The session covered the new budget control's value, diagrams, detailed cost reports, AWS budget alerts/actions, and source identity implementation.
|
||||
|
||||
---
|
||||
The budget control automation aims to address uncontrolled AWS account sprawl and unsustainable cost reduction efforts. It provides account owners with detailed alerts, including information on account spending and cost drivers, enabling them to identify areas for cost reduction. Enforcement will involve attaching an SCP to block new resource creation. The initial scope is limited to lab accounts, with other accounts continuing to receive standard out-of-budget alerts.
|
||||
|
||||
## 关键概念
|
||||
An example alert email includes account details, alert details, warning messages, and detailed reports. There are four types of email alerts: forecast, actual, severe, and enforcement. The alert flow includes forecast alerts at 100% threshold with no action, and actual alerts at 80%, 90%, 95%, and 98% thresholds with escalating recipient lists. At 100%, a severe or enforcement alert is triggered based on a scoring system, with enforcement initially via manual approval and later automated. Budget increases can be requested through an Oli workflow.
|
||||
|
||||
-
|
||||
*The source identity must be tracked.* Challenges during development included tracking source identity, customizing AWS budget alerts, choosing an enforcement method (SCP), and providing a grace period before enforcement. Budgets are evaluated every eight hours, and disabled budget actions result in no spend control until the next month. Currently, 80 lab accounts exceed their budgets, and around 100 are expected to exceed 80% of their budget threshold.
|
||||
|
||||
---
|
||||
The implementation will be gradual, starting with alerts only on April 1st. Manual enforcement will follow upon FinOps' approval, with automatic enforcement as the next step.
|
||||
|
||||
## 行动项
|
||||
## Diagrams and Detailed Cost Reports
|
||||
|
||||
-
|
||||
Daniel discussed diagrams and cost reports attached to email alerts, explaining their creation and content. Libraries for lambdas were created to improve code visibility and simplify deployment. The *top services of recent months* report helps managers understand cost drivers, showing the percentage of budget spent on specific services over time. The *top users of current months* diagram allows account owners to monitor daily spending by users. A detailed Excel report provides granular information on resource IDs, creators, and associated costs, separated by month.
|
||||
|
||||
---
|
||||
*This is the first time that we were able to get to this level of granularity.* Data for the top services report is generated from Athena, while the user's diagram uses data from Cost Explorer.
|
||||
|
||||
## 相关视频
|
||||
## AWS Budget Alerts and Actions
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
Alan discussed the implementation of AWS budget alerts and actions. The AWS budget service is primitive in terms of customization, so the team had to parse the bodies of the emails received from it. The budget alert system sends messages to an SNS topic, which triggers a Lambda function. The Lambda extracts data from the email and uses it to create a more detailed message. The step function enriches the data with account information, budget details, and owner/manager contacts.
|
||||
|
||||
---
|
||||
AWS allows actions to be applied based on alert thresholds. A budget action on 100% triggers either a severe or enforcement email, depending on the scoring system. If budget enforcement is enabled, an SCP is applied to block resource creation. The FINOPS group receives a notification and decides whether to apply the action immediately or negotiate with the account owner.
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
The scoring system and grace period calculations aim to avoid penalizing accounts that slightly exceed their budget near the end of the month. The scoring considers account size and proximity to the end of the month. Smaller accounts have a better grace period.
|
||||
|
||||
FinOps has classified accounts based on cost range. The budgets were last updated on February 23rd. The source identity attribute was implemented to track user activity within AWS accounts, even when assuming different roles. Federated logins use NetIQ access manager to authenticate users and provide access to AWS accounts. The source identity ensures that the original login identity is maintained across role changes, allowing CloudTrail and other services to track user activity accurately.
|
||||
|
||||
@@ -1,24 +1,23 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions - Budget Control - 20240319 160204-Presentation"
|
||||
title: "Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: pptx
|
||||
source-type: video
|
||||
category: "DevOps & SRE/05_FinOps"
|
||||
tags:
|
||||
- AWS
|
||||
- Budget-Control
|
||||
- FinOps
|
||||
- Presentation
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Presentation.pptx"
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions - Budget Control - 20240319 160204-Presentation
|
||||
# Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Presentation.pptx`
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4`
|
||||
|
||||
**Type:** PPTX | **Category:** 05_FinOps
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording
|
||||
@@ -23,28 +23,20 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## Reducing Cloud Costs
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
Vinay from the FINOPS team presented a session on reducing cloud costs, focusing on workload and rate optimization. The session covered modernization, right sizing, and best practices for cost reduction.
|
||||
|
||||
---
|
||||
### Workload Optimization via Modernization and Right Sizing
|
||||
|
||||
## 关键概念
|
||||
Modernization involves using newer generations of services, like EC2 instances. While there's a perception that newer instances are more expensive, the latest families are generally cheaper and offer better performance. *Whenever there's a new family launched by the hyperscale, the latest families are almost cheaper.* However, AWS has slightly changed its pricing model after M6, making M7 and M8 somewhat more expensive. Moving from Intel to AMD can save around 6-10% on on-demand prices for Windows and Linux workloads. Graviton instances can offer even greater savings (20-25% reduction in on-demand cost) for Linux workloads, combined with EDP discounts and commitment plans.
|
||||
|
||||
-
|
||||
Upgrading storage from GP2 to GP3 offers a 20% direct cost benefit without downtime. For Amazon EKS clusters, upgrading to the latest versions is crucial to avoid extended support costs, which are significantly higher. *Rather than spending up unnecessary moment on the extended support, you can deploy additional four or five cluster, right.* Spot instances can provide up to 90% discount compared to on-demand, suitable for big data, CI/CD pipelines, web servers, and HPC.
|
||||
|
||||
---
|
||||
Right sizing involves identifying the correct resource configuration for workload performance and capacity needs. The EC2 right sizing recommendation report captures CPU usage, memory, and network data to provide recommendations. Configuring instance schedules is useful for non-production environments, allowing instances to be powered on/off based on business hours, potentially reducing costs to 40% of on-demand prices. Identifying and deleting idle load balancers, unassociated elastic IPs, and underutilized EBS volumes are also key to cost savings. Old snapshots and CloudWatch logs also contribute to unnecessary costs. Using cheaper regions like Oregon or North Virginia can reduce costs if there are no specific regional requirements.
|
||||
|
||||
## 行动项
|
||||
### Rate Optimization
|
||||
|
||||
-
|
||||
Rate optimization involves commitment-based discounts. Hyperscalers offer discounts for committing to resource usage or spending for a term (1-3 years). There are two categories: resource-level commitment (better discount with limitations) and flexible commitment (standard discount with flexibility). AWS offers Savings Plans (EC2 and Compute) and reservations for various services like RDS, ElastiCache, and CloudFront.
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
The rate optimization workflow includes pre-work (right sizing), analysis (identifying workloads requiring 24/7 uptime), communication (sharing details with finance), approval (from account owner), and reporting (monitoring utilization). Only the Phenop's team can implement commitment plans. All commitment plans will be purchased with no upfront payment options only. The minimum transaction value is 5k per annum.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/05_FinOps"
|
||||
tags:
|
||||
- AWS
|
||||
- Cost-Optimization
|
||||
- FinOps
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318 170100-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions- Reducing Cloud Costs - 20250318_170100-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -10,7 +10,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording
|
||||
@@ -23,28 +23,24 @@ status: raw
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
## Storage Cost Optimization
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
This session covers storage cost optimization best practices across various AWS storage services: Amazon EBS, Amazon EFS, Amazon FSx, and Amazon S3. It includes an optimization example from ADM.
|
||||
|
||||
---
|
||||
Key points include choosing the right storage for your workload, considering API costs and data transfer costs in addition to price per gigabyte, and understanding the different tiers available within each service.
|
||||
|
||||
## 关键概念
|
||||
### Amazon EBS
|
||||
|
||||
-
|
||||
EBS has SSD and HDD volumes. GP3 volumes are recommended as the default for general-purpose SSD due to being 20% more cost-effective than GP2. *With GP3, you can scale IOPS and throughput independently of the volume size.* For migration from GP2 to GP3, automation tools should be updated to create GP3 volumes by default. EBS snapshots have standard and archive tiers, with the archive tier offering 75% lower costs but higher restore times and a 90-day retention period. Automation via Data Lifecycle Management (DLM) or AWS Backup is recommended for managing snapshots, including setting retention policies and migrating to the archive tier.
|
||||
|
||||
---
|
||||
### Amazon EFS and FSx
|
||||
|
||||
## 行动项
|
||||
FSx considerations include data deduplication, compression, and tiering. EFS offers standard, one-zone, and infrequent access tiers, with lifecycle policies to move files between tiers. The infrequent tier has a minimum billable object size of 128KB. EFS archive is a new tier, similar to Glacier, with a 90-day minimum duration and a 128KB minimum billable object size. FSx for NetApp ONTAP has SSD and HDD tiers (capacity pool), with automatic tiering between them.
|
||||
|
||||
-
|
||||
### Amazon S3
|
||||
|
||||
---
|
||||
Choosing the right storage class is crucial for S3 cost optimization. S3 Standard is for frequently accessed objects, with no retrieval fees, minimum retention, or minimum billable object size. Glacier tiers (Instant Retrieval, Flexible Retrieval, Deep Archive) are for rarely accessed data, with varying retrieval times and costs. Intelligent Tiering automatically moves data between tiers based on access patterns, with no transition fees between tiers within Intelligent Tiering. *With intelligent hearing we can automatically move data from warmer to colder color storage tiers and it will be based on the object less access data.* Lifecycle policies can transition objects between tiers, expire non-current versions, and delete incomplete multi-part uploads. Data transfer charges should be considered, and PrivateLink can be leveraged to stay within the AWS network. Storage Lens, CloudWatch, S3 Inventory, and access logs can be used to monitor and optimize S3 usage.
|
||||
|
||||
## 相关视频
|
||||
### ADM Optimization Example
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
ADM migrated NetApp file shares from on-premises to AWS. The initial migration to OpenZFS was inefficient. A second migration to a self-managed NetApp on EC2 instances incurred high data transfer costs. The final migration to AWS FSx for NetApp ONTAP resulted in a 60% cost reduction.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording"
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/05_FinOps"
|
||||
tags:
|
||||
- AWS
|
||||
- Storage
|
||||
- Cost-Optimization
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4"
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# Public Cloud Learning Sessions-Storage Cost Optimization - 20240305 160037-Meeting Recording
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions-Storage Cost Optimization - 20240305_160037-Meeting Recording.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 05_FinOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -11,7 +11,7 @@ tags:
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 15_ Working with Renovatebot.mp4"
|
||||
audio-source: ""
|
||||
status: summarized
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 15 Working with Renovatebot
|
||||
@@ -20,7 +20,7 @@ status: summarized
|
||||
|
||||
**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
**Status:** ✅ 已完成(Gemini 摘要)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 3 Deploy and maintain infrastructure"
|
||||
title: CTP Topic 3 Deploy and maintain infrastructure
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/06_CI_CD_GitOps"
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- IaC
|
||||
- Deployment
|
||||
- CI/CD
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 3 Deploy and maintain infrastructure
|
||||
@@ -26,7 +26,20 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Deploying and Maintaining Infrastructure
|
||||
|
||||
The session focuses on deploying and maintaining infrastructure, clarifying Terraform, Terragrunt, modules, and service catalogs within the landing zone context. It emphasizes the structure of Git repositories and how Terraform and Terragrunt files interact.
|
||||
|
||||
When a landing zone is provisioned, product teams are grouped, each having a landing zone and workload accounts. A product team, such as DevTools, deploys infrastructure to meet specific requirements across accounts like Artifactory and Active Directory. This involves multiple Git repositories, including the core landing zone repository, Terraform service catalog, and a product team service catalog.
|
||||
|
||||
A service module consists of a main.tf file that references other repositories, grouping modules to fulfill a business requirement, such as an active directory or DNS service. *When deploying infrastructure, Terragrunt HCL files are used to reference these services, targeting specific versions rather than the master branch.* These files may include dependencies to reference values across services, favoring dependencies over reading state files.
|
||||
|
||||
When referencing modules within the current codebase, a relative path can be used, but the preferred approach is to have a dedicated service catalog with a modules directory. This allows for independent release cycles and better maintainability. Modules can be used within one account, reused within a product team (in the product team service catalog), or used across product teams (in the Terraform service catalog).
|
||||
|
||||
*A service is a business requirement, while a regular module is a technical requirement.* A service deploys a set of multiple modules, abstracting them. The higher up the chain, the less configuration options are available, similar to an object-oriented approach.
|
||||
|
||||
Terragrunt fetches all references before running, using a Terragrunt cache directory to store cloned repositories. Terragrunt can be run at the directory level, considering dependencies, but applying without verification is discouraged. Jenkins jobs can be enhanced for debugging, and documentation should be comprehensive, referencing Gruntwork as a model. Versioning modules should follow major, minor, and patch conventions.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 3 Deploy and maintain infrastructure
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- IaC
|
||||
- Deployment
|
||||
- CI/CD
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 3 Deploy and maintain infrastructure
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 3_ Deploy and maintain infrastructure.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,8 +1,8 @@
|
||||
---
|
||||
title: "CTP Topic 32 Using Atlantis CICD for infrastructure deployments"
|
||||
title: CTP Topic 32 Using Atlantis CICD for infrastructure deployments
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/06_CI_CD_GitOps"
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- Atlantis
|
||||
- CI/CD
|
||||
@@ -10,9 +10,9 @@ tags:
|
||||
- Terraform
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 32 Using Atlantis CICD for infrastructure deployments
|
||||
@@ -27,7 +27,16 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Atlantis CICD: Replacing Jenkins for Infrastructure Deployments
|
||||
|
||||
The presentation introduces Atlantis, a new automation tool designed for teams to collaborate on Terraform code, aiming to replace Jenkins for infrastructure deployments. Atlantis addresses the speed and complexity issues of the current pipeline. *The current pipeline is practically very slow* due to significant initialization time, multiple code cloning, sequential testing, and ECS deployer provisioning. The existing pipeline's complexity stems from continuous tweaking to integrate more features and cover edge cases, leading to fragility and drift.
|
||||
|
||||
Atlantis is standalone, self-hosted, free, and open source, with an active community. It offers a better collaboration model, simplified networking, and cost savings by removing the need for numerous VPC endpoints. Atlantis applies changes before merging, ensuring code in sync with infrastructure. The workflow is simplified, allowing direct communication with Atlantis from GitHub via comments on pull requests, eliminating the need for separate accounts and integrations.
|
||||
|
||||
Atlantis is hosted on a single EC2 instance in each landing zone's shared account, notified by GitHub Enterprise using webhooks. It uses service accounts to interact with GitHub, post comments, do merges, and close PRs. Cross-account access is managed through deployed key roles in each account, utilized for both simple and cross-account module deployments. User management is controlled on GitHub, and build logs are stored in comments for auditing. Atlantis enforces apply requirements, such as mergeability and peer approval, before applying changes. Auto-merge is enabled for automatic merging upon successful application. Parallel builds are supported, running plan and apply commands concurrently for multiple modules.
|
||||
|
||||
Atlantis locking prevents conflicts by locking the directory of each module when a plan is run, until the pull request is merged, closed, or the plan is discarded. *When a plan is run, the directory of each module is locked until the pull request that is that has this folder locked is merged or closed, or the plan is manually discarded.* Modules and data file dependencies can be declared to trigger plans when dependencies change. Documentation, troubleshooting guides, and a list of migrated repositories are available to assist users.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
---
|
||||
title: CTP Topic 32 Using Atlantis CICD for infrastructure deployments
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- Atlantis
|
||||
- CI/CD
|
||||
- IaC
|
||||
- Terraform
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 32 Using Atlantis CICD for infrastructure deployments
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 32_ Using Atlantis CICD for infrastructure deployments.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 33 An introduction to GitOps"
|
||||
title: CTP Topic 33 An introduction to GitOps
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/06_CI_CD_GitOps"
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- GitOps
|
||||
- CI/CD
|
||||
- Git
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 33 An introduction to GitOps
|
||||
@@ -26,7 +26,28 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> Victor Etkin presents an introduction to GitOps, explaining how it complements DevOps. GitOps applies software development principles to deployment processes, potentially resolving challenges like failed deployments and configuration inconsistencies.
|
||||
|
||||
Key benefits of GitOps:
|
||||
* Increased developer productivity using familiar tools.
|
||||
* Minimized failed deployments with easy rollback capabilities.
|
||||
* Faster feature releases.
|
||||
* Real-time auditing and improved security through Git's features.
|
||||
|
||||
GitOps uses Git workflows, CD pipelines, and infrastructure as code. Observability is crucial for ensuring the desired and actual states align. GitOps is often used with Kubernetes but can be applied elsewhere.
|
||||
|
||||
The four principles of GitOps: declarative configuration, version control, CD process separation, and incremental infrastructure implementation. Git serves as the primary tool, storing deployment infrastructure and application configurations. A GitOps controller reconciles the Git state with the actual system state. *The only tool a developer needs to know is Git.*
|
||||
|
||||
The goal is full automation, with code changes deployed safely in minutes. CI and CD should be decoupled. A basic GitOps workflow for Kubernetes involves developers committing code, creating container images, storing deployment configurations in Git, monitoring changes via a GitOps agent, and rolling out images to environments.
|
||||
|
||||
CI focuses on building and analyzing code, while CD focuses on deploying binaries. Separating CI and CD enhances security. CD tools can run inside container platforms like Kubernetes for added security.
|
||||
|
||||
GitOps enables on-demand incremental deployment, benefiting microservices architectures. CD processes require an IDEMPOTENT platform like Kubernetes. *An IDEMPOTENT operation is one that can be applied multiple times without changing the result beyond the initial application.*
|
||||
|
||||
CD processes can be implemented using push or pull models. The pull model, which monitors both Git and the target system, is recommended for GitOps. Human intervention is still needed for issues like resource failures. GitOps simplifies operations, allowing developers to focus on more valuable activities.
|
||||
|
||||
GitOps is a logical evolution of DevOps, simplifying adoption and enhancing portability. Git commit logs become audit trails, streamlining compliance.
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 33 An introduction to GitOps
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- GitOps
|
||||
- CI/CD
|
||||
- Git
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 33 An introduction to GitOps
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 33_ An introduction to GitOps.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
@@ -1,17 +1,17 @@
|
||||
---
|
||||
title: "CTP Topic 56 Automated infrastructure testing"
|
||||
title: CTP Topic 56 Automated infrastructure testing
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: "DevOps & SRE/06_CI_CD_GitOps"
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- Testing
|
||||
- IaC
|
||||
- Automation
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: "nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 56_ Automated infrastructure testing.mp4"
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 56_ Automated infrastructure testing.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
status: summarized (Gemini 摘要)
|
||||
---
|
||||
|
||||
# CTP Topic 56 Automated infrastructure testing
|
||||
@@ -26,7 +26,22 @@ status: raw
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
> ## Automated Infrastructure Testing
|
||||
|
||||
Mark Francis discusses automated infrastructure testing, emphasizing its value and practical application for engineers. The session aims to provide actionable insights for immediate use.
|
||||
|
||||
Key points covered:
|
||||
|
||||
* Integration tests are crucial for validating deployed infrastructure functionality, going beyond syntax checks to ensure the actual deployment matches expectations.
|
||||
* *I think the bottom quote, just I think let's leave the repetitive things for the computers to do and use our brains for the complex human things.*
|
||||
* TerraTest, a Golang library, automates the apply-test-destroy cycle, streamlining testing processes.
|
||||
* Test-driven development (TDD) involves writing tests before implementing features, ensuring focused development and building a comprehensive test suite.
|
||||
* A new workflow is proposed, integrating test writing as a primary step and removing manual validation, aiming for automated validation suites and increased confidence in deployments.
|
||||
|
||||
The presentation introduces TerraTest and its role in automating infrastructure testing. It highlights a repository with basic examples, demonstrating how TerraTest applies Terraform configurations, validates outputs, and destroys resources. The benefits of this approach include automating manual checks, testing complex modules, and increasing confidence in code changes.
|
||||
|
||||
The discussion also covers the challenges of infrastructure testing, such as time investment and the maturity of testing tools. However, it argues that the long-term benefits, including reduced bugs and increased confidence, outweigh the initial difficulties. The session concludes with a proposed workflow that integrates testing as a core component of infrastructure development, emphasizing the importance of treating tests as first-class citizens. *I'm just extending the value of putting stuff as code.*
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: CTP Topic 56 Automated infrastructure testing
|
||||
type: cloud-learning
|
||||
source-type: video
|
||||
category: DevOps & SRE/06_CI_CD_GitOps
|
||||
tags:
|
||||
- Testing
|
||||
- IaC
|
||||
- Automation
|
||||
- CTP
|
||||
date-added: 2026-04-14
|
||||
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 56_ Automated infrastructure testing.mp4
|
||||
audio-source: ""
|
||||
status: raw
|
||||
---
|
||||
|
||||
# CTP Topic 56 Automated infrastructure testing
|
||||
|
||||
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 56_ Automated infrastructure testing.mp4`
|
||||
|
||||
**Type:** VIDEO | **Category:** 06_CI_CD_GitOps
|
||||
|
||||
**Status:** 🟡 Awaiting Whisper transcription → Summary
|
||||
|
||||
---
|
||||
|
||||
## 摘要
|
||||
|
||||
> 待转录后由 LLM 生成
|
||||
|
||||
---
|
||||
|
||||
## 关键概念
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 行动项
|
||||
|
||||
-
|
||||
|
||||
---
|
||||
|
||||
## 相关视频
|
||||
|
||||
> 配对视频笔记链接(生成后填入)
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2026-04-14*
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user