5.2 KiB
title, type, source-type, category, tags, date-added, video-source, audio-source, status
| title | type | source-type | category | tags | date-added | video-source | audio-source | status | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording | cloud-learning | video | DevOps & SRE/05_FinOps |
|
2026-04-14 | nas:///volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4 | summarized (Gemini 摘要) |
Public Cloud Learning Sessions - Budget Control - 20240319 160204-Meeting Recording
Source: NAS /volume2/work/Public Cloud Learning Sessions/Public Cloud Learning Sessions - Budget Control - 20240319_160204-Meeting Recording.mp4
Type: VIDEO | Category: 05_FinOps
Status: 🟡 Awaiting Whisper transcription → Summary
Budget Control Automation
The SRE Core team (Daniela, Evan, and Alan) presented a learning session on budget control, a new automation providing detailed data to manage budgets and costs within AWS accounts. The session covered the new budget control's value, diagrams, detailed cost reports, AWS budget alerts/actions, and source identity implementation.
The budget control automation aims to address uncontrolled AWS account sprawl and unsustainable cost reduction efforts. It provides account owners with detailed alerts, including information on account spending and cost drivers, enabling them to identify areas for cost reduction. Enforcement will involve attaching an SCP to block new resource creation. The initial scope is limited to lab accounts, with other accounts continuing to receive standard out-of-budget alerts.
An example alert email includes account details, alert details, warning messages, and detailed reports. There are four types of email alerts: forecast, actual, severe, and enforcement. The alert flow includes forecast alerts at 100% threshold with no action, and actual alerts at 80%, 90%, 95%, and 98% thresholds with escalating recipient lists. At 100%, a severe or enforcement alert is triggered based on a scoring system, with enforcement initially via manual approval and later automated. Budget increases can be requested through an Oli workflow.
The source identity must be tracked. Challenges during development included tracking source identity, customizing AWS budget alerts, choosing an enforcement method (SCP), and providing a grace period before enforcement. Budgets are evaluated every eight hours, and disabled budget actions result in no spend control until the next month. Currently, 80 lab accounts exceed their budgets, and around 100 are expected to exceed 80% of their budget threshold.
The implementation will be gradual, starting with alerts only on April 1st. Manual enforcement will follow upon FinOps' approval, with automatic enforcement as the next step.
Diagrams and Detailed Cost Reports
Daniel discussed diagrams and cost reports attached to email alerts, explaining their creation and content. Libraries for lambdas were created to improve code visibility and simplify deployment. The top services of recent months report helps managers understand cost drivers, showing the percentage of budget spent on specific services over time. The top users of current months diagram allows account owners to monitor daily spending by users. A detailed Excel report provides granular information on resource IDs, creators, and associated costs, separated by month.
This is the first time that we were able to get to this level of granularity. Data for the top services report is generated from Athena, while the user's diagram uses data from Cost Explorer.
AWS Budget Alerts and Actions
Alan discussed the implementation of AWS budget alerts and actions. The AWS budget service is primitive in terms of customization, so the team had to parse the bodies of the emails received from it. The budget alert system sends messages to an SNS topic, which triggers a Lambda function. The Lambda extracts data from the email and uses it to create a more detailed message. The step function enriches the data with account information, budget details, and owner/manager contacts.
AWS allows actions to be applied based on alert thresholds. A budget action on 100% triggers either a severe or enforcement email, depending on the scoring system. If budget enforcement is enabled, an SCP is applied to block resource creation. The FINOPS group receives a notification and decides whether to apply the action immediately or negotiate with the account owner.
The scoring system and grace period calculations aim to avoid penalizing accounts that slightly exceed their budget near the end of the month. The scoring considers account size and proximity to the end of the month. Smaller accounts have a better grace period.
FinOps has classified accounts based on cost range. The budgets were last updated on February 23rd. The source identity attribute was implemented to track user activity within AWS accounts, even when assuming different roles. Federated logins use NetIQ access manager to authenticate users and provide access to AWS accounts. The source identity ensures that the original login identity is maintained across role changes, allowing CloudTrail and other services to track user activity accurately.