Files
nexus/knowledgebase/DevOps & SRE/10_OpenText-Series/ctp-topic-41-nfrs-and-error-budgets.md

66 lines
3.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: CTP Topic 41 NFRs and Error Budgets
type: cloud-learning
source-type: video
category: DevOps & SRE/10_OpenText-Series
tags:
- uncategorized
date-added: 2026-04-14
video-source: nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 41_ NFRs and Error Budgets.mp4
audio-source: ""
status: summarized (Gemini 摘要)
---
# CTP Topic 41 NFRs and Error Budgets
**Source:** NAS `/volume2/work/Public Cloud Learning Sessions/CTP _ Topic 41_ NFRs and Error Budgets.mp4`
**Type:** VIDEO | **Category:** 10_OpenText-Series
**Status:** 🟡 Awaiting Whisper transcription → Summary
---
## 摘要
> ## NFRs and Error Budgets
Brendan Standing, head of SRE at Micro Focus, discusses non-functional requirements (NFRs) and error budgets in the context of cloud and agile development. The goal is to drive collaboration between product groups and operations to meet customer expectations, ensure operational requirements in an agile manner, and understand error budget boundaries to deliver features quickly and reliably.
An NFR is a criterion used to judge a system's operation, while an error budget is the maximum time a system can fail without consequences. Historically, NFRs in on-premise data centers were complex and slowed progress, but the focus now is on agile implementation in the cloud.
The cloud landscape shifts ownership, with cloud providers handling infrastructure as a service, platform as a service, or software as a service. AWS's shared responsibility model means the company no longer manages data centers but must architect and manage services in the cloud to meet NFRs. *We want to drive collaboration across our product groups and operations to ensure our obligation to our customers.*
An epic for NFR templates aims to integrate NFRs into sprint backlogs, ensuring consideration for any major change. NFRs should be more prescriptive in the cloud, leveraging cloud-native services. Examples include specific backup procedures using AWS backup with defined cadences and testing, as well as DR planning with quarterly testing and infrastructure as code.
Error budgets measure the amount of unreliability a service can have before impacting customers. Developers can take more risks if within budget but must make safer choices if not. Error budgets normalize failure and bridge the gap between development and operations. They are derived from service level objectives (SLOs) and measured by service level indicators (SLRs).
SLRs are quantifiable measures of reliability, SLOs define how a service should perform, and SLAs are customer-level agreements. Error budget equals one less the availability SLO. For example, with a 99.9% uptime SLO, the error budget is 0.1%. *Error budgets normalize failure as part of the development process.*
Perfect availability is 100%, and the error budget falls between the SLO and 100%. Monitoring capabilities are crucial to measure whether error budgets are met or exceeded. Smaller iterations of changes and well-tested deployments are essential. Monitoring should quickly show whether error budgets are underutilized or exceeded.
Chaos engineering involves intentionally causing faults to test system resilience and ensure NFRs are met. NFRs should be testable and automated. The next steps involve working with product groups to integrate NFRs into backlogs, refine them, and develop SLOs.
---
## 关键概念
-
---
## 行动项
-
---
## 相关视频
> 配对视频笔记链接(生成后填入)
---
*最后更新: 2026-04-14*