Files
nexus/wiki/sources/ctp-topic-41-nfrs-and-error-budgets.md
2026-04-19 06:32:15 +08:00

62 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: ctp-topic-41-nfrs-and-error-budgets
title: "CTP Topic 41 NFR's and Error Budgets"
type: source
tags: [cloud-learning, devops, sre]
date: 2026-04-14
sources:
- raw/Cloud & DevOps/Public-Cloud-Learning-Sessions/10_OpenText-Series/ctp-topic-41-nfrs-and-error-budgets.md
---
## Source File
- [[raw/Cloud & DevOps/Public-Cloud-Learning-Sessions/10_OpenText-Series/ctp-topic-41-nfrs-and-error-budgets.md]]
## Summary
- 核心主题NFR非功能需求与 Error Budget错误预算在云和敏捷开发中的应用
- 问题域:如何平衡功能快速交付与系统可靠性要求
- 方法/机制SRE 实践、SLI/SLO/SLA 体系、混沌工程
- 结论/价值Error Budget 将失败正常化,弥合开发与运维之间的鸿沟
## Key Claims
- NFRNon-Functional Requirements非功能需求是评判系统运行状况的标准决定可用性、性能、安全等属性
- Error Budget错误预算是系统在不影响客户的前提下可以不可靠的最大时间量
- Error Budget = 1 - 可用性 SLO例如 99.9% SLO 对应 0.1% Error Budget
- 混沌工程Chaos Engineering通过故意引发故障来测试系统韧性确保满足 NFR
- AWS 共享责任模型下,企业必须自行架构和管理云服务以满足 NFR
## Key Quotes
> "We want to drive collaboration across our product groups and operations to ensure our obligation to our customers." — Brendan Standing
> "Error budgets normalize failure as part of the development process." — Brendan Standing
> "Perfect availability is 100%, and the error budget falls between the SLO and 100%." — Brendan Standing
## Key Concepts
- [[NFR非功能需求]]:评判系统运行状况的标准,如可用性、性能、安全性
- [[Error Budget错误预算]]:系统可不可靠而不影响客户的允许时间量
- [[SLI服务等级指标]]:可靠性的可量化度量指标
- [[SLO服务等级目标]]:服务应该达到的性能/可靠性目标
- [[SLA服务等级协议]]:客户级别的正式协议
- [[混沌工程]]:主动引入故障测试系统韧性的实践
- [[SRE站点可靠性工程]]:将软件工程方法应用于运维问题的学科
## Key Entities
- [[Brendan Standing]]Micro Focus SRE 负责人,演讲者
- [[AWS]]Amazon Web Services云服务提供商共享责任模型
- [[Micro Focus]]软件公司SRE 团队所在组织
## Connections
- [[SRE]] ← implements ← [[NFR非功能需求]]
- [[SRE]] ← uses ← [[Error Budget错误预算]]
- [[SLO服务等级目标]] ← derives ← [[Error Budget错误预算]]
- [[SLI服务等级指标]] ← measures ← [[SLO服务等级目标]]
- [[混沌工程]] ← validates ← [[NFR非功能需求]]
## Contradictions
- (暂无)
## Notes
- NFR Epic 目标:将 NFR 模板集成到 Sprint backlog确保任何重大变更都考虑 NFR
- NFR 在云端应更规范化,利用云原生服务(如 AWS Backup 定义备份策略和测试频率)
- 监控能力对于衡量 Error Budget 是否耗尽至关重要
- 下一步:与产品团队合作,将 NFR 集成到 backlog制定 SLO