28 lines
1.0 KiB
Markdown
28 lines
1.0 KiB
Markdown
---
|
||
title: "Recovery Assurance"
|
||
type: concept
|
||
tags: [dr, reliability, sre]
|
||
---
|
||
|
||
## Definition
|
||
恢复保障(Recovery Assurance)是一种从设计层面确保系统具备恢复能力的架构理念,与传统的灾难恢复(DR)不同,它强调主动式而非被动式响应。
|
||
|
||
## Key Differences from DR
|
||
- **DR(灾难恢复)**:被动响应,事件发生后尝试恢复
|
||
- **Recovery Assurance**:主动设计,在设计阶段就考虑恢复能力
|
||
|
||
## Core Principles
|
||
1. **Design for Failure**:假设组件会故障,设计容错机制
|
||
2. **Observability**:持续监控系统健康状态
|
||
3. **Automation**:自动检测和恢复能力
|
||
4. **Test-Driven**:通过测试验证恢复能力
|
||
|
||
## Related Metrics
|
||
- [[RTO]](Recovery Time Objective):恢复时间目标
|
||
- [[RPO]](Recovery Point Objective):恢复点目标
|
||
|
||
## Related Concepts
|
||
- [[Self-Healing Systems]]:自愈系统
|
||
- [[SRE]]:站点可靠性工程
|
||
- [[Observability Engineering]]:可观测性工程
|
||
- [[Chaos Engineering]]:混沌工程 |