Auto-sync: 2026-04-21 17:12
This commit is contained in:
@@ -1,28 +0,0 @@
|
||||
---
|
||||
title: "Disaster Recovery"
|
||||
type: concept
|
||||
tags: [infrastructure, resilience, backup]
|
||||
last_updated: 2026-04-21
|
||||
---
|
||||
|
||||
## Definition
|
||||
Disaster Recovery(灾难恢复)是一套在灾难性事件后恢复 IT 系统和数据的策略与流程,确保业务连续性。
|
||||
|
||||
## Core Metrics
|
||||
- **RTO(Recovery Time Objective)**:系统允许的最大停机时间
|
||||
- **RPO(Recovery Point Objective)**:可接受的最大数据丢失量
|
||||
|
||||
## Key Components
|
||||
- **备份策略**:定期创建加密备份,存储于 S3
|
||||
- **恢复流程**:经过测试的恢复程序文档
|
||||
- **自动化恢复**:通过脚本实现自动故障切换
|
||||
|
||||
## Implementation
|
||||
The Agency 项目中的 [[Support Infrastructure Maintainer]] 实现:
|
||||
- 自动化备份脚本(GPG 加密 + S3 上传)
|
||||
- 30 天本地保留 + S3 生命周期管理
|
||||
- Backup verification 和 Slack 通知
|
||||
|
||||
## Related Concepts
|
||||
- [[Feature Flag(特性开关)]]:控制代码路径而不需要重新部署,实现秒级回滚
|
||||
- [[ITSM(IT 服务管理)]]:从工单系统演进为战略推动者,实现运营卓越和风险缓解
|
||||
Reference in New Issue
Block a user