2.6 KiB
2.6 KiB
1. Objective
Ensure business continuity and data protection by implementing an effective DR strategy for the customer, leveraging EFS replication, RDS PITR, and different failover methods.
2. DR Scenarios & Recovery Options
| Method | RDS Recovery | EFS Recovery | Failover Steps | Estimated Downtime (RTO) | RPO | Cost Impact | |
|---|---|---|---|---|---|---|---|
| DR Basic Service | Cold Backup-Restore | Snapshot (6h) | Backup Restore (6h) | 1. Restore RDS from snapshot (6h) 2. Restore EFS from snapshot (6h) 3. Recover EKS (4h) |
24 hours | 4 hours | Base Cost |
| DR Premium Service | EFS Replica Only (RDS PITR) | PITR (6h) | EFS Replica + Restore (0.2h) | 1. RDS recovery from PITR (6h) 2. Stop EFS sync (0.2h) 3. Full EKS recovery |
6 hours | 15 min | +30% Cost |
3. Downtime Estimation & RTO Considerations
- EFS Replica Only (RDS PITR)
- 6-hour RTO, significantly reducing downtime compared to cold restore.
- 15-minute RPO ensures minimal data loss.
4. DR Execution Plan
4.1 Pre-DR Readiness Checks
- Ensure EFS replication is active and functioning correctly.
- Verify RDS PITR backups and retention policies.
- Pre-configure EKS deployment templates(Velero) for rapid recovery.
4.2 Disaster Recovery Trigger
- DR activation is initiated upon a major failure event in the primary environment.
- Decision criteria include regional failure, prolonged service outage, or severe data corruption.
4.3 Execution Steps
EFS Replica Only (RDS PITR)
- Recover RDS from PITR (6 hours).
- Stop EFS replication sync (0.2 hours).
- Recover EKS cluster and validate application (immediate).
4.4 Post-Failover Validation
- Confirm data consistency between DR and primary environments.
- Validate application services and connectivity.
- Communicate DR activation and service restoration to stakeholders.
5. DR Testing & Cost Estimation
- Annual DR validation test is required, adding an estimated 2 months of AWS costs.
- EFS Replica Only (RDS PITR):
- $20.8K/month
- EFS Replica Only (RDS PITR):