75 lines
3.1 KiB
Markdown
75 lines
3.1 KiB
Markdown
# Rollback Rate
|
|
|
|
## Definition
|
|
Rollback Rate is the proportion of deployments that are reverted (rolled back) to a previous stable version after being deployed to production. It measures how often deployments fail to the point where reverting becomes necessary.
|
|
|
|
The DevOps Maturity Model explicitly lists Rollback Rate as one of the metrics for measuring DevOps maturity.
|
|
|
|
## Why Rollback Rate Matters
|
|
|
|
A high rollback rate indicates:
|
|
- Deployment quality issues
|
|
- Insufficient testing before deployment
|
|
- Gap between staging and production environments
|
|
- Unstable or risky deployment processes
|
|
|
|
A low rollback rate indicates:
|
|
- High confidence in the deployment pipeline
|
|
- Comprehensive pre-production testing
|
|
- Stable deployment processes
|
|
|
|
## Across DevOps Maturity Levels
|
|
|
|
| Maturity | Rollback Rate Characteristic |
|
|
|----------|------------------------------|
|
|
| Phase 1 | High rollback rate — manual deployments, no automated testing, siloed teams, manual infrastructure |
|
|
| Phase 2 | Improving — automation reduces some risks, but manual interventions still cause rollbacks |
|
|
| Phase 3 | Lower — automated infrastructure and security scans reduce failures before deployment |
|
|
| Phase 4 | Reduced — performance testing, immutable infrastructure, dependency vulnerability management |
|
|
| Phase 5 | Minimal — zero human intervention, real-time decisions, rollback automation for fast recovery |
|
|
|
|
## Relationship with Other Metrics
|
|
|
|
### Rollback Rate and Change Failure Rate
|
|
- **Change Failure Rate**: All deployments that cause failures (regardless of rollback)
|
|
- **Rollback Rate**: Only deployments where the team explicitly chose to roll back
|
|
|
|
A high CFR but low Rollback Rate could mean failures were fixed without rollback. A low CFR but high Rollback Rate suggests teams are overly cautious.
|
|
|
|
### Rollback Rate and MTTR
|
|
- Rollback is often a strategy for reducing MTTR
|
|
- Fast rollback mechanisms enable quick recovery
|
|
- Organizations with mature CI/CD pipelines have both low rollback rates AND fast rollback capabilities
|
|
|
|
## How to Reduce Rollback Rate
|
|
|
|
### Technical Strategies
|
|
- Comprehensive pre-production testing
|
|
- Feature flags for gradual rollouts
|
|
- Canary deployments (route small % of traffic to new version)
|
|
- Blue-green deployments
|
|
- Comprehensive observability to detect issues before users notice
|
|
- A/B testing in production
|
|
|
|
### Process Improvements
|
|
- Small batch deployments to limit blast radius
|
|
- Strict deployment criteria (all tests green, no open severity-1 bugs)
|
|
- Deployment freeze periods for critical systems
|
|
- Change advisory board for high-risk changes
|
|
|
|
### Cultural Factors
|
|
- Psychological safety to admit when a deployment is failing
|
|
- Clear criteria for when to rollback vs fix-forward
|
|
- Blameless post-mortems to learn from rollbacks
|
|
- On-call engineers empowered to make rollback decisions
|
|
|
|
## Sources
|
|
- [[sources/devops-maturity-model-from-traditional-it-to-advanced-devops.md]]
|
|
|
|
## Related Concepts
|
|
- [[concepts/Change-Failure-Rate]]
|
|
- [[concepts/MTTR]]
|
|
- [[concepts/DORA-Metrics]]
|
|
- [[concepts/Continuous-Deployment]]
|
|
- [[concepts/DevOps-Maturity]]
|