Files
nexus/wiki/concepts/Rollback-Rate.md
2026-04-21 20:03:06 +08:00

3.0 KiB

Rollback Rate

Definition

Rollback Rate is the proportion of deployments that are reverted (rolled back) to a previous stable version after being deployed to production. It measures how often deployments fail to the point where reverting becomes necessary.

The DevOps Maturity Model explicitly lists Rollback Rate as one of the metrics for measuring DevOps maturity.

Why Rollback Rate Matters

A high rollback rate indicates:

  • Deployment quality issues
  • Insufficient testing before deployment
  • Gap between staging and production environments
  • Unstable or risky deployment processes

A low rollback rate indicates:

  • High confidence in the deployment pipeline
  • Comprehensive pre-production testing
  • Stable deployment processes

Across DevOps Maturity Levels

Maturity Rollback Rate Characteristic
Phase 1 High rollback rate — manual deployments, no automated testing, siloed teams, manual infrastructure
Phase 2 Improving — automation reduces some risks, but manual interventions still cause rollbacks
Phase 3 Lower — automated infrastructure and security scans reduce failures before deployment
Phase 4 Reduced — performance testing, immutable infrastructure, dependency vulnerability management
Phase 5 Minimal — zero human intervention, real-time decisions, rollback automation for fast recovery

Relationship with Other Metrics

Rollback Rate and Change Failure Rate

  • Change Failure Rate: All deployments that cause failures (regardless of rollback)
  • Rollback Rate: Only deployments where the team explicitly chose to roll back

A high CFR but low Rollback Rate could mean failures were fixed without rollback. A low CFR but high Rollback Rate suggests teams are overly cautious.

Rollback Rate and MTTR

  • Rollback is often a strategy for reducing MTTR
  • Fast rollback mechanisms enable quick recovery
  • Organizations with mature CI/CD pipelines have both low rollback rates AND fast rollback capabilities

How to Reduce Rollback Rate

Technical Strategies

  • Comprehensive pre-production testing
  • Feature flags for gradual rollouts
  • Canary deployments (route small % of traffic to new version)
  • Blue-green deployments
  • Comprehensive observability to detect issues before users notice
  • A/B testing in production

Process Improvements

  • Small batch deployments to limit blast radius
  • Strict deployment criteria (all tests green, no open severity-1 bugs)
  • Deployment freeze periods for critical systems
  • Change advisory board for high-risk changes

Cultural Factors

  • Psychological safety to admit when a deployment is failing
  • Clear criteria for when to rollback vs fix-forward
  • Blameless post-mortems to learn from rollbacks
  • On-call engineers empowered to make rollback decisions

Sources