# Change Failure Rate ## Definition Change Failure Rate (CFR) is the percentage of deployments that cause failures in production — such as service outages, degraded performance, or incidents requiring hotfixes, rollbacks, or patches. Change Failure Rate is one of the four core **DORA metrics** used to measure DevOps performance. ## Why Change Failure Rate Matters A low change failure rate indicates: - High confidence in the deployment process - Robust testing and quality assurance - Effective risk management - Mature operational practices A high change failure rate means: - Frequent production incidents - Unstable deployments - Low team confidence - Customer impact ## Across DevOps Maturity Levels | Maturity | Change Failure Rate Characteristic | |----------|-----------------------------------| | Phase 1 | High — manual processes, no automated testing, siloed teams, security only at release | | Phase 2 | Improving — unit, integration, and end-to-end tests implemented, but security separate | | Phase 3 | Lower — automated infrastructure, security scans integrated throughout development | | Phase 4 | Significantly reduced — performance/load testing, immutable infrastructure, dependency vulnerability management | | Phase 5 | 0-15% (elite) — zero human intervention, real-time data decisions, high-level security integration prevents non-compliant code | ## Elite Performance Benchmark (DORA) - **Elite performers**: 0-15% change failure rate - **High performers**: 16-30% change failure rate - **Medium performers**: 16-30% change failure rate - **Low performers**: 31-100% change failure rate ## Types of Failed Changes - Production outages - Service degradations - Data corruption - Security vulnerabilities introduced - Performance regressions - Failed rollbacks ## How to Reduce Change Failure Rate ### Technical Practices - Comprehensive test automation (unit, integration, E2E) - Feature flags for gradual rollouts - Canary deployments - Blue-green deployments - Automated rollback mechanisms - Chaos engineering to find weaknesses before production ### Process Improvements - Code review requirements - Security scanning in CI/CD pipeline - Staging environment parity with production - Small batch sizes to limit blast radius - Dependency management and vulnerability scanning ### Cultural Factors - Blameless post-mortems - Learning from failures - Psychological safety to report issues - Shared ownership of reliability ## Relationship with Other DORA Metrics - **Deployment Frequency**: Higher frequency with lower CFR indicates elite performance - **Lead Time**: Shorter lead times with maintained/low CFR = high performance - **MTTR**: Lower CFR means fewer incidents, contributing to lower overall MTTR ## Sources - [[sources/devops-maturity-model-from-traditional-it-to-advanced-devops.md]] - [[sources/cloud-devop-maturity-guideline.md]] ## Related Concepts - [[concepts/DORA-Metrics]] - [[concepts/Continuous-Deployment]] - [[concepts/DevOps-Maturity]] - [[concepts/Error-Budget]] - [[concepts/Rollback-Rate]]