Update nexus: fix conflicts and sync local changes

2026-04-26 12:06:50 +08:00
parent 191797c01b
commit f09834b5a5
2443 changed files with 254323 additions and 255154 deletions
--- a/wiki/concepts/MTTD.md
+++ b/wiki/concepts/MTTD.md
@@ -1,66 +1,66 @@
-# MTTD (Mean Time to Detect)
-
-## Definition
-MTTD (Mean Time to Detect) is the average time required to identify that a problem or failure has occurred in a system. It measures the effectiveness of monitoring, alerting, and observability practices.
-
-MTTD is a component of MTTR and represents the first phase of incident response.
-
-## Why MTTD Matters
-
-A short MTTD means:
- Failures are caught before they cascade into larger outages
- Customer impact is minimized
- The team can begin recovery faster
- Root cause analysis starts sooner
-
-Long MTTD means:
- Problems can escalate undetected
- User experience degrades for longer periods
- More customers are affected
- Root cause analysis becomes harder as the incident grows
-
-## Across DevOps Maturity Levels
-
-| Maturity | Detection Capability |
-|----------|---------------------|
-| Phase 1 | Long MTTD — outages reported by users, no proactive monitoring, reactive approach |
-| Phase 2 | Better MTTD — essential monitoring tools alert teams as soon as issues affect users |
-| Phase 3 | Improved detection — automated monitoring continues, security scans added earlier in pipeline |
-| Phase 4 | Continuous monitoring — tracks system health for early problem detection and root cause analysis |
-| Phase 5 | Minimal MTTD — max uptime with high collaboration and continuous monitoring, no customer interruptions |
-
-## Key Practices for Low MTTD
-
-### Monitoring & Alerting
- Comprehensive application performance monitoring (APM)
- Infrastructure monitoring
- Log aggregation and analysis
- Real-user monitoring (RUM)
- Synthetic monitoring
-
-### Alerting Best Practices
- Meaningful alert thresholds (avoid alert fatigue)
- Alert routing to appropriate on-call staff
- Clear alert context for rapid triage
- Correlation of related alerts
-
-### Observability
- Structured logging
- Distributed tracing
- Metrics dashboards
- Error tracking
-
-## MTTD vs Other Metrics
- **MTTR**: MTTD is a component of MTTR (MTTR = MTTD + MTTA + Mean Time to Repair)
- **Availability**: High availability depends partly on short MTTD
- **Change Failure Rate**: Fewer failures reaching production reduces MTTD pressure
-
-## Sources
- [[sources/devops-maturity-model-from-traditional-it-to-advanced-devops.md]]
-
-## Related Concepts
- [[concepts/MTTR]]
- [[concepts/MTTA]]
- [[concepts/DORA-Metrics]]
- [[concepts/APM]]
- [[concepts/DevOps-Maturity]]
+# MTTD (Mean Time to Detect)
+
+## Definition
+MTTD (Mean Time to Detect) is the average time required to identify that a problem or failure has occurred in a system. It measures the effectiveness of monitoring, alerting, and observability practices.
+
+MTTD is a component of MTTR and represents the first phase of incident response.
+
+## Why MTTD Matters
+
+A short MTTD means:
+- Failures are caught before they cascade into larger outages
+- Customer impact is minimized
+- The team can begin recovery faster
+- Root cause analysis starts sooner
+
+Long MTTD means:
+- Problems can escalate undetected
+- User experience degrades for longer periods
+- More customers are affected
+- Root cause analysis becomes harder as the incident grows
+
+## Across DevOps Maturity Levels
+
+| Maturity | Detection Capability |
+|----------|---------------------|
+| Phase 1 | Long MTTD — outages reported by users, no proactive monitoring, reactive approach |
+| Phase 2 | Better MTTD — essential monitoring tools alert teams as soon as issues affect users |
+| Phase 3 | Improved detection — automated monitoring continues, security scans added earlier in pipeline |
+| Phase 4 | Continuous monitoring — tracks system health for early problem detection and root cause analysis |
+| Phase 5 | Minimal MTTD — max uptime with high collaboration and continuous monitoring, no customer interruptions |
+
+## Key Practices for Low MTTD
+
+### Monitoring & Alerting
+- Comprehensive application performance monitoring (APM)
+- Infrastructure monitoring
+- Log aggregation and analysis
+- Real-user monitoring (RUM)
+- Synthetic monitoring
+
+### Alerting Best Practices
+- Meaningful alert thresholds (avoid alert fatigue)
+- Alert routing to appropriate on-call staff
+- Clear alert context for rapid triage
+- Correlation of related alerts
+
+### Observability
+- Structured logging
+- Distributed tracing
+- Metrics dashboards
+- Error tracking
+
+## MTTD vs Other Metrics
+- **MTTR**: MTTD is a component of MTTR (MTTR = MTTD + MTTA + Mean Time to Repair)
+- **Availability**: High availability depends partly on short MTTD
+- **Change Failure Rate**: Fewer failures reaching production reduces MTTD pressure
+
+## Sources
+- [[sources/devops-maturity-model-from-traditional-it-to-advanced-devops.md]]
+
+## Related Concepts
+- [[concepts/MTTR]]
+- [[concepts/MTTA]]
+- [[concepts/DORA-Metrics]]
+- [[concepts/APM]]
+- [[concepts/DevOps-Maturity]]