chore: sync local project changes

2026-04-27 16:26:07 +08:00
parent dfcf7de003
commit 5854781fa8
144 changed files with 12849 additions and 12330 deletions
--- a/wiki/concepts/AlertManagement.md
+++ b/wiki/concepts/AlertManagement.md
@@ -1,54 +1,54 @@
---
-title: "Alert Management"
-type: concept
-tags: [monitoring, alerting, devops, sre]
-last_updated: 2026-04-26
---
-
-## Alert Management（告警管理）
-
-**中文名称：** 告警管理
-
-**类型：** 运维流程与方法论
-
-**别名：**
- 告警管理
- 告警分发
- Alert Routing
-
---
-
-## Definition
-
-告警管理（Alert Management）是指从告警**生成 → 接收 → 分类 → 分发 → 响应 → 关闭**的全生命周期管理流程，目的是在关键系统异常时及时通知相关人员，同时避免告警风暴和告警疲劳。
-
-**告警生命周期：**
-1. **生成（Generate）：** 监控系统（Prometheus）基于规则判断是否触发告警
-2. **转发（Forward）：** Prometheus 通过 Alertmanager API 发送告警
-3. **分发表单（Dismiss）：** Alertmanager 执行抑制、分组、静默
-4. **路由（Route）：** 按标签/严重级别路由到对应通知渠道
-5. **响应（Respond）：** 值班人员收到通知并处理
-6. **关闭（Resolve）：** 问题解决后告警自动消失
-
-**告警治理最佳实践：**
- **SLO/SLA 驱动：** 告警应与业务关键指标绑定，而非基础设施细节
- **分级告警：** Critical / Warning / Info 三级，避免所有告警同等紧急
- **抑制规则：** 根因告警触发时自动抑制派生告警
- **静默期：** 维护窗口内临时屏蔽告警
- **On-call Rotation：** 值班轮换确保 24/7 有人响应
-
-**告警评估黄金法则：** 每条告警必须有明确处理步骤；无法立即采取行动的告警应该被抑制或降低级别
-
---
-
-## Prometheus 告警管理架构
-
-```
-Prometheus (规则判断) → Alertmanager (抑制/分组/路由) → 通知渠道 (邮件/Slack/PagerDuty/电话)
-```
-
---
-
-## Related Sources
- [[家庭监控方案-prometheus-grafana-node-exporter-cadvisor-blackbox]]
- [[ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid]]
+---
+title: "Alert Management"
+type: concept
+tags: [monitoring, alerting, devops, sre]
+last_updated: 2026-04-26
+---
+
+## Alert Management（告警管理）
+
+**中文名称：** 告警管理
+
+**类型：** 运维流程与方法论
+
+**别名：**
+- 告警管理
+- 告警分发
+- Alert Routing
+
+---
+
+## Definition
+
+告警管理（Alert Management）是指从告警**生成 → 接收 → 分类 → 分发 → 响应 → 关闭**的全生命周期管理流程，目的是在关键系统异常时及时通知相关人员，同时避免告警风暴和告警疲劳。
+
+**告警生命周期：**
+1. **生成（Generate）：** 监控系统（Prometheus）基于规则判断是否触发告警
+2. **转发（Forward）：** Prometheus 通过 Alertmanager API 发送告警
+3. **分发表单（Dismiss）：** Alertmanager 执行抑制、分组、静默
+4. **路由（Route）：** 按标签/严重级别路由到对应通知渠道
+5. **响应（Respond）：** 值班人员收到通知并处理
+6. **关闭（Resolve）：** 问题解决后告警自动消失
+
+**告警治理最佳实践：**
+- **SLO/SLA 驱动：** 告警应与业务关键指标绑定，而非基础设施细节
+- **分级告警：** Critical / Warning / Info 三级，避免所有告警同等紧急
+- **抑制规则：** 根因告警触发时自动抑制派生告警
+- **静默期：** 维护窗口内临时屏蔽告警
+- **On-call Rotation：** 值班轮换确保 24/7 有人响应
+
+**告警评估黄金法则：** 每条告警必须有明确处理步骤；无法立即采取行动的告警应该被抑制或降低级别
+
+---
+
+## Prometheus 告警管理架构
+
+```
+Prometheus (规则判断) → Alertmanager (抑制/分组/路由) → 通知渠道 (邮件/Slack/PagerDuty/电话)
+```
+
+---
+
+## Related Sources
+- [[家庭监控方案-prometheus-grafana-node-exporter-cadvisor-blackbox]]
+- [[ctp-topic-8-implementation-of-cloud-monitoring-using-micro-focus-operations-brid]]