Update nexus wiki content
This commit is contained in:
45
wiki/concepts/Cost-As-Distributed-Systems-Bug.md
Normal file
45
wiki/concepts/Cost-As-Distributed-Systems-Bug.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: "Cost As Distributed Systems Bug"
|
||||
type: concept
|
||||
tags: [sre, finops, observability, reliability, cost-optimization]
|
||||
last_updated: 2026-04-20
|
||||
---
|
||||
|
||||
# Cost As Distributed Systems Bug
|
||||
|
||||
"成本是分布式系统的 bug"——成本异常(cost explosion)不仅是财务问题,更是一种可靠性问题。
|
||||
|
||||
## Core Thesis
|
||||
成本突然增加往往预示着系统即将发生故障。**成本突增应该被视为告警信号**,触发故障调查而非仅财务审查。
|
||||
|
||||
## Why Cost Signals Matter
|
||||
1. **资源泄漏的指示器**:内存泄漏、连接池耗尽往往表现为成本逐步上升
|
||||
2. **异常流量的标志**:DDoS 或滥用可能导致成本爆炸
|
||||
3. **配置错误**:错误的资源配置可能导致资源过度使用
|
||||
4. **级联效应的前兆**:某个组件故障可能导致其他组件超负荷运转
|
||||
|
||||
## Alerting Strategy
|
||||
```
|
||||
IF cost_increase > threshold:
|
||||
ALERT("Cost anomaly detected - investigate system health")
|
||||
```
|
||||
|
||||
将成本监控集成到 SRE 的告警体系中,而非仅作为 FinOps 的事后分析。
|
||||
|
||||
## Key Principles
|
||||
- **Cost as Signal**:将成本指标视为系统健康的信号
|
||||
- **Proactive Monitoring**:在成本失控前设置告警
|
||||
- **Correlation Analysis**:将成本变化与其他系统指标关联
|
||||
|
||||
## Relationship to FinOps
|
||||
FinOps 不仅是成本优化工具,也是 SRE 的可靠性工具。成本可观测性(Cost Observability)是现代 SRE 实践的重要组成部分。
|
||||
|
||||
## Related Concepts
|
||||
- [[Cost-Optimization]]
|
||||
- [[Observability]]
|
||||
- [[FinOps]]
|
||||
- [[Distributed-Systems]]
|
||||
- [[Reliability]]
|
||||
|
||||
## Source
|
||||
- SRE Weekly Issue #513 — [[sre-weekly-issue-513]]
|
||||
Reference in New Issue
Block a user