58 lines
2.1 KiB
Markdown
58 lines
2.1 KiB
Markdown
---
|
||
title: "Failover"
|
||
type: concept
|
||
tags: [cloud-computing, reliability, high-availability]
|
||
date: 2025-03-02
|
||
---
|
||
|
||
# Failover
|
||
|
||
**Failover**(故障转移)是高可用性系统的核心机制,当主系统发生故障时,自动切换到备用系统,确保服务连续性。
|
||
|
||
## Definition
|
||
|
||
故障转移是一种自动化的冗余机制,监控系统检测到主节点故障后,自动将流量或工作负载切换到备用节点,用户通常无感知。
|
||
|
||
## Key Characteristics
|
||
|
||
- **自动化**:无需人工干预,自动检测和切换
|
||
- **快速恢复**:切换时间可从几分钟缩短到秒级
|
||
- **透明切换**:用户无感知或感知极小中断
|
||
- **健康检查**:持续监控主节点健康状态
|
||
|
||
## Failover Patterns in Cloud
|
||
|
||
| Pattern | Description |
|
||
|---------|-------------|
|
||
| **Active-Passive** | 主节点处理流量,备用节点待命;故障时切换 |
|
||
| **Active-Active** | 多个节点同时处理流量;故障节点自动剔除 |
|
||
| **Geo-Failover** | 跨地理区域的故障转移 |
|
||
| **Multi-Region** | 多区域部署,单区域故障不影响其他区域 |
|
||
|
||
## Cloud Myths Context
|
||
|
||
Failover 是反驳"云不可靠"误解的关键机制:
|
||
- 云服务商通过全球分布式架构实现跨区域故障转移
|
||
- 自动化故障转移 SLA 保障 99.99% 可用性
|
||
- 传统本地部署难以实现同等水平的故障转移能力
|
||
|
||
## Implementation Components
|
||
|
||
- **Load Balancer**:健康检查 + 流量分发
|
||
- **Health Checks**:定期检测服务可用性
|
||
- **DNS Failover**:Route 53 / Cloud DNS 的 DNS 级切换
|
||
- **Database Replication**:数据库级别的同步/异步复制
|
||
- **Auto Scaling Groups**:实例级别的自动替换
|
||
|
||
## Related Concepts
|
||
|
||
- [[High-Availability]] — 高可用性
|
||
- [[cloud-computing]] — 云计算
|
||
- [[Scalability]] — 可扩展性
|
||
- [[Disaster-Recovery]] — 灾难恢复
|
||
- [[cloud-migration]] — 云迁移
|
||
|
||
## Sources
|
||
|
||
- [[The Myths and Misconceptions About Cloud Computing (LinkedIn)|sources/the-myths-and-misconceptions-about-cloud-computing-linkedin]]
|