Files
nexus/wiki/concepts/Failover.md

58 lines
2.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Failover"
type: concept
tags: [cloud-computing, reliability, high-availability]
date: 2025-03-02
---
# Failover
**Failover**(故障转移)是高可用性系统的核心机制,当主系统发生故障时,自动切换到备用系统,确保服务连续性。
## Definition
故障转移是一种自动化的冗余机制,监控系统检测到主节点故障后,自动将流量或工作负载切换到备用节点,用户通常无感知。
## Key Characteristics
- **自动化**:无需人工干预,自动检测和切换
- **快速恢复**:切换时间可从几分钟缩短到秒级
- **透明切换**:用户无感知或感知极小中断
- **健康检查**:持续监控主节点健康状态
## Failover Patterns in Cloud
| Pattern | Description |
|---------|-------------|
| **Active-Passive** | 主节点处理流量,备用节点待命;故障时切换 |
| **Active-Active** | 多个节点同时处理流量;故障节点自动剔除 |
| **Geo-Failover** | 跨地理区域的故障转移 |
| **Multi-Region** | 多区域部署,单区域故障不影响其他区域 |
## Cloud Myths Context
Failover 是反驳"云不可靠"误解的关键机制:
- 云服务商通过全球分布式架构实现跨区域故障转移
- 自动化故障转移 SLA 保障 99.99% 可用性
- 传统本地部署难以实现同等水平的故障转移能力
## Implementation Components
- **Load Balancer**:健康检查 + 流量分发
- **Health Checks**:定期检测服务可用性
- **DNS Failover**Route 53 / Cloud DNS 的 DNS 级切换
- **Database Replication**:数据库级别的同步/异步复制
- **Auto Scaling Groups**:实例级别的自动替换
## Related Concepts
- [[High-Availability]] — 高可用性
- [[cloud-computing]] — 云计算
- [[Scalability]] — 可扩展性
- [[Disaster-Recovery]] — 灾难恢复
- [[cloud-migration]] — 云迁移
## Sources
- [[The Myths and Misconceptions About Cloud Computing (LinkedIn)|sources/the-myths-and-misconceptions-about-cloud-computing-linkedin]]