Files
nexus/wiki/entities/Kubernetes.md

114 lines
4.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Kubernetes"
type: entity
tags:
- cloud
- container
- orchestration
- devops
created: 2026-04-25
---
# Kubernetes
## Definition
Kubernetes (K8s) 是 Google 开源的**容器编排平台**,用于自动化容器化应用的部署、扩缩容和管理。是云原生 (Cloud-Native) 架构的核心基础设施,也是 Agentic AI 自主修复 (Self-Healing) 的主要目标环境。
## Aliases
- K8s
- Kubernetes
- Container Orchestration Platform
## Major Cloud Implementations
| Provider | Service | Description |
|----------|---------|-------------|
| AWS | EKS (Elastic Kubernetes Service) | 托管 Kubernetes on AWS |
| GCP | GKE (Google Kubernetes Engine) | 托管 Kubernetes on GCP |
| Azure | AKS (Azure Kubernetes Service) | 托管 Kubernetes on Azure |
## Kubernetes Self-Healing Capabilities
Kubernetes 原生提供基础 Self-Healing 能力:
```yaml
# Kubernetes Self-Healing 原生机制
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
strategy:
type: RollingUpdate
template:
spec:
terminationGracePeriodSeconds: 30
# 内置机制:
# - 自动重启失败的容器
# - 替换不健康的 Pod
# - 滚动更新确保服务可用
```
Agentic AI 在原生能力基础上提供**更高级的自我修复**
| 能力 | Kubernetes 原生 | Agentic AI Enhanced |
|------|---------------|-------------------|
| Pod 重启 | ✅ 自动重启崩溃容器 | ✅ 智能分析根因 + 预防性重启 |
| 扩缩容 | ✅ HPA 基于指标 | ✅ 预测性扩缩容 |
| 节点恢复 | ✅ 节点故障迁移 | ✅ 主动健康检查 + 预防性迁移 |
| 配置修复 | ❌ 需人工介入 | ✅ AI 自动修正 ConfigMap/Secret |
## Agentic AI Monitoring Targets
```
┌─────────────────────────────────────────────────┐
│ Agentic AI for Kubernetes │
├─────────────────────────────────────────────────┤
│ 监控层 │
│ ├── Pod Metrics (CPU/Memory/Network) │
│ ├── Workload Health (Deployment/ReplicaSet) │
│ ├── Node Status (Ready/Condition) │
│ └── Cluster Components (etcd, API Server) │
│ │
│ 决策层 │
│ ├── Anomaly Detection (AI) │
│ ├── Root Cause Analysis (AI) │
│ └── Action Planning (AI) │
│ │
│ 执行层 │
│ ├── kubectl API (restart/migrate/scale) │
│ ├── HPA Override (AI-driven scaling) │
│ └── Config Updates (AI-driven fixes) │
└─────────────────────────────────────────────────┘
```
## Example
> An AI agent monitoring AWS EKS clusters detects high CPU usage due to a rogue pod:
> - Pod `payment-service-v2-abc123` CPU usage: 95%
> - AI correlates with recent deployment timestamp
> - AI identifies: Memory leak in new version
> - AI Actions:
> 1. Scale deployment to 3 replicas (distribute load)
> 2. Create rollback ticket
> 3. Notify team via Slack
> 4. Auto-rollback after approval
## Related Concepts
- [[Self-Healing Systems]] — Kubernetes 是 Self-Healing 的主要载体
- [[Cloud-Native]] — Kubernetes 是 Cloud-Native 的核心
- [[Deployment Automation]] — Kubernetes 部署的自动化
- [[Container Lifecycle Hardening]] — 容器安全加固
## Related Entities
- [[Agentic AI]] — Kubernetes 是 Agentic AI 的管理对象
- EKS, GKE, AKS — 具体云服务商实现
## Related Sources
- [[how-agentic-ai-can-help-for-cloud-devops]]
- [[ctp-topic-70-eks-deployment-using-iac]]