--- title: "Public Cloud Learning Sessions - Observability with OpenTelemetry - 20240402" type: source tags: - OpenTelemetry - Observability - AWS - EKS - DevOps date: 2024-04-02 --- ## Source File - [[raw/Cloud & DevOps/Public-Cloud-Learning-Sessions/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md]] ## Summary - 核心主题:OpenTelemetry 可观测性框架在 AWS 环境中的应用 - 问题域:微服务架构下的系统可观测性挑战 - 方法/机制:三大信号(Metrics、Logs、Traces)+ OpenTelemetry 统一采集方案 + AWS Distro for OpenTelemetry (ADOT) - 结论/价值:OpenTelemetry 提供厂商中立的端到端遥测数据采集标准,降低多供应商集成复杂度 ## Key Claims - 可观测性是衡量从系统外部输出推断内部状态的能力 — Gartner 估计平均每年 87 小时停机时间,每小时损失 $42,000 - OpenTelemetry 通过统一数据格式和 11 种语言 SDK 解决不同组件各自为政的 SDK 问题 - AWS Distro for OpenTelemetry 是统一代理,支持自动检测应用语言并创建预配置的 Collector - OpenTelemetry Collector 通过 Receivers、Processors、Exporters、Extensions 四大组件实现数据标准化 ## Key Quotes > "Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs." — OpenText 学习会议定义 > "The output that Fluent Bit is sending the individual logs to is the OpenTelemetry endpoint on the port 55681." — Demo 演示说明 ## Key Concepts - [[OpenTelemetry]]:厂商中立的遥测数据采集框架,提供统一数据格式和跨语言 SDK - [[OpenTelemetry-Collector]]:数据标准化组件,包含 Receivers、Processors、Exporters、Extensions - [[ADOT]]:AWS Distro for OpenTelemetry,统一代理自动检测应用语言并创建预配置 Collector - [[OTLP]]:OpenTelemetry Protocol,统一数据格式 - [[Fluent-Bit]]:轻量级日志处理器,用于 EKS 环境日志收集和转发 - [[Metrics]]:聚合的源统计数据,用于监控和告警 - [[Logs]]:日志记录,用于问题根因分析 - [[Traces]]:追踪,提供请求在系统中的完整视图 - [[Trace-Span]]:追踪跨度,包含开始时间、持续时间和元数据 - [[可观测性三大支柱]]:Metrics、Logs、Traces 及其关联关系 ## Key Entities - [[AWS]]:云服务商,提供 CloudWatch、X-Ray、ADOT 等可观测性服务 - [[Amazon-OpenSearch-Service]]:托管 OpenSearch 服务,用于存储和可视化日志、追踪数据 - [[EKS]]:Amazon Elastic Kubernetes Service,演示应用的运行平台 - [[Grafana]]:开源可视化平台,支持指标和日志展示 - [[Prometheus]]:开源监控系统,用于指标采集 - [[CloudWatch]]:AWS 原生监控服务 - [[X-Ray]]:AWS 分布式追踪服务 - [[Jay-Comer]]:AWS Solutions Architect,演讲者 ## Connections - [[AWS]] ← provides ← [[ADOT]] - [[EKS]] ← hosts ← [[Fluent-Bit]] → forwards to → [[OpenTelemetry-Collector]] - [[OpenTelemetry-Collector]] → exports to → [[Amazon-OpenSearch-Service]] - [[Prometheus]] ← integrates with ← [[Grafana]] - [[Metrics]] ← part of ← [[可观测性三大支柱]] - [[Logs]] ← part of ← [[可观测性三大支柱]] - [[Traces]] ← part of ← [[可观测性三大支柱]]