62 lines
3.3 KiB
Markdown
62 lines
3.3 KiB
Markdown
---
|
||
title: "Public Cloud Learning Sessions - Observability with OpenTelemetry - 20240402"
|
||
type: source
|
||
tags:
|
||
- OpenTelemetry
|
||
- Observability
|
||
- AWS
|
||
- EKS
|
||
- DevOps
|
||
date: 2024-04-02
|
||
---
|
||
|
||
## Source File
|
||
- [[raw/Cloud & DevOps/Public-Cloud-Learning-Sessions/04_EKS/public-cloud-learning-sessions-observability-with-opentelemetry-20240402-160113-.md]]
|
||
|
||
## Summary
|
||
- 核心主题:OpenTelemetry 可观测性框架在 AWS 环境中的应用
|
||
- 问题域:微服务架构下的系统可观测性挑战
|
||
- 方法/机制:三大信号(Metrics、Logs、Traces)+ OpenTelemetry 统一采集方案 + AWS Distro for OpenTelemetry (ADOT)
|
||
- 结论/价值:OpenTelemetry 提供厂商中立的端到端遥测数据采集标准,降低多供应商集成复杂度
|
||
|
||
## Key Claims
|
||
- 可观测性是衡量从系统外部输出推断内部状态的能力 — Gartner 估计平均每年 87 小时停机时间,每小时损失 $42,000
|
||
- OpenTelemetry 通过统一数据格式和 11 种语言 SDK 解决不同组件各自为政的 SDK 问题
|
||
- AWS Distro for OpenTelemetry 是统一代理,支持自动检测应用语言并创建预配置的 Collector
|
||
- OpenTelemetry Collector 通过 Receivers、Processors、Exporters、Extensions 四大组件实现数据标准化
|
||
|
||
## Key Quotes
|
||
> "Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs." — OpenText 学习会议定义
|
||
|
||
> "The output that Fluent Bit is sending the individual logs to is the OpenTelemetry endpoint on the port 55681." — Demo 演示说明
|
||
|
||
## Key Concepts
|
||
- [[OpenTelemetry]]:厂商中立的遥测数据采集框架,提供统一数据格式和跨语言 SDK
|
||
- [[OpenTelemetry-Collector]]:数据标准化组件,包含 Receivers、Processors、Exporters、Extensions
|
||
- [[ADOT]]:AWS Distro for OpenTelemetry,统一代理自动检测应用语言并创建预配置 Collector
|
||
- [[OTLP]]:OpenTelemetry Protocol,统一数据格式
|
||
- [[Fluent-Bit]]:轻量级日志处理器,用于 EKS 环境日志收集和转发
|
||
- [[Metrics]]:聚合的源统计数据,用于监控和告警
|
||
- [[Logs]]:日志记录,用于问题根因分析
|
||
- [[Traces]]:追踪,提供请求在系统中的完整视图
|
||
- [[Trace-Span]]:追踪跨度,包含开始时间、持续时间和元数据
|
||
- [[可观测性三大支柱]]:Metrics、Logs、Traces 及其关联关系
|
||
|
||
## Key Entities
|
||
- [[AWS]]:云服务商,提供 CloudWatch、X-Ray、ADOT 等可观测性服务
|
||
- [[Amazon-OpenSearch-Service]]:托管 OpenSearch 服务,用于存储和可视化日志、追踪数据
|
||
- [[EKS]]:Amazon Elastic Kubernetes Service,演示应用的运行平台
|
||
- [[Grafana]]:开源可视化平台,支持指标和日志展示
|
||
- [[Prometheus]]:开源监控系统,用于指标采集
|
||
- [[CloudWatch]]:AWS 原生监控服务
|
||
- [[X-Ray]]:AWS 分布式追踪服务
|
||
- [[Jay-Comer]]:AWS Solutions Architect,演讲者
|
||
|
||
## Connections
|
||
- [[AWS]] ← provides ← [[ADOT]]
|
||
- [[EKS]] ← hosts ← [[Fluent-Bit]] → forwards to → [[OpenTelemetry-Collector]]
|
||
- [[OpenTelemetry-Collector]] → exports to → [[Amazon-OpenSearch-Service]]
|
||
- [[Prometheus]] ← integrates with ← [[Grafana]]
|
||
- [[Metrics]] ← part of ← [[可观测性三大支柱]]
|
||
- [[Logs]] ← part of ← [[可观测性三大支柱]]
|
||
- [[Traces]] ← part of ← [[可观测性三大支柱]] |