Files
nexus/knowledgebase/DevOps & SRE/04_EKS/ctp-topic-42-grafana-observability-dashboard.md

3.6 KiB

title, type, source-type, category, tags, date-added, video-source, audio-source, status
title type source-type category tags date-added video-source audio-source status
CTP Topic 42 Grafana Observability dashboard cloud-learning video DevOps & SRE/04_EKS
Grafana
Observability
Dashboard
CTP
2026-04-14 nas:///volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4 summarized (Gemini 摘要)

CTP Topic 42 Grafana Observability dashboard

Source: NAS /volume2/work/Public Cloud Learning Sessions/CTP _ Topic 42_ Grafana_Observability dashboard.mp4

Type: VIDEO | Category: 04_EKS

Status: 🟡 Awaiting Whisper transcription → Summary


摘要

Grafana Observability and Dashboards

Grafana is an open-source web application used for data visualization through charts and dashboards. It supports various data sources, including metrics (CPU load, memory usage) and logs (timestamps, debug levels). Data producers like Jenkins, CA servers, and AWS CloudWatch inject data into these sources, which Grafana then visualizes. Grafana does not exist differently data source by itself. It needs to be expressed from the data, all kinds of data sources.

The infrastructure architecture involves users accessing Grafana through a load balancer and auto-scaling groups. Grafana is installed in a monitoring account and configured to access other product team AWS accounts via IAM role policies. A Grafana monitoring role is assumed from a Terraform service catalog repo, granting access to various landing zone source accounts.

Grafana offers user-level and team-level access controls, with roles like editor, viewer, and admin. Data sources are created with specific ARNs to access AWS accounts. Dashboards are dynamic, fetching data based on product team access. A sample dashboard includes CPU, I/O, network, EBS, and estimated charges monitoring. Alerting systems can be configured to notify channels like Microsoft Teams of high CPU usage or service downtime.

Terraform and Automation

Terraform is used to automate Grafana resource provisioning. Modules exist for data sources and Grafana organizations. A demo scenario simulates onboarding Grafana for a new product group account using LZSAP. The process involves creating folders, calling modules, and using JSON input variables to define organization names and user access.

Dashboards are provisioned with data sources and regions as inputs. Grafana offers flexibility in dashboard layout and data visualization. Product teams can leverage these modules and customize dashboards with application-specific logs or custom CloudWatch metrics.

Network Monitoring and Roadmap

Network monitoring is achieved using Prometheus as a data source for checkpoint and firewall instances. A tool called norm is referenced to fetch metrics via the SNMP protocol. Key dashboards display packet in/out transfers, interface metrics, and CPU/disk usage.

The roadmap includes implementing alerting and notification rules, refining network monitoring dashboards, building application-specific dashboards, and enabling product groups to consume Grafana Terraform modules. The goal is to replace Micro Focus tools with Grafana for end-to-end monitoring. We would like to build application specific dashboards which can basically give us key insight with respect to our applications that are running over there.

Grafana offers open-source and paid versions (Grafana Enterprise and Grafana Cloud). User management is currently within the Grafana database but will move to LDAP or SSO.


关键概念


行动项


相关视频

配对视频笔记链接(生成后填入)


最后更新: 2026-04-14