0% found this document useful (0 votes)

18 views

Observability-Basic

Uploaded by

suresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Observability-Basic

Uploaded by

suresh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

1

Observability
What is Observability?

Observability helps us understand the internal state of a system or application in

detail. It provides insights into applications, infrastructure, and network details,
acting like a "CCTV" that monitors everything and delivers information through tools
like Prometheus, Grafana, and others. Observability not only shows the current state
of an application but also helps determine why it is in that state and how to fix issues
effectively.

Use Cases of Observability:

1. Monitoring disk utilization for a specific node.

2. Checking CPU utilization of a system or application.
3. Measuring memory usage of an application or node.
4. Analyzing HTTP requests to see how many succeeded or failed out of a given
set (e.g., 100 requests).

The Three Pillars of Observability:

To understand why certain events occur, we need to explore the three main pillars of
observability:

1. Metrics
Metrics provide a numerical representation of an application's state, showing
what is currently happening inside it. For example, metrics can display CPU
utilization, memory usage, or the number of requests processed. They also
include historical data, allowing us to examine the application's state at a
specific time and date. Historical data of events to understand the health of
the system.
2

2. Logs
Logs are detailed records of system events, helping us understand why a
particular state occurred. They provide granular details about what happened
during a specific event or error.

3. Traces
Traces illustrate how requests flow through the application, showing the path
taken and pinpointing where issues arise. For example, they can track the flow
of a request from the client to the backend services and identify bottlenecks or
failures in the process.

Example:

Suppose you want to investigate what happened yesterday at 10:00 AM with a

specific request:

● Metrics: You can check metrics to analyze the resource utilization (e.g., CPU,
memory) during that time.
● Logs: You can examine the logs to find out why the application was in a
particular state and identify the exact issue.
● Traces: You can trace the request flow to see the sequence of interactions and
pinpoint where the issue occurred. For instance:
Client → Load Balancer → Frontend → Backend → Database

Monitoring vs. Observability

● Monitoring
Monitoring involves setting up metrics, alerts, and dashboards to track the
performance and health of an application or system. It focuses on predefined
metrics and conditions, such as CPU utilization, memory usage, or disk space.
Monitoring is primarily reactive, helping you detect and respond to known issues
based on pre-configured thresholds.
● Observability
Observability is a broader concept that encompasses everything about an
3

application. It not only collects metrics but also includes logs, traces, and other
data to provide a deep understanding of the system's internal state.
Observability is proactive, enabling you to investigate and resolve unknown
issues, uncover root causes, and understand the system's behavior in depth.

In summary:

● Monitoring helps you track and respond to known issues.

● Observability helps you understand and diagnose both known and unknown
issues comprehensively.

Metrics Use Cases

Metrics provide numerical data that helps monitor and analyze the performance and
health of applications, infrastructure, and systems. Here are some key use cases:

1. CPU Utilization: Track the CPU usage of a node or an EC2 instance in a cluster.
2. Memory Usage: Monitor memory consumption of an AWS virtual machine or
instance.
3. Pod Status: Check for specific Kubernetes pod statuses, such as
"CrashLoopBackOff."
4. Replica Count: Observe the number of replicas running at a particular time.
5. HTTP Requests: Monitor how many HTTP requests are received by an
application.
6. New User Signups: Count how many new users signed up during a specific
period.
7. Signup Time: Record the exact time when a user signed up.
4

Data Collection Mechanisms

Metrics are collected using two primary mechanisms:

● Push Mechanism
○ Systems or applications actively send (push) metrics data to a central
monitoring system.
● Pull Mechanism (Scraping)
○ The monitoring system periodically retrieves (pulls) metrics data from
applications or services. Tools like Prometheus use this mechanism to
scrape metrics endpoints.

Monitoring and Alerts

Monitoring systems use metrics to create alerts and notify teams when certain
conditions are met.

Example Alerts:

● CPU utilization exceeds 70%.

● Disk utilization is above 60%.
● Application experiences latency more than 30 times in a day.

Alert Notifications:
When an alert condition is met, the monitoring system sends notifications to channels
like Slack, Gmail, or other messaging tools, allowing teams to respond promptly.
5

Prometheus
Prometheus is an open-source monitoring and alerting tool designed for collecting and
analyzing metrics from various sources.

How Prometheus Works

● Pull Metrics
Prometheus collects metrics by pulling (scraping) data from:
○ Exporters: Specialized tools that expose metrics from systems like
databases, servers, or hardware.
○ Endpoints: Applications or services expose metrics over HTTP endpoints.
○ Kube State Metrics: Kubernetes-specific metrics, such as pod states,
node conditions, and deployment statuses.
6

● Storing and Querying Data

The scraped metrics are stored in Prometheus’s time-series database. You can
query this data using PromQL, Prometheus's query language.
● Dashboard Representation
While Prometheus itself has a basic UI, it is often integrated with tools like
Grafana to create more advanced and visually appealing dashboards. These
dashboards help visualize metrics data for analysis.

● Alerting
○ Prometheus uses Alertmanager to define and manage alerts based on
specific conditions (e.g., CPU utilization > 70%).
○ Alertmanager can send notifications to various channels, such as Slack,
email, PagerDuty, or others, ensuring timely responses to issues.

Example Workflow:

● Prometheus scrapes metrics from a Kubernetes pod’s metrics endpoint.

● The data is visualized in a Grafana dashboard, showing trends like CPU or
memory usage.
● An alert is configured in Alertmanager to notify the team via Slack if CPU
utilization exceeds 80%.

Conclusion
Observability is essential for managing and optimizing modern, complex systems. By
leveraging metrics, logs, and traces, it provides engineers with the tools to gain deeper
insights into system performance and behavior.
Thank You for today tomorrow i will cover how to setup prometheus and grafana for
cluster.

Cracking the Java Interview_ Top Q&A
No ratings yet
Cracking the Java Interview_ Top Q&A
19 pages
Austin Parker, Ted Young - Learning OpenTelemetry - Setting Up and Operating A Modern Observability System-O'Reilly Media
No ratings yet
Austin Parker, Ted Young - Learning OpenTelemetry - Setting Up and Operating A Modern Observability System-O'Reilly Media
54 pages
Prometheus Certified Associate
No ratings yet
Prometheus Certified Associate
513 pages
Oracle Apps Technical
100% (1)
Oracle Apps Technical
6 pages
Prometheus Ebook v2
75% (4)
Prometheus Ebook v2
231 pages
Turnbull James Monitoring With Prometheus PDF
100% (1)
Turnbull James Monitoring With Prometheus PDF
394 pages
Intro_to_observability_GrafanaUniversity
No ratings yet
Intro_to_observability_GrafanaUniversity
7 pages
Observability 101 Guide by Abhishek Veeramalla
No ratings yet
Observability 101 Guide by Abhishek Veeramalla
51 pages
An Introduction To Prometheus: Brian Brazil Founder
No ratings yet
An Introduction To Prometheus: Brian Brazil Founder
42 pages
Observability Part 1 1728364470
No ratings yet
Observability Part 1 1728364470
3 pages
Prometheus Certified Associate-1
No ratings yet
Prometheus Certified Associate-1
513 pages
Observability_Monitoring__1735803011
No ratings yet
Observability_Monitoring__1735803011
34 pages
Prometheus Certified Associate-1
No ratings yet
Prometheus Certified Associate-1
513 pages
Lecture6
No ratings yet
Lecture6
20 pages
Prom Notes
No ratings yet
Prom Notes
47 pages
TheNewStack CloudNativeObservabilityForDevOpsTeams
No ratings yet
TheNewStack CloudNativeObservabilityForDevOpsTeams
55 pages
Observability - Part 2
No ratings yet
Observability - Part 2
9 pages
Ann_Afamefuna_1652388093
No ratings yet
Ann_Afamefuna_1652388093
33 pages
Observability vs. Monitoring - What's The Difference? (IBM)
No ratings yet
Observability vs. Monitoring - What's The Difference? (IBM)
5 pages
Ultimate Guide To Observability
No ratings yet
Ultimate Guide To Observability
16 pages
Point of View on obserability
No ratings yet
Point of View on obserability
3 pages
MasteringMonitoringwithPrometheusandGrafanae356a4305d8896cf[1]
No ratings yet
MasteringMonitoringwithPrometheusandGrafanae356a4305d8896cf[1]
14 pages
1513-Fundamentals of Metrics Monitoring in Splunk Observability Labs
No ratings yet
1513-Fundamentals of Metrics Monitoring in Splunk Observability Labs
28 pages
Dev Ops
No ratings yet
Dev Ops
55 pages
observability-to-build-better-applications
No ratings yet
observability-to-build-better-applications
26 pages
Grafana
No ratings yet
Grafana
88 pages
House Dzone Refcard 293 Getting Started Prometheus
No ratings yet
House Dzone Refcard 293 Getting Started Prometheus
6 pages
Monitoring Critical Systems: Let's Spend A Little Time Talking About How Google Cloud Helps You Monitor Critical Systems
No ratings yet
Monitoring Critical Systems: Let's Spend A Little Time Talking About How Google Cloud Helps You Monitor Critical Systems
47 pages
cloud-observability
No ratings yet
cloud-observability
11 pages
Monitoringreactiveapplications Webinar 160728171015
No ratings yet
Monitoringreactiveapplications Webinar 160728171015
33 pages
Bai 5 - He Thong Canh Bao
No ratings yet
Bai 5 - He Thong Canh Bao
12 pages
Sovos Grafana Overview Kickoff Intro
No ratings yet
Sovos Grafana Overview Kickoff Intro
28 pages
Gartner-MagicQuadrant_for_Observability_Platforms_2024
No ratings yet
Gartner-MagicQuadrant_for_Observability_Platforms_2024
32 pages
APP1219B - Splunk Observability
No ratings yet
APP1219B - Splunk Observability
49 pages
AWS Marketplace Cloud-Native Ebook 5 Observability FINAL
No ratings yet
AWS Marketplace Cloud-Native Ebook 5 Observability FINAL
40 pages
Prometheus Concepts
No ratings yet
Prometheus Concepts
4 pages
Gaining_observability_in_cloud
No ratings yet
Gaining_observability_in_cloud
15 pages
Monitoring Cloud-Native Applications 1st Edition Mainak Chakraborty instant download
100% (1)
Monitoring Cloud-Native Applications 1st Edition Mainak Chakraborty instant download
45 pages
16 - Prometheus Handout
No ratings yet
16 - Prometheus Handout
31 pages
Guide To DevOps Monitoring Tools
No ratings yet
Guide To DevOps Monitoring Tools
23 pages
unit-5
No ratings yet
unit-5
13 pages
observability-missing-primer-springone-200911001436 (1)
No ratings yet
observability-missing-primer-springone-200911001436 (1)
43 pages
Observability Fundamentals PDF
No ratings yet
Observability Fundamentals PDF
1 page
Module 7 Logging Monitoring and Next Steps
No ratings yet
Module 7 Logging Monitoring and Next Steps
33 pages
Module 3e
No ratings yet
Module 3e
4 pages
(Ebook) Prometheus: Up & Running - Infrastructure and Application Performance Monitoring by Julien Pivotto, Brian Brazil ISBN 9781098131135, 1098131134 download pdf
100% (2)
(Ebook) Prometheus: Up & Running - Infrastructure and Application Performance Monitoring by Julien Pivotto, Brian Brazil ISBN 9781098131135, 1098131134 download pdf
81 pages
Beginners Guide To Observability
No ratings yet
Beginners Guide To Observability
27 pages
grafana_monitoring_guide
No ratings yet
grafana_monitoring_guide
4 pages
Cloud Observability in Action MEAP V06 Michael Mh9 Hausenblas download
No ratings yet
Cloud Observability in Action MEAP V06 Michael Mh9 Hausenblas download
69 pages
Cribl What Is Observability
No ratings yet
Cribl What Is Observability
6 pages
prometheus_monitor
No ratings yet
prometheus_monitor
10 pages
The Future Of Observability With Opentelemetry Ted Young pdf download
100% (1)
The Future Of Observability With Opentelemetry Ted Young pdf download
23 pages
Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization 1st Edition Daniel Gomez Blanco - The ebook is available for instant download, no waiting required
100% (1)
Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization 1st Edition Daniel Gomez Blanco - The ebook is available for instant download, no waiting required
70 pages
Application Monitoring With Prometheus: Intro, Practical Tips, and Adform's Experience
No ratings yet
Application Monitoring With Prometheus: Intro, Practical Tips, and Adform's Experience
41 pages
Instant Access to Prometheus: Up & Running - Infrastructure and Application Performance Monitoring Julien Pivotto ebook Full Chapters
100% (1)
Instant Access to Prometheus: Up & Running - Infrastructure and Application Performance Monitoring Julien Pivotto ebook Full Chapters
55 pages
Elastic Observability
No ratings yet
Elastic Observability
8 pages
Monitoring in The Cloud Ebook 1
No ratings yet
Monitoring in The Cloud Ebook 1
10 pages
Network Monitoring
No ratings yet
Network Monitoring
8 pages
Kubernetes Monitoring Fundamentals
No ratings yet
Kubernetes Monitoring Fundamentals
85 pages
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
5/5 (2)
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
4-SpringBoot BlogPost project Jan 25
No ratings yet
4-SpringBoot BlogPost project Jan 25
8 pages
java streams
No ratings yet
java streams
13 pages
Wireshark Display Filters Cheat Sheet
No ratings yet
Wireshark Display Filters Cheat Sheet
2 pages
5-MS communication Jan 25
No ratings yet
5-MS communication Jan 25
4 pages
Hands-on Guide Running DeepSeek LLMs Locally
No ratings yet
Hands-on Guide Running DeepSeek LLMs Locally
10 pages
IT Troubleshooting
No ratings yet
IT Troubleshooting
3 pages
1-Spring Boot Productapp application Jan 25
No ratings yet
1-Spring Boot Productapp application Jan 25
38 pages
1-SPRING BOOT MS BANK APP STEP BY SETP JAN 25
No ratings yet
1-SPRING BOOT MS BANK APP STEP BY SETP JAN 25
29 pages
2-Spring Data Jan 25
No ratings yet
2-Spring Data Jan 25
14 pages
K8s Horizontal Pod Autoscaling
No ratings yet
K8s Horizontal Pod Autoscaling
12 pages
Java Interview-1
No ratings yet
Java Interview-1
9 pages
Spring Boot
No ratings yet
Spring Boot
7 pages
API Testing Practical Guide - QA_SDET
No ratings yet
API Testing Practical Guide - QA_SDET
7 pages
AWS DevOps Interview Q&A
No ratings yet
AWS DevOps Interview Q&A
5 pages
Linux Commands-2
No ratings yet
Linux Commands-2
16 pages
Day 17 of 30
No ratings yet
Day 17 of 30
7 pages
Constraint_Deltalake_Pyspark
No ratings yet
Constraint_Deltalake_Pyspark
9 pages
AWS Waste management Application
No ratings yet
AWS Waste management Application
9 pages
AWS-Athena-Serverless-Querying
No ratings yet
AWS-Athena-Serverless-Querying
6 pages
?DevOps Interview Disaster_ Avoid These Pitfalls!?
No ratings yet
?DevOps Interview Disaster_ Avoid These Pitfalls!?
7 pages
swipe ??
No ratings yet
swipe ??
20 pages
CNIL - Transfer Impact Assessment Practical Guide
No ratings yet
CNIL - Transfer Impact Assessment Practical Guide
28 pages
Java Design Patterns
No ratings yet
Java Design Patterns
9 pages
DOCKER WITH NFS
No ratings yet
DOCKER WITH NFS
2 pages
Core Fundamentals Java Developers Must Know
No ratings yet
Core Fundamentals Java Developers Must Know
11 pages
SAP SD Important Tables for SD consultants
No ratings yet
SAP SD Important Tables for SD consultants
9 pages
Day 16 of 30
No ratings yet
Day 16 of 30
11 pages
Roles and Responsibilities of L1, L2 and L3 with Scenarios
No ratings yet
Roles and Responsibilities of L1, L2 and L3 with Scenarios
34 pages
Kubernetes Deployments
No ratings yet
Kubernetes Deployments
5 pages
Bihar Stet Computer Science Recorded Batch by Laxmi Mam i Complete (1)
No ratings yet
Bihar Stet Computer Science Recorded Batch by Laxmi Mam i Complete (1)
9 pages
UNIT #07: Software Development Lifecycle
No ratings yet
UNIT #07: Software Development Lifecycle
7 pages
How To Choose The Most Efficient Data Type To-Many Associations PDF
No ratings yet
How To Choose The Most Efficient Data Type To-Many Associations PDF
3 pages
Network Security Project
No ratings yet
Network Security Project
3 pages
HOL 0504 01 PM Unisphere - Provisioning
No ratings yet
HOL 0504 01 PM Unisphere - Provisioning
71 pages
Week 1
No ratings yet
Week 1
28 pages
Oracle Project Planning and Control
No ratings yet
Oracle Project Planning and Control
5 pages
JDJ Java Developer Journal 2005 02
No ratings yet
JDJ Java Developer Journal 2005 02
64 pages
PPL Teaching Plan Div_A 24-25 Sem II
No ratings yet
PPL Teaching Plan Div_A 24-25 Sem II
6 pages
Cybersecurity Fundamentals Certificate Fact Sheet - 0318
No ratings yet
Cybersecurity Fundamentals Certificate Fact Sheet - 0318
2 pages
NetBackup10 AdminGuideII Server
No ratings yet
NetBackup10 AdminGuideII Server
227 pages
Spring Framework - Overview
No ratings yet
Spring Framework - Overview
8 pages
Flask+Python.pptx
No ratings yet
Flask+Python.pptx
36 pages
CIS
No ratings yet
CIS
17 pages
Xtend Satellite Hub Data Sheet: Forsway
No ratings yet
Xtend Satellite Hub Data Sheet: Forsway
4 pages
Mis 09
No ratings yet
Mis 09
31 pages
classification_admin
No ratings yet
classification_admin
200 pages
CCNA 1 v7.0 Modules 16 - 17: Building and Securing A Small Network Exam Answers 2020
No ratings yet
CCNA 1 v7.0 Modules 16 - 17: Building and Securing A Small Network Exam Answers 2020
25 pages
NCC Output 2023 06 22 1687411540
No ratings yet
NCC Output 2023 06 22 1687411540
12 pages
Lecture 1 - Chapter 1
No ratings yet
Lecture 1 - Chapter 1
43 pages
Planning For Big Data
No ratings yet
Planning For Big Data
84 pages
Simple Plan For Networking A Small Scale Hospital
No ratings yet
Simple Plan For Networking A Small Scale Hospital
10 pages
An Introduction To Logical Programming
No ratings yet
An Introduction To Logical Programming
18 pages
10 Best Computer Networking Books For Beginners & Experts
No ratings yet
10 Best Computer Networking Books For Beginners & Experts
16 pages
Salesforce B2C Commerce Developer Academy: Partner Enablement
No ratings yet
Salesforce B2C Commerce Developer Academy: Partner Enablement
25 pages
Telenor M.I.S
100% (1)
Telenor M.I.S
21 pages
192.168.0.16 IP Address Whois
No ratings yet
192.168.0.16 IP Address Whois
2 pages
Content Control Interfaces
No ratings yet
Content Control Interfaces
58 pages
Object Oriented Analysis
No ratings yet
Object Oriented Analysis
15 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Observability-Basic

Uploaded by

Observability-Basic

Uploaded by

1

Observability helps us understand the internal state of a system or application in

Use Cases of Observability:

1. Monitoring disk utilization for a specific node.

The Three Pillars of Observability:

Suppose you want to investigate what happened yesterday at 10:00 AM with a

Monitoring vs. Observability

● Monitoring helps you track and respond to known issues.

Metrics Use Cases

Data Collection Mechanisms

Metrics are collected using two primary mechanisms:

Monitoring and Alerts

● CPU utilization exceeds 70%.

How Prometheus Works

● Storing and Querying Data

● Prometheus scrapes metrics from a Kubernetes pod’s metrics endpoint.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.