monitoring tool (1)
monitoring tool (1)
This activity requires the implementation of a monitoring tool for the Justice Sector
servers' system resources like CPU Usage, Memory Consumption, I/O, Network, Disk
Usage, Processes and Applications’ availability. We shall be collecting metrics from all
of our servers and application systems operated by the Justice Sector stakeholders.
These metrics will help in capacity planning by understanding the servers’ system
resource usage. This server monitoring tool will help in automating the process of server
monitoring. Monitoring our servers performance also helps in identifying other
performance related issues like resource utilization, app downtime and response time.
To avoid learning about issues from users days later, we need an overview system that
collects metrics from all systems and answers questions such as:
• Is the server/service alive?
• How is the Requests Per Second (RPS) changing?
• What is the response time?
• How soon the SSL certificate for the web application is still valid?
• Is there enough memory and CPU?
• How much disk space is left?
For this purpose, we chose a popular solution: collecting metrics using Prometheus,
displaying them on dashboards in Grafana, and sending alerts to support emails and
Zammad tool through Prometheus Alert Manager and looking forward easily escalate
them to key responsible individuals' mobile phones.
The following metrics are collected and visualized in three different dashboards:
1. Server metrics
a. CPU load
b. Memory consumption
c. Disk usage
2. Application metrics
a. Application state
b. Number of processed requests
c. HTTP response codes
d. Request processing time
The following dashboards are currently implemented:
1. Main Dashboard
2. Server Dasboard
3. Application Dasboard
Road Map