Prometheus with Grafana

Prometheus is an Open Source tool used to monitor nearly everything, it has a bunch of integrations, you can check it here

Before you start, please have a look at the config files needed here:

https://github.com/globalsquadproject/monitoring


# Download the Prometheus Server package
wget https://github.com/prometheus/prometheus/releases/download/v2.40.1/prometheus-2.40.1.linux-amd64.tar.gz

# Add Prometheus User
useradd --no-create-home --shell /bin/false prometheus

# Create diectory and change ownership
mkdir /etc/prometheus
mkdir /var/lib/prometheus
chown prometheus:prometheus /etc/prometheus
chown prometheus:prometheus /var/lib/prometheus

#Extract Prmetheus  file & rename it.

tar -xvzf prometheus-2.8.1.linux-amd64.tar.gz
mv prometheus-2.8.1.linux-amd64 prometheuspackage

# Copy “prometheus” and “promtool” binary and change ownership

cp prometheuspackage/prometheus /usr/local/bin/
cp prometheuspackage/promtool /usr/local/bin/
chown prometheus:prometheus /usr/local/bin/prometheus
chown prometheus:prometheus /usr/local/bin/promtool

# Copy “consoles” and “console_libraries”
cp -r prometheuspackage/consoles /etc/prometheus
cp -r prometheuspackage/console_libraries /etc/prometheus
chown -R prometheus:prometheus /etc/prometheus/consoles
chown -R prometheus:prometheus /etc/prometheus/console_libraries

vi /etc/prometheus/prometheus.yml

global:
  scrape_interval: 10s

scrape_configs:
  - job_name: 'prometheus_master'
    scrape_interval: 5s
    static_configs:
      - targets: ['SERVER_IP:9090']


# Change the ownership
chown prometheus:prometheus /etc/prometheus/prometheus.yml

vi /etc/systemd/system/prometheus.service

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target

#start service
systemctl daemon-reload
systemctl start prometheus
systemctl status prometheus

Access the Prometheus server using this this URL: http://SERVER_IP:9090/graph

Node Exporter

The node exporter is very important to Prometheus, it’s responsible for exposing the metrics from the nodes, and it will create an endpoint allowing Prometheus to fetch all data needed.

# Install  Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz

tar -xvzf node_exporter-1.4.0.linux-amd64.tar.gz
useradd -rs /bin/false nodeusr
mv node_exporter-1.4.0.linux-amd64/node_exporter /usr/local/bin/

vim /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=nodeusr
Group=nodeusr
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target

systemctl daemon-reload
systemctl start node_exporter
systemctl enable node_exporter

Access the node exporter interface using this this URL: http://SERVER_IP:9100/metrics

Login on Prometheus server again
vim /etc/prometheus/prometheus.yml
- job_name: 'node_exporter_centos'
    scrape_interval: 5s
    static_configs:
      - targets: ['CLIENT_IP:9100']


systemctl restart prometheus

http://SERVER_IP:9090/targets

Select
Node_memory_MemFree_bytes

Grafana

Grafana is an open-source tool used to show the metrics in a better view, in our case we are going to use Prometheus as a data source


# Download the grafana package
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.2.4-1.x86_64.rpm

# Install the package
sudo yum install grafana-enterprise-9.2.4-1.x86_64.rpm

# Start the service
sudo systemctl start grafana-server
sudo systemctl status grafana-server

http://SERVER_IP:3000

Once you reach the Grafana use the default credentials
User: admin
Pass: admin

After that, you can create your own dashboard or import from Grafana’s website, https://grafana.com/grafana/dashboards/
I used this one:

https://grafana.com/grafana/dashboards/14513-linux-exporter-node/

You can see in the screenshot below how easy is to identify a strange behaviour in your application (huge spike in the graph)

You can use the stress command to simulate that spike

[root@prometheus01 ~]# stress --cpu 2 --timeout 60
stress: info: [20691] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
stress: info: [20691] successful run completed in 60s
[root@prometheus01 ~]#

That’s it, I hope you have learned something new. If you have any questions please don’t hesitate to reach out.

Related Post