[Monitoring] 02. 프로메테우스(Prometheus) 설치 (CentOS Stream 8 기준)
카테고리: MONITORING
태그: monitoring linux
02. 프로메테우스(Prometheus) 설치
■ prometheus 사용자 생성
{
useradd -m -s /bin/false prometheus
id prometheus
}
■ prometheus 디렉토리 생성
{
mkdir /etc/prometheus
mkdir /var/lib/prometheus
chown prometheus /var/lib/prometheus
}
■ prometheus 패키지 다운로드
- https://prometheus.io/download/#prometheus
{
dnf -y install wget
wget https://github.com/prometheus/prometheus/releases/download/v2.46.0/prometheus-2.46.0.linux-amd64.tar.gz -P /tmp
cd /tmp/
tar -zxpvf prometheus-2.46.0.linux-amd64.tar.gz
cd prometheus-2.46.0.linux-amd64
cp -pr prometheus /usr/local/bin/
cp -pr promtool /usr/local/bin/
cp -pr prometheus.yml /etc/prometheus/
}
■ 서비스 파일 생성
cat <<EOF > /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Time Series Collection and Processing Server
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \\
--config.file /etc/prometheus/prometheus.yml \\
--storage.tsdb.path /var/lib/prometheus/ \\
--web.console.templates=/etc/prometheus/consoles \\
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
EOF
{
systemctl daemon-reload
systemctl enable prometheus --now
systemctl status prometheus
netstat -nlp | grep 9090
}
■ 방화벽 설정
{
firewall-cmd --add-port=9090/tcp --permanent
firewall-cmd --reload
}
02. node_exporter 설치
-
node exporter는 Linux 시스템의 메트릭 데이터(CPU/Memory/Disk/Network Traffic 등)를 수집하고, 제공한다.
-
https://prometheus.io/download/#node_exporter
■ node_exporter 사용자 생성
useradd -m -s /bin/false node_exporter
■ node_exporter 패키지 다운로드
{
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
tar -zxpvf node_exporter-1.6.1.linux-amd64.tar.gz
cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
chown node_exporter:node_exporter /usr/local/bin/node_exporter
}
■ 서비스 파일 생성
cat <<EOF > /etc/systemd/system/node_exporter.service
[Unit]
Description=Prometheus Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOF
{
systemctl daemon-reload
systemctl enable node_exporter --now
systemctl status node_exporter
netstat -nlp | grep 9100
}
■ 방화벽 설정
{
firewall-cmd --add-port=9100/tcp --permanent
firewall-cmd --reload
}
■ node_exporter 노드 정보 추가 (프로메테우스 서버)
vi /etc/prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
# 추가
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
■ prometheus 서비스를 재기동 (프로메테우스 서버)
systemctl restart prometheus
■ metric 정보확인
curl http://localhost:9100/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 3.4137e-05
go_gc_duration_seconds{quantile="0.25"} 3.4137e-05
go_gc_duration_seconds{quantile="0.5"} 8.1547e-05
go_gc_duration_seconds{quantile="0.75"} 8.1547e-05
go_gc_duration_seconds{quantile="1"} 8.1547e-05
go_gc_duration_seconds_sum 0.000115684
go_gc_duration_seconds_count 2
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 8
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.20.6"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 2.481704e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 6.30988e+06
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.446544e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 49214
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 8.281024e+06
...
...
03. 결과 확인
■ “http://prometheus_server_ip:9090”에 액세스하면 다음과 같은 UI가 표시된다.
■ 해당 표시를 클릭하면 시계열 데이터를 보기 위한 쿼리들이 있는 것을 확인할 수 있다.
■ node_procs_running을 쿼리를 실행
■ [Status]-[Targets]를 클릭하면 각 노드의 상태를 확인할 수 있다.
댓글 남기기