Prometheus 监控 Redis cluster,其实套路都是一样的,使用 exporter。
exporter 负责采集指标,通过 http 暴露给 Prometheus 拉取。granafa 则通过这些指标绘图展示数据。Prometheus 收集的数据还会根据你设置的告警规则判断是否要发送给 Alertmanager, Alertmanager 则要判断是否要发出告警。
redis-exporter 仓库地址: redis-exporter GitHub地址
部署 Redis-Exporter
下载 redis-exporter 安装包
1
wget https://github.com/oliver006/redis_exporter/releases/download/v1.66.0/redis_exporter-v1.66.0.linux-amd64.tar.gz
部署 redis_exporter 服务
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19# 解压安装包
tar -xf redis_exporter-v1.66.0.linux-amd64.tar.gz && mv redis_exporter-v1.66.0.linux-amd64 /usr/local/redis_exporter
# 创建 systemd 服务文件
cat > /usr/lib/systemd/system/redis_exporter.service <<EOF
[Unit]
Description=redis_exporter
After=network.target
[Service]
Restart=on-failure
ExecStart=/usr/local/redis_exporter/redis_exporter \
--redis.addr=10.66.111.31:6381 \
--redis.password=Red@Qacd!
[Install]
WantedBy=multi-user.target
EOF启动 redis_exporter 服务
1
systemctl enable --now redis_exporter
配置 Prometheus 监控任务
创建 prometheus-additional.yaml 文件,添加以下内容
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21- job_name: 'redis_exporter_targets'
static_configs:
- targets:
- redis://10.66.111.31:6381
- redis://10.66.111.31:6382
- redis://10.66.111.32:6381
- redis://10.66.111.32:6382
- redis://10.66.111.33:6381
- redis://10.66.111.33:6382
metrics_path: /scrape
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.66.111.31:9121
- job_name: "redis-exporter"
static_configs:
- targets:
- 10.66.111.31:9121热加载配置文件
1
2
3
4kubectl create secret generic additional-configs \
--from-file=prometheus-additional.yaml \
--dry-run=client \
-oyaml | kubectl replace -f - -n monitoring添加 Grafana Dashboard,Dashboard ID 为:21914
配置 Redis 集群告警规则
创建 prometheus-redisRule.yaml 文件,文件内容如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: prometheus-k8s-redis-rules
namespace: monitoring
spec:
groups:
- name: Redis
rules:
- alert: RedisDown
expr: redis_up == 0
for: 5m
labels:
severity: error
annotations:
summary: "Redis down (instance {{ $labels.instance }})"
description: "Redis 挂了啊,mmp\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: MissingBackup
expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
for: 5m
labels:
severity: error
annotations:
summary: "Missing backup (instance {{ $labels.instance }})"
description: "Redis has not been backuped for 24 hours\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: OutOfMemory
expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "Out of memory (instance {{ $labels.instance }})"
description: "Redis is running out of memory (> 90%)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: ReplicationBroken
expr: delta(redis_connected_slaves[1m]) < 0
for: 5m
labels:
severity: error
annotations:
summary: "Replication broken (instance {{ $labels.instance }})"
description: "Redis instance lost a slave\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: TooManyConnections
expr: redis_connected_clients > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Too many connections (instance {{ $labels.instance }})"
description: "Redis instance has too many connections\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: NotEnoughConnections
expr: redis_connected_clients < 5
for: 5m
labels:
severity: warning
annotations:
summary: "Not enough connections (instance {{ $labels.instance }})"
description: "Redis instance should have more connections (> 5)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: RejectedConnections
expr: increase(redis_rejected_connections_total[1m]) > 0
for: 5m
labels:
severity: error
annotations:
summary: "Rejected connections (instance {{ $labels.instance }})"
description: "Some connections to Redis has been rejected\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"创建规则资源
1
kubectl create -f prometheus-redisRule.yaml