alertmanager报警添加企业微信监控报警

网友投稿 281 2022-10-30

alertmanager报警添加企业微信监控报警

Prometheus机器:172.27.143.155alertmanager机器:172.27.143.150

一、上面配置了Prometheus和grafana服务在155机器上面接下来配置 alermanager服务1、wget ./alertmanager &

需要创建一个 rules目录里面又2个文件,一个时主机监控,一个时容器监控1)cat host_sys.yml

groups:

name: Hostrules: alert: Memory Usageexpr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100 > 2 for: 1mlabels:name: Memoryseverity: Warningannotations:summary: " {{ $labels.appname }} "description: "宿主机内存使用率超过80%."value: "{{ $value }}" alert: CPU Usageexpr: sum(avg without (cpu)(irate(node_cpu_seconds_total{mode!='idle'}[5m]))) by (instance,appname) > 0.05for: 1mlabels:name: CPUseverity: Warningannotations:summary: " {{ $labels.appname }} "description: "宿主机CPU使用率超过65%."value: "{{ $value }}" alert: HostLoad expr: node_load5 > 4for: 1mlabels:name: Loadseverity: Warningannotations:summary: "{{ $labels.appname }} "description: " 主机负载5分钟超过4."value: "{{ $value }}" alert: Filesystem Usageexpr: 1-(node_filesystem_free_bytes / node_filesystem_size_bytes) > 0.3for: 1mlabels:name: Diskseverity: Warningannotations:summary: " {{ $labels.appname }} "description: " 宿主机 [ {{ $labels.mountpoint }} ]分区使用超过80%."value: "{{ $value }}%" alert: Diskio writesexpr: irate(node_disk_writes_completed_total{job=~"Host"}[1m]) > 50for: 1mlabels:name: Diskioseverity: Warningannotations:summary: " {{ $labels.appname }} "description: " 宿主机 [{{ $labels.device }}]磁盘1分钟平均写入IO负载较高."value: "{{ $value }}iops" alert: Diskio readsexpr: irate(node_disk_reads_completed_total{job=~"Host"}[1m]) > 5for: 1mlabels:name: Diskioseverity: Warningannotations:summary: " {{ $labels.appname }} "description: " 宿主机 [{{ $labels.device }}]磁盘1分钟平均读取IO负载较高."value: "{{ $value }}iops" alert: Network_receiveexpr: irate(node_network_receive_bytes_total{device!~"lo|bond[0-9]|cbr[0-9]|veth.|virbr.|ovs-system"}[5m]) / 1048576 > 5for: 1mlabels:name: Network_receiveseverity: Warningannotations:summary: " {{ $labels.appname }} "description: " 宿主机 [{{ $labels.device }}] 网卡5分钟平均接收流量超过5Mbps."value: "{{ $value }}Mbps" alert: Network_transmitexpr: irate(node_network_transmit_bytes_total{device!~"lo|bond[0-9]|cbr[0-9]|veth.|virbr.|ovs-system"}[5m]) / 1048576 > 5for: 1mlabels:name: Network_transmitseverity: Warningannotations:summary: " {{ $labels.appname }} "description: " 宿主机 [{{ $labels.device }}] 网卡5分钟内平均发送流量超过5Mbps."value: "{{ $value }}Mbps"

2) cat container_sys.yml

groups:

name: Containerrules: alert: CPU Usageexpr: (sum by(name,instance) (rate(container_cpu_usage_seconds_total{image!=""}[5m]))*100) > 80for: 1mlabels:name: CPUseverity: Warningannotations:summary: "{{ $labels.name }} "description: " 容器CPU使用超过80%"value: "{{ $value }}%" alert: Memory Usageexpr: (container_memory_usage_bytes{name=~".+"} - container_memory_cache{name=~".+"}) / container_spec_memory_limit_bytes{name=~".+"} * 100 > 80for: 1mlabels:name: Memoryseverity: Warningannotations:summary: "{{ $labels.name }} "description: " 容器内存使用超过80%."value: "{{ $value }}%" alert: Network_receiveexpr: irate(container_network_receive_bytes_total{name=~".+",interface=~"eth.+"}[5m]) / 1048576 > 5for: 1mlabels:name: Network_receiveseverity: Warningannotations:summary: "{{ $labels.name }} "description: "容器 [{{ $labels.device }}] 网卡5分钟平均接收流量超过5Mbps."value: "{{ $value }}Mbps" alert: Network_transmitexpr: irate(container_network_transmit_bytes_total{name=~".+",interface=~"eth.+"}[5m]) / 1048576 > 5for: 1mlabels:name: Network_transmitseverity: Warningannotations:summary: "{{ $labels.name }} "description: "容器 [{{ $labels.device }}] 网卡5分钟平均发送流量超过5Mbps."value: "{{ $value }}Mbps"

配置完成之后重启一下Prometheus服务

监控容器就完成了

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:九析带你在秦淮完爆 k8s pv pvc storageClass
下一篇:基于CPLD的单片机PCI接口设计
相关文章

 发表评论

暂时没有评论,来抢沙发吧~