EKS 训练营-健康检查(3)-APISpace

EKS 训练营-健康检查(3)

介绍

默认情况下，K8s 会自动重启任何原因导致的宕机的容器实例。可以通过配置包括 Pod存活探测以及服务就绪探测的健康检查服务来完成对应的工作。想详细了解原理的，可以参考 K8s 健康检查官方文档

Pod存活探测（Liveness probes）

1.创建项目

mkdir -p ~/environment/healthchecks

创建 yaml 文件

cd ~/environment/healthchecks cat < liveness-app.yaml apiVersion: v1 kind: Pod metadata: name: liveness-app spec: containers: - name: liveness image: brentley/ecsdemo-nodejs livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 5 periodSeconds: 5 EoF

2.部署和确认服务 Ready

cd ~/environment/healthchecks/ kubectl apply -f liveness-app.yaml kubectl get pod liveness-app

返回类似如下

NAME READY STATUS RESTARTS AGE liveness-app 0/1 ContainerCreating 0 1s

然后我们查看history

kubectl describe pod liveness-app

返回类似如下

Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 36s default-scheduler Successfully assigned default/liveness-app to ip-172-31-34-171.eu-west-1.compute.internal Normal Pulling 35s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Pulling image "brentley/ecsdemo-nodejs" Normal Pulled 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Successfully pulled image "brentley/ecsdemo-nodejs" in 877.182203ms Normal Created 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Created container liveness Normal Started 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Started container liveness

3.人为创造健康检查失败

kubectl get pod liveness-app kubectl exec -it liveness-app -- /bin/kill -s SIGUSR1 1 kubectl get pod liveness-app

4.跟踪日志

执行了步骤安的命令后，nodejs应用程序进入debug模式，不在响应健康检查的请求，所以造成Pod损坏的情况，我们可以通过查看日志跟踪详细过程

kubectl logs liveness-app kubectl logs liveness-app --previous

会发现很多日志，其中有类似如下的章节

::ffff:172.31.34.171 - - [21/May/2021:06:49:06 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:11 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:16 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:21 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:26 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:31 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:36 +0000] "GET /health HTTP/1.1" 200 18 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:41 +0000] "GET /health HTTP/1.1" 200 18 "-" "kube-probe/1.20+" Starting debugger agent. Debugger listening on [::]:5858

服务就绪探测（Readiness probes）

1.创建服务

cd ~/environment/healthchecks/ cat < readiness-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: readiness-deployment spec: replicas: 3 selector: matchLabels: app: readiness-deployment template: metadata: labels: app: readiness-deployment spec: containers: - name: readiness-deployment image: alpine command: ["sh", "-c", "touch /tmp/healthy && sleep 86400"] readinessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 3 EoF

2.部署和检查服务

cd ~/environment/healthchecks/ kubectl apply -f readiness-deployment.yaml kubectl get pods -l app=readiness-deployment kubectl describe deployment readiness-deployment | grep Replicas:

检查出来的副本应该如下所示

Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable

3.人为创造健康检查失败

我们人为的删掉 /tmp/healthy 这个响应文件，会导致应用健康检查失败

# kubectl exec -it -- rm /tmp/healthy kubectl exec -it readiness-deployment-644f56898d-4mcdk -- rm /tmp/healthy kubectl get pods -l app=readiness-deployment

这个时候去查看副本状态

kubectl describe deployment readiness-deployment | grep Replicas:

就会发现其中有一个异常

Replicas: 3 desired | 3 updated | 3 total | 2 available | 1 unavailable

4.修复错误

我们只需要进入那个pod，手工再创建对应的文件即可让应用健康检查恢复正常

kubectl exec -it readiness-deployment-644f56898d-4mcdk -- touch /tmp/healthy kubectl get pods -l app=readiness-deployment kubectl describe deployment readiness-deployment | grep Replicas:

恢复后，副本数又都变成了3个

Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable

清理环境

当你不需要此环境时，可以通过如下方式删除

cd ~/environment/healthchecks/ kubectl delete -f liveness-app.yaml kubectl delete -f readiness-deployment.yaml

欢迎大家扫码关注，获取更多信息

Linux中怎么用cat命令创建文件并写入数据

285 2022-10-25

EKS 训练营-健康检查(3)

linux怎么查看本机内存大小

Linux中怎么用cat命令创建文件并写入数据

mysql连接测试不成功的原因有哪些

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）