linux怎么查看本机内存大小
285
2022-10-25
EKS 训练营-健康检查(3)
介绍
默认情况下,K8s 会自动重启任何原因导致的宕机的容器实例。可以通过配置包括 Pod存活探测 以及 服务就绪探测 的健康检查服务来完成对应的工作。想详细了解原理的,可以参考 K8s 健康检查官方文档
Pod存活探测(Liveness probes)
1.创建项目
mkdir -p ~/environment/healthchecks
创建 yaml 文件
cd ~/environment/healthchecks
cat <
2.部署和确认服务 Ready
cd ~/environment/healthchecks/ kubectl apply -f liveness-app.yaml kubectl get pod liveness-app
返回类似如下
NAME READY STATUS RESTARTS AGE liveness-app 0/1 ContainerCreating 0 1s
然后我们查看history
kubectl describe pod liveness-app
返回类似如下
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 36s default-scheduler Successfully assigned default/liveness-app to ip-172-31-34-171.eu-west-1.compute.internal Normal Pulling 35s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Pulling image "brentley/ecsdemo-nodejs" Normal Pulled 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Successfully pulled image "brentley/ecsdemo-nodejs" in 877.182203ms Normal Created 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Created container liveness Normal Started 34s kubelet, ip-172-31-34-171.eu-west-1.compute.internal Started container liveness
3.人为创造健康检查失败
kubectl get pod liveness-app kubectl exec -it liveness-app -- /bin/kill -s SIGUSR1 1 kubectl get pod liveness-app
4.跟踪日志
执行了步骤安的命令后,nodejs应用程序进入debug模式,不在响应健康检查的请求,所以造成Pod损坏的情况,我们可以通过查看日志跟踪详细过程
kubectl logs liveness-app kubectl logs liveness-app --previous
会发现很多日志,其中有类似如下的章节
::ffff:172.31.34.171 - - [21/May/2021:06:49:06 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:11 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:16 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:21 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:26 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:31 +0000] "GET /health HTTP/1.1" 200 17 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:36 +0000] "GET /health HTTP/1.1" 200 18 "-" "kube-probe/1.20+" ::ffff:172.31.34.171 - - [21/May/2021:06:49:41 +0000] "GET /health HTTP/1.1" 200 18 "-" "kube-probe/1.20+" Starting debugger agent. Debugger listening on [::]:5858
服务就绪探测(Readiness probes)
1.创建服务
cd ~/environment/healthchecks/
cat <
2.部署和检查服务
cd ~/environment/healthchecks/ kubectl apply -f readiness-deployment.yaml kubectl get pods -l app=readiness-deployment kubectl describe deployment readiness-deployment | grep Replicas:
检查出来的副本应该如下所示
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
3.人为创造健康检查失败
我们人为的删掉 /tmp/healthy 这个响应文件,会导致应用健康检查失败
# kubectl exec -it
这个时候去查看副本状态
kubectl describe deployment readiness-deployment | grep Replicas:
就会发现其中有一个异常
Replicas: 3 desired | 3 updated | 3 total | 2 available | 1 unavailable
4.修复错误
我们只需要进入那个pod,手工再创建对应的文件即可让应用健康检查恢复正常
kubectl exec -it readiness-deployment-644f56898d-4mcdk -- touch /tmp/healthy kubectl get pods -l app=readiness-deployment kubectl describe deployment readiness-deployment | grep Replicas:
恢复后,副本数又都变成了3个
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
清理环境
当你不需要此环境时,可以通过如下方式删除
cd ~/environment/healthchecks/ kubectl delete -f liveness-app.yaml kubectl delete -f readiness-deployment.yaml
欢迎大家扫码关注,获取更多信息
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。
发表评论
暂时没有评论,来抢沙发吧~