grpc: the connection is unavailable

错误记录

pod 无法正常创建

[root@master-62 ~]# kubectl get po --all-namespaces -owide |grep -v "Running"
liangmingb                test-1858107638-nrw6d                           0/1       ContainerCreating   0          1h        <none>            slave-143


[root@master-62 ~]# kubectl describe po test-1858107638-nrw6d -nliangmingb
Events:
  FirstSeen LastSeen    Count   From            SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----            -------------   --------    ------      -------
  1h        5s      1762    kubelet, slave-143          Warning     FailedSync  Error syncing pod
  1h        4s      1762    kubelet, slave-143          Normal      SandboxChanged  Pod sandbox changed, it will be killed and re-created

查看kubelet日志 有报错如下:

12月 11 20:07:09 slave-143 kubelet[23490]: WARNING:1211 20:07:09.376601   23490 cni.go:258] CNI failed to retrieve network namespace path: Error: No such container: 9adaef08d85b827e78600cc0a170df27617481442463962f3f30495ce878cc1f
12月 11 20:07:09 slave-143 kubelet[23490]: ERROR:1211 20:07:09.622003   23490 docker_sandbox.go:239] Failed to stop sandbox "9adaef08d85b827e78600cc0a170df27617481442463962f3f30495ce878cc1f": Error response from daemon: {"message":"No such container: 9adaef08d85b827e78600cc0a170df27617481442463962f3f30495ce878cc1f"}
12月 11 20:07:09 slave-143 kubelet[23490]: ERROR
12月 11 20:07:09 slave-143 kubelet[23490]: ERROR
12月 11 20:07:09 slave-143 kubelet[23490]: ERROR
12月 11 20:07:09 slave-143 kubelet[23490]: ERROR:1211 20:07:09.878784   23490 remote_runtime.go:91] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = failed to start sandbox container for pod "test-1858107638-nrw6d": Error response from daemon: {"message":"grpc: the connection is unavailable"}

这种报错产生的影响

# 如果kubelet有这种报错 会有如下影响
1. node所在节点现有的pod 无法停止 也无法删除
  注: 但不影响已有pod的使用 从其他节点可以ping通已有pod的ip地址
2. node上新分配过来的pod 也无法正常创建

[root@slave-143 ~]# docker ps |grep filebeat
07646329caa3        reg.enncloud.cn/enncloud/filebeat@sha256:8869c3fcd0eadfe6202407b06eec8e672f37de3d093031bc01c03e5736e842d9                       "./run.sh"               34 hours ago        Up 34 hours                             k8s_filebeat_filebeat-hvrz9_kube-system_79e95933-fc20-11e8-885a-5254c2cdf2fd_0
9ab9c478ea3d        reg.enncloud.cn/enncloud/pause-amd64:3.0                                                                                        "/pause"                 34 hours ago        Up 34 hours                             k8s_POD_filebeat-hvrz9_kube-system_79e95933-fc20-11e8-885a-5254c2cdf2fd_0

[root@slave-143 ~]# docker stop 07646329caa3
Error response from daemon: Cannot stop container 07646329caa3: Cannot kill container 07646329caa3e77b64f76707df5f69242f753c678a9c344dc83ff25cdd0cdb2f: rpc error: code = 14 desc = grpc: the connection is unavailable

[root@slave-143 ~]# docker rm -f 07646329caa3
Error response from daemon: Could not kill running container 07646329caa3e77b64f76707df5f69242f753c678a9c344dc83ff25cdd0cdb2f, cannot remove - Cannot kill container 07646329caa3e77b64f76707df5f69242f753c678a9c344dc83ff25cdd0cdb2f: rpc error: code = 14 desc = grpc: the connection is unavailable


# 换言之 kubectl drain  这种驱逐pod的方式 也无法生效 因为pod 根本无法删除

解决办法

systemctl restart docker

github 上的相关issue

https://www.infoq.cn/article/2017%2F02%2FDocker-Containerd-RunC
    原文作者:开始懂了90
    原文地址: https://www.jianshu.com/p/3222f720d7a4
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞