Rancher中,发现某台物理机(ubuntu 16.04.4)失联了。
iTerm登录进去看看,发现服务down掉了。
$ docker ps
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
查看服务状态:
$ service docker status
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2018-02-07 15:52:38 CST; 1 day 18h ago
Docs: https://docs.docker.com
Process: 1889 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Main PID: 1889 (code=exited, status=1/FAILURE)
Feb 07 15:52:37 dockerhost1 systemd[1]: Starting Docker Application Container Engine...
Feb 07 15:52:37 dockerhost1 dockerd[1889]: time="2018-02-07T15:52:37.363005434+08:00" level=info msg="libcontainerd: new containerd process, pid: 1897"
Feb 07 15:52:38 dockerhost1 dockerd[1889]: time="2018-02-07T15:52:38.412381618+08:00" level=info msg="[graphdriver] using prior storage driver \"aufs\""
Feb 07 15:52:38 dockerhost1 dockerd[1889]: time="2018-02-07T15:52:38.446398584+08:00" level=fatal msg="Error starting daemon: layer does not exist"
Feb 07 15:52:38 dockerhost1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Feb 07 15:52:38 dockerhost1 systemd[1]: Failed to start Docker Application Container Engine.
Feb 07 15:52:38 dockerhost1 systemd[1]: docker.service: Unit entered failed state.
Feb 07 15:52:38 dockerhost1 systemd[1]: docker.service: Failed with result 'exit-code'.
Feb 07 15:52:38 dockerhost1 systemd[1]: docker.service: Start request repeated too quickly.
Feb 07 15:52:38 dockerhost1 systemd[1]: Failed to start Docker Application Container Engine.
解决方案
- 新建文件
/etc/systemd/system/docker.service.d/overlay.conf
,内容如下:
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H fd:// -s overlay
- 使文件生效
$ sudo systemctl daemon-reload
- 查看配置是否加载
systemctl show --property=ExecStart docker
- 重启docker
sudo systemctl restart docker
如果报错:
Failed to restart docker.service: The name org.freedesktop.PolicyKit1 was not provided by any .service files
See system logs and 'systemctl status docker.service' for details.
检查服务是否安装:
sudo apt-get update
sudo apt-get install policykit-1
如果报错:
docker Error starting daemon: layer does not exist
可能是某个镜像的损坏导致。
可以通过删除所有的镜像文件来解决:
$ sudo rm -rf /var/lib/docker
或者升级到docker v1.13.0版本(该版本已解决此bug):
$ sudo apt-get -y install docker-engine=1.13.1-0~ubuntu-xenial --allow-unauthenticated
参考:https://github.com/coreos/bugs/issues/1808
https://github.com/moby/moby/issues/32170
问题解决。
参考:
https://stackoverflow.com/questions/37227349/unable-to-start-docker-service-in-ubuntu-16-04
http://m635674608.iteye.com/blog/2377517
http://blog.csdn.net/ronnyjiang/article/details/72638839
http://sparkgis.com/java/2017/11/docker报错(ubuntu16-04)-原-docker报错(ubuntu16-04)-m睡意zzz/
http://www.docker.org.cn/thread/72.html
https://docs.docker.com/config/daemon/systemd/#start-automatically-at-system-boot