解决docker指令卡住的场景之一
一、问题概述
本文档用于记录我在开发过程中遇到的一个docker使用问题,记录和分享于此。
该问题表现:
docker所以指令都卡住不返回,或者提示下面内容。
`Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?`。
发生环境:虚拟机Virtual Box + Linux ubuntu-18 4.15.0-20-generic #21-Ubuntu
发生原因:在使用Windows10时,由于串口原因发生了4次蓝屏事件。
日志查看:
xxxxxx@ubuntu-18:~$ sudo journalctl -eu docker
9月 22 14:28:54 ubuntu-18 dockerd[2862]: time="2024-09-22T14:28:54+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/df765b43add0b445562203c020d6389
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.051941623+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers:
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.052345154+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 name
9月 22 14:28:55 ubuntu-18 dockerd[2862]: time="2024-09-22T14:28:55+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/52c8d58b7dd87312ab3b11fa91606db
9月 22 14:28:55 ubuntu-18 dockerd[2862]: time="2024-09-22T14:28:55+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/1cad9287079b11726af87f6ae2fdfaf
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.382099670+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers:
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.382135266+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 name
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.409340048+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers:
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.409374464+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 name
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.970998310+08:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers:
9月 22 14:28:55 ubuntu-18 dockerd[2844]: time="2024-09-22T14:28:55.971133054+08:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 name
9月 22 14:28:56 ubuntu-18 dockerd[2844]: panic: runtime error: invalid memory address or nil pointer dereference
9月 22 14:28:56 ubuntu-18 dockerd[2844]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x5647f2ecb0af]
9月 22 14:28:56 ubuntu-18 dockerd[2844]: goroutine 218 [running]:
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/vendor/github.com/docker/libnetwork.(*endpoint).addServiceInfoToCluster(0xc4200a31e0, 0xc42068de00, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/agent.go:586 +0xbf
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/vendor/github.com/docker/libnetwork.(*sandbox).EnableService(0xc42068de00, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/sandbox.go:686 +0x18f
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).ActivateContainerServiceBinding(0xc420086480, 0xc4201c2cc0, 0x10, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/container_operations.go:1087 +0x89
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).connectToNetwork(0xc420086480, 0xc4202f9200, 0x5647f369ff6d, 0x6, 0xc4205803c0, 0x5647f4cb0400, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/container_operations.go:796 +0xa3e
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).allocateNetwork(0xc420086480, 0xc4202f9200, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/container_operations.go:540 +0xbb3
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).initializeNetworking(0xc420086480, 0xc4202f9200, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/container_operations.go:922 +0xa2
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).containerStart(0xc420086480, 0xc4202f9200, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/start.go:150 +0x2cf
9月 22 14:28:56 ubuntu-18 dockerd[2844]: github.com/docker/docker/daemon.(*Daemon).restore.func2(0xc420ab67e0, 0xc420086480, 0xc4201179b0, 0xc4202f9200, 0xc4204d2420)
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/daemon.go:401 +0x30c
9月 22 14:28:56 ubuntu-18 dockerd[2844]: created by github.com/docker/docker/daemon.(*Daemon).restore
9月 22 14:28:56 ubuntu-18 dockerd[2844]: /go/src/github.com/docker/docker/daemon/daemon.go:381 +0x12e5
9月 22 14:28:56 ubuntu-18 systemd[1]: docker.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
9月 22 14:28:56 ubuntu-18 systemd[1]: docker.service: Failed with result 'exit-code'.
9月 22 14:28:56 ubuntu-18 systemd[1]: Failed to start Docker Application Container Engine.
9月 22 14:28:56 ubuntu-18 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
9月 22 14:28:56 ubuntu-18 systemd[1]: Stopped Docker Application Container Engine.
9月 22 14:28:56 ubuntu-18 systemd[1]: docker.service: Start request repeated too quickly.
9月 22 14:28:56 ubuntu-18 systemd[1]: docker.service: Failed with result 'exit-code'.
9月 22 14:28:56 ubuntu-18 systemd[1]: Failed to start Docker Application Container Engine.
二、解决办法
本人对docker的认识也不多,只是会一些操作和整体的东西。先是重启了几次虚拟机,和1次PC电脑,确认重启无法恢复后,就在网上搜索答案。
网上有类似的问题,可能每个人的形成该问题的原因都有差异。看到了3种解决办法。
方法1:重启一下docker服务就好了。
sudo systemctl restart docker.service
方法2:删除掉旧的docker然后,重新装新的docker。
systemctl stop docker
rm -rf /var/lib/docker
systemctl start dockeruninstall/install docker
方法3:删除docker的运行文件,重新启动docker。
# danger, read the entire text around this code before running
# you will lose data
sudo -s
systemctl stop docker
rm -rf /var/lib/docker
systemctl start docker
exit
我尝试了方法1,没有解决我的问题,方法2我又觉得麻烦。直接尝试方法3。方法3解决了我的问题。
我在进行方法3操作的时候,`rm -rf /var/lib/docker`操作是会有些资源正在使用或挂载状态。我是先将docker 服务disable掉,重启虚拟机,然后再删,就很流畅。删除后再将docker服务enable,再次重启虚拟机。测试docker服务正常,就是如注释所讲,你之前的docker镜像和容器全部丢失了。如果你接受不了,你就提前想想办法。
本人的操作流程为:
# danger, read the entire text around this code before running
# you will lose datasudo systemctl disable docker#重启虚拟机rm -rf /var/lib/docker
sudo systemctl enable docker# 再次重启虚拟机
三、结论
由于宿主机Windows蓝屏异常,导致正在运行的docker容器异常。导致docker服务异常,所有的docker 指令卡住。解决办法就是删除掉之前docker产生的数据(容器和镜像),重新运行即可。这样就丢失掉了之前的容器和镜像。