Issue
I need help understanding my networking logs due to docker-compose
networking.
I'm ssh'd into a VM, and I have two projects with docker-compose. The first is launched simply with docker-compose up
. When I try to launch the second, my ssh session freezes, and I can no longer ssh into the VM. After lots of trial and error, and after reading this I tried to append to my 2nd project's docker-compose.yml
file the following:
networks:
default:
external:
name: abcdef_default
where abcdef_default
is the name of the network created by docker-compose up
of the 1st project. With this, the docker-compose up
on the 2nd project doesn't kick me out of the ssh session.
I tailed the logs in /var/log/*.log
, and here's the output with the networks section in the docker-compose.yml
file (without the timestamp prefix: Jan 19 09:13:42 hostname kernel: [420096.305357]
):
aufs au_opts_verify:1597:dockerd[13813]: dirperm1 breaks the protection by the permission bits on the lower branch
device veth6a84537 entered promiscuous mode
IPv6: ADDRCONF(NETDEV_UP): veth6a84537: link is not ready
eth0: renamed from veth2480623
IPv6: ADDRCONF(NETDEV_CHANGE): veth6a84537: link becomes ready
br-fe0deb0149df: port 18(veth6a84537) entered forwarding state
br-fe0deb0149df: port 18(veth6a84537) entered forwarding state
aufs au_opts_verify:1597:dockerd[25317]: dirperm1 breaks the protection by the permission bits on the lower branch
device veth1a3c1e3 entered promiscuous mode
IPv6: ADDRCONF(NETDEV_UP): veth1a3c1e3: link is not ready
br-fe0deb0149df: port 22(veth1a3c1e3) entered forwarding state
br-fe0deb0149df: port 22(veth1a3c1e3) entered forwarding state
eth0: renamed from veth54e576d
IPv6: ADDRCONF(NETDEV_CHANGE): veth1a3c1e3: link becomes ready
br-fe0deb0149df: port 22(veth1a3c1e3) entered disabled state
veth54e576d: renamed from eth0
br-fe0deb0149df: port 22(veth1a3c1e3) entered disabled state
device veth1a3c1e3 left promiscuous mode
br-fe0deb0149df: port 22(veth1a3c1e3) entered disabled state
br-fe0deb0149df: port 18(veth6a84537) entered forwarding state
and here's the output without the networks
section (i.e. when I get kicked out of the ssh session):
IPv6: ADDRCONF(NETDEV_UP): br-55349b03453a: link is not ready
aufs au_opts_verify:1597:dockerd[26982]: dirperm1 breaks the protection by the permission bits on the lower branch
aufs au_opts_verify:1597:dockerd[26982]: dirperm1 breaks the protection by the permission bits on the lower branch
aufs au_opts_verify:1597:dockerd[3051]: dirperm1 breaks the protection by the permission bits on the lower branch
device veth7a1bcde entered promiscuous mode
IPv6: ADDRCONF(NETDEV_UP): veth7a1bcde: link is not ready
br-55349b03453a: port 1(veth7a1bcde) entered forwarding state
br-55349b03453a: port 1(veth7a1bcde) entered forwarding state
br-55349b03453a: port 1(veth7a1bcde) entered disabled state
eth0: renamed from veth5d8a2ea
IPv6: ADDRCONF(NETDEV_CHANGE): veth7a1bcde: link becomes ready
br-55349b03453a: port 1(veth7a1bcde) entered forwarding state
br-55349b03453a: port 1(veth7a1bcde) entered forwarding state
IPv6: ADDRCONF(NETDEV_CHANGE): br-55349b03453a: link becomes ready
aufs au_opts_verify:1597:dockerd[13814]: dirperm1 breaks the protection by the permission bits on the lower branch
aufs au_opts_verify:1597:dockerd[13814]: dirperm1 breaks the protection by the permission bits on the lower branch
aufs au_opts_verify:1597:dockerd[13922]: dirperm1 breaks the protection by the permission bits on the lower branch
device veth3253bd4 entered promiscuous mode
IPv6: ADDRCONF(NETDEV_UP): veth3253bd4: link is not ready
br-55349b03453a: port 2(veth3253bd4) entered forwarding state
br-55349b03453a: port 2(veth3253bd4) entered forwarding state
br-55349b03453a: port 2(veth3253bd4) entered disabled state
eth0: renamed from veth9c8aaa3
IPv6: ADDRCONF(NETDEV_CHANGE): veth3253bd4: link becomes ready
br-55349b03453a: port 2(veth3253bd4) entered forwarding state
br-55349b03453a: port 2(veth3253bd4) entered forwarding state
br-55349b03453a: port 2(veth3253bd4) entered disabled state
veth9c8aaa3: renamed from eth0
br-55349b03453a: port 2(veth3253bd4) entered disabled state
device veth3253bd4 left promiscuous mode
br-55349b03453a: port 2(veth3253bd4) entered disabled state
br-55349b03453a: port 1(veth7a1bcde) entered forwarding state
br-55349b03453a: port 1(veth7a1bcde) entered disabled state
veth5d8a2ea: renamed from eth0
br-55349b03453a: port 1(veth7a1bcde) entered disabled state
device veth7a1bcde left promiscuous mode
br-55349b03453a: port 1(veth7a1bcde) entered disabled state
I don't really understand how to read these logs.
Here is my ifconfig
also.
Can someone help me read the logs and figure out what the problem is?
Solution
Diagnosis
Our team is using AWS EC2 instances running Ubuntu 18.04 as devservers. We recently got reports that docker-compose broke SSH connections. Even after restarting, the devservers are still inaccessible. So I started investigation.
I was able to exclude the cause of docker-compose by reproducing using docker only.
ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
aca5884d60f146cef81ac55c8cccd231a43f40927d645168642d9b28c5e009a6
ubuntu@ip-172-31-115-116:~$ docker network prune
WARNING! This will remove all custom networks not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Networks:
my-bridge-network
ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
f0a7a06a9627bc2de00eb60091a92010451690626d95e077f622f3058cc3a07c
ubuntu@ip-172-31-115-116:~$ docker network prune
WARNING! This will remove all custom networks not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Networks:
my-bridge-network
ubuntu@ip-172-31-115-116:~$ docker network create -d bridge my-bridge-network
Connection reset by 172.31.115.116 port 22
Then the root cause occurred to me.
Root cause
- Our docker-compose files are using the bridge network mode which will create a new bridge network by default. When
docker-compose down
ordocker network prune
is run, the bridge network will be torn down. And the nextdocker-compose run
ordocker network create
will create a new bridge network. - The default IP range for the docker0 bridge adapter is
172.17.0.0/16
. - When I first ran the
docker network create -d bridge my-bridge-network
command, it created a new bridge adapter for172.18.0.0/16
. - The second bridge adapter was created for
172.19.0.0/16
. - Naturally, the 3rd bridge adapter is created for
172.20.0.0/16
. However, that is our Engineering VPN IP range. Therefore the overlap caused the server unable to communicate with our laptop.
Solutions
The solution is to make sure new docker bridge networks will skip our VPN IP range.
Temp solution
If we add the skipped IP ranges to system route table, docker will automatically skip them. Therefore, we can run the below script whenever the devserver got rebooted.
sudo route add -net [our VPN IP range] netmask 255.255.0.0 gw [our gateway]
This solution is imperfect that the new routes will be discarded after restarting the machine.
Main solution
We should permanently apply the route changes to all devservers.
echo " routes:" | sudo tee -a /etc/netplan/50-cloud-init.yaml
echo " - to: [our VPN IP range]" | sudo tee -a /etc/netplan/50-cloud-init.yaml
echo " via: [our gateway]" | sudo tee -a /etc/netplan/50-cloud-init.yaml
sudo netplan apply
Docker IP changes
We also plan to modify the docker default-address-pools to redefine docker IP ranges. Refer to https://github.com/docker/compose/issues/4336#issuecomment-457326123. I would say modifying /etc/docker/daemon.json
is better.
Answered By - Shan Huang