Michael Evans
2019-Jan-12 03:59 UTC
[Bridge] regression in 4.20.0: physical to VETH (LXC) network bridge (unsure if cgroups is related)
I have since reverted to the working LTS kernel image offered by Arch Linux (4.19.13), but am willing to re-test / gather data additional data on a couple lower-use time periods during the week. After updating to Linux 4.20.0 (along with a full system update otherwise) my BRIDGED network connections to some LXC containers ceased working. Attempting to troubleshoot this issue also produced extremely odd results, which I think offhand MIGHT have caused network packets to fill up some kind of memory buffer instead of being relaid or dropped; there are some additional details at the serverfault and LXC bugs that I filed, as it was initially (and still is) unclear where the actual issue is. - At this time I am unsure if it is related to netdev (bridge, veth), cgroups, or some changed default that should now be configured in a way that is different to previous defaults. https://serverfault.com/questions/947848/linux-bridge-broken-after-upgrade-out-of-ideas-places-to-look-now-4-20-0-arc https://github.com/lxc/lxc/issues/2769 Also now filed: https://bugzilla.kernel.org/show_bug.cgi?id=202235 * It is NOT related to IP forwarding. This is a BRIDGED connection, not a routed one, and it works on older kernels without that enabled. * physical network to bridge works (and will stay connected for a few min after later troubleshooting steps, even if ARP caches / ping flake out and stop responding) * VETH (within LXC) can ping the the host IP on the bridge (but not the gateway, the host can before this step) if manually assigned a static address. Doing this seems to cause general instability and a timed out SSH session. This lead me to reboot between each round of testing to ensure I had a clean slate to start with. I went over the major settings that I did check in the other two bug reports, but I'm open to checking other values and/or performing different kinds of tests occasionally over a given week. Responses won't be immediate but I'll try to check on this frequently over the next two weeks.