Jonathan Rudenberg
2014-Jul-26 21:05 UTC
[libvirt-users] lxc arp doesn't work at start for several seconds
I’m running into an issue with libvirt-lxc networking. I have an init program that configures the eth0 interface with an IP and gateway when the container starts. I noticed that programs running in the container encountered “no route to host” errors and looked into it further. What I found is that ARP packets are not making it onto the gateway during the first few seconds of the container’s life. I have created a repro case that demonstrates this issue: https://github.com/titanous/libvirt-arp-bug All it does is configure eth0 and then ARPs the bridge IP. There are no ARP responses for ~4s and then everything starts working. The linked repo also contains pcap files from the veth and the bridge showing that the ARP packets are sent over the veth but don’t show up on the bridge. I’ve tested and run into this issue 100% of the time on Ubuntu 14.04 with libvirt 1.2.2 and Linux 3.13.0-32 as well as Fedora 20 with libvirt 1.1.3.5 and Linux 3.15.6-200. Any help would be appreciated, and I’m happy to provide more details if they would be useful. Thanks, Jonathan
Jonathan Rudenberg
2014-Jul-27 01:09 UTC
Re: [libvirt-users] lxc arp doesn't work at start for several seconds
On Jul 26, 2014, at 2:05 PM, Jonathan Rudenberg <jonathan@titanous.com> wrote:> I’m running into an issue with libvirt-lxc networking. I have an init program that configures the eth0 interface with an IP and gateway when the container starts. I noticed that programs running in the container encountered “no route to host” errors and looked into it further. What I found is that ARP packets are not making it onto the gateway during the first few seconds of the container’s life.I found the issue: STP was enabled on the virbr0 which spends 2s in each of the Listening and Learning states by default before enabling the interface. A simple `brctl stp virbr0 off` solves this issue. Jonathan
Laine Stump
2014-Jul-29 15:39 UTC
Re: [libvirt-users] lxc arp doesn't work at start for several seconds
On 07/26/2014 09:09 PM, Jonathan Rudenberg wrote:> On Jul 26, 2014, at 2:05 PM, Jonathan Rudenberg <jonathan@titanous.com> wrote: > >> I’m running into an issue with libvirt-lxc networking. I have an init program that configures the eth0 interface with an IP and gateway when the container starts. I noticed that programs running in the container encountered “no route to host” errors and looked into it further. What I found is that ARP packets are not making it onto the gateway during the first few seconds of the container’s life. > I found the issue: STP was enabled on the virbr0 which spends 2s in each of the Listening and Learning states by default before enabling the interface. > > A simple `brctl stp virbr0 off` solves this issue.To avoid needing to manually set it in the future, you can set it in libvirt's default network configuration. Just edit it: virsh net-edit default and change the <bridge> line to this: <bridge name='virbr0' stp='off'/> I'm surprised that leaving stp='on' with delay='0' would still create this behavior - can you verify that is current setting for your default network (use "virsh net-dumpxml default" to see the current setting)