Ihar Hrachyshka
2018-May-10 18:53 UTC
[libvirt-users] e1000 network interface takes a long time to set the link ready
Hi, In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, link on the interface becomes ready several seconds after 'ifup' is executed, which for some buggy images like cirros may slow down boot process for up to 1 minute [2]. If we switch from e1000 to virtio, the link is brought up and ready almost immediately. For the record, I am using the following versions: - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP - libvirt: 3.7.0-4.fc27 - guest kernel: 4.4.0-28-generic #47-Ubuntu Is there something specific about e1000 that makes it initialize the link too slowly on libvirt or guest side? [1] https://github.com/kubevirt/kubevirt/issues/936 [2] https://bugs.launchpad.net/cirros/+bug/1768955 Ihar
Daniel Romero
2018-May-10 18:57 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
Hi, try to use virtio instead... Atte. Daniel Romero P. On Thu, May 10, 2018 at 3:53 PM, Ihar Hrachyshka <ihrachys@redhat.com> wrote:> Hi, > > In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, > link on the interface becomes ready several seconds after 'ifup' is > executed, which for some buggy images like cirros may slow down boot > process for up to 1 minute [2]. If we switch from e1000 to virtio, the > link is brought up and ready almost immediately. > > For the record, I am using the following versions: > - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP > - libvirt: 3.7.0-4.fc27 > - guest kernel: 4.4.0-28-generic #47-Ubuntu > > Is there something specific about e1000 that makes it initialize the > link too slowly on libvirt or guest side? > > [1] https://github.com/kubevirt/kubevirt/issues/936 > [2] https://bugs.launchpad.net/cirros/+bug/1768955 > > Ihar > > _______________________________________________ > libvirt-users mailing list > libvirt-users@redhat.com > https://www.redhat.com/mailman/listinfo/libvirt-users >
Ihar Hrachyshka
2018-May-10 19:06 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
On Thu, May 10, 2018 at 11:57 AM, Daniel Romero <romero.cl@gmail.com> wrote:> Hi, > > try to use virtio instead...That is exactly what I tried, and indeed the link is ready almost immediately. But there are some issues with virtio, like missing drivers in default Windows images. Kubevirt doesn't currently allow to choose the type of the NIC (arguably it should), and its current default type is e1000. Just switching the default to virtio won't fly with Windows, so unless we expose the choice of the NIC type to kubevirt users, they are affected by one issue (slow cirros boot) or another (no networking in windows machines). While we are going to explore exposing the choice in next kubevirt versions, it would be nice to have cirros behave correctly regardless of NIC type. Ihar
Laine Stump
2018-May-10 21:07 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
On 05/10/2018 02:53 PM, Ihar Hrachyshka wrote:> Hi, > > In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, > link on the interface becomes ready several seconds after 'ifup' is > executedWhat is your definition of "becomes ready"? Are you looking at the output of "ip link show" in the guest? Or are you watching "brctl showstp" for the bridge device on the host? Or something else?> which for some buggy images like cirros may slow down boot > process for up to 1 minute [2]. If we switch from e1000 to virtio, the > link is brought up and ready almost immediately. > > For the record, I am using the following versions: > - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP > - libvirt: 3.7.0-4.fc27 > - guest kernel: 4.4.0-28-generic #47-Ubuntu > > Is there something specific about e1000 that makes it initialize the > link too slowly on libvirt or guest side?There isn't anything libvirt could do that would cause the link to IFF_UP up any faster or slower, so if there is an issue it's elsewhere. Since switching to the virtio device eliminates the problem, my guess would be that it's something about the implementation of the emulated device in qemu that is causing a delay in the e1000 driver in the guest. That's just a guess though.> > [1] https://github.com/kubevirt/kubevirt/issues/936 > [2] https://bugs.launchpad.net/cirros/+bug/1768955(I discount the idea of the stp delay timer having an effect, as suggested in one of the comments on github that points to my explanation of STP in a libvirt bugzilla record, because that would cause the same problem for e1000 or virtio). I hesitate to suggest this, because the rtl8139 code in qemu is considered less well maintained and lower performance than e1000, but have you tried setting that model to see how it behaves? You may be forced to make that the default when virtio isn't available. Another thought - I guess the virtio driver in Cirros is always available? Perhaps kubevirt could use libosinfo to auto-decide what device to use for networking based on OS.
Ihar Hrachyshka
2018-May-10 21:44 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
On Thu, May 10, 2018 at 2:07 PM, Laine Stump <laine@redhat.com> wrote:> On 05/10/2018 02:53 PM, Ihar Hrachyshka wrote: >> Hi, >> >> In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, >> link on the interface becomes ready several seconds after 'ifup' is >> executed > > What is your definition of "becomes ready"? Are you looking at the > output of "ip link show" in the guest? Or are you watching "brctl > showstp" for the bridge device on the host? Or something else?I was watching the guest dmesg for the following messages: [ 4.773275] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 6.769235] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 6.771408] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready For e1000, there are 2 seconds in between those messages; for virtio, it's near instant. Interesting that it happens on the very first ifup; when I do it the second time after the guest booted, it's instant.> >> which for some buggy images like cirros may slow down boot >> process for up to 1 minute [2]. If we switch from e1000 to virtio, the >> link is brought up and ready almost immediately. >> >> For the record, I am using the following versions: >> - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP >> - libvirt: 3.7.0-4.fc27 >> - guest kernel: 4.4.0-28-generic #47-Ubuntu >> >> Is there something specific about e1000 that makes it initialize the >> link too slowly on libvirt or guest side? > > There isn't anything libvirt could do that would cause the link to > IFF_UP up any faster or slower, so if there is an issue it's elsewhere. > Since switching to the virtio device eliminates the problem, my guess > would be that it's something about the implementation of the emulated > device in qemu that is causing a delay in the e1000 driver in the guest. > That's just a guess though. > >> >> [1] https://github.com/kubevirt/kubevirt/issues/936 >> [2] https://bugs.launchpad.net/cirros/+bug/1768955 > > (I discount the idea of the stp delay timer having an effect, as > suggested in one of the comments on github that points to my explanation > of STP in a libvirt bugzilla record, because that would cause the same > problem for e1000 or virtio).Yes, it's not STP, and I also tried to explicitly set all bridge timers to 0 with no result. I also did "tcpdump -i any" inside the container that hosts the VM VIF, and there was no relevant traffic on tap device.> > I hesitate to suggest this, because the rtl8139 code in qemu is > considered less well maintained and lower performance than e1000, but > have you tried setting that model to see how it behaves? You may be > forced to make that the default when virtio isn't available.Indeed rth8139 is near instant too: [ 4.156872] 8139cp 0000:07:01.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1 [ 4.177520] 8139cp 0000:07:01.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1 Thanks for the tip, we will consider it too (also thanks for the background info about the driver support state).> > Another thought - I guess the virtio driver in Cirros is always > available? Perhaps kubevirt could use libosinfo to auto-decide what > device to use for networking based on OS. >This, or we can introduce explicit tags for NICs / guest type to use. Thanks a lot for reply, Ihar
Daniel P. Berrangé
2018-May-11 08:42 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
On Thu, May 10, 2018 at 11:53:23AM -0700, Ihar Hrachyshka wrote:> Hi, > > In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, > link on the interface becomes ready several seconds after 'ifup' is > executed, which for some buggy images like cirros may slow down boot > process for up to 1 minute [2]. If we switch from e1000 to virtio, the > link is brought up and ready almost immediately. > > For the record, I am using the following versions: > - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP > - libvirt: 3.7.0-4.fc27 > - guest kernel: 4.4.0-28-generic #47-Ubuntu > > Is there something specific about e1000 that makes it initialize the > link too slowly on libvirt or guest side?Try the e1000e device instead perhaps. If all other NIC models work, then this is likely to be a QEMU problem and should be reported as a bug to them. I notice you're running Fedora 27 though, so before reporting bugs please try with latest upstream QEMU releases (2.12) to see if that's better> [1] https://github.com/kubevirt/kubevirt/issues/936 > [2] https://bugs.launchpad.net/cirros/+bug/1768955Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Ihar Hrachyshka
2018-May-11 18:34 UTC
Re: [libvirt-users] e1000 network interface takes a long time to set the link ready
On Fri, May 11, 2018 at 1:42 AM, Daniel P. Berrangé <berrange@redhat.com> wrote:> On Thu, May 10, 2018 at 11:53:23AM -0700, Ihar Hrachyshka wrote: >> Hi, >> >> In kubevirt, we discovered [1] that whenever e1000 is used for vNIC, >> link on the interface becomes ready several seconds after 'ifup' is >> executed, which for some buggy images like cirros may slow down boot >> process for up to 1 minute [2]. If we switch from e1000 to virtio, the >> link is brought up and ready almost immediately. >> >> For the record, I am using the following versions: >> - L0 kernel: 4.16.5-200.fc27.x86_64 #1 SMP >> - libvirt: 3.7.0-4.fc27 >> - guest kernel: 4.4.0-28-generic #47-Ubuntu >> >> Is there something specific about e1000 that makes it initialize the >> link too slowly on libvirt or guest side? > > Try the e1000e device instead perhaps.Thanks a lot for the suggestion, it works indeed. My understanding is that it's the default NIC for q35 machines starting 2.12, so indeed that's a great choice.> > If all other NIC models work, then this is likely to be a QEMU problem > and should be reported as a bug to them. I notice you're running Fedora 27 > though, so before reporting bugs please try with latest upstream QEMU > releases (2.12) to see if that's betterThanks for the suggestion. I reported a bug here: https://bugs.launchpad.net/qemu/+bug/1770724 I tried to reproduce it with 2.12 (built kubevirt stack with Fedora 29 packages) but I get some fundamental issues in the guest that block me from reproducing the slow link ready bug (with the new qemu / libvirt stack, I get kernel traces and irq interrupt error and no network link at all in the guest). I hope my report against 2.10 would still fit their bug report requirements. Thanks again, Ihar
Reasonably Related Threads
- Re: e1000 network interface takes a long time to set the link ready
- Re: e1000 network interface takes a long time to set the link ready
- e1000 network interface takes a long time to set the link ready
- Re: RLIMIT_MEMLOCK in container environment
- Re: RLIMIT_MEMLOCK in container environment