Philip Prindeville
2017-Oct-27 18:50 UTC
[CentOS-virt] Fwd: Network interface regression on F26 VM after 4.13/4.12 kernel update
I did not hear back on this posting so I figured I was addressing the wrong audience. Maybe someone on the host-side better understands how the 4.12 kernel is interacting with KVM. Thanks, -Philip> Begin forwarded message: > > From: Philip Prindeville <philipp_subx at redfish-solutions.com> > Subject: Network interface regression on F26 VM after 4.13/4.12 kernel update > Date: October 26, 2017 at 4:16:53 PM MDT > To: devel at lists.fedoraproject.org > Reply-To: Development discussions related to Fedora <devel at lists.fedoraproject.org> > > I was running F25 (4.10) on a VM inside KVM/Qemu/libvirt on CentOS 7.3 (updated). > > Then I upgraded it (via dnf system-upgrade) to F26 and 4.11 and it was still working well, as I recall. > > Then I upgraded it again to 4.13 and now I?m seeing flakiness in the network: the NIC will randomly come up and go down. > > Right now I?m seeing: > > $ ip link show > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > 2: ens3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 > link/ether 52:54:00:29:01:5b brd ff:ff:ff:ff:ff:ff > $ > > my messages file shows: > > Oct 26 14:25:51 son-of-builder kernel: igbvf 0000:00:03.0: Link is Down > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.0757] device (ens3): state change: activated -> unavailable (reason 'carrier-changed', internal state 'managed') > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=filter family=2 entries=86 > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.0932] dhcp4 (ens3): canceled DHCP transaction, DHCP client pid 8008 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=nat family=2 entries=52 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=mangle family=2 entries=40 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=raw family=2 entries=29 > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.0933] dhcp4 (ens3): state changed bound -> done > Oct 26 14:25:56 son-of-builder avahi-daemon[756]: Withdrawing address record for 192.168.1.56 on ens3. > Oct 26 14:25:56 son-of-builder avahi-daemon[756]: Leaving mDNS multicast group on interface ens3.IPv4 with address 192.168.1.56. > Oct 26 14:25:56 son-of-builder avahi-daemon[756]: Interface ens3.IPv4 no longer relevant for mDNS. > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:4 'connectivity-change': new request (5 scripts) > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:4 'connectivity-change': start running ordered scripts... > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=filter family=10 entries=87 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=nat family=10 entries=52 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=mangle family=10 entries=40 > Oct 26 14:25:56 son-of-builder audit: NETFILTER_CFG table=raw family=10 entries=30 > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.1140] manager: NetworkManager state is now CONNECTED_LOCAL > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.1145] manager: NetworkManager state is now DISCONNECTED > Oct 26 14:25:56 son-of-builder NetworkManager[824]: <info> [1509049556.1256] policy: set-hostname: set hostname to 'localhost.localdomain' (no default device) > Oct 26 14:25:56 son-of-builder systemd-hostnamed[8026]: Changed host name to 'localhost.localdomain' > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:5 'down' [ens3]: new request (5 scripts) > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:6 'hostname': new request (5 scripts) > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:5 'down' [ens3]: start running ordered scripts... > Oct 26 14:25:56 son-of-builder nm-dispatcher[8018]: req:6 'hostname': start running ordered scripts... > Oct 26 14:26:06 son-of-builder audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' > Oct 26 14:26:21 son-of-builder NetworkManager[824]: <info> [1509049581.0808] connectivity: (ens3) timed out > Oct 26 14:26:21 son-of-builder dbus-daemon[761]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.10' (uid=0 pid=824 comm="/usr/sbin/NetworkManager --no-daemon " label="system_u:system_r:NetworkManager_t:s0") > Oct 26 14:26:21 son-of-builder systemd[1]: Starting Network Manager Script Dispatcher Service... > Oct 26 14:26:21 son-of-builder dbus-daemon[761]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' > Oct 26 14:26:21 son-of-builder systemd[1]: Started Network Manager Script Dispatcher Service. > Oct 26 14:26:21 son-of-builder audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' > Oct 26 14:26:21 son-of-builder nm-dispatcher[8169]: req:1 'connectivity-change': new request (5 scripts) > Oct 26 14:26:21 son-of-builder nm-dispatcher[8169]: req:1 'connectivity-change': start running ordered scripts... > Oct 26 14:26:26 son-of-builder audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' > Oct 26 14:26:31 son-of-builder audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success? > > > Before the network configuration used to be rock-solid. I?m running on a Xeon D-1548 SoC with an on-chip X552/557 (ixgbe.ko) and an off-chip i350 (igb.ko) quad-NIC. In this case, I?m using the first port of the i350. > > The VM?s XML is unchanged, as it was previously (while things were working reliably): > > <interface type='network'> > <mac address='52:54:00:29:01:5b'/> > <source network='hostdev-net0'/> > <model type='virtio'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> > </interface> > > I?ve also tried ?e1000?, ?igb?, and ?rtl8139? as the model type, with no appreciable difference since this problem started. > > Just did a quick check: reinstalling 4.11.10-300 seems to restore functionality. > > It was broken when I tried 4.12.11-300 as well. > > -Philip