Christopher S. Aker
2010-Oct-11  20:36 UTC
[Xen-devel] New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out
At Linode we recently altered our server build spec from Intel L5520 to L5630. Nothing else has changed as far as we can tell, however with this new build we''re experiencing a new problem -- permanent loss of networking after some time (measured in days) on these new machines, with: NETDEV WATCHDOG: eth0: transmit timed out <-- kiss of network death The link is remains active on switch, yet the NIC stops receiving any interrupts. No amount of prodding wakes it back up... Some data points: 2.6.18.8 @ 931 contains an older igb driver 2.6.18.8 @ 1038 contains newest igb driver (as of last week) 2.6.18.8 @ 931 works perfectly on all our equipment prior to L5630 2.6.18.8 @ 1038 times out on everything Motherboard BIOS version is the same. Upgrading BIOS on affected boxes has no effect. A year or two back (after 931), I had to build a newer 2.6.18.8 for whatever reason and decided to include the newest igb drivers at that time. I eventually had to roll this back because the NICs started timing out. However, even our "good" build is timing out on the new spec machine. These machines don''t appear to present the problem when on bare metal. dmesg: http://theshore.net/~caker/xen/BUGS/nic-timeout/ What we''re trying: 1) On an affected machine, we''re swapping out the L5630 back to the L5520. 2) Moving from Xen 3.4.1 to Xen 3.4.4-rc1-pre 3) Xen 3.4.4-rc1-pre along with 2.6.32.23-g41a85de5 dom0 This certainly appears as some strange incompatibility with Xen, dom0, and/or the NIC driver. No more interrupts being delivered is suspicious. I''d be grateful for any insight! Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel