Steve Clark
2014-May-21 17:55 UTC
[CentOS] kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Hi, anybody know how to fix this. May 20 12:16:15 wolfpac kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out May 20 12:16:15 wolfpac kernel: Modules linked in: pf_ring(U) af_key iptable_nat ipt_LOG iptable_filter ip_tables nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6t_LOG xt_limit ip6table_filter ip6_tables bridge stp llc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack ip6_tunnel tunnel6 ip_gre ipv6 ext3 jbd plcm_drv(U) sled_drv(U) wd_drv(U) ppdev parport_pc parport r8169 mii microcode serio_raw i2c_i801 sg iTCO_wdt iTCO_vendor_support shpchp igb ixgbe dca ptp(T) pps_core mdio ext4 jbd2 mbcache sd_mod crc_t10dif ahci i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] May 20 12:16:15 wolfpac kernel: Pid: 0, comm: swapper Tainted: G --------------- T 2.6.32-358.23.2.el6.centos.plus.x86_64 #1 May 20 12:16:15 wolfpac kernel: Call Trace: May 20 12:16:15 wolfpac kernel: <IRQ> [<ffffffff8106e3e7>] ? warn_slowpath_common+0x87/0xc0 May 20 12:16:15 wolfpac kernel: [<ffffffff8106e4d6>] ? warn_slowpath_fmt+0x46/0x50 May 20 12:16:15 wolfpac kernel: [<ffffffff8146f35d>] ? dev_watchdog+0x26d/0x280 May 20 12:16:15 wolfpac kernel: [<ffffffff81012c09>] ? sched_clock+0x9/0x10 May 20 12:16:15 wolfpac kernel: [<ffffffff8146f0f0>] ? dev_watchdog+0x0/0x280 May 20 12:16:15 wolfpac kernel: [<ffffffff81081937>] ? run_timer_softirq+0x197/0x340 May 20 12:16:15 wolfpac kernel: [<ffffffff810a8060>] ? tick_sched_timer+0x0/0xc0 May 20 12:16:15 wolfpac kernel: [<ffffffff8102ea2d>] ? lapic_next_event+0x1d/0x30 May 20 12:16:15 wolfpac kernel: [<ffffffff810770b1>] ? __do_softirq+0xc1/0x1e0 May 20 12:16:15 wolfpac kernel: [<ffffffff8109b87b>] ? hrtimer_interrupt+0x14b/0x260 May 20 12:16:15 wolfpac kernel: [<ffffffff8100c1cc>] ? call_softirq+0x1c/0x30 May 20 12:16:15 wolfpac kernel: [<ffffffff8100de05>] ? do_softirq+0x65/0xa0 May 20 12:16:15 wolfpac kernel: [<ffffffff81076e95>] ? irq_exit+0x85/0x90 May 20 12:16:15 wolfpac kernel: [<ffffffff8151ec20>] ? smp_apic_timer_interrupt+0x70/0x9b May 20 12:16:15 wolfpac kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20 May 20 12:16:15 wolfpac kernel: <EOI> [<ffffffff812da8fe>] ? intel_idle+0xde/0x170 May 20 12:16:15 wolfpac kernel: [<ffffffff812da8e1>] ? intel_idle+0xc1/0x170 May 20 12:16:15 wolfpac kernel: [<ffffffff8141c2a7>] ? cpuidle_idle_call+0xa7/0x140 May 20 12:16:15 wolfpac kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 May 20 12:16:15 wolfpac kernel: [<ffffffff8150e9c0>] ? start_secondary+0x2ac/0x2ef May 20 12:16:15 wolfpac kernel: ---[ end trace 2426f74a18da7744 ]--- May 20 12:16:15 wolfpac kernel: r8169 0000:05:00.0: eth0: link up May 20 12:16:24 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:16:36 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:16:48 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:17:00 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:17:09 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:17:21 wolfpac flash_the_led.pl: Both ping sites failed flash red-green May 20 12:17:33 wolfpac kernel: r8169 0000:05:00.0: eth0: link up 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. Device 0123 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 (8000ns min, 16000ns max), Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 16 Region 0: I/O ports at b000 [size=256] Region 1: Memory at f7820000 (32-bit, non-prefetchable) [size=256] Expansion ROM at dff00000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: r8169 Kernel modules: r8169 Even though it says the link is up - the link is dead. This a remote unit and this NIC is the management port so it is a real pain when this happens. So far it has happened twice an we have had to call someone and have them power cycle the system. Based on what I have found on the net this seems to happen a lot with this nic. We have upgraded to the latest stock CentOS kernel and added the following to the kernel command line in grub. pcie_aspm=off I've also taken the draconian measure of adding a ping to the default route in the watchdog.conf file to cause a reboot if it happens again. I have looked at the driver version in the latest long term kernel (3.10.40-1.el6.elrepo) and it shows as the same as this kernel. From modinfo r8169 version: 2.3LK-NAPI Thanks, Steve -- Stephen Clark *NetWolves Managed Services, LLC.* Director of Technology Phone: 813-579-3200 Fax: 813-882-0209 Email: steve.clark at netwolves.com http://www.netwolves.com