I am trying to transport a dd image between to hosts over a cross linked gigabit connection. Both hosts have an eth1 configured to a non routable ip addr on a shared network. No other devices exist on this link. When transferring via sftp I received a stall warning. Checking the logs I see this: dmesg | grep eth e1000e 0000:00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:1c:c0:f2:1f:bb e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection e1000e 0000:00:19.0: eth0: MAC: 7, PHY: 8, PBA No: FFFFFF-0FF r8169 0000:01:00.0: eth1: RTL8168d/8111d at 0xffffc9000187c000, 00:0a:cd:1d:44:fe, XID 081000c0 IRQ 31 r8169 0000:01:00.0: eth1: jumbo features [frames: 9200 bytes, tx checksumming: ko] ADDRCONF(NETDEV_UP): eth0: link is not ready device eth0 entered promiscuous mode r8169 0000:01:00.0: eth1: invalid firwmare r8169 0000:01:00.0: eth1: unable to load firmware patch rtl_nic/rtl8168d-1.fw (-22) r8169 0000:01:00.0: eth1: link down r8169 0000:01:00.0: eth1: link down ADDRCONF(NETDEV_UP): eth1: link is not ready r8169 0000:01:00.0: eth1: link up ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready br0: port 1(eth0) entering learning state eth1: no IPv6 routers present eth0: no IPv6 routers present br0: port 1(eth0) entering forwarding state r8169 0000:01:00.0: eth1: link down r8169 0000:01:00.0: eth1: link up r8169 0000:01:00.0: eth1: link down r8169 0000:01:00.0: eth1: link up This may, or may not, be related to this bug: https://bugzilla.kernel.org/show_bug.cgi?id=12411 Is there some way to confirm whether or not this is the case. Is there a work-around for it if it is this bug? If it is not then has anyone any idea what is happening and how to fix it? -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On 11/08/12 22:17, James B. Byrne wrote:> I am trying to transport a dd image between to hosts over a cross > linked gigabit connection. Both hosts have an eth1 configured to a > non routable ip addr on a shared network. No other devices exist on > this link. > > When transferring via sftp I received a stall warning. Checking the > logs I see this: > > dmesg | grep eth > > e1000e 0000:00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) > 00:1c:c0:f2:1f:bb > e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection > e1000e 0000:00:19.0: eth0: MAC: 7, PHY: 8, PBA No: FFFFFF-0FF > r8169 0000:01:00.0: eth1: RTL8168d/8111d at 0xffffc9000187c000, > 00:0a:cd:1d:44:fe, XID 081000c0 IRQ 31 > r8169 0000:01:00.0: eth1: jumbo features [frames: 9200 bytes, tx > checksumming: ko] > ADDRCONF(NETDEV_UP): eth0: link is not ready > device eth0 entered promiscuous mode > r8169 0000:01:00.0: eth1: invalid firwmare > r8169 0000:01:00.0: eth1: unable to load firmware patch > rtl_nic/rtl8168d-1.fw (-22) > r8169 0000:01:00.0: eth1: link down > r8169 0000:01:00.0: eth1: link down > ADDRCONF(NETDEV_UP): eth1: link is not ready > r8169 0000:01:00.0: eth1: link up > ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None > ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > br0: port 1(eth0) entering learning state > eth1: no IPv6 routers present > eth0: no IPv6 routers present > br0: port 1(eth0) entering forwarding state > r8169 0000:01:00.0: eth1: link down > r8169 0000:01:00.0: eth1: link up > r8169 0000:01:00.0: eth1: link down > r8169 0000:01:00.0: eth1: link up > > This may, or may not, be related to this bug: > > https://bugzilla.kernel.org/show_bug.cgi?id=12411 > > Is there some way to confirm whether or not this is the case. Is > there a work-around for it if it is this bug? If it is not then has > anyone any idea what is happening and how to fix it? >Elrepo.org has updated drivers for both e1000e and r8169 devices (I'm guessing it's probably the kmod-r8168 you'd need). You could try these to see if they resolve the issue. If you want more help identifying the correct driver for your hardware, see FAQ #4 here: http://elrepo.org/tiki/FAQ Hope that helps.
On Sun, August 12, 2012 12:00, Ned Slider wrote:> On 11/08/12 22:17, James B. Byrne wrote: >> I am trying to transport a dd image between to hosts over a cross >> linked gigabit connection. Both hosts have an eth1 configured to a >> non routable ip addr on a shared network. No other devices exist >> on this link. >> >> When transferring via sftp I received a stall warning. Checking the >> logs I see this: >> >> dmesg | grep eth >> >> e1000e 0000:00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) >> 00:1c:c0:f2:1f:bb >> e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection >> e1000e 0000:00:19.0: eth0: MAC: 7, PHY: 8, PBA No: FFFFFF-0FF >> r8169 0000:01:00.0: eth1: RTL8168d/8111d at 0xffffc9000187c000, >> 00:0a:cd:1d:44:fe, XID 081000c0 IRQ 31 >> r8169 0000:01:00.0: eth1: jumbo features [frames: 9200 bytes, tx >> checksumming: ko] >> ADDRCONF(NETDEV_UP): eth0: link is not ready >> device eth0 entered promiscuous mode >> r8169 0000:01:00.0: eth1: invalid firwmare >> r8169 0000:01:00.0: eth1: unable to load firmware patch >> rtl_nic/rtl8168d-1.fw (-22) >> r8169 0000:01:00.0: eth1: link down >> r8169 0000:01:00.0: eth1: link down >> ADDRCONF(NETDEV_UP): eth1: link is not ready >> r8169 0000:01:00.0: eth1: link up >> ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready >> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: >> None >> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready >> br0: port 1(eth0) entering learning state >> eth1: no IPv6 routers present >> eth0: no IPv6 routers present >> br0: port 1(eth0) entering forwarding state >> r8169 0000:01:00.0: eth1: link down >> r8169 0000:01:00.0: eth1: link up >> r8169 0000:01:00.0: eth1: link down >> r8169 0000:01:00.0: eth1: link up >> >> This may, or may not, be related to this bug: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=12411 >> >> Is there some way to confirm whether or not this is the case. Is >> there a work-around for it if it is this bug? If it is not then has >> anyone any idea what is happening and how to fix it? >> > > Elrepo.org has updated drivers for both e1000e and r8169 devices (I'm > guessing it's probably the kmod-r8168 you'd need). You could try these > to see if they resolve the issue. > > If you want more help identifying the correct driver for your > hardware, see FAQ #4 here: >The network card for eth1 seems to have disappeared somehow. On the problem host: # /sbin/lspci -nn | grep -i net 00:19.0 Ethernet controller [0200]: Intel Corporation 82567V-2 Gigabit Network Connection [8086:10ce] # On another but nearly identically configured host (same MB and additional NIC: # /sbin/lspci -nn | grep -i net00:19.0 Ethernet controller [0200]: Intel Corporation 82567V-2 Gigabit Network Connection [8086:10ce] 01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168] (rev 03) 04:04.0 Serial controller [0700]: NetMos Technology PCI 9835 Multi-I/O Controller [9710:9835] (rev 01) Where did eth1 on the first host go? How do I get it back? A restart? -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Mon, August 13, 2012 10:37, Ned Slider wrote:> Faulty hardware maybe? Try a reboot and see if it reappears. If it's > located on a card try reseating the card (although I suspect this is > an integrated NIC on the motherboard?). > > The chipset is not necessarily the same in the second example > (different revision); RTL8111/8168B is not RTL8168d/8111d. They > probably do use the same driver but I'd need to see the > Vendor:Device ID pairing to know for sure.Eth1 is an xpci card sold by StarTech. A system with an identical card reports this: for BUSID in $(/sbin/lspci | awk '{ IGNORECASE=1 } /net/ { print $1 }'); do /sbin/lspci -s $BUSID -m; /sbin/lspci -s $BUSID -n; done 00:19.0 "Ethernet controller" "Intel Corporation" "82567V-2 Gigabit Network Connection" "Intel Corporation" "Device 0028" 00:19.0 0200: 8086:10ce 01:00.0 "Ethernet controller" "Realtek Semiconductor Co., Ltd." "RTL8111/8168B PCI Express Gigabit Ethernet controller" -r03 "Realtek Semiconductor Co., Ltd." "TEG-ECTX Gigabit PCI-E Adapter [Trendnet]" 01:00.0 0200: 10ec:8168 (rev 03) 04:04.0 "Serial controller" "NetMos Technology" "PCI 9835 Multi-I/O Controller" -r01 -p02 "LSI Logic / Symbios Logic" "2S (16C550 UART)" 04:04.0 0700: 9710:9835 (rev 01) ... The 4 port serial controller on the second host does not have an equivalent card installed on the host that no longer recognizes eth1. The integrated NI is eth0 for both hosts and this i/f is integrated on the Intel motherboard. The motherboards are the same model in both hosts. Both system are configured as KVM hosts. Both are running CentOS-6.3 -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Mon, August 13, 2012 18:48, Ned Slider wrote:> On 13/08/12 19:50, James B. Byrne wrote: >> >> On Mon, August 13, 2012 10:37, Ned Slider wrote: >> >>> Faulty hardware maybe? Try a reboot and see if it reappears. If >>> it's >>> located on a card try reseating the card (although I suspect this >>> is >>> an integrated NIC on the motherboard?). >>> >>> The chipset is not necessarily the same in the second example >>> (different revision); RTL8111/8168B is not RTL8168d/8111d. They >>> probably do use the same driver but I'd need to see the >>> Vendor:Device ID pairing to know for sure. >> >> >> Eth1 is an xpci card sold by StarTech. A system with an identical >> card reports this: >> > > OK, I'd definitely try reseating the card and if you still get no joy > I'd swap it out for a replacement. > >> for BUSID in $(/sbin/lspci | awk '{ IGNORECASE=1 } /net/ { print $1 >> }'); do /sbin/lspci -s $BUSID -m; /sbin/lspci -s $BUSID -n; doneI swapped the suspect card and rebooted the host. After some fussing about with udev I managed to get the new card recognized as eth1 (vice eth2 as udev kept insisting). I will do a transfer test later today and see if it stays up. The original failed in the midst of an sftp transfer. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Wed, August 15, 2012 09:15, Reindl Harald wrote:> did you read the output you posted? > http://lmgtfy.com/?q=unable+to+load+firmware+patch+rtl_nic%2Frtl8168d-1.fwr8169 0000:01:00.0: eth1: invalid firwmare> r8169 0000:01:00.0: eth1: unable to load firmware patch > rtl_nic/rtl8168d-1.fw (-22)I cannot seem to find a fix for this even given the references provided. However, I have discovered that eth1 has problems on the second host as well, it just does not generate the same messages: [root at vhost02 ~]# grep eth /var/log/messages Aug 13 17:10:18 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 13 17:16:19 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 11:23:39 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 11:23:41 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 11:37:42 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 11:37:44 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 11:38:51 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 11:38:53 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 11:40:17 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 11:40:19 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 11:40:50 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 11:40:52 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 14 15:26:20 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 14 15:26:22 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 05:28:39 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 05:28:42 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 05:40:32 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 05:40:34 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 05:47:34 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 05:47:36 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 05:48:21 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 05:48:23 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:16:53 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:16:54 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:18:35 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:18:37 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:22:27 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:22:29 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:31:36 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:31:38 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:40:27 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:40:29 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:41:40 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:41:42 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 06:49:39 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 06:49:41 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:01:59 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:02:01 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:04:08 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:04:10 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:07:28 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:07:30 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:09:54 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:09:56 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:12:07 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:12:09 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:13:58 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:14:00 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:17:04 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:17:06 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:20:14 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:20:16 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 07:29:28 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 07:29:30 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 09:31:56 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 09:31:58 vhost02 kernel: r8169 0000:01:00.0: eth1: link up Aug 15 09:33:18 vhost02 kernel: r8169 0000:01:00.0: eth1: link down Aug 15 09:33:20 vhost02 kernel: r8169 0000:01:00.0: eth1: link up So, my questions are: What is the problem and how do I fix it for both hosts? -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
I seem to have resolved this issue by installing the alternate kernel module for this chip set available from elrepo. # /sbin/lspci -nn | grep -i net . . . 01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168] (rev 03) . . . # /sbin/lspci -n | grep '01:00.0' 01:00.0 0200: 10ec:8168 (rev 03) Using the the device id (10ec:8168) to search: http://elrepo.org/tiki/DeviceIDs shows that this matches: r8168.ko pci 10EC:8168 kmod-r8168 # yum whatprovides kmod-r8168 . . . kmod-r8168-8.031.00-1.el6.elrepo.x86_64 : r8168 kernel module(s) Repo : elrepo Matched from: . . . # yum install kmod-r8168 . . . Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package kmod-r8168.x86_64 0:8.031.00-1.el6.elrepo will be installed --> Finished Dependency Resolution Dependencies Resolved =============================================================================== Package Arch Version Repository Size ===============================================================================Installing: kmod-r8168 x86_64 8.031.00-1.el6.elrepo elrepo 73 k Transaction Summary ===============================================================================Install 1 Package(s) Total size: 73 k Installed size: 533 k Is this ok [y/N]: y Downloading Packages: . . . This seems to have cleared up the problem on both hosts but, only time will tell. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Wed, August 15, 2012 09:26, m.roth at 5-cent.us wrote:> > My eyes uncrossed, and I saw, buried in there, the firstlink, above, > and the last. You might want to see if a) the 8168d firmware patch > will work on that card; b) vhost - it's a virtual host? perhaps it's > trying to load the firmware patch to the real NIC, and as a guest > VM, it doesn't have rights? >vhost is a kvm host system, not a virtualized guest. As I wrote elsewhere, I installed the kmod package from elrepo that handles this device, which solution I believe you may have originally suggested to me. In any case, that seems to have solved the problem (fingers crossed). Thanks for the help. I appreciate it very much. Sorry about your eyes. But when things are totally unfamiliar to me I hesitate to trim error logs lest out of ignorance I remove what is really important. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3