Hi folks, I have several routers here which are based on Jetway J7F4 ITX boards that come with two onboard re-interfaces. I run 7-stable on them via nanobsd and update them about once in three or four months. After the last update (11th December 2008) I have noticed the following strange behaviour on at least two machines (identical hard- and software): After weeks of flawless operation, the network connection on both interfaces suddenly starts to mangle packages. Even a simple ping can show up to 50% or so package loss. The machine is mostly unreachable via net. ifconfig up/down did not cure this, turning off checksum-offloading and stuff did not help. Even simply rebooting the machine did not make the problem go away! I had to power-cycle them by unplugging all cables to get back to normal operation. I have seen this behaviour on two different machines, so I can most probably rule out a hardware issue. It does not appear to happen often, though. I did not see this with an earlier image of 7-stable from June 2008, and probably even an image from early September was working fine (although I did not use that one for such a long time). Visiting the webcvs I noticed that there are a lot of patches for if_re in December 2008 and January 2009. The revision I'm having problems with is tagged "1.95.2.37 2008/12/09 11:01:17". Does anyone have an idea what broke if_re for me, and how I can get back to stable operation? Is it possible to use if_re from head as drop-in replacement to test the patches available after 12/09? I would prefer not to move the machines completely from -stable to -current. Here some further information about the NICs: ---pciconf--- re0@pci0:0:9:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8169/8110 Family Gigabit Ethernet NIC' class = network subclass = ethernet re1@pci0:0:11:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8169/8110 Family Gigabit Ethernet NIC' class = network subclass = ethernet --- ---dmesg--- re0: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port 0xf000-0xf0ff mem 0xfdfff000-0xfdfff0ff irq 10 at device 9.0 on pci0 re0: Chip rev. 0x18000000 re0: MAC rev. 0x00000000 miibus0: <MII bus> on re0 rgephy0: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus0 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto re0: Ethernet address: 00:30:18:ab:d0:19 re0: [FILTER] re1: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port 0xf200-0xf2ff mem 0xfdffe000-0xfdffe0ff irq 10 at device 11.0 on pci0 re1: Chip rev. 0x18000000 re1: MAC rev. 0x00000000 miibus1: <MII bus> on re1 rgephy1: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus1 rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto re1: Ethernet address: 00:30:18:ab:d0:1a re1: [FILTER] --- cu Gerrit
On Wed, Feb 04, 2009 at 10:05:07AM +0100, Gerrit K?hn wrote:> Hi folks, > > I have several routers here which are based on Jetway J7F4 ITX boards that > come with two onboard re-interfaces. I run 7-stable on them via nanobsd > and update them about once in three or four months. > > After the last update (11th December 2008) I have noticed the following > strange behaviour on at least two machines (identical hard- and software): > After weeks of flawless operation, the network connection on both > interfaces suddenly starts to mangle packages. Even a simple ping can show > up to 50% or so package loss. The machine is mostly unreachable via net. > ifconfig up/down did not cure this, turning off checksum-offloading > and stuff did not help. Even simply rebooting the machine did not make the > problem go away! I had to power-cycle them by unplugging all cables to get > back to normal operation. > > I have seen this behaviour on two different machines, so I can most > probably rule out a hardware issue. It does not appear to happen often, > though. I did not see this with an earlier image of 7-stable from June > 2008, and probably even an image from early September was working fine > (although I did not use that one for such a long time). > > Visiting the webcvs I noticed that there are a lot of patches for if_re in > December 2008 and January 2009. The revision I'm having problems with is > tagged "1.95.2.37 2008/12/09 11:01:17". Does anyone have an idea what > broke if_re for me, and how I can get back to stable operation? Is it > possible to use if_re from head as drop-in replacement to test the patches > available after 12/09? I would prefer not to move the machines completely > from -stable to -current. > > Here some further information about the NICs: > > ---pciconf--- > re0@pci0:0:9:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec rev=0x10 > hdr=0x00 vendor = 'Realtek Semiconductor' > device = 'RTL8169/8110 Family Gigabit Ethernet NIC' > class = network > subclass = ethernet > re1@pci0:0:11:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec > rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' > device = 'RTL8169/8110 Family Gigabit Ethernet NIC' > class = network > subclass = ethernet > --- > > > ---dmesg--- > re0: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port > 0xf000-0xf0ff mem 0xfdfff000-0xfdfff0ff irq 10 at device 9.0 on pci0 re0: > Chip rev. 0x18000000 re0: MAC rev. 0x00000000 > miibus0: <MII bus> on re0 > rgephy0: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus0 > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-FDX, auto re0: Ethernet address: 00:30:18:ab:d0:19 > re0: [FILTER] > re1: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port > 0xf200-0xf2ff mem 0xfdffe000-0xfdffe0ff irq 10 at device 11.0 on pci0 re1: > Chip rev. 0x18000000 re1: MAC rev. 0x00000000 > miibus1: <MII bus> on re1 > rgephy1: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus1 > rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-FDX, auto re1: Ethernet address: 00:30:18:ab:d0:1a > re1: [FILTER] > --- >Since you're using RTL8169SC it could be related with my commit r180519(cvs rev 1.95.2.22). It seems that RTL8169SC does not like memory mapped register access and I think jkim@ committed patch for the issue. Would you try re(4) in HEAD? (Just copying if_re.c, if_rlreg.h and if_rl.c from HEAD to stable would be enough to build re(4) on stable).
Am 04.02.2009 um 10:05 schrieb Gerrit K?hn:> After the last update (11th December 2008) I have noticed the > following > strange behaviour on at least two machines (identical hard- and > software): > After weeks of flawless operation, the network connection on both > interfaces suddenly starts to mangle packages. Even a simple ping > can show > up to 50% or so package loss. The machine is mostly unreachable via > net. > ifconfig up/down did not cure this, turning off checksum-offloading > and stuff did not help. Even simply rebooting the machine did not > make the > problem go away! I had to power-cycle them by unplugging all cables > to get > back to normal operation.I've seen a similar, but fully reproducible behavior on this ethernet hardware. It turned out to be not a problem of the driver: http://www.FreeBSD.org/cgi/query-pr.cgi?pr=kern/130957> Is it > possible to use if_re from head as drop-in replacement to test the > patches > available after 12/09?The other way, using an older if_re, worked by replacing sys/dev/re/ if_re.c, sys/pci/if_rlreg.h and sys/pci/if_rl.c, so the answer ist likely "yes". MarKus - - - - - - - - - - - - - - - - - - - Dipl. Ing. Markus Hitter http://www.jump-ing.de/
Hey. I have had similar symptoms on a dedicated server with the re driver. What I did was grab more recent drivers (which might be redundant now) and disable a set of features that weren't stable at the time. Please have a look at this PR that I submitted back then: http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/125805 On Wed, Feb 04, 2009 at 10:05:07AM +0100, Gerrit K?hn wrote:> Hi folks, > > I have several routers here which are based on Jetway J7F4 ITX boards that > come with two onboard re-interfaces. I run 7-stable on them via nanobsd > and update them about once in three or four months. > > After the last update (11th December 2008) I have noticed the following > strange behaviour on at least two machines (identical hard- and software): > After weeks of flawless operation, the network connection on both > interfaces suddenly starts to mangle packages. Even a simple ping can show > up to 50% or so package loss. The machine is mostly unreachable via net. > ifconfig up/down did not cure this, turning off checksum-offloading > and stuff did not help. Even simply rebooting the machine did not make the > problem go away! I had to power-cycle them by unplugging all cables to get > back to normal operation. > > I have seen this behaviour on two different machines, so I can most > probably rule out a hardware issue. It does not appear to happen often, > though. I did not see this with an earlier image of 7-stable from June > 2008, and probably even an image from early September was working fine > (although I did not use that one for such a long time). > > Visiting the webcvs I noticed that there are a lot of patches for if_re in > December 2008 and January 2009. The revision I'm having problems with is > tagged "1.95.2.37 2008/12/09 11:01:17". Does anyone have an idea what > broke if_re for me, and how I can get back to stable operation? Is it > possible to use if_re from head as drop-in replacement to test the patches > available after 12/09? I would prefer not to move the machines completely > from -stable to -current. > > Here some further information about the NICs: > > ---pciconf--- > re0@pci0:0:9:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec rev=0x10 > hdr=0x00 vendor = 'Realtek Semiconductor' > device = 'RTL8169/8110 Family Gigabit Ethernet NIC' > class = network > subclass = ethernet > re1@pci0:0:11:0: class=0x020000 card=0x10ec16f3 chip=0x816710ec > rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' > device = 'RTL8169/8110 Family Gigabit Ethernet NIC' > class = network > subclass = ethernet > --- > > > ---dmesg--- > re0: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port > 0xf000-0xf0ff mem 0xfdfff000-0xfdfff0ff irq 10 at device 9.0 on pci0 re0: > Chip rev. 0x18000000 re0: MAC rev. 0x00000000 > miibus0: <MII bus> on re0 > rgephy0: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus0 > rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-FDX, auto re0: Ethernet address: 00:30:18:ab:d0:19 > re0: [FILTER] > re1: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port > 0xf200-0xf2ff mem 0xfdffe000-0xfdffe0ff irq 10 at device 11.0 on pci0 re1: > Chip rev. 0x18000000 re1: MAC rev. 0x00000000 > miibus1: <MII bus> on re1 > rgephy1: <RTL8169S/8110S/8211B media interface> PHY 1 on miibus1 > rgephy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-FDX, auto re1: Ethernet address: 00:30:18:ab:d0:1a > re1: [FILTER] > --- > > > > cu > Gerrit > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
On Fri, Feb 13, 2009 at 11:41:43AM +0100, Gerrit K?hn wrote:> On Fri, 13 Feb 2009 19:24:00 +0900 Pyun YongHyeon <pyunyh@gmail.com> wrote > about Re: fun with if_re: > > PY> > I had to reboot some of the machines meanwhile and could do some > PY> > further testing. One strange thing I noticed is that the > PY> > re-interfaces often do not come up in a working state after > PY> > rebooting. Strangely, I see network traffic floating around via > PY> > tcpdump, but not even ping works. This state often goes away when > PY> > playing around with the interface (sometimes ifconfig down/up helps, > PY> > sometimes disabling some of the additional features like txc/rxc), > PY> > but I cannot make out a reproducible behaviour so far. When the > PY> > interface leaves this strange state it seems to work fine > PY> > afterwards. Any clues? > > PY> Does this happen on latest if_re.c/if_rlreg.h? I guess jkim fixed > PY> this type of problem in r187483. If that have no effect please let > PY> me know. > > It happens on both versions: the old one from 11th Dec 08 I still had, and > the new one I built with the patches you recommended about a week ago. > if_re is 1.151 2009/01/20 20:22:28 jkim, if_rlreg is 1.94 2009/01/20 > 20:22:28 jkim for the latter. >Ok, try attached patch. -------------- next part -------------- A non-text attachment was scrubbed... Name: re.8169sc.diff Type: text/x-diff Size: 2034 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090213/b0b9f19a/re.8169sc.bin