Hello, I am trying to figure out a problem I'm having using CentOS on a machine as a router. The short story is: any traffic routed through the router seems to get disconnected at random occasionally. The hardware setup is: I have two switches, the router sits between them, the webserver on the LAN switch. The machine I'm using for the router is a Dell 860 1U rackmount with two NICs, one NIC on the internet, one NIC on the LAN. The routing setup is: I'm using IPTABLES for routing, with the following command: iptables -t nat -A PREROUTING -p tcp -m tcp -i eth1 --dport 6680 -j DNAT --to 192.168.1.10:80 Basically, I'm forwarding port 6680 on to the webserver (.10) on the LAN. What I have tested so far: If I'm at the router, I can download files from the webserver just fine, so the webserver setup and physical connection is OK. If I'm at the router, I can download files from the internet just fine, so the physical connection to the outside is OK as well. If I'm on the outside of the router (on the internet) I can download files directly from the router just fine. The issue is when I try to download a file from the webserver via the router (port 6680). It will work sometimes, but other times it will randomly disconnect me, at random points during the download. Watching the traffic on a packet-sniffer shows that right before the download fails, my client computer trying to download the file keeps resending "ACK" messages, the router keeps sending the next sequence of packets, and eventually the router sends a bunch of "RST" packets. There aren't any strange messages in /var/log/messages or dmesg in either the router or the webserver I need some help diagnosing this problem. Here's some info about the router: CentOS 5 latest kernel 2.6.18-8.1.8.el5 iptables v1.3.5 I've tried testing as much as I can before asking for help, but I'm at the end of what I know to try. Any leads as to where to look to diagnose, or what might cause this would help. Thanks in advance, -Jesse
Jesse Cantara wrote:> Hello, > > I am trying to figure out a problem I'm having using CentOS on a > machine as a router. The short story is: any traffic routed through > the router seems to get disconnected at random occasionally. > > The hardware setup is: > I have two switches, the router sits between them, the webserver on > the LAN switch. > The machine I'm using for the router is a Dell 860 1U rackmount with > two NICs, one NIC on the internet, one NIC on the LAN. > > The routing setup is: > I'm using IPTABLES for routing, with the following command: > iptables -t nat -A PREROUTING -p tcp -m tcp -i eth1 --dport 6680 -j > DNAT --to 192.168.1.10:80 > Basically, I'm forwarding port 6680 on to the webserver (.10) on the LAN. > > What I have tested so far: > If I'm at the router, I can download files from the webserver just > fine, so the webserver setup and physical connection is OK. > If I'm at the router, I can download files from the internet just > fine, so the physical connection to the outside is OK as well. > If I'm on the outside of the router (on the internet) I can download > files directly from the router just fine. > > The issue is when I try to download a file from the webserver via the > router (port 6680). It will work sometimes, but other times it will > randomly disconnect me, at random points during the download. > > Watching the traffic on a packet-sniffer shows that right before the > download fails, my client computer trying to download the file keeps > resending "ACK" messages, the router keeps sending the next sequence > of packets, and eventually the router sends a bunch of "RST" packets. > > There aren't any strange messages in /var/log/messages or dmesg in > either the router or the webserver > > I need some help diagnosing this problem. Here's some info about the > router: > CentOS 5 > latest kernel 2.6.18-8.1.8.el5 > iptables v1.3.5 > > I've tried testing as much as I can before asking for help, but I'm at > the end of what I know to try. Any leads as to where to look to > diagnose, or what might cause this would help. > > Thanks in advance, > -Jesse > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centosJesse, What IP address are you using when you try to access the webserver (via port 6680) from the router, the public or the private? If I read the iptables man page correctly, I would not expect the router to mangle the packets generated locally for the PREROUTING table since the packets are not "really" arriving at the eth1 interface. Maybe the problem is that some packets are getting through at all. What happens if you try to access the webserver from a machine on the LAN, but using the public IP address and port 6680? Why not use port 80 and the private IP when accessing the webserver from the router, and anywhere else in the LAN, and address the webserver via 6680 when coming in from the internet. If I read your test scenarios correctly, both of those conditions work correctly and I assume that is your intent. Bob...
Jesse Cantara wrote:> Actually, I spoke too soon. Setting the NIC to 100 Mbit did not fix > the issue, I just happened to misdiagnose a fix, because it seemed to > be working for quite some time, but it is back to the old problems. > Basically, I'm at wits end right now. I'm going to go down to the > colocation and see if they can test the network drop into our cabinet. > If it's not that, then I'm convinced it's the tg3 driver. -Jesse Jesse > Cantara wrote: >> > The problem ended up being the "tg3" Broadcom NIC kernel module driver. >> > It doesn't work properly at Gigabit speeds. Turning it down to 100 >> > Megabit fixed the issue. Does anybody know where I should report this bug? >> > >> > Thanks for all your help, >> > -JesseSorry about being late to the party but I was out of town for a while and I'm still trying to catch up. I have seen this behavior with the tg3 module and a Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet NIC. This is a 64 bit PCI card in a Tyan Tiger MPX (dual Athlon) motherboard. I Googled for any similar problems and couldn't find anything so I put a spare 3c2000t into a 32-bit slot and chalked the problem up to the old motherboard and chip set. The box in question is *NOT* serving as a router but does have multiple NICs. The NIC in question has a CAT 6 cable to a 3com 16 port unmanaged gigabit switch. I swapped cables, etc. and still saw the same behavior. I could restart the network and everything would be fine for a while but then it would just stop with no errors, messages, etc. Since I had the spare 3c2000t, that problem went down in the priority stack. Cheers, Dave -- Politics, n. Strife of interests masquerading as a contest of principles. -- Ambrose Bierce