Hello all, I have a strange networking problem I''ve been fighting for the last week. I have a Fedora Core 4 box attached to a Netopia router on a T-1 via eth1. eth0 is attached to the LAN. IP addresses are static. The Fedora box is running Shorewall masqing to the LAN. Here''s some bad ASCII: (the switch and public Win boxes are for testing only) T-1 | | Netopia router | | ---------- | switch | ---------- | | | | | | | | WinXP Public IP | | | Win2K Public IP | Fedora 4 Public IP (shorewall / masq) | | ---------- | switch | ---------- | | | | | | <--LAN--> All runs fine when it is up, but eth1 on the Fedora box drops link on a rather precise schedule. At 4 AM, 10 AM, 4 PM, and 10 PM, the internet is no longer available from the Fedora box or the LAN clients. The fedora box, both public Windows boxes, and the local side of the netopia router can all ping each other when this happens. The netopia and Windows boxes can ping the internet, and be pinged from the internet. Doing an "/etc/init.d/network restart" brings the Fedora box back on line, and then a "shorewall restart" re-enables masqing. I''ve assigned a different IP to eth1 with no change in behavior. Our ISP is telling me the fault lies at the Fedora box. They believe that eth1 is losing it''s default route. However when the link is down, "ip route" lists everything correctly (the same as when it''s up). I''ve tried everything that I can think of. Nothing is logged when it goes down. Does anyone have any ideas on what might be causing this? Thanks, Chris ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Chris Carbaugh wrote:> Hello all, > <snip> > All runs fine when it is up, but eth1 on the Fedora box drops link on a > rather precise schedule. At 4 AM, 10 AM, 4 PM, and 10 PM, the internet is no > longer available from the Fedora box or the LAN clients. The fedora box, > both public Windows boxes, and the local side of the netopia router can all > ping each other when this happens. The netopia and Windows boxes can ping > the internet, and be pinged from the internet. > > Doing an "/etc/init.d/network restart" brings the Fedora box back on line, > and then a "shorewall restart" re-enables masqing. I''ve assigned a different > IP to eth1 with no change in behavior. > > Our ISP is telling me the fault lies at the Fedora box. They believe that > eth1 is losing it''s default route. However when the link is down, "ip route" > lists everything correctly (the same as when it''s up). > > I''ve tried everything that I can think of. Nothing is logged when it goes > down. Does anyone have any ideas on what might be causing this? >Chris, I may or may not have a comment on this, but I need to know- when you are speaking of "link" are you using that word in an ethernet data-link context? If so, given that there was a schedule to the outages, I''d check the Fedora box and your network electronics (if managed) for anything scheduled in their setup. -- Michael Cozzi cozzi@cozziconsulting.com ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> -----Original Message----- > From: Michael Cozzi [mailto:cozzi@cozziconsulting.com] > Sent: Monday, August 14, 2006 3:51 PM > To: CCarbaugh > Subject: Re: [Shorewall-users] eth1 drops on a predictable schedule > > Chris Carbaugh wrote: > > Hello all, > > <snip> > > All runs fine when it is up, but eth1 on the Fedora box > drops link on > > a rather precise schedule. At 4 AM, 10 AM, 4 PM, and 10 PM, the > > internet is no longer available from the Fedora box or the LAN > > clients. The fedora box, both public Windows boxes, and the local > > side of the netopia router can all ping each other when > this happens. > > The netopia and Windows boxes can ping the internet, and be > pinged from the internet. > > > > Doing an "/etc/init.d/network restart" brings the Fedora > box back on > > line, and then a "shorewall restart" re-enables masqing. I''ve > > assigned a different IP to eth1 with no change in behavior. > > > > Our ISP is telling me the fault lies at the Fedora box. > They believe > > that > > eth1 is losing it''s default route. However when the link > is down, "ip route" > > lists everything correctly (the same as when it''s up). > > > > I''ve tried everything that I can think of. Nothing is > logged when it > > goes down. Does anyone have any ideas on what might be > causing this? > > > > Chris, I may or may not have a comment on this, but I > need to know- when you are speaking of "link" are you using > that word in an ethernet data-link context? > > If so, given that there was a schedule to the outages, > I''d check the Fedora box and your network electronics (if > managed) for anything scheduled in their setup. > > -- > Michael Cozzi > cozzi@cozziconsulting.comYes, by link I mean ethernet. It really seems like a routing problem to me. I can ping the router and the win boxes, but not passed the router to the ''net. I have checked for anything scheduled. Cron only has a nightly backup script running. The switch is not managed. The Netopia router I do not have access to. Thanks for your input, Chris ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Chris Carbaugh wrote:> > Yes, by link I mean ethernet. It really seems like a routing problem to me. > I can ping the router and the win boxes, but not passed the router to the > ''net. I have checked for anything scheduled. Cron only has a nightly backup > script running. The switch is not managed. The Netopia router I do not have > access to. > > Thanks for your input, > Chris >You are welcome Chris. Then if I''m reading your previous post correctly, and link integrity *is* failing, try replacing the NIC and retest. If I''m not reading this correctly, given intact routing tables, someone with greater skills than I needs to comment. The schedule thing seems interesting though... strange- but interesting. -- Michael Cozzi cozzi@cozziconsulting.com ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> -----Original Message----- > From: Michael Cozzi [mailto:cozzi@cozziconsulting.com] > Sent: Monday, August 14, 2006 4:16 PM > To: CCarbaugh > Subject: Re: [Shorewall-users] eth1 drops on a predictable schedule > > Chris Carbaugh wrote: > > > > Yes, by link I mean ethernet. It really seems like a > routing problem to me. > > I can ping the router and the win boxes, but not passed the > router to > > the ''net. I have checked for anything scheduled. Cron only has a > > nightly backup script running. The switch is not managed. The > > Netopia router I do not have access to. > > > > Thanks for your input, > > Chris > > > > You are welcome Chris. > > Then if I''m reading your previous post correctly, and > link integrity > *is* failing, try replacing the NIC and retest. > > If I''m not reading this correctly, given intact routing > tables, someone with greater skills than I needs to comment. > The schedule thing seems interesting though... strange- but > interesting. > > -- > Michael Cozzi > cozzi@cozziconsulting.comYep, replacing the NIC is next on my list. That schedule thing is just plain odd though. It''s been on that schedule for a week now, within ten minutes of the times listed, no matter when the link was last restarted. I''ve been pinging both the eth1 on the server and the local side of the router all day. The router response time stays right around 30ms, where as the server''s response is half in the 30''s and half the time up to 3000ms. I''ll throw a new NIC in and go from there. Thanks again, Chris ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Mon, 2006-08-14 at 15:56 -0400, Chris Carbaugh wrote:> > Yes, by link I mean ethernet. It really seems like a routing problem to me. > I can ping the router and the win boxes, but not passed the router to the > ''net. I have checked for anything scheduled. Cron only has a nightly backup > script running. The switch is not managed. The Netopia router I do not have > access to.When this happens, does ''ip route ls'' show what you expect? Is the default route in place and is it gatewayed through the Netopia? Given the regularity with which this occurs, it has to be triggered by a scheduled event on *some* system in your network -- could be something as simple as a DHCP lease expiration that is running a mis-coded or mis-configured shell script. We''ve seen that one many times on the list... -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> -----Original Message----- > From: Tom Eastep [mailto:teastep@shorewall.net] > Sent: Monday, August 14, 2006 4:45 PM > To: CCarbaugh > Subject: Re: [Shorewall-users] eth1 drops on a predictable schedule > > On Mon, 2006-08-14 at 15:56 -0400, Chris Carbaugh wrote: > > > > > Yes, by link I mean ethernet. It really seems like a > routing problem to me. > > I can ping the router and the win boxes, but not passed the > router to > > the ''net. I have checked for anything scheduled. Cron only has a > > nightly backup script running. The switch is not managed. The > > Netopia router I do not have access to. > > When this happens, does ''ip route ls'' show what you expect? > Is the default route in place and is it gatewayed through the Netopia? > > Given the regularity with which this occurs, it has to be > triggered by a scheduled event on *some* system in your > network -- could be something as simple as a DHCP lease > expiration that is running a mis-coded or mis-configured > shell script. We''ve seen that one many times on the list... > > -Tom > -- > Tom Eastep \ Nothing is foolproof to a sufficiently talented fool > Shoreline, \ http://shorewall.net > Washington USA \ teastep@shorewall.net > PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key >Yes, ''ip route ls'' shows what I expect no matter what state the link is in. The DHCP lease expiration was my first quess as well. The netopia has the ability to act as a DHCP server, but my ISP assures me that it is disabled. I don''t have access to verify though. I do have dhcpd configured on eth0 serving the LAN. All seems well there. The 2 Windows boxes are static IP as well with no problems there. The long ping responses I mentioned in my last email is starting to lead me believe the NIC is in fact bad. Thanks, Chris ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Chris Carbaugh wrote:> Yes, ''ip route ls'' shows what I expect no matter what state the link > is in. > The DHCP lease expiration was my first quess as well. The netopia has the > ability to act as a DHCP server, but my ISP assures me that it is disabled. > I don''t have access to verify though. I do have dhcpd configured on eth0 > serving the LAN. All seems well there. > > The 2 Windows boxes are static IP as well with no problems there. The long > ping responses I mentioned in my last email is starting to lead me believe > the NIC is in fact bad. > > Thanks, > Chris >Chris, Tom''s comments about DHCP seem possible. If you have access to the network on off hours, bring the internal dhcpd down, and make a DHCP request to the segment with the router on it. If you get a valid response from a server, then there''s at least one issue. Then, on an off topic note, you get to laugh at your ISP on the phone, and they have to take it. I know it''s not very nice. But we''re IT people... we have to take what joy we can get. :) -- Michael Cozzi cozzi@cozziconsulting.com ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642