Need some opinions on making a multihomed box more resilient to failure. This server runs asterisk and performs nat and firewalling for an office. Its wan nic is plugged directly into a cable modem and I am stuck with it being configured as a dhcp client. The lan nic services a small office, and has a static IP but of course, no default gateway. When an internet outage occurs, asterisks sip stack tanks and the pbx dives. So I have setup fqdn's/ip's in the hosts file for all sip peers it will attempt to resolve, and setup a local dns with all rfc1912 zones so every query asterisk can possibly make will be answered locally. Now it seems there is still one last hurdle, when the connection is yanked to simulate a complete outage, asterisk still goes down. I can only assume this happens now as a result of no default gateway? Would setting up a silly route for 0.0.0.0/0 to say 127.0.0.1 for the internal nic in /etc/sysconfig/network-scripts/route-eth1 with a metric higher than what the ISP's dhcp servers default gw would be possibly cure this? My hope is that when the wan nic goes down, a route is still available. I can't think of any other network shortcoming that is left when that external nic goes down, I am hoping this is finally it, so asterisk will stop core dumping and yelling "Serious Network Trouble". Thanks! jlc
Joseph L. Casale wrote:> Now it seems there is still one last hurdle, when the connection is yanked > to simulate a complete outage, asterisk still goes down. I can only assume > this happens now as a result of no default gateway?check the logs? run strace on the process? run tcpdump on the interface(s) to see what traffic it is trying to transmit?> > Would setting up a silly route for 0.0.0.0/0 to say 127.0.0.1 for the > internal > nic in /etc/sysconfig/network-scripts/route-eth1 with a metric higher than > what the ISP's dhcp servers default gw would be possibly cure this? My hope > is that when the wan nic goes down, a route is still available.Last I checked the 'metric' number in the linux routing table is really only used when your using a routing daemon. as far as default routes go, it should not have any impact. I suspect the route is not the issue, I suspect that the app is trying to talk to something external and then fails, it will fail the same if you try to point it to a router that goes nowhere. tcpdump should be able to tell you who the host is trying to talk to. strace might reveal why. nate
>Joseph L. Casale wrote: > >> Now it seems there is still one last hurdle, when the connection is yanked >> to simulate a complete outage, asterisk still goes down. I can only assume >> this happens now as a result of no default gateway? > >check the logs? run strace on the process? run tcpdump on >the interface(s) to see what traffic it is trying to transmit? > >> >> Would setting up a silly route for 0.0.0.0/0 to say 127.0.0.1 for the >> internal >> nic in /etc/sysconfig/network-scripts/route-eth1 with a metric higher than >> what the ISP's dhcp servers default gw would be possibly cure this? My hope >> is that when the wan nic goes down, a route is still available. > >Last I checked the 'metric' number in the linux routing table is >really only used when your using a routing daemon. as far as >default routes go, it should not have any impact. > >I suspect the route is not the issue, I suspect that the app is >trying to talk to something external and then fails, it will fail >the same if you try to point it to a router that goes nowhere. > >tcpdump should be able to tell you who the host is trying to talk >to. strace might reveal why. > >nateI have almost this same setup running with no problems. Make sure you have only one default gateway on your server defined on your Internet facing interface. This should be getting assigned from the DHCP request to your ISP so make sure you don't have a gateway in your internal interface. As far as Asterisk crashing, that sounds like application problem (like Nate said) trying to communicate over the connection that was pulled out. However, make sure it's listening on all interfaces (0.0.0.0), or just the internal static IP, so it's not specifically listening on the DHCP IP that could change or go away when the network cable is yanked. A local tcpdump or wireshark should shed some light on this problem if the above doesn't change anything. -- Dave Jones