thr3ads.net - LARTC - Redundant internet connections. [Jun 2007]

If this information is useful, please help other people find it:
Share via:

Grant Taylor

2007-Jun-21 07:05 UTC

Redundant internet connections.

(I know that what I''m wanting to do can be done, but for some reason I 
can not get it to work for the life of me.  I think I have been staring 
at it too long and too closely.)

I have two different internet connections from two cooperating ISPs.  I 
also have a small 8 block of IPs that are globally routable that both 
ISPs will route to me via my world facing globally routable IPs that I 
have with them.  I.e. ISP A has a route to 75.19.28.7/29 via 12.34.56.78 
and ISP B has a route to 75.19.28.7/29 via 87.65.43.21.

I want to use one ISP as the primary default gateway and the other ISP 
as a backup default gateway.  That is to say I want to *NOT* use load 
balancing rather just redundancy in this situation.

I do *NOT* need to use NAT because I do have the globally routable IP 
address on *ALL* interfaces.

I.e.
eth0:  75.19.28.6 (DMZ)
eth1:  12.34.56.78 (ISP A)
eth2:  87.65.43.21 (ISP B)

I want this router to use the default gateway for ISP A of 12.34.56.254 
and only use the default gateway of ISP B 78.65.43.1 if the default 
gateway of ISP A can not be reached.

If I set up the interfaces with their IPs and subnets and set up 
multiple default routes with varying metrics (for priority) and test by 
taking an interface down, things work.  However, this is not a realistic 
test because the interface will never physically go down.

For the sake of discussion, let one link be a DSL modem and the other 
link be a cable modem.  Each of the links is an external modem that uses 
an ethernet cable to connect in to the router.  Thus no matter what the 
state of the link coming in to my facility is, the link on the Linux 
router will always be up b/c the ethernet between the router and the 
modems sitting on the next shelf down will always be up.

I need a way for the Linux kernel to try to use a default gateway and 
switch to another one if it does not see any traffic.

Any help that any one could offer will be greatly appreciated.



Thanks in advance,

Grant. . . .

Salim S I

2007-Jun-21 07:46 UTC

head link

RE: Redundant internet connections.

Use a ping script, which pings some IP every minute or so. Ping can bind
to a specific interface.

Ping -c 1 -w 1 -I eth1 $SOME_IP
Ping -c 1 -w 1 -I eth2 $SOME_IP

Check for return values for those pings.
Change your default routes based on the ping results.

This is the basic idea. You can add many other things to this, more IPs,
more counts, change time interval... (Better use IPs than domain names,
so that DNS queries won''t have problem)
> -----Original Message-----
> From: lartc-bounces@mailman.ds9a.nl
[mailto:lartc-bounces@mailman.ds9a.nl]> On Behalf Of Grant Taylor
> Sent: Thursday, June 21, 2007 3:06 PM
> To: Mail List - Linux Advanced Routing and Traffic Control
> Subject: [LARTC] Redundant internet connections.
> 
> (I know that what I''m wanting to do can be done, but for some
reason I
> can not get it to work for the life of me.  I think I have been
staring> at it too long and too closely.)
> 
> I have two different internet connections from two cooperating ISPs.
I> also have a small 8 block of IPs that are globally routable that both
> ISPs will route to me via my world facing globally routable IPs that I
> have with them.  I.e. ISP A has a route to 75.19.28.7/29 via
12.34.56.78> and ISP B has a route to 75.19.28.7/29 via 87.65.43.21.
> 
> I want to use one ISP as the primary default gateway and the other ISP
> as a backup default gateway.  That is to say I want to *NOT* use load
> balancing rather just redundancy in this situation.
> 
> I do *NOT* need to use NAT because I do have the globally routable IP
> address on *ALL* interfaces.
> 
> I.e.
> eth0:  75.19.28.6 (DMZ)
> eth1:  12.34.56.78 (ISP A)
> eth2:  87.65.43.21 (ISP B)
> 
> I want this router to use the default gateway for ISP A of
12.34.56.254> and only use the default gateway of ISP B 78.65.43.1 if the default
> gateway of ISP A can not be reached.
> 
> If I set up the interfaces with their IPs and subnets and set up
> multiple default routes with varying metrics (for priority) and test
by> taking an interface down, things work.  However, this is not a
realistic> test because the interface will never physically go down.
> 
> For the sake of discussion, let one link be a DSL modem and the other
> link be a cable modem.  Each of the links is an external modem that
uses> an ethernet cable to connect in to the router.  Thus no matter what
the> state of the link coming in to my facility is, the link on the Linux
> router will always be up b/c the ethernet between the router and the
> modems sitting on the next shelf down will always be up.
> 
> I need a way for the Linux kernel to try to use a default gateway and
> switch to another one if it does not see any traffic.
> 
> Any help that any one could offer will be greatly appreciated.
> 
> 
> 
> Thanks in advance,
> 
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-21 14:46 UTC

head link

Re: Redundant internet connections.

On 06/21/07 02:46, Russell Stuart wrote:> Well, it may be that you are connected to the modem by Ethernet, but 
> that doesn''t mean you can''t arrange to know if the link
is up or
> down.
If you are familiar with Cisco, there is a physical link, and a protocol 
link.  I''m ending with an (physical link) Up / (protocol down) Down 
scenario, which can not be detected by Linux''s device state.
> For DSL, you can run PPPoE on your Linux box.  That way you will know 
> when your link is down because the PPPoE connection dies, taking all 
> routes with it. I do this.  It works.  In the case of a cable modem 
> you can request a short dhcp-lease-time (see the option of that name 
> in dhcp-options(5)) which achieves the same thing.  This is by far 
> the best solution because it reacts quickly, and altering of the 
> routing table happens automagically as the links go up and down.
Ugh!  Besides the fact that this is not possible (in my scenario) it is 
in my opinion, EXTREMELY sub-optimal.  Don''t even get me started on 
PPPoE.  There is also the fact that the DHCP leases would have to be 
sub-minute in length to even remotely come close to working for this.
> Assuming this isn''t possible for some reason the only other way to
do
> this is manually.  Ie, you monitor the link somehow.  There are any 
> number of ways you can do this.  One nice way is use Nagios to 
> monitor the link.  This is nice because Nagios can do things when the 
> link goes down and comes back up again - like altering your routing 
> table.  Nagios is also good because it allows for some hysteresis, ie 
> waiting for a few failed pings before taking action.  And it can 
> report what happened by SMS or whatever.  There are a lot of Nagios 
> type monitoring systems out there, maybe you use one.  Failing a home 
> baked shell script will work just as well.  It would just use say:
> 	ping -n -q -c 1 -w 120 -i 20 -I a.d.d.r next.hop.addr in a continuous
> loop to verify the link is up.
Double Ugh!  Why do I need to implement a daemon to do this when just 
about every other OS that I work with will purportedly do this its self. 
  Linux can purportedly do this too supposedly with Dead Gateway 
Detection and / or Equal Cost Multipath Routing or some combination 
there of.

No, I feel like there is a way to do this, I''m just over looking it. 
If
I do need to go back to this method, I''ll completely re-design what 
needs to be done or switch to a different router OS (Free/Net BSD?) to 
do this.
> Finally, be careful in how you set up your routing.  You want to 
> avoid asymmetric routing, and that will happen by default when 
> someone connects to your backup link unless you take special steps to 
> avoid it.
Actually, asymmetric routes are what I want to use in the event traffic 
does go to the backup route while the primary is up and running.

Keep in mind that no one will be connecting to any of the IP addresses 
assigned to the router (save for router management) but rather the 
globally routable IP addresses in the DMZ behind said router.



Grant. . . .

Peter Rabbitson

2007-Jun-21 15:35 UTC

head link

Re: Redundant internet connections.

Grant Taylor wrote:
> I need a way for the Linux kernel to try to use a default gateway and 
> switch to another one if it does not see any traffic.
I don''t know about any working in-kernel solutions, but you can do it 
trivially with netfilter and a cronjob:

* In netfilter do this:
	-t mangle -N ispA
	-t mangle -A ispA -j RETURN
	-t mangle -N ispB
	-t mangle -A ispB -j RETURN
	-t mangle -A PREROUTING -i $ifA -s ! a.a.a.a/aa -j ispA
	-t mangle -A PREROUTING -i $ifB -s ! b.b.b.b/bb -j ispB

where a.a.a.a and b.b.b.b are subnets describing your first 1 - 2 hops, 
so traffic from your upstream router will not count.

* Then make a cron job that run this every minute:
	iptables -t mangle -vnxZL isp[AB]
and will look for the first number on the third line. If it is not 0 - 
the link is alive, otherwise change the routing tables accordingly.

Of course you can have up to 1 minute of downtime, but it does not look 
so bad IMO.

HTH

Peter

Grant Taylor

2007-Jun-21 15:52 UTC

head link

Re: Redundant internet connections.

On 06/21/07 10:35, Peter Rabbitson wrote:> I don''t know about any working in-kernel solutions, but you can do
it
> trivially with netfilter and a cronjob:
<snip>

If I understand what you are proposing correctly, it looks like you are 
jumping to a sub-chain used used only for counting traffic.  If the 
counters show traffic, you are saying that traffic is flowing across the 
link and thus the link must be up and functional.  Right?

If the link is not up and functional the take action to not use that link.

I''m also not clearly understanding how matching the source IP will work
on either link considering that both links will have the capability to 
pass traffic for the same globally routable DMZ subnet.  Though I think 
this could be mitigated by altering the rules to count packets going out 
or coming in an interface rather than based on source / destination IP.
> Of course you can have up to 1 minute of downtime, but it does not look 
> so bad IMO.
One minute may or may not be bad.  I know that it is a long time (when 
you are trying to ssh) but automatic failover is better than manual. 
And the one minute will probably be much faster than manual failover.



Grant. . . .

Peter Rabbitson

2007-Jun-21 16:00 UTC

head link

Re: Redundant internet connections.

Grant Taylor wrote:> On 06/21/07 10:35, Peter Rabbitson wrote:
>> I don''t know about any working in-kernel solutions, but you
can do it
>> trivially with netfilter and a cronjob:
> 
> <snip>
> 
> If I understand what you are proposing correctly, it looks like you are 
> jumping to a sub-chain used used only for counting traffic.  If the 
> counters show traffic, you are saying that traffic is flowing across the 
> link and thus the link must be up and functional.  Right?
Almost correct
> If the link is not up and functional the take action to not use that link.
This is not something I do automatically in netfilter - it is a 
responsibility of the cron job.
> I''m also not clearly understanding how matching the source IP will
work
> on either link considering that both links will have the capability to 
> pass traffic for the same globally routable DMZ subnet.  Though I think 
> this could be mitigated by altering the rules to count packets going out 
> or coming in an interface rather than based on source / destination IP.
I am counting only INcomming traffic (the -i flag). The source matching 
is there only for the following reason: consider

You ->1-> Uplink router ->2-> Internet

If hop 2 is down, then the uplink router might send you back ICMP 
messages that whatever destination you are trying to reach is 
unreachable. This will count as traffic from the internet, whereas in 
fact it isn''t. This is why you need to exclude (thus the _!_ in -s) the
immediate uplink hops, and count incomming traffic (whatever it might 
be) from the "far side" of the internet only.

Grant Taylor

2007-Jun-21 16:23 UTC

head link

Re: Redundant internet connections.

On 06/21/07 11:00, Peter Rabbitson wrote:> This is not something I do automatically in netfilter - it is a 
> responsibility of the cron job.
*nod*
> I am counting only INcomming traffic (the -i flag). The source matching 
> is there only for the following reason: consider
> 
> You ->1-> Uplink router ->2-> Internet
> 
> If hop 2 is down, then the uplink router might send you back ICMP 
> messages that whatever destination you are trying to reach is 
> unreachable. This will count as traffic from the internet, whereas in 
> fact it isn''t. This is why you need to exclude (thus the _!_ in
-s) the
> immediate uplink hops, and count incomming traffic (whatever it might 
> be) from the "far side" of the internet only.
Ah, here is part of the problem.

                      (    eth1    ) --- (DSL Modem) / DSL Gateway
Server --- (DMZ) --- (Linux Router)
                      (    eth2    ) --- (Cable Modem / Cable Gateway

Note:  Globally routable DMZ is connected to eth0.

Traffic will be to / from servers in the DMZ and clients on the internet 
at large.

My "Linux Router" (above) *IS* the system that would send the ICMP ...
unreachable message.  So, there is not an upstream router to look for 
traffic from.

I suppose that I could match traffic coming in eth1 or eth2, but I would 
have to be careful about he source / destination.  However the very 
existence of inbound traffic means that the link is up for at least 
inbound traffic.  However I also need to know that I can send traffic 
too.  I''ve had situations where the traffic would come in but not go
out
(Do NOT ask how why!).

I suppose such monitoring will work, but I still feel like there is a 
better solution out there.

There is also the fact that I am wanting to use one route unless it is 
down and then use the backup.  If the primary route is up and traffic 
comes in the backup, it is to go back out the primary.

Grant. . . .

Peter Rabbitson

2007-Jun-21 16:47 UTC

head link

Re: Redundant internet connections.

Grant Taylor wrote:> On 06/21/07 11:00, Peter Rabbitson wrote:
> Ah, here is part of the problem.
> 
>                      (    eth1    ) --- (DSL Modem) / DSL Gateway
> Server --- (DMZ) --- (Linux Router)
>                      (    eth2    ) --- (Cable Modem / Cable Gateway
> 
> Note:  Globally routable DMZ is connected to eth0.
> 
> Traffic will be to / from servers in the DMZ and clients on the internet 
> at large.
> 
> My "Linux Router" (above) *IS* the system that would send the
ICMP ...
> unreachable message.  So, there is not an upstream router to look for 
> traffic from.
> 
> I suppose that I could match traffic coming in eth1 or eth2, but I would 
> have to be careful about he source / destination.  However the very 
> existence of inbound traffic means that the link is up for at least 
> inbound traffic.  However I also need to know that I can send traffic 
> too. 
You are misunderstanding how ICMP works. The modems themselves are hops, 
and the thing they connect to is another hop. Just look at the first 
several entries of a traceroute to any destination, and you will see 
what I mean. If you still do not believe me - pull the ISP side cable 
from the modem, while still having your router connected to it, and try 
to do a ping to somewhere. Look at the source of the dest. unreachable 
message - it will come from the modem, not from the linux box.

> I''ve had situations where the traffic would come in but not go out
> (Do NOT ask how why!).
This would be a problem with your router configuration. It is virtually 
impossible to have an upstream problem that would cause this. It either 
works both ways or does not at all.
> I suppose such monitoring will work, but I still feel like there is a 
> better solution out there.
I thought so too, but it seems that the only thing that comes close (and 
still does not cut it) are the DGD patches. And (this is my personal 
opinion) the fact they have not been included in the kernel for such a 
long time, indicates there is something fishy about them.

I myself am using a different approach as I am doing load balancing as 
well. A script sends icmp ping packets with large payloads to several 
destinations and computes the mean rtt. Then the ratio of both rtts is 
used to assign link weights. When no pings come back one of the weights 
will be 0, and effectively no routing will be performed through this link.
> There is also the fact that I am wanting to use one route unless it is 
> down and then use the backup.  If the primary route is up and traffic 
> comes in the backup, it is to go back out the primary.
> 
Nothing above prevents you from doing this, although it is a bad idea. 
Of course if you know what you are doing and still want to do it - it''s
your system :)

Grant Taylor

2007-Jun-21 17:02 UTC

head link

Re: Redundant internet connections.

On 06/21/07 11:47, Peter Rabbitson wrote:> You are misunderstanding how ICMP works. The modems themselves are hops, 
> and the thing they connect to is another hop. Just look at the first 
> several entries of a traceroute to any destination, and you will see 
> what I mean. If you still do not believe me - pull the ISP side cable 
> from the modem, while still having your router connected to it, and try 
> to do a ping to somewhere. Look at the source of the dest. unreachable 
> message - it will come from the modem, not from the linux box.
Um, if you are using bridging modems (like I am) you are incorrect.  If 
you are using modem router combos, yes.  Every single install that I 
have used bridging modems on between the Linux router and the ISP acts 
the same way.  If I have a workstation behind a Linux router (that is 
doing basic NATing) connected to a bridging DSL / Cable modem and I 
unplug the phone line or the coax cable from the modem, it is the Linux 
box that sends the ICMP message, NOT the modem.  This is as expected 
too.  The bridging modems bridge the traffic from the ethernet to the 
DSL / cable modem which is in turn bridged from DSL / cable back to a 
network interface at the ISP.  Thus there is one broadcast domain 
between the Linux router and the ISPs router.  Thus there is not IP 
device between the Linux router and the ISP router to send an ICMP 
message back.

No, again, if you are dealing with modem router combos, I''ll grant you 
what you say, but not on bridging modems.
> This would be a problem with your router configuration. It is virtually 
> impossible to have an upstream problem that would cause this. It either 
> works both ways or does not at all.
No, it was not a fault with my router.  It was a fault radio in an 
(W)ISPs core network.  Completely out of my control.  When the ISP 
replaced the piece of equipment in their core (not even on the link to 
me) things started working correctly again.
> I thought so too, but it seems that the only thing that comes close (and 
> still does not cut it) are the DGD patches. And (this is my personal 
> opinion) the fact they have not been included in the kernel for such a 
> long time, indicates there is something fishy about them.
I agree that something is not quite right about the DGD patches. 
Though, I''ve applied them to 2.6.21.5 and did not have any more luck 
with them, so I''m not sure that there is much use for them.  However I 
think that the DGD tests and failures there is were related to my config 
not being right.
> I myself am using a different approach as I am doing load balancing as 
> well. A script sends icmp ping packets with large payloads to several 
> destinations and computes the mean rtt. Then the ratio of both rtts is 
> used to assign link weights. When no pings come back one of the weights 
> will be 0, and effectively no routing will be performed through this link.
*nod*  I am presently using dual load balanced SDSL circuits with 
automated (OSPF) failover at my office.  This is working out VERY well. 
  However the questions I''m asking have to do with a project for a 
different client.
> Nothing above prevents you from doing this, although it is a bad idea. 
> Of course if you know what you are doing and still want to do it -
it''s
> your system :)
The contracts for the connections dictate that one is only used as a 
backup.  If the primary is up any and all traffic outbound is to go out 
over it.  So, if traffic comes in over the backup, returning out bound 
traffic is to go out the primary.  Seeing as how the DMZ behind this 
router is globally routable, I''m not worried about issues with 
asymmetric routes.  There are asymmetric routes in the core all the 
time.  In my opinion, it is only at the edge where you NAT that you have 
to maintain IP addresses and thus have to be very careful and avoide 
asymmetric routes.  Also, seeing as how both circuits are an ethernet 
connection that can carry a frame size / MTU of 1500 byes, I don''t see 
the problems that would be introduced by encapsulated traffic like PPPoE 
for one link verses the other link.  In short, I''m willing to listen to
problems with the asymmetric routes, but I have yet to hear any thing 
that concerns me or even chafes me a little.



Grant. . . .

Peter Rabbitson

2007-Jun-21 17:37 UTC

head link

Re: Redundant internet connections.

Grant Taylor wrote:> No, again, if you are dealing with modem router combos, I''ll grant
you
> what you say, but not on bridging modems.
*nod* I had several cases when my ISP had problems like the one ou 
describe below, so the first 2 hops were pingable but nothing outside.l 
This is why I suggested the entire ISP subnet exclusion, just to be on 
the safe side.
>> This would be a problem with your router configuration. It is 
>> virtually impossible to have an upstream problem that would cause 
>> this. It either works both ways or does not at all.
> 
> No, it was not a fault with my router.  It was a fault radio in an 
> (W)ISPs core network.  Completely out of my control.  When the ISP 
> replaced the piece of equipment in their core (not even on the link to 
> me) things started working correctly again.
I got to give you this one. Murphy at work.
> *nod*  I am presently using dual load balanced SDSL circuits with 
> automated (OSPF) failover at my office.  This is working out VERY well. 
>  However the questions I''m asking have to do with a project for a 
> different client.
No contest here either. It''s just rather rare for a small scale
end-user
to be able to get access to IGPs.
> asymmetric routes.  Also, seeing as how both circuits are an ethernet 
> connection that can carry a frame size / MTU of 1500 byes, I don''t
see
> the problems that would be introduced by encapsulated traffic like PPPoE 
> for one link verses the other link.  In short, I''m willing to
listen to
> problems with the asymmetric routes, but I have yet to hear any thing 
> that concerns me or even chafes me a little.
> 
I misread the part about the stuff behind the router being routable. 
There is nothing wrong with asymmetric routing in this case. However you 
bring up an interesting point about MTU, only to dismiss it right there. 
I think you will have a problem with the default MTU of 1500 being 
combined with the effective MTU of PPPoE links being 1492. Too many 
systems in this day and age have PMTU discovery enabled, and you know 
what is the current state of ICMP messaging on the net.

Peter

Grant Taylor

2007-Jun-21 18:27 UTC

head link

Re: Redundant internet connections.

On 06/21/07 12:37, Peter Rabbitson wrote:> *nod* I had several cases when my ISP had problems like the one you 
> describe below, so the first 2 hops were pingable but nothing outside. 
> This is why I suggested the entire ISP subnet exclusion, just to be on 
> the safe side.
*nod*
> I got to give you this one. Murphy at work.
Ya, Murphy and I go back a long way.  I can usually tell when I''m on
the
right track to solving a problem.  If I''m about to beat something, I 
start having other little problems, i.e. batteries in equipment going 
out, not having the proper patch cord (strait through verses cross 
over), not having proper user name and / or password for equipment, etc. 
  I''ve gotten to the point that I rather like seeing such speed bumps 
because I have noticed that they are usually an indication that I''m at 
least going the right direction.
> No contest here either. It''s just rather rare for a small scale
end-user
> to be able to get access to IGPs.
Well, just because OSPF is what is used does not mean that I have access 
to the IGP.  To make things work, I''m having to have my ISP co-locate a
piece of their equipment at my facility so they are using the IGP with 
in their administrative domain.  I pick up from the single ethernet 
interface out of said equipment.  It''s just a political /
administrative
paradigm shift, but it does allow the circuits to do what I want them to 
do and rather nicely at that I might add.
> I misread the part about the stuff behind the router being routable. 
> There is nothing wrong with asymmetric routing in this case. However you 
> bring up an interesting point about MTU, only to dismiss it right there. 
> I think you will have a problem with the default MTU of 1500 being 
> combined with the effective MTU of PPPoE links being 1492. Too many 
> systems in this day and age have PMTU discovery enabled, and you know 
> what is the current state of ICMP messaging on the net.
*nod*  I figured that the globally routable DMZ IPs was not sinking in 
so I tried re-stating it differently to see if it would make it.  ;)

Both of my links use statically assigned IP addresses on the raw 
ethernet interfaces.  Thus there is no encapsulation (MTU) overhead to 
worry about, i.e. no PPPoE.  Seeing as how I''m running MTUs of 1500 out
my interfaces to the world and at least that or larger in to the ISP 
(ATM links have 4470 (set for something else some time previous) I
don''t
think MTU issues will be on my end.

Incidentally, this is one of the reasons that I try to avoid PPPoE if I 
can.  Well MTU and the fact that our local incumbent phone company as an 
ISP likes to tare down the PPPoE connections after less than 60 seconds 
of inactivity *WITH OUT* notifying the client end.  Thus our only 
reliable recourse is to tare down the connection on the client end 
before the ILEC does so that we know the state and can re-establish it 
on demand when needed.



Grant. . . .

Alex Samad

2007-Jun-21 21:01 UTC

head link

Re: Redundant internet connections.

On Thu, Jun 21, 2007 at 05:35:13PM +0200, Peter Rabbitson
wrote:> Grant Taylor wrote:
> 
> >I need a way for the Linux kernel to try to use a default gateway and 
> >switch to another one if it does not see any traffic.
should something like this work 

default  proto static  metric 5 nexthop via 58.173.108.1  dev vlan2 weight 10
		nexthop via 10.20.20.106  dev ppp0 weight 20

and then let the dgd detect dead gateways and drop the relevant route about.
> 
> I don''t know about any working in-kernel solutions, but you can do
it
> trivially with netfilter and a cronjob:
> 
> * In netfilter do this:
> 	-t mangle -N ispA
> 	-t mangle -A ispA -j RETURN
> 	-t mangle -N ispB
> 	-t mangle -A ispB -j RETURN
> 	-t mangle -A PREROUTING -i $ifA -s ! a.a.a.a/aa -j ispA
> 	-t mangle -A PREROUTING -i $ifB -s ! b.b.b.b/bb -j ispB
> 
> where a.a.a.a and b.b.b.b are subnets describing your first 1 - 2 hops, 
> so traffic from your upstream router will not count.
> 
> * Then make a cron job that run this every minute:
> 	iptables -t mangle -vnxZL isp[AB]
> and will look for the first number on the third line. If it is not 0 - 
> the link is alive, otherwise change the routing tables accordingly.
> 
> Of course you can have up to 1 minute of downtime, but it does not look 
> so bad IMO.
> 
> HTH
> 
> Peter
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
> 

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-21 21:24 UTC

head link

Re: Redundant internet connections.

On 06/21/07 16:01, Alex Samad wrote:> should something like this work
> 
> default  proto static  metric 5
> 	nexthop via 58.173.108.1  dev vlan2 weight 10
> 	nexthop via 10.20.20.106  dev ppp0 weight 20
> 
> and then let the dgd detect dead gateways and drop the relevant route
> about.
Doesn''t this use "Equal Cost Multi Path" (ECMP) routing?

If so, how does this take in to account that I do not want any of the 
traffic to run over the backup connection unless the primary is down?

It is my understanding that the weights of an ECMP route are for a 
fraction of the traffic.  I.e. 10/30 and 20/30 of the traffic will use 
each of the routes.

(Note:  I state 10/30 and 20/30 because the man page indicates that 
10/30 does not equal 1/3.  Namely because the kernel creates an in 
memory route for each weight for each route.  Thus if you use a weight 
of 10, there will be 10 routes in memory.)

Grant. . . .

Alex Samad

2007-Jun-21 22:18 UTC

head link

Re: Redundant internet connections.

On Thu, Jun 21, 2007 at 04:24:19PM -0500, Grant Taylor
wrote:> On 06/21/07 16:01, Alex Samad wrote:
> >should something like this work
> >
> >default  proto static  metric 5
> >	nexthop via 58.173.108.1  dev vlan2 weight 10
> >	nexthop via 10.20.20.106  dev ppp0 weight 20
> >
> >and then let the dgd detect dead gateways and drop the relevant route
> >about.
> 
> Doesn''t this use "Equal Cost Multi Path" (ECMP) routing?sorry yep, just woken up, reading and answering whilst eating breakfast

okay then why not

default via preffered path
default via backup path metric 100

> 
> If so, how does this take in to account that I do not want any of the 
> traffic to run over the backup connection unless the primary is down?
> 
> It is my understanding that the weights of an ECMP route are for a 
> fraction of the traffic.  I.e. 10/30 and 20/30 of the traffic will use 
> each of the routes.
> 
> (Note:  I state 10/30 and 20/30 because the man page indicates that 
> 10/30 does not equal 1/3.  Namely because the kernel creates an in 
> memory route for each weight for each route.  Thus if you use a weight 
> of 10, there will be 10 routes in memory.)
> 
> 
> 
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
> 

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-21 22:23 UTC

head link

Re: Redundant internet connections.

On 06/21/07 17:18, Alex Samad wrote:> sorry yep, just woken up, reading and answering whilst eating breakfast
*nod*
> okay then why not
> 
> default via preffered path
> default via backup path metric 100
I''ve done that with a metric of 0/1, and 1/2.  The problem that
I''m
seeing is that the system will never try to use the second metric. 
It''s
as if the system will never go to a next higher metric if it does not 
receive an error while trying to use a lower metric.



Grant. . . .

Alex Samad

2007-Jun-21 22:30 UTC

head link

Re: Redundant internet connections.

On Thu, Jun 21, 2007 at 05:23:23PM -0500, Grant Taylor
wrote:> On 06/21/07 17:18, Alex Samad wrote:
> >sorry yep, just woken up, reading and answering whilst eating breakfast
> 
> *nod*
> 
> >okay then why not
> >
> >default via preffered path
> >default via backup path metric 100
> 
> I''ve done that with a metric of 0/1, and 1/2.  The problem that
I''m
> seeing is that the system will never try to use the second metric. 
It''s
> as if the system will never go to a next higher metric if it does not 
> receive an error while trying to use a lower metric.Strange I am running openwrt on a linksys wr54gs with 1 cable and 1 adsl. I 
load balance, (also have julian patches applied - its 2.4.30), when the routing 
notices the link is dead, so if i do a ip li. then it marks the routes as dead 
and stops using them, once the interface is brought down the routes disappear


I haven;t followed the dgd threads, but I seem to remember it having some 
problem with upstream detection.

You talked about getting OSPF routing for this, is this from the ISP''s
inbound
as well as outbound. Wouldn''t OSPF handle link state as well ? (it been
a while
since I looked at OSPF)

> 
> 
> 
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
> 

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-21 22:35 UTC

head link

Re: Redundant internet connections.

Ok, after more testing and trying things that others have suggested, 
I''ve made some headways.  Or at least what I think is some head ways.

This is not an answer, just data that I have gathered along the way to 
help others that are trying to help me.

I have determined that either I can not get the DGD patches 
(routes-2.6.21-15.diff) off of Julian''s site to work the way that I 
think it should, or I''m using the wrong patch there from, or said patch
does not work.  I don''t know which, and I can''t really say one
way or
the other.

If I compile a stock 2.6.21.5 kernel (plus patch to see my VMWare LSI 
SCSI card (should make no difference in routing)) with out ECMP or any 
advanced routing, I can get the system to fail to the next route after a 
period of time if the first is down.  I do this by adding the two 
alternate routes with the same metric in reverse order that I want to 
use.  I.e. if I have the following routes:  a.b.c.d (preferred) and 
z.y.x.w (backup) I add the backup route and then the preferred route it 
will fail over after time.  If I set /proc/sys/net/ipv4/route/gc_timeout 
to 10 seconds the system will fall back to the backup route in about 120 
seconds.  I''m still playing with numbers in the /proc tree.  The
problem
with this method is that I have yet to get it to start re-using the 
primary route when it becomes available again.

If I use the previously mentioned DGD patch, the system will just try to 
cache the route for something like 245 days.  I''m still wondering if I 
am applying the correct patch.  This happens with or with out ECMP 
compiled in to the kernel.



Grant. . . .

Grant Taylor

2007-Jun-21 22:39 UTC

head link

Re: Redundant internet connections.

On 06/21/07 17:30, Alex Samad wrote:> Strange I am running openwrt on a linksys wr54gs with 1 cable and 1 
> adsl. I  load balance, (also have julian patches applied - its 
> 2.4.30), when the routing notices the link is dead, so if i do a ip 
> li. then it marks the routes as dead and stops using them, once the 
> interface is brought down the routes disappear
I am not wanting load balancing.  Rather I want to use one link and only 
use the second if the first is down.
> I haven;t followed the dgd threads, but I seem to remember it having 
> some problem with upstream detection.
*nod*  I''m getting that consensus.
> You talked about getting OSPF routing for this, is this from the 
> ISP''s inbound as well as outbound. Wouldn''t OSPF handle
link state as
> well ? (it been a while since I looked at OSPF)
The OSPF was for a different project / different installation.



Grant. . . .

Gustavo Homem

2007-Jun-22 11:54 UTC

head link

Re: Redundant internet connections.

On Thursday 21 June 2007 18:02, Grant Taylor wrote:> On 06/21/07 11:47, Peter Rabbitson wrote:
> > You are misunderstanding how ICMP works. The modems themselves are
hops,
> > and the thing they connect to is another hop. Just look at the first
> > several entries of a traceroute to any destination, and you will see
> > what I mean. If you still do not believe me - pull the ISP side cable
> > from the modem, while still having your router connected to it, and
try
> > to do a ping to somewhere. Look at the source of the dest. unreachable
> > message - it will come from the modem, not from the linux box.
>
> Um, if you are using bridging modems (like I am) you are incorrect. 
This is absolutetly the way to do it with ADSL.

Using a modem in bridged mode minimizes the responsability of the modem/router 
which is a potentially unstable device. Let the stable Linux box do the work 
(routing+nat)  and get the public IP. And firewall the Linux box itself with 
iptables. This is the most flexible and stable way to go.

Cheers
Gustavo



-- 
Angulo Sólido - Tecnologias de Informação
http://angulosolido.pt

Grant Taylor

2007-Jun-22 14:22 UTC

head link

Re: Redundant internet connections.

(Off thread topic.)

On 06/22/07 06:54, Gustavo Homem wrote:> This is absolutetly the way to do it with ADSL.
I could not agree more.
> Using a modem in bridged mode minimizes the responsability of the 
> modem/router which is a potentially unstable device. Let the stable 
> Linux box do the work (routing+nat)  and get the public IP. And 
> firewall the Linux box itself with iptables. This is the most 
> flexible and stable way to go.
*nod*  About the only thing that I''m looking at doing differently at my
house is to use the Thompson USB SpeedTouch (330) USB ADSL modem to put 
the ATM stack on the Linux box its self.  This way the Linux kernel will 
handle the bridging and buffering verses an external device that has 
arbitrary pauses waiting for buffers to fill prior to transmitting data.

My preliminary tests with the ATM stack on Linux show a speed increase 
over the external bridging modem too.  :)  My tests show that Linux / 
Windows think the raw ATM with bridging circuit will get close to 1.6 
Mbps while the bridged devices get closer to 1.5 Mbps.  I also see a 
lower latency between the device connected to the DSL and the upstream 
gateway by a factor of 3 - 5 ms.

Grant. . . .

Gustavo Homem

2007-Jun-22 14:57 UTC

head link

Re: Redundant internet connections.

On Friday 22 June 2007 15:22, Grant Taylor wrote:> (Off thread topic.)
>
> On 06/22/07 06:54, Gustavo Homem wrote:
> > This is absolutetly the way to do it with ADSL.
>
> I could not agree more.
>
> > Using a modem in bridged mode minimizes the responsability of the
> > modem/router which is a potentially unstable device. Let the stable
> > Linux box do the work (routing+nat)  and get the public IP. And
> > firewall the Linux box itself with iptables. This is the most
> > flexible and stable way to go.
>
> *nod*  About the only thing that I''m looking at doing differently
at my
> house is to use the Thompson USB SpeedTouch (330) USB ADSL modem to put
> the ATM stack on the Linux box its self. 
I''ve done this, but I think it''s unreliable for professional
use. The USB
modems are non-standard so if one burns you can''t exchange it for a
different
one without feasible but time consuming tweaking (tried more then one USB 
devices...).

Even for Ethernet briding devices I only use models which are delivered by 
ISPs (rather than retail shop devices), to garantee they were tested for 
stability:

POTS:
http://www.huawei.com/products/terminal/products/view.do?id=87

ISDN:
http://www.acbs-dsl-store.com/contenu/Articles/Article.asp?PdtNum=DSLGP628LP

These models run forever in bridged mode. The second one accepts multiple 
PPPoE clients on different ports.
> This way the Linux kernel will 
> handle the bridging and buffering verses an external device that has
> arbitrary pauses waiting for buffers to fill prior to transmitting data.
>
> My preliminary tests with the ATM stack on Linux show a speed increase
> over the external bridging modem too.  :)  My tests show that Linux /
That''s expectable since using PPPoA instead of PPPoEoA, reduces the
overhead.
But I don''t know a standard PPPoA setup.

But if we want QoS working, we can''t use the full line capability
anyway.
> Windows think the raw ATM with bridging circuit will get close to 1.6
> Mbps while the bridged devices get closer to 1.5 Mbps.  I also see a
> lower latency between the device connected to the DSL and the upstream
> gateway by a factor of 3 - 5 ms.
Even if that happens, it would hardly compensate the risk of lower 
reliability.

Cheers
Gustavo

-- 
Angulo Sólido - Tecnologias de Informação
http://angulosolido.pt

Grant Taylor

2007-Jun-22 15:59 UTC

head link

Re: Redundant internet connections.

On 06/22/07 09:57, Gustavo Homem wrote:> I''ve done this, but I think it''s unreliable for
professional use. The
> USB modems are non-standard so if one burns you can''t exchange it
for
> a different one without feasible but time consuming tweaking (tried 
> more then one USB devices...).
> 
> Even for Ethernet briding devices I only use models which are 
> delivered by ISPs (rather than retail shop devices), to garantee they 
> were tested for stability:
> 
> POTS: http://www.huawei.com/products/terminal/products/view.do?id=87
> 
> ISDN:
http://www.acbs-dsl-store.com/contenu/Articles/Article.asp?PdtNum=DSLGP628LP
> 
> 
> These models run forever in bridged mode. The second one accepts 
> multiple PPPoE clients on different ports.
> 
> 
> That''s expectable since using PPPoA instead of PPPoEoA, reduces
the
> overhead. But I don''t know a standard PPPoA setup.
> 
> But if we want QoS working, we can''t use the full line capability 
> anyway.
> 
> Even if that happens, it would hardly compensate the risk of lower 
> reliability.
All very valid points and things to consider.  However for a home 
environment / non critical environment, it provides a lot of potential.



Grant. . . .

Grant Taylor

2007-Jun-22 18:57 UTC

head link

Re: Redundant internet connections.

On 06/21/07 17:35, Grant Taylor wrote:> The problem with this method is that I have yet to get it to start
> re-using the primary route when it becomes available again.
After doing some more testing and investigation, I think I know why the
system appears to not be using the primary route. My test / lab setup
consists of a Linux router with two subnets bound to one interface (eth0
and eth0:1) and my (VMWare) test Linux system with two ethernet
interfaces bridged the the local LAN with one subnet on each interface.
I have two (as far as Linux is concerned) physical interfaces so that
I can have TX / RX counters for each interface to see which way the
traffic is going out. This worked fine to have the system fall from the
primary down to the secondary route when the primary route went away.

However I never saw the traffic from the test Linux system back to the
interface for the primary route. After doing some investigation I think
this is because the same MAC address is used for both the primary and
secondary routes, seeing as how both addresses are on the same physical
interface on my Linux router.

So, to test this, I took down the primary route, let the test Linux box
fall back to the backup route, which it did. Then I brought the primary
route back on line and waited. As expected the traffic did not start
using the primary route, presumably because of MAC addresses for routes
being cached with an association to a device. So, while the system was
pinging out to the world with the primary route brought back up, I
cleared entries from the local test Linux boxes ARP cache and all of the
sudden, traffic started going out the correct interface.

So, now I think that the method of having two equal cost (metric) routes
on the box will work. I''m now going to test where the two routes are
different MAC addresses to see if the traffic does indeed start using
the proper rout again (Seeing as how there should not be any confusion
with MAC addresses.)

Grant. . . .

Grant Taylor

2007-Jun-22 21:08 UTC

head link

Re: Redundant internet connections. !!!SOLVED!!!

On 06/22/07 13:57, Grant Taylor wrote:> I''m now going to test where the two routes are different MAC
> addresses to see if the traffic does indeed start using the proper
> rout again.
Ok, I have done it and it is working.

The short answer is all you need to have backup routes is to enter them
in reverse order. You do not need to do any special kernel options,
patch the kernel or any thing else, or any special ip rules. All you
need to do is to enter the routes in the reverse of the order that you
want them to be used.

For example, if I have two different internet connections, each with
their own default gateway. Obviously the two default gateways have to
not be on the same subnet.

GW1: A.B.C.D
GW2: Z.Y.X.W
GW3: K.L.M.N

route add default gw K.L.M.N
route add default gw Z.Y.X.W
route add default gw A.B.C.D

Note: All the above routes are the same metric (default of 0).

I do not know why you have to add the routes in reverse. I have just
noticed that route adds the routes as the highest priority to the
routing table. Filled from the top, not the bottom type thing. So,
conversely add them in the reverse order.

In my current test environment I have two identical VMWare virtual
machines (literal copy from one to the other) that I have modified the
configuration and tested. I''ll try to depict it below:

( ISP 1 ) --- ... --- ( ISP 1) --- ( Internet )
( ) |
(DMZ) --- ( Router ) ( Peering Link)
( ) |
( ISP 2 ) --- ... --- ( ISP 2) --- ( Internet )

In this scenario, the DMZ IP address space is from ISP 1. ISP 1 has a
route to the DMZ via the ISP 1 IP address on my local Linux router. ISP
1 has a secondary route to the DMZ via the IP address on ISP 2s router
over the peering link. ISP 2 has a route to the DMZ via the ISP 2 IP
address on my local Linux router.

The link between my local Linux router and ISP 1 is a high speed
wireless link. The link between my local Linux router and ISP 2 is a
lower speed ADSL link.

The ADSL link from ISP 2 is *ONLY* used for backup access in case my
local Linux router is unable to communicate with ISP 1s router. Thus if
for some reason traffic does come in to my ISP 2 IP address it is to go
back out the ISP 1 link, thus asymmetric routing.

I appreciate all the suggestions that everyone submitted while trying to
help resolve this issue. In the end it turned out that everything that
was needed is already in the stock / vanilla kernel.org kernel. All I
had to do was be smart enough to use it.

Some points to help others with this issue if they ever need it:
- Equal Cost Multi Path (a.k.a. E.C.M.P.) routing is NOT needed.
- NO ip rule(s) were needed to pull this off.
- NO additional routing tables were needed to pull this off.
- NO patches (i.e. Julian''s Dead Gateway Detection patch) were needed
to make this off.
- NO special scripts were needed to monitor and / or modify the
routing table(s). (Note: This is applicable to my scenario, see below.)

With regards to the monitoring of routing tables, I did not need to do
any thing special, i.e. no ping or arping was needed. I think this was
because when my primary route went down I would start using the
secondary route and the returning traffic would always try to use the
primary and fail back to the secondary route. When the primary route
did come back up the inbound traffic would come in the primary interface
/ route thus incrementing the counters in my kernel thus making the
kernel aware that the primary route was indeed back up so it could
switch back to it.

Note: In my test, I was manually taking the interface down on one VM
and subsequently bring it back up and restoring the route(s) across it.
In my opinion, this interface fiddling on the upstream end is not
automatic, but is out side of the scope of the client end failing back
to a backup route. If I were trying to do this between two systems
where the link in the middle (between intermediary switches) went down,
I believe I would have to do some sort of heart beat across the link.
In this case, I would probably use (read: try) arping first and then
switch to something else if that did not work.

Grant. . . .

Seemingly Similar Threads

Search for more reasonably related threads

LARTC - Jun 2007 - Redundant internet connections.

Redundant internet connections.

RE: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections.

Re: Redundant internet connections. !!!SOLVED!!!

Seemingly Similar Threads