thr3ads.net - LARTC - Redundant Internet connections [Oct 2003]

If this information is useful, please help other people find it:
Share via:

Seth J. Blank

2003-Oct-13 15:45 UTC

Redundant Internet connections

I have a firewall with two redundant internet connections coming in 
(eth0 and eth1) and an intranet behind eth2.

What I am trying to do is have data off of eth2 split evenly between 
eth0 and eth1, and if one interface goes down, to fully utilize the other.

What I''m trying to do is have all data from eth0 be passed on to eth2 
(unless it''s stopped by the firewall), same with eth1, and all data
from
eth2 be split evenly between eth0 and eth1.

currently I have the following routes and rules to accomplish this:

ip route add 10.0.0.0/8 via GATEWAY0 table 1 proto static
ip route add 10.0.0.0/8 via GATEWAY1 table 2 proto static

ip route add default table default scope global nexthop via GATEWAY0 dev 
eth0 weight 1 nexthop via GATEWAY1 dev eth1 weight 1

ip rule add pref 1500 iif eth0 table 1
ip rule add pref 1501 iif eth1 table 2
ip rule add pref 100 iif eth2 table default

This does NOT work properly.
 From localhost, everything works perfectly. I can bring up and down 
interfaces and everything works properly and transparently.
But, from the intranet, everything stops. With a different default route:
ip route add default via GATEWAY0 dev eth0 table default
everything is fine from both localhost and the intranet. Same with 
GATEWAY1 eth1.

Can anyone offer advice on how to resolve this problem?
The only way I can think of so far is a remarkably simple but stupid 
hack, where I just ping -I eth0 GATEWAY0 and ping -I eth1 GATEWAY1 every 
thirty seconds or so and switch default routes if an interface is down. 
This obviously does not solve the problem, nor allow bandwidth to be 
shared across both lines.

Any help would be greatly appreciated.

Seth J. Blank
Systems Operations
Capital Market Services, LLC



_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Seth J. Blank

2003-Oct-13 16:24 UTC

head link

Redundant Internet connections [Updated]

Sorry, I really wasn''t paying attention when I wrote this (i.e.
I''ve had
no sleep).

I have the routing tables working properly for the internal network.

What I need to do is have the routing tables update the gateways when a 
line is down.

i.e.    intranet ----- firewall ----- router1 ----- internet
                 \-- router2 ----- internet
Currently, I have the gateway from the firewall being nexthops between 
router1 and router2. This works fine. But what I need to do is have the 
firewall check the links between router1/2 and the internet and switch 
gateways if a line is down.
What I want to do, but can''t figure out how to, is send out a packet 
through router1 and see if it gets an arbitrary number of hops (probably 
3) out. If not, switch the default route to use the other gateway. This 
needs to be done for both gateways, and there also needs to be a route 
to restore the gateways when the line goes back up.

Any help would be greatly appreciated.

Thanks so much,
Seth J. Blank
Systems Operations
Capital Market Services, LLC

Seth J. Blank wrote:
> I have a firewall with two redundant internet connections coming in 
> (eth0 and eth1) and an intranet behind eth2.
>
> What I am trying to do is have data off of eth2 split evenly between 
> eth0 and eth1, and if one interface goes down, to fully utilize the 
> other.
>
> What I''m trying to do is have all data from eth0 be passed on to
eth2
> (unless it''s stopped by the firewall), same with eth1, and all
data
> from eth2 be split evenly between eth0 and eth1.
>
> currently I have the following routes and rules to accomplish this:
>
> ip route add 10.0.0.0/8 via GATEWAY0 table 1 proto static
> ip route add 10.0.0.0/8 via GATEWAY1 table 2 proto static
>
> ip route add default table default scope global nexthop via GATEWAY0 
> dev eth0 weight 1 nexthop via GATEWAY1 dev eth1 weight 1
>
> ip rule add pref 1500 iif eth0 table 1
> ip rule add pref 1501 iif eth1 table 2
> ip rule add pref 100 iif eth2 table default
>
> This does NOT work properly.
> From localhost, everything works perfectly. I can bring up and down 
> interfaces and everything works properly and transparently.
> But, from the intranet, everything stops. With a different default route:
> ip route add default via GATEWAY0 dev eth0 table default
> everything is fine from both localhost and the intranet. Same with 
> GATEWAY1 eth1.
>
> Can anyone offer advice on how to resolve this problem?
> The only way I can think of so far is a remarkably simple but stupid 
> hack, where I just ping -I eth0 GATEWAY0 and ping -I eth1 GATEWAY1 
> every thirty seconds or so and switch default routes if an interface 
> is down. This obviously does not solve the problem, nor allow 
> bandwidth to be shared across both lines.
>
> Any help would be greatly appreciated.
>
> Seth J. Blank
> Systems Operations
> Capital Market Services, LLC
>
>
>
> _______________________________________________
> LARTC mailing list / LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Robert Kurjata

2003-Oct-13 17:58 UTC

head link

Re: Redundant Internet connections [Updated]

Witaj Seth,

W Twoim liście datowanym 13 października 2003 (18:24:08) można przeczytać:

SJB> Sorry, I really wasn''t paying attention when I wrote this (i.e.
I''ve had
SJB> no sleep).

SJB> I have the routing tables working properly for the internal network.

SJB> What I need to do is have the routing tables update the gateways when a
SJB> line is down.

SJB> i.e.    intranet ----- firewall ----- router1 ----- internet
SJB>                  \-- router2 ----- internet
SJB> Currently, I have the gateway from the firewall being nexthops between
SJB> router1 and router2. This works fine. But what I need to do is have the
SJB> firewall check the links between router1/2 and the internet and switch
SJB> gateways if a line is down.
SJB> What I want to do, but can''t figure out how to, is send out a
packet
SJB> through router1 and see if it gets an arbitrary number of hops (probably
SJB> 3) out. If not, switch the default route to use the other gateway. This
SJB> needs to be done for both gateways, and there also needs to be a route
SJB> to restore the gateways when the line goes back up.

I have a load balancing setup for 3 uplinks (3 different providers and
technologies) w/failover set with http://www.ssi.bg/~ja/ Nano-HOWTO
(carefully done By-The-Book - any shortcut and it''s gone).

When you need to check if the net is reachable with either of the
links just try to ping some machines outside (a set would be nice)
forcing an output address to be one or the other and decide if you
need to change normal multihop gateway to single hop one via link 1 or
2. Should work with nano, because it''s preserving output address thus
preserving the routes. Works for me (after some sleepless nights, tons
of caffe :). I can pull the plug out and nothing bad happens
(only the traffic shaping needs some correction).


[cut the rest]

-- 
Pozdrowienia,
 Robert

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Seth J. Blank

2003-Oct-14 16:56 UTC

head link

Re: Redundant Internet connections [Updated]

Robert Kurjata wrote:
>I have a load balancing setup for 3 uplinks (3 different providers and
>technologies) w/failover set with http://www.ssi.bg/~ja/ Nano-HOWTO
>(carefully done By-The-Book - any shortcut and it''s gone).
>  
>I have finished implementing this step by step, and things still do not 
appear to be working.

During the testing phase, I have two problems (output which differs from 
what the howto says I should get).
1) When I run "ip route list table main", only the proper entries for 
NWE1/NME1 and NWE2/NME2 come up, not the one for NWI/NMI.
2) "ip route get from (IPE1|IPE2) to 204.152.189.113" both return 
"network unreachable"
All the other output matches exactly.

My only thoughts are that I''ve swapped an IP or two somewhere, but
I''ve
been over the script a ton of times already, and nothing presents itself 
to me.

Any help or troubleshooting hints would be greatly appreciated.

Seth J. Blank
Systems Operations
Capital Market Services, LLC
>When you need to check if the net is reachable with either of the
>links just try to ping some machines outside (a set would be nice)
>forcing an output address to be one or the other and decide if you
>need to change normal multihop gateway to single hop one via link 1 or
>2. Should work with nano, because it''s preserving output address
thus
>preserving the routes. Works for me (after some sleepless nights, tons
>of caffe :). I can pull the plug out and nothing bad happens
>(only the traffic shaping needs some correction).
>
>
>[cut the rest]
>
>  
>

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

gypsy

2003-Oct-15 01:27 UTC

head link

Re: Redundant Internet connections [Updated]

"Seth J. Blank" wrote:> I have finished implementing this step by step, and things still do not
> appear to be working.
> 
> During the testing phase, I have two problems (output which differs from
> what the howto says I should get).
> 1) When I run "ip route list table main", only the proper entries
for
> NWE1/NME1 and NWE2/NME2 come up, not the one for NWI/NMI.
The placement within the script of the line:
	ip route del default table main
is probably what is killing this.  When the interface is brought up, an
entry is made into the main table.  You''re purging that entry.  So
arrange things so that IFI comes up AFTER the del

The lo device should be in main also.

Another possibility is that the original main table did not have the IFI
entry when you ran
	ip rule add prio ### table main

The final thing that comes to mind is that you did not even execute
	ip link set $IFI up
	ip addr flush dev $IFI
	ip addr add $IPI/$NMI brd + dev $IFI
    (or ip addr add $IPI/$NMI brd $BRDI dev $IFI) # this is the line
that populates main
> 2) "ip route get from (IPE1|IPE2) to 204.152.189.113" both return
> "network unreachable"
IS the network reachable?!!  My $0.25 is on IFI1 being dead.  Try
	ping -c1 -I eth# 204.152.189.113
where "#" is set to the first interface.  
> Any help or troubleshooting hints would be greatly appreciated.
> 
> Seth J. Blankbuck
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Seth J. Blank

2003-Oct-15 16:12 UTC

head link

Re: Redundant Internet connections [Updated]

Yeah, I figured out the problem (stupid mistake on my end) and 
everything is working now.

With one exception. If I pull the cat5 out of eth0 (external interface 
1) then everything just hangs. No connections can be made, etc.  Pulling 
the cat5 out of eth1 (external interface 2) has no effect. The 
connection stays like this until eth0 is plugged back in (it picks back 
up immediately)

What this suggests to me is that , even though I''m using the two 
nexthops, all the data is trying to go over eth0, and nothing is being 
sent over eth1.  ... And I just confirmed this with iptraf.

So the question now is, why aren''t the nexthops working? I patched the 
kernel, followed the nano howto precisely, and can use both interfaces 
just fine (ping -I eth0/1,  etc.). If I set the default route to either 
eth0 or eth1, everything works fine. But with the nexthops, it does not 
appear as if the load is being balanced.

Here is my table):

default  proto static
        nexthop via GW1  dev eth0 weight 1
        nexthop via GW2  dev eth1 weight 1

Any thoughts?

Thanks a ton for all your help so far,
Seth

gypsy wrote:
>"Seth J. Blank" wrote:
>  
>
>>I have finished implementing this step by step, and things still do not
>>appear to be working.
>>
>>During the testing phase, I have two problems (output which differs from
>>what the howto says I should get).
>>1) When I run "ip route list table main", only the proper
entries for
>>NWE1/NME1 and NWE2/NME2 come up, not the one for NWI/NMI.
>>    
>>
>
>The placement within the script of the line:
>	ip route del default table main
>is probably what is killing this.  When the interface is brought up, an
>entry is made into the main table.  You''re purging that entry.  So
>arrange things so that IFI comes up AFTER the del
>
>The lo device should be in main also.
>
>Another possibility is that the original main table did not have the IFI
>entry when you ran
>	ip rule add prio ### table main
>
>The final thing that comes to mind is that you did not even execute
>	ip link set $IFI up
>	ip addr flush dev $IFI
>	ip addr add $IPI/$NMI brd + dev $IFI
>    (or ip addr add $IPI/$NMI brd $BRDI dev $IFI) # this is the line
>that populates main
>
>  
>
>>2) "ip route get from (IPE1|IPE2) to 204.152.189.113" both
return
>>"network unreachable"
>>    
>>
>
>IS the network reachable?!!  My $0.25 is on IFI1 being dead.  Try
>	ping -c1 -I eth# 204.152.189.113
>where "#" is set to the first interface.  
>
>  
>
>>Any help or troubleshooting hints would be greatly appreciated.
>>
>>Seth J. Blank
>>    
>>
>buck
>  
>
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Seth J. Blank

2003-Oct-15 16:47 UTC

head link

Re: Redundant Internet connections [Updated]

Another weird piece of information to add.

If I ifdown eth0, everything starts being routed over eth1. But if I 
just yank the cord out of eth0, the system sits there trying to route 
over eth0. This persists for much longer than the 60 seconds it should 
take, max, for the kernel to update the routing tables.

And it''s still confusing me why the traffic isn''t being split
evenly
between eth0 and eth1 (iptraf shows everything going over eth0, no 
traffic at all on eth1).

Thank you all so much for your help,
Seth

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Robert Kurjata

2003-Oct-15 21:00 UTC

head link

Re[2]: Redundant Internet connections [Updated]

Hi Seth,

I cant find anything more than posting my working script for load
balancing over two links (it was for three links and I home I didn''t
remove too much). It has been done strictly by the rules on
Nano-HOWTO and works. The main part is the PING section at the end.
This ensures that kernel sees dead gateways and recovers.
But of course it WILL NOT work without some kernel patching (dead
gateway detection, static routes - just use a Jumbo Patch from
http://www.ssi.bg/~ja/ ).

A final word is: the routers didn''t even have to respond to pings.
They need to respond to ARPS. This stuff doesn''t work properly for PPP
or PPPoE connections as they usually are NoARP.

I also have some shaping done with TC/CBQ on both links.

VERY IMPORTANT: all the testing is USELESS if you have less than 40-50
users doing lots of requests to different sites as a routes are just
cached in kernel. In my system even with 10-20 users balancing is
usually poor improving greatly with number of users - the diference
between links lowers down to 10%.

Hopefully I will get some free time to write a step-by-step howto
because it took me some time to understand the thing.

Home this helped someone, Greetings to the list
---------------------------cut here------------------------------------------

#!/bin/bash
# This script is done by : Robert Kurjata Sep, 2003.
# feel free to use it in any usefull way

# CONFIGURATION
IP=/sbin/ip
PING=/bin/ping

#--------------- LINK PART -----------------
# EXTIFn - interface name
# EXTIPn - outgoing IP
# EXTMn  - netmask length (bits)
# EXTGWn - outgoing gateway
#-------------------------------------------

# LINK 1
EXTIF1=eth2
EXTIP1EXTM1EXTGW1
# LINK 2
EXTIF2=eth1
EXTIP2EXTM2EXTGW2
#ROUTING PART
# removing old rules and routes

echo "removing old rules"
${IP} rule del prio 50 table main
${IP} rule del prio 201 from ${EXTIP1}/${EXTM1} table 201
${IP} rule del prio 202 from ${EXTIP2}/${EXTM2} table 202
${IP} rule del prio 221 table 221
echo "flushing tables"
${IP} route flush table 201
${IP} route flush table 202
${IP} route flush table 221
echo "removing tables"
${IP} route del table 201
${IP} route del table 202
${IP} route del table 221

# setting new rules
echo "Setting new routing rules"

# main table w/o default gateway here
${IP} rule add prio 50 table main
${IP} route del default table main

# identified routes here
${IP} rule add prio 201 from ${EXTIP1}/${EXTM1} table 201
${IP} rule add prio 202 from ${EXTIP2}/${EXTM2} table 202

${IP} route add default via ${EXTGW1} dev ${EXTIF1} src ${EXTIP1} proto static
table 201
${IP} route append prohibit default table 201 metric 1 proto static

${IP} route add default via ${EXTGW2} dev ${EXTIF2} src ${EXTIP2} proto static
table 202
${IP} route append prohibit default table 202 metric 1 proto static

# mutipath
${IP} rule add prio 221 table 221

${IP} route add default table 221 proto static \
            nexthop via ${EXTGW1} dev ${EXTIF1} weight 2\
            nexthop via ${EXTGW2} dev ${EXTIF2} weight 3

${IP} route flush cache

while : ; do
  ${PING} -c 1 ${EXTGW1}
  ${PING} -c 1 ${EXTGW2}
  sleep 60
done

---------------------------cut here------------------------------------------

-- 
Pozdrowienia,
 Robert Kurjata

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Seth J. Blank

2003-Oct-15 22:01 UTC

head link

Re: Redundant Internet connections [Updated]

Thanks Robert, that''s almost exactly what I had (I didn''t have
ip route
flush cache).

The problem is, everything is routing fine, and the data is being split 
evenly over eth0 and eth1, but as soon as I pull the cable out of eth0 
(pulling it out of eth1 doesn''t seem to matter) the connection goes out
and the routes never recover until I plug the cable back in (at which 
point things start flowing perfectly again without any prompting from 
me). On the other hand, if I ifdown eth0, the routes switch over 
silently. As soon as I bring eth0 back up, data''s going over both eth0 
and eth1 again.

In other words, things are working almost exactly as they should be, but 
when the cat5 comes out, things just die. Someone suggested that I use 
mii tools and just ifdown eth0 if it''s out, and that might work, but
I''d
really rather have a solution done solely within routing tables if possible.

The other reason I want to do this from the routing tables is because I 
expect any problems to be further down the line than the cable into the 
firewall.

The network will be set up like this:

intranet eth2 --- firewall --- eth0 --- router1 --- internet
                           \-- eth1 --- router2 --- internet

When the connection from router1 to the internet goes down, I need the 
firewall to stop sending data over eth0 and commit fully to eth1. When 
that link comes back up, I need the routes restored. Same for the other 
way around.

The way I was thinking of doing this was by sending out an ICMP packet 
(say, to google.com) over each interface with a TTL of 3, and if it 
didn''t come back, change the route.

But both the nano howto and the dead gateway detection howto seem to say 
that the routes as I have them (and you put them) should be able to 
handle this problem already. My problem is that it obviously doesn''t.
If
it did, pulling the cable out of eth0 wouldn''t cause such an issue.

So I guess what I''m asking is, does anyone have any suggestions about 
how to troubleshoot this problem?

Thanks so much everyone,
Seth

Robert Kurjata wrote:
>Hi Seth,
>
>I cant find anything more than posting my working script for load
>balancing over two links (it was for three links and I home I
didn''t
>remove too much). It has been done strictly by the rules on
>Nano-HOWTO and works. The main part is the PING section at the end.
>This ensures that kernel sees dead gateways and recovers.
>But of course it WILL NOT work without some kernel patching (dead
>gateway detection, static routes - just use a Jumbo Patch from
>http://www.ssi.bg/~ja/ ).
>
>A final word is: the routers didn''t even have to respond to pings.
>They need to respond to ARPS. This stuff doesn''t work properly for
PPP
>or PPPoE connections as they usually are NoARP.
>
>I also have some shaping done with TC/CBQ on both links.
>
>VERY IMPORTANT: all the testing is USELESS if you have less than 40-50
>users doing lots of requests to different sites as a routes are just
>cached in kernel. In my system even with 10-20 users balancing is
>usually poor improving greatly with number of users - the diference
>between links lowers down to 10%.
>
>Hopefully I will get some free time to write a step-by-step howto
>because it took me some time to understand the thing.
>
>Home this helped someone, Greetings to the list
>---------------------------cut
here------------------------------------------
>
>#!/bin/bash
># This script is done by : Robert Kurjata Sep, 2003.
># feel free to use it in any usefull way
>
># CONFIGURATION
>IP=/sbin/ip
>PING=/bin/ping
>
>#--------------- LINK PART -----------------
># EXTIFn - interface name
># EXTIPn - outgoing IP
># EXTMn  - netmask length (bits)
># EXTGWn - outgoing gateway
>#-------------------------------------------
>
># LINK 1
>EXTIF1=eth2
>EXTIP1>EXTM1>EXTGW1>
># LINK 2
>EXTIF2=eth1
>EXTIP2>EXTM2>EXTGW2>
>#ROUTING PART
># removing old rules and routes
>
>echo "removing old rules"
>${IP} rule del prio 50 table main
>${IP} rule del prio 201 from ${EXTIP1}/${EXTM1} table 201
>${IP} rule del prio 202 from ${EXTIP2}/${EXTM2} table 202
>${IP} rule del prio 221 table 221
>echo "flushing tables"
>${IP} route flush table 201
>${IP} route flush table 202
>${IP} route flush table 221
>echo "removing tables"
>${IP} route del table 201
>${IP} route del table 202
>${IP} route del table 221
>
># setting new rules
>echo "Setting new routing rules"
>
># main table w/o default gateway here
>${IP} rule add prio 50 table main
>${IP} route del default table main
>
># identified routes here
>${IP} rule add prio 201 from ${EXTIP1}/${EXTM1} table 201
>${IP} rule add prio 202 from ${EXTIP2}/${EXTM2} table 202
>
>${IP} route add default via ${EXTGW1} dev ${EXTIF1} src ${EXTIP1} proto
static table 201
>${IP} route append prohibit default table 201 metric 1 proto static
>
>${IP} route add default via ${EXTGW2} dev ${EXTIF2} src ${EXTIP2} proto
static table 202
>${IP} route append prohibit default table 202 metric 1 proto static
>
># mutipath
>${IP} rule add prio 221 table 221
>
>${IP} route add default table 221 proto static \
>            nexthop via ${EXTGW1} dev ${EXTIF1} weight 2\
>            nexthop via ${EXTGW2} dev ${EXTIF2} weight 3
>
>${IP} route flush cache
>
>  
>
>while : ; do
>  ${PING} -c 1 ${EXTGW1}
>  ${PING} -c 1 ${EXTGW2}
>  sleep 60
>done
>
>---------------------------cut
here------------------------------------------
>
>  
>

_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

LARTC - Oct 2003 - Redundant Internet connections

Redundant Internet connections

Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]

Re[2]: Redundant Internet connections [Updated]

Re: Redundant Internet connections [Updated]