thr3ads.net - LARTC - Load Balance and SNAT problem. [Jun 2007]

If this information is useful, please help other people find it:
Share via:

John Chang

2007-Jun-25 03:07 UTC

Load Balance and SNAT problem.

I am developing load balancing router, But I have a question about fail
over.
The follow diagram is my test environment and scripts.
-------------------------------------------------------------------
Environment Setting

                 PC1(192.168.10.2)
                         |
                       (LAN)
                         |
               PC2-eth2(192.168.10.1)
                +               +
  PC2-eth0(111.111.111.2)    PC2-eth1(222.222.222.2 )
                |               |
              (WAN1)          (WAN2)
                |               |
  PC3-eth0(111.111.111.1)    PC3-eth1( 222.222.222.1)
                +               +
               PC2-eth2(172.16.0.1)

PC2-Linux Kernel 2.6.21
PC2-Iptables 1.3.7


-------------------------------------------------------------------
Iptables rules:

iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 111.111.111.2
iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to 222.222.222.2

# table 101
ip route flush table 101
ip route add 192.168.10.0/24 dev eth2 table 101
ip route add default via 111.111.111.1 dev eth0 table 101

# table 102
ip route flush table 102
ip route add 192.168.10.0/24 dev eth2 table 102
ip route add default via 222.222.222.1 dev eth1 table 102

ip rule del fwmark 1 table 101
ip rule del fwmark 2 table 102
ip rule add fwmark 1 table 101
ip rule add fwmark 2 table 102

iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark
iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode
nth --every 2 --packet 1 -j MARK --set-mark 1
iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode
nth --every 2 --packet 2 -j MARK --set-mark 2
iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark

-----------------------------------------------------------------------------
Test Sequence:
1. Run command "ping 172.16.0.1 -t" on PC1
2. I capture packets on WAN1 and WAN2, it works fine.
   The ICMP request/response would come out on WAN1 and WAN2 sequentially.
3. I unplug WAN1. Only the packets on WAN1 will lost, but WAN2 should works,
right?
   I should saw "ping Time Out" and "ping OK" on PC1
sequentially.
4. But the both connections all breaks. It always "ping Time Out" on
PC1.
5. After caputre the packets on WAN1 and WAN2. I saw a weird behavior.
   The source IP of packets on WAN2 is 111.111.111.2, but it should be
222.222.222.2
   That is why WAN2 breaks.
-----------------------------------------------------------------------------
Could you give me a suggestion?
Thanks.


_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-25 14:47 UTC

head link

Re: Load Balance and SNAT problem.

On 06/24/07 22:07, John Chang wrote:> iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark
> iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
> --mode nth --every 2 --packet 1 -j MARK --set-mark 1
> iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
> --mode nth --every 2 --packet 2 -j MARK --set-mark 2
> iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark
I don''t think these rules are going to do what you anticipate them to 
do.  These rules will alternate which route is used based on sequential 
entry of packets in to the router.  Consider if you have any transaction 
that will take more than one packet.  The connection will be sent out 
both routes, each with different source IP addresses, thus the two 
packets are no longer associated with each other thus breaking your 
connection.
> 2. I capture packets on WAN1 and WAN2, it works fine. 
> The ICMP request/response would come out on WAN1 and WAN2 sequentially.
(See the above comment.)
> 3. I unplug WAN1. Only the packets on WAN1 will lost, but WAN2 should 
> works, right? 
> I should saw "ping Time Out" and "ping OK" on PC1
sequentially.
*IF* the rules do work, yes this should be what you see.
> 4. But the both connections all breaks. It always "ping Time Out"
on PC1.
*nod*
> 5. After caputre the packets on WAN1 and WAN2. I saw a weird behavior.
> The source IP of packets on WAN2 is 111.111.111.2 
> but it should be 222.222.222.2
>    That is why WAN2 breaks.
I don''t know what to say here, other than something is not working
right.
> Could you give me a suggestion?
> Thanks.
Do not use this method to load balance.  Look in to Equal Cost Multi 
Path (a.k.a. ECMP) routing and specifying multiple default gateways on 
one route command.  The kernel should try to load balance across the 
multiple default gateways for you while maintaining connections.



Grant. . . .

VladSun

2007-Jun-25 21:30 UTC

head link

Re: Load Balance and SNAT problem.

John Chang написа:>
> I am developing load balancing router, But I have a question about 
> fail over.
> The follow diagram is my test environment and scripts.
> -------------------------------------------------------------------
> Environment Setting
>
> PC1(192.168.10.2 <http://192.168.10.2>)
> |
> (LAN)
> |
> PC2-eth2( 192.168.10.1 <http://192.168.10.1>)
> + +
> PC2-eth0(111.111.111.2 <http://111.111.111.2>) PC2-eth1(222.222.222.2
> <http://222.222.222.2> )
> | |
> (WAN1) (WAN2)
> | |
> PC3-eth0(111.111.111.1 <http://111.111.111.1>) PC3-eth1(
222.222.222.1
> <http://222.222.222.1>)
> + +
> PC2-eth2(172.16.0.1 <http://172.16.0.1>)
>
> PC2-Linux Kernel 2.6.21
> PC2-Iptables 1.3.7
>
>
> -------------------------------------------------------------------
> Iptables rules:
>
> iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 111.111.111.2 
> <http://111.111.111.2>
> iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to 222.222.222.2 
> <http://222.222.222.2>
>
> # table 101
> ip route flush table 101
> ip route add 192.168.10.0/24 <http://192.168.10.0/24> dev eth2 table
101
> ip route add default via 111.111.111.1 <http://111.111.111.1> dev
eth0
> table 101
>
> # table 102
> ip route flush table 102
> ip route add 192.168.10.0/24 <http://192.168.10.0/24> dev eth2 table
102
> ip route add default via 222.222.222.1 <http://222.222.222.1> dev
eth1
> table 102
>
> ip rule del fwmark 1 table 101
> ip rule del fwmark 2 table 102
> ip rule add fwmark 1 table 101
> ip rule add fwmark 2 table 102
>
> iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark
> iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
> --mode nth --every 2 --packet 1 -j MARK --set-mark 1
> iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
> --mode nth --every 2 --packet 2 -j MARK --set-mark 2
> iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark
>
>
-----------------------------------------------------------------------------
>Well ... I am not sure about it but you may try to do it this way:

iptables -t nat -A POSTROUTING -o ! eth2 -m mark --mark 1 -j SNAT --to 
111.111.111.2 <http://111.111.111.2>
iptables -t nat -A POSTROUTING -o ! eth2 -m mark --mark 2 -j SNAT --to 
222.222.222.2 <http://222.222.222.2>

iptables -t mangle -A PREROUTING -t mangle -j CONNMARK --restore-mark
iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
--mode nth --every 2 --packet 1 -j MARK --set-mark 1
iptables -t mangle -A PREROUTING -m state --state NEW -m statistic 
--mode nth --every 2 --packet 2 -j MARK --set-mark 2
iptables -t mangle -A POSTROUTING -j CONNMARK --save-mark


This is done without using iproute.
There is another solution, but it works only with kernels up to 2.6.10:

iptables -t nat -A POSTROUTING -o ! eth2 -j SNAT --to 111.111.111.2 
<http://111.111.111.2>,222.222.222.2 <http://222.222.222.2>

".... For those kernels, if you specify more than one source
address, either via an address range or multiple --to-source options, a 
simple round-robin (one after another in cycle) takes
place between these addresses. Later Kernels (>= 2.6.11-rc1) don''t
have
the ability to NAT to multiple ranges anymore. ..."

Peter Rabbitson

2007-Jun-26 06:46 UTC

head link

Re: Load Balance and SNAT problem.

Grant Taylor wrote:> 
>> Could you give me a suggestion?
>> Thanks.
> 
> Do not use this method to load balance.  Look in to Equal Cost Multi 
> Path (a.k.a. ECMP) routing and specifying multiple default gateways on 
> one route command.  The kernel should try to load balance across the 
> multiple default gateways for you while maintaining connections.
> 
This is a bad bad advice in this day and age. If there are not enough 
users route caching will kill him. Here is a recent discussion of this: 
http://marc.info/?l=lartc&m=117912699505681&w=2

HTH

Peter

P.S. I am not insisting that netfilter is superior in this regard, I am 
simply expressing common requirements and looking into ways of achieving 
them. If someone can point me to how to do this with kernel routes - I 
am all ears, since I recognize that the netfilter solution is not very 
elegant, although it works.

John Chang

2007-Jun-26 11:36 UTC

head link

Re: Load Balance and SNAT problem.

Thanks for your advices.

Currently my test scripts will make both WAN connections break, when I
unplug one WAN connection.
So I can not implement the fail-over mechanism.
My original idea is to mark all packets as 1 when connection WAN2 breaks
or mark all packets as 2 when connection WAN1 breaks.
But now one connection breaks will make both connections break.
I could not identify which connection breaks? It is weird. ><"

------------------------------------------------------------------------------------------------------

Grant Taylor wrote:>
>> Could you give me a suggestion?
>> Thanks.
>
> Do not use this method to load balance.  Look in to Equal Cost Multi
> Path (a.k.a. ECMP) routing and specifying multiple default gateways on
> one route command.  The kernel should try to load balance across the
> multiple default gateways for you while maintaining connections.
>
This is a bad bad advice in this day and age. If there are not enough
users route caching will kill him. Here is a recent discussion of this:
http://marc.info/?l=lartc&m=117912699505681&w=2

HTH

Peter

P.S. I am not insisting that netfilter is superior in this regard, I am
simply expressing common requirements and looking into ways of achieving
them. If someone can point me to how to do this with kernel routes - I
am all ears, since I recognize that the netfilter solution is not very
elegant, although it works.

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-26 14:37 UTC

head link

Re: Load Balance and SNAT problem.

On 06/26/07 01:46, Peter Rabbitson wrote:> This is a bad bad advice in this day and age.
I think that is a bit of a bold statement.  You are free to have your 
opinion on what is better for you, as am I.
> If there are not enough users route caching will kill him. Here is a 
> recent discussion of this:
> http://marc.info/?l=lartc&m=117912699505681&w=2
Um, I just read this discussion and I have a few issues with it.

First and foremost:  It did not cover the reason "... route caching will 
kill ..." to my satisfaction like you indicated.

Second:  It relies on user space processes to alter and maintain things. 
  Thus if for some reason these processes do not run or do not do so in 
a timely manner, they may not function correctly.

Third:  You are altering the way a running kernel is operating from user 
space, not letting the kernel maintain its self.

Fourth:  Occam''s Razor dictates the use of the simpler and equally 
effective (equality is debatable) method to achieve the same result.

Though the method you site has potential, I think there is just as much 
room for improvement as there is in the method that I suggested.  Each 
method has its pros and cons.
> P.S. I am not insisting that netfilter is superior in this regard, I 
> am simply expressing common requirements and looking into ways of 
> achieving them.  If someone can point me to how to do this with 
> kernel routes - I am all ears, since I recognize that the netfilter 
> solution is not very elegant, although it works.
By your own statement, you are indicating that both methods leave 
something to be desired.

Grant. . . .

Patrick Brandão

2007-Jun-26 15:04 UTC

head link

Re: Load Balance and SNAT problem.

Try this algol:

MANGLE:
1 - restore mark
2 - accept mark 1
     accept mark 2
3 - random mark 1 ou 2
4 - save mark

NAT
5 - SNAT per interface.

Att,
Patrick Brandão

----- Original Message ----- 
From: "Grant Taylor" <gtaylor@riverviewtech.net>
To: "Mail List - Linux Advanced Routing and Traffic Control" 
<lartc@mailman.ds9a.nl>
Sent: Tuesday, June 26, 2007 11:37 AM
Subject: Re: [LARTC] Load Balance and SNAT problem.

> On 06/26/07 01:46, Peter Rabbitson wrote:
>> This is a bad bad advice in this day and age.
>
> I think that is a bit of a bold statement.  You are free to have your 
> opinion on what is better for you, as am I.
>
>> If there are not enough users route caching will kill him. Here is a 
>> recent discussion of this:
>> http://marc.info/?l=lartc&m=117912699505681&w=2
>
> Um, I just read this discussion and I have a few issues with it.
>
> First and foremost:  It did not cover the reason "... route caching
will
> kill ..." to my satisfaction like you indicated.
>
> Second:  It relies on user space processes to alter and maintain things. 
> Thus if for some reason these processes do not run or do not do so in a 
> timely manner, they may not function correctly.
>
> Third:  You are altering the way a running kernel is operating from user 
> space, not letting the kernel maintain its self.
>
> Fourth:  Occam''s Razor dictates the use of the simpler and equally
> effective (equality is debatable) method to achieve the same result.
>
> Though the method you site has potential, I think there is just as much 
> room for improvement as there is in the method that I suggested.  Each 
> method has its pros and cons.
>
>> P.S. I am not insisting that netfilter is superior in this regard, I am
>> simply expressing common requirements and looking into ways of
achieving
>> them.  If someone can point me to how to do this with kernel routes - I
>> am all ears, since I recognize that the netfilter solution is not very 
>> elegant, although it works.
>
> By your own statement, you are indicating that both methods leave 
> something to be desired.
>
>
>
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>

Peter Rabbitson

2007-Jun-26 17:44 UTC

head link

Re: Load Balance and SNAT problem.

Grant Taylor wrote:> First and foremost:  It did not cover the reason "... route caching
will
> kill ..." to my satisfaction like you indicated.
Can you elaborate on this? My only issue with the kernel route balancing 
is that route caching can not be disabled entirely, so traffic to the 
same site will leave via the same channel, regardless if the other 
channel is empty or not. I know that it is technically possible (kernel 
option CONFIG_IP_ROUTE_MULTIPATH_RANDOM), but it will work only for 
globally routable addresses, while breaking NAT badly.

The reason I made my bold, as you call it, statement, is because 90% of 
the time when someone is doing NAT, it is for a tightly joined group, 
with similar interests - hence a lot of traffic duplication. For 
instance if every user listens to the same online radiostation - how 
would you work around it?

Let me know your thoughts

Peter

Andre Guimarães

2007-Jun-26 20:01 UTC

head link

Load Balance and SSL

Hi, 

I have load balance working on a linux server, balancing between two providers 
with obvious two different IPs (the customer is not an Autonomous System).

It works very well except with some sites that establish a session and then 
redirects the session to another server. These sessions are usually based on 
informations like cookies and client IP address, and therefore you must reach 
the destination with the same IP address (thats why routing cache is there).

But when the "session" is redirected to another destination server,
another
destination IP, sometimes the connection go trought the another link, and so, 
arrives at the destination with another IP, and then the session becomes 
invalid.

I can''t see anything linux (and any other) could do to deal with it,
since
it''s a new destination IP.

Anyone knows something that could solve this kind of problem ?

:: Sorry for the bad english.



-- 
André Guimarães
Databras Informática
Matriz RJ - 55 (21) 2518-2363
Filial ES - 55 (27) 3233-0098
http://www.databras.com.br

Grant Taylor

2007-Jun-27 01:24 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 12:44 PM, Peter Rabbitson wrote:> Can you elaborate on this? My only issue with the kernel route 
> balancing is that route caching can not be disabled entirely, so 
> traffic to the same site will leave via the same channel, regardless 
> if the other channel is empty or not. I know that it is technically 
> possible (kernel option CONFIG_IP_ROUTE_MULTIPATH_RANDOM), but it 
> will work only for globally routable addresses, while breaking NAT 
> badly.
This is a very good point that was not made in the referenced message. 
I do not have any rebuttal to this point.  This is the type of point 
that I was hoping to see before but did not.

My response to this is that you have a good point, something that in my 
opinion should be addressed by the kernel at some point.
> The reason I made my bold, as you call it, statement, is because 90% 
> of the time when someone is doing NAT, it is for a tightly joined 
> group, with similar interests - hence a lot of traffic duplication. 
> For instance if every user listens to the same online radiostation - 
> how would you work around it?
I don''t know if the 90% as you say is accurate or not.  However if you 
are even remotely in the ball park, you have a good point. I have been 
around environments with nearly 1000 computers with very little in 
similarity between all the people.  I think this is really based on 
where NAT is used and how it is used.  If you are talking of many to one 
NAT I would agree with you.  However if you are talking about many to 
many NAT, I''ll disagree with you.

I think that the scenarios you are thinking of would be best described 
as a small office / home office (a.k.a. SOHO), which would definitely 
qualify with what you are saying.  However there are a LOT of uses of 
NAT outside of SOHOs.  Given the prevalence of SOHOs doing NAT, I am 
willing to bet that you are correct.  But, this is why there are 
different types of solutions to this problem for them.
> Let me know your thoughts
With regard to streaming radio, I personally believe that it should be 
multicast so that it can be streamed in one time and have multiple 
recipients hear it.  Or there should be some sort of proxy that will 
download it and pass it back to multiple clients.  Of course, this is 
beyond the scope of this discussion and would be used in larger 
environments out side of the SOHOs that I think you are referring to.



Grant. . . .

Grant Taylor

2007-Jun-27 01:26 UTC

head link

Re: Load Balance and SSL

On 6/26/2007 3:01 PM, Andre Guimarães wrote:> Anyone knows something that could solve this kind of problem ?
I would like to see some control over how the cache matches, i.e. a 
netmask for the destination IP.  Something like cache for matches on /24 
or the likes.



Grant. . . .

Grant Taylor

2007-Jun-27 01:51 UTC

head link

Re: Load Balance and SNAT problem.

(Sorry, I''m not sure but the answer does impact this discussion.)

On 6/26/2007 12:44 PM, Peter Rabbitson wrote:> so traffic to the same site will leave via the same channel, 
> regardless if the other channel is empty or not.
Is the caching per route or per source IP?  I''m guessing that it is per
route decision such that any and all clients will use the same cached 
route thus not using additional interfaces.

Or is this a clear and concise reason why load balancing via Netfilter 
would be a better approach?



Grant. . . .

Grant Taylor

2007-Jun-27 02:07 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 9:03 PM, Mohan Sundaram wrote:> The caching would be per destination IP - so it is likely all clients 
> will use the same route and thus interface.
This could be a problem.  I was taking the caching to be remembering 
which route was chosen and believing it to be associated with a specific 
source IP address.  I can see this being a very large issue when trying 
to do load balancing.

In light of this information, I think that better could be done in 
Netfilter.  However if there ever was a way to have route selection per 
source IP in the kernel, I would be more interested in that.

I wonder if route selection caching would be different in different 
routing tables.  In other words use a different routing table for a 
different (set of) clients.  Thus one cached routing decision per 
routing table which could differ per routing table.



Grant. . . .

Salim S I

2007-Jun-27 02:22 UTC

head link

RE: Load Balance and SNAT problem.

The caching is per destination and source ip. TOS, fwmark and input
interface too, if present.

Routing with netfilter does not solve cache problems anyway, cache will
still be present, and it will be consulted before routing tables are
hit.

In my opinion, routing in netfilter gives more flexibility in
dynamically choosing weights and such.
But multipath routing gives a bit more IP persistence.

Both solutions work pretty well; there are die-hard fans for both of the
above approaches. Recent archives of lartc have lot of discussions on
it.

> -----Original Message-----
> From: lartc-bounces@mailman.ds9a.nl
[mailto:lartc-bounces@mailman.ds9a.nl]> On Behalf Of Grant Taylor
> Sent: Wednesday, June 27, 2007 10:08 AM
> To: Mail List - Linux Advanced Routing and Traffic Control
> Subject: Re: [LARTC] Load Balance and SNAT problem.
> 
> On 6/26/2007 9:03 PM, Mohan Sundaram wrote:
> > The caching would be per destination IP - so it is likely all
clients> > will use the same route and thus interface.
> 
> This could be a problem.  I was taking the caching to be remembering
> which route was chosen and believing it to be associated with a
specific> source IP address.  I can see this being a very large issue when
trying> to do load balancing.
> 
> In light of this information, I think that better could be done in
> Netfilter.  However if there ever was a way to have route selection
per> source IP in the kernel, I would be more interested in that.
> 
> I wonder if route selection caching would be different in different
> routing tables.  In other words use a different routing table for a
> different (set of) clients.  Thus one cached routing decision per
> routing table which could differ per routing table.
> 
> 
> 
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-27 02:34 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 9:14 PM, Mohan Sundaram wrote:> I remember that route balancing has an option to perform per packet 
> balancing and not per connection. If that were to work, then route 
> cache would not be used IMHO.
Interesting.  Do you have any idea where I can get some more information 
regarding this?
> Per packet balancing is normally not done as it would break 
> connections especially in NAT''ted scenario.
Keep in mind that NATing is not the only place that load balancing is 
used.  I call to mind my recent thread "Redundant internet
connections"
(http://mailman.ds9a.nl/pipermail/lartc/2007q2/021015.html) where I had 
globally routable IP addresses in side the DMZ.  I could have used per 
packet load balancing with out a problem except for the fact that I 
specifically wanted to not use the backup connection unless the primary 
was down.



Grant. . . .

Grant Taylor

2007-Jun-27 02:39 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 9:22 PM, Salim S I wrote:> The caching is per destination and source ip. TOS, fwmark and input 
> interface too, if present.
Is the caching done on the combination of source and destination or 
singularly source or singularly destination?

If caching is done on the former, then as long as the source IP is 
different, you could potentially have different cached route choices for 
different workstations with in a company.



Grant. . . .

Salim S I

2007-Jun-27 03:07 UTC

head link

RE: Load Balance and SNAT problem.

Well, this is the relevant code in my kernel. (2.4.27)


	for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next)
{
		if (rth->key.dst == key->dst &&
		    rth->key.src == key->src &&
		    rth->key.iif == 0 &&
		    rth->key.oif == key->oif &&
#ifdef CONFIG_IP_ROUTE_FWMARK
		    rth->key.fwmark == key->fwmark &&
#endif
		    !((rth->key.tos ^ key->tos) &
			    (IPTOS_RT_MASK | RTO_ONLINK))) 

> -----Original Message-----
> From: lartc-bounces@mailman.ds9a.nl
[mailto:lartc-bounces@mailman.ds9a.nl]> On Behalf Of Grant Taylor
> Sent: Wednesday, June 27, 2007 10:39 AM
> To: Mail List - Linux Advanced Routing and Traffic Control
> Subject: Re: [LARTC] Load Balance and SNAT problem.
> 
> On 6/26/2007 9:22 PM, Salim S I wrote:
> > The caching is per destination and source ip. TOS, fwmark and input
> > interface too, if present.
> 
> Is the caching done on the combination of source and destination or
> singularly source or singularly destination?
> 
> If caching is done on the former, then as long as the source IP is
> different, you could potentially have different cached route choices
for> different workstations with in a company.
> 
> 
> 
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Grant Taylor

2007-Jun-27 03:16 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 10:07 PM, Salim S I wrote:> 	for (rth = rt_hash_table[hash].chain; rth; rth = rth->u.rt_next)
> {
> 		if (rth->key.dst == key->dst &&
> 		    rth->key.src == key->src &&
> 		    rth->key.iif == 0 &&
> 		    rth->key.oif == key->oif &&
> #ifdef CONFIG_IP_ROUTE_FWMARK
> 		    rth->key.fwmark == key->fwmark &&
> #endif
> 		    !((rth->key.tos ^ key->tos) &
> 			    (IPTOS_RT_MASK | RTO_ONLINK))) 
I''m no C programmer, but it looks like the source, destination, in 
interface, and out interface are all part of the conditional, thus 
leading us to believe that caching (?) might be per combination of all 
the above?



Grant. . . .

Peter Rabbitson

2007-Jun-27 05:54 UTC

head link

Re: Load Balance and SNAT problem.

Salim S I wrote:> The caching is per destination and source ip. TOS, fwmark and input
> interface too, if present.
Interesting... It definitely did not work in my scenario though. I am 
going to test this again in the near future, and if you are right I will 
rest my case.
> Routing with netfilter does not solve cache problems anyway, cache will
> still be present, and it will be consulted before routing tables are
> hit.
This is true for locally generated traffic only. Any incomming/forwarded 
traffic can be controlled in the PREROUTING, thus the cache is never 
consulted.
> Both solutions work pretty well; there are die-hard fans for both of the
> above approaches. Recent archives of lartc have lot of discussions on
> it.
I am actually simply jealous that some people apparently get it to work 
in-kernel, and I can''t seem to. My requirements are pretty simple:

o As transparrent as possible DGD, that can detect 2nd and 3rd hop failures
o Robust load balancing - connections are distributed over all available 
links, regardless of source and destination, with the possibility of 
assigning relative channel priorities
o NAT compatible - link hopping is not an option, traffic with a 
specific SRC/DST must stay where it started.

Salim S I

2007-Jun-27 06:41 UTC

head link

RE: Load Balance and SNAT problem.

> This is true for locally generated traffic only. Any
incomming/forwarded> traffic can be controlled in the PREROUTING, thus the cache is never
> consulted.
The cache will still be consulted, in ip_route_input. That is for input
and forwarded traffic. Only if there is no matching entry, routing
tables will be employed.

If you look in the cache, you can see routes cached for same destination
through both wan interfaces. (well, in my case, I can see...).But their
fwmarks are different,as evident from ip_conntrack.

Grant Taylor

2007-Jun-27 06:43 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 12:54 AM, Peter Rabbitson wrote:> I am actually simply jealous that some people apparently get it to 
> work in-kernel, and I can''t seem to.
Ah, so the truth comes out.  ;)
> My requirements are pretty simple:
> o As transparrent as possible DGD, that can detect 2nd and 3rd hop 
> failures
Think about what you just asked for.  "Dead Gateway Detection" is used
to detect dead (upstream) (default) gateway(s).  Rather it is not meant 
to detect dead routes beyond your gateway(s).  To do this you will need 
some sort of utility to monitor things for you.  I.e. you will not be 
able to get the kernel to detect that a gateway is good for some things 
but not for others.  Actually if you stop to think about it, this is 
beyond the scope of what the kernel should do.  This is more the scope 
of a routing protocol and / or a route management daemon.

In short, use something to test reachability to destinations and use ip 
rules to choose routing tables accordingly.  I.e. have a default routing 
table that will try to use any / all interfaces routes and have 
alternative routing tables that will try fewer interfaces / routes.
> o Robust load balancing - connections are distributed over all 
> available links, regardless of source and destination, with the 
> possibility of assigning relative channel priorities
I think this is close to being possible depending on your scenario (NAT 
or not) and a few other things.

It was my understanding that equal cost multi path routing was suppose 
to accomplish this very thing.  I.e. if you had globally routable IP 
addresses behind the router, you could send traffic out either link, 
hopefully in such a fashion as to (hopefully) fully utilize all links. 
ECMP does include weight options to assign ratios to routes.

However, after discussion in this thread, I question if ECMP will do 
this or not.
> o NAT compatible - link hopping is not an option, traffic with a 
> specific SRC/DST must stay where it started.
I think this is the simpler of the above "robust load balancing" as
you
say.  In my opinion, this should be the first of the things to be 
achieved and then try to extend this to be the above.

What you have proposed with load balancing via Netfilter should be able 
to achieve this with out any problems.  Or at least I would think such.



Grant. . . .

Peter Rabbitson

2007-Jun-27 06:58 UTC

head link

Re: Load Balance and SNAT problem.

Grant Taylor wrote:> On 6/27/2007 12:54 AM, Peter Rabbitson wrote:
>> I am actually simply jealous that some people apparently get it to 
>> work in-kernel, and I can''t seem to.
> 
> Ah, so the truth comes out.  ;)
Hehe
>> My requirements are pretty simple:
>> o As transparrent as possible DGD, that can detect 2nd and 3rd hop 
>> failures
> 
> Think about what you just asked for.  "Dead Gateway Detection" is
used
> to detect dead (upstream) (default) gateway(s).  Rather it is not meant 
> to detect dead routes beyond your gateway(s).  To do this you will need 
> some sort of utility to monitor things for you.  I.e. you will not be 
> able to get the kernel to detect that a gateway is good for some things 
> but not for others.  Actually if you stop to think about it, this is 
> beyond the scope of what the kernel should do.  This is more the scope 
> of a routing protocol and / or a route management daemon.
> 
> In short, use something to test reachability to destinations and use ip 
> rules to choose routing tables accordingly.  I.e. have a default routing 
> table that will try to use any / all interfaces routes and have 
> alternative routing tables that will try fewer interfaces / routes.
This is the most fragile part of my current setup. And DGD based on 
packet counts IMO is an extremely simple thing to do, I discussed it 
recently with you. If something like this was present in-kernel the 
world would be a better place.
>> o Robust load balancing - connections are distributed over all 
>> available links, regardless of source and destination, with the 
>> possibility of assigning relative channel priorities
> 
> I think this is close to being possible depending on your scenario (NAT 
> or not) and a few other things.
> 
> It was my understanding that equal cost multi path routing was suppose 
> to accomplish this very thing.  I.e. if you had globally routable IP 
> addresses behind the router, you could send traffic out either link, 
> hopefully in such a fashion as to (hopefully) fully utilize all links. 
> ECMP does include weight options to assign ratios to routes.
For globally routable addresses it doesn''t really matter, because you 
can not usually detect it (things still work).
> What you have proposed with load balancing via Netfilter should be able 
> to achieve this with out any problems.  Or at least I would think such.
It actually does work in production for quite some time now. But as said 
before - it is ugly and fragile.

I understand that we are coming from different environments, but I still 
think that my figure of 90% is rather accurate. If you can afford not to 
do NAT, means that most likely you also have access to the ISPs dynamic 
routing protocols as well, and the entire discussion becomes pointless. 
On the contrary if you run NAT, most likely you are a poor-mans-ISP or 
smaller, running off two consumer DSL links, and all of the above applies.

Either way I rest my case here, as we are comparing apples to dinosaurs, 
and went too far OT :)

Peter

Grant Taylor

2007-Jun-27 07:28 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 1:58 AM, Peter Rabbitson wrote:> And DGD based on packet counts IMO is an extremely simple thing to 
> do, I discussed it recently with you.
(If I recall correctly and / or re-read the appropriate thread correctly.)

What you were talking about doing was pinging (of sorts, be it ICMP, 
testing connections, sending layer 7 traffic, etc.) destinations out 
side of your upstream gateway.  Correct?
> If something like this was present in-kernel the world would be a 
> better place.
I agree that if there was a way for the kernel to handle this the world 
would be a better place.  However, I think it silly to expect the kernel 
to do this.

Well let me take a moment to be sure we are thinking the same thing. 
You want the kernel to be able to realize that one route through a given 
default gateway is no good for a given destination and use a different 
default gateway even though the kernel can reach other destinations 
through the first default gateway?  In other words, if the kernel can 
not reach microsoft.com through ISP1 it should use ISP2 despite the fact 
that it can reach google.com through ISP1?

Grant. . . .

Grant Taylor

2007-Jun-27 07:37 UTC

head link

Re: Load Balance and SNAT problem.

On 6/26/2007 9:14 PM, Mohan Sundaram wrote:> I remember that route balancing has an option to perform per packet 
> balancing and not per connection. If that were to work, then route cache 
> would not be used IMHO. Per packet balancing is normally not done as it 
> would break connections especially in NAT''ted scenario.
To quote the man page for ip, it looks like the balancing is not per 
packet as you indicate, but rather per-flow.

"""equalize - allow packet by packet randomization on multipath
routes.
  Without this  modifier,  the  route  will  be frozen  to one selected 
nexthop, so that load splitting will only occur on per-flow base. 
equalize only works if the kernel is patched."""

Grant. . . .

Grant Taylor

2007-Jun-27 07:53 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 2:50 AM, Mohan Sundaram wrote:> """equalize - allow packet by packet randomization on
multipath
> routes.  Without this  modifier,  the  route  will  be frozen  to one
> selected nexthop, so that load splitting will only occur on per-flow
> base. equalize only works if the kernel is patched."""
I think we both pasted the same quote.

If you do use the "equalize" keyword, you do get a packet by packet / 
per-packet effect.

Where as if you do not use the "equalize" keyword, you get a per-flow 
effect, which is what I was trying to state is the apparent default.

Grant. . . .

Grant Taylor

2007-Jun-27 07:57 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 2:53 AM, Mohan Sundaram wrote:> Pardon my earlier mail.
*nod*  Pardon my reply.  ;)
> This says if equalize patch/keyword is used, packet randomisation 
> happens. Exactly what we want, is it not?
(Referring back to your earlier message...)

Yes, I think this is what we want in this scenario.



Grant. . . .

Peter Rabbitson

2007-Jun-27 08:03 UTC

head link

Re: Load Balance and SNAT problem.

Grant Taylor wrote:> Well let me take a moment to be sure we are thinking the same thing. You 
> want the kernel to be able to realize that one route through a given 
> default gateway is no good for a given destination and use a different 
> default gateway even though the kernel can reach other destinations 
> through the first default gateway?  In other words, if the kernel can 
> not reach microsoft.com through ISP1 it should use ISP2 despite the fact 
> that it can reach google.com through ISP1?
> 
No, nothing like this. Basically my idea is that a no-packet-seen timer 
is maintained for every gateway, excluding any packets with a source 
within the ISPs netblock. This will reliably detect that no traffic is 
seen beyond the ISP, and therefore pronounce the gateway dead.

The only configuration required from the administrator would be an 
address/netmask pair for every gateway, to use as an exclusion for the 
counters, and a no-packets-seen timeout, before a gateway is marked as 
dead. Any incoming activity on the gateway will immediately change its 
status back to active.

So to answer your exact question - I want the kernel to be able to 
realize that a gateway is no good for any destinations other than the 
specified netblock.

Peter

Grant Taylor

2007-Jun-27 08:03 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 2:59 AM, Mohan Sundaram wrote:> I think that default makes sense. If we want pkt based balancing, we 
> enable it explicitly.
Agreed.

We / people just have to be aware that is what it does so that they 
don''t have false expectations.  Of course, this is a fairly common 
problem in unix.



Grant. . . .

Grant Taylor

2007-Jun-27 08:11 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 3:03 AM, Peter Rabbitson wrote:> I want the kernel to be able to realize that a gateway is no good for
> any destinations other than the specified netblock.
Would it be fair to say that you are wanting an administratively 
configurable "ignore addresses that fall with in this <network>"
while
deciding if a gateway is dead?

Obviously <network> would need to be a bit more than just an ip / 
netmask combination to make this realistic.

If this is what you are wanting, it may be possible to augment the 
kernel code that is used to detect dead gateways and have it check to 
see if the networks match a list (from somewhere in proc / sysfs / 
sysctl?) and not increment traffic counters.  I am presuming that it is 
the traffic counters that have to be incremented for the kernel to think 
that a route is still alive.  So, if you purposfully did not increment 
the counters, you could probably detect that a given gateway is no good. 
  I think you would have to add an additional route that was to the 
given network(s) that did not use such a feature to provide a way for 
the routing code to route to those network(s) that it no longer would 
get to via a default gateway.

What do you think?



Grant. . . .

Grant Taylor

2007-Jun-27 08:24 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 3:22 AM, Mohan Sundaram wrote:> *A word of caution*. My connections went awry more due to out of 
> order delivery of packets and I had a hell of a time troubleshooting 
> it as the problem did not appear consistently,:-(. Did not know where 
> in the whole chain I has the problem. Is like the MTU problem in 
> PPTP.
*nod*

This is a warning that you see a LOT of places when you start talking 
about per packet verses per flow load balancing.  Cisco is VERY big in 
to giving this warning.

Despite being aware of this problem, I have yet (knock on wood) to run 
in to this problem my self.

Grant. . . .

Grant Taylor

2007-Jun-27 08:26 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 3:24 AM, Grant Taylor wrote:> This is a warning that you see a LOT of places when you start talking 
> about per packet verses per flow load balancing.  Cisco is VERY big in 
> to giving this warning.
I wonder how much of packet out of order problem would happen with two 
parallel links verses two asymmetric routes through the internet core 
where one packet will take 27 hop route while the other will take a 37 
hop route.



Grant. . . .

Peter Rabbitson

2007-Jun-27 09:09 UTC

head link

Re: Load Balance and SNAT problem.

Grant Taylor wrote:> On 6/27/2007 3:03 AM, Peter Rabbitson wrote:
>> I want the kernel to be able to realize that a gateway is no good for
>> any destinations other than the specified netblock.
> 
> Would it be fair to say that you are wanting an administratively 
> configurable "ignore addresses that fall with in this
<network>" while
> deciding if a gateway is dead?
> 
> Obviously <network> would need to be a bit more than just an ip / 
> netmask combination to make this realistic.
> 
> If this is what you are wanting, it may be possible to augment the 
> kernel code that is used to detect dead gateways and have it check to 
> see if the networks match a list (from somewhere in proc / sysfs / 
> sysctl?) and not increment traffic counters.  I am presuming that it is 
> the traffic counters that have to be incremented for the kernel to think 
> that a route is still alive.  So, if you purposfully did not increment 
> the counters, you could probably detect that a given gateway is no good. 
Something along these lines, yes. Except that instead of a 
packet-counter there is a resettable timer, that gets reset anytime a 
matching packet comes in. When the timer goes over a specified limit - 
gateway is dead.
>  I think you would have to add an additional route that was to the given 
> network(s) that did not use such a feature to provide a way for the 
> routing code to route to those network(s) that it no longer would get to 
> via a default gateway.
> 
This would be a manual task for the administrator, there is no place for 
this in-kernel.

Grant Taylor

2007-Jun-27 10:19 UTC

head link

Re: Load Balance and SNAT problem.

On 6/27/2007 4:09 AM, Peter Rabbitson wrote:> Something along these lines, yes. Except that instead of a 
> packet-counter there is a resettable timer, that gets reset anytime a 
> matching packet comes in. When the timer goes over a specified limit - 
> gateway is dead.
I think this is usually called / treated as a (time until) "Dead 
(Counter) Time" as in the timer counts down and as soon as it hits zero, 
the item is considered dead.  Any time something passes through and 
refreshes it, the time to live is placed in the (time until) "Dead 
(Counter) Timer".
> This would be a manual task for the administrator, there is no place for 
> this in-kernel.
Agreed.

I will state that I think you are asking for a bit much, but you are 
free to ask for what ever you want to, or are willing to code your self.  ;)



Grant. . . .

Reasonably Related Threads

Search for more apparently analagous threads

LARTC - Jun 2007 - Load Balance and SNAT problem.

Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Load Balance and SSL

Re: Load Balance and SNAT problem.

Re: Load Balance and SSL

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

RE: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

RE: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

RE: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Re: Load Balance and SNAT problem.

Reasonably Related Threads