Hi, I have searched the archives on the topic, and it seems that the list gurus favor load balancing to be done in the kernel as opposed to other means. I have been using a home-grown approach, which splits traffic based on `-m statistic --mode random --probability X`, then CONNMARKs the individual connections and the kernel happily routes them. I understand that for > 2 links it will become impractical to calculate a correct X. But if we only have 2 gateways to the internet - are there any advantages in letting the kernel multipath scheduler do the balancing (with all the downsides of route caching), as opposed to the pure random approach described above? Thanks Peter
I have thought about this approach, but, I think, this approach does not handle failover/dead-gateway-detection well. Because you need to alter all your netfilter routing rules if you find a link down. And then reconfigure again when the link comes up. I am interested to know how you handle that. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Monday, May 14, 2007 1:57 PM To: lartc@mailman.ds9a.nl Subject: [LARTC] Multihome load balancing - kernel vs netfilter Hi, I have searched the archives on the topic, and it seems that the list gurus favor load balancing to be done in the kernel as opposed to other means. I have been using a home-grown approach, which splits traffic based on `-m statistic --mode random --probability X`, then CONNMARKs the individual connections and the kernel happily routes them. I understand that for > 2 links it will become impractical to calculate a correct X. But if we only have 2 gateways to the internet - are there any advantages in letting the kernel multipath scheduler do the balancing (with all the downsides of route caching), as opposed to the pure random approach described above? Thanks Peter _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
Salim S I wrote:>> -----Original Message----- >> From: lartc-bounces@mailman.ds9a.nl >> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >> Sent: Monday, May 14, 2007 1:57 PM >> To: lartc@mailman.ds9a.nl >> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >> >> Hi, >> I have searched the archives on the topic, and it seems that the list >> gurus favor load balancing to be done in the kernel as opposed to other >> means. I have been using a home-grown approach, which splits traffic >> based on `-m statistic --mode random --probability X`, then CONNMARKs >> the individual connections and the kernel happily routes them. I >> understand that for > 2 links it will become impractical to calculate a >> correct X. But if we only have 2 gateways to the internet - are there >> any advantages in letting the kernel multipath scheduler do the >> balancing (with all the downsides of route caching), as opposed to the >> pure random approach described above? > > I have thought about this approach, but, I think, this approach does not > handle failover/dead-gateway-detection well. Because you need to alter > all your netfilter routing rules if you find a link down. And then > reconfigure again when the link comes up. I am interested to know how > you handle that. >Certainly. What I am doing is NATing a large company network, which gets load balanced and receives fail over protection. I also have a number of services running on the router which must not be balanced nor failed over, as they are expected to respond on a specific IP only. All remaining traffic on the server itself is not balanced but fails over when the designated primary link goes down. I start with a simple pinger app, that pings several well known remote sites once a minute using a large icmp packet (1k of payload). The rtt times are averaged out and are used to calculate the current "quality" of the link (the large packet makes congestion a visible factor). If one of the interface responses is 0 (meaning not a single one of the pinged hosts has responded) - the link is dead. In iproute I have two separate tables, each using one of the links as default gw, matching a certain mark. The default route is set to a single gateway (not a multipath), either by hardcoding, or by using the first input of the pinger (it can run without a default gw set, explanation follows) In iptables I have two user defined chains: iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 iptables -t mangle -A ISP1 -j MARK --set-mark 11 iptables -t mangle -A ISP1 -j ACCEPT iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 iptables -t mangle -A ISP2 -j MARK --set-mark 12 iptables -t mangle -A ISP2 -j ACCEPT The rules that reference those chains are: For all locally originating traffic: iptables -t mangle -A OUTPUT -o $I1 -j ISP1 iptables -t mangle -A OUTPUT -o $I2 -j ISP2 For all incoming traffic from the internet: iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 For all other traffic (nat) iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode random --probability $X -j ISP1 iptables -t mangle -A PREROUTING -j ISP2 At the end of the PREROUTING cain I have iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark The NATing is trivially solved by: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP What does this achieve: * Local applications that have explicitly requested a specific IP to bind to, will be routed over the corresponding interface and will stay that way. Only applications binding to 0.0.0.0 will be routed by consulting the default route. * Responses to connections from the internet are guaranteed to leave from the same interface they came in. * All new connection not coming from the external interfaces are load balanced by the weight of $X, and are again guaranteed to stay there for the life of the connection, but another connection to the same host is not guaranteed to go over the same link. This is important in a company environment, since most employees use the same online resources. On every run of the pinger I do the following: * If both gateways are alive I replace the -m statistic rule, adjusting the value of $X * If one is detected dead, I adjust the probability accordingly (or alternatively remove the statistic match altogether), and change the default gateway if it is the one that failed. So really the whole exercise revolves around changing a single rule (or two rules, if you want to control the probability in a more fine-grained way). Last but not least this setup allowed me to program exception tables for certain IP blocks. For instance Yahoo has a braindead two tier authentication system for commercial solutions. It remembers the IP which you used to login with first, and it must match the IP used to login to a more secure area (using another password). Or users from within the lan might want to use one of the ISPs SMTP servers, which keeps a close eye on who is talking to it. So I have a $PREFERRED which is adjusted to either ISP1 or ISP2, depending on the current state of affairs, and rules like: iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state NEW -j $PREFERRED iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state NEW -j $PREFERRED This pretty much sums it up. The only downside I can think of is that loss of service can be observed between two runs of the pinger. Let me know if I missed something be it critical or minor. Thanks Peter
iptables -t mangle -A PREROUTING -j ISP2 Doesn''t it need to check for state NEW? Or packets will not reach the restore-mark rule. You may have to manually populate the routing tables when an interface comes up, after being down for some time. (Kernel would have removed the routing entries for this interface after it found the interface down. This happens only if its nexthop is down) I tend to favor this approach, because it is more flexible in selecting the interface. You can use different weights/probability depending on different factors. I have seen a variation of this method, used with ''recent'' (-m recent) match, instead of CONNMARK. The only downside in using this method, as far as I can see, is the need to reconfigure rules and routing tables, in case of a failure/coming-up. But lately, I have found that even with multipath method, there IS a need for reconfiguration. -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson Sent: Monday, May 14, 2007 3:16 PM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter Salim S I wrote:>> -----Original Message----- >> From: lartc-bounces@mailman.ds9a.nl >> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >> Sent: Monday, May 14, 2007 1:57 PM >> To: lartc@mailman.ds9a.nl >> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >> >> Hi, >> I have searched the archives on the topic, and it seems that the list >> gurus favor load balancing to be done in the kernel as opposed toother>> means. I have been using a home-grown approach, which splits traffic >> based on `-m statistic --mode random --probability X`, then CONNMARKs >> the individual connections and the kernel happily routes them. I >> understand that for > 2 links it will become impractical to calculatea>> correct X. But if we only have 2 gateways to the internet - are there >> any advantages in letting the kernel multipath scheduler do the >> balancing (with all the downsides of route caching), as opposed tothe>> pure random approach described above? > > I have thought about this approach, but, I think, this approach doesnot> handle failover/dead-gateway-detection well. Because you need to alter > all your netfilter routing rules if you find a link down. And then > reconfigure again when the link comes up. I am interested to know how > you handle that. >Certainly. What I am doing is NATing a large company network, which gets load balanced and receives fail over protection. I also have a number of services running on the router which must not be balanced nor failed over, as they are expected to respond on a specific IP only. All remaining traffic on the server itself is not balanced but fails over when the designated primary link goes down. I start with a simple pinger app, that pings several well known remote sites once a minute using a large icmp packet (1k of payload). The rtt times are averaged out and are used to calculate the current "quality" of the link (the large packet makes congestion a visible factor). If one of the interface responses is 0 (meaning not a single one of the pinged hosts has responded) - the link is dead. In iproute I have two separate tables, each using one of the links as default gw, matching a certain mark. The default route is set to a single gateway (not a multipath), either by hardcoding, or by using the first input of the pinger (it can run without a default gw set, explanation follows) In iptables I have two user defined chains: iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 iptables -t mangle -A ISP1 -j MARK --set-mark 11 iptables -t mangle -A ISP1 -j ACCEPT iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 iptables -t mangle -A ISP2 -j MARK --set-mark 12 iptables -t mangle -A ISP2 -j ACCEPT The rules that reference those chains are: For all locally originating traffic: iptables -t mangle -A OUTPUT -o $I1 -j ISP1 iptables -t mangle -A OUTPUT -o $I2 -j ISP2 For all incoming traffic from the internet: iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 For all other traffic (nat) iptables -t mangle -A PREROUTING -m state --state NEW -m statistic --mode random --probability $X -j ISP1 iptables -t mangle -A PREROUTING -j ISP2 At the end of the PREROUTING cain I have iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark The NATing is trivially solved by: iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP What does this achieve: * Local applications that have explicitly requested a specific IP to bind to, will be routed over the corresponding interface and will stay that way. Only applications binding to 0.0.0.0 will be routed by consulting the default route. * Responses to connections from the internet are guaranteed to leave from the same interface they came in. * All new connection not coming from the external interfaces are load balanced by the weight of $X, and are again guaranteed to stay there for the life of the connection, but another connection to the same host is not guaranteed to go over the same link. This is important in a company environment, since most employees use the same online resources. On every run of the pinger I do the following: * If both gateways are alive I replace the -m statistic rule, adjusting the value of $X * If one is detected dead, I adjust the probability accordingly (or alternatively remove the statistic match altogether), and change the default gateway if it is the one that failed. So really the whole exercise revolves around changing a single rule (or two rules, if you want to control the probability in a more fine-grained way). Last but not least this setup allowed me to program exception tables for certain IP blocks. For instance Yahoo has a braindead two tier authentication system for commercial solutions. It remembers the IP which you used to login with first, and it must match the IP used to login to a more secure area (using another password). Or users from within the lan might want to use one of the ISPs SMTP servers, which keeps a close eye on who is talking to it. So I have a $PREFERRED which is adjusted to either ISP1 or ISP2, depending on the current state of affairs, and rules like: iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state NEW -j $PREFERRED iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state NEW -j $PREFERRED This pretty much sums it up. The only downside I can think of is that loss of service can be observed between two runs of the pinger. Let me know if I missed something be it critical or minor. Thanks Peter _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
Answer inlined: Salim S I wrote:> iptables -t mangle -A PREROUTING -j ISP2 > > Doesn''t it need to check for state NEW? Or packets will not reach the > restore-mark rule.Of course, and the real script does check. I typed this line manually because the copy cut it, and missed the obvious check.> You may have to manually populate the routing tables when an interface > comes up, after being down for some time. (Kernel would have removed the > routing entries for this interface after it found the interface down. > This happens only if its nexthop is down)This is what I can''t really understand (and it applies to DGD as well) - how often in real life does someone yank a cable out, so an interface will go down? In over 7 years of dealing with various ISPs I have never seen the link go so dead, that the kernel will down the interface and remove all associated routing information. What I have seen on the other hand is the link dying at the 2nd or 3rd hop, which (if I understand correctly) DGD simply can not detect. Correct me if my assumption is wrong.> I tend to favor this approach, because it is more flexible in selecting > the interface. You can use different weights/probability depending on > different factors. I have seen a variation of this method, used with > ''recent'' (-m recent) match, instead of CONNMARK.I see. But recent would have a "caching effect", and from what I understand is heavier on the kernel, unlike the CONNMARK which hooks into the conntrack which in turn has to track connections either way.> The only downside in using this method, as far as I can see, is the need > to reconfigure rules and routing tables, in case of a failure/coming-up. > But lately, I have found that even with multipath method, there IS a > need for reconfiguration.Got you. This pretty much answers my original question. Thank you for your time.> -----Original Message----- > From: lartc-bounces@mailman.ds9a.nl > [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson > Sent: Monday, May 14, 2007 3:16 PM > To: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > Salim S I wrote: >>> -----Original Message----- >>> From: lartc-bounces@mailman.ds9a.nl >>> [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Peter Rabbitson >>> Sent: Monday, May 14, 2007 1:57 PM >>> To: lartc@mailman.ds9a.nl >>> Subject: [LARTC] Multihome load balancing - kernel vs netfilter >>> >>> Hi, >>> I have searched the archives on the topic, and it seems that the list >>> gurus favor load balancing to be done in the kernel as opposed to > other >>> means. I have been using a home-grown approach, which splits traffic >>> based on `-m statistic --mode random --probability X`, then CONNMARKs >>> the individual connections and the kernel happily routes them. I >>> understand that for > 2 links it will become impractical to calculate > a >>> correct X. But if we only have 2 gateways to the internet - are there >>> any advantages in letting the kernel multipath scheduler do the >>> balancing (with all the downsides of route caching), as opposed to > the >>> pure random approach described above? >> I have thought about this approach, but, I think, this approach does > not >> handle failover/dead-gateway-detection well. Because you need to alter >> all your netfilter routing rules if you find a link down. And then >> reconfigure again when the link comes up. I am interested to know how >> you handle that. >> > > Certainly. What I am doing is NATing a large company network, which gets > load balanced and receives fail over protection. I also have a number of > services running on the router which must not be balanced nor failed > over, as they are expected to respond on a specific IP only. All > remaining traffic on the server itself is not balanced but fails over > when the designated primary link goes down. > > I start with a simple pinger app, that pings several well known remote > sites once a minute using a large icmp packet (1k of payload). The rtt > times are averaged out and are used to calculate the current "quality" > of the link (the large packet makes congestion a visible factor). If one > of the interface responses is 0 (meaning not a single one of the pinged > hosts has responded) - the link is dead. > > In iproute I have two separate tables, each using one of the links as > default gw, matching a certain mark. The default route is set to a > single gateway (not a multipath), either by hardcoding, or by using the > first input of the pinger (it can run without a default gw set, > explanation follows) > > In iptables I have two user defined chains: > iptables -t mangle -A ISP1 -j CONNMARK --set-mark 11 > iptables -t mangle -A ISP1 -j MARK --set-mark 11 > iptables -t mangle -A ISP1 -j ACCEPT > > iptables -t mangle -A ISP2 -j CONNMARK --set-mark 12 > iptables -t mangle -A ISP2 -j MARK --set-mark 12 > iptables -t mangle -A ISP2 -j ACCEPT > > The rules that reference those chains are: > > For all locally originating traffic: > iptables -t mangle -A OUTPUT -o $I1 -j ISP1 > iptables -t mangle -A OUTPUT -o $I2 -j ISP2 > > For all incoming traffic from the internet: > iptables -t mangle -A PREROUTING -i $I1 -m state --state NEW -j ISP1 > iptables -t mangle -A PREROUTING -i $I2 -m state --state NEW -j ISP2 > > For all other traffic (nat) > iptables -t mangle -A PREROUTING -m state --state NEW -m statistic > --mode random --probability $X -j ISP1 > iptables -t mangle -A PREROUTING -j ISP2 > > At the end of the PREROUTING cain I have > iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark > > The NATing is trivially solved by: > iptables -t nat -A POSTROUTING -s 10.0.58.0/24 -j SOURCE_NAT > iptables -t nat -A POSTROUTING -s 192.168.58.0/24 -j SOURCE_NAT > iptables -t nat -A POSTROUTING -s 192.168.8.0/24 -j SOURCE_NAT > > iptables -t nat -A SOURCE_NAT -o $I1 -j SNAT --to $I1_IP > iptables -t nat -A SOURCE_NAT -o $I2 -j SNAT --to $I2_IP > > > What does this achieve: > * Local applications that have explicitly requested a specific IP to > bind to, will be routed over the corresponding interface and will stay > that way. Only applications binding to 0.0.0.0 will be routed by > consulting the default route. > * Responses to connections from the internet are guaranteed to leave > from the same interface they came in. > * All new connection not coming from the external interfaces are load > balanced by the weight of $X, and are again guaranteed to stay there for > the life of the connection, but another connection to the same host is > not guaranteed to go over the same link. This is important in a company > environment, since most employees use the same online resources. > > On every run of the pinger I do the following: > * If both gateways are alive I replace the -m statistic rule, adjusting > the value of $X > * If one is detected dead, I adjust the probability accordingly (or > alternatively remove the statistic match altogether), and change the > default gateway if it is the one that failed. > > So really the whole exercise revolves around changing a single rule (or > two rules, if you want to control the probability in a more fine-grained > way). > > Last but not least this setup allowed me to program exception tables for > certain IP blocks. For instance Yahoo has a braindead two tier > authentication system for commercial solutions. It remembers the IP > which you used to login with first, and it must match the IP used to > login to a more secure area (using another password). Or users from > within the lan might want to use one of the ISPs SMTP servers, which > keeps a close eye on who is talking to it. So I have a $PREFERRED which > is adjusted to either ISP1 or ISP2, depending on the current state of > affairs, and rules like: > iptables -t mangle -A PREROUTING -d 66.218.64.0/19 -m state --state > NEW -j $PREFERRED > iptables -t mangle -A PREROUTING -d 68.142.192.0/18 -m state --state > NEW -j $PREFERRED > > This pretty much sums it up. The only downside I can think of is that > loss of service can be observed between two runs of the pinger. Let me > know if I missed something be it critical or minor. > > Thanks > > Peter > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
On Monday 14 May 2007 02:57, Peter Rabbitson wrote:> Hi, > I have searched the archives on the topic, and it seems that the list > gurus favor load balancing to be done in the kernel as opposed to other > means.AFAIKR there aren''t conflicting opinions, there are just to different aproaches and i belive that routing solution is user cause it was the first and because sounds logical to implement multipath with your routing tool. But iptables has become in a routing tool so far (and much more). Personaly im using multipath, but i do not dislike the iptables aproach.> I have been using a home-grown approach, which splits traffic > based on `-m statistic --mode random --probability X`, then CONNMARKs > the individual connections and the kernel happily routes them. I > understand that for > 2 links it will become impractical to calculate a > correct X.well, is not impractical with a litle of scripting in your firewal... #!/bin/bash # your uplinks weight as in kernel multipath # ie: link1 link2 link3 link4 link5 weight=" 1 2 1 3 5 " weight_totalfor n in $weight ; do let weight_total=weight_total+n done for n in $weight ; do probability=$((n*100/weight_total)) echo iptables.. -m statistic --mode random --probability $probability done but the problem arraise when you have lets say 101 links, cause mode random takes a 2 digit number right?, but this can be changed in the code (use the source...)> But if we only have 2 gateways to the internet - are there > any advantages in letting the kernel multipath scheduler do the > balancing (with all the downsides of route caching), as opposed to the > pure random approach described above?Well, the disvantage i see is that you have to move all your routing rules to iptables space, but in the end you always need the routing table, but it is a mather of change old habits... -- Luciano
None of the load balancing techniques I have come across seems to cover ''IP-Persistence''. For example, a session with several connections (for which no conntrack-helper modules exist), will have problems, as its connections will be routed through different WAN interfaces. Some servers are very particular about the source IP of the packets they receive. I suspect online gaming and instant messengers will have problems with load balancing. How is the experience of other people in here? A rewrite of ''recent'' match to include both source and destination may turn out to be a solution, albeit with low performance. Any other ideas? -----Original Message----- From: lartc-bounces@mailman.ds9a.nl [mailto:lartc-bounces@mailman.ds9a.nl] On Behalf Of Luciano Ruete Sent: Tuesday, May 22, 2007 11:28 AM To: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Monday 14 May 2007 02:57, Peter Rabbitson wrote:> Hi, > I have searched the archives on the topic, and it seems that the list > gurus favor load balancing to be done in the kernel as opposed toother> means.AFAIKR there aren''t conflicting opinions, there are just to different aproaches and i belive that routing solution is user cause it was the first and because sounds logical to implement multipath with your routing tool. But iptables has become in a routing tool so far (and much more). Personaly im using multipath, but i do not dislike the iptables aproach.> I have been using a home-grown approach, which splits traffic > based on `-m statistic --mode random --probability X`, then CONNMARKs > the individual connections and the kernel happily routes them. I > understand that for > 2 links it will become impractical to calculatea> correct X.well, is not impractical with a litle of scripting in your firewal... #!/bin/bash # your uplinks weight as in kernel multipath # ie: link1 link2 link3 link4 link5 weight=" 1 2 1 3 5 " weight_totalfor n in $weight ; do let weight_total=weight_total+n done for n in $weight ; do probability=$((n*100/weight_total)) echo iptables.. -m statistic --mode random --probability $probability done but the problem arraise when you have lets say 101 links, cause mode random takes a 2 digit number right?, but this can be changed in the code (use the source...)> But if we only have 2 gateways to the internet - are there > any advantages in letting the kernel multipath scheduler do the > balancing (with all the downsides of route caching), as opposed to the > pure random approach described above?Well, the disvantage i see is that you have to move all your routing rules to iptables space, but in the end you always need the routing table, but it is a mather of change old habits... -- Luciano _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
Sorry, but it doesn''t work that way. CONNMARK needs helper modules like the ones for FTP or H.323 to really know if connections belong to the same session. To cover all gaming and IM apps with own helper modules is practically impossible. I remember even MSN have had problems (timeout every 5 mins), but it seems to have been fixed at the server level. Could you please point out if I had missed any open discussion in the list which covers these things? -----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Wednesday, May 30, 2007 11:46 AM To: Salim S I Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Tuesday 29 May 2007 03:16:47 you wrote:> None of the load balancing techniques I have come across seems tocover> ''IP-Persistence''. For example, a session with several connections (for > which no conntrack-helper modules exist), will have problems, as its > connections will be routed through different WAN interfaces. Some > servers are very particular about the source IP of the packets they > receive. I suspect online gaming and instant messengers will have > problems with load balancing. How is the experience of other people in > here? > > A rewrite of ''recent'' match to include both source and destination may > turn out to be a solution, albeit with low performance. Any otherideas? In this same thread a CONNMARK solution was exposed, and this same CONNMARK solution was openly discused several times in this list. All the cases that you mention (online gamming, instant messenger) and all other that you do not mention are solved having a connection-aware firewall, which is capable to route over the same link packets that belongs to the same logical connection, this is achived perfectly using netfilter CONNMARK. Regards! -- Luciano
Salim S I wrote:>> -----Original Message----- >> From: Luciano Ruete [mailto:luciano@lugmen.org.ar] >> Sent: Wednesday, May 30, 2007 11:46 AM >> To: Salim S I >> Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter >> >> On Tuesday 29 May 2007 03:16:47 you wrote: >>> None of the load balancing techniques I have come across seems to >> cover >>> ''IP-Persistence''. For example, a session with several connections (for >>> which no conntrack-helper modules exist), will have problems, as its >>> connections will be routed through different WAN interfaces. Some >>> servers are very particular about the source IP of the packets they >>> receive. I suspect online gaming and instant messengers will have >>> problems with load balancing. How is the experience of other people in >>> here? >>> >>> A rewrite of ''recent'' match to include both source and destination may >>> turn out to be a solution, albeit with low performance. Any other >> ideas? >> >> In this same thread a CONNMARK solution was exposed, and this same >> CONNMARK >> solution was openly discused several times in this list. >> >> All the cases that you mention (online gamming, instant messenger) and >> all >> other that you do not mention are solved having a connection-aware >> firewall, >> which is capable to route over the same link packets that belongs to the >> same >> logical connection, this is achived perfectly using netfilter CONNMARK. >> >> Regards! > Sorry, but it doesn''t work that way. > CONNMARK needs helper modules like the ones for FTP or H.323 to really > know if connections belong to the same session. To cover all gaming and > IM apps with own helper modules is practically impossible. I remember > even MSN have had problems (timeout every 5 mins), but it seems to have > been fixed at the server level. > Could you please point out if I had missed any open discussion in the > list which covers these things?Salim is correct, non-trackable protocols can be a major PITA. Actually I discussed this earlier in the thread. Yes, kernel balancing due to caching will alleviate this to a certain extent, but there will still be surprises down the road, when a cache entry finaly expires. Besides caching blows the entire balancing idea to bits if most users access primarily the same resource over and over again (think of a popular internet radio station). Furthermore neither route balancing nor the netfilter approach will be effective for resources hosted over _multiple_ distinct IPs (AIM is a very good example with separate authentication and data servers). This is where the exception lists come into play, which I also discussed. If one still wants to achieve pseudo balancing on the exempted destinations, it is still possible with the excellent SAME patch which makes a NAT decision based solely on an index derived fom the size of the source pool to be NATted divided by the number of NAT targets provided. Also note that as long as a service uses a static range of ports, you do not even have to know all the destination IP ranges in order to exempt it - simple port matching will do. HTH Peter
Before we get into the "Top-posting" stuff, it would be nice if you follow the normal way of replying (or atleast marking a copy) to the list. I think that is the basic idea behind mailing list. If you had done that, I wouldn''t have had to do the "Top-Posting". Take a look at the archives please. -----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Thursday, May 31, 2007 12:26 PM To: Salim S I Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter On Wednesday 30 May 2007 00:58:18 you wrote: First of all, learn about basic[1] mailing list rules, mainly your top-posting[2] is breaking all the sense of the thread>> Sorry, but it doesn''t work that way>yes it does.Up to you if you refuse to accept, doesn''t matter for me if you choose to live in your little world.>> CONNMARK needs helper modules like the ones for FTP or H.323 toreally>> know if connections belong to the same session. To cover all gamingand>> IM apps with own helper modules is practically impossible.>this helpers are needed because some special protocols have specialneeds, >all>other protocols are covered in a simpler maner bye flowing the tcp flow>between two ports, you need al least a litle low level knowldge aboutlayer>3-4 protocols to undestand this.Yessir. 3 bags full. If you had read my post c l e a r l y, before you felt obliged to show off your knowledge, you might have understood that I was talking about the so-called ''special-protocols''. Btw, thanks for that bit about "TCP flow between two ports", was quite new to me.>> I remember >> even MSN have had problems (timeout every 5 mins), but it seems tohave>> been fixed at the server level.>With CONNMARK solution 99,99% of the things works, i am thesys/net-admin >from>an ISP that proves it, whit load balancing over multiple links.Sorry again! That figure of ''99.99'' is in YOUR case, but are you aware there are others in this world too, with different scenarios/setups? You did not think Peter and I were dreaming up a scenario,did you? Btw, your being a netadmin doesn''t automatically make your statements correct.>For each protocol that are not covered by simple tcp flow a helpermodule >was written. It must be a well kept secret then! I am sorry to say this, if your knowledge was half the size of your ego, it would have been good for us all.>> Could you please point out if I had missed any open discussion in the >> list which covers these things?>just google(ie): "connmark site:lartc...archive"Thanks for introducing google. But my question still stands.
On Thursday 31 May 2007 02:02:16 Salim S I wrote:> Before we get into the "Top-posting" stuff, it would be nice if you > follow the normal way of replying (or atleast marking a copy) to the > list. I think that is the basic idea behind mailing list.Shure! :-), my fault, not looking at headers, my wish was always to write to the list.> If you had done that, I wouldn''t have had to do the "Top-Posting". Take > a look at the archives please.There is no reason to do Top-Posting, if i''ve missed the cc to the list, you still can do a normal innline reply. But all this is getting OT in this list.> On Wednesday 30 May 2007 00:58:18 you wrote:[snip]> Yessir. 3 bags full. > If you had read my post c l e a r l y, before you felt obliged to show > off your knowledge, you might have understood that I was talking about > the so-called ''special-protocols''.You mention online gaming and IM protocols, and there is nothing special about that. What im triyng to say is that CONNTRACK+CONNMARK solves that problem for you. You can have IM(msn,jabber,yahoo,aol) connected all day long without problems, or you can do online gamming too, or have an ssh session for weeks. CONNTRACK has the avility to track tcp(ammong others) flows and to remember an ESTABLISHED connection. Then you can use CONNMARK to MARK an ESTABLISHED connection with an unique tag based on the provider that it use. Then, every time you see the same MARK on that ESTABLISHED connection you assure that it will be routed over the same original provider. Full example here: http://mailman.ds9a.nl/pipermail/lartc/2006q2/018964.html> Btw, thanks for that bit about "TCP flow between two ports", was quite > new to me. > >> I remember > >> even MSN have had problems (timeout every 5 mins), but it seems to > >> been fixed at the server level. > > > >With CONNMARK solution 99,99% of the things works, i am the > sys/net-admin >from > >an ISP that proves it, whit load balancing over multiple links. > > Sorry again! That figure of ''99.99'' is in YOUR case, but are you aware > there are others in this world too, with different scenarios/setups? You > did not think Peter and I were dreaming up a scenario,did you?The scenario that you mention is a bad/incomplete setup, so do not spect that it will work right.> Btw, your being a netadmin doesn''t automatically make your statements > correct.Which make my statement correct is the fact that in my networks there are not all the problems that you mention in your post.> >For each protocol that are not covered by simple tcp flow a helper > > module >was > written. > It must be a well kept secret then! > I am sorry to say this, if your knowledge was half the size of your ego, > it would have been good for us all.Is not about ego, sorry if you take this personal, it is not my intention, i speak rude because this list get heavly indexed by google, and it is taked as good advice for many answer seekers. You afirm that Linux cannot handle load balancing properly and this is completly WRONG and is bad advertising and a lie. Since 2.4 series has been avaible the greats julian''s patchs[1], and then in 2.6.12 CONNMARK has get in mainline, and with a litle of setup all connection problems related to load balancing get perfectly solved.> >> Could you please point out if I had missed any open discussion in the > >> list which covers these things? > > > >just google(ie): "connmark site:lartc...archive" > > Thanks for introducing google. But my question still stands.i hope is answered now. [1]http://www.ssi.bg/~ja/ -- Luciano
-----Original Message----- From: Luciano Ruete [mailto:luciano@lugmen.org.ar] Sent: Saturday, June 02, 2007 11:28 AM To: Salim S I Cc: lartc@mailman.ds9a.nl Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter>Is not about ego, sorry if you take this personal, it is not myintention, >i>speak rude because this list get heavly indexed by google, and it istaked >as>good advice for many answer seekers. > >You afirm that Linux cannot handle load balancing properly and this is >completly WRONG and is bad advertising and a lie. > >Since 2.4 series has been avaible the greats julian''s patchs[1], andthen >in>2.6.12 CONNMARK has get in mainline, and with a litle of setup all >connection >problems related to load balancing get perfectly solved.I did not say Linux can''t do Load balancing (btw, my setup has Julian''s DGD patch as well as CONNMARK). But there are some limitations to the popular methods currently used. 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the separate control and data servers. He mentions AIM servers as example. This probably can only be solved by having exception IP list. 2.The other situation, and the one I am more concerned, is about different connections which belongs to same session. Consider Client X and Server Y. Client X initiates a connection from port a to port b of server Y. Xa <---> Yb This connection goes through WAN1. After sometime, X opens another connection to Y from port c to port d. Xc <---> Yd This is a perfectly new TCP connection, so it may go through WAN2 (Note that the client is NATed, and that no CONNTRACK exist for this app) The server may reject the second and subsequent connections as it comes in with a different source IP than the first. This situation happens often in IM and Gaming scenarios. Some sort of IP persistence is required to handle this. And I was wondering if recent match would solve this to an extent, without affecting performance. Or if there are some other method available. (Note that I can''t depend much on cache).
On Tue, Jun 05, 2007 at 02:48:01PM +0800, Salim S I wrote:> > > -----Original Message----- > From: Luciano Ruete [mailto:luciano@lugmen.org.ar] > Sent: Saturday, June 02, 2007 11:28 AM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > >Is not about ego, sorry if you take this personal, it is not my > intention, >i > >speak rude because this list get heavly indexed by google, and it is > taked >as > >good advice for many answer seekers. > > > >You afirm that Linux cannot handle load balancing properly and this is > >completly WRONG and is bad advertising and a lie. > > > >Since 2.4 series has been avaible the greats julian''s patchs[1], and > then >in > >2.6.12 CONNMARK has get in mainline, and with a litle of setup all > >connection > >problems related to load balancing get perfectly solved. > > > I did not say Linux can''t do Load balancing (btw, my setup has Julian''s > DGD patch as well as CONNMARK). But there are some limitations to the > popular methods currently used. > > 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the > separate control and data servers. He mentions AIM servers as example. > This probably can only be solved by having exception IP list. > > 2.The other situation, and the one I am more concerned, is about > different connections which belongs to same session. > > Consider Client X and Server Y. > > Client X initiates a connection from port a to port b of server Y. > > Xa <---> Yb This connection goes through WAN1. > > After sometime, X opens another connection to Y from port c to port d. > > Xc <---> Yd This is a perfectly new TCP connection, so it may go > through WAN2 > > (Note that the client is NATed, and that no CONNTRACK exist for this > app) > > The server may reject the second and subsequent connections as it comes > in with a different source IP than the first. > > This situation happens often in IM and Gaming scenarios. Some sort of IP > persistence is required to handle this. And I was wondering if recent > match would solve this to an extent, without affecting performance. Or > if there are some other method available. (Note that I can''t depend much > on cache).Are all of these idioms of each method documented in the wiki ? So what is the preferred method going forward ?> > > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >_______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
On Tuesday 05 June 2007 03:48:01 Salim S I wrote:> -----Original Message----- > From: Luciano Ruete [mailto:luciano@lugmen.org.ar] > Sent: Saturday, June 02, 2007 11:28 AM > To: Salim S I > Cc: lartc@mailman.ds9a.nl > Subject: Re: [LARTC] Multihome load balancing - kernel vs netfilter > > >Is not about ego, sorry if you take this personal, it is not my > > intention, >i > > >speak rude because this list get heavly indexed by google, and it is > > taked >as > > >good advice for many answer seekers. > > > >You afirm that Linux cannot handle load balancing properly and this is > >completly WRONG and is bad advertising and a lie. > > > >Since 2.4 series has been avaible the greats julian''s patchs[1], and > > then >in > > >2.6.12 CONNMARK has get in mainline, and with a litle of setup all > >connection > >problems related to load balancing get perfectly solved. > > I did not say Linux can''t do Load balancing (btw, my setup has Julian''s > DGD patch as well as CONNMARK). But there are some limitations to the > popular methods currently used. > > 1.As Peter Rabbitson [rabbit@rabbit.us] mentioned, one issue is the > separate control and data servers. He mentions AIM servers as example. > This probably can only be solved by having exception IP list.Ok, this is one clear example where NAT concept fails in load balancing, this is not much Linux related, there is no magic to be done here but to write special code for that protcol(as a helper, right). But the helper is not needed in a normal setup. This AIM "geniality" could be done transparent to the end user, and even doing it at clients code, there is not need to use the IP as a part of the auth mecanism, is for shure insecure and useless. So i will not blame Linux on this one.> 2.The other situation, and the one I am more concerned, is about > different connections which belongs to same session. > > Consider Client X and Server Y. > > Client X initiates a connection from port a to port b of server Y. > > Xa <---> Yb This connection goes through WAN1. > > After sometime, X opens another connection to Y from port c to port d. > > Xc <---> Yd This is a perfectly new TCP connection, so it may go > through WAN2 > > (Note that the client is NATed, and that no CONNTRACK exist for this > app) > > The server may reject the second and subsequent connections as it comes > in with a different source IP than the first.well it is perfectly clear now.> This situation happens often in IM and Gaming scenarios.I really don''t know what IM protocol do this..., if you can specify will be better, i have no complains about any IM issue related to load balancing, i personally use MSN and Jabber protocols and have not problems at all. In games i''m not an expert, but i will like to know what is the percentage of this special games. Route cache will help as mentioned in case like this but it is not fail safe. But again, this are things that affect not only a Linux box doing load balancing, it will affect any other solution, and AFAIKSee you need to start to write special per protocol helpers _only_ for load balancing proupouses. This are application exceptions, they are not linux fails, they are not designed taking in account special setups like NATed load balancing, that''s it.> Some sort of IP > persistence is required to handle this. And I was wondering if recent > match would solve this to an extent, without affecting performance. Or > if there are some other method available. (Note that I can''t depend much > on cache).I think "recent" could work, matching only the special ports(and optionally each client address) the impact on other clients will using same ports but with different applications will be mostly null. -- Luciano