Francois
2005-Feb-10 02:10 UTC
One "200Mbps" virtual link between 2 ethernet adaptators of 2 linux boxes.
Hi, ------- ------- | B |eth0---------eth0| C | | |eth1---------eth1| | ------- ------- In an attempt to have the install setup to increase speed and/or reliability of a link between two linux machines (for example in case of a wireless connection), I read that there were more than one solution, for example the old eql driver, bonding driver and teql which all seem to be doing almost the same thing (round robin on packets), or multipath routing using "nexthop" (maybe??). I would like to know if someone had done some knowledge concerning same type of setup (no doubt, it''s an advanced routing mailing-list) and could explain me how these solutions differ and which would could be the best? Also, I started testing a configuration in order to try the bonding driver. ------- | A | | | ------- ___|__ |switch| |______| ------- | | ------- | B |eth0--- ---eth0| C | | |eth1---------eth1| | ------- ------- Machine A: (192.168.1.10) PC used to configure B&C (the only one that has a screen) Machine B&C: Very simple bonding configuration: modprobe bonding mode=1 ip addr add dev bond0 192.168.1.1/24 brd + #for B and .2 for C ip link set bond0 up ip link set eth0 up ip link set eth1 up ifenslave bond0 eth0 eth1 The bad thing is: B pinging C has 50% packet lost which would mean assuming that the round robin of the module works that a route from one of the interfaces doesn''t reach C (pinging from A to 192.168.1.1 gives also 50%). Anyone has an idea on this matter? Thank you very much! François. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Jay Vosburgh
2005-Feb-10 05:58 UTC
Re: One "200Mbps" virtual link between 2 ethernet adaptators of 2 linux boxes.
Francois <fdelawarde@wirelessmundi.com> wrote:> ------- > | A | > | | > ------- > ___|__ > |switch| > |______| > ------- | | ------- >| B |eth0--- ---eth0| C | >| |eth1---------eth1| | > ------- ------- > >Machine A: (192.168.1.10) PC used to configure B&C (the only one that has a >screen) >Machine B&C: Very simple bonding configuration: > > >modprobe bonding mode=1 >ip addr add dev bond0 192.168.1.1/24 brd + #for B and .2 for C >ip link set bond0 up >ip link set eth0 up >ip link set eth1 up >ifenslave bond0 eth0 eth1 > >The bad thing is: B pinging C has 50% packet lost which would mean assuming >that the round robin of the module works that a route from one of the >interfaces doesn''t reach C (pinging from A to 192.168.1.1 gives also 50%). >Anyone has an idea on this matter?First, if you set up bonding this way, check to see if the slaves have routes that supercede the route for the bonding master device. The slaves should not have any routes at all, all routing decisions are made against the master device. When bonding is set up by hand, the slaves can end up with routes if they are up and active prior to being enslaved. It''s not generally a problem when bonding is set up at boot time. Assuming for the moment that the routing is ok, I''m also curious as to which link loses packets (the "eth0s with switch" or the "eth1s no switch"). Looking at the /var/log/messages for information from the bonding driver would also be useful; you might also look into enabling some link monitoring (just in case). Lastly, trying to get a single TCP connection to, essentially, see N interface''s worth of throughput is a surprisingly difficult problem. This is a topic that comes up fairly regularly on the bonding-devel list; below is an article I posted last fall. The below references a discussion about round robin performance as it scales up to 4 adapters from a few years ago; that was done with 100 Mb/sec hardware, but the same would apply to gigabit links. As somebody else pointed out, when round robin was originally implemented in bonding, state of the art was 10 Mb/sec, one packet per interrupt, and reordering wasn''t a problem. Today, with adapters that coalesce packets or drivers that implement NAPI (which does the same thing), it''s very difficult to arrange for packets to all arrive in the proper order. My comments below about balance-alb not allowing a single TCP connection to see more than one interface''s worth of throughput also applies to the other balance modes in bonding (other than round robin). -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com To: "Shlomi Yaakobovich" <Shlomi@exanet.com> cc: "Tim Mattox" <tmattox@gmail.com>, bonding-devel@lists.sourceforge.net Subject: Re: [Bonding-devel] bonding and appletalk In-Reply-To: Message from "Shlomi Yaakobovich" <Shlomi@exanet.com> of "Tue, 05 Oct 2004 14:07:39 +0200." <F8B4823728281C429F53D71695A3AA1E012729BD@hawk.exanet-il.co.il> X-Mailer: MH-E 7.4.3; nmh 1.0.4; GNU Emacs 21.3.1 Date: Tue, 05 Oct 2004 09:59:42 -0700 From: Jay Vosburgh <fubar@us.ibm.com> Shlomi Yaakobovich <Shlomi@exanet.com> wrote:>Thanks for the reply, the problem was indeed that the switch''s 2 ports >were not configure to load-sharing (it''s an Extreme Networks 7i >switch). I am giving up on using mode=0 for this type of connection fow >now, since it requires too much external support, mode=6 is easier to >implement on a "normal" network. > >I suppose that mode=0 works a bit faster than mode=6, is there any >benchmark on the difference ? Do you guys have any idea what is the >performance effects ?The summary: round-robin (mode 0) can provide a single TCP connection with more than one interface''s worth of throughput, but will generally never let you reach the maximum throughput of the bond as a whole, whereas balance-alb (mode 6) will never let a single TCP connection (peer host, really) use more than one interface''s worth of throughput, but it can allow you to use the overall max throughput of the bond (to multiple destinations). And, it depends on what you mean by "faster." The round-robin mode (mode 0) simply stripes all traffic across the interfaces, regardless of where it''s going to. For the case of a unidirectional TCP transfer, this will generally result in many, many packets received out of order. This in turn triggers TCP''s congestion control algorithms (out of order packets are interpreted as lost packets, or late packets). This can be mitigated somewhat by adjusting tcp_reordering, but you''re not likely to see the full bandwidth utilized by one TCP connection. This was discussed in depth on the list some time ago, see the archives at: http://sourceforge.net/mailarchive/forum.php?thread_id=1669977&forum_id=2094 and look for messages titled "trunking performance." The tcp_reordering value, btw, can be changed via /proc/sys/net/ipv4/tcp_reordering, or sysctl net.ipv4.tcp_reordering. The maximum useful value is 127; the default is 3. The balance-alb mode, on the other hand, will stripe traffic to different hosts across different interfaces. Traffic to the same host will always use the same interface (generally; the sorting may be shuffled from time to time). A single connection will never see more than one interface''s worth of throughput, but no segments will ever be delivered out of order, and multiple connections can utilize pretty much the full bandwidth of the aggregation. The 802.3ad mode operates the same way in this regard. There is no mode than will allow a single connection to use more than one interface''s worth of bandwidth and guarantee ordered delivery of packets. An obvious means to accomplish that would be a round robin mode with an added reassembly layer inside of bonding. I''ve not done any experiments on something like that, so I''m not sure if the added overhead of the reassembly would offset the gains from guaranteed delivery order, or whether the burstiness of such a system would still interfere with TCP''s congestion control. That said, if you''re doing high-volume UDP traffic, with no ordering requirements, round-robin will let you slam the full aggregate''s throughput to one host, but balance-alb won''t. Read the archives; there''s a lot of analysis there. -J _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/