Hi, I''m trying to increase the bandwidth between two hosts (backup). Both hosts are in the same /24 subnet and each of them is connected to a Cisco switch by 2 GbE interfaces (intel e1000). The switches/host are located in different building which are connected by 3 x GbE. building A | building B | -------- ----- | ----- -------- | |eth2,10.60.1.241 | | | | | 10.60.1.244,eth2| | | host |-------------------| S |-|-| S |------------------| host | | A | | W |-|-| W | | B | | |eth3,10.60.1.240 | |-|-| | 10.60.1.243,eth3| | | |-------------------| | | | |------------------| | -------- ----- | ----- -------- | My goal is to increase the bandwidth for a single tcp session between the two hosts for a backup job (per packet round robin?), not for multiple connections between many hosts. I know that I won''t get 2 x 115Mb/s because of packet reordering, but 20-30% more that a single connection would be ok. I followed different HowTOs http://www.lartc.org/howto/lartc.rpdb.multiple-links.html#AEN298 http://lartc.org/howto/lartc.loadshare.html or something like: ip route...equalize via... but I never got a higher transfer rate between the two hosts than max 115Mb/s with benchmarks like netpipe or netio. I guess the route cache might be a problem here, or maybe I''m missing some other important part. I''m running Debian Etch with Kernel 2.6.21 from backports.org. Any ideas what I''m missing, or if it''s possible at all? Thanks, Ralf
Grant Taylor
2007-Jul-30 18:44 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On 07/30/07 09:10, Ralf Gross wrote:> I''m trying to increase the bandwidth between two hosts (backup). Both > hosts are in the same /24 subnet and each of them is connected to a > Cisco switch by 2 GbE interfaces (intel e1000). The switches/host are > located in different building which are connected by 3 x GbE.Ok, this is simple enough.> My goal is to increase the bandwidth for a single tcp session between > the two hosts for a backup job (per packet round robin?), not for > multiple connections between many hosts. I know that I won''t get 2 x > 115Mb/s because of packet reordering, but 20-30% more that a single > connection would be ok.*nod*> Any ideas what I''m missing, or if it''s possible at all?You are barking up the wrong tree, or at least the wrong layer. If you have any control of the switches in each building, or can have someone make changes to them for you. Bond the two connections together to make one logical larger connection. Cisco calls this "EtherChannel" and Linux calls this "Bonding". In the long run you will end up with two raw ethernet devices enslaved in to one bond0 interface. These two bonded / etherchannel interfaces will have very close to 2 Gbps worth of speed. Do this on the lower OSI Layer 2 rather than trying (and failing) to do it on the higher OSI Layer 3 where you are doing it presently. Grant. . . .
Paul Zirnik
2007-Jul-30 18:46 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On Monday 30 July 2007 16:10, Ralf Gross wrote:> > My goal is to increase the bandwidth for a single tcp session between > the two hosts for a backup job (per packet round robin?), not for > multiple connections between many hosts. I know that I won''t get 2 x > 115Mb/s because of packet reordering, but 20-30% more that a single > connection would be ok. > > I followed different HowTOs > > http://www.lartc.org/howto/lartc.rpdb.multiple-links.html#AEN298 > http://lartc.org/howto/lartc.loadshare.html > or something like: ip route...equalize via... > > but I never got a higher transfer rate between the two hosts than > max 115Mb/s with benchmarks like netpipe or netio.If you have different switches for each line i suggest the use of "bonding" in balance-round-robin mode. +-------+ eth0 +--------+ eth0 +------+ | Host |--------|switch 1|--------| Host | | | +--------+ | | | A | eth1 +--------+ eth1 | B | | |--------|switch 2|--------| | +-------+ +--------+ +------+ See /usr/src/linux/Documentation/networking/bonding.txt regards, Paul
Ralf Gross
2007-Jul-30 20:12 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Grant Taylor schrieb:> On 07/30/07 09:10, Ralf Gross wrote: > >I''m trying to increase the bandwidth between two hosts (backup). Both > >hosts are in the same /24 subnet and each of them is connected to a > >Cisco switch by 2 GbE interfaces (intel e1000). The switches/host are > >located in different building which are connected by 3 x GbE. > > Ok, this is simple enough. > > >My goal is to increase the bandwidth for a single tcp session between > >the two hosts for a backup job (per packet round robin?), not for > >multiple connections between many hosts. I know that I won''t get 2 x > >115Mb/s because of packet reordering, but 20-30% more that a single > >connection would be ok. > > *nod* > > >Any ideas what I''m missing, or if it''s possible at all? > > You are barking up the wrong tree, or at least the wrong layer. If you > have any control of the switches in each building, or can have someone > make changes to them for you. Bond the two connections together to make > one logical larger connection. Cisco calls this "EtherChannel" and > Linux calls this "Bonding".I''ve tried bonding before. But this didn''t work either because the cisco switch decides on a src/dst mac/ip hash which port of the port channel will be used. But in my case the hash is always the same because between host A and host B. Thus always the same interface was used.> In the long run you will end up with two raw ethernet devices enslaved > in to one bond0 interface. These two bonded / etherchannel interfaces > will have very close to 2 Gbps worth of speed.But not between host A and host B. I''ve gone through this a while ago, everyone told me than that I''ve to solve the problem on L3 ;)> Do this on the lower OSI Layer 2 rather than trying (and failing) to do > it on the higher OSI Layer 3 where you are doing it presently.I think it''s not possible with the Cisco switches we use here to increase the bandwidth between 2 hosts on L2. Ralf
Ralf Gross
2007-Jul-30 20:48 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Paul Zirnik schrieb:> On Monday 30 July 2007 16:10, Ralf Gross wrote: > > > > My goal is to increase the bandwidth for a single tcp session between > > the two hosts for a backup job (per packet round robin?), not for > > multiple connections between many hosts. I know that I won''t get 2 x > > 115Mb/s because of packet reordering, but 20-30% more that a single > > connection would be ok. > > > > I followed different HowTOs > > > > http://www.lartc.org/howto/lartc.rpdb.multiple-links.html#AEN298 > > http://lartc.org/howto/lartc.loadshare.html > > or something like: ip route...equalize via... > > > > but I never got a higher transfer rate between the two hosts than > > max 115Mb/s with benchmarks like netpipe or netio. > > If you have different switches for each line i suggest the use > of "bonding" in balance-round-robin mode. > > +-------+ eth0 +--------+ eth0 +------+ > | Host |--------|switch 1|--------| Host | > | | +--------+ | | > | A | eth1 +--------+ eth1 | B | > | |--------|switch 2|--------| | > +-------+ +--------+ +------+I tried this setup a while ago. Both hosts were connected to a Cisco switch. On the linux hosts I created bond0 interfaces (round robin) and the switch ports on both switches were configured as Port Channels. +-------+ eth2 +----+ +----+ eth2 +------+ | Host |------ |PC1 | |PC2 | ------| Host | | | |_bond0_| |__ | |_bond0_| | | | A | | |SW1 | |SW1 | | | B | | |------ | | | | ------| | +-------+ eth3 +----+ +----+ eth3 +------+ This didn''t increase the transfer rate for a tcp session between the two hosts. Because the mac and ip addresses are the same for the whole tcp session (backup). http://www.cisco.com/warp/public/473/4.html <quote> For example, if the traffic on a channel only goes to a single MAC address, use of the destination MAC address results in the choice of the same link in the channel each time. Maybe I''m missing something, but the Cisco people here told me the same. Ralf
Grant Taylor
2007-Jul-30 21:19 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On 07/30/07 15:12, Ralf Gross wrote:> I''ve tried bonding before. But this didn''t work either because the > cisco switch decides on a src/dst mac/ip hash which port of the port > channel will be used. But in my case the hash is always the same > because between host A and host B. Thus always the same interface was > used.Dough! So the switch is failing you.> But not between host A and host B. I''ve gone through this a while > ago, everyone told me than that I''ve to solve the problem on L3 ;)*SIGH*> I think it''s not possible with the Cisco switches we use here to > increase the bandwidth between 2 hosts on L2.It sounds like a "per packet" or "per flow" decision that is defaulting to "per flow" for deciding which port on an EtherChannel to use. I''m not that much of a Cisco person so I can''t say for sure, but I''d think there would be a setting that could be changed in the switch that would alter this so that you could get an aggregate bandwidth increase. Doing a quick Google search (http://www.google.com/search?hl=en&q=Cisco+CEF&btnG=Google+Search) reminded me that we had to turn on Cisco Express Forwarding (a.k.a. CEF) and set the CEF to be "per packet" rather than "per flow". You might want to do some research on your switches to see if they support CEF or not. If your switches do support it you may want to talk to your switch support staff (if it is not your self) to see if they would consider setting such up. I am presently using CEF between a 3640 and an upstream 7204-ubr to load balance ethernet connections (bridged to SDSL) and am getting an aggregate bandwidth increase in "per packet" fashion. So, this will work between routers and I believe switches that support it. Good luck. Let me know if there is any thing else that I can do to help. Grant. . . .
Grant Taylor
2007-Jul-30 21:22 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On 07/30/07 15:48, Ralf Gross wrote:> I tried this setup a while ago. Both hosts were connected to a Cisco > switch. On the linux hosts I created bond0 interfaces (round robin) > and the switch ports on both switches were configured as Port > Channels.Seeing as how this is a short coming of the switch I don''t think that having the Linux box solve this on layer 3 will do any good. Mainly this is because the switch that the Linux box is connected to will still only use one of the ports in the EtherChannel group to send the traffic out, thus yielding a drop in throughput no matter what. In short, I think you are going to have to solve your switch(s) Layer 2 problem before you do any thing else. Sorry. :( Grant. . . .
Paul Zirnik
2007-Jul-31 07:52 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On Monday 30 July 2007 22:48, Ralf Gross wrote:> Paul Zirnik schrieb: > > On Monday 30 July 2007 16:10, Ralf Gross wrote: > > > My goal is to increase the bandwidth for a single tcp session between > > > the two hosts for a backup job (per packet round robin?), not for > > > multiple connections between many hosts. I know that I won''t get 2 x > > > 115Mb/s because of packet reordering, but 20-30% more that a single > > > connection would be ok. > > > > > > I followed different HowTOs > > > > > > http://www.lartc.org/howto/lartc.rpdb.multiple-links.html#AEN298 > > > http://lartc.org/howto/lartc.loadshare.html > > > or something like: ip route...equalize via... > > > > > > but I never got a higher transfer rate between the two hosts than > > > max 115Mb/s with benchmarks like netpipe or netio. > > > > If you have different switches for each line i suggest the use > > of "bonding" in balance-round-robin mode. > > > > +-------+ eth0 +--------+ eth0 +------+ > > > > | Host |--------|switch 1|--------| Host | > > | > > | | +--------+ | | > > | > > | A | eth1 +--------+ eth1 | B | > > | > > | |--------|switch 2|--------| | > > > > +-------+ +--------+ +------+ > > I tried this setup a while ago. Both hosts were connected to a Cisco > switch. On the linux hosts I created bond0 interfaces (round robin) > and the switch ports on both switches were configured as Port > Channels. > > +-------+ eth2 +----+ +----+ eth2 +------+ > > | Host |------ |PC1 | |PC2 | ------| Host | > | > | | |_bond0_| |__ | |_bond0_| | | > | > | A | | |SW1 | |SW1 | | | B | > | > | |------ | | | | ------| | > > +-------+ eth3 +----+ +----+ eth3 +------+ > > > This didn''t increase the transfer rate for a tcp session between the > two hosts. Because the mac and ip addresses are the same for the whole > tcp session (backup).This is why i sayed you need two different switches. With only one the switch will allways send only to one port, because he knows the MAC address and will not balance traffic on two or more ports with the same MAC address as destination. Etherchannel has no balancing algo it is desinged for one to many connections not for 1 to 1. With two switches this is not true and the traffic will utilize both lines even for a 1 on 1 connection. regards, Paul
Ralf Gross
2007-Jul-31 08:05 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Paul Zirnik schrieb:> This is why i sayed you need two different switches. With only one the switch > will allways send only to one port, because he knows the MAC address > and will not balance traffic on two or more ports with the same MAC address > as destination. Etherchannel has no balancing algo it is desinged for one to > many connections not for 1 to 1. With two switches this is not true and the > traffic will utilize both lines even for a 1 on 1 connection.Ah, you mean no Port Channels at all. This would then more be like a direct crossed link? Ralf
Ralf Gross
2007-Jul-31 11:01 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Paul Zirnik schrieb:> > > On Monday 30 July 2007 16:10, Ralf Gross wrote: > > > > My goal is to increase the bandwidth for a single tcp session between > > > > the two hosts for a backup job (per packet round robin?), not for > > > > multiple connections between many hosts. I know that I won''t get 2 x > > > > 115Mb/s because of packet reordering, but 20-30% more that a single > > > > connection would be ok. > > > > > > > > I followed different HowTOs > > > > > > > > http://www.lartc.org/howto/lartc.rpdb.multiple-links.html#AEN298 > > > > http://lartc.org/howto/lartc.loadshare.html > > > > or something like: ip route...equalize via... > > > > > > > > but I never got a higher transfer rate between the two hosts than > > > > max 115Mb/s with benchmarks like netpipe or netio. > > > > > > If you have different switches for each line i suggest the use > > > of "bonding" in balance-round-robin mode. > > > > > > +-------+ eth0 +--------+ eth0 +------+ > > > > > > | Host |--------|switch 1|--------| Host | > > > | > > > | | +--------+ | | > > > | > > > | A | eth1 +--------+ eth1 | B | > > > | > > > | |--------|switch 2|--------| | > > > > > > +-------+ +--------+ +------+ > > > > I tried this setup a while ago. Both hosts were connected to a Cisco > > switch. On the linux hosts I created bond0 interfaces (round robin) > > and the switch ports on both switches were configured as Port > > Channels.[...]> > This didn''t increase the transfer rate for a tcp session between the > > two hosts. Because the mac and ip addresses are the same for the whole > > tcp session (backup). > > This is why i sayed you need two different switches. With only one > the switch will allways send only to one port, because he knows the > MAC address and will not balance traffic on two or more ports with > the same MAC address as destination. Etherchannel has no balancing > algo it is desinged for one to many connections not for 1 to 1. With > two switches this is not true and the traffic will utilize both > lines even for a 1 on 1 connection.I''m still a bit confused. If I use balance-rr without Etherchannels the bond0 MAC address will show up on 2 different switches. AFAIK and what the networking staff told me, that will result in problems. In your graph both hosts are connected by two switches and both hosts are directly connected to each of the switches. In my case there are more switches innvolved because the hosts are not in the same building. That''s the setup at the moment. building A building b +--------+ +----------+ +----------+ +--------+ | |eth2 p1|cisco 6509| 3 x GbE |cisco 6509|p1 eth2| | | Host A +--------+ switch/ +---------->| switch/ +-------+ Host B | | data +--------+ router |maybe more | router +-------+ backup | | |eth3 p2| |switches | |p2 eth3| | +--------+ +----------+between the+----------+ +--------+ buildings I think you refer to this part of the bonding documentation: http://belnet.dl.sourceforge.net/sourceforge/bonding/bonding.txt |12.2 Maximum Throughput in a Multiple Switch Topology |----------------------------------------------------- | | Multiple switches may be utilized to optimize for throughput |when they are configured in parallel as part of an isolated network |between two or more systems, for example: | | +-----------+ | | Host A | | +-+---+---+-+ | | | | | +--------+ | +---------+ | | | | | +------+---+ +-----+----+ +-----+----+ | | Switch A | | Switch B | | Switch C | | +------+---+ +-----+----+ +-----+----+ | | | | | +--------+ | +---------+ | | | | | +-+---+---+-+ | | Host B | | +-----------+ | |[...] |When employed in this fashion, the balance-rr mode allows individual |connections between two hosts to effectively utilize greater than one |interface''s bandwidth. But I don''t have an isolated network. Maybe I''m still too blind to see a simple solution. Thanks, Ralf
Grant Taylor
2007-Jul-31 15:25 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On 07/31/07 06:01, Ralf Gross wrote:> But I don''t have an isolated network. Maybe I''m still too blind to > see a simple solution.This is why Paul''s solution, though accurate, will not work in your scenario. The fact that you are trying to go across an aggregated link in the middle between the two buildings where you have no control is going to hinder you severely. The only other nasty thing that comes to mind is to assign additional MAC / IP sets to each system on their second interfaces. Establish IP-IP (?) tunnels between the two systems via each pair of MAC / IP sets. I.e. Machine A Primary MAC / IP set to machine B Primary MAC / IP set and Machine A Secondary MAC / IP set to machine B Secondary MAC / IP set. Thus yielding two tunnels between the two machines. Then if you were trying to get to an IP address that could be routed by the IP address at the end of either tunnel, you could then use something like Equal Cost Multi Path (a.k.a. ECMP) routing to send packets down both routes. Seeing as how the traffic you are sending will be encapsulated in IP-IP tunnel packets, each of which should be between its own MAC / IP sets, the switch(s) so the switches should not cause a problem by doing what they are doing. Grant. . . .
Ralf Gross
2007-Jul-31 15:31 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Grant Taylor schrieb:> >I think it''s not possible with the Cisco switches we use here to > >increase the bandwidth between 2 hosts on L2. > > It sounds like a "per packet" or "per flow" decision that is defaulting > to "per flow" for deciding which port on an EtherChannel to use. > > I''m not that much of a Cisco person so I can''t say for sure, but I''d > think there would be a setting that could be changed in the switch that > would alter this so that you could get an aggregate bandwidth increase. > > Doing a quick Google search > (http://www.google.com/search?hl=en&q=Cisco+CEF&btnG=Google+Search) > reminded me that we had to turn on Cisco Express Forwarding (a.k.a. CEF) > and set the CEF to be "per packet" rather than "per flow". You might > want to do some research on your switches to see if they support CEF or > not. If your switches do support it you may want to talk to your switch > support staff (if it is not your self) to see if they would consider > setting such up.I''ve talked to one of the people of the network staff. He meant they never used CEF in this type of scenario. I''m also not very familiar with Cicso products.> I am presently using CEF between a 3640 and an upstream 7204-ubr to load > balance ethernet connections (bridged to SDSL) and am getting an > aggregate bandwidth increase in "per packet" fashion. So, this will > work between routers and I believe switches that support it. > > Good luck. Let me know if there is any thing else that I can do to help.If you could give me more details on your CEF setup, that would maybe help me to show them what a CEF config should look like. But I''m still not sure if CEF is a thing that is designed to work with client-client connections. Ralf
Grant Taylor
2007-Jul-31 16:31 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
On 07/31/07 10:31, Ralf Gross wrote:> I''ve talked to one of the people of the network staff. He meant they > never used CEF in this type of scenario. I''m also not very familiar > with Cicso products.My physical scenario is a Cisco 3640 router with two (10BaseT) ethernet connections connected to external ethernet to SDSL bridging modems. The SDSL modems bridge the ethernet to an ATM circuit. The ATM circuit is terminated in a Cisco 7206 (I think it''s a 6) UBR router at my ISP. Cisco Express Forwarding is being run on the local 3640 and the remote 7206 to control which connection the packets are being routed down. The local 3640 has two routes upstream, each being the remote IP for each of the ATM links. Correspondingly the 7206 has two routes to a (globally) routable subnet behind the local 3640. As I understand it, CEF (ultimately) builds a forwarding information base (a.k.a. FIB) from the routing tables. So if you have multiple routes, CEF will know about them. CEF will then divide the traffic either "per flow" or "per packet" across all available routes so that more aggregate bandwidth is achieved. In my scenario, I am using CEF via OSPF to combine two 1.1 Mbps SDSL connections to get close to 2 Mbps worth of aggregate bandwidth to the net. I can and do routinely receive 1.5 - 1.8 Mbps throughput via FTP / HTTP / BitTorrent. (Though BitTorrent by nature is not the best example)> If you could give me more details on your CEF setup, that would maybe > help me to show them what a CEF config should look like.I think I have done so above. If you want config examples, contact me off list as I don''t want to publish it to the world.> But I''m still not sure if CEF is a thing that is designed to work > with client-client connections.I can''t say for sure one way or the other. but It think that CEF will achieve what you are wanting to do as long as the device you are connecting to will support it. I know that more and more layer 3 devices support it. So, that being said if your switches are recent layer 3 switches, I''d say that they will support CEF. I don''t know for sure though. Grant. . . .
Jay Vosburgh
2007-Jul-31 19:58 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Grant Taylor <gtaylor@riverviewtech.net> wrote:>On 07/31/07 06:01, Ralf Gross wrote: >> But I don''t have an isolated network. Maybe I''m still too blind to see a >> simple solution.There really isn''t a simple solution, since you''re not doing something simple. It sounds simple to say you want to aggregate bandwidth from multiple interfaces for use by one TCP connection, but it''s actually a pretty complicated problem to solve. The diagram and description in the bonding documentation describing the isolated network is really meant for use in clusters, and is more historical than anything else these days. In the days of yore, it was fairly cost effective to connect several switches to several systems such that each system had one port into each switch (as opposed to buying a single, much larger, switch). With no packet coalescing or the like, balance-rr would tend to deliver packets in order to the end systems (one packet per interrupt), and a given connection could get pretty close to full striped throughput. This type of arrangement breaks down with modern network hardware, since there is no longer a one-to-one relationship between interrupts and packet arrival.>The fact that you are trying to go across an aggregated link in the middle >between the two buildings where you have no control is going to hinder you >severely.Yes. You''re also running up against the fact that, traditionally, Etherchannel (and equivalents) is generally meant to aggregate trunks, optimizing for overall maximum throughput across multiple connections. It''s not really optimized to permit a single connection to effectively utilize the combined bandwidth of multiple links.>The only other nasty thing that comes to mind is to assign additional MAC >/ IP sets to each system on their second interfaces.Another similar Rube Goldberg sort of scheme I''ve set up in the past (in the lab, for bonding testing, not in a production environment, your mileage may vary, etc, etc) is to dedicate particular switch ports to particular vlans. So, e.g., linux box eth0 ---- port 1:vlan 99 SWITCH(ES) port2:vlan 99 ---- eth0 linux box bond0 eth1 ---- port 3:vlan 88 SWITCH(ES) port4:vlan 88 ---- eth1 bond0 This sort of arrangement requires setting the Cisco switch ports to be native to a particular vlan, e.g., "switchport mode access", "switchport access vlan 88". Theoretically, the intervening switches will simply pass the vlan traffic through and not decapsulate it until it reaches its end destination port. You might also have to fool with the inter-switch links to make sure they''re trunking properly (to pass the vlan traffic). The downside of this sort of scheme is that the bond0 instances can only communicate with each other, unless you have the ability for one of the intermediate switches to route between the vlan and the regular network, or you have some other host also attached to the vlans to act as a gateway to the rest of the network. My switches won''t route, since they''re switch-only models (2960/2970/3550), with no layer 3 capability, and I''ve never tried setting up a separate gateway host in such a configuration. This also won''t work if the intervening switches either (a) don''t have higher capacity inter-switch links or (b) don''t spread the traffic across the ISLs any better than they do on a regular etherchannel. Basically, you want to take the switches out of the equation (so the load balance algorithm used by etherchannel doesn''t disturb the even balance of the round robin transmission). There might be other ways to essentially tunnel from port 1 to 2 and 3 to 4 (in my diagram above), but that''s really what you''re looking to do. Lastly, as long as I''m here, I can give my usual commentary about TCP packet reordering. The bonding balance-rr mode will generally deliver packets out of order (to an aggregated destination; if you feed a balance-rr of N links at speed X into a single link with enough capacity to handle N * X bandwidth, you don''t see this problem). This is ignoring any port assignment a switch might do. TCP''s action upon receiving packets out of order is typically to issue an ACK indicating a lost segment (fast retransmit; by default, after 3 segments arrive out of order). On linux, this threshold can be adjusted via the net.ipv4.tcp_reordering sysctl. Crank it up to 127 or so and the reordering effect is minimized, although there are other congestion control effects. The bottom line is that you won''t ever see N * X bandwidth on a single TCP connection, and the improvement factor falls off as the number of links in the aggregate increases. With four links, you''re doing pretty good to get about 2.3 links worth of throughput. If memory serves, with two links you top out around 1.5. So, the real question is: Since you''ve got two links, how important is that 0.5 improvement in transfer speed? Can you instead figure out a way to split your backup problem into pieces, and run them concurrently? That can be a much easier problem to tackle, given that it''s trivial to add extra IP addresses to the hosts on each end, and presumably your higher end Cisco gear will permit a load-balance algorithm other than straight MAC address XOR. E.g., the 2960 I''ve got handy permits: slime(config)#port-channel load-balance ? dst-ip Dst IP Addr dst-mac Dst Mac Addr src-dst-ip Src XOR Dst IP Addr src-dst-mac Src XOR Dst Mac Addr src-ip Src IP Addr src-mac Src Mac Addr so it''s possible to get the IP address into the port selection math, and adding IP addresses is pretty straightforward. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
Ralf Gross
2007-Jul-31 21:00 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Jay Vosburgh schrieb:> Grant Taylor <gtaylor@riverviewtech.net> wrote: > > >On 07/31/07 06:01, Ralf Gross wrote: > >> But I don''t have an isolated network. Maybe I''m still too blind to see a > >> simple solution.First, thanks for your very detailed reply. [...]> >The only other nasty thing that comes to mind is to assign additional MAC > >/ IP sets to each system on their second interfaces. > > Another similar Rube Goldberg sort of scheme I''ve set up in the > past (in the lab, for bonding testing, not in a production environment, > your mileage may vary, etc, etc) is to dedicate particular switch ports > to particular vlans. So, e.g., > > linux box eth0 ---- port 1:vlan 99 SWITCH(ES) port2:vlan 99 ---- eth0 linux box > bond0 eth1 ---- port 3:vlan 88 SWITCH(ES) port4:vlan 88 ---- eth1 bond0This is someting that I was thinking about too. It would be like a direct crossover connection which I tested with bonding and that worked very well in round robin mode.> This sort of arrangement requires setting the Cisco switch ports > to be native to a particular vlan, e.g., "switchport mode access", > "switchport access vlan 88". Theoretically, the intervening switches > will simply pass the vlan traffic through and not decapsulate it until > it reaches its end destination port. You might also have to fool with > the inter-switch links to make sure they''re trunking properly (to pass > the vlan traffic). > > The downside of this sort of scheme is that the bond0 instances > can only communicate with each other, unless you have the ability for > one of the intermediate switches to route between the vlan and the > regular network, or you have some other host also attached to the vlans > to act as a gateway to the rest of the network. My switches won''t > route, since they''re switch-only models (2960/2970/3550), with no layer > 3 capability, and I''ve never tried setting up a separate gateway host in > such a configuration.That wouldn''t be a big problem, I still can take one interface of the backup server out of the client vlan and add it to the regular backup vlan (/24). Both hosts are equipped with 4 x GbE interfaces (2 x client vlan + 2 backup vlan).> This also won''t work if the intervening switches either (a) > don''t have higher capacity inter-switch links or (b) don''t spread the > traffic across the ISLs any better than they do on a regular > etherchannel. > > Basically, you want to take the switches out of the equation (so > the load balance algorithm used by etherchannel doesn''t disturb the even > balance of the round robin transmission). There might be other ways to > essentially tunnel from port 1 to 2 and 3 to 4 (in my diagram above), > but that''s really what you''re looking to do.Ok.> [TCP packet reordering] > The bottom line is that you won''t ever see N * X bandwidth on a > single TCP connection, and the improvement factor falls off as the > number of links in the aggregate increases. With four links, you''re > doing pretty good to get about 2.3 links worth of throughput. If memory > serves, with two links you top out around 1.5.This is a factor I hope to achieve.> So, the real question is: Since you''ve got two links, how > important is that 0.5 improvement in transfer speed? Can you instead > figure out a way to split your backup problem into pieces, and run them > concurrently?I use bacula for backup, I can add an alias with a different ip/port for the host with the data. But I think this will get unhandy over the time. OT: This should not only be a classical backup, it''s a bit like a HSM solution. We have large amounts of video data that will be moved from the online storage to tapes. If the data is needed again (only little will be), it''s possible that 5-10 TB of data needs to be restored to the RAID again. So a 30-50% higher transfer rate could safe some hours.> That can be a much easier problem to tackle, given that it''s > trivial to add extra IP addresses to the hosts on each end, and > presumably your higher end Cisco gear will permit a load-balance > algorithm other than straight MAC address XOR. E.g., the 2960 I''ve got > handy permits: > > slime(config)#port-channel load-balance ? > dst-ip Dst IP Addr > dst-mac Dst Mac Addr > src-dst-ip Src XOR Dst IP Addr > src-dst-mac Src XOR Dst Mac Addr > src-ip Src IP Addr > src-mac Src Mac Addr > > so it''s possible to get the IP address into the port selection > math, and adding IP addresses is pretty straightforward.Yes, this is something I thought about first. But I fear that the backup jobs and database records will get confusing. backups should be as simple as possible, therefor I''d like to solve this at a lower level. But it''s still an option. Ralf
Ralf Gross
2007-Aug-21 16:31 UTC
Re: bandwidth aggregation between 2 hosts in the same subnet
Jay Vosburgh schrieb:> Another similar Rube Goldberg sort of scheme I''ve set up in the > past (in the lab, for bonding testing, not in a production environment, > your mileage may vary, etc, etc) is to dedicate particular switch ports > to particular vlans. So, e.g., > > linux box eth0 ---- port 1:vlan 99 SWITCH(ES) port2:vlan 99 ---- eth0 linux box > bond0 eth1 ---- port 3:vlan 88 SWITCH(ES) port4:vlan 88 ---- eth1 bond0 > > This sort of arrangement requires setting the Cisco switch ports > to be native to a particular vlan, e.g., "switchport mode access", > "switchport access vlan 88". Theoretically, the intervening switches > will simply pass the vlan traffic through and not decapsulate it until > it reaches its end destination port. You might also have to fool with > the inter-switch links to make sure they''re trunking properly (to pass > the vlan traffic).I was able to test the above setup now. Both eth2 interfaces are in vlan 801, each eth3 interface is in vlan 801. bonding is configured in round robin mode, net.ipv4.tcp_reordering = 127. As a basic test I disabled bonding and did a prallel benchmark over both vlans (192.168.1.0/24 + 10.10.0.1/24). # ./linux-i386 -t 10.10.0.1 NETIO - Network Throughput Benchmark, Version 1.26 (C) 1997-2005 Kai Uwe Rommel TCP connection established. Packet size 1k bytes: 73444 KByte/s Tx, 72705 KByte/s Rx. Packet size 2k bytes: 73733 KByte/s Tx, 71534 KByte/s Rx. Packet size 4k bytes: 73418 KByte/s Tx, 72074 KByte/s Rx. Packet size 8k bytes: 73458 KByte/s Tx, 71962 KByte/s Rx. Packet size 16k bytes: 73113 KByte/s Tx, 72132 KByte/s Rx. Packet size 32k bytes: 72719 KByte/s Tx, 73442 KByte/s Rx. Done. # ./linux-i386 -t 192.168.1.1 NETIO - Network Throughput Benchmark, Version 1.26 (C) 1997-2005 Kai Uwe Rommel TCP connection established. Packet size 1k bytes: 74130 KByte/s Tx, 71282 KByte/s Rx. Packet size 2k bytes: 73188 KByte/s Tx, 71663 KByte/s Rx. Packet size 4k bytes: 73321 KByte/s Tx, 72349 KByte/s Rx. Packet size 8k bytes: 73080 KByte/s Tx, 72272 KByte/s Rx. Packet size 16k bytes: 73032 KByte/s Tx, 72307 KByte/s Rx. Packet size 32k bytes: 72995 KByte/s Tx, 72132 KByte/s Rx. Done. This is not 2 x GbE, but but more than just one interface. Next I enabled bonding and repeated the test over the bond0 interfaces. # ./linux-i386 -t 10.60.1.244 NETIO - Network Throughput Benchmark, Version 1.26 (C) 1997-2005 Kai Uwe Rommel TCP connection established. Packet size 1k bytes: 113469 KByte/s Tx, 113990 KByte/s Rx. Packet size 2k bytes: 112990 KByte/s Tx, 114107 KByte/s Rx. Packet size 4k bytes: 110997 KByte/s Tx, 114269 KByte/s Rx. Packet size 8k bytes: 113337 KByte/s Tx, 114338 KByte/s Rx. Packet size 16k bytes: 113587 KByte/s Tx, 113920 KByte/s Rx. Packet size 32k bytes: 113249 KByte/s Tx, 114354 KByte/s Rx. Done. Now I get only the speed of one GbE interface again. ifstat on server a (netio server): eth2 eth3 KB/s in KB/s out KB/s in KB/s out 120257.6 6419.24 120143.6 6416.79 0.00 0.00 0.00 0.00 58908.95 67127.21 56951.31 69093.78 0.00 0.00 0.00 0.00 6277.72 119635.0 6277.95 119910.9 0.00 0.00 0.00 0.00 6306.51 120092.4 6309.26 119892.6 0.00 0.00 0.00 0.00 2945.82 55833.14 2832.18 54014.88 0.00 0.00 0.00 0.00 ifstat on server b (netio "client"): eth2 eth3 KB/s in KB/s out KB/s in KB/s out 6339.45 119813.5 6361.06 119714.8 0.00 0.00 0.00 0.00 8852.77 117313.6 14954.50 111191.7 0.00 0.00 0.00 0.00 119485.3 6268.16 119901.3 6270.50 0.00 0.00 0.00 0.00 120151.5 6305.75 119914.7 6309.08 0.00 0.00 0.00 0.00 117493.9 6179.55 111202.9 5838.42 0.00 0.00 0.00 0.00 It seems that the traffic is equllay shared over both interfaces. Only two switches with the vlans are involved (two buildings). Any ideas? Is this the performance I should get from (nearly) 2x GbE with packet reordering in mind? Ralf