thr3ads.net - Lustre discuss - [Lustre-discuss] tcp network load balancing understanding lustre 1.8 [May 2009]

If this information is useful, please help other people find it:
Share via:

Michael Ruepp

2009-May-07 12:50 UTC

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Hi there,

I am configured a simple tcp lustre 1.8 with one mdc (one nic) and two  
oss (four nic per oss)
As well as in the 1.6 documentation, the multihomed sections is a  
little bit unclear to me.

I give every NID a IP in the same subnet, eg: 10.111.20.35-38 - oss0  
and 10.111.20.39-42 oss1

Do I have to make modprobe.conf.local look like this to force lustre  
to use all four interfaces parallel:

options lnet networks=tcp0(eth0,eth1,eth2,eth3)
Because on Page 138 the 1.8 Manual says:
"Note ? In the case of TCP-only clients, the first available non- 
loopback IP interface
is used for tcp0 since the interfaces are not specified. "

or do I have to specify it like this:
options lnet networks=tcp
Because on Page 112 the lustre 1.6 Manual says:
"Note ? In the case of TCP-only clients, all available IP interfaces  
are used for tcp0
since the interfaces are not specified. If there is more than one, the  
IP of the first one
found is used to construct the tcp0 ID."

Which is the opposite of the 1.8 Manual

My goal ist to let lustre utilize all four Gb Links parallel. And my  
Lustre Clients are equipped with two Gb links which should be utilized  
by the lustre clients as well (eth0, eth1)

Or is bonding the better solution in terms of performance?

Thanks very much for input,

Michael Ruepp
Schwarzfilm AG

Isaac Huang

2009-May-07 19:57 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On Thu, May 07, 2009 at 02:50:13PM +0200, Michael Ruepp
wrote:> Hi there,
> ......
> I give every NID a IP in the same subnet, eg: 10.111.20.35-38 - oss0  
> and 10.111.20.39-42 oss1
> 
> Do I have to make modprobe.conf.local look like this to force lustre  
> to use all four interfaces parallel:
> 
> options lnet networks=tcp0(eth0,eth1,eth2,eth3)
> Because on Page 138 the 1.8 Manual says:
> "Note ? In the case of TCP-only clients, the first available non- 
> loopback IP interface
> is used for tcp0 since the interfaces are not specified. "
Correct.
> or do I have to specify it like this:
> options lnet networks=tcp
> Because on Page 112 the lustre 1.6 Manual says:
> "Note ? In the case of TCP-only clients, all available IP interfaces  
> are used for tcp0
Wrong. It needs to be updated as well, Sheila?
> ......
> My goal ist to let lustre utilize all four Gb Links parallel. And my  
> Lustre Clients are equipped with two Gb links which should be utilized  
> by the lustre clients as well (eth0, eth1)
> 
> Or is bonding the better solution in terms of performance?
I don''t have any performance comparisons between the two approaches,
but I''d suggest to go with Linux bonding instead (let''s call
the
tcp0(eth0,...ethN) approach Lustre bonding), because:
1. With Lustre bonding it''s rather tricky to get routing right,
especially when all NICs reside in a same IP subnet. Lustre tcp
network driver, as its name suggests, works at TCP layer and the
decision as to which outgoing interface to use depends on Linux IP
layer routing. When all NICs live in a same IP subnet, it''s very
possible that all outgoing packets would go through the interface of
the 1st route in the Linux routing table, unless some tweaking has
been done to also take source IPs into account. Incoming packets could
also come in via unexpected NICs, depending on your settings in
/proc/sys/net/ipv4/conf/*/arp_ignore and your ethernet topology.

2. Linux bonding does a good job of detecting link status via either
the ARP monitor or the MII monitor, but no such mechanism exists in
Lustre bonding.

In fact, the Lustre bonding is an officially obsoleted feature if I
remember correctly.

Thanks,
Isaac

Klaus Steden

2009-May-07 22:02 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Hi Michael,

Just want to throw my two cents in with Isaac''s posting, as I spent a
great
deal of time working with these kinds of features over the course of the
last two years.

In my experience with Lustre 1.6, in the case where multiple NICs were
available, Lustre will default to using the first one exclusively until it
detects a failure and then switches over to the next available. It will also
not distinguish between different NIC types, i.e. IB, GigE, etc., will be
picked based on discovery order not speed or some other metric.

I didn''t even touch Lustre bonding, because as you both remark,
it''s a
little convoluted. I spent a lot of time experimenting with Lustre over
802.3ad (LACP) aggregated links using the Linux bonding driver, and my OSS
nodes produced very respectable to very good numbers. Across a pair of OSS
nodes each with 2 x GigE NICs, I was able to sustain ~ 350 MB/s write speed
when running sandbox tests, so it appears that although the LACP driver
doesn''t balance a connection across multiple links (i.e. a 2 GigE LACP
bond
doesn''t give you 2 Gbit throughput for a single network I/O), the
Lustre
implementation somehow manages to squeeze more data through the pipe.

To get it set up, simply configure NIC bonding of whatever flavour suits
your needs on the OS nodes, and then assign ''bond0'' to your
tcp networks,
something like this:

options lnet networks=tcp0(bond0)

and you should be off to the races.

hth,
Klaus

On 5/7/09 12:57 PM, "Isaac Huang" <He.Huang at Sun.COM> etched
on stone
tablets:
> On Thu, May 07, 2009 at 02:50:13PM +0200, Michael Ruepp wrote:
>> Hi there,
>> ......
>> I give every NID a IP in the same subnet, eg: 10.111.20.35-38 - oss0
>> and 10.111.20.39-42 oss1
>> 
>> Do I have to make modprobe.conf.local look like this to force lustre
>> to use all four interfaces parallel:
>> 
>> options lnet networks=tcp0(eth0,eth1,eth2,eth3)
>> Because on Page 138 the 1.8 Manual says:
>> "Note ? In the case of TCP-only clients, the first available non-
>> loopback IP interface
>> is used for tcp0 since the interfaces are not specified. "
> 
> Correct.
> 
>> or do I have to specify it like this:
>> options lnet networks=tcp
>> Because on Page 112 the lustre 1.6 Manual says:
>> "Note ? In the case of TCP-only clients, all available IP
interfaces
>> are used for tcp0
> 
> Wrong. It needs to be updated as well, Sheila?
> 
>> ......
>> My goal ist to let lustre utilize all four Gb Links parallel. And my
>> Lustre Clients are equipped with two Gb links which should be utilized
>> by the lustre clients as well (eth0, eth1)
>> 
>> Or is bonding the better solution in terms of performance?
> 
> I don''t have any performance comparisons between the two
approaches,
> but I''d suggest to go with Linux bonding instead (let''s
call the
> tcp0(eth0,...ethN) approach Lustre bonding), because:
> 1. With Lustre bonding it''s rather tricky to get routing right,
> especially when all NICs reside in a same IP subnet. Lustre tcp
> network driver, as its name suggests, works at TCP layer and the
> decision as to which outgoing interface to use depends on Linux IP
> layer routing. When all NICs live in a same IP subnet, it''s very
> possible that all outgoing packets would go through the interface of
> the 1st route in the Linux routing table, unless some tweaking has
> been done to also take source IPs into account. Incoming packets could
> also come in via unexpected NICs, depending on your settings in
> /proc/sys/net/ipv4/conf/*/arp_ignore and your ethernet topology.
> 
> 2. Linux bonding does a good job of detecting link status via either
> the ARP monitor or the MII monitor, but no such mechanism exists in
> Lustre bonding.
> 
> In fact, the Lustre bonding is an officially obsoleted feature if I
> remember correctly.
> 
> Thanks,
> Isaac
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Isaac Huang

2009-May-08 00:20 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On Thu, May 07, 2009 at 03:02:49PM -0700, Klaus Steden
wrote:> ......
> I didn''t even touch Lustre bonding, because as you both remark,
it''s a
> little convoluted. I spent a lot of time experimenting with Lustre over
> 802.3ad (LACP) aggregated links using the Linux bonding driver, and my OSS
> nodes produced very respectable to very good numbers. Across a pair of OSS
> nodes each with 2 x GigE NICs, I was able to sustain ~ 350 MB/s write speed
> when running sandbox tests, so it appears that although the LACP driver
> doesn''t balance a connection across multiple links (i.e. a 2 GigE
LACP bond
> doesn''t give you 2 Gbit throughput for a single network I/O), the
Lustre
> implementation somehow manages to squeeze more data through the pipe.
Probably because the Lustre TCP driver creates multiple connections
between two end points, for different types of data.

Thanks,
Isaac

Mag Gam

2009-May-09 15:07 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

I second the responses.

Go with Native OS bonding, Linux in this case. Makes life so much easier...

Good luck


On Thu, May 7, 2009 at 8:20 PM, Isaac Huang <He.Huang at sun.com>
wrote:> On Thu, May 07, 2009 at 03:02:49PM -0700, Klaus Steden wrote:
>> ......
>> I didn''t even touch Lustre bonding, because as you both
remark, it''s a
>> little convoluted. I spent a lot of time experimenting with Lustre over
>> 802.3ad (LACP) aggregated links using the Linux bonding driver, and my
OSS
>> nodes produced very respectable to very good numbers. Across a pair of
OSS
>> nodes each with 2 x GigE NICs, I was able to sustain ~ 350 MB/s write
speed
>> when running sandbox tests, so it appears that although the LACP driver
>> doesn''t balance a connection across multiple links (i.e. a 2
GigE LACP bond
>> doesn''t give you 2 Gbit throughput for a single network I/O),
the Lustre
>> implementation somehow manages to squeeze more data through the pipe.
>
> Probably because the Lustre TCP driver creates multiple connections
> between two end points, for different types of data.
>
> Thanks,
> Isaac
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Arden Wiebe

2009-May-09 16:18 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Michael,

This might help answer some questions. 
http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows my mostly not tuned OSS
and OST''s pulling 400+MiB/s over TCP Bonding provided by the kernel
complete with a cat of the modeprobe.conf file.  You have the other links
I''ve sent you but the picture above is relevant to your questions.

Arden

--- On Thu, 5/7/09, Michael Ruepp <michael at schwarzfilm.ch> wrote:
> From: Michael Ruepp <michael at schwarzfilm.ch>
> Subject: [Lustre-discuss] tcp network load balancing understanding lustre
1.8
> To: lustre-discuss at lists.lustre.org
> Date: Thursday, May 7, 2009, 5:50 AM
> Hi there,
> 
> I am configured a simple tcp lustre 1.8 with one mdc (one
> nic) and two? 
> oss (four nic per oss)
> As well as in the 1.6 documentation, the multihomed
> sections is a? 
> little bit unclear to me.
> 
> I give every NID a IP in the same subnet, eg:
> 10.111.20.35-38 - oss0? 
> and 10.111.20.39-42 oss1
> 
> Do I have to make modprobe.conf.local look like this to
> force lustre? 
> to use all four interfaces parallel:
> 
> options lnet networks=tcp0(eth0,eth1,eth2,eth3)
> Because on Page 138 the 1.8 Manual says:
> "Note ? In the case of TCP-only clients, the first
> available non- 
> loopback IP interface
> is used for tcp0 since the interfaces are not specified. "
> 
> or do I have to specify it like this:
> options lnet networks=tcp
> Because on Page 112 the lustre 1.6 Manual says:
> "Note ? In the case of TCP-only clients, all available IP
> interfaces? 
> are used for tcp0
> since the interfaces are not specified. If there is more
> than one, the? 
> IP of the first one
> found is used to construct the tcp0 ID."
> 
> Which is the opposite of the 1.8 Manual
> 
> My goal ist to let lustre utilize all four Gb Links
> parallel. And my? 
> Lustre Clients are equipped with two Gb links which should
> be utilized? 
> by the lustre clients as well (eth0, eth1)
> 
> Or is bonding the better solution in terms of performance?
> 
> Thanks very much for input,
> 
> Michael Ruepp
> Schwarzfilm AG
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Andreas Dilger

2009-May-09 18:31 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On May 09, 2009  09:18 -0700, Arden Wiebe wrote:> This might help answer some questions.
> http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows my mostly not
> tuned OSS and OST''s pulling 400+MiB/s over TCP Bonding provided by
the
> kernel complete with a cat of the modeprobe.conf file.  You have the other
> links I''ve sent you but the picture above is relevant to your
questions.
Arden, thanks for sharing this info.  Any chance you could post it to 
wiki.lustre.org?  It would seem there is one bit of info missing somewhere -
how does bond0 know which interfaces to use? 


Also, another oddity - the network monitor is showing 450MiB/s Received,
yet the disk is showing only about 170MiB/s going to the disk.  Either
something is wacky with the monitoring (e.g. it is counting Received for
both the eth* networks AND bond0), or Lustre is doing something very
wierd and retransmitting the bulk data like crazy (seems unlikely).

> --- On Thu, 5/7/09, Michael Ruepp <michael at schwarzfilm.ch> wrote:
> 
> > From: Michael Ruepp <michael at schwarzfilm.ch>
> > Subject: [Lustre-discuss] tcp network load balancing understanding
lustre 1.8
> > To: lustre-discuss at lists.lustre.org
> > Date: Thursday, May 7, 2009, 5:50 AM
> > Hi there,
> > 
> > I am configured a simple tcp lustre 1.8 with one mdc (one
> > nic) and two? 
> > oss (four nic per oss)
> > As well as in the 1.6 documentation, the multihomed
> > sections is a? 
> > little bit unclear to me.
> > 
> > I give every NID a IP in the same subnet, eg:
> > 10.111.20.35-38 - oss0? 
> > and 10.111.20.39-42 oss1
> > 
> > Do I have to make modprobe.conf.local look like this to
> > force lustre? 
> > to use all four interfaces parallel:
> > 
> > options lnet networks=tcp0(eth0,eth1,eth2,eth3)
> > Because on Page 138 the 1.8 Manual says:
> > "Note ? In the case of TCP-only clients, the first
> > available non- 
> > loopback IP interface
> > is used for tcp0 since the interfaces are not specified. "
> > 
> > or do I have to specify it like this:
> > options lnet networks=tcp
> > Because on Page 112 the lustre 1.6 Manual says:
> > "Note ? In the case of TCP-only clients, all available IP
> > interfaces? 
> > are used for tcp0
> > since the interfaces are not specified. If there is more
> > than one, the? 
> > IP of the first one
> > found is used to construct the tcp0 ID."
> > 
> > Which is the opposite of the 1.8 Manual
> > 
> > My goal ist to let lustre utilize all four Gb Links
> > parallel. And my? 
> > Lustre Clients are equipped with two Gb links which should
> > be utilized? 
> > by the lustre clients as well (eth0, eth1)
> > 
> > Or is bonding the better solution in terms of performance?
> > 
> > Thanks very much for input,
> > 
> > Michael Ruepp
> > Schwarzfilm AG
> > 
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > 
> 
> 
>       
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Arden Wiebe

2009-May-10 04:15 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Bond0 knows which interface to utilize because all the other eth0-5 are
designated as slaves in their configuration files.  The manual is fairly clear
on that.

In the screenshot the memory used in gnome system monitor is at 452.4 MiB of 7.8
GiB and the sustained bandwidth to the OSS and OST is 404.2 MiB/s which
corresponds roughly to what collectl is showing for KBWrite for Disks.  Collectl
shows a few different results for Disks, Network and Lustre OST and I believe it
to be measuring the other OST on the network around 170MiB/s if you view the
other screenshot for OST1 or lustrethree.

In the screenshots Lustreone=MGS Lustretwo=MDT Lustrethree=OSS+raid10 target
Lustrefour=OSS+raid10 target

To help clarify the entire network and stress testing I did with all the clients
I could give it is at www.ioio.ca/Lustre-tcp-bonding/images/html and
www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html

Proper benchmarking would be nice though as I just hit it with everything I
could and it lived so I was happy. I found the manual to be lacking in
benchmarking and really wanted to make nice graphs of it all but failed with
iozone to do so for some reason.

I''ll be taking a run at upgrading everything to 1.8 in the coming week
or so and when I do I''ll grab some new screenshots and post the
relevant items to the wiki.  Otherwise if someone else wants to post the
existing screenshots your welcome to use them as they do detail a ground up
build. Apparently 1.8 is great with small files now so it should work even
better with www.oil-gas.ca/phpsysinfo and www.linuxguru.ca/phpsysinfo
 

--- On Sat, 5/9/09, Andreas Dilger <adilger at sun.com> wrote:
> From: Andreas Dilger <adilger at sun.com>
> Subject: Re: [Lustre-discuss] tcp network load balancing understanding
lustre 1.8
> To: "Arden Wiebe" <albert682 at yahoo.com>
> Cc: lustre-discuss at lists.lustre.org, "Michael Ruepp"
<michael at schwarzfilm.ch>
> Date: Saturday, May 9, 2009, 11:31 AM
> On May 09, 2009? 09:18 -0700,
> Arden Wiebe wrote:
> > This might help answer some questions.
> > http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows
> my mostly not
> > tuned OSS and OST''s pulling 400+MiB/s over TCP Bonding
> provided by the
> > kernel complete with a cat of the modeprobe.conf
> file.? You have the other
> > links I''ve sent you but the picture above is relevant
> to your questions.
> 
> Arden, thanks for sharing this info.? Any chance you
> could post it to 
> wiki.lustre.org?? It would seem there is one bit of
> info missing somewhere -
> how does bond0 know which interfaces to use? 
> 
> 
> Also, another oddity - the network monitor is showing
> 450MiB/s Received,
> yet the disk is showing only about 170MiB/s going to the
> disk.? Either
> something is wacky with the monitoring (e.g. it is counting
> Received for
> both the eth* networks AND bond0), or Lustre is doing
> something very
> wierd and retransmitting the bulk data like crazy (seems
> unlikely).
> 
> 
> > --- On Thu, 5/7/09, Michael Ruepp <michael at schwarzfilm.ch>
> wrote:
> > 
> > > From: Michael Ruepp <michael at schwarzfilm.ch>
> > > Subject: [Lustre-discuss] tcp network load
> balancing understanding lustre 1.8
> > > To: lustre-discuss at lists.lustre.org
> > > Date: Thursday, May 7, 2009, 5:50 AM
> > > Hi there,
> > > 
> > > I am configured a simple tcp lustre 1.8 with one
> mdc (one
> > > nic) and two? 
> > > oss (four nic per oss)
> > > As well as in the 1.6 documentation, the
> multihomed
> > > sections is a? 
> > > little bit unclear to me.
> > > 
> > > I give every NID a IP in the same subnet, eg:
> > > 10.111.20.35-38 - oss0? 
> > > and 10.111.20.39-42 oss1
> > > 
> > > Do I have to make modprobe.conf.local look like
> this to
> > > force lustre? 
> > > to use all four interfaces parallel:
> > > 
> > > options lnet networks=tcp0(eth0,eth1,eth2,eth3)
> > > Because on Page 138 the 1.8 Manual says:
> > > "Note ? In the case of TCP-only clients, the
> first
> > > available non- 
> > > loopback IP interface
> > > is used for tcp0 since the interfaces are not
> specified. "
> > > 
> > > or do I have to specify it like this:
> > > options lnet networks=tcp
> > > Because on Page 112 the lustre 1.6 Manual says:
> > > "Note ? In the case of TCP-only clients, all
> available IP
> > > interfaces? 
> > > are used for tcp0
> > > since the interfaces are not specified. If there
> is more
> > > than one, the? 
> > > IP of the first one
> > > found is used to construct the tcp0 ID."
> > > 
> > > Which is the opposite of the 1.8 Manual
> > > 
> > > My goal ist to let lustre utilize all four Gb
> Links
> > > parallel. And my? 
> > > Lustre Clients are equipped with two Gb links
> which should
> > > be utilized? 
> > > by the lustre clients as well (eth0, eth1)
> > > 
> > > Or is bonding the better solution in terms of
> performance?
> > > 
> > > Thanks very much for input,
> > > 
> > > Michael Ruepp
> > > Schwarzfilm AG
> > > 
> > > 
> > > _______________________________________________
> > > Lustre-discuss mailing list
> > > Lustre-discuss at lists.lustre.org
> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> > > 
> > 
> > 
> >? ? ???
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
>

Mag Gam

2009-May-10 12:48 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Thanks for the screen shot Arden.

What is the maximum # of slaves you can have on a bonded interface?



On Sun, May 10, 2009 at 12:15 AM, Arden Wiebe <albert682 at yahoo.com>
wrote:>
> Bond0 knows which interface to utilize because all the other eth0-5 are
designated as slaves in their configuration files. ?The manual is fairly clear
on that.
>
> In the screenshot the memory used in gnome system monitor is at 452.4 MiB
of 7.8 GiB and the sustained bandwidth to the OSS and OST is 404.2 MiB/s which
corresponds roughly to what collectl is showing for KBWrite for Disks. ?Collectl
shows a few different results for Disks, Network and Lustre OST and I believe it
to be measuring the other OST on the network around 170MiB/s if you view the
other screenshot for OST1 or lustrethree.
>
> In the screenshots Lustreone=MGS Lustretwo=MDT Lustrethree=OSS+raid10
target Lustrefour=OSS+raid10 target
>
> To help clarify the entire network and stress testing I did with all the
clients I could give it is at www.ioio.ca/Lustre-tcp-bonding/images/html and
www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html
>
> Proper benchmarking would be nice though as I just hit it with everything I
could and it lived so I was happy. I found the manual to be lacking in
benchmarking and really wanted to make nice graphs of it all but failed with
iozone to do so for some reason.
>
> I''ll be taking a run at upgrading everything to 1.8 in the coming
week or so and when I do I''ll grab some new screenshots and post the
relevant items to the wiki. ?Otherwise if someone else wants to post the
existing screenshots your welcome to use them as they do detail a ground up
build. Apparently 1.8 is great with small files now so it should work even
better with www.oil-gas.ca/phpsysinfo and www.linuxguru.ca/phpsysinfo
>
>
> --- On Sat, 5/9/09, Andreas Dilger <adilger at sun.com> wrote:
>
>> From: Andreas Dilger <adilger at sun.com>
>> Subject: Re: [Lustre-discuss] tcp network load balancing understanding
lustre 1.8
>> To: "Arden Wiebe" <albert682 at yahoo.com>
>> Cc: lustre-discuss at lists.lustre.org, "Michael Ruepp"
<michael at schwarzfilm.ch>
>> Date: Saturday, May 9, 2009, 11:31 AM
>> On May 09, 2009? 09:18 -0700,
>> Arden Wiebe wrote:
>> > This might help answer some questions.
>> > http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows
>> my mostly not
>> > tuned OSS and OST''s pulling 400+MiB/s over TCP Bonding
>> provided by the
>> > kernel complete with a cat of the modeprobe.conf
>> file.? You have the other
>> > links I''ve sent you but the picture above is relevant
>> to your questions.
>>
>> Arden, thanks for sharing this info.? Any chance you
>> could post it to
>> wiki.lustre.org?? It would seem there is one bit of
>> info missing somewhere -
>> how does bond0 know which interfaces to use?
>>
>>
>> Also, another oddity - the network monitor is showing
>> 450MiB/s Received,
>> yet the disk is showing only about 170MiB/s going to the
>> disk.? Either
>> something is wacky with the monitoring (e.g. it is counting
>> Received for
>> both the eth* networks AND bond0), or Lustre is doing
>> something very
>> wierd and retransmitting the bulk data like crazy (seems
>> unlikely).
>>
>>
>> > --- On Thu, 5/7/09, Michael Ruepp <michael at
schwarzfilm.ch>
>> wrote:
>> >
>> > > From: Michael Ruepp <michael at schwarzfilm.ch>
>> > > Subject: [Lustre-discuss] tcp network load
>> balancing understanding lustre 1.8
>> > > To: lustre-discuss at lists.lustre.org
>> > > Date: Thursday, May 7, 2009, 5:50 AM
>> > > Hi there,
>> > >
>> > > I am configured a simple tcp lustre 1.8 with one
>> mdc (one
>> > > nic) and two
>> > > oss (four nic per oss)
>> > > As well as in the 1.6 documentation, the
>> multihomed
>> > > sections is a
>> > > little bit unclear to me.
>> > >
>> > > I give every NID a IP in the same subnet, eg:
>> > > 10.111.20.35-38 - oss0
>> > > and 10.111.20.39-42 oss1
>> > >
>> > > Do I have to make modprobe.conf.local look like
>> this to
>> > > force lustre
>> > > to use all four interfaces parallel:
>> > >
>> > > options lnet networks=tcp0(eth0,eth1,eth2,eth3)
>> > > Because on Page 138 the 1.8 Manual says:
>> > > "Note ? In the case of TCP-only clients, the
>> first
>> > > available non-
>> > > loopback IP interface
>> > > is used for tcp0 since the interfaces are not
>> specified. "
>> > >
>> > > or do I have to specify it like this:
>> > > options lnet networks=tcp
>> > > Because on Page 112 the lustre 1.6 Manual says:
>> > > "Note ? In the case of TCP-only clients, all
>> available IP
>> > > interfaces
>> > > are used for tcp0
>> > > since the interfaces are not specified. If there
>> is more
>> > > than one, the
>> > > IP of the first one
>> > > found is used to construct the tcp0 ID."
>> > >
>> > > Which is the opposite of the 1.8 Manual
>> > >
>> > > My goal ist to let lustre utilize all four Gb
>> Links
>> > > parallel. And my
>> > > Lustre Clients are equipped with two Gb links
>> which should
>> > > be utilized
>> > > by the lustre clients as well (eth0, eth1)
>> > >
>> > > Or is bonding the better solution in terms of
>> performance?
>> > >
>> > > Thanks very much for input,
>> > >
>> > > Michael Ruepp
>> > > Schwarzfilm AG
>> > >
>> > >
>> > > _______________________________________________
>> > > Lustre-discuss mailing list
>> > > Lustre-discuss at lists.lustre.org
>> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> > >
>> >
>> >
>> >
>> > _______________________________________________
>> > Lustre-discuss mailing list
>> > Lustre-discuss at lists.lustre.org
>> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Arden Wiebe

2009-May-10 13:12 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Mag, your welcome. From the page referenced first for a search for Linux Bonding
it states:

How many bonding devices can I have?

There is no limit.
How many slaves can a bonding device have?

This is limited only by the number of network interfaces Linux supports and/or
the number of network cards you can place in your system.

--- On Sun, 5/10/09, Mag Gam <magawake at gmail.com> wrote:
> From: Mag Gam <magawake at gmail.com>
> Subject: Re: [Lustre-discuss] tcp network load balancing understanding
lustre  1.8
> To: "Arden Wiebe" <albert682 at yahoo.com>
> Cc: "Andreas Dilger" <adilger at sun.com>, "Michael
Ruepp" <michael at schwarzfilm.ch>, lustre-discuss at
lists.lustre.org
> Date: Sunday, May 10, 2009, 5:48 AM
> Thanks for the screen shot Arden.
> 
> What is the maximum # of slaves you can have on a bonded
> interface?
> 
> 
> 
> On Sun, May 10, 2009 at 12:15 AM, Arden Wiebe <albert682 at
yahoo.com>
> wrote:
> >
> > Bond0 knows which interface to utilize because all the
> other eth0-5 are designated as slaves in their configuration
> files. ?The manual is fairly clear on that.
> >
> > In the screenshot the memory used in gnome system
> monitor is at 452.4 MiB of 7.8 GiB and the sustained
> bandwidth to the OSS and OST is 404.2 MiB/s which
> corresponds roughly to what collectl is showing for KBWrite
> for Disks. ?Collectl shows a few different results for
> Disks, Network and Lustre OST and I believe it to be
> measuring the other OST on the network around 170MiB/s if
> you view the other screenshot for OST1 or lustrethree.
> >
> > In the screenshots Lustreone=MGS Lustretwo=MDT
> Lustrethree=OSS+raid10 target Lustrefour=OSS+raid10 target
> >
> > To help clarify the entire network and stress testing
> I did with all the clients I could give it is at
> www.ioio.ca/Lustre-tcp-bonding/images/html and
> www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html
> >
> > Proper benchmarking would be nice though as I just hit
> it with everything I could and it lived so I was happy. I
> found the manual to be lacking in benchmarking and really
> wanted to make nice graphs of it all but failed with iozone
> to do so for some reason.
> >
> > I''ll be taking a run at upgrading everything to 1.8 in
> the coming week or so and when I do I''ll grab some new
> screenshots and post the relevant items to the wiki.
> ?Otherwise if someone else wants to post the existing
> screenshots your welcome to use them as they do detail a
> ground up build. Apparently 1.8 is great with small files
> now so it should work even better with
> www.oil-gas.ca/phpsysinfo and www.linuxguru.ca/phpsysinfo
> >
> >
> > --- On Sat, 5/9/09, Andreas Dilger <adilger at sun.com>
> wrote:
> >
> >> From: Andreas Dilger <adilger at sun.com>
> >> Subject: Re: [Lustre-discuss] tcp network load
> balancing understanding lustre 1.8
> >> To: "Arden Wiebe" <albert682 at yahoo.com>
> >> Cc: lustre-discuss at lists.lustre.org,
> "Michael Ruepp" <michael at schwarzfilm.ch>
> >> Date: Saturday, May 9, 2009, 11:31 AM
> >> On May 09, 2009? 09:18 -0700,
> >> Arden Wiebe wrote:
> >> > This might help answer some questions.
> >> > http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows
> >> my mostly not
> >> > tuned OSS and OST''s pulling 400+MiB/s over
> TCP Bonding
> >> provided by the
> >> > kernel complete with a cat of the
> modeprobe.conf
> >> file.? You have the other
> >> > links I''ve sent you but the picture above is
> relevant
> >> to your questions.
> >>
> >> Arden, thanks for sharing this info.? Any chance
> you
> >> could post it to
> >> wiki.lustre.org?? It would seem there is one bit
> of
> >> info missing somewhere -
> >> how does bond0 know which interfaces to use?
> >>
> >>
> >> Also, another oddity - the network monitor is
> showing
> >> 450MiB/s Received,
> >> yet the disk is showing only about 170MiB/s going
> to the
> >> disk.? Either
> >> something is wacky with the monitoring (e.g. it is
> counting
> >> Received for
> >> both the eth* networks AND bond0), or Lustre is
> doing
> >> something very
> >> wierd and retransmitting the bulk data like crazy
> (seems
> >> unlikely).
> >>
> >>
> >> > --- On Thu, 5/7/09, Michael Ruepp <michael at
schwarzfilm.ch>
> >> wrote:
> >> >
> >> > > From: Michael Ruepp <michael at schwarzfilm.ch>
> >> > > Subject: [Lustre-discuss] tcp network
> load
> >> balancing understanding lustre 1.8
> >> > > To: lustre-discuss at lists.lustre.org
> >> > > Date: Thursday, May 7, 2009, 5:50 AM
> >> > > Hi there,
> >> > >
> >> > > I am configured a simple tcp lustre 1.8
> with one
> >> mdc (one
> >> > > nic) and two
> >> > > oss (four nic per oss)
> >> > > As well as in the 1.6 documentation,
> the
> >> multihomed
> >> > > sections is a
> >> > > little bit unclear to me.
> >> > >
> >> > > I give every NID a IP in the same
> subnet, eg:
> >> > > 10.111.20.35-38 - oss0
> >> > > and 10.111.20.39-42 oss1
> >> > >
> >> > > Do I have to make modprobe.conf.local
> look like
> >> this to
> >> > > force lustre
> >> > > to use all four interfaces parallel:
> >> > >
> >> > > options lnet
> networks=tcp0(eth0,eth1,eth2,eth3)
> >> > > Because on Page 138 the 1.8 Manual
> says:
> >> > > "Note ? In the case of TCP-only
> clients, the
> >> first
> >> > > available non-
> >> > > loopback IP interface
> >> > > is used for tcp0 since the interfaces
> are not
> >> specified. "
> >> > >
> >> > > or do I have to specify it like this:
> >> > > options lnet networks=tcp
> >> > > Because on Page 112 the lustre 1.6
> Manual says:
> >> > > "Note ? In the case of TCP-only
> clients, all
> >> available IP
> >> > > interfaces
> >> > > are used for tcp0
> >> > > since the interfaces are not specified.
> If there
> >> is more
> >> > > than one, the
> >> > > IP of the first one
> >> > > found is used to construct the tcp0
> ID."
> >> > >
> >> > > Which is the opposite of the 1.8 Manual
> >> > >
> >> > > My goal ist to let lustre utilize all
> four Gb
> >> Links
> >> > > parallel. And my
> >> > > Lustre Clients are equipped with two Gb
> links
> >> which should
> >> > > be utilized
> >> > > by the lustre clients as well (eth0,
> eth1)
> >> > >
> >> > > Or is bonding the better solution in
> terms of
> >> performance?
> >> > >
> >> > > Thanks very much for input,
> >> > >
> >> > > Michael Ruepp
> >> > > Schwarzfilm AG
> >> > >
> >> > >
> >> > >
> _______________________________________________
> >> > > Lustre-discuss mailing list
> >> > > Lustre-discuss at lists.lustre.org
> >> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >> > >
> >> >
> >> >
> >> >
> >> >
> _______________________________________________
> >> > Lustre-discuss mailing list
> >> > Lustre-discuss at lists.lustre.org
> >> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >>
> >> Cheers, Andreas
> >> --
> >> Andreas Dilger
> >> Sr. Staff Engineer, Lustre Group
> >> Sun Microsystems of Canada, Inc.
> >>
> >>
> >
> >
> >
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
>

Kevin Van Maren

2009-May-10 14:04 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On May 10, 2009, at 7:12 AM, Arden Wiebe <albert682 at yahoo.com> wrote:
>
> Mag, your welcome. From the page referenced first for a search for  
> Linux Bonding it states:
>
> How many bonding devices can I have?
>
> There is no limit.
> How many slaves can a bonding device have?
>
> This is limited only by the number of network interfaces Linux  
> supports and/or the number of network cards you can place in your  
> system.

In practice, most configurations are limited to the (typical) 4 or 8  
maximum supported by the switch you are using.

> --- On Sun, 5/10/09, Mag Gam <magawake at gmail.com> wrote:
>
>> From: Mag Gam <magawake at gmail.com>
>> Subject: Re: [Lustre-discuss] tcp network load balancing  
>> understanding lustre  1.8
>> To: "Arden Wiebe" <albert682 at yahoo.com>
>> Cc: "Andreas Dilger" <adilger at sun.com>,
"Michael Ruepp" <michael at schwarzfilm.ch
>> >, lustre-discuss at lists.lustre.org
>> Date: Sunday, May 10, 2009, 5:48 AM
>> Thanks for the screen shot Arden.
>>
>> What is the maximum # of slaves you can have on a bonded
>> interface?
>>
>>
>>
>> On Sun, May 10, 2009 at 12:15 AM, Arden Wiebe <albert682 at
yahoo.com>
>> wrote:
>>>
>>> Bond0 knows which interface to utilize because all the
>> other eth0-5 are designated as slaves in their configuration
>> files.  The manual is fairly clear on that.
>>>
>>> In the screenshot the memory used in gnome system
>> monitor is at 452.4 MiB of 7.8 GiB and the sustained
>> bandwidth to the OSS and OST is 404.2 MiB/s which
>> corresponds roughly to what collectl is showing for KBWrite
>> for Disks.  Collectl shows a few different results for
>> Disks, Network and Lustre OST and I believe it to be
>> measuring the other OST on the network around 170MiB/s if
>> you view the other screenshot for OST1 or lustrethree.
>>>
>>> In the screenshots Lustreone=MGS Lustretwo=MDT
>> Lustrethree=OSS+raid10 target Lustrefour=OSS+raid10 target
>>>
>>> To help clarify the entire network and stress testing
>> I did with all the clients I could give it is at
>> www.ioio.ca/Lustre-tcp-bonding/images/html and
>> www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html
>>>
>>> Proper benchmarking would be nice though as I just hit
>> it with everything I could and it lived so I was happy. I
>> found the manual to be lacking in benchmarking and really
>> wanted to make nice graphs of it all but failed with iozone
>> to do so for some reason.
>>>
>>> I''ll be taking a run at upgrading everything to 1.8 in
>> the coming week or so and when I do I''ll grab some new
>> screenshots and post the relevant items to the wiki.
>>  Otherwise if someone else wants to post the existing
>> screenshots your welcome to use them as they do detail a
>> ground up build. Apparently 1.8 is great with small files
>> now so it should work even better with
>> www.oil-gas.ca/phpsysinfo and www.linuxguru.ca/phpsysinfo
>>>
>>>
>>> --- On Sat, 5/9/09, Andreas Dilger <adilger at sun.com>
>> wrote:
>>>
>>>> From: Andreas Dilger <adilger at sun.com>
>>>> Subject: Re: [Lustre-discuss] tcp network load
>> balancing understanding lustre 1.8
>>>> To: "Arden Wiebe" <albert682 at yahoo.com>
>>>> Cc: lustre-discuss at lists.lustre.org,
>> "Michael Ruepp" <michael at schwarzfilm.ch>
>>>> Date: Saturday, May 9, 2009, 11:31 AM
>>>> On May 09, 2009  09:18 -0700,
>>>> Arden Wiebe wrote:
>>>>> This might help answer some questions.
>>>>> http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows
>>>> my mostly not
>>>>> tuned OSS and OST''s pulling 400+MiB/s over
>> TCP Bonding
>>>> provided by the
>>>>> kernel complete with a cat of the
>> modeprobe.conf
>>>> file.  You have the other
>>>>> links I''ve sent you but the picture above is
>> relevant
>>>> to your questions.
>>>>
>>>> Arden, thanks for sharing this info.  Any chance
>> you
>>>> could post it to
>>>> wiki.lustre.org?  It would seem there is one bit
>> of
>>>> info missing somewhere -
>>>> how does bond0 know which interfaces to use?
>>>>
>>>>
>>>> Also, another oddity - the network monitor is
>> showing
>>>> 450MiB/s Received,
>>>> yet the disk is showing only about 170MiB/s going
>> to the
>>>> disk.  Either
>>>> something is wacky with the monitoring (e.g. it is
>> counting
>>>> Received for
>>>> both the eth* networks AND bond0), or Lustre is
>> doing
>>>> something very
>>>> wierd and retransmitting the bulk data like crazy
>> (seems
>>>> unlikely).
>>>>
>>>>
>>>>> --- On Thu, 5/7/09, Michael Ruepp <michael at
schwarzfilm.ch>
>>>> wrote:
>>>>>
>>>>>> From: Michael Ruepp <michael at schwarzfilm.ch>
>>>>>> Subject: [Lustre-discuss] tcp network
>> load
>>>> balancing understanding lustre 1.8
>>>>>> To: lustre-discuss at lists.lustre.org
>>>>>> Date: Thursday, May 7, 2009, 5:50 AM
>>>>>> Hi there,
>>>>>>
>>>>>> I am configured a simple tcp lustre 1.8
>> with one
>>>> mdc (one
>>>>>> nic) and two
>>>>>> oss (four nic per oss)
>>>>>> As well as in the 1.6 documentation,
>> the
>>>> multihomed
>>>>>> sections is a
>>>>>> little bit unclear to me.
>>>>>>
>>>>>> I give every NID a IP in the same
>> subnet, eg:
>>>>>> 10.111.20.35-38 - oss0
>>>>>> and 10.111.20.39-42 oss1
>>>>>>
>>>>>> Do I have to make modprobe.conf.local
>> look like
>>>> this to
>>>>>> force lustre
>>>>>> to use all four interfaces parallel:
>>>>>>
>>>>>> options lnet
>> networks=tcp0(eth0,eth1,eth2,eth3)
>>>>>> Because on Page 138 the 1.8 Manual
>> says:
>>>>>> "Note ? In the case of TCP-only
>> clients, the
>>>> first
>>>>>> available non-
>>>>>> loopback IP interface
>>>>>> is used for tcp0 since the interfaces
>> are not
>>>> specified. "
>>>>>>
>>>>>> or do I have to specify it like this:
>>>>>> options lnet networks=tcp
>>>>>> Because on Page 112 the lustre 1.6
>> Manual says:
>>>>>> "Note ? In the case of TCP-only
>> clients, all
>>>> available IP
>>>>>> interfaces
>>>>>> are used for tcp0
>>>>>> since the interfaces are not specified.
>> If there
>>>> is more
>>>>>> than one, the
>>>>>> IP of the first one
>>>>>> found is used to construct the tcp0
>> ID."
>>>>>>
>>>>>> Which is the opposite of the 1.8 Manual
>>>>>>
>>>>>> My goal ist to let lustre utilize all
>> four Gb
>>>> Links
>>>>>> parallel. And my
>>>>>> Lustre Clients are equipped with two Gb
>> links
>>>> which should
>>>>>> be utilized
>>>>>> by the lustre clients as well (eth0,
>> eth1)
>>>>>>
>>>>>> Or is bonding the better solution in
>> terms of
>>>> performance?
>>>>>>
>>>>>> Thanks very much for input,
>>>>>>
>>>>>> Michael Ruepp
>>>>>> Schwarzfilm AG
>>>>>>
>>>>>>
>>>>>>
>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>> Cheers, Andreas
>>>> --
>>>> Andreas Dilger
>>>> Sr. Staff Engineer, Lustre Group
>>>> Sun Microsystems of Canada, Inc.
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Christopher J. Walker

2009-May-10 14:07 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

Mag Gam wrote:> Thanks for the screen shot Arden.
> 
> What is the maximum # of slaves you can have on a bonded interface?
> 
> 
> 
> On Sun, May 10, 2009 at 12:15 AM, Arden Wiebe <albert682 at
yahoo.com> wrote:
>> Bond0 knows which interface to utilize because all the other eth0-5 are
designated as slaves in their configuration files.  The manual is fairly clear
on that.
>>
>> In the screenshot the memory used in gnome system monitor is at 452.4
MiB of 7.8 GiB and the sustained bandwidth to the OSS and OST is 404.2 MiB/s
which corresponds roughly to what collectl is showing for KBWrite for Disks. 
Collectl shows a few different results for Disks, Network and Lustre OST and I
believe it to be measuring the other OST on the network around 170MiB/s if you
view the other screenshot for OST1 or lustrethree.
>>
>> In the screenshots Lustreone=MGS Lustretwo=MDT Lustrethree=OSS+raid10
target Lustrefour=OSS+raid10 target
>>
>> To help clarify the entire network and stress testing I did with all
the clients I could give it is at www.ioio.ca/Lustre-tcp-bonding/images/html and
www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html
>>
>> Proper benchmarking would be nice though as I just hit it with
everything I could and it lived so I was happy. I found the manual to be lacking
in benchmarking and really wanted to make nice graphs of it all but failed with
iozone to do so for some reason.
I too have been trying to benchmark a lustre filesystem with iozone 3.321.

Sometimes it works, and sometimes it hangs.

I turned on debugging, and ran a test with 2 clients on each of 40 
machines. In the output, I get lines like:
  loop: R_STAT_DATA for client 9

For 79 clients, there are two of these messages in the output, and for 
one of them only 1.

I''ve had a brief skim of the source code, and I think that the problem 
is that iozone uses UDP packets to communicate. On a heavily loaded 
network, one of these is bound to get lost. Presumably iozone doesn''t 
have the right retry strategy.

The iozone author has suggested using a different network for the timing 
packets - but I don''t think I can justify the time or expense involved 
in building one purely to do some benchmarking.

Chris

PS On a machine with 2 bonded Gigabit ethernet cards, I found I needed 
two iozone threads to get the available bandwidth. One iozone thread 
seemed to get the bandwidth from one card only.

Brian J. Murrell

2009-May-10 15:00 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On Sun, 2009-05-10 at 15:07 +0100, Christopher J. Walker
wrote:> 
> I''ve had a brief skim of the source code, and I think that the
problem
> is that iozone uses UDP packets to communicate. On a heavily loaded 
> network, one of these is bound to get lost. Presumably iozone
doesn''t
> have the right retry strategy.
Why not use a benchmark that uses an established MPI (such as MPICH or
LAM, which can run it''s message passing infrastructure on a TCP
transport such as rsh or ssh) library.  IOR is one such benchmark.

Of course, if your network is really so loaded as to be dropping UDP
packets then that will probably impact the latency of the MPI messages.
Not sure if that will have a meaningful impact on IOR or not.  I tend to
think the messaging is quite low volume so perhaps not.

In any case, it can add another data point to your debugging efforts to
help prove or disprove your hypothesis.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090510/07ea42fe/attachment.bin

Klaus Steden

2009-May-12 22:50 UTC

head link

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

On 5/10/09 6:12 AM, "Arden Wiebe" <albert682 at yahoo.com>
etched on stone
tablets:
> Mag, your welcome. From the page referenced first for a search for
Linux> Bonding it states:
How many bonding devices can I have?

There is no> limit.How many slaves can a bonding device have?

This is limited only by the> number of network interfaces Linux supports and/or the number of network
cards
> you can place in your system.
>If memory serves, the LACP spec allows for a maximum of 8 devices within an
aggregate group. I don''t know if the ALB and TLB modes of the Linux
bonding
implementation enforces any limit, though.

Klaus

Lustre discuss - May 2009 - tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8

[Lustre-discuss] tcp network load balancing understanding lustre 1.8