Peter J. Braam
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
Hello For the case you describe we recommend the following. Introduce channel bonding (a Linux networking feature) to build an aggregate interface on the OSS. Give that interface with networks="tcp(bond0)" on the OSS nodes (assuming the name of the bonded interface is bond0). The MDS and clients would require no configuration, using their interface by default. So load balancing the interfaces is left to Linux. Note that in the past we initiated work for Lustre to manage such load balancing, but we do not do that anymore. - Peter -> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of > Peter Kjellstr?m > Sent: Tuesday, April 11, 2006 9:19 AM > To: lustre-discuss@clusterfs.com > Subject: [Lustre-discuss] how do you configure multiple eth interfaces > > Hello, > > I''ve been reading the lustre manual (1.4.6.1-manv17) and in > there is an example of multihomed servers (chapter3). Servers > megan and oscar is described as having two ethernet > interfaces for lnet (eth0,eth1) and they are configured as > tcp0 with networks="tcp0(eth0,eth1)" in modprobe.conf. > > Now my question, how should eth0 and eth1 be configured in > the OS? would two IPs on the same subnet work? if so, then > how is the nid given to lmc? and which IP should the hostname > resolve to? > > What I''m trying to do is use two gigE interfaces for each OSS > with clients and mds on the same switch and subnet (initially > at least) using only one interface each. > > Any information on these kind of configs (general too) would > be much appreciated, Peter > > -- > ------------------------------------------------------------ > Peter Kjellstr?m | > National Supercomputer Centre | > Sweden | http://www.nsc.liu.se >
Peter Kjellström
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
On Tuesday 11 April 2006 17:25, Peter J. Braam wrote:> Hello > > For the case you describe we recommend the following. > > Introduce channel bonding (a Linux networking feature) to build an > aggregate interface on the OSS. Give that interface with > > networks="tcp(bond0)" > > on the OSS nodes (assuming the name of the bonded interface is bond0). > > The MDS and clients would require no configuration, using their interface > by default. So load balancing the interfaces is left to Linux.Thanks for your quick reply, I know of this way, but I''d prefer to have more independent interfaces. I fear that my performance will suffer with the bonding approach (maybe this wont happen but I don''t have any fond memories of previous bonding projects...). I was thinking more along the lines of round robin per client or request. I assume that is what I get if I set up tcp0 and tcp1 on all nodes? Something as static and dumb as transfer number N uses eth(N mod number of iterfaces) is what I''m fishing for I guess... There seem to be quite a few storage servers on the market with two or four gigE interfaces, will bonding be the recommended way configure lustre on these kind of servers? A brief overview of alternatives (I''m using 1.4.6.1) and what will happen in the future would be nice to have. Maybe this discussion should be deferred to the usergroup meeting :-) /Peter> Note that in the past we initiated work for Lustre to manage such load > balancing, but we do not do that anymore. > > > - Peter - > > > Hello, > > ... > > What I''m trying to do is use two gigE interfaces for each OSS > > with clients and mds on the same switch and subnet (initially > > at least) using only one interface each. > > > > Any information on these kind of configs (general too) would > > be much appreciated, Peter-- ------------------------------------------------------------ Peter Kjellstr?m | National Supercomputer Centre | Sweden | http://www.nsc.liu.se -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060411/ea551e88/attachment.bin
Daire Byrne
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
Adam/Peter, I used mode=4 but to be honest I can''t recall if I''d patched in the "xmit_hash_policy=layer3+4" into the kernel for my tests. So the transmit was probably just matching clients based on MAC address only. Perhaps the bonding driver has become even faster in recent months? I think I used mode=4 as it avoids out of order packets. The thing that convinced us to stick with Lustre''s inbuilt load-balancing was it''s ability to have a dual NIC server communicate with a dual NIC client. This is useful for providing faster access to selected clients for almost no extra cost (most of our desktops have dual GigE onboard). Of course there is a lot more interrupt thrashing which can drive up CPU usage so its not really great if you''re using a CPU intensive app on the client (jumbo-frames help). But if your environment simply constitutes dual NIC servers and single NIC clients then bonding is far easier to setup and maintain. Regards, Daire> We use the linux bonding driver here and can nearly saturate the > connection. What bonding mode are you using? > > On Wed, 2006-04-12 at 15:25 +0100, Daire Byrne wrote: >> Peter, >> >> Well I didn''t realise that CFS now recommend bonding! I always thought >> that Lustre''s inbuilt load-balancing was better due to the two ksocknald >> threads and cpu affinity support. Maybe these two performance features are >> now working correctly with bonded devices? >> >> The more recent bonding drivers now support a module configuration option >> called "xmit_hash_policy" (=layer3+4) which can be used to setup a LACP >> (dynamic link aggregation) connection to most GigE switches (check out >> https://bugzilla.lustre.org/show_bug.cgi?id=10287). This should perform >> pretty well as it splits clients so that they connect to only one server >> NIC at a time which ensures packet ordering. >> >> If you want to use inbuilt Lustre networking you can use something like >> this in your modprobe.conf(s): >> >> # Server modprobe.conf >> options lnet networks="tcp0(eth0,eth1),tcp1(eth0),tcp2(eth1)" >> >> #Client modprobe.conf >> options lnet\ >> config_on_load=1;\ >> ip2nets="tcp1(eth0) 172.16.*.[1-253/2];\ >> tcp2(eth0) 172.16.*.[2-254/2];" >> >> I''ve been looking into this recently and found a couple of minor bugs >> detailed in https://bugzilla.lustre.org/show_bug.cgi?id=10279 >> >> Hope that helps somewhat. I''d be interested to hear about any performance >> results you get from bonding vs. lustre load-balancing. When I last >> checked I clocked around 150Mb/s r/w for bonding and 185Mb/s for lustre >> load-balancing. Not a huge differance but it all adds up.... >> >> Regards, >> >> Daire >> >> >>> On Tuesday 11 April 2006 17:25, Peter J. Braam wrote: >>>> Hello >>>> >>>> For the case you describe we recommend the following. >>>> >>>> Introduce channel bonding (a Linux networking feature) to build an >>>> aggregate interface on the OSS. Give that interface with >>>> >>>> networks="tcp(bond0)" >>>> >>>> on the OSS nodes (assuming the name of the bonded interface is bond0). >>>> >>>> The MDS and clients would require no configuration, using their interface >>>> by default. So load balancing the interfaces is left to Linux. >>> >>> Thanks for your quick reply, >>> >>> I know of this way, but I''d prefer to have more independent interfaces. I fear >>> that my performance will suffer with the bonding approach (maybe this wont >>> happen but I don''t have any fond memories of previous bonding projects...). >>> >>> I was thinking more along the lines of round robin per client or request. I >>> assume that is what I get if I set up tcp0 and tcp1 on all nodes? Something >>> as static and dumb as transfer number N uses eth(N mod number of iterfaces) >>> is what I''m fishing for I guess... >>> >>> There seem to be quite a few storage servers on the market with two or four >>> gigE interfaces, will bonding be the recommended way configure lustre on >>> these kind of servers? >>> >>> A brief overview of alternatives (I''m using 1.4.6.1) and what will happen in >>> the future would be nice to have. Maybe this discussion should be deferred to >>> the usergroup meeting :-) >>> >>> /Peter >>> >>>> Note that in the past we initiated work for Lustre to manage such load >>>> balancing, but we do not do that anymore. >>>> >>>> >>>> - Peter - >>>> >>>>> Hello, >>>>> ... >>>>> What I''m trying to do is use two gigE interfaces for each OSS >>>>> with clients and mds on the same switch and subnet (initially >>>>> at least) using only one interface each. >>>>> >>>>> Any information on these kind of configs (general too) would >>>>> be much appreciated, Peter >>> >>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >> >
Peter J. Braam
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
Hi Daire, I wanted to clarify that we have indeed changed strategy on this. Last year we heard that the bonding bugs had been found and were on the way out, followed by success at one of our customers. It is quite hard for us to maintain these kind of features, so this was followed by a decision to leave bonding to Linux. If clients have 2 interfaces and servers have multiple interfaces Linux bonding can help equally well again. - Peter -> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Daire Byrne > Sent: Thursday, April 13, 2006 4:55 AM > To: Adam Cassar > Cc: lustre-discuss@clusterfs.com; Peter Kjellstr?m > Subject: Re: [Lustre-discuss] how do you configure multiple > eth interfaces > > > Adam/Peter, > > I used mode=4 but to be honest I can''t recall if I''d patched > in the "xmit_hash_policy=layer3+4" into the kernel for my > tests. So the transmit was probably just matching clients > based on MAC address only. Perhaps the bonding driver has > become even faster in recent months? I think I used > mode=4 as it avoids out of order packets. > > The thing that convinced us to stick with Lustre''s inbuilt > load-balancing was it''s ability to have a dual NIC server > communicate with a dual NIC client. This is useful for > providing faster access to selected clients for almost no > extra cost (most of our desktops have dual GigE onboard). Of > course there is a lot more interrupt thrashing which can > drive up CPU usage so its not really great if you''re using a > CPU intensive app on the client (jumbo-frames help). But if > your environment simply constitutes dual NIC servers and > single NIC clients then bonding is far easier to setup and maintain. > > Regards, > > Daire > > > > We use the linux bonding driver here and can nearly saturate the > > connection. What bonding mode are you using? > > > > On Wed, 2006-04-12 at 15:25 +0100, Daire Byrne wrote: > >> Peter, > >> > >> Well I didn''t realise that CFS now recommend bonding! I always > >> thought that Lustre''s inbuilt load-balancing was better due to the > >> two ksocknald threads and cpu affinity support. Maybe these two > >> performance features are now working correctly with bonded devices? > >> > >> The more recent bonding drivers now support a module configuration > >> option called "xmit_hash_policy" (=layer3+4) which can be used to > >> setup a LACP (dynamic link aggregation) connection to most GigE > >> switches (check out > >> https://bugzilla.lustre.org/show_bug.cgi?id=10287). This should > >> perform pretty well as it splits clients so that they > connect to only one server NIC at a time which ensures packet > ordering. > >> > >> If you want to use inbuilt Lustre networking you can use something > >> like this in your modprobe.conf(s): > >> > >> # Server modprobe.conf > >> options lnet networks="tcp0(eth0,eth1),tcp1(eth0),tcp2(eth1)" > >> > >> #Client modprobe.conf > >> options lnet\ > >> config_on_load=1;\ > >> ip2nets="tcp1(eth0) 172.16.*.[1-253/2];\ > >> tcp2(eth0) 172.16.*.[2-254/2];" > >> > >> I''ve been looking into this recently and found a couple of > minor bugs > >> detailed in https://bugzilla.lustre.org/show_bug.cgi?id=10279 > >> > >> Hope that helps somewhat. I''d be interested to hear about any > >> performance results you get from bonding vs. lustre > load-balancing. > >> When I last checked I clocked around 150Mb/s r/w for bonding and > >> 185Mb/s for lustre load-balancing. Not a huge differance > but it all adds up.... > >> > >> Regards, > >> > >> Daire > >> > >> > >>> On Tuesday 11 April 2006 17:25, Peter J. Braam wrote: > >>>> Hello > >>>> > >>>> For the case you describe we recommend the following. > >>>> > >>>> Introduce channel bonding (a Linux networking feature) > to build an > >>>> aggregate interface on the OSS. Give that interface with > >>>> > >>>> networks="tcp(bond0)" > >>>> > >>>> on the OSS nodes (assuming the name of the bonded > interface is bond0). > >>>> > >>>> The MDS and clients would require no configuration, using their > >>>> interface by default. So load balancing the interfaces > is left to Linux. > >>> > >>> Thanks for your quick reply, > >>> > >>> I know of this way, but I''d prefer to have more independent > >>> interfaces. I fear that my performance will suffer with > the bonding > >>> approach (maybe this wont happen but I don''t have any > fond memories of previous bonding projects...). > >>> > >>> I was thinking more along the lines of round robin per client or > >>> request. I assume that is what I get if I set up tcp0 and tcp1 on > >>> all nodes? Something as static and dumb as transfer number N uses > >>> eth(N mod number of iterfaces) is what I''m fishing for I guess... > >>> > >>> There seem to be quite a few storage servers on the > market with two > >>> or four gigE interfaces, will bonding be the recommended way > >>> configure lustre on these kind of servers? > >>> > >>> A brief overview of alternatives (I''m using 1.4.6.1) and > what will > >>> happen in the future would be nice to have. Maybe this discussion > >>> should be deferred to the usergroup meeting :-) > >>> > >>> /Peter > >>> > >>>> Note that in the past we initiated work for Lustre to > manage such > >>>> load balancing, but we do not do that anymore. > >>>> > >>>> > >>>> - Peter - > >>>> > >>>>> Hello, > >>>>> ... > >>>>> What I''m trying to do is use two gigE interfaces for > each OSS with > >>>>> clients and mds on the same switch and subnet > (initially at least) > >>>>> using only one interface each. > >>>>> > >>>>> Any information on these kind of configs (general too) would be > >>>>> much appreciated, Peter > >>> > >>> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss@clusterfs.com > >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >> > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >
Daire Byrne
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
Peter, Well I didn''t realise that CFS now recommend bonding! I always thought that Lustre''s inbuilt load-balancing was better due to the two ksocknald threads and cpu affinity support. Maybe these two performance features are now working correctly with bonded devices? The more recent bonding drivers now support a module configuration option called "xmit_hash_policy" (=layer3+4) which can be used to setup a LACP (dynamic link aggregation) connection to most GigE switches (check out https://bugzilla.lustre.org/show_bug.cgi?id=10287). This should perform pretty well as it splits clients so that they connect to only one server NIC at a time which ensures packet ordering. If you want to use inbuilt Lustre networking you can use something like this in your modprobe.conf(s): # Server modprobe.conf options lnet networks="tcp0(eth0,eth1),tcp1(eth0),tcp2(eth1)" #Client modprobe.conf options lnet\ config_on_load=1;\ ip2nets="tcp1(eth0) 172.16.*.[1-253/2];\ tcp2(eth0) 172.16.*.[2-254/2];" I''ve been looking into this recently and found a couple of minor bugs detailed in https://bugzilla.lustre.org/show_bug.cgi?id=10279 Hope that helps somewhat. I''d be interested to hear about any performance results you get from bonding vs. lustre load-balancing. When I last checked I clocked around 150Mb/s r/w for bonding and 185Mb/s for lustre load-balancing. Not a huge differance but it all adds up.... Regards, Daire> On Tuesday 11 April 2006 17:25, Peter J. Braam wrote: >> Hello >> >> For the case you describe we recommend the following. >> >> Introduce channel bonding (a Linux networking feature) to build an >> aggregate interface on the OSS. Give that interface with >> >> networks="tcp(bond0)" >> >> on the OSS nodes (assuming the name of the bonded interface is bond0). >> >> The MDS and clients would require no configuration, using their interface >> by default. So load balancing the interfaces is left to Linux. > > Thanks for your quick reply, > > I know of this way, but I''d prefer to have more independent interfaces. I fear > that my performance will suffer with the bonding approach (maybe this wont > happen but I don''t have any fond memories of previous bonding projects...). > > I was thinking more along the lines of round robin per client or request. I > assume that is what I get if I set up tcp0 and tcp1 on all nodes? Something > as static and dumb as transfer number N uses eth(N mod number of iterfaces) > is what I''m fishing for I guess... > > There seem to be quite a few storage servers on the market with two or four > gigE interfaces, will bonding be the recommended way configure lustre on > these kind of servers? > > A brief overview of alternatives (I''m using 1.4.6.1) and what will happen in > the future would be nice to have. Maybe this discussion should be deferred to > the usergroup meeting :-) > > /Peter > >> Note that in the past we initiated work for Lustre to manage such load >> balancing, but we do not do that anymore. >> >> >> - Peter - >> >>> Hello, >>> ... >>> What I''m trying to do is use two gigE interfaces for each OSS >>> with clients and mds on the same switch and subnet (initially >>> at least) using only one interface each. >>> >>> Any information on these kind of configs (general too) would >>> be much appreciated, Peter > >
Adam Cassar
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
We use the linux bonding driver here and can nearly saturate the connection. What bonding mode are you using? On Wed, 2006-04-12 at 15:25 +0100, Daire Byrne wrote:> Peter, > > Well I didn''t realise that CFS now recommend bonding! I always thought > that Lustre''s inbuilt load-balancing was better due to the two ksocknald > threads and cpu affinity support. Maybe these two performance features are > now working correctly with bonded devices? > > The more recent bonding drivers now support a module configuration option > called "xmit_hash_policy" (=layer3+4) which can be used to setup a LACP > (dynamic link aggregation) connection to most GigE switches (check out > https://bugzilla.lustre.org/show_bug.cgi?id=10287). This should perform > pretty well as it splits clients so that they connect to only one server > NIC at a time which ensures packet ordering. > > If you want to use inbuilt Lustre networking you can use something like > this in your modprobe.conf(s): > > # Server modprobe.conf > options lnet networks="tcp0(eth0,eth1),tcp1(eth0),tcp2(eth1)" > > #Client modprobe.conf > options lnet\ > config_on_load=1;\ > ip2nets="tcp1(eth0) 172.16.*.[1-253/2];\ > tcp2(eth0) 172.16.*.[2-254/2];" > > I''ve been looking into this recently and found a couple of minor bugs > detailed in https://bugzilla.lustre.org/show_bug.cgi?id=10279 > > Hope that helps somewhat. I''d be interested to hear about any performance > results you get from bonding vs. lustre load-balancing. When I last > checked I clocked around 150Mb/s r/w for bonding and 185Mb/s for lustre > load-balancing. Not a huge differance but it all adds up.... > > Regards, > > Daire > > > > On Tuesday 11 April 2006 17:25, Peter J. Braam wrote: > >> Hello > >> > >> For the case you describe we recommend the following. > >> > >> Introduce channel bonding (a Linux networking feature) to build an > >> aggregate interface on the OSS. Give that interface with > >> > >> networks="tcp(bond0)" > >> > >> on the OSS nodes (assuming the name of the bonded interface is bond0). > >> > >> The MDS and clients would require no configuration, using their interface > >> by default. So load balancing the interfaces is left to Linux. > > > > Thanks for your quick reply, > > > > I know of this way, but I''d prefer to have more independent interfaces. I fear > > that my performance will suffer with the bonding approach (maybe this wont > > happen but I don''t have any fond memories of previous bonding projects...). > > > > I was thinking more along the lines of round robin per client or request. I > > assume that is what I get if I set up tcp0 and tcp1 on all nodes? Something > > as static and dumb as transfer number N uses eth(N mod number of iterfaces) > > is what I''m fishing for I guess... > > > > There seem to be quite a few storage servers on the market with two or four > > gigE interfaces, will bonding be the recommended way configure lustre on > > these kind of servers? > > > > A brief overview of alternatives (I''m using 1.4.6.1) and what will happen in > > the future would be nice to have. Maybe this discussion should be deferred to > > the usergroup meeting :-) > > > > /Peter > > > >> Note that in the past we initiated work for Lustre to manage such load > >> balancing, but we do not do that anymore. > >> > >> > >> - Peter - > >> > >>> Hello, > >>> ... > >>> What I''m trying to do is use two gigE interfaces for each OSS > >>> with clients and mds on the same switch and subnet (initially > >>> at least) using only one interface each. > >>> > >>> Any information on these kind of configs (general too) would > >>> be much appreciated, Peter > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >-- Adam Cassar ICT Manager NetRegistry Pty Ltd ______________________________________________ http://www.netregistry.com.au Tel: 02 9699 6099 Fax: 02 9699 6088 PO Box 270 Broadway NSW 2007 Domains |Business Email|Web Hosting|E-Commerce Trusted by 10,000s of businesses since 1997 ______________________________________________
Peter Kjellström
2006-May-19 07:36 UTC
[Lustre-discuss] how do you configure multiple eth interfaces
Hello, I''ve been reading the lustre manual (1.4.6.1-manv17) and in there is an example of multihomed servers (chapter3). Servers megan and oscar is described as having two ethernet interfaces for lnet (eth0,eth1) and they are configured as tcp0 with networks="tcp0(eth0,eth1)" in modprobe.conf. Now my question, how should eth0 and eth1 be configured in the OS? would two IPs on the same subnet work? if so, then how is the nid given to lmc? and which IP should the hostname resolve to? What I''m trying to do is use two gigE interfaces for each OSS with clients and mds on the same switch and subnet (initially at least) using only one interface each. Any information on these kind of configs (general too) would be much appreciated, Peter -- ------------------------------------------------------------ Peter Kjellstr?m | National Supercomputer Centre | Sweden | http://www.nsc.liu.se -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060411/e0b3de66/attachment.bin