zhengfeng
2012-Mar-06 05:01 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
Dear all, Since there is no more node in our project when using Lustre, I want to confirm that: 1) Could the "Client" and "MGS" run at one node together? or could "Client" and "OSS" run at one node together? 2) Suppose I had deployed them at one node, what potential shortcomings or harm are there? I am newbie for Lustre file system, please forgive my simple questions. Feel free to share your opinions, many thanks. B.R. Zheng Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120306/3ea805a3/attachment.html
Colin Faber
2012-Mar-06 05:51 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
In all cases yes. However this is not the recommended or optimal configuration. On Mar 5, 2012 10:01 PM, "zhengfeng" <zf5984599 at gmail.com> wrote:> ** > Dear all, > > Since there is no more node in our project when using Lustre, I want > to confirm that: > > 1) Could the "Client" and "MGS" run at one node together? or could > "Client" and "OSS" run at one node together? > > 2) Suppose I had deployed them at one node, what potential shortcomings or > harm are there? > > I am newbie for Lustre file system, please forgive my simple questions. > Feel free to share your opinions, many thanks. > > ------------------------------ > B.R. > Zheng Feng > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120305/ebe6301f/attachment.html
Carlos Thomaz
2012-Mar-06 21:15 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
Zheng it''s VERY unusual. Usually the MGS and MDS are deployed in the same server or same HA Pair. Leave the OSS servers alone as well the clients. Honestly I''ve never seen this before, but remember that you have to have the MGS up and running before even start configuring or bringing up the MDS and OSSs. If for some reasons you have to reboot, let''s say a client where the MGS is, you may get into all sort of problems. Again, this is very odd and won''t buy you anything. Why don''t you just install the MGS and MDS in the same server/HA pair? It''s a straight forward configuration and will help you manage your environment better. Rgds, Carlos. -- Carlos Thomaz | HPC Systems Architect Mobile: +1 (303) 519-0578 cthomaz at ddn.com | Skype ID: carlosthomaz DataDirect Networks, Inc. 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921 ddn.com<http://www.ddn.com/> | Twitter: @ddn_limitless<http://twitter.com/ddn_limitless> | 1.800.TERABYTE From: zhengfeng <zf5984599 at gmail.com<mailto:zf5984599 at gmail.com>> Reply-To: zf5984599 <zf5984599 at gmail.com<mailto:zf5984599 at gmail.com>> Date: Mon, 5 Mar 2012 21:01:25 -0800 To: "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>" <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> Subject: [Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together? Dear all, Since there is no more node in our project when using Lustre, I want to confirm that: 1) Could the "Client" and "MGS" run at one node together? or could "Client" and "OSS" run at one node together? 2) Suppose I had deployed them at one node, what potential shortcomings or harm are there? I am newbie for Lustre file system, please forgive my simple questions. Feel free to share your opinions, many thanks. ________________________________ B.R. Zheng Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120306/c90799f5/attachment-0001.html
Peter Grandi
2012-Mar-07 12:54 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
> Since there is no more node in our project when using Lustre, > I want to confirm that:> 1) Could the "Client" and "MGS" run at one node together? or > could "Client" and "OSS" run at one node together? 2) Suppose > I had deployed them at one node, what potential shortcomings > or harm are there?Running MGS and MDS on the same nodes is customary, see: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 Running the MGS, MDS and OSS service on the same node is possible and fairly common in very small setups, usually those in which there is only 1-2 nodes. It is possible to use the client code on all types of Lustre servers, but at least in the case of using the client code on an OSS there is the non-negligible possibility of a resource deadlock, if the client uses the OSS on the same node, as the client and OSS codes compete for memory, so in the past this has been discouraged. This is documented here: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_84876 ?Caution - Do not do this when the client and OSS are on the same node, as memory pressure between the client and OSS can lead to deadlocks.?
zhengfeng
2012-Mar-08 02:27 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
Dear all, thanks a lot for your answers ;) Now I have another problem about the network between nodes. Since there is no Infiniband or 10G-NIC, but I still want to increase the bandwidth by add more 1G-NICs, I plan to use Linux bonding. Then, bonding 4 NICs together at one node, BUT there is NO performance enhanced no matter which bongding mode, described in kernel doc, used. In stead, the performance of 4-NICs-bonding is lower than 1 NIC''s. Then we use 2 NICs bonding, the performance is better than 1 NIC''s. The result is : bonding 2 NIC: 1 + 1 > 1 bonding 4 NIC: 1 + 1 + 1 + 1 < 1 So confused...... The benchmark we used is "netperf". And I use "tcpdump" to dump the packages, found that there are great of TCP segments out of orders. My question is that: a) TCP segments are out of order, which induced that 4-NIC-bonding performance decay, is this the root cause? b) We are doubting the feasibility of this method: using 4-NIC-bonding to increase bandwidth. Any proposals about that? If so, maybe I should use some other method instead of this. Thanks again, all Best Regards Zheng From: Peter Grandi Date: 2012-03-07 20:54 To: Lustre discussion Subject: Re: [Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?> Since there is no more node in our project when using Lustre, > I want to confirm that:> 1) Could the "Client" and "MGS" run at one node together? or > could "Client" and "OSS" run at one node together? 2) Suppose > I had deployed them at one node, what potential shortcomings > or harm are there?Running MGS and MDS on the same nodes is customary, see: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 Running the MGS, MDS and OSS service on the same node is possible and fairly common in very small setups, usually those in which there is only 1-2 nodes. It is possible to use the client code on all types of Lustre servers, but at least in the case of using the client code on an OSS there is the non-negligible possibility of a resource deadlock, if the client uses the OSS on the same node, as the client and OSS codes compete for memory, so in the past this has been discouraged. This is documented here: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_84876 ?Caution - Do not do this when the client and OSS are on the same node, as memory pressure between the client and OSS can lead to deadlocks.? _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120308/e58089b6/attachment.html
Michael Shuey
2012-Mar-08 03:21 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
I''ve had success with LACP bonding, with an LACP-aware switch. You may also want to check your xmit_hash_policy (it''s a kernel option to the linux bonding driver). I''ve had the best luck with layer3+4 bonding, using several OSSes with sequential IPs, and striping files at a multiple of the number of links in the bond. With most bonding modes, packets can get send across different links in the bond. This results in out-of-order packets, and can slow down a TCP stream (like your Lustre connection) unnecessarily. LACP will route packets to a given destination across exactly one link in the bond - but that will limit each TCP stream to the link speed of a single bond member. You can improve upon single link speeds with Lustre, because Lustre will let you stripe large files across multiple OSSes. A client will build a separate TCP connection to each OSS, so as long as traffic passes over different links you can use all the available bandwidth. The way traffic is scattered across the bond members is controlled by the xmit_hash_policy option; layer3+4 uses an XOR of the source and destination addresses, combined with the source and destination TCP ports (modulo number of links in the bond) to pick the specific link for that stream. If you''re using sequential IPs for your OSSes, you should be able to get a good scattering effect (since your source address and port won''t change, but your destination address will vary across the OSSes). A few years ago, I was using Lustre 1.4 and 1.6 and saw 200+ MB/sec across two gigE links bonded together on the client (using striped files). Your mileage may vary, of course. Caveats include: Small files may have poorer performance than usual, due to transaction overhead to multiple OSSes (if your bond is on the client). Similarly, non-striped files will only see the speed of a single link. If your OSTs start to fill, Lustre''s load balancing may not give you an ideal distribution of stripes across OSSes - causing multiple TCP streams to land on the same bond member on the client. Unfortunately, this will present as slowdowns for certain files on certain clients (because the number of bonds that can be used is a function of both which OSSes are used in the file and the client''s IP in the hash policy). All metadata accesses are limited to the speed of a single bond member on the client. If your bonds are on the server, then (as long as you have a number of clients) you should see a nice increase in overall IO throughput. It won''t be as marked a boost as 10gigE or Infiniband, but bonds are inexpensive and generally better than a single link (to multiple clients). Hope this helps - good luck! -- Mike Shuey On Wed, Mar 7, 2012 at 9:27 PM, zhengfeng <zf5984599 at gmail.com> wrote:> Dear all, thanks?a lot for your answers ;) > > Now?I have another problem about the network between?nodes. > Since?there is?no Infiniband?or 10G-NIC, but I still want to > increase the bandwidth by add more 1G-NICs, I plan to use Linux bonding. > > Then, bonding 4 NICs together?at?one node, BUT there is?NO performance > enhanced no matter which bongding mode, described in kernel doc, used. > In stead, the performance of 4-NICs-bonding is lower than 1 NIC''s. > Then we use 2 NICs bonding, the performance is better than 1 NIC''s. > The result is : > bonding 2 NIC: 1 + 1 > 1 > bonding?4 NIC: 1 + 1 + 1 + 1 < 1 > > So confused...... > The benchmark?we used is?"netperf". > And I?use "tcpdump" to dump the packages, found that there are great > of TCP segments out of orders. > > My question is that: > a) TCP segments are out of order, which?induced that 4-NIC-bonding > ?performance decay, is this the root cause? > b) We are doubting the feasibility of this method: using 4-NIC-bonding > to increase bandwidth. Any proposals about that?? If so, maybe I should > use some other method instead of this. > > Thanks again, all > > ________________________________ > Best Regards > Zheng > > From:?Peter Grandi > Date:?2012-03-07?20:54 > To:?Lustre discussion > Subject:?Re: [Lustre-discuss] Could "Client & MGS" or "Client & OSS" running > at one node together? >>?Since?there?is?no?more?node?in?our?project?when?using?Lustre, >>?I?want?to?confirm?that: > >>?1)?Could?the?"Client"?and?"MGS"?run?at?one?node?together??or >>?could?"Client"?and?"OSS"?run?at?one?node?together??2)?Suppose >>?I?had?deployed?them?at?one?node,?what?potential?shortcomings >>?or?harm?are?there? > > Running?MGS?and?MDS?on?the?same?nodes?is?customary,?see: > ???http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 > > Running?the?MGS,?MDS?and?OSS?service?on?the?same?node?is > possible?and?fairly?common?in?very?small?setups,?usually?those > in?which?there?is?only?1-2?nodes. > > It?is?possible?to?use?the?client?code?on?all?types?of?Lustre > servers,?but?at?least?in?the?case?of?using?the?client?code?on?an > OSS?there?is?the?non-negligible?possibility?of?a?resource > deadlock,?if?the?client?uses?the?OSS?on?the?same?node,?as?the > client?and?OSS?codes?compete?for?memory,?so?in?the?past?this?has > been?discouraged. > > This?is?documented?here: > ??http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_84876 > ????Caution?-?Do?not?do?this?when?the?client?and?OSS?are?on?the > ????same?node,?as?memory?pressure?between?the?client?and?OSS?can > ????lead?to?deadlocks.? > _______________________________________________ > Lustre-discuss?mailing?list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Brian J. Murrell
2012-Mar-12 21:35 UTC
[Lustre-discuss] Could "Client & MGS" or "Client & OSS" running at one node together?
On 12-03-06 12:01 AM, zhengfeng wrote:> > Since there is no more node in our project when using Lustre, I want to confirm that: > > 1) Could the "Client" and "MGS" run at one node together? or could "Client" and "OSS" run at one node together?Theoretically, the client and OST cannot be on the same node due to a potential memory deadlock. When a node that is a Lustre client grows short of memory it flushes it''s cache to free some memory up. However if the OST that it needs to flush pages to is also on the same node, it will need to to allocate memory to receive the pages from the client, which it will not be able to do since the node is already short of memory which is causing the need for the client to flush. Cheers, b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120312/65b48431/attachment.bin