Buday, Tomas
2011-Jan-22 20:29 UTC
[Lustre-discuss] Lustre-discuss Digest, Vol 60, Issue 25
"lustre-discuss-request at lists.lustre.org" <lustre-discuss-request at lists.lustre.org> wrote: Send Lustre-discuss mailing list submissions to lustre-discuss at lists.lustre.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.lustre.org/mailman/listinfo/lustre-discuss or, via email, send a message with subject or body ''help'' to lustre-discuss-request at lists.lustre.org You can reach the person managing the list at lustre-discuss-owner at lists.lustre.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Lustre-discuss digest..." Today''s Topics: 1. Re: MDT raid parameters, multiple MGSes (Thomas Roth) 2. Re: split OSTs from single OSS in 2 networks (Aur?lien Degr?mont) ---------------------------------------------------------------------- Message: 1 Date: Sat, 22 Jan 2011 11:23:56 +0100 From: Thomas Roth <t.roth at gsi.de> Subject: Re: [Lustre-discuss] MDT raid parameters, multiple MGSes To: Cliff White <cliffw at whamcloud.com> Cc: lustre-discuss at lists.lustre.org Message-ID: <4D3AB03C.5090005 at gsi.de> Content-Type: text/plain; charset=ISO-8859-1 O.k., so the point is that the MDS writes are so small, one could never stripe such a write over multiple disks anyhow. Very good, one point less to worry about. Btw, files on the MDT - why does the apparent file size there sometimes reflect the size of the real file, and sometimes not? For example, on a ldiskfs-mounted copy of our MDT, I have a directory under ROOT/ with -rw-rw-r-- 1 935M 15. Jul 2009 09000075278027.140 -rw-rw-r-- 1 0 15. Jul 2009 09000075278027.150 As they should, both entries are 0-sized, as seen by e.g. "du". On Lustre, both files exist and both have size 935M. So for some reason, one has a metatdata entry that appears as a huge sparse file, the other does not. Is there a reason, or is this just an illness of our installation? Cheers, Thomas On 01/21/2011 09:31 PM, Cliff White wrote:> > > On Fri, Jan 21, 2011 at 3:43 AM, Thomas Roth <t.roth at gsi.de <mailto:t.roth at gsi.de>> wrote: > > Hi all, > > we have gotten new MDS hardware, and I''ve got two questions: > > What are the recommendations for the RAID configuration and formatting > options? > I was following the recent discussion about these aspects on an OST: > chunk size, strip size, stride-size, stripe-width etc. in the light of > the 1MB chunks of Lustre ... So what about the MDT? I will have a RAID > 10 that consists of 11 RAID-1 pairs striped over. giving me roughly 3TB > of space. What would be the correct value for <insert your favorite > term>, the amount of data written to one disk before proceeding to the > next disk? > > > The MDS does very small random IO - inodes and directories. Afaik, the largest chunk > of data read/written would be 4.5K -and you would see that only with large OST stripe > counts. RAID 10 is fine. You will not > be doing IO that spans more than one spindle, so I''m not sure if there''s a real need to tune here. > Also, the size of the data on the MDS is determined by the number of files in the > filesystem (~4k per file is good) > unless you are buried in petabytes 3TB is likely way oversize for an MDT. > cliffw > > > > Secondly, it is not yet decided whether we wouldn''t use this hardware to > set up a second Lustre cluster. The manual recommends to have only one > MGS per site, but doesn''t elaborate: what would be the drawback of > having two MGSes, two different network addresses the clients have to > connect to to mount the Lustres? > I know that it didn''t work in Lustre 1.6.3 ;-) and there are no apparent > issues when connecting a Lustre client to a test cluster now (version > 1.8.4), but what about production? > > > Cheers, > Thomas > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org <mailto:Lustre-discuss at lists.lustre.org> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum f?r Schwerionenforschung GmbH Planckstra?e 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschr?nkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Gesch?ftsf?hrung: Professor Dr. Dr. h.c. Horst St?cker, Dr. Hartmut Eickhoff Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt ------------------------------ Message: 2 Date: Sat, 22 Jan 2011 16:58:04 +0100 From: Aur?lien Degr?mont <aurelien.degremont at cea.fr> Subject: Re: [Lustre-discuss] split OSTs from single OSS in 2 networks To: Haisong Cai <cai at sdsc.edu> Cc: lustre-discuss at lists.lustre.org Message-ID: <4D3AFE8C.2090309 at cea.fr> Content-Type: text/plain; charset=UTF-8; format=flowed From Bugzilla, the patch has been introduced in Lustre 2.0 and bugfixed in 2.1. There is a backport for 1.8 but this was never landed in official source code. So it is not available in any 1.8 official releases. Either you have to use Lustre 2.0 or patch 1.8.5 Regards, Aur?lien Le 21/01/2011 18:32, Haisong Cai a ?crit :> It does look exactly what I need. Thanks Aurllien. > > From bugzilla, the patch has been checked in to 1.8. Could someone please > point me to the source location? Is it in 1.8.5 - don''t believe so, > but I thought I would check. If not, would it be in later release of 1.8? > > thanks all, > Haisong > > On Thu, 20 Jan 2011, DEGREMONT Aurelien wrote: > >> Hello >> >> If you want to register different interfaces for different OST on the >> same OSS, you should use --network options, introduced in patch >> https://bugzilla.lustre.org/show_bug.cgi?id=22078 >> >> Regards, >> >> Aur?lien >> >> Haisong Cai a ?crit : >>> I have a storage server that has two QPI contolllers, >>> each controller contols half of the I/O slots. In the first half is >>> a raid controller and 10GbE card, in the second half is a raid controller >>> and a 10GbE card. >>> >>> 4 raid arrays for 4 OSTs of single Lustre filesystem. >>> >>> 2 10GbE, each configured with their own IP and switches, >>> IPSUBNET1 and IPSUBNET2. >>> >>> I want to register OSTs on this OSS with different addresses. >>> That is, OST1& OST2 in IPSUBNET1, OST3& OST4 in IPSUBNET2, >>> and using "policy-based" routing directing traffic. >>> >>> The idea is to not do the bonding and get as much of bandwidth >>> out of 10GbE NICs as possible. >>> >>> Is it possible? >>> >>>------------------------------ _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss End of Lustre-discuss Digest, Vol 60, Issue 25 **********************************************