Dear all, which are the parameters that I can set to improve the lustre performance? Because I have very worse performance using lustre 1.6.4.3. An ''ls'' on a directory containing 1000 files gets 30s (on a NFS filesystem the same ''ls'' gets 0.3s) The same for a file creation/copy. On lustre: # time dd if=/dev/zero of=test.txt bs=1k count=1024000 1024000+0 records in 1024000+0 records out real 1m39.423s user 0m0.510s sys 0m10.940s # time cp test.txt test.out real 1m2.526s user 0m0.000s sys 0m3.550s On NFS: # time dd if=/dev/zero of=test.txt bs=1k count=1024000 1024000+0 records in 1024000+0 records out real 0m11.920s user 0m0.438s sys 0m9.864s # time cp test.txt test.out real 0m15.012s user 0m0.012s sys 0m2.444s -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Fri, 2008-06-06 at 16:45 +0200, Enrico Morelli wrote:> Dear all, > > which are the parameters that I can set to improve the > lustre performance? > > Because I have very worse performance using lustre 1.6.4.3. > An ''ls'' on a directory containing 1000 files gets 30s (on a NFS > filesystem the same ''ls'' gets 0.3s) > > The same for a file creation/copy. > On lustre: > # time dd if=/dev/zero of=test.txt bs=1k count=1024000^^^^^ Try increasing the block size to 1M. What interconnect are you using between your client and servers? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080606/ed35a075/attachment-0001.bin
The very first thing I do before indicting lustre is to verify my interconnect speed as well as storage performance. Once these look good do I start looking at lustre itself. Even then, I never measure the performance by just looking at the elapsed time to run a test. Rather I always run collectl in a separate window and look at my network/interconnet, cpu, etc every second during the test so see what is really happening in between the end points. -mark Enrico Morelli wrote:> Dear all, > > which are the parameters that I can set to improve the > lustre performance? > > Because I have very worse performance using lustre 1.6.4.3. > An ''ls'' on a directory containing 1000 files gets 30s (on a NFS > filesystem the same ''ls'' gets 0.3s) > > The same for a file creation/copy. > On lustre: > # time dd if=/dev/zero of=test.txt bs=1k count=1024000 > 1024000+0 records in > 1024000+0 records out > > real 1m39.423s > user 0m0.510s > sys 0m10.940s > > # time cp test.txt test.out > > real 1m2.526s > user 0m0.000s > sys 0m3.550s > > On NFS: > # time dd if=/dev/zero of=test.txt bs=1k count=1024000 > 1024000+0 records in > 1024000+0 records out > > real 0m11.920s > user 0m0.438s > sys 0m9.864s > > # time cp test.txt test.out > > real 0m15.012s > user 0m0.012s > sys 0m2.444s > > >
On Fri, 06 Jun 2008 10:52:17 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> On Fri, 2008-06-06 at 16:45 +0200, Enrico Morelli wrote: > > Dear all, > > > > which are the parameters that I can set to improve the > > lustre performance? > > > > Because I have very worse performance using lustre 1.6.4.3. > > An ''ls'' on a directory containing 1000 files gets 30s (on a NFS > > filesystem the same ''ls'' gets 0.3s) > > > > The same for a file creation/copy. > > On lustre: > > # time dd if=/dev/zero of=test.txt bs=1k count=1024000 > ^^^^^ > Try increasing the block size to 1M.On lustre: time dd if=/dev/zero of=test.txt bs=1M count=1024 1024+0 records in 1024+0 records out real 0m21.260s user 0m0.000s sys 0m1.600s On NFS: time dd if=/dev/zero of=test.txt bs=1M count=1024 1024+0 records in 1024+0 records out real 0m2.561s user 0m0.006s sys 0m1.698s> > What interconnect are you using between your client and servers? >Its has a giganet nic card (are new HP DL 380/360G5) interconnected through a switch HP Procurve using network cable class 7 (1 mt length). I can transfer files via network at 30MB/s.> b. >-- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
Is this writing to a single OST or mutliple ones? If multiple ones I''d at least test each OST separately just to make sure one of them isn''t slowing them all down. Have you verified your interconnect is functioning correctly? -mark Enrico Morelli wrote:> On Fri, 06 Jun 2008 10:52:17 -0400 > "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote: > > >> On Fri, 2008-06-06 at 16:45 +0200, Enrico Morelli wrote: >> >>> Dear all, >>> >>> which are the parameters that I can set to improve the >>> lustre performance? >>> >>> Because I have very worse performance using lustre 1.6.4.3. >>> An ''ls'' on a directory containing 1000 files gets 30s (on a NFS >>> filesystem the same ''ls'' gets 0.3s) >>> >>> The same for a file creation/copy. >>> On lustre: >>> # time dd if=/dev/zero of=test.txt bs=1k count=1024000 >>> >> ^^^^^ >> Try increasing the block size to 1M. >> > > On lustre: > time dd if=/dev/zero of=test.txt bs=1M count=1024 > 1024+0 records in > 1024+0 records out > > real 0m21.260s > user 0m0.000s > sys 0m1.600s > > On NFS: > time dd if=/dev/zero of=test.txt bs=1M count=1024 > 1024+0 records in > 1024+0 records out > > real 0m2.561s > user 0m0.006s > sys 0m1.698s > > > >> What interconnect are you using between your client and servers? >> >> > > Its has a giganet nic card (are new HP DL 380/360G5) interconnected > through a switch HP Procurve using network cable class 7 (1 mt > length). I can transfer files via network at 30MB/s. > > >> b. >> >> > > >
On Fri, 06 Jun 2008 11:14:26 -0400 Mark Seger <Mark.Seger at hp.com> wrote:> Is this writing to a single OST or mutliple ones? If multiple ones > I''d at least test each OST separately just to make sure one of them > isn''t slowing them all down.The OSTs are four but are in the same server. I have used four partition, on the same server, to create a single lustre filesystem. So I don''t know how run a test in each OST.> Have you verified your interconnect is functioning correctly?Yes, all works fine.> -mark > > Enrico Morelli wrote: > > On Fri, 06 Jun 2008 10:52:17 -0400 > > "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote: > > > > > >> On Fri, 2008-06-06 at 16:45 +0200, Enrico Morelli wrote: > >> > >>> Dear all, > >>> > >>> which are the parameters that I can set to improve the > >>> lustre performance? > >>> > >>> Because I have very worse performance using lustre 1.6.4.3. > >>> An ''ls'' on a directory containing 1000 files gets 30s (on a NFS > >>> filesystem the same ''ls'' gets 0.3s) > >>> > >>> The same for a file creation/copy. > >>> On lustre: > >>> # time dd if=/dev/zero of=test.txt bs=1k count=1024000 > >>> > >> ^^^^^ > >> Try increasing the block size to 1M. > >> > > > > On lustre: > > time dd if=/dev/zero of=test.txt bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > > > real 0m21.260s > > user 0m0.000s > > sys 0m1.600s > > > > On NFS: > > time dd if=/dev/zero of=test.txt bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > > > real 0m2.561s > > user 0m0.006s > > sys 0m1.698s > > > > > > > >> What interconnect are you using between your client and servers? > >> > >> > > > > Its has a giganet nic card (are new HP DL 380/360G5) interconnected > > through a switch HP Procurve using network cable class 7 (1 mt > > length). I can transfer files via network at 30MB/s. > > > > > >> b. > >> > >> > > > > > > >-- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Fri, 2008-06-06 at 17:06 +0200, Enrico Morelli wrote:> > On lustre: > time dd if=/dev/zero of=test.txt bs=1M count=1024 > 1024+0 records in > 1024+0 records out > > real 0m21.260sSo, about 48MiB/s, yes? What does your storage backend look like?> user 0m0.000s > sys 0m1.600s > > On NFS: > time dd if=/dev/zero of=test.txt bs=1M count=1024 > 1024+0 records in > 1024+0 records out > > real 0m2.561s399MiB/s. Do you believe this? Does it correlate with the storage backend on the NFS server and the network interconnect between your NFS client and NFS server?> Its has a giganet nic card (are new HP DL 380/360G5) interconnected > through a switch HP Procurve using network cable class 7 (1 mt > length). I can transfer files via network at 30MB/s.Is this the same interconnect hardware on your NFS installation? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080606/0eab6589/attachment.bin
On Fri, 06 Jun 2008 11:24:28 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> On Fri, 2008-06-06 at 17:06 +0200, Enrico Morelli wrote: > > > > On lustre: > > time dd if=/dev/zero of=test.txt bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > > > real 0m21.260s > > So, about 48MiB/s, yes? What does your storage backend look like?The server is connected to an MSA60 with 7x500GB SATA HDD in RAID 6.> > > user 0m0.000s > > sys 0m1.600s > > > > On NFS: > > time dd if=/dev/zero of=test.txt bs=1M count=1024 > > 1024+0 records in > > 1024+0 records out > > > > real 0m2.561s > > 399MiB/s. Do you believe this? Does it correlate with the storage > backend on the NFS server and the network interconnect between your > NFS client and NFS server?Sorry, sorry, sorry, I''m very tired. I connected directly to the NFS server :-(((( On a client machine the time is more than 1 minutes. The problem is that a lot of people told me that ''ls'' required a lot of time, a ''cp'' required a lot of time, all operations on lustre filesystem are very slow. The people that works on lustre has a lot of files to manage, so testing his directories I found that an ''ls'' require a lot of time and the same for ''cp'' and ''rm'' operations. The dd was not a better test to do. Sorry again. -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Fri, 2008-06-06 at 17:47 +0200, Enrico Morelli wrote:> On Fri, 06 Jun 2008 11:24:28 -0400 > "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote: > > > So, about 48MiB/s, yes? What does your storage backend look like? > > The server is connected to an MSA60 with 7x500GB SATA HDD in RAID 6.Have you measured the throughput of the device(s) you are using as OST(s) on the OSS(es)? You also reported that your network is capable of 30MB/s. That already doesn''t seem right since you are getting 48MiB/s of throughput.> Sorry, sorry, sorry, I''m very tired. I connected directly to the NFS > server :-(((( On a client machine the time is more than 1 minutes.So Lustre is 3x faster than NFS? Are you happy with your 48MiB/s from your Lustre filesystem?> The problem is that a lot of peopleWhich people?> told me that ''ls'' required a lot of > time, a ''cp'' required a lot of time, all operations on lustre > filesystem are very slow.So you want faster than 48MiB/s? You need to have the hardware capable of delivering that. If you want better than 48MiB/s (and certainly you can get way, way higher with the right hardware) you first need to determine what your hardware is capable of. You need to benchmark your OSTs and then benchmark your network. Only when you know what the components are capable of can you determine what the aggregation of them should be able to deliver.> The people that works on lustre has a lot of > files to manage, so testing his directories I found that an ''ls'' require > a lot of time and the same for ''cp'' and ''rm'' operations. > > The dd was not a better test to do. Sorry again.Yeah. For throughput, use a real benchmark like IOR, iozone, etc. While that will be a good example of what cp will do, it is not a good example to compare ''ls'' to as ''ls'' is a metadata heavy application, not file i/o heavy. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080606/5d1e7896/attachment.bin
On Fri, 06 Jun 2008 12:09:13 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> Yeah. For throughput, use a real benchmark like IOR, iozone, etc. > While that will be a good example of what cp will do, it is not a good > example to compare ''ls'' to as ''ls'' is a metadata heavy application, > not file i/o heavy. > > b. >Ok, I''m sorry but our real problem is that the system is very slow when the user perform ''ls'' or shell commands that require access to metadata infos. We have about 30 person that access simultaneously to the system. What we can do to speed-up the system to solve this problem? Adding another MDT can help? Thanks a lot -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
Enrico Morelli wrote: > Its has a giganet nic card (are new HP DL 380/360G5) interconnected > through a switch HP Procurve using network cable class 7 (1 mt > length). I can transfer files via network at 30MB/s.> tranfert in FTP ? NFS ?> The server is connected to an MSA60 with 7x500GB SATA HDD in RAID 6. >if i understand well you have only one Server with 1 RAID6 volume and you partition it for MDT AND 4 OST ? -- Weill Philippe - Administrateur Systeme et Reseaux CNRS Service Aeronomie - Universite Pierre et Marie Curie - Tour 45/46 3e Etage B302 - 4 Place Jussieu - 75252 Paris Cedex 05 - FRANCE Email:philippe.weill at aero.jussieu.fr | tel:+33 0144274759 Fax:+33 0144273776
On Mon, 2008-06-09 at 17:59 +0200, Enrico Morelli wrote:> > Ok, I''m sorry but our real problem is that the system is very slow > when the user perform ''ls'' or shell commands that require access to > metadata infos. We have about 30 person that access simultaneously to > the system.What storage system(s) do you have behind your MDS(es) and OSS(es) and how are they connected? What kind of interconnect do you have between your clients and your MDS(es) and OSS(es)?> What we can do to speed-up the system to solve this problem?You could try increasing the lru cache (/proc/fs/lustre/ldlm/namespaces/<osc>/lru_size) on your clients.> Adding another MDT can help?You cannot have more than 1 MDT at this time. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080610/df50a890/attachment.bin
Hi Brian, I know I can''t have more than one MDT online for the same file system at the same time, but I know it''s possible to have one configured as a hot standby. Are there any limits on the number of backup MDTs a given cluster can have? I''m not sure if this question has been asked before, but I''m curious. cheers, Klaus On 6/10/08 10:10 AM, "Brian J. Murrell" <Brian.Murrell at Sun.COM>did etch on stone tablets:> On Mon, 2008-06-09 at 17:59 +0200, Enrico Morelli wrote: >> >> Ok, I''m sorry but our real problem is that the system is very slow >> when the user perform ''ls'' or shell commands that require access to >> metadata infos. We have about 30 person that access simultaneously to >> the system. > > What storage system(s) do you have behind your MDS(es) and OSS(es) and > how are they connected? What kind of interconnect do you have between > your clients and your MDS(es) and OSS(es)? > >> What we can do to speed-up the system to solve this problem? > > You could try increasing the lru cache > (/proc/fs/lustre/ldlm/namespaces/<osc>/lru_size) on your clients. > >> Adding another MDT can help? > > You cannot have more than 1 MDT at this time. > > b. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Tue, 2008-06-10 at 11:56 -0700, Klaus Steden wrote:> Hi Brian, > > I know I can''t have more than one MDT online for the same file system at the > same time, but I know it''s possible to have one configured as a hot standby. > Are there any limits on the number of backup MDTs a given cluster can have? > I''m not sure if this question has been asked before, but I''m curious.I recall discussing this with another engineer here not that long ago and while we could not think of any particular practical limitation to the number of stand-by MDSes one could have, we also don''t acknowledged that we don''t do any testing with more than one. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080610/ab8403d1/attachment.bin
Hello everybody We did experience the same issue with our file system. Metadata operations and operations on a big number of files are quite slow compared to a local or a NFS file system. For example packing the kernel sources takes about one and a half minutes on the 1Gbps NFS server, but nearly 5 minutes on our Lustre file system. Our setup is consisting of 2 load balanced Lustre servers exporting 7 OSTs and 1 MGS/MDT. The servers run the Lustre-patched 2.6.18 kernel with Lustre 1.6.4.3 self-compiled. On client side we already tried the 2.6.22 patch-less setup and also the patched 2.6.18 kernel without recognizing a major performance difference. There are 192 client machines mounting the Lustre file system off the two servers. The MDT storage device is a Transtec 4Gbps FibreChannel SAS Raid. It''s configured to use Raid level 1+0. The OSTs are Transtec 4Gbps FC SATA Raids, configured to use Raid 6. Each back-end device is connected for failover reasons to both servers via a QLogic SANbox 5602 FC switch. The bandwidth is definitely not the bottleneck with this setup. With a parallel dd test, we were able to write more than 1GB/s to the OSTs. Any hints are always welcome. Kind regards, Reto Gantenbein On Jun 9, 2008, at 5:59 PM, Enrico Morelli wrote:> On Fri, 06 Jun 2008 12:09:13 -0400 > "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote: > > Ok, I''m sorry but our real problem is that the system is very slow > when the user perform ''ls'' or shell commands that require access to > metadata infos. We have about 30 person that access simultaneously to > the system. What we can do to speed-up the system to solve this > problem? > Adding another MDT can help?
On Tue, 10 Jun 2008 13:10:03 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> On Mon, 2008-06-09 at 17:59 +0200, Enrico Morelli wrote: > > > > Ok, I''m sorry but our real problem is that the system is very slow > > when the user perform ''ls'' or shell commands that require access to > > metadata infos. We have about 30 person that access simultaneously > > to the system. > > What storage system(s) do you have behind your MDS(es) and OSS(es) and > how are they connected? What kind of interconnect do you have between > your clients and your MDS(es) and OSS(es)?The lustre server (MDS and OSS) is an HP DL380 G5 (Dual Xeon 5130 2GHz, 2GB RAM) connected through an HP P800 Sata/SAS controller to an HP MSA60 storage equipped with 7x500GB SATA disk in RAID 6. The server is connected to an HP Procurve gigabit switch with a fibre channel, the clients are connected to the same switch with 1mt network cable category 7.> > > What we can do to speed-up the system to solve this problem? > > You could try increasing the lru cache > (/proc/fs/lustre/ldlm/namespaces/<osc>/lru_size) on your clients. > > > Adding another MDT can help? > > You cannot have more than 1 MDT at this time. > > b. >-- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
On Tue, 2008-06-10 at 23:49 +0200, Reto Gantenbein wrote:> Hello everybody > > We did experience the same issue with our file system. Metadata > operations and operations on a big number of files are quite slow > compared to a local or a NFS file system. For example packing the > kernel sources takes about one and a half minutes on the 1Gbps NFS > server, but nearly 5 minutes on our Lustre file system. > > Our setup is consisting of 2 load balanced Lustre servers exporting 7 > OSTs and 1 MGS/MDT. The servers run the Lustre-patched 2.6.18 kernel > with Lustre 1.6.4.3 self-compiled.please install 1.6.5 - this has fix for improve speed for some read paterns, also add optimization for short read (read less of equal page size). -- Alex Lyashkov <Alexey.lyashkov at sun.com> Lustre Group, Sun Microsystems
On Wed, 2008-06-11 at 13:13 +0300, Alex Lyashkov wrote:> > please install 1.6.5 - this has fix for improve speed for some read > paterns, also add optimization for short read (read less of equal page > size).Has it been released yet ? Sun download page only has 1.6.4.3. Attempt to acces http://downloads.lustre.org/public/lustre/v1.6 I get redirectet to Sun download page. /Jakob
On Wed, 2008-06-11 at 09:49 +0200, Enrico Morelli wrote:> > The lustre server (MDS and OSS) is an HP DL380 G5 (Dual Xeon 5130 2GHz, > 2GB RAM) connected through an HP P800 Sata/SAS controller to an HP MSA60 > storage equipped with 7x500GB SATA disk in RAID 6.So you have one single RAID6 volume that you have sliced up for the MDT and OSTs? That configuration is going to provide a lot of contention between the MDT and OSTs as you have created a single "device" for everything and are losing out on the possibilities of parallelism. Lustre''s ability to shine depends on it''s components having dedicated access to discrete devices so that it can exploit the parallelism of them. Lustre also shines in very scalable, high throughput situations. It is not optimized for the "lots of small files" situation. Lots of memory for caching is your best bet in the "lots of small files" and 2GB of RAM for the MDS and OSS are not very much. Can I ask why you chose Lustre as a solution to provide file service to a single RAID6 volume of only 7 disks @ 3.5TB on a single server? Lustre really is not going to shine in that configuration. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080611/038d9c52/attachment.bin
On Wed, 2008-06-11 at 12:26 +0200, Jakob Goldbach wrote:> On Wed, 2008-06-11 at 13:13 +0300, Alex Lyashkov wrote: > > > > please install 1.6.5 - this has fix for improve speed for some read > > paterns, also add optimization for short read (read less of equal page > > size). > > Has it been released yet ? Sun download page only has 1.6.4.3.should be yes. But i don''t know about update sun download page. If not released you can apply patch from bug 14010 or get sources from b_release_1_6_5 cvs branch.> > Attempt to acces http://downloads.lustre.org/public/lustre/v1.6 I get > redirectet to Sun download page. > > /Jakob > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Wed, 11 Jun 2008 08:54:41 -0400 "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote:> On Wed, 2008-06-11 at 09:49 +0200, Enrico Morelli wrote: > > > > The lustre server (MDS and OSS) is an HP DL380 G5 (Dual Xeon 5130 > > 2GHz, 2GB RAM) connected through an HP P800 Sata/SAS controller to > > an HP MSA60 storage equipped with 7x500GB SATA disk in RAID 6. > > So you have one single RAID6 volume that you have sliced up for the > MDT and OSTs? > > That configuration is going to provide a lot of contention between the > MDT and OSTs as you have created a single "device" for everything and > are losing out on the possibilities of parallelism. > > Lustre''s ability to shine depends on it''s components having dedicated > access to discrete devices so that it can exploit the parallelism of > them. > > Lustre also shines in very scalable, high throughput situations. It > is not optimized for the "lots of small files" situation. Lots of > memory for caching is your best bet in the "lots of small files" and > 2GB of RAM for the MDS and OSS are not very much. > > Can I ask why you chose Lustre as a solution to provide file service > to a single RAID6 volume of only 7 disks @ 3.5TB on a single server? > Lustre really is not going to shine in that configuration. >Because we thought that Lustre was better than NFS and more scalable also in a simple configuration like our. Now, if I reduce the OSTs to one do you think that I can improve the performance? -- ------------------------------------------------------------------- (o_ (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +------------------------------------------------------------------+ | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | | * * * * | phone: +39 055 4574269 | | University of Florence | fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | +------------------------------------------------------------------+
From my experience (production service) NFS is better than lustre in single server configuration. On 11 Jun 2008, at 15:16, Enrico Morelli wrote:> On Wed, 11 Jun 2008 08:54:41 -0400 > "Brian J. Murrell" <Brian.Murrell at Sun.COM> wrote: > >> On Wed, 2008-06-11 at 09:49 +0200, Enrico Morelli wrote: >>> >>> The lustre server (MDS and OSS) is an HP DL380 G5 (Dual Xeon 5130 >>> 2GHz, 2GB RAM) connected through an HP P800 Sata/SAS controller to >>> an HP MSA60 storage equipped with 7x500GB SATA disk in RAID 6. >> >> So you have one single RAID6 volume that you have sliced up for the >> MDT and OSTs? >> >> That configuration is going to provide a lot of contention between >> the >> MDT and OSTs as you have created a single "device" for everything and >> are losing out on the possibilities of parallelism. >> >> Lustre''s ability to shine depends on it''s components having dedicated >> access to discrete devices so that it can exploit the parallelism of >> them. >> >> Lustre also shines in very scalable, high throughput situations. It >> is not optimized for the "lots of small files" situation. Lots of >> memory for caching is your best bet in the "lots of small files" and >> 2GB of RAM for the MDS and OSS are not very much. >> >> Can I ask why you chose Lustre as a solution to provide file service >> to a single RAID6 volume of only 7 disks @ 3.5TB on a single server? >> Lustre really is not going to shine in that configuration. >> > > Because we thought that Lustre was better than NFS and more scalable > also in a simple configuration like our. > > Now, if I reduce the OSTs to one do you think that I can improve the > performance? > > -- > ------------------------------------------------------------------- > (o_ > (o_ //\ Coltivate Linux che tanto Windows si pianta da solo. > (/)_ V_/_ > +------------------------------------------------------------------+ > | ENRICO MORELLI | email: morelli at CERM.UNIFI.IT | > | * * * * | phone: +39 055 4574269 | > | University of Florence | fax : +39 055 4574253 | > | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY | > +------------------------------------------------------------------+ > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Wed, 2008-06-11 at 16:16 +0200, Enrico Morelli wrote:> > Because we thought that Lustre was better than NFSFor certain workloads, yes, it is much better because...> and more scalableIt is very much more scalable, but with such tremendous scalability there is a cost at the very low end (and a single server for MDT and OSTS is about as low as you can get on the scale) of the scale.> also in a simple configuration like our.For a single server serving both MDT and OSTs on a general file sharing workload, Lustre will not typically perform any better than NFS. There are exceptions to this generalization for certain corner cases, but in general it''s pretty accurate.> Now, if I reduce the OSTs to one do you think that I can improve the > performance?Not really. You are still running into the same basic problem in that you are trying to serve up to completely different data sets (meta-data and file data) from the same device. If you could isolate your MDT to it''s own device (i.e. a couple of small sata disks mirrored on their own sata buses is pretty cheap) you might see some improvement, but you might also run into other bottlenecks that a single server will impose, such as bus bandwidth, memory bandwidth, network bandwidth, etc. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080611/15fc6045/attachment.bin