thr3ads.net - Lustre discuss - [Fwd: Re: [Lustre-discuss] lustre 1.6.0.1] [Jun 2007]

If this information is useful, please help other people find it:
Share via:
Balagopal Pillai
2007-Jun-21 08:03 UTC
[Fwd: Re: [Lustre-discuss] lustre 1.6.0.1]

-------- Original Message --------
Subject: 	Re: [Lustre-discuss] lustre 1.6.0.1
Date: 	Thu, 21 Jun 2007 11:02:16 -0300
From: 	Balagopal Pillai <pillai@mathstat.dal.ca>
Reply-To: 	pillai@mathstat.dal.ca
Organization: 	Department of Mathematics and Statistics
To: 	Aaron Knister <aaron@iges.org>
References: 	<467A7B26.8090909@mathstat.dal.ca> 
<D77D30F2-FB3F-4864-8309-4F479EB4A063@iges.org>



Hi Aaron,

              On second thoughts i should have tried OCFS2. I looked at 
it a month ago and saw the quorum issue just like GFS. But it doesn''t 
seem to
have the same strict fencing requirements like GFS that needs some 
hardware support except for gnbd fencing, like fc switch fencing, ipmi 
fencing etc. I ran out
of time in this case and at least Lustre seems stable. The old Lustre 
installation is also quite stable. Maybe next time i look at a cluster 
filesytem, i will
give OCFS2 a torture test and see if its stable enough. I did crash GFS 
many times by running bonnie++ on 8 GB files from 4 nodes simultaneously.
Those 4 nodes were the ideal case in the bonding, where it picked up 4 
different mac addresses of the storage server.

              Hopefully in the next maintenance schedule of the 
cluster, i can evaluate more options for the cluster file system. I have 
a new one coming in a month that is almost
tailor made for Lustre with many Dell MD1000''s for parallel I/O. I will
give OCFS2 also a try then. Thanks very much for the response.


Regards
Balagopal

Aaron Knister wrote:> If you weren''t happy with GFS try OCFS2. It''s
oracle''s cluster
> filesystem and it''s SOO easy to set up. Sadly I don''t
have answers to
> any of your other questions other than the fact that Lustre''s 
> performance with small files is abysmal for me too. I''m very much 
> interested in any tunables.
>
> -Aaron
>
> On Jun 21, 2007, at 9:20 AM, Balagopal Pillai wrote:
>
>> Hi,
>>
>>           I am using Lustre 1.6.0.1 with one OST and 20 clients in an 
>> HPC cluster.
>> The OST/MDT/MGS has a 16 channel 3ware 9650 using raid6. I currently 
>> have another lustre installation
>> (version 1.4.5) and it has been working trouble free for over an 
>> year. The OS is CentOS 4. There are 4 network
>> ports in the storage server in adaptive load balanced mode and 
>> aggregate network throughout is great (with 4 x netperf/iperf from 
>> clients)
>> in an ideal situation when clients pick up different mac addresses of 
>> the different interfaces in their arp table.
>>
>>           I have a few questions about Lustre and hope someone can 
>> help me.
>>
>> * I had to re-export the lustre volume via nfs on the new 1.6.0.1 
>> setup to other infrastructure boxes.
>> After the export, i get the following error messages in the OSS -
>>
>>
>> Jun 21 09:31:11 lustre-3ware kernel: Lustre: 
>> 4946:0:(lustre_fsfilt.h:205:fsfilt_start_log()) scratch-OST0000: slow 
>> journal start 33s
>> Jun 21 09:31:11 lustre-3ware kernel: Lustre: 
>> 4946:0:(lustre_fsfilt.h:205:fsfilt_start_log()) Skipped 22 previous 
>> similar messages
>> Jun 21 09:31:11 lustre-3ware kernel: Lustre: 
>> 4874:0:(filter.c:1139:filter_parent_lock()) scratch-OST0000: slow 
>> parent lock 33s
>> Jun 21 09:31:11 lustre-3ware kernel: Lustre: 
>> 4874:0:(filter.c:1139:filter_parent_lock()) Skipped 6 previous 
>> similar messages
>>
>> Also is the NFS re-export option stable in version 1.6? I read some 
>> posts before in the list reporting kernel panics on Lustre 1.4.
>>
>>
>> *I was evaluating GFS for the past few weeks with GNBD and the 
>> performance was amazing (at least for my purpose with one storage 
>> server). It was very fast, especially for small files.
>> But i had to dump it because of stability reasons. The problems were 
>> these - has 6 daemons that need to come up in a particular order. If 
>> some of
>> the kernel modules crash on heavy load on a node, the whole cluster 
>> freezes. It had the issue of quorum, which is beneficial on a HA 
>> setup, may be not for HPC.
>> In some cases, i have to keep just one server running  that 
>> re-exports the volume via nfs even if the hpc nodes are down. Like 
>> during a power failure for example. Quorum is a
>> problem in that case. But it was mostly stability that made me not go 
>> with GFS + GNBD.
>>
>> *Now the problem - Lustre performance dips a lot when it comes to 
>> small files. Please see the following fileop -f 5 test comparing NFS 
>> and Lustre -
>>
>>
>> Lustre -
>>         Fileop:  File size is 1,  Output is in Ops/sec. (A=Avg, 
>> B=Best, W=Worst)
>> .      mkdir  rmdir create   read  write  close   stat access  chmod 
>> readdir link   unlink delete  Total_files
>> A    5   1654    691    132  14228    719   4874   1987  32737   1718 
>>   2506   1262   1340   1608          125
>>
>>
>> NFS -
>> Fileop:  File size is 1,  Output is in Ops/sec. (A=Avg, B=Best,
W=Worst)
>> .      mkdir  rmdir create   read  write  close   stat access  chmod 
>> readdir link   unlink delete  Total_files
>> A    5    177    594    459 380747 137392   2282   1219 444312    502 
>>   1274    306    513    464          125
>>
>>        Could you please recommend any tunables to get a bit more 
>> performance out of Lustre with lots of small files? Lots of small 
>> files was bad in GFS too, but
>> it was better than NFS though.
>>
>> *Also the read performance of Lustre seems to be a little behind NFS. 
>> I had /opt which has all the software for users moved to Lustre in 
>> the new setup. But
>> software like Matlab, Splus etc takes almost a minute to come up. The 
>> second time is very fast though, maybe due to caching. So i am 
>> thinking of putting /opt
>> back to NFS. Is it possible to boost the read performance of Lustre a 
>> bit?
>>
>> *Is there a way to make disk quotas activate at startup automatically 
>> on a Lustre client? The lfs quotaon <mount point> works
sometimes. But
>> sometimes it gives an a resource busy error message.        
>> *One last question. In the older Lustre setup (version 1.4.5), i have 
>> 5 scsi drives one each as an OST for a single volume. The volume 
>> became full. But df still reported
>> that there is 27GB free.  There doesn''t seem to be an lfs df
option
>> in that version of Lustre. So i couldn''t see the individual 
>> utilization of each of the 5 OST. Is this a striping
>> problem?
>>               I know it''s a lot of questions. Hope some of
them are
>> solvable.  Thanks very much.
>>
>>
>>
>> Best Regards
>>
>> Balagopal Pillai         
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss@clusterfs.com
<mailto:Lustre-discuss@clusterfs.com>
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
> Aaron Knister
> Systems Administrator/Web Master
> Center for Research on Environment and Water
>
> (301) 595-7001
> aaron@iges.org <mailto:aaron@iges.org>
>
>
>
Lustre discuss - Jun 2007 - [Fwd: Re: lustre 1.6.0.1]

[Fwd: Re: [Lustre-discuss] lustre 1.6.0.1]