Solofo.Ramangalahy@bull.net
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
Hi, > Due to lack of quota support in Lustre, [...] [...] > We are in the middle of upgrading to lustre-1.4.6.1. [...] > Does anyone have suggestions of alternative ways to implement quotas? I did not see this mentionned during the discussion (or maybe this was mentionned outside of the list): Quotas are supported as of Lustre version 1.4.6.1 Regards, -- Solofo.Ramangalahy@bull.net, Bull S.A. | Tel: +33 (0)4 76 29 72 48 Linux R&D, HPC/CI/Lustre | Fax: +33 (0)4 76 61 52 52 1, Rue de Provence. BP208 | Office B1/386 38432 Echirolles Cedex, France | Mail Stop B1/167
Felix, Evan J
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
Here at pnnl, we have two fairly large file systems, and we have created a multi-threaded directory scanner, that gives us a accounting of the number of files,directories,filesize that each userid owns. It still takes a few hours to run on our 300TB system, but on the smaller 54TB system(5-10% full) it takes less than 20 minutes. Both programs have specific uses for their file system, such as deleting old files(scratch file system), and modifying a database, or checking stripe information. The parallel nature of the system makes it fairly quick. Essentially each thread does this: While dirs on stack: Pop directory off stack: scan directory: if directory, collect statistics, push it on stack if file, collect statistics, filespecial stuff other files: collect statistics save thread-private statistics to global stats It works well, we have tried thread counts from 1 to 32, but as you pass 32, it actually slows things down. And one of them does not do too well beyond 16. Evan Felix Pacific Northwest National Laboratory -----Original Message----- From: lustre-discuss-bounces@clusterfs.com [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Nathan Dauchy Sent: Thursday, April 06, 2006 8:08 AM To: lustre-discuss@clusterfs.com Subject: [Lustre-discuss] usage by user, "du -s", and lack of quotas Greetings, Due to lack of quota support in Lustre, we are having to periodically monitor filesystem usage by user. Fortunately, we can do this by project directory, and don''t have to run "find" on the whole thing. Unfortunately, even running "du -s /el1/projects/*" takes a LONG time. Does anyone have suggestions of alternative ways to implement quotas? Is there a more efficient way in Lustre to determine filesystem utilization of a directory? We are currently running lustre-1.4.4, linux-2.6.5-7.191, on SuSE 9.1 for x86_64, with 4 OSS nodes. We are in the middle of upgrading to lustre-1.4.6.1. The filesystem is reasonably full, but not huge by my understanding of what Lustre is capable of: # df -Ph /el* Filesystem Size Used Avail Use% Mounted on l0009-m:/mds0/client0 12T 8.4T 2.6T 77% /misc/el0 l0009-m:/mds1/client1 12T 9.6T 1.4T 88% /misc/el1 Thanks for any suggestions! -Nathan _______________________________________________ Lustre-discuss mailing list Lustre-discuss@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Mc Carthy, Fergal
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
Nathan, I believe that if you look at the mailing list archives you will find previous discussions about improving responsiveness by tuning the Lustre Distributed Lock Manager (LDLM) LRU settings. Doing so may help a little in your case as it tends to help interactive style access to lots of different files/directories where as the default Lustre settings assume that client nodes will only access a small number of files at any given time. The other thing to consider is adding some parallelism as Evan suggests. Part of this is because Lustre limits the number of simultaneous metadata operations that can be active at the same time per client node. If you were to configure things such that multiple nodes scan portions of the project hierarchy then it may help accelerate things, e.g. have one nodes scan all project dirs with names starting - [a-g], another for the ones starting [h-m] and so on... Though I would recommend a better method for distributing workload than simple alphabetical ranges... Fergal. -- Fergal.McCarthy@HP.com (The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated, you should consider this message and attachments as "HP CONFIDENTIAL".) -----Original Message----- From: lustre-discuss-bounces@clusterfs.com [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Felix, Evan J Sent: 06 April 2006 16:21 To: Nathan Dauchy; lustre-discuss@clusterfs.com Subject: RE: [Lustre-discuss] usage by user, "du -s", and lack of quotas Here at pnnl, we have two fairly large file systems, and we have created a multi-threaded directory scanner, that gives us a accounting of the number of files,directories,filesize that each userid owns. It still takes a few hours to run on our 300TB system, but on the smaller 54TB system(5-10% full) it takes less than 20 minutes. Both programs have specific uses for their file system, such as deleting old files(scratch file system), and modifying a database, or checking stripe information. The parallel nature of the system makes it fairly quick. Essentially each thread does this: While dirs on stack: Pop directory off stack: scan directory: if directory, collect statistics, push it on stack if file, collect statistics, filespecial stuff other files: collect statistics save thread-private statistics to global stats It works well, we have tried thread counts from 1 to 32, but as you pass 32, it actually slows things down. And one of them does not do too well beyond 16. Evan Felix Pacific Northwest National Laboratory -----Original Message----- From: lustre-discuss-bounces@clusterfs.com [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Nathan Dauchy Sent: Thursday, April 06, 2006 8:08 AM To: lustre-discuss@clusterfs.com Subject: [Lustre-discuss] usage by user, "du -s", and lack of quotas Greetings, Due to lack of quota support in Lustre, we are having to periodically monitor filesystem usage by user. Fortunately, we can do this by project directory, and don''t have to run "find" on the whole thing. Unfortunately, even running "du -s /el1/projects/*" takes a LONG time. Does anyone have suggestions of alternative ways to implement quotas? Is there a more efficient way in Lustre to determine filesystem utilization of a directory? We are currently running lustre-1.4.4, linux-2.6.5-7.191, on SuSE 9.1 for x86_64, with 4 OSS nodes. We are in the middle of upgrading to lustre-1.4.6.1. The filesystem is reasonably full, but not huge by my understanding of what Lustre is capable of: # df -Ph /el* Filesystem Size Used Avail Use% Mounted on l0009-m:/mds0/client0 12T 8.4T 2.6T 77% /misc/el0 l0009-m:/mds1/client1 12T 9.6T 1.4T 88% /misc/el1 Thanks for any suggestions! -Nathan _______________________________________________ Lustre-discuss mailing list Lustre-discuss@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Andreas Dilger
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
On Apr 06, 2006 08:21 -0700, Felix, Evan J wrote:> Here at pnnl, we have two fairly large file systems, and we have created > a multi-threaded directory scanner, that gives us a accounting of the > number of files,directories,filesize that each userid owns. It still > takes a few hours to run on our 300TB system, but on the smaller 54TB > system(5-10% full) it takes less than 20 minutes. Both programs have > specific uses for their file system, such as deleting old files(scratch > file system), and modifying a database, or checking stripe information. > The parallel nature of the system makes it fairly quick. Essentially > each thread does this: > > While dirs on stack: > Pop directory off stack: > scan directory: > if directory, collect statistics, push it on stack > if file, collect statistics, filespecial stuff > other files: collect statistics > save thread-private statistics to global stats > > It works well, we have tried thread counts from 1 to 32, but as you pass > 32, it actually slows things down. And one of them does not do too > well beyond 16.Note that the 32-thread limit is almost certainly because the of maximum 32 service threads on the MDS. On smaller MDSes (less RAM/CPUs) the number of threads may be smaller. It will soon be possible to specify the number of MDS threads via module parameter in /etc/modutils.conf: options mds mds_num_threads={n} The actual performance with higher numbers of threads is still under investigation. CFS is also working on allowing more parallelism from a single client under read-only loads like this. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Nathan Dauchy
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
Greetings, Due to lack of quota support in Lustre, we are having to periodically monitor filesystem usage by user. Fortunately, we can do this by project directory, and don''t have to run "find" on the whole thing. Unfortunately, even running "du -s /el1/projects/*" takes a LONG time. Does anyone have suggestions of alternative ways to implement quotas? Is there a more efficient way in Lustre to determine filesystem utilization of a directory? We are currently running lustre-1.4.4, linux-2.6.5-7.191, on SuSE 9.1 for x86_64, with 4 OSS nodes. We are in the middle of upgrading to lustre-1.4.6.1. The filesystem is reasonably full, but not huge by my understanding of what Lustre is capable of: # df -Ph /el* Filesystem Size Used Avail Use% Mounted on l0009-m:/mds0/client0 12T 8.4T 2.6T 77% /misc/el0 l0009-m:/mds1/client1 12T 9.6T 1.4T 88% /misc/el1 Thanks for any suggestions! -Nathan
Felix, Evan J
2006-May-19 07:36 UTC
[Lustre-discuss] usage by user, "du -s", and lack of quotas
Ok, Sorry for the delay, took a while to get this past the IP system. Here is the two files needed for my parallel-python disk usage program. It is also designed to expire files that are old, but only if you tell it. Send any comments or improvements to the list. Evan -----Original Message----- From: Terry Heidelberg [mailto:th@llnl.gov] Sent: Monday, April 10, 2006 2:18 PM To: Felix, Evan J Subject: Re: [Lustre-discuss] usage by user, "du -s", and lack of quotas Hi Evan, This program sounds interesting. Is there any chance you could make it available to LLNL? It takes us many hours to scan our filesystems, using current locally-written non-threaded tools. How many inodes were in use on your 300TB/few hours filesystem? Maybe inodes/hour would be a useful metric for this situation? Thanks, Terry Felix, Evan J wrote:>Here at pnnl, we have two fairly large file systems, and we have >created a multi-threaded directory scanner, that gives us a accounting>of the number of files,directories,filesize that each userid owns. It >still takes a few hours to run on our 300TB system, but on the smaller >54TB system(5-10% full) it takes less than 20 minutes. Both programs >have specific uses for their file system, such as deleting old >files(scratch file system), and modifying a database, or checkingstripe information.>The parallel nature of the system makes it fairly quick. Essentially >each thread does this: > >While dirs on stack: > Pop directory off stack: > scan directory: > if directory, collect statistics, push it on stack > if file, collect statistics, filespecial stuff > other files: collect statistics > save thread-private statistics to global stats > >It works well, we have tried thread counts from 1 to 32, but as you >pass 32, it actually slows things down. And one of them does not do >too well beyond 16. > >Evan Felix >Pacific Northwest National Laboratory > > >-----Original Message----- >From: lustre-discuss-bounces@clusterfs.com >[mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Nathan >Dauchy >Sent: Thursday, April 06, 2006 8:08 AM >To: lustre-discuss@clusterfs.com >Subject: [Lustre-discuss] usage by user, "du -s", and lack of quotas > >Greetings, > >Due to lack of quota support in Lustre, we are having to periodically >monitor filesystem usage by user. Fortunately, we can do this by >project directory, and don''t have to run "find" on the whole thing. >Unfortunately, even running "du -s /el1/projects/*" takes a LONG time. > >Does anyone have suggestions of alternative ways to implement quotas? >Is there a more efficient way in Lustre to determine filesystem >utilization of a directory? > >We are currently running lustre-1.4.4, linux-2.6.5-7.191, on SuSE 9.1 >for x86_64, with 4 OSS nodes. We are in the middle of upgrading to >lustre-1.4.6.1. The filesystem is reasonably full, but not huge by my >understanding of what Lustre is capable of: > ># df -Ph /el* >Filesystem Size Used Avail Use% Mounted on >l0009-m:/mds0/client0 12T 8.4T 2.6T 77% /misc/el0 >l0009-m:/mds1/client1 12T 9.6T 1.4T 88% /misc/el1 > >Thanks for any suggestions! > >-Nathan >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss@clusterfs.com >https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss@clusterfs.com >https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >-------------- next part -------------- A non-text attachment was scrubbed... Name: evanlib.py Type: application/octet-stream Size: 2262 bytes Desc: evanlib.py Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060510/8d9df55a/evanlib.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: dude Type: application/octet-stream Size: 14615 bytes Desc: dude Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060510/8d9df55a/dude.obj