e2scan will show me all the files that have changed from a date, but I want to know all the files that have not changed sense some date. The goal is to make a system for purging scratch spaces that is fast, and minimum wear on the filesystem. How are groups doing this now? Are you using e2scan? Is there a way to have e2scan not only list the file but also the mtime/ctime in the log file, so that we can sort oldest to newest? Thank you! Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985
Brock, ----- "Brock Palen" <brockp at umich.edu> wrote:> e2scan will show me all the files that have changed from a date, but > I want to know all the files that have not changed sense some date. > The goal is to make a system for purging scratch spaces that is fast, > and minimum wear on the filesystem.> How are groups doing this now? Are you using e2scan? > Is there a way to have e2scan not only list the file but also the > mtime/ctime in the log file, so that we can sort oldest to newest?e2scan can dump it''s findings to a sqlite DB which has the ctime/mtime info in it. But you''ll need to write some logic to construct the filepaths because everything is stored with inode number as the index. There is code in e2scan that can probably be recycled for that purpose though. So I suppose you would get e2scan to create the DB and then a custom app would search by ctime/mtime and spit out the full file path. Daire
The e2scan shipped from sun''s rpms does not support sqlite3 out of the box: rpm -qf /usr/sbin/e2scan e2fsprogs-1.40.7.sun3-0redhat e2scan: sqlite3 was not detected on configure, database creation is not supported Should I just rebuilt only e2scan? Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 On Mar 4, 2009, at 2:19 PM, Daire Byrne wrote:> Brock, > > ----- "Brock Palen" <brockp at umich.edu> wrote: > >> e2scan will show me all the files that have changed from a date, but >> I want to know all the files that have not changed sense some date. >> The goal is to make a system for purging scratch spaces that is fast, >> and minimum wear on the filesystem. > >> How are groups doing this now? Are you using e2scan? >> Is there a way to have e2scan not only list the file but also the >> mtime/ctime in the log file, so that we can sort oldest to newest? > > e2scan can dump it''s findings to a sqlite DB which has the ctime/mtime > info in it. But you''ll need to write some logic to construct the > filepaths because everything is stored with inode number as the index. > There is code in e2scan that can probably be recycled for that purpose > though. So I suppose you would get e2scan to create the DB and then a > custom app would search by ctime/mtime and spit out the full file > path. > > Daire > >
Hi Brock, We do this on our scratch file systems as well. Our policy is to remove files that have a time of last use of 60 days or older, where time of last use is defined as the greatest of [acm]time. E2scan seems like an evil kludge to me, at least if you don''t quiesce your servers first, which is impractical for us to do. It is especially painful if you have to correlate data taken seperately from OST and MDT which I guess you need not do if a) you have a release with trustable MDS times (not 1.6.6), or b) you plan to stat(2) the MDS-generated list on a client before purging them. See bug 16942 which describes an MDS resident "purge thread" that continually walks the file system implementing the policy. This is how we thought purging ought to be optimized and we hope to have this in place by the time we put 1.8 in production. Meanwhile, we are walking the file system from a client. Note that for this lfs find --type f --atime +60 --mtime +60 --ctime +60 /mnt/lustre >list beats find /mnt/lustre -type f -atime +60 -mtime +60 -ctime +60 >list by a wide margin since most of the time it does not have to contact the OST''s, which stat(2) will always do for the foreseeable future (until size-on-MDS) to get st_size. Jim On Wed, Mar 04, 2009 at 01:48:10PM -0500, Brock Palen wrote:> e2scan will show me all the files that have changed from a date, but > I want to know all the files that have not changed sense some date. > > The goal is to make a system for purging scratch spaces that is fast, > and minimum wear on the filesystem. > How are groups doing this now? Are you using e2scan? > Is there a way to have e2scan not only list the file but also the > mtime/ctime in the log file, so that we can sort oldest to newest? > > Thank you! > > > Brock Palen > www. umich.edu/~brockp > Center for Advanced Computing > brockp at umich.edu > (734)936-1985 > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http:// lists.lustre.org/mailman/listinfo/lustre-discuss
Jim, ----- "Jim Garlick" <garlick at llnl.gov> wrote:> E2scan seems like an evil kludge to me, at least if you don''t quiesce > your servers first, which is impractical for us to do. It is especially > painful if you have to correlate data taken seperately from OST and MDT > which I guess you need not do if a) you have a release with trustable MDS > times (not 1.6.6), or b) you plan to stat(2) the MDS-generated list on a > client before purging them.For the purposes of clearing old data are m|ctimes of files on the MDT filesystem really going to be that far out? We came across something similar when we used to rsync between two MDTs and filed a "bug" here: https://bugzilla.lustre.org/show_bug.cgi?id=14952 I thought that the MDT filesystem times would at least be close to the "real" lustre filesystem time? Perhaps things are more complicated with striped files.> See bug 16942 which describes an MDS resident "purge thread" that > continually walks the file system implementing the policy. This is how > we thought purging ought to be optimized and we hope to have this in place > by the time we put 1.8 in production.Interesting. I will keep an eye on that - cheers.> Meanwhile, we are walking the file system from a client. Note that > for this > > lfs find --type f --atime +60 --mtime +60 --ctime +60 /mnt/lustre >list > > beats > > find /mnt/lustre -type f -atime +60 -mtime +60 -ctime +60 >list > > by a wide margin since most of the time it does not have to contact > the OST''s, which stat(2) will always do for the foreseeable future > (until size-on-MDS) to get st_size.I did not think about this before - thanks. So the "accurate" times are held in the EAs on the MDT? And the stat(2) is so slow because it wants file size too which then needs to talk to the OSTs. Is there still going to be an overhead on the MDS reading the EAs from disk compared to just stating the files on the MDT device? Some comparison benchmarks of one of our filesystems (36 million files) with 10GigE: # e2scan -l -D -N 0 /dev/lustre/mdt1 ~49 minutes # e2scan -l -D -N `date --date="60 days ago" +%s` /dev/lustre/mdt1 ~18 minutes # lfs find /mnt/lustre ~403 minutes # lfs find -atime +60 -mtime +60 -ctime +60 /mnt/lustre ~2520 minutes # find /mnt/lustre ~100 minutes # find /mnt/lustre -atime +60 -mtime +60 -ctime +60 ~6574 minutes (4.5 days!) The results are not 100% accurate because this was run on a production system whose load varied throughout the day. I appreciate that "lfs find" is more convenient (and accurate) than using e2scan for particular ctimes and mtimes but there is still enough of a performance difference in our environment which makes e2scan preferable. Running multiple (lfs) find commands across a compute cluster does help speed things up but we found that the MDS gets hammered. The other big application for scanning the filesystem is "indexing" (which we are always trying to improve). We also use e2scan for this by dumping a sqlite DB and then only stat''ing the new/modified files. Finally we update a mysql DB which users can quickly query through a GUI. It is always an incremental scan update to avoid stat''ing unchanged files. We all eagerly await changelogs..... Thanks for the insight, Daire
On Mar 11, 2009 09:30 +0000, Daire Byrne wrote:> For the purposes of clearing old data are m|ctimes of files on the MDT > filesystem really going to be that far out? We came across something similar > when we used to rsync between two MDTs and filed a "bug" here: > > https://bugzilla.lustre.org/show_bug.cgi?id=14952 > > I thought that the MDT filesystem times would at least be close to the "real" > lustre filesystem time?In recent versions of Lustre (1.6.6 and later I believe) the mtime on the MDS is updated on close, so that it is usable by e2scan. It isn''t 100% accurate (e.g. not updated in case of crash, or until close), but in most cases it will be reasonably accurate. This unfortunately doesn''t mean that older files will get their mtimes updated, only for newer files.> > Meanwhile, we are walking the file system from a client. Note that > > for this > > > > lfs find --type f --atime +60 --mtime +60 --ctime +60 /mnt/lustre >list > > > > beats > > > > find /mnt/lustre -type f -atime +60 -mtime +60 -ctime +60 >list > > > > by a wide margin since most of the time it does not have to contact > > the OST''s, which stat(2) will always do for the foreseeable future > > (until size-on-MDS) to get st_size. > > I did not think about this before - thanks. So the "accurate" times are held > in the EAs on the MDT?No, the accurate ctime is kept on both the MDT and OST, whichever one is later wins. The mtime is kept on whichever node has the later ctime, which is generally the OSTs because they do the writes. The atime is kept in memory on the OSTs, and written to the MDT at file close.> And the stat(2) is so slow because it wants file size > too which then needs to talk to the OSTs. Is there still going to be an > overhead on the MDS reading the EAs from disk compared to just stating the > files on the MDT device?Yes. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hi,> > Some comparison benchmarks of one of our filesystems (36 million > files) with > 10GigE: > > # e2scan -l -D -N 0 /dev/lustre/mdt1 > ~49 minutes > # e2scan -l -D -N `date --date="60 days ago" +%s` /dev/lustre/mdt1 > ~18 minutes > > # lfs find /mnt/lustre > ~403 minutes > # lfs find -atime +60 -mtime +60 -ctime +60 /mnt/lustre > ~2520 minutes > > # find /mnt/lustre > ~100 minutes > # find /mnt/lustre -atime +60 -mtime +60 -ctime +60 > ~6574 minutes (4.5 days!) >I''m very interested in this discussion, for Lustre-HSM purpose. The Lustre-HSM Policy Engine will mostly process ChangeLogs, but an initial scanning may be needed for upgrading a non-empty Lustre file system to a Lustre-HSM system. Looking at those results, e2scan seems a very efficient way to retrieve metadata for all entries, so it could be used for providing an initial list to PolicyEngine, as a flat file or DB. Does it provide common Posix attributes and striping information? I also guess it does not provide file size until ''Size On MDS'' feature will be landed.> The other big application for scanning the filesystem is "indexing" > (which we are > always trying to improve). We also use e2scan for this by dumping a > sqlite DB > and then only stat''ing the new/modified files. Finally we update a > mysql DB which > users can quickly query through a GUI. It is always an incremental > scan update to > avoid stat''ing unchanged files. We all eagerly await changelogs..... >Don''t you have performance issues with SQLite? It seamed to me that it was not very efficient for managing huge sets of data with millions of entries. Kind regards, Thomas LEIBOVICI CEA/DAM
Thomas, ----- "LEIBOVICI Thomas" <thomas.leibovici at cea.fr> wrote:> I''m very interested in this discussion, for Lustre-HSM purpose. > The Lustre-HSM Policy Engine will mostly process ChangeLogs, but an > initial scanning may be needed > for upgrading a non-empty Lustre file system to a Lustre-HSM system. > Looking at those results, e2scan seems a very efficient way to > retrieve metadata for all entries, > so it could be used for providing an initial list to PolicyEngine, as > a flat file or DB. > Does it provide common Posix attributes and striping information? > I also guess it does not provide file size until ''Size On MDS'' feature > will be landed.e2scan provides owner, group, ctime and mtime but no atime (I''m sure that is a trivial addition). It does not do any reading of the EAs so there is no striping information. I would think the overhead of reading the EAs would make e2scan far slower. At the moment it is quick and simple.> > The other big application for scanning the filesystem is "indexing" > > > (which we are > > always trying to improve). We also use e2scan for this by dumping a > > > sqlite DB > > and then only stat''ing the new/modified files. Finally we update a > > mysql DB which > > users can quickly query through a GUI. It is always an incremental > > scan update to > > avoid stat''ing unchanged files. We all eagerly await changelogs..... > > > Don''t you have performance issues with SQLite? It seamed to me that it > was not very efficient for managing > huge sets of data with millions of entries.Admittedly it is not great but is good enough for our purposes. I''m sure the code could be altered to write directly to mySQL over the network. Regards, Daire