Mikko Lammi
2010-Jan-05 10:34 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Hello, As a result of one badly designed application running loose for some time, we now seem to have over 60 million files in one directory. Good thing about ZFS is that it allows it without any issues. Unfortunatelly now that we need to get rid of them (because they eat 80% of disk space) it seems to be quite challenging. Traditional approaches like "find ./ -exec rm {} \;" seem to take forever - after running several days, the directory size still says the same. The only way how I''ve been able to remove something has been by giving "rm -rf" to problematic directory from parent level. Running this command shows directory size decreasing by 10,000 files/hour, but this would still mean close to ten months (over 250 days) to delete everything! I also tried to use "unlink" command to directory as a root, as a user who created the directory, by changing directory''s owner to root and so forth, but all attempts gave "Not owner" error. Any commands like "ls -f" or "find" will run for hours (or days) without actually listing anything from the directory, so I''m beginning to suspect that maybe the directory''s data structure is somewhat damaged. Is there some diagnostics that I can run with e.g "zdb" to investigate and hopefully fix for a single directory within zfs dataset? To make things even more difficult, this directory is located in rootfs, so dropping the zfs filesystem would basically mean reinstalling the entire system, which is something that we really wouldn''t wish to go. OS is Solaris 10, zpool version is 10 (rather old, I know, but is there easy path for upgrade that might solve this problem?) and the zpool consists two 146 GB SAS drivers in a mirror setup. Any help would be appreciated. Thanks, Mikko -- Mikko Lammi | lmmz at lmmz.net | http://www.lmmz.net
Joerg Schilling
2010-Jan-05 10:47 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
"Mikko Lammi" <mikko.lammi at lmmz.net> wrote:> Hello, > > As a result of one badly designed application running loose for some time, > we now seem to have over 60 million files in one directory. Good thing > about ZFS is that it allows it without any issues. Unfortunatelly now that > we need to get rid of them (because they eat 80% of disk space) it seems > to be quite challenging. > > Traditional approaches like "find ./ -exec rm {} \;" seem to take forever > - after running several days, the directory size still says the same. The > only way how I''ve been able to remove something has been by giving "rm > -rf" to problematic directory from parent level. Running this command > shows directory size decreasing by 10,000 files/hour, but this would still > mean close to ten months (over 250 days) to delete everything!Do you know the number of files where it really starts to become unusable slow? I had firectories with 3 million files on UFS and this was just a bit slower than with small directories. BTW: "find ./ -exec rm {} \;" is definitely the wrong command as it is known since a long time to take forever. This is why "find ./ -exec rm {} +" was introduced 20 years ago. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Markus Kovero
2010-Jan-05 10:51 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Hi, while not providing complete solution, I''d suggest turning atime off so find/rm does not change access time and possibly destroying unnecessary snapshots before removing files, should be quicker. Yours Markus Kovero -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Mikko Lammi Sent: 5. tammikuuta 2010 12:35 To: zfs-discuss at opensolaris.org Subject: [zfs-discuss] Clearing a directory with more than 60 million files Hello, As a result of one badly designed application running loose for some time, we now seem to have over 60 million files in one directory. Good thing about ZFS is that it allows it without any issues. Unfortunatelly now that we need to get rid of them (because they eat 80% of disk space) it seems to be quite challenging. Traditional approaches like "find ./ -exec rm {} \;" seem to take forever - after running several days, the directory size still says the same. The only way how I''ve been able to remove something has been by giving "rm -rf" to problematic directory from parent level. Running this command shows directory size decreasing by 10,000 files/hour, but this would still mean close to ten months (over 250 days) to delete everything! I also tried to use "unlink" command to directory as a root, as a user who created the directory, by changing directory''s owner to root and so forth, but all attempts gave "Not owner" error. Any commands like "ls -f" or "find" will run for hours (or days) without actually listing anything from the directory, so I''m beginning to suspect that maybe the directory''s data structure is somewhat damaged. Is there some diagnostics that I can run with e.g "zdb" to investigate and hopefully fix for a single directory within zfs dataset? To make things even more difficult, this directory is located in rootfs, so dropping the zfs filesystem would basically mean reinstalling the entire system, which is something that we really wouldn''t wish to go. OS is Solaris 10, zpool version is 10 (rather old, I know, but is there easy path for upgrade that might solve this problem?) and the zpool consists two 146 GB SAS drivers in a mirror setup. Any help would be appreciated. Thanks, Mikko -- Mikko Lammi | lmmz at lmmz.net | http://www.lmmz.net _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Mike Gerdts
2010-Jan-05 13:34 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, Jan 5, 2010 at 4:34 AM, Mikko Lammi <mikko.lammi at lmmz.net> wrote:> Hello, > > As a result of one badly designed application running loose for some time, > we now seem to have over 60 million files in one directory. Good thing > about ZFS is that it allows it without any issues. Unfortunatelly now that > we need to get rid of them (because they eat 80% of disk space) it seems > to be quite challenging. > > Traditional approaches like "find ./ -exec rm {} \;" seem to take forever > - after running several days, the directory size still says the same. The > only way how I''ve been able to remove something has been by giving "rm > -rf" to problematic directory from parent level. Running this command > shows directory size decreasing by 10,000 files/hour, but this would still > mean close to ten months (over 250 days) to delete everything! > > I also tried to use "unlink" command to directory as a root, as a user who > created the directory, by changing directory''s owner to root and so forth, > but all attempts gave "Not owner" error. > > Any commands like "ls -f" or "find" will run for hours (or days) without > actually listing anything from the directory, so I''m beginning to suspect > that maybe the directory''s data structure is somewhat damaged. Is there > some diagnostics that I can run with e.g "zdb" to investigate and > hopefully fix for a single directory within zfs dataset?In situations like this, ls will be exceptionally slow partially because it will sort the output. Find is slow because it needs to call lstat() on every entry. In similar situations I have found the following to work. perl -e ''opendir(D, "."); while ( $d = readdir(D) ) { print "$d\n" }'' Replace print with unlink if you wish...> > To make things even more difficult, this directory is located in rootfs, > so dropping the zfs filesystem would basically mean reinstalling the > entire system, which is something that we really wouldn''t wish to go. > > > OS is Solaris 10, zpool version is 10 (rather old, I know, but is there > easy path for upgrade that might solve this problem?) and the zpool > consists two 146 GB SAS drivers in a mirror setup. > > > Any help would be appreciated. > > Thanks, > Mikko > > -- > ?Mikko Lammi | lmmz at lmmz.net | http://www.lmmz.net > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Mike Gerdts http://mgerdts.blogspot.com/
Michael Schuster
2010-Jan-05 13:46 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Mike Gerdts wrote:> On Tue, Jan 5, 2010 at 4:34 AM, Mikko Lammi <mikko.lammi at lmmz.net> wrote: >> Hello, >> >> As a result of one badly designed application running loose for some time, >> we now seem to have over 60 million files in one directory. Good thing >> about ZFS is that it allows it without any issues. Unfortunatelly now that >> we need to get rid of them (because they eat 80% of disk space) it seems >> to be quite challenging. >> >> Traditional approaches like "find ./ -exec rm {} \;" seem to take forever >> - after running several days, the directory size still says the same. The >> only way how I''ve been able to remove something has been by giving "rm >> -rf" to problematic directory from parent level. Running this command >> shows directory size decreasing by 10,000 files/hour, but this would still >> mean close to ten months (over 250 days) to delete everything! >> >> I also tried to use "unlink" command to directory as a root, as a user who >> created the directory, by changing directory''s owner to root and so forth, >> but all attempts gave "Not owner" error. >> >> Any commands like "ls -f" or "find" will run for hours (or days) without >> actually listing anything from the directory, so I''m beginning to suspect >> that maybe the directory''s data structure is somewhat damaged. Is there >> some diagnostics that I can run with e.g "zdb" to investigate and >> hopefully fix for a single directory within zfs dataset? > > In situations like this, ls will be exceptionally slow partially > because it will sort the output.that''s what ''-f'' was supposed to avoid, I''d guess. Michael -- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
David Magda
2010-Jan-05 15:08 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 05:34, Mikko Lammi wrote:> As a result of one badly designed application running loose for some time, > we now seem to have over 60 million files in one directory. Good thing > about ZFS is that it allows it without any issues. Unfortunatelly now that > we need to get rid of them (because they eat 80% of disk space) it seems > to be quite challenging.How about creating a new data set, moving the directory into it, and then destroying it? Assuming the directory in question is /opt/MYapp/data: 1. zfs create rpool/junk 2. mv /opt/MYapp/data /rpool/junk/ 3. zfs destroy rpool/junk
Casper.Dik at Sun.COM
2010-Jan-05 15:12 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
>On Tue, January 5, 2010 05:34, Mikko Lammi wrote: > >> As a result of one badly designed application running loose for some time, >> we now seem to have over 60 million files in one directory. Good thing >> about ZFS is that it allows it without any issues. Unfortunatelly now that >> we need to get rid of them (because they eat 80% of disk space) it seems >> to be quite challenging. > >How about creating a new data set, moving the directory into it, and then >destroying it? > >Assuming the directory in question is /opt/MYapp/data: > 1. zfs create rpool/junk > 2. mv /opt/MYapp/data /rpool/junk/ > 3. zfs destroy rpool/junkThe "move" will create and remove the files; the "remove" by mv will be as inefficient removing them one by one. "rm -rf" would be at least as quick. Casper
Mikko Lammi
2010-Jan-05 15:22 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 17:08, David Magda wrote:> On Tue, January 5, 2010 05:34, Mikko Lammi wrote: > >> As a result of one badly designed application running loose for some >> time, >> we now seem to have over 60 million files in one directory. Good thing >> about ZFS is that it allows it without any issues. Unfortunatelly now >> that >> we need to get rid of them (because they eat 80% of disk space) it seems >> to be quite challenging. > > How about creating a new data set, moving the directory into it, and then > destroying it? > > Assuming the directory in question is /opt/MYapp/data: > 1. zfs create rpool/junk > 2. mv /opt/MYapp/data /rpool/junk/ > 3. zfs destroy rpool/junkTried that as well. It''s moving individual files to the new directory with speed approx. 3,000/minute, so it''s not any faster than anything that I can apply directly to the original directory. I also tried the perl script that does readdir() earlier (it''s as slow as any other application), and switched the zfs dataset parameter "atime" to off, but that didn''t had much effect either. However when we deleted some other files from the volume and managed to raise free disk space from 4 GB to 10 GB, the "rm -rf directory" method started to perform significantly faster. Now it''s deleting around 4,000 files/minute (240,000/h - quite an improvement from 10,000/h). I remember that I saw some discussion related to ZFS performance when filesystem becomes very full, so I wonder if that was the case here. Next I''m going to try if that "find ./ -exec {} +" yelds any better results than "rm -rf" from parent directory. But I guess at some point the bottleneck will be just CPU (this is a 1-Ghz T1000 system) and disk I/O, not the ZFS filesystem. I''m just wondering of what kind of figures to expect. regards, Mikko -- Mikko Lammi | lmmz at lmmz.net | http://www.lmmz.net
David Magda
2010-Jan-05 15:43 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 10:12, Casper.Dik at Sun.COM wrote:>>How about creating a new data set, moving the directory into it, and then >>destroying it? >> >>Assuming the directory in question is /opt/MYapp/data: >> 1. zfs create rpool/junk >> 2. mv /opt/MYapp/data /rpool/junk/ >> 3. zfs destroy rpool/junk > > The "move" will create and remove the files; the "remove" by mv will be as > inefficient removing them one by one. > > "rm -rf" would be at least as quick.Normally when you do a move with-in a ''regular'' file system all that''s usually done is the directory pointer is shuffled around. This is not the case with ZFS data sets, even though they''re on the same pool?
Casper.Dik at Sun.COM
2010-Jan-05 15:48 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
>On Tue, January 5, 2010 10:12, Casper.Dik at Sun.COM wrote: > >>>How about creating a new data set, moving the directory into it, and then >>>destroying it? >>> >>>Assuming the directory in question is /opt/MYapp/data: >>> 1. zfs create rpool/junk >>> 2. mv /opt/MYapp/data /rpool/junk/ >>> 3. zfs destroy rpool/junk >> >> The "move" will create and remove the files; the "remove" by mv will be as >> inefficient removing them one by one. >> >> "rm -rf" would be at least as quick. > >Normally when you do a move with-in a ''regular'' file system all that''s >usually done is the directory pointer is shuffled around. This is not the >case with ZFS data sets, even though they''re on the same pool? >Only within a single zfs you can "rename" files; but within a zpool but on different zfs''s, you will need to copy and remove. Casper
Michael Schuster
2010-Jan-05 15:50 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
David Magda wrote:> On Tue, January 5, 2010 10:12, Casper.Dik at Sun.COM wrote: > >>> How about creating a new data set, moving the directory into it, and then >>> destroying it? >>> >>> Assuming the directory in question is /opt/MYapp/data: >>> 1. zfs create rpool/junk >>> 2. mv /opt/MYapp/data /rpool/junk/ >>> 3. zfs destroy rpool/junk >> The "move" will create and remove the files; the "remove" by mv will be as >> inefficient removing them one by one. >> >> "rm -rf" would be at least as quick. > > Normally when you do a move with-in a ''regular'' file system all that''s > usually done is the directory pointer is shuffled around. This is not the > case with ZFS data sets, even though they''re on the same pool?no - mv doesn''t know about zpools, only about posix filesystems. -- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
Dennis Clarke
2010-Jan-05 15:53 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
> On Tue, January 5, 2010 10:12, Casper.Dik at Sun.COM wrote: > >>>How about creating a new data set, moving the directory into it, and >>> then >>>destroying it? >>> >>>Assuming the directory in question is /opt/MYapp/data: >>> 1. zfs create rpool/junk >>> 2. mv /opt/MYapp/data /rpool/junk/ >>> 3. zfs destroy rpool/junk >> >> The "move" will create and remove the files; the "remove" by mv will be >> as >> inefficient removing them one by one. >> >> "rm -rf" would be at least as quick. > > Normally when you do a move with-in a ''regular'' file system all that''s > usually done is the directory pointer is shuffled around. This is not the > case with ZFS data sets, even though they''re on the same pool? >You can also use star which may speed things up, safely. star -copy -p -acl -sparse -dump -xdir -xdot -fs=96m -fifostats -time \ -C source_dir . destination_dir that will buffer the transport of the data from source to dest via memory and work to keep that buffer full as data is written on the output side. Its probably at least as fast as mv and probably safer because you never delete the original until after the copy is complete. -- Dennis Clarke dclarke at opensolaris.ca <- Email related to the open source Solaris dclarke at blastwave.org <- Email related to open source for Solaris
David Magda
2010-Jan-05 16:00 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 10:50, Michael Schuster wrote:> David Magda wrote: >> Normally when you do a move with-in a ''regular'' file system all that''s >> usually done is the directory pointer is shuffled around. This is not >> the case with ZFS data sets, even though they''re on the same pool? > > no - mv doesn''t know about zpools, only about posix filesystems.So the delineation of POSIX file systems is done at the data set "layer", and not at the zpool layer. (Which makes sense since the output of ''df'' tends to closely mimic the output of ''zfs list''.)
Richard Elling
2010-Jan-05 16:01 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Jan 5, 2010, at 2:34 AM, Mikko Lammi wrote:> Hello, > > As a result of one badly designed application running loose for some > time, > we now seem to have over 60 million files in one directory. Good thing > about ZFS is that it allows it without any issues. Unfortunatelly > now that > we need to get rid of them (because they eat 80% of disk space) it > seems > to be quite challenging. > > Traditional approaches like "find ./ -exec rm {} \;" seem to take > forever > - after running several days, the directory size still says the > same. The > only way how I''ve been able to remove something has been by giving "rm > -rf" to problematic directory from parent level. Running this command > shows directory size decreasing by 10,000 files/hour, but this would > still > mean close to ten months (over 250 days) to delete everything!This is, in part, due to stat() slowness. Fixed in later OpenSolaris builds. I have no idea if or when the fix will be backported to Solaris 10.> I also tried to use "unlink" command to directory as a root, as a > user who > created the directory, by changing directory''s owner to root and so > forth, > but all attempts gave "Not owner" error. > > Any commands like "ls -f" or "find" will run for hours (or days) > without > actually listing anything from the directory, so I''m beginning to > suspect > that maybe the directory''s data structure is somewhat damaged. Is > there > some diagnostics that I can run with e.g "zdb" to investigate and > hopefully fix for a single directory within zfs dataset? > > To make things even more difficult, this directory is located in > rootfs, > so dropping the zfs filesystem would basically mean reinstalling the > entire system, which is something that we really wouldn''t wish to go.How are the files named? If you know something about the filename pattern, then you could create subdirs and mv large numbers of files to reduce the overall size of a single directory. Something like: mkdir .A mv A* .A mkdir .B mv B* .B ... Also, as previously noted, atime=off. If you can handle a reboot, you can bump the size of the DNLC, which might help also. OTOH, if you can reboot you can also run the latest b130 livecd which has faster stat(). -- richard
Joerg Schilling
2010-Jan-05 16:02 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Michael Schuster <Michael.Schuster at Sun.COM> wrote:> >> "rm -rf" would be at least as quick. > > > > Normally when you do a move with-in a ''regular'' file system all that''s > > usually done is the directory pointer is shuffled around. This is not the > > case with ZFS data sets, even though they''re on the same pool? > > no - mv doesn''t know about zpools, only about posix filesystems.mv first tries to rename(2) the file. If this does not succeed but results in EXDEV, it copies the file. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Casper.Dik at Sun.COM
2010-Jan-05 16:03 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
>no - mv doesn''t know about zpools, only about posix filesystems."mv" doesn''t care about filesystems only about the interface provided by POSIX. There is no zfs specific interface which allows you to move a file from one zfs to the next. Casper
David Dyer-Bennet
2010-Jan-05 16:13 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 10:01, Richard Elling wrote:> OTOH, if you can reboot you can also run the latest > b130 livecd which has faster stat().How much faster is it? He estimated 250 days to rm -rf them; so 10x faster would get that down to 25 days, 100x would get it down to 2.5 days (assuming the entire time is in the stat calls, which is probably not totally true).... It''s interesting how our ability to build larger disks, and our software''s ability to do things like create really large numbers of files, comes back to bite us on the ass every now and then. I hope he has a background process running chipping away at it; I don''t THINK 250 days in the background is going to turn out to be the best answer, but one might as well start the clock running just in case. Best answer might turn out to be to copy off the less than 20% good data and just scrag the pool. Inelegant, but might result in less downtime, or in getting the space back much faster. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Richard Elling
2010-Jan-05 16:25 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Jan 5, 2010, at 8:13 AM, David Dyer-Bennet wrote:> > On Tue, January 5, 2010 10:01, Richard Elling wrote: >> OTOH, if you can reboot you can also run the latest >> b130 livecd which has faster stat(). > > How much faster is it? He estimated 250 days to rm -rf them; so 10x > faster would get that down to 25 days, 100x would get it down to 2.5 > days > (assuming the entire time is in the stat calls, which is probably not > totally true)....dunno, nothing useful in the public bug report :-( http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6775100> It''s interesting how our ability to build larger disks, and our > software''s > ability to do things like create really large numbers of files, > comes back > to bite us on the ass every now and then.Wait until you try it with dedup... not only will you need to update a lot of metadata, but also a lot of DTT entries. -- richard
Tim Cook
2010-Jan-05 16:47 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, Jan 5, 2010 at 11:25 AM, Richard Elling <richard.elling at gmail.com>wrote:> On Jan 5, 2010, at 8:13 AM, David Dyer-Bennet wrote: > >> >> On Tue, January 5, 2010 10:01, Richard Elling wrote: >> >>> OTOH, if you can reboot you can also run the latest >>> b130 livecd which has faster stat(). >>> >> >> How much faster is it? He estimated 250 days to rm -rf them; so 10x >> faster would get that down to 25 days, 100x would get it down to 2.5 days >> (assuming the entire time is in the stat calls, which is probably not >> totally true).... >> > > dunno, nothing useful in the public bug report :-( > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6775100 > > > It''s interesting how our ability to build larger disks, and our software''s >> ability to do things like create really large numbers of files, comes back >> to bite us on the ass every now and then. >> > > Wait until you try it with dedup... not only will you need to update a lot > of metadata, but also a lot of DTT entries. > -- richard > >I recall pointing this out over a year ago when I said claiming unlimited snapshots and filesystems was disingenuous at best, and that likely we''d need to see artificial limitations to make many of these features usable. But I digress :) -- --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100105/b2e49c98/attachment.html>
Daniel Rock
2010-Jan-05 16:52 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Am 05.01.2010 16:22, schrieb Mikko Lammi:> However when we deleted some other files from the volume and managed to > raise free disk space from 4 GB to 10 GB, the "rm -rf directory" method > started to perform significantly faster. Now it''s deleting around 4,000 > files/minute (240,000/h - quite an improvement from 10,000/h). I remember > that I saw some discussion related to ZFS performance when filesystem > becomes very full, so I wonder if that was the case here.I did some tests. They were done on an Ultra 20 (2.2 GHz Dual-Core Opteron) with crappy SATA disks. On this machine creation and deletion of files were I/O bound. I was able to create about 1 Mio. files per hour. I stopped after 5 hours, so I had approx. 5 Mio. files in one directory. Deletion (via the Perl script) also had a rate of ~1 Mio. files per hour. During deletion the disks (mirrored zpool) were both 95% busy, CPU time was 5% total. If the T1000 has SCSI disks you can turn on write cache on both disks (though in my tests on delete most I/O were read operations). For the rpool it will probably not be enabled by default because your are "just" using partitions: # format -e [select disk] format> scsi scsi> p8 b2 |= 4 Mode select on page 8 ok. scsi> quit Disable write cache: scsi> p8 b2 &= ~4 (Yes I know, there is a "cache" command in format, but I''m used to above commands a long time before the "cache" command was introduced) Daniel
Joe Blount
2010-Jan-05 17:18 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On 01/ 5/10 10:01 AM, Richard Elling wrote:> How are the files named? If you know something about the filename > pattern, then you could create subdirs and mv large numbers of files > to reduce the overall size of a single directory. Something like: > > mkdir .A > mv A* .A > mkdir .B > mv B* .B > ... >I doubt that would be a faster option. Unless you can be certain the file naming coincides with the unsorted order of the files in the directory. Because if A* does not occur the beginning of the directory''s contents, finding them will be painful. The above process would add many cycles of scanning through all 60 million directory entries. And each move request will churn the vnode cache. A while back I did some experimenting with millions of files per directory. Note that the time estimates are overstating how long it will take. The more files you remove, the faster it will go. I would be trying to get an unsorted read of the directory, and delete them in that order. This is not just to save the time it takes to sort the output. It will also mimimize vnode cache churn, and the time to remove each object. Each remove request must iterate the directory looking for the object to remove. Newer ON builds support the -U option to ls, for unsorted output. I don''t know what may exist on S10. FWIW, I copied the ''ls'' binary from a ON128 machine to /tmp/myls a S10 machine, and it appeared to work - I don''t know if there are any issues/risks with doing that. Since its a niagara system, it might go faster if you can get multiple removes going in parallel. But only if you all the parallel remove requests can be on files near the beginning of the directory''s contents. If you can''t get an unsorted list of files, then multiple threads will just add to the vnode cache thrashing. It might be worth trying something like this: ls -U > remove.sh Make it a bash script. prepend rm -f to each line, and append an & to each line. Maybe every few hundred lines put in a wait. (in case the rm''s can be kicked off significantly faster than they can be completed, you don''t want millions of rm''s to get started) You''ll have to wait on ls to do one unsorted read of the directory. Then you will get parallel remove requests going, and always on files at the beginning of the directory. There should be minimal vnode churn during the removes. Starting the new processes for the removes may counteract the benefit of parallelizing, and make this slower. But since its a Niagara system, you may have the spare cpu cycles to waste anyway. Its just another idea to try...
Paul Gress
2010-Jan-05 17:38 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On 01/ 5/10 05:34 AM, Mikko Lammi wrote:> Hello, > > As a result of one badly designed application running loose for some time, > we now seem to have over 60 million files in one directory. Good thing > about ZFS is that it allows it without any issues. Unfortunatelly now that > we need to get rid of them (because they eat 80% of disk space) it seems > to be quite challenging. >I''ve been following this thread. Would it be faster to do the reverse. Copy the 20% of disk then format then move the 20% back. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100105/15468e98/attachment.html>
Michael Schuster
2010-Jan-05 17:44 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Paul Gress wrote:> On 01/ 5/10 05:34 AM, Mikko Lammi wrote: >> Hello, >> >> As a result of one badly designed application running loose for some time, >> we now seem to have over 60 million files in one directory. Good thing >> about ZFS is that it allows it without any issues. Unfortunatelly now that >> we need to get rid of them (because they eat 80% of disk space) it seems >> to be quite challenging. >> > > I''ve been following this thread. Would it be faster to do the reverse. > Copy the 20% of disk then format then move the 20% back.I''m not sure the OS installation would survive that. Michael -- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
Fajar A. Nugraha
2010-Jan-05 17:57 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Wed, Jan 6, 2010 at 12:44 AM, Michael Schuster <Michael.Schuster at sun.com> wrote:>>> we need to get rid of them (because they eat 80% of disk space) it seems >>> to be quite challenging. >>> >> >> I''ve been following this thread. ?Would it be faster to do the reverse. >> ?Copy the 20% of disk then format then move the 20% back. > > I''m not sure the OS installation would survive that.... even when done from a live/rescue CD session? -- Fajar
Richard Elling
2010-Jan-05 18:49 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Jan 5, 2010, at 8:52 AM, Daniel Rock wrote:> Am 05.01.2010 16:22, schrieb Mikko Lammi: >> However when we deleted some other files from the volume and >> managed to >> raise free disk space from 4 GB to 10 GB, the "rm -rf directory" >> method >> started to perform significantly faster. Now it''s deleting around >> 4,000 >> files/minute (240,000/h - quite an improvement from 10,000/h). I >> remember >> that I saw some discussion related to ZFS performance when filesystem >> becomes very full, so I wonder if that was the case here. > > I did some tests. They were done on an Ultra 20 (2.2 GHz Dual-Core > Opteron) with crappy SATA disks. On this machine creation and > deletion of files were I/O bound. I was able to create about 1 Mio. > files per hour. I stopped after 5 hours, so I had approx. 5 Mio. > files in one directory. > > Deletion (via the Perl script) also had a rate of ~1 Mio. files per > hour. During deletion the disks (mirrored zpool) were both 95% busy, > CPU time was 5% total. > > If the T1000 has SCSI disks you can turn on write cache on both > disks (though in my tests on delete most I/O were read operations). > For the rpool it will probably not be enabled by default because > your are "just" using partitions:Good observation! By default, rpool will not have write cache enabled. It might make a difference to enable the write cache for this operation. -- richard> > # format -e > [select disk] > format> scsi > scsi> p8 b2 |= 4 > > Mode select on page 8 ok. > > scsi> quit > > Disable write cache: > > scsi> p8 b2 &= ~4 > > > (Yes I know, there is a "cache" command in format, but I''m used to > above > commands a long time before the "cache" command was introduced) > > > Daniel > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
David Dyer-Bennet
2010-Jan-05 19:38 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, January 5, 2010 10:25, Richard Elling wrote:> On Jan 5, 2010, at 8:13 AM, David Dyer-Bennet wrote:>> It''s interesting how our ability to build larger disks, and our >> software''s >> ability to do things like create really large numbers of files, >> comes back >> to bite us on the ass every now and then. > > Wait until you try it with dedup... not only will you need to update a > lot > of metadata, but also a lot of DTT entries.My data consists (by volume) almost entirely of bitmap photo images; I don''t think dedup is going to buy me much, so I''m not leaping into experimenting with it. Probably just as well; I don''t think I have enough memory for it, either. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Matthew Ahrens
2010-Jan-20 03:25 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
Michael Schuster wrote:> Mike Gerdts wrote: >> On Tue, Jan 5, 2010 at 4:34 AM, Mikko Lammi <mikko.lammi at lmmz.net> wrote: >>> Hello, >>> >>> As a result of one badly designed application running loose for some >>> time, >>> we now seem to have over 60 million files in one directory. Good thing >>> about ZFS is that it allows it without any issues. Unfortunatelly now >>> that >>> we need to get rid of them (because they eat 80% of disk space) it seems >>> to be quite challenging. >>> >>> Traditional approaches like "find ./ -exec rm {} \;" seem to take >>> forever >>> - after running several days, the directory size still says the same. >>> The >>> only way how I''ve been able to remove something has been by giving "rm >>> -rf" to problematic directory from parent level. Running this command >>> shows directory size decreasing by 10,000 files/hour, but this would >>> still >>> mean close to ten months (over 250 days) to delete everything! >>> >>> I also tried to use "unlink" command to directory as a root, as a >>> user who >>> created the directory, by changing directory''s owner to root and so >>> forth, >>> but all attempts gave "Not owner" error. >>> >>> Any commands like "ls -f" or "find" will run for hours (or days) without >>> actually listing anything from the directory, so I''m beginning to >>> suspect >>> that maybe the directory''s data structure is somewhat damaged. Is there >>> some diagnostics that I can run with e.g "zdb" to investigate and >>> hopefully fix for a single directory within zfs dataset? >> >> In situations like this, ls will be exceptionally slow partially >> because it will sort the output. > > that''s what ''-f'' was supposed to avoid, I''d guess.Yes, but unfortunately, the typical reason ls is slow with huge directories is that it requires a huge amount of memory. Even when not sorting (with -f), it still allocates a huge amount of memory for each entry listed, and buffers the output until the directory is entirely read. So typically -f doesn''t help performance much. Improving this would be a great small project for an OpenSolaris contributor! I filed a couple of bugs for this several years ago, I can dig them up if anyone is interested. --matt
Mike Gerdts
2010-Jan-20 04:11 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, Jan 19, 2010 at 9:25 PM, Matthew Ahrens <Matthew.Ahrens at sun.com> wrote:> Michael Schuster wrote: >> >> Mike Gerdts wrote: >>> >>> On Tue, Jan 5, 2010 at 4:34 AM, Mikko Lammi <mikko.lammi at lmmz.net> wrote: >>>> >>>> Hello, >>>> >>>> As a result of one badly designed application running loose for some >>>> time, >>>> we now seem to have over 60 million files in one directory. Good thing >>>> about ZFS is that it allows it without any issues. Unfortunatelly now >>>> that >>>> we need to get rid of them (because they eat 80% of disk space) it seems >>>> to be quite challenging. >>>> >>>> Traditional approaches like "find ./ -exec rm {} \;" seem to take >>>> forever >>>> - after running several days, the directory size still says the same. >>>> The >>>> only way how I''ve been able to remove something has been by giving "rm >>>> -rf" to problematic directory from parent level. Running this command >>>> shows directory size decreasing by 10,000 files/hour, but this would >>>> still >>>> mean close to ten months (over 250 days) to delete everything! >>>> >>>> I also tried to use "unlink" command to directory as a root, as a user >>>> who >>>> created the directory, by changing directory''s owner to root and so >>>> forth, >>>> but all attempts gave "Not owner" error. >>>> >>>> Any commands like "ls -f" or "find" will run for hours (or days) without >>>> actually listing anything from the directory, so I''m beginning to >>>> suspect >>>> that maybe the directory''s data structure is somewhat damaged. Is there >>>> some diagnostics that I can run with e.g "zdb" to investigate and >>>> hopefully fix for a single directory within zfs dataset? >>> >>> In situations like this, ls will be exceptionally slow partially >>> because it will sort the output. >> >> that''s what ''-f'' was supposed to avoid, I''d guess. > > Yes, but unfortunately, the typical reason ls is slow with huge directories > is that it requires a huge amount of memory. ?Even when not sorting (with > -f), it still allocates a huge amount of memory for each entry listed, and > buffers the output until the directory is entirely read. ?So typically -f > doesn''t help performance much. ?Improving this would be a great small > project for an OpenSolaris contributor! ?I filed a couple of bugs for this > several years ago, I can dig them up if anyone is interested.Yeah, there is more going on than that, though. I had started to dig into this but wanted to do a cleaner experiment before showing my results. Well, a couple weeks have passed and I still haven''t gotten to it. As such, I''ll share my initial findings. I had initially created a zpool on a file in /tmp (2 GB, I think). I then began creating 8 million files in a directory. A while later I realized I was going to need more space, and gave it several more gigabytes (6, I think) allocated from another file in /var/tmp (UFS). In any case, the total size of the pool was way less than 50% of RAM. The test system is a 2xSPARC64-VII M4000 with 32 GB of memory. Aside from my ZFS test, there was one CPU-bound (single core) process running. It was running S10u8. The attached graphs show the system behavior during the "rm -rf" of the directory. The lull at 16:30 was due to stopping the rm -rf as I tried a "zpool export" followed by a "zpool import". What is confusing to me is why there is so much physical read activity. I would have expected all of this to be in the ARC and have no need for the heavy read volume. Again, a cleaner experiment is needed, presumably using zpools on block devices using OpenSolaris. -- Mike Gerdts http://mgerdts.blogspot.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: delete-8-million-files.png Type: image/png Size: 54524 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100119/0c940a1a/attachment.png>
Miles Nordin
2010-Jan-20 17:49 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
>>>>> "ml" == Mikko Lammi <mikko.lammi at lmmz.net> writes:ml> "rm -rf" to problematic directory from parent level. Running ml> this command shows directory size decreasing by 10,000 ml> files/hour, but this would still mean close to ten months ml> (over 250 days) to delete everything! interesting. does ''zpool scrub'' take unusually long, too? or is it pretty close to normal speed? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100120/e7f4d83c/attachment.bin>
Jason King
2010-Jan-26 05:00 UTC
[zfs-discuss] Clearing a directory with more than 60 million files
On Tue, Jan 19, 2010 at 9:25 PM, Matthew Ahrens <Matthew.Ahrens at sun.com> wrote:> Michael Schuster wrote: >> >> Mike Gerdts wrote: >>> >>> On Tue, Jan 5, 2010 at 4:34 AM, Mikko Lammi <mikko.lammi at lmmz.net> wrote: >>>> >>>> Hello, >>>> >>>> As a result of one badly designed application running loose for some >>>> time, >>>> we now seem to have over 60 million files in one directory. Good thing >>>> about ZFS is that it allows it without any issues. Unfortunatelly now >>>> that >>>> we need to get rid of them (because they eat 80% of disk space) it seems >>>> to be quite challenging. >>>> >>>> Traditional approaches like "find ./ -exec rm {} \;" seem to take >>>> forever >>>> - after running several days, the directory size still says the same. >>>> The >>>> only way how I''ve been able to remove something has been by giving "rm >>>> -rf" to problematic directory from parent level. Running this command >>>> shows directory size decreasing by 10,000 files/hour, but this would >>>> still >>>> mean close to ten months (over 250 days) to delete everything! >>>> >>>> I also tried to use "unlink" command to directory as a root, as a user >>>> who >>>> created the directory, by changing directory''s owner to root and so >>>> forth, >>>> but all attempts gave "Not owner" error. >>>> >>>> Any commands like "ls -f" or "find" will run for hours (or days) without >>>> actually listing anything from the directory, so I''m beginning to >>>> suspect >>>> that maybe the directory''s data structure is somewhat damaged. Is there >>>> some diagnostics that I can run with e.g "zdb" to investigate and >>>> hopefully fix for a single directory within zfs dataset? >>> >>> In situations like this, ls will be exceptionally slow partially >>> because it will sort the output. >> >> that''s what ''-f'' was supposed to avoid, I''d guess. > > Yes, but unfortunately, the typical reason ls is slow with huge directories > is that it requires a huge amount of memory. ?Even when not sorting (with > -f), it still allocates a huge amount of memory for each entry listed, and > buffers the output until the directory is entirely read. ?So typically -f > doesn''t help performance much. ?Improving this would be a great small > project for an OpenSolaris contributor! ?I filed a couple of bugs for this > several years ago, I can dig them up if anyone is interested. > > --mattAfter a few days of deliberation, I''ve decided to start working on this (in addition to adding the 256 color ls support Danek was interested in as well as addressing a number of other bugs). I suspect it''s going to require a significant overhaul of the existing code to get it where it can behave better with large directories (though the current code could probably use the cleanup). If anyone''s interested in testing once I''ve got it to a point worth testing (will probably be a few weeks, depending on how much time I can commit to it)... let me know...