I was doing some tests with creating and removing subdirectories and watching the time that takes. The directory retains the size and performance issues after the files are removed. /rootz/test> ls -la . total 42372 drwxr-xr-x 2 add root 2 May 7 23:20 . drwxr-xr-x 3 root sys 3 May 7 00:34 .. /rootz/test> time du -ak . 21184 . real 0m28.497s user 0m0.002s sys 0m1.174s /rootz/test> zfs list NAME USED AVAIL REFER MOUNTPOINT rootz 21.9M 978M 20.7M /rootz Is there any way to ask ZFS to reclaim or shrink the allocation on the directory without actually deleting the directory like I would have to do under UFS? -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
On Sun, May 07, 2006 at 11:38:52PM -0700, Darren Dunham wrote:> I was doing some tests with creating and removing subdirectories and > watching the time that takes. The directory retains the size and > performance issues after the files are removed. > > /rootz/test> ls -la . > total 42372 > drwxr-xr-x 2 add root 2 May 7 23:20 . > drwxr-xr-x 3 root sys 3 May 7 00:34 .. > /rootz/test> time du -ak . > 21184 .ZFS doesn''t recover *quite* all the space that large directories use, even after deleting all the entries. From my own tests, I thought we would be recovering about 5x more space than we are. We rely on compression to help recover the space, but it appears that the "empty" directory blocks are not as compressable as I thought they would be. Here''s my experience: # ls -ls 154075 drwxr-xr-x 2 root root 1.0M May 9 11:14 dir So I have a directory with 1 million entries (all links to the same file), which takes up about 76MB (~75 bytes per entry) # ptime du -h dir 77M dir Yep, 77MB. real 41.780 user 1.971 sys 39.793 Took about 0.04 milliseconds per directory entry to readdir() and stat() it. After removing all the dirents: #ptime du -h dir 41M dir Now we''re down to about 40 bytes per (now deleted) entry. If the compression was working optimally, this would be about 8 bytes per (deleted) entry. I''ve filed the following bug to track this issue: 6423695 empty ZAP leaf blocks are not compressed down to minimum size real 0.383 user 0.000 sys 0.379 And the ''du'' was quick.> real 0m28.497s > user 0m0.002s > sys 0m1.174sProbably your directory is not cached, so it takes some time to read in all the (now empty) blocks of the directory. In the future, we may improve this by "joining" less-than-full ZAP blocks, or by simply special-casing "empty again" or "small again" directories and essentially automatically re-writing them in the most compact form. So far we haven''t found this to be a pressing performance issue, so it isn''t high on our priority list. --matt
On Tue, 9 May 2006, Matthew Ahrens wrote:> On Sun, May 07, 2006 at 11:38:52PM -0700, Darren Dunham wrote: > > I was doing some tests with creating and removing subdirectories and > > watching the time that takes. The directory retains the size and > > performance issues after the files are removed. > > > > /rootz/test> ls -la . > > total 42372 > > drwxr-xr-x 2 add root 2 May 7 23:20 . > > drwxr-xr-x 3 root sys 3 May 7 00:34 .. > > /rootz/test> time du -ak . > > 21184 . > > ZFS doesn''t recover *quite* all the space that large directories use, > even after deleting all the entries. From my own tests, I thought we > would be recovering about 5x more space than we are. We rely on > compression to help recover the space, but it appears that the "empty" > directory blocks are not as compressable as I thought they would be. > Here''s my experience: > > # ls -ls > 154075 drwxr-xr-x 2 root root 1.0M May 9 11:14 dir > > So I have a directory with 1 million entries (all links to the same file), > which takes up about 76MB (~75 bytes per entry) > > # ptime du -h dir > 77M dir > > Yep, 77MB. > > real 41.780 > user 1.971 > sys 39.793 > > Took about 0.04 milliseconds per directory entry to readdir() and stat() it. > > After removing all the dirents: > > #ptime du -h dir > 41M dir > > Now we''re down to about 40 bytes per (now deleted) entry. If the > compression was working optimally, this would be about 8 bytes per > (deleted) entry. > > I''ve filed the following bug to track this issue: > 6423695 empty ZAP leaf blocks are not compressed down to minimum size > > real 0.383 > user 0.000 > sys 0.379 > > And the ''du'' was quick. > > > real 0m28.497s > > user 0m0.002s > > sys 0m1.174s > > Probably your directory is not cached, so it takes some time to read in > all the (now empty) blocks of the directory. > > In the future, we may improve this by "joining" less-than-full ZAP > blocks, or by simply special-casing "empty again" or "small again" > directories and essentially automatically re-writing them in the most > compact form. So far we haven''t found this to be a pressing performance > issue, so it isn''t high on our priority list.Does this mean, that if I have a zfs filesystem that is creating/writing/reading/deleting millions of short-lived files in a day, that the directory area would keep growing? Or am I missing something? Surely the ideal case, when a file is deleted, is that the corresponding directory space/time would be zero. In the case of busy production systems, it would be great to have some sort of zfs directory ''purge'' function, that, *if necessary*, could be run when the system is known to be idle (in the wee hours of the A.M.)? PS: I''m also thinking of a zfs filesystem layered on top of a RAM disk. Whether that RAM disk is a chunk of the server memory (ramdiskadm) or something similar to a Gigabyte i-ram (aka Gigabyte GC-RAMDISK) - which looks like an SATA drive to the OS. PPS: I see a trend towards lower-cost RAM disks that will improve rapidly, in terms of capacity and cost/performance over the next few years. Regards, Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
On Tue, May 09, 2006 at 04:01:53PM -0500, Al Hopper wrote:> Does this mean, that if I have a zfs filesystem that is > creating/writing/reading/deleting millions of short-lived files in a day, > that the directory area would keep growing? Or am I missing something?The space used by a directory is not completely reclaimed until the directory is removed with rmdir(2). However, its space usage is bounded by its maximum number of entries. For example, if I have a single directory with a million files in it, and I repeatedly remove all the files, then fill it up with another million files, then the space used by the directory will fluctuate between 77MB and 41MB (which should be about 8MB, see CR 6423695). The space used by files (and directories) is completely reclaimed when they are deleted.> Surely the ideal case, when a file is deleted, is that the corresponding > directory space/time would be zero.Agreed. As I mentioned, we haven''t found this to be a problem in practice. Almost all directories are trivially small, and the big ones tend to stay big. However, if you find it to be a problem, let us know the details of your situation, and we''ll adjust the priority of this accordingly.> In the case of busy production systems, it would be great to have some > sort of zfs directory ''purge'' function, that, *if necessary*, could be > run when the system is known to be idle (in the wee hours of the > A.M.)?If this turns out to be an issue, we will solve the problem transparently (eg. ZAP leaf block "joining" or automatic ZAP rewriting), rather than introduce a knob for administrators to worry about. --matt
On Tue, May 09, 2006 at 04:01:53PM -0500, Al Hopper wrote:> > In the future, we may improve this by "joining" less-than-full ZAP > > blocks, or by simply special-casing "empty again" or "small again" > > directories and essentially automatically re-writing them in the most > > compact form. So far we haven''t found this to be a pressing performance > > issue, so it isn''t high on our priority list. > > Does this mean, that if I have a zfs filesystem that is > creating/writing/reading/deleting millions of short-lived files in a day, > that the directory area would keep growing? Or am I missing something?No; it would just get as large as is needed to hold the maximum number of files the directory has ever held. UFS does the same thing. What he''s proposing above is automatically shrinking the directory when it gets empty or "much smaller", which would fix this as well. Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development