Hi there, I Googled around and checked the PRs and wasn't successful in finding any reports of what I'm seeing. I'm hoping someone here can help me debug what's going on. On my FreeBSD 8.2-S machine (built circa 12th June), I created a directory and populated it over the course of 3 weeks with about 2 million individual files. As you might imagine, a 'ls' of this directory took quite some time. The files were conveniently named with a timestamp in the filename (still images from a security camera, once per second) so I've since moved them all to timestamped directories (yyyy/MM/dd/hh/mm). What I found though was the original directory the images were in is still very slow to ls -- and it only has 1 file in it, another directory. To clarify: % ls second [lots of time and many many files enumerated] % # rename files using rename script % ls second [wait ages] 2011 dead % mkdir second2 && mv second/2011 second2 % ls second2 [fast!] 2011 % ls second [still very slow] dead % time ls second dead/ gls -F --color 0.00s user 1.56s system 0% cpu 3:09.61 total (timings are similar for /bin/ls) This data is stored on a striped ZFS pool (version 15, though the kernel reports version 28 is available but zpool upgrade seems to disagree), 2T in size. I've run zpool scrub with no effect. ZFS is busily driving the disks away; my iostat monitoring has all three drives in the zpool running at 40-60% busy for the duration of the ls (it was quiet before). I've attached truss to the ls process. It spends a lot of time here: fstatfs(0x5,0x7fffffffe0d0,0x800ad5548,0x7fffffffdfd8,0x0,0x0) = 0 (0x0) I'm thinking there's some old ZFS metadata that it's looking into, but I'm not sure how to best dig into this to understand what's going on under the hood. Can anyone perhaps point me the right direction on this? Thanks, Sean
Not an in depth solution for ZFS, but maybe a solution for you. mkdir images2 mv images/* images2 rmdir images Ronald. On Tue, 02 Aug 2011 09:39:03 +0200, seanrees@gmail.com <seanrees@gmail.com> wrote:> Hi there, > > I Googled around and checked the PRs and wasn't successful in finding > any reports of what I'm seeing. I'm hoping someone here can help me > debug what's going on. > > On my FreeBSD 8.2-S machine (built circa 12th June), I created a > directory and populated it over the course of 3 weeks with about 2 > million individual files. As you might imagine, a 'ls' of this > directory took quite some time. > > The files were conveniently named with a timestamp in the filename > (still images from a security camera, once per second) so I've since > moved them all to timestamped directories (yyyy/MM/dd/hh/mm). What I > found though was the original directory the images were in is still > very slow to ls -- and it only has 1 file in it, another directory. > > To clarify: > % ls second > [lots of time and many many files enumerated] > % # rename files using rename script > % ls second > [wait ages] > 2011 dead > % mkdir second2 && mv second/2011 second2 > % ls second2 > [fast!] > 2011 > % ls second > [still very slow] > dead > % time ls second > dead/ > gls -F --color 0.00s user 1.56s system 0% cpu 3:09.61 total > > (timings are similar for /bin/ls) > > This data is stored on a striped ZFS pool (version 15, though the > kernel reports version 28 is available but zpool upgrade seems to > disagree), 2T in size. I've run zpool scrub with no effect. ZFS is > busily driving the disks away; my iostat monitoring has all three > drives in the zpool running at 40-60% busy for the duration of the ls > (it was quiet before). > > I've attached truss to the ls process. It spends a lot of time here: > fstatfs(0x5,0x7fffffffe0d0,0x800ad5548,0x7fffffffdfd8,0x0,0x0) = 0 (0x0) > > I'm thinking there's some old ZFS metadata that it's looking into, but > I'm not sure how to best dig into this to understand what's going on > under the hood. > > Can anyone perhaps point me the right direction on this? > > Thanks, > > Sean > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
On Tue, Aug 02, 2011 at 08:39:03AM +0100, seanrees@gmail.com wrote:> On my FreeBSD 8.2-S machine (built circa 12th June), I created a > directory and populated it over the course of 3 weeks with about 2 > million individual files.I'll keep this real simple: Why did you do this? I hope this was a stress test of some kind. If not: This is the 2nd or 3rd mail in recent months from people saying "I decided to do something utterly stupid with my filesystem[1] and now I'm asking why performance sucks". Why can people not create proper directory tree layouts to avoid this problem regardless of what filesystem is used? I just don't get it. [1]: Applies to any filesystem, not just ZFS. There was a UFS one a month or two ago too... -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
On Tue, Aug 2, 2011 at 9:39 AM, seanrees@gmail.com <seanrees@gmail.com> wrote:> On my FreeBSD 8.2-S machine (built circa 12th June), I created a > directory and populated it over the course of 3 weeks with about 2 > million individual files. As you might imagine, a 'ls' of this > directory took quite some time.What actually takes some time here isn't zfs, but the sorting of ls(1). Usually, running ls(1) with -f (Output is not sorted) speeds up things enormously.> The files were conveniently named with a timestamp in the filename > (still images from a security camera, once per second) so I've since > moved them all to timestamped directories (yyyy/MM/dd/hh/mm). What I > found though was the original directory the images were in is still > very slow to ls -- and it only has 1 file in it, another directory.That is strange... and shouldn't happen. According to the ZFS Performance Wiki [1], operations on ZFS file systems are supposed to be pretty efficient: Concurrent, constant time directory operations Large directories need constant time operations (lookup, create, delete, etc). Hot directories need concurrent operations. ZFS uses extensible hashing to solve this. Block based, amortized growth cost, short chains for constant time ops, per-block locking for high concurrency. A caveat is that readir returns entries in hash order. Directories are implemented via the ZFS Attribute Processor (ZAP) in ZFS. ZAP can be used to arbitrary name value pairs. ZAP uses two algorithms are optimized for large lists (large directories) and small lists (attribute lists). The ZAP implementation is in zap.c and zap_leaf.c. Each directory is maintained as a table of pointers to constant sized buckets holding a variable number of entries. Each directory record is 16k in size. When this block gets full, a new block of size next power of two is allocated. A directory starts off as a microzap, and then upgraded to a fat zap (via mzap_upgrade) if the size of the name exceeds MZAP_NAME_LEN ( MZAP_ENT_LEN - 8 - 4 - 2) or 50 or if the size of the microzap exceeds MZAP_MAX_BLKSZ (128k) [1]: http://www.solarisinternals.com/wiki/index.php/ZFS_Performance I don't know what's going on there, but someone with ZFS internals expertise may want to have a closer look.> To clarify: > % ls second > [lots of time and many many files enumerated] > % # rename files using rename script > % ls second > [wait ages] > 2011 dead > % mkdir second2 && mv second/2011 second2 > % ls second2 > [fast!] > 2011 > % ls second > [still very slow] > dead > % time ls second > dead/ > gls -F --color ?0.00s user 1.56s system 0% cpu 3:09.61 total > > (timings are similar for /bin/ls) > > This data is stored on a striped ZFS pool (version 15, though the > kernel reports version 28 is available but zpool upgrade seems to > disagree), 2T in size. I've run zpool scrub with no effect. ZFS is > busily driving the disks away; my iostat monitoring has all three > drives in the zpool running at 40-60% busy for the duration of the ls > (it was quiet before). > > I've attached truss to the ls process. It spends a lot of time here: > fstatfs(0x5,0x7fffffffe0d0,0x800ad5548,0x7fffffffdfd8,0x0,0x0) = 0 (0x0)That's a very good hint indeed!> I'm thinking there's some old ZFS metadata that it's looking into, but > I'm not sure how to best dig into this to understand what's going on > under the hood. > > Can anyone perhaps point me the right direction on this? > > Thanks, > > SeanRegards, -cpghost. -- Cordula's Web. http://www.cordula.ws/
On 2011-Aug-02 08:39:03 +0100, "seanrees@gmail.com" <seanrees@gmail.com> wrote:>On my FreeBSD 8.2-S machine (built circa 12th June), I created a >directory and populated it over the course of 3 weeks with about 2 >million individual files. As you might imagine, a 'ls' of this >directory took quite some time. > >The files were conveniently named with a timestamp in the filename >(still images from a security camera, once per second) so I've since >moved them all to timestamped directories (yyyy/MM/dd/hh/mm). What I >found though was the original directory the images were in is still >very slow to ls -- and it only has 1 file in it, another directory.I've also seen this behaviour on Solaris 10 after cleaning out a directory with a large number of files (though not as pathological as your case). I tried creating and deleting entries in an unsuccessful effort to trigger directory compaction. I wound up moving the remaining contents into a new directory, deleting the original one and renaming the new directory. It would appear te be a garbage collection bug in ZFS. On 2011-Aug-02 13:10:27 +0300, Daniel Kalchev <daniel@digsys.bg> wrote:>On 02.08.11 12:46, Daniel O'Connor wrote: >> I am pretty sure UFS does not have this problem. i.e. once you >> delete/move the files out of the directory its performance would be >> good again. > >UFS would be the classic example of poor performance if you do this.Traditional UFS (including Solaris) behave badly in this scenario but 4.4BSD derivatives will release unused space at the end of a directory and have smarts to more efficiently skip unused entries at the start of a directory. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110803/0dd25d6e/attachment.pgp
On 8/2/11 9:39 AM, seanrees@gmail.com wrote:> Hi there, > > I Googled around and checked the PRs and wasn't successful in finding > any reports of what I'm seeing. I'm hoping someone here can help me > debug what's going on. > > On my FreeBSD 8.2-S machine (built circa 12th June), I created a > directory and populated it over the course of 3 weeks with about 2 > million individual files. As you might imagine, a 'ls' of this > directory took quite some time. > > The files were conveniently named with a timestamp in the filename > (still images from a security camera, once per second) so I've since > moved them all to timestamped directories (yyyy/MM/dd/hh/mm). What I > found though was the original directory the images were in is still > very slow to ls -- and it only has 1 file in it, another directory. >While not addressing your original question, which many people have already, I'll toss in the following: I do hope you've disabled access times on your ZFS dataset ? zfs set atime=off YOUR_DATASET/supercamera/captures