Due to legacy constraints, I have a rather complicated system that is currently using Sun QFS (actually the SAM portion of it.) For a lot of reasons, I''d like to look at moving to ZFS, but would like a "sanity check" to make sure ZFS is suitable to this application. First of all, we are NOT using the cluster capabilities of SAMFS. Instead, we''re using it as a way of dealing with one directory that contains approximately 100,000 entries. The question is this: I know from the specs that ZFS can handle a directory with this many entries, but what I''m actually wondering is how directory lookups are handled? That is, if I do a "cd foo050000" in a directory with foo000001 through foo100000, will the filesystem have to scroll through all the directory contents to find foo050000, or does it use a btree or something to handle this? This directory is, in turn, shared out over NFS. Are there any issues I should be aware of with this sort of installation? Thanks for any advice or input! Patrick Narkinsky Sr. System Engineer EDS This message posted from opensolaris.org
ZFS actually uses the ZAP to handle directory lookups. The ZAP is not a btree but a specialized hash table where a hash for each directory entry is generated based on that entry''s name. Hence you won''t be doing any sort of linear search through the entire directory for a file, a hash is generated from the file name and a lookup of that hash in the zap will be done. This is nice and speedy, even with 100,000 files in a directory. Noel On Aug 24, 2006, at 8:02 AM, Patrick Narkinsky wrote:> Due to legacy constraints, I have a rather complicated system that > is currently using Sun QFS (actually the SAM portion of it.) For a > lot of reasons, I''d like to look at moving to ZFS, but would like a > "sanity check" to make sure ZFS is suitable to this application. > > First of all, we are NOT using the cluster capabilities of SAMFS. > Instead, we''re using it as a way of dealing with one directory that > contains approximately 100,000 entries. > > The question is this: I know from the specs that ZFS can handle a > directory with this many entries, but what I''m actually wondering > is how directory lookups are handled? That is, if I do a "cd > foo050000" in a directory with foo000001 through foo100000, will > the filesystem have to scroll through all the directory contents to > find foo050000, or does it use a btree or something to handle this? > > This directory is, in turn, shared out over NFS. Are there any > issues I should be aware of with this sort of installation? > > Thanks for any advice or input! > > Patrick Narkinsky > Sr. System Engineer > EDS > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, Aug 24, 2006 at 10:46:27AM -0700, Noel Dellofano wrote:> ZFS actually uses the ZAP to handle directory lookups. The ZAP is > not a btree but a specialized hash table where a hash for each > directory entry is generated based on that entry''s name. Hence you > won''t be doing any sort of linear search through the entire directory > for a file, a hash is generated from the file name and a lookup of > that hash in the zap will be done. This is nice and speedy, even > with 100,000 files in a directory.I just tried creating 150,000 directories in a ZFS roto directory. It was speedy. Listing individual directories (lookup) is fast. Listing the large directory isn''t, but that turns out to be either terminal I/O or collation in a UTF-8 locale (which is what I use; a simple DTrace script showed that to be my problem): % ptime ls ... real 9.850 user 6.263 <- ouch, UTF-8 hurts sys 0.245 % % LC_ALL=C ptime ls ... real 4.112 <- terminal I/O hurts -- I should redirect to /dev/null user 0.682 sys 0.252 % % LC_ALL=C ptime ls > /dev/null real 0.793 user 0.608 sys 0.184 <- not bad % % LC_ALL=C ptime sh -c ''echo *'' ... real 1.357 <- more compact output than ls(1) user 0.970 sys 0.090 % % ptime ls -a x12345 . .. real 0.022 user 0.000 sys 0.002 % Awesome. Nico --
On Thu, Aug 24, 2006 at 01:15:51PM -0500, Nicolas Williams wrote:> I just tried creating 150,000 directories in a ZFS roto directory. It > was speedy. Listing individual directories (lookup) is fast.Glad to hear that it''s working well for you!> Listing the large directory isn''t, but that turns out to be either > terminal I/O or collation in a UTF-8 locale (which is what I use; a > simple DTrace script showed that to be my problem): > > % ptime ls > ... > > real 9.850 > user 6.263 <- ouch, UTF-8 hurts > sys 0.245Yep, beware of using ''ls'' on large directories! See also: 6299769 ''ls'' memory usage is excessive 6299767 ''ls -f'' should not buffer output --matt