Hi, When testing NFSv4 on ZFS, I met a ''random'' FAIL. The case does the followings: Client via nfsh 1. create dir on server 2. create file1 with short name under dir 3. create file2 with long name under dir 4. Call Readdir on dir with maxcount=96 and dircount=1024 5. Check the result, expect "OK" But sometimes it returns TOOSMALL. If you are not familiar with NFS, it does not matter. With DTrace, I got the follows: -> rfs4_op_readdir ... -> fop_readdir -> crgetmapped <- crgetmapped -> zfs_readdir <- zfs_readdir <- fop_readdir -> fop_rwunlock -> fs_rwunlock <- fs_rwunlock <- fop_rwunlock -> nfs4_readdir_getvp | nfs4_readdir_getvp:entry Trace arg1 of nfs4_readdir_getvp:entry, it shows sometimes it is file1, sometimes it is file2. I expect it is always file1 just like in UFS. Checking inodes of dir, file1 and file2, they are always in ascending order. Does anyone has ideas on it? Thanks, -- This messages posted from opensolaris.org
Eric Schrock
2007-Jul-25 16:01 UTC
[zfs-code] the entry sequence returned by zfs_readdir()
The standards require that the ordering of a directory is consistent, not that entries appear in the order they are added. Under ZFS, the order depends on the hashing algorithm (entry names + salt). Even with UFS, you are relying on implementation behavior with an empty directory, since it uses a linear array of entries. I imagine that if you created a few entries, deleted old ones, and created new ones, you''d find the ordering dependent on which files you deleted. - Eric On Wed, Jul 25, 2007 at 02:41:05AM -0700, Zhigang Cui wrote:> Hi, > When testing NFSv4 on ZFS, I met a ''random'' FAIL. > > The case does the followings: > Client via nfsh > 1. create dir on server > 2. create file1 with short name under dir > 3. create file2 with long name under dir > 4. Call Readdir on dir with maxcount=96 and dircount=1024 > 5. Check the result, expect "OK" > > But sometimes it returns TOOSMALL. If you are not familiar with NFS, it does not matter. With DTrace, I got the follows: > -> rfs4_op_readdir > ... > -> fop_readdir > -> crgetmapped > <- crgetmapped > -> zfs_readdir > <- zfs_readdir > <- fop_readdir > -> fop_rwunlock > -> fs_rwunlock > <- fs_rwunlock > <- fop_rwunlock > -> nfs4_readdir_getvp > | nfs4_readdir_getvp:entry > > Trace arg1 of nfs4_readdir_getvp:entry, it shows sometimes it is file1, sometimes it is file2. I expect it is always file1 just like in UFS. Checking inodes of dir, file1 and file2, they are always in ascending order. > > Does anyone has ideas on it? > > Thanks, > -- > This messages posted from opensolaris.org > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Hi Eric, Thanks for you reply. In fact, what I expect is the behavior is consistent for the user. In other words, if a user repeats the same steps, he wants the same result. In the given case, The names of ''dir'', ''file1'' and ''file2'' are fixed, the name of corresponding exported dir is fixed, the order of the operations is fixed, too. Is it determined by the ''salt''? If such result is desired for ZFS, maybe I need to redesign the test case. :-( Thanks,> The standards require that the ordering of a > directory is consistent, > not that entries appear in the order they are added. > Under ZFS, the > rder depends on the hashing algorithm (entry names + > salt). Even with > UFS, you are relying on implementation behavior with > an empty directory, > since it uses a linear array of entries. I imagine > that if you created > a few entries, deleted old ones, and created new > ones, you''d find the > ordering dependent on which files you deleted. > > - Eric > > On Wed, Jul 25, 2007 at 02:41:05AM -0700, Zhigang Cui > wrote: > > Hi, > > When testing NFSv4 on ZFS, I met a ''random'' FAIL. > > > > The case does the followings: > > Client via nfsh > > 1. create dir on server > > 2. create file1 with short name under dir > > 3. create file2 with long name under dir > > 4. Call Readdir on dir with maxcount=96 and > dircount=1024 > > 5. Check the result, expect "OK" > > > > But sometimes it returns TOOSMALL. If you are not > familiar with NFS, it does not matter. With DTrace, I > got the follows: > > -> rfs4_op_readdir > > ... > > -> fop_readdir > > -> crgetmapped > > <- crgetmapped > > -> zfs_readdir > > <- zfs_readdir > > <- fop_readdir > > -> fop_rwunlock > > -> fs_rwunlock > > <- fs_rwunlock > > <- fop_rwunlock > > -> nfs4_readdir_getvp > > | nfs4_readdir_getvp:entry > > > > Trace arg1 of nfs4_readdir_getvp:entry, it shows > sometimes it is file1, sometimes it is file2. I > expect it is always file1 just like in UFS. Checking > inodes of dir, file1 and file2, they are always in > ascending order. > > > > Does anyone has ideas on it? > > > > Thanks, > > -- > > This messages posted from opensolaris.org > > _______________________________________________ > > zfs-code mailing list > > zfs-code at opensolaris.org > > > http://mail.opensolaris.org/mailman/listinfo/zfs-code > > -- > Eric Schrock, Solaris Kernel Development > http://blogs.sun.com/eschrock > _________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code-- This messages posted from opensolaris.org
Eric Schrock
2007-Jul-26 16:30 UTC
[zfs-code] the entry sequence returned by zfs_readdir()
On Wed, Jul 25, 2007 at 08:21:47PM -0700, Zhigang Cui wrote:> Hi Eric, > Thanks for you reply. > > In fact, what I expect is the behavior is consistent for the user. In > other words, if a user repeats the same steps, he wants the same > result. In the given case, The names of ''dir'', ''file1'' and ''file2'' are > fixed, the name of corresponding exported dir is fixed, the order of > the operations is fixed, too.The order of enumeration is fixed for any given directory, but not necessarily consistent across disparate directories.> Is it determined by the ''salt''? If such result is desired for ZFS, > maybe I need to redesign the test case. :-(Yes, that is correct. Your test case should be verifying that the correct directory entries are there, but it should not be verifying the order in any way. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> The order of enumeration is fixed for any given directory, but not > necessarily consistent across disparate directories. > > > Is it determined by the ''salt''? If such result is desired for ZFS, > > maybe I need to redesign the test case. :-( > > Yes, that is correct. Your test case should be > verifying that the correct directory entries are there, but it should > not be verifying the order in any way.The current test strategy is to set the length of the names of file1 and file2 deliberately, and deliberately set the value of maxcount which is enough to contain file1 name, but not enough for file2 name. Eric, thanks for your answer. -- This messages posted from opensolaris.org