thr3ads.net - zfs code - [zfs-code] the entry sequence returned by zfs

If this information is useful, please help other people find it:
Share via:

Zhigang Cui

2007-Jul-25 09:41 UTC

[zfs-code] the entry sequence returned by zfs_readdir()

Hi,
When testing NFSv4 on ZFS, I met a ''random'' FAIL.

The case does the followings:
Client via nfsh
1. create dir on server
2. create file1 with short name under dir
3. create file2 with long name under dir
4. Call Readdir on dir with maxcount=96 and dircount=1024
5. Check the result, expect "OK"

But sometimes it returns TOOSMALL. If you are not familiar with NFS, it does not
matter. With DTrace, I got the follows:
-> rfs4_op_readdir
...
  -> fop_readdir
    -> crgetmapped
    <- crgetmapped
    -> zfs_readdir
    <- zfs_readdir
  <- fop_readdir
  -> fop_rwunlock
    -> fs_rwunlock
    <- fs_rwunlock
  <- fop_rwunlock
  -> nfs4_readdir_getvp
   | nfs4_readdir_getvp:entry

Trace arg1 of nfs4_readdir_getvp:entry, it shows sometimes it is file1,
sometimes it is file2. I expect it is always file1 just like in UFS. Checking
inodes of dir, file1 and file2, they are always in ascending order.

Does anyone has ideas on it?

Thanks,
--
This messages posted from opensolaris.org

Eric Schrock

2007-Jul-25 16:01 UTC

head link

[zfs-code] the entry sequence returned by zfs_readdir()

The standards require that the ordering of a directory is consistent,
not that entries appear in the order they are added.  Under ZFS, the
order depends on the hashing algorithm (entry names + salt).  Even with
UFS, you are relying on implementation behavior with an empty directory,
since it uses a linear array of entries.  I imagine that if you created
a few entries, deleted old ones, and created new ones, you''d find the
ordering dependent on which files you deleted.

- Eric

On Wed, Jul 25, 2007 at 02:41:05AM -0700, Zhigang Cui
wrote:> Hi,
> When testing NFSv4 on ZFS, I met a ''random'' FAIL.
> 
> The case does the followings:
> Client via nfsh
> 1. create dir on server
> 2. create file1 with short name under dir
> 3. create file2 with long name under dir
> 4. Call Readdir on dir with maxcount=96 and dircount=1024
> 5. Check the result, expect "OK"
> 
> But sometimes it returns TOOSMALL. If you are not familiar with NFS, it
does not matter. With DTrace, I got the follows:
> -> rfs4_op_readdir
> ...
>   -> fop_readdir
>     -> crgetmapped
>     <- crgetmapped
>     -> zfs_readdir
>     <- zfs_readdir
>   <- fop_readdir
>   -> fop_rwunlock
>     -> fs_rwunlock
>     <- fs_rwunlock
>   <- fop_rwunlock
>   -> nfs4_readdir_getvp
>    | nfs4_readdir_getvp:entry
> 
> Trace arg1 of nfs4_readdir_getvp:entry, it shows sometimes it is file1,
sometimes it is file2. I expect it is always file1 just like in UFS. Checking
inodes of dir, file1 and file2, they are always in ascending order.
> 
> Does anyone has ideas on it?
> 
> Thanks,
> --
> This messages posted from opensolaris.org
> _______________________________________________
> zfs-code mailing list
> zfs-code at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-code
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Zhigang Cui

2007-Jul-26 03:21 UTC

head link

[zfs-code] the entry sequence returned by zfs_readdir()

Hi Eric,
Thanks for you reply.

In fact, what I expect is the behavior is consistent for the user. In other
words, if a user repeats the same steps, he wants the same result. In the given
case, The names of ''dir'', ''file1'' and
''file2'' are fixed, the name of corresponding exported dir is
fixed, the order of the operations is fixed, too.

Is it determined by the ''salt''? If such result is desired for
ZFS, maybe I need to redesign the test case. :-(

Thanks,
> The standards require that the ordering of a
> directory is consistent,
> not that entries appear in the order they are added.
>  Under ZFS, the
> rder depends on the hashing algorithm (entry names +
> salt).  Even with
> UFS, you are relying on implementation behavior with
> an empty directory,
> since it uses a linear array of entries.  I imagine
> that if you created
> a few entries, deleted old ones, and created new
> ones, you''d find the
> ordering dependent on which files you deleted.
> 
> - Eric
> 
> On Wed, Jul 25, 2007 at 02:41:05AM -0700, Zhigang Cui
> wrote:
> > Hi,
> > When testing NFSv4 on ZFS, I met a ''random'' FAIL.
> > 
> > The case does the followings:
> > Client via nfsh
> > 1. create dir on server
> > 2. create file1 with short name under dir
> > 3. create file2 with long name under dir
> > 4. Call Readdir on dir with maxcount=96 and
> dircount=1024
> > 5. Check the result, expect "OK"
> > 
> > But sometimes it returns TOOSMALL. If you are not
> familiar with NFS, it does not matter. With DTrace, I
> got the follows:
> > -> rfs4_op_readdir
> > ...
> >   -> fop_readdir
> >     -> crgetmapped
> >     <- crgetmapped
> >     -> zfs_readdir
> >     <- zfs_readdir
> >   <- fop_readdir
> >   -> fop_rwunlock
> >     -> fs_rwunlock
> >     <- fs_rwunlock
> >   <- fop_rwunlock
> >   -> nfs4_readdir_getvp
> >    | nfs4_readdir_getvp:entry
> > 
> > Trace arg1 of nfs4_readdir_getvp:entry, it shows
> sometimes it is file1, sometimes it is file2. I
> expect it is always file1 just like in UFS. Checking
> inodes of dir, file1 and file2, they are always in
> ascending order.
> > 
> > Does anyone has ideas on it?
> > 
> > Thanks,
> > --
> > This messages posted from opensolaris.org
> > _______________________________________________
> > zfs-code mailing list
> > zfs-code at opensolaris.org
> >
> http://mail.opensolaris.org/mailman/listinfo/zfs-code
> 
> --
> Eric Schrock, Solaris Kernel Development
>       http://blogs.sun.com/eschrock
> _________________________________________
> zfs-code mailing list
> zfs-code at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-code--
This messages posted from opensolaris.org

Eric Schrock

2007-Jul-26 16:30 UTC

head link

[zfs-code] the entry sequence returned by zfs_readdir()

On Wed, Jul 25, 2007 at 08:21:47PM -0700, Zhigang Cui
wrote:> Hi Eric,
> Thanks for you reply.
> 
> In fact, what I expect is the behavior is consistent for the user. In
> other words, if a user repeats the same steps, he wants the same
> result. In the given case, The names of ''dir'',
''file1'' and ''file2'' are
> fixed, the name of corresponding exported dir is fixed, the order of
> the operations is fixed, too.
The order of enumeration is fixed for any given directory, but not
necessarily consistent across disparate directories.
> Is it determined by the ''salt''? If such result is desired
for ZFS,
> maybe I need to redesign the test case. :-(
Yes, that is correct.  Your test case should be verifying that the
correct directory entries are there, but it should not be verifying the
order in any way.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Zhigang Cui

2007-Jul-27 01:27 UTC

head link

[zfs-code] the entry sequence returned by zfs_readdir()

> The order of enumeration is fixed for any given directory, but not
> necessarily consistent across disparate directories.
> 
> > Is it determined by the ''salt''? If such result is
desired for ZFS,
> > maybe I need to redesign the test case. :-(
> 
> Yes, that is correct.  Your test case should be
> verifying that the correct directory entries are there, but it should
> not be verifying the order in any way.The current test strategy is to set the length of the names of file1 and file2
deliberately, and deliberately set the value of maxcount which is enough to
contain file1 name, but not enough for file2 name.

Eric, thanks for your answer.
--
This messages posted from opensolaris.org

zfs code - Jul 2007 - the entry sequence returned by zfs_readdir()

[zfs-code] the entry sequence returned by zfs_readdir()

[zfs-code] the entry sequence returned by zfs_readdir()

[zfs-code] the entry sequence returned by zfs_readdir()

[zfs-code] the entry sequence returned by zfs_readdir()

[zfs-code] the entry sequence returned by zfs_readdir()