Why does ZFS report such small directory sizes? For example, take a maildir directory with ten entries: total 2385 drwx------ 8 17121 vmail 10 Jun 8 23:50 . drwx--x--x 14 root root 14 May 12 2006 .. drwx------ 5 17121 vmail 5 May 25 18:16 .Trash drwx------ 5 17121 staff 6 Jun 9 00:01 .testing -rw------- 1 17121 staff 0 Jun 8 18:30 .uidvalidity drwx------ 2 17121 root 2 Jun 6 19:33 courierimapkeywords -rw-r--r-- 1 17121 root 219951 Jun 8 18:29 courierimapuiddb drwx------ 2 17121 vmail 6144 Jun 9 09:59 cur drwx------ 2 17121 vmail 3 Jun 9 12:09 new drwx------ 2 17121 vmail 2 Jun 9 12:09 tmp Note how ".", this directory, is only 10 bytes long. Only one byte per directory entry? This confuses programs that assume that the st_size reported for a directory is a multiple of sizeof(struct dirent) bytes. This message posted from opensolaris.org
Joerg Schilling
2007-Jun-09 17:18 UTC
[zfs-discuss] zfs reports small st_size for directories?
Ed Ravin <eravin at panix.com> wrote:> Why does ZFS report such small directory sizes? For example, take a maildir directory with ten entries: > > total 2385 > drwx------ 8 17121 vmail 10 Jun 8 23:50 . > drwx--x--x 14 root root 14 May 12 2006 .. > drwx------ 5 17121 vmail 5 May 25 18:16 .Trash > drwx------ 5 17121 staff 6 Jun 9 00:01 .testing > -rw------- 1 17121 staff 0 Jun 8 18:30 .uidvalidity > drwx------ 2 17121 root 2 Jun 6 19:33 courierimapkeywords > -rw-r--r-- 1 17121 root 219951 Jun 8 18:29 courierimapuiddb > drwx------ 2 17121 vmail 6144 Jun 9 09:59 cur > drwx------ 2 17121 vmail 3 Jun 9 12:09 new > drwx------ 2 17121 vmail 2 Jun 9 12:09 tmp > > Note how ".", this directory, is only 10 bytes long.It is 10 entries long. The POSIX standard would even allow to return 0 for all directories. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Ed Ravin
2007-Jun-09 18:20 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in ZFS. Any word on when the fix will be out? This message posted from opensolaris.org
Casper.Dik at Sun.COM
2007-Jun-09 20:16 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in >ZFS. Any word on when the fix will be out?It''s a bug in scandir (obviously) and it is filed as such. Does scandir fail on zfs because of this or does scandir needs to reallocate and does it use the size as first order estimate? Casper
Joerg Schilling
2007-Jun-09 20:19 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Casper.Dik at Sun.COM wrote:> > >Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in > >ZFS. Any word on when the fix will be out? > > It''s a bug in scandir (obviously) and it is filed as such.A very old bug. I fixed it for a Berthold AG customer in 1992 when Novell Netware did start to support NFS and reported 512 Byte Dir size for all dirs ;-) J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Eric Schrock
2007-Jun-09 20:20 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Sat, Jun 09, 2007 at 10:16:34PM +0200, Casper.Dik at sun.com wrote:> > >Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in > >ZFS. Any word on when the fix will be out? > > It''s a bug in scandir (obviously) and it is filed as such. > > Does scandir fail on zfs because of this or does scandir needs to > reallocate and does it use the size as first order estimate?The bug is actually wrong - scandir() does work. It only uses the st_size as an estimate of directory size, but then scales the array as necessary. Apart from an odd malloc(0) and the poor linear scaling algorithm, the function work fine with ZFS. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Ed Ravin
2007-Jun-09 20:56 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Sat, Jun 09, 2007 at 10:16:34PM +0200, Casper.Dik at Sun.COM wrote:> > >Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in > >ZFS. Any word on when the fix will be out? > > It''s a bug in scandir (obviously) and it is filed as such. > > Does scandir fail on zfs because of this or does scandir needs to > reallocate and does it use the size as first order estimate?I encountered the problem in NetBSD''s scandir(), when reading off a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a bug report with NetBSD. They were using the st_size, divided by 24, to determine how much memory to allocate with malloc() before reading in the directory entries. All without any sanity checking. I''ve found other programs that make similar assumptions, including the hard-coding of "24" instead of "sizeof dirent". What was the reason to make ZFS use directory sizes as the number of entries rather than the way other Unix filesystems use it? This message posted from opensolaris.org
Eric Schrock
2007-Jun-09 21:01 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Sat, Jun 09, 2007 at 01:56:35PM -0700, Ed Ravin wrote:> > I encountered the problem in NetBSD''s scandir(), when reading off > a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a > bug report with NetBSD. They were using the st_size, divided by 24, to > determine how much memory to allocate with malloc() before reading in > the directory entries. All without any sanity checking.Ah, so the original bug should never been filed against our scandir(3c), which is resilient to this type of failure.> I''ve found other programs that make similar assumptions, including > the hard-coding of "24" instead of "sizeof dirent".Yikes. Even on a ''normal'' filesystem, what happens if entries are added to a directory in the middle of such an operation?> What was the reason to make ZFS use directory sizes as the number of > entries rather than the way other Unix filesystems use it?I seem to recall some discussion about it, but maybe someone else on the team has a better memory than me ;-) - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Ed Ravin
2007-Jun-09 21:44 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Sat, Jun 09, 2007 at 02:01:35PM -0700, Eric Schrock wrote:> On Sat, Jun 09, 2007 at 01:56:35PM -0700, Ed Ravin wrote: > > > > I encountered the problem in NetBSD''s scandir(), when reading off > > a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a > > bug report with NetBSD. They were using the st_size, divided by 24, to > > determine how much memory to allocate with malloc() before reading in > > the directory entries. All without any sanity checking....> Yikes. Even on a ''normal'' filesystem, what happens if entries are added > to a directory in the middle of such an operation?Actually, there is some sanity checking in the NetBSD scandir() after the first block is allocated, but because of their base assumption that st_size is always (number_of_dir_entries * 24) it never allocates enough memory when talking to a ZFS filesystem. -- Ed
Jeff Bonwick
2007-Jun-09 22:54 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
> What was the reason to make ZFS use directory sizes as the number of > entries rather than the way other Unix filesystems use it?In UFS, the st_size is the size of the directory inode as though it were a file. The only reason it''s like that is that UFS is sloppy and lets you cat directories -- a fine way to screw up your terminal settings, but otherwise not terribly useful. For reads (rather than readdirs) of a directory to work, st_size has to be this way. With ZFS, we decided to enforce file vs. directory semantics -- no read(2) of directories, no directory hard links (even as root), etc. What, then, should we return for st_size? We figured the number of entries would be the most useful piece of information for a sysadmin. Jeff
Joerg Schilling
2007-Jun-10 08:46 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Jeff Bonwick <Jeff.Bonwick at sun.com> wrote:> > What was the reason to make ZFS use directory sizes as the number of > > entries rather than the way other Unix filesystems use it? > > In UFS, the st_size is the size of the directory inode as though it > were a file. The only reason it''s like that is that UFS is sloppyThere are many more design flaws with directories in traditional UNIX filesystems, e.g. the explicit existence of "." and ".." and the fact that this are hardlinked directories.> and lets you cat directories -- a fine way to screw up your terminal > settings, but otherwise not terribly useful. For reads (rather thanIf you believe this is the main problem, you should try to find a solution for the problems you get with "cat /kernel/fs/zfs" ;-) J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Casper.Dik at Sun.COM
2007-Jun-10 09:26 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>What was the reason to make ZFS use directory sizes as the number of >entries rather than the way other Unix filesystems use it? I fear that >several more of the 700 open source packages we''ve ported to our hosts >are going to exhibit this problem.It''s a choice as good as any. The scandir implementation is wrong for a variety of reasons; the foremost being that it assumes that it is operating on a directory part of a *local* *ufs* filesystem. Both those assumption are false. (Other filesystems may store different dirents (larger, smaller) or store them in a manner which is no directly related to size) But while it''s clear we can easily fix Solaris'' Nevada/10 scandir this is not the case for all NFS clients as we don''t control them. Casper
Casper.Dik at sun.com
2007-Jun-10 10:22 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>On Sat, Jun 09, 2007 at 10:16:34PM +0200, Casper.Dik at sun.com wrote: >> >> >Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in >> >ZFS. Any word on when the fix will be out? >> >> It''s a bug in scandir (obviously) and it is filed as such. >> >> Does scandir fail on zfs because of this or does scandir needs to >> reallocate and does it use the size as first order estimate? > >The bug is actually wrong - scandir() does work. It only uses the >st_size as an estimate of directory size, but then scales the array as >necessary. Apart from an odd malloc(0) and the poor linear scaling >algorithm, the function work fine with ZFS.Ah, so it''s not really a bug. Is this common for all scandir()s? Casper
Joerg Schilling
2007-Jun-10 10:40 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Casper.Dik at sun.com wrote:> > >On Sat, Jun 09, 2007 at 10:16:34PM +0200, Casper.Dik at sun.com wrote: > >> > >> >Oh, I see, this is bug 6479267: st_size (struct stat) is unreliable in > >> >ZFS. Any word on when the fix will be out? > >> > >> It''s a bug in scandir (obviously) and it is filed as such. > >> > >> Does scandir fail on zfs because of this or does scandir needs to > >> reallocate and does it use the size as first order estimate? > > > >The bug is actually wrong - scandir() does work. It only uses the > >st_size as an estimate of directory size, but then scales the array as > >necessary. Apart from an odd malloc(0) and the poor linear scaling > >algorithm, the function work fine with ZFS. > > Ah, so it''s not really a bug. Is this common for all scandir()s?The original BSD scandir() allocates st_size/24 and never realloc()s. This is how it has been on SunOS-4.x and this is why it did fail with e.g. a Novell Netware NFS server. If there are still implementations around that have not been fixed, the problem is still around. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Chris Ridd
2007-Jun-10 11:56 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On 9/6/07 10:01, "Eric Schrock" <eric.schrock at sun.com> wrote:> On Sat, Jun 09, 2007 at 01:56:35PM -0700, Ed Ravin wrote: >> >> I encountered the problem in NetBSD''s scandir(), when reading off >> a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a >> bug report with NetBSD. They were using the st_size, divided by 24, to >> determine how much memory to allocate with malloc() before reading in >> the directory entries. All without any sanity checking. > > Ah, so the original bug should never been filed against our scandir(3c), > which is resilient to this type of failure.I think when I originally filed this bug I was looking at the wrong scandir implementation, ie the one in /onnv/onnv-gate/usr/src/lib/libbc/libc/gen/common/scandir.c instead of the one in /onnv/onnv-gate/usr/src/lib/libc/port/gen/scandir.c Is there any way to mark the bug as resolved? Or maybe to change the category etc? Cheers, Chris
Casper.Dik at Sun.COM
2007-Jun-10 16:23 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>On 9/6/07 10:01, "Eric Schrock" <eric.schrock at sun.com> wrote: > >> On Sat, Jun 09, 2007 at 01:56:35PM -0700, Ed Ravin wrote: >>> >>> I encountered the problem in NetBSD''s scandir(), when reading off >>> a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a >>> bug report with NetBSD. They were using the st_size, divided by 24, to >>> determine how much memory to allocate with malloc() before reading in >>> the directory entries. All without any sanity checking. >> >> Ah, so the original bug should never been filed against our scandir(3c), >> which is resilient to this type of failure. > >I think when I originally filed this bug I was looking at the wrong scandir >implementation, ie the one in >/onnv/onnv-gate/usr/src/lib/libbc/libc/gen/common/scandir.c instead of the >one in /onnv/onnv-gate/usr/src/lib/libc/port/gen/scandir.c > >Is there any way to mark the bug as resolved? Or maybe to change the >category etc?Possibly; it would still need to be fixed if folks encountered this on their old SunOS 4.x app running on Solaris. Casper
Chris Ridd
2007-Jun-10 18:23 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On 10/6/07 5:23, "Casper.Dik at Sun.COM" <Casper.Dik at Sun.COM> wrote:> >> On 9/6/07 10:01, "Eric Schrock" <eric.schrock at sun.com> wrote: >> >>> On Sat, Jun 09, 2007 at 01:56:35PM -0700, Ed Ravin wrote: >>>> >>>> I encountered the problem in NetBSD''s scandir(), when reading off >>>> a Solaris NFS fileserver with ZFS filesystems. I''ve already filed a >>>> bug report with NetBSD. They were using the st_size, divided by 24, to >>>> determine how much memory to allocate with malloc() before reading in >>>> the directory entries. All without any sanity checking. >>> >>> Ah, so the original bug should never been filed against our scandir(3c), >>> which is resilient to this type of failure. >> >> I think when I originally filed this bug I was looking at the wrong scandir >> implementation, ie the one in >> /onnv/onnv-gate/usr/src/lib/libbc/libc/gen/common/scandir.c instead of the >> one in /onnv/onnv-gate/usr/src/lib/libc/port/gen/scandir.c >> >> Is there any way to mark the bug as resolved? Or maybe to change the >> category etc? > > Possibly; it would still need to be fixed if folks encountered this > on their old SunOS 4.x app running on Solaris.I can''t see that being too high up on Sun''s agenda :-) Cheers, Chris
Frank Batschulat
2007-Jun-11 07:57 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
> Only one byte per directory entry? This confuses > programs that assume that the st_size reported for a > directory is a multiple of sizeof(struct dirent) bytes.Sorry, but a program making this assumption is just flawed and should be fixed. The POSIX standard is crystal-clear here and explicitely mentions: http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html off_t st_size For regular files, the file size in bytes. For symbolic links, the length in bytes of the pathname contained in the symbolic link. [SHM] For a shared memory object, the length in bytes. [TYM] For a typed memory object, the length in bytes. For other file types, the use of this field is unspecified. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a directory is strictly speaking not a regular file and this is in a way enforced by ZFS, the standards wording further defines later on: File type: S_IFREG Regular. S_IFDIR Directory. hth frankB This message posted from opensolaris.org
Joerg Schilling
2007-Jun-11 11:30 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Frank Batschulat <Frank.Batschulat at Sun.COM> wrote:> > Only one byte per directory entry? This confuses > > programs that assume that the st_size reported for a > > directory is a multiple of sizeof(struct dirent) bytes. > > Sorry, but a program making this assumption is just flawed and should be fixed. >This is exactly what I did write when I mentioned that st_size for a directory even may be 0 in any case. In addition, the field st_nlink may be always 1 in case of non-hardlinked directories and "." & ".." do not exist as physical entries not do they need to be hard links. POSIX is intentionally not very specific in relation to filesystems in order to allow different fs implementations. e.g. symlinks do not need to have inodes and permissions/user/group. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Bill Sommerfeld
2007-Jun-11 20:11 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Mon, 2007-06-11 at 00:57 -0700, Frank Batschulat wrote:> a directory is strictly speaking not a regular file and this is in a way enforced by ZFS, > the standards wording further defines later on..So, yes, the standards allow this behavior -- but it''s important to distinguish between delivering the very minimum required by the spec, and delivering behavior which makes it likely for existing applications to work well. Maybe some additional pragmatism is called for here. If we want NFS over ZFS to work well for a variety of clients, maybe we should set st_size to larger values.. - Bill
Casper.Dik at sun.com
2007-Jun-11 21:03 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>Maybe some additional pragmatism is called for here. If we want NFS >over ZFS to work well for a variety of clients, maybe we should set >st_size to larger values..+1; let''s teach the admins to do " st_size /= 24" mentally :-) Casper
Bill Sommerfeld
2007-Jun-11 22:13 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Mon, 2007-06-11 at 23:03 +0200, Casper.Dik at sun.com wrote:> >Maybe some additional pragmatism is called for here. If we want NFS > >over ZFS to work well for a variety of clients, maybe we should set > >st_size to larger values.. > > +1; let''s teach the admins to do " st_size /= 24" mentally :-)Mental long division is annoying. Memory is cheap. And ls displays file lengths in decimal. if we want this to be a geek-friendly interface perhaps we should multiply by 100 rather than 24. - Bill
Joerg Schilling
2007-Jun-11 23:52 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Bill Sommerfeld <sommerfeld at sun.com> wrote:> On Mon, 2007-06-11 at 23:03 +0200, Casper.Dik at sun.com wrote: > > >Maybe some additional pragmatism is called for here. If we want NFS > > >over ZFS to work well for a variety of clients, maybe we should set > > >st_size to larger values.. > > > > +1; let''s teach the admins to do " st_size /= 24" mentally :-) > > Mental long division is annoying. Memory is cheap. And ls displays > file lengths in decimal. > > if we want this to be a geek-friendly interface perhaps we should > multiply by 100 rather than 24.I believe we should rather educate other people that st_size/24 is a bad "solution". J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Casper.Dik at Sun.COM
2007-Jun-12 08:02 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>I believe we should rather educate other people that st_size/24 is a bad >"solution".That''s all well and good but fixing all clients, including potentially really old ones, might not be feasible. Being correct doesn''t help our customers. Casper
Matthew Ahrens
2007-Jun-14 00:27 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Casper.Dik at sun.com wrote:> >> I believe we should rather educate other people that st_size/24 is a bad >> "solution". > > That''s all well and good but fixing all clients, including potentially > really old ones, might not be feasible. Being correct doesn''t help > our customers.To summarize my understanding of this issue: st_size on directories is undefined; apps/libs which do anything other than display it are broken. However, we should avoid exercising this bug in these broken apps if possible. My question: What apps are these? I heard mention of some SunOS 4.x library. I don''t think that''s anywhere near important enough to warrant changing the current ZFS behavior. --matt
Eric Schrock
2007-Jun-14 00:29 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Wed, Jun 13, 2007 at 05:27:18PM -0700, Matthew Ahrens wrote:> Casper.Dik at sun.com wrote: > > > >>I believe we should rather educate other people that st_size/24 is a bad > >>"solution". > > > >That''s all well and good but fixing all clients, including potentially > >really old ones, might not be feasible. Being correct doesn''t help > >our customers. > > To summarize my understanding of this issue: st_size on directories is > undefined; apps/libs which do anything other than display it are broken. > However, we should avoid exercising this bug in these broken apps if > possible. > > My question: What apps are these? I heard mention of some SunOS 4.x > library. I don''t think that''s anywhere near important enough to warrant > changing the current ZFS behavior.NetBSD''s scandir(3c) implementation was also identified. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Ed Ravin
2007-Jun-14 01:42 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Wed, Jun 13, 2007 at 05:27:18PM -0700, Matthew Ahrens wrote:> To summarize my understanding of this issue: st_size on directories is > undefined; apps/libs which do anything other than display it are broken. > However, we should avoid exercising this bug in these broken apps if > possible. > > My question: What apps are these? I heard mention of some SunOS 4.x > library. I don''t think that''s anywhere near important enough to warrant > changing the current ZFS behavior.As mentioned before, NetBSD''s scandir(3) implementation was one. The NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s scandir() looks like another, I''ll have to drop them a line. I went hunting for more apps in the hundreds of ports installed at my shop to see what our exposure was to the scandir() problem - much to my surpise out of 700 or so ports, only a dozen or so used the libc scandir(). A handful of mail programs had a vulnerable local implementation of scandir() - looks like they copied UW''s imap code which was based on the 4.2 BSD code. It''s not clear to me that those get used if there''s an OS implementation of scandir, but I''ll write to them too. Just did a quick check on the 700 or so open source programs that we''ve ported to our hosts - greping for "st_size / 24" got me things like this: ircii/ircii-4.4/source/scandir.c: arraysz = (stb.st_size / 24); imap-uw/imap-2004/src/osdep/unix/scandir.c: nlmax = stb.st_size / 24; /* guesstimate at number of files */ For my shop, it looks like the problem is limited to apps that have local implementations of scandir(). One of our long-term goals is to convert the rest of our fileservers to ZFS, so if there are any more, I or my customers are bound to find them. -- Ed
Matthew Ahrens
2007-Jun-14 02:09 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Ed Ravin wrote:> As mentioned before, NetBSD''s scandir(3) implementation was one. The > NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s scandir() > looks like another, I''ll have to drop them a line.... Thanks much for investigating this and pushing for fixes! --matt
Ed Ravin
2007-Jun-14 03:26 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Wed, Jun 13, 2007 at 09:42:26PM -0400, Ed Ravin wrote:> As mentioned before, NetBSD''s scandir(3) implementation was one. The > NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s scandir() > looks like another, I''ll have to drop them a line.I heard from an OpenBSD developer who is working on a patch, but he hasn''t got a ZFS filesystem to test against. Is anyone willing to provide a test environment, perhaps a shell account that has access to a ZFS filesystem, or perhaps an NFS export of a ZFS filesystem that could be reached via the Internet? If so, write me privately.
Casper.Dik at sun.com
2007-Jun-14 06:58 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>My question: What apps are these? I heard mention of some SunOS 4.x >library. I don''t think that''s anywhere near important enough to warrant >changing the current ZFS behavior.Not apps; NFS clients such as *BSD. On Solaris the issue is next to non-existant (SunOS 4.x binaries using scandir(); and we can patch that) But we don''t control all the NFS clients and their runtimes; it is unclear those are even fixable in some cases. Casper
Casper.Dik at sun.com
2007-Jun-14 07:02 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>I went hunting for more apps in the hundreds of ports installed at my >shop to see what our exposure was to the scandir() problem - much to >my surpise out of 700 or so ports, only a dozen or so used the libc >scandir(). A handful of mail programs had a vulnerable local >implementation of scandir() - looks like they copied UW''s imap code which >was based on the 4.2 BSD code. It''s not clear to me that those get used if >there''s an OS implementation of scandir, but I''ll write to them too.We only recently added scandir to Solaris libc. Casper
Casper.Dik at sun.com
2007-Jun-14 07:09 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
> >>I went hunting for more apps in the hundreds of ports installed at my >>shop to see what our exposure was to the scandir() problem - much to >>my surpise out of 700 or so ports, only a dozen or so used the libc >>scandir(). A handful of mail programs had a vulnerable local >>implementation of scandir() - looks like they copied UW''s imap code which >>was based on the 4.2 BSD code. It''s not clear to me that those get used if >>there''s an OS implementation of scandir, but I''ll write to them too. > >We only recently added scandir to Solaris libc.The implication of which, of course, is that any app build for Solaris 9 or before which uses scandir may have picked up a broken one. Casper
Frank Cusack
2007-Jun-14 08:59 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On June 13, 2007 11:26:07 PM -0400 Ed Ravin <eravin at panix.com> wrote:> On Wed, Jun 13, 2007 at 09:42:26PM -0400, Ed Ravin wrote: >> As mentioned before, NetBSD''s scandir(3) implementation was one. The >> NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s >> scandir() looks like another, I''ll have to drop them a line. > > I heard from an OpenBSD developer who is working on a patch, but he hasn''t > got a ZFS filesystem to test against.He doesn''t need one. He can just change OpenBSD''s ufs (or whatever fs he wishes to use) to plug in random small numbers for st_size instead of what it does now. He could even do this on a 2nd system which exports via NFS, to isolate the code under test. -frank
Casper.Dik at Sun.COM
2007-Jun-14 09:09 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
>On June 13, 2007 11:26:07 PM -0400 Ed Ravin <eravin at panix.com> wrote: >> On Wed, Jun 13, 2007 at 09:42:26PM -0400, Ed Ravin wrote: >>> As mentioned before, NetBSD''s scandir(3) implementation was one. The >>> NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s >>> scandir() looks like another, I''ll have to drop them a line. >> >> I heard from an OpenBSD developer who is working on a patch, but he hasn''t >> got a ZFS filesystem to test against. > >He doesn''t need one. He can just change OpenBSD''s ufs (or whatever fs he >wishes to use) to plug in random small numbers for st_size instead of >what it does now. He could even do this on a 2nd system which exports >via NFS, to isolate the code under test.Or just start the code of with st_size /= 24 Casper
Joerg Schilling
2007-Jun-14 11:09 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Frank Cusack <fcusack at fcusack.com> wrote:> On June 13, 2007 11:26:07 PM -0400 Ed Ravin <eravin at panix.com> wrote: > > On Wed, Jun 13, 2007 at 09:42:26PM -0400, Ed Ravin wrote: > >> As mentioned before, NetBSD''s scandir(3) implementation was one. The > >> NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s > >> scandir() looks like another, I''ll have to drop them a line. > > > > I heard from an OpenBSD developer who is working on a patch, but he hasn''t > > got a ZFS filesystem to test against. > > He doesn''t need one. He can just change OpenBSD''s ufs (or whatever fs he > wishes to use) to plug in random small numbers for st_size instead of > what it does now. He could even do this on a 2nd system which exports > via NFS, to isolate the code under test.He only need to assign statb.st_size = 0; directly after calling stat() in the scandir() code and them make it work again :-) J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Bill Sommerfeld
2007-Jun-14 22:10 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Thu, 2007-06-14 at 09:09 +0200, Casper.Dik at sun.com wrote:> The implication of which, of course, is that any app build for Solaris 9 > or before which uses scandir may have picked up a broken one.or any app which includes its own copy of the BSD scandir code, possibly under a different name, because not all systems support scandir.. it can be impossible to fix all copies of a bug which has been cut & pasted too many times... - Bill
Joerg Schilling
2007-Jun-14 22:27 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Bill Sommerfeld <sommerfeld at sun.com> wrote:> On Thu, 2007-06-14 at 09:09 +0200, Casper.Dik at sun.com wrote: > > The implication of which, of course, is that any app build for Solaris 9 > > or before which uses scandir may have picked up a broken one. > > or any app which includes its own copy of the BSD scandir code, possibly > under a different name, because not all systems support scandir.. > > it can be impossible to fix all copies of a bug which has been cut & > pasted too many times...15 years ago, Novell Netware started to return a fixed size of 512 for all directories via NFS. If there is still unfixed code, there is no help. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Ed Ravin
2007-Jun-15 02:06 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On Fri, Jun 15, 2007 at 12:27:15AM +0200, Joerg Schilling wrote:> Bill Sommerfeld <sommerfeld at sun.com> wrote: > > > On Thu, 2007-06-14 at 09:09 +0200, Casper.Dik at sun.com wrote: > > > The implication of which, of course, is that any app build for Solaris 9 > > > or before which uses scandir may have picked up a broken one. > > > > or any app which includes its own copy of the BSD scandir code, possibly > > under a different name, because not all systems support scandir..So far, all the local implementations of the BSD scandir() that I''ve found are called scandir(). Some of them are smart enough not to use st_size, the rest of them are straight copies of the BSD code.> 15 years ago, Novell Netware started to return a fixed size of 512 for all > directories via NFS. > > If there is still unfixed code, there is no help.The Novell behavior, commendable as it is, did not break the BSD scandir() code, because BSD scandir() fails in the other direction, when st_size is a low number, like less than 24.
Joerg Schilling
2007-Jun-15 10:09 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
Ed Ravin <eravin at panix.com> wrote:> > 15 years ago, Novell Netware started to return a fixed size of 512 for all > > directories via NFS. > > > > If there is still unfixed code, there is no help. > > The Novell behavior, commendable as it is, did not break the BSD scandir() > code, because BSD scandir() fails in the other direction, when st_size is > a low number, like less than 24.This is wrong: If you use such a Novell server, you only see the first 21 entries of a directory. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Tomas Ă–gren
2007-Jun-15 11:19 UTC
[zfs-discuss] Re: zfs reports small st_size for directories?
On 14 June, 2007 - Bill Sommerfeld sent me these 0,6K bytes:> On Thu, 2007-06-14 at 09:09 +0200, Casper.Dik at sun.com wrote: > > The implication of which, of course, is that any app build for Solaris 9 > > or before which uses scandir may have picked up a broken one. > > or any app which includes its own copy of the BSD scandir code, possibly > under a different name, because not all systems support scandir.. > > it can be impossible to fix all copies of a bug which has been cut & > pasted too many times...Such stuff does exist out in the world.. http://www.google.com/codesearch?hl=en&lr=&q=scandir.c+st_size+24&btnG=Search /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Robert Milkowski
2007-Aug-06 14:11 UTC
[zfs-discuss] zfs reports small st_size for directories?
Hello Ed, Thursday, June 14, 2007, 2:42:26 AM, you wrote: ER> On Wed, Jun 13, 2007 at 05:27:18PM -0700, Matthew Ahrens wrote:>> To summarize my understanding of this issue: st_size on directories is >> undefined; apps/libs which do anything other than display it are broken. >> However, we should avoid exercising this bug in these broken apps if >> possible. >> >> My question: What apps are these? I heard mention of some SunOS 4.x >> library. I don''t think that''s anywhere near important enough to warrant >> changing the current ZFS behavior.ER> As mentioned before, NetBSD''s scandir(3) implementation was one. The ER> NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s scandir() ER> looks like another, I''ll have to drop them a line. Does it mean that FreeBSD won''t be able to properly mount via nfsv4 from Solaris 10/ZFS box? Or just that some applications using scandir() won''t work properly...? What I''m asking is if FreeBSD nfs client is depending somehow on this. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
On Mon, Aug 06, 2007 at 03:11:26PM +0100, Robert Milkowski wrote:> Hello Ed, > > Thursday, June 14, 2007, 2:42:26 AM, you wrote: > > ER> On Wed, Jun 13, 2007 at 05:27:18PM -0700, Matthew Ahrens wrote: > >> To summarize my understanding of this issue: st_size on directories is > >> undefined; apps/libs which do anything other than display it are broken. > >> However, we should avoid exercising this bug in these broken apps if > >> possible. > >> > >> My question: What apps are these? I heard mention of some SunOS 4.x > >> library. I don''t think that''s anywhere near important enough to warrant > >> changing the current ZFS behavior. > > ER> As mentioned before, NetBSD''s scandir(3) implementation was one. The > ER> NetBSD project has fixed this in their CVS. OpenBSD and FreeBSD''s scandir() > ER> looks like another, I''ll have to drop them a line. > > > Does it mean that FreeBSD won''t be able to properly mount via nfsv4 from > Solaris 10/ZFS box? Or just that some applications using scandir() > won''t work properly...?Only the latter. All the *BSD distributions have fixed their scandir() libraries in CVS, so if you encounter that a fix is available. The problem is not really NFS related. It would happen locally if you could somehow mount a local ZFS filesystem on your FreeBSD box.> What I''m asking is if FreeBSD nfs client is > depending somehow on this.No.