Richard L. Hamilton
2009-Jan-14 14:12 UTC
[zfs-discuss] Why is st_size of a zfs directory equal to the number of entries?
Cute idea, maybe. But very inconsistent with the size in blocks (reported by ls -dls dir). Is there a particular reason for this, or is it one of those just for the heck of it things? Granted that it isn''t necessarily _wrong_. I just checked SUSv3 for stat() and sys/stat.h, and it appears that st_size is only well-defined for regular files and symlinks. So I suppose it could be (a) undefined, or (b) whatever is deemed to be useful, for directories, device files, etc. This is of course inconsistent with the behavior on other filesystems. On UFS (a bit of a special case perhaps in that it still allows read(2) on a directory, for compatibility), the st_size seems to reflect the actual number of bytes used by the implementation to hold the directory''s current contents. That may well also be the case for tmpfs, but from user-land, one can''t tell since it (reasonably enough) disallows read(2) on directories. Haven''t checked any other filesystems. Don''t have anything else (pcfs, hsfs, udfs, ...) mounted at the moment to check. (other stuff: ISTR that devices on Solaris will give a "size" if applicable, but for non LF-aware 32-bit, that may be capped at MAXOFF32_T rather than returning an error; I think maybe for pipes, one sees the number of bytes available to be read. None of which is portable or should necessarily be depended on...) Cool ideas are fine, but IMO, if one does wish to make something nominally undefined have some particular behavior, I wonder why one wouldn''t at least try for consistency... -- This message posted from opensolaris.org
Joerg Schilling
2009-Jan-14 14:58 UTC
[zfs-discuss] Why is st_size of a zfs directory equal to the number of entries?
"Richard L. Hamilton" <rlhamil at smart.net> wrote:> Cute idea, maybe. But very inconsistent with the size in blocks (reported by ls -dls dir). > Is there a particular reason for this, or is it one of those just for the heck of it things? > > Granted that it isn''t necessarily _wrong_. I just checked SUSv3 for stat() and sys/stat.h, > and it appears that st_size is only well-defined for regular files and symlinks. So I suppose > it could be (a) undefined, or (b) whatever is deemed to be useful, for directories, > device files, etc.You could also return 0 for st_size for all directories and would still be POSIX compliant. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Richard L. Hamilton
2009-Jan-14 17:08 UTC
[zfs-discuss] Why is st_size of a zfs directory equal to the
> "Richard L. Hamilton" <rlhamil at smart.net> wrote: > > > Cute idea, maybe. But very inconsistent with the > size in blocks (reported by ls -dls dir). > > Is there a particular reason for this, or is it one > of those just for the heck of it things? > > > > Granted that it isn''t necessarily _wrong_. I just > checked SUSv3 for stat() and sys/stat.h, > > and it appears that st_size is only well-defined > for regular files and symlinks. So I suppose > > it could be (a) undefined, or (b) whatever is > deemed to be useful, for directories, > > device files, etc. > > You could also return 0 for st_size for all > directories and would still be > POSIX compliant. > > > J?rg >Yes, some do IIRC (certainly for empty directories, maybe always; I forget what OS I saw that on). Heck, "undefined" means it wouldn''t be _wrong_ to return a random number. Even a _negative_ number wouldn''t necessarily be wrong (although it would be a new low in rudeness, perhaps). I did find the earlier discussion on the subject (someone e-mailed me that there had been such). It seemed to conclude that some apps are statically linked with old scandir() code that (incorrectly) assumed that the number of directory entries could be estimated as st_size/24; and worse, that some such apps might be seeing the small st_size that zfs offers via NFS, so they might not even be something that could be fixed on Solaris at all. But I didn''t see anything in the discussion that suggested that this was going to be changed. Nor did I see a compelling argument for leaving it the way it is, either. In the face of "undefined", all arguments end up as pragmatism rather than principle, IMO. Maybe it''s not a bad thing to go and break incorrect code. But if that code has worked for a long time (maybe long enough for the source to have been lost), I don''t know that it''s helpful to just remind everyone that st_size is only defined for certain types of objects, and directories aren''t one of them. (Now if one wanted to write something to break code depending on 32-bit time_t _now_ rather than waiting for 2038, that might be a good deed in terms of breaking things. But I''ll be 80 then (if I''m still alive), and I probably won''t care.) -- This message posted from opensolaris.org
Joerg Schilling
2009-Jan-14 17:13 UTC
[zfs-discuss] Why is st_size of a zfs directory equal to the
"Richard L. Hamilton" <rlhamil at smart.net> wrote:> I did find the earlier discussion on the subject (someone e-mailed me that there had been > such). It seemed to conclude that some apps are statically linked with old scandir() code > that (incorrectly) assumed that the number of directory entries could be estimated as > st_size/24; and worse, that some such apps might be seeing the small st_size that zfs > offers via NFS, so they might not even be something that could be fixed on Solaris at all. > But I didn''t see anything in the discussion that suggested that this was going to be changed. > Nor did I see a compelling argument for leaving it the way it is, either. In the face of > "undefined", all arguments end up as pragmatism rather than principle, IMO.This is a problem I had to fix for some customers in 1992 when people started to use NFS servers based on the Novell OS. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
David Collier-Brown
2009-Jan-14 18:43 UTC
[zfs-discuss] Why is st_size of a zfs directory equal to the
"Richard L. Hamilton" <rlhamil at smart.net> wrote:>> I did find the earlier discussion on the subject (someone e-mailed me that there had been >> such). It seemed to conclude that some apps are statically linked with old scandir() code >> that (incorrectly) assumed that the number of directory entries could be estimated as >> st_size/24; and worse, that some such apps might be seeing the small st_size that zfs >> offers via NFS, so they might not even be something that could be fixed on Solaris at all. >> But I didn''t see anything in the discussion that suggested that this was going to be changed. >> Nor did I see a compelling argument for leaving it the way it is, either. In the face of >> "undefined", all arguments end up as pragmatism rather than principle, IMO. >Joerg Schilling wrote:> This is a problem I had to fix for some customers in 1992 when people started to use NFS > servers based on the Novell OS. > J?rg >Oh bother, I should have noticed this back in 1999/2001 (;-)) Joking aside, we were looking at the Solaris ABI (application Binary interface) and working on ensuring binary stability. The size of a directory entry was supposed to be undefined and in principle *variable*, but Novell et all seem to have assumed that the size they used was guaranteed to be the same for all time. And no machine needs more than 640 KB of memory, either... Ah well, at least the ZFS folks found it for us, so I can add it to my database of porting problems. What OSs did you folks find it on? --dave (an external consultant, these days) c-b -- David Collier-Brown | Always do right. This will gratify Sun Microsystems, Toronto | some people and astonish the rest davecb at sun.com | -- Mark Twain cell: (647) 833-9377, bridge: (877) 385-4099 code: 506 9191#