I''m primarily a Mac user, coming to the OS X world via OpenStep. One of the things that really bites under the MacOS is that in order to offer some reasonable compatibility with older systems, Apple uses the HFS+ filesystem by default. HFS+ is possibly the worst filesystem in the world. In addition to having all sorts of number, size and scaling limitations, if offers no "advanced" features. The only reason they use it is in order to support the Mac''s metadata, notably the type/creator pairs. I have looked over all the (extremely limited) introductory stuff here (a slide show? come on!) and I am curious about zfs''s handling of metadata. Can I add my own metadata to the directory entries? If so, is the resulting structure still "usable" on machines that don''t understand that data? Maury This message posted from opensolaris.org
On Thu, 2005-11-17 at 15:33, Maury Markowitz wrote:> I have looked over all the (extremely limited) introductory stuff here (a slide show? come on!) and I am curious about zfs''s handling of metadata. Can I add my own metadata to the directory entries? If so, is the resulting structure still "usable" on machines that don''t understand that data?You can do that even on UFS using extended attributes. See openat(2). Note also that the HFS+ in MacOS X 10.4.x has a very similar but API incompatible extended attribute system. -- Darren J Moffat
Maury, Maury Markowitz wrote:> I have looked over all the (extremely limited) introductory stuff> here (a slide show? come on!) and I am curious about zfs''s handling > of metadata. Can I add my own metadata to the directory entries?>you should take a look at the blog carnival: <http://blogs.sun.com/roller/page/bmc?entry=welcome_to_zfs> there you''ll find plenty of stuff to read. cheers, /Martin -- Martin Englund, Senior Security Engineer, Sun IT Security Office Email: martin.englund at sun.com Time Zone: GMT+2 PGP: 1024D/4F4ABC26 "The question is not if you are paranoid, it is if you are paranoid enough."
> you should take a look at the blog carnival: > <http://blogs.sun.com/roller/page/bmc?entry=welcome_to > _zfs> > there you''ll find plenty of stuff to read.Well there''s a little stuff there, but generally it''s pretty bloggish and detail lite to be honest. Maury This message posted from opensolaris.org
Sorry, I had e-mail warnings turned on, which for this software means it should send the entire message to you. Bad, very bad. That''s why I e-mailed back, there was no indication your post was, well, a post. It looked to me like you simply e-mailed me what appears quoted below. Anyhoo...> You can do that even on UFS using extended > attributes. > > See openat(2).I''m a newbie to all of this, with what could only be described as a "passing knowledge" of the internals. I am familiar with the basic layout of a UFS-like system, journaling etc., and the basic reasons why HFS sucks so terribly. But if I''m reading openat''s man page right, it seems like the extended metadata is being stored in separate inodes. Is this correct? If so, it would seem I would expect a performance hit when using this solution? It also seems that it would require two calls into the API (open then openat) to get the complete information. Would this also be a performance issue?> Note also that the HFS+ in MacOS X 10.4.x has a very > similar but API incompatible extended attribute system.Now is this an issue of simply translating the calls properly? Or is there some fundamental problem that makes such translation impossible? It seems this would be a particularily good time to make a switch anyway -- Intel based Macs won''t run Classic apps, so the need for type/creator basically disappears. OS X already puts this information into the bundle and will work find on UFS, although there is definitely a performance hit there. Maury This message posted from opensolaris.org
Maury Markowitz wrote:>> you should take a look at the blog carnival: >> http://blogs.sun.com/roller/page/bmc?entry=welcome_to_zfs >> there you''ll find plenty of stuff to read. > Well there''s a little stuff there, but generally it''s pretty bloggish and detail lite to be honest.So follow the _links_ which bmc very deliberately included. James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems
On Thu, Nov 17, 2005 at 02:25:48PM -0800, Maury Markowitz wrote:> > you should take a look at the blog carnival: > > <http://blogs.sun.com/roller/page/bmc?entry=welcome_to > > _zfs> > > there you''ll find plenty of stuff to read. > > Well there''s a little stuff there, but generally it''s pretty bloggish and detail lite to be honest.Well, I think that''s the first time that I''ve been accused of being "detail lite [sic]"... Indeed, it''s amazing to me that anyone could read all of the content that I pointed to and not be _overwhelmed_ with detail. Still, if it''s additional detail you''re looking for, there is but one place to go: http://opensolaris.org/os/community/zfs/source/ Enjoy, Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
On Nov 17, 2005, at 4:33 PM, Maury Markowitz wrote:>> You can do that even on UFS using extended >> attributes. >> >> See openat(2). > > I''m a newbie to all of this, with what could only be described as a > "passing knowledge" of the internals. I am familiar with the basic > layout of a UFS-like system, journaling etc., and the basic reasons > why HFS sucks so terribly. > > But if I''m reading openat''s man page right, it seems like the > extended metadata is being stored in separate inodes. Is this correct? > If so, it would seem I would expect a performance hit when using this > solution? It also seems that it would require two calls into the API > (open then openat) to get the complete information. Would this also be > a performance issue? > >> Note also that the HFS+ in MacOS X 10.4.x has a very >> similar but API incompatible extended attribute system. > > Now is this an issue of simply translating the calls properly? Or is > there some fundamental problem that makes such translation impossible? > > It seems this would be a particularily good time to make a switch > anyway -- Intel based Macs won''t run Classic apps, so the need for > type/creator basically disappears. OS X already puts this information > into the bundle and will work find on UFS, although there is > definitely a performance hit there.We have been having lots of interesting discussions about this sort of thing over in the IETF NFSv4 working group. What Solaris people call "extended attributes" are what many others prefer to call "subfiles", more similar to NTFS streams than Linux xattrs. Ignoring the internal details of how UFS implements them, honestly I haven''t look at ZFS, the API is defined in a way that requires a two step process to access and attribute/subfile. You have to first successfully open() the file before attempting to openat() an attribute/subfile. A convience function attropen() exists that hosr just that (see man fsattr). The semantics of the Linux/BSD getxattr/setxattr is mostly a subset of the Solaris attribute model, although not a perfect match. The most notable issues are around "system" attributes that are interpreted by the OS (like ACLs) where all Solaris attributes are opaque. More similar to the getxattr "user" attributes. Some implementations also add slightly different access permission models that map poorly. There are some that desire to move Linux towards a more Solaris like model to provide more generality. But issues like Linux using a general getxattr for ACLs instead of an ACL specific call like getfacl() need to be resolved. There is are not many pushing for system interpeted Solaris attributes, and fewer pushing for getxattr. The performance issue of needing to open before openat is baked into the API. Although the internal decision to store them all in one inode, or many inodes is just an implementation detail. The performance impact is more likely around if the second syscall causes an additional disk read or if the metadata is already in core. Bottom line is that this is less of a ZFS issue and more of an OS/API model issue. -David
On Thu, 2005-11-17 at 23:12, David Robinson wrote:> There is are not many pushing for system interpeted Solaris > attributes, and fewer pushing for getxattr.We would love to have a system interpreted set of Solaris attributes for MAC labels and forced privileges - like we had in the modified versions of UFS for Trusted Solaris. Whats more we, in the Solaris security community, have been saying this ever since the earliest design days of the extended attribute support in UFS.> The performance issue of needing to open before openat > is baked into the API. Although the internal decision to > store them all in one inode, or many inodes is just an > implementation detail. The performance impact is more > likely around if the second syscall causes an additional > disk read or if the metadata is already in core.I''d suspect that the theoretical hit you get from having to make to syscalls to get to the metadata in the extended attributes is totally irrelevant in the vast majority of cases. This looks like a sever case of premature optimisation. -- Darren J Moffat
[Let''s take this off ZFS after this] On Nov 18, 2005, at 4:32 AM, Darren J Moffat wrote:> On Thu, 2005-11-17 at 23:12, David Robinson wrote: > >> There is are not many pushing for system interpeted Solaris >> attributes, and fewer pushing for getxattr. > > We would love to have a system interpreted set of Solaris attributes > for MAC labels and forced privileges - like we had in the modified > versions of UFS for Trusted Solaris. > > Whats more we, in the Solaris security community, have been saying > this ever since the earliest design days of the extended attribute > support in UFS.If anyone is interested in working on what the Solaris design for system interpreted attributes should be, let me know. It is an active topic I am driving over in the NFSv4 (which is almost identical to Solaris attrs) community. The two big APIs are getxattr and openat, the question is what is the middle ground? -David
> Well, I think that''s the first time that I''ve been > accused of being "detail lite [sic]"... Indeed, it''s > amazing to me that anyone could read > all of the content that I pointed to and not be > _overwhelmed_ with detail.Really? Perhaps we are looking for different things. I certainly didn''t follow _all_ of the links, because I have little interest in managers posting congradulations, users wishing they always had it, or personal stories about rescuing data. But even the "technical" entries are typically nothing more than "here''s how to type in this command". Consider, for instance, these typical entries: http://blogs.sun.com/roller/page/lling#i_love_zfs http://blogs.sun.com/roller/page/martin#zfs_from_a_sysadmin_point http://blogs.sun.com/roller/page/talley#manage_zfs_from_your_browser Would you really consider these to be overwhelmingly detailed? Even when taken as a whole? What I am looking for is a single, organized, detailed "medium technical" description of the system in context by comparing to UFS (for instance). A wonderful work at this sort of level is: http://www.nobius.org/~dbg/practical-file-system-design.pdf Is there something like this I have missed? If there isn''t, do you think there is a need? If so, I''d be happy to edit/copyedit/proof/etc. I write perhaps 100 pages a month on the wiki in my spare time, so I have fairly considerable experience making single "wholes" out of lots of "parts". Maury This message posted from opensolaris.org
> > Well, I think that''s the first time that I''ve been > > accused of being "detail lite [sic]"... Indeed, > it''s > > amazing to me that anyone could read > > all of the content that I pointed to and not be > > _overwhelmed_ with detail. > > Really? Perhaps we are looking for different things. > > I certainly didn''t follow _all_ of the links, because > I have little interest in managers posting > congradulations, users wishing they always had it, or > personal stories about rescuing data. But even the > "technical" entries are typically nothing more than > "here''s how to type in this command". Consider, for > instance, these typical entries: > > http://blogs.sun.com/roller/page/lling#i_love_zfs > http://blogs.sun.com/roller/page/martin#zfs_from_a_sys > admin_point > http://blogs.sun.com/roller/page/talley#manage_zfs_fro > m_your_browser > > Would you really consider these to be overwhelmingly > detailed? Even when taken as a whole? > > What I am looking for is a single, organized, > detailed "medium technical" description of the system > in context by comparing to UFS (for instance). A > wonderful work at this sort of level is: > > http://www.nobius.org/~dbg/practical-file-system-desig > n.pdf > > Is there something like this I have missed? If there > isn''t, do you think there is a need? If so, I''d be > happy to edit/copyedit/proof/etc. I write perhaps 100 > pages a month on the wiki in my spare time, so I have > fairly considerable experience making single "wholes" > out of lots of "parts". > > MauryHe did have lots of detail that was posted. However, something closer (or perhaps exactly) to what you''re looking for can probably be found here: http://www.sun.com/software/solaris/zfs_learning_center.jsp http://www.sun.com/emrkt/campaign_docs/expertexchange/knowledge/solaris_zfs.html I don''t know that anyone has done a detailed specific comparison of ZFS to UFS, however the FAQs linked from the pages above do make some comparisons. In short, ZFS is far more reliable than almost any other file system, and ZFS is much faster than UFS in almost every case so far. I don''t think an extremely detailed technical comparison of ZFS and UFS has been done yet, partly because ZFS isn''t yet done! Whenever ZFS is integrated into a Solaris GA release, that''s when I think we''ll see that start to happen. -Shawn This message posted from opensolaris.org