Hi all, I''ll first off add my hearty congrats on releasing this beast into the wild. :) Haven''t had enough willpower yet to try and load OpenSolaris on my box (comfortable with FreeBSD, and experiences with Solaris at school have been painful). But just reading about ZFS''s internals and capabilities blows me away. Anyway, I''m starting on an OS project next semester, and one of the big areas of it is to have a filesystem that''s essentially non-hierarchical and attribute-based, rather than the traditional directory/filename-based system. I was originally going to use Reiser4, and write a new filesystem personality translation plugin for it, since Reiser4 can take care of all the nasty details of on-disk format, etc. But ZFS''s lower levels look more than capable. So I''m trying to figure out where in the ZFS design does the "object pool" / on-disk format end, and the "traditional filesystem" (e.g. files, directories, etc) begins? By what I can tell in the source code tour, this happens at the interface layer. Thus, I would write my code to interface to the transactional object layer. Is this the right line of thinking? Along the same lines, where in the source code pool is the code for NFS sharing, auto-mounting, etc? Is that still in the interface layer, I can''t tell. Just trying to figure out where everything that is essentially incompatible with my design. Thanks! P.S. If I''m interpreting the CDDL right, I''d just redistribute the parts of ZFS that my FS is based off of. Everything else (including my FS code on top of ZFS) can be closed? Not that I''m really thinking of that, but it''s a consideration. --Scott This message posted from opensolaris.org
Scott Balmos wrote: ...> So I''m trying to figure out where in the ZFS design does the "object pool" / > on-disk format end, and the "traditional filesystem" (e.g. files, > directories, etc) begins? By what I can tell in the source code tour, this > happens at the interface layer. Thus, I would write my code to interface to > the transactional object layer. Is this the right line of thinking?that''s my understanding. I''ve got a nice slide courtesy of Jeff Bonwick in my SOSUG presentation: http://blogs.sun.com/roller/page/jmcp?entry=brief_report_on_last_night And yes, it''s the interface (vfs +/ posix) layer where you want to target your project.> Along the same lines, where in the source code pool is the code for NFS > sharing, auto-mounting, etc? Is that still in the interface layer, I can''t > tell. Just trying to figure out where everything that is essentially > incompatible with my design.usr/src/cmd/dfs.cmds for the userland stuff, usr/src/uts/common/fs/nfs otherwise best regards, James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems
On Thu, Nov 17, 2005 at 02:02:48PM -0800, Scott Balmos wrote:> So I''m trying to figure out where in the ZFS design does the "object > pool" / on-disk format end, and the "traditional filesystem" (e.g. > files, directories, etc) begins? By what I can tell in the source code > tour, this happens at the interface layer. Thus, I would write my code > to interface to the transactional object layer. Is this the right line > of thinking?That''s right. From Eric Schroc''s source code tour, you pretty much want to replace the ZPL (ZFS POSIX Layer). That means that you''d interface with the ZAP (ZFS Attribute Processor, an extensible, scalable on-disk hash table) and the DMU (Data Management Unit - provides transactions and an object interface).> P.S. If I''m interpreting the CDDL right, I''d just redistribute the > parts of ZFS that my FS is based off of. Everything else (including my > FS code on top of ZFS) can be closed? Not that I''m really thinking of > that, but it''s a consideration.That''s right. Since CDDL is a file-based license, only files that are licensed under the CDDL and that you modified have to be released. Everything else is at your (or your licence''s) discretion. --Bill
Hi Bill & James, Thanks for the confirmation. What about the ZIL? Is that specific to ZFS, or should I extend/replace that also? As I understand it, from Neil''s blog description, the transaction logging in the DMU is per-pool, while ZIL is per-filesystem? If I''m reading that correctly, the ZIL needs to be replaced in order to handle whatever fsync-type semantics my own filesystem code would provide. Now I just have to read up on the OpenSolaris driver model and VFS, to figure out what part of the VDEV is doing the actual block writing to disk/file, and how the ZFS POSIX shim receives file ops from the VFS tree above it. :) And then there''s figuring out how to rewrite this in C# or Managed C++ (don''t ask...) Thanks! --Scott This message posted from opensolaris.org
Scott Balmos wrote On 11/17/05 18:51,:> Thanks for the confirmation. What about the ZIL? Is that specific to ZFS, or should I extend/replace that also? As I understand it, from Neil''s blog description, the transaction logging in the DMU is per-pool, while ZIL is per-filesystem? If I''m reading that correctly, the ZIL needs to be replaced in order to handle whatever fsync-type semantics my own filesystem code would provide.First of all, yes you are correct: the transaction groups are pool wide and the ZIL log block chains are per object set (which maps to a ZPL filesystem). You are also right in that you need to replace to ZIL, but only a bit of it. The ZIL used to be fairly ZPL-centric but in a recent change it now largely contained below the ZPL. This stroke of genius was Jeff Bonwicks''s doing as he foresaw other uses of the ZIL that my short-sightedness did not see. Anyway, it should be possible to craft a new consumer of this code. You would have to define your own log records after the generic log record header (lr_t), and to define your own replay vector table for use when replaying the log after a panic. Then it should be a case of saving the log records in memory until an fsync-type request arrives... Actually, this is beginning to get a bit heavy. I probably shouldn''t design this right now. Just suffice it to say that you shouldn''t need to throw away all the ZIL! Good luck: Neil