thr3ads.net - zfs discuss - [zfs-discuss] XATTRs, ZAP and the Mac [May 2006]

If this information is useful, please help other people find it:
Share via:

Maury Markowitz

2006-May-02 20:02 UTC

[zfs-discuss] XATTRs, ZAP and the Mac

Now that the documentation section is expanding, I can make a
not-so-ill-informed comment about zfs and the Mac OS. If I''m reading
the documentation correctly, the news is unfortunately bad.

The "problem" with Mac support is that the Mac stored all of its
display-related information in the directory itself. By this I refer to things
like the icon''s position within its directory window, the "type
and creator" flags that are an analog to a file extension, and even things
like window sizing, position and scroll location for folders. The idea here was
that on a floppy, which the FS was targetted for way back when, a single pass
over the flat-file directory would give you EVERYTHING you needed to draw the
display. Given the extremely limited bandwidth, this was an extremely practical
decision, if potentially limiting.

For the interested, here''s the info needed:

http://developer.apple.com/technotes/tn/tn1150.html#FinderInfo

It''s a pair of four-byte strings, three 2x16-bit Points and potentially
a 32-bit Rect, and a couple of flags, the vast majority of which are no longer
used or are duplicated in zfs''s directory ZAP anyway.

Now generally the Unix nerds would suggest hanging the Mac''s extra
information in the xattrs.The problem here is that zfs stores all xattrs
separately from the ZAP directory entry. So while it is physically possible to
make another ACE for the Mac info and hang it off the xattr "inode"
(ZAP), this means that drawing a window in the Finder will require a file system
walk! This just isn''t going to work, Mac users won''t accept
PC-slow directory displays -- I know I wouldn''t.

What I don''t understand is why this disturbingly inflexible design was
chosen. Note that the ACL ACE is built in order to store up to six entries
in-line, which likely serves 95% of all cases. Why an identical solution was not
used for xattrs absolutely baffles me. An identical six entry with overflow ACE
dir would work wonderfully for xattrs, and in this particular case, would store
all the needed Mac-related goodness.

So can anyone tell me why xattrs weren''t handled in the same way as
ACLs? It smells of inside-the-box-thinking, but I''m no FS expert and
there may very well be a good reason.

And if there isn''t a good reason, is it simply too late to fix this? If
xattr were moved below gid and used up the pad[4], that would at least give us
something useful.

Maury
 
 
This message posted from opensolaris.org

Gregory Shaw

2006-May-02 20:13 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

I don''t disagree with the below, however, you can run your mac on UFS  
instead of HFS+.   Since UFS hasn''t been mac-ified, I''m
wondering if
the below is actually true for all filesystem types.

On May 2, 2006, at 2:02 PM, Maury Markowitz wrote:
> Now that the documentation section is expanding, I can make a not- 
> so-ill-informed comment about zfs and the Mac OS. If I''m reading  
> the documentation correctly, the news is unfortunately bad.
>
> The "problem" with Mac support is that the Mac stored all of its
> display-related information in the directory itself. By this I  
> refer to things like the icon''s position within its directory  
> window, the "type and creator" flags that are an analog to a file
> extension, and even things like window sizing, position and scroll  
> location for folders. The idea here was that on a floppy, which the  
> FS was targetted for way back when, a single pass over the flat- 
> file directory would give you EVERYTHING you needed to draw the  
> display. Given the extremely limited bandwidth, this was an  
> extremely practical decision, if potentially limiting.
>
> For the interested, here''s the info needed:
>
> http://developer.apple.com/technotes/tn/tn1150.html#FinderInfo
>
> It''s a pair of four-byte strings, three 2x16-bit Points and  
> potentially a 32-bit Rect, and a couple of flags, the vast majority  
> of which are no longer used or are duplicated in zfs''s directory  
> ZAP anyway.
>
> Now generally the Unix nerds would suggest hanging the Mac''s extra
> information in the xattrs.The problem here is that zfs stores all  
> xattrs separately from the ZAP directory entry. So while it is  
> physically possible to make another ACE for the Mac info and hang  
> it off the xattr "inode" (ZAP), this means that drawing a window
in
> the Finder will require a file system walk! This just isn''t going
> to work, Mac users won''t accept PC-slow directory displays -- I  
> know I wouldn''t.
>
> What I don''t understand is why this disturbingly inflexible design
> was chosen. Note that the ACL ACE is built in order to store up to  
> six entries in-line, which likely serves 95% of all cases. Why an  
> identical solution was not used for xattrs absolutely baffles me.  
> An identical six entry with overflow ACE dir would work wonderfully  
> for xattrs, and in this particular case, would store all the needed  
> Mac-related goodness.
>
> So can anyone tell me why xattrs weren''t handled in the same way
as
> ACLs? It smells of inside-the-box-thinking, but I''m no FS expert  
> and there may very well be a good reason.
>
> And if there isn''t a good reason, is it simply too late to fix  
> this? If xattr were moved below gid and used up the pad[4], that  
> would at least give us something useful.
>
> Maury
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           greg.shaw at sun.com (work)
Louisville, CO 80028-4382                 shaw at fmsoft.com (home)
"When Microsoft writes an application for Linux, I''ve Won." -
Linus
Torvalds

Matthew Ahrens

2006-May-02 20:45 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

On Tue, May 02, 2006 at 01:02:59PM -0700, Maury Markowitz
wrote:> The "problem" with Mac support is that the Mac stored all of its
> display-related information in the directory itself.
As you''ve mentioned, performance considerations mean that extended
attributes may not be the best solution for storing this information.
Unfortunately, embedding the extended attributes in the znode_phys_t (as
you suggest) is not really practical.  The extended attributes are
full-fledged files, with all the associated metadata (permissions,
dates, etc).  There simply isn''t enough space in the znode_phys_t to
store even a single extended attribute.

Even if you invented a new interface for name-value extended properties
on files, it would be tough to fit many properties in the unused space
in the znode_phys_t.  But such an interface might be worth considering,
if there are important uses for it (the Mac windowing information may be
one).

One way to store this information would be to simply add it to the
znode_phys_t (by using a couple words of the padding).  This would
result in very good performance, since you wouldn''t even have to find
the attribute with the matching name.

Keep in mind that the whole idea of storing this windowing information
with the file presumes that there are not multiple (hard) links to a
file.  If there are multiple names for the file (ie. it exists in
multiple directories), it probably makes more sense to store the
windowing information with the directory entry.  That change would be
fairly straightforward:  use the ZAP to store a larger value in the
directory entry, adding words that describe the window information.

--matt

Matthew Ahrens

2006-May-02 23:37 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

On Tue, May 02, 2006 at 05:10:09PM -0400, Maury Markowitz
wrote:> >Unfortunately, embedding the extended attributes in the znode_phys_t
(as
> >you suggest) is not really practical.  The extended attributes are
> >full-fledged files
> 
> I think that''s the disconnect. WHY are they "full-fledged
files"?
Because that''s what the specification calls for.  If they
weren''t
full-fledged files, they wouldn''t be compatable with existing
interfaces.  That wouldn''t necessarily be a bad thing, just a different
thing.  As I mentioned, a new, lighter-weight interface could be
designed in addition to extended attributes.
> >windowing information with the directory entry.  That change would be
> >fairly straightforward:  use the ZAP to store a larger value in the
> >directory entry, adding words that describe the window information.
> 
> I think I need a little hand holding here. Where is the ZAP in relation to 
> the directory or file? I read the documentation to imply that the ZAP was 
> the directory (page 46, bottom) effectively. However, looking at it now, I 
> see there is no file name for instance.
> 
> Reading over section 5, and I correct that a znode_phys_t is "built up
> inside" a ZAP (potentially micro), and the ZAP itself holds the 
> znode_phys_t and other additional information? If this is correct, is there
> a diagram somewhere that illustrates the resulting overall structure?
Ah, I see that you''ve been reading the on-disk format document. 
Section
6.2 describes the relation between the ZAP, directory entries, and the
znode_phys_t:

	Filesystem directories are implemented as ZAP objects.  Each
	directory holds a set of name-value pairs which contain the
	names and object numbers for each director entry.  Traversing
	through a directory tree is as simple as looking up the value
	for an entry and reading that object number.

	All filesystem objects contain a znode_phys_t structure in the
	bonus buffer of it''s dnode.  This structure stores the
	attributes for the filesystem object.

To recap, the ZAP implements a directory, mapping from file name to
object number.  That object number identifies (ie. refers to) a
particular dnode, which contains a znode_phys_t in the bonus buffer.

As I mentioned, the directory could potentially map from file name to
object number + some additional information, since the values stored in
the ZAP can be variable-length.

Could elaborate on what makes it seem like there "is no file name" and
that the "ZAP itself holds the znode_phys_t"?  Then we can change that
documentation to make it clear that that is not the case.

--matt

Matthew Ahrens

2006-May-03 21:20 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

On Wed, May 03, 2006 at 03:22:53PM -0400, Maury Markowitz
wrote:> >> I think that''s the disconnect. WHY are they
"full-fledged files"?
> >
> >Because that''s what the specification calls for.
> 
> Right, but that''s my concern. To me this sounds like
"historically
> circular" reasoning...
 > 20xx) we need a new file system that supports xaddrs
>         well xaddrs are this second file, so...
 > To me it appears that there is some confusion between the purpose and 
> implementation.
> Certainly if xaddrs were originally introduced to store, well,
"x"
> addrs, then the implementation is a poor one. Years later the
> _implementation_ was copied, even though it was never a good one.
I think you are confusing the interface with the implementation.  ZFS
has "copied" (aka. adhered to) a pre-existing interface[*].  Our
implementation of that interface is in some ways similar to other
implementations.  I believe that our implementation is a very good one,
but if you have specific suggestions for how it could be improved, we''d
love to hear them.

[*] The solaris extended attributes interface is actually more
accurately called "named streams", and has been used as the back-end
for
CIFS (Windows) and NFSv4 named-streams protocols.  See the fsattr(5)
manpage.

We appreciate your suggestion that we implement a higher-performance
method for storing additional metadata associated with files.  This will
most likely not be possible within the extended attribute interface, and
will require that we design (and applications use) a new interface.
Having specific examples of how that interface would be used will help
us to design a useful feature.
> The real problem is that there is nothing like a "general
overview" of
> the zfs system as a whole
I agree that a higher-level overview would be useful.
> COMPARING the system with the widely understood UFS would be
> invaluable, IMHO.
Agreed, thanks for the suggestion.  Unfortunately, ZFS and UFS are
sufficiently different that I think the comparison would only be useful
for a very limited part of ZFS, say from the file/directory down.
> But to the specifics. You asked why I thought it was that the file
> name did not appear. Well, that''s because the term "file
name" (or
> "filename") does not appear anywhere in the document.
Thanks, maybe we should use that keyword in section 6.2 to help when
doing a search.
> So then, at a first glance it seems that one would expect to find the
> directory description in Chapter 6, which has a subsection called
> "Directories and Directory Traversal".
I believe that that section does in fact describe directories.  Perhaps
the description could be made more explicit (eg. "The ZAP object which
stores the directory maps from filename to object number.  Each entry in
the ZAP is a single directory entry.  The entry''s name is the filename,
and its value is the object number which identifies that file.
> That section describes the znode_phys_t structure.
You''re right, it also describes the znode_phys_t.  There should be a
section break after the first paragraph, before we start talking about
the znode_phys_t.
> Maybe I''m going down a dark alley here, but is there any reason
this
> split still exists under zfs? IE, I asumed that the znode_phys_t would
> be located in the directory ZAP, because to my mind, that''s where
> metadata belongs.
ZFS must support POSIX semantics, part of which is hard links.  Hard
links allow you to create multiple names (directory entries) for the
same file.  Therefore, all UNIX filesystems have chosen to store the
file information separately for the directory entries (otherwise, you''d
have multiple copies, and need pointers between all of them so you could
update them all -- yuck).

Hard links suck for FS designers because they constrain our
implementation in this way.  We''d love to have the flexability to
easily
store metadata with the directory entry.  We''ve actually contemplated
caching the metadata needed to do a stat(2) in the directory entry, to
improve performance of directory traversals like find(1).  Perhaps
we''ll
be able to add this performance improvement in an future release.

--matt

Bill Sommerfeld

2006-May-04 02:05 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

On Wed, 2006-05-03 at 17:20, Matthew Ahrens wrote:> We appreciate your suggestion that we implement a higher-performance
> method for storing additional metadata associated with files.  This will
> most likely not be possible within the extended attribute interface, and
> will require that we design (and applications use) a new interface.
> Having specific examples of how that interface would be used will help
> us to design a useful feature.
another potential consumer for an extra-metadata extension is trusted
extensions, for per-file security labels and similar obscurity.

Anton B. Rang

2006-May-04 15:20 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

> ZFS must support POSIX semantics, part of which is hard links. Hard
> links allow you to create multiple names (directory entries) for the
> same file. Therefore, all UNIX filesystems have chosen to store the
> file information separately for the directory entries (otherwise,
you''d
> have multiple copies, and need pointers between all of them so you could
> update them all -- yuck).
For what it''s worth, some file systems have chosen to special-case hard
links
because they are rare and the directory/inode split hurts performance. 
Apple''s
HFS is a case in point.  The file metadata ("inode") is part of the
directory entry,
so that no additional disk access is required to retrieve it.  If the file is a
hard
link, this metadata is a pointer to the shared metadata for the file.

Anton
 
 
This message posted from opensolaris.org

Frank Hofmann

2006-May-04 15:29 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

>> ZFS must support POSIX semantics, part of which is hard links. Hard
>> links allow you to create multiple names (directory entries) for the
>> same file. Therefore, all UNIX filesystems have chosen to store the
>> file information separately for the directory entries (otherwise,
you''d
>> have multiple copies, and need pointers between all of them so you
could
>> update them all -- yuck).
>
> For what it''s worth, some file systems have chosen to special-case
hard links
> because they are rare and the directory/inode split hurts performance. 
Apple''s
> HFS is a case in point.  The file metadata ("inode") is part of
the directory entry,
> so that no additional disk access is required to retrieve it.  If the file
is a hard
> link, this metadata is a pointer to the shared metadata for the file.
Yes, Microsoft''s FAT does it the same way - the dirent is the inode.

This creates locking nightmares in its own right - directory scans/updates 
may be blocking file access; at the very least, the two race. It might 
have advantages in some situations, and simplifies the metadata 
implementation - but at least to me, it also causes headaches ... and an 
upset stomach every now and then ...


FrankH.

Matthew Ahrens

2006-May-04 17:42 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

On Thu, May 04, 2006 at 10:05:31AM -0400, Maury Markowitz
wrote:> Hmmm, where in 6.2 is the filename? I see the description of the
> znode_phys_t, which doesn''t have it, and "Each directory
holds a set
> of name-value pairs which contain the names and object numbers for
> each directory entry."  Is that "names" the filenames?
Yes, the names that are stored in directory entries are filenames.
> And is it accurate to go further, that the "object number" is a
> pointer to a dnode? If so, what is the conceptual difference here with
> UFS, where the directory stores a filename and pointer to an inode?
Yes.  The concept is the same between ZFS and UFS (and every other UNIX
filesystem I''m aware of) -- the directory stores a mapping from
filename
to some number, which points to the structure that describes the file
(dnode/znode, inode, etc).
> >We''ve actually contemplated caching the metadata needed to do
a
> >stat(2) in the directory entry, to improve performance of directory
> >traversals like find(1).
> 
> So how would this work? I assume the extra data is added as additional 
> key/values into the directory, but how do they keep in sync with changes to
> the znode_phys_t?
We''d probably make the existing value longer, and store the additional
info in the existing name/value entry.  Keeping the changes in sync is
the challenge.

--matt

Joerg Schilling

2006-May-07 10:43 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

Maury Markowitz <maury_markowitz at hotmail.com> wrote:
> So can anyone tell me why xattrs weren''t handled in the same way
as ACLs? It smells of inside-the-box-thinking, but I''m no FS expert and
there may very well be a good reason.
They are:

ACLs (ar least in UFS) are inside a shadow inode that is
referenced in the main inode.

XATTRs are inside a "shadow" XATTR directors that is 
referenced in the inode.

The difference between Apple does and what Sun does it that
Sun''s imlementation is not limited and also serves wishes
that come from Microsoft.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-07 10:59 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

Gregory Shaw <greg.shaw at sun.com> wrote:
> I don''t disagree with the below, however, you can run your mac on
UFS
> instead of HFS+.   Since UFS hasn''t been mac-ified, I''m
wondering if
> the below is actually true for all filesystem types.
It seems that UFS has been "mac-ified" on MacOS X.

IIRC, you may append "/rsrc" to any file name.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-07 12:01 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

Matthew Ahrens <ahrens at eng.sun.com> wrote:
> > I think that''s the disconnect. WHY are they
"full-fledged files"?
>
> Because that''s what the specification calls for.  If they
weren''t
> full-fledged files, they wouldn''t be compatable with existing
> interfaces.  That wouldn''t necessarily be a bad thing, just a
different
> thing.  As I mentioned, a new, lighter-weight interface could be
> designed in addition to extended attributes.
Do you believe this should be done using a new different basic implementation?

Note that I would need to know this in order to decide on how to support
XATTRs in a OS independent way inside star.


J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-07 15:03 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

Frank Hofmann <Frank.Hofmann at Sun.COM> wrote:
> Yes, Microsoft''s FAT does it the same way - the dirent is the
inode.
>
> This creates locking nightmares in its own right - directory scans/updates 
> may be blocking file access; at the very least, the two race. It might 
> have advantages in some situations, and simplifies the metadata 
> implementation - but at least to me, it also causes headaches ... and an 
> upset stomach every now and then ...
With FAT, you are right, but there are other ways to implement hard links.

Look at my WOFS from 1990... It uses ''gnodes'' that include the
filename
in one single meta data chunk for a file. Hard links are implemented as
inode number related soft links (while symlinks are name related soft links).

If ZFS did use my concept, you don''t have the problems you have with
FAT.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2006-May-09 12:26 UTC

head link

[zfs-discuss] XATTRs, ZAP and the Mac

"Maury Markowitz" <maury_markowitz at hotmail.com> wrote:
> I believe xattrs were added to store things just like what we''re
talking
> about here. Specifically, if I''m not mistaken, many originally
used them for
> ALC storage. Now that zfs promotes ACL''s to first-class citizens,
it seems
> that a reevaluation of what people ACTUALLY DO with xattrs and whether or 
> not the current mechanism is "correct" for this role certainly
seems in
> order. If some large percentage of use-cases turns out to be "storing 
> ACL''s", and some smaller percentage is "other OS
metadata", then certainly
> it seems that a dedicated "expanding" (as in the zfs ACL system)
key/value
> pair storage system seems to make sense. But that implies more API, which I
> don''t think anyone would want.
XATTRs have been implemented recently while ACLs exist for a long time.
Solaris-2.4 has them already inside UFS....

The fact that Linux and FreeBSD did put them into XATTRs is a hack and the 
way they implement ACLs make them slow.


J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Wout Mertens

2006-May-09 15:04 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

On 07 May 2006, at 17:03, Joerg Schilling wrote:
> Look at my WOFS from 1990... It uses ''gnodes'' that
include the
> filename
> in one single meta data chunk for a file. Hard links are  
> implemented as
> inode number related soft links (while symlinks are name related  
> soft links).
>
> If ZFS did use my concept, you don''t have the problems you have  
> with FAT.
Yes, but WOFS is a write-once filesystem. ZFS is read-write. What  
happens if you delete the file referenced by the inode-softlinks?

Wout.

Joerg Schilling

2006-May-09 15:37 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

Wout Mertens <wmertens at cisco.com> wrote:
>
> On 07 May 2006, at 17:03, Joerg Schilling wrote:
>
> > Look at my WOFS from 1990... It uses ''gnodes'' that
include the
> > filename
> > in one single meta data chunk for a file. Hard links are  
> > implemented as
> > inode number related soft links (while symlinks are name related  
> > soft links).
> >
> > If ZFS did use my concept, you don''t have the problems you
have
> > with FAT.
>
> Yes, but WOFS is a write-once filesystem. ZFS is read-write. What  
> happens if you delete the file referenced by the inode-softlinks?
WOFS lives on a Write once medium, WOFS itself is not write once.

I would need to check my papers.... there is a solution.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Nicolas Williams

2006-May-09 16:09 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

On Tue, May 09, 2006 at 05:37:07PM +0200, Joerg Schilling
wrote:> Wout Mertens <wmertens at cisco.com> wrote:
> > On 07 May 2006, at 17:03, Joerg Schilling wrote:
> > > If ZFS did use my concept, you don''t have the problems
you have
> > > with FAT.
> >
> > Yes, but WOFS is a write-once filesystem. ZFS is read-write. What  
> > happens if you delete the file referenced by the inode-softlinks?
> 
> WOFS lives on a Write once medium, WOFS itself is not write once.
> 
> I would need to check my papers.... there is a solution.
If you unlink the original name/inode entry you can mark it as deleted
without actually deleting it, thus leaving extant links to it live and
fresh; you only have to chase down the other links when the original''s
directory is removed, though you can do the same to directories,
deferring actual removal to later (at the cost of wasting disk space and
complicating accounting).  And/or you can always lazily update back
references, if you throw in a log, so that you can find those when they
are needed.

Note that this assymetric hard-linking approach makes original links
fast and others slow; this will surely bother someone :)

Nico
--

Wout Mertens

2006-May-09 21:37 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

On 09 May 2006, at 18:09, Nicolas Williams wrote:
> On Tue, May 09, 2006 at 05:37:07PM +0200, Joerg Schilling wrote:
>> Wout Mertens <wmertens at cisco.com> wrote:
>>>
>>> Yes, but WOFS is a write-once filesystem. ZFS is read-write. What
>>> happens if you delete the file referenced by the inode-softlinks?
>>
>> WOFS lives on a Write once medium, WOFS itself is not write once.
>>
>> I would need to check my papers.... there is a solution.
>
> If you unlink the original name/inode entry you can mark it as deleted
> without actually deleting it, thus leaving extant links to it live and
> fresh; you only have to chase down the other links when the
original''s
> directory is removed, though you can do the same to directories,
> deferring actual removal to later (at the cost of wasting disk  
> space and
> complicating accounting).  And/or you can always lazily update back
> references, if you throw in a log, so that you can find those when  
> they
> are needed.
>
> Note that this assymetric hard-linking approach makes original links
> fast and others slow; this will surely bother someone :)
How about, as soon as a file is hardlinked, you move it to a special  
invisible directory, and make both the target and the source be your  
special inode-symlink? That solves the directory deletion and makes  
the access symmetrical...

Wout.

Joerg Schilling

2006-May-09 21:48 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

Wout Mertens <wmertens at cisco.com> wrote:
> >> WOFS lives on a Write once medium, WOFS itself is not write once.
> >>
> >> I would need to check my papers.... there is a solution.
> >
> > If you unlink the original name/inode entry you can mark it as deleted
> > without actually deleting it, thus leaving extant links to it live and
> > fresh; you only have to chase down the other links when the
original''s
> > directory is removed, though you can do the same to directories,
> > deferring actual removal to later (at the cost of wasting disk  
> > space and
> > complicating accounting).  And/or you can always lazily update back
> > references, if you throw in a log, so that you can find those when  
> > they
> > are needed.
> >
> > Note that this assymetric hard-linking approach makes original links
> > fast and others slow; this will surely bother someone :)
>
> How about, as soon as a file is hardlinked, you move it to a special  
> invisible directory, and make both the target and the source be your  
> special inode-symlink? That solves the directory deletion and makes  
> the access symmetrical...
Possible implementations are discussed in my master thesis:

e.g. Page 1042 "OpenSolaris f?r Anwender, Administratoren und
Rechenzentren"
ISBN: 3-540-29236-5


Or page 16 of the master thesis:
http://cdrecord.berlios.de/old/private/wofs.ps.gz


J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Wout Mertens

2006-May-10 06:48 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

On 09 May 2006, at 23:48, Joerg Schilling wrote:
> Wout Mertens <wmertens at cisco.com> wrote:
>
>>>> WOFS lives on a Write once medium, WOFS itself is not write
once.
Oops, now that I read your thesis, I see. So you can treat a WORM  
like a normal disk. Cool :)

How come it never got traction? There was a time in the 90s when it  
would have been great to have a stable writeable filesystem on cheap  
CD-ROMs. In fact, with dual layer DVDs right now, it still would have  
its uses. BTW, finding WOFS on Google is not the easiest thing :-(

While reading your thesis, I kept thinking that it should be possible  
for ZFS to be implemented on top of a WORM as well.

Most of the hard bits are already done, checksums and copy-on-write.  
The only issue would be the root blocks that can''t be overwritten,  
but that can be fixed like in your thesis, by growing an array from  
one side of the disk and putting all the data on the other side.

BTW, I would have put the generation nodes at the beginning of the  
disk and the data at the end, because then you can read the whole  
array in one big sequential gulp.
>>>> I would need to check my papers.... there is a solution.
>>>
>>> If you unlink the original name/inode entry you can mark it as  
>>> deleted
>>> without actually deleting it, thus leaving extant links to it  
>>> live and
>>> fresh; you only have to chase down the other links when the  
>>> original''s
>>> directory is removed, though you can do the same to directories,
>>> deferring actual removal to later (at the cost of wasting disk
>>> space and
>>> complicating accounting).  And/or you can always lazily update back
>>> references, if you throw in a log, so that you can find those when
>>> they
>>> are needed.
>>>
>>> Note that this assymetric hard-linking approach makes original
links
>>> fast and others slow; this will surely bother someone :)
>>
>> How about, as soon as a file is hardlinked, you move it to a special
>> invisible directory, and make both the target and the source be your
>> special inode-symlink? That solves the directory deletion and makes
>> the access symmetrical...
>
> Possible implementations are discussed in my master thesis:
But it doesn''t mention this solution ;-)

Anyway, we might be straying a bit too far from the topic.

Wout.

Joerg Schilling

2006-May-16 11:59 UTC

head link

[zfs-discuss] Re: XATTRs, ZAP and the Mac

Wout Mertens <wmertens at cisco.com> wrote:
> >>>> WOFS lives on a Write once medium, WOFS itself is not
write once.
>
> Oops, now that I read your thesis, I see. So you can treat a WORM  
> like a normal disk. Cool :)
Thank you ;-)
> How come it never got traction? There was a time in the 90s when it  
> would have been great to have a stable writeable filesystem on cheap  
> CD-ROMs. In fact, with dual layer DVDs right now, it still would have  
> its uses. BTW, finding WOFS on Google is not the easiest thing :-(
I have no idea why nobody was interested. I did talk about it many 
times in the 90s. Maybe the absence of web servers at that time
was the reason that the right people did miss the concept.

> While reading your thesis, I kept thinking that it should be possible  
> for ZFS to be implemented on top of a WORM as well.
As it seems that ZFS is similar to WOFS in several places you may be right.
> Most of the hard bits are already done, checksums and copy-on-write.  
> The only issue would be the root blocks that can''t be overwritten,
> but that can be fixed like in your thesis, by growing an array from  
> one side of the disk and putting all the data on the other side.
Correct.
> BTW, I would have put the generation nodes at the beginning of the  
> disk and the data at the end, because then you can read the whole  
> array in one big sequential gulp.
The gnodes grow backwards, but they are read in forwards.

> > Possible implementations are discussed in my master thesis:
>
> But it doesn''t mention this solution ;-)
>
> Anyway, we might be straying a bit too far from the topic.
Well, I did not say it mentions _all_ posible solutions ;-)

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       schilling at fokus.fraunhofer.de     (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Bill Sommerfeld

2006-May-16 13:19 UTC

head link

[zfs-discuss] Re: filesystems for WORM

On Tue, 2006-05-16 at 07:59, Joerg Schilling wrote:> I have no idea why nobody was interested. I did talk about it many 
> times in the 90s. Maybe the absence of web servers at that time
> was the reason that the right people did miss the concept.
Someone I worked with in the 1985-1987 timeframe also wrote a read-write
filesystem for WORM drives, and, as best as I can tell, it rolled over
and sank without leaving a trace...

					- Bill

zfs discuss - May 2006 - XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: XATTRs, ZAP and the Mac

[zfs-discuss] Re: filesystems for WORM