thr3ads.net - zfs code - [zfs-code] System call to create a clone of a file on a ZFS filesystem? [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Peter Eriksson

2007-Oct-10 12:59 UTC

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

At the lunch today we started talking about a feature that would have been nice
to have - a system call sort of similar to link(2) where you would get a cloned
copy of a file that would (initially) share the same data blocks on the disk but
would use copy-on-write to create private copies as soon as something is
modified.

It could be nice to use for example for a mail server using maildir:s so that
the mail delivery program sending a mail to multiple users could use that
syscall instead of writing N copies of the same mail...
(It would save space until the users would start to modifiy the files). 

Anyway, just a brainstorm idea that came up... :-)
--
This messages posted from opensolaris.org

Robert Milkowski

2007-Oct-11 07:52 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Hello Peter,

Wednesday, October 10, 2007, 1:59:20 PM, you wrote:

PE> At the lunch today we started talking about a feature that would
PE> have been nice to have - a system call sort of similar to link(2)
PE> where you would get a cloned copy of a file that would (initially)
PE> share the same data blocks on the disk but would use copy-on-write
PE> to create private copies as soon as something is modified.

PE> It could be nice to use for example for a mail server using
PE> maildir:s so that the mail delivery program sending a mail to
PE> multiple users could use that syscall instead of writing N copies of the
same mail...
PE> (It would save space until the users would start to modifiy the files).

While I totally second the idea (it has been discussed on zfs-discuss
some time ago) in case of email platform it wouldn''t be that easy as
every file is different (different headers at least). But there are
definitely other uses where it would be useful and improve user
experience. I haven''t looked into details but in theory one should be
able to copy/move a file within the same datapool between datasets
without having to actually copy data blocks... or maybe there''s some
detail which actually makes it hard to implement...

-- 
Best regards,
 Robert Milkowski                      mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Matthew Ahrens

2007-Oct-11 08:10 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Robert Milkowski wrote:> I haven''t looked into details but in theory one should be
> able to copy/move a file within the same datapool between datasets
> without having to actually copy data blocks... or maybe there''s
some
> detail which actually makes it hard to implement...
Once a block is referenced by multiple filesystem, it is nontrivial to 
determine when it can be freed.

--matt

Robert Milkowski

2007-Oct-11 09:47 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Hello Matthew,

Thursday, October 11, 2007, 9:10:13 AM, you wrote:

MA> Robert Milkowski wrote:>> I haven''t looked into details but in theory one should be
>> able to copy/move a file within the same datapool between datasets
>> without having to actually copy data blocks... or maybe
there''s some
>> detail which actually makes it hard to implement...
MA> Once a block is referenced by multiple filesystem, it is nontrivial to
MA> determine when it can be freed.

In a way multiple snapshots are separate file systems, or clones...
What''s the difference? However I''m sure you right...

-- 
Best regards,
 Robert Milkowski                      mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Pawel Jakub Dawidek

2007-Oct-11 10:27 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski
wrote:> Hello Matthew,
> 
> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
> 
> MA> Robert Milkowski wrote:
> >> I haven''t looked into details but in theory one should be
> >> able to copy/move a file within the same datapool between datasets
> >> without having to actually copy data blocks... or maybe
there''s some
> >> detail which actually makes it hard to implement...
> 
> MA> Once a block is referenced by multiple filesystem, it is nontrivial
to
> MA> determine when it can be freed.
> 
> In a way multiple snapshots are separate file systems, or clones...
> What''s the difference? However I''m sure you right...
Snapshot and clones are not autonomous datasets. A clone has always a
parent, you can use ''zfs promote'' to switch the relatioship,
but you
cannot make them independent, AFAIK.

To Matthew: As I understand it, Robert was talking more about moving the
blocks to another dataset, not creating a hardlink-like situation - only
one dataset will reference the blocks after the move.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20071011/57f54236/attachment.bin>

Robert Milkowski

2007-Oct-11 13:39 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Hello Pawel,

Thursday, October 11, 2007, 11:27:07 AM, you wrote:

PJD> On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski
wrote:>> Hello Matthew,
>> 
>> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
>> 
>> MA> Robert Milkowski wrote:
>> >> I haven''t looked into details but in theory one
should be
>> >> able to copy/move a file within the same datapool between
datasets
>> >> without having to actually copy data blocks... or maybe
there''s some
>> >> detail which actually makes it hard to implement...
>> 
>> MA> Once a block is referenced by multiple filesystem, it is
nontrivial to
>> MA> determine when it can be freed.
>> 
>> In a way multiple snapshots are separate file systems, or clones...
>> What''s the difference? However I''m sure you right...
PJD> Snapshot and clones are not autonomous datasets. A clone has always a
PJD> parent, you can use ''zfs promote'' to switch the
relatioship, but you
PJD> cannot make them independent, AFAIK.

PJD> To Matthew: As I understand it, Robert was talking more about moving the
PJD> blocks to another dataset, not creating a hardlink-like situation - only
PJD> one dataset will reference the blocks after the move.

Yep, with move that''s what I had in mind.
I''ve also was talking about zfscopy...


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Matthew Ahrens

2007-Oct-11 16:49 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Pawel Jakub Dawidek wrote:> On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
>> Hello Matthew,
>>
>> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
>>
>> MA> Robert Milkowski wrote:
>>>> I haven''t looked into details but in theory one should
be
>>>> able to copy/move a file within the same datapool between
datasets
>>>> without having to actually copy data blocks... or maybe
there''s some
>>>> detail which actually makes it hard to implement...
>> MA> Once a block is referenced by multiple filesystem, it is
nontrivial to
>> MA> determine when it can be freed.
>>
>> In a way multiple snapshots are separate file systems, or clones...
>> What''s the difference? However I''m sure you right...
Well, snapshots are nontrivial too.  See

http://blogs.sun.com/ahrens/entry/is_it_magic
> Snapshot and clones are not autonomous datasets. A clone has always a
> parent, you can use ''zfs promote'' to switch the
relatioship, but you
> cannot make them independent, AFAIK.
> 
> To Matthew: As I understand it, Robert was talking more about moving the
> blocks to another dataset, not creating a hardlink-like situation - only
> one dataset will reference the blocks after the move.
Well, he said "copy/move".  "copy" implied to me that both
filesystems would
reference the same blocks.  And even if it is just "move", you still
have the
issue of snapshots from the original filesystem referencing it.  Changing the 
snapshots so they no longer reference the file?  Also nontrivial.

--matt

Pawel Jakub Dawidek

2007-Oct-11 18:46 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

On Thu, Oct 11, 2007 at 09:49:51AM -0700, Matthew Ahrens
wrote:> Pawel Jakub Dawidek wrote:
> > On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
> >> Hello Matthew,
> >>
> >> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
> >>
> >> MA> Robert Milkowski wrote:
> >>>> I haven''t looked into details but in theory one
should be
> >>>> able to copy/move a file within the same datapool between
datasets
> >>>> without having to actually copy data blocks... or maybe
there''s some
> >>>> detail which actually makes it hard to implement...
> >> MA> Once a block is referenced by multiple filesystem, it is
nontrivial to
> >> MA> determine when it can be freed.
> >>
> >> In a way multiple snapshots are separate file systems, or
clones...
> >> What''s the difference? However I''m sure you
right...
> 
> Well, snapshots are nontrivial too.  See
> 
> http://blogs.sun.com/ahrens/entry/is_it_magic
> 
> > Snapshot and clones are not autonomous datasets. A clone has always a
> > parent, you can use ''zfs promote'' to switch the
relatioship, but you
> > cannot make them independent, AFAIK.
> > 
> > To Matthew: As I understand it, Robert was talking more about moving
the
> > blocks to another dataset, not creating a hardlink-like situation -
only
> > one dataset will reference the blocks after the move.
> 
> Well, he said "copy/move".  "copy" implied to me that
both filesystems would
> reference the same blocks.  And even if it is just "move", you
still have the
> issue of snapshots from the original filesystem referencing it.  Changing
the
> snapshots so they no longer reference the file?  Also nontrivial.
I''m sorry for trying to be too helpful:)

I understand it''s not trivial, but beeing able to reference the same
block from different datasets would be a really nice feature to have.
The functionality discussed above if only one example. Another example
would be block aggregation (which has its own name I can''t recall right
now), so we can run a thread once a day that frees duplicated blocks and
make datasets to point at one copy only.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20071011/b14cca9b/attachment.bin>

Darren J Moffat

2007-Oct-12 10:10 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Pawel Jakub Dawidek wrote:> I understand it''s not trivial, but beeing able to reference the
same
> block from different datasets would be a really nice feature to have.
> The functionality discussed above if only one example. Another example
> would be block aggregation (which has its own name I can''t recall
right
> now), so we can run a thread once a day that frees duplicated blocks and
> make datasets to point at one copy only.
NTFS can do this within a single filesystem.

It feels to me almost like the opposite of ditto blocks :-) Though I 
would still want ditto blocks to work with this.

-- 
Darren J Moffat

Pawel Jakub Dawidek

2007-Oct-12 11:09 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

On Fri, Oct 12, 2007 at 11:10:31AM +0100, Darren J Moffat
wrote:> Pawel Jakub Dawidek wrote:
> > I understand it''s not trivial, but beeing able to reference
the same
> > block from different datasets would be a really nice feature to have.
> > The functionality discussed above if only one example. Another example
> > would be block aggregation (which has its own name I can''t
recall right
> > now), so we can run a thread once a day that frees duplicated blocks
and
> > make datasets to point at one copy only.
> 
> NTFS can do this within a single filesystem.
> 
> It feels to me almost like the opposite of ditto blocks :-) Though I 
> would still want ditto blocks to work with this.
My mine use will be for things like Solaris zones or FreeBSD jails. It
is nice to have one base file system, which you can just clone, but over
the time the clones are getting bigger and bigger. If you sell virtual
web servers for example and all you costumers upgrade apache to a new
version you end up with X copies of the same blocks and you lose
everything you saved by using clones initially. Beeing able to run a
process in the background every night, which will aggregate the blocks
back would be really nice.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20071012/93fb6ee3/attachment.bin>

Matthew Ahrens

2007-Oct-12 18:07 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Pawel Jakub Dawidek wrote:> On Fri, Oct 12, 2007 at 11:10:31AM +0100, Darren J Moffat wrote:
>> Pawel Jakub Dawidek wrote:
>>> I understand it''s not trivial, but beeing able to
reference the same
>>> block from different datasets would be a really nice feature to
have.
>>> The functionality discussed above if only one example. Another
example
>>> would be block aggregation (which has its own name I can''t
recall right
>>> now), so we can run a thread once a day that frees duplicated
blocks and
>>> make datasets to point at one copy only.
>> NTFS can do this within a single filesystem.
>>
>> It feels to me almost like the opposite of ditto blocks :-) Though I 
>> would still want ditto blocks to work with this.
> 
> My mine use will be for things like Solaris zones or FreeBSD jails. It
> is nice to have one base file system, which you can just clone, but over
> the time the clones are getting bigger and bigger. If you sell virtual
> web servers for example and all you costumers upgrade apache to a new
> version you end up with X copies of the same blocks and you lose
> everything you saved by using clones initially. Beeing able to run a
> process in the background every night, which will aggregate the blocks
> back would be really nice.
Yeah, that would be nice.  De-duplication is on the list of problems
we''d
like to attack sooner rather than later.

--matt

Torsten "Paul" Eichstädt

2007-Oct-25 16:37 UTC

head link

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

Quick shot: What''s wrong with maintaining a reference count?
Free the block only when ref# is zero. 
Count the bytes in each fs it''s in, but only once in (each of,
recursively) the parent(s) -- maybe this is costly, ''cause now the
parents size is not the sum of it''s children.
I don''t know about the mathematical characteristics of the ZFS
checksum, but I assume it''s good to detect bit-errors and might not be
good enough to find data suitable for aggregation.

Paul
--
This messages posted from opensolaris.org

zfs code - Oct 2007 - System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?

[zfs-code] System call to create a clone of a file on a ZFS filesystem?