thr3ads.net - zfs discuss - [zfs-discuss] cross fs/same pool mv [Dec 2005]

If this information is useful, please help other people find it:
Share via:

Casper.Dik at Sun.COM

2005-Dec-01 17:41 UTC

[zfs-discuss] cross fs/same pool mv

While I understand that the filesystems are really separate,
is there any reason why a "mv" from one fs to another in the same
pool requires all data to be copied?

Is there any good reason to deny in-pool, cross fs hardlinks?
(I can understand how move cannot see that this is possible)

Casper

Jonathan Adams

2005-Dec-01 18:12 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Thu, Dec 01, 2005 at 06:41:36PM +0100, Casper.Dik at sun.com
wrote:> 
> 
> While I understand that the filesystems are really separate,
> is there any reason why a "mv" from one fs to another in the same
> pool requires all data to be copied?
> 
> Is there any good reason to deny in-pool, cross fs hardlinks?
> (I can understand how move cannot see that this is possible)
The biggest issue with these is accounting for the storage.

I.e., with filesystems /foo1 and /foo2:

	mkfile 100m /foo1/tmp
	ln /foo1/tmp /foo2/tmp

How much space is used in /foo1?  /foo2?   Now do:

	rm /foo1/tmp

Now how much space is used?


The other issue is standards compliance:

     The rename() function will fail if:
...
     EXDEV           The links named by old and new are  on  dif-
                     ferent file systems.

and:
     The link() function will fail if:
...
     EXDEV           The link named by new and the file named  by
                     existing  are  on  different logical devices
                     (file systems).

Cheers,
- jonathan

-- 
Jonathan Adams, Solaris Kernel Development

Bill Moore

2005-Dec-01 21:15 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Thu, Dec 01, 2005 at 06:41:36PM +0100, Casper.Dik at sun.com
wrote:> While I understand that the filesystems are really separate,
> is there any reason why a "mv" from one fs to another in the same
> pool requires all data to be copied?
This is something we plan on attempting.  As Jonathan points out, you''d
have to state (by setting a property or something) that you don''t want
strict POSIX compliance.  But as you''ve guessed, this would be a really
useful feature.
> Is there any good reason to deny in-pool, cross fs hardlinks?
> (I can understand how move cannot see that this is possible)
Cross filesystem hardlinks are a different story.  If you can come up
with an answer to how snapshots and accounting should behave, we''d
love to be able to implement this.  So far, though, we haven''t been
able
to figure out a good set of semantics.

This is Bill''s second law of ZFS: Snapshots ruin almost every good idea
I''ve ever had.  :)  But they''re so damned useful.


--Bill

Jeff Bonwick

2005-Dec-02 05:59 UTC

head link

[zfs-discuss] cross fs/same pool mv

> While I understand that the filesystems are really separate,
> is there any reason why a "mv" from one fs to another in the same
> pool requires all data to be copied?
Yes.  Each filesystem has its own object number space.

Suppose that the file I want to link to is object 37 is filesystem A.
If I try to create a hard link to it from filesystem B, what I''ll
actually end up with is a link to object 37 *in filesystem B*.
This is pretty fundamental to the way directories work, both
locally and over NFS.

We could make cross-filesystem hard links work if all filesystems
in a pool shared the same object number space, but that would create
a whole different set of problems.

Jeff

Darren J Moffat

2005-Dec-02 12:01 UTC

head link

[zfs-discuss] cross fs/same pool mv

> > While I understand that the filesystems are really separate,
> > is there any reason why a "mv" from one fs to another in the
same
> > pool requires all data to be copied?
In addition to what Jeff said there is the more obvious (to me anyway)
issue  that the new filesystem may have different compression and
checksum algorithms set and in the (hopefully not to distant) future
different crypto algorithms and keys.

-- 
Darren J Moffat

James Dickens

2005-Dec-02 21:23 UTC

head link

[zfs-discuss] cross fs/same pool mv

On 12/1/05, Jonathan Adams <jonathan.adams at sun.com>
wrote:> On Thu, Dec 01, 2005 at 06:41:36PM +0100, Casper.Dik at sun.com wrote:
> >
> >
> > While I understand that the filesystems are really separate,
> > is there any reason why a "mv" from one fs to another in the
same
> > pool requires all data to be copied?
> >the one reason I come up for this is, if I forget to enable
compression on a filesystem, how else am I going to compress all my
data. Though I could imagine a daemon that could be enabled that could
search for data that could be compressed and possibly make
recomendations that compression be enabled or perhaps just compress
data on the filesystem if it results in improved performance or at
least no effect on performance.

James Dickens
uadmin.blogspot.com


> > Is there any good reason to deny in-pool, cross fs hardlinks?
> > (I can understand how move cannot see that this is possible)
>
> The biggest issue with these is accounting for the storage.
>
> I.e., with filesystems /foo1 and /foo2:
>
>         mkfile 100m /foo1/tmp
>         ln /foo1/tmp /foo2/tmp
>
> How much space is used in /foo1?  /foo2?   Now do:
>
>         rm /foo1/tmp
>
> Now how much space is used?
>
>
> The other issue is standards compliance:
>
>      The rename() function will fail if:
> ...
>      EXDEV           The links named by old and new are  on  dif-
>                      ferent file systems.
>
> and:
>      The link() function will fail if:
> ...
>      EXDEV           The link named by new and the file named  by
>                      existing  are  on  different logical devices
>                      (file systems).
>
> Cheers,
> - jonathan
>
> --
> Jonathan Adams, Solaris Kernel Development
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Jonathan Adams

2005-Dec-02 21:37 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Fri, Dec 02, 2005 at 03:23:53PM -0600, James Dickens
wrote:> On 12/1/05, Jonathan Adams <jonathan.adams at sun.com> wrote:
> > On Thu, Dec 01, 2005 at 06:41:36PM +0100, Casper.Dik at sun.com wrote:
> > >
> > >
> > > While I understand that the filesystems are really separate,
> > > is there any reason why a "mv" from one fs to another
in the same
> > > pool requires all data to be copied?
> > >
> the one reason I come up for this is, if I forget to enable
> compression on a filesystem, how else am I going to compress all my
> data. Though I could imagine a daemon that could be enabled that could
> search for data that could be compressed and possibly make
> recomendations that compression be enabled or perhaps just compress
> data on the filesystem if it results in improved performance or at
> least no effect on performance.
You can just re-copy all of the data after enabling compression (it''s
fairly
easy to write a script, or just do something like:

	find . -xdev -type f | cpio -ocB | cpio -idmuv

to re-write all of the data.

The ZFS folks have talked about having additional "scrub-like" tasks
you
could run on a pool or filesystem bases;  a "recompress" one would be
a
possibility.  The main problem is snapshots; if any exist, then you''ll
be paying for both the uncompressed and compressed versions.

Cheers,
- jonathan
> > > Is there any good reason to deny in-pool, cross fs hardlinks?
> > > (I can understand how move cannot see that this is possible)
> >
> > The biggest issue with these is accounting for the storage.
> >
> > I.e., with filesystems /foo1 and /foo2:
> >
> >         mkfile 100m /foo1/tmp
> >         ln /foo1/tmp /foo2/tmp
> >
> > How much space is used in /foo1?  /foo2?   Now do:
> >
> >         rm /foo1/tmp
> >
> > Now how much space is used?
> >
> >
> > The other issue is standards compliance:
> >
> >      The rename() function will fail if:
> > ...
> >      EXDEV           The links named by old and new are  on  dif-
> >                      ferent file systems.
> >
> > and:
> >      The link() function will fail if:
> > ...
> >      EXDEV           The link named by new and the file named  by
> >                      existing  are  on  different logical devices
> >                      (file systems).
> >
> > Cheers,
> > - jonathan
> >
> > --
> > Jonathan Adams, Solaris Kernel Development
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
-- 
Jonathan Adams, Solaris Kernel Development

Darren J Moffat

2005-Dec-05 11:37 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Fri, 2005-12-02 at 21:37, Jonathan Adams wrote:
> The ZFS folks have talked about having additional "scrub-like"
tasks you
> could run on a pool or filesystem bases;  a "recompress" one
would be a
> possibility.  The main problem is snapshots; if any exist, then
you''ll
> be paying for both the uncompressed and compressed versions.
The other area where something like this maybe needed is for
cryptographic key change role over - though that is much more complex
and we will certainly have issues with snapshots and clones to deal with
in this area.

-- 
Darren J Moffat

Joerg Schilling

2005-Dec-05 12:06 UTC

head link

[zfs-discuss] cross fs/same pool mv

Jonathan Adams <jonathan.adams at sun.com> wrote:
> You can just re-copy all of the data after enabling compression
(it''s fairly
> easy to write a script, or just do something like:
>
> 	find . -xdev -type f | cpio -ocB | cpio -idmuv
>
> to re-write all of the data.
.... and to destroy the content of all files > 5k.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Jonathan Adams

2005-Dec-05 21:10 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Mon, Dec 05, 2005 at 01:06:37PM +0100, Joerg Schilling
wrote:> Jonathan Adams <jonathan.adams at sun.com> wrote:
> 
> > You can just re-copy all of the data after enabling compression
(it''s fairly
> > easy to write a script, or just do something like:
> >
> > 	find . -xdev -type f | cpio -ocB | cpio -idmuv
> >
> > to re-write all of the data.
> 
> .... and to destroy the content of all files > 5k.
Did you try the command?  CPIO writes to a temporary file and then renames.

Cheers,
- jonathan

-- 
Jonathan Adams, Solaris Kernel Development

Joerg Schilling

2005-Dec-06 12:29 UTC

head link

[zfs-discuss] cross fs/same pool mv

Jonathan Adams <jonathan.adams at sun.com> wrote:
> On Mon, Dec 05, 2005 at 01:06:37PM +0100, Joerg Schilling wrote:
> > Jonathan Adams <jonathan.adams at sun.com> wrote:
> > 
> > > You can just re-copy all of the data after enabling compression
(it''s fairly
> > > easy to write a script, or just do something like:
> > >
> > > 	find . -xdev -type f | cpio -ocB | cpio -idmuv
> > >
> > > to re-write all of the data.
> > 
> > .... and to destroy the content of all files > 5k.
>
> Did you try the command?  CPIO writes to a temporary file and then renames.
OK, I in theory know this, but why is this missing from the documentation?

And BTW: this is the only reason why using an outdated command like cpio
for BFU makes sense at all.

Also, the documentation for -u is incorrect. It does not mention, that 
cpio always extracts dirs, even when they are not newer in the archive.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Matthew Simmons

2005-Dec-06 15:25 UTC

head link

[zfs-discuss] cross fs/same pool mv

>>>>> "JS" == Joerg Schilling <schilling at
fokus.fraunhofer.de> writes:
    JS> OK, I in theory know this, but why is this missing from the
    JS> documentation?

Because it''s an implementation detail?  I''m assuming that the
cpio | cpio thing
Jonathan mentioned isn''t explicitly mentioned in the ZFS documentation
either.

Matt

-- 
	Matt Simmons - simmonmt at eng.sun.com | Solaris Kernel - New York
              Is it true that cannibals don''t eat clowns
                      because they taste funny?

Joerg Schilling

2005-Dec-06 15:56 UTC

head link

[zfs-discuss] cross fs/same pool mv

Matthew Simmons <simmonmt at eng.sun.com> wrote:
> >>>>> "JS" == Joerg Schilling <schilling at
fokus.fraunhofer.de> writes:
>
>     JS> OK, I in theory know this, but why is this missing from the
>     JS> documentation?
>
> Because it''s an implementation detail?  I''m assuming that
the cpio | cpio thing
> Jonathan mentioned isn''t explicitly mentioned in the ZFS
documentation either.
Do you believe that an "implementation detail" that may break hard
links
should be omitted from the man page?

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Matthew Simmons

2005-Dec-06 15:58 UTC

head link

[zfs-discuss] cross fs/same pool mv

>>>>> "JS" == Joerg Schilling <schilling at
fokus.fraunhofer.de> writes:
    JS> Do you believe that an "implementation detail" that may
break hard
    JS> links should be omitted from the man page?

If it breaks hard links, then perhaps the detail should be changed/fixed.

Something that, by the way, we couldn''t do had we described the detail
in the
manpages, thus setting it in stone for all eternity.

Matt

-- 
	Matt Simmons - simmonmt at eng.sun.com | Solaris Kernel - New York
      I do not participate in any sport with ambulances at the bottom of a
			     hill.  --Erma Bombeck

Dan Price

2005-Dec-06 17:23 UTC

head link

[zfs-discuss] cross fs/same pool mv

On Tue 06 Dec 2005 at 01:29PM, Joerg Schilling wrote:> 
> Also, the documentation for -u is incorrect. It does not mention, that 
> cpio always extracts dirs, even when they are not newer in the archive.
Please file a documentation bug at http://bugs.opensolaris.org.

        -dp

-- 
Daniel Price - Solaris Kernel Engineering - dp at eng.sun.com - blogs.sun.com/dp

Joerg Schilling

2005-Dec-08 10:22 UTC

head link

[zfs-discuss] cross fs/same pool mv

Matthew Simmons <simmonmt at eng.sun.com> wrote:
> >>>>> "JS" == Joerg Schilling <schilling at
fokus.fraunhofer.de> writes:
>
>     JS> Do you believe that an "implementation detail" that
may break hard
>     JS> links should be omitted from the man page?
>
> If it breaks hard links, then perhaps the detail should be changed/fixed.
>
> Something that, by the way, we couldn''t do had we described the
detail in the
> manpages, thus setting it in stone for all eternity.
Dues this mean that the behavior was changed to the current one to allow BFU?

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Matthew Simmons

2005-Dec-08 15:37 UTC

head link

[zfs-discuss] cross fs/same pool mv

>>>>> "JS" == Joerg Schilling <schilling at
fokus.fraunhofer.de> writes:
    JS> Dues this mean that the behavior was changed to the current one to
    JS> allow BFU?

What?

I was saying that, if the current behavior is broken, we should fix it.  I was
also pointing out that, by not documenting the implementation detail in
question, we actually have the ability to fix it.  Had we documented the
detail, life would be more difficult.

Matt

-- 
	Matt Simmons - simmonmt at eng.sun.com | Solaris Kernel - New York
	    If a parsley farmer is sued, can they garnish his wages?

Joerg Schilling

2005-Dec-08 15:49 UTC

head link

[zfs-discuss] cross fs/same pool mv

Matthew Simmons <simmonmt at eng.sun.com> wrote:
> I was saying that, if the current behavior is broken, we should fix it.  I
was
> also pointing out that, by not documenting the implementation detail in
> question, we actually have the ability to fix it.  Had we documented the
> detail, life would be more difficult.
OK, so if cpio does not write _into_ existing files and  if the archive does not
contain _all_ hard links for a file, the extraction will break hard links.

This is why star will never do something like this by default.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

roland

2007-Jul-01 12:54 UTC

head link

[zfs-discuss] Re: cross fs/same pool mv

> > You can just re-copy all of the data after enabling compression
(it''s fairly
> > easy to write a script, or just do something like:
> >
> > find . -xdev -type f | cpio -ocB | cpio -idmuv
> >
> > to re-write all of the data.
>
> .... and to destroy the content of all files > 5k.
i tried the above for fun on /, got tons of "File .... was modified while
being copied" and now ended up reinstalling my system from scratch :D
(not a problem, it was just a nexenta test installation)

is there a reliable method of re-compressing a whole zfs volume after turning on
compression or changing compression scheme ?

roland
 
 
This message posted from opensolaris.org

Carson Gaspar

2007-Jul-02 12:32 UTC

head link

[zfs-discuss] Re: cross fs/same pool mv

roland wrote:
> is there a reliable method of re-compressing a whole zfs volume after
turning on compression or changing compression scheme ?
It would be slow, and the file system would need to be idle to avoid
race conditions, and it wouldn''t be very fast,  but you _could_ do the
following (POSIX shell syntax). I haven''t tested this, so it could have
typos or other problems:

find . -type f -print | while read n; do
     TF="$(mktemp ${n%/*}/.tmpXXXXXX)"
     if cp -p "$n" "$TF"; then
         if ! mv "$TF" "$n"; then
             echo "failed to re-write $n in mv"
             rm "$TF"
         fi
     else
         echo "failed to re-write $n in cp"
         rm "$TF"
     fi
done

-- 
Carson

asa hammond

2007-Jul-02 19:01 UTC

head link

[zfs-discuss] cross fs/same pool mv

I have had some success using zfs send recv into a child of a  
compressed filesystem to do this although you have the disadvantage  
of losing your settings.

basically :
zfs create tank/foo
mv a bunch of files into foo
zfs create tank/bar
zfs set compression=on bar
zfs snapshot tank/foo at now
zfs send tank/foo at now | zfs recv tank/bar/foosmall
zfs destroy tank/foo
zfs set compression=on  tank/bar/foosmall
zfs rename tank/bar/foosmall tank/foo


kinda clunky and you have to have twice as much space available and  
there are probably other issues with it as I am not a pro zfs user  
here but, worked for me =)

Asa


On Jul 2, 2007, at 5:32 AM, Carson Gaspar wrote:
> roland wrote:
>
>> is there a reliable method of re-compressing a whole zfs volume  
>> after turning on compression or changing compression scheme ?
>
> It would be slow, and the file system would need to be idle to avoid
> race conditions, and it wouldn''t be very fast,  but you _could_ do
the
> following (POSIX shell syntax). I haven''t tested this, so it could
> have
> typos or other problems:
>
> find . -type f -print | while read n; do
>     TF="$(mktemp ${n%/*}/.tmpXXXXXX)"
>     if cp -p "$n" "$TF"; then
>         if ! mv "$TF" "$n"; then
>             echo "failed to re-write $n in mv"
>             rm "$TF"
>         fi
>     else
>         echo "failed to re-write $n in cp"
>         rm "$TF"
>     fi
> done
>
> -- 
> Carson
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

asa hammond

2007-Jul-02 19:02 UTC

head link

[zfs-discuss] cross fs/same pool mv

> roland wrote:
>
>> is there a reliable method of re-compressing a whole zfs volume  
>> after turning on compression or changing compression scheme ?
>
I have had some success using zfs send recv into a child of a  
compressed filesystem to do this although you have the disadvantage  
of losing your settings.

basically :
zfs create tank/foo
mv a bunch of files into foo
zfs create tank/bar
zfs set compression=on bar
zfs snapshot tank/foo at now
zfs send tank/foo at now | zfs recv tank/bar/foosmall
zfs destroy tank/foo
zfs set compression=on  tank/bar/foosmall
zfs rename tank/bar/foosmall tank/foo


kinda clunky and you have to have twice as much space available and  
there are probably other issues with it as I am not a pro zfs user  
here but, worked for me =)

Asa

zfs discuss - Dec 2005 - cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] Re: cross fs/same pool mv

[zfs-discuss] Re: cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv

[zfs-discuss] cross fs/same pool mv