thr3ads.net - zfs discuss - [zfs-discuss] zfs send/receive incremental [Jun 2007]

If this information is useful, please help other people find it:
Share via:

Vic Engle

2007-Jun-02 10:36 UTC

[zfs-discuss] zfs send/receive incremental

Hi All,

Just curious about how the incremental send works. Is it changed blocks or files
and how are the changed blocks or files identified?

Regards,
Vic
 
 
This message posted from opensolaris.org

Starfox

2007-Jun-03 09:33 UTC

head link

[zfs-discuss] Re: zfs send/receive incremental

I''m curious as well.  I''m trying to set up a near-line backup
using two ZFS-based machines, and am a bit confused on how to set it up
properly.

First time around, create a snapshot and send it to remote:
zfs snapshot master/fs at mirror
zfs send master/fs at mirror | ssh mirror zfs recv backup/mirrorfs

Once that''s done, fs at mirror=mirrorfs, correct?  So if I wanted to
start an incremental backup, I''d have to start creating mirror1,
mirror2, etc?
zfs snapshot master/fs at mirror1
zfs send -i mirror master/fs at mirror1 | ssh mirror zfs recv backup/mirrorfs
(make changes)
zfs snapshot master/fs at mirror2
zfs send -i mirror1 master/fs at mirror2 | ssh mirror zfs recv backup/mirrorfs
etc.

So now fs at mirror2=mirrorfs.  So I could probably create a script that does a
rename, snapshot, and send.  But let''s say it was running from cron and
some failed (for whatever reason).  So now I''m running the script
manually and it''s complaining that the incremental source
doesn''t match, and apparently no way to tell which fs at mirrorx is the
source short of trial-and-error even if I kept all my incremental mirrors?  Is
there a clean way around this?

I guess a better scenario would be that I accidentally destroyed the last fs at
mirrorx (so no source snapshot to -i).  Would my only recourse it to bite the
bullet and do a zfs send fs at mirrorx+1 (aka full)?

Another question is if I wanted to send a snapshot as a snapshot, I can do:
zfs snapshot master/fs at savesnap
zfs send master/fs at savesnap | ssh mirror zfs recv backup/mirrorfs at savesnap

And now fs at savesnap=mirrorfs at savesnap, right?  But that would involve
sending the whole stream, which is however much data the filesystem was
consuming at that point in time?  How would mirrorfs handle the difference
between it and mirrorfs at savesnap?  Would it take as much space as the fs at
mirrorx to fs at savesnap, or more?

Sorry for all these questions (and sort of derailing the thread) but
I''d really like to set this up the right way the first time around. 
I''d love to be able to say zfs mirror master/fs[@snapshot] backup at
mirror/mirrorfs and have it sync a fs, either the current fs or a snapshot, but
right now it seems as though the only way to do it is with snapshots, send/recv,
and fiddling with incremental scripting...

Thanks,
-- Starfox
 
 
This message posted from opensolaris.org

Chris Gerhard

2007-Jun-03 15:56 UTC

head link

[zfs-discuss] Re: zfs send/receive incremental

Oddly I posted a script that does what you want all be ti without sending it to
a remote system on friday to my blog
(http://blogs.sun.com/chrisg/entry/rolling_incremental_backups) which i use to
backup my system to an external USB drive.

--chris
 
 
This message posted from opensolaris.org

Starfox

2007-Jun-03 18:12 UTC

head link

[zfs-discuss] Re: zfs send/receive incremental

Yes, I''ve read through tons of blogs.sun.com/* entries, went through
the mailing list looking for the proper way to do it, etc.  Unfortunately, zfs
send/recv remains a hack that requires an elaborate script wrapper precisely
because zfs send/recv is by nature a send-only/recv-only operation (which was
renamed from backup/restore a while ago).  It''s fine if you are back up
the incrementals to tape and restore it at a later date, but lacking if you have
two ZFS that are able to communicate.

So on the high-end, you have something like:
ZFS-exported-shareiscsi/nfs-put-in-fault-tolerant-ZFS-pool
This basically puts one big file on the exporting ZFS, which is required to be
at least as large as the smallest pool device of the mirror/raidx pool.  So if
your local devices are xGB in size, you probably need a x*1.5GB remote device in
order for the exported flat file to allow the remote ZFS to do what it needs to
do (checksum, etc.)

AVS-over-network-to-fault-tolerant-ZFS-pool>From what I can tell (and I''m having problems loading the Flash
movie for the demos), AVS basically sits between ZFS and the pool device and
monitors any block commands, saves that, and sends it to remote.  If you have
identical setups, it works fine, but it will not work if you don''t have
an idetical setup of x device doing y mirror/raidx on z pool, because it just
sends block commands.  So I can''t have a near-line backup of a pool
unless I mirror the pool setup even with II, so nothing like raidz for pool-a,
mirror for pool-b, although both pool-a and pool-b might be an identical size
from the perspective of zfs.
And on the "low" end, you have:
Ghetto-lofiadm-NFS-mirror-and-let-ZFS-bitch-at-you
This was off one of the blogs, where the author (you) basically exported a file
via NFS, used lofiadm to create a "device", and added that to a pool
as a mirror.  When you needed a "consistent" state, you just connected
the device and let it resilver, and when it was done you disconnect it.  And
after this you tried using iscsi at a later date.
The issue with this setup is that it a) requires slicing off a portion of the fs
tree where you actually want a mirrored pool, which basically means ZFS
won''t be able to use write cache (goes against
let-ZFS-manage-whole-devices philosophy) b) still needs a lot more disk space
than the pool "device" size (same problem as iscsi-device-pool) and c)
cannot use it where raidx is involved, because the "remote" device is
liable to be disconnected at any time.

Nonexistent-export-a-whole-device-over-network
This I could not find.  Basically let the drive sit on one machine, and let it
be used as a pool device on another machine.  Basically solves issue #b of the
lofiadm setup because it doesn''t have the overhead of the underlying
ZFS, but still runs into issue #a and #c.

Script-a-mirror:
I''ve seen couple of different ways of doing this.  One is yours,
another is ZetaBack, and another is zfs-auto-snapshot.  This is why I asked all
these weird questions, because it seems that in this point-in-time this is the
only way of doing it on a per-filesystem basis.  But being that (as I said
earlier) send is send-only and recv is recv-only, ZFS will just happily bitch at
you if you recv the wrong source incremental, even though both the send and recv
might have a common snapshot that they could work off of, it will never know
because it won''t communicate with each other.

Ideal setup:
Ideally, this is what I want to have.  Put couple of large HD into a server, let
zfs admin the entire device (so it can manage write-cache).  Create a
Mirror/RAIDx/Hot-spare pool with as much as your budget allows (and in my case,
very little - probably 2x160G or so mirror), and create file systems as needed.

Put another large HD into another machine, and connect it via a dedicated
network segment (which is a wise thing to do with any SAN stuff).  Create a
Mirror/RAIDx/Hot-spare pool with whatever is left in your budget (in my case,
I''ll just take a 60G hard drive from another machine).

Now, from my perspective, I have no need/desire to let my 60G near-line the
entire content of the 160G.  And if I ended up lofiadm/iscsi''ing the
60GB and create a pool with the 160G, I''ll end up wasting 120GB off the
160s unless I slice off that 40GB "near-line mirror" and lose write
cache by doing so (correct me if I''m wrong here).

I don''t need fail-over or anything.  All I want is for what I consider
important (ie, Documents, settings, etc.) off a portion of a pool to be
replicated onto another machine, so if a catastrophic failure happens where I
lose both mirrors due to PS frying or ZFS bit-rotting an important document
during a save, I have access to a "recent" copy of a file on another
machine.

There was a discussion on this forum recently (when I searched) that said that
doing a ghetto-mirror is a lot more "easier" than setting up a
script-a-mirror, and it is.  Since ZFS can see the content of all the devices in
the pool, it can take whatever steps are necessary to get it to a consistent and
up-to-date.

Now the point was raised that doing a ghetto-mirror was not recommended because
ZFS had no way to insure that NFS/NIC/target machine didn''t corrupt the
stream mid-way.  Which is why they recommended doing ZFS-backed iSCSI/NFS share
to be used as a ZFS device on another machine.  For me, I just don''t
see the advantage in this.  If NFS bit-flips something, then the ZFS store just
wrote a bit-flipped stream of ZFS raw data, which won''t help it one bit
when it comes times to read it.  The ZFS mirror will just throw a checksum error
and ignore that block.  Doing the export-a-device seems no different from the
results point of view (it''ll still throw a checksum error), albeit with
a lot more "potential" paths of error than having the device local,
_if_ export-a-device can even be actually done.

So what''s preventing ZFS from saying "the user wants to mirror xyz
filesystem on another ZFS with matching ZFS version."  Is that truly that
different from a device mirror that it can''t track changes done to that
file system (not pool!), be it a modification, snapshot creation, etc. that  it
can''t send those changes over-the-wire, be it another ZFS pool on the
same machine or on another machine every so often?

-- Starfox
 
 
This message posted from opensolaris.org

Richard Elling

2007-Jun-03 18:27 UTC

head link

[zfs-discuss] Re: zfs send/receive incremental

Starfox wrote:> I don''t need fail-over or anything.  All I want is for what I
consider important (ie, Documents, settings, etc.) off a portion of a pool to be
replicated onto another machine, so if a catastrophic failure happens where I
lose both mirrors due to PS frying or ZFS bit-rotting an important document
during a save, I have access to a "recent" copy of a file on another
machine.
filesync(1) was designed for this task.
  -- richard

Matthew Ahrens

2007-Jun-05 17:12 UTC

head link

[zfs-discuss] zfs send/receive incremental

Vic Engle wrote:> Hi All,
> 
> Just curious about how the incremental send works. Is it changed blocks or
> files and how are the changed blocks or files identified?
It''s done at the DMU layer, based on blocks of objects.  We use the 
block-pointer relationships (ie, the on-disk structure of files) to quickly 
find the only the changed blocks.

--matt

Matthew Ahrens

2007-Jun-05 17:25 UTC

head link

[zfs-discuss] Re: zfs send/receive incremental

Starfox wrote:> First time around, create a snapshot and send it to remote: zfs snapshot
> master/fs at mirror zfs send master/fs at mirror | ssh mirror zfs recv
> backup/mirrorfs
> 
> Once that''s done, fs at mirror=mirrorfs, correct?
More accurately, master/fs at mirror == backup/mirrorfs at mirror

 > So now I''m running the script manually> and it''s complaining that the incremental source doesn''t
match, and
> apparently no way to tell which fs at mirrorx is the source short of
> trial-and-error even if I kept all my incremental mirrors?  Is there a
> clean way around this?
I''d say either (a) have your script check to see if the send|recv was 
successful, or (b) have it check what snapshots are available at the other 
side, and start sending incrementals from there.  Eg:

$lastsnap = ssh remote zfs list -o name $fs | tail -1 | cut
''@'' ...
do {
	zfs snapshot $fs@$newsnap
	zfs send -i $lastsnap $fs@$newsnap | ssh remote zfs recv -d pool/recvd
} while (no errors)

> I guess a better scenario would be that I accidentally destroyed the last
> fs at mirrorx (so no source snapshot to -i).  Would my only recourse it to
> bite the bullet and do a zfs send fs at mirrorx+1 (aka full)?
No, if you have the old snaps on both sides, you can simply send an 
incremental from the last common snap (eg, zfs send -i mirrorX-1 fs at
mirrorX+1)
> Another question is if I wanted to send a snapshot as a snapshot, I can
> do: zfs snapshot master/fs at savesnap zfs send master/fs at savesnap | ssh
> mirror zfs recv backup/mirrorfs at savesnap
> 
> And now fs at savesnap=mirrorfs at savesnap, right?  But that would involve
> sending the whole stream, which is however much data the filesystem was
> consuming at that point in time?
Yes.  But this will create a new filesystem backup/mirrorfs on the receiving 
side.  I''m not sure I understand your goal -- zfs send always sends a 
snapshot, whether incremental or full.

--matt

Starfox

2007-Jun-06 22:17 UTC

head link

[zfs-discuss] Re: Re: zfs send/receive incremental

> Starfox wrote:
> > master/fs at mirror zfs send master/fs at mirror | ssh
> mirror zfs recv
> > backup/mirrorfs
> > 
> > Once that''s done, fs at mirror=mirrorfs, correct?
> 
> More accurately, master/fs at mirror => backup/mirrorfs at mirror
> 
Okay, that makes a lot more sense now.
> I''d say either (a) have your script check to see if
> the send|recv was 
> successful, or (b) have it check what snapshots are
> available at the other 
> side, and start sending incrementals from there.  Eg:
> 
> $lastsnap = ssh remote zfs list -o name $fs | tail -1
> | cut ''@'' ...
> do {
> 	zfs snapshot $fs@$newsnap
> zfs send -i $lastsnap $fs@$newsnap | ssh remote zfs
> s recv -d pool/recvd
> } while (no errors)
> 
None of the scripts that I looked at seemed to offered any sort of error
recovery.  I think I''ll be able to use this as a starting point (and
maybe the man pages can be updated to include that you can use any common
snapshot to send -i - that fact is not obvious to those who are unfamiliar with
the capabilities of ZFS).
> > Another question is if I wanted to send a snapshot
> as a snapshot, I can
> > do: zfs snapshot master/fs at savesnap zfs send
> master/fs at savesnap | ssh
> > mirror zfs recv backup/mirrorfs at savesnap
> > 
> Yes.  But this will create a new filesystem
> backup/mirrorfs on the receiving 
> side.  I''m not sure I understand your goal -- zfs
> send always sends a 
> snapshot, whether incremental or full.
> 
I guess I''ll have to play around with when I get the systems up and
running.  But my request to allow two zfs boxes to sync "a" filesystem
still stands.

-- Starfox
 
 
This message posted from opensolaris.org

Chris Gerhard

2007-Jun-08 06:47 UTC

head link

[zfs-discuss] Re: Re: zfs send/receive incremental

> > Starfox wrote:
> 
> None of the scripts that I looked at seemed to
> offered any sort of error recovery.  I think I''ll be
> able to use this as a starting point (and maybe the
> man pages can be updated to include that you can use
> any common snapshot to send -i - that fact is not
> obvious to those who are unfamiliar with the
> capabilities of ZFS).
> 
My experience of handling errors in zfs_backup is that since the script is
duplicating the snapshots on the remote file system if it fails I simply run it
again and it picks up where it left off. That said it has never failed so far,
I''ve interrupted it or the system has crashed due to CR 6566921 but in
both cases running the script again it does just what it should, no more no
less.

--chris
 
 
This message posted from opensolaris.org

zfs discuss - Jun 2007 - zfs send/receive incremental

[zfs-discuss] zfs send/receive incremental

[zfs-discuss] Re: zfs send/receive incremental

[zfs-discuss] Re: zfs send/receive incremental

[zfs-discuss] Re: zfs send/receive incremental

[zfs-discuss] Re: zfs send/receive incremental

[zfs-discuss] zfs send/receive incremental

[zfs-discuss] Re: zfs send/receive incremental

[zfs-discuss] Re: Re: zfs send/receive incremental

[zfs-discuss] Re: Re: zfs send/receive incremental