thr3ads.net - freebsd stable - Musings on ZFS Backup strategies [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Karl Denninger

2013-Mar-01 14:24 UTC

Musings on ZFS Backup strategies

Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that I've
handled this.  Previously near-line (and offline) backup was focused on
being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or worse)
failure, or the obvious (e.g. fire, etc) along with the "aw crap, I just
rm -rf'd something I'd rather not!"

ZFS makes snapshots very cheap, which means you can resolve the "aw
crap" situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem at unique-label
zfs send -i pool/some-filesystem at base pool/some-filesystem at unique-label
zfs destroy pool/some-filesystem at unique-label

and that seems to work (and restore) just fine.

Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the "oops" situations without having to
resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the "tower of Hanoi" style dump hierarchy imposed both a time
AND
entropy cost on backup media.)

Am I missing something here?

(Yes, I know, I've been a ZFS resister.... ;-))

-- 
-- Karl Denninger
/The Market Ticker ?/ <http://market-ticker.org>
Cuda Systems LLC

Ronald Klop

2013-Mar-01 15:06 UTC

head link

Musings on ZFS Backup strategies

On Fri, 01 Mar 2013 15:24:53 +0100, Karl Denninger <karl at denninger.net>
wrote:
> Dabbling with ZFS now, and giving some thought to how to handle backup
> strategies.
>
> ZFS' snapshot capabilities have forced me to re-think the way that
I've
> handled this.  Previously near-line (and offline) backup was focused on
> being able to handle both disasters (e.g. RAID adapter goes nuts and
> scribbles on the entire contents of the array), a double-disk (or worse)
> failure, or the obvious (e.g. fire, etc) along with the "aw crap, I
just
> rm -rf'd something I'd rather not!"
>
> ZFS makes snapshots very cheap, which means you can resolve the "aw
> crap" situation without resorting to backups at all.  This turns the
> backup situation into a disaster recovery one.
>
> And that in turn seems to say that the ideal strategy looks more like:
>
> Take a base snapshot immediately and zfs send it to offline storage.
> Take an incremental at some interval (appropriate for disaster recovery)
> and zfs send THAT to stable storage.
>
> If I then restore the base and snapshot, I get back to where I was when
> the latest snapshot was taken.  I don't need to keep the incremental
> snapshot for longer than it takes to zfs send it, so I can do:
>
> zfs snapshot pool/some-filesystem at unique-label
> zfs send -i pool/some-filesystem at base pool/some-filesystem at
unique-label
> zfs destroy pool/some-filesystem at unique-label
>
> and that seems to work (and restore) just fine.
>
> Am I looking at this the right way here?  Provided that the base backup
> and incremental are both readable, it appears that I have the disaster
> case covered, and the online snapshot increments and retention are
> easily adjusted and cover the "oops" situations without having to
resort
> to the backups at all.
>
> This in turn means that keeping more than two incremental dumps offline
> has little or no value; the second merely being taken to insure that
> there is always at least one that has been written to completion without
> error to apply on top of the base.  That in turn makes the backup
> storage requirement based only on entropy in the filesystem and not time
> (where the "tower of Hanoi" style dump hierarchy imposed both a
time AND
> entropy cost on backup media.)
>
> Am I missing something here?
>
> (Yes, I know, I've been a ZFS resister.... ;-))
>
I do the same. I only use zfs send -I (capital i) so I have all the  
snapshots on the backup also.
That way the data survives an oops (rm -r) and a fire at the same time. :-)

Ronald.

dweimer

2013-Mar-01 15:36 UTC

head link

Musings on ZFS Backup strategies

On 03/01/2013 8:24 am, Karl Denninger wrote:> Dabbling with ZFS now, and giving some thought to how to handle backup
> strategies.
> 
> ZFS' snapshot capabilities have forced me to re-think the way that 
> I've
> handled this.  Previously near-line (and offline) backup was focused 
> on
> being able to handle both disasters (e.g. RAID adapter goes nuts and
> scribbles on the entire contents of the array), a double-disk (or 
> worse)
> failure, or the obvious (e.g. fire, etc) along with the "aw crap, I 
> just
> rm -rf'd something I'd rather not!"
> 
> ZFS makes snapshots very cheap, which means you can resolve the "aw
> crap" situation without resorting to backups at all.  This turns the
> backup situation into a disaster recovery one.
> 
> And that in turn seems to say that the ideal strategy looks more like:
> 
> Take a base snapshot immediately and zfs send it to offline storage.
> Take an incremental at some interval (appropriate for disaster 
> recovery)
> and zfs send THAT to stable storage.
> 
> If I then restore the base and snapshot, I get back to where I was 
> when
> the latest snapshot was taken.  I don't need to keep the incremental
> snapshot for longer than it takes to zfs send it, so I can do:
> 
> zfs snapshot pool/some-filesystem at unique-label
> zfs send -i pool/some-filesystem at base 
> pool/some-filesystem at unique-label
> zfs destroy pool/some-filesystem at unique-label
> 
> and that seems to work (and restore) just fine.
> 
> Am I looking at this the right way here?  Provided that the base 
> backup
> and incremental are both readable, it appears that I have the disaster
> case covered, and the online snapshot increments and retention are
> easily adjusted and cover the "oops" situations without having to
> resort
> to the backups at all.
> 
> This in turn means that keeping more than two incremental dumps 
> offline
> has little or no value; the second merely being taken to insure that
> there is always at least one that has been written to completion 
> without
> error to apply on top of the base.  That in turn makes the backup
> storage requirement based only on entropy in the filesystem and not 
> time
> (where the "tower of Hanoi" style dump hierarchy imposed both a
time
> AND
> entropy cost on backup media.)
> 
> Am I missing something here?
> 
> (Yes, I know, I've been a ZFS resister.... ;-))
I briefly did something like this between two FreeNAS boxes, it seemed 
to work well, but my secondary Box wasn't quite up to par hardware.  
Combine that with the lack of necessary internet bandwidth with a second 
physical location in case of something really disastrous, like a tornado 
or fire destroying my house.  I ended up just using an eSATA drive dock 
and Bacula, with a few external drives rotated regularly into my office 
at work, rather than upgrading the secondary box.

If you have the secondary box that is adequate, and either offsite 
backups aren't a concern or you have a big enough pipe to a secondary 
location that houses the backup this should work.

I would recommend testing your incremental snapshot rotation, I never 
did test a restore from anything but the most recent set of data when I 
was running my setup, I did however save a weeks worth of hourly 
snapshots on a couple of the more rapidly changing data sets.

-- 
Thanks,
    Dean E. Weimer
    http://www.dweimer.net/

Ben Morrow

2013-Mar-01 16:50 UTC

head link

Musings on ZFS Backup strategies

Quoth Karl Denninger <karl at denninger.net>:> Dabbling with ZFS now, and giving some thought to how to handle backup
> strategies.
[...]> 
> Take a base snapshot immediately and zfs send it to offline storage.
> Take an incremental at some interval (appropriate for disaster recovery)
> and zfs send THAT to stable storage.
> 
> If I then restore the base and snapshot, I get back to where I was when
> the latest snapshot was taken.  I don't need to keep the incremental
> snapshot for longer than it takes to zfs send it, so I can do:
> 
> zfs snapshot pool/some-filesystem at unique-label
> zfs send -i pool/some-filesystem at base pool/some-filesystem at
unique-label
> zfs destroy pool/some-filesystem at unique-label
> 
> and that seems to work (and restore) just fine.
For backup purposes it's worth using the -R and -I options to zfs send
rather than -i. This will preserve the other snapshots, which can be
important.
> Am I looking at this the right way here?  Provided that the base backup
> and incremental are both readable, it appears that I have the disaster
> case covered, and the online snapshot increments and retention are
> easily adjusted and cover the "oops" situations without having to
resort
> to the backups at all.
> 
> This in turn means that keeping more than two incremental dumps offline
> has little or no value; the second merely being taken to insure that
> there is always at least one that has been written to completion without
> error to apply on top of the base.  That in turn makes the backup
> storage requirement based only on entropy in the filesystem and not time
> (where the "tower of Hanoi" style dump hierarchy imposed both a
time AND
> entropy cost on backup media.)
No, that's not true. Since you keep taking successive increments from a
fixed base, the size of those increments will increase over time (each
increment will include all net filesystem activity since the base
snapshot). In UFS terms, it's equivalent to always taking level 1 dumps.
Unlike with UFS, the @base snapshot will also start using increasing
amounts of space in the source zpool.

I don't know what medium you're backing up to (does anyone use tape any
more?) but when backing up to disk I much prefer to keep the backup in
the form of a filesystem rather than as 'zfs send' streams. One reason
for this is that I believe that new versions of the ZFS code are more
likely to be able to correctly read old versions of the filesystem than
old versions of the stream format; this may not be correct any more,
though.

Another reason is that it means I can do 'rolling snapshot' backups. I
do an initial dump like this

    # zpool is my working pool
    # bakpool is a second pool I am backing up to

    zfs snapshot -r zpool/fs at dump
    zfs send -R zpool/fs at dump | zfs recv -vFd bakpool

That pipe can obviously go through ssh or whatever to put the backup on
a different machine. Then to make an increment I roll forward the
snapshot like this

    zfs rename -r zpool/fs at dump dump-old
    zfs snapshot -r zpool/fs at dump
    zfs send -R -I @dump-old zpool/fs at dump | zfs recv -vFd bakpool
    zfs destroy -r zpool/fs at dump-old
    zfs destroy -r bakpool/fs at dump-old

(Notice that the increment starts at a snapshot called @dump-old on the
send side but at a snapshot called @dump on the recv side. ZFS can
handle this perfectly well, since it identifies snapshots by UUID, and
will rename the bakpool snapshot as part of the recv.)

This brings the filesystem on bakpool up to date with the filesystem on
zpool, including all snapshots, but never creates an increment with more
than one backup interval's worth of data in. If you want to keep more
history on the backup pool than the source pool, you can hold off on
destroying the old snapshots, and instead rename them to something
unique. (Of course, you could always give them unique names to start
with, but I find it more convenient not to.)

Ben

Volodymyr Kostyrko

2013-Mar-01 17:55 UTC

head link

Musings on ZFS Backup strategies

01.03.2013 16:24, Karl Denninger:> Dabbling with ZFS now, and giving some thought to how to handle backup
> strategies.
>
> ZFS' snapshot capabilities have forced me to re-think the way that
I've
> handled this.  Previously near-line (and offline) backup was focused on
> being able to handle both disasters (e.g. RAID adapter goes nuts and
> scribbles on the entire contents of the array), a double-disk (or worse)
> failure, or the obvious (e.g. fire, etc) along with the "aw crap, I
just
> rm -rf'd something I'd rather not!"
>
> ZFS makes snapshots very cheap, which means you can resolve the "aw
> crap" situation without resorting to backups at all.  This turns the
> backup situation into a disaster recovery one.
>
> And that in turn seems to say that the ideal strategy looks more like:
>
> Take a base snapshot immediately and zfs send it to offline storage.
> Take an incremental at some interval (appropriate for disaster recovery)
> and zfs send THAT to stable storage.
>
> If I then restore the base and snapshot, I get back to where I was when
> the latest snapshot was taken.  I don't need to keep the incremental
> snapshot for longer than it takes to zfs send it, so I can do:
>
> zfs snapshot pool/some-filesystem at unique-label
> zfs send -i pool/some-filesystem at base pool/some-filesystem at
unique-label
> zfs destroy pool/some-filesystem at unique-label
>
> and that seems to work (and restore) just fine.
Yes, I'm working with backups the same way, I wrote a simple script that 
synchronizes two filesystems between distant servers. I also use the 
same script to synchronize bushy filesystems (with hundred thousands of 
files) where rsync produces a too big load for synchronizing.

https://github.com/kworr/zfSnap/commit/08d8b499dbc2527a652cddbc601c7ee8c0c23301

I left it where it was but I was also planning to write some purger for 
snapshots that would automatically purge snapshots when pool gets low on 
space. Never hit that yet.
> Am I looking at this the right way here?  Provided that the base backup
> and incremental are both readable, it appears that I have the disaster
> case covered, and the online snapshot increments and retention are
> easily adjusted and cover the "oops" situations without having to
resort
> to the backups at all.
>
> This in turn means that keeping more than two incremental dumps offline
> has little or no value; the second merely being taken to insure that
> there is always at least one that has been written to completion without
> error to apply on top of the base.  That in turn makes the backup
> storage requirement based only on entropy in the filesystem and not time
> (where the "tower of Hanoi" style dump hierarchy imposed both a
time AND
> entropy cost on backup media.)
Well, snapshots can pose a value in a longer timeframe depending on 
data. Being able to restore some file accidentally deleted two month ago 
already saved 2k$ for one of our customers.

-- 
Sphinx of black quartz, judge my vow.

dweimer

2013-Mar-01 20:09 UTC

head link

Musings on ZFS Backup strategies

On 03/01/2013 1:25 pm, kpneal at pobox.com wrote:> On Fri, Mar 01, 2013 at 09:45:32AM -0600, Karl Denninger wrote:
>> I rotate the disaster disks out to a safe-deposit box at the bank, 
>> and
>> they're geli-encrypted, so if stolen they're worthless to the
thief
>> (other than their cash value as a drive) and if the building goes 
>> "poof"
>> I have the ones in the vault to recover from.  There's the
potential
>> for
>> loss up to the rotation time of course but that is the same risk I 
>> had
>> with all UFS filesystems.
> What do you do about geli keys? Encrypted backups aren't much use if
> you can't unencrypt them.
In my case I set them up with a pass-phrase only, I can mount them on 
any FreeBSD system using geli attach ... then enter pass-phrase when 
prompted. It is less secure than the key method (just because the 
pass-phrase is far shorter than a key would be), but it ensures as long 
as I can remember the pass-phrase I can access the data.  However my 
backups in this method are personal data, worse case scenario is someone 
steals my identity, personal photos, and iTunes library.  My bank 
accounts don't have enough money in them to make it worth, someone going 
through the time and effort to get the data off the disks.  The 
pass-phrase I picked uses all the good practices of mixed case, special 
characters, and its not something easy to guess even by people who know 
me well.  It would be far easier to break into my house and get the data 
that way, than break the encryption, on the external backup media.
If I was say backing up a corporate data with this method and my 
company did defense research, well I would probably use both a 
pass-phrase and key combination and store an offsite copy of the key in 
a separate secure location from the media.

-- 
Thanks,
    Dean E. Weimer
    http://www.dweimer.net/

Daniel Eischen

2013-Mar-01 20:39 UTC

head link

Musings on ZFS Backup strategies

On Fri, 1 Mar 2013, kpneal at pobox.com wrote:
> On Fri, Mar 01, 2013 at 12:23:31PM -0500, Daniel Eischen wrote:
>> Yes, we still use a couple of DLT autoloaders and have nightly
>> incrementals and weekly fulls.  This is the problem I have with
>> converting to ZFS.  Our typical recovery is when a user says
>> they need a directory or set of files from a week or two ago.
>> Using dump from tape, I can easily extract *just* the necessary
>> files.  I don't need a second system to restore to, so that
>> I can then extract the file.
>>
>> dump (and ufsdump for our Solaris boxes) _just work_, and we
>> can go back many many years and they will still work.  If we
>> convert to ZFS, I'm guessing we'll have to do nightly
>> incrementals with 'tar' instead of 'dump' as well as
doing
>> ZFS snapshots for fulls.
>
> What about extended attributes? ACLs? Are those saved by tar?
I think tar (as root or -p) will attempt to preserve those.

-- 
DE

Karl Denninger

2013-Mar-01 20:43 UTC

head link

Musings on ZFS Backup strategies

On 3/1/2013 1:25 PM, kpneal at pobox.com wrote:> On Fri, Mar 01, 2013 at 09:45:32AM -0600, Karl Denninger wrote:
>> I rotate the disaster disks out to a safe-deposit box at the bank, and
>> they're geli-encrypted, so if stolen they're worthless to the
thief
>> (other than their cash value as a drive) and if the building goes
"poof"
>> I have the ones in the vault to recover from.  There's the
potential for
>> loss up to the rotation time of course but that is the same risk I had
>> with all UFS filesystems.
> What do you do about geli keys? Encrypted backups aren't much use if
> you can't unencrypt them.I keep them in my head.  Even my immediate family could not guess it;
one of the things I mastered many years ago was "algorithmic" and very
long passwords that are easy to remember but impossible for someone to
guess other than by brute force, and if long enough that becomes
prohibitive for the guesser.

If I needed even better I'd keep the (random part of the) composite key
on an external thing (e.g. thumbdrive) that is only stuffed in the box
to boot and attach the drives, the removed and stored separately under
separate and high security.

There is no point to using a composite key IF THE RANDOM PART CAN BE
STOLEN; you then are back to the security of the typed password (if
any), so if you want the better level of security you need to deal with
the physical security of the random portion and make sure it is NEVER on
an unencrypted part of the disk itself.

If you're not going to do that then a strong and long password is just
as good.

I can mount my backup volumes on any FreeBSD machine that has the geli
framework.

-- 
-- Karl Denninger
/The Market Ticker ?/ <http://market-ticker.org>
Cuda Systems LLC

freebsd stable - Mar 2013 - Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies

Musings on ZFS Backup strategies