thr3ads.net - zfs discuss - [zfs-discuss] backing this up [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Gregory Durham

2010-Jan-25 23:28 UTC

[zfs-discuss] backing this up

Hello all,
I have quite a bit of data transferring between two machines via snapshot
send and receive, this has been working flawlessly. I am now wanting to back
the data from the failover to tape.I was planning on using bacula as I have
a bit of experience with it. I am now trying to figure out the quickest way
of backing up around 100gb of data to tape. All of these are iSCSI zvols and
snapshots.One option I have seen is zfs send zfs_snap at 1 >
/some_dir/some_file_name. Then I can back this up to tape. This seems easy
as I already have a created a script that does just this but I am worried
that this is not the best or most secure way to do this. Does anyone have a
better solution? I was thinking about then gzip''ing this but that would
take
an enormous amount of time...

Thanks all!
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100125/55fd4823/attachment.html>

David Magda

2010-Jan-25 23:39 UTC

head link

[zfs-discuss] backing this up

On Jan 25, 2010, at 18:28, Gregory Durham wrote:
> One option I have seen is zfs send zfs_snap at 1 > /some_dir/ 
> some_file_name. Then I can back this up to tape. This seems easy as  
> I already have a created a script that does just this but I am  
> worried that this is not the best or most secure way to do this.  
> Does anyone have a better solution?
We''ve been talking about this for the last week and a half. :)

http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929
http://opensolaris.org/jive/thread.jspa?threadID=121797

(They''re the same thread, just different interfaces.)
> I was thinking about then gzip''ing this but that would take an  
> enormous amount of time...

If you have a decent amount of CPU, you can parallelize compression:

http://www.zlib.net/pigz/
http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded

The LZO algorithm (as used in 7zip) is supposed to be better that gzip  
in many benchmarks, and supposedly is very parallel.

Gregory Durham

2010-Jan-25 23:41 UTC

head link

[zfs-discuss] backing this up

Well I guess I am glad I am not the only one. Thanks for the heads up!

On Mon, Jan 25, 2010 at 3:39 PM, David Magda <dmagda at ee.ryerson.ca>
wrote:
> On Jan 25, 2010, at 18:28, Gregory Durham wrote:
>
>  One option I have seen is zfs send zfs_snap at 1 >
/some_dir/some_file_name.
>> Then I can back this up to tape. This seems easy as I already have a
created
>> a script that does just this but I am worried that this is not the best
or
>> most secure way to do this. Does anyone have a better solution?
>>
>
> We''ve been talking about this for the last week and a half. :)
>
>
>
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929
> http://opensolaris.org/jive/thread.jspa?threadID=121797
>
> (They''re the same thread, just different interfaces.)
>
>
>  I was thinking about then gzip''ing this but that would take an
enormous
>> amount of time...
>>
>
>
> If you have a decent amount of CPU, you can parallelize compression:
>
> http://www.zlib.net/pigz/
> http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded
>
> The LZO algorithm (as used in 7zip) is supposed to be better that gzip in
> many benchmarks, and supposedly is very parallel.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100125/3e7dfad0/attachment.html>

Gregory Durham

2010-Jan-27 20:01 UTC

head link

[zfs-discuss] backing this up

Hello All,
I read through the attached threads and found a solution by a poster and
decided to try it. The solution was to use 3 files (in my case I made them
sparse), I then created a raidz2 pool across these 3 files and started a zfs
send | recv. The performance is horrible, it was 5.62mb/s. When I am backing
up the other system to this failover system over a network connection I can
get around 40mb/s. Is it because I am backing it up onto files rather than
physical disks? Am I doing this all wrong? This pool is temporary as it will
be sent to tape, deleted and recreated. Is it possible to zfs send to two
destination simultaneously? Or am I stuck. Any pointers would be great!

I am using OpenSolaris snv_129 and the disks are sata wd 1tb 7200rpm disks.

Thanks All!
Greg

On Mon, Jan 25, 2010 at 3:41 PM, Gregory Durham <gregory.durham at
gmail.com>wrote:
> Well I guess I am glad I am not the only one. Thanks for the heads up!
>
> On Mon, Jan 25, 2010 at 3:39 PM, David Magda <dmagda at
ee.ryerson.ca> wrote:
>
>> On Jan 25, 2010, at 18:28, Gregory Durham wrote:
>>
>>  One option I have seen is zfs send zfs_snap at 1 >
>>> /some_dir/some_file_name. Then I can back this up to tape. This
seems easy
>>> as I already have a created a script that does just this but I am
worried
>>> that this is not the best or most secure way to do this. Does
anyone have a
>>> better solution?
>>>
>>
>> We''ve been talking about this for the last week and a half. :)
>>
>>
>>
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929
>> http://opensolaris.org/jive/thread.jspa?threadID=121797
>>
>> (They''re the same thread, just different interfaces.)
>>
>>
>>  I was thinking about then gzip''ing this but that would take
an enormous
>>> amount of time...
>>>
>>
>>
>> If you have a decent amount of CPU, you can parallelize compression:
>>
>> http://www.zlib.net/pigz/
>> http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded
>>
>> The LZO algorithm (as used in 7zip) is supposed to be better that gzip
in
>> many benchmarks, and supposedly is very parallel.
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100127/46fb7e5b/attachment.html>

Daniel Carosone

2010-Jan-27 21:08 UTC

head link

[zfs-discuss] backing this up

On Wed, Jan 27, 2010 at 12:01:36PM -0800, Gregory Durham
wrote:> Hello All,
> I read through the attached threads and found a solution by a poster and
> decided to try it.
That may have been mine - good to know it helped, or at least started to.
> The solution was to use 3 files (in my case I made them sparse)
yep - writes to allocate space for them up front are pointless with CoW.
> I then created a raidz2 pool across these 3 files 
Really?  If you want one tape''s worth of space, written to 3 tapes,
you might as well just write the same file to three tapes, I think.
(I''m assuming here the files are the size you expect to write to
a single tape - otherwise I''m even more confused about this bit).

Perhaps it''s easier to let zfs cope with repairing small media errors
here and there, but the main idea of using a redundant pool of files
was to cope with loss or damage to whole tapes, for a backup that
already needed to span multiple tapes. If you want this three-way copy
of a single tape, plus easy recovery from bad spots by reading back
multiple tapes, then use a 3-way mirror.  But consider the
error-recovery mode of whatever you''re using to write to tape - some
skip to the next file on a read error.

I expect similar ratios of data to parity files/tapes as would be used
in typical disk setups, at least for "wide stripes".  Say raidz2 in
sets of 10, 8+2, or so.   (As an aside, I like this for disks, too -
since striping 128k blocks to a power-of-two wide data stripe has to
be more efficient)
> and started a zfs send | recv. The performance is horrible
There can be several reasons for this, and we''d need to know more
about your setup.

The first critical thing is going to be the setup of the staging
filesystem tha holds your pool files.  If this is itself a raidz,
perhaps you''re iops limited - you''re expecting 3 disk-files
worth of
concurrency from a pool that may not have it, though it should be a
write-mostly workload so less sensitive.  You''ll be seeking a lot
either way, though.

If this is purely staging to tape, consider making the staging pool
out of non-redundant single-disk vdevs.  Alternately, if the staging
pool is safe, there''s another trick you might consider: create the
pool, then offline 2 files while you recv, leaving the pool-of-files
degraded.  Then when you''re done, you can let the pool resilver and
fill in the redundancy.  This might change the IO pattern enough to
take less time overall, or at least allow you some flexibility with
windows to schedule backup and tapes.

Next is dedup - make sure you have the memory and l2arc capacity to
dedup the incoming write stream.  Dedup within the pool of files if
you want and can (because this will dedup your tapes), but don''t dedup
under it as well. I''ve found this to produce completely pathological
disk thrashing, in a related configuration (pool on lofi crypto
file).  Stacking dedup like this doubles the performance cliff under
memory pressure we''ve been talking about recently.

(If you really do want 3-way-mirror files, then by all means dedup
them in the staging pool.) 

Related to this is arc usage - I haven''t investigated this carefully
myself, but you may well be double-caching: the backup pool''s data, as
well as the staging pool''s view of the files.  Again, since
it''s a
write mostly workload zfs should hopefully figure out that few blocks
are being re-read, but you might experiment with primarycache=metadata
for the staging pool holding the files.  Perhaps zpool-on-files is
smart enough to use direct io bypassing cache anyway, I''m not sure.

How''s your cpu usage? Check that you''re not trying to
double-compress
the files (again, within the backup pool but not outside) and consider
using a lightweight checksum rather than sha256 outside.

Then there''s streaming and concurrency - try piping through buffer and
using bigger socket and tcp buffers.  TCP stalls and slow-start will
amplify latency many-fold.

A good zil device on the staging pool might also help, the backup pool
will be doing sync writes to close its txgs, though probably not too
many others. I haven''t experimented here, either.
> This pool is temporary as it will be sent to tape, deleted and
> recreated.
I tend not to do that, since I can incrementally update the pool
contents before rewriting tapes.  This helps hide the performance
issues dramatically since much less data is transferred and written to
the files, after the first time. 
> Is it possible to zfs send to two destination simultaneously?
Yes, though it''s less convenient than using -R on the top of the
pool, since you have to solve any dependencies (including clone
renames) yourself.  Whether this helps or hurts depends on your
bottleneck: it will help with network and buffering issues, but hurt
(badly) if you''re limited by thrashing seeks (at the writer, since you
already know the reader can sustain higher rates).
> Or am I stuck. Any pointers would be great!
Never. Always! :-)

--
Dan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100128/fd817766/attachment.bin>

Gregory Durham

2010-Jan-28 06:25 UTC

head link

[zfs-discuss] backing this up

Yep Dan,
Thank you very much for the idea, and helping me with my implementation
issues. haha. I can see that raidz2 is not needed in this case.
My question now lies as to full system recovery. Say all hell brakes loose
and all is lost except tapes. If I use what you said and just add snapshots
to a already standing zfs filesystem. I guess in this case I can do full
backups to tapes as well as partial backups, what is the best way to
accomplish this if data is all standing on a file. Note I will be using
bacula (hopefully) unless a better is recommended.
And finally, should I tar this file prior to sending it to tape or is this
not needed in this case?

Just a note, all of this data will fit on the tapes currently but what if it
doesn''t in the future?

Thanks and sorry for all of the questions...
Greg

On Wed, Jan 27, 2010 at 1:08 PM, Daniel Carosone <dan at geek.com.au>
wrote:
> On Wed, Jan 27, 2010 at 12:01:36PM -0800, Gregory Durham wrote:
> > Hello All,
> > I read through the attached threads and found a solution by a poster
and
> > decided to try it.
>
> That may have been mine - good to know it helped, or at least started to.
>
> > The solution was to use 3 files (in my case I made them sparse)
>
> yep - writes to allocate space for them up front are pointless with CoW.
>
> > I then created a raidz2 pool across these 3 files
>
> Really?  If you want one tape''s worth of space, written to 3
tapes,
> you might as well just write the same file to three tapes, I think.
> (I''m assuming here the files are the size you expect to write to
> a single tape - otherwise I''m even more confused about this bit).
>
> Perhaps it''s easier to let zfs cope with repairing small media
errors
> here and there, but the main idea of using a redundant pool of files
> was to cope with loss or damage to whole tapes, for a backup that
> already needed to span multiple tapes. If you want this three-way copy
> of a single tape, plus easy recovery from bad spots by reading back
> multiple tapes, then use a 3-way mirror.  But consider the
> error-recovery mode of whatever you''re using to write to tape -
some
> skip to the next file on a read error.
>
> I expect similar ratios of data to parity files/tapes as would be used
> in typical disk setups, at least for "wide stripes".  Say raidz2
in
> sets of 10, 8+2, or so.   (As an aside, I like this for disks, too -
> since striping 128k blocks to a power-of-two wide data stripe has to
> be more efficient)
>
> > and started a zfs send | recv. The performance is horrible
>
> There can be several reasons for this, and we''d need to know more
> about your setup.
>
> The first critical thing is going to be the setup of the staging
> filesystem tha holds your pool files.  If this is itself a raidz,
> perhaps you''re iops limited - you''re expecting 3
disk-files worth of
> concurrency from a pool that may not have it, though it should be a
> write-mostly workload so less sensitive.  You''ll be seeking a lot
> either way, though.
>
> If this is purely staging to tape, consider making the staging pool
> out of non-redundant single-disk vdevs.  Alternately, if the staging
> pool is safe, there''s another trick you might consider: create the
> pool, then offline 2 files while you recv, leaving the pool-of-files
> degraded.  Then when you''re done, you can let the pool resilver
and
> fill in the redundancy.  This might change the IO pattern enough to
> take less time overall, or at least allow you some flexibility with
> windows to schedule backup and tapes.
>
> Next is dedup - make sure you have the memory and l2arc capacity to
> dedup the incoming write stream.  Dedup within the pool of files if
> you want and can (because this will dedup your tapes), but don''t
dedup
> under it as well. I''ve found this to produce completely
pathological
> disk thrashing, in a related configuration (pool on lofi crypto
> file).  Stacking dedup like this doubles the performance cliff under
> memory pressure we''ve been talking about recently.
>
> (If you really do want 3-way-mirror files, then by all means dedup
> them in the staging pool.)
>
> Related to this is arc usage - I haven''t investigated this
carefully
> myself, but you may well be double-caching: the backup pool''s
data, as
> well as the staging pool''s view of the files.  Again, since
it''s a
> write mostly workload zfs should hopefully figure out that few blocks
> are being re-read, but you might experiment with primarycache=metadata
> for the staging pool holding the files.  Perhaps zpool-on-files is
> smart enough to use direct io bypassing cache anyway, I''m not
sure.
>
> How''s your cpu usage? Check that you''re not trying to
double-compress
> the files (again, within the backup pool but not outside) and consider
> using a lightweight checksum rather than sha256 outside.
>
> Then there''s streaming and concurrency - try piping through buffer
and
> using bigger socket and tcp buffers.  TCP stalls and slow-start will
> amplify latency many-fold.
>
> A good zil device on the staging pool might also help, the backup pool
> will be doing sync writes to close its txgs, though probably not too
> many others. I haven''t experimented here, either.
>
> > This pool is temporary as it will be sent to tape, deleted and
> > recreated.
>
> I tend not to do that, since I can incrementally update the pool
> contents before rewriting tapes.  This helps hide the performance
> issues dramatically since much less data is transferred and written to
> the files, after the first time.
>
> > Is it possible to zfs send to two destination simultaneously?
>
> Yes, though it''s less convenient than using -R on the top of the
> pool, since you have to solve any dependencies (including clone
> renames) yourself.  Whether this helps or hurts depends on your
> bottleneck: it will help with network and buffering issues, but hurt
> (badly) if you''re limited by thrashing seeks (at the writer, since
you
> already know the reader can sustain higher rates).
>
> > Or am I stuck. Any pointers would be great!
>
> Never. Always! :-)
>
> --
> Dan.-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100127/ca00eb1a/attachment.html>

zfs discuss - Jan 2010 - backing this up

[zfs-discuss] backing this up

[zfs-discuss] backing this up

[zfs-discuss] backing this up

[zfs-discuss] backing this up

[zfs-discuss] backing this up

[zfs-discuss] backing this up