Hello all, I have quite a bit of data transferring between two machines via snapshot send and receive, this has been working flawlessly. I am now wanting to back the data from the failover to tape.I was planning on using bacula as I have a bit of experience with it. I am now trying to figure out the quickest way of backing up around 100gb of data to tape. All of these are iSCSI zvols and snapshots.One option I have seen is zfs send zfs_snap at 1 > /some_dir/some_file_name. Then I can back this up to tape. This seems easy as I already have a created a script that does just this but I am worried that this is not the best or most secure way to do this. Does anyone have a better solution? I was thinking about then gzip''ing this but that would take an enormous amount of time... Thanks all! Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100125/55fd4823/attachment.html>
On Jan 25, 2010, at 18:28, Gregory Durham wrote:> One option I have seen is zfs send zfs_snap at 1 > /some_dir/ > some_file_name. Then I can back this up to tape. This seems easy as > I already have a created a script that does just this but I am > worried that this is not the best or most secure way to do this. > Does anyone have a better solution?We''ve been talking about this for the last week and a half. :) http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929 http://opensolaris.org/jive/thread.jspa?threadID=121797 (They''re the same thread, just different interfaces.)> I was thinking about then gzip''ing this but that would take an > enormous amount of time...If you have a decent amount of CPU, you can parallelize compression: http://www.zlib.net/pigz/ http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded The LZO algorithm (as used in 7zip) is supposed to be better that gzip in many benchmarks, and supposedly is very parallel.
Well I guess I am glad I am not the only one. Thanks for the heads up! On Mon, Jan 25, 2010 at 3:39 PM, David Magda <dmagda at ee.ryerson.ca> wrote:> On Jan 25, 2010, at 18:28, Gregory Durham wrote: > > One option I have seen is zfs send zfs_snap at 1 > /some_dir/some_file_name. >> Then I can back this up to tape. This seems easy as I already have a created >> a script that does just this but I am worried that this is not the best or >> most secure way to do this. Does anyone have a better solution? >> > > We''ve been talking about this for the last week and a half. :) > > > http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929 > http://opensolaris.org/jive/thread.jspa?threadID=121797 > > (They''re the same thread, just different interfaces.) > > > I was thinking about then gzip''ing this but that would take an enormous >> amount of time... >> > > > If you have a decent amount of CPU, you can parallelize compression: > > http://www.zlib.net/pigz/ > http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded > > The LZO algorithm (as used in 7zip) is supposed to be better that gzip in > many benchmarks, and supposedly is very parallel. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100125/3e7dfad0/attachment.html>
Hello All, I read through the attached threads and found a solution by a poster and decided to try it. The solution was to use 3 files (in my case I made them sparse), I then created a raidz2 pool across these 3 files and started a zfs send | recv. The performance is horrible, it was 5.62mb/s. When I am backing up the other system to this failover system over a network connection I can get around 40mb/s. Is it because I am backing it up onto files rather than physical disks? Am I doing this all wrong? This pool is temporary as it will be sent to tape, deleted and recreated. Is it possible to zfs send to two destination simultaneously? Or am I stuck. Any pointers would be great! I am using OpenSolaris snv_129 and the disks are sata wd 1tb 7200rpm disks. Thanks All! Greg On Mon, Jan 25, 2010 at 3:41 PM, Gregory Durham <gregory.durham at gmail.com>wrote:> Well I guess I am glad I am not the only one. Thanks for the heads up! > > On Mon, Jan 25, 2010 at 3:39 PM, David Magda <dmagda at ee.ryerson.ca> wrote: > >> On Jan 25, 2010, at 18:28, Gregory Durham wrote: >> >> One option I have seen is zfs send zfs_snap at 1 > >>> /some_dir/some_file_name. Then I can back this up to tape. This seems easy >>> as I already have a created a script that does just this but I am worried >>> that this is not the best or most secure way to do this. Does anyone have a >>> better solution? >>> >> >> We''ve been talking about this for the last week and a half. :) >> >> >> http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35929 >> http://opensolaris.org/jive/thread.jspa?threadID=121797 >> >> (They''re the same thread, just different interfaces.) >> >> >> I was thinking about then gzip''ing this but that would take an enormous >>> amount of time... >>> >> >> >> If you have a decent amount of CPU, you can parallelize compression: >> >> http://www.zlib.net/pigz/ >> http://blogs.sun.com/timc/entry/tamp_a_lightweight_multi_threaded >> >> The LZO algorithm (as used in 7zip) is supposed to be better that gzip in >> many benchmarks, and supposedly is very parallel. >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100127/46fb7e5b/attachment.html>
On Wed, Jan 27, 2010 at 12:01:36PM -0800, Gregory Durham wrote:> Hello All, > I read through the attached threads and found a solution by a poster and > decided to try it.That may have been mine - good to know it helped, or at least started to.> The solution was to use 3 files (in my case I made them sparse)yep - writes to allocate space for them up front are pointless with CoW.> I then created a raidz2 pool across these 3 filesReally? If you want one tape''s worth of space, written to 3 tapes, you might as well just write the same file to three tapes, I think. (I''m assuming here the files are the size you expect to write to a single tape - otherwise I''m even more confused about this bit). Perhaps it''s easier to let zfs cope with repairing small media errors here and there, but the main idea of using a redundant pool of files was to cope with loss or damage to whole tapes, for a backup that already needed to span multiple tapes. If you want this three-way copy of a single tape, plus easy recovery from bad spots by reading back multiple tapes, then use a 3-way mirror. But consider the error-recovery mode of whatever you''re using to write to tape - some skip to the next file on a read error. I expect similar ratios of data to parity files/tapes as would be used in typical disk setups, at least for "wide stripes". Say raidz2 in sets of 10, 8+2, or so. (As an aside, I like this for disks, too - since striping 128k blocks to a power-of-two wide data stripe has to be more efficient)> and started a zfs send | recv. The performance is horribleThere can be several reasons for this, and we''d need to know more about your setup. The first critical thing is going to be the setup of the staging filesystem tha holds your pool files. If this is itself a raidz, perhaps you''re iops limited - you''re expecting 3 disk-files worth of concurrency from a pool that may not have it, though it should be a write-mostly workload so less sensitive. You''ll be seeking a lot either way, though. If this is purely staging to tape, consider making the staging pool out of non-redundant single-disk vdevs. Alternately, if the staging pool is safe, there''s another trick you might consider: create the pool, then offline 2 files while you recv, leaving the pool-of-files degraded. Then when you''re done, you can let the pool resilver and fill in the redundancy. This might change the IO pattern enough to take less time overall, or at least allow you some flexibility with windows to schedule backup and tapes. Next is dedup - make sure you have the memory and l2arc capacity to dedup the incoming write stream. Dedup within the pool of files if you want and can (because this will dedup your tapes), but don''t dedup under it as well. I''ve found this to produce completely pathological disk thrashing, in a related configuration (pool on lofi crypto file). Stacking dedup like this doubles the performance cliff under memory pressure we''ve been talking about recently. (If you really do want 3-way-mirror files, then by all means dedup them in the staging pool.) Related to this is arc usage - I haven''t investigated this carefully myself, but you may well be double-caching: the backup pool''s data, as well as the staging pool''s view of the files. Again, since it''s a write mostly workload zfs should hopefully figure out that few blocks are being re-read, but you might experiment with primarycache=metadata for the staging pool holding the files. Perhaps zpool-on-files is smart enough to use direct io bypassing cache anyway, I''m not sure. How''s your cpu usage? Check that you''re not trying to double-compress the files (again, within the backup pool but not outside) and consider using a lightweight checksum rather than sha256 outside. Then there''s streaming and concurrency - try piping through buffer and using bigger socket and tcp buffers. TCP stalls and slow-start will amplify latency many-fold. A good zil device on the staging pool might also help, the backup pool will be doing sync writes to close its txgs, though probably not too many others. I haven''t experimented here, either.> This pool is temporary as it will be sent to tape, deleted and > recreated.I tend not to do that, since I can incrementally update the pool contents before rewriting tapes. This helps hide the performance issues dramatically since much less data is transferred and written to the files, after the first time.> Is it possible to zfs send to two destination simultaneously?Yes, though it''s less convenient than using -R on the top of the pool, since you have to solve any dependencies (including clone renames) yourself. Whether this helps or hurts depends on your bottleneck: it will help with network and buffering issues, but hurt (badly) if you''re limited by thrashing seeks (at the writer, since you already know the reader can sustain higher rates).> Or am I stuck. Any pointers would be great!Never. Always! :-) -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100128/fd817766/attachment.bin>
Yep Dan, Thank you very much for the idea, and helping me with my implementation issues. haha. I can see that raidz2 is not needed in this case. My question now lies as to full system recovery. Say all hell brakes loose and all is lost except tapes. If I use what you said and just add snapshots to a already standing zfs filesystem. I guess in this case I can do full backups to tapes as well as partial backups, what is the best way to accomplish this if data is all standing on a file. Note I will be using bacula (hopefully) unless a better is recommended. And finally, should I tar this file prior to sending it to tape or is this not needed in this case? Just a note, all of this data will fit on the tapes currently but what if it doesn''t in the future? Thanks and sorry for all of the questions... Greg On Wed, Jan 27, 2010 at 1:08 PM, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Jan 27, 2010 at 12:01:36PM -0800, Gregory Durham wrote: > > Hello All, > > I read through the attached threads and found a solution by a poster and > > decided to try it. > > That may have been mine - good to know it helped, or at least started to. > > > The solution was to use 3 files (in my case I made them sparse) > > yep - writes to allocate space for them up front are pointless with CoW. > > > I then created a raidz2 pool across these 3 files > > Really? If you want one tape''s worth of space, written to 3 tapes, > you might as well just write the same file to three tapes, I think. > (I''m assuming here the files are the size you expect to write to > a single tape - otherwise I''m even more confused about this bit). > > Perhaps it''s easier to let zfs cope with repairing small media errors > here and there, but the main idea of using a redundant pool of files > was to cope with loss or damage to whole tapes, for a backup that > already needed to span multiple tapes. If you want this three-way copy > of a single tape, plus easy recovery from bad spots by reading back > multiple tapes, then use a 3-way mirror. But consider the > error-recovery mode of whatever you''re using to write to tape - some > skip to the next file on a read error. > > I expect similar ratios of data to parity files/tapes as would be used > in typical disk setups, at least for "wide stripes". Say raidz2 in > sets of 10, 8+2, or so. (As an aside, I like this for disks, too - > since striping 128k blocks to a power-of-two wide data stripe has to > be more efficient) > > > and started a zfs send | recv. The performance is horrible > > There can be several reasons for this, and we''d need to know more > about your setup. > > The first critical thing is going to be the setup of the staging > filesystem tha holds your pool files. If this is itself a raidz, > perhaps you''re iops limited - you''re expecting 3 disk-files worth of > concurrency from a pool that may not have it, though it should be a > write-mostly workload so less sensitive. You''ll be seeking a lot > either way, though. > > If this is purely staging to tape, consider making the staging pool > out of non-redundant single-disk vdevs. Alternately, if the staging > pool is safe, there''s another trick you might consider: create the > pool, then offline 2 files while you recv, leaving the pool-of-files > degraded. Then when you''re done, you can let the pool resilver and > fill in the redundancy. This might change the IO pattern enough to > take less time overall, or at least allow you some flexibility with > windows to schedule backup and tapes. > > Next is dedup - make sure you have the memory and l2arc capacity to > dedup the incoming write stream. Dedup within the pool of files if > you want and can (because this will dedup your tapes), but don''t dedup > under it as well. I''ve found this to produce completely pathological > disk thrashing, in a related configuration (pool on lofi crypto > file). Stacking dedup like this doubles the performance cliff under > memory pressure we''ve been talking about recently. > > (If you really do want 3-way-mirror files, then by all means dedup > them in the staging pool.) > > Related to this is arc usage - I haven''t investigated this carefully > myself, but you may well be double-caching: the backup pool''s data, as > well as the staging pool''s view of the files. Again, since it''s a > write mostly workload zfs should hopefully figure out that few blocks > are being re-read, but you might experiment with primarycache=metadata > for the staging pool holding the files. Perhaps zpool-on-files is > smart enough to use direct io bypassing cache anyway, I''m not sure. > > How''s your cpu usage? Check that you''re not trying to double-compress > the files (again, within the backup pool but not outside) and consider > using a lightweight checksum rather than sha256 outside. > > Then there''s streaming and concurrency - try piping through buffer and > using bigger socket and tcp buffers. TCP stalls and slow-start will > amplify latency many-fold. > > A good zil device on the staging pool might also help, the backup pool > will be doing sync writes to close its txgs, though probably not too > many others. I haven''t experimented here, either. > > > This pool is temporary as it will be sent to tape, deleted and > > recreated. > > I tend not to do that, since I can incrementally update the pool > contents before rewriting tapes. This helps hide the performance > issues dramatically since much less data is transferred and written to > the files, after the first time. > > > Is it possible to zfs send to two destination simultaneously? > > Yes, though it''s less convenient than using -R on the top of the > pool, since you have to solve any dependencies (including clone > renames) yourself. Whether this helps or hurts depends on your > bottleneck: it will help with network and buffering issues, but hurt > (badly) if you''re limited by thrashing seeks (at the writer, since you > already know the reader can sustain higher rates). > > > Or am I stuck. Any pointers would be great! > > Never. Always! :-) > > -- > Dan.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100127/ca00eb1a/attachment.html>