Hello everyone, I am trying to take ZFS snapshots (ie. zfs send) and burn them to DVD''s for offsite storage. In many cases, the snapshots greatly exceed the 8GB I can stuff onto a single DVD-DL. In order to make this work, I have used the "split" utility to break the images into smaller, fixed-size chunks that will fit onto a DVD. For example: #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. This gives me a set of files like this: 7.9G mypictures.zfssnap.split.aa 7.9G mypictures.zfssnap.split.ab 7.9G mypictures.zfssnap.split.ac 7.9G mypictures.zfssnap.split.ad 7.9G mypictures.zfssnap.split.ae 7.9G mypictures.zfssnap.split.af 6.1G mypictures.zfssnap.split.ag I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] > testjoin But when I compare the checksum of the original snapshot to that of the rejoined snapshot, I get a different result: #cksum 2008.12.31-2358--pictures.zfssnap 308335278 57499302592 mypictures.zfssnap #cksum testjoin 278036498 57499302592 testjoin And when I try to restore the filesystem, I get the following failure: #zfs recv pool_01/test < ./testjoin cannot receive new filesystem stream: invalid stream (checksum mismatch) Which makes sense given the different checksums reported by the cksum command above. The question I have is, what can I do? My guess is that there is some ascii/binary conversion issue that the "split" and "cat" commands are introducing into the restored file, but I''m at a loss as to exactly what is happening and how to get around it. If anyone out there has a solution to my problem, or a better suggestion on how to accomplish the original goal, please let me know. Thanks to all in advance for any help you may be able to offer. -Michael -- This message posted from opensolaris.org
On Wed, Feb 4, 2009 at 6:19 PM, Michael McKnight <michael_mcknight01 at yahoo.com> wrote:> #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. > But when I compare the checksum of the original snapshot to that of the rejoined snapshot, I get a different result: > > #cksum 2008.12.31-2358--pictures.zfssnap > 308335278 57499302592 mypictures.zfssnap > #cksum testjoin > 278036498 57499302592 testjoin> The question I have is, what can I do? My guess is that there is some ascii/binary conversion issue that the "split" and "cat" commands are introducing into the restored file, but I''m at a loss as to exactly what is happening and how to get around it.Here''s a workaround : try using something that can split-merge reliably. 7z should work, and it''s already included on opensolaris. Something like 7z a -v8100m -ttar mypictures.zfssnap.7z mypictures.zfssnap should be very fast (using no compression), while 7z a -v8100m mypictures.zfssnap.7z mypictures.zfssnap should save a lot of space using the default 7z compression algorithm. Regards, Fajar
On 4-Feb-09, at 6:19 AM, Michael McKnight wrote:> Hello everyone, > > I am trying to take ZFS snapshots (ie. zfs send) and burn them to > DVD''s for offsite storage. In many cases, the snapshots greatly > exceed the 8GB I can stuff onto a single DVD-DL. > > In order to make this work, I have used the "split" utility ... > I use the following command to convert them back into a single file: > #cat mypictures.zfssnap.split.a[a-g] > testjoin > > But when I compare the checksum of the original snapshot to that of > the rejoined snapshot, I get a different result:Tested your RAM lately? --Toby> > > -Michael > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>>> "mm" == Michael McKnight <michael_mcknight01 at yahoo.com> writes:mm> #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. mm> #cat mypictures.zfssnap.split.a[a-g] > testjoin mm> But when I compare the checksum of the original snapshot to mm> that of the rejoined snapshot, I get a different result: sounds fine. I''m not sure why it''s failing. mm> And when I try to restore the filesystem, I get the following mm> failure: #zfs recv pool_01/test < ./testjoin cannot receive mm> new filesystem stream: invalid stream (checksum mismatch) however, aside from this problem you''re immediately having, I think you should never archive the output of ''zfs send''. I think the current warning on the wiki is not sufficiently drastic, but when I asked for an account to update the wiki I got no answer. Here are the problems, again, with archiving ''zfs send'' output: * no way to test the stream''s integrity without receiving it. (meaning, to test a stream, you need enough space to store the stream being tested, plus that much space again. not practical.) A test could possibly be hacked up, but because the whole ZFS software stack is involved in receiving, and is full of assertions itself, any test short of actual extraction wouldn''t be a thorough test, so this is unlikely to change soon. * stream format is not guaranteed to be forward compatible with new kernels. and versioning may be pickier than zfs/zpool versions. * stream is expanded _by the kernel_, so even if tar had a forward-compatibility problem, which it won''t, you could hypothetically work around it by getting an old ''tar''. For ''zfs send'' streams you have to get an entire old kernel, and boot it on modern hardware, to get at your old stream. * supposed to be endian-independent, but isn''t. * stream is ``protected'''' from corruption in the following way: if a single bit is flipped anywhere in the stream, the entire stream and all incrementals descended from it become worthless. It is EXTREMELY corruption-sensitive. ''tar'' and zpool images both detect, report, work around, flipped bits. The ''zfs send'' idea is different: if there''s corruption, the designers assume you can just restart the ''zfs send | zfs recv'' until you get a clean go---what you most need is ability to atomically roll back the failed recv, which you do get. You are not supposed to be archiving it! * unresolved bugs. ``poisonous streams'''' causing kernel panics when you receive them, http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0 The following things do not have these problems: * ZFS filesystems inside file vdev''s (except maybe the endian problem. and also the needs-whole-kernel problem, but mitigated by better forward-compatibility guarantees.) * tar files In both alternatives you probably shouldn''t use gzip on the resulting file. If you must gzip, it would be better to make a bunch of tar.gz files, ex., one per user, and tar the result. Maybe I''m missing some magic flag, but I''ve not gotten gzip to be too bitflip-resilient. The wiki cop-out is a nebulous ``enterprise backup ``Solution'' ''''. Short of that you might make a zpool in a file with zfs compression turned on and rsync or cpio or zfs send | zfs recv the data into it. Or just use gtar like in the old days. With some care you may even be able to convince tar to write directly to the medium. And when you''re done you can do a ''tar t'' directly from medium also, to check it. I''m not sure what to do about incrementals. There is a sort of halfass incremental feature in gtar, but not like what ZFS gives. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090204/50ba24ce/attachment.bin>
On Wed, February 4, 2009 12:01, Miles Nordin wrote:> * stream format is not guaranteed to be forward compatible with new > kernels. and versioning may be pickier than zfs/zpool versions.Useful points, all of them. This particular one also points out something I hadn''t previously thought about -- using zfs send piped through ssh (or in some other way going from one system to another) is also sensitive to this versioning issue. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On Wed, 4 Feb 2009, Toby Thain wrote:>> In order to make this work, I have used the "split" utility ... >> I use the following command to convert them back into a single file: >> #cat mypictures.zfssnap.split.a[a-g] > testjoin >> >> But when I compare the checksum of the original snapshot to that of >> the rejoined snapshot, I get a different result: > > Tested your RAM lately?Split is originally designed to handle text files. It may have problems with binary files. Due to these issues, long ago (1993) I wrote a ''bustup'' utility which works on binary files. I have not looked at it since then. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On 4-Feb-09, at 1:01 PM, Miles Nordin wrote: ... Here are the> problems, again, with archiving ''zfs send'' output: > ... > EXTREMELY corruption-sensitive. ''tar'' and zpool images both > detect, report, work around, flipped bits.I know this was discussed a while back, but in what sense does tar do any of those things? I understand that it is unlikely to barf completely on bitflips, but won''t tar simply silently de-archive bad data? Correct me if I''m wrong, but each tar''d object isn''t stored with its checksum? Of course your points re: send are well taken, thanks for the synopsis. --Toby
On 4-Feb-09, at 2:29 PM, Bob Friesenhahn wrote:> On Wed, 4 Feb 2009, Toby Thain wrote: >>> In order to make this work, I have used the "split" utility ... >>> I use the following command to convert them back into a single file: >>> #cat mypictures.zfssnap.split.a[a-g] > testjoin >>> >>> But when I compare the checksum of the original snapshot to that of >>> the rejoined snapshot, I get a different result: >> >> Tested your RAM lately? > > Split is originally designed to handle text files. It may have > problems with binary files.Ouch, OK. --Toby> Due to these issues, long ago (1993) I wrote a ''bustup'' utility > which works on binary files. I have not looked at it since then. > > Bob > =====================================> Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/ > bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >
>>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes:tt> I know this was discussed a while back, but in what sense does tt> tar do any of those things? I understand that it is unlikely tt> to barf completely on bitflips, but won''t tar simply silently tt> de-archive bad data? yeah, I just tested it, and you''re right. I guess the checksums are only for headers. However, cpio does store checksums for files'' contents, so maybe it''s better to use cpio than tar. Just be careful how you invoke it, because there are different cpio formats just like there are different tar formats, and some might have no or weaker checksum. NetBSD ''pax'' invoked as tar: -----8<----- castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1 1+0 records in 1+0 records out 1048576 bytes transferred in 0.022 secs (47662545 bytes/sec) castrovalva:~$ tar cf t0.tar t0 castrovalva:~$ md5 t0.tar MD5 (t0.tar) = 591a39a984f70fe3e44a5e13f0ac74b6 castrovalva:~$ tar tf t0.tar t0 castrovalva:~$ dd of=t0.tar seek=$(( 512 * 1024 )) bs=1 conv=notrunc asdfasdfasfs 13+0 records in 13+0 records out 13 bytes transferred in 2.187 secs (5 bytes/sec) castrovalva:~$ md5 t0.tar MD5 (t0.tar) = 14b3a9d851579d8331a0466a5ef62693 castrovalva:~$ tar tf t0.tar t0 castrovalva:~$ tar xvf t0.tar tar: Removing leading / from absolute path names in the archive t0 tar: ustar vol 1, 1 files, 1054720 bytes read, 0 bytes written in 1 secs (1054720 bytes/sec) castrovalva:~$ hexdump -C t0 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 0007fe00 61 73 64 66 61 73 64 66 61 73 66 73 0a 00 00 00 |asdfasdfasfs....| 0007fe10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00100000 castrovalva:~$ -----8<----- GNU tar does the same thing. NetBSD ''pax'' invoked as cpio: -----8<----- castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1 1+0 records in 1+0 records out 1048576 bytes transferred in 0.018 secs (58254222 bytes/sec) castrovalva:~$ cpio -H sv4cpio -o > t0.cpio t0 castrovalva:~$ md5 t0.cpio MD5 (t0.cpio) = d5128381e72ee514ced8ad10a5a33f16 castrovalva:~$ dd of=t0.cpio seek=$(( 512 * 1024 )) bs=1 conv=notrunc asdfasdfasdf 13+0 records in 13+0 records out 13 bytes transferred in 1.461 secs (8 bytes/sec) castrovalva:~$ md5 t0.cpio MD5 (t0.cpio) = b22458669256da5bcb6c94948d22a155 castrovalva:~$ rm t0 castrovalva:~$ cpio -i < t0.cpio cpio: Removing leading / from absolute path names in the archive cpio: Actual crc does not match expected crc t0 -----8<----- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090204/439dc7c8/attachment.bin>
Miles Nordin wrote:>>>>>> "mm" == Michael McKnight <michael_mcknight01 at yahoo.com> writes: >>>>>> > > mm> #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. > mm> #cat mypictures.zfssnap.split.a[a-g] > testjoin > > mm> But when I compare the checksum of the original snapshot to > mm> that of the rejoined snapshot, I get a different result: > > sounds fine. I''m not sure why it''s failing. > > mm> And when I try to restore the filesystem, I get the following > mm> failure: #zfs recv pool_01/test < ./testjoin cannot receive > mm> new filesystem stream: invalid stream (checksum mismatch) > > however, aside from this problem you''re immediately having, I think > you should never archive the output of ''zfs send''. I think the > current warning on the wiki is not sufficiently drastic, but when I > asked for an account to update the wiki I got no answer. Here are the > problems, again, with archiving ''zfs send'' output: > > * no way to test the stream''s integrity without receiving it. > (meaning, to test a stream, you need enough space to store the > stream being tested, plus that much space again. not practical.) > A test could possibly be hacked up, but because the whole ZFS > software stack is involved in receiving, and is full of assertions > itself, any test short of actual extraction wouldn''t be a thorough > test, so this is unlikely to change soon. > > * stream format is not guaranteed to be forward compatible with new > kernels. and versioning may be pickier than zfs/zpool versions. >Backward compatibility is achieved.> * stream is expanded _by the kernel_, so even if tar had a > forward-compatibility problem, which it won''t, you could > hypothetically work around it by getting an old ''tar''. For ''zfs > send'' streams you have to get an entire old kernel, and boot it on > modern hardware, to get at your old stream. >An enterprising community member could easily put together a utility to do a verification. All of the necessary code is readily available.> * supposed to be endian-independent, but isn''t. >CR 6764193 was fixed in b105 http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is there another?> * stream is ``protected'''' from corruption in the following way: if a > single bit is flipped anywhere in the stream, the entire stream and > all incrementals descended from it become worthless. It is > EXTREMELY corruption-sensitive. ''tar'' and zpool images both > detect, report, work around, flipped bits. The ''zfs send'' idea is > different: if there''s corruption, the designers assume you can just > restart the ''zfs send | zfs recv'' until you get a clean go---what > you most need is ability to atomically roll back the failed recv, > which you do get. You are not supposed to be archiving it! >This is not completely accurate. Snapshots which are completed are completed.> * unresolved bugs. ``poisonous streams'''' causing kernel panics when > you receive them, http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0 > > The following things do not have these problems: > > * ZFS filesystems inside file vdev''s (except maybe the endian > problem. and also the needs-whole-kernel problem, but mitigated by > better forward-compatibility guarantees.) >Indeed, but perhaps you''ll find the grace to file an appropriate RFE?> * tar files > > In both alternatives you probably shouldn''t use gzip on the resulting > file. If you must gzip, it would be better to make a bunch of tar.gz > files, ex., one per user, and tar the result. Maybe I''m missing some > magic flag, but I''ve not gotten gzip to be too bitflip-resilient. > > The wiki cop-out is a nebulous ``enterprise backup ``Solution'' ''''. >Perhaps it would satisfy you to enumerate the market''s Enterprise Backup Solutions? This might be helpful since Solaris does not include such software, at least by my definition of Solaris. So, the wiki section "Using ZFS With Enterprise Backup Solutions" does in fact enumerate them, and I don''t see any benefit to repeating the enumeration. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Using_ZFS_With_Enterprise_Backup_Solutions> Short of that you might make a zpool in a file with zfs compression > turned on and rsync or cpio or zfs send | zfs recv the data into it. > > Or just use gtar like in the old days. With some care you may even be > able to convince tar to write directly to the medium. And when you''re > done you can do a ''tar t'' directly from medium also, to check it. I''m > not sure what to do about incrementals. There is a sort of halfass > incremental feature in gtar, but not like what ZFS gives. >I suggest you consider an Enterprise Backup Solution. -- richard
Miles Nordin <carton at Ivy.NET> wrote:> >>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes: > > tt> I know this was discussed a while back, but in what sense does > tt> tar do any of those things? I understand that it is unlikely > tt> to barf completely on bitflips, but won''t tar simply silently > tt> de-archive bad data? > > yeah, I just tested it, and you''re right. I guess the checksums are > only for headers. However, cpio does store checksums for files'' > contents, so maybe it''s better to use cpio than tar. Just be careful > how you invoke it, because there are different cpio formats just like > there are different tar formats, and some might have no or weaker > checksum.cpio is a deprecated archive format. As it is hard to enhance the features of cpio without breaking archive compatibility, POSIX defines a standard archive format that is based on tar and made very extensible. BTW: if you are on ZFS, ZFS should prevent flipping bits in archives ;-) J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
> I use the following command to convert them back into a single file: > #cat mypictures.zfssnap.split.a[a-g] > testjoin >Maybe I''m missing the point, but this command won''t give you what you''re after - in bash you want: # cat mypictures.zfssnap.split.a{a..g} > testjoin Chris
Casper.Dik at Sun.COM
2009-Feb-05 12:39 UTC
[zfs-discuss] ZFS snapshot splitting & joining
>> I use the following command to convert them back into a single file: >> #cat mypictures.zfssnap.split.a[a-g] > testjoin >> > >Maybe I''m missing the point, but this command won''t give you what you''re >after - in bash you want: > ># cat mypictures.zfssnap.split.a{a..g} > testjoinThe first should work (unless they really broke the shell) (Yes. I test it, and yes it works) Casper
On Thu, February 5, 2009 06:39, Casper.Dik at Sun.COM wrote:> >>> I use the following command to convert them back into a single file: >>> #cat mypictures.zfssnap.split.a[a-g] > testjoin >>> >> >>Maybe I''m missing the point, but this command won''t give you what you''re >>after - in bash you want: >> >># cat mypictures.zfssnap.split.a{a..g} > testjoin > > > The first should work (unless they really broke the shell) > (Yes. I test it, and yes it works)Good, because that''s a syntax I still remember and use. And it has indeed worked for me recently as well. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes:re> Indeed, but perhaps you''ll find the grace to file an re> appropriate RFE? for what? The main problem I saw was with the wiki not warning people away from archiving ''zfs send'' emphatically enough, for example by comparing its archival characteristics to tar (or checksummed cpio) files and explaining that ''zfs send''s output needs to be ephemeral. This is RFE-worthy: >> * unresolved bugs. ``poisonous streams'''' causing kernel panics >> when you receive them, >> http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0 but I''m not having the problem, so I won''t file it when I can''t provide information. >> * stream format is not guaranteed to be forward compatible re> Backward compatibility is achieved. I''ve read complaints where the zfs filesystem version has to match. People _have_ reported compatibility problems. Maybe it is true that a newer system can always receive an older stream, but not vice-versa. I''d not wish for more, and that removes this (but not other) objections to archiving ''zfs send''. not entirely though---When you archive it you care about whether you''ll be able to read it years from now. Suppose there IS some problem receiving an old stream on a new system. Even if there''s not supposed to be, and even if there isn''t right now, a bug may appear later. I think it''s less likely to get fixed than a bug importing an old zpool. so, archive the zpool, not ''zfs send'' output. re> An enterprising community member could easily put together a re> utility to do a verification. All of the necessary code is re> readily available. fine, but (a) what CAN be written doesn''t change the fact that the tool DOES NOT EXIST NOW, and the possibility of writing one isn''t enough to make archiving ''zfs send'' streams a better idea which is what I''m discussing, and (b) it''s my opinion a thorough tool is not possible, because as I said, a bunch of kernel code is implicated in the zfs recv which is full of assertions itself. ''zfs recv'' is actually panicing boxes. so I''d not have faith in some userspace tool''s claim that a stream is good, since it''s necessarily using different code than the actual extraction. ''tar t'', ''cpio -it'', and I think ''zpool scrub'' don''t use separate code paths for verification. >> * supposed to be endian-independent, but isn''t. >> re> CR 6764193 was fixed in b105 re> http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is re> there another? no, no other, that is what I remember reading. I read someone ran into it when importing a pool, too, not just when using ''zfs send''. so hopefully that fix came for free at the same time. re> I suggest you consider an Enterprise Backup Solution. I prefer Free Software, especially for archival. But I will consider the advice I gave: backup to another zpool, or to a tar/cpio file. I do not have a problem with the way ''zfs send'' works. For replication-like incremental backups, rolling back the entire recv for one flipped bit is quite defendable. the lazy panics aren''t, but the architectural decision to trash a whole stream and all its descendent incrementals for one flipped bit DOES make sense to me. but ''zfs send'' shouldn''t be archived! That is what I''m saying, not ``zfs send | zfs recv sucks'''', just that it shouldn''t be archived. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090205/457449b4/attachment.bin>
Hi everyone, I appreciate the discussion on the practicality of archiving ZFS sends, but right now I don''t know of any other options. I''m a home user, so Enterprise-level solutions aren''t available and as far as I know, tar, cpio, etc. don''t capture ACL''s and other low-level filesystem attributes. Plus, they are all susceptible to corruption while in storage, making recovery no more likely than with a zfs send. The checksumming capability is a key factor to me. I would rather not be able to restore the data than to unknowingly restore bad data. This is the biggest reason I started using ZFS to start with. Too many cases of "invisible" file corruption. Admittedly, it would be nicer if "zfs recv" would flag individual files with checksum problems rather than completely failing the restore. What I need is a complete snapshot of the filesystem (ie. ufsdump) and, correct me if I''m wrong, but zfs send/recv is the closest (only) thing we have. And I need to be able to break up this complete snapshot into pieces small enough to fit onto a DVD-DL. So far, using ZFS send/recv works great as long as the files aren''t split. I have seen suggestions on using something like 7z (?) instead of "split" as an option. Does anyone else have any other ideas on how to successfully break up a send file and join it back together? Thanks again, Michael -- This message posted from opensolaris.org
On Thu, February 5, 2009 14:15, Michael McKnight wrote:> I appreciate the discussion on the practicality of archiving ZFS sends, > but right now I don''t know of any other options. I''m a home user, so > Enterprise-level solutions aren''t available and as far as I know, tar, > cpio, etc. don''t capture ACL''s and other low-level filesystem attributes. > Plus, they are all susceptible to corruption while in storage, making > recovery no more likely than with a zfs send.Your big constraint is using optical disks. Certainly there are arguments for single-use media for a backup, but a series of optical disks containing a data stream gives rise to a nasty probability that *one* disk in the set won''t be readable, which will render everything after that unrecoverable too. .99 ^ 56 = .57, which is not a probability *I* want to see of fully recovering my data. (.99 is probably pessimistic, though. I hope.) (56 disks is how many my backup would take on DVD-DL disks, and is why I don''t do it that way.) External hard drives give you a lot more options. I''m formatting external USB drives as a ZFS pool, and then rsyncing data to them. I can scrub them for verification, and I can easily access individual files. I create snapshots on them so that I can have generations of backup accessible without duplicating data that hasn''t changed. I''m currently updating them via rsync, which doesn''t propagate ACLs, but I could and should be using send/receive instead, which would. I believe I''ve figured out the logic, but haven''t updated the script. If you do it with send/receive, you get a snapshot on the backup drive that''s identical (modulo ZFS bugs) with the original, and which you can scrub to verify when you want, etc. Furthermore, I don''t have to be physically present to change and label and file 56 DVD-DL disks. Looks like DL disks are of similar price (per GB) to external USB drives -- and external drives can be used for more than one backup. (Rather similar meaning within a factor of two either way; I only checked prices one place.) -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
>>>>> "mm" == Michael McKnight <michael_mcknight01 at yahoo.com> writes:mm> as far as I know, tar, cpio, etc. don''t capture ACL''s and mm> other low-level filesystem attributes. Take another look with whatever specific ACL''s you''re using. Some of the cpio formats will probably work because I think there was a thread in here about ACL copy working in cpio but not pax? You have to try it. mm> Plus, they are all susceptible to corruption while in storage, yes, of course there are no magic beans. mm> making recovery no more likely than with a zfs send. nonsense. With ''zfs send'' recovery is impossible with any corruption. With tar/cpio, partial recovery is the rule, not the exception. This is a difference. a big one. And I am repeating myself, over and over. I am baffled as to why this is so disputable. mm> The checksumming capability is a key factor to me. Follow the thread. cpio does checksumming, at least with some of the stream formats, and I showed an example of how to check that the checksums are working, and prove they are missing from tar. mm> I would rather not be able to restore the data than to mm> unknowingly restore bad data. I suppose that makes sense, but only for certain really specific kinds of data that most peopple don''t have. Of course being warned would be nice, but I''ve rarely wanted to be warned by losing everything, even files far away from the bit flip. I''d rather not be warned than get that kind of warning, most of the time. especially for a backup. OTOH if you''re hauling the data from one place to another and throwing away the DVDR when you get it there, then maybe zfs send is appropriate. In that case you are not archiving the zfs send stream, but rather the expanded zpool in the remote location, which is how it''s meant to be used. mm> it would be nicer if "zfs recv" would flag individual files mm> with checksum problems rather than completely failing the mm> restore. It would be nice, but I suspect it''s hard to do this and preserve the incremental dump feature. There are too many lazy panics as is without wishing for incrementals to roll forward from a corrupt base. Also, I think, architecturally, replication and storage should not be mixed because the goals when errors occur are so different. Fixing this problem at the cost of making replication jobs less reliable would be a bad thing, so I like separate tools, and unstorable zfs send. mm> What I need is a complete snapshot of the filesystem mm> (ie. ufsdump) and, correct me if I''m wrong, but zfs send/recv mm> is the closest (only) thing we have. Using ''zfs send | zfs recv'' to replicate one zpool into another zpool is a second option---store the destination pool on DVDR, not the stream. If you have enough space to store disk images of the second zpool, which it sounds like you do, then once you get ''split'' working you can split it up and write it to DVDR, too. Or you can let ZFS do the splitting, and make DVD-size vdev''s, export the pool, and burn them. It''s not as robust as a split cpio when faced with a lost DVD, but it''s worlds better than a split ''zfs send''. for your ''split'' problem, I know I have used ''split'' in the way you want, but I would have been using GNU split. Bob suggested beware of split''s line-orientedness (be sure to use -b). A couple other people suggested using bash''s {a..z} syntax rather than plain globbing to make sure you''re combining the pieces in the right order. There is /usr/gnu/bin/split and /usr/5bin/split on my system in addition to /usr/bin/split so you''ve a couple others to try. You''re checking that it''s working the right way, with md5sum, so at least you already have enough tools to narrow the problem away from ZFS. If you get really desperate, you can use dd''s skip= and count= options to emulate split, and still use cat to combine. Also check the filesizes. If you have a 2GB filesize ulimit set, that could mess up the stdout redirection, but on my Solaris system it seems to default to unlimited. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090205/b1fb443f/attachment.bin>
Miles Nordin wrote:>>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes: >>>>>> > > re> Indeed, but perhaps you''ll find the grace to file an > re> appropriate RFE? > > for what? The main problem I saw was with the wiki not warning people > away from archiving ''zfs send'' emphatically enough, for example by > comparing its archival characteristics to tar (or checksummed cpio) > files and explaining that ''zfs send''s output needs to be ephemeral. >The reason is that zfs send/recv has very good application, even in the backup space. There are, in fact, many people using it.> This is RFE-worthy: > > >> * unresolved bugs. ``poisonous streams'''' causing kernel panics > >> when you receive them, > >> http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0 >Absolutely not an RFE! This is a bug! http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6783818> but I''m not having the problem, so I won''t file it when I can''t > provide information. > > >> * stream format is not guaranteed to be forward compatible > > re> Backward compatibility is achieved. > > I''ve read complaints where the zfs filesystem version has to match. > People _have_ reported compatibility problems. Maybe it is true that > a newer system can always receive an older stream, but not vice-versa. > I''d not wish for more, and that removes this (but not other) > objections to archiving ''zfs send''. >That would be the definition of backwards compatibility.> not entirely though---When you archive it you care about whether > you''ll be able to read it years from now. Suppose there IS some > problem receiving an old stream on a new system. Even if there''s not > supposed to be, and even if there isn''t right now, a bug may appear > later. I think it''s less likely to get fixed than a bug importing an > old zpool. so, archive the zpool, not ''zfs send'' output. >ZFS send is not an archival solution. You should use an archival method which is appropriate for your business requirements. Note: "method" above, not product or command.> re> An enterprising community member could easily put together a > re> utility to do a verification. All of the necessary code is > re> readily available. > > fine, but (a) what CAN be written doesn''t change the fact that the > tool DOES NOT EXIST NOW, and the possibility of writing one isn''t > enough to make archiving ''zfs send'' streams a better idea which is > what I''m discussing, and (b) it''s my opinion a thorough tool is not > possible, because as I said, a bunch of kernel code is implicated in > the zfs recv which is full of assertions itself. ''zfs recv'' is > actually panicing boxes. so I''d not have faith in some userspace > tool''s claim that a stream is good, since it''s necessarily using > different code than the actual extraction. ''tar t'', ''cpio -it'', and I > think ''zpool scrub'' don''t use separate code paths for verification. > > >> * supposed to be endian-independent, but isn''t. > >> > > re> CR 6764193 was fixed in b105 > re> http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is > re> there another? > > no, no other, that is what I remember reading. I read someone ran > into it when importing a pool, too, not just when using ''zfs send''. > so hopefully that fix came for free at the same time. >Perhaps your memory needs to be using checksum=sha256 :-) I do not recall such a conversation or bug.> re> I suggest you consider an Enterprise Backup Solution. > > I prefer Free Software, especially for archival. But I will consider > the advice I gave: backup to another zpool, or to a tar/cpio file. >OK, one of the recommended solutions is Amanda, which is FOSS. The ZFS Best Practices Guide refers to this in the following bullet: Open source backup solutions are available. Joe Little blogs about how he backs up ZFS file systems <http://jmlittle.blogspot.com/2008/08/amanda-simple-zfs-backup-or-s3.html> to Amazon''s S3 <http://aws.amazon.com/s3/> using Amanda. <http://www.zmanda.com> Integration of ZFS snapshots with MySQL and Amanda Enterprise 2.6 Software <http://www.sun.com/bigadmin/features/articles/zmanda_sfx4540.pdf> can also take advantage of ZFS snapshot capabilities.> I do not have a problem with the way ''zfs send'' works. For > replication-like incremental backups, rolling back the entire recv for > one flipped bit is quite defendable. the lazy panics aren''t, but the > architectural decision to trash a whole stream and all its descendent > incrementals for one flipped bit DOES make sense to me. but ''zfs > send'' shouldn''t be archived! That is what I''m saying, not > ``zfs send | zfs recv sucks'''', just that it shouldn''t be archived.I tend to agree, which is why it is called send/receive instead of backup/restore or archive -- historians will note that it was originally called backup, and changed accordingly. -- richard
>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes:re> The reason is that zfs send/recv has very good application, re> even in the backup space. There are, in fact, many people re> using it. [...] re> ZFS send is not an archival solution. You should use an re> archival method which is appropriate for your business re> requirements. Note: "method" above, not product or command. well, I think most backups are archival. If we start arguing about words, I think everyone''s lost interest long ago. But I do think to protect oneself from bad surprises it would be good to never archive the output of ''zfs send'', only use it to move data from one place to another. yes, backup ``method'''', moving data from one place to another is often part of backup and can be done safely with ''zfs send | zfs recv'', but without a specific warning people will imagine something''s safe which isn''t, when you say the phrase ``use zfs send for backup''''. re> CR 6764193 was fixed in b105 >> I read someone ran into it when importing a pool, too, not just >> when using ''zfs send''. so hopefully that fix came for free at >> the same time. re> Perhaps your memory needs to be using checksum=sha256 :-) I do re> not recall such a conversation or bug. fine, here you go: http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090206/e427a6d6/attachment.bin>
my last contribution to this thread (and there was much rejoicing!) Miles Nordin wrote:>>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes: >>>>>> > > re> The reason is that zfs send/recv has very good application, > re> even in the backup space. There are, in fact, many people > re> using it. > > [...] > > re> ZFS send is not an archival solution. You should use an > re> archival method which is appropriate for your business > re> requirements. Note: "method" above, not product or command. > > well, I think most backups are archival.Disagree. Archives tend to not be overwritten, ever. Backups have all sorts of management schemes to allow the backup media to be reused.> If we start arguing about > words, I think everyone''s lost interest long ago. But I do think to > protect oneself from bad surprises it would be good to never archive > the output of ''zfs send'', only use it to move data from one place to > another. > > yes, backup ``method'''', moving data from one place to another is often > part of backup and can be done safely with ''zfs send | zfs recv'', but > without a specific warning people will imagine something''s safe which > isn''t, when you say the phrase ``use zfs send for backup''''. > > re> CR 6764193 was fixed in b105 > > >> I read someone ran into it when importing a pool, too, not just > >> when using ''zfs send''. so hopefully that fix came for free at > >> the same time. > > re> Perhaps your memory needs to be using checksum=sha256 :-) I do > re> not recall such a conversation or bug. > > fine, here you go: > > http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html >Bzzt. Thanks for playing. That is: re> CR 6764193 was fixed in b105 re> http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is re> there another? -- richard
>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes:>> http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html re> Bzzt. Thanks for playing. That is: CR 6764193 was fixed in re> b105 http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is re> there another? I don''t understand. What are you saying? I was responding to the part of your message that I quoted, where you said maybe I needed SHA256 on my memory because you didn''t remember any endiness bugs with straight pools, you only remembered problems with ''zfs send'' streams. That December 19th post is probably what I was remembering. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090206/f7bbd2bd/attachment.bin>
>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes:>> well, I think most backups are archival. re> Disagree. Archives tend to not be overwritten, ever. Backups re> have all sorts of management schemes to allow the backup media re> to be reused. The problem with storing ''zfs send'' arises when you try to ''zfs recv'' the stored stream after the opportunity to resend another stream in case of problems is passed. This problem applies to both what you call ``backups'''' and to what you call ``archives''''. I do agree that ''zfs send'' can be used as part of a backup strategy, as long as it''s used to move data to a ''zfs recv'', not to store it. I don''t agree with what you seem to imply, that ''zfs send'' streams can be stored as long as the backup isn''t what you call ``archival''''. I am not sure what was the point of your bringing up the distinction---I think it''s bad either way. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090206/00a3c5dc/attachment.bin>
I believe Tim Foster''s zfs backup service (very beta atm) has support for splitting zfs send backups. Might want to check that out and see about modifying it for your needs. On Thu, Feb 5, 2009 at 3:15 PM, Michael McKnight <michael_mcknight01 at yahoo.com> wrote:> Hi everyone, > > I appreciate the discussion on the practicality of archiving ZFS sends, but right now I don''t know of any other options. I''m a home user, so Enterprise-level solutions aren''t available and as far as I know, tar, cpio, etc. don''t capture ACL''s and other low-level filesystem attributes. Plus, they are all susceptible to corruption while in storage, making recovery no more likely than with a zfs send. > > The checksumming capability is a key factor to me. I would rather not be able to restore the data than to unknowingly restore bad data. This is the biggest reason I started using ZFS to start with. Too many cases of "invisible" file corruption. Admittedly, it would be nicer if "zfs recv" would flag individual files with checksum problems rather than completely failing the restore. > > What I need is a complete snapshot of the filesystem (ie. ufsdump) and, correct me if I''m wrong, but zfs send/recv is the closest (only) thing we have. And I need to be able to break up this complete snapshot into pieces small enough to fit onto a DVD-DL. > > So far, using ZFS send/recv works great as long as the files aren''t split. I have seen suggestions on using something like 7z (?) instead of "split" as an option. Does anyone else have any other ideas on how to successfully break up a send file and join it back together? > > Thanks again, > Michael > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hi again everyone, OK... I''m even more confused at what is happening here when I try to rejoin the split zfs send file... When I cat the split files and pipe through cksum, I get the same cksum as the original (unsplit) zfs send snapshot: #cat mypictures.zfssnap.split.a[a-d] |cksum 2375397256 27601696744 #cksum mypictures.zfssnap 2375397256 27601696744 But when I cat them into a file and then run cksum on the file, it results in a different cksum: #cat mypictures.zfssnap.split.a[a-d] > testjoin3 #cksum testjoin3 3408767053 27601696744 testjoin3 I am at a loss as to what on Earth is happening here! The resulting file size is the same as the original, but why does cat produce a different cksum when piped vs. directed to a file? In each case where I have run ''cmp -l'' on the resulting file, there is a single byte that has the wrong value. What could cause this? Any ideas would be greatly appreciated. Thanks (again) to all in advance, -Michael -- This message posted from opensolaris.org
Thanks to John K. and Richard E. for an answer that would have never, ever occurred to me... The problem was with the shell. For whatever reason, /usr/bin/ksh can''t rejoin the files correctly. When I switched to /sbin/sh, the rejoin worked fine, the cksum''s matched, and the zfs recv worked without a hitch. The ksh I was using is: # what /usr/bin/ksh /usr/bin/ksh: Version M-11/16/88i SunOS 5.10 Generic 118873-04 Aug 2006 So, is this a bug in the ksh included with Solaris 10? Should I file a bug report with Sun? If so, how? I don''t have a support contract or anything. Anyway, I''d like to thank you all for your valuable input and assistance in helping me work through this issue. -Michael -- This message posted from opensolaris.org
> The problem was with the shell. For whatever reason, > /usr/bin/ksh can''t rejoin the files correctly. When > I switched to /sbin/sh, the rejoin worked fine, the > cksum''s matched, ... > > The ksh I was using is: > > # what /usr/bin/ksh > /usr/bin/ksh: > Version M-11/16/88i > SunOS 5.10 Generic 118873-04 Aug 2006 > > So, is this a bug in the ksh included with Solaris 10?Are you able to reproduce the issue with a script like this (needs ~ 200 gigabytes of free disk space) ? I can''t... =============================% cat split.sh #!/bin/ksh bs=1k count=`expr 57 \* 1024 \* 1024` split_bs=8100m set -x dd if=/dev/urandom of=data.orig bs=${bs} count=${count} split -b ${split_bs} data.orig data.split. ls -l data.split.* cat data.split.a[a-z] > data.join cmp -l data.orig data.join ============================= On SX:CE / OpenSolaris the same version of /bin/ksh = /usr/bin/ksh is present: % what /usr/bin/ksh /usr/bin/ksh: Version M-11/16/88i SunOS 5.11 snv_104 November 2008 I did run the script in a directory in an uncompressed zfs filesystem: % ./split.sh + dd if=/dev/urandom of=data.orig bs=1k count=59768832 59768832+0 records in 59768832+0 records out + split -b 8100m data.orig data.split. + ls -l data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae data.split.af data.split.ag data.split.ah -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:31 data.split.aa -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:35 data.split.ab -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:39 data.split.ac -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:43 data.split.ad -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:48 data.split.ae -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:53 data.split.af -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:58 data.split.ag -rw-r--r-- 1 jk usr 1749024768 Feb 12 18:58 data.split.ah + cat data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae data.split.af data.split.ag data.split.ah + 1> data.join + cmp -l data.orig data.join 2002.33u 2302.05s 1:51:06.85 64.5% As expected, it works without problem. The files are bit for bit identical after splitting and joining. For me this looks more as if your hardware is broken: http://opensolaris.org/jive/thread.jspa?messageID=338148 A single bad bit (!) in the middle of the joined file is very suspicious... -- This message posted from opensolaris.org