thr3ads.net - zfs discuss - [zfs-discuss] corrupt zfs stream? checksum mismatch [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Jonathan Wheeler

2008-Aug-10 11:18 UTC

[zfs-discuss] corrupt zfs stream? checksum mismatch

Hi Folks,

I''m in the very unsettling position of fearing that I''ve lost
all of my data via a zfs send/receive operation, despite ZFS''s
legendary integrity.

The error that I''m getting on restore is:
receiving full stream of faith/home at 09-08-08 into Z/faith/home at 09-08-08
cannot receive: invalid stream (checksum mismatch)

Background:
I was running snv_91, and decided to upgrade to snv_95 converting to the much
awaited zfs-root in the process.

On snv_91, I was using zfs for /opt, /export/home, and a couple of other file
systems under /export
I expected that converting to zfs root would require completely formatting my
disk, so I needed to backup all of my critical data to a remote host beforehand.

My main file server is running snv_71, using an 8 disks raid-Z, with plenty of
space available via nfs, so I directed a zfs send across nfs to it. So it was
zfs -> nfs -> zfs (raid-z)

I don''t remember the exact commands used, but I started off with a zfs
snapshot -r, and then did a zfs send zfs at snapshot >
/my/nfs/server/backup.zfs
This sent each of the filesystems across and redirected them into the one,
single "backup" file.
I wasn''t all that confident that this was a wise move, as I
didn''t know how I was going to get just one fs (rather than all)
extracted again at a later time using zfs receive (I''m open to answers
on that one still!).
So, I decided to *also send just the snapshot of my home directory, which
contains all of my vital information. A bit of extra piece of mind eh, 2 backups
are better than one....

I then installed snv_95 from dvd, using zfs-root, destroying my previous zpool
on the disk in the process.

Here I am now, trying to restore my vital data that I backed up onto the nfs
server, but it''s not working!

# cat justhome.zfs | zfs receive -v Z/faith/home
receiving full stream of faith/home at 09-08-08 into Z/faith/home at 09-08-08
cannot receive: invalid stream (checksum mismatch)

I just don''t understand what''s going on here.
I started off restoring across nfs to my desktop with the standard options.
I''ve tried disabling checksumming on the parent zfs fs, to ensure that
when it was restoring it wouldn''t be using checksumming. I still got
the checksum mismatch error.

Next I tried restoring the zfs backup internally within the nfs server, making
it all local disk traffic, on the off chance that it was the network on my new
build that was somehow broken. No dice, same error, with or without checksumming
on the parent fs.

I''ve also tried my other backup file, but that''s also having
the same problem.
In all I''ve tried about 8 combinations, and I''m breaking out
in a sweat with the possibility of having lost all of my data.

The zfs backup that included all file systems bombs out fairly early, on a small
fs that was only a few GB.
The zfs backup that included just my home fs, gets around 20Gb of the way
through, before failing with the same error (and deleting the partial zfs fs).
I don''t recall how big the original home fs was, perhaps 30-40GB, so
it''s a fair way through.

What''s causing this error, and if this situation is as dire as
I''m fearing (please tell me it''s not so!), why can''t
I at least have the 20GB of data that it can restore before it bombs out with
that checksum error.

Thanks for any help with this!
Jonathan
 
 
This message posted from opensolaris.org

Jonathan Wheeler

2008-Aug-12 10:39 UTC

head link

[zfs-discuss] corrupt zfs stream? "checksum mismatch"

Hi folks,

Perhaps I was a little verbose in my first post, putting a view people off. Does
anyone else have any ideas on this one.
I can''t be the first person to have had a problem with a zfs backup
stream. Is there nothing that can be done to recover at least some of the
stream.

As another helpful chap pointed out, if tar encounters an error in the bitstream
it just moves on until it finds usable data again. Can zfs not do something
similar?

I''ll take whatever i can get!
Jonathan
 
 
This message posted from opensolaris.org

Mattias Pantzare

2008-Aug-12 12:21 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

2008/8/10 Jonathan Wheeler <griffous at
griffous.net>:> Hi Folks,
>
> I''m in the very unsettling position of fearing that I''ve
lost all of my data via a zfs send/receive operation, despite ZFS''s
legendary integrity.
>
> The error that I''m getting on restore is:
> receiving full stream of faith/home at 09-08-08 into Z/faith/home at
09-08-08
> cannot receive: invalid stream (checksum mismatch)
>
> Background:
> I was running snv_91, and decided to upgrade to snv_95 converting to the
much awaited zfs-root in the process.
You could try to restore on a snv_91 system. zfs send streams is not
for backups. This is from the zfs man page:

The format of the stream is evolving. No backwards  com-
patibility is guaranteed. You may not be able to receive
your streams on future versions of ZFS.

Or the file was corrupted when you transfered it.

Miles Nordin

2008-Aug-12 20:52 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

>>>>> "mp" == Mattias Pantzare <pantzer at
ludd.ltu.se> writes:
    mp> Or the file was corrupted when you transfered it.

he stored the backup streams on ZFS, so obviously they couldn''t
possibly be corrupt.   :p

Jonathan, does ''zfs receive -nv'' also detect the checksum
error, or is
it only detected when you actually receive onto a pool without -n?

in addition to skipping to the next header of corrupted tarballs, tar
can validate a tarball''s checksums without extracting it, so
it''s
possible to write a tape, then read it to see if it''s ok.  The
''tar t''
read test checks for medium errors, driver bugs, and bugs inside tar
itself.

so it sounds like: brrk, brrk, danger, do not use zfs send/receive for
backups---use only for moving filesystems from one pool to another.
This brings back the question ``how is it possible to back up and
restore a heavily-cloned/snapshotted system?''''  because upon
restore
the clone inheritance tree is lost, and you''ll never have enough space
in the pool to fit what was there before.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080812/f349ef3e/attachment.bin>

Jonathan Wheeler

2008-Aug-13 13:06 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Hi Mattias & Miles.

To test the version mismatch theory, I setup a snv_91 VM (using virtualbox) on
my snv_95 desktop, and tried the zfs receive again. Unfortunately the symptoms
are exactly the same: around the ~20GB mark, the justhome.zfs stream still bombs
out with the checksum error.

I didn''t realise that the zfs stream format wasn''t backward
compatible at the time that I made the backup, but having performed the above
test, this doesn''t actually appear to be my problem.
I wish it were - that I could have dealt with! :(

So far we''ve established that in this case:
*Version mismatches aren''t causing the problem.
*Receiving across the network isn''t the issue (because I have the exact
same issue restoring the stream directly on my file server).
*All that''s left was the initial send, and since zfs guarantees end to
end data integrity, it should have been able to deal with any network possible
randomness in the middle (zfs on both ends) - or at absolute worst, the zfs send
command should have failed, if it encountered errors. Seems fair, no?

So, is there a major bug here, or at least an oversight in the zfs send part of
the code?
Does zfs send not do checksumming, or, verification after sending? I''m
not sure how else to interpret this data.

Today to add some more datapoints, I repeated a zfs send to the same nfs server
from the same desktop, though this time I''m using zfs root with snv_95.
Same hardware, same network, same commands, but this time I didn''t have
any issues with the zfs receive.

?!?!?!?!

Miles:
zfs receive -nv works ok:
# zfs receive -vn  rpool/test < /net/supernova/Z/backup/angelous/justhome.zfs
would receive full stream of faith/home at 09-08-08 into rpool/test at 09-08-08

Where it gets interesting is with my recursive zfs dump:
bash-3.2# zfs receive -nvF -d rpool/test <
/net/supernova/Z/backup/angelous/pre-zfsroot.zfs
would receive full stream of faith at 09-08-08 into rpool/test at 09-08-08
would receive full stream of faith/virtualmachines at 09-08-08 into
rpool/test/virtualmachines at 09-08-08
would receive full stream of faith/opt at 09-08-08 into rpool/test/opt at
09-08-08
would receive full stream of faith/home at 09-08-08 into rpool/test/home at
09-08-08

faith at 09-08-08 is actually empty.
faith/virtualmachines at 09-08-08 bombs out around 2GB in, but I''m not
really too worried about that fs.
faith/opt at 09-08-08 is also another fs that I can live without.
faith/home at 09-08-08 is the one that we''re after.

It would seem that my justhome.zfs dump (containing only faith/home at 09-08-08)
isn''t going to work, but is there some way to recover the /home fs from
the pre-zfsroot.zfs dump? Since there seems to be a problem with the first fs
(faith/virtualmachines), I need to find a way to skip restoring that zfs, so it
can focus on the faith/home fs.
How can this be achieved with zfs receive?

Jonathan
 
 
This message posted from opensolaris.org

Mattias Pantzare

2008-Aug-13 13:37 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

2008/8/13 Jonathan Wheeler <griffous at
griffous.net>:> So far we''ve established that in this case:
> *Version mismatches aren''t causing the problem.
> *Receiving across the network isn''t the issue (because I have the
exact same issue restoring the stream directly on
> my file server).
> *All that''s left was the initial send, and since zfs guarantees
end to end data integrity, it should have been able to deal
> with any network possible randomness in the middle (zfs on both ends) - or
at absolute worst, the zfs send command
> should have failed, if it encountered errors. Seems fair, no?
>
> So, is there a major bug here, or at least an oversight in the zfs send
part of the code?
> Does zfs send not do checksumming, or, verification after sending?
I''m not sure how else to interpret this data.
zfs send can''t do any verification after sending. It is sending to a
pipe, it does not know that it is writing to a file. ZFS receive can
verify the data, as you know.

ZFS is not involved in moving the data over the network when you are using NFS.

There are many places where data can get corrupt even when you are
using ZFS.  Non ECC memory is one example.

There might be a bug in zfs but that is hard to check as you can''t
reproduce the problem.

Darren J Moffat

2008-Aug-13 13:42 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Mattias Pantzare wrote:> 2008/8/13 Jonathan Wheeler <griffous at griffous.net>:
>> So far we''ve established that in this case:
>> *Version mismatches aren''t causing the problem.
>> *Receiving across the network isn''t the issue (because I have
the exact same issue restoring the stream directly on
>> my file server).
>> *All that''s left was the initial send, and since zfs
guarantees end to end data integrity, it should have been able to deal
>> with any network possible randomness in the middle (zfs on both ends) -
or at absolute worst, the zfs send command
>> should have failed, if it encountered errors. Seems fair, no?
>>
>> So, is there a major bug here, or at least an oversight in the zfs send
part of the code?
>> Does zfs send not do checksumming, or, verification after sending?
I''m not sure how else to interpret this data.
> 
> zfs send can''t do any verification after sending. It is sending to
a
> pipe, it does not know that it is writing to a file. ZFS receive can
> verify the data, as you know.
> 
> ZFS is not involved in moving the data over the network when you are using
NFS.
ZFS is never involved in moving data over the network.  It doesn''t know
anything about networking.  Even if you are using iSCSI or FCoE ZFS 
still doesn''t know about networking the "disk" layers do. 
For the ZFS
send/recv cases as you said it just writes to stdout and reads from stdin.

-- 
Darren J Moffat

Miles Nordin

2008-Aug-13 14:02 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

>>>>> "jw" == Jonathan Wheeler <griffous at
griffous.net> writes:
>>>>> "mp" == Mattias Pantzare <pantzer at
ludd.ltu.se> writes:
    jw> Miles: zfs receive -nv works ok

one might argue ''zfs receive'' should validate checksums with
the -n
option, so you can check if a just-written dump is clean before
counting on it.  Without this, even with hindsight bias it''s really
hard to blame the sysadmin instead of ZFS this time.

    jw> Since there seems to be a problem with the first fs
    jw> (faith/virtualmachines), I need to find a way to skip
    jw> restoring that zfs, so it can focus on the faith/home fs.

right. you do not even need a fix for the supposed corruption, just
for the pedantry.

    mp> There might be a bug in zfs but that is hard to check as you
    mp> can''t reproduce the problem.

the ''zfs receive'' problem happens every time one tries to
restore that
file, and he still has the file, so it''s reproduceable in that sense.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080813/6d50a05d/attachment.bin>

Jonathan Wheeler

2008-Aug-13 14:10 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Thanks for the information, I''m learning quite a lot from all this.

It seems to me that zfs send *should* be doing some kind of verification, since
some work has clearly been put into zfs so that zfs''s can be dumped
into files/pipes. It''s a great feature to have, and I can''t
believe that this was purely for zfs send | zfs receive scenarios.

A common example used all over the place is zfs send | ssh $host. In these
examples is ssh guaranteeing the data delivery somehow? If not, there need to be
some serious asterisks in these guides!
Looking at this at a level that I do understand, it''s going via TCP,
which checksums packets..... then again, I was using nfs over TCP, and look
where I am today. So much for that!

As I google these subjects more and more, I fear that I''m hitting the
conceptual mental block that many before me have done also. zfs send is not
zfsdump, even though it sure looks the same, and it''s not clearly
stated that you may end up in a situation like the one I''m in today if
you don''t somehow test your backups.

As you''ve rightly pointed out, it''s done now and even if I did
manage to reproduce this again, that won''t help my data locked away in
these 2 .zfs files, so focusing on the hopeful is there anything I can do to
recover my data from these zfs dumps? Anything at all :)

If the problem is "just" that "zfs receive" is checksumming
the data on the way in, can I disable this somehow within zfs?
Can I globally disable checksumming in the kernel module? mdb something or
rather?

I read this thread where someone did successfully manage to recovery data from a
damaged zfs, which fulls me with some hope:
http://www.opensolaris.org/jive/thread.jspa?messageID=220125

It''s way over my head, but if anyone can tell me the mdb commands
I''m happy to try them, even if they do kill my cat. I don''t
really have anything to loose with a copy of the data, and I''ll do it
all in a VM anyway.

Thanks,
Jonathan
 
 
This message posted from opensolaris.org

Miles Nordin

2008-Aug-13 20:27 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

>>>>> "jw" == Jonathan Wheeler <griffous at
griffous.net> writes:
jw> A common example used all over the place is zfs send | ssh
jw> $host. In these examples is ssh guaranteeing the data delivery
jw> somehow?

it is really all just appologetics. It sounds like a zfs bug to me.

The only alternative is bad hardware (not disks), so you could try
memory testers, continuous big ''make -n <big number, like 4 -
10>''
builds, scripted continuous zpool send/recv, to look for this.

jw> you may end up in a situation like the one I''m in today if
you
jw> don''t somehow test your backups.

which is why I asked you to check -n spots it. It doesn''t---the tool
gives you no way to test the backups!

I''ve lost before because I backed things up onto tape, wiped the
original, and then had the tape go bad. The idea of backups is to
always have two copies, so I should have written two tapes. but I
don''t see any reason to believe you wouldn''t get two bad
copies in
your case since it sounds like a bug.

I also made the mistake of using FancyTape---I used some DAT bullshit
with a ``table of contents'''' that can become
``corrupt'''' if you power
off the drive at the wrong moment, which simpler tape formats don''t
have. DAT also has these block checksums, where some drives if they
can''t read part of the tape, they just hang forever and can''t
seek
past it. (weirdly analagous to zfs receive). I had already learned
not to gzip a tarball before writing it to tape if the tarball
contained mostly uncompressable things, because the gzip format is
less robust than the tar format. but, I got bitten anyway because of
the stupid tape TOC and the poor exception handling in the DAT drive''s
firmware.

What''s required, *given hindsight*, is to realize that the purpose of
backups for ZFS users is partly to protect ourselves from ZFS bugs, so
the backups need to be stored in a format that has nothing to do with
ZFS, like tar or UDF or a non-ZFS filesystem.

however if you have lots of snapshots or clones, I''m not sure this is
possible because the data expands too much. In that case I might
store backups in an zpool rather than in a file, because I expect
zpool corruption bugs will get more attention sooner than ''zfs
send''
corruption bugs. but, that''s still sketchy, and had it not been for
your experience, I might have trusted the zfs send format.

``learn'''', fine, but I don''t think you''ve
done anything unreasonable.

jw> is there anything I can do to recover my data from these zfs
jw> dumps? Anything at all :)

fix ''zfs receive'' to ignore the error? :)

burry the dumps in the sand for two years, and hope someone else fixes
ZFS in the mean time? :) That''s what I did to my tape with the bad
TOC. no good news yet.

jw> If the problem is "just" that "zfs receive" is
checksumming
jw> the data on the way in, can I disable this somehow within zfs?
jw> Can I globally disable checksumming in the kernel module? mdb
jw> something or rather?

sounds plausible but I don''t know how, so please let me know if you
find a way.

I found also some magic /etc/system incantations, but it doesn''t seem
to apply to ''zfs receive''. It''s more of what you
found, more ``simon
sez, import!'''' stuff:

http://opensolaris.org/jive/message.jspa?messageID=192572#194209
http://sunsolve.sun.com/search/document.do?assetkey=1-66-233602-1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080813/d37b7804/attachment.bin>

Richard Elling

2008-Aug-13 22:48 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Jonathan Wheeler wrote:> Thanks for the information, I''m learning quite a lot from all
this.
>
> It seems to me that zfs send *should* be doing some kind of verification,
since some work has clearly been put into zfs so that zfs''s can be
dumped into files/pipes. It''s a great feature to have, and I
can''t believe that this was purely for zfs send | zfs receive
scenarios.
>   
zfs send/receive is not a backup solution because it does not have the
features generally expected in a backup solution.  It is a very low-level
method of replicating dataset structure.  If you find documentation to
the contrary, which was created after CR6399918 was integrated, then
please file a new bug.
http://bugs.opensolaris.org/view_bug.do?bug_id=6399918
> A common example used all over the place is zfs send | ssh $host. In these
examples is ssh guaranteeing the data delivery somehow? If not, there need to be
some serious asterisks in these guides!
>   
In this case, the receive does checks and will fail when the checks do
not pass.  In such cases, the send can be restarted.  ssh performs
encryption, and encryption codes tend to be more robust because a
corruption will tend to fail upon decryption (including the surrounding
checksum checks).

If you save the contents of the pipe somewhere, then you are at the
mercy of the robustness of the saved location.

However, there is more that can be done here, both inside and outside
of ZFS.  For inside ZFS,  I have filed an RFE: CR6736837, improve
send/receive fault tolerance.  However, to be effective, we really need
a better understanding of the failures we expect to encounter. 

As an interim step, know that a send will create the same stream because
it is sending a stable set of data.  You can send to files twice, on diverse
storage, and then compare the resulting files.  In other words, the
flexibility of UNIX pipes is exposed by zfs send/receive.
> Looking at this at a level that I do understand, it''s going via
TCP, which checksums packets..... then again, I was using nfs over TCP, and look
where I am today. So much for that!
>   
I do not think you will be able to identify the root cause of your
corruption -- there are far too many dependents and you do not
have a known-good reference :-(.
> As I google these subjects more and more, I fear that I''m hitting
the conceptual mental block that many before me have done also. zfs send is not
zfsdump, even though it sure looks the same, and it''s not clearly
stated that you may end up in a situation like the one I''m in today if
you don''t somehow test your backups.
>   
Correct, though this applies to everything, in general.

One backup method I use (I use several ;-), is to use send/receive to a
removable disk, usually a USB disk.  I can then setup compression and
redundancy policies for the USB disk and also periodically scrub to
test the retention.  This also offers the ability to go back to any snapshot
in a matter of minutes, even though I store the USB disk in a fire safe.
Another benefit to this method is that I can easily verify the media -- I
was once a user of 8mm tape drives, so I''ve got several scars related
to the inability to recover data from tapes (they had a nasty habit of
writing tapes that couldn''t be read from other 8mm drives, so if you
had to repair your drive (likely), then you might not be able to read
your tapes).
> As you''ve rightly pointed out, it''s done now and even if
I did manage to reproduce this again, that won''t help my data locked
away in these 2 .zfs files, so focusing on the hopeful is there anything I can
do to recover my data from these zfs dumps? Anything at all :)
>   
I filed RFE CR 6736794, option for partial zfs receives.  But I''m
not confident that it can be implemented easily or quickly.
> If the problem is "just" that "zfs receive" is
checksumming the data on the way in, can I disable this somehow within zfs?
> Can I globally disable checksumming in the kernel module? mdb something or
rather?
>
> I read this thread where someone did successfully manage to recovery data
from a damaged zfs, which fulls me with some hope:
> http://www.opensolaris.org/jive/thread.jspa?messageID=220125
>
> It''s way over my head, but if anyone can tell me the mdb commands
I''m happy to try them, even if they do kill my cat. I don''t
really have anything to loose with a copy of the data, and I''ll do it
all in a VM anyway.
>   
With mdb and the source, all things are possible.  But I''ll have
to defer to someone who uses mdb more frequently than I.
 -- richard

Anton B. Rang

2008-Aug-14 04:07 UTC

head link

[zfs-discuss] corrupt zfs stream? "checksum mismatch"

There is an explicit check in ZFS for the checksum, as you deduced. I suspect
that by disabling this check you could recover much, if not all, of your data.
You could probably do this with mdb by ''simply'' writing a NOP
over the branch in dmu_recv_stream.

It appears that ''zfs send'' was designed to generate a stream
which would immediately be consumed by ''zfs recv''. A simple
checksum suffices, then, to detect problems in transmission (or certain classes
of bugs on the sending side), since the operation can be retried on error. If
the stream will be stored in any way, however, redundancy should be included in
the stream (a la the VMS Backup utility).
 
 
This message posted from opensolaris.org

Jonathan Wheeler

2008-Aug-15 13:32 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Hi Richard,

Thanks for the detailed reply, and the work behind the scenes filing the CRs.
I''ve bookmarked both, and will keep a keen eye on them for status
changes.

As Miles put it, I''ll have to put these dumps into storage for possible
future use.
I do dearly hope that I''ll be able to recover most of that data in the
future, but for the most important bits (documents/spreadsheets), I''ll
have to rebuild them by way of some rather intensive data entry based on hard
copies, now.

Not fun.

I do have a working [zfs send dump!] backup from October, so it''s not a
total loss of my livelihood, but it''ll be a life lesson alright.

With CR 6736794, I wonder if some extra notes could be added around the
checksumming side of the code?
The wording that has been used doesn''t quite match my scenario, but I
certainly agree with what  requested functionality has been requested there.

I have a 50GB zfs send dump and zfs receive is failing (and rolling back) around
the 20GB mark.
While the exact cause and nature of my issue remains unknown, I very much expect
that the vast majority of my zfs send dump is in fact in tact, including data
beyond that 20GB checksum error point. I.E there is a problem around the 20GB
mark, but I expect that the remaining 30GB contains "good" data, or in
very least, *mostly* good data.

The CR appears to be only requesting that zfs receive stop at the 20GB mark, but
{new feature} allows the failed restore attempt to be mountable, in a
unknown/known bad state.

I''d much prefer that zfs receive continue on error too, thus giving it
the full 50GB to process and attempt to repair, rather than only the data up
until the point that it encountered it''s first problem.

Without knowing much about the actual on disk format,metadata and structures I
can''t be sure, but the fs is going to have a much better chance at
recovering when there is more data available across the entire length of the fs,
right? I know from my linux days that the ext2/3 superblocks were distributed
across the full disk, so the more of the disk that it can attempt to read, the
better the chance that it''ll find more correct metadata to use in an
attempt a repair of the FS.

And of course the second benefit of reading more of the data stream, past an
error is that more user data will at least have a chance of being recovered. If
it stops half way, it has _no_ chance of recovering that data, so I favor my
odds of letting it go on to at least try :)

Or is that an entirely new CR itself?

Jonathan
 
 
This message posted from opensolaris.org

Corey Leopold

2008-Sep-05 19:09 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

> Hi Folks,
> 
> The error that I''m getting on restore is:
> receiving full stream of faith/home at 09-08-08 into
> Z/faith/home at 09-08-08
> cannot receive: invalid stream (checksum mismatch)
> 
Did you find a work around?  I have the same problem, except with a replication
set.  This is with b95, which performed the send to a file on an nfs mount. 
I''m also using b95 and/or b96 to receive the file.

I get 14G or so read and then this error happens.

Is is possible to turn off the checksum?  Just so that I can recover what data
is there?
--
This message posted from opensolaris.org

Ross

2008-Sep-06 07:52 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Hey Richard,

I''ve just seen that somebody else has been caught out by this.  Do you
think it would be worth adding an RFE to add ''send to file''
support to ZFS send?

I''ll be using data piped to file myself, and while I''m not
worried about corruption myself, if ZFS send knows it''s sending to a
file, it could check the integrity of the file once the operation completes,
which would probably have helped these guys.

It might also be useful to output a text file containing the checksum so the
integrity of the file can be verified at a later date.

Ross
--
This message posted from opensolaris.org

Richard Elling

2008-Sep-06 15:23 UTC

head link

[zfs-discuss] corrupt zfs stream? checksum mismatch

Ross wrote:> Hey Richard,
>
> I''ve just seen that somebody else has been caught out by this.  Do
you think it would be worth adding an RFE to add ''send to
file'' support to ZFS send?
>   
No.  Pipes are a foundation of UNIX and are much more flexible
than a fixed file interface (as shown below... :-)
> I''ll be using data piped to file myself, and while I''m
not worried about corruption myself, if ZFS send knows it''s sending to
a file, it could check the integrity of the file once the operation completes,
which would probably have helped these guys.
>   
I filed CR 6736837, improve send/receive fault tolerance, which
keeps the pipe structure intact.  Feel free to pile on.
http://bugs.opensolaris.org/view_bug.do?bug_id=6736837
> It might also be useful to output a text file containing the checksum so
the integrity of the file can be verified at a later date.
>   
If you redirect to a file, you can use existing checksum commands.
You could even check that against what flows out of the send.
    zfs send | tee filename | digest -a md5 > filename.md5

 -- richard

Reasonably Related Threads

Search for more reasonably related threads

zfs discuss - Aug 2008 - corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? "checksum mismatch"

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? "checksum mismatch"

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

[zfs-discuss] corrupt zfs stream? checksum mismatch

Reasonably Related Threads