On Sat, January 30, 2010 10:58, matthew patton wrote:
> please forgive the ''stupid'' question.
Perfectly fair thing to wonder about IMHO. And if you''re wondering,
trying to find out is good :-).
> Aside from having a convenient hash table of checksums to consult and upon
> detection of a collision knowing we are dealing with a duplicate, why
> checksum data when the memory bus, PCI-e/x bus, sata/sas bus, and the hard
> disk itself use Reed-Solomon (or similar) encoding to store/transmit ECC
> along with the data?
>
> Where is this "silent data corruption" supposed to occur? And is
the
> probability of preventing/catching an occurance a realistically relevant
> value?
I''ve encountered or "been near" (such as a co-worker
encountering it) a
number of situations where a series of cascaded stages each of which is
"supposed to" be reliable due to checksums and the like actually let
an
error through to the end. This makes me a big fan of real end-to-end
checksumming, so I immediately jumped on the ZFS capability as a great
thing.
Some of those checksums you cite aren''t actually adequate for the
volumes
of data we move through systems IMHO, and augmenting them with the ZFS
checksum can help a lot.
When I first heard about ZFS, late 2005 I think, there were a lot of war
stories of people who put Solaris with ZFS up on an old "scratch"
system
to play with, and were annoyed to find it reporting errors now and then --
and then remembered how there had been various weird data problems on that
system before, which they''d written off to the usual gremlins. The
conclusion people reached was that ZFS checksumming was doing a much
better job of actually detecting data corruption than the previous systems
which depended only on those things you listed.
Also, especially at the cheap end people still run systems without ECC
memory, which blows a big hole in your argument on any such system. One
could argue that one simply never should; and indeed I''ve paid the
money
to get ECC memory in my home ZFS fileserver. But my Windows desktops both
at home and at work, and my laptop, use regular memory.
--
David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info