How can I verify the checksums for a specific file? - Marcus
Marcus Sundman wrote:> How can I verify the checksums for a specific file? >ZFS doesn''t checksum files. So a file does not have a checksum to verify. Perhaps you want to keep a digest(1) of the files? -- richard
On Sat, Oct 25, 2008 at 4:00 AM, Marcus Sundman <sundman at iki.fi> wrote:> How can I verify the checksums for a specific file? > > I have a feeling you are not asking the question about ZFS hosted filesspecifically. If you downloaded a file, enter cksum filename To get the "CRC Check-Sum" For more types of checksum, you can use digest -a md5 filename digest -l will list types of checksum that the "digest" command knows about. Cheers, _hartz -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081025/c8c3acca/attachment.html>
On Sat, Oct 25, 2008 at 6:59 AM, Johan Hartzenberg <jhartzen at gmail.com>wrote:> > > On Sat, Oct 25, 2008 at 4:00 AM, Marcus Sundman <sundman at iki.fi> wrote: > >> How can I verify the checksums for a specific file? >> >> I have a feeling you are not asking the question about ZFS hosted files > specifically. > > If you downloaded a file, enter > cksum filename > > To get the "CRC Check-Sum" > > For more types of checksum, you can use > > digest -a md5 filename > > digest -l will list types of checksum that the "digest" command knows > about. > > Cheers, > _hartz > >Oh, one other thing, To check the cheksums of files you''ve downloaded to a MS Windows system you need do download and install a "checksum checking" utility, try twocows.com _hartz -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081025/384369a8/attachment.html>
For MD5 checksums personally I favour MD5Summer on Windows boxes. Just watch the formatting of the checksum file if you''re checking downloads - sometimes it can be a bit picky about linebreaks. I think this file might have been the latest preview release of Netbeans 6.5 (RC1). Looking in notepad though and putting the digest and filename on one line worked a treat. Sorry this went a little OT / top-posted. Sent from my iPod Mark. On 25 Oct 2008, at 06:01, "Johan Hartzenberg" <jhartzen at gmail.com> wrote:> > > On Sat, Oct 25, 2008 at 6:59 AM, Johan Hartzenberg > <jhartzen at gmail.com> wrote: > > > On Sat, Oct 25, 2008 at 4:00 AM, Marcus Sundman <sundman at iki.fi> > wrote: > How can I verify the checksums for a specific file? > > I have a feeling you are not asking the question about ZFS hosted > files specifically. > > If you downloaded a file, enter > cksum filename > > To get the "CRC Check-Sum" > > For more types of checksum, you can use > > digest -a md5 filename > > digest -l will list types of checksum that the "digest" command > knows about. > > Cheers, > _hartz > > Oh, one other thing, > To check the cheksums of files you''ve downloaded to a MS Windows > system you need do download and install a "checksum checking" > utility, try twocows.com > > _hartz > > > -- > Any sufficiently advanced technology is indistinguishable from magic. > Arthur C. Clarke > > My blog: http://initialprogramload.blogspot.com > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081025/739847b1/attachment.html>
Richard Elling <Richard.Elling at Sun.COM> wrote:> Marcus Sundman wrote: > > How can I verify the checksums for a specific file? > > ZFS doesn''t checksum files.AFAIK ZFS checksums all data, including the contents of files.> So a file does not have a checksum to verify.I wrote "checksums" (plural) for a "file" (singular). - Marcus
On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <sundman at iki.fi> wrote:> Richard Elling <Richard.Elling at Sun.COM> wrote: > > Marcus Sundman wrote: > > > How can I verify the checksums for a specific file? > > > > ZFS doesn''t checksum files. > > AFAIK ZFS checksums all data, including the contents of files. > > > So a file does not have a checksum to verify. > > I wrote "checksums" (plural) for a "file" (singular). >AH - Then you DO mean the ZFS built-in data check-summing - my mistake. ZFS checksums allocations (blocks), not files. The checksum for each block is stored in the parent of that block. These are not shown to you but you can "scrub" the pool, which will see zfs run through all the allocations, checking whether the checksums are valid. This PDF document is quite old but explains it fairly well: http://www.google.co.za/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fru.sun.com%2Ftechdays%2Fpresents%2FSolaris%2Fhow_zfs_works.pdf&ei=f3EDSbnjB5iQQbG2wIIC&usg=AFQjCNG8qtO3bFgmD11izooR7SVbiSOI2A&sig2=-EHfv5Puqz8dxkANISionQ What is not expressly stated in the block is that the ZFS allocation structure stores the posix layer and file data in the leaf nodes in the tree. Cheers, _hartz -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081025/c01a8cdf/attachment.html>
"Johan Hartzenberg" <jhartzen at gmail.com> wrote:> On Sat, Oct 25, 2008 at 6:49 PM, Marcus Sundman <sundman at iki.fi> > wrote: > > Richard Elling <Richard.Elling at Sun.COM> wrote: > > > Marcus Sundman wrote: > > > > How can I verify the checksums for a specific file? > > > > > > ZFS doesn''t checksum files. > > > > AFAIK ZFS checksums all data, including the contents of files. > > > > > So a file does not have a checksum to verify. > > > > I wrote "checksums" (plural) for a "file" (singular). > > > > AH - Then you DO mean the ZFS built-in data check-summing - my > mistake. ZFS checksums allocations (blocks), not files. The checksum > for each block is stored in the parent of that block. These are not > shown to you but you can "scrub" the pool, which will see zfs run > through all the allocations, checking whether the checksums are valid.I don''t want to scrub several TiB of data just to verify a 2 MiB file. I want to verify just the data of that file. (Well, I don''t mind also verifying whatever other data happens to be in the same blocks.)> This PDF document is quite old but explains it fairly well:I couldn''t see anything there describing either how to verify the checksums of individual files or why that would be impossible. OK, since there seems to be some confusion about what I mean, maybe I should describe the actual problems I''m trying to solve: 1) When I notice an error in a file that I''ve copied from a ZFS disk I want to know whether that error is also in the original file on my ZFS disk or if it''s only in the copy. 2) Before I destroy an old backup copy of a file I want to know that the other copy, which is on a ZFS disk, is still OK (at least at that very moment). Naturally I could calculate new checksums for all files in question and compare the checksums, but for reasons I won''t go into now this is not as feasible as it might seem, and obviously less efficient. Up to now I''ve been storing md5sums for all files, but keeping the files and their md5sums synchronized is a burden I could do without. Cheers, Marcus
On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <sundman at iki.fi> wrote:> I don''t want to scrub several TiB of data just to verify a 2 MiB file. I > want to verify just the data of that file. (Well, I don''t mind also > verifying whatever other data happens to be in the same blocks.)Just read the file. If the checksum is valid, then it''ll read without problems. If it''s invalid, then it''ll be rebuilt (if you have redundancy in your pool) or you''ll get I/O errors (if you don''t). Scott
Marcus Sundman wrote:> > I couldn''t see anything there describing either how to verify the > checksums of individual files or why that would be impossible.If you can read the file, the checksum is OK. If it were not, you would get an I/O error attempting to read it. -- Ian.
"Scott Laird" <scott at sigkill.org> wrote:> On Sat, Oct 25, 2008 at 1:57 PM, Marcus Sundman <sundman at iki.fi> > wrote: > > I don''t want to scrub several TiB of data just to verify a 2 MiB > > file. I want to verify just the data of that file. (Well, I don''t > > mind also verifying whatever other data happens to be in the same > > blocks.) > > Just read the file. If the checksum is valid, then it''ll read without > problems. If it''s invalid, then it''ll be rebuilt (if you have > redundancy in your pool) or you''ll get I/O errors (if you don''t).So what you''re trying to say is "cat the file to /dev/null and check for I/O errors", right? And how do I check for I/O errors? Should I run "zpool status -v" and see if the file in question is listed there? Cheers, Marcus
Ian Collins <ian at ianshome.com> wrote:> Marcus Sundman wrote: > > I couldn''t see anything there describing either how to verify the > > checksums of individual files or why that would be impossible. > > If you can read the file, the checksum is OK. If it were not, you > would get an I/O error attempting to read it.Are these I/O errors written to stdout or stderr or where? Regards, Marcus
Marcus Sundman wrote:> Ian Collins <ian at ianshome.com> wrote: > >> Marcus Sundman wrote: >> >>> I couldn''t see anything there describing either how to verify the >>> checksums of individual files or why that would be impossible. >>> >> If you can read the file, the checksum is OK. If it were not, you >> would get an I/O error attempting to read it. >> > > Are these I/O errors written to stdout or stderr or where? > >Yes, stderr. You will not be able top open the file. One of the great benefits of ZFS is you don''t have to manually verify checksums of files on disk. Unless you want to make sure they haven''t been maliciously altered that is. -- Ian.
Ian Collins <ian at ianshome.com> wrote:> Marcus Sundman wrote: > > Are these I/O errors written to stdout or stderr or where? > > Yes, stderr.OK, good, thanks.> You will not be able top open the file.What?! Even if there are errors I want to still be able to read the file to salvage what can be salvaged. E.g., if one byte in a picture file is wrong then it''s quite likely I can still use the picture. If ZFS denies access to the whole file, or even to the whole block with the error, then the whole file is ruined. That''s very bad. Are you sure there is no way to read the file anyway?> One of the great benefits of ZFS is you don''t have to manually verify > checksums of files on disk. Unless you want to make sure they haven''t > been maliciously altered that is.Malicious alteration is not the only way for unwanted changes to a disk. Cheers, Marcus
> 1) When I notice an error in a file that I''ve copied from a ZFS disk I > want to know whether that error is also in the original file on my ZFS > disk or if it''s only in the copy.This was already addressed but let me do so slightly differently: One of the major points of ZFS checksumming is that, in the abscence of software bugs or hardware memory corruption issues when the file is read on the host, successfully reading a file is supposed to mean that you got the correct version of the file (either from physical disk or from cache, having previously been read from physical disk). A scrub is still required if you want to make sure the file is okay *ON DISK*, unless you can satisfy yourself that no relevant data is cached somewhere (or unless someone can inform me of a way to nuke a particular file and related resources from cache).> Up to now I''ve been storing md5sums for all files, but keeping the > files and their md5sums synchronized is a burden I could do without.FWIW I wanted to mention here that if you care a lot about this, I''d recommend something like par2[1] instead. It uses forward error correction[2], allowing you to not only detect corruption, but also correct it. You can choose your desired level of redundancy expressed as a percentage of the file size. [1] http://en.wikipedia.org/wiki/Parchive [2] http://en.wikipedia.org/wiki/Forward_error_correction -- / Peter Schuller PGP userID: 0xE9758B7D or ''Peter Schuller <peter.schuller at infidyne.com>'' Key retrieval: Send an E-Mail to getpgpkey at scode.org E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081026/b5a69205/attachment.bin>
A slight nit. Using cat(1) to read the file to /dev/null will not actually cause the data to be read thanks to the magic that is mmap(). If you use dd(1) to read the file then yes you will either get the data and thus know it''s blocks match their checksums or dd will give you an error if you have no redundancy. --chris -- This message posted from opensolaris.org
A nit on the nit... cat does not use mmap for files <= 32K in size. For those files it''s a simple read() into a buffer and write() it out. Jim --- Chris Gerhard wrote:> A slight nit. > > Using cat(1) to read the file to /dev/null will not actually cause the data to be read thanks to the magic that is mmap(). If you use dd(1) to read the file then yes you will either get the data and thus know it''s blocks match their checksums or dd will give you an error if you have no redundancy. > > --chris > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >