Hi guys! I''m doing series of tests on ZFS before putting it into production on several machines, and I''ve come to a dead end. I have two disks in mirror (rpool). Intentionally, I corrupt data on second disk: # dd if=/dev/urandom of=/dev/rdsk/c0d1t0 bs=512 count=20480 seek=10240 So, I''ve written 10MB''s of random data after first 5MB''s of hard drive. After sync and reboot, ZFS got the corruption noticed, and then I run zpool scrub rpool. After that, I''ve got this state: unknown# zpool status pool: rpool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub in progress for 0h0m, 5.64% done, 0h5m to go config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c0d1s0 DEGRADED 0 0 26 too many errors c0d0s0 ONLINE 0 0 0 errors: No known data errors So I wonder now, how to fix this up? Why doesn''t scrub overwrite bad data with good data from first disk? If I run zpool clear, it will only clear the error reports, and it won''t fixed them - I presume that because I don''t understand the man page for that section clearly. So, how can I fix this disk, without detach/attach procedure? -- This message posted from opensolaris.org
Jakov Sosic wrote:> Hi guys! > > I''m doing series of tests on ZFS before putting it into production on several machines, and I''ve come to a dead end. I have two disks in mirror (rpool). Intentionally, I corrupt data on second disk: > > # dd if=/dev/urandom of=/dev/rdsk/c0d1t0 bs=512 count=20480 seek=10240 > > So, I''ve written 10MB''s of random data after first 5MB''s of hard drive. After sync and reboot, ZFS got the corruption noticed, and then I run zpool scrub rpool. After that, I''ve got this state: > > unknown# zpool status > pool: rpool > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub in progress for 0h0m, 5.64% done, 0h5m to go > config: > > NAME STATE READ WRITE CKSUM > rpool DEGRADED 0 0 0 > mirror DEGRADED 0 0 0 > c0d1s0 DEGRADED 0 0 26 too many errors > c0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > > So I wonder now, how to fix this up? Why doesn''t scrub overwrite bad data with good data from first disk?ZFS doesn''t know why the errors occurred, the most likely scenario would be a bad disk -- in which case you''d need to replace it.> If I run zpool clear, it will only clear the error reports, and it won''t fixed them - I presume that because I don''t understand the man page for that section clearly.The admin guide is great to follow for these tests : http://docs.sun.com/app/docs/doc/819-5461> So, how can I fix this disk, without detach/attach procedure?You shouldn''t need to attach/detach anything. I think you''re looking for ''zpool replace''. zpool replace tank c0d1s0 -Bryant
> > So I wonder now, how to fix this up? Why doesn''t > scrub overwrite bad data with good data from first > disk? > > ZFS doesn''t know why the errors occurred, the most > likely scenario would be a > bad disk -- in which case you''d need to replace it.I know and understand that... But, what is then a limit for self-healing? 2 errors per vdev? 3 errors? 10 errors? before ZFS decides that vdev is irreparable...> You shouldn''t need to attach/detach anything. > I think you''re looking for ''zpool replace''. > zpool replace tank c0d1s0Yes but that will do the complete resilvering, and I just want to fix the corrupted blocks... :) -- This message posted from opensolaris.org
Looks like your scrub was not finished yet. Did check it later? You should not have had to replace the disk. You might have to reinstall the bootblock. -- This message posted from opensolaris.org
Jakov Sosic wrote:> Hi guys! > > I''m doing series of tests on ZFS before putting it into production on several machines, and I''ve come to a dead end. I have two disks in mirror (rpool). Intentionally, I corrupt data on second disk: > > # dd if=/dev/urandom of=/dev/rdsk/c0d1t0 bs=512 count=20480 seek=10240 > > So, I''ve written 10MB''s of random data after first 5MB''s of hard drive. After sync and reboot, ZFS got the corruption noticed, and then I run zpool scrub rpool. After that, I''ve got this state: > > unknown# zpool status > pool: rpool > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub in progress for 0h0m, 5.64% done, 0h5m to go > config: > > NAME STATE READ WRITE CKSUM > rpool DEGRADED 0 0 0 > mirror DEGRADED 0 0 0 > c0d1s0 DEGRADED 0 0 26 too many errors > c0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > > So I wonder now, how to fix this up? Why doesn''t scrub overwrite bad data with good data from first disk? >The data is already fixed, which is why it says "errors: No known data errors"> If I run zpool clear, it will only clear the error reports, and it won''t fixed them - I presume that because I don''t understand the man page for that section clearly. > > So, how can I fix this disk, without detach/attach procedure >Be happy, the data is already fixed. The "DEGRADED" state is used when too many errors were found in a short period of time, which one would use as an idicator of a failing device. However, since the device is not actually failed, it is of no practical use in your test case. -- richard
On 26-Jan-09, at 6:21 PM, Jakov Sosic wrote:>>> So I wonder now, how to fix this up? Why doesn''t >> scrub overwrite bad data with good data from first >> disk? >> >> ZFS doesn''t know why the errors occurred, the most >> likely scenario would be a >> bad disk -- in which case you''d need to replace it. > > I know and understand that... But, what is then a limit for self- > healing? 2 errors per vdev? 3 errors? 10 errors? before ZFS decides > that vdev is irreparable... > > >> You shouldn''t need to attach/detach anything. >> I think you''re looking for ''zpool replace''. >> zpool replace tank c0d1s0 > > Yes but that will do the complete resilvering, and I just want to > fix the corrupted blocks... :)What you are asking for is impossible, since ZFS cannot know which blocks are corrupted without actually checking them all (like a scrub). A resilver involves knowing that some set of blocks is out of date, but ZFS need not verify the rest. --Toby> -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>>> "js" == Jakov Sosic <jsosic at gmail.com> writes: >>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes:js> Yes but that will do the complete resilvering, and I just want js> to fix the corrupted blocks... :) tt> What you are asking for is impossible, since ZFS cannot know tt> which blocks are corrupted without actually checking them yeah of course you have to read every (occupied) block, but he''s still not asking for something completely nonsensical. What if the good drive has a latent sector error in one of the blocks that hasn''t been scribbled over on the bad drive? scrub could heal the error if not for the ``too many errors'''' fault, while ''zpool replace'' could not heal it. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090126/ed2a5d7f/attachment.bin>
On 26-Jan-09, at 8:15 PM, Miles Nordin wrote:>>>>>> "js" == Jakov Sosic <jsosic at gmail.com> writes: >>>>>> "tt" == Toby Thain <toby at telegraphics.com.au> writes: > > js> Yes but that will do the complete resilvering, and I just want > js> to fix the corrupted blocks... :) > > tt> What you are asking for is impossible, since ZFS cannot know > tt> which blocks are corrupted without actually checking them > > yeah of course you have to read every (occupied) block, but he''s still > not asking for something completely nonsensical. What if the good > drive has a latent sector error in one of the blocks that hasn''t been > scribbled over on the bad drive? scrub could heal the error if not > for the ``too many errors'''' fault, while ''zpool replace'' could not > heal it.Yes he''s asking for a "scrub". I was just pointing out that resilver is something else entirely :) --Toby> _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Jakov Sosic wrote: > Be happy, the data is already fixed. The "DEGRADED" > state is used > when too many errors were found in a short period of > time, which > one would use as an idicator of a failing device. > However, since the > evice is not actually failed, it is of no practical > use in your test case.Well yes, now I do get it! The thing is, after scrubbing, I rebooted the machine and dd-ed the /dev/urandom on the other device in the mirror pool. After that, I rebooted again, and checked the md5sum of a file that occupies entire drive. The md5 is same as at the beggining of the test, so that''s it. Problem solved, it seems that scrub really solves the problem, but after it solves it I have to "zpool clear" so that errors go away :) Pool seems in degraded state altough it''s fully corrected... -- This message posted from opensolaris.org