bash-3.00# zpool status -v nfs-s5-p1 pool: nfs-s5-p1 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-p1 ONLINE 0 0 2 c4t600C0FF00000000009258F7A4F1BC601d0 ONLINE 0 0 2 errors: No known data errors bash-3.00# As you can see there''s no protection with ZFS. Does it mean that those two checksum errors were related to metadata and thanks to ditto blocks it was corrected? (I assume application did receive proper data and fs is ok). btw: I''m really suprised how SATA disks are unreliable. I put dozen TBs of data on ZFS last time and just after few days I got few hundreds checksum error (there raid-z was used). And these disks are 500GB in 3511 array. Well that would explain some fsck''s, etc. we saw before. This message posted from opensolaris.org
On Fri, Jun 09, 2006 at 06:16:53AM -0700, Robert Milkowski wrote:> bash-3.00# zpool status -v nfs-s5-p1 > pool: nfs-s5-p1 > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > nfs-s5-p1 ONLINE 0 0 2 > c4t600C0FF00000000009258F7A4F1BC601d0 ONLINE 0 0 2 > > errors: No known data errors > bash-3.00# > > As you can see there''s no protection with ZFS. > Does it mean that those two checksum errors were related to metadata > and thanks to ditto blocks it was corrected? (I assume application did > receive proper data and fs is ok).Hmm, I''m not sure. There are no persistent data errors (as shown by the ''errors:'' line), so you should be. If you want to send your /var/fm/fmd/errlog, or ''fmdump -eV'' output, we can take a look at the details of the error. If this is the case, then it''s a bug that the checksum error is reported for the pool for a recovered ditto block. You may want to try ''zpool clear nfs-s5-p1; zpool scrub nfs-s5-p1'' and see if it turns up anything. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Robert Milkowski wrote:> btw: I''m really suprised how SATA disks are unreliable. I put dozen TBs of > data on ZFS last time and just after few days I got few hundreds checksum > error (there raid-z was used). And these disks are 500GB in 3511 array. > Well that would explain some fsck''s, etc. we saw before.It is more likely due to the density than the interface. In general, high density disks will suffer from superparamagnetic affects more than lower density disks. There are several ways to combat this, but the consumer market values space over reliability. And since there is no checksumming to detect problems, they don''t think they have problems -- the insidious effects of cancer. -- richard
Richard Elling wrote:> Robert Milkowski wrote: >> btw: I''m really suprised how SATA disks are unreliable. I put dozen >> TBs of data on ZFS last time and just after few days I got few >> hundreds checksum error (there raid-z was used). And these disks are >> 500GB in 3511 array. Well that would explain some fsck''s, etc. we saw >> before. > > It is more likely due to the density than the interface. In general, > high density disks will suffer from superparamagnetic affects more than > lower density disks. There are several ways to combat this, but the > consumer market values space over reliability.I''m not actually convinced the consumer market wants the space, it is more that we don''t have a choice because it bigger and bigger drives is all we can buy. Personally I have very little need for a 500G disk at home (mainly because I don''t do video and my photos are jpg not raw ;-)).> And since there is no checksumming to detect problems, they don''t > think they have problems -- > the insidious effects of cancer.Or most of the data is stored in file formats don''t get impacted too much by the odd bit flip here and there (eg MPEG streams). -- Darren J Moffat
> btw: I''m really suprised how SATA disks are unreliable. I put dozen > TBs of data on ZFS last time and just after few days I got few hundreds > checksum error (there raid-z was used). And these disks are 500GB in > 3511 array. Well that would explain some fsck''s, etc. we saw before.I suspect you''ve got a bad disk or controller. A normal SATA drive just won''t behave this badly. Cool that RAID-Z survives it, though. Jeff
Jeff Bonwick wrote:>> btw: I''m really suprised how SATA disks are unreliable. I put dozen >> TBs of data on ZFS last time and just after few days I got few hundreds >> checksum error (there raid-z was used). And these disks are 500GB in >> 3511 array. Well that would explain some fsck''s, etc. we saw before. > > I suspect you''ve got a bad disk or controller. A normal SATA drive > just won''t behave this badly. Cool that RAID-Z survives it, though.I had a power supply go bad a few months ago (cheap PC-junk power supply) and it trashed a bunch of my SATA and IDE disks [*] (though, happily, not the IDE disk I scavenged from a Sun V100 :-). The symptoms were thousands of non-recoverable reads which were remapped until the disks ran out of spare blocks. Since I didn''t believe this, I got a new, more expensive, and presumably more reliable power supply. The IDE disks faired better, but I had to do a low-level format on the SATA drive. All is well now and zfs hasn''t shown any errors since. But, thunderstorm season is approaching next month... I am also trying to collect field data which shows such failure modes specifically looking for clusters of errors. However, I can''t promise anything, and may not get much time to do in-depth study anytime soon. [*] my theory is that disks are about the only devices still using 12VDC power. Some disk vendor specify the quality of the 12VDC supply (eg. ripple) for specific drives. In my case, the 12VDC was the only common-mode failure in the system which would have trashed most of the drives in this manner. -- richard
Hello Eric, Friday, June 9, 2006, 5:16:29 PM, you wrote: ES> On Fri, Jun 09, 2006 at 06:16:53AM -0700, Robert Milkowski wrote:>> bash-3.00# zpool status -v nfs-s5-p1 >> pool: nfs-s5-p1 >> state: ONLINE >> status: One or more devices has experienced an unrecoverable error. An >> attempt was made to correct the error. Applications are unaffected. >> action: Determine if the device needs to be replaced, and clear the errors >> using ''zpool clear'' or replace the device with ''zpool replace''. >> see: http://www.sun.com/msg/ZFS-8000-9P >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> nfs-s5-p1 ONLINE 0 0 2 >> c4t600C0FF00000000009258F7A4F1BC601d0 ONLINE 0 0 2 >> >> errors: No known data errors >> bash-3.00# >> >> As you can see there''s no protection with ZFS. >> Does it mean that those two checksum errors were related to metadata >> and thanks to ditto blocks it was corrected? (I assume application did >> receive proper data and fs is ok).ES> Hmm, I''m not sure. There are no persistent data errors (as shown by the ES> ''errors:'' line), so you should be. If you want to send your ES> /var/fm/fmd/errlog, or ''fmdump -eV'' output, we can take a look at the ES> details of the error. If this is the case, then it''s a bug that the ES> checksum error is reported for the pool for a recovered ditto block. ES> You may want to try ''zpool clear nfs-s5-p1; zpool scrub nfs-s5-p1'' and ES> see if it turns up anything. Well, I just did ''fmdump -eV'' and last entry is from May 31th and is related to pools which are already destroyed. I can see another 1 checksum error in that pool (I did zpool clear last time) and it''s NOT reported by fmdump. This one occuerd afet May 31th. I hope these are ditto blocks and nothing else (read: bad). System is b39 SPARC. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Jeff, Saturday, June 10, 2006, 2:32:49 AM, you wrote:>> btw: I''m really suprised how SATA disks are unreliable. I put dozen >> TBs of data on ZFS last time and just after few days I got few hundreds >> checksum error (there raid-z was used). And these disks are 500GB in >> 3511 array. Well that would explain some fsck''s, etc. we saw before.JB> I suspect you''ve got a bad disk or controller. A normal SATA drive JB> just won''t behave this badly. Cool that RAID-Z survives it, though. It''s not that bad right now. It was then but the array (3511) reported several times ''Drive NOTIFY: Media Error Encountered - 163A981 (311)'' and then I got all of these CKSUM errors. Once it stabilized (drive finally filed and was replaced by hotspare) I see no CKSUM errors after few days. Looks like drive was failing, etc. But still I''m surprised that the array returned bad data (raid-5 on the array). We see such messages once in a while on several 3511s. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
On Mon, Jun 12, 2006 at 10:49:49AM +0200, Robert Milkowski wrote:> > Well, I just did ''fmdump -eV'' and last entry is from May 31th and is > related to pools which are already destroyed. > > I can see another 1 checksum error in that pool (I did zpool clear > last time) and it''s NOT reported by fmdump. This one occuerd afet May > 31th. > > I hope these are ditto blocks and nothing else (read: bad). > > System is b39 SPARC.Yes, that does sound like ditto blocks. I''ll poke around with Bill and figure out why the checksum errors would be percolating up to the pool level. They should be reported only for the leaf device. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
I reproduced this pretty easily on a lab machine. I''ve filed: 6437568 ditto block repair is incorrectly propagated to root vdev To track this issue. Keep in mind that you do have a flakey controller/lun/something. If this had been a user data block, your data would be gone. - Eric On Mon, Jun 12, 2006 at 08:05:03AM -0700, Eric Schrock wrote:> On Mon, Jun 12, 2006 at 10:49:49AM +0200, Robert Milkowski wrote: > > > > Well, I just did ''fmdump -eV'' and last entry is from May 31th and is > > related to pools which are already destroyed. > > > > I can see another 1 checksum error in that pool (I did zpool clear > > last time) and it''s NOT reported by fmdump. This one occuerd afet May > > 31th. > > > > I hope these are ditto blocks and nothing else (read: bad). > > > > System is b39 SPARC. > > Yes, that does sound like ditto blocks. I''ll poke around with Bill and > figure out why the checksum errors would be percolating up to the pool > level. They should be reported only for the leaf device. > > - Eric > > -- > Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Hello Eric, Monday, June 12, 2006, 11:21:24 PM, you wrote: ES> I reproduced this pretty easily on a lab machine. I''ve filed: ES> 6437568 ditto block repair is incorrectly propagated to root vdev Good, thank you. ES> To track this issue. Keep in mind that you do have a flakey ES> controller/lun/something. If this had been a user data block, your data ES> would be gone. Well, probably something is wrong. But it surprises me that every time I get CKSUM error in that config every time it relates to metadata... well quite unlikely isn''t it? btw: if it would be a data block then app reading that block would get proper error and that''s it - right? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Eric, Monday, June 12, 2006, 11:21:24 PM, you wrote: ES> I reproduced this pretty easily on a lab machine. I''ve filed: ES> 6437568 ditto block repair is incorrectly propagated to root vdev ES> To track this issue. Keep in mind that you do have a flakey ES> controller/lun/something. If this had been a user data block, your data ES> would be gone. I belive that something else is also happening here. I can see CKSUM errors on two different servers (v240 and T2000) all on non-redundant zpools and all the times it looks like ditto block helped - hey, it''s just improbable. And while on T2000 from fmdump -ev I get: Jul 05 19:59:43.8786 ereport.io.fire.pec.btp 0x14e4b8015f612002 Jul 05 20:05:28.9165 ereport.io.fire.pec.re 0x14e5f951ce12b002 Jul 05 20:05:58.5381 ereport.io.fire.pec.re 0x14e614e78f4c9002 Jul 05 20:05:58.5389 ereport.io.fire.pec.btp 0x14e614e7b6ddf002 Jul 05 23:34:11.1960 ereport.io.fire.pec.re 0x1513869a6f7a6002 Jul 05 23:34:11.1967 ereport.io.fire.pec.btp 0x1513869a95196002 Jul 06 00:09:17.1845 ereport.io.fire.pec.re 0x151b2fca4c988002 Jul 06 00:09:17.1852 ereport.io.fire.pec.btp 0x151b2fca72e6b002 on v240 fmdump shows nothing for over a month and I''m sure I did zpool clear on that server later. v240: bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 167 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 167 errors: No known data errors bash-3.00# bash-3.00# zpool clear nfs-s5-s7 bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 0 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 errors: No known data errors bash-3.00# bash-3.00# zpool scrub nfs-s5-s7 bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE scrub: scrub in progress, 0.01% done, 269h24m to go config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 0 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 errors: No known data errors bash-3.00# We''ll see the result - I hope I would have not to stop it in the morning. Anyway I have a feeling that nothing will be reported. ps. I''ve got several similar pools on those two servers and I see CKSUM errors on all of them with the same result - it''s almost impossible. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Robert, Thursday, July 6, 2006, 1:49:34 AM, you wrote: RM> Hello Eric, RM> Monday, June 12, 2006, 11:21:24 PM, you wrote: ES>> I reproduced this pretty easily on a lab machine. I''ve filed: ES>> 6437568 ditto block repair is incorrectly propagated to root vdev ES>> To track this issue. Keep in mind that you do have a flakey ES>> controller/lun/something. If this had been a user data block, your data ES>> would be gone. RM> I belive that something else is also happening here. RM> I can see CKSUM errors on two different servers (v240 and T2000) all RM> on non-redundant zpools and all the times it looks like ditto block RM> helped - hey, it''s just improbable. RM> And while on T2000 from fmdump -ev I get: RM> Jul 05 19:59:43.8786 ereport.io.fire.pec.btp 0x14e4b8015f612002 RM> Jul 05 20:05:28.9165 ereport.io.fire.pec.re 0x14e5f951ce12b002 RM> Jul 05 20:05:58.5381 ereport.io.fire.pec.re 0x14e614e78f4c9002 RM> Jul 05 20:05:58.5389 ereport.io.fire.pec.btp 0x14e614e7b6ddf002 RM> Jul 05 23:34:11.1960 ereport.io.fire.pec.re 0x1513869a6f7a6002 RM> Jul 05 23:34:11.1967 ereport.io.fire.pec.btp 0x1513869a95196002 RM> Jul 06 00:09:17.1845 ereport.io.fire.pec.re 0x151b2fca4c988002 RM> Jul 06 00:09:17.1852 ereport.io.fire.pec.btp 0x151b2fca72e6b002 RM> on v240 fmdump shows nothing for over a month and I''m sure I did zpool RM> clear on that server later. RM> v240: RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> status: One or more devices has experienced an unrecoverable error. An RM> attempt was made to correct the error. Applications are unaffected. RM> action: Determine if the device needs to be replaced, and clear the errors RM> using ''zpool clear'' or replace the device with ''zpool replace''. RM> see: http://www.sun.com/msg/ZFS-8000-9P RM> scrub: none requested RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 167 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 167 RM> errors: No known data errors RM> bash-3.00# RM> bash-3.00# zpool clear nfs-s5-s7 RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> scrub: none requested RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 0 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 RM> errors: No known data errors RM> bash-3.00# RM> bash-3.00# zpool scrub nfs-s5-s7 RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> scrub: scrub in progress, 0.01% done, 269h24m to go RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 0 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 RM> errors: No known data errors RM> bash-3.00# RM> We''ll see the result - I hope I would have not to stop it in the RM> morning. Anyway I have a feeling that nothing will be reported. RM> ps. I''ve got several similar pools on those two servers and I see RM> CKSUM errors on all of them with the same result - it''s almost RM> impossible. ok, it took several days actually to complete scrub. During scrub I saw some CKSUM errors already and now again there are many of them, however scrub itself reported no errors at all. bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed with 0 errors on Sun Jul 9 02:56:19 2006 config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 18 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 18 errors: No known data errors bash-3.00# -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Robert Milkowski
2006-Jul-17 20:44 UTC
Fwd: Re[3]: [zfs-discuss] zpool status and CKSUM errors
Hi. Sorry for forward but maybe this will be more visible that way. I really think something strange is going on here and it''s virtually impossible that I have a problem with hardware and get CKSUM errors (many of them) only for ditto blocks. This is a forwarded message From: Robert Milkowski <rmilkowski at task.gda.pl> To: Robert Milkowski <rmilkowski at task.gda.pl> Date: Sunday, July 9, 2006, 8:44:16 PM Subject: [zfs-discuss] zpool status and CKSUM errors ===8<==============Original message text==============Hello Robert, Thursday, July 6, 2006, 1:49:34 AM, you wrote: RM> Hello Eric, RM> Monday, June 12, 2006, 11:21:24 PM, you wrote: ES>> I reproduced this pretty easily on a lab machine. I''ve filed: ES>> 6437568 ditto block repair is incorrectly propagated to root vdev ES>> To track this issue. Keep in mind that you do have a flakey ES>> controller/lun/something. If this had been a user data block, your data ES>> would be gone. RM> I belive that something else is also happening here. RM> I can see CKSUM errors on two different servers (v240 and T2000) all RM> on non-redundant zpools and all the times it looks like ditto block RM> helped - hey, it''s just improbable. RM> And while on T2000 from fmdump -ev I get: RM> Jul 05 19:59:43.8786 ereport.io.fire.pec.btp 0x14e4b8015f612002 RM> Jul 05 20:05:28.9165 ereport.io.fire.pec.re 0x14e5f951ce12b002 RM> Jul 05 20:05:58.5381 ereport.io.fire.pec.re 0x14e614e78f4c9002 RM> Jul 05 20:05:58.5389 ereport.io.fire.pec.btp 0x14e614e7b6ddf002 RM> Jul 05 23:34:11.1960 ereport.io.fire.pec.re 0x1513869a6f7a6002 RM> Jul 05 23:34:11.1967 ereport.io.fire.pec.btp 0x1513869a95196002 RM> Jul 06 00:09:17.1845 ereport.io.fire.pec.re 0x151b2fca4c988002 RM> Jul 06 00:09:17.1852 ereport.io.fire.pec.btp 0x151b2fca72e6b002 RM> on v240 fmdump shows nothing for over a month and I''m sure I did zpool RM> clear on that server later. RM> v240: RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> status: One or more devices has experienced an unrecoverable error. An RM> attempt was made to correct the error. Applications are unaffected. RM> action: Determine if the device needs to be replaced, and clear the errors RM> using ''zpool clear'' or replace the device with ''zpool replace''. RM> see: http://www.sun.com/msg/ZFS-8000-9P RM> scrub: none requested RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 167 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 167 RM> errors: No known data errors RM> bash-3.00# RM> bash-3.00# zpool clear nfs-s5-s7 RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> scrub: none requested RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 0 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 RM> errors: No known data errors RM> bash-3.00# RM> bash-3.00# zpool scrub nfs-s5-s7 RM> bash-3.00# zpool status nfs-s5-s7 RM> pool: nfs-s5-s7 RM> state: ONLINE RM> scrub: scrub in progress, 0.01% done, 269h24m to go RM> config: RM> NAME STATE READ WRITE CKSUM RM> nfs-s5-s7 ONLINE 0 0 0 RM> c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 0 RM> errors: No known data errors RM> bash-3.00# RM> We''ll see the result - I hope I would have not to stop it in the RM> morning. Anyway I have a feeling that nothing will be reported. RM> ps. I''ve got several similar pools on those two servers and I see RM> CKSUM errors on all of them with the same result - it''s almost RM> impossible. ok, it took several days actually to complete scrub. During scrub I saw some CKSUM errors already and now again there are many of them, however scrub itself reported no errors at all. bash-3.00# zpool status nfs-s5-s7 pool: nfs-s5-s7 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed with 0 errors on Sun Jul 9 02:56:19 2006 config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 18 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 18 errors: No known data errors bash-3.00# -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com ===8<===========End of original message text===========