Hi all. A quick question about the checksum error detection routines in ZFS. Surely ZFS can decide about checksum errors in a redundant environment but what about an non-redundant one? We connected a single RAID5 array to a v440 as a NFS server and while doing backups and the like we see the "zpool status -v" checksum error counters increment once in a while. Nevertheless the command keeps us telling that applications are not harmed. How can ZFS detect those? I assume it doesn''t to verify after write as it would kill performance. Is it cause by read errors that vanish after re-trying the read? Would someone please explain how the mechanism works in that case? Of course in the meantime we attached another box in mirror configuration ;) Thanks in advance Thomas ----------------------------------------------------------------- GPG fingerprint: B1 EE D2 39 2C 82 26 DA A5 4D E0 50 35 75 9E ED
It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.) In the event of a data checksum error on a non-redundant pool, the application would see an I/O error. If the reported recovered errors are common, I''d suspect some sort of software-induced metadata corruption; you should be moving much more data than metadata in a typical system. This message posted from opensolaris.org
On Fri, 16 Mar 2007, Anton B. Rang wrote:> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.)I thought about that but looking at the NFS server the real data should be much much more than metadata so I would consider it unlikely. Also in the now redundant setup we see checksum errors on both attached RAIDs Any hints on how to track down the problem to the HBA, cables, RAID and so on? We see similar things on all our machines with few exceptions. Talking to local Sun folks we have been "warned" before that checksum errors will show up and that it''s considered normal. Nevertheless I really want to know what they are about Thomas ----------------------------------------------------------------- GPG fingerprint: B1 EE D2 39 2C 82 26 DA A5 4D E0 50 35 75 9E ED
Hello Thomas, Saturday, March 17, 2007, 11:46:14 AM, you wrote: TN> On Fri, 16 Mar 2007, Anton B. Rang wrote:>> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.)TN> I thought about that but looking at the NFS server the real data should be TN> much much more than metadata so I would consider it unlikely. Also in the TN> now redundant setup we see checksum errors on both attached RAIDs TN> Any hints on how to track down the problem to the HBA, cables, RAID and so TN> on? We see similar things on all our machines with few exceptions. Talking TN> to local Sun folks we have been "warned" before that checksum errors will TN> show up and that it''s considered normal. Nevertheless I really want to TN> know what they are about I have an opened CR for months now about the same problem - lot of CKSUM errors all seem to be only meta-data related which is highly unlikely. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Robert, Saturday, March 17, 2007, 6:49:05 PM, you wrote: RM> Hello Thomas, RM> Saturday, March 17, 2007, 11:46:14 AM, you wrote: TN>> On Fri, 16 Mar 2007, Anton B. Rang wrote:>>> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.)TN>> I thought about that but looking at the NFS server the real data should be TN>> much much more than metadata so I would consider it unlikely. Also in the TN>> now redundant setup we see checksum errors on both attached RAIDs TN>> Any hints on how to track down the problem to the HBA, cables, RAID and so TN>> on? We see similar things on all our machines with few exceptions. Talking TN>> to local Sun folks we have been "warned" before that checksum errors will TN>> show up and that it''s considered normal. Nevertheless I really want to TN>> know what they are about RM> I have an opened CR for months now about the same problem - lot of RM> CKSUM errors all seem to be only meta-data related which is highly RM> unlikely. We''ve reinstalled servers to U3 and SC3.2 and for last few days no single CKSUM error (the same pools were imported) - so maybe something wrong was with U2. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Robert, Wednesday, March 21, 2007, 10:36:15 AM, you wrote: RM> Hello Robert, RM> Saturday, March 17, 2007, 6:49:05 PM, you wrote: RM>> Hello Thomas, RM>> Saturday, March 17, 2007, 11:46:14 AM, you wrote: TN>>> On Fri, 16 Mar 2007, Anton B. Rang wrote:>>>> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.)TN>>> I thought about that but looking at the NFS server the real data should be TN>>> much much more than metadata so I would consider it unlikely. Also in the TN>>> now redundant setup we see checksum errors on both attached RAIDs TN>>> Any hints on how to track down the problem to the HBA, cables, RAID and so TN>>> on? We see similar things on all our machines with few exceptions. Talking TN>>> to local Sun folks we have been "warned" before that checksum errors will TN>>> show up and that it''s considered normal. Nevertheless I really want to TN>>> know what they are about RM>> I have an opened CR for months now about the same problem - lot of RM>> CKSUM errors all seem to be only meta-data related which is highly RM>> unlikely. RM> We''ve reinstalled servers to U3 and SC3.2 and for last few days no RM> single CKSUM error (the same pools were imported) - so maybe something RM> wrong was with U2. One of those server has reported again some CKSUM errors in the same way so it looks like only metadata were involved. So the problem is still there but on U3 to much less extent. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Hello Robert, Thursday, March 29, 2007, 12:37:28 AM, you wrote: RM> Hello Robert, RM> Wednesday, March 21, 2007, 10:36:15 AM, you wrote: RM>> Hello Robert, RM>> Saturday, March 17, 2007, 6:49:05 PM, you wrote: RM>>> Hello Thomas, RM>>> Saturday, March 17, 2007, 11:46:14 AM, you wrote: TN>>>> On Fri, 16 Mar 2007, Anton B. Rang wrote:>>>>> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.)TN>>>> I thought about that but looking at the NFS server the real data should be TN>>>> much much more than metadata so I would consider it unlikely. Also in the TN>>>> now redundant setup we see checksum errors on both attached RAIDs TN>>>> Any hints on how to track down the problem to the HBA, cables, RAID and so TN>>>> on? We see similar things on all our machines with few exceptions. Talking TN>>>> to local Sun folks we have been "warned" before that checksum errors will TN>>>> show up and that it''s considered normal. Nevertheless I really want to TN>>>> know what they are about RM>>> I have an opened CR for months now about the same problem - lot of RM>>> CKSUM errors all seem to be only meta-data related which is highly RM>>> unlikely. RM>> We''ve reinstalled servers to U3 and SC3.2 and for last few days no RM>> single CKSUM error (the same pools were imported) - so maybe something RM>> wrong was with U2. RM> One of those server has reported again some CKSUM errors in the same way so it RM> looks like only metadata were involved. So the problem is still there RM> but on U3 to much less extent. bash-3.00# uname -a SunOS XXXXX 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V240 bash-3.00# [...] pool: nfs-s5-s6 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-s6 ONLINE 0 0 7 c4t600C0FF00000000009258F4855B59001d0 ONLINE 0 0 7 errors: No known data errors pool: nfs-s5-s7 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-s7 ONLINE 0 0 6 c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 6 errors: No known data errors pool: nfs-s5-s8 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM nfs-s5-s8 ONLINE 0 0 10 c4t600C0FF00000000009258F3E4C4C5601d0 ONLINE 0 0 10 errors: No known data errors bash-3.00# -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Isn''t it more likely that these are errors on data as well? I think zfs retries read operations when there''s a checksum failure, so maybe these are transient hardware problems (faulty cables, high temperature..)? This would explain the non-existence of unrecoverable errors. Robert Milkowski wrote:> Hello Robert, > > Thursday, March 29, 2007, 12:37:28 AM, you wrote: > > RM> Hello Robert, > > RM> Wednesday, March 21, 2007, 10:36:15 AM, you wrote: > > RM>> Hello Robert, > > RM>> Saturday, March 17, 2007, 6:49:05 PM, you wrote: > > RM>>> Hello Thomas, > > RM>>> Saturday, March 17, 2007, 11:46:14 AM, you wrote: > > TN>>>> On Fri, 16 Mar 2007, Anton B. Rang wrote: >>>>>> It''s possible (if unlikely) that you are only getting checksum errors on metadata. Since ZFS always internally mirrors its metadata, even on non-redundant pools, it can recover from metadata corruption which does not affect all copies. (If there is only one LUN, the mirroring happens at different locations on the same LUN.) > > TN>>>> I thought about that but looking at the NFS server the real data should be > TN>>>> much much more than metadata so I would consider it unlikely. Also in the > TN>>>> now redundant setup we see checksum errors on both attached RAIDs > > TN>>>> Any hints on how to track down the problem to the HBA, cables, RAID and so > TN>>>> on? We see similar things on all our machines with few exceptions. Talking > TN>>>> to local Sun folks we have been "warned" before that checksum errors will > TN>>>> show up and that it''s considered normal. Nevertheless I really want to > TN>>>> know what they are about > > RM>>> I have an opened CR for months now about the same problem - lot of > RM>>> CKSUM errors all seem to be only meta-data related which is highly > RM>>> unlikely. > > RM>> We''ve reinstalled servers to U3 and SC3.2 and for last few days no > RM>> single CKSUM error (the same pools were imported) - so maybe something > RM>> wrong was with U2. > > RM> One of those server has reported again some CKSUM errors in the same way so it > RM> looks like only metadata were involved. So the problem is still there > RM> but on U3 to much less extent. > > bash-3.00# uname -a > SunOS XXXXX 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V240 > bash-3.00# > > > [...] > > pool: nfs-s5-s6 > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > nfs-s5-s6 ONLINE 0 0 7 > c4t600C0FF00000000009258F4855B59001d0 ONLINE 0 0 7 > > errors: No known data errors > > pool: nfs-s5-s7 > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > nfs-s5-s7 ONLINE 0 0 6 > c4t600C0FF00000000009258F28706F5201d0 ONLINE 0 0 6 > > errors: No known data errors > > pool: nfs-s5-s8 > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > nfs-s5-s8 ONLINE 0 0 10 > c4t600C0FF00000000009258F3E4C4C5601d0 ONLINE 0 0 10 > > errors: No known data errors > bash-3.00# > > > >
Hello Ricardo, Friday, April 6, 2007, 5:33:14 AM, you wrote: RC> Isn''t it more likely that these are errors on data as well? I think zfs RC> retries read operations when there''s a checksum failure, so maybe these RC> are transient hardware problems (faulty cables, high temperature..)? RC> This would explain the non-existence of unrecoverable errors. Wouldn''t ZFS retryy for metadata also then? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com