Donald Murray, P.Eng.
2010-Jun-27 21:13 UTC
[zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors
Hi, I awoke this morning to a panic''d opensolaris zfs box. I rebooted it and confirmed it would panic each time it tried to import the ''tank'' pool. Once I disconnected half of one of the mirrored disks, the box booted cleanly and the pool imported without a panic. Because this box has a hot spare, it began resilvering automatically. This is the first time I''ve resilvered to a hot spare, so I''m not sure whether the output below [1] is normal. In particular, I think it''s odd that the spare has an equal number of read and cksum errors. Is this normal? Is my spare a piece of junk, just like the disk it replaced? [1] root at weyl:~# zpool status tank pool: tank state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 1.36M 0 0 9828443264686839751 UNAVAIL 0 0 0 was /dev/dsk/c6t1d0s0 c7t1d0 DEGRADED 0 0 1.36M too many errors c9t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 spares c7t1d0 INUSE currently in use errors: No known data errors root at weyl:~#
Cindy Swearingen
2010-Jun-28 20:55 UTC
[zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors
Hi Donald, I think this is just a reporting error in the zpool status output, depending on what Solaris release is. Thanks, Cindy On 06/27/10 15:13, Donald Murray, P.Eng. wrote:> Hi, > > I awoke this morning to a panic''d opensolaris zfs box. I rebooted it > and confirmed it would panic each time it tried to import the ''tank'' > pool. Once I disconnected half of one of the mirrored disks, the box > booted cleanly and the pool imported without a panic. > > Because this box has a hot spare, it began resilvering automatically. > This is the first time I''ve resilvered to a hot spare, so I''m not sure > whether the output below [1] is normal. > > In particular, I think it''s odd that the spare has an equal number of > read and cksum errors. Is this normal? Is my spare a piece of junk, > just like the disk it replaced? > > > [1] > root at weyl:~# zpool status tank > pool: tank > state: DEGRADED > status: One or more devices could not be opened. Sufficient replicas exist for > the pool to continue functioning in a degraded state. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-2Q > scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror DEGRADED 0 0 0 > spare DEGRADED 1.36M 0 0 > 9828443264686839751 UNAVAIL 0 0 0 was > /dev/dsk/c6t1d0s0 > c7t1d0 DEGRADED 0 0 1.36M too many errors > c9t0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c7t0d0 ONLINE 0 0 0 > c5t1d0 ONLINE 0 0 0 > spares > c7t1d0 INUSE currently in use > > errors: No known data errors > root at weyl:~# > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Donald Murray, P.Eng.
2010-Jun-29 03:51 UTC
[zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors
Thanks Cindy. I''m running 111b at the moment. I ran a scrub last night, and it still reports the same status. root at weyl:~# uname -a SunOS weyl 5.11 snv_111b i86pc i386 i86pc Solaris root at weyl:~# zpool status -x pool: tank state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: scrub completed after 2h40m with 0 errors on Mon Jun 28 01:23:12 2010 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 1.37M 0 0 9828443264686839751 UNAVAIL 0 0 0 was /dev/dsk/c6t1d0s0 c7t1d0 DEGRADED 0 0 1.37M too many errors c9t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 spares c7t1d0 INUSE currently in use errors: No known data errors root at weyl:~# On Mon, Jun 28, 2010 at 14:55, Cindy Swearingen <cindy.swearingen at oracle.com> wrote:> Hi Donald, > > I think this is just a reporting error in the zpool status output, > depending on what Solaris release is. > > Thanks, > > Cindy > > On 06/27/10 15:13, Donald Murray, P.Eng. wrote: >> >> Hi, >> >> I awoke this morning to a panic''d opensolaris zfs box. I rebooted it >> and confirmed it would panic each time it tried to import the ''tank'' >> pool. Once I disconnected half of one of the mirrored disks, the box >> booted cleanly and the pool imported without a panic. >> >> Because this box has a hot spare, it began resilvering automatically. >> This is the first time I''ve resilvered to a hot spare, so I''m not sure >> whether the output below [1] ?is normal. >> >> In particular, I think it''s odd that the spare has an equal number of >> read and cksum errors. Is this normal? Is my spare a piece of junk, >> just like the disk it replaced? >> >> >> [1] >> root at weyl:~# zpool status tank >> ?pool: tank >> ?state: DEGRADED >> status: One or more devices could not be opened. ?Sufficient replicas >> exist for >> ? ? ? ?the pool to continue functioning in a degraded state. >> action: Attach the missing device and online it using ''zpool online''. >> ? see: http://www.sun.com/msg/ZFS-8000-2Q >> ?scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go >> config: >> >> ? ? ? ?NAME ? ? ? ? ? ? ? ? ? ? ? STATE ? ? READ WRITE CKSUM >> ? ? ? ?tank ? ? ? ? ? ? ? ? ? ? ? DEGRADED ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ?mirror ? ? ? ? ? ? ? ? ? DEGRADED ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?spare ? ? ? ? ? ? ? ? ?DEGRADED 1.36M ? ? 0 ? ? 0 >> ? ? ? ? ? ? ?9828443264686839751 ?UNAVAIL ? ? ?0 ? ? 0 ? ? 0 ?was >> /dev/dsk/c6t1d0s0 >> ? ? ? ? ? ? ?c7t1d0 ? ? ? ? ? ? ? DEGRADED ? ? 0 ? ? 0 1.36M ?too many >> errors >> ? ? ? ? ? ?c9t0d0 ? ? ? ? ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ?mirror ? ? ? ? ? ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?c7t0d0 ? ? ? ? ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ? ? ?c5t1d0 ? ? ? ? ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 >> ? ? ? ?spares >> ? ? ? ? ?c7t1d0 ? ? ? ? ? ? ? ? ? INUSE ? ? currently in use >> >> errors: No known data errors >> root at weyl:~# >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Cindy Swearingen
2010-Jun-29 14:45 UTC
[zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors
Okay, at this point, I would suspect the spare is having a problem too or some other hardware problem. Do you have another disk that you can try as a replacement to c6t1d0? If so, you can try to detach the spare like this: # zpool detach tank c7t1d0 Then, physically replace c6t1d0. You review the full instructions here: http://docs.sun.com/app/docs/doc/817-2271/gazgd?l=en&a=view Another option is if you have another unused disk already connected to the system you can use it to replace c6t1d0 after detaching the spare. For example, if c4t1d0 was available: # zpool replace tank c6t1d0 c4t1d0 Thanks, Cindy On 06/28/10 21:51, Donald Murray, P.Eng. wrote:> Thanks Cindy. I''m running 111b at the moment. I ran a scrub last > night, and it still reports the same status. > > root at weyl:~# uname -a > SunOS weyl 5.11 snv_111b i86pc i386 i86pc Solaris > root at weyl:~# zpool status -x > pool: tank > state: DEGRADED > status: One or more devices could not be opened. Sufficient replicas exist for > the pool to continue functioning in a degraded state. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-2Q > scrub: scrub completed after 2h40m with 0 errors on Mon Jun 28 01:23:12 2010 > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror DEGRADED 0 0 0 > spare DEGRADED 1.37M 0 0 > 9828443264686839751 UNAVAIL 0 0 0 was > /dev/dsk/c6t1d0s0 > c7t1d0 DEGRADED 0 0 1.37M too many errors > c9t0d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c7t0d0 ONLINE 0 0 0 > c5t1d0 ONLINE 0 0 0 > spares > c7t1d0 INUSE currently in use > > errors: No known data errors > root at weyl:~# > > > > On Mon, Jun 28, 2010 at 14:55, Cindy Swearingen > <cindy.swearingen at oracle.com> wrote: >> Hi Donald, >> >> I think this is just a reporting error in the zpool status output, >> depending on what Solaris release is. >> >> Thanks, >> >> Cindy >> >> On 06/27/10 15:13, Donald Murray, P.Eng. wrote: >>> Hi, >>> >>> I awoke this morning to a panic''d opensolaris zfs box. I rebooted it >>> and confirmed it would panic each time it tried to import the ''tank'' >>> pool. Once I disconnected half of one of the mirrored disks, the box >>> booted cleanly and the pool imported without a panic. >>> >>> Because this box has a hot spare, it began resilvering automatically. >>> This is the first time I''ve resilvered to a hot spare, so I''m not sure >>> whether the output below [1] is normal. >>> >>> In particular, I think it''s odd that the spare has an equal number of >>> read and cksum errors. Is this normal? Is my spare a piece of junk, >>> just like the disk it replaced? >>> >>> >>> [1] >>> root at weyl:~# zpool status tank >>> pool: tank >>> state: DEGRADED >>> status: One or more devices could not be opened. Sufficient replicas >>> exist for >>> the pool to continue functioning in a degraded state. >>> action: Attach the missing device and online it using ''zpool online''. >>> see: http://www.sun.com/msg/ZFS-8000-2Q >>> scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> tank DEGRADED 0 0 0 >>> mirror DEGRADED 0 0 0 >>> spare DEGRADED 1.36M 0 0 >>> 9828443264686839751 UNAVAIL 0 0 0 was >>> /dev/dsk/c6t1d0s0 >>> c7t1d0 DEGRADED 0 0 1.36M too many >>> errors >>> c9t0d0 ONLINE 0 0 0 >>> mirror ONLINE 0 0 0 >>> c7t0d0 ONLINE 0 0 0 >>> c5t1d0 ONLINE 0 0 0 >>> spares >>> c7t1d0 INUSE currently in use >>> >>> errors: No known data errors >>> root at weyl:~# >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss