Richard Broberg
2006-May-05 22:28 UTC
[zfs-discuss] disk failure on raidz pool defies fixing
I have a raidz pool which looks like this after a disk failure: # zpool status pool: tank state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-4J scrub: resilver completed with 0 errors on Fri May 5 18:14:29 2006 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz DEGRADED 0 0 0 c1t0d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 UNAVAIL 0 0 0 corrupted data c2t8d0 ONLINE 0 0 0 c2t9d0 ONLINE 0 0 0 c2t10d0 ONLINE 0 0 0 c2t11d0 ONLINE 0 0 0 c2t12d0 ONLINE 0 0 0 c2t13d0 ONLINE 0 0 0 errors: No known data errors # ----- I have physically replaced the failed disk with a new one, but I''m having problems using ''zpool replace'': # zpool replace tank c1t5d0 invalid vdev specification use ''-f'' to override the following errors: /dev/dsk/c1t5d0s0 is part of active ZFS pool tank. Please see zpool(1M). /dev/dsk/c1t5d0s2 is part of active ZFS pool tank. Please see zpool(1M). # so I follow the advice, and use ''-f'': # zpool replace -f tank c1t5d0 invalid vdev specification the following errors must be manually repaired: /dev/dsk/c1t5d0s0 is part of active ZFS pool tank. Please see zpool(1M). /dev/dsk/c1t5d0s2 is part of active ZFS pool tank. Please see zpool(1M). # --- What now? This message posted from opensolaris.org
Hmmm, this looks like a bug to me. The single argument form of ''zpool replace'' should do the trick. What has happened is that there is enough information on the disk to identify it as belonging to ''tank'', yet not enough good data for it to be opened. Incidentally, you you send me the contents of /var/fm/fmd/errlog and /var/fm/fmd/fltlog, as well as /var/adm/messages? I''m always trying to collect details of this failure mode. The ''zpool replace'' code should probably allow you to replace a disk with itself provided the original isn''t still online. As a workaround, you should be able to dd(1) over the first and last megabyte of the disk. This will prevent zpool(1M) from recognizing it as the same disk in the pool, and should allow you to replace it. - Eric On Fri, May 05, 2006 at 03:28:34PM -0700, Richard Broberg wrote:> I have a raidz pool which looks like this after a disk failure: > > # zpool status > pool: tank > state: DEGRADED > status: One or more devices could not be used because the label is missing or > invalid. Sufficient replicas exist for the pool to continue > functioning in a degraded state. > action: Replace the device using ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-4J > scrub: resilver completed with 0 errors on Fri May 5 18:14:29 2006 > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > raidz DEGRADED 0 0 0 > c1t0d0 ONLINE 0 0 0 > c1t1d0 ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > c1t5d0 UNAVAIL 0 0 0 corrupted data > c2t8d0 ONLINE 0 0 0 > c2t9d0 ONLINE 0 0 0 > c2t10d0 ONLINE 0 0 0 > c2t11d0 ONLINE 0 0 0 > c2t12d0 ONLINE 0 0 0 > c2t13d0 ONLINE 0 0 0 > > errors: No known data errors > # > > ----- > > I have physically replaced the failed disk with a new one, but I''m having > problems using ''zpool replace'': > > # zpool replace tank c1t5d0 > invalid vdev specification > use ''-f'' to override the following errors: > /dev/dsk/c1t5d0s0 is part of active ZFS pool tank. Please see zpool(1M). > /dev/dsk/c1t5d0s2 is part of active ZFS pool tank. Please see zpool(1M). > # > > so I follow the advice, and use ''-f'': > > # zpool replace -f tank c1t5d0 > invalid vdev specification > the following errors must be manually repaired: > /dev/dsk/c1t5d0s0 is part of active ZFS pool tank. Please see zpool(1M). > /dev/dsk/c1t5d0s2 is part of active ZFS pool tank. Please see zpool(1M). > # > > --- > > What now? > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Richard Broberg
2006-May-10 16:28 UTC
[zfs-discuss] Re: disk failure on raidz pool defies fixing
your suggestion worked just fine -- I did the dd on the target disk, then was able to do the ''zpool replace''. requested files coming shortly (I''m attaching them after I''ve already successfully run ''zpool replace'', btw). thanks This message posted from opensolaris.org
Richard Broberg
2006-May-10 16:31 UTC
[zfs-discuss] Re: disk failure on raidz pool defies fixing
log files attached This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: errlog Type: application/octet-stream Size: 54746 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060510/9c4ac51b/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: fltlog Type: application/octet-stream Size: 1839 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060510/9c4ac51b/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: messages Type: application/octet-stream Size: 75358 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060510/9c4ac51b/attachment-0002.obj>
storage-disk
2006-Dec-21 00:12 UTC
[zfs-discuss] Re: disk failure on raidz pool defies fixing
Hi Eric, I''m experience the same problem. However, I don''t know how to dd the first and the last sector of the disk. may I have the command? How do you decode file /var/fm/fmd/errlog and /var/fm/fmd/fltlog? Thanks Giang This message posted from opensolaris.org
Tomas Ă–gren
2006-Dec-21 01:02 UTC
[zfs-discuss] Re: disk failure on raidz pool defies fixing
On 20 December, 2006 - storage-disk sent me these 0,4K bytes:> Hi Eric, > > How do you decode file /var/fm/fmd/errlog and /var/fm/fmd/fltlog?fmdump -e, fmdump /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se