Hi, I have a SunFire X4540 with 19TB in a RAID-Z configuration; here''s my zpool status: pool: raid state: UNAVAIL status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run ''zpool clear''. see: http://www.sun.com/msg/ZFS-8000-HC scrub: resilver in progress for 84h11m, 99.47% done, 0h27m to go config: NAME STATE READ WRITE CKSUM raid UNAVAIL 0 0 451 insufficient replicas raidz1 UNAVAIL 0 0 902 insufficient replicas c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t3d0 UNAVAIL 472 94 0 cannot open c5t3d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c2t7d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 spare DEGRADED 7 0 66.8M c5t2d0 FAULTED 11 2 0 too many errors replacing DEGRADED 0 0 0 c5t7d0 FAULTED 13 0 0 too many errors c5t6d0 ONLINE 0 0 0 202G resilvered c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c2t6d0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c0t1d0 FAULTED 0 0 0 too many errors c4t7d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 spares c4t7d0 INUSE currently in use c5t7d0 INUSE currently in use c5t6d0 INUSE currently in use c5t4d0 AVAIL errors: 911 data errors, use ''-v'' for a list It looks like the resilver has got stuck; Oracle have sent out a replacement disk today and are asking me to replace c5t7d0. If I am understanding the documentation correctly, I believe I need to do the following: zpool offline raid c5t7d0 cfgadm -c unconfigure c5::dsk/c5t7d0 ....before physically replacing the disk. However, I get the following messages when trying to do this: # zpool offline raid c5t7d0 cannot offline c5t7d0: device is reserved as a hot spare # cfgadm -c unconfigure c5::dsk/c5t7d0 cfgadm: Hardware specific failure: failed to unconfigure SCSI device: Device busy I also tried a detach: # zpool detach raid c5t7d0 cannot detach c5t7d0: pool I/O is currently suspended And I also tried using the last available spare to try and free up the disk I need to replace: # zpool replace raid c5t2d0 c5t4d0 Cannot replace c5t2d0 with c5t4d0: device has already been replaced with a spare I am new to ZFS, how would I go about safely removing the affected drive in the software, before physically replacing it? I''m also not sure at exactly which juncture to do a ''zpool clear'' and ''zpool scrub''? I''d appreciate any guidance - thanks in advance, Mark ---- Mark Mahabir Systems Manager, X-Ray and Observational Astronomy Dept. of Physics & Astronomy, University of Leicester, LE1 7RH Tel: +44(0)116 252 5652 email: mark.mahabir at leicester.ac.uk Elite Without Being Elitist Times Higher Awards Winner 2007, 2008, 2009, 2010 Follow us on Twitter http://twitter.com/uniofleicsnews -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110405/1ba60eb5/attachment-0001.html>
Sorry, but what exactly were you thinking of when putting 40+ drives in a single RAIDz1 VDEV? roy ----- Original Message ----- Hi, I have a SunFire X4540 with 19TB in a RAID-Z configuration; here''s my zpool status: pool: raid state: UNAVAIL status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run ''zpool clear''. see: http://www.sun.com/msg/ZFS-8000-HC scrub: resilver in progress for 84h11m, 99.47% done, 0h27m to go config: NAME STATE READ WRITE CKSUM raid UNAVAIL 0 0 451 insufficient replicas raidz1 UNAVAIL 0 0 902 insufficient replicas c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t3d0 UNAVAIL 472 94 0 cannot open c5t3d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c2t7d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 spare DEGRADED 7 0 66.8M c5t2d0 FAULTED 11 2 0 too many errors replacing DEGRADED 0 0 0 c5t7d0 FAULTED 13 0 0 too many errors c5t6d0 ONLINE 0 0 0 202G resilvered c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c2t6d0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c0t1d0 FAULTED 0 0 0 too many errors c4t7d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 spares c4t7d0 INUSE currently in use c5t7d0 INUSE currently in use c5t6d0 INUSE currently in use c5t4d0 AVAIL errors: 911 data errors, use ''-v'' for a list It looks like the resilver has got stuck; Oracle have sent out a replacement disk today and are asking me to replace c5t7d0. If I am understanding the documentation correctly, I believe I need to do the following: zpool offline raid c5t7d0 cfgadm -c unconfigure c5::dsk/c5t7d0 ....before physically replacing the disk. However, I get the following messages when trying to do this: # zpool offline raid c5t7d0 cannot offline c5t7d0: device is reserved as a hot spare # cfgadm -c unconfigure c5::dsk/c5t7d0 cfgadm: Hardware specific failure: failed to unconfigure SCSI device: Device busy I also tried a detach: # zpool detach raid c5t7d0 cannot detach c5t7d0: pool I/O is currently suspended And I also tried using the last available spare to try and free up the disk I need to replace: # zpool replace raid c5t2d0 c5t4d0 Cannot replace c5t2d0 with c5t4d0: device has already been replaced with a spare I am new to ZFS, how would I go about safely removing the affected drive in the software, before physically replacing it? I''m also not sure at exactly which juncture to do a ''zpool clear'' and ''zpool scrub''? I''d appreciate any guidance - thanks in advance, Mark ---- Mark Mahabir Systems Manager, X-Ray and Observational Astronomy Dept. of Physics & Astronomy, University of Leicester, LE1 7RH Tel: +44(0)116 252 5652 email: mark.mahabir at leicester.ac.uk Elite Without Being Elitist Times Higher Awards Winner 2007, 2008, 2009, 2010 Follow us on Twitter http://twitter.com/uniofleicsnews _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110405/d4158cd3/attachment.html>
Hi I was a little short earlier today, but then... First of all, using 40+ drives in a single VDEV on RAIDz1 is a little like BASE jumping with an old, round, parachute without a reserve, under nasty weather conditions; not what I''d recommend. What you see below is 2 spares in use plus a spare that has been flagged in use, and then failed. With RAIDz1, you can lose a single drive, and as far as I can see, you are now down on two dead ones, meaning you''ve probably lost the pool. If you find a way to recover it, make a good backup. If you manage to back it up, or already have a backup, recreate the pool in smaller VDEVs, and preferably with RAIDz2. The ZFS Best Practices document at http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide is contains good reading about this and other subjects. roy ----- Original Message ----- Hi, I have a SunFire X4540 with 19TB in a RAID-Z configuration; here''s my zpool status: pool: raid state: UNAVAIL status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run ''zpool clear''. see: http://www.sun.com/msg/ZFS-8000-HC scrub: resilver in progress for 84h11m, 99.47% done, 0h27m to go config: NAME STATE READ WRITE CKSUM raid UNAVAIL 0 0 451 insufficient replicas raidz1 UNAVAIL 0 0 902 insufficient replicas c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t3d0 UNAVAIL 472 94 0 cannot open c5t3d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c2t7d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 spare DEGRADED 7 0 66.8M c5t2d0 FAULTED 11 2 0 too many errors replacing DEGRADED 0 0 0 c5t7d0 FAULTED 13 0 0 too many errors c5t6d0 ONLINE 0 0 0 202G resilvered c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c2t6d0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c0t1d0 FAULTED 0 0 0 too many errors c4t7d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 spares c4t7d0 INUSE currently in use c5t7d0 INUSE currently in use c5t6d0 INUSE currently in use c5t4d0 AVAIL errors: 911 data errors, use ''-v'' for a list It looks like the resilver has got stuck; Oracle have sent out a replacement disk today and are asking me to replace c5t7d0. If I am understanding the documentation correctly, I believe I need to do the following: zpool offline raid c5t7d0 cfgadm -c unconfigure c5::dsk/c5t7d0 ....before physically replacing the disk. However, I get the following messages when trying to do this: # zpool offline raid c5t7d0 cannot offline c5t7d0: device is reserved as a hot spare # cfgadm -c unconfigure c5::dsk/c5t7d0 cfgadm: Hardware specific failure: failed to unconfigure SCSI device: Device busy I also tried a detach: # zpool detach raid c5t7d0 cannot detach c5t7d0: pool I/O is currently suspended And I also tried using the last available spare to try and free up the disk I need to replace: # zpool replace raid c5t2d0 c5t4d0 Cannot replace c5t2d0 with c5t4d0: device has already been replaced with a spare I am new to ZFS, how would I go about safely removing the affected drive in the software, before physically replacing it? I''m also not sure at exactly which juncture to do a ''zpool clear'' and ''zpool scrub''? I''d appreciate any guidance - thanks in advance, Mark ---- Mark Mahabir Systems Manager, X-Ray and Observational Astronomy Dept. of Physics & Astronomy, University of Leicester, LE1 7RH Tel: +44(0)116 252 5652 email: mark.mahabir at leicester.ac.uk Elite Without Being Elitist Times Higher Awards Winner 2007, 2008, 2009, 2010 Follow us on Twitter http://twitter.com/uniofleicsnews _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110405/6be4b59c/attachment-0001.html>