Dear managers, one of our servers (X4240) shows a faulty disk: ------------------------------------------------------------------------ -bash-3.00# zpool status ? pool: rpool ?state: ONLINE ?scrub: none requested config: ? ? ? ? NAME ? ? ? ? ?STATE ? ? READ WRITE CKSUM ? ? ? ? rpool ? ? ? ? ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? mirror ? ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t0d0s0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t1d0s0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 errors: No known data errors ? pool: tank ?state: DEGRADED status: One or more devices are faulted in response to persistent errors. ? ? ? ? Sufficient replicas exist for the pool to continue functioning in a ? ? ? ? degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device ? ? ? ? repaired. ?scrub: none requested config: ? ? ? ? NAME ? ? ? ?STATE ? ? READ WRITE CKSUM ? ? ? ? tank ? ? ? ?DEGRADED ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? mirror ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t2d0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t3d0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? mirror ? ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t5d0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t4d0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? mirror ? ?DEGRADED ? ? 0 ? ? 0 ? ? 0 ? ? ? ? ? ? c1t6d0 ?FAULTED ? ? ?0 ? ?19 ? ? 0 ?too many errors ? ? ? ? ? ? c1t7d0 ?ONLINE ? ? ? 0 ? ? 0 ? ? 0 errors: No known data errors ------------------------------------------------------------------------ I derived the following possible approaches to solve the problem: 1) A way to reestablish redundancy would be to use the command ? ? ? ?zpool attach tank c1t7d0 c1t15d0 to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would have the faulty disk in the virtual device. We could then dettach the faulty disk with the command ? ? ? ?zpool dettach tank c1t6d0 2) Another approach would be to add a spare disk to tank ? ? ? ?zpool add tank spare c1t15d0 and the replace to replace the faulty disk. ? ? ? ?zpool replace tank c1t6d0 c1t15d0 In theory that is easy, but since I have never done that and since this is a productive server I would appreciate if somone with more experience would look on my agenda before I issue these commands. What is the difference between the two approaches? Which one do you recommend? And is that really all that has to be done or am I missing a bit? I mean can c1t6d0 be physically replaced after issuing "zpool dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also found the command ? ? ? ?zpool offline tank ?... but am not sure whether this should be used in my case. Hints are greatly appreciated! Thanks a lot, ? Andreas
Cindy.Swearingen at Sun.COM
2009-Aug-06 20:04 UTC
[zfs-discuss] Replacing faulty disk in ZFS pool
Hi Andreas, Good job for using a mirrored configuration. :-) Your various approaches would work. My only comment about #2 is that it might take some time for the spare to kick in for the faulted disk. Both 1 and 2 would take a bit more time than just replacing the faulted disk with a spare disk, like this: # zpool replace tank c1t6d0 c1t15d0 Then you could physically replace c1t6d0 and add it back to the pool as a spare, like this: # zpool add tank spare c1t6d0 For a production system, the steps above might be the most efficient. Get the faulted disk replaced with a known good disk so the pool is no longer degraded, then physically replace the bad disk when you have the time and add it back to the pool as a spare. It is also good practice to run a zpool scrub to ensure the replacement is operational and use zpool clear to clear the previous errors on the pool. If the system is used heavily, then you might want to run the zpool scrub when system use is reduced. If you were going to physically replace c1t6d0 while it was still attached to the pool, then you might offline it first. Cindy On 08/06/09 13:17, Andreas H?schler wrote:> Dear managers, > > one of our servers (X4240) shows a faulty disk: > > ------------------------------------------------------------------------ > -bash-3.00# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t0d0s0 ONLINE 0 0 0 > c1t1d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: DEGRADED > status: One or more devices are faulted in response to persistent > errors. > Sufficient replicas exist for the pool to continue functioning > in a > degraded state. > action: Replace the faulted device, or use ''zpool clear'' to mark the > device > repaired. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror DEGRADED 0 0 0 > c1t6d0 FAULTED 0 19 0 too many errors > c1t7d0 ONLINE 0 0 0 > > errors: No known data errors > ------------------------------------------------------------------------ > I derived the following possible approaches to solve the problem: > > 1) A way to reestablish redundancy would be to use the command > > zpool attach tank c1t7d0 c1t15d0 > > to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would > have the faulty disk in the virtual device. > > We could then dettach the faulty disk with the command > > zpool dettach tank c1t6d0 > > 2) Another approach would be to add a spare disk to tank > > zpool add tank spare c1t15d0 > > and the replace to replace the faulty disk. > > zpool replace tank c1t6d0 c1t15d0 > > In theory that is easy, but since I have never done that and since this > is a productive server I would appreciate if somone with more > experience would look on my agenda before I issue these commands. > > What is the difference between the two approaches? Which one do you > recommend? And is that really all that has to be done or am I missing a > bit? I mean can c1t6d0 be physically replaced after issuing "zpool > dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also > found the command > > zpool offline tank ... > > but am not sure whether this should be used in my case. Hints are > greatly appreciated! > > Thanks a lot, > > Andreas > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discus > s
I believe there are a couple of ways that work. The commands I''ve always used are to attach the new disk as a spare (if not already) and then replace the failed disk with the spare. I don''t know if there are advantages or disavantages but I also have never had a problem doing it this way. Andreas H?schler wrote:> Dear managers, > > one of our servers (X4240) shows a faulty disk: > > ------------------------------------------------------------------------ > -bash-3.00# zpool status > pool: rpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t0d0s0 ONLINE 0 0 0 > c1t1d0s0 ONLINE 0 0 0 > > errors: No known data errors > > pool: tank > state: DEGRADED > status: One or more devices are faulted in response to persistent > errors. > Sufficient replicas exist for the pool to continue functioning > in a > degraded state. > action: Replace the faulted device, or use ''zpool clear'' to mark the > device > repaired. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror DEGRADED 0 0 0 > c1t6d0 FAULTED 0 19 0 too many errors > c1t7d0 ONLINE 0 0 0 > > errors: No known data errors > ------------------------------------------------------------------------ > I derived the following possible approaches to solve the problem: > > 1) A way to reestablish redundancy would be to use the command > > zpool attach tank c1t7d0 c1t15d0 > > to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would > have the faulty disk in the virtual device. > > We could then dettach the faulty disk with the command > > zpool dettach tank c1t6d0 > > 2) Another approach would be to add a spare disk to tank > > zpool add tank spare c1t15d0 > > and the replace to replace the faulty disk. > > zpool replace tank c1t6d0 c1t15d0 > > In theory that is easy, but since I have never done that and since this > is a productive server I would appreciate if somone with more > experience would look on my agenda before I issue these commands. > > What is the difference between the two approaches? Which one do you > recommend? And is that really all that has to be done or am I missing a > bit? I mean can c1t6d0 be physically replaced after issuing "zpool > dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also > found the command > > zpool offline tank ... > > but am not sure whether this should be used in my case. Hints are > greatly appreciated! > > Thanks a lot, > > Andreas > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
Hi Cindy,> Good job for using a mirrored configuration. :-)Thanks!> Your various approaches would work. > > My only comment about #2 is that it might take some time for the spare > to kick in for the faulted disk. > > Both 1 and 2 would take a bit more time than just replacing the faulted > disk with a spare disk, like this: > > # zpool replace tank c1t6d0 c1t15d0You mean I can execute zpool replace tank c1t6d0 c1t15d0 without having made c1t15d0 a spare disk first with zpool add tank spare c1t15d0 ? After doing that c1t6d0 is offline and ready to be physically replaced?> Then you could physically replace c1t6d0 and add it back to the pool as > a spare, like this: > > # zpool add tank spare c1t6d0 > > For a production system, the steps above might be the most efficient. > Get the faulted disk replaced with a known good disk so the pool is > no longer degraded, then physically replace the bad disk when you have > the time and add it back to the pool as a spare. > > It is also good practice to run a zpool scrub to ensure the > replacement is operationalThat would be zpool scrub tank in my case!?> and use zpool clear to clear the previous > errors on the pool.I assume teh complete comamnd fo rmy case is zpool clear tank Why d we have to do that. Couldb''t zfs realize that everything is fine again after executing "zpool replace tank c1t6d0 c1t15d0"?> If the system is used heavily, then you might want to run the zpool > scrub when system use is reduced.That would be now! :-)> If you were going to physically replace c1t6d0 while it was still > attached to the pool, then you might offline it first.Ok, this sounds like approach 3) zpool offline tank c1t6d0 <physically replace c1t6d0 with a new one> zpool online tank c1t6d0 Would that be it? Thanks a lot! Regards, Andreas
Cindy.Swearingen at Sun.COM
2009-Aug-06 20:38 UTC
[zfs-discuss] Replacing faulty disk in ZFS pool
Andreas, More comments below. Cindy On 08/06/09 14:18, Andreas H?schler wrote:> Hi Cindy, > > >> Good job for using a mirrored configuration. :-) > > > Thanks! > >> Your various approaches would work. >> >> My only comment about #2 is that it might take some time for the spare >> to kick in for the faulted disk. >> >> Both 1 and 2 would take a bit more time than just replacing the faulted >> disk with a spare disk, like this: >> >> # zpool replace tank c1t6d0 c1t15d0 > > > You mean I can execute > > zpool replace tank c1t6d0 c1t15d0 > > without having made c1t15d0 a spare disk first withYes, that is correct.> > zpool add tank spare c1t15d0 > > ? After doing that c1t6d0 is offline and ready to be physically replaced?Yes, that is correct.> >> Then you could physically replace c1t6d0 and add it back to the pool as >> a spare, like this: >> >> # zpool add tank spare c1t6d0 >> >> For a production system, the steps above might be the most efficient. >> Get the faulted disk replaced with a known good disk so the pool is >> no longer degraded, then physically replace the bad disk when you have >> the time and add it back to the pool as a spare. >> >> It is also good practice to run a zpool scrub to ensure the >> replacement is operational > > > That would be > > zpool scrub tank > > in my case!?Yes.> >> and use zpool clear to clear the previous >> errors on the pool. > > > I assume teh complete comamnd fo rmy case is > > zpool clear tank > > Why d we have to do that. Couldb''t zfs realize that everything is fine > again after executing "zpool replace tank c1t6d0 c1t15d0"?Yes, sometimes the clear is not necessary but it will also clear the error counts if need be.> >> If the system is used heavily, then you might want to run the zpool >> scrub when system use is reduced. > > > That would be now! :-) > >> If you were going to physically replace c1t6d0 while it was still >> attached to the pool, then you might offline it first. > > > Ok, this sounds like approach 3) > > zpool offline tank c1t6d0 > <physically replace c1t6d0 with a new one> > zpool online tank c1t6d0 > > Would that be it?Those steps would be like this: zpool offline tank c1t6d0 <physically replace c1t6d0 with a new one> zpool replace tank c1t6d0 zpool online tank c1t6d0 On some hardware, you must unconfigure the disk before replacing it, such as after taking it offline. I''m not sure if the x4240 is in that category. If you do the replacement with another known good disk (c1t15d0) then you do not have to unconfigure the failed disk first. See Example 11-1 for more information: http://docs.sun.com/app/docs/doc/819-5461/gbbvf?a=view> > Thanks a lot! > > Regards, > > Andreas > >
If her adds the spare and then manually forces a replace, it will take no more time than any other way. I do this quite frequently and without needing the scrub which does take quite a lot of time. Cindy.Swearingen at Sun.COM wrote:> Hi Andreas, > > Good job for using a mirrored configuration. :-) > > Your various approaches would work. > > My only comment about #2 is that it might take some time for the spare > to kick in for the faulted disk. > > Both 1 and 2 would take a bit more time than just replacing the faulted > disk with a spare disk, like this: > > # zpool replace tank c1t6d0 c1t15d0 > > Then you could physically replace c1t6d0 and add it back to the pool as > a spare, like this: > > # zpool add tank spare c1t6d0 > > For a production system, the steps above might be the most efficient. > Get the faulted disk replaced with a known good disk so the pool is > no longer degraded, then physically replace the bad disk when you have > the time and add it back to the pool as a spare. > > It is also good practice to run a zpool scrub to ensure the > replacement is operational and use zpool clear to clear the previous > errors on the pool. If the system is used heavily, then you might want > to run the zpool scrub when system use is reduced. > > If you were going to physically replace c1t6d0 while it was still > attached to the pool, then you might offline it first. > > Cindy > > On 08/06/09 13:17, Andreas H?schler wrote: > >> Dear managers, >> >> one of our servers (X4240) shows a faulty disk: >> >> ------------------------------------------------------------------------ >> -bash-3.00# zpool status >> pool: rpool >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> rpool ONLINE 0 0 0 >> mirror ONLINE 0 0 0 >> c1t0d0s0 ONLINE 0 0 0 >> c1t1d0s0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: tank >> state: DEGRADED >> status: One or more devices are faulted in response to persistent >> errors. >> Sufficient replicas exist for the pool to continue functioning >> in a >> degraded state. >> action: Replace the faulted device, or use ''zpool clear'' to mark the >> device >> repaired. >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> tank DEGRADED 0 0 0 >> mirror ONLINE 0 0 0 >> c1t2d0 ONLINE 0 0 0 >> c1t3d0 ONLINE 0 0 0 >> mirror ONLINE 0 0 0 >> c1t5d0 ONLINE 0 0 0 >> c1t4d0 ONLINE 0 0 0 >> mirror DEGRADED 0 0 0 >> c1t6d0 FAULTED 0 19 0 too many errors >> c1t7d0 ONLINE 0 0 0 >> >> errors: No known data errors >> ------------------------------------------------------------------------ >> I derived the following possible approaches to solve the problem: >> >> 1) A way to reestablish redundancy would be to use the command >> >> zpool attach tank c1t7d0 c1t15d0 >> >> to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would >> have the faulty disk in the virtual device. >> >> We could then dettach the faulty disk with the command >> >> zpool dettach tank c1t6d0 >> >> 2) Another approach would be to add a spare disk to tank >> >> zpool add tank spare c1t15d0 >> >> and the replace to replace the faulty disk. >> >> zpool replace tank c1t6d0 c1t15d0 >> >> In theory that is easy, but since I have never done that and since this >> is a productive server I would appreciate if somone with more >> experience would look on my agenda before I issue these commands. >> >> What is the difference between the two approaches? Which one do you >> recommend? And is that really all that has to be done or am I missing a >> bit? I mean can c1t6d0 be physically replaced after issuing "zpool >> dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also >> found the command >> >> zpool offline tank ... >> >> but am not sure whether this should be used in my case. Hints are >> greatly appreciated! >> >> Thanks a lot, >> >> Andreas >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discus >> s >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
Hi all,>> zpool add tank spare c1t15d0 >> ? After doing that c1t6d0 is offline and ready to be physically >> replaced? > > Yes, that is correct. >>> Then you could physically replace c1t6d0 and add it back to the pool >>> as >>> a spare, like this: >>> >>> # zpool add tank spare c1t6d0 >>> >>> For a production system, the steps above might be the most efficient. >>> Get the faulted disk replaced with a known good disk so the pool is >>> no longer degraded, then physically replace the bad disk when you >>> have >>> the time and add it back to the pool as a spare. >>> >>> It is also good practice to run a zpool scrub to ensure the >>> replacement is operational >> That would be >> zpool scrub tank >> in my case!? > > Yes. >>> and use zpool clear to clear the previous >>> errors on the pool. >> I assume teh complete comamnd fo rmy case is >> zpool clear tank >> Why d we have to do that. Couldb''t zfs realize that everything is >> fine again after executing "zpool replace tank c1t6d0 c1t15d0"? > > Yes, sometimes the clear is not necessary but it will also clear the > error counts if need be.I have done zpool add tank spare c1t15d0 zpool replace tank c1t6d0 c1t15d0 now and waited for the completion of the resilvering process. "zpool status" now gives me scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 22:55:37 2009 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c1t6d0 FAULTED 0 19 0 too many errors c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 spares c1t15d0 INUSE currently in use errors: No known data errors This does look like a final step is missing. Can I simply physically replace c1t6d0 now or do I have to do zpool offline tank c1t6d0 first? Moreover it seems I have to run a zpool clear in my case to get rid of the DEGRADED message!? What is the missing bit here?> zpool offline tank c1t6d0 > <physically replace c1t6d0 with a new one> > zpool replace tank c1t6d0 > zpool online tank c1t6d0Just out of curiosity (since I used the other road this time), how does the replace command know what exactly to do here. In my case I ordered the system specifically to replace c1t6d0 with c1t15d0 by doing "zpool replace tank c1t6d0 c1t15d0" but if I simply issue zpool replace tank c1t6d0 it ...!?? Thanks a lot, Andreas
Cindy.Swearingen at Sun.COM
2009-Aug-06 21:13 UTC
[zfs-discuss] Replacing faulty disk in ZFS pool
Andreas, I think you can still offline the faulted disk, c1t6d0. The difference between these two replacements: zpool replace tank c1t6d0 c1t15d0 zpool replace tank c1t6d0 Is that in the second case, you are telling ZFS that c1t6d0 has been physically replaced in the same location. This would be equivalent but unnecessary syntax: zpool replace tank c1t6d0 c1t6d0 Another option is to set the autoreplace pool property to on, which will do the replacement steps (zpool replace) after you physically replace the disk in the same physical location as the faulted disk. This is also described in Example 11-1, here: http://docs.sun.com/app/docs/doc/819-5461/gbbvf?a=view After you physically replace c1t6d0, then you might have to detach the spare, c1t15d0, back to the spare pool, like this: # zpool detach tank c1t15d0 I''m not sure this step is always necessary... cs On 08/06/09 15:05, Andreas H?schler wrote:> Hi all, > >>> zpool add tank spare c1t15d0 >>> ? After doing that c1t6d0 is offline and ready to be physically >>> replaced? >> >> >> Yes, that is correct. >> >>>> Then you could physically replace c1t6d0 and add it back to the pool as >>>> a spare, like this: >>>> >>>> # zpool add tank spare c1t6d0 >>>> >>>> For a production system, the steps above might be the most efficient. >>>> Get the faulted disk replaced with a known good disk so the pool is >>>> no longer degraded, then physically replace the bad disk when you have >>>> the time and add it back to the pool as a spare. >>>> >>>> It is also good practice to run a zpool scrub to ensure the >>>> replacement is operational >>> >>> That would be >>> zpool scrub tank >>> in my case!? >> >> >> Yes. >> >>>> and use zpool clear to clear the previous >>>> errors on the pool. >>> >>> I assume teh complete comamnd fo rmy case is >>> zpool clear tank >>> Why d we have to do that. Couldb''t zfs realize that everything is >>> fine again after executing "zpool replace tank c1t6d0 c1t15d0"? >> >> >> Yes, sometimes the clear is not necessary but it will also clear the >> error counts if need be. > > > I have done > > zpool add tank spare c1t15d0 > zpool replace tank c1t6d0 c1t15d0 > > now and waited for the completion of the resilvering process. "zpool > status" now gives me > > scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 > 22:55:37 2009 > config: > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror DEGRADED 0 0 0 > spare DEGRADED 0 0 0 > c1t6d0 FAULTED 0 19 0 too many errors > c1t15d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > spares > c1t15d0 INUSE currently in use > > errors: No known data errors > > This does look like a final step is missing. Can I simply physically > replace c1t6d0 now or do I have to do > > zpool offline tank c1t6d0 > > first? Moreover it seems I have to run a > > zpool clear > > in my case to get rid of the DEGRADED message!? What is the missing bit > here? > >> zpool offline tank c1t6d0 >> <physically replace c1t6d0 with a new one> >> zpool replace tank c1t6d0 >> zpool online tank c1t6d0 > > > Just out of curiosity (since I used the other road this time), how does > the replace command know what exactly to do here. In my case I ordered > the system specifically to replace c1t6d0 with c1t15d0 by doing "zpool > replace tank c1t6d0 c1t15d0" but if I simply issue > > zpool replace tank c1t6d0 > > it ...!?? > > Thanks a lot, > > Andreas > > >
Hi Cindy,> I think you can still offline the faulted disk, c1t6d0.OK, here it gets tricky. I have NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c1t6d0 FAULTED 0 19 0 too many errors c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 spares c1t15d0 INUSE currently in use now. When I issue the command zpool offline tank c1t6d0 I get cannot offline c1t6d0: no valid replicas ?? However zpool detach tank c1t6d0 seems to work! pool: tank state: ONLINE scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 22:55:37 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 errors: No known data errors This looks like I can remove and physically replace c1t6d0 now! :-) Thanks, Andreas
Cindy.Swearingen at Sun.COM
2009-Aug-06 21:42 UTC
[zfs-discuss] Replacing faulty disk in ZFS pool
Dang. This is a bug we talked about recently that is fixed in Nevada and an upcoming Solaris 10 release. Okay, so you can''t offline the faulted disk, but you were able to replace it and detach the spare. Cool beans... Cindy On 08/06/09 15:35, Andreas H?schler wrote:> Hi Cindy, > >> I think you can still offline the faulted disk, c1t6d0. > > > OK, here it gets tricky. I have > > NAME STATE READ WRITE CKSUM > tank DEGRADED 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror DEGRADED 0 0 0 > spare DEGRADED 0 0 0 > c1t6d0 FAULTED 0 19 0 too many errors > c1t15d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > spares > c1t15d0 INUSE currently in use > > now. When I issue the command > > zpool offline tank c1t6d0 > > I get > > cannot offline c1t6d0: no valid replicas > > ?? > > However > > zpool detach tank c1t6d0 > > seems to work! > > pool: tank > state: ONLINE > scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 > 22:55:37 2009 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t15d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > > errors: No known data errors > > This looks like I can remove and physically replace c1t6d0 now! :-) > > Thanks, > > Andreas >