I made a really stupid mistake... having trouble removing a hot spare marked as failed I was trying several ways to put it back in a good state. One means I tried was to ''zpool add pool c5t3d0''... but I forgot to use the proper syntax "zpool add pool spare c5t3d0". Now I''m in a bind. I''ve got 4 large raidz2''s and now this punty 500GB drive in the config: ... raidz2 ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 spares c5t3d0 FAULTED corrupted data c4t7d0 AVAIL ... Detach and Remove won''t work. Does anyone know of a way to get that c5t3d0 out of the data configuration and back to hot-spare where it belongs? However if I understand the layout properly, this should not have an adverse impact on my existing configuration.... I think. If I can''t dump it, what happens when that disk fills up? I can''t believe I made such a bone headed mistake. This is one of those times when a "Are you sure you...?" would be helpful. :( benr.
There''s really no way to recover from this, since we don''t have device removal. However, I''m suprised that no warning was given. There are at least two things that should have happened: 1. zpool(1M) should have warned you that the redundancy level you were attempting did not match that of your existing pool. This doesn''t apply if you already have a mixed level of redundancy. 2. zpool(1M) should have warned you that the device was in use as an active spare and not let you continue. What bits were you running? - Eric On Tue, Jan 15, 2008 at 06:25:50PM -0800, Ben Rockwood wrote:> I made a really stupid mistake... having trouble removing a hot spare > marked as failed I was trying several ways to put it back in a good > state. One means I tried was to ''zpool add pool c5t3d0''... but I forgot > to use the proper syntax "zpool add pool spare c5t3d0". > > Now I''m in a bind. I''ve got 4 large raidz2''s and now this punty 500GB > drive in the config: > > ... > raidz2 ONLINE 0 0 0 > c5t7d0 ONLINE 0 0 0 > c5t2d0 ONLINE 0 0 0 > c7t7d0 ONLINE 0 0 0 > c6t7d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > c0t7d0 ONLINE 0 0 0 > c4t3d0 ONLINE 0 0 0 > c7t3d0 ONLINE 0 0 0 > c6t3d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > c0t3d0 ONLINE 0 0 0 > c5t3d0 ONLINE 0 0 0 > spares > c5t3d0 FAULTED corrupted data > c4t7d0 AVAIL > ... > > > > Detach and Remove won''t work. Does anyone know of a way to get that > c5t3d0 out of the data configuration and back to hot-spare where it belongs? > > However if I understand the layout properly, this should not have an > adverse impact on my existing configuration.... I think. If I can''t > dump it, what happens when that disk fills up? > > I can''t believe I made such a bone headed mistake. This is one of those > times when a "Are you sure you...?" would be helpful. :( > > benr. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, FishWorks http://blogs.sun.com/eschrock
Eric Schrock wrote:> There''s really no way to recover from this, since we don''t have device > removal. However, I''m suprised that no warning was given. There are at > least two things that should have happened: > > 1. zpool(1M) should have warned you that the redundancy level you were > attempting did not match that of your existing pool. This doesn''t > apply if you already have a mixed level of redundancy. > > 2. zpool(1M) should have warned you that the device was in use as an > active spare and not let you continue. > > What bits were you running? >snv_78, however the pool was created on snv_43 and hasn''t yet been upgraded. Though, programatically, I can''t see why there would be a difference in the way ''zpool'' would handle the check. The big question is, if I''m stuck like the permanently, whats the potential risk? Could I potentially just fail that drive and leave it in a failed state? benr.
Hello Ben, Wednesday, January 16, 2008, 5:29:57 AM, you wrote: BR> Eric Schrock wrote:>> There''s really no way to recover from this, since we don''t have device >> removal. However, I''m suprised that no warning was given. There are at >> least two things that should have happened: >> >> 1. zpool(1M) should have warned you that the redundancy level you were >> attempting did not match that of your existing pool. This doesn''t >> apply if you already have a mixed level of redundancy. >> >> 2. zpool(1M) should have warned you that the device was in use as an >> active spare and not let you continue. >> >> What bits were you running? >>BR> snv_78, however the pool was created on snv_43 and hasn''t yet been BR> upgraded. Though, programatically, I can''t see why there would be a BR> difference in the way ''zpool'' would handle the check. BR> The big question is, if I''m stuck like the permanently, whats the BR> potential risk? BR> Could I potentially just fail that drive and leave it in a failed state? If some data has been written since you did it you have a "chance" it was stripped between your raid-z pools and this drive - so if you fail a drive you won''t have an access to some data. Metadata should be fine but then after a reboot or export you won''t be able to import a pool. If you can''t re-create a pool (+backup&restore your data) I would recommend to wait for device removal in zfs and in a mean time I would attach another drive to it so you''ve got mirrored configuration and remove them once there''s a device removal. Since you''re already working on nevada you probably could adopt new bits quickly. The only question is - when device removal is going to be integrated - last time someone mentioned it here it was supposed to be by the end of last year... -- Best regards, Robert Milkowski mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Robert Milkowski wrote:> If you can''t re-create a pool (+backup&restore your data) I would > recommend to wait for device removal in zfs and in a mean time I would > attach another drive to it so you''ve got mirrored configuration and > remove them once there''s a device removal. Since you''re already > working on nevada you probably could adopt new bits quickly. > > The only question is - when device removal is going to be integrated - > last time someone mentioned it here it was supposed to be by the end > of last year... >Ya, I''m afraid your right. benr.