Hello list, I pre-created the pools we would use for when the SSD eventually come in. Not my finest moment perhaps. Since I knew the SSDs would be 32GB in size, I created 32GB slices on HDDs in slot 36 and 44. * For future reference to others thinking to do the same, do not bother setting up the log until you have the SSDs, or make the slices half that of the planned SSD size. So the SSDs arrived, and I have a spare X4540 to attempt the replacement, before we have to do it on all the production x4540s. Hopefully with no downtime. SunOS x4500-15.unix 5.10 Generic_141445-09 i86pc i386 i86pc logs c5t4d0s0 ONLINE 0 0 0 c6t4d0s0 ONLINE 0 0 0 # zpool detach zpool1 c5t4d0s0 # hdadm offline disk c5t4 This was very exciting, this is the first time EVER that the blue LED has turned on. much rejoicing! ;) Took slot 36 out, and inserted the first SSD. Lights came on green again, but just in case; # hdadm online disk c5t4 I used format to fdisk it, change to EFI label. # zpool attach zpool1 c6t4d0s0 c5t4d0 cannot attach c5t4d0 to c6t4d0s0; the device is too small Uhoh. Of course, I created a slice of 32gb, literally, and SSD "32GB" is the old HDD "human" size. This has been fixed in OpenSolaris already (attaching smaller mirrors), but apparently not for Solaris 10 u8. I appear screwed. Are there patches to fix this perhaps? Hopefully? ;) However, would I COULD do is add a new device; # zpool add zpool1 log c5t4d0 # zpool status logs c6t4d0s0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 Interesting. Unfortunately, I can not "zpool offline", nor "zpool detach", nor "zpool remove" the existing c6t4d0s0 device. At this point we are essentially stuck. I would have to re-create the whole pool to fix this. With servers live and full of customer data, this will be awkward. So I switched to a more .. direct approach. I also knew that if the log-device fails, it will go back to using the "default" log device. # hdadm offline disk c6t4 Even though this says "OK", it does not actually work since the device is in use. In the end, I simply pulled out the HDD. Since we had already added a second log device, there were no hiccups at all. It barely noticed it was gone. logs c6t4d0s0 UNAVAIL 0 0 0 corrupted data c5t4d0 ONLINE 0 0 0 At this point we inserted the second SSD, did the format for EFI label, and we were a little surprised that this worked; # zpool attach zpool1 c5t4d0 c6t4d0 So now we have the situation of: logs c6t4d0s0 UNAVAIL 0 0 0 corrupted data mirror ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 It would be nice to get rid of c6t4d0s0 though. Any thoughts? What would you experts do in this situation? We have to run Solaris 10 (loooong battle there, no support for Opensolaris from anyone in Japan). Can I delete the sucker using zdb? Thanks for any reply, -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
> Interesting. Unfortunately, I can not "zpool offline", nor "zpool > detach", nor "zpool remove" the existing c6t4d0s0 device. >I thought perhaps we could boot something newer than b125 [*1] and I would be able to remove the slog device that is too big. The dev-127.iso does not boot [*2] due to splashimage, so I had to edit the ISO to remove that for booting. After booting with "-B console=ttya", I find that "it" can not add the /dev/dsk entries for the 24 HDDs, since "/" is on a too-small ramdisk. Disk-full messages ensue. Yay! After I have finally imported the pools, without upgrading (since I have to boot back to Sol 10 u8 for production), I attempt to remove the "slog" that is no longer needed: # zpool remove zpool1 c6t4d0s0 cannot remove c6t4d0s0: pool must be upgrade to support log removal Sigh. Lund [*1] http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6574286 [*2] http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6739497 -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Ok the "logfix" program compiled for svn111 does run, and lets me change the HDD 32GB slog, with the new SSD (~29GB) slog, comes up as faulty, but I can replace it with itself, and everything is OK. I can attach the second SSD without issues. Assuming that it doesn''t try to write the full 32GB ever, it should be ok. Don''t know if ZPOOL stores the physical size in the label, or when importing. # zpool export zpool1 # ./logfix /dev/rdsk/c5t1d0s0 /dev/rdsk/c10t4d0s0 13049515403703921770 # zpool import zpool1 # zpool status logs 13049515403703921770 FAULTED 0 0 0 was /dev/dsk/c10t4d0s0 # zpool replace -f zpool1 13049515403703921770 c10t4d0 # zpool status logs c10t4d0 ONLINE 0 0 0 # zpool attach zpool1 c10t4d0 c9t4d0 logs mirror-1 ONLINE 0 0 0 c10t4d0 ONLINE 0 0 0 c9t4d0 ONLINE 0 0 0 And back in Solaris 10 u8: # zpool import zpool1 # zpool status logs mirror ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 It does at least have a solution, even if it is rather unattractive. 12 servers, and has to be done at 2am means I will be testy for a while. Lund Jorgen Lundman wrote:> >> Interesting. Unfortunately, I can not "zpool offline", nor "zpool >> detach", nor "zpool remove" the existing c6t4d0s0 device. >> > > I thought perhaps we could boot something newer than b125 [*1] and I > would be able to remove the slog device that is too big. > > The dev-127.iso does not boot [*2] due to splashimage, so I had to edit > the ISO to remove that for booting. > > After booting with "-B console=ttya", I find that "it" can not add the > /dev/dsk entries for the 24 HDDs, since "/" is on a too-small ramdisk. > Disk-full messages ensue. Yay! > > After I have finally imported the pools, without upgrading (since I have > to boot back to Sol 10 u8 for production), I attempt to remove the > "slog" that is no longer needed: > > > # zpool remove zpool1 c6t4d0s0 > cannot remove c6t4d0s0: pool must be upgrade to support log removal > > > Sigh. > > > Lund > > > > [*1] > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6574286 > > [*2] > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6739497 > > > >-- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)