Hello list, Before we started changing to ZFS bootfs, we used DiskSuite mirrored ufs boot. Very often, if we needed to grow a cluster by another machine or two, we would simply clone a running live server. Generally the procedure for this would be; 1 detach the "2nd" HDD, metaclear, and delete metadb on 2nd disk. 2 mount the 2nd HDD under /mnt, and change system/vfstab to be a single boot HDD, and no longer "mirrored", as well as host name, and IP addresses. 3 bootadm update-archive -R /mnt 4 unmount, cfgadm unconfigure, and pull out the HDD. and generally, in about ~4 minutes, we have a new live server in the cluster. We tried to do the same thing to day, but with a ZFS bootfs. We did: 1 zpool detach on the "2nd HDD". 2 cfgadm unconfigure the HDD, and pull out the disk. The source server was fine, could insert new disk, attach it, and it resilvered. However, the new destination server had lots of issues. At first, grub would give no menu at all, just the grub? command prompt. The command: findroot(pool_zboot,0,a) would return "Error 15: No such file". After booting a Solaris Live CD, I could "zpool import" the pool, but of course it was in Degraded mode etc. Now it would show menu, but if you boot it, it would flash the message that the pool was last accessed by Solaris $sysid, and "panic". After a lot of reboots, and fiddling, I managed to get miniroot to at least boot, then, only after inserting a new HDD and letting the pool become completely "good" would it let me boot into multi-user. Is there something we should do perhaps, that will let the cloning procedure go smoothly? Should I "export" the ''now separated disk'' somehow? In fact, can I mount that disk to make changes to it before pulling out the disk? Most documentation on cloning uses "zfs send", which would be possible, but 4 minutes is hard to beat when your cluster is under heavy load. Lund -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Ok, so it seems that with DiskSuite, detaching a mirror does nothing to the disk you detached. However, "zpool detach" appears to mark the disk as blank, so nothing will find any pools (import, import -D etc). zdb -l will show labels, but no amount of work that we have found will bring the HDD back online in the new server. Grub is blank, and findroot can not see any pool. zpool will not let you offline the 2nd disk in a mirror. This is incorrect behaviour. You can not "cfgadm unconfigure" the sata device while zpool has the disk. We can just yank the disk, but we had issues getting a new-blank disk recognised after that. cfgadm would not release the old disk. However, we found we can do this: # cfgadm -x sata_port_deactivate sata0/1::dsk/c0t1d0 This will make zpool mark it: c0t1d0s0 REMOVED 0 0 0 and eventually: c0t1d0s0 FAULTED 0 0 0 too many errors After that, we pull out the disk, and issue: # zpool detach zboot c0t1d0s0 # cfgadm -x sata_port_activate sata0/1::dsk/c0t1d0 # cfgadm -c configure sata0/1::dsk/c0t1d0 # format (fdisk, partition as required to be the same) # zpool attach zboot c0t0d0s0 c0t1d0s0 There is one final thing to address, when the disk is used in a new machine, it will generally panic with "pool was used previously with system-id xxxxxx". Which requires more miniroot work. It would be nice to be able to avoid this as well. But you can''t export the "/" pool before pulling out the disk, either. Jorgen Lundman wrote:> > Hello list, > > Before we started changing to ZFS bootfs, we used DiskSuite mirrored ufs > boot. > > Very often, if we needed to grow a cluster by another machine or two, we > would simply clone a running live server. Generally the procedure for > this would be; > > 1 detach the "2nd" HDD, metaclear, and delete metadb on 2nd disk. > 2 mount the 2nd HDD under /mnt, and change system/vfstab to be a single > boot HDD, and no longer "mirrored", as well as host name, and IP addresses. > 3 bootadm update-archive -R /mnt > 4 unmount, cfgadm unconfigure, and pull out the HDD. > > and generally, in about ~4 minutes, we have a new live server in the > cluster. > > > We tried to do the same thing to day, but with a ZFS bootfs. We did: > > 1 zpool detach on the "2nd HDD". > 2 cfgadm unconfigure the HDD, and pull out the disk. > > The source server was fine, could insert new disk, attach it, and it > resilvered. > > However, the new destination server had lots of issues. At first, grub > would give no menu at all, just the grub? command prompt. > > The command: findroot(pool_zboot,0,a) would return "Error 15: No such > file". > > After booting a Solaris Live CD, I could "zpool import" the pool, but of > course it was in Degraded mode etc. > > Now it would show menu, but if you boot it, it would flash the message > that the pool was last accessed by Solaris $sysid, and "panic". > > After a lot of reboots, and fiddling, I managed to get miniroot to at > least boot, then, only after inserting a new HDD and letting the pool > become completely "good" would it let me boot into multi-user. > > Is there something we should do perhaps, that will let the cloning > procedure go smoothly? Should I "export" the ''now separated disk'' > somehow? In fact, can I mount that disk to make changes to it before > pulling out the disk? > > Most documentation on cloning uses "zfs send", which would be possible, > but 4 minutes is hard to beat when your cluster is under heavy load. > > Lund >-- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Jorgen Lundman wrote:> However, "zpool detach" appears to mark the disk as blank, so nothing > will find any pools (import, import -D etc). zdb -l will show labels,For kicks, I tried to demonstrate this does indeed happen, so I dd''ed the first 1024 1k blocks from the disk, zpool detach it, then dd''ed the image back out to the HDD. Pulled out disk and it boots directly without any interventions. If only zpool detach had a flag to tell it not to scribble over the detached disk. Guess I could diff the before and after disk image and work out what it is that it does, and write a tool to undo it, or figure out if I can undo it using "zdb". Lund -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
On Fri, Jul 24, 2009 at 9:24 AM, Jorgen Lundman<lundman at gmo.jp> wrote:>> However, "zpool detach" appears to mark the disk as blank, so nothing will >> find any pools (import, import -D etc). zdb -l will show labels,If both disks are bootable (with installboot or installgrub), removing the mirror and put in in the new server should create an exact clone (including IP address and hostname). I don''t think this is recommended though. This page provides root pool recovery methods, which shoud also be usable for cloning purposes. http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery -- Fajar
Jorgen Lundman wrote:> > > Jorgen Lundman wrote: >> However, "zpool detach" appears to mark the disk as blank, so nothing >> will find any pools (import, import -D etc). zdb -l will show labels, > > For kicks, I tried to demonstrate this does indeed happen, so I dd''ed > the first 1024 1k blocks from the disk, zpool detach it, then dd''ed the > image back out to the HDD. > > Pulled out disk and it boots directly without any interventions. If only > zpool detach had a flag to tell it not to scribble over the detached disk.This is a known issue. The simple case is actually quite easy (and I had nearly working code for it pretty quickly). What I mean by the simple case is 2 disks in a mirror. What if it was actually a 4 disk or 48 disk pool setup as mirrored pairs - that is much harder to deal with. Or the cases where there are cache and log devices in the pool make it harder to deal with as well (not really an issue for root pools though). Maybe the 2 disk mirror is a special enough case that this could be worth allowing without having to deal with all the other cases as well. The only reason I think it is a special enough cases is because it is the config we use for the root/boot pool. See 6849185 and 5097228. -- Darren J Moffat
Darren J Moffat wrote:> Maybe the 2 disk mirror is a special enough case that this could be > worth allowing without having to deal with all the other cases as well. > The only reason I think it is a special enough cases is because it is > the config we use for the root/boot pool. > > See 6849185 and 5097228. >Ah of course, you have a valid point and mirrors can be used it much more complicated situations. Been reading your blog all day, while impatiently waiting for zfs-crypto.. Lund -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Darren J Moffat wrote:> Jorgen Lundman wrote: >> >> >> Jorgen Lundman wrote: >>> However, "zpool detach" appears to mark the disk as blank, so >>> nothing will find any pools (import, import -D etc). zdb -l will >>> show labels, >> >> For kicks, I tried to demonstrate this does indeed happen, so I dd''ed >> the first 1024 1k blocks from the disk, zpool detach it, then dd''ed >> the image back out to the HDD. >> >> Pulled out disk and it boots directly without any interventions. If >> only zpool detach had a flag to tell it not to scribble over the >> detached disk. > > This is a known issue. The simple case is actually quite easy (and I > had nearly working code for it pretty quickly). What I mean by the > simple case is 2 disks in a mirror. What if it was actually a 4 disk > or 48 disk pool setup as mirrored pairs - that is much harder to deal > with. Or the cases where there are cache and log devices in the pool > make it harder to deal with as well (not really an issue for root > pools though). > > Maybe the 2 disk mirror is a special enough case that this could be > worth allowing without having to deal with all the other cases as > well. The only reason I think it is a special enough cases is because > it is the config we use for the root/boot pool.Not so much a 2 disk mirror, but a single vdev mirror (e.g. also a 3-way mirror, having one split off). This is asked for by multiple customers at every ZFS Discovery Day we do.> > See 6849185 and 5097228. >-- Andrew
On Jul 14, 2009, at 10:45 PM, Jorgen Lundman wrote:> > Hello list, > > Before we started changing to ZFS bootfs, we used DiskSuite mirrored > ufs boot. > > Very often, if we needed to grow a cluster by another machine or > two, we would simply clone a running live server. Generally the > procedure for this would be; > > 1 detach the "2nd" HDD, metaclear, and delete metadb on 2nd disk. > 2 mount the 2nd HDD under /mnt, and change system/vfstab to be a > single boot HDD, and no longer "mirrored", as well as host name, and > IP addresses. > 3 bootadm update-archive -R /mnt > 4 unmount, cfgadm unconfigure, and pull out the HDD.That is because you had only one other choice: filesystem level copy. With ZFS I believe you will find that snapshots will allow you to have better control over this. The send/receive process is very, very similar to a mirror resilver, so you are only carrying your previous process forward into a brave new world. You''ll find that send/receive is much more flexible than broken mirrors can be. -- richard
> That is because you had only one other choice: filesystem level copy. > With ZFS I believe you will find that snapshots will allow you to have > better control over this. The send/receive process is very, very similar > to a mirror resilver, so you are only carrying your previous process > forward into a brave new world. You''ll find that send/receive is much > more flexible than broken mirrors can be. > -- richardPerhaps, but when the crunch is on, it is hard to beat the 3 minute cloning. "zfs send" will not be done in 3 minutes, especially if the version used is before "zfs send speed fixes", like official Sol 10 10/08. (I am not sure, but zfs send sounds like you already need the 2nd server set up and running with IPs etc? ) Anyway, we have found a procedure now, so it is all possible. But it would have been nicer to be able to detach the disk "politely" ;) Lund -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
Richard Elling wrote:> That is because you had only one other choice: filesystem level copy. > With ZFS I believe you will find that snapshots will allow you to have > better control over this. The send/receive process is very, very similar > to a mirror resilver, so you are only carrying your previous process > forward into a brave new world. You''ll find that send/receive is much > more flexible than broken mirrors can be. > -- richardI always do my zpool backups by split mirrors, but I did just try using zfs send/receive to see if it''s viable. Unfortunately, it craps out about 10% in with "cannot receive incremental stream: invalid backup stream", so it doesn''t get me very far. I would guess from the time taken to get as far as it did, that if it had worked, it''s probably going to be about 3 times slower than a resilver and split mirror. That''s for a 500GB zpool with 8 filesystems and 3,500 snapshots. -- Andrew