Hello all, It was often asked and discussed on the list about "how to change rpool HDDs from AHCI to IDE mode" and back, with the modern routine involving reconfiguration of the BIOS, bootup from separate live media, simple import and export of the rpool, and bootup from the rpool. The documented way is to reinstall the OS upon HW changes. Both are inconvenient to say the least. Linux and recent Windows are much more careless about total changes of hardware underneath the OS image between boots, they just boot up and work. Why do we shoot ourselves in the foot with this boot-up problem? Now that I''m trying to dual-boot my OI-based system, I hit the problem hard: I have either a HW SATA (AMD Hudson, often not recognized upon bootup, but that''s another story) and a VirtualBox SATA on different pci dev/vendor IDs, or Physical and Virtual IDE which result in the same device path to cmdk and pci-ide - so I''m stuck with IDE mode at least for these compatibility reasons. So the basic question is: WHY does the OS want to use the device path (/pci... string) coded into the rpool''s vdevs mid-way in the bootup during vfs root-import routine, and fail with a panic if the device naming changed, when the loader (GRUB) for example already had no problem reading the same rpool? Is there any rationale or historic baggage to this situation? Is it a design error or shortsight? Isn''t it possible to use the same routine as for other pool imports, including import of this same rpool from a live-media boot - just find the component devices (starting with the one passed by the loader and/or matching by pool name and/or GUID) and import the resulting pool? Perhaps, this could be attempted if the current method fails, before reverting to a kernel panic - try another method first. Would this be a sane thing to change, or are there known beasts lurking in the dark? Thanks, //Jim Klimov
On 10/03/2012 05:54 AM, Jim Klimov wrote:> Hello all, > > It was often asked and discussed on the list about "how to > change rpool HDDs from AHCI to IDE mode" and back, with the > modern routine involving reconfiguration of the BIOS, bootup > from separate live media, simple import and export of the > rpool, and bootup from the rpool. The documented way is to > reinstall the OS upon HW changes. Both are inconvenient to > say the least.Any chance to touch /reconfigure, power off, then change the BIOS settings and reboot, like in the old days? Or maybe with passing -r and optionally -s and -v from grub like the old way we used to reconfigure Solaris?
2012-10-03 14:40, Ray Arachelian ?????:> On 10/03/2012 05:54 AM, Jim Klimov wrote: >> Hello all, >> >> It was often asked and discussed on the list about "how to >> change rpool HDDs from AHCI to IDE mode" and back, with the >> modern routine involving reconfiguration of the BIOS, bootup >> from separate live media, simple import and export of the >> rpool, and bootup from the rpool. The documented way is to >> reinstall the OS upon HW changes. Both are inconvenient to >> say the least. > > Any chance to touch /reconfigure, power off, then change the BIOS > settings and reboot, like in the old days? Or maybe with passing -r > and optionally -s and -v from grub like the old way we used to > reconfigure Solaris?Tried that, does not help. Adding forceloads to /etc/system and remaking the boot archive - also no. //Jim
On Wed, Oct 3, 2012 at 5:43 PM, Jim Klimov <jimklimov at cos.ru> wrote:> 2012-10-03 14:40, Ray Arachelian ?????: > >> On 10/03/2012 05:54 AM, Jim Klimov wrote: >>> >>> Hello all, >>> >>> It was often asked and discussed on the list about "how to >>> change rpool HDDs from AHCI to IDE mode" and back, with the >>> modern routine involving reconfiguration of the BIOS, bootup >>> from separate live media, simple import and export of the >>> rpool, and bootup from the rpool.IIRC when working with xen I had to boot with live cd, import the pool, then poweroff (without exporting the pool). Then it can boot. Somewhat inline with what you described.>> The documented way is to >>> reinstall the OS upon HW changes. Both are inconvenient to >>> say the least. >> >> >> Any chance to touch /reconfigure, power off, then change the BIOS >> settings and reboot, like in the old days? Or maybe with passing -r >> and optionally -s and -v from grub like the old way we used to >> reconfigure Solaris? > > > Tried that, does not help. Adding forceloads to /etc/system > and remaking the boot archive - also no.On Ubuntu + zfsonlinux + root/boot on zfs, the boot script helper is "smart" enough to try all available device nodes, so it wouldn''t matter if the dev path/id/name changed. But ONLY if there''s no zpool.cache in the initramfs. Not sure how easy it would be to port that functionality to solaris. -- Fajar
2012-10-03 16:04, Fajar A. Nugraha wrote:> On Ubuntu + zfsonlinux + root/boot on zfs, the boot script helper is > "smart" enough to try all available device nodes, so it wouldn''t > matter if the dev path/id/name changed. But ONLY if there''s no > zpool.cache in the initramfs. > > Not sure how easy it would be to port that functionality to solaris.Thanks, I thought of zpool.cache too, but it is only listed in /boot/solaris/filelist.safe which ironically still exists - though proper failsafe archives are not generated anymore. Even returning them would be a huge step forward in - a locally hosted self-sufficient interactive mini OS image in an archive unpacked and booted by GRUB indepependently of Solaris''s view of the hardware is much simpler than external live media... Unfortunately, so far I didn''t see ways of fixing the boot procedure short of hacking the binaries by compiling new ones, i.e. I did not find any easily changeable scripted logic. I digress, I did not yet look much further than unpacking the boot archive file itself and inspecting the files there. There''s even no binaries in it, which I''m afraid means the logic is in the kernel monofile... :( //Jim
Its been awhile, but it seems like in the past, you would power the system down, boot from removable media, import your pool then destroy or archive the /etc/zfs/zpool.cache, and possibly your /etc/path_to_inst file, power down again and re-arrange your hardware, then come up one final time with a reconfigure boot. Or something like that. I remember a similar video that was up on YouTube as done by some of the Sun guys employed in Germany. They build a big array from USB drives, then exported the pool. Once the system was down, they re-arranged all the drives in random order and ZFS was able to figure out how to put the raid all back together. I need to go find that video. Jerry On 10/ 3/12 07:04 AM, Fajar A. Nugraha wrote:> On Wed, Oct 3, 2012 at 5:43 PM, Jim Klimov <jimklimov at cos.ru> wrote: >> 2012-10-03 14:40, Ray Arachelian ?????: >> >>> On 10/03/2012 05:54 AM, Jim Klimov wrote: >>>> >>>> Hello all, >>>> >>>> It was often asked and discussed on the list about "how to >>>> change rpool HDDs from AHCI to IDE mode" and back, with the >>>> modern routine involving reconfiguration of the BIOS, bootup >>>> from separate live media, simple import and export of the >>>> rpool, and bootup from the rpool. > > IIRC when working with xen I had to boot with live cd, import the > pool, then poweroff (without exporting the pool). Then it can boot. > Somewhat inline with what you described. > >>> The documented way is to >>>> reinstall the OS upon HW changes. Both are inconvenient to >>>> say the least. >>> >>> >>> Any chance to touch /reconfigure, power off, then change the BIOS >>> settings and reboot, like in the old days? Or maybe with passing -r >>> and optionally -s and -v from grub like the old way we used to >>> reconfigure Solaris? >> >> >> Tried that, does not help. Adding forceloads to /etc/system >> and remaking the boot archive - also no. > > On Ubuntu + zfsonlinux + root/boot on zfs, the boot script helper is > "smart" enough to try all available device nodes, so it wouldn''t > matter if the dev path/id/name changed. But ONLY if there''s no > zpool.cache in the initramfs. > > Not sure how easy it would be to port that functionality to solaris. >
On Thu, Oct 04, 2012 at 07:57:34PM -0500, Jerry Kemp wrote:> I remember a similar video that was up on YouTube as done by some of the > Sun guys employed in Germany. They build a big array from USB drives, > then exported the pool. Once the system was down, they re-arranged all > the drives in random order and ZFS was able to figure out how to put the > raid all back together. I need to go find that video.http://constantin.glez.de/blog/2011/01/how-save-world-zfs-and-12-usb-sticks-4th-anniversary-video-re-release-edition ? Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
thanks for the link. This was the youtube link that I had. http://www.youtube.com/watch?v=1zw8V8g5eT0 Jerry On 10/ 4/12 08:07 PM, Jens Elkner wrote:> On Thu, Oct 04, 2012 at 07:57:34PM -0500, Jerry Kemp wrote: >> I remember a similar video that was up on YouTube as done by some of the >> Sun guys employed in Germany. They build a big array from USB drives, >> then exported the pool. Once the system was down, they re-arranged all >> the drives in random order and ZFS was able to figure out how to put the >> raid all back together. I need to go find that video. > > http://constantin.glez.de/blog/2011/01/how-save-world-zfs-and-12-usb-sticks-4th-anniversary-video-re-release-edition > ? > > Have fun, > jel.
2012-10-05 4:57, Jerry Kemp wrote:> Its been awhile, but it seems like in the past, you would power the > system down, boot from removable media, import your pool then destroy or > archive the /etc/zfs/zpool.cache, and possibly your /etc/path_to_inst > file, power down again and re-arrange your hardware, then come up one > final time with a reconfigure boot. Or something like that.Well, as I wrote in the OP, now the procedure is simpler; in your words: * power the system down and re-arrange your hardware (BIOS settings in case of SATA/Legacy=IDE switch) * boot from removable media, * import your rpool * export rpool * reboot from rpool Your procedure CAN be useful i.e. if a secondary userdata pool fails to import and causes kernel panics and such, or relies on external hardware (NAS/SAN) which is no longer available; then by deleting the rpool.cache you can avoid its automatic import upon OS boot. This does not seem to be employed by rpool import routine. This has a few "bad" steps I want to avoid: 1) Use of extra media. I''d go for at most a self-sufficient failsafe boot image like that we had in SXCE/Sol10 (basically, it''s just a bigger boot_archive that you can log into); preferably the OS should not need even that and switch to new rpool connection technology on the fly during boot. 2) Reliance on some remapped PCI path numbers (i.e. it is often not vendorid:devid, but a pci at number kind of address), which might be changeable between boots if your enumeration is not cut in stone for some reason. For example, I do worry whether the LiveUSB boots can make the HDD seem to be at a different path than the plain HDD boots - due to insertion/removal of a whole storage device tree and change of BIOS boot order. (This laptop has no CD/DVD, and I don''t think buying and adding/removing a USB CD/DVD would be a substantial change to adding/removing a USB HDD as I do now.) Whatever device paths the live-media bootup sees, it writes into the rpool headers upon successful import/export, and those strings (only?) are probed by boot from rpool. That is, the newly booted kernel does see enough of the pool to find these headers, then follows them and perhaps finds no storage hardware at that address. Well then, search/import the pool from the device WHERE you found those headers? Duh? ///Jim
Hello all, I have one more thought - or a question - about the current strangeness of rpool import: is it supported, or does it work, to have rpools on multipathed devices? If yes (which I hope it is, but don''t have a means to check) what sort of a string is saved into the pool''s labels as its device path? Some metadevice which is on a layer above mpxio, or one of the physical storage device paths? If the latter is the case, what happens during system boot if the multipathing happens to choose another path, not the one saved in labels? Thanks for insights, //Jim 2012-10-03 13:54, Jim Klimov wrote:> So the basic question is: WHY does the OS want to use the > device path (/pci... string) coded into the rpool''s vdevs > mid-way in the bootup during vfs root-import routine, and > fail with a panic if the device naming changed, when the > loader (GRUB) for example already had no problem reading > the same rpool? Is there any rationale or historic baggage > to this situation? Is it a design error or shortsight? > > Isn''t it possible to use the same routine as for other > pool imports, including import of this same rpool from a > live-media boot - just find the component devices (starting > with the one passed by the loader and/or matching by pool > name and/or GUID) and import the resulting pool? Perhaps, > this could be attempted if the current method fails, before > reverting to a kernel panic - try another method first.
James C. McPherson
2012-Oct-19 07:16 UTC
[zfs-discuss] Changing rpool device paths/drivers
On 19/10/12 04:50 PM, Jim Klimov wrote:> Hello all, > > I have one more thought - or a question - about the current > strangeness of rpool import: is it supported, or does it work, > to have rpools on multipathed devices? > > If yes (which I hope it is, but don''t have a means to check) > what sort of a string is saved into the pool''s labels as its > device path? Some metadevice which is on a layer above mpxio, > or one of the physical storage device paths? If the latter is > the case, what happens during system boot if the multipathing > happens to choose another path, not the one saved in labels?if you run /usr/bin/strings over /etc/zfs/zpool.cache, you''ll see that not only is the device path stored, but (more importantly) the devid. As far as I''m aware, having an rpool on multipathed devices is fine. Multiple paths to the device should still allow ZFS to obtain the same devid info... and we use devid''s in preference to physical paths. James C. McPherson -- Oracle Systems / Solaris / Core http://www.jmcpdotcom.com/blog
Thanks, more Qs below ;) 2012-10-19 11:16, James C. McPherson wrote:> if you run /usr/bin/strings over /etc/zfs/zpool.cache, > you''ll see that not only is the device path stored, but > (more importantly) the devid.As an excerpt from my adventurous notebook, which only has an rpool on SAS, I see these lines IDE mode: # /usr/bin/strings /etc/zfs/zpool.cache ... path /dev/dsk/c3d0s0 devid 2id1,cmdk at ATOSHIBA_MK5061GSY=___________72LBP0S2T/a phys_path $/pci at 0,0/pci-ide at 11/ide at 0/cmdk at 0,0:a ... (I was wrong to say earlier that with VirtualBox I can dual-boot the VM in IDE mode flawlessly, on my last test there were also discrepancies: ''pci-ide at 1,1'' vs. ''pci-ide at 11'' so the rpool did not import too; I am not sure what the devid would be in that case). When the same notebook reconfigured into SATA mode I see: ... path /dev/dsk/c3t0d0s0 devid 6id1,sd at SATA_____TOSHIBA_MK5061GS___________72LBP0S2T/a phys_path #/pci at 0,0/pci17aa,5104 at 11/disk at 0,0:a ... Attacking my original problem and question, is any of these values expected to NOT change if the driver (HBA device) is changed, i.e. when switching between SATA and IDE modes? As seen above, devid apparently includes the driver/technology name (sd or cmdk), and identifiers (@tech_vendor_model_sernum) also differ, although some components do match at least partially. There are no problems with this when importing a "guest pool", such as getting an existing rpool while booted from LiveCD; panics only happen due to extra checks for the rpool import.> As far as I''m aware, having an rpool on multipathed devices > is fine. Multiple paths to the device should still allow ZFS > to obtain the same devid info... and we use devid''s in > preference to physical paths.I do hope that in case of multipathing, the multiple paths use the same technology such as SAS or iSCSI, leading to the same devids which can be used reliably. Are there any real-life scenarios where a multipath can be implemented over several different transports, or people avoid that - just in case? In particular, can''t the pool GUID be used for spa_import_rootpool? Thanks, //Jim
Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
2012-Oct-19 11:27 UTC
[zfs-discuss] Changing rpool device paths/drivers
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of James C. McPherson > > As far as I''m aware, having an rpool on multipathed devices > is fine.Even a year ago, a new system I bought from Oracle came with multipath devices for all devices by default. Granted, there weren''t any multiple paths on that system... But it was using the multipath device names. I expect this is the new default for everything moving forward.
James C. McPherson
2012-Oct-19 11:56 UTC
[zfs-discuss] Changing rpool device paths/drivers
On 19/10/12 09:27 PM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of James C. McPherson >> >> As far as I''m aware, having an rpool on multipathed devices is fine.> Even a year ago, a new system I bought from Oracle came with multipath > devices for all devices by default. Granted, there weren''t any > multiple paths on that system... But it was using the multipath device > names. I expect this is the new default for everything moving > forward.All-MPxIO, All The Time is actually something we''ve been wanting to do for quite some time now. When we introduced fibrechannel support on x86/x64, it was decided that MPxIO would be on by default. Likewise, with the mpt_sas and pmcs 2nd generation SAS drivers, MPxIO is on by default. For legacy mpt and FC on sparc it''s off by default, but it''s very easy to turn on. James C. McPherson -- Oracle Systems / Solaris / Core http://www.jmcpdotcom.com/blog
On Oct 19, 2012, at 12:16 AM, James C. McPherson <jmcp at opensolaris.org> wrote:> On 19/10/12 04:50 PM, Jim Klimov wrote: >> Hello all, >> >> I have one more thought - or a question - about the current >> strangeness of rpool import: is it supported, or does it work, >> to have rpools on multipathed devices? >> >> If yes (which I hope it is, but don''t have a means to check) >> what sort of a string is saved into the pool''s labels as its >> device path? Some metadevice which is on a layer above mpxio, >> or one of the physical storage device paths? If the latter is >> the case, what happens during system boot if the multipathing >> happens to choose another path, not the one saved in labels? > > if you run /usr/bin/strings over /etc/zfs/zpool.cache, > you''ll see that not only is the device path stored, but > (more importantly) the devid.yuk. "zdb -C" is what you want.> As far as I''m aware, having an rpool on multipathed devices > is fine. Multiple paths to the device should still allow ZFS > to obtain the same devid info... and we use devid''s in > preference to physical paths.It is fine. The boot process is slightly different in that zpool.cache is not consulted at first. However, it is consulted later, so there are edge cases where this can cause problems when there are significant changes in the device tree. The archives are full of workarounds for this rare case. -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121019/79281cf3/attachment.html>