Evan Layton
2010-Jun-23 16:40 UTC
[zfs-discuss] c5->c9 device name change prevents beadm activate
On 6/23/10 4:29 AM, Brian Nitz wrote:> I saw a problem while upgrading from build 140 to 141 where beadm > activate {build141BE} failed because installgrub failed: > > # BE_PRINT_ERR=true beadm activate opensolarismigi-4 > be_do_installgrub: installgrub failed for device c5t0d0s0. > Unable to activate opensolarismigi-4. > Unknown external error. > > The reason installgrub failed is that it is attempting to install grub > on c5t0d0s0 which is where my root pool is: > # zpool status > pool: rpool > state: ONLINE > status: The pool is formatted using an older on-disk format. The pool can > still be used, but some features are unavailable. > action: Upgrade the pool using ''zpool upgrade''. Once this is done, the > pool will no longer be accessible on older software versions. > scan: scrub repaired 0 in 5h3m with 0 errors on Tue Jun 22 22:31:08 2010 > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c5t0d0s0 ONLINE 0 0 0 > > errors: No known data errors > > But the raw device doesn''t exist: > # ls -ls /dev/rdsk/c5* > /dev/rdsk/c5*: No such file or directory > > Even though zfs pool still sees it as c5, the actual device seen by > format is c9t0d0s0 > > > Is there any workaround for this problem? Is it a bug in install, zfs or > somewhere else in ON? >In this instance beadm is a victim of the zpool configuration reporting the wrong device. This does appear to be a ZFS issue since the device actually being used is not what zpool status is reporting. I''m forwarding this on to the ZFS alias to see if anyone has any thoughts there. -evan
Cindy Swearingen
2010-Jun-23 17:08 UTC
[zfs-discuss] c5->c9 device name change prevents beadm activate
On 06/23/10 10:40, Evan Layton wrote:> On 6/23/10 4:29 AM, Brian Nitz wrote: >> I saw a problem while upgrading from build 140 to 141 where beadm >> activate {build141BE} failed because installgrub failed: >> >> # BE_PRINT_ERR=true beadm activate opensolarismigi-4 >> be_do_installgrub: installgrub failed for device c5t0d0s0. >> Unable to activate opensolarismigi-4. >> Unknown external error. >> >> The reason installgrub failed is that it is attempting to install grub >> on c5t0d0s0 which is where my root pool is: >> # zpool status >> pool: rpool >> state: ONLINE >> status: The pool is formatted using an older on-disk format. The pool can >> still be used, but some features are unavailable. >> action: Upgrade the pool using ''zpool upgrade''. Once this is done, the >> pool will no longer be accessible on older software versions. >> scan: scrub repaired 0 in 5h3m with 0 errors on Tue Jun 22 22:31:08 2010 >> config: >> >> NAME STATE READ WRITE CKSUM >> rpool ONLINE 0 0 0 >> c5t0d0s0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> But the raw device doesn''t exist: >> # ls -ls /dev/rdsk/c5* >> /dev/rdsk/c5*: No such file or directory >> >> Even though zfs pool still sees it as c5, the actual device seen by >> format is c9t0d0s0 >> >> >> Is there any workaround for this problem? Is it a bug in install, zfs or >> somewhere else in ON? >> > > In this instance beadm is a victim of the zpool configuration reporting > the wrong device. This does appear to be a ZFS issue since the device > actually being used is not what zpool status is reporting. I''m forwarding > this on to the ZFS alias to see if anyone has any thoughts there. > > -evanHi Evan, I suspect that some kind of system, hardware, or firmware event changed this device name. We could identify the original root pool device with the zpool history output from this pool. Brian, you could boot this system from the OpenSolaris LiveCD and attempt to import this pool to see if that will update the device info correctly. If that doesn''t help, then create /dev/rdsk/c5* symlinks to point to the correct device. Thanks, Cindy
Lori Alt
2010-Jun-23 17:15 UTC
[zfs-discuss] c5->c9 device name change prevents beadm activate
Cindy Swearingen wrote:> > > On 06/23/10 10:40, Evan Layton wrote: >> On 6/23/10 4:29 AM, Brian Nitz wrote: >>> I saw a problem while upgrading from build 140 to 141 where beadm >>> activate {build141BE} failed because installgrub failed: >>> >>> # BE_PRINT_ERR=true beadm activate opensolarismigi-4 >>> be_do_installgrub: installgrub failed for device c5t0d0s0. >>> Unable to activate opensolarismigi-4. >>> Unknown external error. >>> >>> The reason installgrub failed is that it is attempting to install grub >>> on c5t0d0s0 which is where my root pool is: >>> # zpool status >>> pool: rpool >>> state: ONLINE >>> status: The pool is formatted using an older on-disk format. The >>> pool can >>> still be used, but some features are unavailable. >>> action: Upgrade the pool using ''zpool upgrade''. Once this is done, the >>> pool will no longer be accessible on older software versions. >>> scan: scrub repaired 0 in 5h3m with 0 errors on Tue Jun 22 22:31:08 >>> 2010 >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> rpool ONLINE 0 0 0 >>> c5t0d0s0 ONLINE 0 0 0 >>> >>> errors: No known data errors >>> >>> But the raw device doesn''t exist: >>> # ls -ls /dev/rdsk/c5* >>> /dev/rdsk/c5*: No such file or directory >>> >>> Even though zfs pool still sees it as c5, the actual device seen by >>> format is c9t0d0s0 >>> >>> >>> Is there any workaround for this problem? Is it a bug in install, >>> zfs or >>> somewhere else in ON? >>> >> >> In this instance beadm is a victim of the zpool configuration reporting >> the wrong device. This does appear to be a ZFS issue since the device >> actually being used is not what zpool status is reporting. I''m >> forwarding >> this on to the ZFS alias to see if anyone has any thoughts there. >> >> -evan > > Hi Evan, > > I suspect that some kind of system, hardware, or firmware event changed > this device name. We could identify the original root pool device with > the zpool history output from this pool. > > Brian, you could boot this system from the OpenSolaris LiveCD and > attempt to import this pool to see if that will update the device info > correctly. > > If that doesn''t help, then create /dev/rdsk/c5* symlinks to point to > the correct device. >I''ve seen this kind of device name change in a couple contexts now related to installs, image-updates, etc. I think we need to understand why this is happening. Prior to OpenSolaris and the new installer, we used to go to a fair amount of trouble to make sure that device names, once assigned, never changed. Various parts of the system depended on device names remaining the same across upgrades and other system events. Does anyone know why these device names are changing? Because that seems like the root of the problem. Creating symlinks with the old names seems like a band-aid, which could cause problems down the road--what if some other device on the system gets assigned that name on a future update? Lori>
Brian Nitz
2010-Jun-24 09:27 UTC
[zfs-discuss] c5->c9 device name change prevents beadm activate
Lori, In my case what may have caused the problem is that after a previous upgrade failed, I used this zfs send/recv procedure to give me (what I thought was) a sane rpool: http://blogs.sun.com/migi/entry/broken_opensolaris_never Is it possible that a zfs recv of a root pool contains the device names from the sending hardware? On 06/23/10 18:15, Lori Alt wrote:> Cindy Swearingen wrote: >> >> >> On 06/23/10 10:40, Evan Layton wrote: >>> On 6/23/10 4:29 AM, Brian Nitz wrote: >>>> I saw a problem while upgrading from build 140 to 141 where beadm >>>> activate {build141BE} failed because installgrub failed: >>>> >>>> # BE_PRINT_ERR=true beadm activate opensolarismigi-4 >>>> be_do_installgrub: installgrub failed for device c5t0d0s0. >>>> Unable to activate opensolarismigi-4. >>>> Unknown external error. >>>> >>>> The reason installgrub failed is that it is attempting to install grub >>>> on c5t0d0s0 which is where my root pool is: >>>> # zpool status >>>> pool: rpool >>>> state: ONLINE >>>> status: The pool is formatted using an older on-disk format. The >>>> pool can >>>> still be used, but some features are unavailable. >>>> action: Upgrade the pool using ''zpool upgrade''. Once this is done, the >>>> pool will no longer be accessible on older software versions. >>>> scan: scrub repaired 0 in 5h3m with 0 errors on Tue Jun 22 22:31:08 >>>> 2010 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> rpool ONLINE 0 0 0 >>>> c5t0d0s0 ONLINE 0 0 0 >>>> >>>> errors: No known data errors >>>> >>>> But the raw device doesn''t exist: >>>> # ls -ls /dev/rdsk/c5* >>>> /dev/rdsk/c5*: No such file or directory >>>> >>>> Even though zfs pool still sees it as c5, the actual device seen by >>>> format is c9t0d0s0 >>>> >>>> >>>> Is there any workaround for this problem? Is it a bug in install, >>>> zfs or >>>> somewhere else in ON? >>>> >>> >>> In this instance beadm is a victim of the zpool configuration reporting >>> the wrong device. This does appear to be a ZFS issue since the device >>> actually being used is not what zpool status is reporting. I''m >>> forwarding >>> this on to the ZFS alias to see if anyone has any thoughts there. >>> >>> -evan >> >> Hi Evan, >> >> I suspect that some kind of system, hardware, or firmware event changed >> this device name. We could identify the original root pool device with >> the zpool history output from this pool. >> >> Brian, you could boot this system from the OpenSolaris LiveCD and >> attempt to import this pool to see if that will update the device info >> correctly. >> >> If that doesn''t help, then create /dev/rdsk/c5* symlinks to point to >> the correct device. >> > I''ve seen this kind of device name change in a couple contexts now > related to installs, image-updates, etc. > > I think we need to understand why this is happening. Prior to > OpenSolaris and the new installer, we used to go to a fair amount of > trouble to make sure that device names, once assigned, never changed. > Various parts of the system depended on device names remaining the > same across upgrades and other system events. > > Does anyone know why these device names are changing? Because that > seems like the root of the problem. Creating symlinks with the old > names seems like a band-aid, which could cause problems down the > road--what if some other device on the system gets assigned that name > on a future update? > > Lori > >> >
Lori Alt
2010-Jun-24 20:02 UTC
[zfs-discuss] c5->c9 device name change prevents beadm activate
On 06/24/10 03:27 AM, Brian Nitz wrote:> Lori, > > In my case what may have caused the problem is that after a previous > upgrade failed, I used this zfs send/recv procedure to give me (what I > thought was) a sane rpool: > > http://blogs.sun.com/migi/entry/broken_opensolaris_never > > Is it possible that a zfs recv of a root pool contains the device > names from the sending hardware?Yes, the data installed by the zfs recv will contain the device names from the sending hardware. I looked at the instructions in the blog you reference above and while the procedure *might* work in some circumstances, it would mostly be by accident. Maybe if there is an exact match of hardware, it might work, but there''s also metadata that describes the BEs on a system and I doubt whether the send/recv would restore all the information necessary to do that. You might want to bring this subject up on the caiman-discuss at opensolaris.org alias, where needs like this can be addressed for real, in the supported installation tools. Lori> > On 06/23/10 18:15, Lori Alt wrote: >> Cindy Swearingen wrote: >>> >>> >>> On 06/23/10 10:40, Evan Layton wrote: >>>> On 6/23/10 4:29 AM, Brian Nitz wrote: >>>>> I saw a problem while upgrading from build 140 to 141 where beadm >>>>> activate {build141BE} failed because installgrub failed: >>>>> >>>>> # BE_PRINT_ERR=true beadm activate opensolarismigi-4 >>>>> be_do_installgrub: installgrub failed for device c5t0d0s0. >>>>> Unable to activate opensolarismigi-4. >>>>> Unknown external error. >>>>> >>>>> The reason installgrub failed is that it is attempting to install >>>>> grub >>>>> on c5t0d0s0 which is where my root pool is: >>>>> # zpool status >>>>> pool: rpool >>>>> state: ONLINE >>>>> status: The pool is formatted using an older on-disk format. The >>>>> pool can >>>>> still be used, but some features are unavailable. >>>>> action: Upgrade the pool using ''zpool upgrade''. Once this is done, >>>>> the >>>>> pool will no longer be accessible on older software versions. >>>>> scan: scrub repaired 0 in 5h3m with 0 errors on Tue Jun 22 >>>>> 22:31:08 2010 >>>>> config: >>>>> >>>>> NAME STATE READ WRITE CKSUM >>>>> rpool ONLINE 0 0 0 >>>>> c5t0d0s0 ONLINE 0 0 0 >>>>> >>>>> errors: No known data errors >>>>> >>>>> But the raw device doesn''t exist: >>>>> # ls -ls /dev/rdsk/c5* >>>>> /dev/rdsk/c5*: No such file or directory >>>>> >>>>> Even though zfs pool still sees it as c5, the actual device seen by >>>>> format is c9t0d0s0 >>>>> >>>>> >>>>> Is there any workaround for this problem? Is it a bug in install, >>>>> zfs or >>>>> somewhere else in ON? >>>>> >>>> >>>> In this instance beadm is a victim of the zpool configuration >>>> reporting >>>> the wrong device. This does appear to be a ZFS issue since the device >>>> actually being used is not what zpool status is reporting. I''m >>>> forwarding >>>> this on to the ZFS alias to see if anyone has any thoughts there. >>>> >>>> -evan >>> >>> Hi Evan, >>> >>> I suspect that some kind of system, hardware, or firmware event changed >>> this device name. We could identify the original root pool device with >>> the zpool history output from this pool. >>> >>> Brian, you could boot this system from the OpenSolaris LiveCD and >>> attempt to import this pool to see if that will update the device info >>> correctly. >>> >>> If that doesn''t help, then create /dev/rdsk/c5* symlinks to point to >>> the correct device. >>> >> I''ve seen this kind of device name change in a couple contexts now >> related to installs, image-updates, etc. >> >> I think we need to understand why this is happening. Prior to >> OpenSolaris and the new installer, we used to go to a fair amount of >> trouble to make sure that device names, once assigned, never >> changed. Various parts of the system depended on device names >> remaining the same across upgrades and other system events. >> >> Does anyone know why these device names are changing? Because that >> seems like the root of the problem. Creating symlinks with the old >> names seems like a band-aid, which could cause problems down the >> road--what if some other device on the system gets assigned that name >> on a future update? >> >> Lori >> >>> >> >