Jens Elkner
2008-Nov-12 23:30 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Hi, in preparation to try zfs boot on sparc I installed all recent patches incl. feature patches comming from s10s_u3wos_10 and after reboot finally 137137-09 (still having everything on UFS). Now it doesn''t boot at anymore: ############################### Sun Fire V240, No Keyboard Copyright 2006 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.22.23, 2048 MB memory installed, Serial #63729301. Ethernet address 0:3:ba:cc:6e:95, Host ID: 83cc6e95. Rebooting with command: boot Boot device: /pci at 1c,600000/scsi at 2/disk at 0,0:a File and args: | seek failed Warning: Fcode sequence resulted in a net stack depth change of 1 Evaluating: Evaluating: The file just loaded does not appear to be executable. |1} ok | ############################### fsck /dev/rdsk/c0t0d0s0 doesn''t find any problems. So mounted this slice on /tmp/a, and # find /tmp/a/boot /tmp/a/boot /tmp/a/boot/solaris /tmp/a/boot/solaris/bin /tmp/a/boot/solaris/bin/extract_boot_filelist /tmp/a/boot/solaris/bin/create_ramdisk /tmp/a/boot/solaris/bin/root_archive /tmp/a/boot/solaris/filelist.ramdisk /tmp/a/boot/solaris/filelist.safe /tmp/a/boot/solaris/filestat.ramdisk # cat /tmp/a/boot/solaris/filelist.ramdisk etc/cluster/nodeid etc/dacf.conf etc/mach kernel platform It looks different than on x86 (no kernels), so is it possible, that the patch didn''t install all required files or is it simply broken? Or did somebody forget to mention, that an OBP update is required before installing this patch? Any hints? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
What hardware are you on, and what firmware are you at. Issue is coming from firmware. Enda -- This message posted from opensolaris.org
Jens Elkner
2008-Nov-13 21:36 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Thu, Nov 13, 2008 at 10:50:02AM -0800, Enda wrote:> What hardware are you on, and what firmware are you at. > Issue is coming from firmware.Sun Fire V240 with OpenBoot 4.22.23 Tried to find out, whether there is an OBP patch available, but haven''t found anything wrt. V240, V440 and V490 :( Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Gerry Haskins
2008-Nov-14 00:54 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Jens, http://www.sun.com/bigadmin/patches/firmware/release_history.jsp on the Big Admin Patching center, http://www.sun.com/bigadmin/patches/ list firmware revisions. If it''s the same as a V490, then I think the current firmware version is 121689-04, http://sunsolve.sun.com/search/advsearch.do?collection=PATCH&type=collections&queryKey5=121689&toDocument=yes Best Wishes, Gerry. -- This message posted from opensolaris.org
Jens Elkner
2008-Nov-14 09:25 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Thu, Nov 13, 2008 at 04:54:57PM -0800, Gerry Haskins wrote:> Jens, http://www.sun.com/bigadmin/patches/firmware/release_history.jsp on the Big Admin Patching center, http://www.sun.com/bigadmin/patches/ list firmware revisions.Thanks a lot. Digged around there and found, that 121683-06 aka OBP 4.22.33 seems to be the most recent one for V240. So in theory it should be ok in my case.> If it''s the same as a V490, then I think the current firmware version is 121689-04, http://sunsolve.sun.com/search/advsearch.do?collection=PATCH&type=collections&queryKey5=121689&toDocument=yesOK - so the OBPs are all the latest ones on my machines. Unfortunately I''ve not a 2nd V490 to test, whether the problem occurs there as well - so I''ll better postbone its upgrade :( Anyway, thanks a lot Gerry, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
hi, is the system still in the same state initially reported ? ie. you have not manually run any commands (ie. installboot) that would have altered the slice containing the root fs where 137137-09 was applied could you please provide the following 1. a copy of the 137137-09 patchadd log if you have one available 2. an indication of anything particular about the system configuration, ie. mirrored root 3. output from the following commands run against root fs where 137137-09 was applied ls -l usr/platform/sun4u/lib/fs/*/bootblk ls -l platform/sun4u/lib/fs/*/bootblk sum usr/platform/sun4u/lib/fs/*/bootblk sum platform/sun4u/lib/fs/*/bootblk dd if=/dev/rdsk/<rootdsk> of=/tmp/bb bs=1b iseek=1 count=15 cmp /tmp/bb usr/platform/sun4u/lib/fs/ufs/bootblk cmp /tmp/bb platform/sun4u/lib/fs/ufs/bootblk prtvtoc /dev/rdsk/<rootdsk> where <rootdsk> is the slice containing the root fs where 137137-09 was applied thanks, Ed -- This message posted from opensolaris.org
Jens Elkner
2008-Nov-14 23:06 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Fri, Nov 14, 2008 at 01:07:29PM -0800, Ed Clark wrote: hi,> > is the system still in the same state initially reported ?Yes.> ie. you have not manually run any commands (ie. installboot) that would have altered the slice containing the root fs where 137137-09 was applied > > could you please provide the following > > 1. a copy of the 137137-09 patchadd log if you have one availablecp it to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ Can''t spot anything unusual.> 2. an indication of anything particular about the system configuration, ie. mirrored rootNo mirrors/raid: # format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 <FUJITSU-MAN3367MC-0109 cyl 24343 alt 2 hd 4 sec 737> /pci at 1c,600000/scsi at 2/sd at 0,0 1. c0t1d0 <SEAGATE-ST336737LC-0102 cyl 29773 alt 2 hd 4 sec 606> /pci at 1c,600000/scsi at 2/sd at 1,0 2. c0t2d0 <FUJITSU-MAT3073N SUN72G-0602-68.37GB> /pci at 1c,600000/scsi at 2/sd at 2,0 3. c0t3d0 <FUJITSU-MAT3073N SUN72G-0602-68.37GB> /pci at 1c,600000/scsi at 2/sd at 3,0> 3. output from the following commands run against root fs where 137137-09 was applied > > ls -l usr/platform/sun4u/lib/fs/*/bootblk > ls -l platform/sun4u/lib/fs/*/bootblk > sum usr/platform/sun4u/lib/fs/*/bootblk > sum platform/sun4u/lib/fs/*/bootblk > dd if=/dev/rdsk/<rootdsk> of=/tmp/bb bs=1b iseek=1 count=15 > cmp /tmp/bb usr/platform/sun4u/lib/fs/ufs/bootblk > cmp /tmp/bb platform/sun4u/lib/fs/ufs/bootblk > prtvtoc /dev/rdsk/<rootdsk>also cp to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ Seems to be ok, too. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Hi,> > > > 1. a copy of the 137137-09 patchadd log if you have > one available > > cp it to > http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ > Can''t spot anything unusual. >thanks for info - what you provided here is the patch pkg installation log, what i was actually after was patchadd log (ie. the messages output to terminal) -- both the patchadd log and the console log on reboot should have shown errors which would have provided hints as to what the problem was> 2. an indication of anything particular about the > system configuration, ie. mirrored root > > No mirrors/raid: > > # format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c0t0d0 <FUJITSU-MAN3367MC-0109 cyl 24343 > alt 2 hd 4 sec 737> > /pci at 1c,600000/scsi at 2/sd at 0,0 > c0t1d0 <SEAGATE-ST336737LC-0102 cyl 29773 alt 2 hd 4 > sec 606> > /pci at 1c,600000/scsi at 2/sd at 1,0 > c0t2d0 <FUJITSU-MAT3073N SUN72G-0602-68.37GB> > /pci at 1c,600000/scsi at 2/sd at 2,0 > c0t3d0 <FUJITSU-MAT3073N SUN72G-0602-68.37GB> > /pci at 1c,600000/scsi at 2/sd at 3,0 > put from the following commands run against root fs > where 137137-09 was applied > > > > ls -l usr/platform/sun4u/lib/fs/*/bootblk > > ls -l platform/sun4u/lib/fs/*/bootblk > > sum usr/platform/sun4u/lib/fs/*/bootblk > > sum platform/sun4u/lib/fs/*/bootblk > > dd if=/dev/rdsk/<rootdsk> of=/tmp/bb bs=1b iseek=1 > count=15 > > cmp /tmp/bb usr/platform/sun4u/lib/fs/ufs/bootblk > > cmp /tmp/bb platform/sun4u/lib/fs/ufs/bootblk > > prtvtoc /dev/rdsk/<rootdsk> > > also cp to > http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ > Seems to be ok, too. >now the df/prtvtoc output was most useful : 137137-09 delivers sparc newboot, and the problem here appears to be that a root fs slice of 256M falls well below the minimum required size required for sparc newboot to operate nominally -- due to the lack of space in /, i suspect that 137137-09 postpatch failed to copy the ~180MB failsafe archive (/platform/sun4u/failsafe) to your system, and that the ~80M boot archive (/platform/sun4u/boot_archive) was not created correctly on the reboot after applying 137137-09 the ''seek failed'' error message you see on boot is coming from the ufs bootblk fcode, which i suspect is due to not being able load the corrupt boot_archive you should be able to get your system to boot by doing the following 1. net/CD/DVD boot the system using a recent update release, u5/u6 should work, not sure about u4 or earlier 2. mount the root fs slice, cd to <root-fs-mount-point> 3. ls -l platform/sun4u 4. rm -f platform/sun4u/boot_archive 5. sbin/bootadm -a update_all 6. ls -l platform/sun4u the boot_archive file should build successfully, and you should see something like the following #?ls -la platform/sun4u total 168008 drwxr-xr-x 4 root sys 512 Nov 16 07:36 . drwxr-xr-x 40 root sys 1536 Nov 16 05:36 .. -rw-r--r-- 1 root root 84787200 Nov 16 07:36 boot_archive -rw-r--r-- 1 root sys 71808 Oct 3 14:28 bootlst drwxr-xr-x 9 root sys 512 Nov 16 05:10 kernel drwxr-xr-x 4 root bin 512 Nov 16 05:36 lib -rw-r--r-- 1 root sys 1084048 Oct 3 14:28 wanboot # boot_archive corruption will be a recurrent problem on your configuration, every time the system determines that boot_archive needs to be rebuilt on reboot -- a very inelegant workaround would be to ''rm -f /platform/sun4u/boot_archive'' every time before rebooting the system better option would be to reinstall the system, choosing a disk layout adequate for newboot hth, Ed -- This message posted from opensolaris.org
Dominique Frise
2008-Nov-17 10:31 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
We noticed following postpatch error while installing patch 137137-09 on systems with STMS enabled (MPxIO) and fiber channels systems disks: .... Patch 137137-09 has been successfully installed. See /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/log for details Executing postpatch script... Detected SVM root. Installing bootblk on /dev/rdsk//dev/dsk/c6t20000014C35012F5d0s0 /dev/rdsk//dev/dsk/c6t20000014C35012F5d0s0: Not a character device Installing bootblk on /dev/rdsk//dev/dsk/c6t20000004CF6F4E64d0s0 /dev/rdsk//dev/dsk/c6t20000004CF6F4E64d0s0: Not a character device .... Due to a bug in the postpatch script, the UFS boot blocks could not be installed. The solution is -while in single-user- create the boot blocks manually before rebooting. In this case: # /usr/sbin/installboot /platform/`uname -m`/lib/fs/ufs/bootblk /dev/rdsk/c6t20000014C35012F5d0s0 # /usr/sbin/installboot /platform/`uname -m`/lib/fs/ufs/bootblk /dev/rdsk/c6t20000004CF6F4E64d0s0 In your case, you will have to follow a boot recovery procedure like the one at http://sunsolve.sun.com/search/document.do?assetkey=1-62-206110-1 Hope this helps. Dominique -- This message posted from opensolaris.org
hi,> We noticed following postpatch error while installing > patch 137137-09 on systems with STMS enabled (MPxIO) > and fiber channels systems disks: > > .... > Patch 137137-09 has been successfully installed. > See > /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/ > log for details > Executing postpatch script... > Detected SVM root. > Installing bootblk on > /dev/rdsk//dev/dsk/c6t20000014C35012F5d0s0 > /dev/rdsk//dev/dsk/c6t20000014C35012F5d0s0: Not a > character device > Installing bootblk on > /dev/rdsk//dev/dsk/c6t20000004CF6F4E64d0s0 > /dev/rdsk//dev/dsk/c6t20000004CF6F4E64d0s0: Not a > character device > .... > > Due to a bug in the postpatch script, the UFS boot > blocks could not be installed.thanks for info on this problem, we are looking into the options of creating a patch specific solution> > In your case, you will have to follow a boot recovery > procedure like the one at > http://sunsolve.sun.com/search/document.do?assetkey=1- > 62-206110-1 >this procedure in not relevant to the original problem on this thread -- 137137-09 did install the bootblk successfully, the problem here is that the root fs later runs out of space for the boot archives associated with sparc newboot best, Ed -- This message posted from opensolaris.org
Jens Elkner
2008-Nov-18 03:18 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Sun, Nov 16, 2008 at 09:27:32AM -0800, Ed Clark wrote: Hi Ed,> > > 1. a copy of the 137137-09 patchadd log if you have > > http://iws.cs.uni-magdeburg.de/~elkner/137137-09/> thanks for info - what you provided here is the patch pkg installation log,Yes, actually the only one, I have/could find.> what i was actually after was patchadd log (ie. the messages output to terminal)Up to now I thought, that stderr and stdout are redirected from patchadd to the patchlog, but never checked in detail, since the log always had the info I needed ...> -- both the patchadd log and the console log on reboot should have shown errors which would have provided hints as to what the problem wasHaven''t seen anything unusaly. But may be I''ve overseen it :(> now the df/prtvtoc output was most useful : > > 137137-09 delivers sparc newboot, and the problem here appears to be that a root fs slice of 256M falls well below the minimum required size required for sparc newboot to operate nominally -- due to the lack of space in /, i suspect that 137137-09 postpatch failed to copy the ~180MB failsafe archive (/platform/sun4u/failsafe) to your system, and that the ~80M boot archive (/platform/sun4u/boot_archive) was not created correctly on the reboot after applying 137137-09 > > the ''seek failed'' error message you see on boot is coming from the ufs bootblk fcode, which i suspect is due to not being able load the corrupt boot_archiveYes - that makes sense.> you should be able to get your system to boot by doing the following > > 1. net/CD/DVD boot the system using a recent update release, u5/u6 should work, not sure about u4 or earlier > 2. mount the root fs slice, cd to <root-fs-mount-point> > 3. ls -l platform/sun4u > 4. rm -f platform/sun4u/boot_archive > 5. sbin/bootadm -a update_allYepp - now I can see the problem: Creating boot_archive for /a updating /a/platform/sun4u/boot_archive 15+0 records in 15+0 records out cat: write error: No space left on device bootadm: write to file failed: /a/boot/solaris/filestat.ramdisk.tmp: No space left on device So moving /etc/gconf to /opt and /etc/mail/cf to /usr/lib/mail (inkl. creating the appropriate links) was sufficient to get it work.> 6. ls -l platform/sun4utotal 136770 -rw-r--r-- 1 root root 68716544 Nov 18 03:22 boot_archive -rw-r--r-- 1 root sys 71808 Oct 3 23:28 bootlst -rw-r--r-- 1 root sys 79976 Oct 3 23:34 cprboot drwxr-xr-x 11 root sys 512 Mar 19 2007 kernel drwxr-xr-x 4 root bin 512 Nov 12 22:17 lib drwxr-xr-x 2 root bin 512 Mar 19 2007 sbin -rw-r--r-- 1 root sys 1084048 Oct 3 23:28 wanboot Filesystem 1024-blocks Used Available Capacity Mounted on /dev/dsk/c0t0d0s0 245947 205343 16010 93% /> boot_archive corruption will be a recurrent problem on your configuration, every time the system determines that boot_archive needs to be rebuilt on reboot -- a very inelegant workaround would be to ''rm -f /platform/sun4u/boot_archive'' every time before rebooting the systemHmmm - may be I''m wrong, but IMHO if there is not enough space for a new boot_archive the "bootadm" should not corrupt anything but leave the old one in place - I would guess, in 95% of al cases one comes away with it, since very often updates are not really required ...> better option would be to reinstall the system, choosing a disk layout adequate for newbootWell, the 2nd exercise is to test zfs boot (all systems have at least a 2nd HDD). If this works, just converting to zfs is probably the better option ... Anyway, thanks a lot for your help! Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Vincent Fox
2008-Nov-18 22:37 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
I noticed this while patching to 137137-09 on a UFS Sparc today: Patch 137137-09 has been successfully installed. See /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/log for details Executing postpatch script... Detected SVM root. Installing bootblk on /dev/rdsk/c1t0d0s0 Installing bootblk on /dev/rdsk/c1t1d0s0 Creating boot_archive for /var/run/.patchSafeMode/root updating /var/run/.patchSafeMode/root/platform/sun4u/boot_archive I thought to look at df output before rebooting, and there are PAGES & PAGES like this: /var/run/.patchSafeModeOrigFiles/usr/platform/FJSV,GPUZC-M/lib/libcpc.so.1 7597264 85240 7512024 2% /usr/platform/FJSV,GPUZC-M/lib/libcpc.so.1 /var/run/.patchSafeModeOrigFiles/usr/platform/FJSV,GPUZC-M/lib/sparcv9/libcpc.so.1 7597264 85240 7512024 2% /usr/platform/FJSV,GPUZC-M/lib/sparcv9/libcpc.so.1 /var/run/.patchSafeModeOrigFiles/usr/platform/sun4us/lib/libcpc.so.1 7597264 85240 7512024 2% /usr/platform/sun4us/lib/libcpc.so.1 /var/run/.patchSafeModeOrigFiles/usr/platform/sun4us/lib/sparcv9/libcpc.so.1 7597264 85240 7512024 2% /usr/platform/sun4us/lib/sparcv9/libcpc.so.1 /var/run/.patchSafeModeOrigFiles/usr/lib/fm/dict/SUN4US.dict 7597264 85240 7512024 2% /usr/lib/fm/dict/SUN4US.dict /var/run/.patchSafeModeOrigFiles/usr/lib/locale/C/LC_MESSAGES/SUN4US.mo 7597264 85240 7512024 2% /usr/lib/locale/C/LC_MESSAGES/SUN4US.mo /var/run/.patchSafeModeOrigFiles/usr/platform/sun4us/lib/fm/eft/pci.eft 7597264 85240 7512024 2% /usr/platform/sun4us/lib/fm/eft/pci.eft Hundreds of mountpoints, what''s it doing in there? -- This message posted from opensolaris.org
Marion Hakanson
2008-Nov-19 00:20 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
vincent_b_fox at yahoo.com said:> I thought to look at df output before rebooting, and there are PAGES & PAGES > like this: > >/var/run/.patchSafeModeOrigFiles/usr/platform/FJSV,GPUZC-M/lib/libcpc.so.17597264 85240 7512024 2% /usr/platform/FJSV,GPUZC-M/lib/libcpc.so.1> . . . > Hundreds of mountpoints, what''s it doing in there?That''s normal, for deferred-activation patches (like this jumbo kernel patch). They are loopback mounts which are supposed to keep any kernel-specific things from being affected by something that would otherwise change the running kernel. Using liveupgrade for patches is quite a bit cleaner, in my opinion, if you have that option. It seems to do a good job of updating grub on all bootable drives as well (as of S10U6, anyway). Regards, Marion
Hi Jens,> > what i was actually after was patchadd log (ie. the > messages output to terminal) > > Up to now I thought, that stderr and stdout are > redirected from patchadd > to the patchlog, but never checked in detail, since > the log always had > the info I needed ... >messages from the underlying pkging commands are captured in the /var/sadm/patch/<PID>/log file messages from patchadd itself and patch level scripts (prepatch, postpatch, etc) go to stdout/stderr these are two distinct sets of messages -- not really optimal, just the way patchadd has always been> > Yepp - now I can see the problem: > > Creating boot_archive for /a > updating /a/platform/sun4u/boot_archive > 15+0 records in > 15+0 records out > cat: write error: No space left on device > bootadm: write to file failed: > /a/boot/solaris/filestat.ramdisk.tmp: No space left > on device > > So moving /etc/gconf to /opt and /etc/mail/cf to > /usr/lib/mail (inkl. > creating the appropriate links) was sufficient to get > it work. >nice trick, but unfortunately it won''t do -- officially you must _never_ make such changes that alter the type of system files ; if you do, the changes are at your own risk and completely unsupported the basic reason for this patching can not be guaranteed behave in a deterministic manner when it encounters such changes -- this does cause real problems too, ie. a case where changing sendmail.cf to a symlink caused a kernel patch only half apply, leading to long term outages removing the old corrupt boot archive is a simple and safe way to free up some space, best part is on reboot the system will automatically rebuild it> > Hmmm - may be I''m wrong, but IMHO if there is not > enough space for a new > boot_archive the "bootadm" should not corrupt > anything but leave the old > one in place - I would guess, in 95% of al cases one > comes away with it, > since very often updates are not really required ...hmm ... something of double edged sword, at least the way it works currently we know with certainty when there was a problem building the archive and can go about rectifying it ; the problem with keeping the old boot archive is that the system may have the appearance of booting and possibly even running ok, but there is absolutely no guarantee of nominal operation, could be very confusing> > better option would be to reinstall the system, > choosing a disk layout adequate for newboot > > Well, the 2nd exercise is to test zfs boot (all > systems have at least a > 2nd HDD). If this works, just converting to zfs is > probably the better > option ... >yep - even better ! best, Ed -- This message posted from opensolaris.org
Jens Elkner
2008-Nov-19 23:52 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Tue, Nov 18, 2008 at 07:44:56PM -0800, Ed Clark wrote: Hi Ed,> messages from the underlying pkging commands are captured in the /var/sadm/patch/<PID>/log file > > messages from patchadd itself and patch level scripts (prepatch, postpatch, etc) go to stdout/stderr > > these are two distinct sets of messages -- not really optimal, just the way patchadd has always beenYepp - thanks for making that clear.> > So moving /etc/gconf to /opt and /etc/mail/cf to > > /usr/lib/mail (inkl. > > creating the appropriate links) was sufficient to get > > it work.> nice trick, but unfortunately it won''t do -- officially you must _never_ make such changes that alter the type of system files ; if you do, the changes are at your own risk and completely unsupportedYes, considering it for a temp. change, only. However pkgtools are usually robust enough (as long as the dir is not empty) to handle such relocations properly - that''s why I really like pkgtools (giving me the freedom I need ;-)).> the basic reason for this patching can not be guaranteed behave in a deterministic manner when it encounters such changes -- this does cause real problems too, ie. a case where changing sendmail.cf to a symlink caused a kernel patch only half apply, leading to long term outagesOops - that''s a big bummer [OT: and probably the result of the bad packaging strategy of solaris (i.e. not software oriented aka merging different sw into one sol package ...)]. Good to know, that this already happend...> removing the old corrupt boot archive is a simple and safe way to free up some space, best part is on reboot the system will automatically rebuild it > > > > > Hmmm - may be I''m wrong, but IMHO if there is not > > enough space for a new > > boot_archive the "bootadm" should not corrupt > > anything but leave the old > > one in place - I would guess, in 95% of al cases one > > comes away with it, > > since very often updates are not really required ... > > hmm ... something of double edged sword, at least the way it works currently we know with certainty when there was a problem building the archive and can go about rectifying it ; the problem with keeping the old boot archive is that the system may have the appearance of booting and possibly even running ok, but there is absolutely no guarantee of nominal operation, could be very confusingYes, I understand your point of view. However, I didn''t mean, silently ignore the "unable to update boot archive" but giving the user a simple way to fix the problem. So I would prefer the "keep the old archive as long as it can not be updated, but issue big warnings on reboot/activation to get informed, that a fix is needed". At least in my case the system would have been "offline" for at most 30min, but because of the bug it was several days offline and without your help probably several weeks/months (i.e. my experience wrt. german sun support) ... Anyway, thanks a lot again, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Vincent Fox
2008-Dec-02 20:22 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Reviving this thread. We have a Solaris 10u4 system recently patched with 137137-09. Unfortunately the patch was applied from multi-user mode, I wonder if this may have been original posters problem as well? Anyhow we are now stuck with an unbootable system as well. I have submitted a case to Sun about it, will add details as that proceeds. -- This message posted from opensolaris.org
Enda O''Connor
2008-Dec-02 20:46 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Vincent Fox wrote:> Reviving this thread. > > We have a Solaris 10u4 system recently patched with 137137-09. > Unfortunately the patch was applied from multi-user mode, I wonder if this > may have been original posters problem as well? Anyhow we are now stuck > with an unbootable system as well. > > I have submitted a case to Sun about it, will add details as that proceeds.Hi There are basically two possible issue that we are aware of 6772822, where the root fs has insufficient space to hold the failsafe archive ( 181M ) the bootarchive 80M approx, and a rebuild of same when rebooting, leading to some possible different outcomes if you see "seek failed" it indicates that new bootblk installed ok, but it couldn''t rebuild on reboot, There are also issues where if running svm on mpxio, the bootblk won''t et installed, 6772083 or 6775167 Let us know the exact errror seen and if possible the exact output from patchadd 137137-09 Enda
Vincent Fox
2008-Dec-02 21:42 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
The SupportTech responding to case #66153822 so far has only suggested "boot from cdrom and patchrm 137137-09" which tells me I''m dealing with a level-1 binder monkey. It''s the idle node of a cluster holding 10K email accounts so I''m proceeding cautiously. It is unfortunate the admin doing the original patching did them from multi-user but here we are. I am attempting to boot net:dhcp -s just to collect more info: My patchadd output shows 138866-01 & 137137-09 being applied OK: bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/138866-01 Validating patches... Loading patches installed on the system... Done! Loading patches requested to install. Done! Checking patches that you specified for installation. Done! Approved patches will be installed in this order: 138866-01 Checking installed patches... Verifying sufficient filesystem capacity (dry run method)... Installing patch packages... Patch 138866-01 has been successfully installed. See /var/sadm/patch/138866-01/log for details Patch packages installed: SUNWcsr bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/137137-09 Validating patches... Loading patches installed on the system... Done! Loading patches requested to install. Version of package SUNWcakr from directory SUNWcakr.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWcar from directory SUNWcar.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.c in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.d in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.m in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.u in patch 137137-09 differs from the package installed on the system. Architecture for package SUNWnxge from directory SUNWnxge.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWcakr from directory SUNWcakr.us in patch 137137-09 differs from the package installed on the system. Version of package SUNWcar from directory SUNWcar.us in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.us in patch 137137-09 differs from the package installed on the system. Done! The following requested patches have packages not installed on the system Package SUNWcpr from directory SUNWcpr.u in patch 137137-09 is not installed on the system. Changes for package SUNWcpr will not be applied to the system. Package SUNWefc from directory SUNWefc.u in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. Package SUNWfruip from directory SUNWfruip.u in patch 137137-09 is not installed on the system. Changes for package SUNWfruip will not be applied to the system. Package SUNWluxd from directory SUNWluxd.u in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. Package SUNWs8brandr from directory SUNWs8brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandr will not be applied to the system. Package SUNWs8brandu from directory SUNWs8brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandu will not be applied to the system. Package SUNWs9brandr from directory SUNWs9brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandr will not be applied to the system. Package SUNWs9brandu from directory SUNWs9brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandu will not be applied to the system. Package SUNWus from directory SUNWus.u in patch 137137-09 is not installed on the system. Changes for package SUNWus will not be applied to the system. Package SUNWefc from directory SUNWefc.us in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. Package SUNWluxd from directory SUNWluxd.us in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. Package FJSVvplr from directory FJSVvplr.u in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. Package FJSVvplr from directory FJSVvplr.us in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. Checking patches that you specified for installation. Done! Approved patches will be installed in this order: 137137-09 Checking installed patches... Executing prepatch script... Verifying sufficient filesystem capacity (dry run method)... Dec 2 10:05:58 cyrus2-2 cfenvd[706]: LDT(3) in loadavg chi = 19.18 thresh 11.58 Dec 2 10:05:58 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* Installing patch packages... Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT(4) in rootprocs chi = 9.04 thresh 9.02 Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT_BUF (rootprocs): Rot 76.00 76.00 76.00 *81.00* 77.00 77.00 80.00 80.00 81.00 *84.00* Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT(4) in loadavg chi = 16.72 thresh 11.58 Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* Dec 2 10:10:58 cyrus2-2 cfenvd[706]: LDT(5) in loadavg chi = 20.24 thresh 11.58 Dec 2 10:10:58 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* *137.00* Dec 2 10:13:28 cyrus2-2 cfenvd[706]: LDT(6) in loadavg chi = 12.67 thresh 11.58 Dec 2 10:13:28 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* *137.00* *94.00* Patch 137137-09 has been successfully installed. See /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/log for details Tue Dec 2 10:15:01 PST 2008 Executing postpatch script... Detected SVM root. Installing bootblk on /dev/rdsk/c0t0d0s0 Installing bootblk on /dev/rdsk/c0t1d0s0 Creating boot_archive for /var/run/.patchSafeMode/root updating /var/run/.patchSafeMode/root/platform/sun4v/boot_archive Patch packages installed: FJSVcpcu FJSVfmd FJSVhea FJSVmdb FJSVmdbr FJSVpiclu SUNWarc SUNWarcr SUNWcakr SUNWcar SUNWcart200 SUNWckr SUNWcpcu SUNWcry SUNWcsd SUNWcsl SUNWcslr SUNWcsr SUNWcsu SUNWdmgtu SUNWdtrc SUNWefcl SUNWesu SUNWfmd SUNWhea SUNWib SUNWiopc SUNWipfh SUNWipfu SUNWiscsitgtr SUNWiscsitgtu SUNWkvm SUNWkvmt200 SUNWldomr SUNWldomu SUNWmdb SUNWmdbr SUNWmdr SUNWmdu SUNWnfsckr SUNWnfscu SUNWnfsskr SUNWnfssr SUNWnfssu SUNWnisu SUNWniumx SUNWnxge SUNWopenssl-libraries SUNWpcu SUNWpd SUNWpdu SUNWpiclu SUNWpmr SUNWpmu SUNWpppd SUNWrds SUNWrsg SUNWrsgk SUNWsmapi SUNWssad SUNWsshcu SUNWsshdu SUNWsshu SUNWtavor SUNWtoo SUNWudapltu SUNWusb SUNWust1 SUNWust2 SUNWwbsup SUNWxcu4 SUNWypu SUNWzfskr SUNWzfsr SUNWzfsu SUNWzoneu -- This message posted from opensolaris.org
I don''t want to steer you wrong under the circumstances, so I think we need more information. First, is the failure the same as in the earlier part of this thread. I.e., when you boot, do you get a failure like this? Warning: Fcode sequence resulted in a net stack depth change of 1 Evaluating: Evaluating: The file just loaded does not appear to be executable Second, at least at first glance, this looks like more of a generic patch problem than a problem specifically related to zfs boot. Since this is S10, not OpenSolaris, perhaps you should be escalating this through the standard support channels. This alias probably won''t get you any really useful answers on general problems with patching. Lori On 12/02/08 14:42, Vincent Fox wrote:> The SupportTech responding to case #66153822 so far > has only suggested "boot from cdrom and patchrm 137137-09" > which tells me I''m dealing with a level-1 binder monkey. > It''s the idle node of a cluster holding 10K email accounts > so I''m proceeding cautiously. It is unfortunate the admin doing > the original patching did them from multi-user but here we are. > > I am attempting to boot net:dhcp -s just to collect more info: > > My patchadd output shows 138866-01 & 137137-09 being applied OK: > > bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/138866-01 > Validating patches... > > Loading patches installed on the system... > > Done! > > Loading patches requested to install. > > Done! > > Checking patches that you specified for installation. > > Done! > > > Approved patches will be installed in this order: > > 138866-01 > > > Checking installed patches... > Verifying sufficient filesystem capacity (dry run method)... > Installing patch packages... > > Patch 138866-01 has been successfully installed. > See /var/sadm/patch/138866-01/log for details > > Patch packages installed: > SUNWcsr > > bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/137137-09 > Validating patches... > > Loading patches installed on the system... > > Done! > > Loading patches requested to install. > > Version of package SUNWcakr from directory SUNWcakr.u in patch 137137-09 differs from the package installed on the system. > Version of package SUNWcar from directory SUNWcar.u in patch 137137-09 differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.c in patch 137137-09 differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.d in patch 137137-09 differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.m in patch 137137-09 differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.u in patch 137137-09 differs from the package installed on the system. > Architecture for package SUNWnxge from directory SUNWnxge.u in patch 137137-09 differs from the package installed on the system. > Version of package SUNWcakr from directory SUNWcakr.us in patch 137137-09 differs from the package installed on the system. > Version of package SUNWcar from directory SUNWcar.us in patch 137137-09 differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.us in patch 137137-09 differs from the package installed on the system. > Done! > > The following requested patches have packages not installed on the system > Package SUNWcpr from directory SUNWcpr.u in patch 137137-09 is not installed on the system. Changes for package SUNWcpr will not be applied to the system. > Package SUNWefc from directory SUNWefc.u in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. > Package SUNWfruip from directory SUNWfruip.u in patch 137137-09 is not installed on the system. Changes for package SUNWfruip will not be applied to the system. > Package SUNWluxd from directory SUNWluxd.u in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. > Package SUNWs8brandr from directory SUNWs8brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandr will not be applied to the system. > Package SUNWs8brandu from directory SUNWs8brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandu will not be applied to the system. > Package SUNWs9brandr from directory SUNWs9brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandr will not be applied to the system. > Package SUNWs9brandu from directory SUNWs9brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandu will not be applied to the system. > Package SUNWus from directory SUNWus.u in patch 137137-09 is not installed on the system. Changes for package SUNWus will not be applied to the system. > Package SUNWefc from directory SUNWefc.us in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. > Package SUNWluxd from directory SUNWluxd.us in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. > Package FJSVvplr from directory FJSVvplr.u in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. > Package FJSVvplr from directory FJSVvplr.us in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. > > Checking patches that you specified for installation. > > Done! > > > Approved patches will be installed in this order: > > 137137-09 > > > Checking installed patches... > Executing prepatch script... > Verifying sufficient filesystem capacity (dry run method)... > Dec 2 10:05:58 cyrus2-2 cfenvd[706]: LDT(3) in loadavg chi = 19.18 thresh 11.58 > Dec 2 10:05:58 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* > Installing patch packages... > Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT(4) in rootprocs chi = 9.04 thresh 9.02 > Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT_BUF (rootprocs): Rot 76.00 76.00 76.00 *81.00* 77.00 77.00 80.00 80.00 81.00 *84.00* > Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT(4) in loadavg chi = 16.72 thresh 11.58 > Dec 2 10:08:28 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* > Dec 2 10:10:58 cyrus2-2 cfenvd[706]: LDT(5) in loadavg chi = 20.24 thresh 11.58 > Dec 2 10:10:58 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* *137.00* > Dec 2 10:13:28 cyrus2-2 cfenvd[706]: LDT(6) in loadavg chi = 12.67 thresh 11.58 > Dec 2 10:13:28 cyrus2-2 cfenvd[706]: LDT_BUF (loadavg): Rot 2.00 36.00 21.00 11.00 21.00 39.00 *92.00* *95.00* *137.00* *94.00* > > Patch 137137-09 has been successfully installed. > See /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/log for details > Tue Dec 2 10:15:01 PST 2008 > Executing postpatch script... > Detected SVM root. > Installing bootblk on /dev/rdsk/c0t0d0s0 > Installing bootblk on /dev/rdsk/c0t1d0s0 > Creating boot_archive for /var/run/.patchSafeMode/root > updating /var/run/.patchSafeMode/root/platform/sun4v/boot_archive > > Patch packages installed: > FJSVcpcu > FJSVfmd > FJSVhea > FJSVmdb > FJSVmdbr > FJSVpiclu > SUNWarc > SUNWarcr > SUNWcakr > SUNWcar > SUNWcart200 > SUNWckr > SUNWcpcu > SUNWcry > SUNWcsd > SUNWcsl > SUNWcslr > SUNWcsr > SUNWcsu > SUNWdmgtu > SUNWdtrc > SUNWefcl > SUNWesu > SUNWfmd > SUNWhea > SUNWib > SUNWiopc > SUNWipfh > SUNWipfu > SUNWiscsitgtr > SUNWiscsitgtu > SUNWkvm > SUNWkvmt200 > SUNWldomr > SUNWldomu > SUNWmdb > SUNWmdbr > SUNWmdr > SUNWmdu > SUNWnfsckr > SUNWnfscu > SUNWnfsskr > SUNWnfssr > SUNWnfssu > SUNWnisu > SUNWniumx > SUNWnxge > SUNWopenssl-libraries > SUNWpcu > SUNWpd > SUNWpdu > SUNWpiclu > SUNWpmr > SUNWpmu > SUNWpppd > SUNWrds > SUNWrsg > SUNWrsgk > SUNWsmapi > SUNWssad > SUNWsshcu > SUNWsshdu > SUNWsshu > SUNWtavor > SUNWtoo > SUNWudapltu > SUNWusb > SUNWust1 > SUNWust2 > SUNWwbsup > SUNWxcu4 > SUNWypu > SUNWzfskr > SUNWzfsr > SUNWzfsu > SUNWzoneu >
Vincent Fox
2008-Dec-02 22:10 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
> > I don''t want to steer you wrong under the > circumstances, > so I think we need more information. > > First, is the failure the same as in the earlier part > of this > thread. I.e., when you boot, do you get a failure > like this? > > Warning: Fcode sequence resulted in a net stack depth > change of 1 > Evaluating: > > Evaluating: > > The file just loaded does not appear to be executableNope: ===================================================Sun Fire T200, No Keyboard Copyright 2007 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.27.4, 16256 MB memory available, Serial #75621394. Ethernet address 0:14:4f:81:e4:12, Host ID: 8481e412. Boot device: /pci at 780/pci at 0/pci at 9/scsi at 0/disk at 0 File and args: ufs-file-system Loading: /platform/SUNW,Sun-Fire-T200/boot_archive Loading: /platform/sun4v/boot_archive Can''t open boot_archive Evaluating: The file just loaded does not appear to be executable. ======================================================> > Second, at least at first glance, this looks like > more of > a generic patch problem than a problem specifically > related to zfs boot. Since this is S10, not > OpenSolaris, > perhaps you should be escalating this through the > standard support channels. This alias probably > won''t get you any really useful answers on general > problems with patching.Yeah I just thought since I''d followed this thread before it might be useful to add to it since there might be crossover issues. I''ll keep pushing on the string. I hate being the annoying customer who says "I won''t follow your suggestion because (blah) please escalate this ticket." I hope to move these systems over to 10u6 in a few months and streamline our patching so problems like this won''t exist. -- This message posted from opensolaris.org
Vincent Fox
2008-Dec-03 19:38 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Followup to my own post. Looks like my SVM setup was having problems prior to patch being applied. If I boot net:dhcp -s and poke around on the disks, it looks like disk0 is pre-patch state and disk1 is post-patch. I can get a shell if I boot disk1 -s So I think I am in SVM hell here not specfically ZFS patch breaking my box. Never mind! -- This message posted from opensolaris.org
Jens Elkner
2008-Dec-04 03:18 UTC
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Tue, Dec 02, 2008 at 12:22:49PM -0800, Vincent Fox wrote:> Reviving this thread. > > We have a Solaris 10u4 system recently patched with 137137-09. > Unfortunately the patch was applied from multi-user mode, I wonder if this > may have been original posters problem as well? Anyhow we are now stuckNo - in my case it was a ''not enough space'' on / problem, not the multi-user mode ;-). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768