After an error I had to press the reset button on my ZFS root filesystem based sxce.b99 The system did not come up again! I tried a failsafe reboot; it works but I cannot mount rpool/ROOT on /a I did a zpool scrub rpool and it has no known data errors. How can I access the data on this ZFS disk? I wouldn''t mind a reinstall, but I really want to save the data in my home directories. I can''t understand why the grub boot menu does not find the rpool/ROOT anymore. It should be mounted on (legacy) .alt.tmp.b-yh.mnt/ Is this problem solvable? I -do- hope so! -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
dick hoogendijk wrote:> After an error I had to press the reset button on my ZFS root filesystem > based sxce.b99 > > The system did not come up again!please elaborate - what does the system do precisely? Michael> I tried a failsafe reboot; it works but I cannot mount rpool/ROOT on /a > I did a zpool scrub rpool and it has no known data errors. > > How can I access the data on this ZFS disk? > I wouldn''t mind a reinstall, but I really want to save the data in my home > directories. I can''t understand why the grub boot menu does not find the > rpool/ROOT anymore. It should be mounted on (legacy) .alt.tmp.b-yh.mnt/ > > Is this problem solvable? I -do- hope so! >-- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
michael schuster wrote:> dick hoogendijk wrote: >> After an error I had to press the reset button on my ZFS root filesystem >> based sxce.b99 >> >> The system did not come up again! > > please elaborate - what does the system do precisely?The system hangs (forever) on the first screen with the SunOS copyright message. (Just those three lines of text). Nothing happens after that. It stopped booting ;-( Normally I have a few BE''s on that disk and some snapshots. Right now, I only have -one- BE: rpool/ROOT/snv99 So I also have just three options in the grub menu (normal boot; xVM; and failsafe). The latter works, but I cannot rw-mount rpool/ROOT/snv99 as ''root'' on /a (/a is a read-only filesystem it says; /rpoolROOT/snv99 is supposed to be mounted on .alt.tmp.b-yh.mnt/ on normal booting) Can this system be made bootable again (repaired)? If not, how can I get access to the data again? What could have happened? I only pressed the reset button because the system froze while working in Gnome. It never came up again ;-( -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
dick hoogendijk wrote:> michael schuster wrote: >> dick hoogendijk wrote: >>> After an error I had to press the reset button on my ZFS root filesystem >>> based sxce.b99 >>> >>> The system did not come up again! >> please elaborate - what does the system do precisely? > > The system hangs (forever) on the first screen with the SunOS copyright > message. (Just those three lines of text). Nothing happens after that. > It stopped booting ;-(Please add -kv to the end of your kernel$ line in grub, (or the first module$ line for an xVM kernel) and boot again. Then you''ll be able to tell us where the system has appeared to hang, and we in turn will be able to give you something approaching assistance.> Normally I have a few BE''s on that disk and some snapshots. > Right now, I only have -one- BE: rpool/ROOT/snv99 > > So I also have just three options in the grub menu (normal boot; xVM; and > failsafe). The latter works, but I cannot rw-mount rpool/ROOT/snv99 as > ''root'' on /a (/a is a read-only filesystem it says; /rpoolROOT/snv99 is > supposed to be mounted on .alt.tmp.b-yh.mnt/ on normal booting)have you tried mount -F zfs rpool/ROOT/snv_99 /a What do zpool status -v rpool and zpool get all rpool report?> Can this system be made bootable again (repaired)?We don''t know, there''s still not quite enough information.> If not, how can I get access to the data again? > What could have happened?Lots of things. We can''t channel .... James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
James C. McPherson wrote:> Lots of things. We can''t channel ....A nice way for a smiley.. Thanks for the mail so far. I''m in Holland and at work right now (13.45h) but I''ll check out your suggestions as soon as I get home. -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
James C. McPherson wrote:> Please add -kv to the end of your kernel$ line in > grub,#GRUB kernel$ add -kv cmdk0 at ata0 target 0 lun 0 cmdk0 is /pci at 0,0/pci_ide at 4/ide at 0/cmdk0,0 ### end the machine hangst here> have you tried > mount -F zfs rpool/ROOT/snv_99 /a# mount -F zfs rpool/ROOT/snv99 /a (*) filesystem ''rpool/ROOT/snv99'' cannot be mounted using ''mount -F zfs'' Use ''zfs set mountpoint=/a'' instead If you must use ''mount -F zfs'' or /etc/vfstab, use ''zfs set mountpoint=legacy''. See zfs(1M) for more information.> zpool status -v rpool# zpool status -v rpool pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c0d0s0 ONLINE 0 0 0> zpool get all rpool# zpool get all rpool NAME PROPERTY VALUE SOURCE rpool size 148G - used 61.1G available 89.9G capacity 41% altroot /a local health ONLINE guid 936248....... version 13 bootfs rpool/ROOT/snv99 local delegation on default autoreplace off default cachefile none local failmode continue local listsnapshots off default (*) If I chose not to mount the boot env rpool/ROOT/snv99 rw on /a I cannot do a thing in my failsafe login. The dataset does no exist then. I have to say ''y'' to the question; get some error msgs that it fails but then at least I can see the pool. -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
It would also be useful to see the output of `zfs list` and `zfs get all rpool/ROOT/snv_99` while booted from the failsafe archive. - lori dick hoogendijk wrote:> James C. McPherson wrote: > >> Please add -kv to the end of your kernel$ line in >> grub, >> > > #GRUB kernel$ add -kv > cmdk0 at ata0 target 0 lun 0 > cmdk0 is /pci at 0,0/pci_ide at 4/ide at 0/cmdk0,0 > ### end the machine hangst here > > >> have you tried >> mount -F zfs rpool/ROOT/snv_99 /a >> > > # mount -F zfs rpool/ROOT/snv99 /a (*) > filesystem ''rpool/ROOT/snv99'' cannot be mounted using ''mount -F zfs'' > Use ''zfs set mountpoint=/a'' instead > If you must use ''mount -F zfs'' or /etc/vfstab, use ''zfs set > mountpoint=legacy''. > See zfs(1M) for more information. > > >> zpool status -v rpool >> > # zpool status -v rpool > pool: rpool > state: ONLINE > scrub: none requested > config: > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c0d0s0 ONLINE 0 0 0 > > >> zpool get all rpool >> > > # zpool get all rpool > NAME PROPERTY VALUE SOURCE > rpool size 148G - > used 61.1G > available 89.9G > capacity 41% > altroot /a local > health ONLINE > guid 936248....... > version 13 > bootfs rpool/ROOT/snv99 local > delegation on default > autoreplace off default > cachefile none local > failmode continue local > listsnapshots off default > > (*) If I chose not to mount the boot env rpool/ROOT/snv99 rw on /a I > cannot do a thing in my failsafe login. The dataset does no exist then. > I have to say ''y'' to the question; get some error msgs that it fails but > then at least I can see the pool. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081013/bd3ca0f0/attachment.html>
Lori Alt wrote:> > It would also be useful to see the output of `zfs list` > and `zfs get all rpool/ROOT/snv_99` while > booted from the failsafe archive.# zfs list rpool 69.0G 76.6G 40K /a/rpool rpool/ROOT 22.7G 18K legacy rpool/ROOT/snv99 22.7G 10.8G /a/.alt.tmp.b-yh.mnt/ rpool/dump 1.50G 1.50G - rpool/export 36.6G 796M /a/export rpool/export/home 36.8G 35.8G /a/export/home rpool/local 234M 234M /a/usr/local rpool/swap 8G 84.6G 6.74G - # zfs get all rpool/ROOT/snv99 type filesystem creation wed ... used 22.7G available 76.6G referenced 10.8G compressratio 1.00x mounted no quota none reservation none recordsize 128K mountpoint /a/.alt.tmp.b-yh.mnt/ sharefs off checksum on compression off atime on devices on exec on setuid on readonly off zoned off snapdir hidden aclmode groupmask aclinherit restricted canmount noauto shareiscsi off xattr on copies 1 version 3 utf8only off normalization none casesensitive sensitive vscan off nbmand off sharesmb off refquota none refreservation none primarycache all secondarycache all usedbysnapshots 11.9G usedbydataset 10.8G usedbychildren 0 usedbyrefreservation 0 -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
dick hoogendijk wrote:> Lori Alt wrote: > >> It would also be useful to see the output of `zfs list` >> and `zfs get all rpool/ROOT/snv_99` while >> booted from the failsafe archive >> > > # zfs list > rpool 69.0G 76.6G 40K /a/rpool > rpool/ROOT 22.7G 18K legacy > rpool/ROOT/snv99 22.7G 10.8G /a/.alt.tmp.b-yh.mnt/ >Well, this is clearly a problem. I''ve seen it before (there are some bugs in LU that can leave the root dataset of a BE in this state), but I''ve always seen the system boot anyway because the zfs_mountroot code in the kernel appeared to ignore the value of the moutnpoint property and just mounted the dataset at "/" anyway. Since I don''t fully understand the problem, I can''t be sure this will work, but I''m pretty sure it won''t hurt: try setting the mountpoint of the dataset to "/": zfs set mountpoint=/ rpool/ROOT/snv99 I do mean "/", not "/a". After executing the command, the mountpoint shown with `zfs list` should be "/a" (in the zfs list output, the current alternate root for the pool is prepended to the persistent mount point of the datasets to show the effective mount point). Then reboot and see if your problem is solved. If not, we''ll dig deeper with kmdb into what''s happening. Lori> rpool/dump 1.50G 1.50G - > rpool/export 36.6G 796M /a/export > rpool/export/home 36.8G 35.8G /a/export/home > rpool/local 234M 234M /a/usr/local > rpool/swap 8G 84.6G 6.74G - > > # zfs get all rpool/ROOT/snv99 > type filesystem > creation wed ... > used 22.7G > available 76.6G > referenced 10.8G > compressratio 1.00x > mounted no > quota none > reservation none > recordsize 128K > mountpoint /a/.alt.tmp.b-yh.mnt/ > sharefs off > checksum on > compression off > atime on > devices on > exec on > setuid on > readonly off > zoned off > snapdir hidden > aclmode groupmask > aclinherit restricted > canmount noauto > shareiscsi off > xattr on > copies 1 > version 3 > utf8only off > normalization none > casesensitive sensitive > vscan off > nbmand off > sharesmb off > refquota none > refreservation none > primarycache all > secondarycache all > usedbysnapshots 11.9G > usedbydataset 10.8G > usedbychildren 0 > usedbyrefreservation 0 > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081013/625fcd98/attachment.html>
Lori Alt wrote:> dick hoogendijk wrote: >> Lori Alt wrote:> Since I don''t fully understand the problem, I can''t > be sure this will work, but I''m pretty sure it won''t > hurt: try setting the mountpoint of the dataset to "/": > > zfs set mountpoint=/ rpool/ROOT/snv99 > > Then reboot and see if your problem is solved. If not, > we''ll dig deeper with kmdb into what''s happening.The reboot is still the same. The machine still hangs. After that I did a failsafe reboot to see if ROOT/snv99 was mountend now. This is the message: ROOT/snv99 was found on rpool. Do you wish to have it mounted read-write on /a? [y,n,?] y mounting rpool on /a [...] That was ten minutes ago. Nothing happens after this message. There seems to be some kind of resemblance with ''normal'' booting. ROOT/snv99 seems OK (zpool scrub; status; etc do say so). Still nothing happens with this BE. We have to dig deeper with kmdb. But before we do that, tell me please what is an easy way to transfer the messages from the failsafe login on the problematic machine to i.e. this S10u5 server. All former screen output had to be typed in by hand. I didn''t know of another way. -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE ++ http://nagual.nl/ | SunOS 10u5 05/08 ++
On Mon, Oct 13, 2008 at 10:25 PM, dick hoogendijk <dick at nagual.nl> wrote:> > > > We have to dig deeper with kmdb. But before we do that, tell me please > what is an easy way to transfer the messages from the failsafe login on > the problematic machine to i.e. this S10u5 server. All former screen > output had to be typed in by hand. I didn''t know of another way. > >If you say "no" to mount the pool on /a, does it still hang? Just to ask the obvious question, did you try to press ENTER or anything else where it was hanging? What build are you booting into failsafe mode? Something older, or b99? Do you have a build-99 DVD to boot from, from which you can get a proper running system with networking, etc? -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081016/c485dcda/attachment.html>
On Mon, 13 Oct 2008 12:15:50 -0600 Lori Alt <Lori.Alt at Sun.COM> wrote:> Then reboot and see if your problem is solved. If not, > we''ll dig deeper with kmdb into what''s happening.Just for the record. We were not able to solve the problem. The ROOT kept missing from my snv_99 bootdisk. I have been able to access all datasets, -except- the root file system. I saves the data of /export /export/home, but I lost /opt/sfw and have to recompile all those programs I had in there; if only it had been a seperate dataset ;-) mountable: rpool rpool/export rpool/export/home rpool/ROOT/snv_99 (contains just two dirs: /etc and /boot) UNmountable: rpool/ROOT (changing "legacy" did not help So, I gave up on this one. I have no idea why and how the system became unaccessable. All datasets turned out healthy. Alas, all but one were accessable and the one that was not was the one needed to boot ;-) I did a reinstall from scratch off the snv_99 DVD. At first it did not work. It hanged after checking both networkcards. I reformatted the harddrive with a SXDE (older) DVD tot UFS and started the installtion of the snv_99 DVD again. This time all wen well and I''m sitting after a brand new ZFS nevada.b99 system with all other datasets restored. A little more worried about ZFS than I was before, but I still use it. It is -SO- easy once used to it ;-) -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D ++ http://nagual.nl/ + SunOS sxce snv99 ++