Hi, I am having a problem where I cannot boot my OSOL 2009.06 laptop, it is stuck in a reboot loop. I tried booting from a live CD and I am unable to import the rpool the laptop reboots everytime I issue the force import command. This happens even though the status of the rpool is ONLINE: jack at opensolaris:~# zpool import pool: rpool id: 2811861170192059477 state: ONLINE status: The pool was last accessed by another system. action: The pool can be imported using its name or numeric identifier and the ''-f'' flag. see: http://www.sun.com/msg/ZFS-8000-EY config: rpool ONLINE c6d0s0 ONLINE Any assistance would be great I need to get some files from the laptop. TIA -- This message posted from opensolaris.org
I redirected the console to the serial port and managed to capture the panic information below: SunOS Release 5.11 Version snv_111b 64-bit Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. panic[cpu0]/thread=ffffff0007c39c60: mutex_enter: bad mutex, lp=e8 owner=f000ec62f000ec60 thread=ffffff0007c39c60 ffffff0007c38ca0 unix:mutex_panic+73 () ffffff0007c38d00 unix:mutex_vector_enter+446 () ffffff0007c38d50 genunix:kmem_slab_alloc+31 () ffffff0007c38db0 genunix:kmem_cache_alloc+130 () ffffff0007c38dd0 zfs:zio_buf_alloc+2c () ffffff0007c38e10 zfs:arc_get_data_buf+173 () ffffff0007c38e60 zfs:arc_buf_alloc+a2 () ffffff0007c38f00 zfs:arc_read_nolock+137 () ffffff0007c38fa0 zfs:arc_read+75 () ffffff0007c390d0 zfs:scrub_visitbp+161 () ffffff0007c391e0 zfs:scrub_visitbp+27c () ffffff0007c392f0 zfs:scrub_visitbp+21d () ffffff0007c39400 zfs:scrub_visitbp+21d () ffffff0007c39510 zfs:scrub_visitbp+21d () ffffff0007c39620 zfs:scrub_visitbp+21d () ffffff0007c39730 zfs:scrub_visitbp+21d () ffffff0007c39840 zfs:scrub_visitbp+432 () ffffff0007c39890 zfs:scrub_visit_rootbp+4f () ffffff0007c398f0 zfs:scrub_visitds+7e () ffffff0007c39aa0 zfs:dsl_pool_scrub_sync+126 () ffffff0007c39b10 zfs:dsl_pool_sync+192 () ffffff0007c39ba0 zfs:spa_sync+32a () ffffff0007c39c40 zfs:txg_sync_thread+265 () ffffff0007c39c50 unix:thread_start+8 () skipping system dump - no dump device configured rebooting... Can anyone tell me what is going wrong? -- This message posted from opensolaris.org
When I boot from a snv133 live cd and attempt to import the rpool it panics with this output: Sun Microsystems Inc. SunOS 5.11 snv_133 February 2010 jack at opensolaris:~$ pfexec su Mar 9 03:11:37 opensolaris su: ''su root'' succeeded for jack on /dev/console jack at opensolaris:~# zpool import -f -o ro -o failmode=continue -R /mnt rpool panic[cpu1]/thread=ffffff00086e0c60: BAD TRAP: type=e (#pf Page fault) rp=ffffff00086dfe60 addr=278 occurred in module "unix" due to a NULL pointer dereference sched: #pf Page fault Bad kernel fault at addr=0x278 pid=0, pc=0xfffffffffb862b6b, sp=0xffffff00086dff58, eflags=0x10246 cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> cr2: 278cr3: c800000cr8: c rdi: 278 rsi: 4 rdx: ffffff00086e0c60 rcx: 0 r8: 40 r9: 21d9a rax: 0 rbx: 0 rbp: ffffff00086dffb0 r10: 7f6fc8 r11: 6e r12: 0 r13: 278 r14: 4 r15: ffffff01cfe27e08 fsb: 0 gsb: ffffff01ccfa5080 ds: 4b es: 4b fs: 0 gs: 1c3 trp: e err: 2 rip: fffffffffb862b6b cs: 30 rfl: 10246 rsp: ffffff00086dff58 ss: 38 ffffff00086dfd40 unix:die+dd () ffffff00086dfe50 unix:trap+177e () ffffff00086dfe60 unix:cmntrap+e6 () ffffff00086dffb0 unix:mutex_enter+b () ffffff00086dffd0 zfs:zio_buf_alloc+2c () ffffff00086e0010 zfs:arc_get_data_buf+173 () ffffff00086e0060 zfs:arc_buf_alloc+a2 () ffffff00086e0100 zfs:arc_read_nolock+12f () ffffff00086e01a0 zfs:arc_read+75 () ffffff00086e0230 zfs:scrub_prefetch+b9 () ffffff00086e02f0 zfs:scrub_visitbp+5f1 () ffffff00086e03b0 zfs:scrub_visitbp+6e3 () ffffff00086e0470 zfs:scrub_visitbp+6e3 () ffffff00086e0530 zfs:scrub_visitbp+6e3 () ffffff00086e05f0 zfs:scrub_visitbp+6e3 () ffffff00086e06b0 zfs:scrub_visitbp+6e3 () ffffff00086e0750 zfs:scrub_visitdnode+84 () ffffff00086e0810 zfs:scrub_visitbp+1a6 () ffffff00086e0860 zfs:scrub_visit_rootbp+4f () ffffff00086e08c0 zfs:scrub_visitds+7e () ffffff00086e0a80 zfs:dsl_pool_scrub_sync+163 () ffffff00086e0af0 zfs:dsl_pool_sync+25b () ffffff00086e0ba0 zfs:spa_sync+36f () ffffff00086e0c40 zfs:txg_sync_thread+24a () ffffff00086e0c50 unix:thread_start+8 () -- This message posted from opensolaris.org
Found a site that recommended setting the following system file entries set zfs:zfs_recover=1 set aok=1 and running this command zdb -e -bcsvL rpool but I get the following error: Traversing all blocks to verify checksums ... out of memory -- generating core dump Abort The laptop has 4GB of memory, and I did not see memory utilization pass 400MB. -- This message posted from opensolaris.org
Hi D, Is this a 32-bit system? We were looking at your panic messages and they seem to indicate a problem with memory and not necessarily a problem with the pool or the disk. Your previous zpool status output also indicates that the disk is okay. Maybe someone with similar recent memory problems can advise. Thanks, Cindy On 03/09/10 09:15, D. Pinnock wrote:> When I boot from a snv133 live cd and attempt to import the rpool it panics with this output: > > Sun Microsystems Inc. SunOS 5.11 snv_133 February 2010 > > jack at opensolaris:~$ pfexec su > Mar 9 03:11:37 opensolaris su: ''su root'' succeeded for jack on /dev/console > jack at opensolaris:~# zpool import -f -o ro -o failmode=continue -R /mnt rpool > > panic[cpu1]/thread=ffffff00086e0c60: BAD TRAP: type=e (#pf Page fault) rp=ffffff00086dfe60 addr=278 occurred in module "unix" due to a NULL pointer dereference > > sched: #pf Page fault > Bad kernel fault at addr=0x278 > pid=0, pc=0xfffffffffb862b6b, sp=0xffffff00086dff58, eflags=0x10246 > cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: 278cr3: c800000cr8: c > > rdi: 278 rsi: 4 rdx: ffffff00086e0c60 > rcx: 0 r8: 40 r9: 21d9a > rax: 0 rbx: 0 rbp: ffffff00086dffb0 > r10: 7f6fc8 r11: 6e r12: 0 > r13: 278 r14: 4 r15: ffffff01cfe27e08 > fsb: 0 gsb: ffffff01ccfa5080 ds: 4b > es: 4b fs: 0 gs: 1c3 > trp: e err: 2 rip: fffffffffb862b6b > cs: 30 rfl: 10246 rsp: ffffff00086dff58 > ss: 38 > > ffffff00086dfd40 unix:die+dd () > ffffff00086dfe50 unix:trap+177e () > ffffff00086dfe60 unix:cmntrap+e6 () > ffffff00086dffb0 unix:mutex_enter+b () > ffffff00086dffd0 zfs:zio_buf_alloc+2c () > ffffff00086e0010 zfs:arc_get_data_buf+173 () > ffffff00086e0060 zfs:arc_buf_alloc+a2 () > ffffff00086e0100 zfs:arc_read_nolock+12f () > ffffff00086e01a0 zfs:arc_read+75 () > ffffff00086e0230 zfs:scrub_prefetch+b9 () > ffffff00086e02f0 zfs:scrub_visitbp+5f1 () > ffffff00086e03b0 zfs:scrub_visitbp+6e3 () > ffffff00086e0470 zfs:scrub_visitbp+6e3 () > ffffff00086e0530 zfs:scrub_visitbp+6e3 () > ffffff00086e05f0 zfs:scrub_visitbp+6e3 () > ffffff00086e06b0 zfs:scrub_visitbp+6e3 () > ffffff00086e0750 zfs:scrub_visitdnode+84 () > ffffff00086e0810 zfs:scrub_visitbp+1a6 () > ffffff00086e0860 zfs:scrub_visit_rootbp+4f () > ffffff00086e08c0 zfs:scrub_visitds+7e () > ffffff00086e0a80 zfs:dsl_pool_scrub_sync+163 () > ffffff00086e0af0 zfs:dsl_pool_sync+25b () > ffffff00086e0ba0 zfs:spa_sync+36f () > ffffff00086e0c40 zfs:txg_sync_thread+24a () > ffffff00086e0c50 unix:thread_start+8 ()
On 03/ 9/10 10:53 AM, Cindy Swearingen wrote:> Hi D, > > Is this a 32-bit system? > > We were looking at your panic messages and they seem to indicate a > problem with memory and not necessarily a problem with the pool or > the disk. Your previous zpool status output also indicates that the > disk is okay. >To perhaps clarify, you''re panicking trying to grab a mutex, which hints that something has stomped on the memory containing that mutex. The reason for the 32-bit question is that sometimes a deep stack can overrun on a 32-bit box. That''s probably not what happened here, but we ask anyway. -tim
My Laptop is a 64bit system Dell Latitude D630 Intel Core2 Duo Processor T7100 4GB RAM -- This message posted from opensolaris.org
So I was back on it again today and I was following this thread http://opensolaris.org/jive/thread.jspa?threadID=70205&tstart=15 and got the following error when I ran this command zdb -e -bb rpool Traversing all blocks to verify nothing leaked ... Assertion failed: c < SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT, file ../../../uts/common/fs/zfs/zio.c, line 203, function zio_buf_alloc Abort (core dumped) -- This message posted from opensolaris.org