thr3ads.net - zfs discuss - [zfs-discuss] ZFS panic in b70 [Sep 2007]

If this information is useful, please help other people find it:
Share via:
Phillip Steinbachs
2007-Sep-05 02:05 UTC
[zfs-discuss] ZFS panic in b70

I encountered the following ZFS panic today and am looking for suggestions 
on how to resolve it.

First panic:

panic[cpu0]/thread=ffffff000fa9cc80: assertion failed: 0 ==
dmu_buf_hold_array(os, object, offset, size, FALSE, FTAG, &numbufs,
&dbp), file: ../../common/fs/zfs/dmu.c, line: 435

ffffff000fa9c8d0 genunix:assfail+7e ()
ffffff000fa9c980 zfs:dmu_write+15c ()
ffffff000fa9ca40 zfs:space_map_sync+27a ()
ffffff000fa9cae0 zfs:metaslab_sync+24f ()
ffffff000fa9cb40 zfs:vdev_sync+b5 ()
ffffff000fa9cbd0 zfs:spa_sync+1e2 ()
ffffff000fa9cc60 zfs:txg_sync_thread+19a ()
ffffff000fa9cc70 unix:thread_start+8 ()


Recursive panic at boot:

panic[cpu3]/thread=ffffff000f818c80: BAD TRAP: type=e (#pf Page fault)
rp=ffffff000f818610 addr=0 occurred in module "unix" due to a NULL
pointer dereference

sched: #pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0xfffffffffb83d24b, sp=0xffffff000f818708, eflags=0x10246
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4:
6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 8c00000 cr8: c
         rdi:                0 rsi:                1 rdx: ffffff000f818c80
         rcx:               23  r8:                0  r9: fffffffec2a65e80
         rax:                0 rbx:                0 rbp: ffffff000f818780
         r10:   6ea8f08d984d6b r11: fffffffec2a38000 r12:           2bd40a
         r13:              400 r14:      1b0005c8800 r15: fffffffecc992580
         fsb: fffffd7fff382000 gsb: fffffffec28aaa80  ds:                0
          es:                0  fs:                0  gs:                0
         trp:                e err:                2 rip: fffffffffb83d24b
          cs:               30 rfl:            10246 rsp: ffffff000f818708
          ss:               38

ffffff000f8184f0 unix:die+c8 ()
ffffff000f818600 unix:trap+135b ()
ffffff000f818610 unix:_cmntrap+e9 ()
ffffff000f818780 unix:mutex_enter+b ()
ffffff000f8187f0 zfs:metaslab_free+97 ()
ffffff000f818820 zfs:zio_dva_free+29 ()
ffffff000f818840 zfs:zio_next_stage+b3 ()
ffffff000f818860 zfs:zio_gang_pipeline+31 ()
ffffff000f818880 zfs:zio_next_stage+b3 ()
ffffff000f8188d0 zfs:zio_wait_for_children+5d ()
ffffff000f8188f0 zfs:zio_wait_children_ready+20 ()
ffffff000f818910 zfs:zio_next_stage_async+bb ()
ffffff000f818930 zfs:zio_nowait+11 ()
ffffff000f8189c0 zfs:arc_free+174 ()
ffffff000f818a50 zfs:dsl_dataset_block_kill+25f ()
ffffff000f818ad0 zfs:dmu_objset_sync+90 ()
ffffff000f818b40 zfs:dsl_pool_sync+199 ()
ffffff000f818bd0 zfs:spa_sync+1c5 ()
ffffff000f818c60 zfs:txg_sync_thread+19a ()
ffffff000f818c70 unix:thread_start+8 ()


The events that lead up to this:

- "data" zpool consists of two 7 disk raidz1 vdevs with 2 hot spares,
one
   disk in each vdev was faulted with hot spares active.
- faulted disks were cleared because errors were suspicious, hot spares
   were detached and marked as available.  zpool status shows pool as
   normal
- "data_new" zpool was created and consists of two 8 disk raidz2 vdevs
- recursive zfs send/recv command was issued to copy filesystems and
   snapshots from "data" to "data_new", which immediately
triggered
   the first panic
- system refuses to boot now with recursive panic while trying to mount
   zpools


Bill Sommerfeld on #opensolaris suggested that this may be related 
to bugid 6393634, but it doesn''t offer any suggestion on how to recover
from it.  I''m guessing I can boot from DVD and remove the zpool.cache
in order to get the machine to boot, but I may not be able to import 
the affected pool.  I''d like to save it if possible.

Any help would be appreciated.

-phillip
zfs discuss - Sep 2007 - ZFS panic in b70

[zfs-discuss] ZFS panic in b70