My pool panic''d while updating to Lucid Lynx hosted inside an iSCSI
LUN. And now it won''t come back up. I have dedup and compression on.
These are my current findings:
* iostat -En won''t list 8 of my disks
* zdb lists all my disks except my cache device
* The following commands panics the box in single-user mode: format, zfs, zpool
and zdb -l. Multi-user panics before reading ZFS config.
* Unplugging all devices belonging to the pool brings up the host to multi-user
mode and lists my pool as UNAVAIL.
I''ve scavenged the net for extracting useful information that might be
of use.
I suspect it has something to do with the DDT table.
Best Regards
Michael
zdb output:
rpool:
    version: 22
    name: ''rpool''
    state: 0
    txg: 10643295
    pool_guid: 16751367988873007995
    hostid: 13336047
    hostname: ''''
    vdev_children: 1
    vdev_tree:
        type: ''root''
        id: 0
        guid: 16751367988873007995
        children[0]:
            type: ''mirror''
            id: 0
            guid: 6639969804249231424
            whole_disk: 0
            metaslab_array: 23
            metaslab_shift: 31
            ashift: 9
            asize: 250956742656
            is_log: 0
            children[0]:
                type: ''disk''
                id: 0
                guid: 14476065696483338328
                path: ''/dev/dsk/c14d0s0''
                devid: ''id1,cmdk at
AWDC_WD2500YD-01NVB1=_____WD-WCANK4006148/a''
                phys_path: ''/pci at 0,0/pci10de,75a at 8/pci-ide at
9/ide at 0/cmdk at 0,0:a''
                whole_disk: 0
                DTL: 78
            children[1]:
                type: ''disk''
                id: 1
                guid: 10422182008705867883
                path: ''/dev/dsk/c16d0s0''
                devid: ''id1,cmdk at
AWDC_WD2500YD-01NVB1=_____WD-WCANK5135915/a''
                phys_path: ''/pci at 0,0/pci10de,75a at 8/pci-ide at
9/ide at 1/cmdk at 0,0:a''
                whole_disk: 0
                DTL: 173
tank:
    version: 22
    name: ''tank''
    state: 0
    txg: 36636297
    pool_guid: 10904371515657913150
    hostid: 13336047
    hostname: ''zen''
    vdev_children: 3
    vdev_tree:
        type: ''root''
        id: 0
        guid: 10904371515657913150
        children[0]:
            type: ''raidz''
            id: 0
            guid: 4940983256616168565
            nparity: 1
            metaslab_array: 23
            metaslab_shift: 32
            ashift: 9
            asize: 2560443285504
            is_log: 0
            children[0]:
                type: ''disk''
                id: 0
                guid: 7633768960477747795
                path: ''/dev/dsk/c13t4d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD6400AACS-0_____WD-WCAUF0933938/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 4,0:a''
                whole_disk: 1
                DTL: 4268
            children[1]:
                type: ''disk''
                id: 1
                guid: 12141479741527311128
                path: ''/dev/dsk/c13t5d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD6400AACS-0_____WD-WCAUF0934597/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 5,0:a''
                whole_disk: 1
                DTL: 4267
            children[2]:
                type: ''disk''
                id: 2
                guid: 7952488001712683172
                path: ''/dev/dsk/c13t6d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD6400AACS-0_____WD-WCAUF0934679/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 6,0:a''
                whole_disk: 1
                DTL: 4266
            children[3]:
                type: ''disk''
                id: 3
                guid: 535039729687145914
                path: ''/dev/dsk/c13t7d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD6400AACS-0_____WD-WCAUF0931654/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 7,0:a''
                whole_disk: 1
                DTL: 4265
        children[1]:
            type: ''raidz''
            id: 1
            guid: 6936009139020911476
            nparity: 1
            metaslab_array: 4097
            metaslab_shift: 34
            ashift: 9
            asize: 2000373678080
            is_log: 0
            children[0]:
                type: ''disk''
                id: 0
                guid: 4043674464412192471
                path: ''/dev/dsk/c13t3d0s0''
                devid: ''id1,sd at
SATA_____SAMSUNG_HD103SI_______S1VSJ90SC22045/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 3,0:a''
                whole_disk: 1
                DTL: 8198
            children[1]:
                type: ''disk''
                id: 1
                guid: 7230587084054299877
                path: ''/dev/dsk/c13t1d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD5001AALS-0_____WD-WMASY3260051/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 1,0:a''
                whole_disk: 1
                DTL: 4263
            children[2]:
                type: ''disk''
                id: 2
                guid: 10560603583403897619
                path: ''/dev/dsk/c13t2d0s0''
                devid: ''id1,sd at
SATA_____SAMSUNG_HD103SI_______S1VSJ90SC22634/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 2,0:a''
                whole_disk: 1
                DTL: 12327
            children[3]:
                type: ''disk''
                id: 3
                guid: 1310727864203033402
                path: ''/dev/dsk/c13t0d0s0''
                devid: ''id1,sd at
SATA_____WDC_WD5001AALS-0_____WD-WMASY3508706/a''
                phys_path: ''/pci at 0,0/pci10de,77a at 13/pci1033,125
at 0/pci11ab,11ab at 1/disk at 0,0:a''
                whole_disk: 1
                DTL: 4261
        children[2]:
            type: ''disk''
            id: 2
            guid: 14323860655899304907
            path: ''/dev/dsk/c8t0d0s0''
            devid: ''id1,sd at
SATA_____INTEL_SSDSA2M080__CVPO003401VT080BGN/a''
            phys_path: ''/pci at 0,0/pci1043,82e2 at 9/disk at
0,0:a''
            whole_disk: 1
            metaslab_array: 933
            metaslab_shift: 29
            ashift: 9
            asize: 80012902400
            is_log: 1
            DTL: 12330
            create_txg: 36514714
Kernel debug output: (Raw typescript, sorry)
Script started on May  4, 2010 06:22:58 PM CEST
root at zen:~/coredir/foo# mdb -k unix.0 vmcore.0 
(B)0Loading modules: [ unix genunix specfs mac cpu.generic uppc pcplusmp
scsi_vhci zfs sata sd sockfs ip hook neti sctp arp usba uhci s1394 qlc fctl stmf
md lofs ]
> ::stt atus
debugging crash dump vmcore.0 (64-bit) from zen
operating system: 5.11 snv_134 (i86pc)
panic message: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff000fd16950 addr=30 occurred in module
 "zfs" due to a NULL pointer dereference
dump content: kernel pages only
> stack     ::stack
ddt_phys_decref+0xc(0)
zio_ddt_free+0x55(ffffff02d9d1d660)
zio_execute+0x8d(ffffff02d9d1d660)
taskq_thread+0x248(ffffff02c97eb368)
thread_start+8()
> ::msgbuf
MESSAGE                                                               
         48-bit LBA, DMA, Native Command Queueing, SMART, SMART self-test
        SATA Gen2 signaling speed (3.0Gbps)
        Supported queue depth 32
        capacity = 1250263728 sectors
sd17 at marvell88sx0: target 4 lun 0
sd17 is /pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at
4,0
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at 4,0
(sd17) online
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1 :
        SATA disk device at port 5
        model WDC WD6400AACS-00G8B0                   
        firmware 05.04C05
        serial number      WD-WCAUF0934597
        supported features:
         48-bit LBA, DMA, Native Command Queueing, SMART, SMART self-test
        SATA Gen2 signaling speed (3.0Gbps)
        Supported queue depth 32
        capacity = 1250263728 sectors
sd18 at marvell88sx0: target 5 lun 0
sd18 is /pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at
5,0
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at 5,0
(sd18) online
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1 :
        SATA disk device at port 6
        model WDC WD6400AACS-00G8B0                   
>> More [<space>, <cr>, q, n, c, a] ?                         
firmware 05.04C05
        serial number      WD-WCAUF0934679
        supported features:
         48-bit LBA, DMA, Native Command Queueing, SMART, SMART self-test
        SATA Gen2 signaling speed (3.0Gbps)
        Supported queue depth 32
        capacity = 1250263728 sectors
sd19 at marvell88sx0: target 6 lun 0
sd19 is /pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at
6,0
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at 6,0
(sd19) online
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1 :
        SATA disk device at port 7
        model WDC WD6400AACS-00G8B0                   
        firmware 05.04C05
        serial number      WD-WCAUF0931654
        supported features:
         48-bit LBA, DMA, Native Command Queueing, SMART, SMART self-test
        SATA Gen2 signaling speed (3.0Gbps)
        Supported queue depth 32
        capacity = 1250263728 sectors
sd20 at marvell88sx0: target 7 lun 0
sd20 is /pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at
7,0
/pci at 0,0/pci10de,77a at 13/pci1033,125 at 0/pci11ab,11ab at 1/disk at 7,0
(sd20) online
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/keyboard at 0 (hid6)
offline
>> More [<space>, <cr>, q, n, c, a] ?                         
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/input at 1 (hid7) offline
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/keyboard at 0 (hid6)
offline
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/input at 1 (hid7) offline
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/mouse at 1 (hid5) removed
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2 (usb_mid2) removed
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2 (hubd1) removed
USB 1.10 device (usb557,2404) operating at low speed (USB 1.x) on USB 1.10 root 
hub: device at 2, usb_mid1 at bus address 3
        ATEN USB 2.0 Switch (4-port)
usb_mid1 is /pci at 0,0/pci1043,82e2 at 4/device at 2
/pci at 0,0/pci1043,82e2 at 4/device at 2 (usb_mid1) online
USB 1.10 interface (usbif557,2404.config1.0) operating at low speed (USB 1.x) on
 USB 1.10 root hub: input at 0, hid3 at bus address 3
        ATEN USB 2.0 Switch (4-port)
hid3 is /pci at 0,0/pci1043,82e2 at 4/device at 2/input at 0
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 0 (hid3) online
USB 1.10 interface (usbif557,2404.config1.1) operating at low speed (USB 1.x) on
 USB 1.10 root hub: input at 1, hid4 at bus address 3
        ATEN USB 2.0 Switch (4-port)
hid4 is /pci at 0,0/pci1043,82e2 at 4/device at 2/input at 1
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 1 (hid4) online
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 0 (hid3) offline
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 1 (hid4) offline
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 0 (hid3) offline
>> More [<space>, <cr>, q, n, c, a] ?                         
/pci at 0,0/pci1043,82e2 at 4/device at 2/input at 1 (hid4) offline
/pci at 0,0/pci1043,82e2 at 4/device at 2 (usb_mid1) removed
USB 2.0 device (usb424,2514) operating at hi speed (USB 2.x) on USB 2.0 root hub
: hub at 2, hubd1 at bus address 2
hubd1 is /pci at 0,0/pci1043,82e2 at 4,1/hub at 2
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2 (hubd1) online
USB 2.0 device (usb46d,c025) operating at low speed (USB 1.x) on USB 2.0 externa
l hub: mouse at 1, hid5 at bus address 3
        B16_b_02 USB-PS/2 Optical Mouse
hid5 is /pci at 0,0/pci1043,82e2 at 4,1/hub at 2/mouse at 1
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/mouse at 1 (hid5) online
USB 1.10 device (usb46d,c30e) operating at low speed (USB 1.x) on USB 2.0 extern
al hub: device at 2, usb_mid2 at bus address 4
        Logitech HID compliant keyboard
usb_mid2 is /pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2 (usb_mid2) online
USB 1.10 interface (usbif46d,c30e.config1.0) operating at low speed (USB 1.x) on
 USB 2.0 external hub: keyboard at 0, hid6 at bus address 4
        Logitech HID compliant keyboard
hid6 is /pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/keyboard at 0
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/keyboard at 0 (hid6) online
USB 1.10 interface (usbif46d,c30e.config1.1) operating at low speed (USB 1.x) on
 USB 2.0 external hub: input at 1, hid7 at bus address 4
        Logitech HID compliant keyboard
>> More [<space>, <cr>, q, n, c, a] ?                         
hid7 is /pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/input at 1
/pci at 0,0/pci1043,82e2 at 4,1/hub at 2/device at 2/input at 1 (hid7) online
panic[cpu1]/thread=ffffff000fd16c60: 
BAD TRAP: type=e (#pf Page fault) rp=ffffff000fd16950 addr=30 occurred in module
 "zfs" due to a NULL pointer dereference
zpool-tank: 
#pf Page fault
Bad kernel fault at addr=0x30
pid=225, pc=0xfffffffff795abe4, sp=0xffffff000fd16a48, eflags=0x10296
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4:
6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 30
cr3: 4000000
cr8: c
        rdi:                0 rsi: ffffff02d9d1d6c0 rdx: ffffffffffffffff
        rcx:              144  r8:       70fb497da6  r9:         3ba11c96
        rax:                0 rbx:              200 rbp: ffffff000fd16a50
        r10: ffffff02dd30a0d0 r11: ffffff02dd30a098 r12: ffffff02d9d1d6c0
        r13: ffffff02dd308000 r14: ffffff02c97eb388 r15: ffffff02c97eb390
        fsb:                0 gsb: ffffff02c874c080  ds:               4b
         es:               4b  fs:                0  gs:              1c3
>> More [<space>, <cr>, q, n, c, a] ?                         
trp:                e err:                2 rip: fffffffff795abe4
         cs:               30 rfl:            10296 rsp: ffffff000fd16a48
         ss:               38
ffffff000fd16830 unix:die+dd ()
ffffff000fd16940 unix:trap+177b ()
ffffff000fd16950 unix:cmntrap+e6 ()
ffffff000fd16a50 zfs:ddt_phys_decref+c ()
ffffff000fd16a80 zfs:zio_ddt_free+55 ()
ffffff000fd16ab0 zfs:zio_execute+8d ()
ffffff000fd16b50 genunix:taskq_thread+248 ()
ffffff000fd16b60 unix:thread_start+8 ()
syncing file systems...
 done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
> ::panicinfo
             cpu                1
          thread ffffff000fd16c60
         message 
BAD TRAP: type=e (#pf Page fault) rp=ffffff000fd16950 addr=30 occurred in module
 "zfs" due to a NULL pointer dereference
             rdi                0
             rsi ffffff02d9d1d6c0
             rdx ffffffffffffffff
             rcx              144
              r8       70fb497da6
              r9         3ba11c96
             rax                0
             rbx              200
             rbp ffffff000fd16a50
             r10 ffffff02dd30a0d0
             r10 ffffff02dd30a0d0
             r11 ffffff02dd30a098
             r12 ffffff02d9d1d6c0
             r13 ffffff02dd308000
             r14 ffffff02c97eb388
             r15 ffffff02c97eb390
          fsbase                0
          gsbase ffffff02c874c080
              ds               4b
>> More [<space>, <cr>, q, n, c, a] ?                         
es               4b
              fs                0
              gs              1c3
          trapno                e
             err                2
             rip fffffffff795abe4
              cs               30
          rflags            10296
             rsp ffffff000fd16a48
              ss               38
          gdt_hi                0
          gdt_lo              1ef
          idt_hi                0
          idt_lo         d0000fff
             ldt                0
            task               70
             cr0         8005003b
             cr2               30
             cr3          4000000
             cr4              6f8
>   ::  ps -z  
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R      0      0      0      0      0 0x00000001 fffffffffbc2dbb0 sched
R    225      0      0      0      0 0x00020001 ffffff02c6e4ac70 zpool-tank
R      3      0      0      0      0 0x00020001 ffffff02c6e4de10 fsflush
R      2      0      0      0      0 0x00020001 ffffff02c6e4ea78 pageout
R      1      0      0      0      0 0x4a004000 ffffff02c6e4f6e0 init
R    224      1    224    224      0 0x42000000 ffffff02d4b116f0 syseventconfd
R    233    224    224    224      0 0x4a004000 ffffff02d897fe28 zfsdle
R    232    224    224    224      0 0x4a004000 ffffff02d8980a90 zfsdle
R    231    224    224    224      0 0x4a004000 ffffff02d89816f8 zfsdle
R    230    224    224    224      0 0x4a004000 ffffff02c9d0c010 zfsdle
R    229    224    224    224      0 0x4a004000 ffffff02c9d10a80 zfsdle
R    228    224    224    224      0 0x4a004000 ffffff02c9d0fe18 zfsdle
R    227    224    224    224      0 0x4a004000 ffffff02d4b0cc80 zfsdle
R    226    224    224    224      0 0x4a004000 ffffff02d4b0c018 zfsdle
R    136      1    136    136      0 0x42000000 ffffff02c9d0e548 rcm_daemon
R    134      1    134    134      0 0x42000000 ffffff02c9d12350 devfsadm
R    111      1    111    111      0 0x42010000 ffffff02d4b12358 syseventd
R     76      1     76     76      1 0x42000000 ffffff02c9d0d8e0 kcfd
R     16      1     16     16     15 0x52000000 ffffff02c9d116e8 dlmgmtd
R     11      1     11     11      0 0x42000000 ffffff02c6e4d1a8 svc.configd
R      9      1      9      9      0 0x42000000 ffffff02c6e4c540 svc.startd
R    197      9    197    197      0 0x4a014000 ffffff02d4b0d8e8 bash
R      5      0      0      0      0 0x00020001 ffffff02c6e50348 zpool-rpool
> ::quit
root at zen:~/coredir/foo# ls
debug.txt  unix.0  vmcore.0
script done on May  4, 2010 06:26:46 PM CEST
-- 
This message posted from opensolaris.org