Hi,
I have a problem with ZFS filesystem on array. ZFS was created
by Solaris 10 U2. Some glitches with array made it panic
Solaris on boot. I''ve installed snv63 (as snv60 contains some
important fixes), systems boots but kernel panic when
I try to import pool. This is with zfs_recover=1.
Configuration is as follows (on snv63):
# zpool import
pool: macierz
id: 15960555323673164597
state: ONLINE
status: The pool is formatted using an older on-disk version.
action: The pool can be imported using its name or numeric identifier, though
some features will not be available without an explicit ''zpool
upgrade''.
config:
macierz ONLINE
c2t0d0 ONLINE
c2t1d0 ONLINE
c2t2d0 ONLINE
c2t3d0 ONLINE
Those are 1,8TB logical volumes exported by array.
And the panic:
Loading modules: [ unix genunix specfs dtrace cpu.AuthenticAMD.15 uppc
pcplusmp scsi_vhci ufs ip hook neti sctp arp usba fctl lofs zfs random
md cpc crypto fcip fcp logindmux ptm ipc ]> ::status
debugging crash dump vmcore.0 (64-bit) from boraks
operating system: 5.11 snv_63 (i86pc)
panic message:
ZFS: bad checksum (read on <unknown> off 0: zio fffffffed258b880 [L0
SPA space map] 1000L/600P DVA[0]=<1:fe78108600:600> DVA[
1]=<2:166f85c200:600> fletcher4 lzjb LE contiguous birth=2484644 fill=1 ck
dump content: kernel pages only> *panic_thread::findstack -v
stack pointer for thread ffffff00101d5c80: ffffff00101d58f0
ffffff00101d59e0 panic+0x9c()
ffffff00101d5a40 zio_done+0x17c(fffffffed258b880)
ffffff00101d5a60 zio_next_stage+0xb3(fffffffed258b880)
ffffff00101d5ab0 zio_wait_for_children+0x5d(fffffffed258b880, 11,
fffffffed258bad8)
ffffff00101d5ad0 zio_wait_children_done+0x20(fffffffed258b880)
ffffff00101d5af0 zio_next_stage+0xb3(fffffffed258b880)
ffffff00101d5b40 zio_vdev_io_assess+0x129(fffffffed258b880)
ffffff00101d5b60 zio_next_stage+0xb3(fffffffed258b880)
ffffff00101d5bb0 vdev_mirror_io_done+0x29d(fffffffed258b880)
ffffff00101d5bd0 zio_vdev_io_done+0x26(fffffffed258b880)
ffffff00101d5c60 taskq_thread+0x1a7(fffffffec27490f0)
ffffff00101d5c70 thread_start+8()
I''ve uploaded crash dump here:
http://www.crocom.com.pl/~tomek/boraks-zpool-import-crash.584MB.tar.bz2
Archive is 55MB, it unpacks to almost 600MB.
I''d be happy to provide additional details, this is my
first serious issue with ZFS.
And yes, I know I should''ve backups.
--
Tomasz Torcz
zdzichu at gmail.com
I seem to have got the same core dump, in a different way.
I had a zpool setup on a iscsi ''disk''. For details see:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html
But after a reboot the iscsi target was not longer available, so the iscsi
initiator could not provide the disk that he zpool was based on.
I did a ''zpool status'', but the PC just rebooted, rather than
handling it in a
graceful way.
After the reboot I discover a core dump has been created - details below:
# cat /etc/release
Solaris Nevada snv_60 X86
Copyright 2007 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 12 March 2007
#
# cd /var/crash/solaris
# mdb -k 1
Loading modules: [ unix genunix specfs dtrace uppc pcplusmp scsi_vhci ufs ip
hook neti
sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip fcp
logindmux ptm
sppp emlxs ipc ]> ::status
debugging crash dump vmcore.1 (64-bit) from solaris
operating system: 5.11 snv_60 (i86pc)
panic message:
ZFS: I/O failure (write on <unknown> off 0: zio fffffffec38cf340 [L0
packed nvlist]
4000L/600P DVA[0]=<0:160225800:600> DVA[1]=<0:9800:600> fletcher4
lzjb LE contiguous
birth=192896 fill=1 cksum=6b28
dump content: kernel pages only> *panic_thread::findstack -v
stack pointer for thread ffffff00025b2c80: ffffff00025b28f0
ffffff00025b29e0 panic+0x9c()
ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340)
ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340)
ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11,
fffffffec38cf598)
ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340)
ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340)
ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340)
ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340)
ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340)
ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340)
ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018)
ffffff00025b2c70 thread_start+8()> ::cpuinfo -v
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fffffffffbc31f80 1b 2 0 99 no no t-0 ffffff00025b2c80 sched
| |
RUNNING <--+ +--> PRI THREAD PROC
READY 60 ffffff00022c9c80 sched
EXISTS 60 ffffff00020e9c80 sched
ENABLE
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
1 fffffffec11ad000 1f 3 0 59 yes no t-0 fffffffec3dcbbc0
syslogd
| |
RUNNING <--+ +--> PRI THREAD PROC
READY 60 ffffff000212bc80 sched
QUIESCED 59 fffffffec1e51360 syslogd
EXISTS 59 fffffffec1ec2180 syslogd
ENABLE
> ::quit
This message posted from opensolaris.org
On May 15, 2007, at 4:49 PM, Nigel Smith wrote:> I seem to have got the same core dump, in a different way. > I had a zpool setup on a iscsi ''disk''. For details see: > http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/ > 001162.html > But after a reboot the iscsi target was not longer available, so > the iscsi > initiator could not provide the disk that he zpool was based on. > I did a ''zpool status'', but the PC just rebooted, rather than > handling it in a > graceful way. > After the reboot I discover a core dump has been created - details > below:ZFS panic''ing on a failed write in a non-redundant pool is known and is being worked on. Why the iSCSI device didn''t come up is also a bug. I''ll ask the iSCSI people to take a look... eric> > # cat /etc/release > Solaris Nevada snv_60 X86 > Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 12 March 2007 > # > # cd /var/crash/solaris > # mdb -k 1 > Loading modules: [ unix genunix specfs dtrace uppc pcplusmp > scsi_vhci ufs ip hook neti > sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip > fcp logindmux ptm > sppp emlxs ipc ] >> ::status > debugging crash dump vmcore.1 (64-bit) from solaris > operating system: 5.11 snv_60 (i86pc) > panic message: > ZFS: I/O failure (write on <unknown> off 0: zio fffffffec38cf340 > [L0 packed nvlist] > 4000L/600P DVA[0]=<0:160225800:600> DVA[1]=<0:9800:600> fletcher4 > lzjb LE contiguous > birth=192896 fill=1 cksum=6b28 > dump content: kernel pages only >> *panic_thread::findstack -v > stack pointer for thread ffffff00025b2c80: ffffff00025b28f0 > ffffff00025b29e0 panic+0x9c() > ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340) > ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11, > fffffffec38cf598) > ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340) > ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340) > ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340) > ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340) > ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018) > ffffff00025b2c70 thread_start+8() >> ::cpuinfo -v > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH > THREAD PROC > 0 fffffffffbc31f80 1b 2 0 99 no no t-0 > ffffff00025b2c80 sched > | | > RUNNING <--+ +--> PRI THREAD PROC > READY 60 ffffff00022c9c80 sched > EXISTS 60 ffffff00020e9c80 sched > ENABLE > > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH > THREAD PROC > 1 fffffffec11ad000 1f 3 0 59 yes no t-0 > fffffffec3dcbbc0 syslogd > | | > RUNNING <--+ +--> PRI THREAD PROC > READY 60 ffffff000212bc80 sched > QUIESCED 59 fffffffec1e51360 syslogd > EXISTS 59 fffffffec1ec2180 syslogd > ENABLE > >> ::quit > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
tim szeto
2007-May-16 19:28 UTC
iSCSI target not coming back up (was Fwd: [zfs-discuss] Re: snv63: kernel panic on import)
Nigel,
Was the iSCSI target daemon running and the targets are gone? or
did the daemon core repeatedly?
How did you created the targets?
-tim
eric kustarz wrote:> Hi Tim,
>
> Is the iSCSI target not coming back up after a reboot a known problem?
>
> Can you take a look?
>
> eric
>
> Begin forwarded message:
>
>> From: eric kustarz <Eric.Kustarz at sun.com>
>> Date: May 16, 2007 8:56:44 AM PDT
>> To: Nigel Smith <nwsmith at wilusa.freeserve.co.uk>
>> Cc: zfs-discuss at opensolaris.org
>> Subject: Re: [zfs-discuss] Re: snv63: kernel panic on import
>>
>>
>> On May 15, 2007, at 4:49 PM, Nigel Smith wrote:
>>
>>> I seem to have got the same core dump, in a different way.
>>> I had a zpool setup on a iscsi ''disk''. For
details see:
>>>
http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html
>>>
>>> But after a reboot the iscsi target was not longer available, so
the
>>> iscsi
>>> initiator could not provide the disk that he zpool was based on.
>>> I did a ''zpool status'', but the PC just rebooted,
rather than
>>> handling it in a
>>> graceful way.
>>> After the reboot I discover a core dump has been created - details
>>> below:
>>
>> ZFS panic''ing on a failed write in a non-redundant pool is
known and
>> is being worked on. Why the iSCSI device didn''t come up is
also a
>> bug. I''ll ask the iSCSI people to take a look...
>>
>> eric
>>
>>>
>>> # cat /etc/release
>>> Solaris Nevada snv_60 X86
>>> Copyright 2007 Sun Microsystems, Inc. All Rights
Reserved.
>>> Use is subject to license terms.
>>> Assembled 12 March 2007
>>> #
>>> # cd /var/crash/solaris
>>> # mdb -k 1
>>> Loading modules: [ unix genunix specfs dtrace uppc pcplusmp
>>> scsi_vhci ufs ip hook neti
>>> sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip
>>> fcp logindmux ptm
>>> sppp emlxs ipc ]
>>>> ::status
>>> debugging crash dump vmcore.1 (64-bit) from solaris
>>> operating system: 5.11 snv_60 (i86pc)
>>> panic message:
>>> ZFS: I/O failure (write on <unknown> off 0: zio
fffffffec38cf340 [L0
>>> packed nvlist]
>>> 4000L/600P DVA[0]=<0:160225800:600>
DVA[1]=<0:9800:600> fletcher4
>>> lzjb LE contiguous
>>> birth=192896 fill=1 cksum=6b28
>>> dump content: kernel pages only
>>>> *panic_thread::findstack -v
>>> stack pointer for thread ffffff00025b2c80: ffffff00025b28f0
>>> ffffff00025b29e0 panic+0x9c()
>>> ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340)
>>> ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340)
>>> ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11,
>>> fffffffec38cf598)
>>> ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340)
>>> ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340)
>>> ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340)
>>> ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340)
>>> ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340)
>>> ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340)
>>> ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018)
>>> ffffff00025b2c70 thread_start+8()
>>>> ::cpuinfo -v
>>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH
>>> THREAD PROC
>>> 0 fffffffffbc31f80 1b 2 0 99 no no t-0
>>> ffffff00025b2c80 sched
>>> | |
>>> RUNNING <--+ +--> PRI THREAD PROC
>>> READY 60 ffffff00022c9c80 sched
>>> EXISTS 60 ffffff00020e9c80 sched
>>> ENABLE
>>>
>>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH
>>> THREAD PROC
>>> 1 fffffffec11ad000 1f 3 0 59 yes no t-0
>>> fffffffec3dcbbc0 syslogd
>>> | |
>>> RUNNING <--+ +--> PRI THREAD PROC
>>> READY 60 ffffff000212bc80 sched
>>> QUIESCED 59 fffffffec1e51360 syslogd
>>> EXISTS 59 fffffffec1ec2180 syslogd
>>> ENABLE
>>>
>>>> ::quit
>>>
>>>
>>> This message posted from opensolaris.org
>>> _______________________________________________
>>> zfs-discuss mailing list
>>> zfs-discuss at opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>