Hi, I have a problem with ZFS filesystem on array. ZFS was created by Solaris 10 U2. Some glitches with array made it panic Solaris on boot. I''ve installed snv63 (as snv60 contains some important fixes), systems boots but kernel panic when I try to import pool. This is with zfs_recover=1. Configuration is as follows (on snv63): # zpool import pool: macierz id: 15960555323673164597 state: ONLINE status: The pool is formatted using an older on-disk version. action: The pool can be imported using its name or numeric identifier, though some features will not be available without an explicit ''zpool upgrade''. config: macierz ONLINE c2t0d0 ONLINE c2t1d0 ONLINE c2t2d0 ONLINE c2t3d0 ONLINE Those are 1,8TB logical volumes exported by array. And the panic: Loading modules: [ unix genunix specfs dtrace cpu.AuthenticAMD.15 uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba fctl lofs zfs random md cpc crypto fcip fcp logindmux ptm ipc ]> ::statusdebugging crash dump vmcore.0 (64-bit) from boraks operating system: 5.11 snv_63 (i86pc) panic message: ZFS: bad checksum (read on <unknown> off 0: zio fffffffed258b880 [L0 SPA space map] 1000L/600P DVA[0]=<1:fe78108600:600> DVA[ 1]=<2:166f85c200:600> fletcher4 lzjb LE contiguous birth=2484644 fill=1 ck dump content: kernel pages only> *panic_thread::findstack -vstack pointer for thread ffffff00101d5c80: ffffff00101d58f0 ffffff00101d59e0 panic+0x9c() ffffff00101d5a40 zio_done+0x17c(fffffffed258b880) ffffff00101d5a60 zio_next_stage+0xb3(fffffffed258b880) ffffff00101d5ab0 zio_wait_for_children+0x5d(fffffffed258b880, 11, fffffffed258bad8) ffffff00101d5ad0 zio_wait_children_done+0x20(fffffffed258b880) ffffff00101d5af0 zio_next_stage+0xb3(fffffffed258b880) ffffff00101d5b40 zio_vdev_io_assess+0x129(fffffffed258b880) ffffff00101d5b60 zio_next_stage+0xb3(fffffffed258b880) ffffff00101d5bb0 vdev_mirror_io_done+0x29d(fffffffed258b880) ffffff00101d5bd0 zio_vdev_io_done+0x26(fffffffed258b880) ffffff00101d5c60 taskq_thread+0x1a7(fffffffec27490f0) ffffff00101d5c70 thread_start+8() I''ve uploaded crash dump here: http://www.crocom.com.pl/~tomek/boraks-zpool-import-crash.584MB.tar.bz2 Archive is 55MB, it unpacks to almost 600MB. I''d be happy to provide additional details, this is my first serious issue with ZFS. And yes, I know I should''ve backups. -- Tomasz Torcz zdzichu at gmail.com
I seem to have got the same core dump, in a different way. I had a zpool setup on a iscsi ''disk''. For details see: http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html But after a reboot the iscsi target was not longer available, so the iscsi initiator could not provide the disk that he zpool was based on. I did a ''zpool status'', but the PC just rebooted, rather than handling it in a graceful way. After the reboot I discover a core dump has been created - details below: # cat /etc/release Solaris Nevada snv_60 X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 12 March 2007 # # cd /var/crash/solaris # mdb -k 1 Loading modules: [ unix genunix specfs dtrace uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip fcp logindmux ptm sppp emlxs ipc ]> ::statusdebugging crash dump vmcore.1 (64-bit) from solaris operating system: 5.11 snv_60 (i86pc) panic message: ZFS: I/O failure (write on <unknown> off 0: zio fffffffec38cf340 [L0 packed nvlist] 4000L/600P DVA[0]=<0:160225800:600> DVA[1]=<0:9800:600> fletcher4 lzjb LE contiguous birth=192896 fill=1 cksum=6b28 dump content: kernel pages only> *panic_thread::findstack -vstack pointer for thread ffffff00025b2c80: ffffff00025b28f0 ffffff00025b29e0 panic+0x9c() ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340) ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340) ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11, fffffffec38cf598) ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340) ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340) ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340) ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340) ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340) ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340) ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018) ffffff00025b2c70 thread_start+8()> ::cpuinfo -vID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fffffffffbc31f80 1b 2 0 99 no no t-0 ffffff00025b2c80 sched | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff00022c9c80 sched EXISTS 60 ffffff00020e9c80 sched ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 fffffffec11ad000 1f 3 0 59 yes no t-0 fffffffec3dcbbc0 syslogd | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff000212bc80 sched QUIESCED 59 fffffffec1e51360 syslogd EXISTS 59 fffffffec1ec2180 syslogd ENABLE> ::quitThis message posted from opensolaris.org
On May 15, 2007, at 4:49 PM, Nigel Smith wrote:> I seem to have got the same core dump, in a different way. > I had a zpool setup on a iscsi ''disk''. For details see: > http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/ > 001162.html > But after a reboot the iscsi target was not longer available, so > the iscsi > initiator could not provide the disk that he zpool was based on. > I did a ''zpool status'', but the PC just rebooted, rather than > handling it in a > graceful way. > After the reboot I discover a core dump has been created - details > below:ZFS panic''ing on a failed write in a non-redundant pool is known and is being worked on. Why the iSCSI device didn''t come up is also a bug. I''ll ask the iSCSI people to take a look... eric> > # cat /etc/release > Solaris Nevada snv_60 X86 > Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 12 March 2007 > # > # cd /var/crash/solaris > # mdb -k 1 > Loading modules: [ unix genunix specfs dtrace uppc pcplusmp > scsi_vhci ufs ip hook neti > sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip > fcp logindmux ptm > sppp emlxs ipc ] >> ::status > debugging crash dump vmcore.1 (64-bit) from solaris > operating system: 5.11 snv_60 (i86pc) > panic message: > ZFS: I/O failure (write on <unknown> off 0: zio fffffffec38cf340 > [L0 packed nvlist] > 4000L/600P DVA[0]=<0:160225800:600> DVA[1]=<0:9800:600> fletcher4 > lzjb LE contiguous > birth=192896 fill=1 cksum=6b28 > dump content: kernel pages only >> *panic_thread::findstack -v > stack pointer for thread ffffff00025b2c80: ffffff00025b28f0 > ffffff00025b29e0 panic+0x9c() > ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340) > ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11, > fffffffec38cf598) > ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340) > ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340) > ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340) > ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340) > ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340) > ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018) > ffffff00025b2c70 thread_start+8() >> ::cpuinfo -v > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH > THREAD PROC > 0 fffffffffbc31f80 1b 2 0 99 no no t-0 > ffffff00025b2c80 sched > | | > RUNNING <--+ +--> PRI THREAD PROC > READY 60 ffffff00022c9c80 sched > EXISTS 60 ffffff00020e9c80 sched > ENABLE > > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH > THREAD PROC > 1 fffffffec11ad000 1f 3 0 59 yes no t-0 > fffffffec3dcbbc0 syslogd > | | > RUNNING <--+ +--> PRI THREAD PROC > READY 60 ffffff000212bc80 sched > QUIESCED 59 fffffffec1e51360 syslogd > EXISTS 59 fffffffec1ec2180 syslogd > ENABLE > >> ::quit > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
tim szeto
2007-May-16 19:28 UTC
iSCSI target not coming back up (was Fwd: [zfs-discuss] Re: snv63: kernel panic on import)
Nigel, Was the iSCSI target daemon running and the targets are gone? or did the daemon core repeatedly? How did you created the targets? -tim eric kustarz wrote:> Hi Tim, > > Is the iSCSI target not coming back up after a reboot a known problem? > > Can you take a look? > > eric > > Begin forwarded message: > >> From: eric kustarz <Eric.Kustarz at sun.com> >> Date: May 16, 2007 8:56:44 AM PDT >> To: Nigel Smith <nwsmith at wilusa.freeserve.co.uk> >> Cc: zfs-discuss at opensolaris.org >> Subject: Re: [zfs-discuss] Re: snv63: kernel panic on import >> >> >> On May 15, 2007, at 4:49 PM, Nigel Smith wrote: >> >>> I seem to have got the same core dump, in a different way. >>> I had a zpool setup on a iscsi ''disk''. For details see: >>> http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html >>> >>> But after a reboot the iscsi target was not longer available, so the >>> iscsi >>> initiator could not provide the disk that he zpool was based on. >>> I did a ''zpool status'', but the PC just rebooted, rather than >>> handling it in a >>> graceful way. >>> After the reboot I discover a core dump has been created - details >>> below: >> >> ZFS panic''ing on a failed write in a non-redundant pool is known and >> is being worked on. Why the iSCSI device didn''t come up is also a >> bug. I''ll ask the iSCSI people to take a look... >> >> eric >> >>> >>> # cat /etc/release >>> Solaris Nevada snv_60 X86 >>> Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. >>> Use is subject to license terms. >>> Assembled 12 March 2007 >>> # >>> # cd /var/crash/solaris >>> # mdb -k 1 >>> Loading modules: [ unix genunix specfs dtrace uppc pcplusmp >>> scsi_vhci ufs ip hook neti >>> sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip >>> fcp logindmux ptm >>> sppp emlxs ipc ] >>>> ::status >>> debugging crash dump vmcore.1 (64-bit) from solaris >>> operating system: 5.11 snv_60 (i86pc) >>> panic message: >>> ZFS: I/O failure (write on <unknown> off 0: zio fffffffec38cf340 [L0 >>> packed nvlist] >>> 4000L/600P DVA[0]=<0:160225800:600> DVA[1]=<0:9800:600> fletcher4 >>> lzjb LE contiguous >>> birth=192896 fill=1 cksum=6b28 >>> dump content: kernel pages only >>>> *panic_thread::findstack -v >>> stack pointer for thread ffffff00025b2c80: ffffff00025b28f0 >>> ffffff00025b29e0 panic+0x9c() >>> ffffff00025b2a40 zio_done+0x17c(fffffffec38cf340) >>> ffffff00025b2a60 zio_next_stage+0xb3(fffffffec38cf340) >>> ffffff00025b2ab0 zio_wait_for_children+0x5d(fffffffec38cf340, 11, >>> fffffffec38cf598) >>> ffffff00025b2ad0 zio_wait_children_done+0x20(fffffffec38cf340) >>> ffffff00025b2af0 zio_next_stage+0xb3(fffffffec38cf340) >>> ffffff00025b2b40 zio_vdev_io_assess+0x129(fffffffec38cf340) >>> ffffff00025b2b60 zio_next_stage+0xb3(fffffffec38cf340) >>> ffffff00025b2bb0 vdev_mirror_io_done+0x2af(fffffffec38cf340) >>> ffffff00025b2bd0 zio_vdev_io_done+0x26(fffffffec38cf340) >>> ffffff00025b2c60 taskq_thread+0x1a7(fffffffec154f018) >>> ffffff00025b2c70 thread_start+8() >>>> ::cpuinfo -v >>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH >>> THREAD PROC >>> 0 fffffffffbc31f80 1b 2 0 99 no no t-0 >>> ffffff00025b2c80 sched >>> | | >>> RUNNING <--+ +--> PRI THREAD PROC >>> READY 60 ffffff00022c9c80 sched >>> EXISTS 60 ffffff00020e9c80 sched >>> ENABLE >>> >>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH >>> THREAD PROC >>> 1 fffffffec11ad000 1f 3 0 59 yes no t-0 >>> fffffffec3dcbbc0 syslogd >>> | | >>> RUNNING <--+ +--> PRI THREAD PROC >>> READY 60 ffffff000212bc80 sched >>> QUIESCED 59 fffffffec1e51360 syslogd >>> EXISTS 59 fffffffec1ec2180 syslogd >>> ENABLE >>> >>>> ::quit >>> >>> >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >