I''ve got a Intel DP35DP Motherboard, Q6600 proc (Intel 2.4G, 4 core), 4GB of ram and a copule of Sata disks, running ICH9. S10U5, patched about a week ago or so... I have a zpool on a single slice (haven''t added a mirror yet, was getting to that) and have started to suffer regular hard resets and have gotten a few panics. The system is an nfs server for a couple of systems (not much write) and one writer (I do my svn updates over NFS cause my ath0 board refuses to work in 64-bit on S10U5) I also do local builds on the same server. Ideas? The first looks like: panic[cpu0]/thread=ffffffff9bcf0460: BAD TRAP: type=e (#pf Page fault) rp=fffffe80001739a0 addr=ffffffffc064dba0 cmake: #pf Page fault Bad kernel fault at addr=0xffffffffc064dba0 pid=6797, pc=0xfffffffff0a6350a, sp=0xfffffe8000173a90, eflags=0x10207 cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> cr2: ffffffffc064dba0 cr3: 12bf9b000 cr8: c rdi: 6c60 rsi: 0 rdx: 0 rcx: 0 r8: 8b0000200001017f r9: ffffffffae3a79c0 rax: 0 rbx: ffffffffc0611f40 rbp: fffffe8000173ac0 r10: 0 r11: 0 r12: ffffffffae4687d0 r13: d8c200 r14: 2 r15: ffffffff826c0480 fsb: ffffffff80000000 gsb: fffffffffbc24ec0 ds: 43 es: 43 fs: 0 gs: 1c3 trp: e err: 0 rip: fffffffff0a6350a cs: 28 rfl: 10207 rsp: fffffe8000173a90 ss: 30 fffffe80001738b0 unix:real_mode_end+71e1 () fffffe8000173990 unix:trap+5e6 () fffffe80001739a0 unix:_cmntrap+140 () fffffe8000173ac0 zfs:zio_buf_alloc+a () fffffe8000173af0 zfs:arc_buf_alloc+9f () fffffe8000173b70 zfs:arc_read+ee () fffffe8000173bf0 zfs:dbuf_read_impl+1a0 () fffffe8000173c30 zfs:zfsctl_ops_root+304172dd () fffffe8000173c60 zfs:dmu_tx_check_ioerr+6e () fffffe8000173cc0 zfs:dmu_tx_count_write+73 () fffffe8000173cf0 zfs:dmu_tx_hold_write+4a () fffffe8000173db0 zfs:zfs_write+1bb () fffffe8000173e00 genunix:fop_write+31 () fffffe8000173eb0 genunix:write+287 () fffffe8000173ec0 genunix:write32+e () fffffe8000173f10 unix:brand_sys_sysenter+1f2 () syncing file systems... 3130 15 done dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel NOTICE: ahci_tran_reset_dport: port 0 reset port The second liek this: panic[cpu2]/thread=ffffffff9b425f20: BAD TRAP: type=e (#pf Page fault) rp=fffffe80018cdf40 addr=ffffffffc064dba0 nfsd: #pf Page fault Bad kernel fault at addr=0xffffffffc064dba0 pid=665, pc=0xfffffffff0a6350a, sp=0xfffffe80018ce030, eflags=0x10207 cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> cr2: ffffffffc064dba0 cr3: 12a9df000 cr8: c rdi: 6c60 rsi: 0 rdx: 0 rcx: 0 r8: 8b0000200001017f r9: f rax: 0 rbx: ffffffffc0611f40 rbp: fffffe80018ce060 r10: 0 r11: 0 r12: fffffe81c20ecf00 r13: d8c200 r14: 2 r15: ffffffff826c2240 fsb: ffffffff80000000 gsb: ffffffff81a6c800 ds: 43 es: 43 fs: 0 gs: 1c3 trp: e err: 0 rip: fffffffff0a6350a cs: 28 rfl: 10207 rsp: fffffe80018ce030 ss: 30 fffffe80018cde50 unix:real_mode_end+71e1 () fffffe80018cdf30 unix:trap+5e6 () fffffe80018cdf40 unix:_cmntrap+140 () fffffe80018ce060 zfs:zio_buf_alloc+a () fffffe80018ce090 zfs:arc_buf_alloc+9f () fffffe80018ce110 zfs:arc_read+ee () fffffe80018ce190 zfs:dbuf_read_impl+1a0 () fffffe80018ce1d0 zfs:zfsctl_ops_root+304172dd () fffffe80018ce200 zfs:dmu_tx_check_ioerr+6e () fffffe80018ce260 zfs:dmu_tx_count_write+73 () fffffe80018ce290 zfs:dmu_tx_hold_write+4a () fffffe80018ce350 zfs:zfs_write+1bb () fffffe80018ce3a0 genunix:fop_write+31 () fffffe80018ce410 nfssrv:do_io+b5 () fffffe80018ce610 nfssrv:rfs4_op_write+40e () fffffe80018ce770 nfssrv:rfs4_compound+1b3 () fffffe80018ce800 nfssrv:rfs4_dispatch+234 () fffffe80018ceb10 nfssrv:common_dispatch+88a () fffffe80018ceb20 nfssrv:nfs4_drc+3051ccc1 () fffffe80018cebf0 rpcmod:svc_getreq+209 () fffffe80018cec40 rpcmod:svc_run+124 () fffffe80018cec70 rpcmod:svc_do_run+88 () fffffe80018ceec0 nfs:nfssys+208 () fffffe80018cef10 unix:brand_sys_sysenter+1f2 () syncing file systems... done dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel NOTICE: ahci_tran_reset_dport: port 0 reset port This message posted from opensolaris.org
Ben Taylor wrote:> I''ve got a Intel DP35DP Motherboard, Q6600 proc (Intel 2.4G, 4 core), 4GB of ram and a > copule of Sata disks, running ICH9. S10U5, patched about a week ago or so... > > I have a zpool on a single slice (haven''t added a mirror yet, was getting to that) and have > started to suffer regular hard resets and have gotten a few panics. The system is an > nfs server for a couple of systems (not much write) and one writer (I do my svn updates > over NFS cause my ath0 board refuses to work in 64-bit on S10U5) I also do local > builds on the same server. > > Ideas?Try to enable kmem_flags. zio_buf_buf() is very simple function calling kmem_cache_alloc() in the end. I suspect some kind of kernel memory corruption. wbr, victor> The first looks like: > > panic[cpu0]/thread=ffffffff9bcf0460: > BAD TRAP: type=e (#pf Page fault) rp=fffffe80001739a0 addr=ffffffffc064dba0 > > > cmake: > #pf Page fault > Bad kernel fault at addr=0xffffffffc064dba0 > pid=6797, pc=0xfffffffff0a6350a, sp=0xfffffe8000173a90, eflags=0x10207 > cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: ffffffffc064dba0 cr3: 12bf9b000 cr8: c > rdi: 6c60 rsi: 0 rdx: 0 > rcx: 0 r8: 8b0000200001017f r9: ffffffffae3a79c0 > rax: 0 rbx: ffffffffc0611f40 rbp: fffffe8000173ac0 > r10: 0 r11: 0 r12: ffffffffae4687d0 > r13: d8c200 r14: 2 r15: ffffffff826c0480 > fsb: ffffffff80000000 gsb: fffffffffbc24ec0 ds: 43 > es: 43 fs: 0 gs: 1c3 > trp: e err: 0 rip: fffffffff0a6350a > cs: 28 rfl: 10207 rsp: fffffe8000173a90 > ss: 30 > > fffffe80001738b0 unix:real_mode_end+71e1 () > fffffe8000173990 unix:trap+5e6 () > fffffe80001739a0 unix:_cmntrap+140 () > fffffe8000173ac0 zfs:zio_buf_alloc+a () > fffffe8000173af0 zfs:arc_buf_alloc+9f () > fffffe8000173b70 zfs:arc_read+ee () > fffffe8000173bf0 zfs:dbuf_read_impl+1a0 () > fffffe8000173c30 zfs:zfsctl_ops_root+304172dd () > fffffe8000173c60 zfs:dmu_tx_check_ioerr+6e () > fffffe8000173cc0 zfs:dmu_tx_count_write+73 () > fffffe8000173cf0 zfs:dmu_tx_hold_write+4a () > fffffe8000173db0 zfs:zfs_write+1bb () > fffffe8000173e00 genunix:fop_write+31 () > fffffe8000173eb0 genunix:write+287 () > fffffe8000173ec0 genunix:write32+e () > fffffe8000173f10 unix:brand_sys_sysenter+1f2 () > > syncing file systems... > 3130 > 15 > done > dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel > NOTICE: ahci_tran_reset_dport: port 0 reset port > > > The second liek this: > > panic[cpu2]/thread=ffffffff9b425f20: > BAD TRAP: type=e (#pf Page fault) rp=fffffe80018cdf40 addr=ffffffffc064dba0 > > > nfsd: > #pf Page fault > Bad kernel fault at addr=0xffffffffc064dba0 > pid=665, pc=0xfffffffff0a6350a, sp=0xfffffe80018ce030, eflags=0x10207 > cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: ffffffffc064dba0 cr3: 12a9df000 cr8: c > rdi: 6c60 rsi: 0 rdx: 0 > rcx: 0 r8: 8b0000200001017f r9: f > rax: 0 rbx: ffffffffc0611f40 rbp: fffffe80018ce060 > r10: 0 r11: 0 r12: fffffe81c20ecf00 > r13: d8c200 r14: 2 r15: ffffffff826c2240 > fsb: ffffffff80000000 gsb: ffffffff81a6c800 ds: 43 > es: 43 fs: 0 gs: 1c3 > trp: e err: 0 rip: fffffffff0a6350a > cs: 28 rfl: 10207 rsp: fffffe80018ce030 > ss: 30 > > fffffe80018cde50 unix:real_mode_end+71e1 () > fffffe80018cdf30 unix:trap+5e6 () > fffffe80018cdf40 unix:_cmntrap+140 () > fffffe80018ce060 zfs:zio_buf_alloc+a () > fffffe80018ce090 zfs:arc_buf_alloc+9f () > fffffe80018ce110 zfs:arc_read+ee () > fffffe80018ce190 zfs:dbuf_read_impl+1a0 () > fffffe80018ce1d0 zfs:zfsctl_ops_root+304172dd () > fffffe80018ce200 zfs:dmu_tx_check_ioerr+6e () > fffffe80018ce260 zfs:dmu_tx_count_write+73 () > fffffe80018ce290 zfs:dmu_tx_hold_write+4a () > fffffe80018ce350 zfs:zfs_write+1bb () > fffffe80018ce3a0 genunix:fop_write+31 () > fffffe80018ce410 nfssrv:do_io+b5 () > fffffe80018ce610 nfssrv:rfs4_op_write+40e () > fffffe80018ce770 nfssrv:rfs4_compound+1b3 () > fffffe80018ce800 nfssrv:rfs4_dispatch+234 () > fffffe80018ceb10 nfssrv:common_dispatch+88a () > fffffe80018ceb20 nfssrv:nfs4_drc+3051ccc1 () > fffffe80018cebf0 rpcmod:svc_getreq+209 () > fffffe80018cec40 rpcmod:svc_run+124 () > fffffe80018cec70 rpcmod:svc_do_run+88 () > fffffe80018ceec0 nfs:nfssys+208 () > fffffe80018cef10 unix:brand_sys_sysenter+1f2 () > > syncing file systems... > done > dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel > NOTICE: ahci_tran_reset_dport: port 0 reset port > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Hi Ben, I''m having exactly same error for a months. In my case problem also started soon after update to 10U5. I''ve SATA mirror pool on ICH6 and also share it over NFS. Do you see checksum errors in zpool stats -xv? Unfortunately, I haven''t found any solution yet. Regards, Rustam. Ben Taylor wrote:> I''ve got a Intel DP35DP Motherboard, Q6600 proc (Intel 2.4G, 4 core), 4GB of ram and a > copule of Sata disks, running ICH9. S10U5, patched about a week ago or so... > > I have a zpool on a single slice (haven''t added a mirror yet, was getting to that) and have > started to suffer regular hard resets and have gotten a few panics. The system is an > nfs server for a couple of systems (not much write) and one writer (I do my svn updates > over NFS cause my ath0 board refuses to work in 64-bit on S10U5) I also do local > builds on the same server. > > Ideas? > > The first looks like: > > panic[cpu0]/thread=ffffffff9bcf0460: > BAD TRAP: type=e (#pf Page fault) rp=fffffe80001739a0 addr=ffffffffc064dba0 > > > cmake: > #pf Page fault > Bad kernel fault at addr=0xffffffffc064dba0 > pid=6797, pc=0xfffffffff0a6350a, sp=0xfffffe8000173a90, eflags=0x10207 > cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: ffffffffc064dba0 cr3: 12bf9b000 cr8: c > rdi: 6c60 rsi: 0 rdx: 0 > rcx: 0 r8: 8b0000200001017f r9: ffffffffae3a79c0 > rax: 0 rbx: ffffffffc0611f40 rbp: fffffe8000173ac0 > r10: 0 r11: 0 r12: ffffffffae4687d0 > r13: d8c200 r14: 2 r15: ffffffff826c0480 > fsb: ffffffff80000000 gsb: fffffffffbc24ec0 ds: 43 > es: 43 fs: 0 gs: 1c3 > trp: e err: 0 rip: fffffffff0a6350a > cs: 28 rfl: 10207 rsp: fffffe8000173a90 > ss: 30 > > fffffe80001738b0 unix:real_mode_end+71e1 () > fffffe8000173990 unix:trap+5e6 () > fffffe80001739a0 unix:_cmntrap+140 () > fffffe8000173ac0 zfs:zio_buf_alloc+a () > fffffe8000173af0 zfs:arc_buf_alloc+9f () > fffffe8000173b70 zfs:arc_read+ee () > fffffe8000173bf0 zfs:dbuf_read_impl+1a0 () > fffffe8000173c30 zfs:zfsctl_ops_root+304172dd () > fffffe8000173c60 zfs:dmu_tx_check_ioerr+6e () > fffffe8000173cc0 zfs:dmu_tx_count_write+73 () > fffffe8000173cf0 zfs:dmu_tx_hold_write+4a () > fffffe8000173db0 zfs:zfs_write+1bb () > fffffe8000173e00 genunix:fop_write+31 () > fffffe8000173eb0 genunix:write+287 () > fffffe8000173ec0 genunix:write32+e () > fffffe8000173f10 unix:brand_sys_sysenter+1f2 () > > syncing file systems... > 3130 > 15 > done > dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel > NOTICE: ahci_tran_reset_dport: port 0 reset port > > > The second liek this: > > panic[cpu2]/thread=ffffffff9b425f20: > BAD TRAP: type=e (#pf Page fault) rp=fffffe80018cdf40 addr=ffffffffc064dba0 > > > nfsd: > #pf Page fault > Bad kernel fault at addr=0xffffffffc064dba0 > pid=665, pc=0xfffffffff0a6350a, sp=0xfffffe80018ce030, eflags=0x10207 > cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: ffffffffc064dba0 cr3: 12a9df000 cr8: c > rdi: 6c60 rsi: 0 rdx: 0 > rcx: 0 r8: 8b0000200001017f r9: f > rax: 0 rbx: ffffffffc0611f40 rbp: fffffe80018ce060 > r10: 0 r11: 0 r12: fffffe81c20ecf00 > r13: d8c200 r14: 2 r15: ffffffff826c2240 > fsb: ffffffff80000000 gsb: ffffffff81a6c800 ds: 43 > es: 43 fs: 0 gs: 1c3 > trp: e err: 0 rip: fffffffff0a6350a > cs: 28 rfl: 10207 rsp: fffffe80018ce030 > ss: 30 > > fffffe80018cde50 unix:real_mode_end+71e1 () > fffffe80018cdf30 unix:trap+5e6 () > fffffe80018cdf40 unix:_cmntrap+140 () > fffffe80018ce060 zfs:zio_buf_alloc+a () > fffffe80018ce090 zfs:arc_buf_alloc+9f () > fffffe80018ce110 zfs:arc_read+ee () > fffffe80018ce190 zfs:dbuf_read_impl+1a0 () > fffffe80018ce1d0 zfs:zfsctl_ops_root+304172dd () > fffffe80018ce200 zfs:dmu_tx_check_ioerr+6e () > fffffe80018ce260 zfs:dmu_tx_count_write+73 () > fffffe80018ce290 zfs:dmu_tx_hold_write+4a () > fffffe80018ce350 zfs:zfs_write+1bb () > fffffe80018ce3a0 genunix:fop_write+31 () > fffffe80018ce410 nfssrv:do_io+b5 () > fffffe80018ce610 nfssrv:rfs4_op_write+40e () > fffffe80018ce770 nfssrv:rfs4_compound+1b3 () > fffffe80018ce800 nfssrv:rfs4_dispatch+234 () > fffffe80018ceb10 nfssrv:common_dispatch+88a () > fffffe80018ceb20 nfssrv:nfs4_drc+3051ccc1 () > fffffe80018cebf0 rpcmod:svc_getreq+209 () > fffffe80018cec40 rpcmod:svc_run+124 () > fffffe80018cec70 rpcmod:svc_do_run+88 () > fffffe80018ceec0 nfs:nfssys+208 () > fffffe80018cef10 unix:brand_sys_sysenter+1f2 () > > syncing file systems... > done > dumping to /dev/dsk/c0t0d0s1, offset 860356608, content: kernel > NOTICE: ahci_tran_reset_dport: port 0 reset port > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >