The system was an x4540 running Solaris 10 Update 6 acting as a production Samba server. The only unusual activity was me sending and receiving incremental dumps to and from another system. When the system came back, the filesystem/local service was in maintenance and had to be manually restarted. Unfortunately I don''t have the crash dump (it failed), but the panic message was: Nov 12 17:36:43 ares ^Mpanic[cpu1]/thread=fffffe8d2c087460: Nov 12 17:36:43 ares genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=fffffe800429a6e0 addr=fffffe893f924000 Nov 12 17:36:43 ares unix: [ID 100000 kern.notice] Nov 12 17:36:43 ares unix: [ID 839527 kern.notice] zfs: Nov 12 17:36:43 ares unix: [ID 753105 kern.notice] #pf Page fault Nov 12 17:36:43 ares unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xfffffe893f924000 Nov 12 17:36:43 ares unix: [ID 243837 kern.notice] pid=20825, pc=0xfffffffffb8303ba, sp=0xfffffe800429a7d8, eflags=0x10216 Nov 12 17:36:43 ares unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse> Nov 12 17:36:43 ares unix: [ID 354241 kern.notice] cr2: fffffe893f924000 cr3: 420275000 cr8: c Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rdi: fffffe8b461d2640 rsi: fffffe893f924000 rdx: 108 Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rcx: 19 r8: 0 r9: 0 Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rax: 100 rbx: fffffe89658356c0 rbp: fffffe800429a850 Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] r10: 1 r11: 0 r12: 108 Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] r13: fffffe800429a864 r14: fffffe8d014b1c70 r15: 0 Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] fsb: ffffffff80000000 gsb: ffffffff8b708000 ds: 43 Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] es: 43 fs: 0 gs: 1c3 Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffb8303ba Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] cs: 28 rfl: 10216 rsp: fffffe800429a7d8 Nov 12 17:36:44 ares unix: [ID 266532 kern.notice] ss: 30 Nov 12 17:36:44 ares unix: [ID 100000 kern.notice] Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a5f0 unix:real_mode_end+7201 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a6d0 unix:trap+5e6 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a6e0 unix:_cmntrap+140 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a850 unix:bcopy+a () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a890 zfs:dbuf_read+95 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a8c0 zfs:dmu_bonus_hold+8f () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a910 zfs:restore_object+191 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429aa60 zfs:dmu_recv_stream+5e6 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad30 zfs:zfs_ioc_recv+247 () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad80 zfs:zfsdev_ioctl+14c () Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad90 genunix:cdev_ioctl+1d () Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429adb0 specfs:spec_ioctl+50 () Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429ade0 genunix:fop_ioctl+25 () Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429aec0 genunix:ioctl+ac () Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429af10 unix:brand_sys_syscall32+1a3 () Nov 12 17:36:45 ares unix: [ID 100000 kern.notice] Nov 12 17:36:45 ares genunix: [ID 672855 kern.notice] syncing file systems... Nov 12 17:36:45 ares genunix: [ID 733762 kern.notice] 43 Nov 12 17:36:47 ares genunix: [ID 904073 kern.notice] done Nov 12 17:36:48 ares genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel Nov 12 17:57:03 ares genunix: [ID 409368 kern.notice] ^M 81% done: 3324566 pages dumped, compression ratio 3.28, Nov 12 17:57:03 ares genunix: [ID 495082 kern.notice] dump failed: error 28 -- Ian.
Andreas Koppenhoefer
2008-Nov-12 08:17 UTC
[zfs-discuss] Possible ZFS panic on Solaris 10 Update 6
Maybe this is the same as I''ve described in article <http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0> I''ve written a quick&dirty shell script to reproduce a race condition which forces Update 3&4 to panic and leads Update 5 to hanging zfs commands. - Andreas -- This message posted from opensolaris.org
Andreas Koppenhoefer wrote:> Maybe this is the same as I''ve described in article > <http://www.opensolaris.org/jive/thread.jspa?threadID=81613&tstart=0> > > I''ve written a quick&dirty shell script to reproduce a race condition which forces Update 3&4 to panic and leads Update 5 to hanging zfs commands. > >It could be, I was sending 1000 or so incremental streams (sequentially) where most were tiny updates. -- Ian.
Matthew Ahrens
2008-Nov-13 18:25 UTC
[zfs-discuss] Possible ZFS panic on Solaris 10 Update 6
Ian, I couldn''t find any bugs with a similar stack trace. Can you file a bug? --matt Ian Collins wrote:> The system was an x4540 running Solaris 10 Update 6 acting as a > production Samba server. > > The only unusual activity was me sending and receiving incremental dumps > to and from another system. > > When the system came back, the filesystem/local service was in > maintenance and had to be manually restarted. > > Unfortunately I don''t have the crash dump (it failed), but the panic > message was: > > Nov 12 17:36:43 ares ^Mpanic[cpu1]/thread=fffffe8d2c087460: > Nov 12 17:36:43 ares genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=fffffe800429a6e0 addr=fffffe893f924000 > Nov 12 17:36:43 ares unix: [ID 100000 kern.notice] > Nov 12 17:36:43 ares unix: [ID 839527 kern.notice] zfs: > Nov 12 17:36:43 ares unix: [ID 753105 kern.notice] #pf Page fault > Nov 12 17:36:43 ares unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xfffffe893f924000 > Nov 12 17:36:43 ares unix: [ID 243837 kern.notice] pid=20825, pc=0xfffffffffb8303ba, sp=0xfffffe800429a7d8, eflags=0x10216 > Nov 12 17:36:43 ares unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse> > Nov 12 17:36:43 ares unix: [ID 354241 kern.notice] cr2: fffffe893f924000 cr3: 420275000 cr8: c > Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rdi: fffffe8b461d2640 rsi: fffffe893f924000 rdx: 108 > Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rcx: 19 r8: 0 r9: 0 > Nov 12 17:36:43 ares unix: [ID 592667 kern.notice] rax: 100 rbx: fffffe89658356c0 rbp: fffffe800429a850 > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] r10: 1 r11: 0 r12: 108 > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] r13: fffffe800429a864 r14: fffffe8d014b1c70 r15: 0 > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] fsb: ffffffff80000000 gsb: ffffffff8b708000 ds: 43 > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] es: 43 fs: 0 gs: 1c3 > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffb8303ba > Nov 12 17:36:44 ares unix: [ID 592667 kern.notice] cs: 28 rfl: 10216 rsp: fffffe800429a7d8 > Nov 12 17:36:44 ares unix: [ID 266532 kern.notice] ss: 30 > Nov 12 17:36:44 ares unix: [ID 100000 kern.notice] > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a5f0 unix:real_mode_end+7201 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a6d0 unix:trap+5e6 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a6e0 unix:_cmntrap+140 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a850 unix:bcopy+a () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a890 zfs:dbuf_read+95 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a8c0 zfs:dmu_bonus_hold+8f () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429a910 zfs:restore_object+191 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429aa60 zfs:dmu_recv_stream+5e6 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad30 zfs:zfs_ioc_recv+247 () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad80 zfs:zfsdev_ioctl+14c () > Nov 12 17:36:44 ares genunix: [ID 655072 kern.notice] fffffe800429ad90 genunix:cdev_ioctl+1d () > Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429adb0 specfs:spec_ioctl+50 () > Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429ade0 genunix:fop_ioctl+25 () > Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429aec0 genunix:ioctl+ac () > Nov 12 17:36:45 ares genunix: [ID 655072 kern.notice] fffffe800429af10 unix:brand_sys_syscall32+1a3 () > Nov 12 17:36:45 ares unix: [ID 100000 kern.notice] > Nov 12 17:36:45 ares genunix: [ID 672855 kern.notice] syncing file systems... > Nov 12 17:36:45 ares genunix: [ID 733762 kern.notice] 43 > Nov 12 17:36:47 ares genunix: [ID 904073 kern.notice] done > Nov 12 17:36:48 ares genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > Nov 12 17:57:03 ares genunix: [ID 409368 kern.notice] ^M 81% done: 3324566 pages dumped, compression ratio 3.28, > Nov 12 17:57:03 ares genunix: [ID 495082 kern.notice] dump failed: error 28 > >
Matthew Ahrens wrote:> Ian, > > I couldn''t find any bugs with a similar stack trace. Can you file a bug? >I''ve asked my client to file a critical but though their support channel. This is a real show stopper for us! -- Ian.