Folks, before I start delving too deeply into this crashdump, has anyone seen anything like it? The background is that I''m running a non-debug open build of b49 and was in the process of running the "zoneadm -z redlx install ...". After a bit, the machine panics, initially looking at the crashdump, I''m down to 88mb free (out of a gig) and see the following stack. fffffe8000de7800 page_unlock+0x3b(180218720) fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, fffffe808180a000, 1, ffffffff80826dc8) fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, fffffe8081818000, 1, ffffffff80826dc8) fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, fffffffffbc29b20, fffffe8081818000, 2000, 0, 1) fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, ffffffff89b84d80, 12000, 2000, 1, 1) fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400) fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8) fffffe8000de7c70 thread_start+8() %rax = 0x0000000000000000 %r9 = 0x000000000300430e %rbx = 0x000000000000000e %r10 = 0x0000000000001000 %rcx = 0xfffffe8081819000 %r11 = 0x1ffffffff13709b0 %rdx = 0xfffffe8000de7c80 %r12 = 0x0000000180218720 %rsi = 0x0000000000013000 %r13 = 0xfffffffffbc52160 pse_mutex+0x200 %rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000 %r8 = 0x0000000000000200 %r15 = 0xfffffe8000de79d8 %rip = 0xfffffffffb8474fb page_unlock+0x3b %rbp = 0xfffffe8000de7800 %rsp = 0xfffffe8000de77e0 %rflags = 0x00010246 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status=<of,df,IF,tf,sf,ZF,af,PF,cf> %cs = 0x0028 %ds = 0x0043 %es = 0x0043 %trapno = 0xe %fs = 0x0000 fsbase = 0xffffffff80000000 %err = 0x0 %gs = 0x01c3 gsbase = 0xfffffffffbc27b70 While the panic string says NULL pointer dereference, it appears that 0x180218720 is not mapped. The dereference looks like the first dereference in page_unlock(), which looks at pp->p_selock. I can spend a little time looking at it, but was wondering if anyone had seen this kind of panic previously? I have two identical crashdumps created in exactly the same way. alan. -- Alan Hargreaves - http://blogs.sun.com/tpenta Staff Engineer (Kernel/VOSJEC/Performance) Systems Technical Service Center Sun Microsystems
Alan Hargreaves
2006-Sep-15 02:44 UTC
[zfs-discuss] Re: zfs panic installing a brandz zone
I know, bad form replying to myself, but I am wondering if it might be related to 6438702 error handling in zfs_getpage() can trigger "page not locked" Which is marked "fix in progress" with a target of the current build. alan. Alan Hargreaves wrote:> Folks, before I start delving too deeply into this crashdump, has anyone > seen anything like it? > > The background is that I''m running a non-debug open build of b49 and was > in the process of running the "zoneadm -z redlx install ...". > > After a bit, the machine panics, initially looking at the crashdump, I''m > down to 88mb free (out of a gig) and see the following stack. > > fffffe8000de7800 page_unlock+0x3b(180218720) > fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, > fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, > fffffe808180a000, 1, > ffffffff80826dc8) > fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, > fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, > fffffe8081818000, 1, > ffffffff80826dc8) > fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, fffffffffbc29b20, > fffffe8081818000, 2000, 0, 1) > fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, > ffffffff89b84d80, 12000, 2000, 1, 1) > fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400) > fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8) > fffffe8000de7c70 thread_start+8() > > %rax = 0x0000000000000000 %r9 = 0x000000000300430e > %rbx = 0x000000000000000e %r10 = 0x0000000000001000 > %rcx = 0xfffffe8081819000 %r11 = 0x1ffffffff13709b0 > %rdx = 0xfffffe8000de7c80 %r12 = 0x0000000180218720 > %rsi = 0x0000000000013000 %r13 = 0xfffffffffbc52160 > pse_mutex+0x200 > %rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000 > %r8 = 0x0000000000000200 %r15 = 0xfffffe8000de79d8 > > %rip = 0xfffffffffb8474fb page_unlock+0x3b > %rbp = 0xfffffe8000de7800 > %rsp = 0xfffffe8000de77e0 > %rflags = 0x00010246 > id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 > status=<of,df,IF,tf,sf,ZF,af,PF,cf> > > %cs = 0x0028 %ds = 0x0043 %es = 0x0043 > %trapno = 0xe %fs = 0x0000 fsbase = 0xffffffff80000000 > %err = 0x0 %gs = 0x01c3 gsbase = 0xfffffffffbc27b70 > > While the panic string says NULL pointer dereference, it appears that > 0x180218720 is not mapped. The dereference looks like the first > dereference in page_unlock(), which looks at pp->p_selock. > > I can spend a little time looking at it, but was wondering if anyone had > seen this kind of panic previously? > > I have two identical crashdumps created in exactly the same way. > > alan.-- Alan Hargreaves - http://blogs.sun.com/tpenta Staff Engineer (Kernel/VOSJEC/Performance) Systems Technical Service Center Sun Microsystems I went in the World''s Greatest shave for Leukaemia. See http://blogs.sun.com/roller/page/tpenta?entry=hair_yesterday_gone_today
Yup, its almost certain that this is the bug you are hitting. -Mark Alan Hargreaves wrote:> I know, bad form replying to myself, but I am wondering if it might be > related to > > 6438702 error handling in zfs_getpage() can trigger "page not > locked" > > Which is marked "fix in progress" with a target of the current build. > > alan. > > Alan Hargreaves wrote: > >> Folks, before I start delving too deeply into this crashdump, has >> anyone seen anything like it? >> >> The background is that I''m running a non-debug open build of b49 and >> was in the process of running the "zoneadm -z redlx install ...". >> >> After a bit, the machine panics, initially looking at the crashdump, >> I''m down to 88mb free (out of a gig) and see the following stack. >> >> fffffe8000de7800 page_unlock+0x3b(180218720) >> fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, >> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, >> fffffe808180a000, 1, >> ffffffff80826dc8) >> fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, >> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, >> fffffe8081818000, 1, >> ffffffff80826dc8) >> fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, >> fffffffffbc29b20, fffffe8081818000, 2000, 0, 1) >> fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, >> ffffffff89b84d80, 12000, 2000, 1, 1) >> fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400) >> fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8) >> fffffe8000de7c70 thread_start+8() >> >> %rax = 0x0000000000000000 %r9 = 0x000000000300430e >> %rbx = 0x000000000000000e %r10 = 0x0000000000001000 >> %rcx = 0xfffffe8081819000 %r11 = 0x1ffffffff13709b0 >> %rdx = 0xfffffe8000de7c80 %r12 = 0x0000000180218720 >> %rsi = 0x0000000000013000 %r13 = 0xfffffffffbc52160 >> pse_mutex+0x200 >> %rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000 >> %r8 = 0x0000000000000200 %r15 = 0xfffffe8000de79d8 >> >> %rip = 0xfffffffffb8474fb page_unlock+0x3b >> %rbp = 0xfffffe8000de7800 >> %rsp = 0xfffffe8000de77e0 >> %rflags = 0x00010246 >> id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 >> status=<of,df,IF,tf,sf,ZF,af,PF,cf> >> >> %cs = 0x0028 %ds = 0x0043 %es = 0x0043 >> %trapno = 0xe %fs = 0x0000 fsbase = 0xffffffff80000000 >> %err = 0x0 %gs = 0x01c3 gsbase = 0xfffffffffbc27b70 >> >> While the panic string says NULL pointer dereference, it appears that >> 0x180218720 is not mapped. The dereference looks like the first >> dereference in page_unlock(), which looks at pp->p_selock. >> >> I can spend a little time looking at it, but was wondering if anyone >> had seen this kind of panic previously? >> >> I have two identical crashdumps created in exactly the same way. >> >> alan. > > >