thr3ads.net - zfs discuss - [zfs-discuss] zfs panic installing a brandz zone [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Alan Hargreaves

2006-Sep-15 01:35 UTC

[zfs-discuss] zfs panic installing a brandz zone

Folks, before I start delving too deeply into this crashdump, has anyone 
seen anything like it?

The background is that I''m running a non-debug open build of b49 and
was
in the process of running the "zoneadm -z redlx install ...".

After a bit, the machine panics, initially looking at the crashdump,
I''m
down to 88mb free (out of a gig) and see the following stack.

fffffe8000de7800 page_unlock+0x3b(180218720)
fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, 
fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
fffffe808180a000, 1,
ffffffff80826dc8)
fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, 
fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
fffffe8081818000, 1,
ffffffff80826dc8)
fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, fffffffffbc29b20, 
fffffe8081818000, 2000, 0, 1)
fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, 
ffffffff89b84d80, 12000, 2000, 1, 1)
fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400)
fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8)
fffffe8000de7c70 thread_start+8()

%rax = 0x0000000000000000                 %r9  = 0x000000000300430e
%rbx = 0x000000000000000e                 %r10 = 0x0000000000001000
%rcx = 0xfffffe8081819000                 %r11 = 0x1ffffffff13709b0
%rdx = 0xfffffe8000de7c80                 %r12 = 0x0000000180218720
%rsi = 0x0000000000013000                 %r13 = 0xfffffffffbc52160 
pse_mutex+0x200
%rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000
%r8  = 0x0000000000000200                 %r15 = 0xfffffe8000de79d8

%rip = 0xfffffffffb8474fb page_unlock+0x3b
%rbp = 0xfffffe8000de7800
%rsp = 0xfffffe8000de77e0
%rflags = 0x00010246
   id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
   status=<of,df,IF,tf,sf,ZF,af,PF,cf>

                         %cs = 0x0028    %ds = 0x0043    %es = 0x0043
%trapno = 0xe           %fs = 0x0000    fsbase = 0xffffffff80000000
    %err = 0x0           %gs = 0x01c3    gsbase = 0xfffffffffbc27b70

While the panic string says NULL pointer dereference, it appears that 
0x180218720 is not mapped. The dereference looks like the first 
dereference in page_unlock(), which looks at pp->p_selock.

I can spend a little time looking at it, but was wondering if anyone had 
seen this kind of panic previously?

I have two identical crashdumps created in exactly the same way.

alan.
-- 
Alan Hargreaves - http://blogs.sun.com/tpenta
Staff Engineer (Kernel/VOSJEC/Performance)
Systems Technical Service Center
Sun Microsystems

Alan Hargreaves

2006-Sep-15 02:44 UTC

head link

[zfs-discuss] Re: zfs panic installing a brandz zone

I know, bad form replying to myself, but I am wondering if it might be 
related to

         6438702 error handling in zfs_getpage() can trigger "page not 
locked"

Which is marked "fix in progress" with a target of the current build.

alan.

Alan Hargreaves wrote:> Folks, before I start delving too deeply into this crashdump, has anyone 
> seen anything like it?
> 
> The background is that I''m running a non-debug open build of b49
and was
> in the process of running the "zoneadm -z redlx install ...".
> 
> After a bit, the machine panics, initially looking at the crashdump,
I''m
> down to 88mb free (out of a gig) and see the following stack.
> 
> fffffe8000de7800 page_unlock+0x3b(180218720)
> fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, 
> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
> fffffe808180a000, 1,
> ffffffff80826dc8)
> fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, 
> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
> fffffe8081818000, 1,
> ffffffff80826dc8)
> fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, fffffffffbc29b20, 
> fffffe8081818000, 2000, 0, 1)
> fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, 
> ffffffff89b84d80, 12000, 2000, 1, 1)
> fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400)
> fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8)
> fffffe8000de7c70 thread_start+8()
> 
> %rax = 0x0000000000000000                 %r9  = 0x000000000300430e
> %rbx = 0x000000000000000e                 %r10 = 0x0000000000001000
> %rcx = 0xfffffe8081819000                 %r11 = 0x1ffffffff13709b0
> %rdx = 0xfffffe8000de7c80                 %r12 = 0x0000000180218720
> %rsi = 0x0000000000013000                 %r13 = 0xfffffffffbc52160 
> pse_mutex+0x200
> %rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000
> %r8  = 0x0000000000000200                 %r15 = 0xfffffe8000de79d8
> 
> %rip = 0xfffffffffb8474fb page_unlock+0x3b
> %rbp = 0xfffffe8000de7800
> %rsp = 0xfffffe8000de77e0
> %rflags = 0x00010246
>   id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
>   status=<of,df,IF,tf,sf,ZF,af,PF,cf>
> 
>                         %cs = 0x0028    %ds = 0x0043    %es = 0x0043
> %trapno = 0xe           %fs = 0x0000    fsbase = 0xffffffff80000000
>    %err = 0x0           %gs = 0x01c3    gsbase = 0xfffffffffbc27b70
> 
> While the panic string says NULL pointer dereference, it appears that 
> 0x180218720 is not mapped. The dereference looks like the first 
> dereference in page_unlock(), which looks at pp->p_selock.
> 
> I can spend a little time looking at it, but was wondering if anyone had 
> seen this kind of panic previously?
> 
> I have two identical crashdumps created in exactly the same way.
> 
> alan.

-- 
Alan Hargreaves - http://blogs.sun.com/tpenta
Staff Engineer (Kernel/VOSJEC/Performance)
Systems Technical Service Center
Sun Microsystems

I went in the World''s Greatest shave for Leukaemia. See
http://blogs.sun.com/roller/page/tpenta?entry=hair_yesterday_gone_today

Mark Maybee

2006-Sep-15 13:56 UTC

head link

[zfs-discuss] Re: zfs panic installing a brandz zone

Yup, its almost certain that this is the bug you are hitting.

-Mark

Alan Hargreaves wrote:> I know, bad form replying to myself, but I am wondering if it might be 
> related to
> 
>          6438702 error handling in zfs_getpage() can trigger "page not
> locked"
> 
> Which is marked "fix in progress" with a target of the current
build.
> 
> alan.
> 
> Alan Hargreaves wrote:
> 
>> Folks, before I start delving too deeply into this crashdump, has 
>> anyone seen anything like it?
>>
>> The background is that I''m running a non-debug open build of
b49 and
>> was in the process of running the "zoneadm -z redlx install
...".
>>
>> After a bit, the machine panics, initially looking at the crashdump, 
>> I''m down to 88mb free (out of a gig) and see the following
stack.
>>
>> fffffe8000de7800 page_unlock+0x3b(180218720)
>> fffffe8000de78d0 zfs_getpage+0x236(ffffffff89b84d80, 12000, 2000, 
>> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
>> fffffe808180a000, 1,
>> ffffffff80826dc8)
>> fffffe8000de7950 fop_getpage+0x52(ffffffff89b84d80, 12000, 2000, 
>> fffffe8000de7a1c, fffffe8000de79b8, 2000, fffffffffbc29b20, 
>> fffffe8081818000, 1,
>> ffffffff80826dc8)
>> fffffe8000de7a50 segmap_fault+0x1d6(ffffffff801a6f38, 
>> fffffffffbc29b20, fffffe8081818000, 2000, 0, 1)
>> fffffe8000de7b30 segmap_getmapflt+0x67a(fffffffffbc29b20, 
>> ffffffff89b84d80, 12000, 2000, 1, 1)
>> fffffe8000de7bd0 lofi_strategy_task+0x14b(ffffffff959d2400)
>> fffffe8000de7c60 taskq_thread+0x1a7(ffffffff84453da8)
>> fffffe8000de7c70 thread_start+8()
>>
>> %rax = 0x0000000000000000                 %r9  = 0x000000000300430e
>> %rbx = 0x000000000000000e                 %r10 = 0x0000000000001000
>> %rcx = 0xfffffe8081819000                 %r11 = 0x1ffffffff13709b0
>> %rdx = 0xfffffe8000de7c80                 %r12 = 0x0000000180218720
>> %rsi = 0x0000000000013000                 %r13 = 0xfffffffffbc52160 
>> pse_mutex+0x200
>> %rdi = 0xfffffffffbc52160 pse_mutex+0x200 %r14 = 0x0000000000004000
>> %r8  = 0x0000000000000200                 %r15 = 0xfffffe8000de79d8
>>
>> %rip = 0xfffffffffb8474fb page_unlock+0x3b
>> %rbp = 0xfffffe8000de7800
>> %rsp = 0xfffffe8000de77e0
>> %rflags = 0x00010246
>>   id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
>>   status=<of,df,IF,tf,sf,ZF,af,PF,cf>
>>
>>                         %cs = 0x0028    %ds = 0x0043    %es = 0x0043
>> %trapno = 0xe           %fs = 0x0000    fsbase = 0xffffffff80000000
>>    %err = 0x0           %gs = 0x01c3    gsbase = 0xfffffffffbc27b70
>>
>> While the panic string says NULL pointer dereference, it appears that 
>> 0x180218720 is not mapped. The dereference looks like the first 
>> dereference in page_unlock(), which looks at pp->p_selock.
>>
>> I can spend a little time looking at it, but was wondering if anyone 
>> had seen this kind of panic previously?
>>
>> I have two identical crashdumps created in exactly the same way.
>>
>> alan.
> 
> 
>

zfs discuss - Sep 2006 - zfs panic installing a brandz zone

[zfs-discuss] zfs panic installing a brandz zone

[zfs-discuss] Re: zfs panic installing a brandz zone

[zfs-discuss] Re: zfs panic installing a brandz zone