thr3ads.net - freebsd stable - 10.2-STABLE amd64 panic: page fault while in kernel mode [Oct 2015]

If this information is useful, please help other people find it:
Share via:

Frank Razenberg

2015-Oct-14 13:52 UTC

10.2-STABLE amd64 panic: page fault while in kernel mode

After upgrading from 9.2 to 10.1 I first started noticing panics. They 
occurred roughly weekly and since this storage machine isn't frequently 
used I didn't look into it much further. After updating for 10.2-STABLE 
the panics have gone from weekly to daily.
The machine has 32GB of non-registered ECC DDR3-1066 RAM. There's also a 
10-disk raidz2 pool. I've ran memtest86+ for 72 hours straight with no 
errors.

Crash dumps all feature the following:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 12
fault virtual address   = 0x1d1c0bec0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff804fda65
stack pointer           = 0x28:0xfffffe0698f21870
frame pointer           = 0x28:0xfffffe0698f218d0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 6106 (pickup)
trap number             = 12
panic: page fault
cpuid = 2


(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff8053ce32 in kern_reboot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:455
#2  0xffffffff8053d215 in vpanic (fmt=<value optimized out>, ap=<value 
optimized out>) at /usr/src/sys/kern/kern_shutdown.c:762
#3  0xffffffff8053d0a3 in panic (fmt=0x0) at 
/usr/src/sys/kern/kern_shutdown.c:691
#4  0xffffffff807755db in trap_fatal (frame=<value optimized out>, 
eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851
#5  0xffffffff807758dd in trap_pfault (frame=0xfffffe0698dbc7c0, 
usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674
#6  0xffffffff80774f7a in trap (frame=0xfffffe0698dbc7c0) at 
/usr/src/sys/amd64/amd64/trap.c:440
#7  0xffffffff8075b0f2 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff804fda65 in kqueue_close (fp=0xfffff803e4967190, 
td=0xfffff80014b094a0) at /usr/src/sys/kern/kern_event.c:1750
#9  0xffffffff804f25f9 in _fdrop (fp=0xfffff803e4967190, 
td=0xfffff802b5d2a000) at file.h:343
#10 0xffffffff804f4e9e in closef (fp=<value optimized out>, td=<value 
optimized out>) at /usr/src/sys/kern/kern_descrip.c:2338
#11 0xffffffff804f4ab9 in fdescfree (td=0xfffff80014b094a0) at 
/usr/src/sys/kern/kern_descrip.c:2106
#12 0xffffffff805013a9 in exit1 (td=0xfffff80014b094a0, rv=<value 
optimized out>) at /usr/src/sys/kern/kern_exit.c:369
#13 0xffffffff80500e3e in sys_sys_exit (td=0xfffffe000782e060, 
uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:179
#14 0xffffffff80775efd in amd64_syscall (td=0xfffff80014b094a0, 
traced=0) at subr_syscall.c:134
#15 0xffffffff8075b3db in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:396
#16 0x000000080120335a in ?? ()

Most of the dumps list 'pickup' as current process. All of them have 
'kqueue_close' in the backtrace.
I'm not sure what the next step in diagnosing the issue is. Any pointers 
would be greatly appreciated.

-Frank

Konstantin Belousov

2015-Oct-14 14:42 UTC

head link

10.2-STABLE amd64 panic: page fault while in kernel mode

On Wed, Oct 14, 2015 at 03:52:47PM +0200, Frank Razenberg
wrote:> After upgrading from 9.2 to 10.1 I first started noticing panics. They 
> occurred roughly weekly and since this storage machine isn't frequently
> used I didn't look into it much further. After updating for 10.2-STABLE
> the panics have gone from weekly to daily.
> The machine has 32GB of non-registered ECC DDR3-1066 RAM. There's also
a
> 10-disk raidz2 pool. I've ran memtest86+ for 72 hours straight with no 
> errors.
> 
> Crash dumps all feature the following:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 2; apic id = 12
> fault virtual address   = 0x1d1c0bec0
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff804fda65
> stack pointer           = 0x28:0xfffffe0698f21870
> frame pointer           = 0x28:0xfffffe0698f218d0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                          = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 6106 (pickup)
> trap number             = 12
> panic: page fault
> cpuid = 2
> 
> 
> (kgdb) bt
> #0  doadump (textdump=<value optimized out>) at pcpu.h:219
> #1  0xffffffff8053ce32 in kern_reboot (howto=260) at 
> /usr/src/sys/kern/kern_shutdown.c:455
> #2  0xffffffff8053d215 in vpanic (fmt=<value optimized out>,
ap=<value
> optimized out>) at /usr/src/sys/kern/kern_shutdown.c:762
> #3  0xffffffff8053d0a3 in panic (fmt=0x0) at 
> /usr/src/sys/kern/kern_shutdown.c:691
> #4  0xffffffff807755db in trap_fatal (frame=<value optimized out>, 
> eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851
> #5  0xffffffff807758dd in trap_pfault (frame=0xfffffe0698dbc7c0, 
> usermode=<value optimized out>) at
/usr/src/sys/amd64/amd64/trap.c:674
> #6  0xffffffff80774f7a in trap (frame=0xfffffe0698dbc7c0) at 
> /usr/src/sys/amd64/amd64/trap.c:440
> #7  0xffffffff8075b0f2 in calltrap () at 
> /usr/src/sys/amd64/amd64/exception.S:236
> #8  0xffffffff804fda65 in kqueue_close (fp=0xfffff803e4967190, 
> td=0xfffff80014b094a0) at /usr/src/sys/kern/kern_event.c:1750
> #9  0xffffffff804f25f9 in _fdrop (fp=0xfffff803e4967190, 
> td=0xfffff802b5d2a000) at file.h:343
> #10 0xffffffff804f4e9e in closef (fp=<value optimized out>,
td=<value
> optimized out>) at /usr/src/sys/kern/kern_descrip.c:2338
> #11 0xffffffff804f4ab9 in fdescfree (td=0xfffff80014b094a0) at 
> /usr/src/sys/kern/kern_descrip.c:2106
> #12 0xffffffff805013a9 in exit1 (td=0xfffff80014b094a0, rv=<value 
> optimized out>) at /usr/src/sys/kern/kern_exit.c:369
> #13 0xffffffff80500e3e in sys_sys_exit (td=0xfffffe000782e060, 
> uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:179
> #14 0xffffffff80775efd in amd64_syscall (td=0xfffff80014b094a0, 
> traced=0) at subr_syscall.c:134
> #15 0xffffffff8075b3db in Xfast_syscall () at 
> /usr/src/sys/amd64/amd64/exception.S:396
> #16 0x000000080120335a in ?? ()
> 
> Most of the dumps list 'pickup' as current process. All of them
have
> 'kqueue_close' in the backtrace.
> I'm not sure what the next step in diagnosing the issue is. Any
pointers
> would be greatly appreciated.
What is exact revision of the checkout you run, where the panic above
occurs ?

Please load the kernel.debug + vmcore into kgdb, go to frame 8, and do
p *kq
p *kn
p i
p kq->kq_knlist[i].slh_first
p *(kq->kq_knlist[i].slh_first)

freebsd stable - Oct 2015 - 10.2-STABLE amd64 panic: page fault while in kernel mode

10.2-STABLE amd64 panic: page fault while in kernel mode

10.2-STABLE amd64 panic: page fault while in kernel mode