thr3ads.net - freebsd stable - Kernel panic on 11.2-RELEASE-p7 [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Jurij Kovačič

2018-Dec-25 06:57 UTC

Kernel panic on 11.2-RELEASE-p7

Dear list,

I hope I am posting this to the correct list - if not, I apologize (and
please advise where to post this instead).

Today I experienced a kernel panic on a (physical) server, running Freebsd
11.2-RELEASE-p7 with GENERIC kernel, ZFS root:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer    = 0x20:0xffffffff82299013
stack pointer            = 0x28:0xfffffe0352893ad0
frame pointer            = 0x28:0xfffffe0352893b10
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process        = 9 (dbuf_evict_thread)
trap number        = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80b3d577 at kdb_backtrace+0x67
#1 0xffffffff80af6b17 at vpanic+0x177
#2 0xffffffff80af6993 at panic+0x43
#3 0xffffffff80f77fdf at trap_fatal+0x35f
#4 0xffffffff80f7759e at trap+0x5e
#5 0xffffffff80f5808c at calltrap+0x8
#6 0xffffffff8229c049 at dbuf_evict_one+0xe9
#7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5
#8 0xffffffff80aba093 at fork_exit+0x83
#9 0xffffffff80f58fae at fork_trampoline+0xe

I have used "crashinfo" utility to generate the text file which is
available at this URL: http://www.ocpea.com/dump/core.txt

At the time of the crash, the server was probably under more intensive I/O
load (scheduled backup with rsync).

This is a production server, so naturally, all advice is deeply
appreciated. :)

Kind regards,
Jurij

Jurij Kovačič

2018-Dec-28 10:07 UTC

head link

Kernel panic on 11.2-RELEASE-p7

Dear list,

This morning the server mentioned in my previous e-mail (Freebsd
11.2-RELEASE-p7 with GENERIC kernel, ZFS root) experienced another kernel
panic:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer    = 0x20:0xffffffff82299013
stack pointer            = 0x28:0xfffffe0352893ad0
frame pointer            = 0x28:0xfffffe0352893b10
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process        = 9 (dbuf_evict_thread)
trap number        = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80b3d577 at kdb_backtrace+0x67
#1 0xffffffff80af6b17 at vpanic+0x177
#2 0xffffffff80af6993 at panic+0x43
#3 0xffffffff80f77fdf at trap_fatal+0x35f
#4 0xffffffff80f7759e at trap+0x5e
#5 0xffffffff80f5808c at calltrap+0x8
#6 0xffffffff8229c049 at dbuf_evict_one+0xe9
#7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5
#8 0xffffffff80aba093 at fork_exit+0x83
#9 0xffffffff80f58fae at fork_trampoline+0xe

I have used the "crashinfo" utility to (again) generate the text file
which
is available at this URL: http://www.ocpea.com/dump/core-2.txt
<http://www.ocpea.com/dump/core.txt>

Does anyone have any idea how we can go about discovering the cause for
this? We would appreciate any suggestion ...

Kind regards,
Jurij Kovacic


On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova?i? <jurij.kovacic at ocpea.com>
wrote:
> Dear list,
>
> I hope I am posting this to the correct list - if not, I apologize (and
> please advise where to post this instead).
>
> Today I experienced a kernel panic on a (physical) server, running Freebsd
> 11.2-RELEASE-p7 with GENERIC kernel, ZFS root:
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer    = 0x20:0xffffffff82299013
> stack pointer            = 0x28:0xfffffe0352893ad0
> frame pointer            = 0x28:0xfffffe0352893b10
> code segment        = base 0x0, limit 0xfffff, type 0x1b
>             = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    = interrupt enabled, resume, IOPL = 0
> current process        = 9 (dbuf_evict_thread)
> trap number        = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> #0 0xffffffff80b3d577 at kdb_backtrace+0x67
> #1 0xffffffff80af6b17 at vpanic+0x177
> #2 0xffffffff80af6993 at panic+0x43
> #3 0xffffffff80f77fdf at trap_fatal+0x35f
> #4 0xffffffff80f7759e at trap+0x5e
> #5 0xffffffff80f5808c at calltrap+0x8
> #6 0xffffffff8229c049 at dbuf_evict_one+0xe9
> #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5
> #8 0xffffffff80aba093 at fork_exit+0x83
> #9 0xffffffff80f58fae at fork_trampoline+0xe
>
> I have used "crashinfo" utility to generate the text file which
is
> available at this URL: http://www.ocpea.com/dump/core.txt
>
> At the time of the crash, the server was probably under more intensive I/O
> load (scheduled backup with rsync).
>
> This is a production server, so naturally, all advice is deeply
> appreciated. :)
>
> Kind regards,
> Jurij
>

Jurij Kovačič

2019-Jan-05 10:01 UTC

head link

Kernel panic on 11.2-RELEASE-p7

Dear list,

About a week ago, we had a kernel panic on Freebsd 11.2-RELEASE-p7 with
GENERIC kernel, ZFS root. As the kernel was not compiled with debug support
enabled, the resulting "vmcore" files were of little use.
Consequently, I
recompiled kernel with debug support:

--- GENERIC     2018-12-29 08:03:04.786846000 +0100
+++ DEBUG       2018-12-29 08:23:36.522966000 +0100
@@ -19,11 +19,16 @@
 # $FreeBSD: releng/11.2/sys/amd64/conf/GENERIC 333417 2018-05-09 16:14:12Z
sbruno $

 cpu            HAMMER
-ident          GENERIC
+ident          DEBUG

 makeoptions    DEBUG=-g                # Build kernel with gdb(1) debug
symbols
 makeoptions    WITH_CTF=1              # Run ctfconvert(1) for DTrace
support

+# kernel debugging
+options                KDB
+options                KDB_UNATTENDED
+options                KDB_TRACE
+
 options        SCHED_ULE               # ULE scheduler
 options        PREEMPTION              # Enable kernel thread preemption
 options        INET                    # InterNETworking

and installed it.

After running for about a week, the server crashed again this night.
Unfortunately, there are no "vmcore" files on "/var/crash"
this time.

The server has 12GB of RAM installed:
 # sysctl hw.physmem
hw.physmem: 12843053056

and uses 2 swap partitions (2G each):
# swapinfo -h
Device          1K-blocks     Used    Avail Capacity
/dev/ada0p2       2097152     642M     1.4G    31%
/dev/ada1p2       2097152     638M     1.4G    31%
Total             4194304     1.3G     2.7G    31%

Dump device is set in /etc/rc.conf:
# grep dump /etc/rc.conf
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to
disable
dumpdev="AUTO"

There seems to be enough space left in "/var/crash":
 # zfs list | grep crash
zroot/var/crash      857M  17.2G   857M  /var/crash

and like I said earlier, the system DID create "vmcore" files when
crashing
with GENERIC kernel. Is it possible that swap partition(s) are too small
for the memory dump, now that the kernel is compiled with debug support? Or
is some additional configuration needed to make the system save vmcore
files?

Please advise.

Kind regards,
Jurij

On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova?i? <jurij.kovacic at ocpea.com>
wrote:
> Dear list,
>
> I hope I am posting this to the correct list - if not, I apologize (and
> please advise where to post this instead).
>
> Today I experienced a kernel panic on a (physical) server, running Freebsd
> 11.2-RELEASE-p7 with GENERIC kernel, ZFS root:
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer    = 0x20:0xffffffff82299013
> stack pointer            = 0x28:0xfffffe0352893ad0
> frame pointer            = 0x28:0xfffffe0352893b10
> code segment        = base 0x0, limit 0xfffff, type 0x1b
>             = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    = interrupt enabled, resume, IOPL = 0
> current process        = 9 (dbuf_evict_thread)
> trap number        = 9
> panic: general protection fault
> cpuid = 0
> KDB: stack backtrace:
> #0 0xffffffff80b3d577 at kdb_backtrace+0x67
> #1 0xffffffff80af6b17 at vpanic+0x177
> #2 0xffffffff80af6993 at panic+0x43
> #3 0xffffffff80f77fdf at trap_fatal+0x35f
> #4 0xffffffff80f7759e at trap+0x5e
> #5 0xffffffff80f5808c at calltrap+0x8
> #6 0xffffffff8229c049 at dbuf_evict_one+0xe9
> #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5
> #8 0xffffffff80aba093 at fork_exit+0x83
> #9 0xffffffff80f58fae at fork_trampoline+0xe
>
> I have used "crashinfo" utility to generate the text file which
is
> available at this URL: http://www.ocpea.com/dump/core.txt
>
> At the time of the crash, the server was probably under more intensive I/O
> load (scheduled backup with rsync).
>
> This is a production server, so naturally, all advice is deeply
> appreciated. :)
>
> Kind regards,
> Jurij
>

freebsd stable - Jan 2019 - Kernel panic on 11.2-RELEASE-p7

Kernel panic on 11.2-RELEASE-p7

Kernel panic on 11.2-RELEASE-p7

Kernel panic on 11.2-RELEASE-p7