Dear list, I hope I am posting this to the correct list - if not, I apologize (and please advise where to post this instead). Today I experienced a kernel panic on a (physical) server, running Freebsd 11.2-RELEASE-p7 with GENERIC kernel, ZFS root: Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff82299013 stack pointer = 0x28:0xfffffe0352893ad0 frame pointer = 0x28:0xfffffe0352893b10 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 9 (dbuf_evict_thread) trap number = 9 panic: general protection fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff80b3d577 at kdb_backtrace+0x67 #1 0xffffffff80af6b17 at vpanic+0x177 #2 0xffffffff80af6993 at panic+0x43 #3 0xffffffff80f77fdf at trap_fatal+0x35f #4 0xffffffff80f7759e at trap+0x5e #5 0xffffffff80f5808c at calltrap+0x8 #6 0xffffffff8229c049 at dbuf_evict_one+0xe9 #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5 #8 0xffffffff80aba093 at fork_exit+0x83 #9 0xffffffff80f58fae at fork_trampoline+0xe I have used "crashinfo" utility to generate the text file which is available at this URL: http://www.ocpea.com/dump/core.txt At the time of the crash, the server was probably under more intensive I/O load (scheduled backup with rsync). This is a production server, so naturally, all advice is deeply appreciated. :) Kind regards, Jurij
Dear list, This morning the server mentioned in my previous e-mail (Freebsd 11.2-RELEASE-p7 with GENERIC kernel, ZFS root) experienced another kernel panic: Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff82299013 stack pointer = 0x28:0xfffffe0352893ad0 frame pointer = 0x28:0xfffffe0352893b10 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 9 (dbuf_evict_thread) trap number = 9 panic: general protection fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff80b3d577 at kdb_backtrace+0x67 #1 0xffffffff80af6b17 at vpanic+0x177 #2 0xffffffff80af6993 at panic+0x43 #3 0xffffffff80f77fdf at trap_fatal+0x35f #4 0xffffffff80f7759e at trap+0x5e #5 0xffffffff80f5808c at calltrap+0x8 #6 0xffffffff8229c049 at dbuf_evict_one+0xe9 #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5 #8 0xffffffff80aba093 at fork_exit+0x83 #9 0xffffffff80f58fae at fork_trampoline+0xe I have used the "crashinfo" utility to (again) generate the text file which is available at this URL: http://www.ocpea.com/dump/core-2.txt <http://www.ocpea.com/dump/core.txt> Does anyone have any idea how we can go about discovering the cause for this? We would appreciate any suggestion ... Kind regards, Jurij Kovacic On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova?i? <jurij.kovacic at ocpea.com> wrote:> Dear list, > > I hope I am posting this to the correct list - if not, I apologize (and > please advise where to post this instead). > > Today I experienced a kernel panic on a (physical) server, running Freebsd > 11.2-RELEASE-p7 with GENERIC kernel, ZFS root: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff82299013 > stack pointer = 0x28:0xfffffe0352893ad0 > frame pointer = 0x28:0xfffffe0352893b10 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 9 (dbuf_evict_thread) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80b3d577 at kdb_backtrace+0x67 > #1 0xffffffff80af6b17 at vpanic+0x177 > #2 0xffffffff80af6993 at panic+0x43 > #3 0xffffffff80f77fdf at trap_fatal+0x35f > #4 0xffffffff80f7759e at trap+0x5e > #5 0xffffffff80f5808c at calltrap+0x8 > #6 0xffffffff8229c049 at dbuf_evict_one+0xe9 > #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5 > #8 0xffffffff80aba093 at fork_exit+0x83 > #9 0xffffffff80f58fae at fork_trampoline+0xe > > I have used "crashinfo" utility to generate the text file which is > available at this URL: http://www.ocpea.com/dump/core.txt > > At the time of the crash, the server was probably under more intensive I/O > load (scheduled backup with rsync). > > This is a production server, so naturally, all advice is deeply > appreciated. :) > > Kind regards, > Jurij >
Dear list, About a week ago, we had a kernel panic on Freebsd 11.2-RELEASE-p7 with GENERIC kernel, ZFS root. As the kernel was not compiled with debug support enabled, the resulting "vmcore" files were of little use. Consequently, I recompiled kernel with debug support: --- GENERIC 2018-12-29 08:03:04.786846000 +0100 +++ DEBUG 2018-12-29 08:23:36.522966000 +0100 @@ -19,11 +19,16 @@ # $FreeBSD: releng/11.2/sys/amd64/conf/GENERIC 333417 2018-05-09 16:14:12Z sbruno $ cpu HAMMER -ident GENERIC +ident DEBUG makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support +# kernel debugging +options KDB +options KDB_UNATTENDED +options KDB_TRACE + options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking and installed it. After running for about a week, the server crashed again this night. Unfortunately, there are no "vmcore" files on "/var/crash" this time. The server has 12GB of RAM installed: # sysctl hw.physmem hw.physmem: 12843053056 and uses 2 swap partitions (2G each): # swapinfo -h Device 1K-blocks Used Avail Capacity /dev/ada0p2 2097152 642M 1.4G 31% /dev/ada1p2 2097152 638M 1.4G 31% Total 4194304 1.3G 2.7G 31% Dump device is set in /etc/rc.conf: # grep dump /etc/rc.conf # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable dumpdev="AUTO" There seems to be enough space left in "/var/crash": # zfs list | grep crash zroot/var/crash 857M 17.2G 857M /var/crash and like I said earlier, the system DID create "vmcore" files when crashing with GENERIC kernel. Is it possible that swap partition(s) are too small for the memory dump, now that the kernel is compiled with debug support? Or is some additional configuration needed to make the system save vmcore files? Please advise. Kind regards, Jurij On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova?i? <jurij.kovacic at ocpea.com> wrote:> Dear list, > > I hope I am posting this to the correct list - if not, I apologize (and > please advise where to post this instead). > > Today I experienced a kernel panic on a (physical) server, running Freebsd > 11.2-RELEASE-p7 with GENERIC kernel, ZFS root: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0xffffffff82299013 > stack pointer = 0x28:0xfffffe0352893ad0 > frame pointer = 0x28:0xfffffe0352893b10 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 9 (dbuf_evict_thread) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80b3d577 at kdb_backtrace+0x67 > #1 0xffffffff80af6b17 at vpanic+0x177 > #2 0xffffffff80af6993 at panic+0x43 > #3 0xffffffff80f77fdf at trap_fatal+0x35f > #4 0xffffffff80f7759e at trap+0x5e > #5 0xffffffff80f5808c at calltrap+0x8 > #6 0xffffffff8229c049 at dbuf_evict_one+0xe9 > #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5 > #8 0xffffffff80aba093 at fork_exit+0x83 > #9 0xffffffff80f58fae at fork_trampoline+0xe > > I have used "crashinfo" utility to generate the text file which is > available at this URL: http://www.ocpea.com/dump/core.txt > > At the time of the crash, the server was probably under more intensive I/O > load (scheduled backup with rsync). > > This is a production server, so naturally, all advice is deeply > appreciated. :) > > Kind regards, > Jurij >