thr3ads.net - freebsd stable - debugging frequent kernel panics on 8.2-RELEASE [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Steven Hartland

2011-Aug-10 14:33 UTC

debugging frequent kernel panics on 8.2-RELEASE

We're currently experiencing a large number of kernel panics
on FreeBSD 8.2-RELEASE across a large number of machines here.

The base stack reported is a double fault with no additional
details and CTRL+ALT+ESC fails to break to the debugger as
does and NMI, even though it at least tries printing the
following many times some quite jumbled:-
NMI ... going to debugger

We've configured the dump device but that also seems to fail
to capture any details just sitting there after panic with
Dumping 4465MB:

The machines are single disk ZFS root install and the dump
device is configured using the gptid, could this be what's
preventing the dump happening?

The kernel is compiled with:-
options     KDB         # Kernel debugger related code
options     KDB_TRACE       # Print a stack trace for a panic

We have remove KVM but not remote serial on the most of the
machines.

Any advice on how to debug this issue?

    Regards
    Steve

===============================================This e.mail is private and
confidential between Multiplay (UK) Ltd. and the person or entity to whom it is
addressed. In the event of misdirection, the recipient is prohibited from using,
copying, printing or otherwise disseminating it or any information contained in
it.

In the event of misdirection, illegible or incomplete transmission please
telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.

Jeremy Chadwick

2011-Aug-10 15:13 UTC

head link

debugging frequent kernel panics on 8.2-RELEASE

On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland
wrote:> The base stack reported is a double fault with no additional
> details and CTRL+ALT+ESC fails to break to the debugger as
> does and NMI, even though it at least tries printing the
> following many times some quite jumbled:-
> NMI ... going to debugger
You may be interested in these system tunables (not sysctls).  These
come from sys/amd64/amd64/trap.c (i386 has the same):

machdep.kdb_on_nmi    (defaults to 1)
machdep.panic_on_nmi  (defaults to 1)

If what you're seeing is a hardware NMI that fires, followed by the
machine panicing, the above tunables are probably doing that.  A
hardware NMI could indicate an actual hardware issue of sorts, depending
on how the motherboard vendor implemented what they did.  For example,
on a series of mainboards we have at my workplace, the BIOS can be
configured to generate either an NMI or SMI# when different kinds of ECC
RAM errors happen (either single-bit or multi-bit parity errors).  I
don't know if that's what you're seeing.

If you're generating the NMI yourself (possibly via the KVM, etc.) then
okay, that's different.  I'm trying to discern whether or not
*you're*
generating the NMI, or if the NMI just happens and causes a panic for
you and that's what you're worried about.

Now to discuss the "jumbled console output":

The interspersing of kernel text output has plagued FreeBSD for a very
long time (since approximately 6.x).  There have been statements from
kernel coders that you can decrease the likelihood of it happening by
increasing the PRINTF_BUFR_SIZE (not a typo) option in your kernel
configuration.  The issue is exacerbated by use of SMP (either
multi-core or multi-CPU).

The default (assuming your kernel configs are based off of GENERIC
within the past 4-5 years) is 128.  However, the same developers stated
that they have great reservations over increasing this number
dramatically (meaning, something like 256 will probably work, but larger
"may have repercussions which are unknown at this time").

I have stated publicly then, and will do so again now, that this option
does not solve the problem.  I acknowledge it may make it "less likely
to happen" or may decrease the amount of interspersed output, but in my
experience neither of those prove true; and more importantly, said
option does not solve the problem.  I've talked on-list with John
Baldwin about this problem in the past, who had some pretty good ideas
of how to solve it.

I should point out that Solaris 10 and OpenSolaris (not sure about
present-day releases) both have this problem as well, especially during
kernel panics or MCEs.  Linux addressed this issue by implementing a
ring-based cyclic buffer for its kernel messages (syslog/klogd), and the
model is extremely well-documented (quite clever too):

http://www.mjmwired.net/kernel/Documentation/trace/ring-buffer-design.txt

I'm still surprised not a single GSoC project has attempted to solve
this for FreeBSD.  It really is a serious matter, as it makes getting
kernel backtraces and crash data a serious pain in the butt.  It can
also impact real-time debugging.  These are the *worst* times to have to
tolerate something like this.

I can point you to old threads about this, and my old FreeBSD wiki page
("Commonly reported issues") touches on this as well.  The point I
want
to get across is that PRINTF_BUFR_SIZE does not solve the problem.
> We've configured the dump device but that also seems to fail
> to capture any details just sitting there after panic with
> Dumping 4465MB:
> 
> The machines are single disk ZFS root install and the dump
> device is configured using the gptid, could this be what's
> preventing the dump happening?
I can tell you that others have reported this problem where the kernel
panic/dump begins but either locks up after showing the first progress
metre/amount, or during the dumping itself.

I give everyone the same advice: please make sure that you have a swap
partition that's large enough to fit your entire memory contents
(preferably a swap that's 2x or 1.5x the amount of physical RAM), and
please make sure it's on a dedicated slice (e.g. ada0s1b).  I do not
advise any sort of "abstraction" layer between swap and the rest of
the
system.  It might seem like a great/fun/awesome idea followed by
"whatever jdc, it works!" but when a crash happens -- which is when
you
need it most -- and it doesn't work, I won't sympathise.  :-)

As for the GPT aspects of things: I'm still not familiar with GPT (as a
technology I am, but when it comes to actual usability I am not).
> The kernel is compiled with:-
> options     KDB         # Kernel debugger related code
> options     KDB_TRACE       # Print a stack trace for a panic
> 
> We have remove KVM but not remote serial on the most of the
> machines.
As long as remote KVM provides actual VGA-level redirection, then that's
sufficient (though makes copy-pasting output basically impossible).  We
use serial console and tend to use these options; the DDB and GDB
options may be helpful for you, but not if the system is behaving the
way you describe.

# Debugging options
options         BREAK_TO_DEBUGGER       # Sending a serial BREAK drops to DDB
options         ALT_BREAK_TO_DEBUGGER   # Permit <CR>~<Ctrl-b> to
drop to DDB
options         KDB                     # Enable kernel debugger support
options         KDB_TRACE               # Print stack trace automatically on
panic
options         DDB                     # Support DDB
options         GDB                     # Support remote GDB

In combination with this, we use the following in /etc/rc.conf (the
dumpdev line is important, else savecore won't pick up anything):

dumpdev="auto"
ddb_enable="yes"

But we do not use any ddb scripts.  I keep it in there Just In Case.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |

Andriy Gapon

2011-Aug-10 15:24 UTC

head link

debugging frequent kernel panics on 8.2-RELEASE

on 10/08/2011 17:22 Steven Hartland said the following:> The kernel is compiled with:-
> options     KDB         # Kernel debugger related code
> options     KDB_TRACE       # Print a stack trace for a panic
You also have to provide an actual debugger backend like built-in DDB or a stub
for remote GDB to get online debugging.  No guarantees that that would help you
to
get the debugging information, but without that the chances are even slimmer.

You may also try this patch and see if it provides any improvements for
post-panic
environment (dumping etc):
http://people.freebsd.org/~avg/stop_scheduler_on_panic.8.x.diff

It might also be a good idea to at least capture a screenshot of whatever
information you get on console when the panic happens.

-- 
Andriy Gapon

Steven Hartland

2011-Aug-10 15:37 UTC

head link

debugging frequent kernel panics on 8.2-RELEASE

----- Original Message ----- 
From: "Steven Hartland" <killing@multiplay.co.uk>
To: <freebsd-stable@freebsd.org>
Sent: Wednesday, August 10, 2011 3:22 PM
Subject: debugging frequent kernel panics on 8.2-RELEASE

> We're currently experiencing a large number of kernel panics
> on FreeBSD 8.2-RELEASE across a large number of machines here.
>
> The base stack reported is a double fault with no additional
> details and CTRL+ALT+ESC fails to break to the debugger as
> does and NMI, even though it at least tries printing the
> following many times some quite jumbled:-
> NMI ... going to debugger
>
> We've configured the dump device but that also seems to fail
> to capture any details just sitting there after panic with
> Dumping 4465MB:
>
> The machines are single disk ZFS root install and the dump
> device is configured using the gptid, could this be what's
> preventing the dump happening?
>
> The kernel is compiled with:-
> options     KDB         # Kernel debugger related code
> options     KDB_TRACE       # Print a stack trace for a panic
>
> We have remove KVM but not remote serial on the most of the
> machines.
>
> Any advice on how to debug this issue?
ldn32.multiplay.co.uk dumped core - see /var/crash/vmcore.0

Wed Aug 10 14:02:07 UTC 2011

FreeBSD crash 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Jul 21 11:05:52 BST 2011
root@crash:/usr/obj/usr/src/sys/MULTIPLAY  amd64

panic: double fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0xffffffff8052f6f1
rsp = 0xffffff86ce600fb0
rbp = 0xffffff86ce601210
cpuid = 0; apic id = 00
panic: double fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff803af91e at kdb_backtrace+0x5e
#1 0xffffffff8037d817 at panic+0x187
#2 0xffffffff80574316 at dblfault_handler+0x96
#3 0xffffffff8055d06d at Xdblfault+0xad
Uptime: 13d20h53m31s
Physical memory: 24555 MB
Dumping 3283 MB: 3268 3252 3236 3220 3204 3188 3172 3156 3140 3124 3108 3092
3076 3060 3044 3028 3012 2996 2980 2964 2948 2932
2916 2900 2884 2868 2852 2836 2820 2804 2788 2772 2756 2740 272
4 2708 2692 2676 2660 2644 2628 2612 2596 2580 2564 2548 2532 2516 2500 2484
2468 2452 2436 2420 2404 2388 2372 2356 2340 2324
2308 2292 2276 2260 2244 2228 2212 2196 2180 2164 2148 2132 211
6 2100 2084 2068 2052 2036 2020 2004 1988 1972 1956 1940 1924 1908 1892 1876
1860 1844 1828 1812 1796 1780 1764 1748 1732 1716
1700 1684 1668 1652 1636 1620 1604 1588 1572 1556 1540 1524 150
8 1492 1476 1460 1444 1428 1412 1396 1380 1364 1348 1332 1316 1300 1284 1268
1252 1236 1220 1204 1188 1172 1156 1140 1124 1108
1092 1076 1060 1044 1028 1012 996 980 964 948 932 916 900 884 8
68 852 836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564
548 532 516 500 484 468 452 436 420 404 388 372 356
340 324 308 292 276 260 244 228 212 196 180 164 148 132 116
 100 84 68 52 36 20 4

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from
/boot/kernel/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from
/boot/kernel/opensolaris.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from
/boot/kernel/linprocfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/linprocfs.ko
Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from
/boot/kernel/nullfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/nullfs.ko

One of the machines has managed to dump where all the others
have failed to do so here's the stack from core.txt.0

#0  sched_switch (td=0xffffffff80830bc0, newtd=0xffffff000a73f8c0,
flags=Variable "flags" is not available.)
    at /usr/src/sys/kern/sched_ule.c:1858
1858                    cpuid = PCPU_GET(cpuid);
(kgdb)
#0  sched_switch (td=0xffffffff80830bc0, newtd=0xffffff000a73f8c0,
flags=Variable "flags" is not available.)
    at /usr/src/sys/kern/sched_ule.c:1858
#1  0xffffffff80385c86 in mi_switch (flags=260, newtd=0x0)
    at /usr/src/sys/kern/kern_synch.c:449
#2  0xffffffff803b92d2 in sleepq_timedwait (wchan=0xffffffff80830760, pri=68)
    at /usr/src/sys/kern/subr_sleepqueue.c:644
#3  0xffffffff803861e1 in _sleep (ident=0xffffffff80830760, lock=0x0,
    priority=Variable "priority" is not available.
) at /usr/src/sys/kern/kern_synch.c:230
#4  0xffffffff80532c29 in scheduler (dummy=Variable "dummy" is not
available.
) at /usr/src/sys/vm/vm_glue.c:807
#5  0xffffffff80335d67 in mi_startup () at /usr/src/sys/kern/init_main.c:254
#6  0xffffffff8016efac in btext () at /usr/src/sys/amd64/amd64/locore.S:81
#7  0xffffffff808556e0 in sleepq_chains ()
#8  0xffffffff8083b1e0 in cpu_top ()
#9  0x0000000000000000 in ?? ()
#10 0xffffffff80830bc0 in proc0 ()
#11 0xffffffff80ba4b90 in ?? ()
#12 0xffffffff80ba4b38 in ?? ()
#13 0xffffff000a73f8c0 in ?? ()
#14 0xffffffff803a2cc9 in sched_switch (td=0x0, newtd=0x0, flags=Variable
"flags" is not available.
)
    at /usr/src/sys/kern/sched_ule.c:1852
Previous frame inner to this frame (corrupt stack?)
(kgdb)

Not sure this really points to the cause, but we have the crash dump so
can do more digging if someone would point me in the correct direction.

    Regards
    Steve 


===============================================This e.mail is private and
confidential between Multiplay (UK) Ltd. and the person or entity to whom it is
addressed. In the event of misdirection, the recipient is prohibited from using,
copying, printing or otherwise disseminating it or any information contained in
it.

In the event of misdirection, illegible or incomplete transmission please
telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.

Andriy Gapon

2011-Aug-18 09:21 UTC

head link

debugging frequent kernel panics on 8.2-RELEASE

on 18/08/2011 02:15 Steven Hartland said the following:> ----- Original Message ----- From: "Andriy Gapon"
<avg@FreeBSD.org>
> 
>> Thanks to the debug that Steven provided and to the help that I
received from
>> Kostik, I think that now I understand the basic mechanics of this
panic, but,
>> unfortunately, not the details of its root cause.
>>
>> It seems like everything starts with some kind of a race between
terminating
>> processes in a jail and termination of the jail itself.  This is where
the
>> details are very thin so far.  What we see is that a process (http) is
in
>> exit(2) syscall, in exit1() function actually, and past the place where
P_WEXIT
>> flag is set and even past the place where p_limit is freed and reset to
NULL.
>> At that place the thread calls prison_proc_free(), which calls
prison_deref().
>> Then, we see that in prison_deref() the thread gets a page fault
because of what
>> seems like a NULL pointer dereference.  That's just the start of
the problem and
>> its root cause.
> 
> Thats interesting, are you using http as an example or is that something
thats
> been gleaned from the debugging of our output? I ask as there's only
one process
> running in each of our jails and thats a single java process.

It's from the debug data: p_comm = "httpd"
I also would like to ask you to revert the last patch that I sent you (with
tf_rip
comparisons) and try the patch from Kostik instead.
Given what we suspect about the problem, can please also try to provoke the
problem by e.g. doing frequent jail restarts or something else that supposedly
should hit the bug.

-- 
Andriy Gapon

Andriy Gapon

2011-Aug-20 10:02 UTC

head link

debugging frequent kernel panics on 8.2-RELEASE

on 18/08/2011 02:15 Steven Hartland said the following:> In a nutshell the jail manager we're using will attempt to resurrect
the jail
> from a dieing state in a few specific scenarios.
> 
> Here's an exmaple:-
> 1. jail restart requested
> 2. jail is stopped, so the java processes is killed off, but active tcp
sessions
> may prevent the timely full shutdown of the jail.
> 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of
> starting a new jail we attach to the old one and exec the new java process.
> 4. if an existing jail isnt detected, i.e. where there where not hanging
tcp
> sessions and #2 cleanly shutdown the jail, a new jail is created, attached
to
> and the java exec'ed.
> 
> The system uses static jailid's so its possible to determine if an
existing
> jail for this "service" exists or not. This prevents duplicate
services as
> well as making services easy to identify by their jailid.
> 
> So what we could be seeing is a race between the jail shutdown and the
attach
> of the new process?
Not a jail expert at all, but a few suggestions...

First, wouldn't the 'persist' jail option simplify your life a
little bit?

Second, you may want to try to monitor value of prison0.pr_uref variable (e.g.
via kgdb) while executing various scenarios of what you do now.  If after
finishing a certain scenario you end up with a value lower than at the start of
scenario, then this is the troublesome one.
Please note that prison0.pr_uref is composed from a number of non-jailed
processes plus a number of top-level jails.  So take this into account when
comparing prison0.pr_uref values - it's better to record the initial value
when
no jails are started and it's important to keep the number of non-jailed
processes the same (or to account for its changes).

-- 
Andriy Gapon

freebsd stable - Aug 2011 - debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE

debugging frequent kernel panics on 8.2-RELEASE