thr3ads.net - freebsd stable - Kernel panic in smp rendezvous [May 2017]

If this information is useful, please help other people find it:
Share via:

Olivier Cinquin

2017-May-10 22:29 UTC

Kernel panic in smp rendezvous

Hi,
I'm getting the following kernel panics on recent 11-STABLE (r317883):

spin lock 0xffffffff81df43d0 (smp rendezvous) held by 0xfffff8019c7a7000 (tid
100845) too long
timeout stopping cpus
panic: spin lock held too long
cpuid = 12
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe3e64caf2a0
vpanic() at vpanic+0x186/frame 0xfffffe3e64caf320
panic() at panic+0x43/frame 0xfffffe3e64caf380
_mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0x311/frame 0xfffffe3e64caf3f0
smp_targeted_tlb_shootdown() at smp_targeted_tlb_shootdown+0x101/frame
0xfffffe3e64caf470
smp_masked_invlpg_range() at smp_masked_invlpg_range+0x50/frame
0xfffffe3e64caf4a0
pmap_invalidate_range() at pmap_invalidate_range+0x196/frame 0xfffffe3e64caf4e0
vm_thread_stack_dispose() at vm_thread_stack_dispose+0x2f/frame
0xfffffe3e64caf530
thread_free() at thread_free+0x39/frame 0xfffffe3e64caf550
thread_reap() at thread_reap+0x10e/frame 0xfffffe3e64caf570
proc_reap() at proc_reap+0x6bd/frame 0xfffffe3e64caf5b0
proc_to_reap() at proc_to_reap+0x48c/frame 0xfffffe3e64caf600
kern_wait6() at kern_wait6+0x49e/frame 0xfffffe3e64caf6b0
sys_wait4() at sys_wait4+0x78/frame 0xfffffe3e64caf8a0
amd64_syscall() at amd64_syscall+0x6c4/frame 0xfffffe3e64cafa30
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe3e64cafa30


The panics occur sporadically and seemingly randomly. They occur on
multiple machines with the same configuration, and on older revisions of
11-STABLE as well as r317883 (I cannot easily bisect this down to a
specific commit given that it can take days or sometimes weeks for the
panic to occur). Note that my kernel is compiled with "options IPSEC",
but
that does not seem to be relevant.

My understanding of the overall problem is that an IPI is sent to all cores
by smp_rendezvous_cpus, but that some of the cores do not respond to it (or
at least do not respond to it correctly and in time). More specifically, I
can find 61 threads that seem to be blocked in cpustop_handler on an atomic
operation to indicate that they have stopped in response to the IPI. This
operation is CPU_SET_ATOMIC(cpu, &stopped_cpus); CPU_SET_ATOMIC boils down
to a call to "atomic_set_long" (not sure where that is itself defined
for
amd64). Either there is a line numbering problem, or this suggests that
perhaps there is a deadlock at this step (unless there's some sort of
livelock and CPU_SET_ATOMIC is the place where the threads spend most of
their time).

Looking at the thread backtraces as a whole, I can see the one that
triggered the panic, the 61 that are in cpustop_handler, and a lot of
sleeping threads. I guess each core should have an active thread, and this
is an architecture with 64 cores, so that leaves 64 - (1 + 61) = 2 cores
that are unaccounted for. Interestingly, for 2 different panics for which I
have a vmcore file these numbers are the same. For these two panics, the
IDs of the cores that are neither in cpustop_handler nor calling the smp
rendez vous are not the same (#18 and #19 in one instance, #44 and #45 in
the other instance) but in both instances the IDs are consecutive; perhaps
that's relevant.

I've uploaded a full set of backtraces at
https://gist.github.com/cinquin/d63cdf9de01b0b8033c47128868f2d38 and a
dmesg at https://gist.github.com/cinquin/17c0cf6ac7832fd1b683488d08d3e69b
(in short, the machine is a 4 processor Opteron 6274 system). I'm pasting
below
an example backtrace of the threads in stopcpu_handler.

Any suggestions as to what the problem might be and how to solve it? I've
saved the core and can provide further information.

Thanks
Olivier Cinquin

A total of 61 threads are along the lines of

#0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1275
#1  0xffffffff8105a6f5 in ipi_nmi_handler () at
/usr/src/sys/x86/x86/mp_x86.c:1231
#2  0xffffffff80ef740a in trap (frame=0xfffffe3e687f7f30) at
/usr/src/sys/amd64/amd64/trap.c:185
#3  0xffffffff80edbd03 in nmi_calltrap () at
/usr/src/sys/amd64/amd64/exception.S:510
...
(kgdb) frame 0
#0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1275
1275		CPU_SET_ATOMIC(cpu, &stopped_cpus);
(kgdb) list
1270		cpu = PCPU_GET(cpuid);
1271	
1272		savectx(&stoppcbs[cpu]);
1273	
1274		/* Indicate that we are stopped */
1275		CPU_SET_ATOMIC(cpu, &stopped_cpus);
1276	
1277		/* Wait for restart */
1278		while (!CPU_ISSET(cpu, &started_cpus))
1279		    ia32_pause();
(kgdb) p stopped_cpus
$1 = {__bits = 0xffffffff81df4368}
(kgdb) p started_cpus
$2 = {__bits = 0xffffffff81df4418}

Konstantin Belousov

2017-May-11 10:01 UTC

head link

Kernel panic in smp rendezvous

On Wed, May 10, 2017 at 03:29:39PM -0700, Olivier Cinquin
wrote:> Hi,
> I'm getting the following kernel panics on recent 11-STABLE (r317883):
> 
> spin lock 0xffffffff81df43d0 (smp rendezvous) held by 0xfffff8019c7a7000
(tid 100845) too long
> timeout stopping cpus
> panic: spin lock held too long
> cpuid = 12
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe3e64caf2a0
> vpanic() at vpanic+0x186/frame 0xfffffe3e64caf320
> panic() at panic+0x43/frame 0xfffffe3e64caf380
> _mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0x311/frame
0xfffffe3e64caf3f0
> smp_targeted_tlb_shootdown() at smp_targeted_tlb_shootdown+0x101/frame
0xfffffe3e64caf470
> smp_masked_invlpg_range() at smp_masked_invlpg_range+0x50/frame
0xfffffe3e64caf4a0
> pmap_invalidate_range() at pmap_invalidate_range+0x196/frame
0xfffffe3e64caf4e0
> vm_thread_stack_dispose() at vm_thread_stack_dispose+0x2f/frame
0xfffffe3e64caf530
> thread_free() at thread_free+0x39/frame 0xfffffe3e64caf550
> thread_reap() at thread_reap+0x10e/frame 0xfffffe3e64caf570
> proc_reap() at proc_reap+0x6bd/frame 0xfffffe3e64caf5b0
> proc_to_reap() at proc_to_reap+0x48c/frame 0xfffffe3e64caf600
> kern_wait6() at kern_wait6+0x49e/frame 0xfffffe3e64caf6b0
> sys_wait4() at sys_wait4+0x78/frame 0xfffffe3e64caf8a0
> amd64_syscall() at amd64_syscall+0x6c4/frame 0xfffffe3e64cafa30
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe3e64cafa30
> 
> 
> The panics occur sporadically and seemingly randomly. They occur on
> multiple machines with the same configuration, and on older revisions of
> 11-STABLE as well as r317883 (I cannot easily bisect this down to a
> specific commit given that it can take days or sometimes weeks for the
> panic to occur). Note that my kernel is compiled with "options
IPSEC", but
> that does not seem to be relevant.
> 
> My understanding of the overall problem is that an IPI is sent to all cores
> by smp_rendezvous_cpus, but that some of the cores do not respond to it (or
> at least do not respond to it correctly and in time). More specifically, I
> can find 61 threads that seem to be blocked in cpustop_handler on an atomic
> operation to indicate that they have stopped in response to the IPI. This
> operation is CPU_SET_ATOMIC(cpu, &stopped_cpus); CPU_SET_ATOMIC boils
down
> to a call to "atomic_set_long" (not sure where that is itself
defined for
> amd64). Either there is a line numbering problem, or this suggests that
> perhaps there is a deadlock at this step (unless there's some sort of
> livelock and CPU_SET_ATOMIC is the place where the threads spend most of
> their time).
> 
> Looking at the thread backtraces as a whole, I can see the one that
> triggered the panic, the 61 that are in cpustop_handler, and a lot of
> sleeping threads. I guess each core should have an active thread, and this
> is an architecture with 64 cores, so that leaves 64 - (1 + 61) = 2 cores
> that are unaccounted for. Interestingly, for 2 different panics for which I
> have a vmcore file these numbers are the same. For these two panics, the
> IDs of the cores that are neither in cpustop_handler nor calling the smp
> rendez vous are not the same (#18 and #19 in one instance, #44 and #45 in
> the other instance) but in both instances the IDs are consecutive; perhaps
> that's relevant.
> 
> I've uploaded a full set of backtraces at
> https://gist.github.com/cinquin/d63cdf9de01b0b8033c47128868f2d38 and a
> dmesg at https://gist.github.com/cinquin/17c0cf6ac7832fd1b683488d08d3e69b
> (in short, the machine is a 4 processor Opteron 6274 system). I'm
pasting below
> an example backtrace of the threads in stopcpu_handler.
> 
> Any suggestions as to what the problem might be and how to solve it?
I've
> saved the core and can provide further information.
> 
> Thanks
> Olivier Cinquin
> 
> A total of 61 threads are along the lines of
> 
> #0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1275
> #1  0xffffffff8105a6f5 in ipi_nmi_handler () at
/usr/src/sys/x86/x86/mp_x86.c:1231
> #2  0xffffffff80ef740a in trap (frame=0xfffffe3e687f7f30) at
/usr/src/sys/amd64/amd64/trap.c:185
> #3  0xffffffff80edbd03 in nmi_calltrap () at
/usr/src/sys/amd64/amd64/exception.S:510
> ...
> (kgdb) frame 0
> #0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1275
> 1275		CPU_SET_ATOMIC(cpu, &stopped_cpus);
> (kgdb) list
> 1270		cpu = PCPU_GET(cpuid);
> 1271	
> 1272		savectx(&stoppcbs[cpu]);
> 1273	
> 1274		/* Indicate that we are stopped */
> 1275		CPU_SET_ATOMIC(cpu, &stopped_cpus);
> 1276	
> 1277		/* Wait for restart */
> 1278		while (!CPU_ISSET(cpu, &started_cpus))
> 1279		    ia32_pause();
> (kgdb) p stopped_cpus
> $1 = {__bits = 0xffffffff81df4368}
> (kgdb) p started_cpus
> $2 = {__bits = 0xffffffff81df4418}
> 
It would be interesting to see only the backtraces for threads which are
on CPUs, i.e. all nmi_stop_handler()-threads and others.
>From your description, it looks like that there are two problems, bothof which might have the same cause.  Initial thread broadcasts tlb
shootdown IPI, calling smp_targeted_tlb_shootdown().  The function
locks the smp_ipi_mtx spinlock.  After IPI is sent, the function loops
waiting for other cores acknowledging the IPI.  The is the
		while (*p_cpudone != generation)
			ia32_pause();
loop which reads from other CPU PCPU area.

Then, other thread decides that it should execute shootdown as well, and
tries to lock the same smp_ipi_mtx spinlock.  It times out due to the
initial thread, and panics.  Panic(9) sends broadcast NMI to all cores
except self to stop the machine.  This is what you see in the backtrace
for all except one thread.

If you do not see NMI handler entrance for the initial thread, then this
indicates the second problem, any way showing just backtraces for each
CPU would clear the possible confusion.

The question of why shootdown times out is open anyway.  You should
examine pc_smp_tlb_done in each PCPU area to see whether CPUs properly
reacted to the IPI and it is initiator which does not see the update,
or it is some CPU which did not received IPI.

This might be a CPU errata. I remember much earlier AMD CPUs having an
issue with tight read loop, could be that it is it. Did you tried to
update BIOS ? Also install sysutils/devcpu-data and upgrade microcode on
the processor online. I skimmed over the AMD document 48063 Rev. 3.24
"Revision Guide for AMD Family 15h Models 00h-0Fh Processors", nothing
oustanding came out except possibly Errata 619:

619 Non-Posted Reads May Block Write Dependent on Probe
Responses
The northbridge may stall indefinitely on non-posted reads when a posted
write becomes dependent on probe responses.

You might also try to change the loop I cited above to some variant like
		while (*p_cpudone != generation) {
			ia32_pause();
			mfence();
		}
and report whether any of the magic helped.

freebsd stable - May 2017 - Kernel panic in smp rendezvous

Kernel panic in smp rendezvous

Kernel panic in smp rendezvous