thr3ads.net - Xen devel - [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Konrad Rzeszutek Wilk

2011-Jul-11 16:24 UTC

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Sun, Jul 10, 2011 at 04:14:49PM -0700, Paul E. McKenney
wrote:> On Sun, Jul 10, 2011 at 10:50:48PM +0100, julie Sullivan wrote:
> > > Very cool!  Thank you very much for the testing --
.. snip..> And here is what I am proposing sending upstream.  I have your Tested-by,
Hey Paul,

I am hitting a similar bug.
Starting udev Kernel Device Manager...
Starting Configure read-only root support...
[   79.942067] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 2, t=60002 jiffies)
[   79.942089] sending NMI to all CPUs:

when running a 3.0-rc6 under Xen as 32-bit guest (I don''t see this
issue
when running a 64-bit guest) and when I''ve more than two CPUs under the
guest.

I''ve tried the patch below against 3.0-rc6 and it did not fix the
issue.

I''ve also tried to use 3.0-rc3 as somewhere in thread one of the
reporters mentioned
that it worked for me - but that did not help me.

The config is a Fedora Core based. The stack traces of the four CPUs look
as follow:

CPU0:
Call Trace:
  [<c04023a7>] hypercall_page+0x3a7  <--
  [<c0405ed5>] xen_safe_halt+0x12 
  [<c040ea08>] default_idle+0x5a 
  [<c04081a6>] cpu_idle+0x8e 
  [<c07da9a9>] rest_init+0x5d 
  [<c0a86788>] start_kernel+0x34d 
  [<c0a861c4>] unknown_bootoption 
  [<c0a860ba>] i386_start_kernel+0xa9 
  [<c0a895ce>] xen_start_kernel+0x55d 
  [<c04090b1>] sys_rt_sigreturn+0xb 

CPU1 and CPU2:
Call Trace:
  [<c04023a7>] hypercall_page+0x3a7  <--
  [<c0405ed5>] xen_safe_halt+0x12 
  [<c040ea08>] default_idle+0x5a 
  [<c04081a6>] cpu_idle+0x8e 
  [<c07e5419>] cpu_bringup_and_idle+0xd 

CPU3:
Call Trace:
  [<c042d0f2>] task_waking_fair+0x11  <--
  [<c0439a45>] try_to_wake_up+0xb2 
  [<c0439b0c>] default_wake_function+0x10 
  [<c042d4db>] __wake_up_common+0x3b 
  [<c042ea69>] complete+0x3e 
  [<c0455e14>] wakeme_after_rcu+0x10 
  [<c048fd58>] __rcu_process_callbacks+0x172 
  [<c049080f>] rcu_process_callbacks+0x20 
  [<c044567d>] __do_softirq+0xa2 
  [<c04455db>] __do_softirq 
  [<c040a52d>] do_softirq+0x5a 

The full config is http://darnok.org/xen/config-rcu-stall
The full bootup log is http://darnok.org/xen/log-rcu-stall

Any thoughts of what I ought to try? I don''t know if there is some
missing functionality
in the RCU patches to work under Xen.... Any older version of Linux kernel
you would like me to try?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-11 17:13 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Mon, Jul 11, 2011 at 12:24:51PM -0400, Konrad Rzeszutek Wilk
wrote:> On Sun, Jul 10, 2011 at 04:14:49PM -0700, Paul E. McKenney wrote:
> > On Sun, Jul 10, 2011 at 10:50:48PM +0100, julie Sullivan wrote:
> > > > Very cool!  Thank you very much for the testing --
> .. snip..
> > And here is what I am proposing sending upstream.  I have your
Tested-by,
> 
> Hey Paul,
> 
> I am hitting a similar bug.
> Starting udev Kernel Device Manager...
> Starting Configure read-only root support...
> [   79.942067] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 2, t=60002 jiffies)
> [   79.942089] sending NMI to all CPUs:
> 
> when running a 3.0-rc6 under Xen as 32-bit guest (I don''t see this
issue
> when running a 64-bit guest) and when I''ve more than two CPUs
under the guest.
> 
> I''ve tried the patch below against 3.0-rc6 and it did not fix the
issue.
> 
> I''ve also tried to use 3.0-rc3 as somewhere in thread one of the
reporters mentioned
> that it worked for me - but that did not help me.
> 
> The config is a Fedora Core based. The stack traces of the four CPUs look
> as follow:
> 
> CPU0:
> Call Trace:
>   [<c04023a7>] hypercall_page+0x3a7  <--
>   [<c0405ed5>] xen_safe_halt+0x12 
>   [<c040ea08>] default_idle+0x5a 
>   [<c04081a6>] cpu_idle+0x8e 
>   [<c07da9a9>] rest_init+0x5d 
>   [<c0a86788>] start_kernel+0x34d 
>   [<c0a861c4>] unknown_bootoption 
>   [<c0a860ba>] i386_start_kernel+0xa9 
>   [<c0a895ce>] xen_start_kernel+0x55d 
>   [<c04090b1>] sys_rt_sigreturn+0xb 
> 
> CPU1 and CPU2:
> Call Trace:
>   [<c04023a7>] hypercall_page+0x3a7  <--
>   [<c0405ed5>] xen_safe_halt+0x12 
>   [<c040ea08>] default_idle+0x5a 
>   [<c04081a6>] cpu_idle+0x8e 
>   [<c07e5419>] cpu_bringup_and_idle+0xd 
> 
> CPU3:
> Call Trace:
>   [<c042d0f2>] task_waking_fair+0x11  <--
>   [<c0439a45>] try_to_wake_up+0xb2 
>   [<c0439b0c>] default_wake_function+0x10 
>   [<c042d4db>] __wake_up_common+0x3b 
>   [<c042ea69>] complete+0x3e 
>   [<c0455e14>] wakeme_after_rcu+0x10 
>   [<c048fd58>] __rcu_process_callbacks+0x172 
>   [<c049080f>] rcu_process_callbacks+0x20 
>   [<c044567d>] __do_softirq+0xa2 
>   [<c04455db>] __do_softirq 
>   [<c040a52d>] do_softirq+0x5a 
> 
> The full config is http://darnok.org/xen/config-rcu-stall
> The full bootup log is http://darnok.org/xen/log-rcu-stall
> 
> Any thoughts of what I ought to try? I don''t know if there is some
missing functionality
> in the RCU patches to work under Xen.... Any older version of Linux kernel
> you would like me to try?
Hmmm...  Does the stall repeat about every 3.5 minutes after the first stall?

One thing to try would be to disable CONFIG_RCU_FAST_NO_HZ.  I wouldn''t
expect this to have any effect, but might be worth a try.  It is really
intended for small battery-powered systems.

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-11 19:30 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

> 
> Hmmm...  Does the stall repeat about every 3.5 minutes after the first
stall?
Starting Configure read-only root support...
[   81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=60002 jiffies)
[   81.335091] sending NMI to all CPUs:
[  261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=240034 jiffies)
[  261.367092] sending NMI to all CPUs:
[  441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=420066 jiffies)
[  441.399089] sending NMI to all CPUs:
> 
> One thing to try would be to disable CONFIG_RCU_FAST_NO_HZ.  I
wouldn''t
> expect this to have any effect, but might be worth a try.  It is really
Did not help.> intended for small battery-powered systems.
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-11 20:15 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Mon, Jul 11, 2011 at 03:30:22PM -0400, Konrad Rzeszutek Wilk
wrote:> > 
> > Hmmm...  Does the stall repeat about every 3.5 minutes after the first
stall?
> 
> Starting Configure read-only root support...
> [   81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=60002 jiffies)
> [   81.335091] sending NMI to all CPUs:
> [  261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=240034 jiffies)
> [  261.367092] sending NMI to all CPUs:
> [  441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=420066 jiffies)
> [  441.399089] sending NMI to all CPUs:
OK, then the likely cause is something hanging onto the CPU.  Do the later
stalls also show stack traces?  If so, what shows up?

(Documentation/RCU/stallwarn.txt for more info on this.)
> > 
> > One thing to try would be to disable CONFIG_RCU_FAST_NO_HZ.  I
wouldn''t
> > expect this to have any effect, but might be worth a try.  It is
really
> 
> Did not help.
Well, I guess it met my expectations.  Thank you for trying it!

							Thanx, Paul
> > intended for small battery-powered systems.
> > 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-11 21:09 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Mon, Jul 11, 2011 at 01:15:08PM -0700, Paul E. McKenney
wrote:> On Mon, Jul 11, 2011 at 03:30:22PM -0400, Konrad Rzeszutek Wilk wrote:
> > > 
> > > Hmmm...  Does the stall repeat about every 3.5 minutes after the
first stall?
> > 
> > Starting Configure read-only root support...
> > [   81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=60002 jiffies)
> > [   81.335091] sending NMI to all CPUs:
> > [  261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=240034 jiffies)
> > [  261.367092] sending NMI to all CPUs:
> > [  441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=420066 jiffies)
> > [  441.399089] sending NMI to all CPUs:
> 
> OK, then the likely cause is something hanging onto the CPU.  Do the later
> stalls also show stack traces?  If so, what shows up?
I don''t really get any stack traces from the guest. Not sure why it
does
not print them out (probably b/c the NMI functionality is not accessible
somehow?). I get the stack traces using a ''xenctx'' tool and
this is what
I get from the guest before the stall, and after the stall:

20:45:56 # 12 :/mnt/tmp/FC15-32/ 
/usr/lib64/xen/bin/xenctx 29 -s System.map-3.0.0-rc6-disabled-options+ -a 2
cs:eip: 0061:c042d0f5 task_waking_fair+0x14 
flags: 00001286 i s nz p
ss:esp: 0069:e94cff0c
eax: c18dbed0   ebx: ffffffff   ecx: fff00000   edx: c14a10c0
esi: 00000000   edi: 00000000   ebp: e94cff18
 ds:     007b    es:     007b    fs:     00d8    gs:     00e0

cr0: 8005003b
cr2: b7743000
cr3: 97348001
cr4: 00000660

dr0: 00000000
dr1: 00000000
dr2: 00000000
dr3: 00000000
dr6: ffff0ff0
dr7: 00000400
Code (instr addr c042d0f5)
c3 55 89 e5 57 56 53 3e 8d 74 26 00 8b 90 58 01 00 00 8b 7a 1c <8b> 72 20
8b 5a 18 8b 4a 14 39 f3


Stack:
 c18dbed0 00000003 00000002 e94cff38 c0439a45 c18d00c0 c18dc2c0 00000000
 e8bd1ec4 e8bd1ef8 00000003 e94cff40 c0439b0c e94cff64 c042d4db 00000000
 e8bd1f04 00000001 00000001 e8bd1f00 e8bd0200 e8bd1efc e94cff80 c042ea69
 00000000 00000000 e8bd1ef4 ea9c4918 c0a43a80 e94cff88 c0455e14 e94cffb4

Call Trace:
  [<c042d0f5>] task_waking_fair+0x14  <--
  [<c0439a45>] try_to_wake_up+0xb2 
  [<c0439b0c>] default_wake_function+0x10 
  [<c042d4db>] __wake_up_common+0x3b 
  [<c042ea69>] complete+0x3e 
  [<c0455e14>] wakeme_after_rcu+0x10 
  [<c048fd26>] __rcu_process_callbacks+0x172 
  [<c048fe14>] rcu_process_callbacks+0x1e 
  [<c044567d>] __do_softirq+0xa2 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Sander Eikelenboom

2011-Jul-12 06:33 UTC

head link

Re: [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

Monday, July 11, 2011, 9:30:22 PM, you wrote:
>> 
>> Hmmm...  Does the stall repeat about every 3.5 minutes after the first
stall?
> Starting Configure read-only root support...
> [   81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=60002 jiffies)
> [   81.335091] sending NMI to all CPUs:
> [  261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=240034 jiffies)
> [  261.367092] sending NMI to all CPUs:
> [  441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0}
(detected by 3, t=420066 jiffies)
> [  441.399089] sending NMI to all CPUs:
>> 
>> One thing to try would be to disable CONFIG_RCU_FAST_NO_HZ.  I
wouldn''t
>> expect this to have any effect, but might be worth a try.  It is really
> Did not help.
>> intended for small battery-powered systems.
>> 
Just as a note, i''m also seeing some stalls from domU''s
running a 3.0-rc kernel from you master tree (about 2 week old).
But it seems the first occurrences are not as quick after being booted.
Unfortunately i haven''t got time to investigate more this week.

--
Sander



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 10:55 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Mon, Jul 11, 2011 at 05:09:54PM -0400, Konrad Rzeszutek Wilk
wrote:> On Mon, Jul 11, 2011 at 01:15:08PM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 11, 2011 at 03:30:22PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > 
> > > > Hmmm...  Does the stall repeat about every 3.5 minutes after
the first stall?
> > > 
> > > Starting Configure read-only root support...
> > > [   81.335070] INFO: rcu_sched_state detected stalls on
CPUs/tasks: { 0} (detected by 3, t=60002 jiffies)
> > > [   81.335091] sending NMI to all CPUs:
> > > [  261.367071] INFO: rcu_sched_state detected stalls on
CPUs/tasks: { 0} (detected by 3, t=240034 jiffies)
> > > [  261.367092] sending NMI to all CPUs:
> > > [  441.399066] INFO: rcu_sched_state detected stalls on
CPUs/tasks: { 0} (detected by 3, t=420066 jiffies)
> > > [  441.399089] sending NMI to all CPUs:
> > 
> > OK, then the likely cause is something hanging onto the CPU.  Do the
later
> > stalls also show stack traces?  If so, what shows up?
> 
> I don''t really get any stack traces from the guest. Not sure why
it does
> not print them out (probably b/c the NMI functionality is not accessible
> somehow?). I get the stack traces using a ''xenctx'' tool
and this is what
> I get from the guest before the stall, and after the stall:
> 
> 20:45:56 # 12 :/mnt/tmp/FC15-32/ 
> /usr/lib64/xen/bin/xenctx 29 -s System.map-3.0.0-rc6-disabled-options+ -a 2
> cs:eip: 0061:c042d0f5 task_waking_fair+0x14 
> flags: 00001286 i s nz p
> ss:esp: 0069:e94cff0c
> eax: c18dbed0   ebx: ffffffff   ecx: fff00000   edx: c14a10c0
> esi: 00000000   edi: 00000000   ebp: e94cff18
>  ds:     007b    es:     007b    fs:     00d8    gs:     00e0
> 
> cr0: 8005003b
> cr2: b7743000
> cr3: 97348001
> cr4: 00000660
> 
> dr0: 00000000
> dr1: 00000000
> dr2: 00000000
> dr3: 00000000
> dr6: ffff0ff0
> dr7: 00000400
> Code (instr addr c042d0f5)
> c3 55 89 e5 57 56 53 3e 8d 74 26 00 8b 90 58 01 00 00 8b 7a 1c <8b>
72 20 8b 5a 18 8b 4a 14 39 f3
> 
> 
> Stack:
>  c18dbed0 00000003 00000002 e94cff38 c0439a45 c18d00c0 c18dc2c0 00000000
>  e8bd1ec4 e8bd1ef8 00000003 e94cff40 c0439b0c e94cff64 c042d4db 00000000
>  e8bd1f04 00000001 00000001 e8bd1f00 e8bd0200 e8bd1efc e94cff80 c042ea69
>  00000000 00000000 e8bd1ef4 ea9c4918 c0a43a80 e94cff88 c0455e14 e94cffb4
> 
> Call Trace:
>   [<c042d0f5>] task_waking_fair+0x14  <--
Hmmm...  This is a 32-bit system, isn''t it?

Could you please add a check to the loop in task_waking_fair() and
do a printk() if the loop does (say) more than 1000 passes without
exiting?

							Thanx, Paul
>   [<c0439a45>] try_to_wake_up+0xb2 
>   [<c0439b0c>] default_wake_function+0x10 
>   [<c042d4db>] __wake_up_common+0x3b 
>   [<c042ea69>] complete+0x3e 
>   [<c0455e14>] wakeme_after_rcu+0x10 
>   [<c048fd26>] __rcu_process_callbacks+0x172 
>   [<c048fe14>] rcu_process_callbacks+0x1e 
>   [<c044567d>] __do_softirq+0xa2 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 14:05 UTC

head link

Re: [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 08:33:17AM +0200, Sander Eikelenboom
wrote:> Monday, July 11, 2011, 9:30:22 PM, you wrote:
> 
> >> 
> >> Hmmm...  Does the stall repeat about every 3.5 minutes after the
first stall?
> 
> > Starting Configure read-only root support...
> > [   81.335070] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=60002 jiffies)
> > [   81.335091] sending NMI to all CPUs:
> > [  261.367071] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=240034 jiffies)
> > [  261.367092] sending NMI to all CPUs:
> > [  441.399066] INFO: rcu_sched_state detected stalls on CPUs/tasks: {
0} (detected by 3, t=420066 jiffies)
> > [  441.399089] sending NMI to all CPUs:
> 
> >> 
> >> One thing to try would be to disable CONFIG_RCU_FAST_NO_HZ.  I
wouldn''t
> >> expect this to have any effect, but might be worth a try.  It is
really
> 
> > Did not help.
> >> intended for small battery-powered systems.
> >> 
> 
> Just as a note, i''m also seeing some stalls from domU''s
running a 3.0-rc kernel from you master tree (about 2 week old).
> But it seems the first occurrences are not as quick after being booted.
Unfortunately i haven''t got time to investigate more this week.
Should you get some time, providing the stack traces from the CPU stall
warnings would give me something to go on.  ;-)

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 14:12 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

> >   [<c042d0f5>] task_waking_fair+0x14  <--
> 
> Hmmm...  This is a 32-bit system, isn''t it?
Yes. I ran this little loop:

#!/bin/bash

ID=`xl list | grep Fedora | awk ''  { print $2}''`

rm -f cpu*.log
while (true) do
	xl pause $ID
	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >>
cpu0.log
	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >>
cpu1.log
	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >>
cpu2.log
	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >>
cpu3.log
	xl unpause $ID
done

To get an idea what the CPU is doing before it hits the task_waking_fair
and there isn''t anything daming. Here are the logs:

http://darnok.org/xen/cpu1.log
> 
> Could you please add a check to the loop in task_waking_fair() and
> do a printk() if the loop does (say) more than 1000 passes without
> exiting?
Of course. Let me queue that up.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 14:49 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 10:12:28AM -0400, Konrad Rzeszutek Wilk
wrote:> > >   [<c042d0f5>] task_waking_fair+0x14  <--
> > 
> > Hmmm...  This is a 32-bit system, isn''t it?
> 
> Yes. I ran this little loop:
> 
> #!/bin/bash
> 
> ID=`xl list | grep Fedora | awk ''  { print $2}''`
> 
> rm -f cpu*.log
> while (true) do
> 	xl pause $ID
> 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >>
cpu0.log
> 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >>
cpu1.log
> 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >>
cpu2.log
> 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >>
cpu3.log
> 	xl unpause $ID
> done
> 
> To get an idea what the CPU is doing before it hits the task_waking_fair
> and there isn''t anything daming. Here are the logs:
> 
> http://darnok.org/xen/cpu1.log
OK, a fair amount of variety, then lots and lots of task_waking_fair(),
so I still feel good about asking you for the following.
> > Could you please add a check to the loop in task_waking_fair() and
> > do a printk() if the loop does (say) more than 1000 passes without
> > exiting?
> 
> Of course. Let me queue that up.
Hmmm...  Given that this is persisting for many many seconds, it might
be better to check for at least 10,000,000 passes.  In contrast, 1000
passes might elapse just waiting for a cache miss to complete.

Other possible causes include:

o	A mismatch between Xen''s and RCU''s ideas of how CONFIG_NO_HZ
	works.  If Xen thinks that the CPU is in CONFIG_NO_HZ''s
	dyntick-idle mode, but RCU thinks otherwise, the grace period
	might stall.

o	Problems due to portions of the code attempting to use
	RCU read-side critical sections while in dyntick-idle mode.
	Frederic Weisbecker has located some of these, (though not yet
	in Xen) and he has some diagnositics which may be found at:

	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git

	on branch eqscheck.2011.07.08a.

	You need to enable CONFIG_PROVE_RCU for these diagnostics to
	be executed.

o	As always, there might be bugs in RCU.  ;-)

But the loop in task_waking_fair() looks like the most prominent smoking
gun at the moment.

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 15:07 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 07:49:36AM -0700, Paul E. McKenney
wrote:> On Tue, Jul 12, 2011 at 10:12:28AM -0400, Konrad Rzeszutek Wilk wrote:
> > > >   [<c042d0f5>] task_waking_fair+0x14  <--
> > > 
> > > Hmmm...  This is a 32-bit system, isn''t it?
> > 
> > Yes. I ran this little loop:
> > 
> > #!/bin/bash
> > 
> > ID=`xl list | grep Fedora | awk ''  { print $2}''`
> > 
> > rm -f cpu*.log
> > while (true) do
> > 	xl pause $ID
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >>
cpu0.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >>
cpu1.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >>
cpu2.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >>
cpu3.log
> > 	xl unpause $ID
> > done
Having just noted the "julie" above and the "julie Sullivan
<kernelmail.jms@gmail.com>" in the email address list, I suddenly
suspect
that the idea here might be to use Xen to allow Julie''s problem to be
more easily debugged.  If so, kudos and a big thank you!!!

And if so, could you please try out the patch below?  My earlier attempt
delayed RCU callbacks until just before the scheduler initialized itself
(which seems to have fixed the bug that Ravi Kulkarni (CCed) found), but
didn''t help Julie.  This patch instead delays RCU callbacks until after
the scheduler has actually completely spawned at least one task.

This patch should apply to any recent v3.0-rc release, but of course please
let me know if it causes trouble.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7e59ffb..ba06207 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -84,9 +84,32 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
 
+/*
+ * The rcu_scheduler_active variable transitions from zero to one just
+ * before the first task is spawned.  So when this variable is zero, RCU
+ * can assume that there is but one task, allowing RCU to (for example)
+ * optimized synchronize_sched() to a simple barrier().  When this variable
+ * is one, RCU must actually do all the hard work required to detect real
+ * grace periods.  This variable is also used to suppress boot-time false
+ * positives from lockdep-RCU error checking.
+ */
 int rcu_scheduler_active __read_mostly;
 EXPORT_SYMBOL_GPL(rcu_scheduler_active);
 
+/*
+ * The rcu_scheduler_fully_active variable transitions from zero to one
+ * during the early_initcall() processing, which is after the scheduler
+ * is capable of creating new tasks.  So RCU processing (for example,
+ * creating tasks for RCU priority boosting) must be delayed until after
+ * rcu_scheduler_fully_active transitions from zero to one.  We also
+ * currently delay invocation of any RCU callbacks until after this point.
+ *
+ * It might later prove better for people registering RCU callbacks during
+ * early boot to take responsibility for these callbacks, but one step at
+ * a time.
+ */
+static int rcu_scheduler_fully_active __read_mostly;
+
 #ifdef CONFIG_RCU_BOOST
 
 /*
@@ -98,7 +121,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DEFINE_PER_CPU(int, rcu_cpu_kthread_cpu);
 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DEFINE_PER_CPU(char, rcu_cpu_has_work);
-static char rcu_kthreads_spawnable;
 
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
@@ -1467,6 +1489,8 @@ static void rcu_process_callbacks(struct softirq_action
*unused)
  */
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
 {
+	if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
+		return;
 	if (likely(!rsp->boost)) {
 		rcu_do_batch(rsp, rdp);
 		return;
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 14dc7dd..75113cb 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1532,7 +1532,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
 	struct sched_param sp;
 	struct task_struct *t;
 
-	if (!rcu_kthreads_spawnable ||
+	if (!rcu_scheduler_fully_active ||
 	    per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
 		return 0;
 	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d",
cpu);
@@ -1639,7 +1639,7 @@ static int __cpuinit rcu_spawn_one_node_kthread(struct
rcu_state *rsp,
 	struct sched_param sp;
 	struct task_struct *t;
 
-	if (!rcu_kthreads_spawnable ||
+	if (!rcu_scheduler_fully_active ||
 	    rnp->qsmaskinit == 0)
 		return 0;
 	if (rnp->node_kthread_task == NULL) {
@@ -1665,7 +1665,7 @@ static int __init rcu_spawn_kthreads(void)
 	int cpu;
 	struct rcu_node *rnp;
 
-	rcu_kthreads_spawnable = 1;
+	rcu_scheduler_fully_active = 1;
 	for_each_possible_cpu(cpu) {
 		per_cpu(rcu_cpu_has_work, cpu) = 0;
 		if (cpu_online(cpu))
@@ -1687,7 +1687,7 @@ static void __cpuinit rcu_prepare_kthreads(int cpu)
 	struct rcu_node *rnp = rdp->mynode;
 
 	/* Fire up the incoming CPU''s kthread and leaf rcu_node kthread. */
-	if (rcu_kthreads_spawnable) {
+	if (rcu_scheduler_fully_active) {
 		(void)rcu_spawn_one_cpu_kthread(cpu);
 		if (rnp->node_kthread_task == NULL)
 			(void)rcu_spawn_one_node_kthread(rcu_state, rnp);
@@ -1726,6 +1726,13 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt)
 {
 }
 
+static int __init rcu_scheduler_really_started(void)
+{
+	rcu_scheduler_fully_active = 1;
+	return 0;
+}
+early_initcall(rcu_scheduler_really_started);
+
 static void __cpuinit rcu_prepare_kthreads(int cpu)
 {
 }

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 15:15 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 07:49:36AM -0700, Paul E. McKenney
wrote:> On Tue, Jul 12, 2011 at 10:12:28AM -0400, Konrad Rzeszutek Wilk wrote:
> > > >   [<c042d0f5>] task_waking_fair+0x14  <--
> > > 
> > > Hmmm...  This is a 32-bit system, isn''t it?
> > 
> > Yes. I ran this little loop:
> > 
> > #!/bin/bash
> > 
> > ID=`xl list | grep Fedora | awk ''  { print $2}''`
> > 
> > rm -f cpu*.log
> > while (true) do
> > 	xl pause $ID
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >>
cpu0.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >>
cpu1.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >>
cpu2.log
> > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >>
cpu3.log
> > 	xl unpause $ID
> > done
> > 
> > To get an idea what the CPU is doing before it hits the
task_waking_fair
> > and there isn''t anything daming. Here are the logs:
> > 
> > http://darnok.org/xen/cpu1.log
> 
> OK, a fair amount of variety, then lots and lots of task_waking_fair(),
> so I still feel good about asking you for the following.
But...  But...  But...

Just how accurate are these stack traces?  For example, do you have
frame pointers enabled?  If not, could you please enable them?

The reason that I ask is that the wakeme_after_rcu() looks like it is
being invoked from softirq, which would be grossly illegal and could
cause any manner of misbehavior.  Did someone put a synchronize_rcu()
into an RCU callback or something?  Or did I do something really really
braindead inside the RCU implementation?

(I am looking into this last question, but would appreciate any and all
help with the other questions!)

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 15:22 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 08:15:50AM -0700, Paul E. McKenney
wrote:> On Tue, Jul 12, 2011 at 07:49:36AM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 12, 2011 at 10:12:28AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > >   [<c042d0f5>] task_waking_fair+0x14  <--
> > > > 
> > > > Hmmm...  This is a 32-bit system, isn''t it?
> > > 
> > > Yes. I ran this little loop:
> > > 
> > > #!/bin/bash
> > > 
> > > ID=`xl list | grep Fedora | awk ''  { print
$2}''`
> > > 
> > > rm -f cpu*.log
> > > while (true) do
> > > 	xl pause $ID
> > > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 0 >>
cpu0.log
> > > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 1 >>
cpu1.log
> > > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 2 >>
cpu2.log
> > > 	 /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/FC15-32/System.map-3.0.0-rc6-julie-tested-dirty -a $ID 3 >>
cpu3.log
> > > 	xl unpause $ID
> > > done
> > > 
> > > To get an idea what the CPU is doing before it hits the
task_waking_fair
> > > and there isn''t anything daming. Here are the logs:
> > > 
> > > http://darnok.org/xen/cpu1.log
> > 
> > OK, a fair amount of variety, then lots and lots of
task_waking_fair(),
> > so I still feel good about asking you for the following.
> 
> But...  But...  But...
> 
> Just how accurate are these stack traces?  For example, do you have
> frame pointers enabled?  If not, could you please enable them?
> 
> The reason that I ask is that the wakeme_after_rcu() looks like it is
> being invoked from softirq, which would be grossly illegal and could
> cause any manner of misbehavior.  Did someone put a synchronize_rcu()
> into an RCU callback or something?  Or did I do something really really
> braindead inside the RCU implementation?
> 
> (I am looking into this last question, but would appreciate any and all
> help with the other questions!)
OK, I was confusing Julie''s, Ravi''s, and Konrad''s
situations.
The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
only if the scheduler is actually running.  This is what happens if
you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
callback is posted that, when invoked, awakens the task that invoked
synchronize_rcu().

And, based on http://darnok.org/xen/log-rcu-stall, Konrad''s system
appears to be well past the point where the scheduler is initialized.

So I am coming back around to the loop in task_waking_fair().

Though the patch I sent out earlier might help, for example, if early
invocation of RCU callbacks is somehow messing up the scheduler''s
initialization.

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 16:03 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

> > http://darnok.org/xen/cpu1.log
> 
> OK, a fair amount of variety, then lots and lots of task_waking_fair(),
> so I still feel good about asking you for the following.
.. snup..> Hmmm...  Given that this is persisting for many many seconds, it might
> be better to check for at least 10,000,000 passes.  In contrast, 1000
> passes might elapse just waiting for a cache miss to complete.
Changed it to that large number. This is the diff I used:

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 433491c..e185c04 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1392,14 +1392,19 @@ static void task_waking_fair(struct task_struct *p)
 	struct sched_entity *se = &p->se;
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 	u64 min_vruntime;
+	u64 loop_cnt = 0UL;
 
 #ifndef CONFIG_64BIT
 	u64 min_vruntime_copy;
-
+	loop_cnt = 0UL;
 	do {
 		min_vruntime_copy = cfs_rq->min_vruntime_copy;
 		smp_rmb();
 		min_vruntime = cfs_rq->min_vruntime;
+		if (loop_cnt++ > 10000000) {
+			printk(KERN_INFO "POKE!\n");
+			loop_cnt = 0UL;
+		}
 	} while (min_vruntime != min_vruntime_copy);
 #else
 	min_vruntime = cfs_rq->min_vruntime;

And the log is:
http://darnok.org/xen/loop_cnt.log

which seems to imply that we are indeed stuck in that loop
forever.
> 
> Other possible causes include:
What is really strange is that I can only reproduce this on 32-bit
builds.> 
> o	A mismatch between Xen''s and RCU''s ideas of how
CONFIG_NO_HZ
> 	works.  If Xen thinks that the CPU is in CONFIG_NO_HZ''s
> 	dyntick-idle mode, but RCU thinks otherwise, the grace period
> 	might stall.
One sure way to figure this out is to disable CONFIG_NO_HZ right?
Or will that take away task_waking_fair case as well?> 
> o	Problems due to portions of the code attempting to use
> 	RCU read-side critical sections while in dyntick-idle mode.
> 	Frederic Weisbecker has located some of these, (though not yet
> 	in Xen) and he has some diagnositics which may be found at:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
> 
> 	on branch eqscheck.2011.07.08a.
> 
> 	You need to enable CONFIG_PROVE_RCU for these diagnostics to
> 	be executed.
Ok, let me try those too.> 
> o	As always, there might be bugs in RCU.  ;-)
> 
> But the loop in task_waking_fair() looks like the most prominent smoking
> gun at the moment.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 16:32 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen, 32-bit guest only.

> > > > http://darnok.org/xen/cpu1.log
> > > 
> > > OK, a fair amount of variety, then lots and lots of
task_waking_fair(),
> > > so I still feel good about asking you for the following.
> > 
> > But...  But...  But...
> > 
> > Just how accurate are these stack traces?  For example, do you have
> > frame pointers enabled?  If not, could you please enable them?
Frame pointers are enabled.> > 
> > The reason that I ask is that the wakeme_after_rcu() looks like it is
> > being invoked from softirq, which would be grossly illegal and could
> > cause any manner of misbehavior.  Did someone put a synchronize_rcu()
> > into an RCU callback or something?  Or did I do something really
really
This is a 3.0-rc6 based kernels with the debug patch, the initial
RCU inhibit patch (where you disable the RCU checking during bootup) and
that is it.

What is bizzare is that the soft_irq shows but there is no corresponding
Xen eventchannel stack trace - there should have been also xen_evtchn_upcall
(which is the general code that calls the main IRQ handler.. which would make
the softirq call). This is assuming that the IRQ (timer one) is reguarly
dispatching
(which it looks to be doing). Somehow getting just the softirq by itself is
bizzre.

Perhaps an IPI has been sent that does this. Let me see what a stack
trace for an IPI looks like.
> > braindead inside the RCU implementation?
> > 
> > (I am looking into this last question, but would appreciate any and
all
> > help with the other questions!)
> 
> OK, I was confusing Julie''s, Ravi''s, and
Konrad''s situations.
Do you want me to create a new email thread to keep this one seperate?
> The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
> only if the scheduler is actually running.  This is what happens if
> you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
> callback is posted that, when invoked, awakens the task that invoked
> synchronize_rcu().
> 
> And, based on http://darnok.org/xen/log-rcu-stall, Konrad''s system
> appears to be well past the point where the scheduler is initialized.
> 
> So I am coming back around to the loop in task_waking_fair().
> 
> Though the patch I sent out earlier might help, for example, if early
> invocation of RCU callbacks is somehow messing up the scheduler''s
> initialization.
Ok, let me try it out.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 16:39 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 12:03:24PM -0400, Konrad Rzeszutek Wilk
wrote:> > > http://darnok.org/xen/cpu1.log
> > 
> > OK, a fair amount of variety, then lots and lots of
task_waking_fair(),
> > so I still feel good about asking you for the following.
> .. snup..
> > Hmmm...  Given that this is persisting for many many seconds, it might
> > be better to check for at least 10,000,000 passes.  In contrast, 1000
> > passes might elapse just waiting for a cache miss to complete.
> 
> Changed it to that large number. This is the diff I used:
> 
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 433491c..e185c04 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1392,14 +1392,19 @@ static void task_waking_fair(struct task_struct *p)
>  	struct sched_entity *se = &p->se;
>  	struct cfs_rq *cfs_rq = cfs_rq_of(se);
>  	u64 min_vruntime;
> +	u64 loop_cnt = 0UL;
> 
>  #ifndef CONFIG_64BIT
>  	u64 min_vruntime_copy;
> -
> +	loop_cnt = 0UL;
>  	do {
>  		min_vruntime_copy = cfs_rq->min_vruntime_copy;
>  		smp_rmb();
>  		min_vruntime = cfs_rq->min_vruntime;
> +		if (loop_cnt++ > 10000000) {
> +			printk(KERN_INFO "POKE!\n");
> +			loop_cnt = 0UL;
> +		}
>  	} while (min_vruntime != min_vruntime_copy);
>  #else
>  	min_vruntime = cfs_rq->min_vruntime;
> 
> And the log is:
> http://darnok.org/xen/loop_cnt.log
> 
> which seems to imply that we are indeed stuck in that loop
> forever.
It does indeed, thank you!  Also it looks like interrupts are
disabled, and that timekeeping is similarly out of action.
> > Other possible causes include:
> 
> What is really strange is that I can only reproduce this on 32-bit builds.
Not strange at all.  If you have a 64-bit build, the function doesn''t
have a loop.  ;-)
> > o	A mismatch between Xen''s and RCU''s ideas of how
CONFIG_NO_HZ
> > 	works.  If Xen thinks that the CPU is in CONFIG_NO_HZ''s
> > 	dyntick-idle mode, but RCU thinks otherwise, the grace period
> > 	might stall.
> 
> One sure way to figure this out is to disable CONFIG_NO_HZ right?
> Or will that take away task_waking_fair case as well?
Disabling CONFIG_NO_HZ would be an interesting test case.
> > o	Problems due to portions of the code attempting to use
> > 	RCU read-side critical sections while in dyntick-idle mode.
> > 	Frederic Weisbecker has located some of these, (though not yet
> > 	in Xen) and he has some diagnositics which may be found at:
> > 
> > 
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
> > 
> > 	on branch eqscheck.2011.07.08a.
> > 
> > 	You need to enable CONFIG_PROVE_RCU for these diagnostics to
> > 	be executed.
> 
> Ok, let me try those too.
Thank you!
> > o	As always, there might be bugs in RCU.  ;-)
> > 
> > But the loop in task_waking_fair() looks like the most prominent
smoking
> > gun at the moment.
And could you also please try out the patch that I posted earlier?

							Thaxn, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 16:46 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen, 32-bit guest only.

On Tue, Jul 12, 2011 at 12:32:10PM -0400, Konrad Rzeszutek Wilk
wrote:> > > > > http://darnok.org/xen/cpu1.log
> > > > 
> > > > OK, a fair amount of variety, then lots and lots of
task_waking_fair(),
> > > > so I still feel good about asking you for the following.
> > > 
> > > But...  But...  But...
> > > 
> > > Just how accurate are these stack traces?  For example, do you
have
> > > frame pointers enabled?  If not, could you please enable them?
> 
> Frame pointers are enabled.
> > > 
> > > The reason that I ask is that the wakeme_after_rcu() looks like
it is
> > > being invoked from softirq, which would be grossly illegal and
could
> > > cause any manner of misbehavior.  Did someone put a
synchronize_rcu()
> > > into an RCU callback or something?  Or did I do something really
really
> 
> This is a 3.0-rc6 based kernels with the debug patch, the initial
> RCU inhibit patch (where you disable the RCU checking during bootup) and
> that is it.
> 
> What is bizzare is that the soft_irq shows but there is no corresponding
> Xen eventchannel stack trace - there should have been also
xen_evtchn_upcall
> (which is the general code that calls the main IRQ handler.. which would
make
> the softirq call). This is assuming that the IRQ (timer one) is reguarly
dispatching
> (which it looks to be doing). Somehow getting just the softirq by itself is
bizzre.
> 
> Perhaps an IPI has been sent that does this. Let me see what a stack
> trace for an IPI looks like.
Thank you for the info!
> > > braindead inside the RCU implementation?
> > > 
> > > (I am looking into this last question, but would appreciate any
and all
> > > help with the other questions!)
> > 
> > OK, I was confusing Julie''s, Ravi''s, and
Konrad''s situations.
> 
> Do you want me to create a new email thread to keep this one seperate?
Let''s please keep everyone on copy.  I bet that these problems are
related.  Plus once we get something that works, it would be good if
everyone could test it.
> > The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
> > only if the scheduler is actually running.  This is what happens if
> > you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
> > callback is posted that, when invoked, awakens the task that invoked
> > synchronize_rcu().
> > 
> > And, based on http://darnok.org/xen/log-rcu-stall, Konrad''s
system
> > appears to be well past the point where the scheduler is initialized.
> > 
> > So I am coming back around to the loop in task_waking_fair().
> > 
> > Though the patch I sent out earlier might help, for example, if early
> > invocation of RCU callbacks is somehow messing up the
scheduler''s
> > initialization.
> 
> Ok, let me try it out.
Thank you again!

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 18:01 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

> > http://darnok.org/xen/loop_cnt.log
> > 
> > which seems to imply that we are indeed stuck in that loop
> > forever.
> 
> It does indeed, thank you!  Also it looks like interrupts are
> disabled, and that timekeeping is similarly out of action.
.. With the latest patch the time looks to be advancing.
> Disabling CONFIG_NO_HZ would be an interesting test case.
Hadn''t done that yet. Compiling a kernel with "# CONFIG_NO_HZ is
not set"
right now.> 
> > > o	Problems due to portions of the code attempting to use
> > > 	RCU read-side critical sections while in dyntick-idle mode.
> > > 	Frederic Weisbecker has located some of these, (though not yet
> > > 	in Xen) and he has some diagnositics which may be found at:
> > > 
> > > 
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
> > > 
> > > 	on branch eqscheck.2011.07.08a.
> > > 
> > > 	You need to enable CONFIG_PROVE_RCU for these diagnostics to
> > > 	be executed.
> > 
> > Ok, let me try those too.
> 
> Thank you!
Will shortly do this.> 
> > > o	As always, there might be bugs in RCU.  ;-)
> > > 
> > > But the loop in task_waking_fair() looks like the most prominent
smoking
> > > gun at the moment.
> 
> And could you also please try out the patch that I posted earlier?
With the previous patch and the .. this is getting confusing. With this patch:
http://darnok.org/xen/loop_cnt-extra.patch

I get this output: http://darnok.org/xen/log.loop_cnt-extra-patch (one guest
with 4 VCPUS) and http://darnok.org/xen/loop_cnt-extra-patch.log (the guest with
16 VCPUs)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 18:59 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 02:01:51PM -0400, Konrad Rzeszutek Wilk
wrote:> > > http://darnok.org/xen/loop_cnt.log
> > > 
> > > which seems to imply that we are indeed stuck in that loop
> > > forever.
> > 
> > It does indeed, thank you!  Also it looks like interrupts are
> > disabled, and that timekeeping is similarly out of action.
> 
> .. With the latest patch the time looks to be advancing.
Sounds like an improvement.  ;-)
> > Disabling CONFIG_NO_HZ would be an interesting test case.
> 
> Hadn''t done that yet. Compiling a kernel with "# CONFIG_NO_HZ
is not set"
> right now.
> > 
> > > > o	Problems due to portions of the code attempting to use
> > > > 	RCU read-side critical sections while in dyntick-idle mode.
> > > > 	Frederic Weisbecker has located some of these, (though not
yet
> > > > 	in Xen) and he has some diagnositics which may be found at:
> > > > 
> > > > 
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
> > > > 
> > > > 	on branch eqscheck.2011.07.08a.
> > > > 
> > > > 	You need to enable CONFIG_PROVE_RCU for these diagnostics
to
> > > > 	be executed.
> > > 
> > > Ok, let me try those too.
> > 
> > Thank you!
> 
> Will shortly do this.
> > 
> > > > o	As always, there might be bugs in RCU.  ;-)
> > > > 
> > > > But the loop in task_waking_fair() looks like the most
prominent smoking
> > > > gun at the moment.
> > 
> > And could you also please try out the patch that I posted earlier?
> 
> With the previous patch and the .. this is getting confusing. With this
patch:
> http://darnok.org/xen/loop_cnt-extra.patch
That is indeed the patch I intended.
> I get this output: http://darnok.org/xen/log.loop_cnt-extra-patch (one
guest
> with 4 VCPUS) and http://darnok.org/xen/loop_cnt-extra-patch.log (the guest
with 16 VCPUs)
OK, so the infinite loop in task_waking_fair() happens even if RCU callbacks
are deferred until after the scheduler is fully initialized.  Sounds like
one for the scheduler guys.  ;-)

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 19:07 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

> > > Disabling CONFIG_NO_HZ would be an interesting test case.
> > 
> > Hadn''t done that yet. Compiling a kernel with "#
CONFIG_NO_HZ is not set"
> > right now.
Log: http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled.log
config:http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled+.config
Patch: http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled.patch
> > > > > But the loop in task_waking_fair() looks like the most
prominent smoking
> > > > > gun at the moment.
> > > 
> > > And could you also please try out the patch that I posted
earlier?
> > 
> > With the previous patch and the .. this is getting confusing. With
this patch:
> > http://darnok.org/xen/loop_cnt-extra.patch
> 
> That is indeed the patch I intended.
<nods>> 
> > I get this output: http://darnok.org/xen/log.loop_cnt-extra-patch (one
guest
> > with 4 VCPUS) and http://darnok.org/xen/loop_cnt-extra-patch.log (the
guest with 16 VCPUs)
> 
> OK, so the infinite loop in task_waking_fair() happens even if RCU
callbacks
> are deferred until after the scheduler is fully initialized.  Sounds like
> one for the scheduler guys.  ;-)
Yikes. Well, in the meantime let me check the IPI part and see if there is
something
busted that could trigger softirq to be invoked directly.

And also compile the kernel with the CONFIG_RCU_PROVE_LOCKING with some extra
git tree you pointed me to.> 
> 							Thanx, Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Peter Zijlstra

2011-Jul-12 19:10 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney
wrote:> OK, so the infinite loop in task_waking_fair() happens even if RCU
callbacks
> are deferred until after the scheduler is fully initialized.  Sounds like
> one for the scheduler guys.  ;-) 
https://lkml.org/lkml/2011/7/12/150

?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Jul-12 19:57 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 09:10:36PM +0200, Peter Zijlstra
wrote:> On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney wrote:
> > OK, so the infinite loop in task_waking_fair() happens even if RCU
callbacks
> > are deferred until after the scheduler is fully initialized.  Sounds
like
> > one for the scheduler guys.  ;-) 
> 
> https://lkml.org/lkml/2011/7/12/150
Such a simple patch. And yes, it fixes the issue. You can add
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> if it
hasn''t yet
showed up in Ingo''s tree.

Paul, thanks for help on this and providing ideas to test!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 20:05 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 09:10:36PM +0200, Peter Zijlstra
wrote:> On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney wrote:
> > OK, so the infinite loop in task_waking_fair() happens even if RCU
callbacks
> > are deferred until after the scheduler is fully initialized.  Sounds
like
> > one for the scheduler guys.  ;-) 
> 
> https://lkml.org/lkml/2011/7/12/150
> 
> ?
Looks like that would do the trick, thank you Peter!

Konrad, Julie, Ravi, could you please try out Peter''s patch?

							Thanx, Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 20:46 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 03:57:32PM -0400, Konrad Rzeszutek Wilk
wrote:> On Tue, Jul 12, 2011 at 09:10:36PM +0200, Peter Zijlstra wrote:
> > On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney wrote:
> > > OK, so the infinite loop in task_waking_fair() happens even if
RCU callbacks
> > > are deferred until after the scheduler is fully initialized. 
Sounds like
> > > one for the scheduler guys.  ;-) 
> > 
> > https://lkml.org/lkml/2011/7/12/150
> 
> Such a simple patch. And yes, it fixes the issue. You can add
> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> if it
hasn''t yet
> showed up in Ingo''s tree.
> 
> Paul, thanks for help on this and providing ideas to test!
Konrad, thank you for all the testing!

Julie, if you apply Peter''s patch, do you also need the patch shown
below?

Ravi, could you please retest with the patch below as well?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7e59ffb..ba06207 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -84,9 +84,32 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
 
 static struct rcu_state *rcu_state;
 
+/*
+ * The rcu_scheduler_active variable transitions from zero to one just
+ * before the first task is spawned.  So when this variable is zero, RCU
+ * can assume that there is but one task, allowing RCU to (for example)
+ * optimized synchronize_sched() to a simple barrier().  When this variable
+ * is one, RCU must actually do all the hard work required to detect real
+ * grace periods.  This variable is also used to suppress boot-time false
+ * positives from lockdep-RCU error checking.
+ */
 int rcu_scheduler_active __read_mostly;
 EXPORT_SYMBOL_GPL(rcu_scheduler_active);
 
+/*
+ * The rcu_scheduler_fully_active variable transitions from zero to one
+ * during the early_initcall() processing, which is after the scheduler
+ * is capable of creating new tasks.  So RCU processing (for example,
+ * creating tasks for RCU priority boosting) must be delayed until after
+ * rcu_scheduler_fully_active transitions from zero to one.  We also
+ * currently delay invocation of any RCU callbacks until after this point.
+ *
+ * It might later prove better for people registering RCU callbacks during
+ * early boot to take responsibility for these callbacks, but one step at
+ * a time.
+ */
+static int rcu_scheduler_fully_active __read_mostly;
+
 #ifdef CONFIG_RCU_BOOST
 
 /*
@@ -98,7 +121,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DEFINE_PER_CPU(int, rcu_cpu_kthread_cpu);
 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DEFINE_PER_CPU(char, rcu_cpu_has_work);
-static char rcu_kthreads_spawnable;
 
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
@@ -1467,6 +1489,8 @@ static void rcu_process_callbacks(struct softirq_action
*unused)
  */
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
 {
+	if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
+		return;
 	if (likely(!rsp->boost)) {
 		rcu_do_batch(rsp, rdp);
 		return;
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 14dc7dd..75113cb 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1532,7 +1532,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
 	struct sched_param sp;
 	struct task_struct *t;
 
-	if (!rcu_kthreads_spawnable ||
+	if (!rcu_scheduler_fully_active ||
 	    per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
 		return 0;
 	t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d",
cpu);
@@ -1639,7 +1639,7 @@ static int __cpuinit rcu_spawn_one_node_kthread(struct
rcu_state *rsp,
 	struct sched_param sp;
 	struct task_struct *t;
 
-	if (!rcu_kthreads_spawnable ||
+	if (!rcu_scheduler_fully_active ||
 	    rnp->qsmaskinit == 0)
 		return 0;
 	if (rnp->node_kthread_task == NULL) {
@@ -1665,7 +1665,7 @@ static int __init rcu_spawn_kthreads(void)
 	int cpu;
 	struct rcu_node *rnp;
 
-	rcu_kthreads_spawnable = 1;
+	rcu_scheduler_fully_active = 1;
 	for_each_possible_cpu(cpu) {
 		per_cpu(rcu_cpu_has_work, cpu) = 0;
 		if (cpu_online(cpu))
@@ -1687,7 +1687,7 @@ static void __cpuinit rcu_prepare_kthreads(int cpu)
 	struct rcu_node *rnp = rdp->mynode;
 
 	/* Fire up the incoming CPU''s kthread and leaf rcu_node kthread. */
-	if (rcu_kthreads_spawnable) {
+	if (rcu_scheduler_fully_active) {
 		(void)rcu_spawn_one_cpu_kthread(cpu);
 		if (rnp->node_kthread_task == NULL)
 			(void)rcu_spawn_one_node_kthread(rcu_state, rnp);
@@ -1726,6 +1726,13 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt)
 {
 }
 
+static int __init rcu_scheduler_really_started(void)
+{
+	rcu_scheduler_fully_active = 1;
+	return 0;
+}
+early_initcall(rcu_scheduler_really_started);
+
 static void __cpuinit rcu_prepare_kthreads(int cpu)
 {
 }

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 20:52 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 03:07:56PM -0400, Konrad Rzeszutek Wilk
wrote:> > > > Disabling CONFIG_NO_HZ would be an interesting test case.
> > > 
> > > Hadn''t done that yet. Compiling a kernel with "#
CONFIG_NO_HZ is not set"
> > > right now.
> 
> Log: http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled.log
> config:http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled+.config
> Patch: http://darnok.org/xen/loop_cnt-extra-patch-no-hz-disabled.patch
OK, thank you for trying this out.  No joy, but to be expected given
Peter''s later email.

							Thanx, Paul
> > > > > > But the loop in task_waking_fair() looks like the
most prominent smoking
> > > > > > gun at the moment.
> > > > 
> > > > And could you also please try out the patch that I posted
earlier?
> > > 
> > > With the previous patch and the .. this is getting confusing.
With this patch:
> > > http://darnok.org/xen/loop_cnt-extra.patch
> > 
> > That is indeed the patch I intended.
> 
> <nods>
> > 
> > > I get this output: http://darnok.org/xen/log.loop_cnt-extra-patch
(one guest
> > > with 4 VCPUS) and http://darnok.org/xen/loop_cnt-extra-patch.log
(the guest with 16 VCPUs)
> > 
> > OK, so the infinite loop in task_waking_fair() happens even if RCU
callbacks
> > are deferred until after the scheduler is fully initialized.  Sounds
like
> > one for the scheduler guys.  ;-)
> 
> Yikes. Well, in the meantime let me check the IPI part and see if there is
something
> busted that could trigger softirq to be invoked directly.
> 
> And also compile the kernel with the CONFIG_RCU_PROVE_LOCKING with some
extra
> git tree you pointed me to.
> > 
> > 							Thanx, Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Julie Sullivan

2011-Jul-12 21:04 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 9:46 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:> On Tue, Jul 12, 2011 at 03:57:32PM -0400, Konrad Rzeszutek Wilk wrote:
>> On Tue, Jul 12, 2011 at 09:10:36PM +0200, Peter Zijlstra wrote:
>> > On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney wrote:
>> > > OK, so the infinite loop in task_waking_fair() happens even
if RCU callbacks
>> > > are deferred until after the scheduler is fully initialized.
 Sounds like
>> > > one for the scheduler guys.  ;-)
>> >
>> > https://lkml.org/lkml/2011/7/12/150
>>
>> Such a simple patch. And yes, it fixes the issue. You can add
>> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> if it
hasn''t yet
>> showed up in Ingo''s tree.
>>
>> Paul, thanks for help on this and providing ideas to test!
>
> Konrad, thank you for all the testing!
>
> Julie, if you apply Peter''s patch,


But this is for 32-bit , right?
> +#ifndef CONFIG_64BIT
> +	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
> +#endif }

I''m using 64-bit...

Would you still like me to try the below patch?

Cheers
Julie

> do you also need the patch shown
> below?
>
> Ravi, could you please retest with the patch below as well?
>
>                                                        Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 7e59ffb..ba06207 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -84,9 +84,32 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>
>  static struct rcu_state *rcu_state;
>
> +/*
> + * The rcu_scheduler_active variable transitions from zero to one just
> + * before the first task is spawned.  So when this variable is zero, RCU
> + * can assume that there is but one task, allowing RCU to (for example)
> + * optimized synchronize_sched() to a simple barrier().  When this
variable
> + * is one, RCU must actually do all the hard work required to detect real
> + * grace periods.  This variable is also used to suppress boot-time false
> + * positives from lockdep-RCU error checking.
> + */
>  int rcu_scheduler_active __read_mostly;
>  EXPORT_SYMBOL_GPL(rcu_scheduler_active);
>
> +/*
> + * The rcu_scheduler_fully_active variable transitions from zero to one
> + * during the early_initcall() processing, which is after the scheduler
> + * is capable of creating new tasks.  So RCU processing (for example,
> + * creating tasks for RCU priority boosting) must be delayed until after
> + * rcu_scheduler_fully_active transitions from zero to one.  We also
> + * currently delay invocation of any RCU callbacks until after this point.
> + *
> + * It might later prove better for people registering RCU callbacks during
> + * early boot to take responsibility for these callbacks, but one step at
> + * a time.
> + */
> +static int rcu_scheduler_fully_active __read_mostly;
> +
>  #ifdef CONFIG_RCU_BOOST
>
>  /*
> @@ -98,7 +121,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
>  DEFINE_PER_CPU(int, rcu_cpu_kthread_cpu);
>  DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
>  DEFINE_PER_CPU(char, rcu_cpu_has_work);
> -static char rcu_kthreads_spawnable;
>
>  #endif /* #ifdef CONFIG_RCU_BOOST */
>
> @@ -1467,6 +1489,8 @@ static void rcu_process_callbacks(struct
softirq_action *unused)
>  */
>  static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data
*rdp)
>  {
> +       if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
> +               return;
>        if (likely(!rsp->boost)) {
>                rcu_do_batch(rsp, rdp);
>                return;
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 14dc7dd..75113cb 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1532,7 +1532,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int
cpu)
>        struct sched_param sp;
>        struct task_struct *t;
>
> -       if (!rcu_kthreads_spawnable ||
> +       if (!rcu_scheduler_fully_active ||
>            per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
>                return 0;
>        t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu,
"rcuc%d", cpu);
> @@ -1639,7 +1639,7 @@ static int __cpuinit
rcu_spawn_one_node_kthread(struct rcu_state *rsp,
>        struct sched_param sp;
>        struct task_struct *t;
>
> -       if (!rcu_kthreads_spawnable ||
> +       if (!rcu_scheduler_fully_active ||
>            rnp->qsmaskinit == 0)
>                return 0;
>        if (rnp->node_kthread_task == NULL) {
> @@ -1665,7 +1665,7 @@ static int __init rcu_spawn_kthreads(void)
>        int cpu;
>        struct rcu_node *rnp;
>
> -       rcu_kthreads_spawnable = 1;
> +       rcu_scheduler_fully_active = 1;
>        for_each_possible_cpu(cpu) {
>                per_cpu(rcu_cpu_has_work, cpu) = 0;
>                if (cpu_online(cpu))
> @@ -1687,7 +1687,7 @@ static void __cpuinit rcu_prepare_kthreads(int cpu)
>        struct rcu_node *rnp = rdp->mynode;
>
>        /* Fire up the incoming CPU''s kthread and leaf rcu_node
kthread. */
> -       if (rcu_kthreads_spawnable) {
> +       if (rcu_scheduler_fully_active) {
>                (void)rcu_spawn_one_cpu_kthread(cpu);
>                if (rnp->node_kthread_task == NULL)
>                        (void)rcu_spawn_one_node_kthread(rcu_state, rnp);
> @@ -1726,6 +1726,13 @@ static void rcu_cpu_kthread_setrt(int cpu, int
to_rt)
>  {
>  }
>
> +static int __init rcu_scheduler_really_started(void)
> +{
> +       rcu_scheduler_fully_active = 1;
> +       return 0;
> +}
> +early_initcall(rcu_scheduler_really_started);
> +
>  static void __cpuinit rcu_prepare_kthreads(int cpu)
>  {
>  }
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Paul E. McKenney

2011-Jul-12 21:07 UTC

head link

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

On Tue, Jul 12, 2011 at 10:04:48PM +0100, Julie Sullivan
wrote:> On Tue, Jul 12, 2011 at 9:46 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Jul 12, 2011 at 03:57:32PM -0400, Konrad Rzeszutek Wilk wrote:
> >> On Tue, Jul 12, 2011 at 09:10:36PM +0200, Peter Zijlstra wrote:
> >> > On Tue, 2011-07-12 at 11:59 -0700, Paul E. McKenney wrote:
> >> > > OK, so the infinite loop in task_waking_fair() happens
even if RCU callbacks
> >> > > are deferred until after the scheduler is fully
initialized.  Sounds like
> >> > > one for the scheduler guys.  ;-)
> >> >
> >> > https://lkml.org/lkml/2011/7/12/150
> >>
> >> Such a simple patch. And yes, it fixes the issue. You can add
> >> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> if
it hasn''t yet
> >> showed up in Ingo''s tree.
> >>
> >> Paul, thanks for help on this and providing ideas to test!
> >
> > Konrad, thank you for all the testing!
> >
> > Julie, if you apply Peter''s patch,
> 
> But this is for 32-bit , right?
Indeed it is, please accept my apologies for my confusion.
> > +#ifndef CONFIG_64BIT
> > +	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
> > +#endif
>  }
> 
> I''m using 64-bit...
> 
> Would you still like me to try the below patch?
Could you please?  It restores the exact behavior of the patch that
worked for you, but in a form that can go upstream.

So I am very much hoping that it works for you.

							Thanx, Paul
> Cheers
> Julie
> 
> 
> > do you also need the patch shown
> > below?
> >
> > Ravi, could you please retest with the patch below as well?
> >
> >                                                        Thanx, Paul
> >
> >
------------------------------------------------------------------------
> >
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 7e59ffb..ba06207 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -84,9 +84,32 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
> >
> >  static struct rcu_state *rcu_state;
> >
> > +/*
> > + * The rcu_scheduler_active variable transitions from zero to one
just
> > + * before the first task is spawned.  So when this variable is zero,
RCU
> > + * can assume that there is but one task, allowing RCU to (for
example)
> > + * optimized synchronize_sched() to a simple barrier().  When this
variable
> > + * is one, RCU must actually do all the hard work required to detect
real
> > + * grace periods.  This variable is also used to suppress boot-time
false
> > + * positives from lockdep-RCU error checking.
> > + */
> >  int rcu_scheduler_active __read_mostly;
> >  EXPORT_SYMBOL_GPL(rcu_scheduler_active);
> >
> > +/*
> > + * The rcu_scheduler_fully_active variable transitions from zero to
one
> > + * during the early_initcall() processing, which is after the
scheduler
> > + * is capable of creating new tasks.  So RCU processing (for example,
> > + * creating tasks for RCU priority boosting) must be delayed until
after
> > + * rcu_scheduler_fully_active transitions from zero to one.  We also
> > + * currently delay invocation of any RCU callbacks until after this
point.
> > + *
> > + * It might later prove better for people registering RCU callbacks
during
> > + * early boot to take responsibility for these callbacks, but one
step at
> > + * a time.
> > + */
> > +static int rcu_scheduler_fully_active __read_mostly;
> > +
> >  #ifdef CONFIG_RCU_BOOST
> >
> >  /*
> > @@ -98,7 +121,6 @@ DEFINE_PER_CPU(unsigned int,
rcu_cpu_kthread_status);
> >  DEFINE_PER_CPU(int, rcu_cpu_kthread_cpu);
> >  DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
> >  DEFINE_PER_CPU(char, rcu_cpu_has_work);
> > -static char rcu_kthreads_spawnable;
> >
> >  #endif /* #ifdef CONFIG_RCU_BOOST */
> >
> > @@ -1467,6 +1489,8 @@ static void rcu_process_callbacks(struct
softirq_action *unused)
> >  */
> >  static void invoke_rcu_callbacks(struct rcu_state *rsp, struct
rcu_data *rdp)
> >  {
> > +       if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
> > +               return;
> >        if (likely(!rsp->boost)) {
> >                rcu_do_batch(rsp, rdp);
> >                return;
> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > index 14dc7dd..75113cb 100644
> > --- a/kernel/rcutree_plugin.h
> > +++ b/kernel/rcutree_plugin.h
> > @@ -1532,7 +1532,7 @@ static int __cpuinit
rcu_spawn_one_cpu_kthread(int cpu)
> >        struct sched_param sp;
> >        struct task_struct *t;
> >
> > -       if (!rcu_kthreads_spawnable ||
> > +       if (!rcu_scheduler_fully_active ||
> >            per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
> >                return 0;
> >        t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu,
"rcuc%d", cpu);
> > @@ -1639,7 +1639,7 @@ static int __cpuinit
rcu_spawn_one_node_kthread(struct rcu_state *rsp,
> >        struct sched_param sp;
> >        struct task_struct *t;
> >
> > -       if (!rcu_kthreads_spawnable ||
> > +       if (!rcu_scheduler_fully_active ||
> >            rnp->qsmaskinit == 0)
> >                return 0;
> >        if (rnp->node_kthread_task == NULL) {
> > @@ -1665,7 +1665,7 @@ static int __init rcu_spawn_kthreads(void)
> >        int cpu;
> >        struct rcu_node *rnp;
> >
> > -       rcu_kthreads_spawnable = 1;
> > +       rcu_scheduler_fully_active = 1;
> >        for_each_possible_cpu(cpu) {
> >                per_cpu(rcu_cpu_has_work, cpu) = 0;
> >                if (cpu_online(cpu))
> > @@ -1687,7 +1687,7 @@ static void __cpuinit rcu_prepare_kthreads(int
cpu)
> >        struct rcu_node *rnp = rdp->mynode;
> >
> >        /* Fire up the incoming CPU''s kthread and leaf
rcu_node kthread. */
> > -       if (rcu_kthreads_spawnable) {
> > +       if (rcu_scheduler_fully_active) {
> >                (void)rcu_spawn_one_cpu_kthread(cpu);
> >                if (rnp->node_kthread_task == NULL)
> >                        (void)rcu_spawn_one_node_kthread(rcu_state,
rnp);
> > @@ -1726,6 +1726,13 @@ static void rcu_cpu_kthread_setrt(int cpu, int
to_rt)
> >  {
> >  }
> >
> > +static int __init rcu_scheduler_really_started(void)
> > +{
> > +       rcu_scheduler_fully_active = 1;
> > +       return 0;
> > +}
> > +early_initcall(rcu_scheduler_really_started);
> > +
> >  static void __cpuinit rcu_prepare_kthreads(int cpu)
> >  {
> >  }
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jul 2011 - Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

Re: [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

Re: [Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen, 32-bit guest only.

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen, 32-bit guest only.

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3

[Xen-devel] Re: PROBLEM: 3.0-rc kernels unbootable since -rc3