Hello, The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on an old Core Duo laptop; maybe someone can hint what is wrong. Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing seems to happen. I need to press any key to observe progress - I need to do it tens of times for the boot to finish. After X starts fine, then there is no need for keypressing anymore. A particularly disturbing fact is that qrexec_daemon parent, that basically does for (;;) { sleep(1); fprintf(stderr, "."); } does not print dots, until a keypress arrives. So something is very wrong with timers. Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching power cord, machine enters S3 immediately. This is vaguely similar to the issue described in https://lkml.org/lkml/2008/9/14/122 but this time, "nohz=off" does not help. "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless solution. Any idea what is going on or how to debug it ? Regards, RW _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] > Sent: Monday, September 05, 2011 3:20 AM > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] dom0 is stalled until a keypress > > Hello, > The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on > an old Core Duo laptop; maybe someone can hint what is wrong. > Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing > seems to happen. I need to press any key to observe progress - I need to do > it tens of times for the boot to finish. After X starts fine, then there is > no need for keypressing anymore. > A particularly disturbing fact is that qrexec_daemon parent, that basically > does > for (;;) { sleep(1); fprintf(stderr, "."); } > does not print dots, until a keypress arrives. So something is very wrong > with timers. > Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching > power cord, machine enters S3 immediately. > This is vaguely similar to the issue described in > https://lkml.org/lkml/2008/9/14/122 > but this time, "nohz=off" does not help. > > "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless > solution. Any idea what is going on or how to debug it ?ISTR seeing this on a Core(2?)Duo laptop and I think the workaround was setting max_cstate=0 (as Xen boot parameter). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/06/11 17:49, Dan Magenheimer wrote:>> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] >> Sent: Monday, September 05, 2011 3:20 AM >> To: xen-devel@lists.xensource.com >> Subject: [Xen-devel] dom0 is stalled until a keypress >> >> Hello, >> The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on >> an old Core Duo laptop; maybe someone can hint what is wrong. >> Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing >> seems to happen. I need to press any key to observe progress - I need to do >> it tens of times for the boot to finish. After X starts fine, then there is >> no need for keypressing anymore. >> A particularly disturbing fact is that qrexec_daemon parent, that basically >> does >> for (;;) { sleep(1); fprintf(stderr, "."); } >> does not print dots, until a keypress arrives. So something is very wrong >> with timers. >> Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching >> power cord, machine enters S3 immediately. >> This is vaguely similar to the issue described in >> https://lkml.org/lkml/2008/9/14/122 >> but this time, "nohz=off" does not help. >> >> "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless >> solution. Any idea what is going on or how to debug it ? > > ISTR seeing this on a Core(2?)Duo laptop and I think the > workaround was setting max_cstate=0 (as Xen boot parameter). >But what was the actual problem? Setting max_cstate is probably even worse for power management than setting cpufreq=dom-kernel, isn''t it? Thanks, joanna. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> From: Joanna Rutkowska [mailto:joanna@invisiblethingslab.com] > Subject: Re: [Xen-devel] dom0 is stalled until a keypress > > On 09/06/11 17:49, Dan Magenheimer wrote: > >> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] > >> Sent: Monday, September 05, 2011 3:20 AM > >> To: xen-devel@lists.xensource.com > >> Subject: [Xen-devel] dom0 is stalled until a keypress > >> > >> Hello, > >> The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on > >> an old Core Duo laptop; maybe someone can hint what is wrong. > >> Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing > >> seems to happen. I need to press any key to observe progress - I need to do > >> it tens of times for the boot to finish. After X starts fine, then there is > >> no need for keypressing anymore. > >> A particularly disturbing fact is that qrexec_daemon parent, that basically > >> does > >> for (;;) { sleep(1); fprintf(stderr, "."); } > >> does not print dots, until a keypress arrives. So something is very wrong > >> with timers. > >> Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching > >> power cord, machine enters S3 immediately. > >> This is vaguely similar to the issue described in > >> https://lkml.org/lkml/2008/9/14/122 > >> but this time, "nohz=off" does not help. > >> > >> "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless > >> solution. Any idea what is going on or how to debug it ? > > > > ISTR seeing this on a Core(2?)Duo laptop and I think the > > workaround was setting max_cstate=0 (as Xen boot parameter). > > > But what was the actual problem? Setting max_cstate is probably even > worse for power management than setting cpufreq=dom-kernel, isn''t it?Sorry, dunno. I recall looking into it a bit and finding that the Core processor (and possibly specifically Merom, the laptop version) had some special C-state (C3, C1E maybe?) and giving up at that point. Sorry I can''t be more helpful. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-06 19:04 UTC
Re: [Xen-devel] dom0 is stalled until a keypress
On 09/05/2011 02:19 AM, Rafal Wojtczuk wrote:> Hello, > The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on > an old Core Duo laptop; maybe someone can hint what is wrong. > Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing > seems to happen. I need to press any key to observe progress - I need to do > it tens of times for the boot to finish. After X starts fine, then there is > no need for keypressing anymore. > A particularly disturbing fact is that qrexec_daemon parent, that basically > does > for (;;) { sleep(1); fprintf(stderr, "."); } > does not print dots, until a keypress arrives. So something is very wrong > with timers. > Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching > power cord, machine enters S3 immediately. > This is vaguely similar to the issue described in > https://lkml.org/lkml/2008/9/14/122 > but this time, "nohz=off" does not help. > > "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless > solution. Any idea what is going on or how to debug it ?Try booting with "idle=halt". J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 09/06/11 19:17, Dan Magenheimer wrote:>> From: Joanna Rutkowska [mailto:joanna@invisiblethingslab.com] >> Subject: Re: [Xen-devel] dom0 is stalled until a keypress >> >> On 09/06/11 17:49, Dan Magenheimer wrote: >>>> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] >>>> Sent: Monday, September 05, 2011 3:20 AM >>>> To: xen-devel@lists.xensource.com >>>> Subject: [Xen-devel] dom0 is stalled until a keypress >>>> >>>> Hello, >>>> The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on >>>> an old Core Duo laptop; maybe someone can hint what is wrong. >>>> Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing >>>> seems to happen. I need to press any key to observe progress - I need to do >>>> it tens of times for the boot to finish. After X starts fine, then there is >>>> no need for keypressing anymore. >>>> A particularly disturbing fact is that qrexec_daemon parent, that basically >>>> does >>>> for (;;) { sleep(1); fprintf(stderr, "."); } >>>> does not print dots, until a keypress arrives. So something is very wrong >>>> with timers. >>>> Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching >>>> power cord, machine enters S3 immediately. >>>> This is vaguely similar to the issue described in >>>> https://lkml.org/lkml/2008/9/14/122 >>>> but this time, "nohz=off" does not help. >>>> >>>> "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless >>>> solution. Any idea what is going on or how to debug it ? >>> >>> ISTR seeing this on a Core(2?)Duo laptop and I think the >>> workaround was setting max_cstate=0 (as Xen boot parameter). >>> >> But what was the actual problem? Setting max_cstate is probably even >> worse for power management than setting cpufreq=dom-kernel, isn''t it? > > Sorry, dunno. I recall looking into it a bit and finding that > the Core processor (and possibly specifically Merom, the laptop > version) had some special C-state (C3, C1E maybe?) and giving > up at that point. Sorry I can''t be more helpful.But the same system worked fine without any tweaks (cpufreq, max_cstate) on Xen 3.4 and only started exhibiting this behavior after we switched to Xen 4.1... j. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 07.09.11 at 11:03, Joanna Rutkowska <joanna@invisiblethingslab.com> wrote: > On 09/06/11 19:17, Dan Magenheimer wrote: >>> From: Joanna Rutkowska [mailto:joanna@invisiblethingslab.com] >>> Subject: Re: [Xen-devel] dom0 is stalled until a keypress >>> >>> On 09/06/11 17:49, Dan Magenheimer wrote: >>>>> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] >>>>> Sent: Monday, September 05, 2011 3:20 AM >>>>> To: xen-devel@lists.xensource.com >>>>> Subject: [Xen-devel] dom0 is stalled until a keypress >>>>> >>>>> Hello, >>>>> The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on >>>>> an old Core Duo laptop; maybe someone can hint what is wrong. >>>>> Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing >>>>> seems to happen. I need to press any key to observe progress - I need to do >>>>> it tens of times for the boot to finish. After X starts fine, then there is >>>>> no need for keypressing anymore. >>>>> A particularly disturbing fact is that qrexec_daemon parent, that basically >>>>> does >>>>> for (;;) { sleep(1); fprintf(stderr, "."); } >>>>> does not print dots, until a keypress arrives. So something is very wrong >>>>> with timers. >>>>> Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching >>>>> power cord, machine enters S3 immediately. >>>>> This is vaguely similar to the issue described in >>>>> https://lkml.org/lkml/2008/9/14/122 >>>>> but this time, "nohz=off" does not help. >>>>> >>>>> "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless >>>>> solution. Any idea what is going on or how to debug it ? >>>> >>>> ISTR seeing this on a Core(2?)Duo laptop and I think the >>>> workaround was setting max_cstate=0 (as Xen boot parameter). >>>> >>> But what was the actual problem? Setting max_cstate is probably even >>> worse for power management than setting cpufreq=dom-kernel, isn''t it? >> >> Sorry, dunno. I recall looking into it a bit and finding that >> the Core processor (and possibly specifically Merom, the laptop >> version) had some special C-state (C3, C1E maybe?) and giving >> up at that point. Sorry I can''t be more helpful. > > But the same system worked fine without any tweaks (cpufreq, max_cstate) > on Xen 3.4 and only started exhibiting this behavior after we switched > to Xen 4.1...4.1.0 or 4.1.1? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Sep 07, 2011 at 10:35:45AM +0100, Jan Beulich wrote:> >>> On 07.09.11 at 11:03, Joanna Rutkowska <joanna@invisiblethingslab.com> wrote: > > On 09/06/11 19:17, Dan Magenheimer wrote: > >>> From: Joanna Rutkowska [mailto:joanna@invisiblethingslab.com] > >>> Subject: Re: [Xen-devel] dom0 is stalled until a keypress > >>> > >>> On 09/06/11 17:49, Dan Magenheimer wrote: > >>>>> From: Rafal Wojtczuk [mailto:rafal@invisiblethingslab.com] > >>>>> Sent: Monday, September 05, 2011 3:20 AM > >>>>> To: xen-devel@lists.xensource.com > >>>>> Subject: [Xen-devel] dom0 is stalled until a keypress > >>>>> > >>>>> Hello, > >>>>> The following bizarre behaviour was observed on xen4.1+suse dom0 2.6.38, on > >>>>> an old Core Duo laptop; maybe someone can hint what is wrong. > >>>>> Dom0 boot stalls after an init.d script prints "Starting udev". Then nothing > >>>>> seems to happen. I need to press any key to observe progress - I need to do > >>>>> it tens of times for the boot to finish. After X starts fine, then there is > >>>>> no need for keypressing anymore. > >>>>> A particularly disturbing fact is that qrexec_daemon parent, that basically > >>>>> does > >>>>> for (;;) { sleep(1); fprintf(stderr, "."); } > >>>>> does not print dots, until a keypress arrives. So something is very wrong > >>>>> with timers. > >>>>> Somehow similarly, pm-suspend sometimes hangs at some stage - after detaching > >>>>> power cord, machine enters S3 immediately. > >>>>> This is vaguely similar to the issue described in > >>>>> https://lkml.org/lkml/2008/9/14/122 > >>>>> but this time, "nohz=off" does not help. > >>>>> > >>>>> "cpufreq=dom0-kernel" cures the symptoms; but it is not a sideeffectless > >>>>> solution. Any idea what is going on or how to debug it ? > >>>> > >>>> ISTR seeing this on a Core(2?)Duo laptop and I think the > >>>> workaround was setting max_cstate=0 (as Xen boot parameter). > >>>> > >>> But what was the actual problem? Setting max_cstate is probably even > >>> worse for power management than setting cpufreq=dom-kernel, isn''t it? > >> > >> Sorry, dunno. I recall looking into it a bit and finding that > >> the Core processor (and possibly specifically Merom, the laptop > >> version) had some special C-state (C3, C1E maybe?) and giving > >> up at that point. Sorry I can''t be more helpful. > > > > But the same system worked fine without any tweaks (cpufreq, max_cstate) > > on Xen 3.4 and only started exhibiting this behavior after we switched > > to Xen 4.1... > > 4.1.0 or 4.1.1?Originally tested on 4.1.0; same problem with 4.1.1. Jeremy> Try booting with "idle=halt". It does not help, either. Regards, RW _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel