Shriram Rajagopalan
2011-Feb-16 06:51 UTC
[Xen-devel] [PATCH] xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
Use PM_FREEZE, PM_THAW and PM_RESTORE power events for suspend/resume/checkpoint functionality, instead of PM_SUSPEND and PM_RESUME. Use of these pm events fixes the Xen Guest hangup when taking checkpoints. When a suspend event is cancelled (while taking checkpoints once/continuously), we use PM_THAW instead of PM_RESUME. PM_RESTORE is used when suspend is not cancelled. See Documentation/power/devices.txt and linux/pm.h for more info about freeze, thaw and restore. The sequence of pm events in a suspend-resume scenario is shown below. dpm_suspend_start(PMSG_FREEZE); dpm_suspend_noirq(PMSG_FREEZE); sysdev_suspend(PMSG_FREEZE); cancelled = suspend_hypercall() sysdev_resume(); dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE); Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> --- drivers/xen/manage.c | 12 ++++++------ drivers/xen/xenbus/xenbus_probe.c | 2 +- drivers/xen/xenbus/xenbus_probe_frontend.c | 8 +++++--- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index db8c4c4..3f76dcf 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -63,7 +63,7 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = sysdev_suspend(PMSG_SUSPEND); + err = sysdev_suspend(PMSG_FREEZE); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", err); @@ -114,16 +114,16 @@ static void do_suspend(void) } #endif - err = dpm_suspend_start(PMSG_SUSPEND); + err = dpm_suspend_start(PMSG_FREEZE); if (err) { printk(KERN_ERR "xen suspend: dpm_suspend_start %d\n", err); goto out_thaw; } - printk(KERN_DEBUG "suspending xenstore...\n"); + /* printk(KERN_DEBUG "suspending xenstore...\n"); */ xs_suspend(); - err = dpm_suspend_noirq(PMSG_SUSPEND); + err = dpm_suspend_noirq(PMSG_FREEZE); if (err) { printk(KERN_ERR "dpm_suspend_noirq failed: %d\n", err); goto out_resume; @@ -134,7 +134,7 @@ static void do_suspend(void) else err = stop_machine(xen_suspend, &cancelled, cpumask_of(0)); - dpm_resume_noirq(PMSG_RESUME); + dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -148,7 +148,7 @@ out_resume: } else xs_suspend_cancel(); - dpm_resume_end(PMSG_RESUME); + dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE); /* Make sure timer events get retriggered on all CPUs */ clock_was_set(); diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index 7397695..471ca48 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -645,7 +645,7 @@ EXPORT_SYMBOL_GPL(xenbus_dev_resume); int xenbus_dev_cancel(struct device *dev) { /* Do nothing */ - DPRINTK("cancel"); + /* DPRINTK("cancel"); */ return 0; } EXPORT_SYMBOL_GPL(xenbus_dev_cancel); diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c b/drivers/xen/xenbus/xenbus_probe_frontend.c index 9ad8868..d6e5f0d 100644 --- a/drivers/xen/xenbus/xenbus_probe_frontend.c +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c @@ -86,9 +86,11 @@ static struct device_attribute xenbus_frontend_dev_attrs[] = { }; static struct dev_pm_ops xenbus_pm_ops = { - .suspend = xenbus_dev_suspend, - .resume = xenbus_dev_resume, - .thaw = xenbus_dev_cancel, + .suspend = xenbus_dev_suspend, + .resume = xenbus_dev_resume, + .freeze = xenbus_dev_suspend, + .thaw = xenbus_dev_cancel, + .restore = xenbus_dev_resume, }; static struct xen_bus_type xenbus_frontend = { -- 1.7.0.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Feb-16 09:57 UTC
Re: [Xen-devel] [PATCH] xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
On Wed, 2011-02-16 at 06:51 +0000, Shriram Rajagopalan wrote:> Use PM_FREEZE, PM_THAW and PM_RESTORE power events for > suspend/resume/checkpoint functionality, instead of PM_SUSPEND > and PM_RESUME. Use of these pm events fixes the Xen Guest hangup > when taking checkpoints. When a suspend event is cancelled > (while taking checkpoints once/continuously), we use PM_THAW > instead of PM_RESUME. PM_RESTORE is used when suspend is not > cancelled. See Documentation/power/devices.txt and linux/pm.h > for more info about freeze, thaw and restore. The sequence of > pm events in a suspend-resume scenario is shown below. > > dpm_suspend_start(PMSG_FREEZE); > > dpm_suspend_noirq(PMSG_FREEZE); > > sysdev_suspend(PMSG_FREEZE); > cancelled = suspend_hypercall() > sysdev_resume(); > > dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); > > dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE);Thank you. This applies without fuzz on top of my pvhvm suspend cleanup series from yesterday but with the side effect that this change now impacts PVHVM suspend too (since xen_hvm_suspend, which you did not patch here, was merged into xen_suspend by that series) -- did you test that configuration or did you deliberately avoid changing xen_hvm_suspend? Tiny fixlet I needed to compile on top of that series diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index 1c855e3..8f62fec 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -147,7 +147,7 @@ static void do_suspend(void) err = stop_machine(xen_suspend, &si, cpumask_of(0)); - dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); + dpm_resume_noirq(si.cancelled ? PMSG_THAW : PMSG_RESTORE); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -161,7 +161,7 @@ out_resume: } else xs_suspend_cancel(); - dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE); + dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE); /* Make sure timer events get retriggered on all CPUs */ clock_was_set();> > - printk(KERN_DEBUG "suspending xenstore...\n"); > + /* printk(KERN_DEBUG "suspending xenstore...\n"); */ > xs_suspend(); > [...] > int xenbus_dev_cancel(struct device *dev) > { > /* Do nothing */ > - DPRINTK("cancel"); > + /* DPRINTK("cancel"); */ > return 0;Please don''t make unrelated changes in patches. Also please don''t just comment out code, either leave these lines alone or remove them entirely. Otherwise the patch appears to be fine to me. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Feb-16 11:43 UTC
Re: [Xen-devel] [PATCH] xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
On Wed, 2011-02-16 at 06:51 +0000, Shriram Rajagopalan wrote:> Use PM_FREEZE, PM_THAW and PM_RESTORE power events for > suspend/resume/checkpoint functionality, instead of PM_SUSPEND > and PM_RESUME. Use of these pm events fixes the Xen Guest hangup > when taking checkpoints. When a suspend event is cancelled > (while taking checkpoints once/continuously), we use PM_THAW > instead of PM_RESUME. PM_RESTORE is used when suspend is not > cancelled. See Documentation/power/devices.txt and linux/pm.h > for more info about freeze, thaw and restore. The sequence of > pm events in a suspend-resume scenario is shown below. > > dpm_suspend_start(PMSG_FREEZE); > > dpm_suspend_noirq(PMSG_FREEZE); > > sysdev_suspend(PMSG_FREEZE); > cancelled = suspend_hypercall() > sysdev_resume(); > > dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); > > dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE);With this patch I get [ 18.902808] PM: Device pcspkr failed to freeze: error -22 [ 18.902835] xen suspend: dpm_suspend_start -22 apparently due to a lack of CONFIG_HIBERNATE which is a prerequisite for using the freeze methods (see pm_ops function). As I mentioned earlier I think some of the CONFIG_PM_SLEEP tests in drivers/xen/manage.c need to be adjusted for the new suspend scheme (and I suspect they are a little wrong for the old one too). Since CONFIG_HIBERNATE is a "suspend to disk" option I think this needs running past the core pm guys to determine the correct approach, it might be to make PMSG_FREEZE support enabled by some some less specific configuration option. Enabling CONFIG_HIBERNATE does seem to be sufficient to make this work though. Ian.> > Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> > --- > drivers/xen/manage.c | 12 ++++++------ > drivers/xen/xenbus/xenbus_probe.c | 2 +- > drivers/xen/xenbus/xenbus_probe_frontend.c | 8 +++++--- > 3 files changed, 12 insertions(+), 10 deletions(-) > > diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c > index db8c4c4..3f76dcf 100644 > --- a/drivers/xen/manage.c > +++ b/drivers/xen/manage.c > @@ -63,7 +63,7 @@ static int xen_suspend(void *data) > > BUG_ON(!irqs_disabled()); > > - err = sysdev_suspend(PMSG_SUSPEND); > + err = sysdev_suspend(PMSG_FREEZE); > if (err) { > printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", > err); > @@ -114,16 +114,16 @@ static void do_suspend(void) > } > #endif > > - err = dpm_suspend_start(PMSG_SUSPEND); > + err = dpm_suspend_start(PMSG_FREEZE); > if (err) { > printk(KERN_ERR "xen suspend: dpm_suspend_start %d\n", err); > goto out_thaw; > } > > - printk(KERN_DEBUG "suspending xenstore...\n"); > + /* printk(KERN_DEBUG "suspending xenstore...\n"); */ > xs_suspend(); > > - err = dpm_suspend_noirq(PMSG_SUSPEND); > + err = dpm_suspend_noirq(PMSG_FREEZE); > if (err) { > printk(KERN_ERR "dpm_suspend_noirq failed: %d\n", err); > goto out_resume; > @@ -134,7 +134,7 @@ static void do_suspend(void) > else > err = stop_machine(xen_suspend, &cancelled, cpumask_of(0)); > > - dpm_resume_noirq(PMSG_RESUME); > + dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); > > if (err) { > printk(KERN_ERR "failed to start xen_suspend: %d\n", err); > @@ -148,7 +148,7 @@ out_resume: > } else > xs_suspend_cancel(); > > - dpm_resume_end(PMSG_RESUME); > + dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE); > > /* Make sure timer events get retriggered on all CPUs */ > clock_was_set(); > diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c > index 7397695..471ca48 100644 > --- a/drivers/xen/xenbus/xenbus_probe.c > +++ b/drivers/xen/xenbus/xenbus_probe.c > @@ -645,7 +645,7 @@ EXPORT_SYMBOL_GPL(xenbus_dev_resume); > int xenbus_dev_cancel(struct device *dev) > { > /* Do nothing */ > - DPRINTK("cancel"); > + /* DPRINTK("cancel"); */ > return 0; > } > EXPORT_SYMBOL_GPL(xenbus_dev_cancel); > diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c b/drivers/xen/xenbus/xenbus_probe_frontend.c > index 9ad8868..d6e5f0d 100644 > --- a/drivers/xen/xenbus/xenbus_probe_frontend.c > +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c > @@ -86,9 +86,11 @@ static struct device_attribute xenbus_frontend_dev_attrs[] = { > }; > > static struct dev_pm_ops xenbus_pm_ops = { > - .suspend = xenbus_dev_suspend, > - .resume = xenbus_dev_resume, > - .thaw = xenbus_dev_cancel, > + .suspend = xenbus_dev_suspend, > + .resume = xenbus_dev_resume, > + .freeze = xenbus_dev_suspend, > + .thaw = xenbus_dev_cancel, > + .restore = xenbus_dev_resume, > }; > > static struct xen_bus_type xenbus_frontend = {_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Shriram Rajagopalan
2011-Feb-16 18:15 UTC
Re: [Xen-devel] [PATCH] xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
I didnt test the patch against the latest xen_suspend patch series you sent out. I couldnt find it in any of the trees. And since you said earlier that the xen_hvm_suspend fix would be (re)fixed to PM_FREEZE after my patch, I refrained from touching it. But I did test with 2.6.38-rc1 32 bit kernel, PVHVM mode. It "seemed" to work fine for save/restore/checkpoint. I could see the PM event messages in dmesg (freeze, thaw, restore related timing stats) On Wed, Feb 16, 2011 at 3:43 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Wed, 2011-02-16 at 06:51 +0000, Shriram Rajagopalan wrote: >> Use PM_FREEZE, PM_THAW and PM_RESTORE power events for >> suspend/resume/checkpoint functionality, instead of PM_SUSPEND >> and PM_RESUME. Use of these pm events fixes the Xen Guest hangup >> when taking checkpoints. When a suspend event is cancelled >> (while taking checkpoints once/continuously), we use PM_THAW >> instead of PM_RESUME. PM_RESTORE is used when suspend is not >> cancelled. See Documentation/power/devices.txt and linux/pm.h >> for more info about freeze, thaw and restore. The sequence of >> pm events in a suspend-resume scenario is shown below. >> >> dpm_suspend_start(PMSG_FREEZE); >> >> dpm_suspend_noirq(PMSG_FREEZE); >> >> sysdev_suspend(PMSG_FREEZE); >> cancelled = suspend_hypercall() >> sysdev_resume(); >> >> dpm_resume_noirq(cancelled ? PMSG_THAW : PMSG_RESTORE); >> >> dpm_resume_end(cancelled ? PMSG_THAW : PMSG_RESTORE); > > With this patch I get > > [ 18.902808] PM: Device pcspkr failed to freeze: error -22 > [ 18.902835] xen suspend: dpm_suspend_start -22 > > apparently due to a lack of CONFIG_HIBERNATE which is a prerequisite for > using the freeze methods (see pm_ops function). > > As I mentioned earlier I think some of the CONFIG_PM_SLEEP tests in > drivers/xen/manage.c need to be adjusted for the new suspend scheme (and > I suspect they are a little wrong for the old one too). > > Since CONFIG_HIBERNATE is a "suspend to disk" option I think this needs > running past the core pm guys to determine the correct approach, it > might be to make PMSG_FREEZE support enabled by some some less specific > configuration option. > > Enabling CONFIG_HIBERNATE does seem to be sufficient to make this work > though. > > Ian. >On a related note, my initial kernel config had somehow enabled CONFIG_MICROCODE. So, with a PV kernel (2.6.38-rc1), I got the following WARNING stack trace for checkpoint & restore (ie freeze/thaw or freeze/restore) Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.255561] PM: freeze of devices complete after 0.123 msecs Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.255603] PM: late freeze of devices complete after 0.035 msecs Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] ------------[ cut here ]------------ Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] WARNING: at ...arch/x86/kernel/microcode_core.c:454 mc_sysdev_resume+0x30/0x5c() Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] Modules linked in: Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] Pid: 6, comm: migration/0 Not tainted 2.6.38-rc1-xenu #12 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] Call Trace: Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff810417db>] ? warn_slowpath_common+0x80/0x98 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8107c601>] ? cpu_stopper_thread+0x10d/0x172 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff81041808>] ? warn_slowpath_null+0x15/0x17 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff810276c5>] ? mc_sysdev_resume+0x30/0x5c Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff812294f9>] ? __sysdev_resume+0x74/0xc4 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff812295ae>] ? sysdev_resume+0x65/0xa6 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff81204736>] ? xen_suspend+0xc4/0xcb Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8107c6f1>] ? stop_machine_cpu_stop+0x7d/0xb6 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8107c674>] ? stop_machine_cpu_stop+0x0/0xb6 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8107c5d7>] ? cpu_stopper_thread+0xe3/0x172 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff813ab106>] ? schedule+0x4e7/0x516 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff81006cf2>] ? check_events+0x12/0x20 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff81006cdf>] ? xen_restore_fl_direct_end+0x0/0x1 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8107c4f4>] ? cpu_stopper_thread+0x0/0x172 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff81057438>] ? kthread+0x7d/0x85 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8100b724>] ? kernel_thread_helper+0x4/0x10 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8100ab36>] ? int_ret_from_sys_call+0x7/0x1b Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff813ac6a1>] ? retint_restore_args+0x5/0x6 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] [<ffffffff8100b720>] ? kernel_thread_helper+0x0/0x10 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256614] ---[ end trace 24fdc8979bd6c62e ]--- Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.256346] PM: early restore of devices complete after 0.047 msecs Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.270496] PM: restore of devices complete after 13.106 msecs Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.279878] Setting capacity to 41943040 Feb 16 06:02:35 rshriram-vm1 kernel: [ 147.293516] Setting capacity to 41943040 Feb 16 06:04:29 rshriram-vm1 init: hvc0 main process ended, respawning Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.776082] PM: freeze of devices complete after 0.161 msecs Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.776127] PM: late freeze of devices complete after 0.037 msecs Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] ------------[ cut here ]------------ Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] WARNING: at ...arch/x86/kernel/microcode_core.c:454 mc_sysdev_resume+0x30/0x5c() Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] Modules linked in: Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] Pid: 6, comm: migration/0 Tainted: G W 2.6.38-rc1-xenu #12 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] Call Trace: Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff810417db>] ? warn_slowpath_common+0x80/0x98 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81006cdf>] ? xen_restore_fl_direct_end+0x0/0x1 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8107c601>] ? cpu_stopper_thread+0x10d/0x172 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81041808>] ? warn_slowpath_null+0x15/0x17 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff810276c5>] ? mc_sysdev_resume+0x30/0x5c Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff812294f9>] ? __sysdev_resume+0x74/0xc4 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81006cdf>] ? xen_restore_fl_direct_end+0x0/0x1 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff812295ae>] ? sysdev_resume+0x65/0xa6 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81204736>] ? xen_suspend+0xc4/0xcb Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8107c6f1>] ? stop_machine_cpu_stop+0x7d/0xb6 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8107c674>] ? stop_machine_cpu_stop+0x0/0xb6 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8107c5d7>] ? cpu_stopper_thread+0xe3/0x172 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff813ab106>] ? schedule+0x4e7/0x516 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81006cf2>] ? check_events+0x12/0x20 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81006cdf>] ? xen_restore_fl_direct_end+0x0/0x1 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8107c4f4>] ? cpu_stopper_thread+0x0/0x172 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff81057438>] ? kthread+0x7d/0x85 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8100b724>] ? kernel_thread_helper+0x4/0x10 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8100ab36>] ? int_ret_from_sys_call+0x7/0x1b Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff813ac6a1>] ? retint_restore_args+0x5/0x6 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] [<ffffffff8100b720>] ? kernel_thread_helper+0x0/0x10 Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777141] ---[ end trace 24fdc8979bd6c62f ]--- Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777060] PM: early thaw of devices complete after 0.045 msecs Feb 16 06:15:30 rshriram-vm1 kernel: [ 906.777060] PM: thaw of devices complete after 0.067 msecs sysdev_resume() call we make in drivers/xen/manage.c results in calling [sysdev_drivers]->(resume)() Looking at the microcode_core.c driver, the mc_sysdev resume function raises this warning if more than 1 CPU is online during system resume. If sysdev_resume took an arg like sysdev_suspend and called appropriate [sysdev_drivers]->(thaw)() or (restore)(), we could supply (PM_THAW/PM_RESTORE) and avoid this sort of warning. I am not sure if this would fit in with the intended functionality of sysdev_resume() function in drivers/base/sys.c. Of course, disabling CONFIG_MICROCODE makes the warning go away but I was thinking along the lines of a stock kernel config that has lots of things enabled. Correct me if I am wrong about this. shriram _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Feb-16 18:28 UTC
Re: [Xen-devel] [PATCH] xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
On Wed, 2011-02-16 at 18:15 +0000, Shriram Rajagopalan wrote:> I didnt test the patch against the latest xen_suspend patch series you > sent out. I couldnt find it in any of the trees. And since you said > earlier that the xen_hvm_suspend fix would be (re)fixed to PM_FREEZE > after my patch, I refrained from touching it. But I did test with > 2.6.38-rc1 32 bit kernel, PVHVM mode. It "seemed" to work fine for > save/restore/checkpoint. I could see the PM event messages in dmesg > (freeze, thaw, restore related timing stats)Great, thanks.> On a related note, my initial kernel config had somehow enabled > CONFIG_MICROCODE. > So, with a PV kernel (2.6.38-rc1), I got the following WARNING stack > trace for > checkpoint & restore (ie freeze/thaw or freeze/restore)> [...]> sysdev_resume() call we make in drivers/xen/manage.c results in > calling [sysdev_drivers]->(resume)() > Looking at the microcode_core.c driver, the mc_sysdev resume function > raises this warning if more than 1 CPU is online during system resume.This is known issue, the 1 CPU constraint is a native thing and isn''t applicable to Xen. There has been a patch floating around for ages. I saw some traffic about it recently, let me dig... ... aha. it went into -mm last week, see <201102082204.p18M4bqc028043@imap1.linux-foundation.org> not bad for a patch originally posted in 2009! http://www.gossamer-threads.com/lists/linux/kernel/1335664 [...]> Of course, disabling CONFIG_MICROCODE makes the warning go away but I > was thinking along the lines of a stock kernel config that has lots of > things enabled. Correct me if I am wrong about this.Consideration for stock kernel configurations (particularly distro configs) is absolutely correct, thanks! Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel