Gonglei (Arei)
2013-Aug-08 14:23 UTC
pvops: Does PVOPS guest os support online "suspend/resume"
Hi all, While suspend and resume a PVOPS guest os while it's running, we found that it would get its block/net io stucked. However, non-PVOPS guest os has no such problem. How reproducible: ------------------- 1/1 Steps to reproduce: ------------------ 1)suspend guest os Note: do not migrate/shutdown the guest os. 2)resume guest os (Think about rolling-back(resume) during core-dumping(suspend) a guest, such problem would cause the guest os unoprationable.) ===================================================================we found warning messages in guest os: -------------------------------------------------------------------- Aug 2 10:17:34 localhost kernel: [38592.985159] platform pcspkr: resume Aug 2 10:17:34 localhost kernel: [38592.989890] platform vesafb.0: resume Aug 2 10:17:34 localhost kernel: [38592.996075] input input0: type resume Aug 2 10:17:34 localhost kernel: [38593.001330] input input1: type resume Aug 2 10:17:34 localhost kernel: [38593.005496] vbd vbd-51712: legacy resume Aug 2 10:17:34 localhost kernel: [38593.011506] WARNING: g.e. still in use! Aug 2 10:17:34 localhost kernel: [38593.016909] WARNING: leaking g.e. and page still in use! Aug 2 10:17:34 localhost kernel: [38593.026204] xen vbd-51760: legacy resume Aug 2 10:17:34 localhost kernel: [38593.033070] vif vif-0: legacy resume Aug 2 10:17:34 localhost kernel: [38593.039327] WARNING: g.e. still in use! Aug 2 10:17:34 localhost kernel: [38593.045304] WARNING: leaking g.e. and page still in use! Aug 2 10:17:34 localhost kernel: [38593.052101] WARNING: g.e. still in use! Aug 2 10:17:34 localhost kernel: [38593.057965] WARNING: leaking g.e. and page still in use! Aug 2 10:17:34 localhost kernel: [38593.066795] serial8250 serial8250: resume Aug 2 10:17:34 localhost kernel: [38593.073556] input input2: type resume Aug 2 10:17:34 localhost kernel: [38593.079385] platform Fixed MDIO bus.0: resume Aug 2 10:17:34 localhost kernel: [38593.086285] usb usb1: type resume ------------------------------------------------------ which means that we refers to a grant-table while it's in use. The reason results in that: suspend/resume codes: -------------------------------------------------------- //drivers/xen/manage.c static void do_suspend(void) { int err; struct suspend_info si; shutting_down = SHUTDOWN_SUSPEND; ……………… err = dpm_suspend_start(PMSG_FREEZE); ……………… dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE); if (err) { pr_err("failed to start xen_suspend: %d\n", err); si.cancelled = 1; } //NOTE: si.cancelled = 1 out_resume: if (!si.cancelled) { xen_arch_resume(); xs_resume(); } else xs_suspend_cancel(); dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE); //blkfront device got resumed here. out_thaw: #ifdef CONFIG_PREEMPT thaw_processes(); out: #endif shutting_down = SHUTDOWN_INVALID; } ------------------------------------ Func "dpm_suspend_start" suspends devices, and "dpm_resume_end" resumes devices. However, we found that the device "blkfront" has no SUSPEND method but RESUME method. ------------------------------------- //drivers/block/xen-blkfront.c static DEFINE_XENBUS_DRIVER(blkfront, , .probe = blkfront_probe, .remove = blkfront_remove, .resume = blkfront_resume, // only RESUME method found here. .otherend_changed = blkback_changed, .is_ready = blkfront_is_ready, ); -------------------------------------- It resumes blkfront device when it didn't get suspended, which caused the prolem above. ========================================In order to check whether it's the problem of PVOPS or hypervisor(xen)/dom0, we suspend/resume other non-PVOPS guest oses, no such problem occured. Other non-PVOPS are using their own xen drivers, as shown in https://github.com/jpaton/xen-4.1-LJX1/blob/master/unmodified_drivers/linux-2.6/platform-pci/machine_reboot.c : int __xen_suspend(int fast_suspend, void (*resume_notifier)(int)) { int err, suspend_cancelled, nr_cpus; struct ap_suspend_info info; xenbus_suspend(); …………………… preempt_enable(); if (!suspend_cancelled) xenbus_resume(); //when the guest os get resumed, suspend_cancelled == 1, thus it wouldn't enter xenbus_resume_uvp here. else xenbus_suspend_cancel(); //It gets here. so the blkfront wouldn't resume. return 0; } In non-PVOPS guest os, although they don't have blkfront SUSPEND method either, their xen-driver doesn't resume blkfront device, thus, they would't have any problem after suspend/resume. I'm wondering why the 2 types of driver(PVOPS and non-PVOPS) are different here. Is that because: 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? or 2) PVOPS has other ways to avoid such problem? thank you in advance. -Gonglei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Aug-08 19:16 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
On Thu, Aug 08, 2013 at 02:23:06PM +0000, Gonglei (Arei) wrote:> Hi all, > > While suspend and resume a PVOPS guest os while it's running, we found that it would get its block/net io stucked. However, non-PVOPS guest os has no such problem. >With what version of Linux is this? Have you tried with v3.10? Thanks.> How reproducible: > ------------------- > 1/1 > > Steps to reproduce: > ------------------ > 1)suspend guest os > Note: do not migrate/shutdown the guest os. > 2)resume guest os > > (Think about rolling-back(resume) during core-dumping(suspend) a guest, such problem would cause the guest os unoprationable.) > > ===================================================================> we found warning messages in guest os: > -------------------------------------------------------------------- > Aug 2 10:17:34 localhost kernel: [38592.985159] platform pcspkr: resume > Aug 2 10:17:34 localhost kernel: [38592.989890] platform vesafb.0: resume > Aug 2 10:17:34 localhost kernel: [38592.996075] input input0: type resume > Aug 2 10:17:34 localhost kernel: [38593.001330] input input1: type resume > Aug 2 10:17:34 localhost kernel: [38593.005496] vbd vbd-51712: legacy resume > Aug 2 10:17:34 localhost kernel: [38593.011506] WARNING: g.e. still in use! > Aug 2 10:17:34 localhost kernel: [38593.016909] WARNING: leaking g.e. and page still in use! > Aug 2 10:17:34 localhost kernel: [38593.026204] xen vbd-51760: legacy resume > Aug 2 10:17:34 localhost kernel: [38593.033070] vif vif-0: legacy resume > Aug 2 10:17:34 localhost kernel: [38593.039327] WARNING: g.e. still in use! > Aug 2 10:17:34 localhost kernel: [38593.045304] WARNING: leaking g.e. and page still in use! > Aug 2 10:17:34 localhost kernel: [38593.052101] WARNING: g.e. still in use! > Aug 2 10:17:34 localhost kernel: [38593.057965] WARNING: leaking g.e. and page still in use! > Aug 2 10:17:34 localhost kernel: [38593.066795] serial8250 serial8250: resume > Aug 2 10:17:34 localhost kernel: [38593.073556] input input2: type resume > Aug 2 10:17:34 localhost kernel: [38593.079385] platform Fixed MDIO bus.0: resume > Aug 2 10:17:34 localhost kernel: [38593.086285] usb usb1: type resume > ------------------------------------------------------ > > which means that we refers to a grant-table while it's in use. > > The reason results in that: > suspend/resume codes: > -------------------------------------------------------- > //drivers/xen/manage.c > static void do_suspend(void) > { > int err; > struct suspend_info si; > > shutting_down = SHUTDOWN_SUSPEND; > > ……………… > err = dpm_suspend_start(PMSG_FREEZE); > ……………… > dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE); > > if (err) { > pr_err("failed to start xen_suspend: %d\n", err); > si.cancelled = 1; > } > //NOTE: si.cancelled = 1 > > out_resume: > if (!si.cancelled) { > xen_arch_resume(); > xs_resume(); > } else > xs_suspend_cancel(); > > dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE); //blkfront device got resumed here. > > out_thaw: > #ifdef CONFIG_PREEMPT > thaw_processes(); > out: > #endif > shutting_down = SHUTDOWN_INVALID; > } > ------------------------------------ > > Func "dpm_suspend_start" suspends devices, and "dpm_resume_end" resumes devices. > However, we found that the device "blkfront" has no SUSPEND method but RESUME method. > > ------------------------------------- > //drivers/block/xen-blkfront.c > static DEFINE_XENBUS_DRIVER(blkfront, , > .probe = blkfront_probe, > .remove = blkfront_remove, > .resume = blkfront_resume, // only RESUME method found here. > .otherend_changed = blkback_changed, > .is_ready = blkfront_is_ready, > ); > -------------------------------------- > > It resumes blkfront device when it didn't get suspended, which caused the prolem above. > > > ========================================> In order to check whether it's the problem of PVOPS or hypervisor(xen)/dom0, we suspend/resume other non-PVOPS guest oses, no such problem occured. > > Other non-PVOPS are using their own xen drivers, as shown in https://github.com/jpaton/xen-4.1-LJX1/blob/master/unmodified_drivers/linux-2.6/platform-pci/machine_reboot.c : > > int __xen_suspend(int fast_suspend, void (*resume_notifier)(int)) > { > int err, suspend_cancelled, nr_cpus; > struct ap_suspend_info info; > > xenbus_suspend(); > > …………………… > preempt_enable(); > > if (!suspend_cancelled) > xenbus_resume(); //when the guest os get resumed, suspend_cancelled == 1, thus it wouldn't enter xenbus_resume_uvp here. > else > xenbus_suspend_cancel(); //It gets here. so the blkfront wouldn't resume. > > return 0; > } > > > In non-PVOPS guest os, although they don't have blkfront SUSPEND method either, their xen-driver doesn't resume blkfront device, thus, they would't have any problem after suspend/resume. > > > I'm wondering why the 2 types of driver(PVOPS and non-PVOPS) are different here. > Is that because: > 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? > or > 2) PVOPS has other ways to avoid such problem? > > thank you in advance. > > -Gonglei > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Gonglei (Arei)
2013-Aug-10 08:29 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
> -----Original Message----- > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > Sent: Friday, August 09, 2013 3:17 AM > To: Gonglei (Arei) > Cc: xen-devel@lists.xen.org; Zhangbo (Oscar); Luonengjun; Hanweidong > Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online > "suspend/resume" > > On Thu, Aug 08, 2013 at 02:23:06PM +0000, Gonglei (Arei) wrote: > > Hi all, > > > > While suspend and resume a PVOPS guest os while it's running, we found that > it would get its block/net io stucked. However, non-PVOPS guest os has no such > problem. > > > > With what version of Linux is this? Have you tried with v3.10?Thanks for responding. We've tried kernel "3.5.0-17 generic" (ubuntu 12.10), the problem still exists. Although we are not sure about the result about kernel 3.10, but suspiciously it would also have the same problem. Xen version: 4.3.0 Another method to reproduce: 1) xl create dom1.cfg 2) xl save -c dom1 /path/to/save/file (-c Leave domain running after creating the snapshot.) As I mentioned before, the problem occurs because PVOPS guest os RESUMEes blkfront when the guest resumes. The "blkfront_resume" method seems unnecessary here. non-PVOPS guest os doesn't RESUME blkfront, thus they works fine. So, here comes the 2 questions, is the problem caused because: 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? or 2) PVOPS has other ways to avoid such problem? -Gonglei> > Thanks. > > How reproducible: > > ------------------- > > 1/1 > > > > Steps to reproduce: > > ------------------ > > 1)suspend guest os > > Note: do not migrate/shutdown the guest os. > > 2)resume guest os > > > > (Think about rolling-back(resume) during core-dumping(suspend) a guest, > such problem would cause the guest os unoprationable.) > > > > > ===============================================================> ===> > we found warning messages in guest os: > > -------------------------------------------------------------------- > > Aug 2 10:17:34 localhost kernel: [38592.985159] platform pcspkr: resume > > Aug 2 10:17:34 localhost kernel: [38592.989890] platform vesafb.0: resume > > Aug 2 10:17:34 localhost kernel: [38592.996075] input input0: type resume > > Aug 2 10:17:34 localhost kernel: [38593.001330] input input1: type resume > > Aug 2 10:17:34 localhost kernel: [38593.005496] vbd vbd-51712: legacy > resume > > Aug 2 10:17:34 localhost kernel: [38593.011506] WARNING: g.e. still in use! > > Aug 2 10:17:34 localhost kernel: [38593.016909] WARNING: leaking g.e. > and page still in use! > > Aug 2 10:17:34 localhost kernel: [38593.026204] xen vbd-51760: legacy > resume > > Aug 2 10:17:34 localhost kernel: [38593.033070] vif vif-0: legacy resume > > Aug 2 10:17:34 localhost kernel: [38593.039327] WARNING: g.e. still in use! > > Aug 2 10:17:34 localhost kernel: [38593.045304] WARNING: leaking g.e. > and page still in use! > > Aug 2 10:17:34 localhost kernel: [38593.052101] WARNING: g.e. still in use! > > Aug 2 10:17:34 localhost kernel: [38593.057965] WARNING: leaking g.e. > and page still in use! > > Aug 2 10:17:34 localhost kernel: [38593.066795] serial8250 serial8250: > resume > > Aug 2 10:17:34 localhost kernel: [38593.073556] input input2: type resume > > Aug 2 10:17:34 localhost kernel: [38593.079385] platform Fixed MDIO bus.0: > resume > > Aug 2 10:17:34 localhost kernel: [38593.086285] usb usb1: type resume > > ------------------------------------------------------ > > > > which means that we refers to a grant-table while it's in use. > > > > The reason results in that: > > suspend/resume codes: > > -------------------------------------------------------- > > //drivers/xen/manage.c > > static void do_suspend(void) > > { > > int err; > > struct suspend_info si; > > > > shutting_down = SHUTDOWN_SUSPEND; > > > > ……………… > > err = dpm_suspend_start(PMSG_FREEZE); > > ……………… > > dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE); > > > > if (err) { > > pr_err("failed to start xen_suspend: %d\n", err); > > si.cancelled = 1; > > } > > //NOTE: si.cancelled = 1 > > > > out_resume: > > if (!si.cancelled) { > > xen_arch_resume(); > > xs_resume(); > > } else > > xs_suspend_cancel(); > > > > dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE); > //blkfront device got resumed here. > > > > out_thaw: > > #ifdef CONFIG_PREEMPT > > thaw_processes(); > > out: > > #endif > > shutting_down = SHUTDOWN_INVALID; > > } > > ------------------------------------ > > > > Func "dpm_suspend_start" suspends devices, and "dpm_resume_end" > resumes devices. > > However, we found that the device "blkfront" has no SUSPEND method but > RESUME method. > > > > ------------------------------------- > > //drivers/block/xen-blkfront.c > > static DEFINE_XENBUS_DRIVER(blkfront, , > > .probe = blkfront_probe, > > .remove = blkfront_remove, > > .resume = blkfront_resume, // only RESUME method found here. > > .otherend_changed = blkback_changed, > > .is_ready = blkfront_is_ready, > > ); > > -------------------------------------- > > > > It resumes blkfront device when it didn't get suspended, which caused the > prolem above. > > > > > > ========================================> > In order to check whether it's the problem of PVOPS or hypervisor(xen)/dom0, > we suspend/resume other non-PVOPS guest oses, no such problem occured. > > > > Other non-PVOPS are using their own xen drivers, as shown in > https://github.com/jpaton/xen-4.1-LJX1/blob/master/unmodified_drivers/linux- > 2.6/platform-pci/machine_reboot.c : > > > > int __xen_suspend(int fast_suspend, void (*resume_notifier)(int)) > > { > > int err, suspend_cancelled, nr_cpus; > > struct ap_suspend_info info; > > > > xenbus_suspend(); > > > > …………………… > > preempt_enable(); > > > > if (!suspend_cancelled) > > xenbus_resume(); //when the guest os get resumed, > suspend_cancelled == 1, thus it wouldn't enter xenbus_resume_uvp here. > > else > > xenbus_suspend_cancel(); //It gets here. so the blkfront > wouldn't resume. > > > > return 0; > > } > > > > > > In non-PVOPS guest os, although they don't have blkfront SUSPEND method > either, their xen-driver doesn't resume blkfront device, thus, they would't have > any problem after suspend/resume. > > > > > > I'm wondering why the 2 types of driver(PVOPS and non-PVOPS) are different > here. > > Is that because: > > 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? > > or > > 2) PVOPS has other ways to avoid such problem? > > > > thank you in advance. > > > > -Gonglei > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Aug-12 12:49 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
On Sat, Aug 10, 2013 at 08:29:43AM +0000, Gonglei (Arei) wrote:> > > > -----Original Message----- > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > > Sent: Friday, August 09, 2013 3:17 AM > > To: Gonglei (Arei) > > Cc: xen-devel@lists.xen.org; Zhangbo (Oscar); Luonengjun; Hanweidong > > Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online > > "suspend/resume" > > > > On Thu, Aug 08, 2013 at 02:23:06PM +0000, Gonglei (Arei) wrote: > > > Hi all, > > > > > > While suspend and resume a PVOPS guest os while it's running, we found that > > it would get its block/net io stucked. However, non-PVOPS guest os has no such > > problem. > > > > > > > With what version of Linux is this? Have you tried with v3.10? > > Thanks for responding. We've tried kernel "3.5.0-17 generic" (ubuntu 12.10), the problem still exists.So you have not tried v3.10. v3.5 is ancient from the upstream perspective.> Although we are not sure about the result about kernel 3.10, but suspiciously it would also have the same problem.Potentially. There were fixes added in 3.5: commit 569ca5b3f94cd0b3295ec5943aa457cf4a4f6a3a Author: Jan Beulich <JBeulich@suse.com> Date: Thu Apr 5 16:10:07 2012 +0100 xen/gnttab: add deferred freeing logic Rather than just leaking pages that can't be freed at the point where access permission for the backend domain gets revoked, put them on a list and run a timer to (infrequently) retry freeing them. (This can particularly happen when unloading a frontend driver when devices are still present, and the backend still has them in non-closed state or hasn't finished closing them yet.) and that seems to be triggered.> > Xen version: 4.3.0 > > Another method to reproduce: > 1) xl create dom1.cfg > 2) xl save -c dom1 /path/to/save/file > (-c Leave domain running after creating the snapshot.) > > As I mentioned before, the problem occurs because PVOPS guest os RESUMEes blkfront when the guest resumes. > The "blkfront_resume" method seems unnecessary here.It has to do that otherwise it can't replay the I/Os that might not have hit the platter when it migrated from the original host. But you are exercising the case where it does a checkpoint, not a full save/restore cycle. In which case you might be indeed hitting a bug.> non-PVOPS guest os doesn't RESUME blkfront, thus they works fine.Potentially. The non-PVOPS guests are based on an ancient kernels and the upstream logic in the generic suspend/resume machinery has also changed.> > So, here comes the 2 questions, is the problem caused because: > 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? > or > 2) PVOPS has other ways to avoid such problem?Just to make sure I am not confused here. The problem does not appear if you do NOT use -c, correct?> > -Gonglei > > > > Thanks. > > > How reproducible: > > > ------------------- > > > 1/1 > > > > > > Steps to reproduce: > > > ------------------ > > > 1)suspend guest os > > > Note: do not migrate/shutdown the guest os. > > > 2)resume guest os > > > > > > (Think about rolling-back(resume) during core-dumping(suspend) a guest, > > such problem would cause the guest os unoprationable.) > > > > > > > > ===============================================================> > ===> > > we found warning messages in guest os: > > > -------------------------------------------------------------------- > > > Aug 2 10:17:34 localhost kernel: [38592.985159] platform pcspkr: resume > > > Aug 2 10:17:34 localhost kernel: [38592.989890] platform vesafb.0: resume > > > Aug 2 10:17:34 localhost kernel: [38592.996075] input input0: type resume > > > Aug 2 10:17:34 localhost kernel: [38593.001330] input input1: type resume > > > Aug 2 10:17:34 localhost kernel: [38593.005496] vbd vbd-51712: legacy > > resume > > > Aug 2 10:17:34 localhost kernel: [38593.011506] WARNING: g.e. still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.016909] WARNING: leaking g.e. > > and page still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.026204] xen vbd-51760: legacy > > resume > > > Aug 2 10:17:34 localhost kernel: [38593.033070] vif vif-0: legacy resume > > > Aug 2 10:17:34 localhost kernel: [38593.039327] WARNING: g.e. still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.045304] WARNING: leaking g.e. > > and page still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.052101] WARNING: g.e. still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.057965] WARNING: leaking g.e. > > and page still in use! > > > Aug 2 10:17:34 localhost kernel: [38593.066795] serial8250 serial8250: > > resume > > > Aug 2 10:17:34 localhost kernel: [38593.073556] input input2: type resume > > > Aug 2 10:17:34 localhost kernel: [38593.079385] platform Fixed MDIO bus.0: > > resume > > > Aug 2 10:17:34 localhost kernel: [38593.086285] usb usb1: type resume > > > ------------------------------------------------------ > > > > > > which means that we refers to a grant-table while it's in use. > > > > > > The reason results in that: > > > suspend/resume codes: > > > -------------------------------------------------------- > > > //drivers/xen/manage.c > > > static void do_suspend(void) > > > { > > > int err; > > > struct suspend_info si; > > > > > > shutting_down = SHUTDOWN_SUSPEND; > > > > > > ……………… > > > err = dpm_suspend_start(PMSG_FREEZE); > > > ……………… > > > dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE); > > > > > > if (err) { > > > pr_err("failed to start xen_suspend: %d\n", err); > > > si.cancelled = 1; > > > } > > > //NOTE: si.cancelled = 1 > > > > > > out_resume: > > > if (!si.cancelled) { > > > xen_arch_resume(); > > > xs_resume(); > > > } else > > > xs_suspend_cancel(); > > > > > > dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE); > > //blkfront device got resumed here. > > > > > > out_thaw: > > > #ifdef CONFIG_PREEMPT > > > thaw_processes(); > > > out: > > > #endif > > > shutting_down = SHUTDOWN_INVALID; > > > } > > > ------------------------------------ > > > > > > Func "dpm_suspend_start" suspends devices, and "dpm_resume_end" > > resumes devices. > > > However, we found that the device "blkfront" has no SUSPEND method but > > RESUME method. > > > > > > ------------------------------------- > > > //drivers/block/xen-blkfront.c > > > static DEFINE_XENBUS_DRIVER(blkfront, , > > > .probe = blkfront_probe, > > > .remove = blkfront_remove, > > > .resume = blkfront_resume, // only RESUME method found here. > > > .otherend_changed = blkback_changed, > > > .is_ready = blkfront_is_ready, > > > ); > > > -------------------------------------- > > > > > > It resumes blkfront device when it didn't get suspended, which caused the > > prolem above. > > > > > > > > > ========================================> > > In order to check whether it's the problem of PVOPS or hypervisor(xen)/dom0, > > we suspend/resume other non-PVOPS guest oses, no such problem occured. > > > > > > Other non-PVOPS are using their own xen drivers, as shown in > > https://github.com/jpaton/xen-4.1-LJX1/blob/master/unmodified_drivers/linux- > > 2.6/platform-pci/machine_reboot.c : > > > > > > int __xen_suspend(int fast_suspend, void (*resume_notifier)(int)) > > > { > > > int err, suspend_cancelled, nr_cpus; > > > struct ap_suspend_info info; > > > > > > xenbus_suspend(); > > > > > > …………………… > > > preempt_enable(); > > > > > > if (!suspend_cancelled) > > > xenbus_resume(); //when the guest os get resumed, > > suspend_cancelled == 1, thus it wouldn't enter xenbus_resume_uvp here. > > > else > > > xenbus_suspend_cancel(); //It gets here. so the blkfront > > wouldn't resume. > > > > > > return 0; > > > } > > > > > > > > > In non-PVOPS guest os, although they don't have blkfront SUSPEND method > > either, their xen-driver doesn't resume blkfront device, thus, they would't have > > any problem after suspend/resume. > > > > > > > > > I'm wondering why the 2 types of driver(PVOPS and non-PVOPS) are different > > here. > > > Is that because: > > > 1) PVOPS kernel doesn't take this situation into accont, and has a bug here? > > > or > > > 2) PVOPS has other ways to avoid such problem? > > > > > > thank you in advance. > > > > > > -Gonglei > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xen.org > > > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Gonglei (Arei)
2013-Aug-12 14:19 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
Hi,> -----Original Message----- > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > Sent: Monday, August 12, 2013 8:50 PM > To: Gonglei (Arei) > Cc: xen-devel@lists.xen.org; Zhangbo (Oscar); Luonengjun; > ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com; rjw@sisk.pl; > rshriram@cs.ubc.ca; Yanqiangjun; Jinjian (Ken) > Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online > "suspend/resume" > > On Sat, Aug 10, 2013 at 08:29:43AM +0000, Gonglei (Arei) wrote: > > > > > > > -----Original Message----- > > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > > > Sent: Friday, August 09, 2013 3:17 AM > > > To: Gonglei (Arei) > > > Cc: xen-devel@lists.xen.org; Zhangbo (Oscar); Luonengjun; Hanweidong > > > Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online > > > "suspend/resume" > > > > > > On Thu, Aug 08, 2013 at 02:23:06PM +0000, Gonglei (Arei) wrote: > > > > Hi all, > > > > > > > > While suspend and resume a PVOPS guest os while it''s running, we found > that > > > it would get its block/net io stucked. However, non-PVOPS guest os has no > such > > > problem. > > > > > > > > > > With what version of Linux is this? Have you tried with v3.10? > > > > Thanks for responding. We''ve tried kernel "3.5.0-17 generic" (ubuntu 12.10), > the problem still exists. > > So you have not tried v3.10. v3.5 is ancient from the upstream perspective. >thank you, I didn''t notice that, I would try 3.10 later.> > Although we are not sure about the result about kernel 3.10, but suspiciously > it would also have the same problem. > > Potentially. There were fixes added in 3.5: > > commit 569ca5b3f94cd0b3295ec5943aa457cf4a4f6a3a > Author: Jan Beulich <JBeulich@suse.com> > Date: Thu Apr 5 16:10:07 2012 +0100 > > xen/gnttab: add deferred freeing logic > > Rather than just leaking pages that can''t be freed at the point where > access permission for the backend domain gets revoked, put them on a > list and run a timer to (infrequently) retry freeing them. (This can > particularly happen when unloading a frontend driver when devices are > still present, and the backend still has them in non-closed state or > hasn''t finished closing them yet.) > > and that seems to be triggered.I''ve tryed to apply this patch, but it didn''t fix this problem: it retries endlessly to free the leaking pages, however, there seems to be no end. messages keep coming out per seconds "WARNING: leaking g.e. and page still in use!"> > > > Xen version: 4.3.0 > > > > Another method to reproduce: > > 1) xl create dom1.cfg > > 2) xl save -c dom1 /path/to/save/file > > (-c Leave domain running after creating the snapshot.) > > > > As I mentioned before, the problem occurs because PVOPS guest os > RESUMEes blkfront when the guest resumes. > > The "blkfront_resume" method seems unnecessary here. > > It has to do that otherwise it can''t replay the I/Os that might not have > hit the platter when it migrated from the original host. > > But you are exercising the case where it does a checkpoint, > not a full save/restore cycle. > > In which case you might be indeed hitting a bug.If we add a suspend method for the blkfront, to make the front/end blk device turn their states from {XenbusStateConnected, XenbusStateConnected} into{XenbusStateInitialising, XenbusStateInitWait}, when we suspend the guest os,would that cause any problem? We found that windows xen-pv driver did such things. We''re hoping that such attempt would solve this problem> > > non-PVOPS guest os doesn''t RESUME blkfront, thus they works fine. > > Potentially. The non-PVOPS guests are based on an ancient kernels and > the upstream logic in the generic suspend/resume machinery has also > changed. > > > > > So, here comes the 2 questions, is the problem caused because: > > 1) PVOPS kernel doesn''t take this situation into accont, and has a bug here? > > or > > 2) PVOPS has other ways to avoid such problem? > > Just to make sure I am not confused here. The problem does not > appear if you do NOT use -c, correct?yes, the purpose of using "-c" here is to do a "ONLINE" suspend/resume. such problem just occurs with ONLINE suspend/resume, rather than OFFLINE suspend/resume. To be precisely, 2 examples are listed here below: <1> 1) xl create dom1.cfg 2) xl save -c dom1 /opt/dom1.save after this, the dom1 guest os has its io stucked. which means ONLINE suspend/resume has something wrong. 3) xl destroy dom1 4) xl restore /opt/dom1.save the restored dom1 works fine, which means OFFLINE suspend/resume is OK. <2> 1) xl create dom1.cfg 2) xl save dom1 /opt/dom1.save no "-c" here, it would destroy the guest dom1 automatically. 3) xl restore /opt/dom1.save the restored dom1 works fine, which means OFFLINE suspend/resume is OK. -Gonglei
Shriram Rajagopalan
2013-Aug-12 18:04 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
On Mon, Aug 12, 2013 at 10:19 AM, Gonglei (Arei) <arei.gonglei@huawei.com>wrote:> > > Thanks for responding. We''ve tried kernel "3.5.0-17 generic" (ubuntu > 12.10), > > the problem still exists. > > > > So you have not tried v3.10. v3.5 is ancient from the upstream > perspective. > > > thank you, I didn''t notice that, I would try 3.10 later. > >3.5 may be ancient compared to 3.10 but from the suspend/resume support perspective, I think things were fixed way back in 3.0 series.> yes, the purpose of using "-c" here is to do a "ONLINE" suspend/resume. > such problem just occurs with ONLINE suspend/resume, > rather than OFFLINE suspend/resume. To be precisely, 2 examples are listed > here below: > <1> > 1) xl create dom1.cfg > 2) xl save -c dom1 /opt/dom1.save > after this, the dom1 guest os has its io stucked. which means ONLINE > suspend/resume has something wrong. > 3) xl destroy dom1 > 4) xl restore /opt/dom1.save > the restored dom1 works fine, which means OFFLINE suspend/resume is > OK. > > >I am a bit lost here. Didnt we fix suspend/resume issues in the 3.0 release window. I tested it with both xm and xl save (with/without -c option). That was also when I fixed some bugs in "xl save -c" code and introduced a minimal xl remus implementation (which is a continuous xl save -c). And we had blkfront et. al at that time too. Did the distros miss some kernel config (iirc it was HIBERNATE_CALLBACKS) ? So, did something fundamental change between 3.0 to 3.5, causing the "regression" that Gonglei is seeing ?> <2> > 1) xl create dom1.cfg > 2) xl save dom1 /opt/dom1.save > no "-c" here, it would destroy the guest dom1 automatically. > 3) xl restore /opt/dom1.save > the restored dom1 works fine, which means OFFLINE suspend/resume is > OK. > >This one always worked.. even with stock 2.6 kernels. shriram _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Gonglei (Arei)
2013-Aug-13 14:38 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
Hi, I rechecked the different kernels today, and found that I made a mistake before. sorry for misleading you all:) All in all, the problems should be concluded in the 2 items below: 1 the kernel 2.6.32 PVOPS guest os(I tested RHEL6.1 and RHEL6.3), does have bugs in ONLINE suspend/resume (checkpoint), which was, as Shriram mentioned, fixed in: http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/xen/manage.c?id=b3e96c0c756211e805c6941d4a6e5f6e1995cb6b 2 the kernel above 3.0(I tested Ubuntu12.10 with kernel 3.5 and Ubuntu13.04 with kernel 3.8), they seem to have another "bug": 1) if we set MULTI VCPUS for the guest os, it would have problems in resuming(to be correctly, it's thaw). In details: <1>set the guest os with 4 vcpus in dom1.cfg: vcpus=4 <2>xl create dom1.cfg excute command "top -d 1" in guest dom1's vnc window <3>xl save -c dom1 /opt/dom1.save <4>after step <3>, we check the guest dom1's vnc window, and found that: kernel thread migration/1, migration/2, migration/3 got their cpu usage up to 100% the guest os couldn't respond to any request such as mouse movement or keyboard input. no "thaw" things printed in dom1's serial output. 2) if we set only 1 vcpu for the guest os, it would thaw back and works fine. 3) anyother odd thing is that: if we use the saved file generated in 2-1) to restore the guest, and then do online suspend/resume (xl save -c, checkpoint), it would be fine, no problems occurred. Such problem occurs on guest os with kernel 3.5/3.8(maybe other kernels as well, not tested). I hope that the steps I did was correct. Have you ever entercounter such "suspend/resume checkpoint on multi-vcpu guest os" problem? ------- PS: BTW, I'm wondering why using freeze/thaw instead of suspend/resume would solve the problem with kernels below 3.0? It seems that blkfront_resume is still called if we use thaw method here, because blkfront has no available pm_op. static int device_resume(struct device *dev, pm_message_t state, bool async) { ………… if (dev->bus) { if (dev->bus->pm) { info = "bus "; callback = pm_op(dev->bus->pm, state); } else if (dev->bus->resume) { info = "legacy bus "; callback = dev->bus->resume; //blkfront_resume is called here. here? goto End; } ………… } Best Regards! -Gonglei From: Shriram Rajagopalan [mailto:rshriram@cs.ubc.ca] Sent: Tuesday, August 13, 2013 2:05 AM To: Gonglei (Arei) Cc: Konrad Rzeszutek Wilk; xen-devel@lists.xen.org; Zhangbo (Oscar); Luonengjun; ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com; rjw@sisk.pl; Yanqiangjun; Jinjian (Ken) Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online "suspend/resume" On Mon, Aug 12, 2013 at 10:19 AM, Gonglei (Arei) <arei.gonglei@huawei.com<mailto:arei.gonglei@huawei.com>> wrote:> > Thanks for responding. We've tried kernel "3.5.0-17 generic" (ubuntu 12.10), > the problem still exists. > > So you have not tried v3.10. v3.5 is ancient from the upstream perspective. >thank you, I didn't notice that, I would try 3.10 later. 3.5 may be ancient compared to 3.10 but from the suspend/resume support perspective, I think things were fixed way back in 3.0 series. yes, the purpose of using "-c" here is to do a "ONLINE" suspend/resume. such problem just occurs with ONLINE suspend/resume, rather than OFFLINE suspend/resume. To be precisely, 2 examples are listed here below: <1> 1) xl create dom1.cfg 2) xl save -c dom1 /opt/dom1.save after this, the dom1 guest os has its io stucked. which means ONLINE suspend/resume has something wrong. 3) xl destroy dom1 4) xl restore /opt/dom1.save the restored dom1 works fine, which means OFFLINE suspend/resume is OK. I am a bit lost here. Didnt we fix suspend/resume issues in the 3.0 release window. I tested it with both xm and xl save (with/without -c option). That was also when I fixed some bugs in "xl save -c" code and introduced a minimal xl remus implementation (which is a continuous xl save -c). And we had blkfront et. al at that time too. Did the distros miss some kernel config (iirc it was HIBERNATE_CALLBACKS) ? So, did something fundamental change between 3.0 to 3.5, causing the "regression" that Gonglei is seeing ? <2> 1) xl create dom1.cfg 2) xl save dom1 /opt/dom1.save no "-c" here, it would destroy the guest dom1 automatically. 3) xl restore /opt/dom1.save the restored dom1 works fine, which means OFFLINE suspend/resume is OK. This one always worked.. even with stock 2.6 kernels. shriram _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Aug-13 16:34 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
On Tue, Aug 13, 2013 at 02:38:18PM +0000, Gonglei (Arei) wrote:> Hi, > I rechecked the different kernels today, and found that I made a mistake before. sorry for misleading you all:) > > All in all, the problems should be concluded in the 2 items below: > 1 the kernel 2.6.32 PVOPS guest os(I tested RHEL6.1 and RHEL6.3), does have bugs in ONLINE suspend/resume (checkpoint), which was, > as Shriram mentioned, fixed in: > http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/xen/manage.c?id=b3e96c0c756211e805c6941d4a6e5f6e1995cb6b > 2 the kernel above 3.0(I tested Ubuntu12.10 with kernel 3.5 and Ubuntu13.04 with kernel 3.8), they seem to have another "bug": > 1) if we set MULTI VCPUS for the guest os, it would have problems in resuming(to be correctly, it's thaw). > In details: > <1>set the guest os with 4 vcpus > in dom1.cfg: vcpus=4 > <2>xl create dom1.cfg > excute command "top -d 1" in guest dom1's vnc window > <3>xl save -c dom1 /opt/dom1.save > <4>after step <3>, we check the guest dom1's vnc window, and found that: > kernel thread migration/1, migration/2, migration/3 got their cpu usage up to 100% > the guest os couldn't respond to any request such as mouse movement or keyboard input. > no "thaw" things printed in dom1's serial output. > > 2) if we set only 1 vcpu for the guest os, it would thaw back and works fine. > 3) anyother odd thing is that: if we use the saved file generated in 2-1) to restore the guest, and then do online suspend/resume (xl save -c, checkpoint), > it would be fine, no problems occurred. > > Such problem occurs on guest os with kernel 3.5/3.8(maybe other kernels as well, not tested). I hope that the steps I did was correct.Please do check with the upstream kernel. There were some CPU hotplug issues in older kernels and just to make sure that this is not one of them it would be good to eliminate this. Please do test with v3.11-rc5.> Have you ever entercounter such "suspend/resume checkpoint on multi-vcpu guest os" problem? > > ------- > PS: BTW, I'm wondering why using freeze/thaw instead of suspend/resume would solve the problem with kernels below 3.0? > It seems that blkfront_resume is still called if we use thaw method here, because blkfront has no available pm_op. > > static int device_resume(struct device *dev, pm_message_t state, bool async) > { > ………… > if (dev->bus) { > if (dev->bus->pm) { > info = "bus "; > callback = pm_op(dev->bus->pm, state); > } else if (dev->bus->resume) { > info = "legacy bus "; > callback = dev->bus->resume; //blkfront_resume is called here. here? > goto End;One easy way to figure this out is to stick printks in here to see if that blkfront code is indeed called. You can also use 'dump_stack()' to get a nice stack-trace. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Gonglei (Arei)
2013-Aug-14 10:52 UTC
Re: pvops: Does PVOPS guest os support online "suspend/resume"
> -----Original Message----- > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > Sent: Wednesday, August 14, 2013 12:35 AM > To: Gonglei (Arei) > Cc: rshriram@cs.ubc.ca; xen-devel@lists.xen.org; Zhangbo (Oscar); > Luonengjun; ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com; > rjw@sisk.pl; Yanqiangjun; Jinjian (Ken) > Subject: Re: [Xen-devel] pvops: Does PVOPS guest os support online > "suspend/resume" > > On Tue, Aug 13, 2013 at 02:38:18PM +0000, Gonglei (Arei) wrote: > > Hi, > > I rechecked the different kernels today, and found that I made a mistake > before. sorry for misleading you all:) > > > > All in all, the problems should be concluded in the 2 items below: > > 1 the kernel 2.6.32 PVOPS guest os(I tested RHEL6.1 and RHEL6.3), does have > bugs in ONLINE suspend/resume (checkpoint), which was, > > as Shriram mentioned, fixed in: > > > http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/driver > s/xen/manage.c?id=b3e96c0c756211e805c6941d4a6e5f6e1995cb6b > > 2 the kernel above 3.0(I tested Ubuntu12.10 with kernel 3.5 and Ubuntu13.04 > with kernel 3.8), they seem to have another "bug": > > 1) if we set MULTI VCPUS for the guest os, it would have problems in > resuming(to be correctly, it's thaw). > > In details: > > <1>set the guest os with 4 vcpus > > in dom1.cfg: vcpus=4 > > <2>xl create dom1.cfg > > excute command "top -d 1" in guest dom1's vnc window > > <3>xl save -c dom1 /opt/dom1.save > > <4>after step <3>, we check the guest dom1's vnc window, and > found that: > > kernel thread migration/1, migration/2, migration/3 got > their cpu usage up to 100% > > the guest os couldn't respond to any request such as > mouse movement or keyboard input. > > no "thaw" things printed in dom1's serial output. > > > > 2) if we set only 1 vcpu for the guest os, it would thaw back and works fine. > > 3) anyother odd thing is that: if we use the saved file generated in 2-1) to > restore the guest, and then do online suspend/resume (xl save -c, checkpoint), > > it would be fine, no problems occurred. > > > > Such problem occurs on guest os with kernel 3.5/3.8(maybe other kernels as > well, not tested). I hope that the steps I did was correct. > > Please do check with the upstream kernel. There were some CPU hotplug > issues in older kernels > and just to make sure that this is not one of them it would be good to eliminate > this. > > Please do test with v3.11-rc5. > > > Have you ever entercounter such "suspend/resume checkpoint on multi-vcpu > guest os" problem? > > > > ------- > > PS: BTW, I'm wondering why using freeze/thaw instead of suspend/resume > would solve the problem with kernels below 3.0? > > It seems that blkfront_resume is still called if we use thaw method here, > because blkfront has no available pm_op. > > > > static int device_resume(struct device *dev, pm_message_t state, bool > async) > > { > > ………… > > if (dev->bus) { > > if (dev->bus->pm) { > > info = "bus "; > > callback = pm_op(dev->bus->pm, state); > > } else if (dev->bus->resume) { > > info = "legacy bus "; > > callback = dev->bus->resume; > //blkfront_resume is called here. here? > > goto End; > > One easy way to figure this out is to stick printks in here to see if that blkfront > code > is indeed called. You can also use 'dump_stack()' to get a nice stack-trace.Hi, 1 I tried kernel 3.11-rc6, it has the same problem: after doing the checkpoint, multi-vcpu guest os can't respond to anything, because its kernel threads migration/1, migration/2, etc, got their cpu usage up to 100% 2 kernel 3.0 doesn't have this problem. So, It seems that some bugs came out between v3.0 and v3.5, something concerning vcpu freeze/thaw ? Thanks! -Gonglei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel