Changing the ACPI state (e.g. power off) while not all cpus are in cpupool 0 will currently crash the hypervisor during disabling the other cpus. This patch avoids the crash by adding the reason for disabling a cpu (either permanent e.g. in case of cpu hotplug or temporary in case of ACPI state change). This requires an additional parameter for cpu callbacks. All callbacks are changed to take a structure as parameter instead of only the cpu number. Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> 25 files changed, 185 insertions(+), 139 deletions(-) xen/arch/x86/acpi/cpu_idle.c | 13 ++++--- xen/arch/x86/cpu/mcheck/mce.c | 8 ++-- xen/arch/x86/cpu/mcheck/mce_intel.c | 8 ++-- xen/arch/x86/hvm/hvm.c | 10 ++--- xen/arch/x86/microcode.c | 6 +-- xen/arch/x86/nmi.c | 17 +++++---- xen/arch/x86/percpu.c | 8 ++-- xen/arch/x86/setup.c | 2 - xen/arch/x86/smpboot.c | 8 ++-- xen/arch/x86/sysctl.c | 8 ++-- xen/arch/x86/x86_32/traps.c | 8 ++-- xen/common/cpu.c | 46 ++++++++++++++++---------- xen/common/cpupool.c | 17 ++++++--- xen/common/kexec.c | 13 ++++--- xen/common/rcupdate.c | 15 +++++--- xen/common/sched_credit2.c | 6 +-- xen/common/schedule.c | 8 ++-- xen/common/stop_machine.c | 17 +++++---- xen/common/tasklet.c | 19 ++++++---- xen/common/timer.c | 15 +++++--- xen/common/tmem_xen.c | 32 +++++++++--------- xen/common/trace.c | 6 +-- xen/drivers/cpufreq/cpufreq.c | 15 +++++--- xen/drivers/cpufreq/cpufreq_misc_governors.c | 6 +-- xen/include/xen/cpu.h | 13 ++++++- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Keir Fraser
2012-Mar-20 12:52 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
That''s quite a lot of bother. Firstly, this could probably be supported by a global system_state type of variable indicating whether we are booting, suspending, normal. Etc. Secondly I wonder whether you really need to care about this detail from within the cpupool code? When you offline a CPU in cpupool!=0, remember it. Put it back in the pool when it onlines, if the pool still exists. Don''t prevent a pool from being destroyed just because it has offline cpus as members. Something like that? Or even always have the cpupool code put onlined cpus in pool 0, and have the acpi suspend code remember and restore pool memberships. -- Keir On 20/03/2012 12:15, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> Changing the ACPI state (e.g. power off) while not all cpus are in cpupool 0 > will currently crash the hypervisor during disabling the other cpus. > This patch avoids the crash by adding the reason for disabling a cpu (either > permanent e.g. in case of cpu hotplug or temporary in case of ACPI state > change). > This requires an additional parameter for cpu callbacks. All callbacks are > changed to take a structure as parameter instead of only the cpu number. > > Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> > > > 25 files changed, 185 insertions(+), 139 deletions(-) > xen/arch/x86/acpi/cpu_idle.c | 13 ++++--- > xen/arch/x86/cpu/mcheck/mce.c | 8 ++-- > xen/arch/x86/cpu/mcheck/mce_intel.c | 8 ++-- > xen/arch/x86/hvm/hvm.c | 10 ++--- > xen/arch/x86/microcode.c | 6 +-- > xen/arch/x86/nmi.c | 17 +++++---- > xen/arch/x86/percpu.c | 8 ++-- > xen/arch/x86/setup.c | 2 - > xen/arch/x86/smpboot.c | 8 ++-- > xen/arch/x86/sysctl.c | 8 ++-- > xen/arch/x86/x86_32/traps.c | 8 ++-- > xen/common/cpu.c | 46 ++++++++++++++++---------- > xen/common/cpupool.c | 17 ++++++--- > xen/common/kexec.c | 13 ++++--- > xen/common/rcupdate.c | 15 +++++--- > xen/common/sched_credit2.c | 6 +-- > xen/common/schedule.c | 8 ++-- > xen/common/stop_machine.c | 17 +++++---- > xen/common/tasklet.c | 19 ++++++---- > xen/common/timer.c | 15 +++++--- > xen/common/tmem_xen.c | 32 +++++++++--------- > xen/common/trace.c | 6 +-- > xen/drivers/cpufreq/cpufreq.c | 15 +++++--- > xen/drivers/cpufreq/cpufreq_misc_governors.c | 6 +-- > xen/include/xen/cpu.h | 13 ++++++- > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Juergen Gross
2012-Mar-20 13:35 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
On 03/20/2012 01:52 PM, Keir Fraser wrote:> That''s quite a lot of bother. Firstly, this could probably be supported by a > global system_state type of variable indicating whether we are booting, > suspending, normal. Etc. Secondly I wonder whether you really need to care > about this detail from within the cpupool code? When you offline a CPU in > cpupool!=0, remember it. Put it back in the pool when it onlines, if the > pool still exists. Don''t prevent a pool from being destroyed just because it > has offline cpus as members. Something like that? Or even always have the > cpupool code put onlined cpus in pool 0, and have the acpi suspend code > remember and restore pool memberships.Hmm. Using a global variable seems to be hacky. I tried to find a clean solution for the problem. If the changes are too big in your opinion, I''ll try to handle cpu offlining local to cpupools. Thanks for your opinion, Juergen> -- Keir > > On 20/03/2012 12:15, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >> Changing the ACPI state (e.g. power off) while not all cpus are in cpupool 0 >> will currently crash the hypervisor during disabling the other cpus. >> This patch avoids the crash by adding the reason for disabling a cpu (either >> permanent e.g. in case of cpu hotplug or temporary in case of ACPI state >> change). >> This requires an additional parameter for cpu callbacks. All callbacks are >> changed to take a structure as parameter instead of only the cpu number. >> >> Signed-off-by: Juergen Gross<juergen.gross@ts.fujitsu.com> >> >> >> 25 files changed, 185 insertions(+), 139 deletions(-) >> xen/arch/x86/acpi/cpu_idle.c | 13 ++++--- >> xen/arch/x86/cpu/mcheck/mce.c | 8 ++-- >> xen/arch/x86/cpu/mcheck/mce_intel.c | 8 ++-- >> xen/arch/x86/hvm/hvm.c | 10 ++--- >> xen/arch/x86/microcode.c | 6 +-- >> xen/arch/x86/nmi.c | 17 +++++---- >> xen/arch/x86/percpu.c | 8 ++-- >> xen/arch/x86/setup.c | 2 - >> xen/arch/x86/smpboot.c | 8 ++-- >> xen/arch/x86/sysctl.c | 8 ++-- >> xen/arch/x86/x86_32/traps.c | 8 ++-- >> xen/common/cpu.c | 46 ++++++++++++++++---------- >> xen/common/cpupool.c | 17 ++++++--- >> xen/common/kexec.c | 13 ++++--- >> xen/common/rcupdate.c | 15 +++++--- >> xen/common/sched_credit2.c | 6 +-- >> xen/common/schedule.c | 8 ++-- >> xen/common/stop_machine.c | 17 +++++---- >> xen/common/tasklet.c | 19 ++++++---- >> xen/common/timer.c | 15 +++++--- >> xen/common/tmem_xen.c | 32 +++++++++--------- >> xen/common/trace.c | 6 +-- >> xen/drivers/cpufreq/cpufreq.c | 15 +++++--- >> xen/drivers/cpufreq/cpufreq_misc_governors.c | 6 +-- >> xen/include/xen/cpu.h | 13 ++++++- >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >-- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
Keir Fraser
2012-Mar-20 14:30 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
On 20/03/2012 13:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> On 03/20/2012 01:52 PM, Keir Fraser wrote: >> That''s quite a lot of bother. Firstly, this could probably be supported by a >> global system_state type of variable indicating whether we are booting, >> suspending, normal. Etc. Secondly I wonder whether you really need to care >> about this detail from within the cpupool code? When you offline a CPU in >> cpupool!=0, remember it. Put it back in the pool when it onlines, if the >> pool still exists. Don''t prevent a pool from being destroyed just because it >> has offline cpus as members. Something like that? Or even always have the >> cpupool code put onlined cpus in pool 0, and have the acpi suspend code >> remember and restore pool memberships. > > Hmm. Using a global variable seems to be hacky.I''m not so sure. Suspend/resume is a significant out-of-the-ordinary global system state. Representing that in a state variable doesn''t seem so bad. There are other states we could fold into this, for example we have an early_boot variable in arch/x86 which could become part of the state enumeration.> I tried to find a clean > solution > for the problem. If the changes are too big in your opinion, I''ll try to > handle > cpu offlining local to cpupools.Is it better to have cpupools know about offlining/suspend, or have offlining/suspend know about cpupools? I would have thought the latter makes more sense since it is offlining/suspend which calls into the cpupool subsystem. -- Keir> Thanks for your opinion,
Juergen Gross
2012-Mar-20 14:46 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
On 03/20/2012 03:30 PM, Keir Fraser wrote:> On 20/03/2012 13:35, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >> On 03/20/2012 01:52 PM, Keir Fraser wrote: >>> That''s quite a lot of bother. Firstly, this could probably be supported by a >>> global system_state type of variable indicating whether we are booting, >>> suspending, normal. Etc. Secondly I wonder whether you really need to care >>> about this detail from within the cpupool code? When you offline a CPU in >>> cpupool!=0, remember it. Put it back in the pool when it onlines, if the >>> pool still exists. Don''t prevent a pool from being destroyed just because it >>> has offline cpus as members. Something like that? Or even always have the >>> cpupool code put onlined cpus in pool 0, and have the acpi suspend code >>> remember and restore pool memberships. >> Hmm. Using a global variable seems to be hacky. > I''m not so sure. Suspend/resume is a significant out-of-the-ordinary global > system state. Representing that in a state variable doesn''t seem so bad. > There are other states we could fold into this, for example we have an > early_boot variable in arch/x86 which could become part of the state > enumeration. > >> I tried to find a clean >> solution >> for the problem. If the changes are too big in your opinion, I''ll try to >> handle >> cpu offlining local to cpupools. > Is it better to have cpupools know about offlining/suspend, or have > offlining/suspend know about cpupools? I would have thought the latter makes > more sense since it is offlining/suspend which calls into the cpupool > subsystem.I thought of a more relaxed solution in the cpupool coding: Instead of allowing to offline a cpu only if it is in Pool-0, I would allow it if: - the cpu is not the last one in the cpupool - or no domain is active in the cpupool (this would include the suspend case, where all domains are paused) Together with your proposal to remember the cpupool for an offlined cpu to add it again when it is onlined the handling should be rather simple and local. Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
Keir Fraser
2012-Mar-20 14:55 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
On 20/03/2012 14:46, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> Is it better to have cpupools know about offlining/suspend, or have >> offlining/suspend know about cpupools? I would have thought the latter makes >> more sense since it is offlining/suspend which calls into the cpupool >> subsystem. > > I thought of a more relaxed solution in the cpupool coding: > > Instead of allowing to offline a cpu only if it is in Pool-0, I would allow it > if: > - the cpu is not the last one in the cpupool > - or no domain is active in the cpupool (this would include the suspend case, > where all domains are paused) > > Together with your proposal to remember the cpupool for an offlined cpu to add > it again when it is onlined the handling should be rather simple and local.Ah, I see. Yes, a more flexible policy like this in the cpupool subsystem is best solution of all, imo. -- Keir
Juergen Gross
2012-Mar-21 13:46 UTC
Re: [PATCH] Allow ACPI state change with active cpupools
On 03/20/2012 03:55 PM, Keir Fraser wrote:> On 20/03/2012 14:46, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >>> Is it better to have cpupools know about offlining/suspend, or have >>> offlining/suspend know about cpupools? I would have thought the latter makes >>> more sense since it is offlining/suspend which calls into the cpupool >>> subsystem. >> I thought of a more relaxed solution in the cpupool coding: >> >> Instead of allowing to offline a cpu only if it is in Pool-0, I would allow it >> if: >> - the cpu is not the last one in the cpupool >> - or no domain is active in the cpupool (this would include the suspend case, >> where all domains are paused) >> >> Together with your proposal to remember the cpupool for an offlined cpu to add >> it again when it is onlined the handling should be rather simple and local. > Ah, I see. Yes, a more flexible policy like this in the cpupool subsystem is > best solution of all, imo.I have a problem with that solution. I had to change cpu_disable_scheduler() to do a vcpu_migrate() only if the vcpu is runnable. When the ACPI state change isn''t permanent like in poweroff case I don''t want to change vcpu affinities. This action might be necessary in vcpu_wake() if v->processor isn''t usable for the vcpu any more. Unfortunately vcpu_migrate() is calling evtchn_move_pirqs(), which taking the domain event channel lock. And vcpu_wake() is called at least by send_evtchn() with that lock already hold... Is it possible to use spin_lock_recursive() for the domain event channel lock? Another solution would be to omit the vcpu_migrate() in cpu_disable_scheduler() for paused vcpus only and to do it when unpausing the vcpu, but this would leak scheduler internals... Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html