Doing an ACPI state change with not all cpus in cpupool 0 will currently crash the hypervisor. Patch 1 relaxes cpupool cpu offlining handling to allow offlining a cpu in any cpupool if no lack of ressources is detected Patch 2 prohibits migrating a vcpu to another cpu if it is paused 6 files changed, 144 insertions(+), 33 deletions(-) xen/arch/x86/smpboot.c | 2 - xen/common/cpupool.c | 85 +++++++++++++++++++++++++++++++++++++++----- xen/common/domain.c | 4 +- xen/common/schedule.c | 83 +++++++++++++++++++++++++++++++----------- xen/include/xen/sched-if.h | 1 xen/include/xen/sched.h | 2 +
Juergen Gross
2012-Mar-22 08:22 UTC
[PATCH 1 of 2] Allow ACPI state change with active cpupools
Changing the ACPI state (e.g. power off) while not all cpus are in cpupool 0 will currently crash the hypervisor during disabling the other cpus. Up to now only cpus in Pool-0 were allowed to go offline. There is no reason why a cpu in another cpupool can''t be taken offline if there are still other cpus available in that cpupool. As disabling the cpus for an ACPI state change is only a temporary measure (if not poweroff, of course) and all domains are paused in this case, there is no reason to reject removing the last cpu from a cpupool if only paused domains are in this cpupool. The cpus taken offline from a cpupool will be still associated with that cpupool to enable adding them automatically when they are becoming online again. Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> 4 files changed, 80 insertions(+), 9 deletions(-) xen/arch/x86/smpboot.c | 2 - xen/common/cpupool.c | 85 +++++++++++++++++++++++++++++++++++++++----- xen/include/xen/sched-if.h | 1 xen/include/xen/sched.h | 1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Currently offlining a cpu will migrate all vcpus which were active on that cpu to another one, possibly breaking existing cpu affinities. In case of an ACPI state change the cpus are taken offline and online later (if not poweroff) while all domains are paused. There is no reason to migrate the vcpus during offlining the cpus, as the cpus will be available again when the domains are being unpaused. This patch defers the migration check in case of paused vcpus or domains by adding vcpu_arouse() to wake up a vcpu and check whether it must be migrated to another cpu. Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> 3 files changed, 64 insertions(+), 24 deletions(-) xen/common/domain.c | 4 +- xen/common/schedule.c | 83 ++++++++++++++++++++++++++++++++++------------- xen/include/xen/sched.h | 1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2012-Mar-22 10:09 UTC
Re: [PATCH 1 of 2] Allow ACPI state change with active cpupools
>>> On 22.03.12 at 09:22, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote: > static void cpupool_cpu_add(unsigned int cpu) > { >+ struct cpupool **c; >+ > spin_lock(&cpupool_lock); > cpumask_clear_cpu(cpu, &cpupool_locked_cpus); > cpumask_set_cpu(cpu, &cpupool_free_cpus); >+ if ( cpumask_test_cpu(cpu, &cpupool_free_offline_cpus) ) >+ { >+ cpumask_clear_cpu(cpu, &cpupool_free_offline_cpus);cpumask_test_and_clear_cpu()>+ goto out; >+ } >+ for_each_cpupool(c) >+ { >+ if ( cpumask_test_cpu(cpu, (*c)->cpu_offline) ) >+ { >+ cpumask_clear_cpu(cpu, (*c)->cpu_offline);Here too.>+ cpupool_assign_cpu_locked(*c, cpu); >+ goto out; >+ } >+ } > cpupool_assign_cpu_locked(cpupool0, cpu); >... >+void cpupool_disable_cpu(unsigned int cpu) >+{ >+ struct cpupool *c; >+ >+ c = per_cpu(cpupool, cpu); >+ if ( c == NULL ) >+ { >+ cpumask_clear_cpu(cpu, &cpupool_free_cpus); >+ cpumask_set_cpu(cpu, &cpupool_free_offline_cpus); >+ } >+ else if ( cpumask_test_cpu(cpu, c->cpu_valid) ) >+ { >+ cpumask_clear_cpu(cpu, c->cpu_valid);And once more.>+ cpumask_set_cpu(cpu, c->cpu_offline); >+ } >+}Jan
Your original patch didn''t touch this code. Was that an omission in the original version? On reflection I prefer your original patch to this new approach. I''ll apply it if you still believe your original patch is complete and correct as it stands. -- Keir On 22/03/2012 08:22, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> Currently offlining a cpu will migrate all vcpus which were active on that > cpu to another one, possibly breaking existing cpu affinities. > > In case of an ACPI state change the cpus are taken offline and online later > (if not poweroff) while all domains are paused. There is no reason to > migrate the vcpus during offlining the cpus, as the cpus will be available > again when the domains are being unpaused. > > This patch defers the migration check in case of paused vcpus or domains > by adding vcpu_arouse() to wake up a vcpu and check whether it must be > migrated to another cpu. > > Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> > > > 3 files changed, 64 insertions(+), 24 deletions(-) > xen/common/domain.c | 4 +- > xen/common/schedule.c | 83 ++++++++++++++++++++++++++++++++++------------- > xen/include/xen/sched.h | 1 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Juergen Gross
2012-Mar-22 10:22 UTC
Re: [PATCH 2 of 2] Avoid vcpu migration of paused vcpus
On 03/22/2012 11:12 AM, Keir Fraser wrote:> Your original patch didn''t touch this code. Was that an omission in the > original version? On reflection I prefer your original patch to this new > approach. I''ll apply it if you still believe your original patch is complete > and correct as it stands.I like my second patch more :-) It covers more cases, not just poweroff. In hibernate case no vcpu pinnings will be lost. Today all vcpus pinned to a cpu other than 0 will lose their pinnings at cpu offlining. At reactivation those pinnings will not be restored automatically. My patch will cover that by checking availability of the cpus after reactivation. Poweroff (which was my primary concern) works with both versions. I did not test other ACPI state changes with either version, but would expect better results in hibernate case with my second approach. Juergen> -- Keir > > On 22/03/2012 08:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >> Currently offlining a cpu will migrate all vcpus which were active on that >> cpu to another one, possibly breaking existing cpu affinities. >> >> In case of an ACPI state change the cpus are taken offline and online later >> (if not poweroff) while all domains are paused. There is no reason to >> migrate the vcpus during offlining the cpus, as the cpus will be available >> again when the domains are being unpaused. >> >> This patch defers the migration check in case of paused vcpus or domains >> by adding vcpu_arouse() to wake up a vcpu and check whether it must be >> migrated to another cpu. >> >> Signed-off-by: Juergen Gross<juergen.gross@ts.fujitsu.com> >> >> >> 3 files changed, 64 insertions(+), 24 deletions(-) >> xen/common/domain.c | 4 +- >> xen/common/schedule.c | 83 ++++++++++++++++++++++++++++++++++------------- >> xen/include/xen/sched.h | 1 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >-- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
>>> On 22.03.12 at 09:22, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote: >--- a/xen/common/schedule.c Thu Mar 22 09:21:44 2012 +0100 >+++ b/xen/common/schedule.c Thu Mar 22 09:21:47 2012 +0100 >@@ -525,6 +525,47 @@ void vcpu_force_reschedule(struct vcpu * > } > } > >+static int vcpu_chk_affinity(struct vcpu *v, unsigned int cpu) >+{ >+ cpumask_t online_affinity; >+ struct cpupool *c; >+ >+ c = v->domain->cpupool; >+ cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid); >+ if ( cpumask_empty(&online_affinity) &&There''s no need for a local cpumask_t variable here - please use !cpumask_intersects(v->cpu_affinity, c->cpu_valid) instead.>+ cpumask_test_cpu(cpu, v->cpu_affinity) ) >+ { >+ printk("Breaking vcpu affinity for domain %d vcpu %d\n", >+ v->domain->domain_id, v->vcpu_id); >+ cpumask_setall(v->cpu_affinity); >+ } >+ >+ return ( !cpumask_test_cpu(cpu, c->cpu_valid) && (v->processor == cpu) );Please drop the extra outermost parentheses.>+} >+ >+void vcpu_arouse(struct vcpu *v) >+{ >+ unsigned long flags; >+ >+ if ( atomic_read(&v->pause_count) || >+ atomic_read(&v->domain->pause_count) )Is it not possible (or even more correct) to use vcpu_runnable() here?>+ return; >+ >+ vcpu_schedule_lock_irqsave(v, flags); >+ >+ if ( unlikely(vcpu_chk_affinity(v, v->processor) && (v != current)) )unlikely() is generally useful only around individual conditions (i.e. not around && or || expressions).>+ { >+ set_bit(_VPF_migrating, &v->pause_flags); >+ vcpu_schedule_unlock_irqrestore(v, flags); >+ vcpu_migrate(v); >+ return; >+ } >+ >+ vcpu_schedule_unlock_irqrestore(v, flags); >+ >+ vcpu_wake(v); >+} >+ > /* > * This function is used by cpu_hotplug code from stop_machine context > * and from cpupools to switch schedulers on a cpu. >@@ -534,7 +575,6 @@ int cpu_disable_scheduler(unsigned int c > struct domain *d; > struct vcpu *v; > struct cpupool *c; >- cpumask_t online_affinity; > int ret = 0; > > c = per_cpu(cpupool, cpu); >@@ -547,34 +587,33 @@ int cpu_disable_scheduler(unsigned int c > { > vcpu_schedule_lock_irq(v); > >- cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid); >- if ( cpumask_empty(&online_affinity) && >- cpumask_test_cpu(cpu, v->cpu_affinity) ) >+ if ( likely(!atomic_read(&v->pause_count) && >+ !atomic_read(&d->pause_count)) )Same question as above regarding vcpu_runnable().> { >- printk("Breaking vcpu affinity for domain %d vcpu %d\n", >- v->domain->domain_id, v->vcpu_id); >- cpumask_setall(v->cpu_affinity); >- } >+ if ( vcpu_chk_affinity(v, cpu) ) >+ { >+ set_bit(_VPF_migrating, &v->pause_flags); >+ vcpu_schedule_unlock_irq(v); >+ vcpu_sleep_nosync(v); >+ vcpu_migrate(v); >+ } >+ else >+ { >+ vcpu_schedule_unlock_irq(v); >+ }Please drop the unnecessary braces here, as per the recently posted coding style draft. Jan> >- if ( v->processor == cpu ) >- { >- set_bit(_VPF_migrating, &v->pause_flags); >- vcpu_schedule_unlock_irq(v); >- vcpu_sleep_nosync(v); >- vcpu_migrate(v); >+ /* >+ * A vcpu active in the hypervisor will not be migratable. >+ * The caller should try again after releasing and reaquiring >+ * all locks. >+ */ >+ if ( v->processor == cpu ) >+ ret = -EAGAIN; > } > else > { > vcpu_schedule_unlock_irq(v); > } >- >- /* >- * A vcpu active in the hypervisor will not be migratable. >- * The caller should try again after releasing and reaquiring >- * all locks. >- */ >- if ( v->processor == cpu ) >- ret = -EAGAIN; > } > > domain_update_node_affinity(d);
Juergen Gross
2012-Mar-22 10:38 UTC
Re: [PATCH 2 of 2] Avoid vcpu migration of paused vcpus
On 03/22/2012 11:26 AM, Jan Beulich wrote:>>>> On 22.03.12 at 09:22, Juergen Gross<juergen.gross@ts.fujitsu.com> wrote: >> --- a/xen/common/schedule.c Thu Mar 22 09:21:44 2012 +0100 >> +++ b/xen/common/schedule.c Thu Mar 22 09:21:47 2012 +0100 >> @@ -525,6 +525,47 @@ void vcpu_force_reschedule(struct vcpu * >> } >> } >> >> +static int vcpu_chk_affinity(struct vcpu *v, unsigned int cpu) >> +{ >> + cpumask_t online_affinity; >> + struct cpupool *c; >> + >> + c = v->domain->cpupool; >> + cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid); >> + if ( cpumask_empty(&online_affinity)&& > There''s no need for a local cpumask_t variable here - please use > !cpumask_intersects(v->cpu_affinity, c->cpu_valid) instead.Okay.>> + cpumask_test_cpu(cpu, v->cpu_affinity) ) >> + { >> + printk("Breaking vcpu affinity for domain %d vcpu %d\n", >> + v->domain->domain_id, v->vcpu_id); >> + cpumask_setall(v->cpu_affinity); >> + } >> + >> + return ( !cpumask_test_cpu(cpu, c->cpu_valid)&& (v->processor == cpu) ); > Please drop the extra outermost parentheses.Okay.>> +} >> + >> +void vcpu_arouse(struct vcpu *v) >> +{ >> + unsigned long flags; >> + >> + if ( atomic_read(&v->pause_count) || >> + atomic_read(&v->domain->pause_count) ) > Is it not possible (or even more correct) to use vcpu_runnable() here?No. There might be flags set in v->pause_flags which I don''t want to test here. I''m only interested in "real" paused state. An old "migrating" case should be reconsidered here, e.g.>> + return; >> + >> + vcpu_schedule_lock_irqsave(v, flags); >> + >> + if ( unlikely(vcpu_chk_affinity(v, v->processor)&& (v != current)) ) > unlikely() is generally useful only around individual conditions (i.e. > not around&& or || expressions).Ah, I didn''t know that. Will change it.>> + { >> + set_bit(_VPF_migrating,&v->pause_flags); >> + vcpu_schedule_unlock_irqrestore(v, flags); >> + vcpu_migrate(v); >> + return; >> + } >> + >> + vcpu_schedule_unlock_irqrestore(v, flags); >> + >> + vcpu_wake(v); >> +} >> + >> /* >> * This function is used by cpu_hotplug code from stop_machine context >> * and from cpupools to switch schedulers on a cpu. >> @@ -534,7 +575,6 @@ int cpu_disable_scheduler(unsigned int c >> struct domain *d; >> struct vcpu *v; >> struct cpupool *c; >> - cpumask_t online_affinity; >> int ret = 0; >> >> c = per_cpu(cpupool, cpu); >> @@ -547,34 +587,33 @@ int cpu_disable_scheduler(unsigned int c >> { >> vcpu_schedule_lock_irq(v); >> >> - cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid); >> - if ( cpumask_empty(&online_affinity)&& >> - cpumask_test_cpu(cpu, v->cpu_affinity) ) >> + if ( likely(!atomic_read(&v->pause_count)&& >> + !atomic_read(&d->pause_count)) ) > Same question as above regarding vcpu_runnable().Same answer :-)>> { >> - printk("Breaking vcpu affinity for domain %d vcpu %d\n", >> - v->domain->domain_id, v->vcpu_id); >> - cpumask_setall(v->cpu_affinity); >> - } >> + if ( vcpu_chk_affinity(v, cpu) ) >> + { >> + set_bit(_VPF_migrating,&v->pause_flags); >> + vcpu_schedule_unlock_irq(v); >> + vcpu_sleep_nosync(v); >> + vcpu_migrate(v); >> + } >> + else >> + { >> + vcpu_schedule_unlock_irq(v); >> + } > Please drop the unnecessary braces here, as per the recently posted > coding style draft.Okay. Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
On 22/03/2012 10:22, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> On 03/22/2012 11:12 AM, Keir Fraser wrote: >> Your original patch didn''t touch this code. Was that an omission in the >> original version? On reflection I prefer your original patch to this new >> approach. I''ll apply it if you still believe your original patch is complete >> and correct as it stands. > > I like my second patch more :-) > > It covers more cases, not just poweroff. In hibernate case no vcpu pinnings > will be lost. Today all vcpus pinned to a cpu other than 0 will lose their > pinnings at cpu offlining. At reactivation those pinnings will not be > restored automatically. My patch will cover that by checking availability > of the cpus after reactivation. > > Poweroff (which was my primary concern) works with both versions. I did not > test other ACPI state changes with either version, but would expect better > results in hibernate case with my second approach.How about the attached patch? Which is similar to your original patch except I added the global state variable, and I added a check for it to cpu_disable_scheduler(). It''s nice and small. :-) -- Keir> > Juergen > >> -- Keir >> >> On 22/03/2012 08:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: >> >>> Currently offlining a cpu will migrate all vcpus which were active on that >>> cpu to another one, possibly breaking existing cpu affinities. >>> >>> In case of an ACPI state change the cpus are taken offline and online later >>> (if not poweroff) while all domains are paused. There is no reason to >>> migrate the vcpus during offlining the cpus, as the cpus will be available >>> again when the domains are being unpaused. >>> >>> This patch defers the migration check in case of paused vcpus or domains >>> by adding vcpu_arouse() to wake up a vcpu and check whether it must be >>> migrated to another cpu. >>> >>> Signed-off-by: Juergen Gross<juergen.gross@ts.fujitsu.com> >>> >>> >>> 3 files changed, 64 insertions(+), 24 deletions(-) >>> xen/common/domain.c | 4 +- >>> xen/common/schedule.c | 83 >>> ++++++++++++++++++++++++++++++++++------------- >>> xen/include/xen/sched.h | 1 >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Juergen Gross
2012-Mar-22 11:59 UTC
Re: [PATCH 2 of 2] Avoid vcpu migration of paused vcpus
On 03/22/2012 12:20 PM, Keir Fraser wrote:> On 22/03/2012 10:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >> On 03/22/2012 11:12 AM, Keir Fraser wrote: >>> Your original patch didn''t touch this code. Was that an omission in the >>> original version? On reflection I prefer your original patch to this new >>> approach. I''ll apply it if you still believe your original patch is complete >>> and correct as it stands. >> I like my second patch more :-) >> >> It covers more cases, not just poweroff. In hibernate case no vcpu pinnings >> will be lost. Today all vcpus pinned to a cpu other than 0 will lose their >> pinnings at cpu offlining. At reactivation those pinnings will not be >> restored automatically. My patch will cover that by checking availability >> of the cpus after reactivation. >> >> Poweroff (which was my primary concern) works with both versions. I did not >> test other ACPI state changes with either version, but would expect better >> results in hibernate case with my second approach. > How about the attached patch? Which is similar to your original patch except > I added the global state variable, and I added a check for it to > cpu_disable_scheduler(). It''s nice and small. :-)And it works in all of my test cases for poweroff. :-) Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Thanks, Juergen> -- Keir > >> Juergen >> >>> -- Keir >>> >>> On 22/03/2012 08:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: >>> >>>> Currently offlining a cpu will migrate all vcpus which were active on that >>>> cpu to another one, possibly breaking existing cpu affinities. >>>> >>>> In case of an ACPI state change the cpus are taken offline and online later >>>> (if not poweroff) while all domains are paused. There is no reason to >>>> migrate the vcpus during offlining the cpus, as the cpus will be available >>>> again when the domains are being unpaused. >>>> >>>> This patch defers the migration check in case of paused vcpus or domains >>>> by adding vcpu_arouse() to wake up a vcpu and check whether it must be >>>> migrated to another cpu. >>>> >>>> Signed-off-by: Juergen Gross<juergen.gross@ts.fujitsu.com> >>>> >>>> >>>> 3 files changed, 64 insertions(+), 24 deletions(-) >>>> xen/common/domain.c | 4 +- >>>> xen/common/schedule.c | 83 >>>> ++++++++++++++++++++++++++++++++++------------- >>>> xen/include/xen/sched.h | 1 >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xen.org >>>> http://lists.xen.org/xen-devel >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >>> >>> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel-- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Juergen Gross
2012-Mar-23 08:17 UTC
Re: [PATCH 2 of 2] Avoid vcpu migration of paused vcpus
On 03/22/2012 12:20 PM, Keir Fraser wrote:> On 22/03/2012 10:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote: > >> On 03/22/2012 11:12 AM, Keir Fraser wrote: >>> Your original patch didn''t touch this code. Was that an omission in the >>> original version? On reflection I prefer your original patch to this new >>> approach. I''ll apply it if you still believe your original patch is complete >>> and correct as it stands. >> I like my second patch more :-) >> >> It covers more cases, not just poweroff. In hibernate case no vcpu pinnings >> will be lost. Today all vcpus pinned to a cpu other than 0 will lose their >> pinnings at cpu offlining. At reactivation those pinnings will not be >> restored automatically. My patch will cover that by checking availability >> of the cpus after reactivation. >> >> Poweroff (which was my primary concern) works with both versions. I did not >> test other ACPI state changes with either version, but would expect better >> results in hibernate case with my second approach. > How about the attached patch? Which is similar to your original patch except > I added the global state variable, and I added a check for it to > cpu_disable_scheduler(). It''s nice and small. :-)Would you mind putting it in 4.1, too? Juergen -- Juergen Gross Principal Developer Operating Systems PDG ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html