In order to give guests a hint at whether their vCPU-s are currently scheduled (so they can e.g. adapt their behavior in spin loops), update the run state area (if registered) also when de-scheduling a vCPU. Also fix an oversight in the compat mode implementation of VCPUOP_register_runstate_memory_area. Please also consider for the 3.4 and 3.3 branches. Signed-off-by: Jan Beulich <jbeulich@novell.com> --- 2009-08-18.orig/xen/arch/x86/domain.c 2009-08-17 11:37:44.000000000 +0200 +++ 2009-08-18/xen/arch/x86/domain.c 2009-08-18 14:18:08.000000000 +0200 @@ -1265,6 +1265,26 @@ static void paravirt_ctxt_switch_to(stru } } +/* Update per-VCPU guest runstate shared memory area (if registered). */ +static void update_runstate_area(struct vcpu *v) +{ + if ( guest_handle_is_null(runstate_guest(v)) ) + return; + +#ifdef CONFIG_COMPAT + if ( is_pv_32on64_domain(v->domain) ) + { + struct compat_vcpu_runstate_info info; + + XLAT_vcpu_runstate_info(&info, &v->runstate); + __copy_to_guest(v->runstate_guest.compat, &info, 1); + return; + } +#endif + + __copy_to_guest(runstate_guest(v), &v->runstate, 1); +} + static inline int need_full_gdt(struct vcpu *v) { return (!is_hvm_vcpu(v) && !is_idle_vcpu(v)); @@ -1356,6 +1376,9 @@ void context_switch(struct vcpu *prev, s flush_tlb_mask(&dirty_mask); } + if (prev != next) + update_runstate_area(prev); + if ( is_hvm_vcpu(prev) && !list_empty(&prev->arch.hvm_vcpu.tm_list) ) pt_save_timer(prev); @@ -1395,21 +1418,8 @@ void context_switch(struct vcpu *prev, s context_saved(prev); - /* Update per-VCPU guest runstate shared memory area (if registered). */ - if ( !guest_handle_is_null(runstate_guest(next)) ) - { - if ( !is_pv_32on64_domain(next->domain) ) - __copy_to_guest(runstate_guest(next), &next->runstate, 1); -#ifdef CONFIG_COMPAT - else - { - struct compat_vcpu_runstate_info info; - - XLAT_vcpu_runstate_info(&info, &next->runstate); - __copy_to_guest(next->runstate_guest.compat, &info, 1); - } -#endif - } + if (prev != next) + update_runstate_area(next); schedule_tail(next); BUG(); --- 2009-08-18.orig/xen/arch/x86/x86_64/domain.c 2008-05-13 11:02:22.000000000 +0200 +++ 2009-08-18/xen/arch/x86/x86_64/domain.c 2009-08-18 14:18:08.000000000 +0200 @@ -56,7 +56,7 @@ arch_compat_vcpu_op( struct vcpu_runstate_info runstate; vcpu_runstate_get(v, &runstate); - XLAT_vcpu_runstate_info(&info, &v->runstate); + XLAT_vcpu_runstate_info(&info, &runstate); } __copy_to_guest(v->runstate_guest.compat, &info, 1); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-18 20:28 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On 08/18/09 05:48, Jan Beulich wrote:> In order to give guests a hint at whether their vCPU-s are currently > scheduled (so they can e.g. adapt their behavior in spin loops), update > the run state area (if registered) also when de-scheduling a vCPU. >Hm, the pvops kernels are already assuming that this is happening, so definite ack from me. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-02 23:53 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On 08/18/09 05:48, Jan Beulich wrote:> In order to give guests a hint at whether their vCPU-s are currently > scheduled (so they can e.g. adapt their behavior in spin loops), update > the run state area (if registered) also when de-scheduling a vCPU. > > Also fix an oversight in the compat mode implementation of > VCPUOP_register_runstate_memory_area. > > Please also consider for the 3.4 and 3.3 branches. >BTW, this actually changes the documented behaviour of register_runstate_memory_area: * 2. Only one shared area may be registered per VCPU. The shared area is * updated by the hypervisor each time the VCPU is scheduled. Thus * runstate.state will always be RUNSTATE_running and * runstate.state_entry_time will indicate the system time at which the * VCPU was last scheduled to run. not that I think anything was relying on the old behaviour (indeed, it''s pretty unexpected behaviour). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Nov-20 18:09 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
This patch seems to cause suspend/resume to fail, at least for a pvops kernel (I tried 2.6.29/30/31, currently using .30). The stack trace is lost among a raft of "BUG: recent printk recursion!" but I think the actual issue is: [ 32.904966] WARNING: at /local/scratch/ianc/devel/kernels/linux-2.6/arch/x86/xen/time.c:180 xen_sched_clock+0x6d/0x70() This is "WARN_ON(state.state != RUNSTATE_running);". With some debugging I''ve found that the kernel currently thinks the runstate is 3 (i.e. RUNSTATE_offline). I think the issue is that when the guest is suspended the last deschedule now updates the guest runstate -> RUNSTATE_offline but when the new VCPU in the new domain is first scheduled the guest_runstate pointer hasn''t yet been configured so the guests copy does not get updated. I can''t see where the guest runstate pointer is supposed to be either restored or re-setup on resume. I tried adding a setup_runstate_info to xen_timer_resume (to match the call in xen_timer_setup) but that seems like it is already too late -- I still see the warnings trigger. I''m not sure how this is possible since I thought we were in a stop_machine section at this point. Ian. On Tue, 2009-08-18 at 13:48 +0100, Jan Beulich wrote:> In order to give guests a hint at whether their vCPU-s are currently > scheduled (so they can e.g. adapt their behavior in spin loops), update > the run state area (if registered) also when de-scheduling a vCPU. > > Also fix an oversight in the compat mode implementation of > VCPUOP_register_runstate_memory_area. > > Please also consider for the 3.4 and 3.3 branches. > > Signed-off-by: Jan Beulich <jbeulich@novell.com> > > --- 2009-08-18.orig/xen/arch/x86/domain.c 2009-08-17 11:37:44.000000000 +0200 > +++ 2009-08-18/xen/arch/x86/domain.c 2009-08-18 14:18:08.000000000 +0200 > @@ -1265,6 +1265,26 @@ static void paravirt_ctxt_switch_to(stru > } > } > > +/* Update per-VCPU guest runstate shared memory area (if registered). */ > +static void update_runstate_area(struct vcpu *v) > +{ > + if ( guest_handle_is_null(runstate_guest(v)) ) > + return; > + > +#ifdef CONFIG_COMPAT > + if ( is_pv_32on64_domain(v->domain) ) > + { > + struct compat_vcpu_runstate_info info; > + > + XLAT_vcpu_runstate_info(&info, &v->runstate); > + __copy_to_guest(v->runstate_guest.compat, &info, 1); > + return; > + } > +#endif > + > + __copy_to_guest(runstate_guest(v), &v->runstate, 1); > +} > + > static inline int need_full_gdt(struct vcpu *v) > { > return (!is_hvm_vcpu(v) && !is_idle_vcpu(v)); > @@ -1356,6 +1376,9 @@ void context_switch(struct vcpu *prev, s > flush_tlb_mask(&dirty_mask); > } > > + if (prev != next) > + update_runstate_area(prev); > + > if ( is_hvm_vcpu(prev) && !list_empty(&prev->arch.hvm_vcpu.tm_list) ) > pt_save_timer(prev); > > @@ -1395,21 +1418,8 @@ void context_switch(struct vcpu *prev, s > > context_saved(prev); > > - /* Update per-VCPU guest runstate shared memory area (if registered). */ > - if ( !guest_handle_is_null(runstate_guest(next)) ) > - { > - if ( !is_pv_32on64_domain(next->domain) ) > - __copy_to_guest(runstate_guest(next), &next->runstate, 1); > -#ifdef CONFIG_COMPAT > - else > - { > - struct compat_vcpu_runstate_info info; > - > - XLAT_vcpu_runstate_info(&info, &next->runstate); > - __copy_to_guest(next->runstate_guest.compat, &info, 1); > - } > -#endif > - } > + if (prev != next) > + update_runstate_area(next); > > schedule_tail(next); > BUG(); > --- 2009-08-18.orig/xen/arch/x86/x86_64/domain.c 2008-05-13 11:02:22.000000000 +0200 > +++ 2009-08-18/xen/arch/x86/x86_64/domain.c 2009-08-18 14:18:08.000000000 +0200 > @@ -56,7 +56,7 @@ arch_compat_vcpu_op( > struct vcpu_runstate_info runstate; > > vcpu_runstate_get(v, &runstate); > - XLAT_vcpu_runstate_info(&info, &v->runstate); > + XLAT_vcpu_runstate_info(&info, &runstate); > } > __copy_to_guest(v->runstate_guest.compat, &info, 1); > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Nov-20 18:57 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On Fri, 2009-11-20 at 18:09 +0000, Ian Campbell wrote:> > > I can''t see where the guest runstate pointer is supposed to be either > restored or re-setup on resume. I tried adding a setup_runstate_info > to xen_timer_resume (to match the call in xen_timer_setup) but that > seems like it is already too late -- I still see the warnings trigger. > I''m not sure how this is possible since I thought we were in a > stop_machine section at this point.The xen_sched_clock calls are as a result of the various printks, e.g. in xen_vcpu_setup, in order to add the timestamp to the output. Therefore we need to ensure we reset the runstate info before any printks. --- Subject: re-register runstate area earlier on resume. This is necessary to ensure the runstate area is available to xen_sched_clock before any calls to printk which will require it in order to provide a timestamp. I chose to pull the xen_setup_runstate_info out of xen_time_init into the caller in order to maintain parity with calling xen_setup_runstate_info separately from calling xen_time_resume. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 0a5aa44..fed538a 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -100,7 +102,7 @@ bool xen_vcpu_stolen(int vcpu) return per_cpu(runstate, vcpu).state == RUNSTATE_runnable; } -static void setup_runstate_info(int cpu) +void xen_setup_runstate_info(int cpu) { struct vcpu_register_runstate_memory_area area; @@ -442,8 +456,6 @@ void xen_setup_timer(int cpu) evt->cpumask = cpumask_of(cpu); evt->irq = irq; - - setup_runstate_info(cpu); } void xen_teardown_timer(int cpu) @@ -494,6 +507,7 @@ __init void xen_time_init(void) setup_force_cpu_cap(X86_FEATURE_TSC); + xen_setup_runstate_info(cpu); xen_setup_timer(cpu); xen_setup_cpu_clockevents(); } diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index ca6596b..07ee25c 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -44,6 +44,7 @@ void __init xen_build_dynamic_phys_to_machine(void); void xen_init_irq_ops(void); void xen_setup_timer(int cpu); +void xen_setup_runstate_info(int cpu); void xen_teardown_timer(int cpu); cycle_t xen_clocksource_read(void); void xen_setup_cpu_clockevents(void); diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index f09e8c3..219fb3f 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -145,6 +158,8 @@ void xen_vcpu_restore(void) HYPERVISOR_vcpu_op(VCPUOP_down, cpu, NULL)) BUG(); + xen_setup_runstate_info(cpu); + xen_vcpu_setup(cpu); if (other_cpu && _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Nov-21 00:48 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On 11/21/09 02:57, Ian Campbell wrote:> On Fri, 2009-11-20 at 18:09 +0000, Ian Campbell wrote: > >> >> I can''t see where the guest runstate pointer is supposed to be either >> restored or re-setup on resume. I tried adding a setup_runstate_info >> to xen_timer_resume (to match the call in xen_timer_setup) but that >> seems like it is already too late -- I still see the warnings trigger. >> I''m not sure how this is possible since I thought we were in a >> stop_machine section at this point. >> > The xen_sched_clock calls are as a result of the various printks, e.g. > in xen_vcpu_setup, in order to add the timestamp to the output. > Therefore we need to ensure we reset the runstate info before any > printks. >Thanks for investigating this. Does this mean there was a longer-standing bug where the runstate was just completely invalid after a save/restore? One comment:> --- > > Subject: re-register runstate area earlier on resume. > > This is necessary to ensure the runstate area is available to > xen_sched_clock before any calls to printk which will require it in > order to provide a timestamp. > > I chose to pull the xen_setup_runstate_info out of xen_time_init into > the caller in order to maintain parity with calling > xen_setup_runstate_info separately from calling xen_time_resume. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > index 0a5aa44..fed538a 100644 > --- a/arch/x86/xen/time.c > +++ b/arch/x86/xen/time.c > @@ -100,7 +102,7 @@ bool xen_vcpu_stolen(int vcpu) > return per_cpu(runstate, vcpu).state == RUNSTATE_runnable; > } > > -static void setup_runstate_info(int cpu) > +void xen_setup_runstate_info(int cpu) > { > struct vcpu_register_runstate_memory_area area; > > @@ -442,8 +456,6 @@ void xen_setup_timer(int cpu) > > evt->cpumask = cpumask_of(cpu); > evt->irq = irq; > - > - setup_runstate_info(cpu); > } > > void xen_teardown_timer(int cpu) > @@ -494,6 +507,7 @@ __init void xen_time_init(void) > > setup_force_cpu_cap(X86_FEATURE_TSC); > > + xen_setup_runstate_info(cpu); > xen_setup_timer(cpu); > xen_setup_cpu_clockevents(); > } > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h > index ca6596b..07ee25c 100644 > --- a/arch/x86/xen/xen-ops.h > +++ b/arch/x86/xen/xen-ops.h > @@ -44,6 +44,7 @@ void __init xen_build_dynamic_phys_to_machine(void); > > void xen_init_irq_ops(void); > void xen_setup_timer(int cpu); > +void xen_setup_runstate_info(int cpu); > void xen_teardown_timer(int cpu); > cycle_t xen_clocksource_read(void); > void xen_setup_cpu_clockevents(void); > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index f09e8c3..219fb3f 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -145,6 +158,8 @@ void xen_vcpu_restore(void) > HYPERVISOR_vcpu_op(VCPUOP_down, cpu, NULL)) > BUG(); > > + xen_setup_runstate_info(cpu); > + > xen_vcpu_setup(cpu); >This is only run when have_vcpu_info_placement is set. I checked in a followup patch to rearrange the loop so that it runs unconditionally, but then only does xen_vcpu_setup() when have_vcpu_info_placement. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Nov-21 09:19 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On Sat, 2009-11-21 at 00:48 +0000, Jeremy Fitzhardinge wrote:> On 11/21/09 02:57, Ian Campbell wrote: > > On Fri, 2009-11-20 at 18:09 +0000, Ian Campbell wrote: > > > >> > >> I can''t see where the guest runstate pointer is supposed to be either > >> restored or re-setup on resume. I tried adding a setup_runstate_info > >> to xen_timer_resume (to match the call in xen_timer_setup) but that > >> seems like it is already too late -- I still see the warnings trigger. > >> I''m not sure how this is possible since I thought we were in a > >> stop_machine section at this point. > >> > > The xen_sched_clock calls are as a result of the various printks, e.g. > > in xen_vcpu_setup, in order to add the timestamp to the output. > > Therefore we need to ensure we reset the runstate info before any > > printks. > > > > Thanks for investigating this. Does this mean there was a > longer-standing bug where the runstate was just completely invalid after > a save/restore?I think so, yes.> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > > index f09e8c3..219fb3f 100644 > > --- a/arch/x86/xen/enlighten.c > > +++ b/arch/x86/xen/enlighten.c > > @@ -145,6 +158,8 @@ void xen_vcpu_restore(void) > > HYPERVISOR_vcpu_op(VCPUOP_down, cpu, NULL)) > > BUG(); > > > > + xen_setup_runstate_info(cpu); > > + > > xen_vcpu_setup(cpu); > > > > This is only run when have_vcpu_info_placement is set. I checked in a > followup patch to rearrange the loop so that it runs unconditionally, > but then only does xen_vcpu_setup() when have_vcpu_info_placement.Make sense, thanks. I think this is stable material back as far as they are still maintaining them. The patch which exposed the issue was backported to 3.4.2. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Nov-23 08:19 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
Acked-by: Jan Beulich <jbeulich@novell.com>>>> Ian Campbell <Ian.Campbell@citrix.com> 20.11.09 19:57 >>>On Fri, 2009-11-20 at 18:09 +0000, Ian Campbell wrote:> > > I can''t see where the guest runstate pointer is supposed to be either > restored or re-setup on resume. I tried adding a setup_runstate_info > to xen_timer_resume (to match the call in xen_timer_setup) but that > seems like it is already too late -- I still see the warnings trigger. > I''m not sure how this is possible since I thought we were in a > stop_machine section at this point.The xen_sched_clock calls are as a result of the various printks, e.g. in xen_vcpu_setup, in order to add the timestamp to the output. Therefore we need to ensure we reset the runstate info before any printks. --- Subject: re-register runstate area earlier on resume. This is necessary to ensure the runstate area is available to xen_sched_clock before any calls to printk which will require it in order to provide a timestamp. I chose to pull the xen_setup_runstate_info out of xen_time_init into the caller in order to maintain parity with calling xen_setup_runstate_info separately from calling xen_time_resume. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 0a5aa44..fed538a 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -100,7 +102,7 @@ bool xen_vcpu_stolen(int vcpu) return per_cpu(runstate, vcpu).state == RUNSTATE_runnable; } -static void setup_runstate_info(int cpu) +void xen_setup_runstate_info(int cpu) { struct vcpu_register_runstate_memory_area area; @@ -442,8 +456,6 @@ void xen_setup_timer(int cpu) evt->cpumask = cpumask_of(cpu); evt->irq = irq; - - setup_runstate_info(cpu); } void xen_teardown_timer(int cpu) @@ -494,6 +507,7 @@ __init void xen_time_init(void) setup_force_cpu_cap(X86_FEATURE_TSC); + xen_setup_runstate_info(cpu); xen_setup_timer(cpu); xen_setup_cpu_clockevents(); } diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index ca6596b..07ee25c 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -44,6 +44,7 @@ void __init xen_build_dynamic_phys_to_machine(void); void xen_init_irq_ops(void); void xen_setup_timer(int cpu); +void xen_setup_runstate_info(int cpu); void xen_teardown_timer(int cpu); cycle_t xen_clocksource_read(void); void xen_setup_cpu_clockevents(void); diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index f09e8c3..219fb3f 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -145,6 +158,8 @@ void xen_vcpu_restore(void) HYPERVISOR_vcpu_op(VCPUOP_down, cpu, NULL)) BUG(); + xen_setup_runstate_info(cpu); + xen_vcpu_setup(cpu); if (other_cpu && _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Nov-24 15:11 UTC
Re: [Xen-devel] [PATCH] x86: extend runstate area updates
On Sat, 2009-11-21 at 00:48 +0000, Jeremy Fitzhardinge wrote:> On 11/21/09 02:57, Ian Campbell wrote: > > On Fri, 2009-11-20 at 18:09 +0000, Ian Campbell wrote: > > > >> > >> I can''t see where the guest runstate pointer is supposed to be either > >> restored or re-setup on resume. I tried adding a setup_runstate_info > >> to xen_timer_resume (to match the call in xen_timer_setup) but that > >> seems like it is already too late -- I still see the warnings trigger. > >> I''m not sure how this is possible since I thought we were in a > >> stop_machine section at this point. > >> > > The xen_sched_clock calls are as a result of the various printks, e.g. > > in xen_vcpu_setup, in order to add the timestamp to the output. > > Therefore we need to ensure we reset the runstate info before any > > printks. > > > > Thanks for investigating this.Oopps... Subject: register runstate on secondary CPUs The commit "xen: re-register runstate area earlier on resume" caused us to never try and setup the runstate area for secondary CPUs. Ensure that we do this... Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 7f96358..ea8b5e6 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -295,6 +295,7 @@ static int __cpuinit xen_cpu_up(unsigned int cpu) (unsigned long)task_stack_page(idle) - KERNEL_STACK_OFFSET + THREAD_SIZE; #endif + xen_setup_runstate_info(cpu); xen_setup_timer(cpu); xen_init_lock_cpu(cpu); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel