Jiang, Yunhong
2010-Mar-29 12:00 UTC
[Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
When flush tlb mask, we need consider the cpu_online_map. The same happens to ept flush also. We noticed sometime system hang on cpu online/offline stress test. The reason is because flush_tlb_mask from __get_page_type is deadloop. This should be caused by a small windows in cpu offline. The cpu_online_map is changed and the interrupt is disabled at take_cpu_down() for the to-be-offline CPU. However, the __sync_lazy_execstate() called from idle_task_exit() in the idle_loop() for the to-be-offline CPU. At that time, the stop_machine_run is finished already, and __get_page_type may be called in other CPU before the __sync_lazy_execstate(). BTW, I noticed that cpu_clear(cpu, cpu_online_map) is called twice in __cpu_disable, I will ask the owner which one should be removed. Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> diff -r f3db0ae08304 xen/arch/x86/hvm/vmx/vmx.c --- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 +++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Mar 29 17:49:51 2010 +0800 @@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d) * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack. */ d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; + cpus_and(d->arch.hvm_domain.vmx.ept_synced, + d->arch.hvm_domain.vmx.ept_synced, + cpu_online_map); on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, __ept_sync_domain, d, 1); } diff -r f3db0ae08304 xen/arch/x86/smp.c --- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 +++ b/xen/arch/x86/smp.c Mon Mar 29 17:47:25 2010 +0800 @@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma { spin_lock(&flush_lock); cpus_andnot(flush_cpumask, *mask, *cpumask_of(smp_processor_id())); + cpus_and(flush_cpumask, cpu_online_map, flush_cpumask); flush_va = va; flush_flags = flags; send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Mar-29 12:55 UTC
Re: [Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 29.03.10 14:00 >>> >When flush tlb mask, we need consider the cpu_online_map. The same happens to ept flush also.While the idea is certainly correct, doing this more efficiently seems quite desirable to me, especially when NR_CPUS is large:>--- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 >+++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Mar 29 17:49:51 2010 +0800 >@@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d) > * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack. > */ > d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; >+ cpus_and(d->arch.hvm_domain.vmx.ept_synced, >+ d->arch.hvm_domain.vmx.ept_synced, >+ cpu_online_map);The added code can be combined with the pre-existing line: cpus_and(d->arch.hvm_domain.vmx.ept_synced, d->domain_dirty_cpumask, cpu_online_map);> on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, > __ept_sync_domain, d, 1); > } >--- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 >+++ b/xen/arch/x86/smp.c Mon Mar 29 17:47:25 2010 +0800 >@@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma > { > spin_lock(&flush_lock); > cpus_andnot(flush_cpumask, *mask, *cpumask_of(smp_processor_id())); >+ cpus_and(flush_cpumask, cpu_online_map, flush_cpumask);Here, first doing the full-mask operation and then clearing the one extra bit is less overhead: cpus_and(flush_cpumask, *mask, cpu_online_map); cpu_clear(smp_processor_id(), flush_cpumask);> flush_va = va; > flush_flags = flags; > send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR);Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Mar-29 14:33 UTC
Re: [Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
Sounds good. Can you please re-spin the patch, Yunhong? I will drop your original patch for now. -- Keir On 29/03/2010 13:55, "Jan Beulich" <JBeulich@novell.com> wrote:>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 29.03.10 14:00 >>> >> When flush tlb mask, we need consider the cpu_online_map. The same happens to >> ept flush also. > > While the idea is certainly correct, doing this more efficiently seems > quite desirable to me, especially when NR_CPUS is large: > >> --- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 >> +++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Mar 29 17:49:51 2010 +0800 >> @@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d) >> * unnecessary extra flushes, to avoid allocating a cpumask_t on the >> stack. >> */ >> d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; >> + cpus_and(d->arch.hvm_domain.vmx.ept_synced, >> + d->arch.hvm_domain.vmx.ept_synced, >> + cpu_online_map); > > The added code can be combined with the pre-existing line: > > cpus_and(d->arch.hvm_domain.vmx.ept_synced, > d->domain_dirty_cpumask, cpu_online_map); > >> on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, >> __ept_sync_domain, d, 1); >> } >> --- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 >> +++ b/xen/arch/x86/smp.c Mon Mar 29 17:47:25 2010 +0800 >> @@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma >> { >> spin_lock(&flush_lock); >> cpus_andnot(flush_cpumask, *mask, *cpumask_of(smp_processor_id())); >> + cpus_and(flush_cpumask, cpu_online_map, flush_cpumask); > > Here, first doing the full-mask operation and then clearing the one > extra bit is less overhead: > > cpus_and(flush_cpumask, *mask, cpu_online_map); > cpu_clear(smp_processor_id(), flush_cpumask); > >> flush_va = va; >> flush_flags = flags; >> send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); > > Jan > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Mar-30 01:39 UTC
RE: [Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
>-----Original Message----- >From: Jan Beulich [mailto:JBeulich@novell.com] >Sent: Monday, March 29, 2010 8:55 PM >To: Jiang, Yunhong >Cc: Keir Fraser; xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] [PATCH] When flush tlb , we need consider the >cpu_online_map > >>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 29.03.10 14:00 >>> >>When flush tlb mask, we need consider the cpu_online_map. The same happens to >ept flush also. > >While the idea is certainly correct, doing this more efficiently seems >quite desirable to me, especially when NR_CPUS is large: > >>--- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 >>+++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Mar 29 17:49:51 2010 +0800 >>@@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d) >> * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack. >> */ >> d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; >>+ cpus_and(d->arch.hvm_domain.vmx.ept_synced, >>+ d->arch.hvm_domain.vmx.ept_synced, >>+ cpu_online_map); > >The added code can be combined with the pre-existing line: > > cpus_and(d->arch.hvm_domain.vmx.ept_synced, > d->domain_dirty_cpumask, cpu_online_map); >Oops, really stupid and thanks for point out this.>> on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, >> __ept_sync_domain, d, 1); >> } >>--- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 >>+++ b/xen/arch/x86/smp.c Mon Mar 29 17:47:25 2010 +0800 >>@@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma >> { >> spin_lock(&flush_lock); >> cpus_andnot(flush_cpumask, *mask, >*cpumask_of(smp_processor_id())); >>+ cpus_and(flush_cpumask, cpu_online_map, flush_cpumask); > >Here, first doing the full-mask operation and then clearing the one >extra bit is less overhead: > > cpus_and(flush_cpumask, *mask, cpu_online_map); > cpu_clear(smp_processor_id(), flush_cpumask);Good point. --jyh> >> flush_va = va; >> flush_flags = flags; >> send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); > >Jan > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2010-Mar-30 03:29 UTC
RE: [Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
Sure.Attached is the new patch. BTW, Xiantao will have a look on the two cpu_clear(cpu, cpu_online_map) in __cpu_disable(), so I remove the comments in the patch. Thanks --jyh When flush tlb mask, we need consider the cpu_online_map. The same happens to ept flush also. We noticed sometime system hang on cpu online/offline stress test. The reason is because flush_tlb_mask from __get_page_type is deadloop. This should be caused by a small windows in cpu offline. The cpu_online_map is changed and the interrupt is disabled at take_cpu_down() for the to-be-offline CPU. However, the __sync_lazy_execstate() called from idle_task_exit() in the idle_loop() for the to-be-offline CPU. At that time, the stop_machine_run is finished already, and __get_page_type may be called in other CPU before the __sync_lazy_execstate(). Thanks Jan pointing out issue in my original patch and gives suggestion, not sure if he would like be in the signed-off-by. Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> diff -r f3db0ae08304 xen/arch/x86/hvm/vmx/vmx.c --- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 +++ b/xen/arch/x86/hvm/vmx/vmx.c Tue Mar 30 11:07:33 2010 +0800 @@ -1234,7 +1234,9 @@ void ept_sync_domain(struct domain *d) * the ept_synced mask before on_selected_cpus() reads it, resulting in * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack. */ - d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; + cpus_and(d->arch.hvm_domain.vmx.ept_synced, + d->domain_dirty_cpumask, cpu_online_map); + on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, __ept_sync_domain, d, 1); } diff -r f3db0ae08304 xen/arch/x86/smp.c --- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 +++ b/xen/arch/x86/smp.c Tue Mar 30 11:06:59 2010 +0800 @@ -228,7 +228,8 @@ void flush_area_mask(const cpumask_t *ma if ( !cpus_subset(*mask, *cpumask_of(smp_processor_id())) ) { spin_lock(&flush_lock); - cpus_andnot(flush_cpumask, *mask, *cpumask_of(smp_processor_id())); + cpus_and(flush_cpumask, *mask, cpu_online_map); + cpu_clear(smp_processor_id(), flush_cpumask); flush_va = va; flush_flags = flags; send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR);>-----Original Message----- >From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >Sent: Monday, March 29, 2010 10:33 PM >To: Jan Beulich; Jiang, Yunhong >Cc: xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] [PATCH] When flush tlb , we need consider the >cpu_online_map > >Sounds good. Can you please re-spin the patch, Yunhong? I will drop your >original patch for now. > > -- Keir > >On 29/03/2010 13:55, "Jan Beulich" <JBeulich@novell.com> wrote: > >>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 29.03.10 14:00 >>> >>> When flush tlb mask, we need consider the cpu_online_map. The same happens >to >>> ept flush also. >> >> While the idea is certainly correct, doing this more efficiently seems >> quite desirable to me, especially when NR_CPUS is large: >> >>> --- a/xen/arch/x86/hvm/vmx/vmx.c Sat Mar 27 16:01:35 2010 +0000 >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Mar 29 17:49:51 2010 +0800 >>> @@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d) >>> * unnecessary extra flushes, to avoid allocating a cpumask_t on the >>> stack. >>> */ >>> d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask; >>> + cpus_and(d->arch.hvm_domain.vmx.ept_synced, >>> + d->arch.hvm_domain.vmx.ept_synced, >>> + cpu_online_map); >> >> The added code can be combined with the pre-existing line: >> >> cpus_and(d->arch.hvm_domain.vmx.ept_synced, >> d->domain_dirty_cpumask, cpu_online_map); >> >>> on_selected_cpus(&d->arch.hvm_domain.vmx.ept_synced, >>> __ept_sync_domain, d, 1); >>> } >>> --- a/xen/arch/x86/smp.c Sat Mar 27 16:01:35 2010 +0000 >>> +++ b/xen/arch/x86/smp.c Mon Mar 29 17:47:25 2010 +0800 >>> @@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma >>> { >>> spin_lock(&flush_lock); >>> cpus_andnot(flush_cpumask, *mask, >*cpumask_of(smp_processor_id())); >>> + cpus_and(flush_cpumask, cpu_online_map, flush_cpumask); >> >> Here, first doing the full-mask operation and then clearing the one >> extra bit is less overhead: >> >> cpus_and(flush_cpumask, *mask, cpu_online_map); >> cpu_clear(smp_processor_id(), flush_cpumask); >> >>> flush_va = va; >>> flush_flags = flags; >>> send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); >> >> Jan >> >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel