Tim Deegan
2010-Nov-17 14:47 UTC
Re: [Xen-devel] [PATCH] x86/hvm/pmtimer: improving scalability of virtual time update on Xen 4.0.0
At 21:39 +0000 on 17 Nov (1290029945), Song Xiang wrote:> diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c > index 48fe26a..587944f 100644 > --- a/xen/arch/x86/hvm/pmtimer.c > +++ b/xen/arch/x86/hvm/pmtimer.c > @@ -90,7 +90,7 @@ static void pmt_update_time(PMTState *s) > > /* Update the timer */ > curr_gtime = hvm_get_guest_time(s->vcpu); > - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> 32; > + *(volatile uint32_t *)&s->pm.tmr_val = s->pm.tmr_val + (((curr_gtime - s->last_gtime) * s->scale) >> 32); > s->pm.tmr_val &= TMR_VAL_MASK; > s->last_gtime = curr_gtime;That doesn''t make it an atomic update! The variable is still written to twice. :) You need to calculate the new tmr_val in a scratch variable, and then write it back once at the end of the function. (And no ''volatile'' wll be necessary). Cheers, Tim.> @@ -206,10 +206,19 @@ static int handle_pmt_io( > > if ( dir == IOREQ_READ ) > { > - spin_lock(&s->lock); > - pmt_update_time(s); > - *val = s->pm.tmr_val; > - spin_unlock(&s->lock); > + /* > + * if acquired the PMTState lock then update the time > + * else other vcpu is updating it ,it should be up to date. > + */ > + if (spin_trylock(&s->lock)) { > + pmt_update_time(s); > + *val = s->pm.tmr_val; > + spin_unlock(&s->lock); > + } > + else { > + spin_barrier(&s->lock); > + *val = s->pm.tmr_val; > + } > return X86EMUL_OKAY; > } >> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Nov-17 15:43 UTC
Re: [Xen-devel] [PATCH] x86/hvm/pmtimer: improving scalability of virtual time update on Xen 4.0.0
On 17/11/2010 14:47, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:>> /* Update the timer */ >> curr_gtime = hvm_get_guest_time(s->vcpu); >> - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> 32; >> + *(volatile uint32_t *)&s->pm.tmr_val = s->pm.tmr_val + (((curr_gtime - >> s->last_gtime) * s->scale) >> 32); >> s->pm.tmr_val &= TMR_VAL_MASK; >> s->last_gtime = curr_gtime; > > That doesn''t make it an atomic update! The variable is still written > to twice. :) You need to calculate the new tmr_val in a scratch > variable, and then write it back once at the end of the function. > (And no ''volatile'' wll be necessary).Can we be sure the compiler will emit only one write to the shared field unless we use volatile? Seems to safer and more explicit to keep it, to me (like atomic_set()). Otherwise compiler could elide the local variable and continue to do the whole composite operation on the shared field? Of course I agree about calculating the value to store in a local variabne and then write the result in one operation. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Song Xiang
2010-Nov-17 21:39 UTC
[Xen-devel] [PATCH] x86/hvm/pmtimer: improving scalability of virtual time update on Xen 4.0.0
From: Xiang Song <xiangsong@fudan.edu.cn> This provides a patch to arch/x86/hvm/pmtimer.c for Xen 4.0.0 to mitigate the heavy contention on handle_pmt_io when running a HVM configured with many cores (e.g., 32 cores). As the virtual time of a domain must be fresh, there should be someone updating it, periodically. But it is not necessary to let a VCPU update the virtual time when another one has been updating it. Thus the update can be skipped when the VCPU finds someone else is updating the virtual time. So every time a VCPU invoke handle_pmt_io to update the current domain''s virtual time, it will first try to acquire the pmtimer lock. If it succeeds, it will update the virtual time. Otherwise, it can skip the update, waits for the pmtimer lock holder to finish updating the virtual time and returns the updated time. Signed-off-by: Xiang Song <xiangsong@fudan.edu.cn> --- xen/arch/x86/hvm/pmtimer.c | 19 ++++++++++++++----- 1 files changed, 14 insertions(+), 5 deletions(-) We evaluate this patch using a guest HVM configured with 32 cores on a 48-core HP machine. (8 * 6 AMD 2.4GHz Opteron chips). We use Linux (Debian GNU/Linux 5.0, kernel version 2.6.35-rc5) as guest HVM OS. Each VCPU of the guest HVM is pinned on a unique physical core. The guest HVM is allocated with 16GBytes memory with HAP enabled. We use 7 application benchmarks to evaluate the patch, and run all of them in memory using a ramfs. The percentage of CPU time spent on spin_lock inside handle_pmt_io under 32 cores configuration on HVM on the original Xen is as follows: (We use Xenoprof to collect the execution info) original Xen gmake 32.56% phoenix -- histogram 55.04% phoenix -- wordcount 78.94% phoenix -- linear_regression 15.89% specjvm -- compress 21.79% specjvm -- crypto AES 21.35% specjvm -- XML validation 42.73% It is obvious that all applications have heavy contention in handle_pmt_io. After applying the patch, The percentage of CPU time spent on _spin_barrier inside handle_pmt_io under 32 cores configuration is much lower than that on original Xen: patched Xen gmake 2.20% phoenix -- histogram 12.25% phoenix -- wordcount 17.04% phoenix -- linear_regression 4.45% specjvm -- compress 2.45% specjvm -- crypto AES 2.46% specjvm -- XML validation 4.67% After applying the patch, we get speedups for all 7 applications. The performance of 7 applications using 1-core configuration and 32-core configuration on original Xen and patched Xen, and the speedup after using this patch are shown as follows: 1-core 32-core original Xen patched Xen speedup original Xen patched Xen speedup gmake (sec) 91.2 89.2 2.24% 1824.3 1778 2.60% phoenix -- histogram (sec) 9.5 9.15 3.83% 32.79 28.67 14.37% phoenix -- wordcount (sec) 206.23 201.07 2.57% 279.22 249.53 11.90% phoenix -- linear_regression (sec) 2.57 2.5 2.80% 6.06 5.89 2.89% specjvm -- compress (ops/min) 27.43 28.6 4.27% 774.37 906.0366667 17.00% specjvm -- crypto AES (ops/min) 8.92 9.32 4.48% 209.55 245.5166667 17.16% specjvm -- XML validation (ops/min) 62.55 65.65 4.96% 624.46 780.89 25.05% diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c index 48fe26a..587944f 100644 --- a/xen/arch/x86/hvm/pmtimer.c +++ b/xen/arch/x86/hvm/pmtimer.c @@ -90,7 +90,7 @@ static void pmt_update_time(PMTState *s) /* Update the timer */ curr_gtime = hvm_get_guest_time(s->vcpu); - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> 32; + *(volatile uint32_t *)&s->pm.tmr_val = s->pm.tmr_val + (((curr_gtime - s->last_gtime) * s->scale) >> 32); s->pm.tmr_val &= TMR_VAL_MASK; s->last_gtime = curr_gtime; @@ -206,10 +206,19 @@ static int handle_pmt_io( if ( dir == IOREQ_READ ) { - spin_lock(&s->lock); - pmt_update_time(s); - *val = s->pm.tmr_val; - spin_unlock(&s->lock); + /* + * if acquired the PMTState lock then update the time + * else other vcpu is updating it ,it should be up to date. + */ + if (spin_trylock(&s->lock)) { + pmt_update_time(s); + *val = s->pm.tmr_val; + spin_unlock(&s->lock); + } + else { + spin_barrier(&s->lock); + *val = s->pm.tmr_val; + } return X86EMUL_OKAY; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Song Xiang
2010-Nov-17 23:13 UTC
Re: [Xen-devel] [PATCH] x86/hvm/pmtimer: improving scalability of virtual time update on Xen 4.0.0
It can be as follows: @@ -90,8 +90,8 @@ static void pmt_update_time(PMTState *s) /* Update the timer */ curr_gtime = hvm_get_guest_time(s->vcpu); - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> 32; - s->pm.tmr_val &= TMR_VAL_MASK; + *(volatile uint32_t *)&s->pm.tmr_val = (s->pm.tmr_val + + (((curr_gtime - s->last_gtime) * s->scale) >> 32)) & TMR_VAL_MASK; s->last_gtime = curr_gtime; /* If the counter''s MSB has changed, set the status bit */ but it seems gcc (gcc 4.4.4) has omitted the statement s->pm.tmr_val &= TMR_VAL_MASK; in the object file 在 2010-11-17,下午2:47, Tim Deegan 写道:> At 21:39 +0000 on 17 Nov (1290029945), Song Xiang wrote: >> diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c >> index 48fe26a..587944f 100644 >> --- a/xen/arch/x86/hvm/pmtimer.c >> +++ b/xen/arch/x86/hvm/pmtimer.c >> @@ -90,7 +90,7 @@ static void pmt_update_time(PMTState *s) >> >> /* Update the timer */ >> curr_gtime = hvm_get_guest_time(s->vcpu); >> - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> >> 32; >> + *(volatile uint32_t *)&s->pm.tmr_val = s->pm.tmr_val + >> (((curr_gtime - s->last_gtime) * s->scale) >> 32); >> s->pm.tmr_val &= TMR_VAL_MASK; >> s->last_gtime = curr_gtime; > > That doesn''t make it an atomic update! The variable is still written > to twice. :) You need to calculate the new tmr_val in a scratch > variable, and then write it back once at the end of the function. > (And no ''volatile'' wll be necessary). > > Cheers, > > Tim. > >> @@ -206,10 +206,19 @@ static int handle_pmt_io( >> >> if ( dir == IOREQ_READ ) >> { >> - spin_lock(&s->lock); >> - pmt_update_time(s); >> - *val = s->pm.tmr_val; >> - spin_unlock(&s->lock); >> + /* >> + * if acquired the PMTState lock then update the time >> + * else other vcpu is updating it ,it should be up to date. >> + */ >> + if (spin_trylock(&s->lock)) { >> + pmt_update_time(s); >> + *val = s->pm.tmr_val; >> + spin_unlock(&s->lock); >> + } >> + else { >> + spin_barrier(&s->lock); >> + *val = s->pm.tmr_val; >> + } >> return X86EMUL_OKAY; >> } >> > >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Xen Platform Team > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Song Xiang
2010-Nov-18 00:14 UTC
[Xen-devel] [PATCH] x86/hvm/pmtimer: improving scalability of virtual time update on Xen 4.0.0
From: Xiang Song <xiangsong@fudan.edu.cn> This provides a patch to arch/x86/hvm/pmtimer.c for Xen 4.0.0 to mitigate the heavy contention on handle_pmt_io when running a HVM configured with many cores (e.g., 32 cores). As the virtual time of a domain must be fresh, there should be someone updating it, periodically. But it is not necessary to let a VCPU update the virtual time when another one has been updating it. Thus the update can be skipped when the VCPU finds someone else is updating the virtual time. So every time a VCPU invoke handle_pmt_io to update the current domain''s virtual time, it will first try to acquire the pmtimer lock. If it succeeds, it will update the virtual time. Otherwise, it can skip the update, waits for the pmtimer lock holder to finish updating the virtual time and returns the updated time. Signed-off-by: Xiang Song <xiangsong@fudan.edu.cn> --- xen/arch/x86/hvm/pmtimer.c | 21 +++++++++++++++------ 1 files changed, 15 insertions(+), 6 deletions(-) We evaluate this patch using a guest HVM configured with 32 cores on a 48-core HP machine. (8 * 6 AMD 2.4GHz Opteron chips). We use Linux (Debian GNU/Linux 5.0, kernel version 2.6.35-rc5) as guest HVM OS. Each VCPU of the guest HVM is pinned on a unique physical core. The guest HVM is allocated with 16GBytes memory with HAP enabled. We use 7 application benchmarks to evaluate the patch, and run all of them in memory using a ramfs. The percentage of CPU time spent on spin_lock inside handle_pmt_io under 32 cores configuration on HVM on the original Xen is as follows: (We use Xenoprof to collect the execution info) original Xen gmake 32.56% phoenix -- histogram 55.04% phoenix -- wordcount 78.94% phoenix -- linear_regression 15.89% specjvm -- compress 21.79% specjvm -- crypto AES 21.35% specjvm -- XML validation 42.73% It is obvious that all applications have heavy contention in handle_pmt_io. After applying the patch, The percentage of CPU time spent on _spin_barrier inside handle_pmt_io under 32 cores configuration is much lower than that on original Xen: patched Xen gmake 2.20% phoenix -- histogram 12.25% phoenix -- wordcount 17.04% phoenix -- linear_regression 4.45% specjvm -- compress 2.45% specjvm -- crypto AES 2.46% specjvm -- XML validation 4.67% After applying the patch, we get speedups for all 7 applications. The performance of 7 applications using 1-core configuration and 32-core configuration on original Xen and patched Xen, and the speedup after using this patch are shown as follows: 1-core 32-core original Xen patched Xen speedup original Xen patched Xen speedup gmake (sec) 91.2 89.2 2.24% 1824.3 1778 2.60% phoenix -- histogram (sec) 9.5 9.15 3.83% 32.79 28.67 14.37% phoenix -- wordcount (sec) 206.23 201.07 2.57% 279.22 249.53 11.90% phoenix -- linear_regression (sec) 2.57 2.5 2.80% 6.06 5.89 2.89% specjvm -- compress (ops/min) 27.43 28.6 4.27% 774.37 906.0366667 17.00% specjvm -- crypto AES (ops/min) 8.92 9.32 4.48% 209.55 245.5166667 17.16% specjvm -- XML validation (ops/min) 62.55 65.65 4.96% 624.46 780.89 25.05% diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c index 48fe26a..5bec90c 100644 --- a/xen/arch/x86/hvm/pmtimer.c +++ b/xen/arch/x86/hvm/pmtimer.c @@ -90,8 +90,8 @@ static void pmt_update_time(PMTState *s) /* Update the timer */ curr_gtime = hvm_get_guest_time(s->vcpu); - s->pm.tmr_val += ((curr_gtime - s->last_gtime) * s->scale) >> 32; - s->pm.tmr_val &= TMR_VAL_MASK; + *(volatile uint32_t *)&s->pm.tmr_val = (s->pm.tmr_val + + (((curr_gtime - s->last_gtime) * s->scale) >> 32)) & TMR_VAL_MASK; s->last_gtime = curr_gtime; /* If the counter''s MSB has changed, set the status bit */ @@ -206,10 +206,19 @@ static int handle_pmt_io( if ( dir == IOREQ_READ ) { - spin_lock(&s->lock); - pmt_update_time(s); - *val = s->pm.tmr_val; - spin_unlock(&s->lock); + /* + * if acquired the PMTState lock then update the time + * else other vcpu is updating it ,it should be up to date. + */ + if (spin_trylock(&s->lock)) { + pmt_update_time(s); + *val = s->pm.tmr_val; + spin_unlock(&s->lock); + } + else { + spin_barrier(&s->lock); + *val = s->pm.tmr_val; + } return X86EMUL_OKAY; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel