Wei Wang
2012-Apr-12 13:11 UTC
[PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
Hi Jan, This is a revised fix from my last post[1]. We found that the performance issue[2] is actually caused by an unnecessary cache flush when fall-through happens. According to acpi spec, c2 has cache consistency, we don''t need cache flush if cx->type != ACPI_STATE_C3. This fix improves performance of some OEM systems significantly. I would suggest to back port it to 4.1, 4.0 and even 3.4. Thanks, We [1] http://lists.xen.org/archives/html/xen-devel/2012-04/msg00395.html [2] http://forums.citrix.com/thread.jspa?threadID=297461&tstart=0&start=0 # HG changeset patch # Parent 6efb9f934bfb4c46af375612fb01e65d15518380 # User Wei Wang <wei.wang2@amd.com> acpi: do not flush cache if cx->type != ACPI_STATE_C3 Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 6efb9f934bfb -r 85775fd04cf4 xen/arch/x86/acpi/cpu_idle.c --- a/xen/arch/x86/acpi/cpu_idle.c Thu Apr 05 11:24:26 2012 +0100 +++ b/xen/arch/x86/acpi/cpu_idle.c Thu Apr 12 14:15:09 2012 +0200 @@ -509,7 +509,8 @@ static void acpi_processor_idle(void) else if ( !power->flags.bm_check ) { /* SMP with no shared cache... Invalidate cache */ - ACPI_FLUSH_CPU_CACHE(); + if ( cx->type == ACPI_STATE_C3 ) + ACPI_FLUSH_CPU_CACHE(); } /* Invoke C3 */
Steven
2012-Apr-12 17:05 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
Hi, Wei, I have read you slides in xen summit 2007 about NPT in AMD, "AMD Barcelona and Nested Paging Support in Xen". However, I have some question regarding the nested page walk in your slide 8. In your slide 8, the first step is to get gPA from get_PML4(gCR3,gVA). I assume that it use the [47:39] bit of gVA. However, in another AMD white paper, "AMD-VTM Nested Paging". http://developer.amd.com/assets/NPT-WP-1%201-final-TM.pdf In its figure 4, I saw that the first step is to translate gCR3 using nested page walk and then combine with the gVA[47:39] to read the table entry. These two documents look having different order of reading the guest page table. In the slides, it first get_PML4. But in the white paper, it first does nested page walk. I am wondering which one is true. Thanks. - ha
Zhang, Yang Z
2012-Apr-13 02:14 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
This should not be enough. No need to check bm when going to C2. How about the following patch: diff -r d35a117afa2f xen/arch/x86/acpi/cpu_idle.c --- a/xen/arch/x86/acpi/cpu_idle.c Tue Mar 27 15:25:07 2012 +0200 +++ b/xen/arch/x86/acpi/cpu_idle.c Fri Apr 13 10:10:31 2012 +0800 @@ -493,7 +493,8 @@ static void acpi_processor_idle(void) * not set. In that case we cannot do much, we enter C3 * without doing anything. */ - if ( power->flags.bm_check && power->flags.bm_control ) + if ( (cx->type == ACPI_STATE_C3) && power->flags.bm_check + && power->flags.bm_control ) { spin_lock(&c3_cpu_status.lock); if ( ++c3_cpu_status.count == num_online_cpus() ) @@ -509,13 +510,14 @@ static void acpi_processor_idle(void) else if ( !power->flags.bm_check ) { /* SMP with no shared cache... Invalidate cache */ - ACPI_FLUSH_CPU_CACHE(); + if ( cx->type == ACPI_STATE_C3 ) + ACPI_FLUSH_CPU_CACHE(); } - /* Invoke C3 */ acpi_idle_do_entry(cx); - if ( power->flags.bm_check && power->flags.bm_control ) + if ( (cx->type == ACPI_STATE_C3) && power->flags.bm_check + && power->flags.bm_control ) { /* Enable bus master arbitration */ spin_lock(&c3_cpu_status.lock); best regards yang> -----Original Message----- > From: Wei Wang [mailto:wei.wang2@amd.com] > Sent: Thursday, April 12, 2012 9:11 PM > To: Jan Beulich; Keir Fraser > Cc: Zhang, Yang Z; xen-devel@lists.xensource.com; Andre Przywara; Ostrovsky, > Boris; Wei Huang > Subject: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3 > > Hi Jan, > This is a revised fix from my last post[1]. We found that the performance > issue[2] is actually caused by an unnecessary cache flush when fall-through > happens. According to acpi spec, c2 has cache consistency, we don''t need > cache flush if cx->type != ACPI_STATE_C3. > This fix improves performance of some OEM systems significantly. I would > suggest to back port it to 4.1, 4.0 and even 3.4. > > Thanks, > We > > [1] > http://lists.xen.org/archives/html/xen-devel/2012-04/msg00395.html > [2] > http://forums.citrix.com/thread.jspa?threadID=297461&tstart=0&start=0 > > # HG changeset patch > # Parent 6efb9f934bfb4c46af375612fb01e65d15518380 > # User Wei Wang <wei.wang2@amd.com> > acpi: do not flush cache if cx->type != ACPI_STATE_C3 > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 6efb9f934bfb -r 85775fd04cf4 xen/arch/x86/acpi/cpu_idle.c > --- a/xen/arch/x86/acpi/cpu_idle.c Thu Apr 05 11:24:26 2012 +0100 > +++ b/xen/arch/x86/acpi/cpu_idle.c Thu Apr 12 14:15:09 2012 +0200 > @@ -509,7 +509,8 @@ static void acpi_processor_idle(void) > else if ( !power->flags.bm_check ) > { > /* SMP with no shared cache... Invalidate cache */ > - ACPI_FLUSH_CPU_CACHE(); > + if ( cx->type == ACPI_STATE_C3 ) > + ACPI_FLUSH_CPU_CACHE(); > } > > /* Invoke C3 */
Jan Beulich
2012-Apr-13 07:23 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
>>> On 12.04.12 at 19:05, Steven <wangwangkang@gmail.com> wrote: > Hi, Wei, > I have read you slides in xen summit 2007 about NPT in AMD, "AMD > Barcelona and Nested Paging Support in Xen". > However, I have some question regarding the nested page walk in your slide > 8.Please do not hijack threads for unrelated topics; create new ones instead. Jan> In your slide 8, the first step is to get gPA from get_PML4(gCR3,gVA). > I assume that it use the [47:39] bit of gVA. > However, in another AMD white paper, "AMD-VTM Nested Paging". > http://developer.amd.com/assets/NPT-WP-1%201-final-TM.pdf > In its figure 4, I saw that the first step is to translate gCR3 using > nested page walk and then combine with the gVA[47:39] to read the > table entry. > > These two documents look having different order of reading the guest > page table. In the slides, it first get_PML4. But in the white paper, > it first does nested page walk. > I am wondering which one is true. Thanks. > > - ha > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2012-Apr-13 08:30 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
>>> On 13.04.12 at 04:14, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote: > This should not be enough. No need to check bm when going to C2. > How about the following patch:That looks right, but I''d prefer to simplify it a little: --- a/xen/arch/x86/acpi/cpu_idle.c +++ b/xen/arch/x86/acpi/cpu_idle.c @@ -493,7 +493,9 @@ static void acpi_processor_idle(void) * not set. In that case we cannot do much, we enter C3 * without doing anything. */ - if ( power->flags.bm_check && power->flags.bm_control ) + if ( cx->type != ACPI_STATE_C3 ) + /* nothing to be done here */; + else if ( power->flags.bm_check && power->flags.bm_control ) { spin_lock(&c3_cpu_status.lock); if ( ++c3_cpu_status.count == num_online_cpus() ) @@ -515,7 +517,8 @@ static void acpi_processor_idle(void) /* Invoke C3 */ acpi_idle_do_entry(cx); - if ( power->flags.bm_check && power->flags.bm_control ) + if ( (cx->type == ACPI_STATE_C3) && + power->flags.bm_check && power->flags.bm_control ) { /* Enable bus master arbitration */ spin_lock(&c3_cpu_status.lock); Also, Yang, you didn''t provide a S-o-b - are you okay with me adding it? If you''re both okay with above patch, I''ll see that I get it committed. Jan
Wei Wang
2012-Apr-13 09:06 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
On 04/13/2012 10:30 AM, Jan Beulich wrote:>>>> On 13.04.12 at 04:14, "Zhang, Yang Z"<yang.z.zhang@intel.com> wrote: >> This should not be enough. No need to check bm when going to C2. >> How about the following patch: > > That looks right, but I''d prefer to simplify it a little: > > --- a/xen/arch/x86/acpi/cpu_idle.c > +++ b/xen/arch/x86/acpi/cpu_idle.c > @@ -493,7 +493,9 @@ static void acpi_processor_idle(void) > * not set. In that case we cannot do much, we enter C3 > * without doing anything. > */ > - if ( power->flags.bm_check&& power->flags.bm_control ) > + if ( cx->type != ACPI_STATE_C3 ) > + /* nothing to be done here */; > + else if ( power->flags.bm_check&& power->flags.bm_control ) > { > spin_lock(&c3_cpu_status.lock); > if ( ++c3_cpu_status.count == num_online_cpus() ) > @@ -515,7 +517,8 @@ static void acpi_processor_idle(void) > /* Invoke C3 */ > acpi_idle_do_entry(cx); > > - if ( power->flags.bm_check&& power->flags.bm_control ) > + if ( (cx->type == ACPI_STATE_C3)&& > + power->flags.bm_check&& power->flags.bm_control ) > { > /* Enable bus master arbitration */ > spin_lock(&c3_cpu_status.lock); > > Also, Yang, you didn''t provide a S-o-b - are you okay with me adding > it? > > If you''re both okay with above patch, I''ll see that I get it committed. > > Jan > >This looks good to me. Thanks Wei
Zhang, Yang Z
2012-Apr-13 13:20 UTC
Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3
Me too. best regards yang> -----Original Message----- > From: Wei Wang [mailto:wei.wang2@amd.com] > Sent: Friday, April 13, 2012 5:06 PM > To: Jan Beulich > Cc: Zhang, Yang Z; AndrePrzywara; Boris Ostrovsky; Wei Huang; > xen-devel@lists.xensource.com; Keir Fraser > Subject: Re: [PATCH] acpi: do not flush cache if cx->type != ACPI_STATE_C3 > > On 04/13/2012 10:30 AM, Jan Beulich wrote: > >>>> On 13.04.12 at 04:14, "Zhang, Yang Z"<yang.z.zhang@intel.com> > wrote: > >> This should not be enough. No need to check bm when going to C2. > >> How about the following patch: > > > > That looks right, but I''d prefer to simplify it a little: > > > > --- a/xen/arch/x86/acpi/cpu_idle.c > > +++ b/xen/arch/x86/acpi/cpu_idle.c > > @@ -493,7 +493,9 @@ static void acpi_processor_idle(void) > > * not set. In that case we cannot do much, we enter C3 > > * without doing anything. > > */ > > - if ( power->flags.bm_check&& power->flags.bm_control ) > > + if ( cx->type != ACPI_STATE_C3 ) > > + /* nothing to be done here */; > > + else if ( power->flags.bm_check&& power->flags.bm_control ) > > { > > spin_lock(&c3_cpu_status.lock); > > if ( ++c3_cpu_status.count == num_online_cpus() ) @@ > > -515,7 +517,8 @@ static void acpi_processor_idle(void) > > /* Invoke C3 */ > > acpi_idle_do_entry(cx); > > > > - if ( power->flags.bm_check&& power->flags.bm_control ) > > + if ( (cx->type == ACPI_STATE_C3)&& > > + power->flags.bm_check&& power->flags.bm_control ) > > { > > /* Enable bus master arbitration */ > > spin_lock(&c3_cpu_status.lock); > > > > Also, Yang, you didn''t provide a S-o-b - are you okay with me adding > > it? > > > > If you''re both okay with above patch, I''ll see that I get it committed. > > > > Jan > > > > > > This looks good to me. Thanks > Wei