Future AMD SVM supports a new feature called flush by ASID. The idea is to allow CPU to flush TLBs associated with the ASID assigned to guest VM. So hypervisor doesn''t have to reassign a new ASID in order to flush guest''s VCPU. Please review it. Thanks, Wei Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Wei Wang <wei.wang2@amd.com> -- Advanced Micro Devices GmbH Sitz: Dornach, Gemeinde Aschheim, Landkreis München Registergericht München, HRB Nr. 43632 WEEE-Reg-Nr: DE 12919551 Geschäftsführer: Alberto Bozzo, Andrew Bowd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:> Future AMD SVM supports a new feature called flush by ASID. The idea is to > allow CPU to flush TLBs associated with the ASID assigned to guest VM. So > hypervisor doesn''t have to reassign a new ASID in order to flush guest''s > VCPU. Please review it.What advantage does the new system have? Intuitively it seems like it might be a tiny bit fairer and a tiny bit faster (by explicitly flushing instead of relying on LRO) but I''m not convinced that it will be visible in macro-benchmarks. Have you measured it? Cheers, Tim.> Thanks, > Wei > > Signed-off-by: Wei Huang <wei.huang2@amd.com> > Signed-off-by: Wei Wang <wei.wang2@amd.com> > -- > Advanced Micro Devices GmbH > Sitz: Dornach, Gemeinde Aschheim, > Landkreis München Registergericht München, > HRB Nr. 43632 > WEEE-Reg-Nr: DE 12919551 > Geschäftsführer: > Alberto Bozzo, Andrew Bowd > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Tim, Flush by ASID provides more flexible control of tlb flushing. The most advantage is to allow hypervisor to flush tagged tlb selectively. Using this feature, HV is able to flush tlb entries associated with a guest VM directly instead of allocating a new asid . The whole tlb flush will also be reduced by reducing asid allocation. So far, we did not measure drastic performance improvement in testing with kernbench and X11perf. Actually, we found out that, reducing tlb flushes accompanying with vmrun does not improve performance very much. we sent out a patch to optimize hvm_flush_guest_tlbs last week, which reduces over 90% tlb flushes for vmrun, and we even cannot see signification speedup with it. Maybe, the latency of vmrun is too big so that the overhead of tlb flush is negligible? Thanks, Wei On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote: > > Future AMD SVM supports a new feature called flush by ASID. The idea is > > to allow CPU to flush TLBs associated with the ASID assigned to guest VM. > > So hypervisor doesn''t have to reassign a new ASID in order to flush > > guest''s VCPU. Please review it. > > What advantage does the new system have? Intuitively it seems like it > might be a tiny bit fairer and a tiny bit faster (by explicitly flushing > instead of relying on LRO) but I''m not convinced that it will be visible > in macro-benchmarks. Have you measured it? > > Cheers, > > Tim. > > > Thanks, > > Wei > > > > Signed-off-by: Wei Huang <wei.huang2@amd.com> > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > -- > > Advanced Micro Devices GmbH > > Sitz: Dornach, Gemeinde Aschheim, > > Landkreis München Registergericht München, > > HRB Nr. 43632 > > WEEE-Reg-Nr: DE 12919551 > > Geschäftsführer: > > Alberto Bozzo, Andrew Bowd > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
It begs the question whether it''s worth complicating code for an optimisation with no measurable benefit, doesn''t it? -- Keir On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote:> Hi Tim, > Flush by ASID provides more flexible control of tlb flushing. The most > advantage is to allow hypervisor to flush tagged tlb selectively. Using this > feature, HV is able to flush tlb entries associated with a guest VM directly > instead of allocating a new asid . The whole tlb flush will also be reduced > by reducing asid allocation. > > So far, we did not measure drastic performance improvement in testing with > kernbench and X11perf. Actually, we found out that, reducing tlb flushes > accompanying with vmrun does not improve performance very much. > we sent out a patch to optimize hvm_flush_guest_tlbs last week, which reduces > over 90% tlb flushes for vmrun, and we even cannot see signification speedup > with it. Maybe, the latency of vmrun is too big so that the overhead of tlb > flush is negligible? > > Thanks, > Wei > > > On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote: >> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote: >>> Future AMD SVM supports a new feature called flush by ASID. The idea is >>> to allow CPU to flush TLBs associated with the ASID assigned to guest VM. >>> So hypervisor doesn''t have to reassign a new ASID in order to flush >>> guest''s VCPU. Please review it. >> >> What advantage does the new system have? Intuitively it seems like it >> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing >> instead of relying on LRO) but I''m not convinced that it will be visible >> in macro-benchmarks. Have you measured it? >> >> Cheers, >> >> Tim. >> >>> Thanks, >>> Wei >>> >>> Signed-off-by: Wei Huang <wei.huang2@amd.com> >>> Signed-off-by: Wei Wang <wei.wang2@amd.com> >>> -- >>> Advanced Micro Devices GmbH >>> Sitz: Dornach, Gemeinde Aschheim, >>> Landkreis München Registergericht München, >>> HRB Nr. 43632 >>> WEEE-Reg-Nr: DE 12919551 >>> Geschäftsführer: >>> Alberto Bozzo, Andrew Bowd >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir, Sure, that is a good question :) . Actually finding a benchmark that scales with asid well is not quite easy. Benckmark like Kernbench which has large working set will occupy all tls entries by its own asid. In this case, even disabling asid is not harmful. We only tested single guest with multiple vcpus. Maybe using multiple guests or other benchmarks will show a better result? Thanks, Wei On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote:> It begs the question whether it''s worth complicating code for an > optimisation with no measurable benefit, doesn''t it? > > -- Keir > > On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote: > > Hi Tim, > > Flush by ASID provides more flexible control of tlb flushing. The most > > advantage is to allow hypervisor to flush tagged tlb selectively. Using > > this feature, HV is able to flush tlb entries associated with a guest VM > > directly instead of allocating a new asid . The whole tlb flush will also > > be reduced by reducing asid allocation. > > > > So far, we did not measure drastic performance improvement in testing > > with kernbench and X11perf. Actually, we found out that, reducing tlb > > flushes accompanying with vmrun does not improve performance very much. > > we sent out a patch to optimize hvm_flush_guest_tlbs last week, which > > reduces over 90% tlb flushes for vmrun, and we even cannot see > > signification speedup with it. Maybe, the latency of vmrun is too big so > > that the overhead of tlb flush is negligible? > > > > Thanks, > > Wei > > > > On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote: > >> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote: > >>> Future AMD SVM supports a new feature called flush by ASID. The idea is > >>> to allow CPU to flush TLBs associated with the ASID assigned to guest > >>> VM. So hypervisor doesn''t have to reassign a new ASID in order to flush > >>> guest''s VCPU. Please review it. > >> > >> What advantage does the new system have? Intuitively it seems like it > >> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing > >> instead of relying on LRO) but I''m not convinced that it will be visible > >> in macro-benchmarks. Have you measured it? > >> > >> Cheers, > >> > >> Tim. > >> > >>> Thanks, > >>> Wei > >>> > >>> Signed-off-by: Wei Huang <wei.huang2@amd.com> > >>> Signed-off-by: Wei Wang <wei.wang2@amd.com> > >>> -- > >>> Advanced Micro Devices GmbH > >>> Sitz: Dornach, Gemeinde Aschheim, > >>> Landkreis München Registergericht München, > >>> HRB Nr. 43632 > >>> WEEE-Reg-Nr: DE 12919551 > >>> Geschäftsführer: > >>> Alberto Bozzo, Andrew Bowd > >>> > >>> > >>> _______________________________________________ > >>> Xen-devel mailing list > >>> Xen-devel@lists.xensource.com > >>> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Our gut feeling has always been that the major benefit is having two ASIDS, allowing one for host and one for current guest and thus avoiding TLB flush on every VM entry/exit. Unless your TLB is very large, or guest vcpus run only for very short periods, it''s likely that a heavy guest workload displaces all other ASIDs (guest VCPUs) from the TLB anyway. We''re interested in benchmark numbers that can disprove the gut feeling, of course! -- Keir On 12/01/2011 13:23, "Wei Wang2" <wei.wang2@amd.com> wrote:> Keir, > Sure, that is a good question :) . > Actually finding a benchmark that scales with asid well is not quite easy. > Benckmark like Kernbench which has large working set will occupy all tls > entries by its own asid. In this case, even disabling asid is not harmful. > We only tested single guest with multiple vcpus. Maybe using multiple guests > or other benchmarks will show a better result? > Thanks, > Wei > > > On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote: >> It begs the question whether it''s worth complicating code for an >> optimisation with no measurable benefit, doesn''t it? >> >> -- Keir >> >> On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote: >>> Hi Tim, >>> Flush by ASID provides more flexible control of tlb flushing. The most >>> advantage is to allow hypervisor to flush tagged tlb selectively. Using >>> this feature, HV is able to flush tlb entries associated with a guest VM >>> directly instead of allocating a new asid . The whole tlb flush will also >>> be reduced by reducing asid allocation. >>> >>> So far, we did not measure drastic performance improvement in testing >>> with kernbench and X11perf. Actually, we found out that, reducing tlb >>> flushes accompanying with vmrun does not improve performance very much. >>> we sent out a patch to optimize hvm_flush_guest_tlbs last week, which >>> reduces over 90% tlb flushes for vmrun, and we even cannot see >>> signification speedup with it. Maybe, the latency of vmrun is too big so >>> that the overhead of tlb flush is negligible? >>> >>> Thanks, >>> Wei >>> >>> On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote: >>>> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote: >>>>> Future AMD SVM supports a new feature called flush by ASID. The idea is >>>>> to allow CPU to flush TLBs associated with the ASID assigned to guest >>>>> VM. So hypervisor doesn''t have to reassign a new ASID in order to flush >>>>> guest''s VCPU. Please review it. >>>> >>>> What advantage does the new system have? Intuitively it seems like it >>>> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing >>>> instead of relying on LRO) but I''m not convinced that it will be visible >>>> in macro-benchmarks. Have you measured it? >>>> >>>> Cheers, >>>> >>>> Tim. >>>> >>>>> Thanks, >>>>> Wei >>>>> >>>>> Signed-off-by: Wei Huang <wei.huang2@amd.com> >>>>> Signed-off-by: Wei Wang <wei.wang2@amd.com> >>>>> -- >>>>> Advanced Micro Devices GmbH >>>>> Sitz: Dornach, Gemeinde Aschheim, >>>>> Landkreis München Registergericht München, >>>>> HRB Nr. 43632 >>>>> WEEE-Reg-Nr: DE 12919551 >>>>> Geschäftsführer: >>>>> Alberto Bozzo, Andrew Bowd >>>>> >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This feature isn''t something ground-breaking. So we don''t expect significant performance improvement for many benchmarks. But it ought to have a niche market for certain workloads. We will collect more performance results for the next submission. The bottom line is not to slowdown existing ASID implementation. Thanks, -WeiH On 01/12/2011 07:38 AM, Keir Fraser wrote:> Our gut feeling has always been that the major benefit is having two ASIDS, > allowing one for host and one for current guest and thus avoiding TLB flush > on every VM entry/exit. Unless your TLB is very large, or guest vcpus run > only for very short periods, it''s likely that a heavy guest workload > displaces all other ASIDs (guest VCPUs) from the TLB anyway. > > We''re interested in benchmark numbers that can disprove the gut feeling, of > course! > > -- Keir > > On 12/01/2011 13:23, "Wei Wang2"<wei.wang2@amd.com> wrote: > >> Keir, >> Sure, that is a good question :) . >> Actually finding a benchmark that scales with asid well is not quite easy. >> Benckmark like Kernbench which has large working set will occupy all tls >> entries by its own asid. In this case, even disabling asid is not harmful. >> We only tested single guest with multiple vcpus. Maybe using multiple guests >> or other benchmarks will show a better result? >> Thanks, >> Wei >> >> >> On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote: >>> It begs the question whether it''s worth complicating code for an >>> optimisation with no measurable benefit, doesn''t it? >>> >>> -- Keir >>> >>> On 12/01/2011 12:41, "Wei Wang2"<wei.wang2@amd.com> wrote: >>>> Hi Tim, >>>> Flush by ASID provides more flexible control of tlb flushing. The most >>>> advantage is to allow hypervisor to flush tagged tlb selectively. Using >>>> this feature, HV is able to flush tlb entries associated with a guest VM >>>> directly instead of allocating a new asid . The whole tlb flush will also >>>> be reduced by reducing asid allocation. >>>> >>>> So far, we did not measure drastic performance improvement in testing >>>> with kernbench and X11perf. Actually, we found out that, reducing tlb >>>> flushes accompanying with vmrun does not improve performance very much. >>>> we sent out a patch to optimize hvm_flush_guest_tlbs last week, which >>>> reduces over 90% tlb flushes for vmrun, and we even cannot see >>>> signification speedup with it. Maybe, the latency of vmrun is too big so >>>> that the overhead of tlb flush is negligible? >>>> >>>> Thanks, >>>> Wei >>>> >>>> On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote: >>>>> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote: >>>>>> Future AMD SVM supports a new feature called flush by ASID. The idea is >>>>>> to allow CPU to flush TLBs associated with the ASID assigned to guest >>>>>> VM. So hypervisor doesn''t have to reassign a new ASID in order to flush >>>>>> guest''s VCPU. Please review it. >>>>> What advantage does the new system have? Intuitively it seems like it >>>>> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing >>>>> instead of relying on LRO) but I''m not convinced that it will be visible >>>>> in macro-benchmarks. Have you measured it? >>>>> >>>>> Cheers, >>>>> >>>>> Tim. >>>>> >>>>>> Thanks, >>>>>> Wei >>>>>> >>>>>> Signed-off-by: Wei Huang<wei.huang2@amd.com> >>>>>> Signed-off-by: Wei Wang<wei.wang2@amd.com> >>>>>> -- >>>>>> Advanced Micro Devices GmbH >>>>>> Sitz: Dornach, Gemeinde Aschheim, >>>>>> Landkreis München Registergericht München, >>>>>> HRB Nr. 43632 >>>>>> WEEE-Reg-Nr: DE 12919551 >>>>>> Geschäftsführer: >>>>>> Alberto Bozzo, Andrew Bowd >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel