I see the code like (in arch/x86/time.c), and wondering how IA32_TSC_AUX MSR is saved/restored at domain switch time. if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && boot_cpu_has(X86_FEATURE_RDTSCP) ) write_rdtscp_aux(d->arch.incarnation); BTW, include/asm-x86/msr.h #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ in include/asm-x86/msr-index.h Thanks, Jun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Jun -- Xen doesn''t expose the TSC rdtscp bit so assumes that no guests depend on it. So no save/restore of TSC_AUX is necessary. Xen could provide support for the TSC rdtscp bit and allow a guest OS to manage TSC_AUX, but the existing use of TSC_AUX by Linux would fail to provide the desired result across migration, so there''s little point. Also the pvrdtscp algorithm now assumes that Xen itself is responsible for updating TSC_AUX whenever a migration (across physical machines) occurs. The #define for write_rdtscp_aux is from Linux source, so I didn''t change the code and define the constant. Dan> -----Original Message----- > From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > Sent: Wednesday, December 09, 2009 9:42 AM > To: xen-devel@lists.xensource.com > Cc: Dan Magenheimer > Subject: Saving/Restoring IA32_TSC_AUX MSR > > > I see the code like (in arch/x86/time.c), and wondering how > IA32_TSC_AUX MSR is saved/restored at domain switch time. > > if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && > boot_cpu_has(X86_FEATURE_RDTSCP) ) > write_rdtscp_aux(d->arch.incarnation); > > BTW, > > include/asm-x86/msr.h > #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) > > We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding > +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ > in include/asm-x86/msr-index.h > > Thanks, > Jun > --- > Intel Open Source Technology Center > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59:> Hi Jun -- >Dan,> Xen doesn''t expose the TSC rdtscp bit so assumes that > no guests depend on it. So no save/restore of TSC_AUX > is necessary. Xen could provide support for the TSCBut it''s possible that multiple domains use the pvrdtscp algorithm, and the incarnation number is domain specific. We also have the issue when adding RDTSCP support for HVM guests.> rdtscp bit and allow a guest OS to manage TSC_AUX, but > the existing use of TSC_AUX by Linux would fail to > provide the desired result across migration, so there''s > little point. Also the pvrdtscp algorithm now assumes > that Xen itself is responsible for updating TSC_AUX > whenever a migration (across physical machines) occurs. > > The #define for write_rdtscp_aux is from Linux source, > so I didn''t change the code and define the constant. > > Dan > >> -----Original Message----- >> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >> Sent: Wednesday, December 09, 2009 9:42 AM >> To: xen-devel@lists.xensource.com >> Cc: Dan Magenheimer >> Subject: Saving/Restoring IA32_TSC_AUX MSR >> >> >> I see the code like (in arch/x86/time.c), and wondering how >> IA32_TSC_AUX MSR is saved/restored at domain switch time. >> >> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && >> boot_cpu_has(X86_FEATURE_RDTSCP) ) >> write_rdtscp_aux(d->arch.incarnation); >> >> BTW, >> >> include/asm-x86/msr.h >> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) >> >> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding >> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ >> in include/asm-x86/msr-index.h >> >> Thanks, >> Jun >> --- >> Intel Open Source Technology Center >> >>Jun ___ Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Jun --> But it''s possible that multiple domains use the pvrdtscp > algorithm, and the incarnation number is domain specific.OK, I see. The code for writing TSC_AUX is in __update_vcpu_system_time() not in context switch.> We also have the issue when adding RDTSCP support for > HVM guests.Only if you expose the rdtscp bit via cpuid. This could certainly be done but, as I said, is probably pointless. (The pvrdtscp algorithm uses the instruction whether or not the rdtscp bit is set in cpuid, since Xen emulates it -- for PV domains only now -- if the physical machine doesn''t support the instruction. Dan> -----Original Message----- > From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > Sent: Wednesday, December 09, 2009 10:08 AM > To: Dan Magenheimer; xen-devel@lists.xensource.com > Subject: RE: Saving/Restoring IA32_TSC_AUX MSR > > > Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59: > > > Hi Jun -- > > > > Dan, > > > Xen doesn''t expose the TSC rdtscp bit so assumes that > > no guests depend on it. So no save/restore of TSC_AUX > > is necessary. Xen could provide support for the TSC > > But it''s possible that multiple domains use the pvrdtscp > algorithm, and the incarnation number is domain specific. We > also have the issue when adding RDTSCP support for HVM guests. > > > rdtscp bit and allow a guest OS to manage TSC_AUX, but > > the existing use of TSC_AUX by Linux would fail to > > provide the desired result across migration, so there''s > > little point. Also the pvrdtscp algorithm now assumes > > that Xen itself is responsible for updating TSC_AUX > > whenever a migration (across physical machines) occurs. > > > > The #define for write_rdtscp_aux is from Linux source, > > so I didn''t change the code and define the constant. > > > > Dan > > > >> -----Original Message----- > >> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > >> Sent: Wednesday, December 09, 2009 9:42 AM > >> To: xen-devel@lists.xensource.com > >> Cc: Dan Magenheimer > >> Subject: Saving/Restoring IA32_TSC_AUX MSR > >> > >> > >> I see the code like (in arch/x86/time.c), and wondering how > >> IA32_TSC_AUX MSR is saved/restored at domain switch time. > >> > >> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && > >> boot_cpu_has(X86_FEATURE_RDTSCP) ) > >> write_rdtscp_aux(d->arch.incarnation); > >> > >> BTW, > >> > >> include/asm-x86/msr.h > >> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) > >> > >> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding > >> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ > >> in include/asm-x86/msr-index.h > >> > >> Thanks, > >> Jun > >> --- > >> Intel Open Source Technology Center > >> > >> > > Jun > ___ > Intel Open Source Technology Center > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, Dan, I am now trying to add the rdtscp support for Xen HVM guest. I have some questions about your pvrdtscp patch. See below. Dan Magenheimer wrote:> Hi Jun -- > >> But it''s possible that multiple domains use the pvrdtscp >> algorithm, and the incarnation number is domain specific. > > OK, I see. The code for writing TSC_AUX is in > __update_vcpu_system_time() not in context switch.Will you modify the place where Hypervisor writes TSC_AUX MSR? In the current pvrdtscp logic, I think this MSR should be written while vcpu context switch. Also, this will make HVM support much easier because that MSR would not be modified by Hypervisor time to time.> >> We also have the issue when adding RDTSCP support for >> HVM guests. > > Only if you expose the rdtscp bit via cpuid. This could > certainly be done but, as I said, is probably pointless. > (The pvrdtscp algorithm uses the instruction whether or > not the rdtscp bit is set in cpuid, since Xen emulates > it -- for PV domains only now -- if the physical machine > doesn''t support the instruction.We are planning to add HVM support for RDTSCP, and the behavior for this instruction will follow the native way. This caused a problem that RDTSCP instruction in application has different experience upon PV and HVM domains. Do you have any comment about this? Thanks! Thanks! Dongxiao> > Dan > >> -----Original Message----- >> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >> Sent: Wednesday, December 09, 2009 10:08 AM >> To: Dan Magenheimer; xen-devel@lists.xensource.com >> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR >> >> >> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59: >> >>> Hi Jun -- >>> >> >> Dan, >> >>> Xen doesn''t expose the TSC rdtscp bit so assumes that >>> no guests depend on it. So no save/restore of TSC_AUX >>> is necessary. Xen could provide support for the TSC >> >> But it''s possible that multiple domains use the pvrdtscp >> algorithm, and the incarnation number is domain specific. We >> also have the issue when adding RDTSCP support for HVM guests. >> >>> rdtscp bit and allow a guest OS to manage TSC_AUX, but >>> the existing use of TSC_AUX by Linux would fail to >>> provide the desired result across migration, so there''s >>> little point. Also the pvrdtscp algorithm now assumes >>> that Xen itself is responsible for updating TSC_AUX >>> whenever a migration (across physical machines) occurs. >>> >>> The #define for write_rdtscp_aux is from Linux source, >>> so I didn''t change the code and define the constant. >>> >>> Dan >>> >>>> -----Original Message----- >>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >>>> Sent: Wednesday, December 09, 2009 9:42 AM >>>> To: xen-devel@lists.xensource.com >>>> Cc: Dan Magenheimer >>>> Subject: Saving/Restoring IA32_TSC_AUX MSR >>>> >>>> >>>> I see the code like (in arch/x86/time.c), and wondering how >>>> IA32_TSC_AUX MSR is saved/restored at domain switch time. >>>> >>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && >>>> boot_cpu_has(X86_FEATURE_RDTSCP) ) >>>> write_rdtscp_aux(d->arch.incarnation); >>>> >>>> BTW, >>>> >>>> include/asm-x86/msr.h >>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) >>>> >>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding >>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ >>>> in include/asm-x86/msr-index.h >>>> >>>> Thanks, >>>> Jun >>>> --- >>>> Intel Open Source Technology Center >>>> >>>> >> >> Jun >> ___ >> Intel Open Source Technology Center >> >> >> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-10 15:49 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
Hi Dongxiao -- There are two approaches to adding rdtscp support: 1) Faithful full implementation of rdtscp instruction 2) Support pvrtdtscp algorithm For (1), you would enable the rdtscp bit in cpuid. Then on hardware that supports rdtscp, you would do context switching of TSC_AUX. On hardware that doesn''t support rdtscp, you would intercept the illegal instruction trap and emulate the instruction. (TSC_AUX emulation could be handled "lazily", no need to do context switch for that.) BUT if you look at how TSC_AUX is used by a native OS**, the OS sets TSC_AUX to each physical CPU number so an application can easily determine if successive rdtscp instructions were not executed on the same processor. (This was important on older processors that did not have invariant TSC.) Unfortunately, on Xen, this mechanism is worthless and misleading because the OS believes it is setting TSC_AUX to a physical CPU number but it is actually setting it to a virtual CPU number, and the physical CPU number may change at any time due to scheduling or migration. So an app using rdtscp will get a wrong answer. As a result, I do NOT recommend (1) and do recommend that Xen should continue to return zero for the rdtscp bit in cpuid. For (2), setting TSC_AUX in __update_vcpu_system_time() is fine (I think). On hardware that supports, for HVM you would need to ensure that the rdtscp instruction works natively (even though the rdtscp bit in cpuid is not turned on for the guest). On hardware that does not support rdtscp, you would intercept the illegal instruction trap and call the existing code in pv_soft_rdtsc(). Does that make sense? Thanks, Dan ** I''ve looked at RHEL5. Windows actually always returns 0 for TSC_AUX.> -----Original Message----- > From: Xu, Dongxiao [mailto:dongxiao.xu@intel.com] > Sent: Thursday, December 10, 2009 4:22 AM > To: Dan Magenheimer; Nakajima, Jun; > xen-devel@lists.xensource.com; Keir > Fraser > Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > Hi, Dan, > I am now trying to add the rdtscp support for Xen HVM guest. > I have some questions about your pvrdtscp patch. See below. > > Dan Magenheimer wrote: > > Hi Jun -- > > > >> But it''s possible that multiple domains use the pvrdtscp > >> algorithm, and the incarnation number is domain specific. > > > > OK, I see. The code for writing TSC_AUX is in > > __update_vcpu_system_time() not in context switch. > > Will you modify the place where Hypervisor writes TSC_AUX MSR? > In the current pvrdtscp logic, I think this MSR should be > written while > vcpu context switch. Also, this will make HVM support much easier > because that MSR would not be modified by Hypervisor time to time. > > > > >> We also have the issue when adding RDTSCP support for > >> HVM guests. > > > > Only if you expose the rdtscp bit via cpuid. This could > > certainly be done but, as I said, is probably pointless. > > (The pvrdtscp algorithm uses the instruction whether or > > not the rdtscp bit is set in cpuid, since Xen emulates > > it -- for PV domains only now -- if the physical machine > > doesn''t support the instruction. > > We are planning to add HVM support for RDTSCP, and the > behavior for this instruction > will follow the native way. > This caused a problem that RDTSCP instruction in application > has different experience > upon PV and HVM domains. Do you have any comment about this? Thanks! > > Thanks! > Dongxiao > > > > > Dan > > > >> -----Original Message----- > >> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > >> Sent: Wednesday, December 09, 2009 10:08 AM > >> To: Dan Magenheimer; xen-devel@lists.xensource.com > >> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR > >> > >> > >> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59: > >> > >>> Hi Jun -- > >>> > >> > >> Dan, > >> > >>> Xen doesn''t expose the TSC rdtscp bit so assumes that > >>> no guests depend on it. So no save/restore of TSC_AUX > >>> is necessary. Xen could provide support for the TSC > >> > >> But it''s possible that multiple domains use the pvrdtscp > >> algorithm, and the incarnation number is domain specific. We > >> also have the issue when adding RDTSCP support for HVM guests. > >> > >>> rdtscp bit and allow a guest OS to manage TSC_AUX, but > >>> the existing use of TSC_AUX by Linux would fail to > >>> provide the desired result across migration, so there''s > >>> little point. Also the pvrdtscp algorithm now assumes > >>> that Xen itself is responsible for updating TSC_AUX > >>> whenever a migration (across physical machines) occurs. > >>> > >>> The #define for write_rdtscp_aux is from Linux source, > >>> so I didn''t change the code and define the constant. > >>> > >>> Dan > >>> > >>>> -----Original Message----- > >>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > >>>> Sent: Wednesday, December 09, 2009 9:42 AM > >>>> To: xen-devel@lists.xensource.com > >>>> Cc: Dan Magenheimer > >>>> Subject: Saving/Restoring IA32_TSC_AUX MSR > >>>> > >>>> > >>>> I see the code like (in arch/x86/time.c), and wondering how > >>>> IA32_TSC_AUX MSR is saved/restored at domain switch time. > >>>> > >>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && > >>>> boot_cpu_has(X86_FEATURE_RDTSCP) ) > >>>> write_rdtscp_aux(d->arch.incarnation); > >>>> > >>>> BTW, > >>>> > >>>> include/asm-x86/msr.h > >>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) > >>>> > >>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding > >>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ > >>>> in include/asm-x86/msr-index.h > >>>> > >>>> Thanks, > >>>> Jun > >>>> --- > >>>> Intel Open Source Technology Center > >>>> > >>>> > >> > >> Jun > >> ___ > >> Intel Open Source Technology Center > >> > >> > >> > >> > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan, Thanks for reply, some comments below. Best Regards, -- Dongxiao Dan Magenheimer wrote:> Hi Dongxiao -- > > There are two approaches to adding rdtscp support: > > 1) Faithful full implementation of rdtscp instruction > 2) Support pvrtdtscp algorithm > > For (1), you would enable the rdtscp bit in cpuid. Then > on hardware that supports rdtscp, you would do context > switching of TSC_AUX. On hardware that doesn''t support > rdtscp, you would intercept the illegal instruction trap > and emulate the instruction. (TSC_AUX emulation > could be handled "lazily", no need to do context > switch for that.) > > BUT if you look at how TSC_AUX is used by a native > OS**, the OS sets TSC_AUX to each physical CPU number > so an application can easily determine if successive > rdtscp instructions were not executed on the same > processor. (This was important on older processors > that did not have invariant TSC.) Unfortunately, > on Xen, this mechanism is worthless and misleading > because the OS believes it is setting TSC_AUX to > a physical CPU number but it is actually setting > it to a virtual CPU number, and the physical CPU > number may change at any time due to scheduling > or migration. So an app using rdtscp will get a > wrong answer.However for HVM, we should keep its behavior the same as on native machine. So if hardware support rdtscp, we will also support it in HVM; if not, we will not expose that bit in cpuid to guest.> > As a result, I do NOT recommend (1) and do recommend > that Xen should continue to return zero for the rdtscp > bit in cpuid. > > For (2), setting TSC_AUX in __update_vcpu_system_time() > is fine (I think). On hardware that supports, for HVM > you would need to ensure that the rdtscp instruction > works natively (even though the rdtscp bit in cpuid > is not turned on for the guest). On hardware that > does not support rdtscp, you would intercept the illegal > instruction trap and call the existing code in > pv_soft_rdtsc().Put the writing of TSC_AUX MSR in __update_vcpu_system_time() has a problem that, Hypervisor will overwrite the value time to time, ( For example, at do_softirq()->local_time_calibration() ), even if the value didn''t change (Currently the domain incarnation value only increase at save/restore/migration). This makes HVM support a bit Tricky because we need to save/restore guest/host TSC_AUX at every VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in context_switch(), then things will become easier for HVM support. Do you have idea about It? Thanks! :-)> > Does that make sense? > > Thanks, > Dan > > ** I''ve looked at RHEL5. Windows actually always > returns 0 for TSC_AUX. > >> -----Original Message----- >> From: Xu, Dongxiao [mailto:dongxiao.xu@intel.com] >> Sent: Thursday, December 10, 2009 4:22 AM >> To: Dan Magenheimer; Nakajima, Jun; >> xen-devel@lists.xensource.com; Keir >> Fraser >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR >> >> >> Hi, Dan, >> I am now trying to add the rdtscp support for Xen HVM guest. >> I have some questions about your pvrdtscp patch. See below. >> >> Dan Magenheimer wrote: >>> Hi Jun -- >>> >>>> But it''s possible that multiple domains use the pvrdtscp >>>> algorithm, and the incarnation number is domain specific. >>> >>> OK, I see. The code for writing TSC_AUX is in >>> __update_vcpu_system_time() not in context switch. >> >> Will you modify the place where Hypervisor writes TSC_AUX MSR? >> In the current pvrdtscp logic, I think this MSR should be >> written while >> vcpu context switch. Also, this will make HVM support much easier >> because that MSR would not be modified by Hypervisor time to time. >> >>> >>>> We also have the issue when adding RDTSCP support for >>>> HVM guests. >>> >>> Only if you expose the rdtscp bit via cpuid. This could >>> certainly be done but, as I said, is probably pointless. >>> (The pvrdtscp algorithm uses the instruction whether or >>> not the rdtscp bit is set in cpuid, since Xen emulates >>> it -- for PV domains only now -- if the physical machine >>> doesn''t support the instruction. >> >> We are planning to add HVM support for RDTSCP, and the >> behavior for this instruction >> will follow the native way. >> This caused a problem that RDTSCP instruction in application >> has different experience >> upon PV and HVM domains. Do you have any comment about this? Thanks! >> >> Thanks! >> Dongxiao >> >>> >>> Dan >>> >>>> -----Original Message----- >>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >>>> Sent: Wednesday, December 09, 2009 10:08 AM >>>> To: Dan Magenheimer; xen-devel@lists.xensource.com >>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR >>>> >>>> >>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59: >>>> >>>>> Hi Jun -- >>>>> >>>> >>>> Dan, >>>> >>>>> Xen doesn''t expose the TSC rdtscp bit so assumes that >>>>> no guests depend on it. So no save/restore of TSC_AUX >>>>> is necessary. Xen could provide support for the TSC >>>> >>>> But it''s possible that multiple domains use the pvrdtscp >>>> algorithm, and the incarnation number is domain specific. We >>>> also have the issue when adding RDTSCP support for HVM guests. >>>> >>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but >>>>> the existing use of TSC_AUX by Linux would fail to >>>>> provide the desired result across migration, so there''s >>>>> little point. Also the pvrdtscp algorithm now assumes >>>>> that Xen itself is responsible for updating TSC_AUX >>>>> whenever a migration (across physical machines) occurs. >>>>> >>>>> The #define for write_rdtscp_aux is from Linux source, >>>>> so I didn''t change the code and define the constant. >>>>> >>>>> Dan >>>>> >>>>>> -----Original Message----- >>>>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >>>>>> Sent: Wednesday, December 09, 2009 9:42 AM >>>>>> To: xen-devel@lists.xensource.com >>>>>> Cc: Dan Magenheimer >>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR >>>>>> >>>>>> >>>>>> I see the code like (in arch/x86/time.c), and wondering how >>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time. >>>>>> >>>>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && >>>>>> boot_cpu_has(X86_FEATURE_RDTSCP) ) >>>>>> write_rdtscp_aux(d->arch.incarnation); >>>>>> >>>>>> BTW, >>>>>> >>>>>> include/asm-x86/msr.h >>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) >>>>>> >>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding >>>>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ >>>>>> in include/asm-x86/msr-index.h >>>>>> >>>>>> Thanks, >>>>>> Jun >>>>>> --- >>>>>> Intel Open Source Technology Center >>>>>> >>>>>> >>>> >>>> Jun >>>> ___ >>>> Intel Open Source Technology Center >>>> >>>> >>>> >>>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 02:00 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> However for HVM, we should keep its behavior the same as > on native machine. So if hardware support rdtscp, we will also > support it in HVM; if not, we will not expose that bit in cpuid > to guest.As I said, I think this a very bad idea because there is no way to ensure the behavior of an app/OS in a VM gives the same results as in a physical machine. So I think the cpuid rdtscp bit should always be off.> increase at save/restore/migration). This makes HVM support a bit > Tricky because we need to save/restore guest/host TSC_AUX at every > VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in > context_switch(), then things will become easier for HVM support.If you are doing a full faithful implementation of rdtscp (as if cpuid rdtscp bit is on), I agree this is a problem. If not, and the only use of TSC_AUX is for the pvrdtscp algorithm, I think setting TSC_AUX in __update_vcpu_system_time() is fine because TSC_AUX is not part of a VM''s context, it is a communication of information from system software (Xen) to applications. I expect that Keir will not support putting TSC_AUX in the context switch code unless it is absolutely necessary, as it is certainly expensive to read and write to TSC_AUX and this cost will add to every context switch of every VM even though very few will actually use rdtscp/TSC_AUX. So I think we need to decide first about approach (1), the full faithful implementation of rdtscp.> -----Original Message----- > From: Xu, Dongxiao [mailto:dongxiao.xu@intel.com] > Sent: Thursday, December 10, 2009 6:23 PM > To: Dan Magenheimer; Nakajima, Jun; > xen-devel@lists.xensource.com; Keir > Fraser > Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > Dan, > Thanks for reply, some comments below. > > Best Regards, > -- Dongxiao > > Dan Magenheimer wrote: > > Hi Dongxiao -- > > > > There are two approaches to adding rdtscp support: > > > > 1) Faithful full implementation of rdtscp instruction > > 2) Support pvrtdtscp algorithm > > > > For (1), you would enable the rdtscp bit in cpuid. Then > > on hardware that supports rdtscp, you would do context > > switching of TSC_AUX. On hardware that doesn''t support > > rdtscp, you would intercept the illegal instruction trap > > and emulate the instruction. (TSC_AUX emulation > > could be handled "lazily", no need to do context > > switch for that.) > > > > BUT if you look at how TSC_AUX is used by a native > > OS**, the OS sets TSC_AUX to each physical CPU number > > so an application can easily determine if successive > > rdtscp instructions were not executed on the same > > processor. (This was important on older processors > > that did not have invariant TSC.) Unfortunately, > > on Xen, this mechanism is worthless and misleading > > because the OS believes it is setting TSC_AUX to > > a physical CPU number but it is actually setting > > it to a virtual CPU number, and the physical CPU > > number may change at any time due to scheduling > > or migration. So an app using rdtscp will get a > > wrong answer. > > However for HVM, we should keep its behavior the same as > on native machine. So if hardware support rdtscp, we will also > support it in HVM; if not, we will not expose that bit in cpuid > to guest. > > > > > As a result, I do NOT recommend (1) and do recommend > > that Xen should continue to return zero for the rdtscp > > bit in cpuid. > > > > For (2), setting TSC_AUX in __update_vcpu_system_time() > > is fine (I think). On hardware that supports, for HVM > > you would need to ensure that the rdtscp instruction > > works natively (even though the rdtscp bit in cpuid > > is not turned on for the guest). On hardware that > > does not support rdtscp, you would intercept the illegal > > instruction trap and call the existing code in > > pv_soft_rdtsc(). > > Put the writing of TSC_AUX MSR in __update_vcpu_system_time() > has a problem that, Hypervisor will overwrite the value time to time, > ( For example, at do_softirq()->local_time_calibration() ), > even if the > value didn''t change (Currently the domain incarnation value only > increase at save/restore/migration). This makes HVM support a bit > Tricky because we need to save/restore guest/host TSC_AUX at every > VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in > context_switch(), then things will become easier for HVM support. > Do you have idea about It? Thanks! :-) > > > > > Does that make sense? > > > > Thanks, > > Dan > > > > ** I''ve looked at RHEL5. Windows actually always > > returns 0 for TSC_AUX. > > > >> -----Original Message----- > >> From: Xu, Dongxiao [mailto:dongxiao.xu@intel.com] > >> Sent: Thursday, December 10, 2009 4:22 AM > >> To: Dan Magenheimer; Nakajima, Jun; > >> xen-devel@lists.xensource.com; Keir > >> Fraser > >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > >> > >> > >> Hi, Dan, > >> I am now trying to add the rdtscp support for Xen HVM guest. > >> I have some questions about your pvrdtscp patch. See below. > >> > >> Dan Magenheimer wrote: > >>> Hi Jun -- > >>> > >>>> But it''s possible that multiple domains use the pvrdtscp > >>>> algorithm, and the incarnation number is domain specific. > >>> > >>> OK, I see. The code for writing TSC_AUX is in > >>> __update_vcpu_system_time() not in context switch. > >> > >> Will you modify the place where Hypervisor writes TSC_AUX MSR? > >> In the current pvrdtscp logic, I think this MSR should be > >> written while > >> vcpu context switch. Also, this will make HVM support much easier > >> because that MSR would not be modified by Hypervisor time to time. > >> > >>> > >>>> We also have the issue when adding RDTSCP support for > >>>> HVM guests. > >>> > >>> Only if you expose the rdtscp bit via cpuid. This could > >>> certainly be done but, as I said, is probably pointless. > >>> (The pvrdtscp algorithm uses the instruction whether or > >>> not the rdtscp bit is set in cpuid, since Xen emulates > >>> it -- for PV domains only now -- if the physical machine > >>> doesn''t support the instruction. > >> > >> We are planning to add HVM support for RDTSCP, and the > >> behavior for this instruction > >> will follow the native way. > >> This caused a problem that RDTSCP instruction in application > >> has different experience > >> upon PV and HVM domains. Do you have any comment about > this? Thanks! > >> > >> Thanks! > >> Dongxiao > >> > >>> > >>> Dan > >>> > >>>> -----Original Message----- > >>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > >>>> Sent: Wednesday, December 09, 2009 10:08 AM > >>>> To: Dan Magenheimer; xen-devel@lists.xensource.com > >>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR > >>>> > >>>> > >>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59: > >>>> > >>>>> Hi Jun -- > >>>>> > >>>> > >>>> Dan, > >>>> > >>>>> Xen doesn''t expose the TSC rdtscp bit so assumes that > >>>>> no guests depend on it. So no save/restore of TSC_AUX > >>>>> is necessary. Xen could provide support for the TSC > >>>> > >>>> But it''s possible that multiple domains use the pvrdtscp > >>>> algorithm, and the incarnation number is domain specific. We > >>>> also have the issue when adding RDTSCP support for HVM guests. > >>>> > >>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but > >>>>> the existing use of TSC_AUX by Linux would fail to > >>>>> provide the desired result across migration, so there''s > >>>>> little point. Also the pvrdtscp algorithm now assumes > >>>>> that Xen itself is responsible for updating TSC_AUX > >>>>> whenever a migration (across physical machines) occurs. > >>>>> > >>>>> The #define for write_rdtscp_aux is from Linux source, > >>>>> so I didn''t change the code and define the constant. > >>>>> > >>>>> Dan > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > >>>>>> Sent: Wednesday, December 09, 2009 9:42 AM > >>>>>> To: xen-devel@lists.xensource.com > >>>>>> Cc: Dan Magenheimer > >>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR > >>>>>> > >>>>>> > >>>>>> I see the code like (in arch/x86/time.c), and wondering how > >>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time. > >>>>>> > >>>>>> if ( (d->arch.tsc_mode == TSC_MODE_PVRDTSCP) && > >>>>>> boot_cpu_has(X86_FEATURE_RDTSCP) ) > >>>>>> write_rdtscp_aux(d->arch.incarnation); > >>>>>> > >>>>>> BTW, > >>>>>> > >>>>>> include/asm-x86/msr.h > >>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0) > >>>>>> > >>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding > >>>>>> +#define MSR_TSC_AUX 0xc0000103 /* Auxiliary TSC */ > >>>>>> in include/asm-x86/msr-index.h > >>>>>> > >>>>>> Thanks, > >>>>>> Jun > >>>>>> --- > >>>>>> Intel Open Source Technology Center > >>>>>> > >>>>>> > >>>> > >>>> Jun > >>>> ___ > >>>> Intel Open Source Technology Center > >>>> > >>>> > >>>> > >>>> > >>> > >>> _______________________________________________ > >>> Xen-devel mailing list > >>> Xen-devel@lists.xensource.com > >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/12/2009 02:00, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> I expect that Keir will not support putting TSC_AUX > in the context switch code unless it is absolutely > necessary, as it is certainly expensive to read and > write to TSC_AUX and this cost will add to every > context switch of every VM even though very few will > actually use rdtscp/TSC_AUX.Well, you''d make it dependent on the guest using TSC_AUX, I suppose. I think that''s going to be pretty rare.> So I think we need to decide first about approach (1), > the full faithful implementation of rdtscp.The question has to be: what win do we get for faithful virtualisation of RDTSCP in a virtualised environment? Supporting CPU instructions just because they''re there is not a useful effort. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2009-Dec-11 08:43 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
Keir Fraser wrote:> On 11/12/2009 02:00, "Dan Magenheimer" <dan.magenheimer@oracle.com> > wrote: > >> I expect that Keir will not support putting TSC_AUX >> in the context switch code unless it is absolutely >> necessary, as it is certainly expensive to read and >> write to TSC_AUX and this cost will add to every >> context switch of every VM even though very few will >> actually use rdtscp/TSC_AUX. > > Well, you''d make it dependent on the guest using TSC_AUX, I suppose. > I think that''s going to be pretty rare. > >> So I think we need to decide first about approach (1), >> the full faithful implementation of rdtscp. > > The question has to be: what win do we get for faithful > virtualisation of RDTSCP in a virtualised environment? Supporting CPU > instructions just because they''re there is not a useful effort.As I know, RDTSCP can used to implment fast vgetcpu in newer Linux kernel. Current node and cpu info is saved in the MSR, and applications or libraries can get this info at ring3 through this instruction. If enable this instruction for vmx non-root mode, it should benefit these kernels I think. Thanks! Xiantao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/12/2009 08:43, "Zhang, Xiantao" <xiantao.zhang@intel.com> wrote:>> The question has to be: what win do we get for faithful >> virtualisation of RDTSCP in a virtualised environment? Supporting CPU >> instructions just because they''re there is not a useful effort. > > As I know, RDTSCP can used to implment fast vgetcpu in newer Linux kernel. > Current node and cpu info is saved in the MSR, and applications or libraries > can get this info at ring3 through this instruction. If enable this > instruction for vmx non-root mode, it should benefit these kernels I think.Sounds reasonable. Obviously this will be incompatible with pvrdtscp, but the latter is off by default so this isn''t a too serious problem I think. Pvrdtscp will simply trump ordinary RDTSCP emulation when it is enabled. You can put your meddling with TSC_AUX MSR in the context-switch path, regradless of whether pvrdtscp''s stays in __update_vcpu_system_time(). In short: have at it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 15:09 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> As I know, RDTSCP can used to implment fast vgetcpu in > newer Linux kernel.Yes, but code which uses fast vgetcpu is expecting to get physical cpu and physical node number. Since an HVM guest OS only has access to virtual cpu and virtual node number, the information written to TSC_AUX by a guest OS is misleading and may silently break any userland code that assumes it is getting physical information. I continue to think this is a bad idea and, to use Keir''s words, is "Supporting CPU instructions just because they''re there". But, if I am overruled, I''d like to see some measurement of the cycle cost for writing to TSC_AUX. Since Linux only writes it once at __cpuinit time, I wouldn''t be surprised to find out that it is horribly slow and adding it to every context switch would be slowing down all users of Xen for a handful of applications -- that are getting incorrect information (vcpu vs pcpu) anyway.> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Friday, December 11, 2009 2:22 AM > To: Zhang, Xiantao; Dan Magenheimer; Xu, Dongxiao; Nakajima, Jun; > xen-devel@lists.xensource.com > Cc: Dugger, Donald D > Subject: Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > On 11/12/2009 08:43, "Zhang, Xiantao" <xiantao.zhang@intel.com> wrote: > > >> The question has to be: what win do we get for faithful > >> virtualisation of RDTSCP in a virtualised environment? > Supporting CPU > >> instructions just because they''re there is not a useful effort. > > > > As I know, RDTSCP can used to implment fast vgetcpu in > newer Linux kernel. > > Current node and cpu info is saved in the MSR, and > applications or libraries > > can get this info at ring3 through this instruction. If enable this > > instruction for vmx non-root mode, it should benefit these > kernels I think. > > Sounds reasonable. Obviously this will be incompatible with > pvrdtscp, but > the latter is off by default so this isn''t a too serious > problem I think. > Pvrdtscp will simply trump ordinary RDTSCP emulation when it > is enabled. > > You can put your meddling with TSC_AUX MSR in the context-switch path, > regradless of whether pvrdtscp''s stays in __update_vcpu_system_time(). > > In short: have at it. > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote:>> As I know, RDTSCP can used to implment fast vgetcpu in >> newer Linux kernel. > > Yes, but code which uses fast vgetcpu is expecting > to get physical cpu and physical node number. Since > an HVM guest OS only has access to virtual cpu and > virtual node number, the information written to TSC_AUX > by a guest OS is misleading and may silently break any > userland code that assumes it is getting physical > information.This is depend on how the node info is virtualized. If the virtual node could reflect the physical node info, what rdtscp returns is valuable to applications.> > I continue to think this is a bad idea and, to use Keir''s > words, is "Supporting CPU instructions just because > they''re there". > > But, if I am overruled, I''d like to see some measurement > of the cycle cost for writing to TSC_AUX. Since > Linux only writes it once at __cpuinit time, I wouldn''t > be surprised to find out that it is horribly slow > and adding it to every context switch would be slowing > down all users of Xen for a handful of applications -- > that are getting incorrect information (vcpu vs pcpu) > anyway.According to the current PVRDTSC logic, write_rdtscp_aux() is called in each scheduling ( schedule()-> update_vcpu_system_time()->__update_vcpu_system_time()-> write_rdtscp_aux() ), which is more frequent than __context_switch().> >> -----Original Message----- >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: Friday, December 11, 2009 2:22 AM >> To: Zhang, Xiantao; Dan Magenheimer; Xu, Dongxiao; Nakajima, Jun; >> xen-devel@lists.xensource.com Cc: Dugger, Donald D >> Subject: Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR >> >> >> On 11/12/2009 08:43, "Zhang, Xiantao" <xiantao.zhang@intel.com> >> wrote: >> >>>> The question has to be: what win do we get for faithful >>>> virtualisation of RDTSCP in a virtualised environment? Supporting >>>> CPU instructions just because they''re there is not a useful effort. >>> >>> As I know, RDTSCP can used to implment fast vgetcpu in newer Linux >>> kernel. Current node and cpu info is saved in the MSR, and >>> applications or libraries can get this info at ring3 through this >>> instruction. If enable this instruction for vmx non-root mode, it >>> should benefit these kernels I think. >> >> Sounds reasonable. Obviously this will be incompatible with >> pvrdtscp, but >> the latter is off by default so this isn''t a too serious >> problem I think. >> Pvrdtscp will simply trump ordinary RDTSCP emulation when it >> is enabled. >> >> You can put your meddling with TSC_AUX MSR in the context-switch >> path, regradless of whether pvrdtscp''s stays in >> __update_vcpu_system_time(). >> >> In short: have at it. >> >> -- Keir >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-develBest Regards, -- Dongxiao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 16:12 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> > Yes, but code which uses fast vgetcpu is expecting > > to get physical cpu and physical node number. Since > > an HVM guest OS only has access to virtual cpu and > > virtual node number, the information written to TSC_AUX > > by a guest OS is misleading and may silently break any > > userland code that assumes it is getting physical > > information. > > This is depend on how the node info is virtualized. > If the virtual node could reflect the physical > node info, what rdtscp returns is valuable to applications.If it is possible to ensure that the cpu/node info is virtualized so that TSC_AUX always correctly provides the information needed by apps, I agree this would be valuable. I don''t see how this is possible, but maybe you have some creative ideas?> According to the current PVRDTSC logic, write_rdtscp_aux() > is called in each scheduling ( schedule()-> > update_vcpu_system_time()->__update_vcpu_system_time()-> > write_rdtscp_aux() ), which is more frequent than > __context_switch().OK, I see. Then I am OK with moving the call to write_rdtscp_aux() Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-11 18:20 UTC
Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
On 12/11/09 07:09, Dan Magenheimer wrote:>> As I know, RDTSCP can used to implment fast vgetcpu in >> newer Linux kernel. >> > Yes, but code which uses fast vgetcpu is expecting > to get physical cpu and physical node number. Since > an HVM guest OS only has access to virtual cpu and > virtual node number, the information written to TSC_AUX > by a guest OS is misleading and may silently break any > userland code that assumes it is getting physical > information. >It will fall back to using the segment limit trick to get vcpu+vnode info if rdtscp isn''t available, so they''ll get the info either way. It''s not clear how many apps make good use of the numa node info, but presumably some do. So long as the virtual numa info bears some vague resemblance to the real topology then they could still make use of it in a Xen domain. Whether or not Xen currently implements that is a separate question. However, the vcpu number is definitely useful to usermode apps, so they can get some idea how they''re moved between (v)cpus. I don''t think it will matter to them that it isn''t pcpu. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 18:35 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> However, the vcpu number is definitely useful to usermode > apps, so they > can get some idea how they''re moved between (v)cpus. I don''t > think it > will matter to them that it isn''t pcpu.My point is that an app running on native Linux can safely assume that, if TSC_AUX==3 at time T1 and TSC_AUX is still 3 at time T2,it is running on the same processor and the same node at both T1 and T2. In a virtual environment it cannot even assume it is running on the same machine. Further if the app sees that TSC_AUX==2 at time T3 and TSC_AUX==3 at time T4, on native Linux it can safely assume that it is running on a different processor. While rarer, in a virtual environment, this may also be a false assumption. That''s why I say the information is misleading.> -----Original Message----- > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > Sent: Friday, December 11, 2009 11:21 AM > To: Dan Magenheimer > Cc: Keir Fraser; Zhang, Xiantao; Xu, Dongxiao; Nakajima, Jun; > xen-devel@lists.xensource.com; Dugger, Donald D > Subject: Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > On 12/11/09 07:09, Dan Magenheimer wrote: > >> As I know, RDTSCP can used to implment fast vgetcpu in > >> newer Linux kernel. > >> > > Yes, but code which uses fast vgetcpu is expecting > > to get physical cpu and physical node number. Since > > an HVM guest OS only has access to virtual cpu and > > virtual node number, the information written to TSC_AUX > > by a guest OS is misleading and may silently break any > > userland code that assumes it is getting physical > > information. > > > > It will fall back to using the segment limit trick to get vcpu+vnode > info if rdtscp isn''t available, so they''ll get the info either way. > > It''s not clear how many apps make good use of the numa node info, but > presumably some do. So long as the virtual numa info bears > some vague > resemblance to the real topology then they could still make > use of it in > a Xen domain. Whether or not Xen currently implements that is a > separate question. > > However, the vcpu number is definitely useful to usermode > apps, so they > can get some idea how they''re moved between (v)cpus. I don''t > think it > will matter to them that it isn''t pcpu. > > J >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote on Fri, 11 Dec 2009 at 08:12:00:>>> Yes, but code which uses fast vgetcpu is expecting >>> to get physical cpu and physical node number. Since >>> an HVM guest OS only has access to virtual cpu and >>> virtual node number, the information written to TSC_AUX >>> by a guest OS is misleading and may silently break any >>> userland code that assumes it is getting physical >>> information. >> >> This is depend on how the node info is virtualized. >> If the virtual node could reflect the physical >> node info, what rdtscp returns is valuable to applications. > > If it is possible to ensure that the cpu/node info > is virtualized so that TSC_AUX always correctly provides the > information needed by apps, I agree this would be > valuable. I don''t see how this is possible, but maybe > you have some creative ideas?It''s possible, and the way guest NUMA supposed to be. We are working on that. Jun ___ Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-11 18:50 UTC
Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
On 12/11/09 10:35, Dan Magenheimer wrote:>> However, the vcpu number is definitely useful to usermode >> apps, so they >> can get some idea how they''re moved between (v)cpus. I don''t >> think it >> will matter to them that it isn''t pcpu. >> > My point is that an app running on native Linux can > safely assume that, if TSC_AUX==3 at time T1 and > TSC_AUX is still 3 at time T2,it is running > on the same processor and the same node at both T1 > and T2. In a virtual environment it cannot even > assume it is running on the same machine. > Further if the app sees that TSC_AUX==2 at time T3 > and TSC_AUX==3 at time T4, on native Linux it > can safely assume that it is running on a different > processor. While rarer, in a virtual environment, > this may also be a false assumption. > > That''s why I say the information is misleading. >Sure, but that info is, at best, of heuristic value, and won''t cause any correctness problems if it is wrong. The performance may suck, but that''s part of the larger problem of running NUMA-aware code in a virtual environment. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge wrote on Fri, 11 Dec 2009 at 10:50:29:> On 12/11/09 10:35, Dan Magenheimer wrote: >>> However, the vcpu number is definitely useful to usermode >>> apps, so they >>> can get some idea how they''re moved between (v)cpus. I don''t >>> think it >>> will matter to them that it isn''t pcpu. >>> >> My point is that an app running on native Linux can >> safely assume that, if TSC_AUX==3 at time T1 and >> TSC_AUX is still 3 at time T2,it is running >> on the same processor and the same node at both T1 >> and T2. In a virtual environment it cannot even >> assume it is running on the same machine. >> Further if the app sees that TSC_AUX==2 at time T3 >> and TSC_AUX==3 at time T4, on native Linux it >> can safely assume that it is running on a different >> processor. While rarer, in a virtual environment, >> this may also be a false assumption. >> >> That''s why I say the information is misleading. >> > Sure, but that info is, at best, of heuristic value, and won''t cause > any correctness problems if it is wrong. The performance may suck, but > that''s part of the larger problem of running NUMA-aware code in a > virtual environment. >And to utilize various NUMA optimizations in the kernel/apps in the guest, we need "the virtual numa info bears some vague resemblance to the real topology" (from Jeremy''s email) with the vcpus bound to the CPU/node. I understand that enabling RDTSCP in HVM will disable the pvrdtscp algorithm if used by the kernel. One way is to mask off the feature in CPUID (by default). Then kernel won''t use it. Jun ___ Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 19:46 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> It''s possible, and the way guest NUMA supposed to be. We are > working on that.I''d be very interested in learning more about your plans.> -----Original Message----- > From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > Sent: Friday, December 11, 2009 11:38 AM > To: Dan Magenheimer; Xu, Dongxiao; Keir Fraser; Zhang, Xiantao; > xen-devel@lists.xensource.com > Cc: Dugger, Donald D > Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > Dan Magenheimer wrote on Fri, 11 Dec 2009 at 08:12:00: > > >>> Yes, but code which uses fast vgetcpu is expecting > >>> to get physical cpu and physical node number. Since > >>> an HVM guest OS only has access to virtual cpu and > >>> virtual node number, the information written to TSC_AUX > >>> by a guest OS is misleading and may silently break any > >>> userland code that assumes it is getting physical > >>> information. > >> > >> This is depend on how the node info is virtualized. > >> If the virtual node could reflect the physical > >> node info, what rdtscp returns is valuable to applications. > > > > If it is possible to ensure that the cpu/node info > > is virtualized so that TSC_AUX always correctly provides the > > information needed by apps, I agree this would be > > valuable. I don''t see how this is possible, but maybe > > you have some creative ideas? > > It''s possible, and the way guest NUMA supposed to be. We are > working on that. > > Jun > ___ > Intel Open Source Technology Center > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 22:23 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
Well, although it might be nice to be able to use rdtscp and TSC_AUX to determine pcpu/vcpu/pnode/vnode information, I think Jeremy and Jan convinced me in another thread a couple of months ago that in userland: x = vgetcpu() do_other_stuff(); y = vgetcpu() if x==1 and y==2, there''s no way to determine that do_other_stuff() was executed on cpu 1 vs cpu 2, or (though unlikely) even on cpu 3. And if x==y==4, there''s no guarantee that do_other_stuff() is executed on cpu 4. If this is true the only safe use of TSC_AUX is for its originally designed intent: To determine if two successive rdtscp instructions were or were not executed on the same processor. Since this cannot be guaranteed in a VM, that''s a reasonable argument that TSC_AUX shouldn''t be exposed at all (meaning the rdtscp bit in cpuid should be turned off by Xen). True, as long as the information is ONLY used heuristically to obtain pcpu/vcpu/pnode/vnode info, and no guarantee of correctness is implied or expected, it might be useful some of the time. But frankly, if "performance sucks" when the heuristic fails due to the fact that the app is running on a VM instead of native OS, I''d see that as a problem and suggest the proper way to fix that is to define more App-to-Xen ABIs so that the app can get the real information, not a heuristic. Which also argues for Xen leaving the rdtscp bit in cpuid turned off Dan> -----Original Message----- > From: Nakajima, Jun [mailto:jun.nakajima@intel.com] > Sent: Friday, December 11, 2009 12:30 PM > To: Jeremy Fitzhardinge; Dan Magenheimer > Cc: Keir Fraser; Zhang, Xiantao; Xu, Dongxiao; > xen-devel@lists.xensource.com; Dugger, Donald D > Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR > > > Jeremy Fitzhardinge wrote on Fri, 11 Dec 2009 at 10:50:29: > > > On 12/11/09 10:35, Dan Magenheimer wrote: > >>> However, the vcpu number is definitely useful to usermode > >>> apps, so they > >>> can get some idea how they''re moved between (v)cpus. I don''t > >>> think it > >>> will matter to them that it isn''t pcpu. > >>> > >> My point is that an app running on native Linux can > >> safely assume that, if TSC_AUX==3 at time T1 and > >> TSC_AUX is still 3 at time T2,it is running > >> on the same processor and the same node at both T1 > >> and T2. In a virtual environment it cannot even > >> assume it is running on the same machine. > >> Further if the app sees that TSC_AUX==2 at time T3 > >> and TSC_AUX==3 at time T4, on native Linux it > >> can safely assume that it is running on a different > >> processor. While rarer, in a virtual environment, > >> this may also be a false assumption. > >> > >> That''s why I say the information is misleading. > >> > > Sure, but that info is, at best, of heuristic value, and > won''t cause > > any correctness problems if it is wrong. The performance > may suck, but > > that''s part of the larger problem of running NUMA-aware code in a > > virtual environment. > > > > And to utilize various NUMA optimizations in the kernel/apps > in the guest, we need "the virtual numa info bears some vague > resemblance to the real topology" (from Jeremy''s email) with > the vcpus bound to the CPU/node. > > I understand that enabling RDTSCP in HVM will disable the > pvrdtscp algorithm if used by the kernel. One way is to mask > off the feature in CPUID (by default). Then kernel won''t use it. > > Jun > ___ > Intel Open Source Technology Center > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote on Fri, 11 Dec 2009 at 14:23:02:> Well, although it might be nice to be able to use > rdtscp and TSC_AUX to determine pcpu/vcpu/pnode/vnode > information, I think Jeremy and Jan convinced me in > another thread a couple of months ago that in userland: > > x = vgetcpu() > do_other_stuff(); > y = vgetcpu() > > if x==1 and y==2, there''s no way to determine that > do_other_stuff() was executed on cpu 1 vs cpu 2, > or (though unlikely) even on cpu 3. And if > x==y==4, there''s no guarantee that do_other_stuff() > is executed on cpu 4. > > If this is true the only safe use of TSC_AUX is for > its originally designed intent: To determine if two > successive rdtscp instructions were or were not > executed on the same processor. Since this cannot > be guaranteed in a VM, that''s a reasonable argument > that TSC_AUX shouldn''t be exposed at all (meaning the > rdtscp bit in cpuid should be turned off by Xen).This should work if you bind (i.e. pin) each vcpu to each CPU, as I suggested. Jun ___ Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-11 23:30 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> > If this is true the only safe use of TSC_AUX is for > > its originally designed intent: To determine if two > > successive rdtscp instructions were or were not > > executed on the same processor. Since this cannot > > be guaranteed in a VM, that''s a reasonable argument > > that TSC_AUX shouldn''t be exposed at all (meaning the > > rdtscp bit in cpuid should be turned off by Xen). > > This should work if you bind (i.e. pin) each vcpu to each > CPU, as I suggested.Yes, it does. If there were a reasonable way for an application to check "am I running on a VM for which each vcpu has been pinned?" this might be a reasonable constraint as, if the app isn''t, it could fail or at least log a message. But if the app will randomly fail (or perform horribly) depending on whether the underlying VM is pinned or not (which might even change across a migration or if a sysadmin is "tuning" his data center), I don''t think enterprise customers would appreciate that. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote:>>> If this is true the only safe use of TSC_AUX is for >>> its originally designed intent: To determine if two >>> successive rdtscp instructions were or were not >>> executed on the same processor. Since this cannot >>> be guaranteed in a VM, that''s a reasonable argument >>> that TSC_AUX shouldn''t be exposed at all (meaning the >>> rdtscp bit in cpuid should be turned off by Xen). >> >> This should work if you bind (i.e. pin) each vcpu to each >> CPU, as I suggested. > > Yes, it does. If there were a reasonable way for an > application to check "am I running on a VM for which > each vcpu has been pinned?" this might be a reasonable > constraint as, if the app isn''t, it could fail or at least > log a message. But if the app will randomly fail > (or perform horribly) depending on whether the > underlying VM is pinned or not (which might even > change across a migration or if a sysadmin is > "tuning" his data center), I don''t think > enterprise customers would appreciate that.Dan, If later guest NUMA is implemented, both APP and Hypervisor/Guest are NUMA awared. APP could get benefit>From the information of node/processor which is got fromRDTSCP. But how to implement guest NUMA is another story, either we can use pin, or something other creative idea. Best Regards, -- Dongxiao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-12 00:09 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> > Yes, it does. If there were a reasonable way for an > > application to check "am I running on a VM for which > > each vcpu has been pinned?" this might be a reasonable > > constraint as, if the app isn''t, it could fail or at least > > log a message. But if the app will randomly fail > > (or perform horribly) depending on whether the > > underlying VM is pinned or not (which might even > > change across a migration or if a sysadmin is > > "tuning" his data center), I don''t think > > enterprise customers would appreciate that. > > Dan, > If later guest NUMA is implemented, both APP and > Hypervisor/Guest are NUMA awared. APP could get benefit > From the information of node/processor which is got from > RDTSCP. But how to implement guest NUMA is another story, > either we can use pin, or something other creative idea.Right. A guest NUMA implementation could use: 1) rdtscp+tsc_aux, which is very fast but unreliable (unless the app can be certain the guest is permanently pinned), or 2) some other yet-to-be-designed mechanism, likely involving system calls and/or hypercalls, which is slower but can be designed to be always reliable In my experience in the enterprise world, "slow but reliable" is always better than "fast but unreliable", except possibly in well-understood constrained situations. So I am suggesting we do not implement (1) by NOT enabling rdtscp-bit-in-cpuid and instead concentrate on (2). I guess for the special cases where unreliable is acceptable, (1) could be an option, but I don''t think it should be turned on by default. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer wrote:>>> Yes, it does. If there were a reasonable way for an >>> application to check "am I running on a VM for which >>> each vcpu has been pinned?" this might be a reasonable >>> constraint as, if the app isn''t, it could fail or at least >>> log a message. But if the app will randomly fail >>> (or perform horribly) depending on whether the >>> underlying VM is pinned or not (which might even >>> change across a migration or if a sysadmin is >>> "tuning" his data center), I don''t think >>> enterprise customers would appreciate that. >> >> Dan, >> If later guest NUMA is implemented, both APP and >> Hypervisor/Guest are NUMA awared. APP could get benefit >> From the information of node/processor which is got from >> RDTSCP. But how to implement guest NUMA is another story, >> either we can use pin, or something other creative idea. > > Right. A guest NUMA implementation could use: > > 1) rdtscp+tsc_aux, which is very fast but unreliable > (unless the app can be certain the guest is permanently > pinned), or > 2) some other yet-to-be-designed mechanism, likely involving > system calls and/or hypercalls, which is slower but can be > designed to be always reliableHere is my simple understanding of guest NUMA: it means that Hypervisor will present the correct NUMA information to Guest kernel/app. So once guest NUMA is implemented, the information got from RDTSCP is both reliable and fast. Thanks! Dongxiao> > In my experience in the enterprise world, "slow but > reliable" is always better than "fast but unreliable", > except possibly in well-understood constrained situations. > So I am suggesting we do not implement (1) by NOT > enabling rdtscp-bit-in-cpuid and instead concentrate > on (2). I guess for the special cases where unreliable > is acceptable, (1) could be an option, but I don''t > think it should be turned on by default. > > Dan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2009-Dec-13 09:17 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
Dan Magenheimer wrote:> Well, although it might be nice to be able to use > rdtscp and TSC_AUX to determine pcpu/vcpu/pnode/vnode > information, I think Jeremy and Jan convinced me in > another thread a couple of months ago that in userland: > > x = vgetcpu() > do_other_stuff(); > y = vgetcpu() > > if x==1 and y==2, there''s no way to determine that > do_other_stuff() was executed on cpu 1 vs cpu 2, > or (though unlikely) even on cpu 3. And if > x==y==4, there''s no guarantee that do_other_stuff() > is executed on cpu 4. > > If this is true the only safe use of TSC_AUX is for > its originally designed intent: To determine if two > successive rdtscp instructions were or were not > executed on the same processor. Since this cannot > be guaranteed in a VM, that''s a reasonable argument > that TSC_AUX shouldn''t be exposed at all (meaning the > rdtscp bit in cpuid should be turned off by Xen).Why do you think this is the design intent of this instruction ? For guest NUMA support, it should be a must to pin each vcpu of one VM to some logical proceossors which belong to one specific node(disable vcpu migration between nodes), I think, otherwise, virutal numa may suffer from performance loss. For example, in a numa system which has two nodes and each node has 4G memory and 8 logical processors. And in this Xen-configured system, if we carete a VM with 2 G memory with4 vcpu support, Xen system may allocate 1 G memory from physical node 0 and another 1 G memory from physical node 1. And in this case, if we virtualize numa for this VM, vcpu0 and vcpu1 can be assinged to virtual node0 , vcpu2 and vcpu3 can be configured for virtual node1, certainly, we also can safely pin vcpu0 and vpcu1 to the physical node0''s 8 locial processors and accordingly pin vcpu2 and vcpu3 to the physical node1''s 8 physical processors. Since virtual TSC_AUX is virtualized for each vcpu, and the value is saved/restored for the vcpu when its migration occurs, so if one application always runs on a virtual processors, it should get a fixed value when it calls vgetcpu, envn if this vcpu often migrates among logical processors of one node. Back to this topic, in all, we can''t mix the virtual TSC_AUX of guest with the host''s TSC_AUX. If switch to HVM''s vcpu context, load this vcpu''s virtual TSC_AUX_MSR to physical TSC_AUX_MSR, and when it is sheduled out, host''s TSC_AUX_MSR(which maybe used for pv guests) is loaded.> True, as long as the information is ONLY used > heuristically to obtain pcpu/vcpu/pnode/vnode info, > and no guarantee of correctness is implied or expected, > it might be useful some of the time. > > But frankly, if "performance sucks" when the heuristic > fails due to the fact that the app is running on > a VM instead of native OS, I''d see that as a problem > and suggest the proper way to fix that is to define > more App-to-Xen ABIs so that the app can get the > real information, not a heuristic. Which also argues > for Xen leaving the rdtscp bit in cpuid turned off > > Dan > >> -----Original Message----- >> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] >> Sent: Friday, December 11, 2009 12:30 PM >> To: Jeremy Fitzhardinge; Dan Magenheimer >> Cc: Keir Fraser; Zhang, Xiantao; Xu, Dongxiao; >> xen-devel@lists.xensource.com; Dugger, Donald D >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR >> >> >> Jeremy Fitzhardinge wrote on Fri, 11 Dec 2009 at 10:50:29: >> >>> On 12/11/09 10:35, Dan Magenheimer wrote: >>>>> However, the vcpu number is definitely useful to usermode apps, >>>>> so they can get some idea how they''re moved between (v)cpus. I >>>>> don''t think it will matter to them that it isn''t pcpu. >>>>> >>>> My point is that an app running on native Linux can >>>> safely assume that, if TSC_AUX==3 at time T1 and >>>> TSC_AUX is still 3 at time T2,it is running >>>> on the same processor and the same node at both T1 >>>> and T2. In a virtual environment it cannot even >>>> assume it is running on the same machine. >>>> Further if the app sees that TSC_AUX==2 at time T3 >>>> and TSC_AUX==3 at time T4, on native Linux it >>>> can safely assume that it is running on a different >>>> processor. While rarer, in a virtual environment, >>>> this may also be a false assumption. >>>> >>>> That''s why I say the information is misleading. >>>> >>> Sure, but that info is, at best, of heuristic value, and won''t >>> cause any correctness problems if it is wrong. The performance may >>> suck, but that''s part of the larger problem of running NUMA-aware >>> code in a virtual environment. >>> >> >> And to utilize various NUMA optimizations in the kernel/apps >> in the guest, we need "the virtual numa info bears some vague >> resemblance to the real topology" (from Jeremy''s email) with >> the vcpus bound to the CPU/node. >> >> I understand that enabling RDTSCP in HVM will disable the >> pvrdtscp algorithm if used by the kernel. One way is to mask >> off the feature in CPUID (by default). Then kernel won''t use it. >> >> Jun >> ___ >> Intel Open Source Technology Center_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Dec-13 18:06 UTC
RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> > If this is true the only safe use of TSC_AUX is for > > its originally designed intent: To determine if two > > successive rdtscp instructions were or were not > > executed on the same processor. Since this cannot > > be guaranteed in a VM, that''s a reasonable argument > > that TSC_AUX shouldn''t be exposed at all (meaning the > > rdtscp bit in cpuid should be turned off by Xen). > > Why do you think this is the design intent of this instruction ?The instruction was designed by AMD for this purpose a few years ago in order to allow applications to detect (and correct) possible TSC skew between processors.> For guest NUMA support, it should be a must to pin each vcpu > of one VM to some logical proceossors which belong to one > specific node(disable vcpu migration between nodes), I think, > otherwise, virutal numa may suffer from performance loss.I agree that, for guest NUMA support, restricting all vcpus to the same physical node is important. However, PINNING each vcpu to a fixed pcpu (and never allowing migration) greatly reduces the value of virtualization.> For example, in a numa system which has two nodes and each > node has 4G memory and 8 logical processors. And in this > Xen-configured system, if we carete a VM with 2 G memory > with4 vcpu support, Xen system may allocate 1 G memory from > physical node 0 and another 1 G memory from physical node 1. > And in this case, if we virtualize numa for this VM, vcpu0 > and vcpu1 can be assinged to virtual node0 , vcpu2 and vcpu3 > can be configured for virtual node1, certainly, we also can > safely pin vcpu0 and vpcu1 to the physical node0''s 8 locial > processors and accordingly pin vcpu2 and vcpu3 to the > physical node1''s 8 physical processors. Since virtual > TSC_AUX is virtualized for each vcpu, and the value is > saved/restored for the vcpu when its migration occurs, so if > one application always runs on a virtual processors, it > should get a fixed value when it calls vgetcpu, envn if this > vcpu often migrates among logical processors of one node.I agree there are some cases where the TSC_AUX value set by a guest OS may be useful. But ensuring that its is always useful (NEVER incorrect) requires too many restrictions, such as pinning.> Back to this topic, in all, we can''t mix the virtual > TSC_AUX of guest with the host''s TSC_AUX. If switch to HVM''s > vcpu context, load this vcpu''s virtual TSC_AUX_MSR to > physical TSC_AUX_MSR, and when it is sheduled out, host''s > TSC_AUX_MSR(which maybe used for pv guests) is loaded.I agree they can''t be mixed. My position is that a guest does not have sufficient information to always correctly set TSC_AUX, so the best way to avoid the issue is to tell the guest OS that TSC_AUX doesn''t exist (i.e. cpuid-rdtscp bit is off). Xen can still set TSC_AUX (and even emulate it on processors that don''t support it) and this information can still be used (correctly) by virtualization-and-NUMA-aware OS''s and applications. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-13 18:59 UTC
Re: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
On 12/13/09 10:06, Dan Magenheimer wrote:> I agree there are some cases where the TSC_AUX value > set by a guest OS may be useful. But ensuring that its > is always useful (NEVER incorrect) requires too many restrictions, > such as pinning. >At least with respect to Linux guests [*], this objection to rdtscp is moot, because if it isn''t present then Linux will fall back to another mechanism which is always present. Guest usermode will get the same info, good/bad/misleading/whatever, either way; rdtscp can''t make it worse. The only question is whether specifically adding rdtscp/TSC_AUX support adds any overall improvement. (* I don''t know if any other rdtscp-users attempt to put NUMA or other physical topology info into TSC_AUX. If they just stick to setting/using the cpu number, then they will get a net win from rdtscp.) J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge wrote:> On 12/13/09 10:06, Dan Magenheimer wrote: >> I agree there are some cases where the TSC_AUX value >> set by a guest OS may be useful. But ensuring that its >> is always useful (NEVER incorrect) requires too many restrictions, >> such as pinning. >> > > At least with respect to Linux guests [*], this objection to rdtscp is > moot, because if it isn''t present then Linux will fall back to another > mechanism which is always present. Guest usermode will get the same > info, good/bad/misleading/whatever, either way; rdtscp can''t make it > worse. The only question is whether specifically adding > rdtscp/TSC_AUX support adds any overall improvement. > > (* I don''t know if any other rdtscp-users attempt to put NUMA or other > physical topology info into TSC_AUX. If they just stick to > setting/using the cpu number, then they will get a net win from > rdtscp.)Just have a glance at the open-solaris code, in its mp_startup() function, it will write the cpu_id value into the TSC_AUX MSR. Therefore I think open-solaris also uses this feature. Thanks, Dongxiao> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel