Here is a set of VPMU changes that I thought might be useful. The first two patches are to avoid VMEXITs on certain MSR accesses. This is already part of VMX so I added similar SVM code The third patch should address the problem that Suravee mentioned on the list a few weeks ago (http://lists.xen.org/archives/html/xen-devel/2013-03/msg00087.html). It''s a slightly different solution then what he suggested. 4th patch stops counters on AMD when VCPU is de-scheduled 5th is trivial Haswell support (new model number) 6th patch is trying to factor out common code from VMX and SVM. 7th is lazy VPMU save/restore. It is optimized for the case when CPUs are not over-subscribed and VCPU stays on the same processor most of the time. It is more beneficial on Intel processors because HW will keep track of guest/host global control register in VMCS and tehrefore we don''t need to explicitly stop the counters. On AMD we do need to do this and so while there is improvement, it is not as pronounced. Here are some numbers that I managed to collect while running guests with oprofile. This is number of executed instructions in vpmu_save/vpmu_load. Eager VPMU Lazy VPMU Save Restore Save Restore Intel 181 225 46 50 AMD 132 104 80 102 When processors are oversubscribed, lazy restore may take about 2.5 times as many instructions as in the dedicated case if new VCPU jumps onto the processor (which doesn''t happen on every context switch). -boris Boris Ostrovsky (8): x86/AMD: Allow more fine-grained control of VMCB MSR Permission Map x86/AMD: Do not intercept access to performance counters MSRs x86/AMD: Read VPMU MSRs from context when it is not loaded into HW x86/AMD: Stop counters on VPMU save x86/VPMU: Add Haswell support x86/VPMU: Factor out VPMU common code x86/VPMU: Save/restore VPMU only when necessary x86/AMD: Clean up context_update() in AMD VPMU code xen/arch/x86/domain.c | 14 ++- xen/arch/x86/hvm/svm/svm.c | 19 ++-- xen/arch/x86/hvm/svm/vpmu.c | 188 +++++++++++++++++++++++-------- xen/arch/x86/hvm/vmx/vmx.c | 2 - xen/arch/x86/hvm/vmx/vpmu_core2.c | 48 ++++---- xen/arch/x86/hvm/vpmu.c | 114 +++++++++++++++++-- xen/include/asm-x86/hvm/svm/vmcb.h | 8 +- xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - xen/include/asm-x86/hvm/vpmu.h | 6 +- 9 files changed, 296 insertions(+), 104 deletions(-) -- 1.8.1.2
Boris Ostrovsky
2013-Apr-09 17:26 UTC
[PATCH 1/8] x86/AMD: Allow more fine-grained control of VMCB MSR Permission Map
Currently VMCB''s MSRPM can be updated to either intercept both reads and writes to an MSR or not intercept neither. In some cases we may want to be more selective and intercept one but not the other. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/svm.c | 15 +++++++-------- xen/include/asm-x86/hvm/svm/vmcb.h | 8 ++++++-- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index f170ffb..8ce37c9 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -137,7 +137,7 @@ svm_msrbit(unsigned long *msr_bitmap, uint32_t msr) return msr_bit; } -void svm_intercept_msr(struct vcpu *v, uint32_t msr, int enable) +void svm_intercept_msr(struct vcpu *v, uint32_t msr, int flags) { unsigned long *msr_bit; @@ -145,16 +145,15 @@ void svm_intercept_msr(struct vcpu *v, uint32_t msr, int enable) BUG_ON(msr_bit == NULL); msr &= 0x1fff; - if ( enable ) - { - __set_bit(msr * 2, msr_bit); + if ( flags & MSR_INTERCEPT_READ ) + __set_bit(msr * 2, msr_bit); + else + __clear_bit(msr * 2, msr_bit); + + if ( flags & MSR_INTERCEPT_WRITE ) __set_bit(msr * 2 + 1, msr_bit); - } else - { - __clear_bit(msr * 2, msr_bit); __clear_bit(msr * 2 + 1, msr_bit); - } } static void svm_save_dr(struct vcpu *v) diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h index b7c0404..ade4bb2 100644 --- a/xen/include/asm-x86/hvm/svm/vmcb.h +++ b/xen/include/asm-x86/hvm/svm/vmcb.h @@ -531,9 +531,13 @@ void svm_destroy_vmcb(struct vcpu *v); void setup_vmcb_dump(void); +#define MSR_INTERCEPT_NONE 0 +#define MSR_INTERCEPT_READ 1 +#define MSR_INTERCEPT_WRITE 2 +#define MSR_INTERCEPT_RW (MSR_INTERCEPT_WRITE | MSR_INTERCEPT_READ) void svm_intercept_msr(struct vcpu *v, uint32_t msr, int enable); -#define svm_disable_intercept_for_msr(v, msr) svm_intercept_msr((v), (msr), 0) -#define svm_enable_intercept_for_msr(v, msr) svm_intercept_msr((v), (msr), 1) +#define svm_disable_intercept_for_msr(v, msr) svm_intercept_msr((v), (msr), MSR_INTERCEPT_NONE) +#define svm_enable_intercept_for_msr(v, msr) svm_intercept_msr((v), (msr), MSR_INTERCEPT_RW) /* * VMCB accessor functions. -- 1.8.1.2
Boris Ostrovsky
2013-Apr-09 17:26 UTC
[PATCH 2/8] x86/AMD: Do not intercept access to performance counters MSRs
Access to performance counters and reads of event selects don''t need to always be intercepted. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/vpmu.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 8ce37c9..89e47b3 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1052,8 +1052,8 @@ static int svm_vcpu_initialise(struct vcpu *v) static void svm_vcpu_destroy(struct vcpu *v) { - svm_destroy_vmcb(v); vpmu_destroy(v); + svm_destroy_vmcb(v); passive_domain_destroy(v); } diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 16170da..f194975 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -88,6 +88,7 @@ struct amd_vpmu_context { u64 counters[MAX_NUM_COUNTERS]; u64 ctrls[MAX_NUM_COUNTERS]; u32 hw_lapic_lvtpc; + bool_t msr_bitmap_set; }; static inline int get_pmu_reg_type(u32 addr) @@ -138,6 +139,36 @@ static inline u32 get_fam15h_addr(u32 addr) return addr; } +static void amd_vpmu_set_msr_bitmap(struct vcpu *v) +{ + int i; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + struct amd_vpmu_context *ctxt = vpmu->context; + + for ( i = 0; i < num_counters; i++ ) + { + svm_intercept_msr(v, counters[i], MSR_INTERCEPT_NONE); + svm_intercept_msr(v, ctrls[i], MSR_INTERCEPT_WRITE); + } + + ctxt->msr_bitmap_set = 1; +} + +static void amd_vpmu_unset_msr_bitmap(struct vcpu *v) +{ + int i; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + struct amd_vpmu_context *ctxt = vpmu->context; + + for ( i = 0; i < num_counters; i++ ) + { + svm_intercept_msr(v, counters[i], MSR_INTERCEPT_RW); + svm_intercept_msr(v, ctrls[i], MSR_INTERCEPT_RW); + } + + ctxt->msr_bitmap_set = 0; +} + static int amd_vpmu_do_interrupt(struct cpu_user_regs *regs) { struct vcpu *v = current; @@ -219,6 +250,10 @@ static void amd_vpmu_save(struct vcpu *v) return; context_save(v); + + if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && ctx->msr_bitmap_set ) + amd_vpmu_unset_msr_bitmap(v); + ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); } @@ -267,6 +302,9 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) return 1; vpmu_set(vpmu, VPMU_RUNNING); apic_write(APIC_LVTPC, PMU_APIC_VECTOR); + + if ( !((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) + amd_vpmu_set_msr_bitmap(v); } /* stop saving & restore if guest stops first counter */ @@ -275,6 +313,8 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) { apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); vpmu_reset(vpmu, VPMU_RUNNING); + if ( ((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) + amd_vpmu_unset_msr_bitmap(v); release_pmu_ownship(PMU_OWNER_HVM); } @@ -345,6 +385,9 @@ static void amd_vpmu_destroy(struct vcpu *v) if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) return; + if ( ((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) + amd_vpmu_unset_msr_bitmap(v); + xfree(vpmu->context); vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED); -- 1.8.1.2
Boris Ostrovsky
2013-Apr-09 17:26 UTC
[PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
Mark context as LOADED on any MSR write (and on vpmu_restore()). This will allow us to know whether to read an MSR from HW or from context: guest may write into an MSR without enabling it (thus not marking the context as RUNNING) and then be migrated. Reading this MSR again will not match the pervious write since registers will not be loaded into HW. In addition, we should be saving the context when it is LOADED, not RUNNING --- otherwise we need to save it any time it becomes non-RUNNING, which may be a frequent occurrence. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/vpmu.c | 48 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index f194975..3115923 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -225,6 +225,8 @@ static void amd_vpmu_restore(struct vcpu *v) context_restore(v); apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); + + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); } static inline void context_save(struct vcpu *v) @@ -246,7 +248,7 @@ static void amd_vpmu_save(struct vcpu *v) struct amd_vpmu_context *ctx = vpmu->context; if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_RUNNING)) ) + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) return; context_save(v); @@ -256,6 +258,35 @@ static void amd_vpmu_save(struct vcpu *v) ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); +} + +static void context_read(unsigned int msr, u64 *msr_content) +{ + unsigned int i; + struct vcpu *v = current; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + struct amd_vpmu_context *ctxt = vpmu->context; + + if ( k7_counters_mirrored && + ((msr >= MSR_K7_EVNTSEL0) && (msr <= MSR_K7_PERFCTR3)) ) + { + msr = get_fam15h_addr(msr); + } + + for ( i = 0; i < num_counters; i++ ) + { + if ( msr == ctrls[i] ) + { + *msr_content = ctxt->ctrls[i]; + return; + } + else if (msr == counters[i] ) + { + *msr_content = ctxt->counters[i]; + return; + } + } } static void context_update(unsigned int msr, u64 msr_content) @@ -318,6 +349,12 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) release_pmu_ownship(PMU_OWNER_HVM); } + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + { + context_restore(v); + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); + } + /* Update vpmu context immediately */ context_update(msr, msr_content); @@ -328,7 +365,14 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) { - rdmsrl(msr, *msr_content); + struct vcpu *v = current; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + context_read(msr, msr_content); + else + rdmsrl(msr, *msr_content); + return 1; } -- 1.8.1.2
Stop the counters during VPMU save operation since they shouldn''t be running when VPCU that controls them is not. This also makes it unnecessary to check for overflow in context_restore() Set LVTPC vector before loading the context during vpmu_restore(). Otherwise it is possible to trigger an interrupt without proper vector. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/vpmu.c | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 3115923..2f4b6e4 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -197,20 +197,9 @@ static inline void context_restore(struct vcpu *v) struct amd_vpmu_context *ctxt = vpmu->context; for ( i = 0; i < num_counters; i++ ) - wrmsrl(ctrls[i], ctxt->ctrls[i]); - - for ( i = 0; i < num_counters; i++ ) { wrmsrl(counters[i], ctxt->counters[i]); - - /* Force an interrupt to allow guest reset the counter, - if the value is positive */ - if ( is_overflowed(ctxt->counters[i]) && (ctxt->counters[i] > 0) ) - { - gdprintk(XENLOG_WARNING, "VPMU: Force a performance counter " - "overflow interrupt!\n"); - amd_vpmu_do_interrupt(0); - } + wrmsrl(ctrls[i], ctxt->ctrls[i]); } } @@ -223,8 +212,8 @@ static void amd_vpmu_restore(struct vcpu *v) vpmu_is_set(vpmu, VPMU_RUNNING)) ) return; - context_restore(v); apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); + context_restore(v); vpmu_set(vpmu, VPMU_CONTEXT_LOADED); } @@ -236,10 +225,11 @@ static inline void context_save(struct vcpu *v) struct amd_vpmu_context *ctxt = vpmu->context; for ( i = 0; i < num_counters; i++ ) - rdmsrl(counters[i], ctxt->counters[i]); - - for ( i = 0; i < num_counters; i++ ) + { rdmsrl(ctrls[i], ctxt->ctrls[i]); + wrmsrl(ctrls[i], 0); + rdmsrl(counters[i], ctxt->counters[i]); + } } static void amd_vpmu_save(struct vcpu *v) -- 1.8.1.2
Initialize VPMU on Haswell CPUs. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/vmx/vpmu_core2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c b/xen/arch/x86/hvm/vmx/vpmu_core2.c index 2313e39..7c86a0b 100644 --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c @@ -893,6 +893,7 @@ int vmx_vpmu_initialise(struct vcpu *v, unsigned int vpmu_flags) case 0x3a: /* IvyBridge */ case 0x3e: /* IvyBridge EP */ + case 0x3c: /* Haswell */ ret = core2_vpmu_initialise(v, vpmu_flags); if ( !ret ) vpmu->arch_vpmu_ops = &core2_vpmu_ops; -- 1.8.1.2
Factor out common code from SVM amd VMX into VPMU. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/vpmu.c | 37 ----------------------- xen/arch/x86/hvm/vmx/vpmu_core2.c | 30 +------------------ xen/arch/x86/hvm/vpmu.c | 50 ++++++++++++++++++++++++++++---- xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - xen/include/asm-x86/hvm/vpmu.h | 1 + 5 files changed, 47 insertions(+), 72 deletions(-) diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 2f4b6e4..6610806 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -87,7 +87,6 @@ static const u32 AMD_F15H_CTRLS[] = { struct amd_vpmu_context { u64 counters[MAX_NUM_COUNTERS]; u64 ctrls[MAX_NUM_COUNTERS]; - u32 hw_lapic_lvtpc; bool_t msr_bitmap_set; }; @@ -171,22 +170,6 @@ static void amd_vpmu_unset_msr_bitmap(struct vcpu *v) static int amd_vpmu_do_interrupt(struct cpu_user_regs *regs) { - struct vcpu *v = current; - struct vlapic *vlapic = vcpu_vlapic(v); - u32 vlapic_lvtpc; - unsigned char int_vec; - - if ( !is_vlapic_lvtpc_enabled(vlapic) ) - return 0; - - vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); - int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; - - if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) - vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); - else - v->nmi_pending = 1; - return 1; } @@ -205,17 +188,7 @@ static inline void context_restore(struct vcpu *v) static void amd_vpmu_restore(struct vcpu *v) { - struct vpmu_struct *vpmu = vcpu_vpmu(v); - struct amd_vpmu_context *ctxt = vpmu->context; - - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_RUNNING)) ) - return; - - apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); context_restore(v); - - vpmu_set(vpmu, VPMU_CONTEXT_LOADED); } static inline void context_save(struct vcpu *v) @@ -237,18 +210,10 @@ static void amd_vpmu_save(struct vcpu *v) struct vpmu_struct *vpmu = vcpu_vpmu(v); struct amd_vpmu_context *ctx = vpmu->context; - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) - return; - context_save(v); if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && ctx->msr_bitmap_set ) amd_vpmu_unset_msr_bitmap(v); - - ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); - apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); } static void context_read(unsigned int msr, u64 *msr_content) @@ -299,8 +264,6 @@ static void context_update(unsigned int msr, u64 msr_content) for ( i = 0; i < num_counters; i++ ) if ( msr == ctrls[i] ) ctxt->ctrls[i] = msr_content; - - ctxt->hw_lapic_lvtpc = apic_read(APIC_LVTPC); } static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c b/xen/arch/x86/hvm/vmx/vpmu_core2.c index 7c86a0b..6195bfc 100644 --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c @@ -305,25 +305,18 @@ static inline void __core2_vpmu_save(struct vcpu *v) rdmsrl(core2_fix_counters.msr[i], core2_vpmu_cxt->fix_counters[i]); for ( i = 0; i < core2_get_pmc_count(); i++ ) rdmsrl(MSR_IA32_PERFCTR0+i, core2_vpmu_cxt->arch_msr_pair[i].counter); - core2_vpmu_cxt->hw_lapic_lvtpc = apic_read(APIC_LVTPC); - apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); } static void core2_vpmu_save(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) - return; - __core2_vpmu_save(v); /* Unset PMU MSR bitmap to trap lazy load. */ if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && cpu_has_vmx_msr_bitmap ) core2_vpmu_unset_msr_bitmap(v->arch.hvm_vmx.msr_bitmap); - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); return; } @@ -341,20 +334,11 @@ static inline void __core2_vpmu_load(struct vcpu *v) wrmsrl(core2_ctrls.msr[i], core2_vpmu_cxt->ctrls[i]); for ( i = 0; i < core2_get_pmc_count(); i++ ) wrmsrl(MSR_P6_EVNTSEL0+i, core2_vpmu_cxt->arch_msr_pair[i].control); - - apic_write_around(APIC_LVTPC, core2_vpmu_cxt->hw_lapic_lvtpc); } static void core2_vpmu_load(struct vcpu *v) { - struct vpmu_struct *vpmu = vcpu_vpmu(v); - - /* Only when PMU is counting, we load PMU context immediately. */ - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_RUNNING)) ) - return; __core2_vpmu_load(v); - vpmu_set(vpmu, VPMU_CONTEXT_LOADED); } static int core2_vpmu_alloc_resource(struct vcpu *v) @@ -705,11 +689,8 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs *regs) { struct vcpu *v = current; u64 msr_content; - u32 vlapic_lvtpc; - unsigned char int_vec; struct vpmu_struct *vpmu = vcpu_vpmu(v); struct core2_vpmu_context *core2_vpmu_cxt = vpmu->context; - struct vlapic *vlapic = vcpu_vlapic(v); rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content); if ( msr_content ) @@ -728,18 +709,9 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs *regs) return 0; } + /* HW sets the MASK bit when performance counter interrupt occurs*/ apic_write_around(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED); - if ( !is_vlapic_lvtpc_enabled(vlapic) ) - return 1; - - vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); - int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; - vlapic_set_reg(vlapic, APIC_LVTPC, vlapic_lvtpc | APIC_LVT_MASKED); - if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) - vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); - else - v->nmi_pending = 1; return 1; } diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c index 3b69580..6b7af68 100644 --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -31,7 +31,7 @@ #include <asm/hvm/vpmu.h> #include <asm/hvm/svm/svm.h> #include <asm/hvm/svm/vmcb.h> - +#include <asm/apic.h> /* * "vpmu" : vpmu generally enabled @@ -83,10 +83,31 @@ int vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) int vpmu_do_interrupt(struct cpu_user_regs *regs) { - struct vpmu_struct *vpmu = vcpu_vpmu(current); + struct vcpu *v = current; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + + if ( vpmu->arch_vpmu_ops ) + { + struct vlapic *vlapic = vcpu_vlapic(v); + u32 vlapic_lvtpc; + unsigned char int_vec; + + if ( !vpmu->arch_vpmu_ops->do_interrupt(regs) ) + return 0; + + if ( !is_vlapic_lvtpc_enabled(vlapic) ) + return 1; + + vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); + int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; + + if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) + vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); + else + v->nmi_pending = 1; + return 1; + } - if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->do_interrupt ) - return vpmu->arch_vpmu_ops->do_interrupt(regs); return 0; } @@ -103,17 +124,36 @@ void vpmu_do_cpuid(unsigned int input, void vpmu_save(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); + + if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) + return; - if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_save ) + if ( vpmu->arch_vpmu_ops ) vpmu->arch_vpmu_ops->arch_vpmu_save(v); + + vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC); + apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); + + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); } void vpmu_load(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); + /* Only when PMU is counting, we load PMU context immediately. */ + if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && + vpmu_is_set(vpmu, VPMU_RUNNING)) ) + return; + if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load ) + { + apic_write_around(APIC_LVTPC, vpmu->hw_lapic_lvtpc); vpmu->arch_vpmu_ops->arch_vpmu_load(v); + } + + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); } void vpmu_initialise(struct vcpu *v) diff --git a/xen/include/asm-x86/hvm/vmx/vpmu_core2.h b/xen/include/asm-x86/hvm/vmx/vpmu_core2.h index 4128f2a..60b05fd 100644 --- a/xen/include/asm-x86/hvm/vmx/vpmu_core2.h +++ b/xen/include/asm-x86/hvm/vmx/vpmu_core2.h @@ -44,7 +44,6 @@ struct core2_vpmu_context { u64 fix_counters[VPMU_CORE2_NUM_FIXED]; u64 ctrls[VPMU_CORE2_NUM_CTRLS]; u64 global_ovf_status; - u32 hw_lapic_lvtpc; struct arch_msr_pair arch_msr_pair[1]; }; diff --git a/xen/include/asm-x86/hvm/vpmu.h b/xen/include/asm-x86/hvm/vpmu.h index cd31f5e..01be976 100644 --- a/xen/include/asm-x86/hvm/vpmu.h +++ b/xen/include/asm-x86/hvm/vpmu.h @@ -62,6 +62,7 @@ int svm_vpmu_initialise(struct vcpu *, unsigned int flags); struct vpmu_struct { u32 flags; + u32 hw_lapic_lvtpc; void *context; struct arch_vpmu_ops *arch_vpmu_ops; }; -- 1.8.1.2
Boris Ostrovsky
2013-Apr-09 17:26 UTC
[PATCH 7/8] x86/VPMU: Save/restore VPMU only when necessary
VPMU doesn''t need to always be saved during context switch. If we are comming back to the same processor and no other VPCU has run here we can simply continue running. This is especailly useful on Intel processors where Global Control MSR is stored in VMCS, thus not requiring us to stop the counters during save operation. On AMD we need to explicitly stop the counters but we don''t need to save them. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/domain.c | 14 ++++++-- xen/arch/x86/hvm/svm/svm.c | 2 -- xen/arch/x86/hvm/svm/vpmu.c | 58 +++++++++++++++++++++++++----- xen/arch/x86/hvm/vmx/vmx.c | 2 -- xen/arch/x86/hvm/vmx/vpmu_core2.c | 25 +++++++++++-- xen/arch/x86/hvm/vpmu.c | 74 +++++++++++++++++++++++++++++++++++---- xen/include/asm-x86/hvm/vpmu.h | 5 ++- 7 files changed, 155 insertions(+), 25 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 0d7a40c..6f9cbed 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1535,8 +1535,14 @@ void context_switch(struct vcpu *prev, struct vcpu *next) if (prev != next) update_runstate_area(prev); - if ( is_hvm_vcpu(prev) && !list_empty(&prev->arch.hvm_vcpu.tm_list) ) - pt_save_timer(prev); + if ( is_hvm_vcpu(prev) ) + { + if (prev != next) + vpmu_save(prev); + + if ( !list_empty(&prev->arch.hvm_vcpu.tm_list) ) + pt_save_timer(prev); + } local_irq_disable(); @@ -1574,6 +1580,10 @@ void context_switch(struct vcpu *prev, struct vcpu *next) (next->domain->domain_id != 0)); } + if (is_hvm_vcpu(next) && (prev != next) ) + /* Must be done with interrupts enabled */ + vpmu_load(next); + context_saved(prev); if (prev != next) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 89e47b3..8123f37 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -860,7 +860,6 @@ static void svm_ctxt_switch_from(struct vcpu *v) svm_fpu_leave(v); svm_save_dr(v); - vpmu_save(v); svm_lwp_save(v); svm_tsc_ratio_save(v); @@ -901,7 +900,6 @@ static void svm_ctxt_switch_to(struct vcpu *v) svm_vmsave(per_cpu(root_vmcb, cpu)); svm_vmload(vmcb); vmcb->cleanbits.bytes = 0; - vpmu_load(v); svm_lwp_load(v); svm_tsc_ratio_load(v); diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 6610806..a11a650 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -188,6 +188,21 @@ static inline void context_restore(struct vcpu *v) static void amd_vpmu_restore(struct vcpu *v) { + struct vpmu_struct *vpmu = vcpu_vpmu(v); + struct amd_vpmu_context *ctxt = vpmu->context; + + vpmu_reset(vpmu, VPMU_STOPPED); + + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + { + int i; + + for ( i = 0; i < num_counters; i++ ) + wrmsrl(ctrls[i], ctxt->ctrls[i]); + + return; + } + context_restore(v); } @@ -197,23 +212,40 @@ static inline void context_save(struct vcpu *v) struct vpmu_struct *vpmu = vcpu_vpmu(v); struct amd_vpmu_context *ctxt = vpmu->context; + /* No need to save controls -- they are saved in amd_vpmu_do_wrmsr */ for ( i = 0; i < num_counters; i++ ) - { - rdmsrl(ctrls[i], ctxt->ctrls[i]); - wrmsrl(ctrls[i], 0); rdmsrl(counters[i], ctxt->counters[i]); - } } -static void amd_vpmu_save(struct vcpu *v) +static int amd_vpmu_save(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); struct amd_vpmu_context *ctx = vpmu->context; + int i; - context_save(v); + /* + * Stop the counters. If we came here via vpmu_save_force (i.e. + * when VPMU_CONTEXT_SAVE is set) counters are already stopped. + */ + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_SAVE) ) + { + vpmu_set(vpmu, VPMU_STOPPED); + for ( i = 0; i < num_counters; i++ ) + wrmsrl(ctrls[i], 0); + + return 0; + } + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + return 0; + + context_save(v); + if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && ctx->msr_bitmap_set ) amd_vpmu_unset_msr_bitmap(v); + + return 1; } static void context_read(unsigned int msr, u64 *msr_content) @@ -286,6 +318,7 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) return 1; vpmu_set(vpmu, VPMU_RUNNING); apic_write(APIC_LVTPC, PMU_APIC_VECTOR); + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR; if ( !((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) amd_vpmu_set_msr_bitmap(v); @@ -296,16 +329,19 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) (is_pmu_enabled(msr_content) == 0) && vpmu_is_set(vpmu, VPMU_RUNNING) ) { apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | APIC_LVT_MASKED; vpmu_reset(vpmu, VPMU_RUNNING); if ( ((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) amd_vpmu_unset_msr_bitmap(v); release_pmu_ownship(PMU_OWNER_HVM); } - if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) + || vpmu_is_set(vpmu, VPMU_STOPPED) ) { context_restore(v); vpmu_set(vpmu, VPMU_CONTEXT_LOADED); + vpmu_reset(vpmu, VPMU_STOPPED); } /* Update vpmu context immediately */ @@ -321,7 +357,13 @@ static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) struct vcpu *v = current; struct vpmu_struct *vpmu = vcpu_vpmu(v); - if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + if ( vpmu_is_set(vpmu, VPMU_STOPPED) ) + { + context_restore(v); + vpmu_reset(vpmu, VPMU_STOPPED); + } + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) context_read(msr, msr_content); else rdmsrl(msr, *msr_content); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index a869ed4..e36dbcb 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -615,7 +615,6 @@ static void vmx_ctxt_switch_from(struct vcpu *v) vmx_save_guest_msrs(v); vmx_restore_host_msrs(); vmx_save_dr(v); - vpmu_save(v); } static void vmx_ctxt_switch_to(struct vcpu *v) @@ -640,7 +639,6 @@ static void vmx_ctxt_switch_to(struct vcpu *v) vmx_restore_guest_msrs(v); vmx_restore_dr(v); - vpmu_load(v); } diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c b/xen/arch/x86/hvm/vmx/vpmu_core2.c index 6195bfc..9f152b4 100644 --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c @@ -307,17 +307,23 @@ static inline void __core2_vpmu_save(struct vcpu *v) rdmsrl(MSR_IA32_PERFCTR0+i, core2_vpmu_cxt->arch_msr_pair[i].counter); } -static void core2_vpmu_save(struct vcpu *v) +static int core2_vpmu_save(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_SAVE) ) + return 0; + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + return 0; + __core2_vpmu_save(v); /* Unset PMU MSR bitmap to trap lazy load. */ if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && cpu_has_vmx_msr_bitmap ) core2_vpmu_unset_msr_bitmap(v->arch.hvm_vmx.msr_bitmap); - return; + return 1; } static inline void __core2_vpmu_load(struct vcpu *v) @@ -338,6 +344,11 @@ static inline void __core2_vpmu_load(struct vcpu *v) static void core2_vpmu_load(struct vcpu *v) { + struct vpmu_struct *vpmu = vcpu_vpmu(v); + + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + return; + __core2_vpmu_load(v); } @@ -539,9 +550,15 @@ static int core2_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) /* Setup LVTPC in local apic */ if ( vpmu_is_set(vpmu, VPMU_RUNNING) && is_vlapic_lvtpc_enabled(vcpu_vlapic(v)) ) + { apic_write_around(APIC_LVTPC, PMU_APIC_VECTOR); + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR; + } else + { apic_write_around(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | APIC_LVT_MASKED; + } core2_vpmu_save_msr_context(v, type, index, msr_content); if ( type != MSR_TYPE_GLOBAL ) @@ -616,6 +633,7 @@ static int core2_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) else return 0; } + return 1; } @@ -710,7 +728,8 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs *regs) } /* HW sets the MASK bit when performance counter interrupt occurs*/ - apic_write_around(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED); + vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED; + apic_write_around(APIC_LVTPC, vpmu->hw_lapic_lvtpc); return 1; } diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c index 6b7af68..3f269d1 100644 --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -18,7 +18,6 @@ * * Author: Haitao Shan <haitao.shan@intel.com> */ - #include <xen/config.h> #include <xen/sched.h> #include <xen/xenoprof.h> @@ -42,6 +41,8 @@ static unsigned int __read_mostly opt_vpmu_enabled; static void parse_vpmu_param(char *s); custom_param("vpmu", parse_vpmu_param); +static DEFINE_PER_CPU(struct vcpu *, last_vcpu); + static void __init parse_vpmu_param(char *s) { switch ( parse_bool(s) ) @@ -121,30 +122,89 @@ void vpmu_do_cpuid(unsigned int input, vpmu->arch_vpmu_ops->do_cpuid(input, eax, ebx, ecx, edx); } +static void vpmu_save_force(void *arg) +{ + struct vcpu *v = (struct vcpu *)arg; + struct vpmu_struct *vpmu = vcpu_vpmu(v); + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + return; + + if ( vpmu->arch_vpmu_ops ) + (void)vpmu->arch_vpmu_ops->arch_vpmu_save(v); + + vpmu_reset(vpmu, VPMU_CONTEXT_SAVE); + + per_cpu(last_vcpu, smp_processor_id()) = NULL; +} + void vpmu_save(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); + int pcpu = smp_processor_id(); if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) return; + vpmu->last_pcpu = pcpu; + per_cpu(last_vcpu, pcpu) = v; + if ( vpmu->arch_vpmu_ops ) - vpmu->arch_vpmu_ops->arch_vpmu_save(v); + if ( vpmu->arch_vpmu_ops->arch_vpmu_save(v) ) + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); - vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC); apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); - - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); } void vpmu_load(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); + int pcpu = smp_processor_id(); + struct vcpu *prev = NULL; + + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) + return; + + /* First time this VCPU is running here */ + if ( vpmu->last_pcpu != pcpu ) + { + /* + * Get the context from last pcpu that we ran on. Note that if another + * VCPU is running there it must have saved this VPCU''s context before + * startig to run (see below). + * There should be no race since remote pcpu will disable interrupts + * before saving the context. + */ + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) + { + vpmu_set(vpmu, VPMU_CONTEXT_SAVE); + on_selected_cpus(cpumask_of(vpmu->last_pcpu), + vpmu_save_force, (void *)v, 1); + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); + } + } + + /* Prevent forced context save from remote CPU */ + local_irq_disable(); + + prev = per_cpu(last_vcpu, pcpu); + + if (prev != v && prev) { + vpmu = vcpu_vpmu(prev); + + /* Someone ran here before us */ + vpmu_set(vpmu, VPMU_CONTEXT_SAVE); + vpmu_save_force(prev); + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); + + vpmu = vcpu_vpmu(v); + } + + local_irq_enable(); /* Only when PMU is counting, we load PMU context immediately. */ - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && - vpmu_is_set(vpmu, VPMU_RUNNING)) ) + if ( !vpmu_is_set(vpmu, VPMU_RUNNING) ) return; if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load ) diff --git a/xen/include/asm-x86/hvm/vpmu.h b/xen/include/asm-x86/hvm/vpmu.h index 01be976..5040a49 100644 --- a/xen/include/asm-x86/hvm/vpmu.h +++ b/xen/include/asm-x86/hvm/vpmu.h @@ -52,7 +52,7 @@ struct arch_vpmu_ops { unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx); void (*arch_vpmu_destroy)(struct vcpu *v); - void (*arch_vpmu_save)(struct vcpu *v); + int (*arch_vpmu_save)(struct vcpu *v); void (*arch_vpmu_load)(struct vcpu *v); void (*arch_vpmu_dump)(struct vcpu *v); }; @@ -62,6 +62,7 @@ int svm_vpmu_initialise(struct vcpu *, unsigned int flags); struct vpmu_struct { u32 flags; + u32 last_pcpu; u32 hw_lapic_lvtpc; void *context; struct arch_vpmu_ops *arch_vpmu_ops; @@ -73,6 +74,8 @@ struct vpmu_struct { #define VPMU_PASSIVE_DOMAIN_ALLOCATED 0x8 #define VPMU_CPU_HAS_DS 0x10 /* Has Debug Store */ #define VPMU_CPU_HAS_BTS 0x20 /* Has Branch Trace Store */ +#define VPMU_CONTEXT_SAVE 0x40 /* Force context save */ +#define VPMU_STOPPED 0x80 /* Stop counters while VCPU is not running */ #define vpmu_set(_vpmu, _x) ((_vpmu)->flags |= (_x)) -- 1.8.1.2
Boris Ostrovsky
2013-Apr-09 17:26 UTC
[PATCH 8/8] x86/AMD: Clean up context_update() in AMD VPMU code
Clean up context_update() in AMD VPMU code. Rename restore routine to "load" to be consistent with Intel code and with arch_vpmu_ops names Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/arch/x86/hvm/svm/vpmu.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index a11a650..74c9eec 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -173,7 +173,7 @@ static int amd_vpmu_do_interrupt(struct cpu_user_regs *regs) return 1; } -static inline void context_restore(struct vcpu *v) +static inline void context_load(struct vcpu *v) { unsigned int i; struct vpmu_struct *vpmu = vcpu_vpmu(v); @@ -186,7 +186,7 @@ static inline void context_restore(struct vcpu *v) } } -static void amd_vpmu_restore(struct vcpu *v) +static void amd_vpmu_load(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); struct amd_vpmu_context *ctxt = vpmu->context; @@ -203,7 +203,7 @@ static void amd_vpmu_restore(struct vcpu *v) return; } - context_restore(v); + context_load(v); } static inline void context_save(struct vcpu *v) @@ -290,12 +290,18 @@ static void context_update(unsigned int msr, u64 msr_content) } for ( i = 0; i < num_counters; i++ ) - if ( msr == counters[i] ) + { + if ( msr == ctrls[i] ) + { + ctxt->ctrls[i] = msr_content; + return; + } + else if (msr == counters[i] ) + { ctxt->counters[i] = msr_content; - - for ( i = 0; i < num_counters; i++ ) - if ( msr == ctrls[i] ) - ctxt->ctrls[i] = msr_content; + return; + } + } } static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) @@ -339,7 +345,7 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) || vpmu_is_set(vpmu, VPMU_STOPPED) ) { - context_restore(v); + context_load(v); vpmu_set(vpmu, VPMU_CONTEXT_LOADED); vpmu_reset(vpmu, VPMU_STOPPED); } @@ -359,7 +365,7 @@ static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) if ( vpmu_is_set(vpmu, VPMU_STOPPED) ) { - context_restore(v); + context_load(v); vpmu_reset(vpmu, VPMU_STOPPED); } @@ -443,7 +449,7 @@ struct arch_vpmu_ops amd_vpmu_ops = { .do_interrupt = amd_vpmu_do_interrupt, .arch_vpmu_destroy = amd_vpmu_destroy, .arch_vpmu_save = amd_vpmu_save, - .arch_vpmu_load = amd_vpmu_restore + .arch_vpmu_load = amd_vpmu_load }; int svm_vpmu_initialise(struct vcpu *v, unsigned int vpmu_flags) -- 1.8.1.2
Dietmar Hahn
2013-Apr-10 08:57 UTC
Re: [PATCH 7/8] x86/VPMU: Save/restore VPMU only when necessary
Am Dienstag 09 April 2013, 13:26:18 schrieb Boris Ostrovsky:> VPMU doesn''t need to always be saved during context switch. If we are > comming back to the same processor and no other VPCU has run here we can > simply continue running. This is especailly useful on Intel processors > where Global Control MSR is stored in VMCS, thus not requiring us to stop > the counters during save operation. On AMD we need to explicitly stop the > counters but we don''t need to save them. > > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>Only minor comments below. Dietmar.> --- > xen/arch/x86/domain.c | 14 ++++++-- > xen/arch/x86/hvm/svm/svm.c | 2 -- > xen/arch/x86/hvm/svm/vpmu.c | 58 +++++++++++++++++++++++++----- > xen/arch/x86/hvm/vmx/vmx.c | 2 -- > xen/arch/x86/hvm/vmx/vpmu_core2.c | 25 +++++++++++-- > xen/arch/x86/hvm/vpmu.c | 74 +++++++++++++++++++++++++++++++++++---- > xen/include/asm-x86/hvm/vpmu.h | 5 ++- > 7 files changed, 155 insertions(+), 25 deletions(-) > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 0d7a40c..6f9cbed 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1535,8 +1535,14 @@ void context_switch(struct vcpu *prev, struct vcpu *next) > if (prev != next) > update_runstate_area(prev); > > - if ( is_hvm_vcpu(prev) && !list_empty(&prev->arch.hvm_vcpu.tm_list) ) > - pt_save_timer(prev); > + if ( is_hvm_vcpu(prev) ) > + { > + if (prev != next) > + vpmu_save(prev); > + > + if ( !list_empty(&prev->arch.hvm_vcpu.tm_list) ) > + pt_save_timer(prev); > + } > > local_irq_disable(); > > @@ -1574,6 +1580,10 @@ void context_switch(struct vcpu *prev, struct vcpu *next) > (next->domain->domain_id != 0)); > } > > + if (is_hvm_vcpu(next) && (prev != next) ) > + /* Must be done with interrupts enabled */ > + vpmu_load(next); > + > context_saved(prev); > > if (prev != next) > diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c > index 89e47b3..8123f37 100644 > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -860,7 +860,6 @@ static void svm_ctxt_switch_from(struct vcpu *v) > svm_fpu_leave(v); > > svm_save_dr(v); > - vpmu_save(v); > svm_lwp_save(v); > svm_tsc_ratio_save(v); > > @@ -901,7 +900,6 @@ static void svm_ctxt_switch_to(struct vcpu *v) > svm_vmsave(per_cpu(root_vmcb, cpu)); > svm_vmload(vmcb); > vmcb->cleanbits.bytes = 0; > - vpmu_load(v); > svm_lwp_load(v); > svm_tsc_ratio_load(v); > > diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c > index 6610806..a11a650 100644 > --- a/xen/arch/x86/hvm/svm/vpmu.c > +++ b/xen/arch/x86/hvm/svm/vpmu.c > @@ -188,6 +188,21 @@ static inline void context_restore(struct vcpu *v) > > static void amd_vpmu_restore(struct vcpu *v) > { > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + struct amd_vpmu_context *ctxt = vpmu->context; > + > + vpmu_reset(vpmu, VPMU_STOPPED); > + > + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) )Trailing space.> + { > + int i; > + > + for ( i = 0; i < num_counters; i++ ) > + wrmsrl(ctrls[i], ctxt->ctrls[i]); > + > + return; > + } > + > context_restore(v); > } > > @@ -197,23 +212,40 @@ static inline void context_save(struct vcpu *v) > struct vpmu_struct *vpmu = vcpu_vpmu(v); > struct amd_vpmu_context *ctxt = vpmu->context; > > + /* No need to save controls -- they are saved in amd_vpmu_do_wrmsr */ > for ( i = 0; i < num_counters; i++ ) > - { > - rdmsrl(ctrls[i], ctxt->ctrls[i]); > - wrmsrl(ctrls[i], 0); > rdmsrl(counters[i], ctxt->counters[i]); > - } > } > > -static void amd_vpmu_save(struct vcpu *v) > +static int amd_vpmu_save(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > struct amd_vpmu_context *ctx = vpmu->context; > + int i; > > - context_save(v); > + /*Same here.> + * Stop the counters. If we came here via vpmu_save_force (i.e. > + * when VPMU_CONTEXT_SAVE is set) counters are already stopped. > + */ > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_SAVE) )Same here.> + { > + vpmu_set(vpmu, VPMU_STOPPED); > > + for ( i = 0; i < num_counters; i++ ) > + wrmsrl(ctrls[i], 0); > + > + return 0; > + } > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + return 0; > +Same here.> + context_save(v); > +Same here.> if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && ctx->msr_bitmap_set ) > amd_vpmu_unset_msr_bitmap(v); > + > + return 1; > } > > static void context_read(unsigned int msr, u64 *msr_content) > @@ -286,6 +318,7 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > return 1; > vpmu_set(vpmu, VPMU_RUNNING); > apic_write(APIC_LVTPC, PMU_APIC_VECTOR); > + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR; > > if ( !((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) > amd_vpmu_set_msr_bitmap(v); > @@ -296,16 +329,19 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > (is_pmu_enabled(msr_content) == 0) && vpmu_is_set(vpmu, VPMU_RUNNING) ) > { > apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); > + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | APIC_LVT_MASKED; > vpmu_reset(vpmu, VPMU_RUNNING); > if ( ((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set ) > amd_vpmu_unset_msr_bitmap(v); > release_pmu_ownship(PMU_OWNER_HVM); > } > > - if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) > + || vpmu_is_set(vpmu, VPMU_STOPPED) ) > { > context_restore(v); > vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > + vpmu_reset(vpmu, VPMU_STOPPED); > } > > /* Update vpmu context immediately */ > @@ -321,7 +357,13 @@ static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) > struct vcpu *v = current; > struct vpmu_struct *vpmu = vcpu_vpmu(v); > > - if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + if ( vpmu_is_set(vpmu, VPMU_STOPPED) ) > + { > + context_restore(v); > + vpmu_reset(vpmu, VPMU_STOPPED); > + } > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) )Same here.> context_read(msr, msr_content); > else > rdmsrl(msr, *msr_content); > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > index a869ed4..e36dbcb 100644 > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -615,7 +615,6 @@ static void vmx_ctxt_switch_from(struct vcpu *v) > vmx_save_guest_msrs(v); > vmx_restore_host_msrs(); > vmx_save_dr(v); > - vpmu_save(v); > } > > static void vmx_ctxt_switch_to(struct vcpu *v) > @@ -640,7 +639,6 @@ static void vmx_ctxt_switch_to(struct vcpu *v) > > vmx_restore_guest_msrs(v); > vmx_restore_dr(v); > - vpmu_load(v); > } > > > diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c b/xen/arch/x86/hvm/vmx/vpmu_core2.c > index 6195bfc..9f152b4 100644 > --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c > +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c > @@ -307,17 +307,23 @@ static inline void __core2_vpmu_save(struct vcpu *v) > rdmsrl(MSR_IA32_PERFCTR0+i, core2_vpmu_cxt->arch_msr_pair[i].counter); > } > > -static void core2_vpmu_save(struct vcpu *v) > +static int core2_vpmu_save(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_SAVE) ) > + return 0; > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + return 0; > + > __core2_vpmu_save(v); > > /* Unset PMU MSR bitmap to trap lazy load. */ > if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && cpu_has_vmx_msr_bitmap ) > core2_vpmu_unset_msr_bitmap(v->arch.hvm_vmx.msr_bitmap); > > - return; > + return 1; > } > > static inline void __core2_vpmu_load(struct vcpu *v) > @@ -338,6 +344,11 @@ static inline void __core2_vpmu_load(struct vcpu *v) > > static void core2_vpmu_load(struct vcpu *v) > { > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + > + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + return; > + > __core2_vpmu_load(v); > } > > @@ -539,9 +550,15 @@ static int core2_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > /* Setup LVTPC in local apic */ > if ( vpmu_is_set(vpmu, VPMU_RUNNING) && > is_vlapic_lvtpc_enabled(vcpu_vlapic(v)) ) > + { > apic_write_around(APIC_LVTPC, PMU_APIC_VECTOR); > + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR; > + } > else > + { > apic_write_around(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); > + vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | APIC_LVT_MASKED; > + } > > core2_vpmu_save_msr_context(v, type, index, msr_content); > if ( type != MSR_TYPE_GLOBAL ) > @@ -616,6 +633,7 @@ static int core2_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) > else > return 0; > } > + > return 1; > } > > @@ -710,7 +728,8 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs *regs) > } > > /* HW sets the MASK bit when performance counter interrupt occurs*/ > - apic_write_around(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED); > + vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED; > + apic_write_around(APIC_LVTPC, vpmu->hw_lapic_lvtpc); > > return 1; > } > diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c > index 6b7af68..3f269d1 100644 > --- a/xen/arch/x86/hvm/vpmu.c > +++ b/xen/arch/x86/hvm/vpmu.c > @@ -18,7 +18,6 @@ > * > * Author: Haitao Shan <haitao.shan@intel.com> > */ > - > #include <xen/config.h> > #include <xen/sched.h> > #include <xen/xenoprof.h> > @@ -42,6 +41,8 @@ static unsigned int __read_mostly opt_vpmu_enabled; > static void parse_vpmu_param(char *s); > custom_param("vpmu", parse_vpmu_param); > > +static DEFINE_PER_CPU(struct vcpu *, last_vcpu); > + > static void __init parse_vpmu_param(char *s) > { > switch ( parse_bool(s) ) > @@ -121,30 +122,89 @@ void vpmu_do_cpuid(unsigned int input, > vpmu->arch_vpmu_ops->do_cpuid(input, eax, ebx, ecx, edx); > } > > +static void vpmu_save_force(void *arg) > +{ > + struct vcpu *v = (struct vcpu *)arg; > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + return; > + > + if ( vpmu->arch_vpmu_ops ) > + (void)vpmu->arch_vpmu_ops->arch_vpmu_save(v); > + > + vpmu_reset(vpmu, VPMU_CONTEXT_SAVE); > + > + per_cpu(last_vcpu, smp_processor_id()) = NULL; > +} > + > void vpmu_save(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > + int pcpu = smp_processor_id(); > > if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) > return; > > + vpmu->last_pcpu = pcpu; > + per_cpu(last_vcpu, pcpu) = v; > + > if ( vpmu->arch_vpmu_ops ) > - vpmu->arch_vpmu_ops->arch_vpmu_save(v); > + if ( vpmu->arch_vpmu_ops->arch_vpmu_save(v) ) > + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > > - vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); > - > - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > } > > void vpmu_load(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > + int pcpu = smp_processor_id(); > + struct vcpu *prev = NULL; > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) > + return; > + > + /* First time this VCPU is running here */ > + if ( vpmu->last_pcpu != pcpu ) > + { > + /* > + * Get the context from last pcpu that we ran on. Note that if another > + * VCPU is running there it must have saved this VPCU''s context before > + * startig to run (see below). > + * There should be no race since remote pcpu will disable interrupts > + * before saving the context. > + */ > + if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + { > + vpmu_set(vpmu, VPMU_CONTEXT_SAVE); > + on_selected_cpus(cpumask_of(vpmu->last_pcpu), > + vpmu_save_force, (void *)v, 1); > + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > + } > + } > + > + /* Prevent forced context save from remote CPU */ > + local_irq_disable(); > + > + prev = per_cpu(last_vcpu, pcpu); > + > + if (prev != v && prev) { > + vpmu = vcpu_vpmu(prev); > + > + /* Someone ran here before us */ > + vpmu_set(vpmu, VPMU_CONTEXT_SAVE); > + vpmu_save_force(prev); > + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > + > + vpmu = vcpu_vpmu(v); > + } > + > + local_irq_enable(); > > /* Only when PMU is counting, we load PMU context immediately. */ > - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_RUNNING)) ) > + if ( !vpmu_is_set(vpmu, VPMU_RUNNING) ) > return; > > if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load ) > diff --git a/xen/include/asm-x86/hvm/vpmu.h b/xen/include/asm-x86/hvm/vpmu.h > index 01be976..5040a49 100644 > --- a/xen/include/asm-x86/hvm/vpmu.h > +++ b/xen/include/asm-x86/hvm/vpmu.h > @@ -52,7 +52,7 @@ struct arch_vpmu_ops { > unsigned int *eax, unsigned int *ebx, > unsigned int *ecx, unsigned int *edx); > void (*arch_vpmu_destroy)(struct vcpu *v); > - void (*arch_vpmu_save)(struct vcpu *v); > + int (*arch_vpmu_save)(struct vcpu *v); > void (*arch_vpmu_load)(struct vcpu *v); > void (*arch_vpmu_dump)(struct vcpu *v); > }; > @@ -62,6 +62,7 @@ int svm_vpmu_initialise(struct vcpu *, unsigned int flags); > > struct vpmu_struct { > u32 flags; > + u32 last_pcpu; > u32 hw_lapic_lvtpc; > void *context; > struct arch_vpmu_ops *arch_vpmu_ops; > @@ -73,6 +74,8 @@ struct vpmu_struct { > #define VPMU_PASSIVE_DOMAIN_ALLOCATED 0x8 > #define VPMU_CPU_HAS_DS 0x10 /* Has Debug Store */ > #define VPMU_CPU_HAS_BTS 0x20 /* Has Branch Trace Store */ > +#define VPMU_CONTEXT_SAVE 0x40 /* Force context save */ > +#define VPMU_STOPPED 0x80 /* Stop counters while VCPU is not running */Would it be better to group the VPMU_ state and context flags together and move special cpu flags behind?> > > #define vpmu_set(_vpmu, _x) ((_vpmu)->flags |= (_x)) >-- Company details: http://ts.fujitsu.com/imprint.html
Am Dienstag 09 April 2013, 13:26:11 schrieb Boris Ostrovsky:> Here is a set of VPMU changes that I thought might be useful. > > The first two patches are to avoid VMEXITs on certain MSR accesses. This > is already part of VMX so I added similar SVM code > > The third patch should address the problem that Suravee mentioned on the > list a few weeks ago (http://lists.xen.org/archives/html/xen-devel/2013-03/msg00087.html). > It''s a slightly different solution then what he suggested. > > 4th patch stops counters on AMD when VCPU is de-scheduled > > 5th is trivial Haswell support (new model number) > > 6th patch is trying to factor out common code from VMX and SVM. > > 7th is lazy VPMU save/restore. It is optimized for the case when CPUs are > not over-subscribed and VCPU stays on the same processor most of the time. > It is more beneficial on Intel processors because HW will keep track of > guest/host global control register in VMCS and tehrefore we don''t need to > explicitly stop the counters. On AMD we do need to do this and so while > there is improvement, it is not as pronounced. > > Here are some numbers that I managed to collect while running guests with > oprofile. This is number of executed instructions in vpmu_save/vpmu_load. > > Eager VPMU Lazy VPMU > Save Restore Save Restore > Intel 181 225 46 50 > AMD 132 104 80 102 > > When processors are oversubscribed, lazy restore may take about 2.5 times > as many instructions as in the dedicated case if new VCPU jumps onto the > processor (which doesn''t happen on every context switch). > > > -borisGood work! Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> For Intel: Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Dietmar.> > > Boris Ostrovsky (8): > x86/AMD: Allow more fine-grained control of VMCB MSR Permission Map > x86/AMD: Do not intercept access to performance counters MSRs > x86/AMD: Read VPMU MSRs from context when it is not loaded into HW > x86/AMD: Stop counters on VPMU save > x86/VPMU: Add Haswell support > x86/VPMU: Factor out VPMU common code > x86/VPMU: Save/restore VPMU only when necessary > x86/AMD: Clean up context_update() in AMD VPMU code > > xen/arch/x86/domain.c | 14 ++- > xen/arch/x86/hvm/svm/svm.c | 19 ++-- > xen/arch/x86/hvm/svm/vpmu.c | 188 +++++++++++++++++++++++-------- > xen/arch/x86/hvm/vmx/vmx.c | 2 - > xen/arch/x86/hvm/vmx/vpmu_core2.c | 48 ++++---- > xen/arch/x86/hvm/vpmu.c | 114 +++++++++++++++++-- > xen/include/asm-x86/hvm/svm/vmcb.h | 8 +- > xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - > xen/include/asm-x86/hvm/vpmu.h | 6 +- > 9 files changed, 296 insertions(+), 104 deletions(-) > >-- Company details: http://ts.fujitsu.com/imprint.html
Boris Ostrovsky
2013-Apr-10 12:53 UTC
Re: [PATCH 7/8] x86/VPMU: Save/restore VPMU only when necessary
On 04/10/2013 04:57 AM, Dietmar Hahn wrote:> > struct vpmu_struct { > u32 flags; > + u32 last_pcpu; > u32 hw_lapic_lvtpc; > void *context; > struct arch_vpmu_ops *arch_vpmu_ops; > @@ -73,6 +74,8 @@ struct vpmu_struct { > #define VPMU_PASSIVE_DOMAIN_ALLOCATED 0x8 > #define VPMU_CPU_HAS_DS 0x10 /* Has Debug Store */ > #define VPMU_CPU_HAS_BTS 0x20 /* Has Branch Trace Store */ > +#define VPMU_CONTEXT_SAVE 0x40 /* Force context save */ > +#define VPMU_STOPPED 0x80 /* Stop counters while VCPU is not running */ > Would it be better to group the VPMU_ state and context flags together and move > special cpu flags behind?Yes, I should do this. I''ll wait for other comments and will resend the patches. (And will add VPMU dump code for AMD, similar to what you did for Intel.) Thanks. -boris
Jan Beulich
2013-Apr-10 13:25 UTC
Re: [PATCH 2/8] x86/AMD: Do not intercept access to performance counters MSRs
>>> On 09.04.13 at 19:26, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote: > @@ -138,6 +139,36 @@ static inline u32 get_fam15h_addr(u32 addr) > return addr; > } > > +static void amd_vpmu_set_msr_bitmap(struct vcpu *v) > +{ > + int i;Please, here and elsewhere, use unsigned int for variables used as array index, unless some specific use requires them to be signed. Jan> + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + struct amd_vpmu_context *ctxt = vpmu->context; > + > + for ( i = 0; i < num_counters; i++ ) > + { > + svm_intercept_msr(v, counters[i], MSR_INTERCEPT_NONE); > + svm_intercept_msr(v, ctrls[i], MSR_INTERCEPT_WRITE); > + } > + > + ctxt->msr_bitmap_set = 1; > +}
Acked-by: Jun Nakajima <jun.nakajima@intel.com> On Tue, Apr 9, 2013 at 10:26 AM, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:> Factor out common code from SVM amd VMX into VPMU. > > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > xen/arch/x86/hvm/svm/vpmu.c | 37 ----------------------- > xen/arch/x86/hvm/vmx/vpmu_core2.c | 30 +------------------ > xen/arch/x86/hvm/vpmu.c | 50 > ++++++++++++++++++++++++++++---- > xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - > xen/include/asm-x86/hvm/vpmu.h | 1 + > 5 files changed, 47 insertions(+), 72 deletions(-) > > diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c > index 2f4b6e4..6610806 100644 > --- a/xen/arch/x86/hvm/svm/vpmu.c > +++ b/xen/arch/x86/hvm/svm/vpmu.c > @@ -87,7 +87,6 @@ static const u32 AMD_F15H_CTRLS[] = { > struct amd_vpmu_context { > u64 counters[MAX_NUM_COUNTERS]; > u64 ctrls[MAX_NUM_COUNTERS]; > - u32 hw_lapic_lvtpc; > bool_t msr_bitmap_set; > }; > > @@ -171,22 +170,6 @@ static void amd_vpmu_unset_msr_bitmap(struct vcpu *v) > > static int amd_vpmu_do_interrupt(struct cpu_user_regs *regs) > { > - struct vcpu *v = current; > - struct vlapic *vlapic = vcpu_vlapic(v); > - u32 vlapic_lvtpc; > - unsigned char int_vec; > - > - if ( !is_vlapic_lvtpc_enabled(vlapic) ) > - return 0; > - > - vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); > - int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; > - > - if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) > - vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); > - else > - v->nmi_pending = 1; > - > return 1; > } > > @@ -205,17 +188,7 @@ static inline void context_restore(struct vcpu *v) > > static void amd_vpmu_restore(struct vcpu *v) > { > - struct vpmu_struct *vpmu = vcpu_vpmu(v); > - struct amd_vpmu_context *ctxt = vpmu->context; > - > - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_RUNNING)) ) > - return; > - > - apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); > context_restore(v); > - > - vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > } > > static inline void context_save(struct vcpu *v) > @@ -237,18 +210,10 @@ static void amd_vpmu_save(struct vcpu *v) > struct vpmu_struct *vpmu = vcpu_vpmu(v); > struct amd_vpmu_context *ctx = vpmu->context; > > - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) > - return; > - > context_save(v); > > if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && ctx->msr_bitmap_set ) > amd_vpmu_unset_msr_bitmap(v); > - > - ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > - apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); > - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > } > > static void context_read(unsigned int msr, u64 *msr_content) > @@ -299,8 +264,6 @@ static void context_update(unsigned int msr, u64 > msr_content) > for ( i = 0; i < num_counters; i++ ) > if ( msr == ctrls[i] ) > ctxt->ctrls[i] = msr_content; > - > - ctxt->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > } > > static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c > b/xen/arch/x86/hvm/vmx/vpmu_core2.c > index 7c86a0b..6195bfc 100644 > --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c > +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c > @@ -305,25 +305,18 @@ static inline void __core2_vpmu_save(struct vcpu *v) > rdmsrl(core2_fix_counters.msr[i], > core2_vpmu_cxt->fix_counters[i]); > for ( i = 0; i < core2_get_pmc_count(); i++ ) > rdmsrl(MSR_IA32_PERFCTR0+i, > core2_vpmu_cxt->arch_msr_pair[i].counter); > - core2_vpmu_cxt->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > - apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); > } > > static void core2_vpmu_save(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > > - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) > - return; > - > __core2_vpmu_save(v); > > /* Unset PMU MSR bitmap to trap lazy load. */ > if ( !vpmu_is_set(vpmu, VPMU_RUNNING) && cpu_has_vmx_msr_bitmap ) > core2_vpmu_unset_msr_bitmap(v->arch.hvm_vmx.msr_bitmap); > > - vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > return; > } > > @@ -341,20 +334,11 @@ static inline void __core2_vpmu_load(struct vcpu *v) > wrmsrl(core2_ctrls.msr[i], core2_vpmu_cxt->ctrls[i]); > for ( i = 0; i < core2_get_pmc_count(); i++ ) > wrmsrl(MSR_P6_EVNTSEL0+i, > core2_vpmu_cxt->arch_msr_pair[i].control); > - > - apic_write_around(APIC_LVTPC, core2_vpmu_cxt->hw_lapic_lvtpc); > } > > static void core2_vpmu_load(struct vcpu *v) > { > - struct vpmu_struct *vpmu = vcpu_vpmu(v); > - > - /* Only when PMU is counting, we load PMU context immediately. */ > - if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_RUNNING)) ) > - return; > __core2_vpmu_load(v); > - vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > } > > static int core2_vpmu_alloc_resource(struct vcpu *v) > @@ -705,11 +689,8 @@ static int core2_vpmu_do_interrupt(struct > cpu_user_regs *regs) > { > struct vcpu *v = current; > u64 msr_content; > - u32 vlapic_lvtpc; > - unsigned char int_vec; > struct vpmu_struct *vpmu = vcpu_vpmu(v); > struct core2_vpmu_context *core2_vpmu_cxt = vpmu->context; > - struct vlapic *vlapic = vcpu_vlapic(v); > > rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content); > if ( msr_content ) > @@ -728,18 +709,9 @@ static int core2_vpmu_do_interrupt(struct > cpu_user_regs *regs) > return 0; > } > > + /* HW sets the MASK bit when performance counter interrupt occurs*/ > apic_write_around(APIC_LVTPC, apic_read(APIC_LVTPC) & > ~APIC_LVT_MASKED); > > - if ( !is_vlapic_lvtpc_enabled(vlapic) ) > - return 1; > - > - vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); > - int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; > - vlapic_set_reg(vlapic, APIC_LVTPC, vlapic_lvtpc | APIC_LVT_MASKED); > - if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) > - vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); > - else > - v->nmi_pending = 1; > return 1; > } > > diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c > index 3b69580..6b7af68 100644 > --- a/xen/arch/x86/hvm/vpmu.c > +++ b/xen/arch/x86/hvm/vpmu.c > @@ -31,7 +31,7 @@ > #include <asm/hvm/vpmu.h> > #include <asm/hvm/svm/svm.h> > #include <asm/hvm/svm/vmcb.h> > - > +#include <asm/apic.h> > > /* > * "vpmu" : vpmu generally enabled > @@ -83,10 +83,31 @@ int vpmu_do_rdmsr(unsigned int msr, uint64_t > *msr_content) > > int vpmu_do_interrupt(struct cpu_user_regs *regs) > { > - struct vpmu_struct *vpmu = vcpu_vpmu(current); > + struct vcpu *v = current; > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + > + if ( vpmu->arch_vpmu_ops ) > + { > + struct vlapic *vlapic = vcpu_vlapic(v); > + u32 vlapic_lvtpc; > + unsigned char int_vec; > + > + if ( !vpmu->arch_vpmu_ops->do_interrupt(regs) ) > + return 0; > + > + if ( !is_vlapic_lvtpc_enabled(vlapic) ) > + return 1; > + > + vlapic_lvtpc = vlapic_get_reg(vlapic, APIC_LVTPC); > + int_vec = vlapic_lvtpc & APIC_VECTOR_MASK; > + > + if ( GET_APIC_DELIVERY_MODE(vlapic_lvtpc) == APIC_MODE_FIXED ) > + vlapic_set_irq(vcpu_vlapic(v), int_vec, 0); > + else > + v->nmi_pending = 1; > + return 1; > + } > > - if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->do_interrupt ) > - return vpmu->arch_vpmu_ops->do_interrupt(regs); > return 0; > } > > @@ -103,17 +124,36 @@ void vpmu_do_cpuid(unsigned int input, > void vpmu_save(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > + > + if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) > + return; > > - if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_save ) > + if ( vpmu->arch_vpmu_ops ) > vpmu->arch_vpmu_ops->arch_vpmu_save(v); > + > + vpmu->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > + apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED); > + > + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > } > > void vpmu_load(struct vcpu *v) > { > struct vpmu_struct *vpmu = vcpu_vpmu(v); > > + /* Only when PMU is counting, we load PMU context immediately. */ > + if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > + vpmu_is_set(vpmu, VPMU_RUNNING)) ) > + return; > + > if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load ) > + { > + apic_write_around(APIC_LVTPC, vpmu->hw_lapic_lvtpc); > vpmu->arch_vpmu_ops->arch_vpmu_load(v); > + } > + > + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > } > > void vpmu_initialise(struct vcpu *v) > diff --git a/xen/include/asm-x86/hvm/vmx/vpmu_core2.h > b/xen/include/asm-x86/hvm/vmx/vpmu_core2.h > index 4128f2a..60b05fd 100644 > --- a/xen/include/asm-x86/hvm/vmx/vpmu_core2.h > +++ b/xen/include/asm-x86/hvm/vmx/vpmu_core2.h > @@ -44,7 +44,6 @@ struct core2_vpmu_context { > u64 fix_counters[VPMU_CORE2_NUM_FIXED]; > u64 ctrls[VPMU_CORE2_NUM_CTRLS]; > u64 global_ovf_status; > - u32 hw_lapic_lvtpc; > struct arch_msr_pair arch_msr_pair[1]; > }; > > diff --git a/xen/include/asm-x86/hvm/vpmu.h > b/xen/include/asm-x86/hvm/vpmu.h > index cd31f5e..01be976 100644 > --- a/xen/include/asm-x86/hvm/vpmu.h > +++ b/xen/include/asm-x86/hvm/vpmu.h > @@ -62,6 +62,7 @@ int svm_vpmu_initialise(struct vcpu *, unsigned int > flags); > > struct vpmu_struct { > u32 flags; > + u32 hw_lapic_lvtpc; > void *context; > struct arch_vpmu_ops *arch_vpmu_ops; > }; > -- > 1.8.1.2 > >-- Jun Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Boris, I am trying the patch set on the trinity and ran into the following issue: root@sos-dev02:/sandbox/xen/xen-git-boris# xl create /sandbox/vm-images/xen-configs/ubuntu-12.10_dom.hvm Parsing config from /sandbox/vm-images/xen-configs/ubuntu-12.10_dom.hvm libxl: error: libxl_create.c:416:libxl__domain_make: domain creation fail libxl: error: libxl_create.c:644:initiate_domain_create: cannot make domain: -3 libxl: error: libxl.c:1377:libxl__destroy_domid: non-existant domain -1 libxl: error: libxl.c:1341:domain_destroy_callback: unable to destroy guest with domid 4294967295 libxl: error: libxl_create.c:1208:domcreate_destruction_cb: unable to destroy domain 4294967295 following failed creation root@sos-dev02:/sandbox/xen/xen-git-boris# The same hvm config file starts up fine on older hypervisor. Any clues? Suravee On 4/9/2013 12:26 PM, Boris Ostrovsky wrote:> Here is a set of VPMU changes that I thought might be useful. > > The first two patches are to avoid VMEXITs on certain MSR accesses. This > is already part of VMX so I added similar SVM code > > The third patch should address the problem that Suravee mentioned on the > list a few weeks ago (http://lists.xen.org/archives/html/xen-devel/2013-03/msg00087.html). > It''s a slightly different solution then what he suggested. > > 4th patch stops counters on AMD when VCPU is de-scheduled > > 5th is trivial Haswell support (new model number) > > 6th patch is trying to factor out common code from VMX and SVM. > > 7th is lazy VPMU save/restore. It is optimized for the case when CPUs are > not over-subscribed and VCPU stays on the same processor most of the time. > It is more beneficial on Intel processors because HW will keep track of > guest/host global control register in VMCS and tehrefore we don''t need to > explicitly stop the counters. On AMD we do need to do this and so while > there is improvement, it is not as pronounced. > > Here are some numbers that I managed to collect while running guests with > oprofile. This is number of executed instructions in vpmu_save/vpmu_load. > > Eager VPMU Lazy VPMU > Save Restore Save Restore > Intel 181 225 46 50 > AMD 132 104 80 102 > > When processors are oversubscribed, lazy restore may take about 2.5 times > as many instructions as in the dedicated case if new VCPU jumps onto the > processor (which doesn''t happen on every context switch). > > > -boris > > > Boris Ostrovsky (8): > x86/AMD: Allow more fine-grained control of VMCB MSR Permission Map > x86/AMD: Do not intercept access to performance counters MSRs > x86/AMD: Read VPMU MSRs from context when it is not loaded into HW > x86/AMD: Stop counters on VPMU save > x86/VPMU: Add Haswell support > x86/VPMU: Factor out VPMU common code > x86/VPMU: Save/restore VPMU only when necessary > x86/AMD: Clean up context_update() in AMD VPMU code > > xen/arch/x86/domain.c | 14 ++- > xen/arch/x86/hvm/svm/svm.c | 19 ++-- > xen/arch/x86/hvm/svm/vpmu.c | 188 +++++++++++++++++++++++-------- > xen/arch/x86/hvm/vmx/vmx.c | 2 - > xen/arch/x86/hvm/vmx/vpmu_core2.c | 48 ++++---- > xen/arch/x86/hvm/vpmu.c | 114 +++++++++++++++++-- > xen/include/asm-x86/hvm/svm/vmcb.h | 8 +- > xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - > xen/include/asm-x86/hvm/vpmu.h | 6 +- > 9 files changed, 296 insertions(+), 104 deletions(-) >
On 04/10/2013 02:49 PM, Suravee Suthikulanit wrote:> Boris, > > I am trying the patch set on the trinity and ran into the following > issue: > > root@sos-dev02:/sandbox/xen/xen-git-boris# xl create > /sandbox/vm-images/xen-configs/ubuntu-12.10_dom.hvm > Parsing config from /sandbox/vm-images/xen-configs/ubuntu-12.10_dom.hvm > libxl: error: libxl_create.c:416:libxl__domain_make: domain creation fail > libxl: error: libxl_create.c:644:initiate_domain_create: cannot make > domain: -3 > libxl: error: libxl.c:1377:libxl__destroy_domid: non-existant domain -1 > libxl: error: libxl.c:1341:domain_destroy_callback: unable to destroy > guest with domid 4294967295 > libxl: error: libxl_create.c:1208:domcreate_destruction_cb: unable to > destroy domain 4294967295 following failed creation > root@sos-dev02:/sandbox/xen/xen-git-boris# > > The same hvm config file starts up fine on older hypervisor. Any clues?This is failure during domain creation and the patchset is not touching any of that code. Is there anything in xl dmesg or /var/log/xen? -boris> > Suravee > > On 4/9/2013 12:26 PM, Boris Ostrovsky wrote: >> Here is a set of VPMU changes that I thought might be useful. >> >> The first two patches are to avoid VMEXITs on certain MSR accesses. This >> is already part of VMX so I added similar SVM code >> >> The third patch should address the problem that Suravee mentioned on the >> list a few weeks ago >> (http://lists.xen.org/archives/html/xen-devel/2013-03/msg00087.html). >> It''s a slightly different solution then what he suggested. >> >> 4th patch stops counters on AMD when VCPU is de-scheduled >> >> 5th is trivial Haswell support (new model number) >> >> 6th patch is trying to factor out common code from VMX and SVM. >> >> 7th is lazy VPMU save/restore. It is optimized for the case when CPUs >> are >> not over-subscribed and VCPU stays on the same processor most of the >> time. >> It is more beneficial on Intel processors because HW will keep track of >> guest/host global control register in VMCS and tehrefore we don''t >> need to >> explicitly stop the counters. On AMD we do need to do this and so while >> there is improvement, it is not as pronounced. >> >> Here are some numbers that I managed to collect while running guests >> with >> oprofile. This is number of executed instructions in >> vpmu_save/vpmu_load. >> >> Eager VPMU Lazy VPMU >> Save Restore Save Restore >> Intel 181 225 46 50 >> AMD 132 104 80 102 >> >> When processors are oversubscribed, lazy restore may take about 2.5 >> times >> as many instructions as in the dedicated case if new VCPU jumps onto the >> processor (which doesn''t happen on every context switch). >> >> >> -boris >> >> >> Boris Ostrovsky (8): >> x86/AMD: Allow more fine-grained control of VMCB MSR Permission Map >> x86/AMD: Do not intercept access to performance counters MSRs >> x86/AMD: Read VPMU MSRs from context when it is not loaded into HW >> x86/AMD: Stop counters on VPMU save >> x86/VPMU: Add Haswell support >> x86/VPMU: Factor out VPMU common code >> x86/VPMU: Save/restore VPMU only when necessary >> x86/AMD: Clean up context_update() in AMD VPMU code >> >> xen/arch/x86/domain.c | 14 ++- >> xen/arch/x86/hvm/svm/svm.c | 19 ++-- >> xen/arch/x86/hvm/svm/vpmu.c | 188 >> +++++++++++++++++++++++-------- >> xen/arch/x86/hvm/vmx/vmx.c | 2 - >> xen/arch/x86/hvm/vmx/vpmu_core2.c | 48 ++++---- >> xen/arch/x86/hvm/vpmu.c | 114 +++++++++++++++++-- >> xen/include/asm-x86/hvm/svm/vmcb.h | 8 +- >> xen/include/asm-x86/hvm/vmx/vpmu_core2.h | 1 - >> xen/include/asm-x86/hvm/vpmu.h | 6 +- >> 9 files changed, 296 insertions(+), 104 deletions(-) >> > >
Suravee Suthikulpanit
2013-Apr-11 18:26 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
Boris, I tried booting the guest HVM after the patch, I still see PERF only working in Software mode only. I''ll look more into this. Suravee On 4/9/2013 12:26 PM, Boris Ostrovsky wrote:> Mark context as LOADED on any MSR write (and on vpmu_restore()). This > will allow us to know whether to read an MSR from HW or from context: > guest may write into an MSR without enabling it (thus not marking the > context as RUNNING) and then be migrated. Reading this MSR again will > not match the pervious write since registers will not be loaded into HW. > > In addition, we should be saving the context when it is LOADED, not > RUNNING --- otherwise we need to save it any time it becomes non-RUNNING, > which may be a frequent occurrence. > > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > xen/arch/x86/hvm/svm/vpmu.c | 48 +++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 46 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c > index f194975..3115923 100644 > --- a/xen/arch/x86/hvm/svm/vpmu.c > +++ b/xen/arch/x86/hvm/svm/vpmu.c > @@ -225,6 +225,8 @@ static void amd_vpmu_restore(struct vcpu *v) > > context_restore(v); > apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); > + > + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > } > > static inline void context_save(struct vcpu *v) > @@ -246,7 +248,7 @@ static void amd_vpmu_save(struct vcpu *v) > struct amd_vpmu_context *ctx = vpmu->context; > > if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && > - vpmu_is_set(vpmu, VPMU_RUNNING)) ) > + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) > return; > > context_save(v); > @@ -256,6 +258,35 @@ static void amd_vpmu_save(struct vcpu *v) > > ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); > apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); > + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); > +} > + > +static void context_read(unsigned int msr, u64 *msr_content) > +{ > + unsigned int i; > + struct vcpu *v = current; > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + struct amd_vpmu_context *ctxt = vpmu->context; > + > + if ( k7_counters_mirrored && > + ((msr >= MSR_K7_EVNTSEL0) && (msr <= MSR_K7_PERFCTR3)) ) > + { > + msr = get_fam15h_addr(msr); > + } > + > + for ( i = 0; i < num_counters; i++ ) > + { > + if ( msr == ctrls[i] ) > + { > + *msr_content = ctxt->ctrls[i]; > + return; > + } > + else if (msr == counters[i] ) > + { > + *msr_content = ctxt->counters[i]; > + return; > + } > + } > } > > static void context_update(unsigned int msr, u64 msr_content) > @@ -318,6 +349,12 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > release_pmu_ownship(PMU_OWNER_HVM); > } > > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + { > + context_restore(v); > + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); > + } > + > /* Update vpmu context immediately */ > context_update(msr, msr_content); > > @@ -328,7 +365,14 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content) > > static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) > { > - rdmsrl(msr, *msr_content); > + struct vcpu *v = current; > + struct vpmu_struct *vpmu = vcpu_vpmu(v); > + > + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > + context_read(msr, msr_content); > + else > + rdmsrl(msr, *msr_content); > + > return 1; > } >
Boris Ostrovsky
2013-Apr-11 18:34 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote:> Boris, > > I tried booting the guest HVM after the patch, I still see PERF only > working in Software mode only. I''ll look more into this.You may need to declare proper CPUID bits in the config file. On fam15h I have cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] -boris> > Suravee > On 4/9/2013 12:26 PM, Boris Ostrovsky wrote: >> Mark context as LOADED on any MSR write (and on vpmu_restore()). This >> will allow us to know whether to read an MSR from HW or from context: >> guest may write into an MSR without enabling it (thus not marking the >> context as RUNNING) and then be migrated. Reading this MSR again will >> not match the pervious write since registers will not be loaded into HW. >> >> In addition, we should be saving the context when it is LOADED, not >> RUNNING --- otherwise we need to save it any time it becomes >> non-RUNNING, >> which may be a frequent occurrence. >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> --- >> xen/arch/x86/hvm/svm/vpmu.c | 48 >> +++++++++++++++++++++++++++++++++++++++++++-- >> 1 file changed, 46 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c >> index f194975..3115923 100644 >> --- a/xen/arch/x86/hvm/svm/vpmu.c >> +++ b/xen/arch/x86/hvm/svm/vpmu.c >> @@ -225,6 +225,8 @@ static void amd_vpmu_restore(struct vcpu *v) >> context_restore(v); >> apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); >> + >> + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); >> } >> static inline void context_save(struct vcpu *v) >> @@ -246,7 +248,7 @@ static void amd_vpmu_save(struct vcpu *v) >> struct amd_vpmu_context *ctx = vpmu->context; >> if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && >> - vpmu_is_set(vpmu, VPMU_RUNNING)) ) >> + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) >> return; >> context_save(v); >> @@ -256,6 +258,35 @@ static void amd_vpmu_save(struct vcpu *v) >> ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); >> apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); >> + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); >> +} >> + >> +static void context_read(unsigned int msr, u64 *msr_content) >> +{ >> + unsigned int i; >> + struct vcpu *v = current; >> + struct vpmu_struct *vpmu = vcpu_vpmu(v); >> + struct amd_vpmu_context *ctxt = vpmu->context; >> + >> + if ( k7_counters_mirrored && >> + ((msr >= MSR_K7_EVNTSEL0) && (msr <= MSR_K7_PERFCTR3)) ) >> + { >> + msr = get_fam15h_addr(msr); >> + } >> + >> + for ( i = 0; i < num_counters; i++ ) >> + { >> + if ( msr == ctrls[i] ) >> + { >> + *msr_content = ctxt->ctrls[i]; >> + return; >> + } >> + else if (msr == counters[i] ) >> + { >> + *msr_content = ctxt->counters[i]; >> + return; >> + } >> + } >> } >> static void context_update(unsigned int msr, u64 msr_content) >> @@ -318,6 +349,12 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, >> uint64_t msr_content) >> release_pmu_ownship(PMU_OWNER_HVM); >> } >> + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >> + { >> + context_restore(v); >> + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); >> + } >> + >> /* Update vpmu context immediately */ >> context_update(msr, msr_content); >> @@ -328,7 +365,14 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, >> uint64_t msr_content) >> static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t >> *msr_content) >> { >> - rdmsrl(msr, *msr_content); >> + struct vcpu *v = current; >> + struct vpmu_struct *vpmu = vcpu_vpmu(v); >> + >> + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >> + context_read(msr, msr_content); >> + else >> + rdmsrl(msr, *msr_content); >> + >> return 1; >> } > >
Suravee Suthikulpanit
2013-Apr-11 19:30 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On 4/11/2013 1:34 PM, Boris Ostrovsky wrote:> On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: >> Boris, >> >> I tried booting the guest HVM after the patch, I still see PERF only >> working in Software mode only. I''ll look more into this. > > You may need to declare proper CPUID bits in the config file. On > fam15h I have > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] > > -borisYes, that''s working now. Thanks. Suravee> >> >> Suravee >> On 4/9/2013 12:26 PM, Boris Ostrovsky wrote: >>> Mark context as LOADED on any MSR write (and on vpmu_restore()). This >>> will allow us to know whether to read an MSR from HW or from context: >>> guest may write into an MSR without enabling it (thus not marking the >>> context as RUNNING) and then be migrated. Reading this MSR again will >>> not match the pervious write since registers will not be loaded into >>> HW. >>> >>> In addition, we should be saving the context when it is LOADED, not >>> RUNNING --- otherwise we need to save it any time it becomes >>> non-RUNNING, >>> which may be a frequent occurrence. >>> >>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> --- >>> xen/arch/x86/hvm/svm/vpmu.c | 48 >>> +++++++++++++++++++++++++++++++++++++++++++-- >>> 1 file changed, 46 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c >>> index f194975..3115923 100644 >>> --- a/xen/arch/x86/hvm/svm/vpmu.c >>> +++ b/xen/arch/x86/hvm/svm/vpmu.c >>> @@ -225,6 +225,8 @@ static void amd_vpmu_restore(struct vcpu *v) >>> context_restore(v); >>> apic_write(APIC_LVTPC, ctxt->hw_lapic_lvtpc); >>> + >>> + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); >>> } >>> static inline void context_save(struct vcpu *v) >>> @@ -246,7 +248,7 @@ static void amd_vpmu_save(struct vcpu *v) >>> struct amd_vpmu_context *ctx = vpmu->context; >>> if ( !(vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) && >>> - vpmu_is_set(vpmu, VPMU_RUNNING)) ) >>> + vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED)) ) >>> return; >>> context_save(v); >>> @@ -256,6 +258,35 @@ static void amd_vpmu_save(struct vcpu *v) >>> ctx->hw_lapic_lvtpc = apic_read(APIC_LVTPC); >>> apic_write(APIC_LVTPC, ctx->hw_lapic_lvtpc | APIC_LVT_MASKED); >>> + vpmu_reset(vpmu, VPMU_CONTEXT_LOADED); >>> +} >>> + >>> +static void context_read(unsigned int msr, u64 *msr_content) >>> +{ >>> + unsigned int i; >>> + struct vcpu *v = current; >>> + struct vpmu_struct *vpmu = vcpu_vpmu(v); >>> + struct amd_vpmu_context *ctxt = vpmu->context; >>> + >>> + if ( k7_counters_mirrored && >>> + ((msr >= MSR_K7_EVNTSEL0) && (msr <= MSR_K7_PERFCTR3)) ) >>> + { >>> + msr = get_fam15h_addr(msr); >>> + } >>> + >>> + for ( i = 0; i < num_counters; i++ ) >>> + { >>> + if ( msr == ctrls[i] ) >>> + { >>> + *msr_content = ctxt->ctrls[i]; >>> + return; >>> + } >>> + else if (msr == counters[i] ) >>> + { >>> + *msr_content = ctxt->counters[i]; >>> + return; >>> + } >>> + } >>> } >>> static void context_update(unsigned int msr, u64 msr_content) >>> @@ -318,6 +349,12 @@ static int amd_vpmu_do_wrmsr(unsigned int msr, >>> uint64_t msr_content) >>> release_pmu_ownship(PMU_OWNER_HVM); >>> } >>> + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >>> + { >>> + context_restore(v); >>> + vpmu_set(vpmu, VPMU_CONTEXT_LOADED); >>> + } >>> + >>> /* Update vpmu context immediately */ >>> context_update(msr, msr_content); >>> @@ -328,7 +365,14 @@ static int amd_vpmu_do_wrmsr(unsigned int >>> msr, uint64_t msr_content) >>> static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t >>> *msr_content) >>> { >>> - rdmsrl(msr, *msr_content); >>> + struct vcpu *v = current; >>> + struct vpmu_struct *vpmu = vcpu_vpmu(v); >>> + >>> + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >>> + context_read(msr, msr_content); >>> + else >>> + rdmsrl(msr, *msr_content); >>> + >>> return 1; >>> } >> >> > >
Suravee Suthikulpanit
2013-Apr-11 19:48 UTC
Re: [PATCH 8/8] x86/AMD: Clean up context_update() in AMD VPMU code
Boris, I have a couple questions. After patched, here is the final code: static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) { struct vcpu *v = current; struct vpmu_struct *vpmu = vcpu_vpmu(v); if ( vpmu_is_set(vpmu, VPMU_STOPPED) ) { context_load(v); vpmu_reset(vpmu, VPMU_STOPPED); } if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) context_read(msr, msr_content); else rdmsrl(msr, *msr_content); return 1; } 1. Why do you need to call context_load() here since you checking if the we are in VPMU_CONTEXT_LOADED and get the appropriate copy before returning the msr_content? 2. If you have already called context_load() above, shouldn''t you have to call "vpmu_set(vpmu, VPMU_CONTEXT_LOADED)" as well? But then, you wouldn''t be needing the latter logic, isn''t it? Suravee
Boris Ostrovsky
2013-Apr-11 20:42 UTC
Re: [PATCH 8/8] x86/AMD: Clean up context_update() in AMD VPMU code
On 04/11/2013 03:48 PM, Suravee Suthikulpanit wrote:> Boris, > > I have a couple questions. After patched, here is the final code: > > static int amd_vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) > { > struct vcpu *v = current; > struct vpmu_struct *vpmu = vcpu_vpmu(v); > > if ( vpmu_is_set(vpmu, VPMU_STOPPED) ) > { > context_load(v); > vpmu_reset(vpmu, VPMU_STOPPED); > } > > if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) > context_read(msr, msr_content); > else > rdmsrl(msr, *msr_content); > > return 1; > } > > 1. Why do you need to call context_load() here since you checking if > the we are in VPMU_CONTEXT_LOADED and get the appropriate copy before > returning the msr_content?It is possible to be both LOADED and STOPPED. Example: guest writes control register (but does not enable the counters, so VPMU is not RUNNING). In this case amd_vpmu_load() will not get called from vpmu_load() so subsequent rdmsr of this control register in the guest will need to read from context and not from HW (since HW register is zero). In this case context_load() really only needs to load control registers; the counters themselves are the same in both context and HW.> 2. If you have already called context_load() above, shouldn''t you have > to call "vpmu_set(vpmu, VPMU_CONTEXT_LOADED)" as well? But then, you > wouldn''t be needing the latter logic, isn''t it?Yes, you are right. I think we can get rid if context_read() completely and simply load the context if it is not LOADED or is STOPPED. Just like it is done in amd_vpmu_do_wrmsr(). -boris
Konrad Rzeszutek Wilk
2013-Apr-16 15:41 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On Thu, Apr 11, 2013 at 02:34:47PM -0400, Boris Ostrovsky wrote:> On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: > >Boris, > > > >I tried booting the guest HVM after the patch, I still see PERF > >only working in Software mode only. I''ll look more into this. > > You may need to declare proper CPUID bits in the config file. On > fam15h I have > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011'']Would it be possible to write somewhere this magic incantention? Perhaps in the xl.cfg.pod.5 ? (This of course being a different patch).
Jacob Shin
2013-Apr-16 17:12 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On Tue, Apr 16, 2013 at 11:41:51AM -0400, Konrad Rzeszutek Wilk wrote:> On Thu, Apr 11, 2013 at 02:34:47PM -0400, Boris Ostrovsky wrote: > > On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: > > >Boris, > > > > > >I tried booting the guest HVM after the patch, I still see PERF > > >only working in Software mode only. I''ll look more into this. > > > > You may need to declare proper CPUID bits in the config file. On > > fam15h I have > > > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] > > Would it be possible to write somewhere this magic incantention? > > Perhaps in the xl.cfg.pod.5 ? > > (This of course being a different patch). >Well, maybe we should turn it on by default? http://lists.xen.org/archives/html/xen-devel/2013-04/msg01028.html: diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index 17efc0f..c269468 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -112,6 +112,7 @@ static void amd_xc_cpuid_policy( bitmaskof(X86_FEATURE_XOP) | bitmaskof(X86_FEATURE_FMA4) | bitmaskof(X86_FEATURE_TBM) | + bitmaskof(X86_FEATURE_PERFCTR_CORE) | bitmaskof(X86_FEATURE_LWP)); regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */ (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) | Or maybe not since vpmu is not on by default .. ?
Konrad Rzeszutek Wilk
2013-Apr-16 18:36 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On Tue, Apr 16, 2013 at 12:12:16PM -0500, Jacob Shin wrote:> On Tue, Apr 16, 2013 at 11:41:51AM -0400, Konrad Rzeszutek Wilk wrote: > > On Thu, Apr 11, 2013 at 02:34:47PM -0400, Boris Ostrovsky wrote: > > > On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: > > > >Boris, > > > > > > > >I tried booting the guest HVM after the patch, I still see PERF > > > >only working in Software mode only. I''ll look more into this. > > > > > > You may need to declare proper CPUID bits in the config file. On > > > fam15h I have > > > > > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] > > > > Would it be possible to write somewhere this magic incantention? > > > > Perhaps in the xl.cfg.pod.5 ? > > > > (This of course being a different patch). > > > > Well, maybe we should turn it on by default? > > http://lists.xen.org/archives/html/xen-devel/2013-04/msg01028.html: > > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c > index 17efc0f..c269468 100644 > --- a/tools/libxc/xc_cpuid_x86.c > +++ b/tools/libxc/xc_cpuid_x86.c > @@ -112,6 +112,7 @@ static void amd_xc_cpuid_policy( > bitmaskof(X86_FEATURE_XOP) | > bitmaskof(X86_FEATURE_FMA4) | > bitmaskof(X86_FEATURE_TBM) | > + bitmaskof(X86_FEATURE_PERFCTR_CORE) | > bitmaskof(X86_FEATURE_LWP)); > regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */ > (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) | > > Or maybe not since vpmu is not on by default .. ?I would say not yet. As the vpmu=1 (at least on Intel) has some issues. Until that is fixed and vpmu=1 is by default lets leave it as so.>
Suravee Suthikulanit
2013-Jun-19 22:56 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On 4/16/2013 1:36 PM, Konrad Rzeszutek Wilk wrote:> On Tue, Apr 16, 2013 at 12:12:16PM -0500, Jacob Shin wrote: > > On Tue, Apr 16, 2013 at 11:41:51AM -0400, Konrad Rzeszutek Wilk wrote: > > > On Thu, Apr 11, 2013 at 02:34:47PM -0400, Boris Ostrovsky wrote: > > > > On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: > > > > >Boris, > > > > > > > > > >I tried booting the guest HVM after the patch, I still see PERF > > > > >only working in Software mode only. I''ll look more into this. > > > > > > > > You may need to declare proper CPUID bits in the config file. On > > > > fam15h I have > > > > > > > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] > > > > > > Would it be possible to write somewhere this magic incantention? > > > > > > Perhaps in the xl.cfg.pod.5 ? > > > > > > (This of course being a different patch). > > > > > > > Well, maybe we should turn it on by default? > > > > http://lists.xen.org/archives/html/xen-devel/2013-04/msg01028.html: > > > > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c > > index 17efc0f..c269468 100644 > > --- a/tools/libxc/xc_cpuid_x86.c > > +++ b/tools/libxc/xc_cpuid_x86.c > > @@ -112,6 +112,7 @@ static void amd_xc_cpuid_policy( > > bitmaskof(X86_FEATURE_XOP) | > > bitmaskof(X86_FEATURE_FMA4) | > > bitmaskof(X86_FEATURE_TBM) | > > + bitmaskof(X86_FEATURE_PERFCTR_CORE) | > > bitmaskof(X86_FEATURE_LWP)); > > regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */ > > (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) | > > > > Or maybe not since vpmu is not on by default .. ? > > I would say not yet. As the vpmu=1 (at least on Intel) has some issues. > Until that is fixed and vpmu=1 is by default lets leave it as so. > > > >Konrad, Boris: I would like to ask you to reconsider accepting this patch for 4.3. This bit and vpmu=1 are independent of each other. Without vpmu=1 option, PERF in HVM guest will not work regardless of this bit. So, it should be safe to always setting this bit. However, if user set vpmu=1 and not _manually_ setting this bit, the PERF logic will break and users will be getting incorrect result. The bit is currently used in the Linux PERF logic for all family15h to tell that there are 6 counters instead of 4 counters (when bit the is not set). Also, it will be using a different set of event constrain. The current Linux PERF core PMU logic assume that this bit will always be available. Suravee _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Boris Ostrovsky
2013-Jun-19 23:32 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On 06/19/2013 06:56 PM, Suravee Suthikulanit wrote:> On 4/16/2013 1:36 PM, Konrad Rzeszutek Wilk wrote: >> On Tue, Apr 16, 2013 at 12:12:16PM -0500, Jacob Shin wrote: >> > On Tue, Apr 16, 2013 at 11:41:51AM -0400, Konrad Rzeszutek Wilk wrote: >> > > On Thu, Apr 11, 2013 at 02:34:47PM -0400, Boris Ostrovsky wrote: >> > > > On 04/11/2013 02:26 PM, Suravee Suthikulpanit wrote: >> > > > >Boris, >> > > > > >> > > > >I tried booting the guest HVM after the patch, I still see PERF >> > > > >only working in Software mode only. I''ll look more into this. >> > > > >> > > > You may need to declare proper CPUID bits in the config file. On >> > > > fam15h I have >> > > > >> > > > cpuid=[''0x80000001:ecx=00000001101000011000101111110011''] >> > > >> > > Would it be possible to write somewhere this magic incantention? >> > > >> > > Perhaps in the xl.cfg.pod.5 ? >> > > >> > > (This of course being a different patch). >> > > >> > >> > Well, maybe we should turn it on by default? >> > >> > http://lists.xen.org/archives/html/xen-devel/2013-04/msg01028.html: >> > >> > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c >> > index 17efc0f..c269468 100644 >> > --- a/tools/libxc/xc_cpuid_x86.c >> > +++ b/tools/libxc/xc_cpuid_x86.c >> > @@ -112,6 +112,7 @@ static void amd_xc_cpuid_policy( >> > bitmaskof(X86_FEATURE_XOP) | >> > bitmaskof(X86_FEATURE_FMA4) | >> > bitmaskof(X86_FEATURE_TBM) | >> > + bitmaskof(X86_FEATURE_PERFCTR_CORE) | >> > bitmaskof(X86_FEATURE_LWP)); >> > regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */ >> > (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) | >> > >> > Or maybe not since vpmu is not on by default .. ? >> >> I would say not yet. As the vpmu=1 (at least on Intel) has some issues. >> Until that is fixed and vpmu=1 is by default lets leave it as so. >> >> > >> > Konrad, Boris: > I would like to ask you to reconsider accepting this patch for 4.3.I think it missed 4.3 anyway.> > This bit and vpmu=1 are independent of each other. Without vpmu=1 option, PERF in HVM guest > will not work regardless of this bit. So, it should be safe to always setting this bit. > However, if user set vpmu=1 and not _manually_ setting this bit, the PERF logic will > break and users will be getting incorrect result.Why would you need to set this bit by hand in addition to having the patch? With this patch I see ecx=0x00a18bf3. I.e. bit 23 is set, as expected. -boris> > The bit is currently used in the Linux PERF logic for all family15h to > tell that there are 6 counters instead of 4 counters (when bit the is not set). > Also, it will be using a different set of event constrain. The current Linux PERF > core PMU logic assume that this bit will always be available. > > Suravee > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Boris Ostrovsky
2013-Jun-19 23:53 UTC
Re: [PATCH 3/8] x86/AMD: Read VPMU MSRs from context when it is not loaded into HW
On 06/19/2013 07:32 PM, Boris Ostrovsky wrote:> On 06/19/2013 06:56 PM, Suravee Suthikulanit wrote: >>> > >>> > http://lists.xen.org/archives/html/xen-devel/2013-04/msg01028.html: >>> > >>> > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c >>> > index 17efc0f..c269468 100644 >>> > --- a/tools/libxc/xc_cpuid_x86.c >>> > +++ b/tools/libxc/xc_cpuid_x86.c >>> > @@ -112,6 +112,7 @@ static void amd_xc_cpuid_policy( >>> > bitmaskof(X86_FEATURE_XOP) | >>> > bitmaskof(X86_FEATURE_FMA4) | >>> > bitmaskof(X86_FEATURE_TBM) | >>> > + bitmaskof(X86_FEATURE_PERFCTR_CORE) | >>> > bitmaskof(X86_FEATURE_LWP)); >>> > regs[3] &= (0x0183f3ff | /* features shared with 0x00000001:EDX */ >>> > (is_pae ? bitmaskof(X86_FEATURE_NX) : 0) | >>> > >>> > Or maybe not since vpmu is not on by default .. ? >>> >>> I would say not yet. As the vpmu=1 (at least on Intel) has some issues. >>> Until that is fixed and vpmu=1 is by default lets leave it as so. >>> >>> > >>> >> Konrad, Boris: >> I would like to ask you to reconsider accepting this patch for 4.3. > > I think it missed 4.3 anyway.OK, I think I misunderstood your question --- I thought you were asking to back out this patch because you found that something is broken with it. Instead you do want to have it in 4.3. Sorry. This is probably request for George then (but I suspect it''s too late now). -boris> >> This bit and vpmu=1 are independent of each other. Without vpmu=1 option, PERF in HVM guest >> will not work regardless of this bit. So, it should be safe to always setting this bit. >> However, if user set vpmu=1 and not _manually_ setting this bit, the PERF logic will >> break and users will be getting incorrect result. > > > Why would you need to set this bit by hand in addition to having the > patch? With this patch I see ecx=0x00a18bf3. I.e. bit 23 is set, as > expected. > > -boris > > >> The bit is currently used in the Linux PERF logic for all family15h to >> tell that there are 6 counters instead of 4 counters (when bit the is not set). >> Also, it will be using a different set of event constrain. The current Linux PERF >> core PMU logic assume that this bit will always be available._______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel