search for: lbr

Displaying 20 results from an estimated 115 matches for "lbr".

Did you mean: br
2017 Sep 26
1
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
...vCPU is scheduled out, the guest task on the vCPU may not guest task lifetime has nothing to do with this. It's completely independent of what you do here on the VCPU level. > run out its time slice yet, so the task will continue to run when the vCPU > is > scheduled in by the host (lbr wasn't save by the guest task when the vCPU is > scheduled out in this case). > > It is possible to have the vCPU which runs the guest task (in use of lbr) > scheduled > out, followed by a new host task being scheduled in on the pCPU to run. > It is not guaranteed that the ne...
2017 Sep 26
1
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
...vCPU is scheduled out, the guest task on the vCPU may not guest task lifetime has nothing to do with this. It's completely independent of what you do here on the VCPU level. > run out its time slice yet, so the task will continue to run when the vCPU > is > scheduled in by the host (lbr wasn't save by the guest task when the vCPU is > scheduled out in this case). > > It is possible to have the vCPU which runs the guest task (in use of lbr) > scheduled > out, followed by a new host task being scheduled in on the pCPU to run. > It is not guaranteed that the ne...
2017 Sep 25
2
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
> +static void auto_switch_lbr_msrs(struct vcpu_vmx *vmx) > +{ > + int i; > + struct perf_lbr_stack lbr_stack; > + > + perf_get_lbr_stack(&lbr_stack); > + > + add_atomic_switch_msr(vmx, MSR_LBR_SELECT, 0, 0); > + add_atomic_switch_msr(vmx, lbr_stack.lbr_tos, 0, 0); > + > + for (i = 0; i < lbr...
2017 Sep 25
2
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
> +static void auto_switch_lbr_msrs(struct vcpu_vmx *vmx) > +{ > + int i; > + struct perf_lbr_stack lbr_stack; > + > + perf_get_lbr_stack(&lbr_stack); > + > + add_atomic_switch_msr(vmx, MSR_LBR_SELECT, 0, 0); > + add_atomic_switch_msr(vmx, lbr_stack.lbr_tos, 0, 0); > + > + for (i = 0; i < lbr...
2017 Sep 26
1
[PATCH v1 0/4] Enable LBR for the guest
On 09/25/2017 10:59 PM, Andi Kleen wrote: > On Mon, Sep 25, 2017 at 12:44:52PM +0800, Wei Wang wrote: >> This patch series enables the Last Branch Recording feature for the >> guest. Instead of trapping each LBR stack MSR access, the MSRs are >> passthroughed to the guest. Those MSRs are switched (i.e. load and >> saved) on VMExit and VMEntry. >> >> Test: >> Try "perf record -b ./test_program" on guest. > I don't see where you expose the PERF capabilities MSR?...
2017 Sep 26
1
[PATCH v1 0/4] Enable LBR for the guest
On 09/25/2017 10:59 PM, Andi Kleen wrote: > On Mon, Sep 25, 2017 at 12:44:52PM +0800, Wei Wang wrote: >> This patch series enables the Last Branch Recording feature for the >> guest. Instead of trapping each LBR stack MSR access, the MSRs are >> passthroughed to the guest. Those MSRs are switched (i.e. load and >> saved) on VMExit and VMEntry. >> >> Test: >> Try "perf record -b ./test_program" on guest. > I don't see where you expose the PERF capabilities MSR?...
2017 Sep 25
10
[PATCH v1 0/4] Enable LBR for the guest
This patch series enables the Last Branch Recording feature for the guest. Instead of trapping each LBR stack MSR access, the MSRs are passthroughed to the guest. Those MSRs are switched (i.e. load and saved) on VMExit and VMEntry. Test: Try "perf record -b ./test_program" on guest. Wei Wang (4): KVM/vmx: re-write the msr auto switch feature KVM/vmx: auto switch MSR_IA32_DEBUGCTLMSR...
2017 Sep 25
10
[PATCH v1 0/4] Enable LBR for the guest
This patch series enables the Last Branch Recording feature for the guest. Instead of trapping each LBR stack MSR access, the MSRs are passthroughed to the guest. Those MSRs are switched (i.e. load and saved) on VMExit and VMEntry. Test: Try "perf record -b ./test_program" on guest. Wei Wang (4): KVM/vmx: re-write the msr auto switch feature KVM/vmx: auto switch MSR_IA32_DEBUGCTLMSR...
2017 Sep 26
0
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
On 09/25/2017 10:57 PM, Andi Kleen wrote: >> +static void auto_switch_lbr_msrs(struct vcpu_vmx *vmx) >> +{ >> + int i; >> + struct perf_lbr_stack lbr_stack; >> + >> + perf_get_lbr_stack(&lbr_stack); >> + >> + add_atomic_switch_msr(vmx, MSR_LBR_SELECT, 0, 0); >> + add_atomic_switch_msr(vmx, lbr_stack.lbr_tos, 0, 0); >...
2017 Sep 25
1
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
On 25/09/2017 06:44, Wei Wang wrote: > Passthrough the LBR stack to the guest, and auto switch the stack MSRs > upon VMEntry and VMExit. > > Signed-off-by: Wei Wang <wei.w.wang at intel.com> This has to be enabled separately for each guest, because it may prevent live migration to hosts with a different family/model. Paolo > --- >...
2017 Sep 25
1
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
On 25/09/2017 06:44, Wei Wang wrote: > Passthrough the LBR stack to the guest, and auto switch the stack MSRs > upon VMEntry and VMExit. > > Signed-off-by: Wei Wang <wei.w.wang at intel.com> This has to be enabled separately for each guest, because it may prevent live migration to hosts with a different family/model. Paolo > --- >...
2017 Sep 25
0
[PATCH v1 0/4] Enable LBR for the guest
On Mon, Sep 25, 2017 at 12:44:52PM +0800, Wei Wang wrote: > This patch series enables the Last Branch Recording feature for the > guest. Instead of trapping each LBR stack MSR access, the MSRs are > passthroughed to the guest. Those MSRs are switched (i.e. load and > saved) on VMExit and VMEntry. > > Test: > Try "perf record -b ./test_program" on guest. I don't see where you expose the PERF capabilities MSR? That's normally...
2017 Sep 25
0
[PATCH v1 4/4] KVM/vmx: enable lbr for the guest
Passthrough the LBR stack to the guest, and auto switch the stack MSRs upon VMEntry and VMExit. Signed-off-by: Wei Wang <wei.w.wang at intel.com> --- arch/x86/kvm/vmx.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx....
2020 Aug 07
4
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. It doesn’t rely on previous inlining like today’s AutoFDO to get context-sensitive profile, and it also doesn’t need a separate post-inline context-sensitive profile like CSPGO. In addition, we introduced pseudo-instrumentation for m...
2020 Aug 07
2
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. Can you share more details on this? LBR only has 32 entries, so it won't give you full call context, so stack unwinding is needed. What is the overhead you see in production environment? [wenlei] We are not worried about overh...
2020 Aug 08
5
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. Can you share more details on this? LBR only has 32 entries, so it won't give you full call context, so stack unwinding is needed. What is the overhead you see in production environment? [wenlei] We are not worried about overh...
2020 Aug 08
2
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...sensitive Sample PGO as an > alternative to the existing AutoFDO. We’d like to share our motivation, > propose a new design, and reveal preliminary results on benchmarks. We will > refer to the proposed design as CSSPGO in this RFC. > > > > The new CSSPGO leverages simultaneous LBR and stack sampling to construct > a full context-sensitive profile. > > > > > > Can you share more details on this? LBR only has 32 entries, so it won't > give you full call context, so stack unwinding is needed. What is the > overhead you see in production environmen...
2020 Aug 08
3
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. Can you share more details on this? LBR only has 32 entries, so it won't give you full call context, so stack unwinding is needed. What is the overhead you see in production environment? [wenlei] We are not worried about overh...
2020 Aug 08
2
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. Can you share more details on this? LBR only has 32 entries, so it won't give you full call context, so stack unwinding is needed. What is the overhead you see in production environment? [wenlei] We are not worried about overh...
2020 Aug 08
2
[RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation
...acebook is building a new context-sensitive Sample PGO as an alternative to the existing AutoFDO. We’d like to share our motivation, propose a new design, and reveal preliminary results on benchmarks. We will refer to the proposed design as CSSPGO in this RFC. The new CSSPGO leverages simultaneous LBR and stack sampling to construct a full context-sensitive profile. Can you share more details on this? LBR only has 32 entries, so it won't give you full call context, so stack unwinding is needed. What is the overhead you see in production environment? [wenlei] We are not worried about overh...