thr3ads.net - Linux Virtualization - [PATCH v1 4/4] KVM/vmx: enable lbr for the guest [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Andi Kleen

2017-Sep-25 14:57 UTC

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

> +static void auto_switch_lbr_msrs(struct vcpu_vmx *vmx)
> +{
> +	int i;
> +	struct perf_lbr_stack lbr_stack;
> +
> +	perf_get_lbr_stack(&lbr_stack);
> +
> +	add_atomic_switch_msr(vmx, MSR_LBR_SELECT, 0, 0);
> +	add_atomic_switch_msr(vmx, lbr_stack.lbr_tos, 0, 0);
> +
> +	for (i = 0; i < lbr_stack.lbr_nr; i++) {
> +		add_atomic_switch_msr(vmx, lbr_stack.lbr_from + i, 0, 0);
> +		add_atomic_switch_msr(vmx, lbr_stack.lbr_to + i, 0, 0);
> +		if (lbr_stack.lbr_info)
> +			add_atomic_switch_msr(vmx, lbr_stack.lbr_info + i, 0,
> +					      0);
> +	}
That will be really expensive and add a lot of overhead to every entry/exit.
perf can already context switch the LBRs on task context switch. With that
you can just switch LBR_SELECT, which is *much* cheaper because there
are far less context switches than exit/entries.

It implies that when KVM is running it needs to prevent perf from enabling
LBRs in the context of KVM, but that should be straight forward.

-Andi

Wei Wang

2017-Sep-26 08:56 UTC

head link

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

On 09/25/2017 10:57 PM, Andi Kleen wrote:>> +static void auto_switch_lbr_msrs(struct vcpu_vmx *vmx)
>> +{
>> +	int i;
>> +	struct perf_lbr_stack lbr_stack;
>> +
>> +	perf_get_lbr_stack(&lbr_stack);
>> +
>> +	add_atomic_switch_msr(vmx, MSR_LBR_SELECT, 0, 0);
>> +	add_atomic_switch_msr(vmx, lbr_stack.lbr_tos, 0, 0);
>> +
>> +	for (i = 0; i < lbr_stack.lbr_nr; i++) {
>> +		add_atomic_switch_msr(vmx, lbr_stack.lbr_from + i, 0, 0);
>> +		add_atomic_switch_msr(vmx, lbr_stack.lbr_to + i, 0, 0);
>> +		if (lbr_stack.lbr_info)
>> +			add_atomic_switch_msr(vmx, lbr_stack.lbr_info + i, 0,
>> +					      0);
>> +	}
> That will be really expensive and add a lot of overhead to every
entry/exit.
> perf can already context switch the LBRs on task context switch. With that
> you can just switch LBR_SELECT, which is *much* cheaper because there
> are far less context switches than exit/entries.
>
> It implies that when KVM is running it needs to prevent perf from enabling
> LBRs in the context of KVM, but that should be straight forward.
I kind of have a different thought here:

1) vCPU context switching and guest side task switching are not identical.
That is, when the vCPU is scheduled out, the guest task on the vCPU may not
run out its time slice yet, so the task will continue to run when the 
vCPU is
scheduled in by the host (lbr wasn't save by the guest task when the vCPU is
scheduled out in this case).

It is possible to have the vCPU which runs the guest task (in use of 
lbr) scheduled
out, followed by a new host task being scheduled in on the pCPU to run.
It is not guaranteed that the new host task does not use the LBR feature 
on the
pCPU.

2) Sometimes, people may want this usage: "perf record -b 
./qemu-system-x86_64 ...",
which will need lbr to be used in KVM as well.

I think one possible optimization we could do would be to add the LBR 
MSRs to auto
switching when the guest requests to enable the feature, and remove them 
when
being disabled. This will need to trap guest access to MSR_DEBUGCTL.

Best,
Wei

Andi Kleen

2017-Sep-26 16:41 UTC

head link

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

> 1) vCPU context switching and guest side task switching are not identical.
> That is, when the vCPU is scheduled out, the guest task on the vCPU may not
guest task lifetime has nothing to do with this. It's completely independent
of what you do here on the VCPU level.
> run out its time slice yet, so the task will continue to run when the vCPU
> is
> scheduled in by the host (lbr wasn't save by the guest task when the
vCPU is
> scheduled out in this case).
> 
> It is possible to have the vCPU which runs the guest task (in use of lbr)
> scheduled
> out, followed by a new host task being scheduled in on the pCPU to run.
> It is not guaranteed that the new host task does not use the LBR feature on
> the
> pCPU.
Sure it may use the LBR, and the normal perf context switch
will switch it and everything works fine.

It's like any other per-task LBR user.
> 
> 2) Sometimes, people may want this usage: "perf record -b
> ./qemu-system-x86_64 ...",
> which will need lbr to be used in KVM as well.
In this obscure case you can disable LBR support for the guest.
The common case is far more important.

It sounds like you didn't do any performance measurements.
I expect the performance of your current solution to be terrible.

e.g. a normal perf PMI does at least 1 MSR reads and 4+ MSR writes
for a single counter. With multiple counters it gets worse.

For each of those you'll need to exit. Adding something
to the entry/exit list is similar to the cost of doing 
explicit RD/WRMSRs.

On Skylake we have 32*3=96 MSRs for the LBRs.

So with the 5 exits and entries, you're essentually doing
5*2*96=18432 extra MSR accesses for each PMI.

MSR access is 100+ cycles at least, for writes it is far more
expensive.

-Andi

Maybe Matching Threads

Search for more maybe matching threads

Linux Virtualization - Sep 2017 - [PATCH v1 4/4] KVM/vmx: enable lbr for the guest

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

[PATCH v1 4/4] KVM/vmx: enable lbr for the guest

Maybe Matching Threads