thr3ads.net - Virtualization - [Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently [Jul 2019]

If this information is useful, please help other people find it:
Share via:

Juergen Gross

2019-Jul-03 14:04 UTC

[PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

On 03.07.19 01:51, Nadav Amit wrote:> To improve TLB shootdown performance, flush the remote and local TLBs
> concurrently. Introduce flush_tlb_multi() that does so. Introduce
> paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen
> and hyper-v are only compile-tested).
> 
> While the updated smp infrastructure is capable of running a function on
> a single local core, it is not optimized for this case. The multiple
> function calls and the indirect branch introduce some overhead, and
> might make local TLB flushes slower than they were before the recent
> changes.
> 
> Before calling the SMP infrastructure, check if only a local TLB flush
> is needed to restore the lost performance in this common case. This
> requires to check mm_cpumask() one more time, but unless this mask is
> updated very frequently, this should impact performance negatively.
> 
> Cc: "K. Y. Srinivasan" <kys at microsoft.com>
> Cc: Haiyang Zhang <haiyangz at microsoft.com>
> Cc: Stephen Hemminger <sthemmin at microsoft.com>
> Cc: Sasha Levin <sashal at kernel.org>
> Cc: Thomas Gleixner <tglx at linutronix.de>
> Cc: Ingo Molnar <mingo at redhat.com>
> Cc: Borislav Petkov <bp at alien8.de>
> Cc: x86 at kernel.org
> Cc: Juergen Gross <jgross at suse.com>
> Cc: Paolo Bonzini <pbonzini at redhat.com>
> Cc: Dave Hansen <dave.hansen at linux.intel.com>
> Cc: Andy Lutomirski <luto at kernel.org>
> Cc: Peter Zijlstra <peterz at infradead.org>
> Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
> Cc: linux-hyperv at vger.kernel.org
> Cc: linux-kernel at vger.kernel.org
> Cc: virtualization at lists.linux-foundation.org
> Cc: kvm at vger.kernel.org
> Cc: xen-devel at lists.xenproject.org
> Signed-off-by: Nadav Amit <namit at vmware.com>
> ---
>   arch/x86/hyperv/mmu.c                 | 13 +++---
>   arch/x86/include/asm/paravirt.h       |  6 +--
>   arch/x86/include/asm/paravirt_types.h |  4 +-
>   arch/x86/include/asm/tlbflush.h       |  9 ++--
>   arch/x86/include/asm/trace/hyperv.h   |  2 +-
>   arch/x86/kernel/kvm.c                 | 11 +++--
>   arch/x86/kernel/paravirt.c            |  2 +-
>   arch/x86/mm/tlb.c                     | 65 ++++++++++++++++++++-------
>   arch/x86/xen/mmu_pv.c                 | 20 ++++++---
>   include/trace/events/xen.h            |  2 +-
>   10 files changed, 91 insertions(+), 43 deletions(-)
...
> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
> index beb44e22afdf..19e481e6e904 100644
> --- a/arch/x86/xen/mmu_pv.c
> +++ b/arch/x86/xen/mmu_pv.c
> @@ -1355,8 +1355,8 @@ static void xen_flush_tlb_one_user(unsigned long
addr)
>   	preempt_enable();
>   }
>   
> -static void xen_flush_tlb_others(const struct cpumask *cpus,
> -				 const struct flush_tlb_info *info)
> +static void xen_flush_tlb_multi(const struct cpumask *cpus,
> +				const struct flush_tlb_info *info)
>   {
>   	struct {
>   		struct mmuext_op op;
> @@ -1366,7 +1366,7 @@ static void xen_flush_tlb_others(const struct cpumask
*cpus,
>   	const size_t mc_entry_size = sizeof(args->op) +
>   		sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus());
>   
> -	trace_xen_mmu_flush_tlb_others(cpus, info->mm, info->start,
info->end);
> +	trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start,
info->end);
>   
>   	if (cpumask_empty(cpus))
>   		return;		/* nothing to do */
> @@ -1375,9 +1375,17 @@ static void xen_flush_tlb_others(const struct
cpumask *cpus,
>   	args = mcs.args;
>   	args->op.arg2.vcpumask = to_cpumask(args->mask);
>   
> -	/* Remove us, and any offline CPUS. */
> +	/* Flush locally if needed and remove us */
> +	if (cpumask_test_cpu(smp_processor_id(), to_cpumask(args->mask))) {
> +		local_irq_disable();
> +		flush_tlb_func_local(info);
I think this isn't the correct function for PV guests.

In fact it should be much easier: just don't clear the own cpu from the
mask, that's all what's needed. The hypervisor is just fine having the
current cpu in the mask and it will do the right thing.


Juergen

Nadav Amit

2019-Jul-03 17:02 UTC

head link

[PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

> On Jul 3, 2019, at 7:04 AM, Juergen Gross <jgross at suse.com> wrote:
> 
> On 03.07.19 01:51, Nadav Amit wrote:
>> To improve TLB shootdown performance, flush the remote and local TLBs
>> concurrently. Introduce flush_tlb_multi() that does so. Introduce
>> paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen
>> and hyper-v are only compile-tested).
>> While the updated smp infrastructure is capable of running a function
on
>> a single local core, it is not optimized for this case. The multiple
>> function calls and the indirect branch introduce some overhead, and
>> might make local TLB flushes slower than they were before the recent
>> changes.
>> Before calling the SMP infrastructure, check if only a local TLB flush
>> is needed to restore the lost performance in this common case. This
>> requires to check mm_cpumask() one more time, but unless this mask is
>> updated very frequently, this should impact performance negatively.
>> Cc: "K. Y. Srinivasan" <kys at microsoft.com>
>> Cc: Haiyang Zhang <haiyangz at microsoft.com>
>> Cc: Stephen Hemminger <sthemmin at microsoft.com>
>> Cc: Sasha Levin <sashal at kernel.org>
>> Cc: Thomas Gleixner <tglx at linutronix.de>
>> Cc: Ingo Molnar <mingo at redhat.com>
>> Cc: Borislav Petkov <bp at alien8.de>
>> Cc: x86 at kernel.org
>> Cc: Juergen Gross <jgross at suse.com>
>> Cc: Paolo Bonzini <pbonzini at redhat.com>
>> Cc: Dave Hansen <dave.hansen at linux.intel.com>
>> Cc: Andy Lutomirski <luto at kernel.org>
>> Cc: Peter Zijlstra <peterz at infradead.org>
>> Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
>> Cc: linux-hyperv at vger.kernel.org
>> Cc: linux-kernel at vger.kernel.org
>> Cc: virtualization at lists.linux-foundation.org
>> Cc: kvm at vger.kernel.org
>> Cc: xen-devel at lists.xenproject.org
>> Signed-off-by: Nadav Amit <namit at vmware.com>
>> ---
>>  arch/x86/hyperv/mmu.c                 | 13 +++---
>>  arch/x86/include/asm/paravirt.h       |  6 +--
>>  arch/x86/include/asm/paravirt_types.h |  4 +-
>>  arch/x86/include/asm/tlbflush.h       |  9 ++--
>>  arch/x86/include/asm/trace/hyperv.h   |  2 +-
>>  arch/x86/kernel/kvm.c                 | 11 +++--
>>  arch/x86/kernel/paravirt.c            |  2 +-
>>  arch/x86/mm/tlb.c                     | 65 ++++++++++++++++++++-------
>>  arch/x86/xen/mmu_pv.c                 | 20 ++++++---
>>  include/trace/events/xen.h            |  2 +-
>>  10 files changed, 91 insertions(+), 43 deletions(-)
> 
> ...
> 
>> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
>> index beb44e22afdf..19e481e6e904 100644
>> --- a/arch/x86/xen/mmu_pv.c
>> +++ b/arch/x86/xen/mmu_pv.c
>> @@ -1355,8 +1355,8 @@ static void xen_flush_tlb_one_user(unsigned long
addr)
>>  	preempt_enable();
>>  }
>>  -static void xen_flush_tlb_others(const struct cpumask *cpus,
>> -				 const struct flush_tlb_info *info)
>> +static void xen_flush_tlb_multi(const struct cpumask *cpus,
>> +				const struct flush_tlb_info *info)
>>  {
>>  	struct {
>>  		struct mmuext_op op;
>> @@ -1366,7 +1366,7 @@ static void xen_flush_tlb_others(const struct
cpumask *cpus,
>>  	const size_t mc_entry_size = sizeof(args->op) +
>>  		sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus());
>>  -	trace_xen_mmu_flush_tlb_others(cpus, info->mm, info->start,
info->end);
>> +	trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start,
info->end);
>>    	if (cpumask_empty(cpus))
>>  		return;		/* nothing to do */
>> @@ -1375,9 +1375,17 @@ static void xen_flush_tlb_others(const struct
cpumask *cpus,
>>  	args = mcs.args;
>>  	args->op.arg2.vcpumask = to_cpumask(args->mask);
>>  -	/* Remove us, and any offline CPUS. */
>> +	/* Flush locally if needed and remove us */
>> +	if (cpumask_test_cpu(smp_processor_id(), to_cpumask(args->mask)))
{
>> +		local_irq_disable();
>> +		flush_tlb_func_local(info);
> 
> I think this isn't the correct function for PV guests.
> 
> In fact it should be much easier: just don't clear the own cpu from the
> mask, that's all what's needed. The hypervisor is just fine having
the
> current cpu in the mask and it will do the right thing.
Thanks. I will do so in v3. I don?t think Hyper-V people would want to do
the same, unfortunately, since it would induce VM-exit on TLB flushes. But
if they do - I?ll be able not to expose flush_tlb_func_local().

Andrew Cooper

2019-Jul-03 17:43 UTC

head link

[Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

On 03/07/2019 18:02, Nadav Amit wrote:>> On Jul 3, 2019, at 7:04 AM, Juergen Gross <jgross at suse.com>
wrote:
>>
>> On 03.07.19 01:51, Nadav Amit wrote:
>>> To improve TLB shootdown performance, flush the remote and local
TLBs
>>> concurrently. Introduce flush_tlb_multi() that does so. Introduce
>>> paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v
(Xen
>>> and hyper-v are only compile-tested).
>>> While the updated smp infrastructure is capable of running a
function on
>>> a single local core, it is not optimized for this case. The
multiple
>>> function calls and the indirect branch introduce some overhead, and
>>> might make local TLB flushes slower than they were before the
recent
>>> changes.
>>> Before calling the SMP infrastructure, check if only a local TLB
flush
>>> is needed to restore the lost performance in this common case. This
>>> requires to check mm_cpumask() one more time, but unless this mask
is
>>> updated very frequently, this should impact performance negatively.
>>> Cc: "K. Y. Srinivasan" <kys at microsoft.com>
>>> Cc: Haiyang Zhang <haiyangz at microsoft.com>
>>> Cc: Stephen Hemminger <sthemmin at microsoft.com>
>>> Cc: Sasha Levin <sashal at kernel.org>
>>> Cc: Thomas Gleixner <tglx at linutronix.de>
>>> Cc: Ingo Molnar <mingo at redhat.com>
>>> Cc: Borislav Petkov <bp at alien8.de>
>>> Cc: x86 at kernel.org
>>> Cc: Juergen Gross <jgross at suse.com>
>>> Cc: Paolo Bonzini <pbonzini at redhat.com>
>>> Cc: Dave Hansen <dave.hansen at linux.intel.com>
>>> Cc: Andy Lutomirski <luto at kernel.org>
>>> Cc: Peter Zijlstra <peterz at infradead.org>
>>> Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
>>> Cc: linux-hyperv at vger.kernel.org
>>> Cc: linux-kernel at vger.kernel.org
>>> Cc: virtualization at lists.linux-foundation.org
>>> Cc: kvm at vger.kernel.org
>>> Cc: xen-devel at lists.xenproject.org
>>> Signed-off-by: Nadav Amit <namit at vmware.com>
>>> ---
>>>  arch/x86/hyperv/mmu.c                 | 13 +++---
>>>  arch/x86/include/asm/paravirt.h       |  6 +--
>>>  arch/x86/include/asm/paravirt_types.h |  4 +-
>>>  arch/x86/include/asm/tlbflush.h       |  9 ++--
>>>  arch/x86/include/asm/trace/hyperv.h   |  2 +-
>>>  arch/x86/kernel/kvm.c                 | 11 +++--
>>>  arch/x86/kernel/paravirt.c            |  2 +-
>>>  arch/x86/mm/tlb.c                     | 65
++++++++++++++++++++-------
>>>  arch/x86/xen/mmu_pv.c                 | 20 ++++++---
>>>  include/trace/events/xen.h            |  2 +-
>>>  10 files changed, 91 insertions(+), 43 deletions(-)
>> ...
>>
>>> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
>>> index beb44e22afdf..19e481e6e904 100644
>>> --- a/arch/x86/xen/mmu_pv.c
>>> +++ b/arch/x86/xen/mmu_pv.c
>>> @@ -1355,8 +1355,8 @@ static void xen_flush_tlb_one_user(unsigned
long addr)
>>>  	preempt_enable();
>>>  }
>>>  -static void xen_flush_tlb_others(const struct cpumask *cpus,
>>> -				 const struct flush_tlb_info *info)
>>> +static void xen_flush_tlb_multi(const struct cpumask *cpus,
>>> +				const struct flush_tlb_info *info)
>>>  {
>>>  	struct {
>>>  		struct mmuext_op op;
>>> @@ -1366,7 +1366,7 @@ static void xen_flush_tlb_others(const struct
cpumask *cpus,
>>>  	const size_t mc_entry_size = sizeof(args->op) +
>>>  		sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus());
>>>  -	trace_xen_mmu_flush_tlb_others(cpus, info->mm,
info->start, info->end);
>>> +	trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start,
info->end);
>>>    	if (cpumask_empty(cpus))
>>>  		return;		/* nothing to do */
>>> @@ -1375,9 +1375,17 @@ static void xen_flush_tlb_others(const
struct cpumask *cpus,
>>>  	args = mcs.args;
>>>  	args->op.arg2.vcpumask = to_cpumask(args->mask);
>>>  -	/* Remove us, and any offline CPUS. */
>>> +	/* Flush locally if needed and remove us */
>>> +	if (cpumask_test_cpu(smp_processor_id(),
to_cpumask(args->mask))) {
>>> +		local_irq_disable();
>>> +		flush_tlb_func_local(info);
>> I think this isn't the correct function for PV guests.
>>
>> In fact it should be much easier: just don't clear the own cpu from
the
>> mask, that's all what's needed. The hypervisor is just fine
having the
>> current cpu in the mask and it will do the right thing.
> Thanks. I will do so in v3. I don?t think Hyper-V people would want to do
> the same, unfortunately, since it would induce VM-exit on TLB flushes.
Why do you believe the vmexit matters?? You're talking one anyway for
the IPI.

Intel only have virtualised self-IPI, and while AMD do have working
non-self IPIs, you still take a vmexit anyway if any destination vcpu
isn't currently running in non-root mode (IIRC).

At that point, you might as well have the hypervisor do all the hard
work via a multi-cpu shootdown/flush hypercall, rather than trying to
arrange it locally.

~Andrew

Virtualization - Jul 2019 - [Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

[PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

[PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

[Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently