Waiman Long
2017-Feb-15 18:31 UTC
[PATCH v3 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead
v2->v3: - Provide an optimized __raw_callee_save___kvm_vcpu_is_preempted() in assembly as suggested by PeterZ. - Add a new patch to change vcpu_is_preempted() argument type to long to ease the writing of the assembly code. v1->v2: - Rerun the fio test on a different system on both bare-metal and a KVM guest. Both sockets were utilized in this test. - The commit log was updated with new performance numbers, but the patch wasn't changed. - Drop patch 2. As it was found that the overhead of callee-save vcpu_is_preempted() can have some impact on system performance on a VM guest, especially of x86-64 guest, this patch set intends to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Waiman Long (2): x86/paravirt: Change vcp_is_preempted() arg type to long x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c | 30 +++++++++++++++++++++++++++++- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 32 insertions(+), 4 deletions(-) -- 1.8.3.1
Waiman Long
2017-Feb-15 18:31 UTC
[PATCH v3 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long
The cpu argument in the function prototype of vcpu_is_preempted() is changed from int to long. That makes it easier to provide a better optimized assembly version of that function. For Xen, vcpu_is_preempted(long) calls xen_vcpu_stolen(int), the downcast from long to int is not a problem as vCPU number won't exceed 32 bits. Signed-off-by: Waiman Long <longman at redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c | 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 864f57b..2a46f73 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -674,7 +674,7 @@ static __always_inline void pv_kick(int cpu) PVOP_VCALL1(pv_lock_ops.kick, cpu); } -static __always_inline bool pv_vcpu_is_preempted(int cpu) +static __always_inline bool pv_vcpu_is_preempted(long cpu) { return PVOP_CALLEE1(bool, pv_lock_ops.vcpu_is_preempted, cpu); } diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index c343ab5..48a706f 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -34,7 +34,7 @@ static inline void queued_spin_unlock(struct qspinlock *lock) } #define vcpu_is_preempted vcpu_is_preempted -static inline bool vcpu_is_preempted(int cpu) +static inline bool vcpu_is_preempted(long cpu) { return pv_vcpu_is_preempted(cpu); } diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 099fcba..85ed343 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -589,7 +589,7 @@ static void kvm_wait(u8 *ptr, u8 val) local_irq_restore(flags); } -__visible bool __kvm_vcpu_is_preempted(int cpu) +__visible bool __kvm_vcpu_is_preempted(long cpu) { struct kvm_steal_time *src = &per_cpu(steal_time, cpu); diff --git a/arch/x86/kernel/paravirt-spinlocks.c b/arch/x86/kernel/paravirt-spinlocks.c index 6259327..8f2d1c9 100644 --- a/arch/x86/kernel/paravirt-spinlocks.c +++ b/arch/x86/kernel/paravirt-spinlocks.c @@ -20,7 +20,7 @@ bool pv_is_native_spin_unlock(void) __raw_callee_save___native_queued_spin_unlock; } -__visible bool __native_vcpu_is_preempted(int cpu) +__visible bool __native_vcpu_is_preempted(long cpu) { return false; } -- 1.8.3.1
Waiman Long
2017-Feb-15 18:31 UTC
[PATCH v3 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
It was found when running fio sequential write test with a XFS ramdisk on a KVM guest running on a 2-socket x86-64 system, the %CPU times as reported by perf were as follows: 69.75% 0.59% fio [k] down_write 69.15% 0.01% fio [k] call_rwsem_down_write_failed 67.12% 1.12% fio [k] rwsem_down_write_failed 63.48% 52.77% fio [k] osq_lock 9.46% 7.88% fio [k] __raw_callee_save___kvm_vcpu_is_preempt 3.93% 3.93% fio [k] __kvm_vcpu_is_preempted Making vcpu_is_preempted() a callee-save function has a relatively high cost on x86-64 primarily due to at least one more cacheline of data access from the saving and restoring of registers (8 of them) to and from stack as well as one more level of function call. To reduce this performance overhead, an optimized assembly version of the the __raw_callee_save___kvm_vcpu_is_preempt() function is provided for x86-64. With this patch applied on a KVM guest on a 2-socekt 16-core 32-thread system with 16 parallel jobs (8 on each socket), the aggregrate bandwidth of the fio test on an XFS ramdisk were as follows: I/O Type w/o patch with patch -------- --------- ---------- random read 8141.2 MB/s 8497.1 MB/s seq read 8229.4 MB/s 8304.2 MB/s random write 1675.5 MB/s 1701.5 MB/s seq write 1681.3 MB/s 1699.9 MB/s There are some increases in the aggregated bandwidth because of the patch. The perf data now became: 70.78% 0.58% fio [k] down_write 70.20% 0.01% fio [k] call_rwsem_down_write_failed 69.70% 1.17% fio [k] rwsem_down_write_failed 59.91% 55.42% fio [k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted The assembly code was verified by using a test kernel module to compare the output of C __kvm_vcpu_is_preempted() and that of assembly __raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <peterz at infradead.org> Signed-off-by: Waiman Long <longman at redhat.com> --- arch/x86/kernel/kvm.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 85ed343..83e22c1 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -589,6 +589,7 @@ static void kvm_wait(u8 *ptr, u8 val) local_irq_restore(flags); } +#ifdef CONFIG_X86_32 __visible bool __kvm_vcpu_is_preempted(long cpu) { struct kvm_steal_time *src = &per_cpu(steal_time, cpu); @@ -597,11 +598,38 @@ __visible bool __kvm_vcpu_is_preempted(long cpu) } PV_CALLEE_SAVE_REGS_THUNK(__kvm_vcpu_is_preempted); +#else + +extern bool __raw_callee_save___kvm_vcpu_is_preempted(long); + +/* + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and + * restoring to/from the stack. It is assumed that the preempted value + * is at an offset of 16 from the beginning of the kvm_steal_time structure + * which is verified by the BUILD_BUG_ON() macro below. + */ +#define PREEMPTED_OFFSET 16 +asm( +".pushsection .text;" +".global __raw_callee_save___kvm_vcpu_is_preempted;" +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" +"__raw_callee_save___kvm_vcpu_is_preempted:" +"movq __per_cpu_offset(,%rdi,8), %rax;" +"cmpb $0, " __stringify(PREEMPTED_OFFSET) "+steal_time(%rax);" +"setne %al;" +"ret;" +".popsection"); + +#endif + /* * Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present. */ void __init kvm_spinlock_init(void) { + BUILD_BUG_ON((offsetof(struct kvm_steal_time, preempted) + != PREEMPTED_OFFSET) || (sizeof(steal_time.preempted) != 1)); + if (!kvm_para_available()) return; /* Does host kernel support KVM_FEATURE_PV_UNHALT? */ -- 1.8.3.1
kbuild test robot
2017-Feb-15 20:28 UTC
[PATCH v3 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
Hi Waiman, [auto build test ERROR on kvm/linux-next] [also build test ERROR on v4.10-rc8 next-20170215] [cannot apply to tip/x86/core] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Waiman-Long/x86-kvm-Reduce-vcpu_is_preempted-overhead/20170216-033517 base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next config: i386-randconfig-h1-02160052 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/uapi/linux/capability.h:16, from include/linux/capability.h:15, from include/linux/sched.h:15, from include/linux/context_tracking.h:4, from arch/x86/kernel/kvm.c:23: arch/x86/kernel/kvm.c: In function 'kvm_spinlock_init':>> arch/x86/kernel/kvm.c:631:6: error: 'PREEMPTED_OFFSET' undeclared (first use in this function)!= PREEMPTED_OFFSET) || (sizeof(steal_time.preempted) != 1)); ^ include/linux/compiler.h:498:19: note: in definition of macro '__compiletime_assert' bool __cond = !(condition); \ ^~~~~~~~~ include/linux/compiler.h:518:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^~~~~~~~~~~~~~~~~~~ include/linux/bug.h:54:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ include/linux/bug.h:78:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^~~~~~~~~~~~~~~~>> arch/x86/kernel/kvm.c:630:2: note: in expansion of macro 'BUILD_BUG_ON'BUILD_BUG_ON((offsetof(struct kvm_steal_time, preempted) ^~~~~~~~~~~~ arch/x86/kernel/kvm.c:631:6: note: each undeclared identifier is reported only once for each function it appears in != PREEMPTED_OFFSET) || (sizeof(steal_time.preempted) != 1)); ^ include/linux/compiler.h:498:19: note: in definition of macro '__compiletime_assert' bool __cond = !(condition); \ ^~~~~~~~~ include/linux/compiler.h:518:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^~~~~~~~~~~~~~~~~~~ include/linux/bug.h:54:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ include/linux/bug.h:78:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^~~~~~~~~~~~~~~~>> arch/x86/kernel/kvm.c:630:2: note: in expansion of macro 'BUILD_BUG_ON'BUILD_BUG_ON((offsetof(struct kvm_steal_time, preempted) ^~~~~~~~~~~~ vim +/PREEMPTED_OFFSET +631 arch/x86/kernel/kvm.c 624 625 /* 626 * Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present. 627 */ 628 void __init kvm_spinlock_init(void) 629 { > 630 BUILD_BUG_ON((offsetof(struct kvm_steal_time, preempted) > 631 != PREEMPTED_OFFSET) || (sizeof(steal_time.preempted) != 1)); 632 633 if (!kvm_para_available()) 634 return; --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/gzip Size: 27710 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20170216/cefc02e0/attachment-0001.bin>
Reasonably Related Threads
- [PATCH v3 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead
- [PATCH v4 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead
- [PATCH v4 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
- [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function