change fomr v1:
	a simplier definition of default vcpu_is_preempted
	skip mahcine type check on ppc, and add config. remove dedicated macro.
	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
	add more comments
	thanks boqun and Peter's suggestion.
This patch set aims to fix lock holder preemption issues.
test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report
18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
These spin_on_onwer variant also cause rcu stall before we apply this patch set
Pan Xinhui (4):
  kernel/sched: introduce vcpu preempted check interface
  powerpc/spinlock: support vcpu preempted check
  locking/osq: Drop the overload of osq_lock()
  kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
 arch/powerpc/include/asm/spinlock.h | 18 ++++++++++++++++++
 include/linux/sched.h               | 12 ++++++++++++
 kernel/locking/mutex.c              | 15 +++++++++++++--
 kernel/locking/osq_lock.c           | 10 +++++++++-
 kernel/locking/rwsem-xadd.c         | 16 +++++++++++++---
 5 files changed, 65 insertions(+), 6 deletions(-)
-- 
2.4.11
Pan Xinhui
2016-Jun-28  14:43 UTC
[PATCH v2 1/4] kernel/sched: introduce vcpu preempted check interface
This patch support to fix lock holder preemption issue. For kernel users, we could use bool vcpu_is_preempted(int cpu) to detech if one vcpu is preempted or not. The default implementation is a macro defined by false. So compiler can wrap it out if arch dose not support such vcpu pteempted check. Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org> Signed-off-by: Pan Xinhui <xinhui.pan at linux.vnet.ibm.com> --- include/linux/sched.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e42ada..cbe0574 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -3293,6 +3293,18 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu) #endif /* CONFIG_SMP */ +/* + * In order to deal with a various lock holder preemption issues provide an + * interface to see if a vCPU is currently running or not. + * + * This allows us to terminate optimistic spin loops and block, analogous to + * the native optimistic spin heuristic of testing if the lock owner task is + * running or not. + */ +#ifndef vcpu_is_preempted +#define vcpu_is_preempted(cpu) false +#endif + extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask); extern long sched_getaffinity(pid_t pid, struct cpumask *mask); -- 2.4.11
Pan Xinhui
2016-Jun-28  14:43 UTC
[PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself. Currently
kernel has an interface of bool vcpu_is_preempted(int cpu). It take the cpu
as parameter and return true if the cpu is preempted. Then kernel can break
the spin loops upon on the retval of vcpu_is_preempted.
As kernel has used this interface, So lets support it.
Only pSeries need supoort it. And the fact is powerNV are built into same
kernel image with pSeries. So we need return false if we are runnig as
powerNV. The another fact is that lppaca->yiled_count keeps zero on
powerNV. So we can just skip the machine type.
Suggested-by: Boqun Feng <boqun.feng at gmail.com>
Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan at linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/spinlock.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
diff --git a/arch/powerpc/include/asm/spinlock.h
b/arch/powerpc/include/asm/spinlock.h
index 523673d..3ac9fcb 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,24 @@
 #define SYNC_IO
 #endif
 
+/*
+ * This support kernel to check if one cpu is preempted or not.
+ * Then we can fix some lock holder preemption issue.
+ */
+#ifdef CONFIG_PPC_PSERIES
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	/*
+	 * pSeries and powerNV can be built into same kernel image. In
+	 * principle we need return false directly if we are running as
+	 * powerNV. However the yield_count is always zero on powerNV, So
+	 * skip such machine type check
+	 */
+	return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
+}
+#endif
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
 	return lock.slock == 0;
-- 
2.4.11
Pan Xinhui
2016-Jun-28  14:43 UTC
[PATCH v2 3/4] locking/osq: Drop the overload of osq_lock()
An over-committed guest with more vCPUs than pCPUs has a heavy overload in
osq_lock().
This is because vCPU A hold the osq lock and yield out, vCPU B wait per_cpu
node->locked to be set. IOW, vCPU B wait vCPU A to run and unlock the osq
lock.
Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.
test case:
perf record -a perf bench sched messaging -g 400 -p && perf report
before patch:
18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
after patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
Suggested-by: Boqun Feng <boqun.feng at gmail.com>
Signed-off-by: Pan Xinhui <xinhui.pan at linux.vnet.ibm.com>
---
 kernel/locking/osq_lock.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..39d1385 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -21,6 +21,11 @@ static inline int encode_cpu(int cpu_nr)
 	return cpu_nr + 1;
 }
 
+static inline int node_cpu(struct optimistic_spin_node *node)
+{
+	return node->cpu - 1;
+}
+
 static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val)
 {
 	int cpu_nr = encoded_cpu_val - 1;
@@ -118,8 +123,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	while (!READ_ONCE(node->locked)) {
 		/*
 		 * If we need to reschedule bail... so we can block.
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
 		 */
-		if (need_resched())
+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
 			goto unqueue;
 
 		cpu_relax_lowlatency();
-- 
2.4.11
Pan Xinhui
2016-Jun-28  14:43 UTC
[PATCH v2 4/4] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner
An over-committed guest with more vCPUs than pCPUs has a heavy overload in
the two spin_on_owner. This blames on the lock holder preemption issue.
Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.
test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report
before patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
after patch:
 9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
 4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
 3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
 2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
Signed-off-by: Pan Xinhui <xinhui.pan at linux.vnet.ibm.com>
---
 kernel/locking/mutex.c      | 15 +++++++++++++--
 kernel/locking/rwsem-xadd.c | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 79d2d76..ef0451b2 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -236,7 +236,13 @@ bool mutex_spin_on_owner(struct mutex *lock, struct
task_struct *owner)
 		 */
 		barrier();
 
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			ret = false;
 			break;
 		}
@@ -261,8 +267,13 @@ static inline int mutex_can_spin_on_owner(struct mutex
*lock)
 
 	rcu_read_lock();
 	owner = READ_ONCE(lock->owner);
+
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task not
+	 * on cpu or its cpu is preempted
+	 */
 	if (owner)
-		retval = owner->on_cpu;
+		retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 	rcu_read_unlock();
 	/*
 	 * if lock->owner is not set, the mutex owner may have just acquired
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 09e30c6..828ca7c 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -319,7 +319,11 @@ static inline bool rwsem_can_spin_on_owner(struct
rw_semaphore *sem)
 		goto done;
 	}
 
-	ret = owner->on_cpu;
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task not
+	 * on cpu or its cpu is preempted
+	 */
+	ret = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 done:
 	rcu_read_unlock();
 	return ret;
@@ -340,8 +344,14 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct
task_struct *owner)
 		 */
 		barrier();
 
-		/* abort spinning when need_resched or owner is not running */
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * abort spinning when need_resched or owner is not running or
+		 * owner's cpu is preempted. vcpu_is_preempted is a macro
+		 * defined by false if arch does not support vcpu preempted
+		 * check
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
2.4.11
Wanpeng Li
2016-Jul-05  09:57 UTC
[PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
Hi Xinhui, 2016-06-28 22:43 GMT+08:00 Pan Xinhui <xinhui.pan at linux.vnet.ibm.com>:> This is to fix some lock holder preemption issues. Some other locks > implementation do a spin loop before acquiring the lock itself. Currently > kernel has an interface of bool vcpu_is_preempted(int cpu). It take the cpu > as parameter and return true if the cpu is preempted. Then kernel can break > the spin loops upon on the retval of vcpu_is_preempted. > > As kernel has used this interface, So lets support it. > > Only pSeries need supoort it. And the fact is powerNV are built into same > kernel image with pSeries. So we need return false if we are runnig as > powerNV. The another fact is that lppaca->yiled_count keeps zero on > powerNV. So we can just skip the machine type.Lock holder vCPU preemption can be detected by hardware pSeries or paravirt method? Regards, Wanpeng Li
On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote:> change fomr v1: > a simplier definition of default vcpu_is_preempted > skip mahcine type check on ppc, and add config. remove dedicated macro. > add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. > add more comments > thanks boqun and Peter's suggestion. > > This patch set aims to fix lock holder preemption issues. > > test-case: > perf record -a perf bench sched messaging -g 400 -p && perf report > > 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock > 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner > 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock > 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task > 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq > 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is > 2.49% sched-messaging [kernel.vmlinux] [k] system_call > > We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin > loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner. > These spin_on_onwer variant also cause rcu stall before we apply this patch set >Paolo, could you help out with an (x86) KVM interface for this? Waiman, could you see if you can utilize this to get rid of the SPIN_THRESHOLD in qspinlock_paravirt?
On 06/07/16 08:52, Peter Zijlstra wrote:> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote: >> change fomr v1: >> a simplier definition of default vcpu_is_preempted >> skip mahcine type check on ppc, and add config. remove dedicated macro. >> add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. >> add more comments >> thanks boqun and Peter's suggestion. >> >> This patch set aims to fix lock holder preemption issues. >> >> test-case: >> perf record -a perf bench sched messaging -g 400 -p && perf report >> >> 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock >> 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner >> 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock >> 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task >> 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq >> 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is >> 2.49% sched-messaging [kernel.vmlinux] [k] system_call >> >> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin >> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner. >> These spin_on_onwer variant also cause rcu stall before we apply this patch set >> > > Paolo, could you help out with an (x86) KVM interface for this?Xen support of this interface should be rather easy. Could you please Cc: xen-devel-request at lists.xenproject.org in the next version? Juergen
On 2016?07?06? 14:52, Peter Zijlstra wrote:> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote: >> change fomr v1: >> a simplier definition of default vcpu_is_preempted >> skip mahcine type check on ppc, and add config. remove dedicated macro. >> add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. >> add more comments >> thanks boqun and Peter's suggestion. >> >> This patch set aims to fix lock holder preemption issues. >> >> test-case: >> perf record -a perf bench sched messaging -g 400 -p && perf report >> >> 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock >> 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner >> 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock >> 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task >> 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq >> 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is >> 2.49% sched-messaging [kernel.vmlinux] [k] system_call >> >> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin >> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner. >> These spin_on_onwer variant also cause rcu stall before we apply this patch set >> > > Paolo, could you help out with an (x86) KVM interface for this? > > Waiman, could you see if you can utilize this to get rid of the > SPIN_THRESHOLD in qspinlock_paravirt? >hmm. maybe something like below. wait_node can go into pv_wait() earlier as soon as the prev cpu is preempted. but for the wait_head, as qspinlock does not record the lock_holder correctly(thanks to lock stealing), vcpu preemption check might get wrong results. Waiman, I have used one hash table to keep the lock holder in my ppc implementation patch. I think we could do something similar in generic code? diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index 74c4a86..40560e8 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -312,7 +312,8 @@ pv_wait_early(struct pv_node *prev, int loop) if ((loop & PV_PREV_CHECK_MASK) != 0) return false; - return READ_ONCE(prev->state) != vcpu_running; + return READ_ONCE(prev->state) != vcpu_running || + vcpu_is_preempted(prev->cpu); } /*
On 06/07/2016 08:52, Peter Zijlstra wrote:> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote: >> change fomr v1: >> a simplier definition of default vcpu_is_preempted >> skip mahcine type check on ppc, and add config. remove dedicated macro. >> add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. >> add more comments >> thanks boqun and Peter's suggestion. >> >> This patch set aims to fix lock holder preemption issues. >> >> test-case: >> perf record -a perf bench sched messaging -g 400 -p && perf report >> >> 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock >> 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner >> 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock >> 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task >> 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq >> 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is >> 2.49% sched-messaging [kernel.vmlinux] [k] system_call >> >> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin >> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner. >> These spin_on_onwer variant also cause rcu stall before we apply this patch set >> > > Paolo, could you help out with an (x86) KVM interface for this?If it's just for spin loops, you can check if the version field in the steal time structure has changed. Paolo> Waiman, could you see if you can utilize this to get rid of the > SPIN_THRESHOLD in qspinlock_paravirt? >
Balbir Singh
2016-Jul-06  10:54 UTC
[PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
On Tue, 2016-06-28 at 10:43 -0400, Pan Xinhui wrote:> This is to fix some lock holder preemption issues. Some other locks > implementation do a spin loop before acquiring the lock itself. Currently > kernel has an interface of bool vcpu_is_preempted(int cpu). It take the cpu^^ takes> as parameter and return true if the cpu is preempted. Then kernel can break > the spin loops upon on the retval of vcpu_is_preempted. >? > As kernel has used this interface, So lets support it. >? > Only pSeries need supoort it. And the fact is powerNV are built into same???^^ support> kernel image with pSeries. So we need return false if we are runnig as > powerNV. The another fact is that lppaca->yiled_count keeps zero on??^^ yield> powerNV. So we can just skip the machine type. >? > Suggested-by: Boqun Feng <boqun.feng at gmail.com> > Suggested-by: Peter Zijlstra (Intel) <peterz at infradead.org> > Signed-off-by: Pan Xinhui <xinhui.pan at linux.vnet.ibm.com> > --- > ?arch/powerpc/include/asm/spinlock.h | 18 ++++++++++++++++++ > ?1 file changed, 18 insertions(+) >? > diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h > index 523673d..3ac9fcb 100644 > --- a/arch/powerpc/include/asm/spinlock.h > +++ b/arch/powerpc/include/asm/spinlock.h > @@ -52,6 +52,24 @@ > ?#define SYNC_IO > ?#endif > ? > +/* > + * This support kernel to check if one cpu is preempted or not. > + * Then we can fix some lock holder preemption issue. > + */ > +#ifdef CONFIG_PPC_PSERIES > +#define vcpu_is_preempted vcpu_is_preempted > +static inline bool vcpu_is_preempted(int cpu) > +{ > + /* > + ?* pSeries and powerNV can be built into same kernel image. In > + ?* principle we need return false directly if we are running as > + ?* powerNV. However the yield_count is always zero on powerNV, So > + ?* skip such machine type checkOr you could use the ppc_md interface callbacks if required, but your solution works as well> + ?*/ > + return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1); > +} > +#endif > + > ?static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) > ?{ > ? return lock.slock == 0;Balbir Singh.
On 07/06/2016 02:52 AM, Peter Zijlstra wrote:> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote: >> change fomr v1: >> a simplier definition of default vcpu_is_preempted >> skip mahcine type check on ppc, and add config. remove dedicated macro. >> add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. >> add more comments >> thanks boqun and Peter's suggestion. >> >> This patch set aims to fix lock holder preemption issues. >> >> test-case: >> perf record -a perf bench sched messaging -g 400 -p&& perf report >> >> 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock >> 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner >> 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock >> 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task >> 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq >> 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is >> 2.49% sched-messaging [kernel.vmlinux] [k] system_call >> >> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin >> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner. >> These spin_on_onwer variant also cause rcu stall before we apply this patch set >> > Paolo, could you help out with an (x86) KVM interface for this? > > Waiman, could you see if you can utilize this to get rid of the > SPIN_THRESHOLD in qspinlock_paravirt?That API is certainly useful to make the paravirt spinlock perform better. However, I am not sure if we can completely get rid of the SPIN_THRESHOLD at this point. It is not just the kvm, the xen code need to be modified as well. Cheers, Longman
Reasonably Related Threads
- [PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
- [PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
- [PATCH v3 2/4] powerpc/spinlock: support vcpu preempted check
- [PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check
- [PATCH v2 2/4] powerpc/spinlock: support vcpu preempted check