thr3ads.net - similar to: "[PATCH v7 00/11] qspinlock: a 4-byte queue spinlock with PV support"

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 01

10

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving performance. - Simplify some of the codes and add more comments. - Test for X86_FEATURE_HYPERVISOR CPU feature bit to enable/disable unfair lock. - Reduce unfair lock slowpath lock stealing frequency depending on its distance from the queue head. - Add performance data for IvyBridge-EX CPU.

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 01

10

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving performance. - Simplify some of the codes and add more comments. - Test for X86_FEATURE_HYPERVISOR CPU feature bit to enable/disable unfair lock. - Reduce unfair lock slowpath lock stealing frequency depending on its distance from the queue head. - Add performance data for IvyBridge-EX CPU.

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 02

17

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

N.B. Sorry for the duplicate. This patch series were resent as the original one was rejected by the vger.kernel.org list server due to long header. There is no change in content. v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving performance. - Simplify some of the codes and add more comments. - Test for X86_FEATURE_HYPERVISOR CPU feature bit

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 02

17

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

N.B. Sorry for the duplicate. This patch series were resent as the original one was rejected by the vger.kernel.org list server due to long header. There is no change in content. v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving performance. - Simplify some of the codes and add more comments. - Test for X86_FEATURE_HYPERVISOR CPU feature bit

[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014 Mar 12

17

[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support

v5->v6: - Change the optimized 2-task contending code to make it fairer at the expense of a bit of performance. - Add a patch to support unfair queue spinlock for Xen. - Modify the PV qspinlock code to follow what was done in the PV ticketlock. - Add performance data for the unfair lock as well as the PV support code. v4->v5: - Move the optimized 2-task contending code to the

[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014 Mar 12

17

[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support

v5->v6: - Change the optimized 2-task contending code to make it fairer at the expense of a bit of performance. - Add a patch to support unfair queue spinlock for Xen. - Modify the PV qspinlock code to follow what was done in the PV ticketlock. - Add performance data for the unfair lock as well as the PV support code. v4->v5: - Move the optimized 2-task contending code to the

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014 Feb 27

14

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

v4->v5: - Move the optimized 2-task contending code to the generic file to enable more architectures to use it without code duplication. - Address some of the style-related comments by PeterZ. - Allow the use of unfair queue spinlock in a real para-virtualized execution environment. - Add para-virtualization support to the qspinlock code by ensuring that the lock holder and queue

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014 Feb 27

14

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

v4->v5: - Move the optimized 2-task contending code to the generic file to enable more architectures to use it without code duplication. - Address some of the style-related comments by PeterZ. - Allow the use of unfair queue spinlock in a real para-virtualized execution environment. - Add para-virtualization support to the qspinlock code by ensuring that the lock holder and queue

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014 Feb 26

22

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

v4->v5: - Move the optimized 2-task contending code to the generic file to enable more architectures to use it without code duplication. - Address some of the style-related comments by PeterZ. - Allow the use of unfair queue spinlock in a real para-virtualized execution environment. - Add para-virtualization support to the qspinlock code by ensuring that the lock holder and queue

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014 Feb 26

22

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

v4->v5: - Move the optimized 2-task contending code to the generic file to enable more architectures to use it without code duplication. - Address some of the style-related comments by PeterZ. - Allow the use of unfair queue spinlock in a real para-virtualized execution environment. - Add para-virtualization support to the qspinlock code by ensuring that the lock holder and queue

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014 Mar 02

1

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

On 02/26, Waiman Long wrote: > > +void queue_spin_lock_slowpath(struct qspinlock *lock, int qsval) > +{ > + unsigned int cpu_nr, qn_idx; > + struct qnode *node, *next; > + u32 prev_qcode, my_qcode; > + > + /* > + * Get the queue node > + */ > + cpu_nr = smp_processor_id(); > + node = get_qnode(&qn_idx); > + > + /* > + * It should never happen

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014 Mar 02

1

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

On 02/26, Waiman Long wrote: > > +void queue_spin_lock_slowpath(struct qspinlock *lock, int qsval) > +{ > + unsigned int cpu_nr, qn_idx; > + struct qnode *node, *next; > + u32 prev_qcode, my_qcode; > + > + /* > + * Get the queue node > + */ > + cpu_nr = smp_processor_id(); > + node = get_qnode(&qn_idx); > + > + /* > + * It should never happen

[PATCH v8 01/10] qspinlock: A generic 4-byte queue spinlock implementation

2014 Apr 02

0

[PATCH v8 01/10] qspinlock: A generic 4-byte queue spinlock implementation

This patch introduces a new generic queue spinlock implementation that can serve as an alternative to the default ticket spinlock. Compared with the ticket spinlock, this queue spinlock should be almost as fair as the ticket spinlock. It has about the same speed in single-thread and it can be much faster in high contention situations especially when the spinlock is embedded within the data

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 17

33

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

v8->v9: - Integrate PeterZ's version of the queue spinlock patch with some modification: http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org - Break the more complex patches into smaller ones to ease review effort. - Fix a racing condition in the PV qspinlock code. v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 Apr 17

33

[PATCH v9 00/19] qspinlock: a 4-byte queue spinlock with PV support

v8->v9: - Integrate PeterZ's version of the queue spinlock patch with some modification: http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org - Break the more complex patches into smaller ones to ease review effort. - Fix a racing condition in the PV qspinlock code. v7->v8: - Remove one unneeded atomic operation from the slowpath, thus improving

[PATCH v10 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 May 07

32

[PATCH v10 00/19] qspinlock: a 4-byte queue spinlock with PV support

v9->v10: - Make some minor changes to qspinlock.c to accommodate review feedback. - Change author to PeterZ for 2 of the patches. - Include Raghavendra KT's test results in patch 18. v8->v9: - Integrate PeterZ's version of the queue spinlock patch with some modification: http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org - Break the more complex

[PATCH v10 00/19] qspinlock: a 4-byte queue spinlock with PV support

2014 May 07

32

[PATCH v10 00/19] qspinlock: a 4-byte queue spinlock with PV support

v9->v10: - Make some minor changes to qspinlock.c to accommodate review feedback. - Change author to PeterZ for 2 of the patches. - Include Raghavendra KT's test results in patch 18. v8->v9: - Integrate PeterZ's version of the queue spinlock patch with some modification: http://lkml.kernel.org/r/20140310154236.038181843 at infradead.org - Break the more complex

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014 Mar 03

5

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

Hi, Here are some numbers for my version -- also attached is the test code. I found that booting big machines is tediously slow so I lifted the whole lot to userspace. I measure the cycles spend in arch_spin_lock() + arch_spin_unlock(). The machines used are a 4 node (2 socket) AMD Interlagos, and a 2 node (2 socket) Intel Westmere-EP. AMD (ticket) AMD (qspinlock + pending + opt) Local:

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014 Mar 03

5

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

Hi, Here are some numbers for my version -- also attached is the test code. I found that booting big machines is tediously slow so I lifted the whole lot to userspace. I measure the cycles spend in arch_spin_lock() + arch_spin_unlock(). The machines used are a 4 node (2 socket) AMD Interlagos, and a 2 node (2 socket) Intel Westmere-EP. AMD (ticket) AMD (qspinlock + pending + opt) Local:

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014 Feb 26

2

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

You don't happen to have a proper state diagram for this thing do you? I suppose I'm going to have to make one; this is all getting a bit unwieldy, and those xchg() + fixup things are hard to read. On Wed, Feb 26, 2014 at 10:14:23AM -0500, Waiman Long wrote: > +static inline int queue_spin_trylock_quick(struct qspinlock *lock, int qsval) > +{ > + union arch_qspinlock *qlock =

similar to: [PATCH v7 00/11] qspinlock: a 4-byte queue spinlock with PV support