Displaying 20 results from an estimated 149 matches for "uncontended".
Did you mean:
contended
2014 May 07
0
[PATCH v10 03/19] qspinlock: Add pending bit
...structure
* @val: Current value of the queue spinlock 32-bit word
*
- * (queue tail, lock bit)
+ * (queue tail, pending bit, lock bit)
+ *
+ * fast : slow : unlock
+ * : :
+ * uncontended (0,0,0) -:--> (0,0,1) ------------------------------:--> (*,*,0)
+ * : | ^--------.------. / :
+ * : v \ \ | :
+ * pending : (0,1,1) +--> (0,1,0) \ | :
+ *...
2014 Jun 15
0
[PATCH 03/11] qspinlock: Add pending bit
...k
* @lock: Pointer to queue spinlock structure
* @val: Current value of the queue spinlock 32-bit word
*
- * (queue tail, lock bit)
- *
- * fast : slow : unlock
- * : :
- * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
- * : | ^--------. / :
- * : v \ | :
- * uncontended : (n,x) --+--> (n,0) | :
- * que...
2014 Apr 17
0
[PATCH v9 03/19] qspinlock: Add pending bit
...tructure
* @val: Current value of the queue spinlock 32-bit word
*
- * (queue tail, lock bit)
+ * (queue tail, pending bit, lock bit)
*
- * fast : slow : unlock
- * : :
- * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
- * : | ^--------. / :
- * : v \ | :
- * uncontended : (n,x) --+--> (n,0) | :
- * que...
2015 Mar 16
0
[PATCH 3/9] qspinlock: Add pending bit
...* @lock: Pointer to queue spinlock structure
* @val: Current value of the queue spinlock 32-bit word
*
- * (queue tail, lock value)
- *
- * fast : slow : unlock
- * : :
- * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
- * : | ^--------. / :
- * : v \ | :
- * uncontended : (n,x) --+--> (n,0) | :
- * que...
2014 Mar 03
5
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
...86360.083940
2 - nodes: 2 - nodes:
2: 1509.193824 2: 1209.090219
4: 48154.495998 4: 48547.242379
8: 137946.787244 8: 141381.498125
---
There a few curious facts I found (assuming my test code is sane).
- Intel seems to be an order of magnitude faster on uncontended LOCKed
ops compared to AMD
- On Intel the uncontended qspinlock fast path (cmpxchg) seems slower
than the uncontended ticket xadd -- although both are plenty fast
when compared to AMD.
- In general, replacing cmpxchg loops with unconditional atomic ops
doesn't seem to matter a w...
2014 Mar 03
5
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
...86360.083940
2 - nodes: 2 - nodes:
2: 1509.193824 2: 1209.090219
4: 48154.495998 4: 48547.242379
8: 137946.787244 8: 141381.498125
---
There a few curious facts I found (assuming my test code is sane).
- Intel seems to be an order of magnitude faster on uncontended LOCKed
ops compared to AMD
- On Intel the uncontended qspinlock fast path (cmpxchg) seems slower
than the uncontended ticket xadd -- although both are plenty fast
when compared to AMD.
- In general, replacing cmpxchg loops with unconditional atomic ops
doesn't seem to matter a w...
2014 Jun 16
4
[PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock
...ept it is not a lock bit. It is a lock uint8_t.
Is the queue tail at this point the composite of 'cpu|idx'?
> + *
> + * fast : slow : unlock
> + * : :
> + * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
> + * : | ^--------. / :
> + * : v \ | :
> + * uncontended : (n,x) --+--> (n,0)...
2014 Jun 16
4
[PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock
...ept it is not a lock bit. It is a lock uint8_t.
Is the queue tail at this point the composite of 'cpu|idx'?
> + *
> + * fast : slow : unlock
> + * : :
> + * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
> + * : | ^--------. / :
> + * : v \ | :
> + * uncontended : (n,x) --+--> (n,0)...
2014 May 21
0
[RFC 08/07] qspinlock: integrate pending bit into queue
...nlock
* @lock: Pointer to queue spinlock structure
@@ -324,21 +381,21 @@ static inline int trylock_pending(struct qspinlock *lock, u32 *pval)
* fast : slow : unlock
* : :
* uncontended (0,0,0) -:--> (0,0,1) ------------------------------:--> (*,*,0)
- * : | ^--------.------. / :
- * : v \ \ | :
- * pending : (0,1,1) +--> (0,1,0) \ | :
- *...
2014 Jun 11
3
[PATCH v11 09/16] qspinlock, x86: Allow unfair spinlock in a virtual guest
On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
> Enabling this configuration feature causes a slight decrease the
> performance of an uncontended lock-unlock operation by about 1-2%
> mainly due to the use of a static key. However, uncontended lock-unlock
> operation are really just a tiny percentage of a real workload. So
> there should no noticeable change in application performance.
No, entirely unacceptable.
> +#ifdef CONFI...
2014 Jun 11
3
[PATCH v11 09/16] qspinlock, x86: Allow unfair spinlock in a virtual guest
On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
> Enabling this configuration feature causes a slight decrease the
> performance of an uncontended lock-unlock operation by about 1-2%
> mainly due to the use of a static key. However, uncontended lock-unlock
> operation are really just a tiny percentage of a real workload. So
> there should no noticeable change in application performance.
No, entirely unacceptable.
> +#ifdef CONFI...
2014 Jun 17
5
[PATCH 03/11] qspinlock: Add pending bit
...ock structure
> * @val: Current value of the queue spinlock 32-bit word
> *
> - * (queue tail, lock bit)
> - *
> - * fast : slow : unlock
> - * : :
> - * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
> - * : | ^--------. / :
> - * : v \ | :
> - * uncontended : (n,x) --+--> (n,0)...
2014 Jun 17
5
[PATCH 03/11] qspinlock: Add pending bit
...ock structure
> * @val: Current value of the queue spinlock 32-bit word
> *
> - * (queue tail, lock bit)
> - *
> - * fast : slow : unlock
> - * : :
> - * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
> - * : | ^--------. / :
> - * : v \ | :
> - * uncontended : (n,x) --+--> (n,0)...
2014 Jun 23
0
[PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock
...ue tail at this point the composite of 'cpu|idx'?
Yes, as per {en,de}code_tail() above.
> > + *
> > + * fast : slow : unlock
> > + * : :
> > + * uncontended (0,0) --:--> (0,1) --------------------------------:--> (*,0)
> > + * : | ^--------. / :
> > + * : v \ | :
> > + * uncontended : (n,x) --+--> (n,0)...
2014 Jun 12
2
[PATCH v11 09/16] qspinlock, x86: Allow unfair spinlock in a virtual guest
On Wed, Jun 11, 2014 at 09:37:55PM -0400, Long, Wai Man wrote:
>
> On 6/11/2014 6:54 AM, Peter Zijlstra wrote:
> >On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
> >>Enabling this configuration feature causes a slight decrease the
> >>performance of an uncontended lock-unlock operation by about 1-2%
> >>mainly due to the use of a static key. However, uncontended lock-unlock
> >>operation are really just a tiny percentage of a real workload. So
> >>there should no noticeable change in application performance.
> >No, entirely u...
2014 Jun 12
2
[PATCH v11 09/16] qspinlock, x86: Allow unfair spinlock in a virtual guest
On Wed, Jun 11, 2014 at 09:37:55PM -0400, Long, Wai Man wrote:
>
> On 6/11/2014 6:54 AM, Peter Zijlstra wrote:
> >On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
> >>Enabling this configuration feature causes a slight decrease the
> >>performance of an uncontended lock-unlock operation by about 1-2%
> >>mainly due to the use of a static key. However, uncontended lock-unlock
> >>operation are really just a tiny percentage of a real workload. So
> >>there should no noticeable change in application performance.
> >No, entirely u...
2014 Jun 18
3
[PATCH 04/11] qspinlock: Extract out the exchange of tail code word
...ion with a
> single cmpxchg:
>
> - * 0,0,0 -> 0,0,1 ; trylock
> - * p,y,x -> n,y,x ; prev = xchg(lock, node)
>
> to first doing the trylock, then the xchg. If the trylock passes and the
> xchg returns prev=0,0,0, the next step of the algorithm goes to the
> locked/uncontended state
>
> + /*
> + * claim the lock:
> + *
> + * n,0 -> 0,1 : lock, uncontended
>
> Similar to your suggestion of patch 3, it's expected that the xchg will
> *not* return prev=0,0,0 after a failed trylock.
I do like your explanation. I hope that Peter will put i...
2014 Jun 18
3
[PATCH 04/11] qspinlock: Extract out the exchange of tail code word
...ion with a
> single cmpxchg:
>
> - * 0,0,0 -> 0,0,1 ; trylock
> - * p,y,x -> n,y,x ; prev = xchg(lock, node)
>
> to first doing the trylock, then the xchg. If the trylock passes and the
> xchg returns prev=0,0,0, the next step of the algorithm goes to the
> locked/uncontended state
>
> + /*
> + * claim the lock:
> + *
> + * n,0 -> 0,1 : lock, uncontended
>
> Similar to your suggestion of patch 3, it's expected that the xchg will
> *not* return prev=0,0,0 after a failed trylock.
I do like your explanation. I hope that Peter will put i...
2014 Feb 28
5
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
On Thu, Feb 27, 2014 at 03:42:19PM -0500, Waiman Long wrote:
> >>+ old = xchg(&qlock->lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
> >>+
> >>+ if (old == 0) {
> >>+ /*
> >>+ * Got the lock, can clear the waiting bit now
> >>+ */
> >>+ smp_u8_store_release(&qlock->wait, 0);
> >
> >So we just did an
2014 Feb 28
5
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
On Thu, Feb 27, 2014 at 03:42:19PM -0500, Waiman Long wrote:
> >>+ old = xchg(&qlock->lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
> >>+
> >>+ if (old == 0) {
> >>+ /*
> >>+ * Got the lock, can clear the waiting bit now
> >>+ */
> >>+ smp_u8_store_release(&qlock->wait, 0);
> >
> >So we just did an