On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote:> +/** > + * trylock_pending - try to acquire queue spinlock using the pending bit > + * @lock : Pointer to queue spinlock structure > + * @pval : Pointer to value of the queue spinlock 32-bit word > + * Return: 1 if lock acquired, 0 otherwise > + */ > +static inline int trylock_pending(struct qspinlock *lock, u32 *pval) > +{ > + u32 old, new, val = *pval;I'm not thrilled about you breaking this into a separate function; the compiler will put it right back and now you get to have that ugly pointer stuff. It also makes the function control flow not match the state diagram anymore.> + > + /* > + * trylock || pending > + * > + * 0,0,0 -> 0,0,1 ; trylock > + * 0,0,1 -> 0,1,1 ; pending > + */ > + for (;;) { > + /* > + * If we observe any contention; queue. > + */ > + if (val & ~_Q_LOCKED_MASK) > + return 0; > + > + new = _Q_LOCKED_VAL; > + if (val == new) > + new |= _Q_PENDING_VAL; > + > + old = atomic_cmpxchg(&lock->val, val, new); > + if (old == val) > + break; > + > + *pval = val = old; > + } > + > + /* > + * we won the trylock > + */ > + if (new == _Q_LOCKED_VAL) > + return 1; > + > + /* > + * we're pending, wait for the owner to go away. > + * > + * *,1,1 -> *,1,0 > + */ > + while ((val = atomic_read(&lock->val)) & _Q_LOCKED_MASK) > + arch_mutex_cpu_relax();That was a cpu_relax().> + > + /* > + * take ownership and clear the pending bit. > + * > + * *,1,0 -> *,0,1 > + */ > + for (;;) { > + new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL; > + > + old = atomic_cmpxchg(&lock->val, val, new); > + if (old == val) > + break; > + > + val = old; > + } > + return 1; > +}
On 04/17/2014 11:42 AM, Peter Zijlstra wrote:> On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote: >> +/** >> + * trylock_pending - try to acquire queue spinlock using the pending bit >> + * @lock : Pointer to queue spinlock structure >> + * @pval : Pointer to value of the queue spinlock 32-bit word >> + * Return: 1 if lock acquired, 0 otherwise >> + */ >> +static inline int trylock_pending(struct qspinlock *lock, u32 *pval) >> +{ >> + u32 old, new, val = *pval; > I'm not thrilled about you breaking this into a separate function; the > compiler will put it right back and now you get to have that ugly > pointer stuff. > > It also makes the function control flow not match the state diagram > anymore.I separate it out primarily to break the pending bit logic away from the core MCS queuing logic to make each of them easier to understand as they are kind of independent. I fully understand that the compiler will put them back together. As I pile on more code, the slowpath function will grow bigger making it harder to comprehend and find out where are the boundary between them. I will take a look at the state diagram to see what adjustment will be needed.>> + >> + /* >> + * trylock || pending >> + * >> + * 0,0,0 -> 0,0,1 ; trylock >> + * 0,0,1 -> 0,1,1 ; pending >> + */ >> + for (;;) { >> + /* >> + * If we observe any contention; queue. >> + */ >> + if (val& ~_Q_LOCKED_MASK) >> + return 0; >> + >> + new = _Q_LOCKED_VAL; >> + if (val == new) >> + new |= _Q_PENDING_VAL; >> + >> + old = atomic_cmpxchg(&lock->val, val, new); >> + if (old == val) >> + break; >> + >> + *pval = val = old; >> + } >> + >> + /* >> + * we won the trylock >> + */ >> + if (new == _Q_LOCKED_VAL) >> + return 1; >> + >> + /* >> + * we're pending, wait for the owner to go away. >> + * >> + * *,1,1 -> *,1,0 >> + */ >> + while ((val = atomic_read(&lock->val))& _Q_LOCKED_MASK) >> + arch_mutex_cpu_relax(); > That was a cpu_relax().Yes, but arch_mutex_cpu_relax() is the same as cpu_relax() for x86. -Longman
On Thu, Apr 17, 2014 at 05:20:31PM -0400, Waiman Long wrote:> >>+ while ((val = atomic_read(&lock->val))& _Q_LOCKED_MASK) > >>+ arch_mutex_cpu_relax(); > >That was a cpu_relax(). > > Yes, but arch_mutex_cpu_relax() is the same as cpu_relax() for x86.Yeah, so why bother typing more? Let the s390 people sort that if/when they try and make this work for them.