Peter Zijlstra
2014-Apr-18 08:27 UTC
[PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
On Thu, Apr 17, 2014 at 05:46:27PM -0400, Waiman Long wrote:> On 04/17/2014 11:56 AM, Peter Zijlstra wrote: > >On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote: > >>+struct __qspinlock { > >>+ union { > >>+ atomic_t val;char bytes[4];> >>+ struct { > >>+#ifdef __LITTLE_ENDIAN > >>+ u16 locked_pending; > >>+ u16 tail; > >>+#else > >>+ u16 tail; > >>+ u16 locked_pending; > >>+#endif > >>+ };struct { #ifdef __LITTLE_ENDIAN u8 locked; #else u8 res[3]; u8 locked; #endif };> >>+ }; > >>+}; > >>+ > >>+/** > >>+ * clear_pending_set_locked - take ownership and clear the pending bit. > >>+ * @lock: Pointer to queue spinlock structure > >>+ * @val : Current value of the queue spinlock 32-bit word > >>+ * > >>+ * *,1,0 -> *,0,1 > >>+ */ > >>+static __always_inline void > >>+clear_pending_set_locked(struct qspinlock *lock, u32 val) > >>+{ > >>+ struct __qspinlock *l = (void *)lock; > >>+ > >>+ ACCESS_ONCE(l->locked_pending) = 1; > >You lost the __constant_le16_to_cpu(_Q_LOCKED_VAL) there. The > >unconditional 1 is wrong. You also have to flip the bytes in > >locked_pending. > > I don't think that is wrong. The lock byte is in the least significant 8 > bits and the pending byte is the next higher significant 8 bits irrespective > of the endian-ness. So a value of 1 in a 16-bit context means the lock byte > is set, but the pending byte is cleared. The name "locked_pending" doesn't > mean that locked variable is in a lower address than pending.val is LE bytes[0,1,2,3] BE [3,2,1,0] locked_pending is LE bytes[0,1] BE [1,0] locked LE bytes[0] BE [0] That does mean that the LSB of BE locked_pending is bytes[1]. So if you do BE: locked_pending = 1, you set bytes[1], not bytes[0].
Waiman Long
2014-Apr-18 17:52 UTC
[PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
On 04/18/2014 04:27 AM, Peter Zijlstra wrote:> On Thu, Apr 17, 2014 at 05:46:27PM -0400, Waiman Long wrote: >> On 04/17/2014 11:56 AM, Peter Zijlstra wrote: >>> On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote: >>>> +struct __qspinlock { >>>> + union { >>>> + atomic_t val; > char bytes[4]; > >>>> + struct { >>>> +#ifdef __LITTLE_ENDIAN >>>> + u16 locked_pending; >>>> + u16 tail; >>>> +#else >>>> + u16 tail; >>>> + u16 locked_pending; >>>> +#endif >>>> + }; > struct { > #ifdef __LITTLE_ENDIAN > u8 locked; > #else > u8 res[3]; > u8 locked; > #endif > }; > >>>> + }; >>>> +}; >>>> + >>>> +/** >>>> + * clear_pending_set_locked - take ownership and clear the pending bit. >>>> + * @lock: Pointer to queue spinlock structure >>>> + * @val : Current value of the queue spinlock 32-bit word >>>> + * >>>> + * *,1,0 -> *,0,1 >>>> + */ >>>> +static __always_inline void >>>> +clear_pending_set_locked(struct qspinlock *lock, u32 val) >>>> +{ >>>> + struct __qspinlock *l = (void *)lock; >>>> + >>>> + ACCESS_ONCE(l->locked_pending) = 1; >>> You lost the __constant_le16_to_cpu(_Q_LOCKED_VAL) there. The >>> unconditional 1 is wrong. You also have to flip the bytes in >>> locked_pending. >> I don't think that is wrong. The lock byte is in the least significant 8 >> bits and the pending byte is the next higher significant 8 bits irrespective >> of the endian-ness. So a value of 1 in a 16-bit context means the lock byte >> is set, but the pending byte is cleared. The name "locked_pending" doesn't >> mean that locked variable is in a lower address than pending. > val is LE bytes[0,1,2,3] BE [3,2,1,0] > locked_pending is LE bytes[0,1] BE [1,0] > locked LE bytes[0] BE [0] > > That does mean that the LSB of BE locked_pending is bytes[1]. > So if you do BE: locked_pending = 1, you set bytes[1], not bytes[0].I am confused by your notation. Anyway, my version of the byte location chart is: val is LE bytes[0,1,2,3] BE [0,1,2,3] locked_pending is LE bytes[0,1] BE [2,3] locked is LE bytes[0] BE [3] If we assign 1 to BE locked_pending, bytes[2] = 0 and bytes[3] = 1. Note that the LSB of the BE locked_pending is bytes[3]. Similarly, if we assign 1 to BE val, bytes[3] = 1 and all the other bytes will be 0. -Longman
Peter Zijlstra
2014-Apr-18 19:05 UTC
[PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
On Fri, Apr 18, 2014 at 01:52:50PM -0400, Waiman Long wrote:> I am confused by your notation.Nah, I think I was confused :-) Make the 1 _Q_LOCKED_VAL though, as that's the proper constant to use.
Maybe Matching Threads
- [PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
- [PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
- [PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
- [PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS
- [PATCH v9 05/19] qspinlock: Optimize for smaller NR_CPUS