David Vrabel
2014-Mar-13 11:21 UTC
[PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
On 12/03/14 18:54, Waiman Long wrote:> This patch adds para-virtualization support to the queue spinlock in > the same way as was done in the PV ticket lock code. In essence, the > lock waiters will spin for a specified number of times (QSPIN_THRESHOLD > = 2^14) and then halted itself. The queue head waiter will spins > 2*QSPIN_THRESHOLD times before halting itself. When it has spinned > QSPIN_THRESHOLD times, the queue head will assume that the lock > holder may be scheduled out and attempt to kick the lock holder CPU > if it has the CPU number on hand.I don't really understand the reasoning for kicking the lock holder. It will either be: running, runnable, or halted because it's in a slow path wait for another lock. In any of these states I do not see how a kick is useful.> Enabling the PV code does have a performance impact on spinlock > acquisitions and releases. The following table shows the execution > time (in ms) of a spinlock micro-benchmark that does lock/unlock > operations 5M times for each task versus the number of contending > tasks on a Westmere-EX system. > > # of Ticket lock Queue lock > tasks PV off/PV on/%Change PV off/PV on/%Change > ------ -------------------- --------------------- > 1 135/ 179/+33% 137/ 169/+23% > 2 1045/ 1103/ +6% 1120/ 1536/+37% > 3 1827/ 2683/+47% 2313/ 2425/ +5% > 4 2689/ 4191/+56% 2914/ 3128/ +7% > 5 3736/ 5830/+56% 3715/ 3762/ +1% > 6 4942/ 7609/+54% 4504/ 4558/ +2% > 7 6304/ 9570/+52% 5292/ 5351/ +1% > 8 7736/11323/+46% 6037/ 6097/ +1%Do you have measurements from tests when VCPUs are overcommitted?> +#ifdef CONFIG_PARAVIRT_SPINLOCKS > +/** > + * queue_spin_unlock_slowpath - kick up the CPU of the queue head > + * @lock : Pointer to queue spinlock structure > + * > + * The lock is released after finding the queue head to avoid racing > + * condition between the queue head and the lock holder. > + */ > +void queue_spin_unlock_slowpath(struct qspinlock *lock) > +{ > + struct qnode *node, *prev; > + u32 qcode = (u32)queue_get_qcode(lock); > + > + /* > + * Get the queue tail node > + */ > + node = xlate_qcode(qcode); > + > + /* > + * Locate the queue head node by following the prev pointer from > + * tail to head. > + * It is assumed that the PV guests won't have that many CPUs so > + * that it won't take a long time to follow the pointers.This isn't a valid assumption, but this isn't that different from the search done in the ticket slow unlock path so I guess it's ok. David
Paolo Bonzini
2014-Mar-13 13:57 UTC
[PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Il 13/03/2014 12:21, David Vrabel ha scritto:> On 12/03/14 18:54, Waiman Long wrote: >> This patch adds para-virtualization support to the queue spinlock in >> the same way as was done in the PV ticket lock code. In essence, the >> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD >> = 2^14) and then halted itself. The queue head waiter will spins >> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned >> QSPIN_THRESHOLD times, the queue head will assume that the lock >> holder may be scheduled out and attempt to kick the lock holder CPU >> if it has the CPU number on hand. > > I don't really understand the reasoning for kicking the lock holder.I agree. If the lock holder isn't running, there's probably a good reason for that and going to sleep will not necessarily convince the scheduler to give more CPU to the lock holder. I think there are two choices: 1) use yield_to to donate part of the waiter's quantum to the lock holder? For this we probably need a new, separate hypercall interface. For KVM it would be the same as hlt in the guest but with an additional yield_to in the host. 2) do nothing, just go to sleep. Could you get (or do you have) numbers for (2)? More important, I think a barrier is missing: Lock holder --------------------------------------- // queue_spin_unlock barrier(); ACCESS_ONCE(qlock->lock) = 0; barrier(); // pv_kick_node: if (pv->cpustate != PV_CPU_HALTED) return; ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED; __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD); Waiter ------------------------------------------- // pv_head_spin_check ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED; lockval = cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, _QSPINLOCK_LOCKED_SLOWPATH); if (lockval == 0) { /* * Can exit now as the lock is free */ ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE; *count = 0; return; } __queue_hibernate(); Nothing protects from writing qlock->lock before pv->cpustate is read, leading to this: Lock holder Waiter --------------------------------------------------------------- read pv->cpustate (it is PV_CPU_ACTIVE) pv->cpustate = PV_CPU_HALTED lockval = cmpxchg(...) hibernate() qlock->lock = 0 if (pv->cpustate != PV_CPU_HALTED) return; I think you need this: - if (pv->cpustate != PV_CPU_HALTED) - return; - ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED; + if (cmpxchg(pv->cpustate, PV_CPU_HALTED, PV_CPU_KICKED) + != PV_CPU_HALTED) + return; Paolo
Waiman Long
2014-Mar-13 19:05 UTC
[PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
On 03/13/2014 07:21 AM, David Vrabel wrote:> On 12/03/14 18:54, Waiman Long wrote: >> This patch adds para-virtualization support to the queue spinlock in >> the same way as was done in the PV ticket lock code. In essence, the >> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD >> = 2^14) and then halted itself. The queue head waiter will spins >> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned >> QSPIN_THRESHOLD times, the queue head will assume that the lock >> holder may be scheduled out and attempt to kick the lock holder CPU >> if it has the CPU number on hand. > I don't really understand the reasoning for kicking the lock holder. It > will either be: running, runnable, or halted because it's in a slow path > wait for another lock. In any of these states I do not see how a kick > is useful.You may be right. I can certainly take this part out of the patch if people don't think that is useful.>> Enabling the PV code does have a performance impact on spinlock >> acquisitions and releases. The following table shows the execution >> time (in ms) of a spinlock micro-benchmark that does lock/unlock >> operations 5M times for each task versus the number of contending >> tasks on a Westmere-EX system. >> >> # of Ticket lock Queue lock >> tasks PV off/PV on/%Change PV off/PV on/%Change >> ------ -------------------- --------------------- >> 1 135/ 179/+33% 137/ 169/+23% >> 2 1045/ 1103/ +6% 1120/ 1536/+37% >> 3 1827/ 2683/+47% 2313/ 2425/ +5% >> 4 2689/ 4191/+56% 2914/ 3128/ +7% >> 5 3736/ 5830/+56% 3715/ 3762/ +1% >> 6 4942/ 7609/+54% 4504/ 4558/ +2% >> 7 6304/ 9570/+52% 5292/ 5351/ +1% >> 8 7736/11323/+46% 6037/ 6097/ +1% > Do you have measurements from tests when VCPUs are overcommitted?I don't have a measurement with overcommitted guests yet. I will set up such an environment and do some tests on it.>> +#ifdef CONFIG_PARAVIRT_SPINLOCKS >> +/** >> + * queue_spin_unlock_slowpath - kick up the CPU of the queue head >> + * @lock : Pointer to queue spinlock structure >> + * >> + * The lock is released after finding the queue head to avoid racing >> + * condition between the queue head and the lock holder. >> + */ >> +void queue_spin_unlock_slowpath(struct qspinlock *lock) >> +{ >> + struct qnode *node, *prev; >> + u32 qcode = (u32)queue_get_qcode(lock); >> + >> + /* >> + * Get the queue tail node >> + */ >> + node = xlate_qcode(qcode); >> + >> + /* >> + * Locate the queue head node by following the prev pointer from >> + * tail to head. >> + * It is assumed that the PV guests won't have that many CPUs so >> + * that it won't take a long time to follow the pointers. > This isn't a valid assumption, but this isn't that different from the > search done in the ticket slow unlock path so I guess it's ok. > > DavidI will change that to say that in most cases, the queue length will be short. -Longman
Waiman Long
2014-Mar-13 19:49 UTC
[PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
On 03/13/2014 09:57 AM, Paolo Bonzini wrote:> Il 13/03/2014 12:21, David Vrabel ha scritto: >> On 12/03/14 18:54, Waiman Long wrote: >>> This patch adds para-virtualization support to the queue spinlock in >>> the same way as was done in the PV ticket lock code. In essence, the >>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD >>> = 2^14) and then halted itself. The queue head waiter will spins >>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned >>> QSPIN_THRESHOLD times, the queue head will assume that the lock >>> holder may be scheduled out and attempt to kick the lock holder CPU >>> if it has the CPU number on hand. >> >> I don't really understand the reasoning for kicking the lock holder. > > I agree. If the lock holder isn't running, there's probably a good > reason for that and going to sleep will not necessarily convince the > scheduler to give more CPU to the lock holder. I think there are two > choices: > > 1) use yield_to to donate part of the waiter's quantum to the lock > holder? For this we probably need a new, separate hypercall > interface. For KVM it would be the same as hlt in the guest but with > an additional yield_to in the host. > > 2) do nothing, just go to sleep. > > Could you get (or do you have) numbers for (2)?I will take out the lock holder kick portion from the patch. I will also try to collect more test data.> > More important, I think a barrier is missing: > > Lock holder --------------------------------------- > > // queue_spin_unlock > barrier(); > ACCESS_ONCE(qlock->lock) = 0; > barrier(); >This is not the unlock code that is used when PV spinlock is enabled. The right unlock code is if (static_key_false(¶virt_spinlocks_enabled)) { /* * Need to atomically clear the lock byte to avoid racing with * queue head waiter trying to set _QSPINLOCK_LOCKED_SLOWPATH. */ if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0) == _QSPINLOCK_LOCKED)) return; else queue_spin_unlock_slowpath(lock); } else { __queue_spin_unlock(lock); }> // pv_kick_node: > if (pv->cpustate != PV_CPU_HALTED) > return; > ACCESS_ONCE(pv->cpustate) = PV_CPU_KICKED; > __queue_kick_cpu(pv->mycpu, PV_KICK_QUEUE_HEAD); > > Waiter ------------------------------------------- > > // pv_head_spin_check > ACCESS_ONCE(pv->cpustate) = PV_CPU_HALTED; > lockval = cmpxchg(&qlock->lock, > _QSPINLOCK_LOCKED, > _QSPINLOCK_LOCKED_SLOWPATH); > if (lockval == 0) { > /* > * Can exit now as the lock is free > */ > ACCESS_ONCE(pv->cpustate) = PV_CPU_ACTIVE; > *count = 0; > return; > } > __queue_hibernate(); > > Nothing protects from writing qlock->lock before pv->cpustate is read, > leading to this: > > Lock holder Waiter > --------------------------------------------------------------- > read pv->cpustate > (it is PV_CPU_ACTIVE) > pv->cpustate = PV_CPU_HALTED > lockval = cmpxchg(...) > hibernate() > qlock->lock = 0 > if (pv->cpustate != PV_CPU_HALTED) > return; >The lock holder will read cpustate only if the lock byte has been changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte synchronize the 2 threads. The only thing that I am not certain is when the waiter is trying to go to sleep while, at the same time, the lock holder is trying to kick it. Will there be a missed wakeup because of this timing issue? -Longman
Reasonably Related Threads
- [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
- [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
- [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
- [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
- [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support