Paolo Bonzini
2014-Mar-03 10:55 UTC
[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
Il 28/02/2014 18:06, Waiman Long ha scritto:> On 02/26/2014 12:07 PM, Konrad Rzeszutek Wilk wrote: >> On Wed, Feb 26, 2014 at 10:14:24AM -0500, Waiman Long wrote: >>> Locking is always an issue in a virtualized environment as the virtual >>> CPU that is waiting on a lock may get scheduled out and hence block >>> any progress in lock acquisition even when the lock has been freed. >>> >>> One solution to this problem is to allow unfair lock in a >>> para-virtualized environment. In this case, a new lock acquirer can >>> come and steal the lock if the next-in-line CPU to get the lock is >>> scheduled out. Unfair lock in a native environment is generally not a >> Hmm, how do you know if the 'next-in-line CPU' is scheduled out? As >> in the hypervisor knows - but you as a guest might have no idea >> of it. > > I use a heart-beat counter to see if the other side responses within a > certain time limit. If not, I assume it has been scheduled out probably > due to PLE.PLE is unnecessary if you have "true" pv spinlocks where the next-in-line schedules itself out with a hypercall (Xen) or hlt instruction (KVM). Set a bit in the qspinlock before going to sleep, and the lock owner will know that it needs to kick the next-in-line. I think there is no need for the unfair lock bits. 1-2% is a pretty large hit. Paolo
Waiman Long
2014-Mar-04 15:15 UTC
[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
On 03/03/2014 05:55 AM, Paolo Bonzini wrote:> Il 28/02/2014 18:06, Waiman Long ha scritto: >> On 02/26/2014 12:07 PM, Konrad Rzeszutek Wilk wrote: >>> On Wed, Feb 26, 2014 at 10:14:24AM -0500, Waiman Long wrote: >>>> Locking is always an issue in a virtualized environment as the virtual >>>> CPU that is waiting on a lock may get scheduled out and hence block >>>> any progress in lock acquisition even when the lock has been freed. >>>> >>>> One solution to this problem is to allow unfair lock in a >>>> para-virtualized environment. In this case, a new lock acquirer can >>>> come and steal the lock if the next-in-line CPU to get the lock is >>>> scheduled out. Unfair lock in a native environment is generally not a >>> Hmm, how do you know if the 'next-in-line CPU' is scheduled out? As >>> in the hypervisor knows - but you as a guest might have no idea >>> of it. >> >> I use a heart-beat counter to see if the other side responses within a >> certain time limit. If not, I assume it has been scheduled out probably >> due to PLE. > > PLE is unnecessary if you have "true" pv spinlocks where the > next-in-line schedules itself out with a hypercall (Xen) or hlt > instruction (KVM). Set a bit in the qspinlock before going to sleep, > and the lock owner will know that it needs to kick the next-in-line. > > I think there is no need for the unfair lock bits. 1-2% is a pretty > large hit. > > PaoloI don't think that PLE is something that can be controlled by software. It is done in hardware. I maybe wrong. Anyway, I plan to add code to schedule out the CPUs waiting in the queue except the first 2 in the next version of the patch. The PV code in the v5 patch did seem to improve benchmark performance with moderate to heavy spinlock contention. However, I didn't see much CPU kicking going on. My theory is that the additional PV code complicates the pause loop timing so that the hardware PLE didn't kick in, whereas the original pause loop is pretty simple causing PLE to happen fairly frequently. -Longman
Paolo Bonzini
2014-Mar-04 15:23 UTC
[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
Il 04/03/2014 16:15, Waiman Long ha scritto:>> >> PLE is unnecessary if you have "true" pv spinlocks where the >> next-in-line schedules itself out with a hypercall (Xen) or hlt >> instruction (KVM). Set a bit in the qspinlock before going to sleep, >> and the lock owner will know that it needs to kick the next-in-line. >> >> I think there is no need for the unfair lock bits. 1-2% is a pretty >> large hit. > > I don't think that PLE is something that can be controlled by software. > It is done in hardware.Yes, but the hypervisor decides *what* to do when the processor detects a pause-loop. But my point is that if you have pv spinlocks, the processor in the end will never or almost never do a pause-loop exit. PLE is mostly for legacy guests that doesn't have pv spinlocks. Paolo> I maybe wrong. Anyway, I plan to add code to > schedule out the CPUs waiting in the queue except the first 2 in the > next version of the patch. > > The PV code in the v5 patch did seem to improve benchmark performance > with moderate to heavy spinlock contention. However, I didn't see much > CPU kicking going on. My theory is that the additional PV code > complicates the pause loop timing so that the hardware PLE didn't kick > in, whereas the original pause loop is pretty simple causing PLE to > happen fairly frequently.
David Vrabel
2014-Mar-04 15:39 UTC
[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
On 04/03/14 15:15, Waiman Long wrote:> On 03/03/2014 05:55 AM, Paolo Bonzini wrote: >> Il 28/02/2014 18:06, Waiman Long ha scritto: >>> On 02/26/2014 12:07 PM, Konrad Rzeszutek Wilk wrote: >>>> On Wed, Feb 26, 2014 at 10:14:24AM -0500, Waiman Long wrote: >>>>> Locking is always an issue in a virtualized environment as the virtual >>>>> CPU that is waiting on a lock may get scheduled out and hence block >>>>> any progress in lock acquisition even when the lock has been freed. >>>>> >>>>> One solution to this problem is to allow unfair lock in a >>>>> para-virtualized environment. In this case, a new lock acquirer can >>>>> come and steal the lock if the next-in-line CPU to get the lock is >>>>> scheduled out. Unfair lock in a native environment is generally not a >>>> Hmm, how do you know if the 'next-in-line CPU' is scheduled out? As >>>> in the hypervisor knows - but you as a guest might have no idea >>>> of it. >>> >>> I use a heart-beat counter to see if the other side responses within a >>> certain time limit. If not, I assume it has been scheduled out probably >>> due to PLE. >> >> PLE is unnecessary if you have "true" pv spinlocks where the >> next-in-line schedules itself out with a hypercall (Xen) or hlt >> instruction (KVM). Set a bit in the qspinlock before going to sleep, >> and the lock owner will know that it needs to kick the next-in-line. >> >> I think there is no need for the unfair lock bits. 1-2% is a pretty >> large hit. >> >> Paolo > > I don't think that PLE is something that can be controlled by software.You can avoid PLE by not issuing PAUSE instructions when spinning. You may want to consider this if you have a lock that explicitly deschedules the VCPU while waiting (or just deschedule before PLE would trigger).> It is done in hardware. I maybe wrong. Anyway, I plan to add code to > schedule out the CPUs waiting in the queue except the first 2 in the > next version of the patch.I think you should deschedule all waiters.> The PV code in the v5 patch did seem to improve benchmark performance > with moderate to heavy spinlock contention.The goal of PV aware locks is to improve performance when locks are contented /and/ VCPUs are over-committed. Is this something you're actually measuring? It's not clear to me. David
Raghavendra K T
2014-Mar-04 17:50 UTC
[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
> The PV code in the v5 patch did seem to improve benchmark performance > with moderate to heavy spinlock contention. However, I didn't see much > CPU kicking going on. My theory is that the additional PV code > complicates the pause loop timing so that the hardware PLE didn't kick > in, whereas the original pause loop is pretty simple causing PLE to > happen fairly frequently.you could play with ple_gap parameter to make it work for bigger spin-loops in such cases.
Reasonably Related Threads
- [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
- [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
- [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
- [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment
- [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment