Konrad Rzeszutek Wilk
2014-Mar-17 19:10 UTC
[PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote:> On 03/14/2014 04:30 AM, Peter Zijlstra wrote: > >On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote: > >>On 03/13/2014 11:15 AM, Peter Zijlstra wrote: > >>>On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote: > >>>>+static inline void arch_spin_lock(struct qspinlock *lock) > >>>>+{ > >>>>+ if (static_key_false(¶virt_unfairlocks_enabled)) > >>>>+ queue_spin_lock_unfair(lock); > >>>>+ else > >>>>+ queue_spin_lock(lock); > >>>>+} > >>>So I would have expected something like: > >>> > >>> if (static_key_false(¶virt_spinlock)) { > >>> while (!queue_spin_trylock(lock)) > >>> cpu_relax(); > >>> return; > >>> } > >>> > >>>At the top of queue_spin_lock_slowpath(). > >>I don't like the idea of constantly spinning on the lock. That can cause all > >>sort of performance issues. > >Its bloody virt; _that_ is a performance issue to begin with. > > > >Anybody half sane stops using virt (esp. if they care about > >performance). > > > >>My version of the unfair lock tries to grab the > >>lock ignoring if there are others waiting in the queue or not. So instead of > >>the doing a cmpxchg of the whole 32-bit word, I just do a cmpxchg of the > >>lock byte in the unfair version. A CPU has only one chance to steal the > >>lock. If it can't, it will be lined up in the queue just like the fair > >>version. It is not as unfair as the other unfair locking schemes that spins > >>on the lock repetitively. So lock starvation should be less a problem. > >> > >>On the other hand, it may not perform as well as the other unfair locking > >>schemes. It is a compromise to provide some lock unfairness without > >>sacrificing the good cacheline behavior of the queue spinlock. > >But but but,.. any kind of queueing gets you into a world of hurt with > >virt. > > > >The simple test-and-set lock (as per the above) still sucks due to lock > >holder preemption, but at least the suckage doesn't queue. Because with > >queueing you not only have to worry about the lock holder getting > >preemption, but also the waiter(s). > > > >Take the situation of 3 (v)CPUs where cpu0 holds the lock but is > >preempted. cpu1 queues, cpu2 queues. Then cpu1 gets preempted, after > >which cpu0 gets back online. > > > >The simple test-and-set lock will now let cpu2 acquire. Your queue > >however will just sit there spinning, waiting for cpu1 to come back from > >holiday. > > > >I think you're way over engineering this. Just do the simple > >test-and-set lock for virt&& !paravirt (as I think Paolo Bonzini > >suggested RHEL6 already does). > > The PV ticketlock code was designed to handle lock holder preemption > by redirecting CPU resources in a preempted guest to another guest > that can better use it and then return the preempted CPU back > sooner. > > Using a simple test-and-set lock will not allow us to enable this PV > spinlock functionality as there is no structure to decide who does > what. I can extend the current unfair lock code to allow thoseAnd what would be needed to do 'decide who does what'?> waiting in the queue to also attempt to steal the lock, though at a > lesser frequency so that the queue head has a higher chance of > getting the lock. This will solve the lock waiter preemption problem > that you worry about. This does make the code a bit more complex, > but it allow us to enable both the unfair lock and the PV spinlock > code together to solve the lock waiter and lock holder preemption > problems. > > -Longman >
Waiman Long
2014-Mar-19 03:11 UTC
[PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
On 03/17/2014 03:10 PM, Konrad Rzeszutek Wilk wrote:> On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote: >> On 03/14/2014 04:30 AM, Peter Zijlstra wrote: >>> On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote: >>>> On 03/13/2014 11:15 AM, Peter Zijlstra wrote: >>>>> On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote: >>>>>> +static inline void arch_spin_lock(struct qspinlock *lock) >>>>>> +{ >>>>>> + if (static_key_false(¶virt_unfairlocks_enabled)) >>>>>> + queue_spin_lock_unfair(lock); >>>>>> + else >>>>>> + queue_spin_lock(lock); >>>>>> +} >>>>> So I would have expected something like: >>>>> >>>>> if (static_key_false(¶virt_spinlock)) { >>>>> while (!queue_spin_trylock(lock)) >>>>> cpu_relax(); >>>>> return; >>>>> } >>>>> >>>>> At the top of queue_spin_lock_slowpath(). >>>> I don't like the idea of constantly spinning on the lock. That can cause all >>>> sort of performance issues. >>> Its bloody virt; _that_ is a performance issue to begin with. >>> >>> Anybody half sane stops using virt (esp. if they care about >>> performance). >>> >>>> My version of the unfair lock tries to grab the >>>> lock ignoring if there are others waiting in the queue or not. So instead of >>>> the doing a cmpxchg of the whole 32-bit word, I just do a cmpxchg of the >>>> lock byte in the unfair version. A CPU has only one chance to steal the >>>> lock. If it can't, it will be lined up in the queue just like the fair >>>> version. It is not as unfair as the other unfair locking schemes that spins >>>> on the lock repetitively. So lock starvation should be less a problem. >>>> >>>> On the other hand, it may not perform as well as the other unfair locking >>>> schemes. It is a compromise to provide some lock unfairness without >>>> sacrificing the good cacheline behavior of the queue spinlock. >>> But but but,.. any kind of queueing gets you into a world of hurt with >>> virt. >>> >>> The simple test-and-set lock (as per the above) still sucks due to lock >>> holder preemption, but at least the suckage doesn't queue. Because with >>> queueing you not only have to worry about the lock holder getting >>> preemption, but also the waiter(s). >>> >>> Take the situation of 3 (v)CPUs where cpu0 holds the lock but is >>> preempted. cpu1 queues, cpu2 queues. Then cpu1 gets preempted, after >>> which cpu0 gets back online. >>> >>> The simple test-and-set lock will now let cpu2 acquire. Your queue >>> however will just sit there spinning, waiting for cpu1 to come back from >>> holiday. >>> >>> I think you're way over engineering this. Just do the simple >>> test-and-set lock for virt&& !paravirt (as I think Paolo Bonzini >>> suggested RHEL6 already does). >> The PV ticketlock code was designed to handle lock holder preemption >> by redirecting CPU resources in a preempted guest to another guest >> that can better use it and then return the preempted CPU back >> sooner. >> >> Using a simple test-and-set lock will not allow us to enable this PV >> spinlock functionality as there is no structure to decide who does >> what. I can extend the current unfair lock code to allow those > And what would be needed to do 'decide who does what'? > >Sorry for not very clear in my previous mail. The current PV code will halt the CPUs if the lock isn't acquired in a certain time. When the locker holder come back, it will wake up the next CPU in the ticket lock queue. With a simple set and test lock, there is no structure to decide which one to wake up. So you still need to have some sort of queue to do that, you just can't wake up all of them and let them fight for the lock. -Longman
Konrad Rzeszutek Wilk
2014-Mar-19 15:25 UTC
[PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
On Tue, Mar 18, 2014 at 11:11:43PM -0400, Waiman Long wrote:> On 03/17/2014 03:10 PM, Konrad Rzeszutek Wilk wrote: > >On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote: > >>On 03/14/2014 04:30 AM, Peter Zijlstra wrote: > >>>On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote: > >>>>On 03/13/2014 11:15 AM, Peter Zijlstra wrote: > >>>>>On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote: > >>>>>>+static inline void arch_spin_lock(struct qspinlock *lock) > >>>>>>+{ > >>>>>>+ if (static_key_false(¶virt_unfairlocks_enabled)) > >>>>>>+ queue_spin_lock_unfair(lock); > >>>>>>+ else > >>>>>>+ queue_spin_lock(lock); > >>>>>>+} > >>>>>So I would have expected something like: > >>>>> > >>>>> if (static_key_false(¶virt_spinlock)) { > >>>>> while (!queue_spin_trylock(lock)) > >>>>> cpu_relax(); > >>>>> return; > >>>>> } > >>>>> > >>>>>At the top of queue_spin_lock_slowpath(). > >>>>I don't like the idea of constantly spinning on the lock. That can cause all > >>>>sort of performance issues. > >>>Its bloody virt; _that_ is a performance issue to begin with. > >>> > >>>Anybody half sane stops using virt (esp. if they care about > >>>performance). > >>> > >>>>My version of the unfair lock tries to grab the > >>>>lock ignoring if there are others waiting in the queue or not. So instead of > >>>>the doing a cmpxchg of the whole 32-bit word, I just do a cmpxchg of the > >>>>lock byte in the unfair version. A CPU has only one chance to steal the > >>>>lock. If it can't, it will be lined up in the queue just like the fair > >>>>version. It is not as unfair as the other unfair locking schemes that spins > >>>>on the lock repetitively. So lock starvation should be less a problem. > >>>> > >>>>On the other hand, it may not perform as well as the other unfair locking > >>>>schemes. It is a compromise to provide some lock unfairness without > >>>>sacrificing the good cacheline behavior of the queue spinlock. > >>>But but but,.. any kind of queueing gets you into a world of hurt with > >>>virt. > >>> > >>>The simple test-and-set lock (as per the above) still sucks due to lock > >>>holder preemption, but at least the suckage doesn't queue. Because with > >>>queueing you not only have to worry about the lock holder getting > >>>preemption, but also the waiter(s). > >>> > >>>Take the situation of 3 (v)CPUs where cpu0 holds the lock but is > >>>preempted. cpu1 queues, cpu2 queues. Then cpu1 gets preempted, after > >>>which cpu0 gets back online. > >>> > >>>The simple test-and-set lock will now let cpu2 acquire. Your queue > >>>however will just sit there spinning, waiting for cpu1 to come back from > >>>holiday. > >>> > >>>I think you're way over engineering this. Just do the simple > >>>test-and-set lock for virt&& !paravirt (as I think Paolo Bonzini > >>>suggested RHEL6 already does). > >>The PV ticketlock code was designed to handle lock holder preemption > >>by redirecting CPU resources in a preempted guest to another guest > >>that can better use it and then return the preempted CPU back > >>sooner. > >> > >>Using a simple test-and-set lock will not allow us to enable this PV > >>spinlock functionality as there is no structure to decide who does > >>what. I can extend the current unfair lock code to allow those > >And what would be needed to do 'decide who does what'? > > > > > > Sorry for not very clear in my previous mail. The current PV code > will halt the CPUs if the lock isn't acquired in a certain time. > When the locker holder come back, it will wake up the next CPU in > the ticket lock queue. With a simple set and test lock, there is no > structure to decide which one to wake up. So you still need to have > some sort of queue to do that, you just can't wake up all of them > and let them fight for the lock.Why not? That is what bytelocks did and it worked OK in the virtualization environment. The reason it was OK was because the hypervisor decided which VCPU to schedule and that one would get the lock. Granted it did have a per-cpu-what-lock-am-I-waiting-for structure to aid in this. In some form, the hypervisor serializes who is going to get the lock as it ultimately decides which of the VCPUs that are kicked to wake up - and if the lock is not FIFO - it doesn't matter which ones gets it. With the 'per-cpu-what-lock-I-am-waiting-for' you can also have a round-robin of which vCPU you had kick to keep it fair. You might want to take a look at the PV bytelock that existed in the Xen code prior to PV ticketlock (so 3.10) to see how it did that.
Maybe Matching Threads
- [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
- [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
- [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
- [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest
- [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest