On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote:> Ah nice. That could be spun out as a seperate patch to optimize the existing > ticket locks I presume.Yes I suppose we can do something similar for the ticket and patch in the right increment. We'd need to restructure the code a bit, but its not fundamentally impossible. We could equally apply the head hashing to the current ticket implementation and avoid the current bitmap iteration.> Now with the old pv ticketlock code an vCPU would only go to sleep once and > be woken up when it was its turn. With this new code it is woken up twice > (and twice it goes to sleep). With an overcommit scenario this would imply > that we will have at least twice as many VMEXIT as with the previous code.An astute observation, I had not considered that.> I presume when you did benchmarking this did not even register? Thought > I wonder if it would if you ran the benchmark for a week or so.You presume I benchmarked :-) I managed to boot something virt and run hackbench in it. I wouldn't know a representative virt setup if I ran into it. The thing is, we want this qspinlock for real hardware because its faster and I really want to avoid having to carry two spinlock implementations -- although I suppose that if we really really have to we could.
On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote:> On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote: > > Ah nice. That could be spun out as a seperate patch to optimize the existing > > ticket locks I presume. > > Yes I suppose we can do something similar for the ticket and patch in > the right increment. We'd need to restructure the code a bit, but > its not fundamentally impossible. > > We could equally apply the head hashing to the current ticket > implementation and avoid the current bitmap iteration. > > > Now with the old pv ticketlock code an vCPU would only go to sleep once and > > be woken up when it was its turn. With this new code it is woken up twice > > (and twice it goes to sleep). With an overcommit scenario this would imply > > that we will have at least twice as many VMEXIT as with the previous code. > > An astute observation, I had not considered that.Thank you.> > > I presume when you did benchmarking this did not even register? Thought > > I wonder if it would if you ran the benchmark for a week or so. > > You presume I benchmarked :-) I managed to boot something virt and run > hackbench in it. I wouldn't know a representative virt setup if I ran > into it. > > The thing is, we want this qspinlock for real hardware because its > faster and I really want to avoid having to carry two spinlock > implementations -- although I suppose that if we really really have to > we could.In some way you already have that - for virtualized environments where you don't have an PV mechanism you just use the byte spinlock - which is good. And switching to PV ticketlock implementation after boot.. ugh. I feel your pain. What if you used an PV bytelock implemenation? The code you posted already 'sprays' all the vCPUS to wake up. And that is exactly what you need for PV bytelocks - well, you only need to wake up the vCPUS that have gone to sleep waiting on an specific 'struct spinlock' and just stash those in an per-cpu area. The old Xen spinlock code (Before 3.11?) had this. Just an idea thought.
On 03/27/2015 10:07 AM, Konrad Rzeszutek Wilk wrote:> On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote: >> On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote: >>> Ah nice. That could be spun out as a seperate patch to optimize the existing >>> ticket locks I presume. >> Yes I suppose we can do something similar for the ticket and patch in >> the right increment. We'd need to restructure the code a bit, but >> its not fundamentally impossible. >> >> We could equally apply the head hashing to the current ticket >> implementation and avoid the current bitmap iteration. >> >>> Now with the old pv ticketlock code an vCPU would only go to sleep once and >>> be woken up when it was its turn. With this new code it is woken up twice >>> (and twice it goes to sleep). With an overcommit scenario this would imply >>> that we will have at least twice as many VMEXIT as with the previous code. >> An astute observation, I had not considered that. > Thank you. >>> I presume when you did benchmarking this did not even register? Thought >>> I wonder if it would if you ran the benchmark for a week or so. >> You presume I benchmarked :-) I managed to boot something virt and run >> hackbench in it. I wouldn't know a representative virt setup if I ran >> into it. >> >> The thing is, we want this qspinlock for real hardware because its >> faster and I really want to avoid having to carry two spinlock >> implementations -- although I suppose that if we really really have to >> we could. > In some way you already have that - for virtualized environments where you > don't have an PV mechanism you just use the byte spinlock - which is good. > > And switching to PV ticketlock implementation after boot.. ugh. I feel your pain. > > What if you used an PV bytelock implemenation? The code you posted already > 'sprays' all the vCPUS to wake up. And that is exactly what you need for PV > bytelocks - well, you only need to wake up the vCPUS that have gone to sleep > waiting on an specific 'struct spinlock' and just stash those in an per-cpu > area. The old Xen spinlock code (Before 3.11?) had this. > > Just an idea thought.The current code should have just waken up one sleeping vCPU. We shouldn't want to wake up all of them and have almost all except one go back to sleep. I think the PV bytelock you suggest is workable. It should also simplify the implementation. It is just a matter of how much we value the fairness attribute of the PV ticket or queue spinlock implementation that we have. -Longman