thr3ads.net - Linux Virtualization - [PATCH 8/9] qspinlock: Generic paravirt support [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Peter Zijlstra

2015-Apr-01 17:12 UTC

[PATCH 8/9] qspinlock: Generic paravirt support

On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long
wrote:> After more careful reading, I think the assumption that the presence of an
> unused bucket means there is no match is not true. Consider the scenario:
> 
> 1. cpu 0 puts lock1 into hb[0]
> 2. cpu 1 puts lock2 into hb[1]
> 3. cpu 2 clears hb[0]
> 4. cpu 3 looks for lock2 and doesn't find it
Hmm, yes. The only way I can see that being true is if we assume entries
are never taken out again.

The wikipedia page could use some clarification here, this is not clear.
> At this point, I am thinking using back your previous idea of passing the
> queue head information down the queue.
Having to scan the entire array for a lookup sure sucks, but the wait
loops involved in the other idea can get us in the exact predicament we
were trying to get out, because their forward progress depends on other
CPUs.

Hohumm.. time to think more I think ;-)

Peter Zijlstra

2015-Apr-01 17:42 UTC

head link

[PATCH 8/9] qspinlock: Generic paravirt support

On Wed, Apr 01, 2015 at 07:12:23PM +0200, Peter Zijlstra
wrote:> On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote:
> > After more careful reading, I think the assumption that the presence
of an
> > unused bucket means there is no match is not true. Consider the
scenario:
> > 
> > 1. cpu 0 puts lock1 into hb[0]
> > 2. cpu 1 puts lock2 into hb[1]
> > 3. cpu 2 clears hb[0]
> > 4. cpu 3 looks for lock2 and doesn't find it
> 
> Hmm, yes. The only way I can see that being true is if we assume entries
> are never taken out again.
> 
> The wikipedia page could use some clarification here, this is not clear.
> 
> > At this point, I am thinking using back your previous idea of passing
the
> > queue head information down the queue.
> 
> Having to scan the entire array for a lookup sure sucks, but the wait
> loops involved in the other idea can get us in the exact predicament we
> were trying to get out, because their forward progress depends on other
> CPUs.
> 
> Hohumm.. time to think more I think ;-)
So bear with me, I've not really pondered this well so it could be full
of holes (again).

After the cmpxchg(&l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL) succeeds the
spin_unlock() must do the hash lookup, right? We can make the lookup
unhash.

If the cmpxchg() fails the unlock will not do the lookup and we must
unhash.

Peter Zijlstra

2015-Apr-01 18:17 UTC

head link

[PATCH 8/9] qspinlock: Generic paravirt support

On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra
wrote:> > Hohumm.. time to think more I think ;-)
> 
> So bear with me, I've not really pondered this well so it could be full
> of holes (again).
> 
> After the cmpxchg(&l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL) succeeds
the
> spin_unlock() must do the hash lookup, right? We can make the lookup
> unhash.
> 
> If the cmpxchg() fails the unlock will not do the lookup and we must
> unhash.
The idea being that the result is that any lookup is guaranteed to find
an entry, which reduces our worst case lookup cost to whatever the worst
case insertion cost was.

Waiman Long

2015-Apr-01 20:10 UTC

head link

[PATCH 8/9] qspinlock: Generic paravirt support

On 04/01/2015 01:12 PM, Peter Zijlstra wrote:> On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote:
>> After more careful reading, I think the assumption that the presence of
an
>> unused bucket means there is no match is not true. Consider the
scenario:
>>
>> 1. cpu 0 puts lock1 into hb[0]
>> 2. cpu 1 puts lock2 into hb[1]
>> 3. cpu 2 clears hb[0]
>> 4. cpu 3 looks for lock2 and doesn't find it
> Hmm, yes. The only way I can see that being true is if we assume entries
> are never taken out again.
>
> The wikipedia page could use some clarification here, this is not clear.
>
>> At this point, I am thinking using back your previous idea of passing
the
>> queue head information down the queue.
> Having to scan the entire array for a lookup sure sucks, but the wait
> loops involved in the other idea can get us in the exact predicament we
> were trying to get out, because their forward progress depends on other
> CPUs.
For the waiting loop, the worst case is when a new CPU get queued right 
before we write the head value to the previous tail node. In the case, 
the maximum number of retries is equal to the total number of CPUs - 2. 
But that should rarely happen.

I do find a way to guarantee forward progress in a few steps. I will try 
the normal way once. If that fails, I will insert the head node to the 
tail once again after saving the next pointer. After modifying the 
previous tail node, cmpxchg will be used to restore the previous tail. 
If that fails, we just have to wait until the next pointer is updated 
and write it out to the previous tail node. We can now restore the next 
pointer and move forward.

Let me know if that looks reasonable to you.

-Longman

Seemingly Similar Threads

Search for more seemingly similar threads

Linux Virtualization - Apr 2015 - [PATCH 8/9] qspinlock: Generic paravirt support

[PATCH 8/9] qspinlock: Generic paravirt support

[PATCH 8/9] qspinlock: Generic paravirt support

[PATCH 8/9] qspinlock: Generic paravirt support

[PATCH 8/9] qspinlock: Generic paravirt support

Seemingly Similar Threads