Oleg Nesterov
2015-Feb-10 13:26 UTC
[PATCH] x86 spinlock: Fix memory corruption on completing completions
On 02/10, Raghavendra K T wrote:> > On 02/10/2015 06:23 AM, Linus Torvalds wrote: > >> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >> if (READ_ONCE(lock->tickets.tail) & TICKET_SLOWPATH_FLAG) .. >> >> into something like >> >> val = xadd((&lock->ticket.head_tail, TICKET_LOCK_INC << TICKET_SHIFT); >> if (unlikely(val & TICKET_SLOWPATH_FLAG)) ... >> >> would be the right thing to do. Somebody should just check that I got >> that shift right, and that the tail is in the high bytes (head really >> needs to be high to work, if it's in the low byte(s) the xadd would >> overflow from head into tail which would be wrong). > > Unfortunately xadd could result in head overflow as tail is high. > > The other option was repeated cmpxchg which is bad I believe. > Any suggestions?Stupid question... what if we simply move SLOWPATH from .tail to .head? In this case arch_spin_unlock() could do xadd(tickets.head) and check the result In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg the whole .head_tail. Plus obviously more boring changes. This needs a separate patch even _if_ this can work. BTW. If we move "clear slowpath" into "lock" path, then probably trylock should be changed too? Something like below, we just need to clear SLOWPATH before cmpxchg. Oleg. --- x/arch/x86/include/asm/spinlock.h +++ x/arch/x86/include/asm/spinlock.h @@ -109,7 +109,8 @@ static __always_inline int arch_spin_try if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG)) return 0; - new.head_tail = old.head_tail + (TICKET_LOCK_INC << TICKET_SHIFT); + new.tickets.head = old.tickets.head; + new.tickets.tail = (old.tickets.tail & ~TICKET_SLOWPATH_FLAG) + TICKET_LOCK_INC; /* cmpxchg is a full barrier, so nothing can move before it */ return cmpxchg(&lock->head_tail, old.head_tail, new.head_tail) == old.head_tail;
Jeremy Fitzhardinge
2015-Feb-11 01:18 UTC
[PATCH] x86 spinlock: Fix memory corruption on completing completions
On 02/10/2015 05:26 AM, Oleg Nesterov wrote:> On 02/10, Raghavendra K T wrote: >> On 02/10/2015 06:23 AM, Linus Torvalds wrote: >> >>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >>> if (READ_ONCE(lock->tickets.tail) & TICKET_SLOWPATH_FLAG) .. >>> >>> into something like >>> >>> val = xadd((&lock->ticket.head_tail, TICKET_LOCK_INC << TICKET_SHIFT); >>> if (unlikely(val & TICKET_SLOWPATH_FLAG)) ... >>> >>> would be the right thing to do. Somebody should just check that I got >>> that shift right, and that the tail is in the high bytes (head really >>> needs to be high to work, if it's in the low byte(s) the xadd would >>> overflow from head into tail which would be wrong). >> Unfortunately xadd could result in head overflow as tail is high. >> >> The other option was repeated cmpxchg which is bad I believe. >> Any suggestions? > Stupid question... what if we simply move SLOWPATH from .tail to .head? > In this case arch_spin_unlock() could do xadd(tickets.head) and check > the resultWell, right now, "tail" is manipulated by locked instructions by CPUs who are contending for the ticketlock, but head can be manipulated unlocked by the CPU which currently owns the ticketlock. If SLOWPATH moved into head, then non-owner CPUs would be touching head, requiring everyone to use locked instructions on it. That's the theory, but I don't see much (any?) code which depends on that. Ideally we could find a way so that pv ticketlocks could use a plain unlocked add for the unlock like the non-pv case, but I just don't see a way to do it.> In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg > the whole .head_tail. Plus obviously more boring changes. This needs a > separate patch even _if_ this can work.Definitely.> BTW. If we move "clear slowpath" into "lock" path, then probably trylock > should be changed too? Something like below, we just need to clear SLOWPATH > before cmpxchg.How important / widely used is trylock these days? J> > Oleg. > > --- x/arch/x86/include/asm/spinlock.h > +++ x/arch/x86/include/asm/spinlock.h > @@ -109,7 +109,8 @@ static __always_inline int arch_spin_try > if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG)) > return 0; > > - new.head_tail = old.head_tail + (TICKET_LOCK_INC << TICKET_SHIFT); > + new.tickets.head = old.tickets.head; > + new.tickets.tail = (old.tickets.tail & ~TICKET_SLOWPATH_FLAG) + TICKET_LOCK_INC; > > /* cmpxchg is a full barrier, so nothing can move before it */ > return cmpxchg(&lock->head_tail, old.head_tail, new.head_tail) == old.head_tail; >
Raghavendra K T
2015-Feb-11 11:08 UTC
[PATCH] x86 spinlock: Fix memory corruption on completing completions
On 02/10/2015 06:56 PM, Oleg Nesterov wrote:> On 02/10, Raghavendra K T wrote: >> >> On 02/10/2015 06:23 AM, Linus Torvalds wrote: >> >>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >>> if (READ_ONCE(lock->tickets.tail) & TICKET_SLOWPATH_FLAG) .. >>> >>> into something like >>> >>> val = xadd((&lock->ticket.head_tail, TICKET_LOCK_INC << TICKET_SHIFT); >>> if (unlikely(val & TICKET_SLOWPATH_FLAG)) ... >>> >>> would be the right thing to do. Somebody should just check that I got >>> that shift right, and that the tail is in the high bytes (head really >>> needs to be high to work, if it's in the low byte(s) the xadd would >>> overflow from head into tail which would be wrong). >> >> Unfortunately xadd could result in head overflow as tail is high. >> >> The other option was repeated cmpxchg which is bad I believe. >> Any suggestions? > > Stupid question... what if we simply move SLOWPATH from .tail to .head? > In this case arch_spin_unlock() could do xadd(tickets.head) and check > the resultIt is a good idea. Trying this now.> In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg > the whole .head_tail. Plus obviously more boring changes. This needs a > separate patch even _if_ this can work.Correct, but apart from this, before doing xadd in unlock, we would have to make sure lsb bit is cleared so that we can live with 1 bit overflow to tail which is unused. now either or both of head,tail lsb bit may be set after unlock.
Oleg Nesterov
2015-Feb-11 17:24 UTC
[PATCH] x86 spinlock: Fix memory corruption on completing completions
On 02/10, Jeremy Fitzhardinge wrote:> > On 02/10/2015 05:26 AM, Oleg Nesterov wrote: > > On 02/10, Raghavendra K T wrote: > >> Unfortunately xadd could result in head overflow as tail is high. > >> > >> The other option was repeated cmpxchg which is bad I believe. > >> Any suggestions? > > Stupid question... what if we simply move SLOWPATH from .tail to .head? > > In this case arch_spin_unlock() could do xadd(tickets.head) and check > > the result > > Well, right now, "tail" is manipulated by locked instructions by CPUs > who are contending for the ticketlock, but head can be manipulated > unlocked by the CPU which currently owns the ticketlock. If SLOWPATH > moved into head, then non-owner CPUs would be touching head, requiring > everyone to use locked instructions on it. > > That's the theory, but I don't see much (any?) code which depends on that. > > Ideally we could find a way so that pv ticketlocks could use a plain > unlocked add for the unlock like the non-pv case, but I just don't see a > way to do it.I agree, and I have to admit I am not sure I fully understand why unlock uses the locked add. Except we need a barrier to avoid the race with the enter_slowpath() users, of course. Perhaps this is the only reason? Anyway, I suggested this to avoid the overflow if we use xadd(), and I guess we need the locked insn anyway if we want to eliminate the unsafe read-after-unlock...> > BTW. If we move "clear slowpath" into "lock" path, then probably trylock > > should be changed too? Something like below, we just need to clear SLOWPATH > > before cmpxchg. > > How important / widely used is trylock these days?I am not saying this is that important. Just this looks more consistent imo and we can do this for free. Oleg.
Oleg Nesterov
2015-Feb-11 17:38 UTC
[PATCH] x86 spinlock: Fix memory corruption on completing completions
On 02/11, Raghavendra K T wrote:> > On 02/10/2015 06:56 PM, Oleg Nesterov wrote: > >> In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg >> the whole .head_tail. Plus obviously more boring changes. This needs a >> separate patch even _if_ this can work. > > Correct, but apart from this, before doing xadd in unlock, > we would have to make sure lsb bit is cleared so that we can live with 1 > bit overflow to tail which is unused. now either or both of head,tail > lsb bit may be set after unlock.Sorry, can't understand... could you spell? If TICKET_SLOWPATH_FLAG lives in .head arch_spin_unlock() could simply do head = xadd(&lock->tickets.head, TICKET_LOCK_INC); if (head & TICKET_SLOWPATH_FLAG) __ticket_unlock_kick(head); so it can't overflow to .tail? But probably I missed your concern. And we we do this, probably it makes sense to add something like bool tickets_equal(__ticket_t one, __ticket_t two) { return (one ^ two) & ~TICKET_SLOWPATH_FLAG; } and change kvm_lock_spinning() to use tickets_equal(tickets.head, want), plus it can have more users in asm/spinlock.h. Oleg.
Reasonably Related Threads
- [PATCH] x86 spinlock: Fix memory corruption on completing completions
- [PATCH] x86 spinlock: Fix memory corruption on completing completions
- [PATCH] x86 spinlock: Fix memory corruption on completing completions
- [PATCH V5] x86 spinlock: Fix memory corruption on completing completions
- [PATCH V5] x86 spinlock: Fix memory corruption on completing completions