Displaying 20 results from an estimated 90 matches for "lock_waiting".
2014 Mar 12
2
[PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
...k->lock_wait) = _QSPINLOCK_LOCKED;
> + *
> + * It is not currently clear why this happens. A workaround
> + * is to use atomic instruction to store the new value.
> + */
> + {
> + u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
> + BUG_ON(lw != _QSPINLOCK_WAITING);
> + }
> + return 1;
>
It was found that when I used a direct memory store instead of an atomic
op, the following kernel crash might happen at filesystem dismount time:
Red Hat Enterprise Linux Server 7.0 (Maipo)
Kernel 3.14.0-rc6-qlock on an x86_64
h11-kvm20 login: [ 1529.934047] B...
2014 Mar 12
2
[PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
...k->lock_wait) = _QSPINLOCK_LOCKED;
> + *
> + * It is not currently clear why this happens. A workaround
> + * is to use atomic instruction to store the new value.
> + */
> + {
> + u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
> + BUG_ON(lw != _QSPINLOCK_WAITING);
> + }
> + return 1;
>
It was found that when I used a direct memory store instead of an atomic
op, the following kernel crash might happen at filesystem dismount time:
Red Hat Enterprise Linux Server 7.0 (Maipo)
Kernel 3.14.0-rc6-qlock on an x86_64
h11-kvm20 login: [ 1529.934047] B...
2014 Mar 12
0
[PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
...tes.
+ * 2) The 2nd byte of the 32-bit lock word can be used as a pending bit
+ * for waiting lock acquirer so that it won't need to go through the
+ * MCS style locking queuing which has a higher overhead.
*/
+#define _QSPINLOCK_WAIT_SHIFT 8 /* Waiting bit position */
+#define _QSPINLOCK_WAITING (1 << _QSPINLOCK_WAIT_SHIFT)
+/* Masks for lock & wait bits */
+#define _QSPINLOCK_LWMASK (_QSPINLOCK_WAITING | _QSPINLOCK_LOCKED)
+
#define queue_encode_qcode(cpu, idx) (((cpu) + 1) << 2 | (idx))
+#define queue_get_qcode(lock) (atomic_read(&(lock)->qlcode) >> _QCODE...
2015 Feb 16
1
[Xen-devel] [PATCH V5] x86 spinlock: Fix memory corruption on completing completions
...ts);
> + u8 old = READ_ONCE(zero_stats);
> if (unlikely(old)) {
> ret = cmpxchg(&zero_stats, old, 0);
> /* This ensures only one fellow resets the stat */
> @@ -112,6 +112,7 @@ __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
> struct xen_lock_waiting *w = this_cpu_ptr(&lock_waiting);
> int cpu = smp_processor_id();
> u64 start;
> + __ticket_t head;
> unsigned long flags;
>
> /* If kicker interrupts not initialized yet, just spin */
> @@ -159,11 +160,15 @@ __visible void xen_lock_spinning(struct arch_spinlock *...
2015 Feb 16
1
[Xen-devel] [PATCH V5] x86 spinlock: Fix memory corruption on completing completions
...ts);
> + u8 old = READ_ONCE(zero_stats);
> if (unlikely(old)) {
> ret = cmpxchg(&zero_stats, old, 0);
> /* This ensures only one fellow resets the stat */
> @@ -112,6 +112,7 @@ __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
> struct xen_lock_waiting *w = this_cpu_ptr(&lock_waiting);
> int cpu = smp_processor_id();
> u64 start;
> + __ticket_t head;
> unsigned long flags;
>
> /* If kicker interrupts not initialized yet, just spin */
> @@ -159,11 +160,15 @@ __visible void xen_lock_spinning(struct arch_spinlock *...
2014 Feb 26
2
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
..._qspinlock *qlock = (union arch_qspinlock *)lock;
> + u16 old;
> +
> + /*
> + * Fall into the quick spinning code path only if no one is waiting
> + * or the lock is available.
> + */
> + if (unlikely((qsval != _QSPINLOCK_LOCKED) &&
> + (qsval != _QSPINLOCK_WAITING)))
> + return 0;
> +
> + old = xchg(&qlock->lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
> +
> + if (old == 0) {
> + /*
> + * Got the lock, can clear the waiting bit now
> + */
> + smp_u8_store_release(&qlock->wait, 0);
So we just did an atomic...
2014 Feb 26
2
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
..._qspinlock *qlock = (union arch_qspinlock *)lock;
> + u16 old;
> +
> + /*
> + * Fall into the quick spinning code path only if no one is waiting
> + * or the lock is available.
> + */
> + if (unlikely((qsval != _QSPINLOCK_LOCKED) &&
> + (qsval != _QSPINLOCK_WAITING)))
> + return 0;
> +
> + old = xchg(&qlock->lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
> +
> + if (old == 0) {
> + /*
> + * Got the lock, can clear the waiting bit now
> + */
> + smp_u8_store_release(&qlock->wait, 0);
So we just did an atomic...
2014 Mar 13
0
[PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
...K_LOCKED;
> >+ *
> >+ * It is not currently clear why this happens. A workaround
> >+ * is to use atomic instruction to store the new value.
> >+ */
> >+ {
> >+ u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
> >+ BUG_ON(lw != _QSPINLOCK_WAITING);
> >+ }
> It was found that when I used a direct memory store instead of an atomic op,
> the following kernel crash might happen at filesystem dismount time:
>
> [ 1529.936714] Call Trace:
> [ 1529.936714] [<ffffffff811c2d03>] d_walk+0xc3/0x260
> [ 1529.936714] [...
2014 Feb 26
0
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
...t
+ * for waiting lock acquirer so that it won't need to go through the
+ * MCS style locking queuing which has a higher overhead.
+ * 2) The 16-bit queue code can be accessed or modified directly as a
+ * 16-bit short value without disturbing the first 2 bytes.
+ */
+#define _QSPINLOCK_WAITING 0x100U /* Waiting bit in 2nd byte */
+#define _QSPINLOCK_LWMASK 0xffff /* Mask for lock & wait bits */
+
+#define queue_encode_qcode(cpu, idx) (((cpu) + 1) << 2 | (idx))
+
+#define queue_spin_trylock_quick queue_spin_trylock_quick
+/**
+ * queue_spin_trylock_quick - fast spinning on the...
2014 Feb 27
0
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
...t
+ * for waiting lock acquirer so that it won't need to go through the
+ * MCS style locking queuing which has a higher overhead.
+ * 2) The 16-bit queue code can be accessed or modified directly as a
+ * 16-bit short value without disturbing the first 2 bytes.
+ */
+#define _QSPINLOCK_WAITING 0x100U /* Waiting bit in 2nd byte */
+#define _QSPINLOCK_LWMASK 0xffff /* Mask for lock & wait bits */
+
+#define queue_encode_qcode(cpu, idx) (((cpu) + 1) << 2 | (idx))
+
+#define queue_spin_trylock_quick queue_spin_trylock_quick
+/**
+ * queue_spin_trylock_quick - fast spinning on the...
2014 Feb 27
0
[PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support
...ut this may result in some ping ponging?
Actually, I think the qspinlock can work roughly the same as the
pvticketlock, using the same lock_spinning and unlock_lock hooks.
The x86-specific codepath can use bit 1 in the ->wait byte as "I have
halted, please kick me".
value = _QSPINLOCK_WAITING;
i = 0;
do
cpu_relax();
while (ACCESS_ONCE(slock->lock) && i++ < BUSY_WAIT);
if (ACCESS_ONCE(slock->lock)) {
value |= _QSPINLOCK_HALTED;
xchg(&slock->wait, value >> 8);
if (ACCESS_ONCE(slock->lock)) {
... call lock_spinning hook ...
}
}
/*
* Se...
2014 Feb 27
0
[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks
..._qspinlock *)lock;
>> + u16 old;
>> +
>> + /*
>> + * Fall into the quick spinning code path only if no one is waiting
>> + * or the lock is available.
>> + */
>> + if (unlikely((qsval != _QSPINLOCK_LOCKED)&&
>> + (qsval != _QSPINLOCK_WAITING)))
>> + return 0;
>> +
>> + old = xchg(&qlock->lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
>> +
>> + if (old == 0) {
>> + /*
>> + * Got the lock, can clear the waiting bit now
>> + */
>> + smp_u8_store_release(&qlock->...
2014 Feb 27
3
[PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support
On 02/27/2014 08:15 PM, Paolo Bonzini wrote:
[...]
>> But neither of the VCPUs being kicked here are halted -- they're either
>> running or runnable (descheduled by the hypervisor).
>
> /me actually looks at Waiman's code...
>
> Right, this is really different from pvticketlocks, where the *unlock*
> primitive wakes up a sleeping VCPU. It is more similar to PLE
2014 Feb 27
3
[PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support
On 02/27/2014 08:15 PM, Paolo Bonzini wrote:
[...]
>> But neither of the VCPUs being kicked here are halted -- they're either
>> running or runnable (descheduled by the hypervisor).
>
> /me actually looks at Waiman's code...
>
> Right, this is really different from pvticketlocks, where the *unlock*
> primitive wakes up a sleeping VCPU. It is more similar to PLE
2015 Feb 15
7
[PATCH V5] x86 spinlock: Fix memory corruption on completing completions
...ock->tickets.head);
+ if (__tickets_equal(head, want)) {
add_stats(TAKEN_SLOW_PICKUP, 1);
goto out;
}
@@ -803,8 +805,8 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket)
add_stats(RELEASED_SLOW, 1);
for_each_cpu(cpu, &waiting_cpus) {
const struct kvm_lock_waiting *w = &per_cpu(klock_waiting, cpu);
- if (ACCESS_ONCE(w->lock) == lock &&
- ACCESS_ONCE(w->want) == ticket) {
+ if (READ_ONCE(w->lock) == lock &&
+ READ_ONCE(w->want) == ticket) {
add_stats(RELEASED_SLOW_KICKED, 1);
kvm_kick_cpu(cpu);
break;
di...
2015 Feb 15
7
[PATCH V5] x86 spinlock: Fix memory corruption on completing completions
...ock->tickets.head);
+ if (__tickets_equal(head, want)) {
add_stats(TAKEN_SLOW_PICKUP, 1);
goto out;
}
@@ -803,8 +805,8 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket)
add_stats(RELEASED_SLOW, 1);
for_each_cpu(cpu, &waiting_cpus) {
const struct kvm_lock_waiting *w = &per_cpu(klock_waiting, cpu);
- if (ACCESS_ONCE(w->lock) == lock &&
- ACCESS_ONCE(w->want) == ticket) {
+ if (READ_ONCE(w->lock) == lock &&
+ READ_ONCE(w->want) == ticket) {
add_stats(RELEASED_SLOW_KICKED, 1);
kvm_kick_cpu(cpu);
break;
di...
2014 Mar 12
17
[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support
v5->v6:
- Change the optimized 2-task contending code to make it fairer at the
expense of a bit of performance.
- Add a patch to support unfair queue spinlock for Xen.
- Modify the PV qspinlock code to follow what was done in the PV
ticketlock.
- Add performance data for the unfair lock as well as the PV
support code.
v4->v5:
- Move the optimized 2-task contending code to the
2014 Mar 12
17
[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support
v5->v6:
- Change the optimized 2-task contending code to make it fairer at the
expense of a bit of performance.
- Add a patch to support unfair queue spinlock for Xen.
- Modify the PV qspinlock code to follow what was done in the PV
ticketlock.
- Add performance data for the unfair lock as well as the PV
support code.
v4->v5:
- Move the optimized 2-task contending code to the
2013 Aug 06
16
[PATCH V12 0/14] Paravirtualized ticket spinlocks
This series replaces the existing paravirtualized spinlock mechanism
with a paravirtualized ticketlock mechanism. The series provides
implementation for both Xen and KVM.
The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included
against -tip.
I 'll be sending a separate patchset for KVM host based on kvm tree.
Please note I have added the below performance result
2013 Aug 06
16
[PATCH V12 0/14] Paravirtualized ticket spinlocks
This series replaces the existing paravirtualized spinlock mechanism
with a paravirtualized ticketlock mechanism. The series provides
implementation for both Xen and KVM.
The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included
against -tip.
I 'll be sending a separate patchset for KVM host based on kvm tree.
Please note I have added the below performance result