thr3ads.net - search: "vcpu

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015 Apr 07

0

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

...state that is set by the new lock holder on + * the new queue head to indicate that _Q_SLOW_VAL is set and hash entry + * filled. With this state, the queue head CPU will always be kicked even + * if it is not halted to avoid potential racing condition. + */ enum vcpu_state { vcpu_running = 0, vcpu_halted, + vcpu_hashed }; struct pv_node { @@ -97,7 +104,13 @@ static inline u32 hash_align(u32 hash) return hash & ~(PV_HB_PER_LINE - 1); } -static struct qspinlock **pv_hash(struct qspinlock *lock, struct pv_node *node) +/* + * Set up an entry in the lock hash table + * This is not inlined t...

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 24

0

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

...{ if (READ_ONCE(node->locked)) return; + if (loop == MAYHALT_THRESHOLD) + xchg(&pn->mayhalt, true); cpu_relax(); } /* - * Order pn->state vs pn->locked thusly: + * Order pn->state/pn->mayhalt vs pn->locked thusly: * - * [S] pn->state = vcpu_halted [S] next->locked = 1 + * [S] pn->mayhalt = 1 [S] next->locked = 1 + * MB, delay barrier() + * [S] pn->state = vcpu_halted [L] pn->mayhalt * MB MB * [L] pn->locked [RmW] pn->state = vcpu_hashed * * Match...

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 07

0

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Hash table using open addressing with an LFSR probe sequence. + * + * Since we should not be holding locks from NMI context (very rare indeed) the + * max load factor is 0....

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015 Apr 09

2

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

...too sure about that name change.. > { > struct pv_node *pn = (struct pv_node *)node; > + struct __qspinlock *l = (void *)lock; > > /* > + * Transition CPU state: halted => hashed > + * Quit if the transition failed. > */ > + if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted) > + return; > + > + /* > + * Put the lock into the hash table & set the _Q_SLOW_VAL in the lock. > + * As this is the same CPU that will check the _Q_SLOW_VAL value and > + * the hash table later on at unlock time, no atomic instruction is &gt...

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 24

0

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Lock and MCS node addresses hash table for fast lookup + * + * Hashing is done on a per-cacheline basis to minimize the need to access + * more than one cacheline. + * + *...

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015 Apr 09

2

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

...too sure about that name change.. > { > struct pv_node *pn = (struct pv_node *)node; > + struct __qspinlock *l = (void *)lock; > > /* > + * Transition CPU state: halted => hashed > + * Quit if the transition failed. > */ > + if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted) > + return; > + > + /* > + * Put the lock into the hash table & set the _Q_SLOW_VAL in the lock. > + * As this is the same CPU that will check the _Q_SLOW_VAL value and > + * the hash table later on at unlock time, no atomic instruction is &gt...

[PATCH 8/9] qspinlock: Generic paravirt support

2015 Mar 16

0

[PATCH 8/9] qspinlock: Generic paravirt support

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (2U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Initialize the PV part of the mcs_spinlock node. + */ +static void pv_init_node(struct mcs_spinlock *node) +{ + struct pv_node *pn = (struct pv_node *)node; + + BUILD_BUG_O...

[PATCH 8/9] qspinlock: Generic paravirt support

2015 Mar 16

0

[PATCH 8/9] qspinlock: Generic paravirt support

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (2U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Initialize the PV part of the mcs_spinlock node. + */ +static void pv_init_node(struct mcs_spinlock *node) +{ + struct pv_node *pn = (struct pv_node *)node; + + BUILD_BUG_O...

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 May 04

1

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Lock and MCS node addresses hash table for fast lookup + * + * Hashing is done on a per-cacheline basis to minimize the need to access + * more than one cacheline. + * + *...

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 May 04

1

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

...s a suspended vcpu + * + * Using these we implement __pv_queue_spin_lock_slowpath() and + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and + * native_queue_spin_unlock(). + */ + +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) + +enum vcpu_state { + vcpu_running = 0, + vcpu_halted, +}; + +struct pv_node { + struct mcs_spinlock mcs; + struct mcs_spinlock __res[3]; + + int cpu; + u8 state; +}; + +/* + * Lock and MCS node addresses hash table for fast lookup + * + * Hashing is done on a per-cacheline basis to minimize the need to access + * more than one cacheline. + * + *...

[PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

2015 Apr 24

16

[PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

v15->v16: - Remove the lfsr patch and use linear probing as lfsr is not really necessary in most cases. - Move the paravirt PV_CALLEE_SAVE_REGS_THUNK code to an asm header. - Add a patch to collect PV qspinlock statistics which also supersedes the PV lock hash debug patch. - Add PV qspinlock performance numbers. v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve

[PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

2015 Apr 24

16

[PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

v15->v16: - Remove the lfsr patch and use linear probing as lfsr is not really necessary in most cases. - Move the paravirt PV_CALLEE_SAVE_REGS_THUNK code to an asm header. - Add a patch to collect PV qspinlock statistics which also supersedes the PV lock hash debug patch. - Add PV qspinlock performance numbers. v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 09

6

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

...lement __pv_queue_spin_lock_slowpath() and > + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and > + * native_queue_spin_unlock(). > + */ > + > +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) > + > +enum vcpu_state { > + vcpu_running = 0, > + vcpu_halted, > +}; > + > +struct pv_node { > + struct mcs_spinlock mcs; > + struct mcs_spinlock __res[3]; > + > + int cpu; > + u8 state; > +}; > + > +/* > + * Hash table using open addressing with an LFSR probe sequence. > + * > + * Since we should not be holding l...

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 09

6

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

...lement __pv_queue_spin_lock_slowpath() and > + * __pv_queue_spin_unlock() to replace native_queue_spin_lock_slowpath() and > + * native_queue_spin_unlock(). > + */ > + > +#define _Q_SLOW_VAL (3U << _Q_LOCKED_OFFSET) > + > +enum vcpu_state { > + vcpu_running = 0, > + vcpu_halted, > +}; > + > +struct pv_node { > + struct mcs_spinlock mcs; > + struct mcs_spinlock __res[3]; > + > + int cpu; > + u8 state; > +}; > + > +/* > + * Hash table using open addressing with an LFSR probe sequence. > + * > + * Since we should not be holding l...

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 29

4

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: > In the pv_scan_next() function, the slow cmpxchg atomic operation is > performed even if the other CPU is not even close to being halted. This > extra cmpxchg can harm slowpath performance. > > This patch introduces the new mayhalt flag to indicate if the other > spinning CPU is close to being halted or not. The

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 29

4

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: > In the pv_scan_next() function, the slow cmpxchg atomic operation is > performed even if the other CPU is not even close to being halted. This > extra cmpxchg can harm slowpath performance. > > This patch introduces the new mayhalt flag to indicate if the other > spinning CPU is close to being halted or not. The

[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

2015 Apr 07

18

[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve upon the PV qspinlock code by dynamically allocating the hash table as well as some other performance optimization. - Simplified the Xen PV qspinlock code as suggested by David Vrabel <david.vrabel at citrix.com>. - Add benchmarking data for 3.19 kernel to compare the performance of a spinlock heavy test

[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

2015 Apr 07

18

[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve upon the PV qspinlock code by dynamically allocating the hash table as well as some other performance optimization. - Simplified the Xen PV qspinlock code as suggested by David Vrabel <david.vrabel at citrix.com>. - Add benchmarking data for 3.19 kernel to compare the performance of a spinlock heavy test

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 29

0

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

...f it goes wrong the borkage is subtle and painful.\ I have to agree with Peter. But it goes beyond this particular patch. Patterns like this: xchg(&pn->mayhalt, true); are just evil and disgusting. Even befoe this patch, that code had (void)xchg(&pn->state, vcpu_halted); which is *wrong* and should never be done. If you want it to be "set_mb()" (which sets a value and has a memory barrier), then use set_mb(). Yes, it happens to use a "xchg()" to do so, but dammit, it documents that whole "this is a memory barrier" in the name. Als...

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 13

1

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

...p;l->locked, 0); > >Ah yes, clever that. > > > >>+ /* > >>+ * At this point the memory pointed at by lock can be freed/reused, > >>+ * however we can still use the PV node to kick the CPU. > >>+ */ > >>+ if (READ_ONCE(node->state) == vcpu_halted) > >>+ pv_kick(node->cpu); > >>+} > >>+PV_CALLEE_SAVE_REGS_THUNK(__pv_queue_spin_unlock); > >However I feel the PV_CALLEE_SAVE_REGS_THUNK thing belongs in the x86 > >code. > > That is why I originally put my version of the qspinlock_paravirt.h heade...

search for: vcpu_halted