thr3ads.net - Xen devel - Recursive locking in Xen (in reference to NMI/MCE path audit) [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Andrew Cooper

2012-Dec-05 18:26 UTC

Recursive locking in Xen (in reference to NMI/MCE path audit)

Hello,

While auditing the NMI/MCE paths, I have encountered some issues with
recursive locking in Xen, discovered by the misuse of the console_lock
intermittently as a regular lock and as a recursive lock.

The comment in spinlock.h is unclear as to whether mixing recursive and
non recursive calls on the same spinlock is valid.  If the calls are
genuinely not valid, then surely regular spinlocks and recursive
spinlocks should be separate types to let the compiler work for us.

If mixing calls are valid, then there appear to be problems with nesting
recursive and regular calls, as either ordering of spin_lock and
spin_lock_recursive will deadlock.

As a result, I am wondering which of the above to fix?

There are very few users of recursive locks (domain lock, domain
page_alloc lock, mm (pod and paging) locks and console lock).  The
console and page_alloc locks appear to have mixed callers, while the
domain and mm locks appear to have strictly recursive callers.

It seems to me that either we need to make the two locks different
types, or use ASSERT()s to ensure we dont next spin_lock() and
spin_lock_recursive() calls.

In addition to the above problems, I find myself needing to implement
spin_lock_recursive_irq{,save,restore}() variants.  The implementations
themselves are not too hard to do, but I did wonder whether we might
want to have extra ASSERT()s to remove potential deadlock scenarios from
the NMI/MCE paths.   The ASSERT()s would have to be along the lines of
"assert this is exclusively a recursive lock" or "assert this is
a
per-cpu regular spinlock which is never referenced outside of this
specific NMI/MCE path".  The possible implementation of these
pseudo-asserts would differ depending on the outcome of the query.

Any other comment or queries?

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Tim Deegan

2012-Dec-06 10:39 UTC

head link

Re: Recursive locking in Xen (in reference to NMI/MCE path audit)

At 18:26 +0000 on 05 Dec (1354731981), Andrew Cooper
wrote:> Hello,
> 
> While auditing the NMI/MCE paths, I have encountered some issues with
> recursive locking in Xen, discovered by the misuse of the console_lock
> intermittently as a regular lock and as a recursive lock.
> 
> The comment in spinlock.h is unclear as to whether mixing recursive and
> non recursive calls on the same spinlock is valid.  If the calls are
> genuinely not valid, then surely regular spinlocks and recursive
> spinlocks should be separate types to let the compiler work for us.
Seems like a good idea.  
> If mixing calls are valid, then there appear to be problems with nesting
> recursive and regular calls, as either ordering of spin_lock and
> spin_lock_recursive will deadlock.
Yes.  But paths that know they will not need to recurse can safely use
the non-recursive lock op.  The shadow code used to do this sort of
thing (with a better failure mode) to explicitly catch recursive paths
that weren''t intended. 
> As a result, I am wondering which of the above to fix?
> 
> There are very few users of recursive locks (domain lock, domain
> page_alloc lock, mm (pod and paging) locks and console lock).  The
> console and page_alloc locks appear to have mixed callers, while the
> domain and mm locks appear to have strictly recursive callers.
Please don''t touch the mm locks!  Any code that uses them in NMI or MCE
handlers is in a bad way already. :)
> It seems to me that either we need to make the two locks different
> types, or use ASSERT()s to ensure we dont next spin_lock() and
> spin_lock_recursive() calls.
I''d be happy with different types, unless there are cases where we care
about the speed of extra operations on the page_alloc or domain locks.
> In addition to the above problems, I find myself needing to implement
> spin_lock_recursive_irq{,save,restore}() variants.  The implementations
> themselves are not too hard to do, but I did wonder whether we might
> want to have extra ASSERT()s to remove potential deadlock scenarios from
> the NMI/MCE paths.   The ASSERT()s would have to be along the lines of
> "assert this is exclusively a recursive lock" or "assert
this is a
> per-cpu regular spinlock which is never referenced outside of this
> specific NMI/MCE path".  The possible implementation of these
> pseudo-asserts would differ depending on the outcome of the query.
Have you looked at the existing lock debugging that Keir put in to do
exactly this for normal vs IRQ?  Can it be trivially extended?

Tim.

Apparently Analagous Threads

Search for more possibly parallel threads

Xen devel - Dec 2012 - Recursive locking in Xen (in reference to NMI/MCE path audit)

Recursive locking in Xen (in reference to NMI/MCE path audit)

Re: Recursive locking in Xen (in reference to NMI/MCE path audit)

Apparently Analagous Threads