Chuck Anderson
2010-Dec-08 00:54 UTC
[Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
I''m posting this because I am writing a patch to fix a 2.6.32 based PV Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h arch_enter_lazy_mmu_mode() (see details below). The following BUG_ON() was triggered: arch/x86/kernel/paravirt.c static inline void enter_lazy(enum paravirt_lazy_mode mode) { BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); percpu_write(paravirt_lazy_mode, mode); } because enter_lazy() was called twice, once through mm/memory.c copy_pte_range() and a second time through an interrupt path. The easy fix is to disable interrupts in copy_pte_range() before calling arch_enter_lazy_mmu_mode() and re-enable them after the call to arch_leave_lazy_mmu_mode() but I''m asking if there is a better way to handle this. If disabling interrupts is best, there are other calls to arch_enter_lazy_mmu_mode() that appear to have the same interruption issue. It may be best then to disable interrupts in arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu(). Here is how the nested call to arch_enter_lazy_mmu_mode() was made. The first call path is: do_fork() copy_process() dup_mm() dup_mmap() copy_page_range() copy_pud_range() copy_pmd_range() copy_pte_range() arch_enter_lazy_mmu_mode() paravirt_enter_lazy_mmu() enter_lazy() We bubble back up to mm/memory.c copy_pte_range(). The guest is interrupted in that function. Here is the edited interrupt call stack that gets us to arch_enter_lazy_mmu_mode() for the second time without an intervening arch_leave_lazy_mmu_mode(), triggering the BUG_ON() in enter_lazy(): xen_evtchn_do_upcall() handle_irq() blkif_interrupt() do_blkif_request() blkif_queue_request() gnttab_alloc_grant_references() get_free_entries() gnttab_expand() gnttab_map() arch_gnttab_map_shared() apply_to_page_range(... map_pte_fn ...) We get to enter_lazy() downstream from apply_to_page_range(): apply_to_page_range(... map_pte_fn ...) apply_to_pud_range(... map_pte_fn ...) apply_to_pmd_range(... map_pte_fn ...) apply_to_pte_range(... map_pte_fn ...) arch_enter_lazy_mmu_mode() paravirt_enter_lazy_mmu() enter_lazy() The spin locks acquired indirectly through mm/memory.c copy_pte_range() are obtained with spin_lock() and spin_acquire() which I believe do not disable interrupts. Thanks, Chuck _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Dec-08 08:48 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
>>> On 08.12.10 at 01:54, Chuck Anderson <chuck.anderson@oracle.com> wrote: > I''m posting this because I am writing a patch to fix a 2.6.32 based PV > Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h > arch_enter_lazy_mmu_mode() (see details below). The following BUG_ON() > was triggered: > > arch/x86/kernel/paravirt.c > > static inline void enter_lazy(enum paravirt_lazy_mode mode) > { > BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); > > percpu_write(paravirt_lazy_mode, mode); > } > > because enter_lazy() was called twice, once through mm/memory.c > copy_pte_range() and a second time through an interrupt path. > > The easy fix is to disable interrupts in copy_pte_range() before calling > arch_enter_lazy_mmu_mode() and re-enable them after the call to > arch_leave_lazy_mmu_mode() but I''m asking if there is a better way to > handle this. If disabling interrupts is best, there are other calls to > arch_enter_lazy_mmu_mode() that appear to have the same interruption > issue. It may be best then to disable interrupts in > arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu().I don''t think this is an option, as the period of time for which you would disable interrupts could be pretty much unbounded. Instead (being a performance optimization only anyway) the BUG_ON() could be removed (accepting that the interrupted sequence would not batch any further hypercalls, and provided all of this stuff can actually be used in a nested way), the flag could be converted to a counter (again provided nesting is okay here in the first place), or a filter could be applied when actually checking whether to batch (which is what we do in our non-pvops kernels: in IRQ context, no batching happens). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-08 21:21 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
On 12/08/2010 12:48 AM, Jan Beulich wrote:>>>> On 08.12.10 at 01:54, Chuck Anderson <chuck.anderson@oracle.com> wrote: >> I''m posting this because I am writing a patch to fix a 2.6.32 based PV >> Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h >> arch_enter_lazy_mmu_mode() (see details below). The following BUG_ON() >> was triggered: >> >> arch/x86/kernel/paravirt.c >> >> static inline void enter_lazy(enum paravirt_lazy_mode mode) >> { >> BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); >> >> percpu_write(paravirt_lazy_mode, mode); >> } >> >> because enter_lazy() was called twice, once through mm/memory.c >> copy_pte_range() and a second time through an interrupt path. >> >> The easy fix is to disable interrupts in copy_pte_range() before calling >> arch_enter_lazy_mmu_mode() and re-enable them after the call to >> arch_leave_lazy_mmu_mode() but I''m asking if there is a better way to >> handle this. If disabling interrupts is best, there are other calls to >> arch_enter_lazy_mmu_mode() that appear to have the same interruption >> issue. It may be best then to disable interrupts in >> arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu(). > I don''t think this is an option, as the period of time for which you > would disable interrupts could be pretty much unbounded. > > Instead (being a performance optimization only anyway) > the BUG_ON() could be removed (accepting that the > interrupted sequence would not batch any further > hypercalls, and provided all of this stuff can actually be > used in a nested way), the flag could be converted to a > counter (again provided nesting is okay here in the first > place), or a filter could be applied when actually checking > whether to batch (which is what we do in our non-pvops > kernels: in IRQ context, no batching happens).That''s what happens in pvops kernels too - batching is disabled in interrupt context so that (for example) vmalloc pagefault pte updates aren''t deferred. Looks like enter/leave lazy should just be no-op in interrupt context too. Though I''m surprised it has taken so long for this to appear. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-08 22:28 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
On 12/07/2010 04:54 PM, Chuck Anderson wrote:> The easy fix is to disable interrupts in copy_pte_range() before > calling arch_enter_lazy_mmu_mode() and re-enable them after the call > to arch_leave_lazy_mmu_mode() but I''m asking if there is a better way > to handle this. If disabling interrupts is best, there are other > calls to arch_enter_lazy_mmu_mode() that appear to have the same > interruption issue. It may be best then to disable interrupts in > arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu().Disabling interrupts would cause too much latency. I think we may have done this at one point, but it is very antisocial. Since lazy mode is effectively disabled in interrupt handlers anyway, it should just be enough to ignore enter/leave requests. Does this work for you? From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Date: Wed, 8 Dec 2010 14:21:16 -0800 Subject: [PATCH] x86/paravirt: don''t enter/leave lazy mode in interrupts. We already ignore the current state of lazy mode in interrupts, but we should also ignore any attempt to enter/leave lazy mode within an interrupt context. enter_lazy() will BUG if it sees an attempt at a nested entry to lazy mode, which is generally an error. However, it''s possible that an interrupt handler may do something that would trigger a batched MMU update, for example, and that could interrupt an existing batched update. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reported-by: Chuck Anderson <chuck.anderson@oracle.com> Cc: Jan Beulich <JBeulich@novell.com> Cc: Stable Kernel <stable@kernel.org> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index c5b2500..a2ad10d 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -231,6 +231,9 @@ static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LA static inline void enter_lazy(enum paravirt_lazy_mode mode) { + if (in_interrupt()) + return; + BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); percpu_write(paravirt_lazy_mode, mode); @@ -238,6 +241,9 @@ static inline void enter_lazy(enum paravirt_lazy_mode mode) static void leave_lazy(enum paravirt_lazy_mode mode) { + if (in_interrupt()) + return; + BUG_ON(percpu_read(paravirt_lazy_mode) != mode); percpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE); Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chuck Anderson
2010-Dec-09 01:21 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
Jeremy, Is it possible for an ongoing lazy mode update to have batched some MMU updates; an interrupt occurs; an interrupt routine does a non-lazy MMU update for a PTE that is also in the lazy update queue; that update is overwritten on return from the interrupt when the update queue is flushed? Or are the PTE updates protected by a lock? If they are, wouldn''t we deadlock in the interrupt routine when it tries to obtain that (I assume) spinlock? Chuck Jeremy Fitzhardinge wrote:> Disabling interrupts would cause too much latency. I think we may have > done this at one point, but it is very antisocial. > > Since lazy mode is effectively disabled in interrupt handlers anyway, it > should just be enough to ignore enter/leave requests. Does this work > for you? > > From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > Date: Wed, 8 Dec 2010 14:21:16 -0800 > Subject: [PATCH] x86/paravirt: don''t enter/leave lazy mode in interrupts. > > We already ignore the current state of lazy mode in interrupts, but we > should also ignore any attempt to enter/leave lazy mode within > an interrupt context. > > enter_lazy() will BUG if it sees an attempt at a nested entry to lazy > mode, which is generally an error. However, it''s possible that an > interrupt handler may do something that would trigger a batched MMU > update, for example, and that could interrupt an existing batched update._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chuck Anderson
2010-Dec-09 06:50 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
Jeremy, Looking at copy_pte_range(), the stale update scenario I described below can''t happen. I believe the deadlock could happen but that is not a lazy/not lazy MMU update issue. Here is an extract from your proposed patch: static inline void enter_lazy(enum paravirt_lazy_mode mode) { + if (in_interrupt()) + return; + BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); My vote is for something like: static inline void enter_lazy(enum paravirt_lazy_mode mode) { - BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); + /* + * Switch modes only if we are not in an interrupt context. + * The mode is ignored while handling an interrupt. + */ + if (!in_interrupt()) { + BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); - percpu_write(paravirt_lazy_mode, mode); + percpu_write(paravirt_lazy_mode, mode); + } } static void leave_lazy(enum paravirt_lazy_mode mode) { - BUG_ON(percpu_read(paravirt_lazy_mode) != mode); + /* + * Switch modes only if we are not in an interrupt context. + * The mode is ignored while handling an interrupt. + */ + if (!in_interrupt()) { + BUG_ON(percpu_read(paravirt_lazy_mode) != mode); - percpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE); + percpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE); + } } Thanks, Chuck Chuck Anderson wrote:> Jeremy, > Is it possible for an ongoing lazy mode update to have batched some > MMU updates; an interrupt occurs; an interrupt routine does a non-lazy > MMU update for a PTE that is also in the lazy update queue; that > update is overwritten on return from the interrupt when the update > queue is flushed? Or are the PTE updates protected by a lock? If > they are, wouldn''t we deadlock in the interrupt routine when it tries > to obtain that (I assume) spinlock? > Chuck_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Dec-09 17:43 UTC
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
On 12/08/2010 05:21 PM, Chuck Anderson wrote:> Jeremy, > Is it possible for an ongoing lazy mode update to have batched some > MMU updates; an interrupt occurs; an interrupt routine does a non-lazy > MMU update for a PTE that is also in the lazy update queue; that > update is overwritten on return from the interrupt when the update > queue is flushed? Or are the PTE updates protected by a lock? If > they are, wouldn''t we deadlock in the interrupt routine when it tries > to obtain that (I assume) spinlock?The kernel-wide rule is that to update a usermode pte, you must be holding the appropriate pte lock. The pte lock is not interrupt safe, so it is never correct to do a usermode pte update from interrupt context. Kernel pte updates don''t have any particular lock associated with them; each subsystem generally has its own locking scheme to serialize the updates if necessary. Overall the kernel''s mappings aren''t changed very often, except for specific things like kmap, vmalloc, page attributes, etc. So the circumstances you point out would be bugs regardless of whether Xen or lazy mmu updates are in effect. Lazy updates rely on those rules being correctly enforced (in particular, it is never correct to be in lazy mmu update mode for usermode ptes without holding the pte lock). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel