Simon Horman
2009-May-29 04:43 UTC
[Xen-devel] Regression introduced by changeset "evtchn: Free pirq_to_evtchn/pirq_mask arrays on domain destruction."
Changeset "evtchn: Free pirq_to_evtchn/pirq_mask arrays on domain destruction." (19661:326b24bfa9f9) appears to introduce a regression. The problem that I am observing is a panic when destroying an HVM domain that has PCI devices passed through. (XEN) ----[ Xen-3.4.0-rc4-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<ffff828c8014e4a0>] __pirq_guest_unbind+0x1c0/0x2e0 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff828c802a8000 rcx: 0000000000000000 (XEN) rdx: 0000000000000006 rsi: ffff8300b7e0a16e rdi: ffff8300b7e0a166 (XEN) rbp: ffff8300b7e0a150 rsp: ffff83011bc57db0 r8: 000000000000000c (XEN) r9: 000000000000000c r10: ffff828400000000 r11: 00000000000001a4 (XEN) r12: ffff8301154fe000 r13: ffff828c802a8000 r14: 0000000000000011 (XEN) r15: 0000000000000031 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 00000000b91ea000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83011bc57db0: (XEN) 000000f1dec36000 0000000000000000 0000000000020000 ffff828c802a8000 (XEN) 0000000000000011 ffff8301154fe000 ffff8301154ffa28 0000000000000011 (XEN) ffff8300defa8438 ffff828c8014ea2d 0000000000000001 ffff8301154fe000 (XEN) ffff8300defa8010 ffff8301154fe000 ffff8301154ffa28 ffff828c801332db (XEN) ffff8301154fe178 ffff8301154fe000 ffffffffffffff00 ffff8301154fe000 (XEN) ffff8301154ffa28 ffff828c80296900 ffff828c80233100 ffff828c80144999 (XEN) 0000000000000000 ffff828c80105a8e 0000000000000282 0000000000000000 (XEN) ffff828c80233240 0000000000000001 ffff828c80296900 ffff828c8011f8b9 (XEN) 0000000000000001 ffff83011bc57f28 ffff828c80297900 ffff828c80119cd1 (XEN) ffff83011bc57f28 ffff83011bc57f28 ffff828c80296900 ffff828c802315d0 (XEN) ffff83011bc57f28 ffff828c80145708 0000000000002000 ffff8300defac000 (XEN) ffff8300dec36000 0000002b4c001fce 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000246 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffffffff802073aa 0000000000000001 (XEN) 0000000000000000 0000000000000001 0000010000000000 ffffffff802073aa (XEN) 000000000000e033 0000000000000246 ffffffff805cbf50 000000000000e02b (XEN) 0000000000000000 000007ea00000001 00114feeffffffff 0000000000000000 (XEN) 0000000000000001 ffff8300defac000 (XEN) Xen call trace: (XEN) [<ffff828c8014e4a0>] __pirq_guest_unbind+0x1c0/0x2e0 (XEN) [<ffff828c8014ea2d>] pirq_guest_unbind+0x5d/0x1b0 (XEN) [<ffff828c801332db>] pci_release_devices+0xab/0x230 (XEN) [<ffff828c80144999>] arch_domain_destroy+0x19/0x130 (XEN) [<ffff828c80105a8e>] complete_domain_destroy+0x6e/0x100 (XEN) [<ffff828c8011f8b9>] rcu_process_callbacks+0xc9/0x280 (XEN) [<ffff828c80119cd1>] do_softirq+0x51/0x90 (XEN) [<ffff828c80145708>] idle_loop+0x58/0xb0 (XEN) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 0000000115636067 000000000003ee24 (XEN) L3[0x000] = 00000000b7e25067 000000000003ee25 (XEN) L2[0x000] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) FATAL PAGE FAULT (XEN) [error_code=0002] (XEN) Faulting linear address: 0000000000000000 (XEN) **************************************** (XEN) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-May-29 07:19 UTC
Re: [Xen-devel] Regression introduced by changeset "evtchn: Freepirq_to_evtchn/pirq_mask arrays on domain destruction."
Yeah, evtchn_destroy() runs much earlier than arch_domain_destroy(). Keir, is there any reason evtchn_destroy() cannot be deferred accordingly? If there is, then un-tying the freeing of pirq_mask and/or pirq_to_evtchn from evtchn_destroy() would be needed. I have to admit that it seems not logical even in the original code to have IRQ (and hence indirectly evtchn) related activities going on for a domain past evtchn_destroy(). Jan>>> Simon Horman <horms@verge.net.au> 29.05.09 06:43 >>>Changeset "evtchn: Free pirq_to_evtchn/pirq_mask arrays on domain destruction." (19661:326b24bfa9f9) appears to introduce a regression. The problem that I am observing is a panic when destroying an HVM domain that has PCI devices passed through. (XEN) ----[ Xen-3.4.0-rc4-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<ffff828c8014e4a0>] __pirq_guest_unbind+0x1c0/0x2e0 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff828c802a8000 rcx: 0000000000000000 (XEN) rdx: 0000000000000006 rsi: ffff8300b7e0a16e rdi: ffff8300b7e0a166 (XEN) rbp: ffff8300b7e0a150 rsp: ffff83011bc57db0 r8: 000000000000000c (XEN) r9: 000000000000000c r10: ffff828400000000 r11: 00000000000001a4 (XEN) r12: ffff8301154fe000 r13: ffff828c802a8000 r14: 0000000000000011 (XEN) r15: 0000000000000031 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 00000000b91ea000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83011bc57db0: (XEN) 000000f1dec36000 0000000000000000 0000000000020000 ffff828c802a8000 (XEN) 0000000000000011 ffff8301154fe000 ffff8301154ffa28 0000000000000011 (XEN) ffff8300defa8438 ffff828c8014ea2d 0000000000000001 ffff8301154fe000 (XEN) ffff8300defa8010 ffff8301154fe000 ffff8301154ffa28 ffff828c801332db (XEN) ffff8301154fe178 ffff8301154fe000 ffffffffffffff00 ffff8301154fe000 (XEN) ffff8301154ffa28 ffff828c80296900 ffff828c80233100 ffff828c80144999 (XEN) 0000000000000000 ffff828c80105a8e 0000000000000282 0000000000000000 (XEN) ffff828c80233240 0000000000000001 ffff828c80296900 ffff828c8011f8b9 (XEN) 0000000000000001 ffff83011bc57f28 ffff828c80297900 ffff828c80119cd1 (XEN) ffff83011bc57f28 ffff83011bc57f28 ffff828c80296900 ffff828c802315d0 (XEN) ffff83011bc57f28 ffff828c80145708 0000000000002000 ffff8300defac000 (XEN) ffff8300dec36000 0000002b4c001fce 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000246 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffffffff802073aa 0000000000000001 (XEN) 0000000000000000 0000000000000001 0000010000000000 ffffffff802073aa (XEN) 000000000000e033 0000000000000246 ffffffff805cbf50 000000000000e02b (XEN) 0000000000000000 000007ea00000001 00114feeffffffff 0000000000000000 (XEN) 0000000000000001 ffff8300defac000 (XEN) Xen call trace: (XEN) [<ffff828c8014e4a0>] __pirq_guest_unbind+0x1c0/0x2e0 (XEN) [<ffff828c8014ea2d>] pirq_guest_unbind+0x5d/0x1b0 (XEN) [<ffff828c801332db>] pci_release_devices+0xab/0x230 (XEN) [<ffff828c80144999>] arch_domain_destroy+0x19/0x130 (XEN) [<ffff828c80105a8e>] complete_domain_destroy+0x6e/0x100 (XEN) [<ffff828c8011f8b9>] rcu_process_callbacks+0xc9/0x280 (XEN) [<ffff828c80119cd1>] do_softirq+0x51/0x90 (XEN) [<ffff828c80145708>] idle_loop+0x58/0xb0 (XEN) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 0000000115636067 000000000003ee24 (XEN) L3[0x000] = 00000000b7e25067 000000000003ee25 (XEN) L2[0x000] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) FATAL PAGE FAULT (XEN) [error_code=0002] (XEN) Faulting linear address: 0000000000000000 (XEN) **************************************** (XEN) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-May-29 08:11 UTC
Re: [Xen-devel] Regression introduced by changeset "evtchn: Freepirq_to_evtchn/pirq_mask arrays on domain destruction."
On 29/05/2009 08:19, "Jan Beulich" <JBeulich@novell.com> wrote:> Yeah, evtchn_destroy() runs much earlier than arch_domain_destroy(). > > Keir, is there any reason evtchn_destroy() cannot be deferred accordingly? > If there is, then un-tying the freeing of pirq_mask and/or pirq_to_evtchn > from evtchn_destroy() would be needed. I have to admit that it seems not > logical even in the original code to have IRQ (and hence indirectly evtchn) > related activities going on for a domain past evtchn_destroy().There are subtleties which make it a bad idea to defer evtchn_destroy() to complete_domain_destroy(). Changeset 15465 actually deliberately moved evtchn_destroy() earlier, and that cleaned up some issues. Moving the xfree()s to complete_domain_destryoy was my plan B in this case. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel