Andrew Cooper
2013-Aug-05 20:38 UTC
[RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
Automated testing on Xen-4.3 testing tip found an interesting issue (XEN) *** DOUBLE FAULT *** (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100 (XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008 (XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001 (XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000 (XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068 (XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0 (XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000): (XEN) ffff83043f2c6008: [<ffff82c4c01aa73d>] cpuidle_wakeup_mwait+0x2d/0xba (XEN) ffff83043f2c6058: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c60c8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c60d8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c61a8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6230: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6268: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c62d8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c62e8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c63b8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6440: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6478: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6498: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c64e8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c64f8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6550: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 (XEN) ffff83043f2c6568: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c65c8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6650: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6688: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c66f8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6708: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c67d8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6898: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6908: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6918: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6938: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c6988: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c69e8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6a70: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6aa8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6b18: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41 (XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f (XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d (XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde (XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239 (XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e (XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d (XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a (XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489 (XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306 (XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a (XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc (XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d (XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153 (XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17 (XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8 (XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14 (XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10 (XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309 (XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93 (XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 (XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2 (XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 (XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf (XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60 (XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0 (XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) DOUBLE FAULT -- system shutdown (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... The hpet interrupt handler runs with interrupts enabled, due to this the spin_unlock_irq() in: while ( desc->status & IRQ_PENDING ) { desc->status &= ~IRQ_PENDING; spin_unlock_irq(&desc->lock); tsc_in = tb_init_done ? get_cycles() : 0; action->handler(irq, action->dev_id, regs); TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); spin_lock_irq(&desc->lock); } in do_IRQ(). Clearly there are cases where the frequency of the HPET interrupt is faster than the time it takes to process handle_hpet_broadcast(), I presume in part because of the large amount of cpumask manipulation. One solution, presented in this patch is to disable interrupts while running the hpet event handler, but the patch is RFC because I dont really like it as it feels a little bit like a hack. handle_hpet_broadcast() is clearly too long as an interrupt handler, and perhaps effort should be made to reduce it if possible. Then again, interrupt handlers for this (and other Xen-consumed interrupts?) should probably be run with interrupts disabled, to prevent exactly these kinds of problems. Thoughts/comments? CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> CC: Tim Deegan <tim@xen.org> --- xen/arch/x86/hpet.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 946d133..46e14a9 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -229,7 +229,9 @@ static void hpet_interrupt_handler(int irq, void *data, return; } + local_irq_disable(); ch->event_handler(ch); + local_irq_enable(); } static void hpet_msi_unmask(struct irq_desc *desc) -- 1.7.10.4
Keir Fraser
2013-Aug-06 04:49 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
On 05/08/2013 21:38, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> The hpet interrupt handler runs with interrupts enabled, due to this the > spin_unlock_irq() in: > > while ( desc->status & IRQ_PENDING ) > { > desc->status &= ~IRQ_PENDING; > spin_unlock_irq(&desc->lock); > tsc_in = tb_init_done ? get_cycles() : 0; > action->handler(irq, action->dev_id, regs); > TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); > spin_lock_irq(&desc->lock); > } > > in do_IRQ().But the handler sets IRQ_INPROGRESS before entering this loop, which should prevent reentry of the irq handler (hpet_interrupt_handler). So I don''t see how the multiple reentering of hpet_interrupt_handler in the call trace is possible. -- Keir
Jan Beulich
2013-Aug-06 08:01 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Automated testing on Xen-4.3 testing tip found an interesting issue > > (XEN) *** DOUBLE FAULT *** > (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----The call trace is suspicious in ways beyond what Keir already pointed out - with debug=y, there shouldn''t be bogus entries listed, yet ...> (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100 > (XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008 > (XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001 > (XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000 > (XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068 > (XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0 > (XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000):[... removed redundant stuff]> (XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f > (XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 > (XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 > (XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 > (XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f > (XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41 > (XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f > (XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba > (XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 > (XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d > (XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde > (XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239 > (XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e > (XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d > (XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a > (XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489 > (XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306 > (XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a > (XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc > (XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d > (XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153 > (XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17 > (XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8 > (XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14 > (XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10 > (XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309... how would this end up getting called from do_IRQ()?> (XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f > (XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93 > (XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f > (XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba > (XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 > (XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 > (XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2 > (XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 > (XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf > (XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60 > (XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0 > (XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cbAnd this one looks bogus too. Question therefore is whether the problem you describe isn''t a consequence of an earlier issue.> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d > (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e > (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 > (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba > (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 > (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a > (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 > (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 > (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 > (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 > (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd > (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 > (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) DOUBLE FAULT -- system shutdown > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > The hpet interrupt handler runs with interrupts enabled, due to this the > spin_unlock_irq() in: > > while ( desc->status & IRQ_PENDING ) > { > desc->status &= ~IRQ_PENDING; > spin_unlock_irq(&desc->lock); > tsc_in = tb_init_done ? get_cycles() : 0; > action->handler(irq, action->dev_id, regs); > TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); > spin_lock_irq(&desc->lock); > } > > in do_IRQ(). > > Clearly there are cases where the frequency of the HPET interrupt is faster > than the time it takes to process handle_hpet_broadcast(), I presume in part > because of the large amount of cpumask manipulation.How many CPUs (and how many usable HPET channels) does the system have that this crash was observed on? Jan
Andrew Cooper
2013-Aug-06 10:32 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
On 06/08/13 09:01, Jan Beulich wrote:>>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Automated testing on Xen-4.3 testing tip found an interesting issue >> >> (XEN) *** DOUBLE FAULT *** >> (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- > The call trace is suspicious in ways beyond what Keir already > pointed out - with debug=y, there shouldn''t be bogus entries listed, > yet ...show_stack_overflow() doesn''t have a debug case which follows frame pointers. I shall submit a patch for this presently, and put it into XenServer in the hope of getting a better stack trace in the future. <snip>> And this one looks bogus too. Question therefore is whether the > problem you describe isn''t a consequence of an earlier issue.There is nothing apparently interesting preceding the crash. Just some spew from an HVM domain using the 0x39 debug port.> >> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d >> (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e >> (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 >> (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba >> (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 >> (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a >> (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 >> (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 >> (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd >> (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 >> (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) DOUBLE FAULT -- system shutdown >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> The hpet interrupt handler runs with interrupts enabled, due to this the >> spin_unlock_irq() in: >> >> while ( desc->status & IRQ_PENDING ) >> { >> desc->status &= ~IRQ_PENDING; >> spin_unlock_irq(&desc->lock); >> tsc_in = tb_init_done ? get_cycles() : 0; >> action->handler(irq, action->dev_id, regs); >> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); >> spin_lock_irq(&desc->lock); >> } >> >> in do_IRQ(). >> >> Clearly there are cases where the frequency of the HPET interrupt is faster >> than the time it takes to process handle_hpet_broadcast(), I presume in part >> because of the large amount of cpumask manipulation. > How many CPUs (and how many usable HPET channels) does the > system have that this crash was observed on? > > JanThe machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. The full boot xl dmesg is attached, but it appears that the are 8 broadcast hpets. This is futher backed up by the ''i'' debugkey (also attached) Keir: (merging your thread back here) I see your point regarding IRQ_INPROGRESS, but even with 8 hpet interrupts, there are rather more than 8 occurences of handle_hpet_broadcast() in the stack. If the occurences were just function pointers on the stack, I would expect to see several handle_hpet_broadcast()+0x0/0x268 ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Aug-06 11:44 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram.Not all that big.> The full boot xl dmesg is attached, but it appears that the are 8 > broadcast hpets. This is futher backed up by the ''i'' debugkey (also > attached)Right. And with fewer CPUs than HPET channels, you could get the system into a mode where each CPU uses a dedicated channel ("maxcpus=4", suppressing registration of all the disabled ones).> Keir: (merging your thread back here) > I see your point regarding IRQ_INPROGRESS, but even with 8 hpet > interrupts, there are rather more than 8 occurences of > handle_hpet_broadcast() in the stack. If the occurences were just > function pointers on the stack, I would expect to see several > handle_hpet_broadcast()+0x0/0x268Which further hints at some earlier problem. I suppose you don''t happen to have a dump of that crash, or else you could inspect the IRQ descriptors as well as the stack for whether all instances came from the same IRQ/vector. Jan
Andrew Cooper
2013-Aug-06 13:23 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
On 06/08/13 12:44, Jan Beulich wrote:>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. > Not all that big. > >> The full boot xl dmesg is attached, but it appears that the are 8 >> broadcast hpets. This is futher backed up by the ''i'' debugkey (also >> attached) > Right. And with fewer CPUs than HPET channels, you could get > the system into a mode where each CPU uses a dedicated channel > ("maxcpus=4", suppressing registration of all the disabled ones).Does this setup actually mean that there are 8 hpets which are all broadcasting to every pcpu? The affinities listed in debug-keys ''i'' seem to be towards single pcpus, but the order looks peculiar to say the least.> >> Keir: (merging your thread back here) >> I see your point regarding IRQ_INPROGRESS, but even with 8 hpet >> interrupts, there are rather more than 8 occurences of >> handle_hpet_broadcast() in the stack. If the occurences were just >> function pointers on the stack, I would expect to see several >> handle_hpet_broadcast()+0x0/0x268 > Which further hints at some earlier problem. I suppose you don''t > happen to have a dump of that crash, or else you could inspect > the IRQ descriptors as well as the stack for whether all instances > came from the same IRQ/vector. > > Jan >Sadly no - the crashdump analyser grabbed the double fault IST, rather than the entire contents of the main stack. I shall extend the analyser to pick up the main stack as well; It does cross IST boundaries for call traces. I shall how easy it is to make it parse the irq_desc''s & friends as well on crash, although for this case it might be easier just to tweak the double fault handler. ~Andrew
Jan Beulich
2013-Aug-06 13:57 UTC
Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
>>> On 06.08.13 at 15:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 06/08/13 12:44, Jan Beulich wrote: >>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. >> Not all that big. >> >>> The full boot xl dmesg is attached, but it appears that the are 8 >>> broadcast hpets. This is futher backed up by the ''i'' debugkey (also >>> attached) >> Right. And with fewer CPUs than HPET channels, you could get >> the system into a mode where each CPU uses a dedicated channel >> ("maxcpus=4", suppressing registration of all the disabled ones). > > Does this setup actually mean that there are 8 hpets which are all > broadcasting to every pcpu? The affinities listed in debug-keys ''i'' > seem to be towards single pcpus, but the order looks peculiar to say the > least.No, each channel will be used for just one CPU when there are at least as many channels as CPUs. The difference between not using said command line option and using it is that in the former case a new channel would get assigned to a CPU each time it needs one, while in the latter case a static (pre-)assignment is used, i.e. each CPU will use always the same single channel. Jan
On 06/08/13 14:57, Jan Beulich wrote:>>> Right. And with fewer CPUs than HPET channels, you could get >>> the system into a mode where each CPU uses a dedicated channel >>> ("maxcpus=4", suppressing registration of all the disabled ones). >> Does this setup actually mean that there are 8 hpets which are all >> broadcasting to every pcpu? The affinities listed in debug-keys ''i'' >> seem to be towards single pcpus, but the order looks peculiar to say the >> least. > No, each channel will be used for just one CPU when there are at > least as many channels as CPUs. The difference between not using > said command line option and using it is that in the former case a > new channel would get assigned to a CPU each time it needs one, > while in the latter case a static (pre-)assignment is used, i.e. each > CPU will use always the same single channel. > > Jan >We had another crash, this time with a proper stack trace. (This was using an early version stack trace improvements series) From the stack trace (now correctly with frame pointers), we see 9 calls to handle_hpet_broadcast(). This indicates that the current logic does not correctly prevent repeated delivery of interrupts. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Hi, At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote:> We had another crash, this time with a proper stack trace. (This was > using an early version stack trace improvements series) > > From the stack trace (now correctly with frame pointers), we see 9 calls > to handle_hpet_broadcast().Hmmm. I don''t think this can be following frame pointers -- or if it is something very odd is happening here: ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 and here: ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124 ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde Tim.
On 13/08/13 10:22, Tim Deegan wrote:> Hi, > > At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote: >> We had another crash, this time with a proper stack trace. (This was >> using an early version stack trace improvements series) >> >> From the stack trace (now correctly with frame pointers), we see 9 calls >> to handle_hpet_broadcast(). > Hmmm. I don''t think this can be following frame pointers -- or if it is > something very odd is happening here: > > ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 > ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 > ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 > > and here: > > ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 > ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e > ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124 > ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde > > Tim. > >Hmm yes. I will double check the frame pointer through exception frame logic. ~Andrew
>>> On 13.08.13 at 11:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > We had another crash, this time with a proper stack trace. (This was > using an early version stack trace improvements series) > > From the stack trace (now correctly with frame pointers), we see 9 calls > to handle_hpet_broadcast(). > > This indicates that the current logic does not correctly prevent > repeated delivery of interrupts.And this was with a 1:1 CPU <-> HPET channel mapping (not visible from just the stack trace)? In any case, could you try moving the call to ack_APIC_irq() from hpet_msi_ack() to hpet_msi_end() (the latter may need to be re-created depending on the Xen version you do this with). Or, as another alternative, call hpet_msi_{,un}mask() from the two functions respectively (albeit I think this might result in lost interrupts). Potentially hpet_msi_end() would then also need to disable interrupts before doing either of these actions. Jan