flight 11946 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 Regressions which are regarded as allowable (not blocking): test-i386-i386-win 14 guest-start.2 fail like 11944 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-winxpsp3 7 windows-install fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 7 windows-install fail never pass test-amd64-i386-qemuu-rhel6hvm-intel 7 redhat-install fail never pass test-i386-i386-xl-qemuu-winxpsp3 7 windows-install fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-rhel6hvm-amd 11 leak-check/check fail never pass test-amd64-i386-rhel6hvm-intel 11 leak-check/check fail never pass test-amd64-i386-qemuu-rhel6hvm-amd 7 redhat-install fail never pass test-amd64-amd64-win 16 leak-check/check fail never pass test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass test-i386-i386-xl-winxpsp3 13 guest-stop fail never pass test-amd64-i386-win 16 leak-check/check fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-amd64-xl-win 13 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-i386-i386-xl-win 13 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass version targeted for testing: xen 9207cc3a0862 baseline version: xen 9ad1e42c341b ------------------------------------------------------------ People who touched revisions under test: David Vrabel <david.vrabel@citrix.com> Ian Campbell <ian.campbell@citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Julian Pidancet <julian.pidancet@gmail.com> Keir Fraser <keir@xen.org> Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tim Deegan <tim@xen.org> Yongjie Ren <yongjie.ren@intel.com> ------------------------------------------------------------ jobs: build-amd64 pass build-i386 pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xl pass test-amd64-i386-rhel6hvm-amd fail test-amd64-i386-qemuu-rhel6hvm-amd fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 fail test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel fail test-amd64-i386-qemuu-rhel6hvm-intel fail test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pair pass test-amd64-i386-pair pass test-i386-i386-pair pass test-amd64-amd64-xl-sedf-pin pass test-amd64-amd64-pv pass test-amd64-i386-pv pass test-i386-i386-pv pass test-amd64-amd64-xl-sedf pass test-amd64-i386-win-vcpus1 fail test-amd64-i386-xl-win-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-amd64-win fail test-amd64-i386-win fail test-i386-i386-win fail test-amd64-amd64-xl-win fail test-i386-i386-xl-win fail test-amd64-amd64-xl-qemuu-winxpsp3 fail test-i386-i386-xl-qemuu-winxpsp3 fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail test-i386-i386-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. ------------------------------------------------------------ changeset: 24790:9207cc3a0862 tag: tip user: David Vrabel <david.vrabel@citrix.com> date: Mon Feb 13 13:34:47 2012 +0000 libfdt: add to build Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org> changeset: 24789:e060d1bd7b60 user: David Vrabel <david.vrabel@citrix.com> date: Mon Feb 13 13:34:08 2012 +0000 libfdt: fixup libfdt_env.h for xen Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org> changeset: 24788:fcc188f21e47 user: David Vrabel <david.vrabel@citrix.com> date: Mon Feb 13 13:33:26 2012 +0000 libfdt: add version 1.3.0 Add libfdt 1.3.0 from http://git.jdl.com/gitweb/?p=dtc.git This will be used by Xen to parse the DTBs provided by bootloaders on ARM platforms. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org> changeset: 24787:bd0a11ed1a67 user: Ian Campbell <ian.campbell@citrix.com> date: Mon Feb 13 12:53:28 2012 +0000 MAINTAINERS: Add entry for ARM w/ virt extensions port Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Keir Fraser <keir@xen.org> changeset: 24786:79fe73117c12 user: Julian Pidancet <julian.pidancet@gmail.com> date: Mon Feb 13 12:50:46 2012 +0000 firmware: Introduce CONFIG_ROMBIOS and CONFIG_SEABIOS options This patch introduces configuration options allowing to built either a rombios only or a seabios only hvmloader. Building option ROMs like vgabios or etherboot is only enabled for a rombios hvmloader, since SeaBIOS takes care or extracting option ROMs itself from the PCI devices (these option ROMs are provided by the device model and do not need to be built in hvmloader). The Makefile in tools/firmware/ now only checks for bcc if rombios is enabled. These two configuration options are left on by default to remain compatible. Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> changeset: 24785:e4d8d2524407 user: Julian Pidancet <julian.pidancet@gmail.com> date: Mon Feb 13 12:50:04 2012 +0000 hvmloader: Move option ROM loading into a separate optionnal file Make load_rom field in struct bios_config an optionnal callback rather than a boolean value. It allow BIOS specific code to implement it''s own option ROM loading methods. Facilities to scan PCI devices, extract an deploy ROMs are moved into a separate file that can be compiled optionnaly. Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> changeset: 24784:ab47cfef2b0a user: Julian Pidancet <julian.pidancet@gmail.com> date: Mon Feb 13 12:49:06 2012 +0000 firmware: Use mkhex from hvmloader directory for etherboot ROMs To remain consistent with how other ROMs are built into hvmloader, call mkhex on etherboot ROMs from the hvmloader directory, instead of the etherboot directory. In other words, eb-roms.h is not used any more. Introduce ETHERBOOT_NICS config option to choose which ROMs should be built (kept rtl8139 and 8086100e per default as before). Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> changeset: 24783:0fe9e2556e20 user: Julian Pidancet <julian.pidancet@gmail.com> date: Mon Feb 13 12:48:20 2012 +0000 hvmloader: Allow the mkhex command to take several file arguments Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> changeset: 24782:e1f10d12b9fe user: Julian Pidancet <julian.pidancet@gmail.com> date: Mon Feb 13 12:47:46 2012 +0000 hvmloader: Only compile 32bitbios_support.c when rombios is enabled 32bitbios_support.c only contains code specific to rombios, and should not be built-in when building hvmloader for SeaBIOS only (as for rombios.c). Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> changeset: 24781:6ae5506e49ab user: Jan Beulich <jbeulich@suse.com> date: Mon Feb 13 13:12:30 2012 +0100 x86/vMCE: MC{G,i}_CTL handling adjustments - g_mcg_cap was read to determine whether MCG_CTL exists before it got initialized - h_mci_ctrl[] and dom_vmce()->mci_ctl[] both got initialized via memset() with an inappropriate size (hence causing a [minor?] information leak) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> changeset: 24780:e953d536d3c6 user: Jan Beulich <jbeulich@suse.com> date: Mon Feb 13 13:09:02 2012 +0100 x86/paging: use clear_guest() for zero-filling guest buffers While static arrays of all zeros may be tolerable (but are simply inefficient now that we have the necessary infrastructure), using on- stack arrays for this purpose (particularly when their size doesn''t have an upper limit enforced) is calling for eventual problems (even if the code can be reached via administrative interfaces only). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> changeset: 24779:9ad1e42c341b user: Ian Campbell <ian.campbell@citrix.com> date: Fri Feb 10 17:24:50 2012 +0000 xend: populate HVM guest grant table on boot Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com> =======================================commit 8cc8a3651c9c5bc2d0086d12f4b870fc525b9387 Author: Jan Beulich <JBeulich@suse.com> Date: Tue Feb 7 18:42:56 2012 +0000 qemu-dm: fix unregister_iomem() This function (introduced quite a long time ago in e7911109f4321e9ba0cc56a253b653600aa46bea - "disable qemu PCI devices in HVM domains") appears to be completely broken, causing the regression reported in http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1805 (due to the newly added caller of it in 56d7747a3cf811910c4cf865e1ebcb8b82502005 - "qemu: clean up MSI-X table handling"). It''s unclear how the function can ever have fulfilled its purpose: the value returned by iomem_index() is *not* an index into mmio[]. Additionally, fix two problems: - unregister_iomem() must not clear mmio[].start, otherwise cpu_register_physical_memory() won''t be able to re-use the previous slot, thus causing a leak - cpu_unregister_io_memory() must not check mmio[].size, otherwise it won''t properly clean up entries (temporarily) squashed through unregister_iomem() Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Yongjie Ren <yongjie.ren@intel.com>
On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote:> flight 11946 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944Host crash: http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log This is the debug Andrew Cooper added recently to track down the IRQ assertion we''ve been seeing, sadly it looks like the debug code tries to call xfree from interrupt context and therefore doesn''t produce full output :-( Or is 24675:d82a1e3d3c65 ("xsm: Add security label to IRQ debug output") at fault for adding the xfree in what may be an IRQ context? (are keyhandlers run in IRQ context?) A skanky quick "fix" follows. Feb 13 17:17:29.777522 (XEN) *** IRQ BUG found *** Feb 13 17:19:32.594539 (XEN) CPU0 -Testing vector 229 from bitmap 34,48,57,64,72,75,80,83,88,97,104-105,113,120-121,129,136,144,152,160,168,176,184,192,202 Feb 13 17:19:32.617515 (XEN) Guest interrupt information: Feb 13 17:19:32.617536 (XEN) IRQ: 0 affinity:001 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound Feb 13 17:19:32.617567 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 Feb 13 17:19:32.626489 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not tainted ]---- Feb 13 17:19:32.626512 (XEN) CPU: 0 Feb 13 17:19:32.626525 (XEN) RIP: e008:[<ffff82c48012c842>] xfree+0x33/0x121 Feb 13 17:19:32.641496 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor Feb 13 17:19:32.641519 (XEN) rax: ffff82c4802d0800 rbx: ffff8301a7e00080 rcx: 0000000000000000 Feb 13 17:19:32.650560 (XEN) rdx: 0000000000000000 rsi: 0000000000000083 rdi: 0000000000000000 Feb 13 17:19:32.665510 (XEN) rbp: ffff82c4802afd18 rsp: ffff82c4802afcf8 r8: 0000000000000004 Feb 13 17:19:32.665550 (XEN) r9: 0000000000000000 r10: 0000000000000006 r11: ffff82c480224aa0 Feb 13 17:19:32.673509 (XEN) r12: ffff8301a7e00580 r13: 0000000000000005 r14: ffff82c4802aff18 Feb 13 17:19:32.685503 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000006f0 Feb 13 17:19:32.685537 (XEN) cr3: 00000001a7f54000 cr2: 00000000c4b4ee84 Feb 13 17:19:32.697505 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 Feb 13 17:19:32.697540 (XEN) Xen stack trace from rsp=ffff82c4802afcf8: Feb 13 17:19:32.706513 (XEN) ffff8301a7e00080 ffff8301a7e00580 0000000000000005 ffff82c4802aff18 Feb 13 17:19:32.721495 (XEN) ffff82c4802afd88 ffff82c4801658ee ffff82c4802afd38 ffff82c48010098a Feb 13 17:19:32.721531 (XEN) 00000400802afd68 0000000000000083 ffff8301a7e000a8 0000000000000000 Feb 13 17:19:32.729495 (XEN) 00000000fffffffa 00000000000000e5 ffff8301a7e00580 0000000000000005 Feb 13 17:19:32.738490 (XEN) ffff82c4802aff18 ffff8301a7e005a8 ffff82c4802afe28 ffff82c480167781 Feb 13 17:19:32.738515 (XEN) ffff8301a7ece000 ffff82c4802afde8 0000000000000000 ffff82c4802aff18 Feb 13 17:19:32.750497 (XEN) ffff82c4802aff18 0000000000000002 ffff82c4802aff18 ffff82c4802fa060 Feb 13 17:19:32.762568 (XEN) 000000e500000000 ffff82c4802fa060 ffff82c4802afe08 ffff82c48017bd51 Feb 13 17:19:32.762596 (XEN) ffff82c4802aff18 ffff82c4802aff18 ffff82c48025e380 ffff82c4802aff18 Feb 13 17:19:32.773513 (XEN) 00000000ffffffff 0000000000000002 00007d3b7fd501a7 ffff82c4801525d0 Feb 13 17:19:32.785503 (XEN) 0000000000000002 00000000ffffffff ffff82c4802aff18 ffff82c48025e380 Feb 13 17:19:32.785539 (XEN) ffff82c4802afee0 ffff82c4802aff18 0000001863058413 00000000000c0000 Feb 13 17:19:32.794514 (XEN) 000000000e1ff99c 000000000000c701 ffff82c4802f9a90 0000000000000000 Feb 13 17:19:32.809503 (XEN) 0000000000000000 ffff8301a7f5dc80 0000000000000000 0000002000000000 Feb 13 17:19:32.809529 (XEN) ffff82c4801581a9 000000000000e008 0000000000000246 ffff82c4802afee0 Feb 13 17:19:32.814513 (XEN) 0000000000000000 ffff82c4802aff10 ffff82c48015a647 0000000000000000 Feb 13 17:19:32.829506 (XEN) ffff8300d7cfb000 ffff8300d7af9000 0000000000000000 ffff82c4802afd88 Feb 13 17:19:32.829549 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Feb 13 17:19:32.841510 (XEN) 00000000dfc91f90 00000000deadbeef 0000000000000000 0000000000000000 Feb 13 17:19:32.853508 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef Feb 13 17:19:32.858496 (XEN) Xen call trace: Feb 13 17:19:32.858518 (XEN) [<ffff82c48012c842>] xfree+0x33/0x121 Feb 13 17:19:32.858547 (XEN) [<ffff82c4801658ee>] dump_irqs+0x2a3/0x2ca Feb 13 17:19:32.870500 (XEN) [<ffff82c480167781>] smp_irq_move_cleanup_interrupt+0x303/0x37b Feb 13 17:19:32.870554 (XEN) [<ffff82c4801525d0>] irq_move_cleanup_interrupt+0x30/0x40 Feb 13 17:19:32.885510 (XEN) [<ffff82c4801581a9>] default_idle+0x99/0x9e Feb 13 17:19:32.885541 (XEN) [<ffff82c48015a647>] idle_loop+0x6c/0x7c Feb 13 17:19:32.897496 (XEN) Feb 13 17:19:32.897510 (XEN) Feb 13 17:19:32.897520 (XEN) **************************************** Feb 13 17:19:32.897537 (XEN) Panic on CPU 0: Feb 13 17:19:32.905499 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 Feb 13 17:19:32.905522 (XEN) **************************************** Feb 13 17:19:32.913488 (XEN) Feb 13 17:19:32.913506 (XEN) Reboot in five seconds... # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1329216241 0 # Node ID 738424a5e5a5053c75cfbe64f6675b5d756daf1b # Parent 0ba87b95e80bae059fe70b4b117dcc409f2471ef xen: don''t try to print IRQ SSID in IRQ debug from irq context. It is not possible to call xfree() in that context. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r 0ba87b95e80b -r 738424a5e5a5 xen/arch/x86/irq.c --- a/xen/arch/x86/irq.c Mon Feb 13 17:26:08 2012 +0000 +++ b/xen/arch/x86/irq.c Tue Feb 14 10:44:01 2012 +0000 @@ -2026,7 +2026,7 @@ static void dump_irqs(unsigned char key) if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type ) continue; - ssid = xsm_show_irq_sid(irq); + ssid = in_irq() ? NULL : xsm_show_irq_sid(irq); spin_lock_irqsave(&desc->lock, flags); @@ -2073,7 +2073,8 @@ static void dump_irqs(unsigned char key) spin_unlock_irqrestore(&desc->lock, flags); - xfree(ssid); + if ( ssid ) + xfree(ssid); } dump_ioapic_irq_info();
On 02/14/2012 05:44 AM, Ian Campbell wrote:> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: >> flight 11946 xen-unstable real [real] >> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ >> >> Regressions :-( >> >> Tests which did not succeed and are blocking, >> including tests which could not be run: >> test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 > > Host crash: > http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log > > This is the debug Andrew Cooper added recently to track down the IRQ > assertion we''ve been seeing, sadly it looks like the debug code tries to > call xfree from interrupt context and therefore doesn''t produce full > output :-( > > Or is 24675:d82a1e3d3c65 ("xsm: Add security label to IRQ debug output") > at fault for adding the xfree in what may be an IRQ context? (are > keyhandlers run in IRQ context?)Keyhandlers are not run in IRQ context (or at least, the primary methods of invoking them don''t run there - serial keypress, xl debug-key). The placement of the xsm call and xfree was to avoid a similar backtrace from attempting allocation while holding the irq''s spinlock.> A skanky quick "fix" follows. > > Feb 13 17:17:29.777522 (XEN) *** IRQ BUG found *** > Feb 13 17:19:32.594539 (XEN) CPU0 -Testing vector 229 from bitmap 34,48,57,64,72,75,80,83,88,97,104-105,113,120-121,129,136,144,152,160,168,176,184,192,202 > Feb 13 17:19:32.617515 (XEN) Guest interrupt information: > Feb 13 17:19:32.617536 (XEN) IRQ: 0 affinity:001 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound > Feb 13 17:19:32.617567 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > Feb 13 17:19:32.626489 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not tainted ]---- > Feb 13 17:19:32.626512 (XEN) CPU: 0 > Feb 13 17:19:32.626525 (XEN) RIP: e008:[<ffff82c48012c842>] xfree+0x33/0x121 > Feb 13 17:19:32.641496 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > Feb 13 17:19:32.641519 (XEN) rax: ffff82c4802d0800 rbx: ffff8301a7e00080 rcx: 0000000000000000 > Feb 13 17:19:32.650560 (XEN) rdx: 0000000000000000 rsi: 0000000000000083 rdi: 0000000000000000 > Feb 13 17:19:32.665510 (XEN) rbp: ffff82c4802afd18 rsp: ffff82c4802afcf8 r8: 0000000000000004 > Feb 13 17:19:32.665550 (XEN) r9: 0000000000000000 r10: 0000000000000006 r11: ffff82c480224aa0 > Feb 13 17:19:32.673509 (XEN) r12: ffff8301a7e00580 r13: 0000000000000005 r14: ffff82c4802aff18 > Feb 13 17:19:32.685503 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000006f0 > Feb 13 17:19:32.685537 (XEN) cr3: 00000001a7f54000 cr2: 00000000c4b4ee84 > Feb 13 17:19:32.697505 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > Feb 13 17:19:32.697540 (XEN) Xen stack trace from rsp=ffff82c4802afcf8: > Feb 13 17:19:32.706513 (XEN) ffff8301a7e00080 ffff8301a7e00580 0000000000000005 ffff82c4802aff18 > Feb 13 17:19:32.721495 (XEN) ffff82c4802afd88 ffff82c4801658ee ffff82c4802afd38 ffff82c48010098a > Feb 13 17:19:32.721531 (XEN) 00000400802afd68 0000000000000083 ffff8301a7e000a8 0000000000000000 > Feb 13 17:19:32.729495 (XEN) 00000000fffffffa 00000000000000e5 ffff8301a7e00580 0000000000000005 > Feb 13 17:19:32.738490 (XEN) ffff82c4802aff18 ffff8301a7e005a8 ffff82c4802afe28 ffff82c480167781 > Feb 13 17:19:32.738515 (XEN) ffff8301a7ece000 ffff82c4802afde8 0000000000000000 ffff82c4802aff18 > Feb 13 17:19:32.750497 (XEN) ffff82c4802aff18 0000000000000002 ffff82c4802aff18 ffff82c4802fa060 > Feb 13 17:19:32.762568 (XEN) 000000e500000000 ffff82c4802fa060 ffff82c4802afe08 ffff82c48017bd51 > Feb 13 17:19:32.762596 (XEN) ffff82c4802aff18 ffff82c4802aff18 ffff82c48025e380 ffff82c4802aff18 > Feb 13 17:19:32.773513 (XEN) 00000000ffffffff 0000000000000002 00007d3b7fd501a7 ffff82c4801525d0 > Feb 13 17:19:32.785503 (XEN) 0000000000000002 00000000ffffffff ffff82c4802aff18 ffff82c48025e380 > Feb 13 17:19:32.785539 (XEN) ffff82c4802afee0 ffff82c4802aff18 0000001863058413 00000000000c0000 > Feb 13 17:19:32.794514 (XEN) 000000000e1ff99c 000000000000c701 ffff82c4802f9a90 0000000000000000 > Feb 13 17:19:32.809503 (XEN) 0000000000000000 ffff8301a7f5dc80 0000000000000000 0000002000000000 > Feb 13 17:19:32.809529 (XEN) ffff82c4801581a9 000000000000e008 0000000000000246 ffff82c4802afee0 > Feb 13 17:19:32.814513 (XEN) 0000000000000000 ffff82c4802aff10 ffff82c48015a647 0000000000000000 > Feb 13 17:19:32.829506 (XEN) ffff8300d7cfb000 ffff8300d7af9000 0000000000000000 ffff82c4802afd88 > Feb 13 17:19:32.829549 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > Feb 13 17:19:32.841510 (XEN) 00000000dfc91f90 00000000deadbeef 0000000000000000 0000000000000000 > Feb 13 17:19:32.853508 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef > Feb 13 17:19:32.858496 (XEN) Xen call trace: > Feb 13 17:19:32.858518 (XEN) [<ffff82c48012c842>] xfree+0x33/0x121 > Feb 13 17:19:32.858547 (XEN) [<ffff82c4801658ee>] dump_irqs+0x2a3/0x2ca > Feb 13 17:19:32.870500 (XEN) [<ffff82c480167781>] smp_irq_move_cleanup_interrupt+0x303/0x37b > Feb 13 17:19:32.870554 (XEN) [<ffff82c4801525d0>] irq_move_cleanup_interrupt+0x30/0x40 > Feb 13 17:19:32.885510 (XEN) [<ffff82c4801581a9>] default_idle+0x99/0x9e > Feb 13 17:19:32.885541 (XEN) [<ffff82c48015a647>] idle_loop+0x6c/0x7c > Feb 13 17:19:32.897496 (XEN) > Feb 13 17:19:32.897510 (XEN) > Feb 13 17:19:32.897520 (XEN) **************************************** > Feb 13 17:19:32.897537 (XEN) Panic on CPU 0: > Feb 13 17:19:32.905499 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > Feb 13 17:19:32.905522 (XEN) **************************************** > Feb 13 17:19:32.913488 (XEN) > Feb 13 17:19:32.913506 (XEN) Reboot in five seconds... > > # HG changeset patch > # User Ian Campbell <ian.campbell@citrix.com> > # Date 1329216241 0 > # Node ID 738424a5e5a5053c75cfbe64f6675b5d756daf1b > # Parent 0ba87b95e80bae059fe70b4b117dcc409f2471ef > xen: don''t try to print IRQ SSID in IRQ debug from irq context. > > It is not possible to call xfree() in that context. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > diff -r 0ba87b95e80b -r 738424a5e5a5 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Mon Feb 13 17:26:08 2012 +0000 > +++ b/xen/arch/x86/irq.c Tue Feb 14 10:44:01 2012 +0000 > @@ -2026,7 +2026,7 @@ static void dump_irqs(unsigned char key) > if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type ) > continue; > > - ssid = xsm_show_irq_sid(irq); > + ssid = in_irq() ? NULL : xsm_show_irq_sid(irq); > > spin_lock_irqsave(&desc->lock, flags); > > @@ -2073,7 +2073,8 @@ static void dump_irqs(unsigned char key) > > spin_unlock_irqrestore(&desc->lock, flags); > > - xfree(ssid); > + if ( ssid ) > + xfree(ssid); > } > > dump_ioapic_irq_info(); > > >
On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote:> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: > > flight 11946 xen-unstable real [real] > > http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 > > Host crash: > http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log > > This is the debug Andrew Cooper added recently to track down the IRQ > assertion we''ve been seeing, sadly it looks like the debug code tries to > call xfree from interrupt context and therefore doesn''t produce full > output :-(Are we still seeing the issue this debugging was intended to address? We don''t seem to be seeing the host crashes any more. Should the debug code be patched up as in the following patch, otherwise when we do see it it doesn''t end up printing any useful info. Someone recently reported bugs.debian.org/665433 to Debian, is this the same underlying issue? That report is with Xen 4.0 FWIW.> Or is 24675:d82a1e3d3c65 ("xsm: Add security label to IRQ debug output") > at fault for adding the xfree in what may be an IRQ context? (are > keyhandlers run in IRQ context?) > > A skanky quick "fix" follows. > > Feb 13 17:17:29.777522 (XEN) *** IRQ BUG found *** > Feb 13 17:19:32.594539 (XEN) CPU0 -Testing vector 229 from bitmap 34,48,57,64,72,75,80,83,88,97,104-105,113,120-121,129,136,144,152,160,168,176,184,192,202 > Feb 13 17:19:32.617515 (XEN) Guest interrupt information: > Feb 13 17:19:32.617536 (XEN) IRQ: 0 affinity:001 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound > Feb 13 17:19:32.617567 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > Feb 13 17:19:32.626489 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not tainted ]---- > Feb 13 17:19:32.626512 (XEN) CPU: 0 > Feb 13 17:19:32.626525 (XEN) RIP: e008:[<ffff82c48012c842>] xfree+0x33/0x121 > Feb 13 17:19:32.641496 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > Feb 13 17:19:32.641519 (XEN) rax: ffff82c4802d0800 rbx: ffff8301a7e00080 rcx: 0000000000000000 > Feb 13 17:19:32.650560 (XEN) rdx: 0000000000000000 rsi: 0000000000000083 rdi: 0000000000000000 > Feb 13 17:19:32.665510 (XEN) rbp: ffff82c4802afd18 rsp: ffff82c4802afcf8 r8: 0000000000000004 > Feb 13 17:19:32.665550 (XEN) r9: 0000000000000000 r10: 0000000000000006 r11: ffff82c480224aa0 > Feb 13 17:19:32.673509 (XEN) r12: ffff8301a7e00580 r13: 0000000000000005 r14: ffff82c4802aff18 > Feb 13 17:19:32.685503 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000006f0 > Feb 13 17:19:32.685537 (XEN) cr3: 00000001a7f54000 cr2: 00000000c4b4ee84 > Feb 13 17:19:32.697505 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > Feb 13 17:19:32.697540 (XEN) Xen stack trace from rsp=ffff82c4802afcf8: > Feb 13 17:19:32.706513 (XEN) ffff8301a7e00080 ffff8301a7e00580 0000000000000005 ffff82c4802aff18 > Feb 13 17:19:32.721495 (XEN) ffff82c4802afd88 ffff82c4801658ee ffff82c4802afd38 ffff82c48010098a > Feb 13 17:19:32.721531 (XEN) 00000400802afd68 0000000000000083 ffff8301a7e000a8 0000000000000000 > Feb 13 17:19:32.729495 (XEN) 00000000fffffffa 00000000000000e5 ffff8301a7e00580 0000000000000005 > Feb 13 17:19:32.738490 (XEN) ffff82c4802aff18 ffff8301a7e005a8 ffff82c4802afe28 ffff82c480167781 > Feb 13 17:19:32.738515 (XEN) ffff8301a7ece000 ffff82c4802afde8 0000000000000000 ffff82c4802aff18 > Feb 13 17:19:32.750497 (XEN) ffff82c4802aff18 0000000000000002 ffff82c4802aff18 ffff82c4802fa060 > Feb 13 17:19:32.762568 (XEN) 000000e500000000 ffff82c4802fa060 ffff82c4802afe08 ffff82c48017bd51 > Feb 13 17:19:32.762596 (XEN) ffff82c4802aff18 ffff82c4802aff18 ffff82c48025e380 ffff82c4802aff18 > Feb 13 17:19:32.773513 (XEN) 00000000ffffffff 0000000000000002 00007d3b7fd501a7 ffff82c4801525d0 > Feb 13 17:19:32.785503 (XEN) 0000000000000002 00000000ffffffff ffff82c4802aff18 ffff82c48025e380 > Feb 13 17:19:32.785539 (XEN) ffff82c4802afee0 ffff82c4802aff18 0000001863058413 00000000000c0000 > Feb 13 17:19:32.794514 (XEN) 000000000e1ff99c 000000000000c701 ffff82c4802f9a90 0000000000000000 > Feb 13 17:19:32.809503 (XEN) 0000000000000000 ffff8301a7f5dc80 0000000000000000 0000002000000000 > Feb 13 17:19:32.809529 (XEN) ffff82c4801581a9 000000000000e008 0000000000000246 ffff82c4802afee0 > Feb 13 17:19:32.814513 (XEN) 0000000000000000 ffff82c4802aff10 ffff82c48015a647 0000000000000000 > Feb 13 17:19:32.829506 (XEN) ffff8300d7cfb000 ffff8300d7af9000 0000000000000000 ffff82c4802afd88 > Feb 13 17:19:32.829549 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > Feb 13 17:19:32.841510 (XEN) 00000000dfc91f90 00000000deadbeef 0000000000000000 0000000000000000 > Feb 13 17:19:32.853508 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef > Feb 13 17:19:32.858496 (XEN) Xen call trace: > Feb 13 17:19:32.858518 (XEN) [<ffff82c48012c842>] xfree+0x33/0x121 > Feb 13 17:19:32.858547 (XEN) [<ffff82c4801658ee>] dump_irqs+0x2a3/0x2ca > Feb 13 17:19:32.870500 (XEN) [<ffff82c480167781>] smp_irq_move_cleanup_interrupt+0x303/0x37b > Feb 13 17:19:32.870554 (XEN) [<ffff82c4801525d0>] irq_move_cleanup_interrupt+0x30/0x40 > Feb 13 17:19:32.885510 (XEN) [<ffff82c4801581a9>] default_idle+0x99/0x9e > Feb 13 17:19:32.885541 (XEN) [<ffff82c48015a647>] idle_loop+0x6c/0x7c > Feb 13 17:19:32.897496 (XEN) > Feb 13 17:19:32.897510 (XEN) > Feb 13 17:19:32.897520 (XEN) **************************************** > Feb 13 17:19:32.897537 (XEN) Panic on CPU 0: > Feb 13 17:19:32.905499 (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > Feb 13 17:19:32.905522 (XEN) **************************************** > Feb 13 17:19:32.913488 (XEN) > Feb 13 17:19:32.913506 (XEN) Reboot in five seconds... > > # HG changeset patch > # User Ian Campbell <ian.campbell@citrix.com> > # Date 1329216241 0 > # Node ID 738424a5e5a5053c75cfbe64f6675b5d756daf1b > # Parent 0ba87b95e80bae059fe70b4b117dcc409f2471ef > xen: don''t try to print IRQ SSID in IRQ debug from irq context. > > It is not possible to call xfree() in that context. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > diff -r 0ba87b95e80b -r 738424a5e5a5 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Mon Feb 13 17:26:08 2012 +0000 > +++ b/xen/arch/x86/irq.c Tue Feb 14 10:44:01 2012 +0000 > @@ -2026,7 +2026,7 @@ static void dump_irqs(unsigned char key) > if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type ) > continue; > > - ssid = xsm_show_irq_sid(irq); > + ssid = in_irq() ? NULL : xsm_show_irq_sid(irq); > > spin_lock_irqsave(&desc->lock, flags); > > @@ -2073,7 +2073,8 @@ static void dump_irqs(unsigned char key) > > spin_unlock_irqrestore(&desc->lock, flags); > > - xfree(ssid); > + if ( ssid ) > + xfree(ssid); > } > > dump_ioapic_irq_info(); > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
>>> On 27.03.12 at 12:36, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> # HG changeset patch >> # User Ian Campbell <ian.campbell@citrix.com> >> # Date 1329216241 0 >> # Node ID 738424a5e5a5053c75cfbe64f6675b5d756daf1b >> # Parent 0ba87b95e80bae059fe70b4b117dcc409f2471ef >> xen: don''t try to print IRQ SSID in IRQ debug from irq context. >> >> It is not possible to call xfree() in that context. >> >> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> >> >> diff -r 0ba87b95e80b -r 738424a5e5a5 xen/arch/x86/irq.c >> --- a/xen/arch/x86/irq.c Mon Feb 13 17:26:08 2012 +0000 >> +++ b/xen/arch/x86/irq.c Tue Feb 14 10:44:01 2012 +0000 >> @@ -2026,7 +2026,7 @@ static void dump_irqs(unsigned char key) >> if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type ) >> continue; >> >> - ssid = xsm_show_irq_sid(irq); >> + ssid = in_irq() ? NULL : xsm_show_irq_sid(irq); >> >> spin_lock_irqsave(&desc->lock, flags); >> >> @@ -2073,7 +2073,8 @@ static void dump_irqs(unsigned char key) >> >> spin_unlock_irqrestore(&desc->lock, flags); >> >> - xfree(ssid); >> + if ( ssid ) >> + xfree(ssid);But perhaps xfree(NULL) should be made usable in any context (i.e. the assertion in there moved down)? Otherwise the construct above is likely to get collapsed again at some point with "xfree(NULL) is perfectly valid" in mind. Jan>> } >> >> dump_ioapic_irq_info();
On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote: >> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: >> > flight 11946 xen-unstable real [real] >> > http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ >> > >> > Regressions :-( >> > >> > Tests which did not succeed and are blocking, >> > including tests which could not be run: >> > test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 >> >> Host crash: >> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log >> >> This is the debug Andrew Cooper added recently to track down the IRQ >> assertion we''ve been seeing, sadly it looks like the debug code tries to >> call xfree from interrupt context and therefore doesn''t produce full >> output :-( > > Are we still seeing the issue this debugging was intended to address? We > don''t seem to be seeing the host crashes any more. Should the debug code > be patched up as in the following patch, otherwise when we do see it it > doesn''t end up printing any useful info. > > Someone recently reported bugs.debian.org/665433 to Debian, is this the > same underlying issue? That report is with Xen 4.0 FWIW.I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging code added. Can the fix to the debugging code be checked in until the original issue has been fixed? Thanks, AP (XEN) *** IRQ BUG found *** (XEN) CPU0 -Testing vector 236 from bitmap 41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208 (XEN) Guest interrupt information: (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48012cefb>] xfree+0x33/0x118 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff830214ac0080 rcx: 0000000000000000 (XEN) rdx: ffff82c4802d8880 rsi: 0000000000000083 rdi: 0000000000000000 (XEN) rbp: ffff82c4802b7c78 rsp: ffff82c4802b7c58 r8: 0000000000000004 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000010 (XEN) r12: ffff830214ac0c80 r13: 000000000000000c r14: ffff830214ac0ca8 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000426f0 (XEN) cr3: 0000000168971000 cr2: 0000000001095e00 (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4802b7c58: (XEN) ffff830214ac0080 ffff830214ac0c80 000000000000000c ffff830214ac0ca8 (XEN) ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a ffff82c400000020 (XEN) ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 0000000000000000 (XEN) 00000000000000ec 00000000000000ec ffff830214ac0c80 000000000000000c (XEN) ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 ffff82c480168000 (XEN) ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 ffff82c4802b7f18 (XEN) 0000000000000000 0000000000000000 ffff82c480302324 0000000000000020 (XEN) ffff82c4802b7dd8 0000000000000003 0000000000000000 0000000000000000 (XEN) ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 ffff8300da996000 (XEN) 0000000000000000 ffffffff802b7d90 ffff82c480159160 ffff82c4802b7e20 (XEN) ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 0000000000000003 (XEN) 0000000000000000 0000000000000000 00007d3b7fd48207 ffff82c480160426 (XEN) 0000000000000000 0000000000000000 0000000000000003 ffff8300da991000 (XEN) ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 ffff82c4802319a0 (XEN) 00000000deadbeef 0000000000000000 ffff83021c0b8081 0000000000000000 (XEN) 0000000000000048 ffff8801d7227ec0 ffff8300da991000 0000002000000000 (XEN) ffff82c4801865c1 000000000000e008 0000000000000202 ffff82c4802b7e88 (XEN) 000000000000e010 0000000000000003 ffff82c4802b7ef8 ffff82c4802230d8 (XEN) ffff82c4802b7f18 0000000000000000 0000000000000246 ffffffff810013aa (XEN) 0000000000000000 ffffffff810013aa 000000000000e030 0000000000000246 (XEN) Xen call trace: (XEN) [<ffff82c48012cefb>] xfree+0x33/0x118 (XEN) [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8 (XEN) [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db (XEN) [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4 (XEN) [<ffff82c480160426>] common_interrupt+0x26/0x30 (XEN) [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a (XEN) [<ffff82c480228438>] syscall_enter+0xc8/0x122 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds...
On 04/05/12 20:48, AP wrote:> On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote: >>> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: >>>> flight 11946 xen-unstable real [real] >>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ >>>> >>>> Regressions :-( >>>> >>>> Tests which did not succeed and are blocking, >>>> including tests which could not be run: >>>> test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 >>> Host crash: >>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log >>> >>> This is the debug Andrew Cooper added recently to track down the IRQ >>> assertion we''ve been seeing, sadly it looks like the debug code tries to >>> call xfree from interrupt context and therefore doesn''t produce full >>> output :-( >> Are we still seeing the issue this debugging was intended to address? We >> don''t seem to be seeing the host crashes any more. Should the debug code >> be patched up as in the following patch, otherwise when we do see it it >> doesn''t end up printing any useful info. >> >> Someone recently reported bugs.debian.org/665433 to Debian, is this the >> same underlying issue? That report is with Xen 4.0 FWIW. > I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging > code added. Can the fix to the debugging code be checked in until the > original issue has been fixed? > > Thanks, > AP > > (XEN) *** IRQ BUG found *** > (XEN) CPU0 -Testing vector 236 from bitmap > 41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208 > (XEN) Guest interrupt information: > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > status=00000000 mapped, unbound > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Tainted: C ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c48012cefb>] xfree+0x33/0x118 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff830214ac0080 rcx: 0000000000000000 > (XEN) rdx: ffff82c4802d8880 rsi: 0000000000000083 rdi: 0000000000000000 > (XEN) rbp: ffff82c4802b7c78 rsp: ffff82c4802b7c58 r8: 0000000000000004 > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000010 > (XEN) r12: ffff830214ac0c80 r13: 000000000000000c r14: ffff830214ac0ca8 > (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000426f0 > (XEN) cr3: 0000000168971000 cr2: 0000000001095e00 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4802b7c58: > (XEN) ffff830214ac0080 ffff830214ac0c80 000000000000000c ffff830214ac0ca8 > (XEN) ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a ffff82c400000020 > (XEN) ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 0000000000000000 > (XEN) 00000000000000ec 00000000000000ec ffff830214ac0c80 000000000000000c > (XEN) ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 ffff82c480168000 > (XEN) ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 ffff82c4802b7f18 > (XEN) 0000000000000000 0000000000000000 ffff82c480302324 0000000000000020 > (XEN) ffff82c4802b7dd8 0000000000000003 0000000000000000 0000000000000000 > (XEN) ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 ffff8300da996000 > (XEN) 0000000000000000 ffffffff802b7d90 ffff82c480159160 ffff82c4802b7e20 > (XEN) ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 0000000000000003 > (XEN) 0000000000000000 0000000000000000 00007d3b7fd48207 ffff82c480160426 > (XEN) 0000000000000000 0000000000000000 0000000000000003 ffff8300da991000 > (XEN) ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 ffff82c4802319a0 > (XEN) 00000000deadbeef 0000000000000000 ffff83021c0b8081 0000000000000000 > (XEN) 0000000000000048 ffff8801d7227ec0 ffff8300da991000 0000002000000000 > (XEN) ffff82c4801865c1 000000000000e008 0000000000000202 ffff82c4802b7e88 > (XEN) 000000000000e010 0000000000000003 ffff82c4802b7ef8 ffff82c4802230d8 > (XEN) ffff82c4802b7f18 0000000000000000 0000000000000246 ffffffff810013aa > (XEN) 0000000000000000 ffffffff810013aa 000000000000e030 0000000000000246 > (XEN) Xen call trace: > (XEN) [<ffff82c48012cefb>] xfree+0x33/0x118 > (XEN) [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8 > (XEN) [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db > (XEN) [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4 > (XEN) [<ffff82c480160426>] common_interrupt+0x26/0x30 > (XEN) [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a > (XEN) [<ffff82c480228438>] syscall_enter+0xc8/0x122 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds...The attached patch should prevent this panic, allowing for all the debug information to be printed to the console. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, May 4, 2012 at 8:11 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:> > On 04/05/12 20:48, AP wrote: > > On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> > > wrote: > >> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote: > >>> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: > >>>> flight 11946 xen-unstable real [real] > >>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ > >>>> > >>>> Regressions :-( > >>>> > >>>> Tests which did not succeed and are blocking, > >>>> including tests which could not be run: > >>>> test-amd64-i386-xl-credit2 7 debian-install fail REGR. > >>>> vs. 11944 > >>> Host crash: > >>> > >>>http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log> >>> > >>> This is the debug Andrew Cooper added recently to track down the IRQ > >>> assertion we''ve been seeing, sadly it looks like the debug code tries > >>> to > >>> call xfree from interrupt context and therefore doesn''t produce full > >>> output :-( > >> Are we still seeing the issue this debugging was intended to address? > >> We > >> don''t seem to be seeing the host crashes any more. Should the debug > >> code > >> be patched up as in the following patch, otherwise when we do see it it > >> doesn''t end up printing any useful info. > >> > >> Someone recently reported bugs.debian.org/665433 to Debian, is this the > >> same underlying issue? That report is with Xen 4.0 FWIW. > > I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging > > code added. Can the fix to the debugging code be checked in until the > > original issue has been fixed? > > > > Thanks, > > AP > > > > (XEN) *** IRQ BUG found *** > > (XEN) CPU0 -Testing vector 236 from bitmap > > > >41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208> > (XEN) Guest interrupt information: > > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > > status=00000000 mapped, unbound > > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > > (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Tainted: C ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82c48012cefb>] xfree+0x33/0x118 > > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > > (XEN) rax: 0000000000000000 rbx: ffff830214ac0080 rcx: > > 0000000000000000 > > (XEN) rdx: ffff82c4802d8880 rsi: 0000000000000083 rdi: > > 0000000000000000 > > (XEN) rbp: ffff82c4802b7c78 rsp: ffff82c4802b7c58 r8: > > 0000000000000004 > > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: > > 0000000000000010 > > (XEN) r12: ffff830214ac0c80 r13: 000000000000000c r14: > > ffff830214ac0ca8 > > (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: > > 00000000000426f0 > > (XEN) cr3: 0000000168971000 cr2: 0000000001095e00 > > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > > (XEN) Xen stack trace from rsp=ffff82c4802b7c58: > > (XEN) ffff830214ac0080 ffff830214ac0c80 000000000000000c > > ffff830214ac0ca8 > > (XEN) ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a > > ffff82c400000020 > > (XEN) ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 > > 0000000000000000 > > (XEN) 00000000000000ec 00000000000000ec ffff830214ac0c80 > > 000000000000000c > > (XEN) ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 > > ffff82c480168000 > > (XEN) ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 > > ffff82c4802b7f18 > > (XEN) 0000000000000000 0000000000000000 ffff82c480302324 > > 0000000000000020 > > (XEN) ffff82c4802b7dd8 0000000000000003 0000000000000000 > > 0000000000000000 > > (XEN) ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 > > ffff8300da996000 > > (XEN) 0000000000000000 ffffffff802b7d90 ffff82c480159160 > > ffff82c4802b7e20 > > (XEN) ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 > > 0000000000000003 > > (XEN) 0000000000000000 0000000000000000 00007d3b7fd48207 > > ffff82c480160426 > > (XEN) 0000000000000000 0000000000000000 0000000000000003 > > ffff8300da991000 > > (XEN) ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 > > ffff82c4802319a0 > > (XEN) 00000000deadbeef 0000000000000000 ffff83021c0b8081 > > 0000000000000000 > > (XEN) 0000000000000048 ffff8801d7227ec0 ffff8300da991000 > > 0000002000000000 > > (XEN) ffff82c4801865c1 000000000000e008 0000000000000202 > > ffff82c4802b7e88 > > (XEN) 000000000000e010 0000000000000003 ffff82c4802b7ef8 > > ffff82c4802230d8 > > (XEN) ffff82c4802b7f18 0000000000000000 0000000000000246 > > ffffffff810013aa > > (XEN) 0000000000000000 ffffffff810013aa 000000000000e030 > > 0000000000000246 > > (XEN) Xen call trace: > > (XEN) [<ffff82c48012cefb>] xfree+0x33/0x118 > > (XEN) [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8 > > (XEN) [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db > > (XEN) [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4 > > (XEN) [<ffff82c480160426>] common_interrupt+0x26/0x30 > > (XEN) [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a > > (XEN) [<ffff82c480228438>] syscall_enter+0xc8/0x122 > > (XEN) > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > > (XEN) **************************************** > > (XEN) > > (XEN) Reboot in five seconds... > The attached patch should prevent this panic, allowing for all the debug > information to be printed to the console.Thanks, that fixed it. Here is what I see now: (XEN) *** IRQ BUG found *** (XEN) CPU0 -Testing vector 236 from bitmap 37,41,49,51,64,72,80,88,96,104,120,136,145,152,158,160,168,175,182,192,200,211 (XEN) Guest interrupt information: (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound (XEN) IRQ: 1 affinity:01 vec:d3 type=IO-APIC-edge status=00000030 in-flight=0 domain-list=0: 1(-S--), (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC status=00000000 mapped, unbound (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge status=00000030 in-flight=0 domain-list=0: 8(-S--), (XEN) IRQ: 9 affinity:02 vec:25 type=IO-APIC-level status=00000030 in-flight=0 domain-list=0: 9(-S--), (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound [ 5129.737147] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 1800652, at 1800652], missed IRQ? Let me know if you need any more info. Thanks, AP _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Fri, 2012-05-04 at 21:11 +0100, Andrew Cooper wrote:> On 04/05/12 20:48, AP wrote: > > On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote: > >>> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote: > >>>> flight 11946 xen-unstable real [real] > >>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/ > >>>> > >>>> Regressions :-( > >>>> > >>>> Tests which did not succeed and are blocking, > >>>> including tests which could not be run: > >>>> test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944 > >>> Host crash: > >>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log > >>> > >>> This is the debug Andrew Cooper added recently to track down the IRQ > >>> assertion we''ve been seeing, sadly it looks like the debug code tries to > >>> call xfree from interrupt context and therefore doesn''t produce full > >>> output :-( > >> Are we still seeing the issue this debugging was intended to address? We > >> don''t seem to be seeing the host crashes any more. Should the debug code > >> be patched up as in the following patch, otherwise when we do see it it > >> doesn''t end up printing any useful info. > >> > >> Someone recently reported bugs.debian.org/665433 to Debian, is this the > >> same underlying issue? That report is with Xen 4.0 FWIW. > > I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging > > code added. Can the fix to the debugging code be checked in until the > > original issue has been fixed? > > > > Thanks, > > AP > > > > (XEN) *** IRQ BUG found *** > > (XEN) CPU0 -Testing vector 236 from bitmap > > 41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208 > > (XEN) Guest interrupt information: > > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > > status=00000000 mapped, unbound > > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > > (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Tainted: C ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82c48012cefb>] xfree+0x33/0x118 > > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > > (XEN) rax: 0000000000000000 rbx: ffff830214ac0080 rcx: 0000000000000000 > > (XEN) rdx: ffff82c4802d8880 rsi: 0000000000000083 rdi: 0000000000000000 > > (XEN) rbp: ffff82c4802b7c78 rsp: ffff82c4802b7c58 r8: 0000000000000004 > > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000010 > > (XEN) r12: ffff830214ac0c80 r13: 000000000000000c r14: ffff830214ac0ca8 > > (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000426f0 > > (XEN) cr3: 0000000168971000 cr2: 0000000001095e00 > > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > > (XEN) Xen stack trace from rsp=ffff82c4802b7c58: > > (XEN) ffff830214ac0080 ffff830214ac0c80 000000000000000c ffff830214ac0ca8 > > (XEN) ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a ffff82c400000020 > > (XEN) ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 0000000000000000 > > (XEN) 00000000000000ec 00000000000000ec ffff830214ac0c80 000000000000000c > > (XEN) ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 ffff82c480168000 > > (XEN) ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 ffff82c4802b7f18 > > (XEN) 0000000000000000 0000000000000000 ffff82c480302324 0000000000000020 > > (XEN) ffff82c4802b7dd8 0000000000000003 0000000000000000 0000000000000000 > > (XEN) ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 ffff8300da996000 > > (XEN) 0000000000000000 ffffffff802b7d90 ffff82c480159160 ffff82c4802b7e20 > > (XEN) ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 0000000000000003 > > (XEN) 0000000000000000 0000000000000000 00007d3b7fd48207 ffff82c480160426 > > (XEN) 0000000000000000 0000000000000000 0000000000000003 ffff8300da991000 > > (XEN) ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 ffff82c4802319a0 > > (XEN) 00000000deadbeef 0000000000000000 ffff83021c0b8081 0000000000000000 > > (XEN) 0000000000000048 ffff8801d7227ec0 ffff8300da991000 0000002000000000 > > (XEN) ffff82c4801865c1 000000000000e008 0000000000000202 ffff82c4802b7e88 > > (XEN) 000000000000e010 0000000000000003 ffff82c4802b7ef8 ffff82c4802230d8 > > (XEN) ffff82c4802b7f18 0000000000000000 0000000000000246 ffffffff810013aa > > (XEN) 0000000000000000 ffffffff810013aa 000000000000e030 0000000000000246 > > (XEN) Xen call trace: > > (XEN) [<ffff82c48012cefb>] xfree+0x33/0x118 > > (XEN) [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8 > > (XEN) [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db > > (XEN) [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4 > > (XEN) [<ffff82c480160426>] common_interrupt+0x26/0x30 > > (XEN) [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a > > (XEN) [<ffff82c480228438>] syscall_enter+0xc8/0x122 > > (XEN) > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) Assertion ''!in_irq()'' failed at xmalloc_tlsf.c:607 > > (XEN) **************************************** > > (XEN) > > (XEN) Reboot in five seconds... > The attached patch should prevent this panicThis is effectively the same as my patch from <1332844592.25560.9.camel@zakaz.uk.xensource.com>. I think "if (ssid) xfree(...)" is preferable to "if (in_irq()) xfree(...)" but not enough to prevent me: Acked-by: Ian Campbell <ian.campbell@citrix.com> If the debug code is going to stay for 4.2 then IMHO we should also take this patch to make it actually useful. Otherwise we should just revert the original debug patch before the release.
> Thanks, that fixed it. Here is what I see now: > > (XEN) *** IRQ BUG found *** > (XEN) CPU0 -Testing vector 236 from bitmap > 37,41,49,51,64,72,80,88,96,104,120,136,145,152,158,160,168,175,182,192,200,211 > (XEN) Guest interrupt information: > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > status=00000000 mapped, unbound > (XEN) IRQ: 1 affinity:01 vec:d3 type=IO-APIC-edge > status=00000030 in-flight=0 domain-list=0: 1(-S--), > (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC > status=00000000 mapped, unbound > (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge > status=00000030 in-flight=0 domain-list=0: 8(-S--), > (XEN) IRQ: 9 affinity:02 vec:25 type=IO-APIC-level > status=00000030 in-flight=0 domain-list=0: 9(-S--), > (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge > status=00000002 mapped, unbound > [ 5129.737147] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer > elapsed... blt ring idle [waiting on 1800652, at 1800652], missed IRQ? > > Let me know if you need any more info. > Thanks, > AP >There should be quite a lot more irq information dumped than just that. Was there any more on the console or had it given up by that point? It might be worth trying to set synchronous console to get all of that debug information? How easy is this error to reproduce for you? I never managed to reproduce it reliably enough to be able to debug? If you could provide your Xen boot console log, that would be very useful ~Andrew
>> The attached patch should prevent this panic > This is effectively the same as my patch from > <1332844592.25560.9.camel@zakaz.uk.xensource.com>. I think "if (ssid) > xfree(...)" is preferable to "if (in_irq()) xfree(...)" but not enough > to prevent me: > > Acked-by: Ian Campbell <ian.campbell@citrix.com> > > If the debug code is going to stay for 4.2 then IMHO we should also take > this patch to make it actually useful. Otherwise we should just revert > the original debug patch before the release. > >Yes - I was thinking the same. I suggest that when xen-4.2-testing.hg gets branched off unstable, this debugging gets put back to just being an assert as before. However, I am quite unsure as to what would happen with interrupts following that failed assert. I shall re-do the patch. I think it is a fairly sensible patch to have in even after the main debugging has been removed, especially if similar debugging needs to be done in the future. ~Andrew
On Sat, May 5, 2012 at 4:04 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:> > > > Thanks, that fixed it. Here is what I see now: > > > > (XEN) *** IRQ BUG found *** > > (XEN) CPU0 -Testing vector 236 from bitmap > >37,41,49,51,64,72,80,88,96,104,120,136,145,152,158,160,168,175,182,192,200,211> > (XEN) Guest interrupt information: > > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > > status=00000000 mapped, unbound > > (XEN) IRQ: 1 affinity:01 vec:d3 type=IO-APIC-edge > > status=00000030 in-flight=0 domain-list=0: 1(-S--), > > (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC > > status=00000000 mapped, unbound > > (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge > > status=00000030 in-flight=0 domain-list=0: 8(-S--), > > (XEN) IRQ: 9 affinity:02 vec:25 type=IO-APIC-level > > status=00000030 in-flight=0 domain-list=0: 9(-S--), > > (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge > > status=00000002 mapped, unbound > > (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge > > status=00000002 mapped, unbound > > [ 5129.737147] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer > > elapsed... blt ring idle [waiting on 1800652, at 1800652], missed IRQ? > > > > Let me know if you need any more info. > > Thanks, > > AP > > > > There should be quite a lot more irq information dumped than just that. > Was there any more on the console or had it given up by that point? ItThere was nothing more on the console. The system was hung.> might be worth trying to set synchronous console to get all of that > debug information?I was running with sync_console and console_to_ring options.> How easy is this error to reproduce for you? I never managed to > reproduce it reliably enough to be able to debug?I cannot reproduce it easily either.> If you could provide your Xen boot console log, that would be very usefulI will send full logs the next time I see the problem. Thanks, AP _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Sat, May 5, 2012 at 11:41 AM, AP <apxeng@gmail.com> wrote:> > On Sat, May 5, 2012 at 4:04 AM, Andrew Cooper <andrew.cooper3@citrix.com>wrote:> > > > > > > Thanks, that fixed it. Here is what I see now: > > > > > > (XEN) *** IRQ BUG found *** > > > (XEN) CPU0 -Testing vector 236 from bitmap > > >37,41,49,51,64,72,80,88,96,104,120,136,145,152,158,160,168,175,182,192,200,211> > > (XEN) Guest interrupt information: > > > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge > > > status=00000000 mapped, unbound > > > (XEN) IRQ: 1 affinity:01 vec:d3 type=IO-APIC-edge > > > status=00000030 in-flight=0 domain-list=0: 1(-S--), > > > (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC > > > status=00000000 mapped, unbound > > > (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge > > > status=00000030 in-flight=0 domain-list=0: 8(-S--), > > > (XEN) IRQ: 9 affinity:02 vec:25 type=IO-APIC-level > > > status=00000030 in-flight=0 domain-list=0: 9(-S--), > > > (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge > > > status=00000002 mapped, unbound > > > [ 5129.737147] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer > > > elapsed... blt ring idle [waiting on 1800652, at 1800652], missed IRQ? > > > > > > Let me know if you need any more info. > > > Thanks, > > > AP > > > > > > > There should be quite a lot more irq information dumped than just that. > > Was there any more on the console or had it given up by that point? It > > There was nothing more on the console. The system was hung. > > > might be worth trying to set synchronous console to get all of that > > debug information? > > I was running with sync_console and console_to_ring options. > > > > How easy is this error to reproduce for you? I never managed to > > reproduce it reliably enough to be able to debug? > > I cannot reproduce it easily either. > > > > If you could provide your Xen boot console log, that would be veryuseful> > I will send full logs the next time I see the problem.I have attached the full logs. I had a CentOS 5.6 and a Windows 7 HVM domain running. Thanks, AP _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote: > (XEN) *** IRQ BUG found *** > (XEN) CPU0 -Testing vector 236 from bitmap236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming in through the 8259A. Something fundamentally fishy must be going on here, and I would suppose the code in question shouldn''t even be reached for legacy vectors. Furthermore, calling dump_irqs() from the debugging code with desc->lock still held makes it impossible to get full output, as that function wants to lock all initialized IRQ descriptors. Jan> 37,41,49,51,64,72,80,88,96,104,120,136,145,152,158,160,168,175,182,192,200,211 > (XEN) Guest interrupt information: > (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge status=00000000 > mapped, unbound > (XEN) IRQ: 1 affinity:01 vec:d3 type=IO-APIC-edge status=00000030 > in-flight=0 domain-list=0: 1(-S--), > (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC status=00000000 > mapped, unbound > (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge status=00000030 > in-flight=0 domain-list=0: 8(-S--), > (XEN) IRQ: 9 affinity:02 vec:25 type=IO-APIC-level status=00000030 > in-flight=0 domain-list=0: 9(-S--), > (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge status=00000002 > mapped, unbound > (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge status=00000002 > mapped, unbound > [ 5129.737147] [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer > elapsed... blt ring idle [waiting on 1800652, at 1800652], missed IRQ? > > Let me know if you need any more info. > Thanks, > AP
On 07/05/2012 09:10, Jan Beulich wrote:>>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote: >> (XEN) *** IRQ BUG found *** >> (XEN) CPU0 -Testing vector 236 from bitmap > 236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming > in through the 8259A. Something fundamentally fishy must be going > on here, and I would suppose the code in question shouldn''t even be > reached for legacy vectors. > > Furthermore, calling dump_irqs() from the debugging code with > desc->lock still held makes it impossible to get full output, as that > function wants to lock all initialized IRQ descriptors. > > JanYes - it has been vector 236 on each of the 3 reported failures from AP, and I believe it was also vector 236 in the one case I managed to reproduce the issue. However, once we have set up the IO-APIC, the 8259A should not be used any more. The boot dmeg shows that io_ack_method is indeed "old" (which was going to be my first suggestion), and that EOI Broadcast Suppression is enabled, which I have already identified as a source of problems for some customers. As a ''fix'', I provided the ability for "io_ack_method=new" to prevent EOI Broadcast Suppression being enabled. This was upstreamed in c/s 24870:9bf3ec036bef, but apparently has not completely fixed the customer problems - just made it substantially more rare. AP: Can you manually invoke the ''i'' debug key and provide that - it will help to see how Xen is setting up the IO-APIC(s) on your system. ~Andrew
>>> On 07.05.12 at 13:50, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 07/05/2012 09:10, Jan Beulich wrote: >>>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote: >>> (XEN) *** IRQ BUG found *** >>> (XEN) CPU0 -Testing vector 236 from bitmap >> 236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming >> in through the 8259A. Something fundamentally fishy must be going >> on here, and I would suppose the code in question shouldn''t even be >> reached for legacy vectors. >> >> Furthermore, calling dump_irqs() from the debugging code with >> desc->lock still held makes it impossible to get full output, as that >> function wants to lock all initialized IRQ descriptors. > > Yes - it has been vector 236 on each of the 3 reported failures from AP, > and I believe it was also vector 236 in the one case I managed to > reproduce the issue. > > However, once we have set up the IO-APIC, the 8259A should not be used > any more. The boot dmeg shows that io_ack_method is indeed "old" (which > was going to be my first suggestion), and that EOI Broadcast Suppression > is enabled, which I have already identified as a source of problems for > some customers. As a ''fix'', I provided the ability for > "io_ack_method=new" to prevent EOI Broadcast Suppression being enabled. > This was upstreamed in c/s 24870:9bf3ec036bef, but apparently has not > completely fixed the customer problems - just made it substantially more > rare. > > AP: Can you manually invoke the ''i'' debug key and provide that - it will > help to see how Xen is setting up the IO-APIC(s) on your system.Seeing the ''z'' output might also be helpful, especially to see whether any of the IO-APICs'' RTEs is an ExtINT one. Further, checking that no 8259A IRQ got (or was left) enabled for some reason might be useful as well (cached_irq_mask plus the raw port 0x21 and 0xA1 values). In any case the debugging code''s locking should be fixed. Jan
On 07/05/2012 14:34, Jan Beulich wrote:>>>> On 07.05.12 at 13:50, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 07/05/2012 09:10, Jan Beulich wrote: >>>>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote: >>>> (XEN) *** IRQ BUG found *** >>>> (XEN) CPU0 -Testing vector 236 from bitmap >>> 236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming >>> in through the 8259A. Something fundamentally fishy must be going >>> on here, and I would suppose the code in question shouldn''t even be >>> reached for legacy vectors. >>> >>> Furthermore, calling dump_irqs() from the debugging code with >>> desc->lock still held makes it impossible to get full output, as that >>> function wants to lock all initialized IRQ descriptors. >> Yes - it has been vector 236 on each of the 3 reported failures from AP, >> and I believe it was also vector 236 in the one case I managed to >> reproduce the issue. >> >> However, once we have set up the IO-APIC, the 8259A should not be used >> any more. The boot dmeg shows that io_ack_method is indeed "old" (which >> was going to be my first suggestion), and that EOI Broadcast Suppression >> is enabled, which I have already identified as a source of problems for >> some customers. As a ''fix'', I provided the ability for >> "io_ack_method=new" to prevent EOI Broadcast Suppression being enabled. >> This was upstreamed in c/s 24870:9bf3ec036bef, but apparently has not >> completely fixed the customer problems - just made it substantially more >> rare. >> >> AP: Can you manually invoke the ''i'' debug key and provide that - it will >> help to see how Xen is setting up the IO-APIC(s) on your system. > Seeing the ''z'' output might also be helpful, especially to see whether > any of the IO-APICs'' RTEs is an ExtINT one. > > Further, checking that no 8259A IRQ got (or was left) enabled for > some reason might be useful as well (cached_irq_mask plus the raw > port 0x21 and 0xA1 values). > > In any case the debugging code''s locking should be fixed. > > Jan >It appears we have two functions to dump the IO-APIC state: __print_IO_APIC() which gets called on boot and from ''z'', and dump_ioapic_irq_info() which gets called from the end of ''i''. These should probably be consolidated somehow. As for the debugging, perhaps change the call to dump_irqs() with a call to dump_ioapic_irq_info() instead. Given that the legacy vectors cant migrate, is it wise including them in the loop in irq_move_cleanup_interrupt()? In fact, is it wise including any vector above LAST_DYNAMIC_VECTOR? ~Andrew
>>> On 07.05.12 at 16:41, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Given that the legacy vectors cant migrate, is it wise including them in > the loop in irq_move_cleanup_interrupt()? In fact, is it wise including > any vector above LAST_DYNAMIC_VECTOR?Likely not, but then again this is the final piece of moving an interrupt, so there must have been something earlier that incorrectly initiated a move. In other words, rather than fixing the loop here, we should make sure execution can''t even make it there for legacy vectors. And of course this is irrespective of the fact that no legacy interrupt should occur in the first place, unless this is a very strange system. Jan
>>> On 07.05.12 at 16:41, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > It appears we have two functions to dump the IO-APIC state: > __print_IO_APIC() which gets called on boot and from ''z'', and > dump_ioapic_irq_info() which gets called from the end of ''i''. These > should probably be consolidated somehow.Rather not - ''z'' provides information on the IO-APIC that isn''t directly related to specific interrupts, while ''i'' (when it comes to the IO-APIC) is exclusively interested in the RTEs. Unless dump_ioapic_irq_info() is _fully_ redundant with ''z'' (didn''t check in detail yet), in which case I''d vote for removing this function. Jan
On 07/05/2012 15:50, Jan Beulich wrote:>>>> On 07.05.12 at 16:41, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Given that the legacy vectors cant migrate, is it wise including them in >> the loop in irq_move_cleanup_interrupt()? In fact, is it wise including >> any vector above LAST_DYNAMIC_VECTOR? > Likely not, but then again this is the final piece of moving an interrupt, > so there must have been something earlier that incorrectly initiated a > move. In other words, rather than fixing the loop here, we should > make sure execution can''t even make it there for legacy vectors. > > And of course this is irrespective of the fact that no legacy interrupt > should occur in the first place, unless this is a very strange system. > > Jan >The only way to get to this point is if desc->arch.move_cleanup_count is non 0, in which case, one of these functions: hpet_msi_ack (hpet.c) ack_edge_ioapic_irq (io_apci.c) mask_and_ack_level_ioapic_irq (io_apic.c) ack_nonmaskable_msi_irq (msi.c) iommu_msi_mask (iommu_init.c) dma_msi_mask (iommu.c) has called irq_complete_move, after something has called __assign_irq_vector() to move the irq to another CPU. I would say something very fishy is going on - no desc used by any of those functions should have a vector from the legacy region. As for the loop, it is probably quite sensible to reduce that down to LAST_DYNAMIC_VECTOR. Leaving it at NR_VECTORS is just 32 wasted iterations of the loop in interrupt context.
>>> On 07.05.12 at 17:40, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > As for the loop, it is probably quite sensible to reduce that down to > LAST_DYNAMIC_VECTOR. Leaving it at NR_VECTORS is just 32 wasted > iterations of the loop in interrupt context.No, you can''t leave there. You''d have to skip the legacy vectors, and continue with the ones Xen itself may have in use. Jan
On 07/05/2012 15:54, Jan Beulich wrote:>>>> On 07.05.12 at 16:41, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> It appears we have two functions to dump the IO-APIC state: >> __print_IO_APIC() which gets called on boot and from ''z'', and >> dump_ioapic_irq_info() which gets called from the end of ''i''. These >> should probably be consolidated somehow. > Rather not - ''z'' provides information on the IO-APIC that isn''t > directly related to specific interrupts, while ''i'' (when it comes to > the IO-APIC) is exclusively interested in the RTEs. Unless > dump_ioapic_irq_info() is _fully_ redundant with ''z'' (didn''t check > in detail yet), in which case I''d vote for removing this function. > > Jan >dump_ioapic_irq_info() loops through nr_irqs_gsi and uses irq_2_pin to work out which io-apic RTE to read and decode. __print_IO_APIC() loop through nr_ioapics, then through each RTE and decodes it. At the end, it loops through nr_irqs_gsi and matches irqs to ioapic:pin pairs. So they are probably different enough to be worth keeping. ~Andrew
On Mon, May 7, 2012 at 1:34 PM, Jan Beulich <JBeulich@suse.com> wrote:> > >>> On 07.05.12 at 13:50, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 07/05/2012 09:10, Jan Beulich wrote: > >>>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote: > >>> (XEN) *** IRQ BUG found *** > >>> (XEN) CPU0 -Testing vector 236 from bitmap > >> 236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming > >> in through the 8259A. Something fundamentally fishy must be going > >> on here, and I would suppose the code in question shouldn''t even be > >> reached for legacy vectors. > >> > >> Furthermore, calling dump_irqs() from the debugging code with > >> desc->lock still held makes it impossible to get full output, as that > >> function wants to lock all initialized IRQ descriptors. > > > > Yes - it has been vector 236 on each of the 3 reported failures from AP, > > and I believe it was also vector 236 in the one case I managed to > > reproduce the issue. > > > > However, once we have set up the IO-APIC, the 8259A should not be used > > any more. The boot dmeg shows that io_ack_method is indeed "old" (which > > was going to be my first suggestion), and that EOI Broadcast Suppression > > is enabled, which I have already identified as a source of problems for > > some customers. As a ''fix'', I provided the ability for > > "io_ack_method=new" to prevent EOI Broadcast Suppression being enabled. > > This was upstreamed in c/s 24870:9bf3ec036bef, but apparently has not > > completely fixed the customer problems - just made it substantially more > > rare. > > > > AP: Can you manually invoke the ''i'' debug key and provide that - it will > > help to see how Xen is setting up the IO-APIC(s) on your system.(XEN) Guest interrupt information: (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound (XEN) IRQ: 1 affinity:02 vec:85 type=IO-APIC-edge status=00000030 in-flight=0 domain-list=0: 1(----), (XEN) IRQ: 2 affinity:ff vec:e2 type=XT-PIC status=00000000 mapped, unbound (XEN) IRQ: 3 affinity:01 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 4 affinity:01 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 5 affinity:01 vec:50 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 6 affinity:01 vec:58 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 7 affinity:01 vec:60 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 8 affinity:08 vec:29 type=IO-APIC-edge status=00000030 in-flight=0 domain-list=0: 8(----), (XEN) IRQ: 9 affinity:02 vec:7f type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 9(----), (XEN) IRQ: 10 affinity:01 vec:78 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 11 affinity:01 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 12 affinity:08 vec:d4 type=IO-APIC-edge status=00000030 in-flight=0 domain-list=0: 12(----), (XEN) IRQ: 13 affinity:0f vec:98 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 14 affinity:01 vec:a0 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 15 affinity:01 vec:a8 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) IRQ: 16 affinity:02 vec:a6 type=IO-APIC-level status=00000030 in-flight=0 domain-list=0: 16(----), (XEN) IRQ: 17 affinity:0f vec:c0 type=IO-APIC-level status=00000002 mapped, unbound (XEN) IRQ: 18 affinity:0f vec:c8 type=IO-APIC-level status=00000002 mapped, unbound (XEN) IRQ: 19 affinity:0f vec:f1 type=IO-APIC-level status=00000000 mapped, unbound (XEN) IRQ: 20 affinity:0f vec:61 type=IO-APIC-level status=00000002 mapped, unbound (XEN) IRQ: 22 affinity:0f vec:32 type=IO-APIC-level status=00000002 mapped, unbound (XEN) IRQ: 23 affinity:01 vec:ac type=IO-APIC-level status=00000030 in-flight=0 domain-list=0: 23(----), (XEN) IRQ: 24 affinity:01 vec:28 type=DMA_MSI status=00000000 mapped, unbound (XEN) IRQ: 25 affinity:01 vec:30 type=DMA_MSI status=00000000 mapped, unbound (XEN) IRQ: 26 affinity:01 vec:31 type=PCI-MSI/-X status=00000030 in-flight=0 domain-list=0:279(----), (XEN) IRQ: 27 affinity:01 vec:39 type=PCI-MSI/-X status=00000030 in-flight=0 domain-list=0:278(----), (XEN) IRQ: 28 affinity:01 vec:41 type=PCI-MSI/-X status=00000030 in-flight=0 domain-list=0:277(----), (XEN) IRQ: 29 affinity:01 vec:49 type=PCI-MSI/-X status=00000030 in-flight=0 domain-list=0:276(----), (XEN) IRQ: 30 affinity:01 vec:51 type=PCI-MSI/-X status=00000030 in-flight=0 domain-list=0:275(----), (XEN) IRQ: 31 affinity:04 vec:d7 type=PCI-MSI status=00000030 in-flight=0 domain-list=0:274(----), (XEN) IRQ: 32 affinity:04 vec:df type=PCI-MSI status=00000030 in-flight=0 domain-list=0:273(----), (XEN) IRQ: 33 affinity:02 vec:b0 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:272(----), (XEN) IRQ: 34 affinity:02 vec:a8 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:271(----), (XEN) IRQ: 35 affinity:04 vec:ad type=PCI-MSI status=00000030 in-flight=0 domain-list=0:270(----), (XEN) IO-APIC interrupt information: (XEN) IRQ 0 Vec240: (XEN) Apic 0x00, Pin 2: vec=f0 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 1 Vec133: (XEN) Apic 0x00, Pin 1: vec=85 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 3 Vec 64: (XEN) Apic 0x00, Pin 3: vec=40 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 4 Vec 72: (XEN) Apic 0x00, Pin 4: vec=48 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 5 Vec 80: (XEN) Apic 0x00, Pin 5: vec=50 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 6 Vec 88: (XEN) Apic 0x00, Pin 6: vec=58 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 7 Vec 96: (XEN) Apic 0x00, Pin 7: vec=60 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 8 Vec 41: (XEN) Apic 0x00, Pin 8: vec=29 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 9 Vec127: (XEN) Apic 0x00, Pin 9: vec=7f delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=L mask=0 dest_id:0 (XEN) IRQ 10 Vec120: (XEN) Apic 0x00, Pin 10: vec=78 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 11 Vec136: (XEN) Apic 0x00, Pin 11: vec=88 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 12 Vec212: (XEN) Apic 0x00, Pin 12: vec=d4 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 13 Vec152: (XEN) Apic 0x00, Pin 13: vec=98 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=1 dest_id:0 (XEN) IRQ 14 Vec160: (XEN) Apic 0x00, Pin 14: vec=a0 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 15 Vec168: (XEN) Apic 0x00, Pin 15: vec=a8 delivery=LoPri dest=L status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) IRQ 16 Vec166: (XEN) Apic 0x00, Pin 16: vec=a6 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0 (XEN) IRQ 17 Vec192: (XEN) Apic 0x00, Pin 17: vec=c0 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0 (XEN) IRQ 18 Vec200: (XEN) Apic 0x00, Pin 18: vec=c8 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0 (XEN) IRQ 19 Vec241: (XEN) Apic 0x00, Pin 19: vec=f1 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0 (XEN) IRQ 20 Vec 97: (XEN) Apic 0x00, Pin 20: vec=61 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0 (XEN) IRQ 22 Vec 50: (XEN) Apic 0x00, Pin 22: vec=32 delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0 (XEN) IRQ 23 Vec172: (XEN) Apic 0x00, Pin 23: vec=ac delivery=LoPri dest=L status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0> Seeing the ''z'' output might also be helpful, especially to see whether > any of the IO-APICs'' RTEs is an ExtINT one.(XEN) number of MP IRQ sources: 15. (XEN) number of IO-APIC #2 registers: 24. (XEN) testing the IO APIC....................... (XEN) IO APIC #2...... (XEN) .... register #00: 02000000 (XEN) ....... : physical APIC id: 02 (XEN) ....... : Delivery Type: 0 (XEN) ....... : LTS : 0 (XEN) .... register #01: 00170020 (XEN) ....... : max redirection entries: 0017 (XEN) ....... : PRQ implemented: 0 (XEN) ....... : IO APIC version: 0020 (XEN) .... IRQ redirection table: (XEN) NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: (XEN) 00 000 00 1 0 0 0 0 0 0 00 (XEN) 01 000 00 0 0 0 0 0 1 1 85 (XEN) 02 000 00 0 0 0 0 0 1 1 F0 (XEN) 03 000 00 0 0 0 0 0 1 1 40 (XEN) 04 000 00 0 0 0 0 0 1 1 48 (XEN) 05 000 00 0 0 0 0 0 1 1 50 (XEN) 06 000 00 0 0 0 0 0 1 1 58 (XEN) 07 000 00 0 0 0 0 0 1 1 60 (XEN) 08 000 00 0 0 0 0 0 1 1 29 (XEN) 09 000 00 0 1 0 0 0 1 1 A7 (XEN) 0a 000 00 0 0 0 0 0 1 1 78 (XEN) 0b 000 00 0 0 0 0 0 1 1 88 (XEN) 0c 000 00 0 0 0 0 0 1 1 D4 (XEN) 0d 000 00 1 0 0 0 0 1 1 98 (XEN) 0e 000 00 0 0 0 0 0 1 1 A0 (XEN) 0f 000 00 0 0 0 0 0 1 1 A8 (XEN) 10 000 00 0 1 0 1 0 1 1 AE (XEN) 11 000 00 1 1 0 1 0 1 1 C0 (XEN) 12 000 00 1 1 0 1 0 1 1 C8 (XEN) 13 000 00 0 1 0 1 0 1 1 F1 (XEN) 14 000 00 1 1 0 1 0 1 1 61 (XEN) 15 0CA 0A 1 0 0 0 0 1 2 71 (XEN) 16 000 00 1 1 0 1 0 1 1 32 (XEN) 17 000 00 0 1 0 1 0 1 1 AC (XEN) Using vector-based indexing (XEN) IRQ to pin mappings: (XEN) IRQ240 -> 0:2 (XEN) IRQ133 -> 0:1 (XEN) IRQ64 -> 0:3 (XEN) IRQ72 -> 0:4 (XEN) IRQ80 -> 0:5 (XEN) IRQ88 -> 0:6 (XEN) IRQ96 -> 0:7 (XEN) IRQ41 -> 0:8 (XEN) IRQ167 -> 0:9 (XEN) IRQ120 -> 0:10 (XEN) IRQ136 -> 0:11 (XEN) IRQ212 -> 0:12 (XEN) IRQ152 -> 0:13 (XEN) IRQ160 -> 0:14 (XEN) IRQ168 -> 0:15 (XEN) IRQ174 -> 0:16 (XEN) IRQ192 -> 0:17 (XEN) IRQ200 -> 0:18 (XEN) IRQ241 -> 0:19 (XEN) IRQ97 -> 0:20 (XEN) IRQ50 -> 0:22 (XEN) IRQ172 -> 0:23 (XEN) .................................... done. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 07.05.12 at 20:29, AP <apxeng@gmail.com> wrote: > On Mon, May 7, 2012 at 1:34 PM, Jan Beulich <JBeulich@suse.com> wrote: >> Seeing the ''z'' output might also be helpful, especially to see whether >> any of the IO-APICs'' RTEs is an ExtINT one. > > (XEN) number of MP IRQ sources: 15. > (XEN) number of IO-APIC #2 registers: 24. > (XEN) testing the IO APIC....................... > (XEN) IO APIC #2...... > (XEN) .... register #00: 02000000 > (XEN) ....... : physical APIC id: 02 > (XEN) ....... : Delivery Type: 0 > (XEN) ....... : LTS : 0 > (XEN) .... register #01: 00170020 > (XEN) ....... : max redirection entries: 0017 > (XEN) ....... : PRQ implemented: 0 > (XEN) ....... : IO APIC version: 0020 > (XEN) .... IRQ redirection table: > (XEN) NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: > (XEN) 00 000 00 1 0 0 0 0 0 0 00 > (XEN) 01 000 00 0 0 0 0 0 1 1 85 > (XEN) 02 000 00 0 0 0 0 0 1 1 F0 > (XEN) 03 000 00 0 0 0 0 0 1 1 40 > (XEN) 04 000 00 0 0 0 0 0 1 1 48 > (XEN) 05 000 00 0 0 0 0 0 1 1 50 > (XEN) 06 000 00 0 0 0 0 0 1 1 58 > (XEN) 07 000 00 0 0 0 0 0 1 1 60 > (XEN) 08 000 00 0 0 0 0 0 1 1 29 > (XEN) 09 000 00 0 1 0 0 0 1 1 A7 > (XEN) 0a 000 00 0 0 0 0 0 1 1 78 > (XEN) 0b 000 00 0 0 0 0 0 1 1 88 > (XEN) 0c 000 00 0 0 0 0 0 1 1 D4 > (XEN) 0d 000 00 1 0 0 0 0 1 1 98 > (XEN) 0e 000 00 0 0 0 0 0 1 1 A0 > (XEN) 0f 000 00 0 0 0 0 0 1 1 A8 > (XEN) 10 000 00 0 1 0 1 0 1 1 AE > (XEN) 11 000 00 1 1 0 1 0 1 1 C0 > (XEN) 12 000 00 1 1 0 1 0 1 1 C8 > (XEN) 13 000 00 0 1 0 1 0 1 1 F1 > (XEN) 14 000 00 1 1 0 1 0 1 1 61 > (XEN) 15 0CA 0A 1 0 0 0 0 1 2 71This entry is definitely bogus (delivery mode is SMI, which is not allowed in an IO-APIC RTE), but as it is masked it _shouldn''t_ cause any harm.> (XEN) 16 000 00 1 1 0 1 0 1 1 32 > (XEN) 17 000 00 0 1 0 1 0 1 1 ACSo we''ll need to see the PIC (8259A) masks too. IRQ12 definitely appears to get touched a lot (judging by the vector it uses), so while this shouldn''t be the case I would nevertheless consider the possibility of a window where the 8259A interrupt gets temporarily unmasked. Jan