Running xen-unstable changeset 12734 on an ES7000 4 dual-core sockets (8cpu) 16GB memory, with kernel parameters dom0_mem=512M and xenheap_megabytes=64. DOMUs are paravirtualized sles10, 96MB, 4vcpus, each on a separate physical lun. We successfully start 118 DOMUs, but when we try to start the 119th, the system panics with the following messages: Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS! (XEN) Domain 0 crashed: rebooting machine in 5 seconds. The documentation in include/asm-x86_64/irq.h suggests that the value of NR_IRQS under x86_64 is limited to 256. In fact, when we rebuilt xen-unstable with NR_IRQS set to 768, the kernel panics on boot (see below). On the hunch that networking in a VM uses up an IRQ, we eliminated the ''vif'' statement in each DOMU config file, and we were able to start 164 VMs in the 16GB before we exhausted memory. Has anyone run into this IRQ issue? Is there any work-around? brian carb unisys corporation - malvern, pa brian.carb@unisys.com <mailto:brian.carb@unisys.com> --- trace output when trying to boot xen where NR_IRQs was set to 768 --- (XEN) Initializing CPU#0 (XEN) Detected 3400.113 MHz processor. (XEN) extable.c:77: Pre-exception: ffff8300001713fc -> 0000000000000000 (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e010:[<ffff8300001713fc>] get_cpu_vendor+0x2c/0x90 (XEN) RFLAGS: 0000000000010006 CONTEXT: hypervisor (XEN) rax: 00008e00e0100000 rbx: ffff830000206840 rcx: 000000006c65746e (XEN) rdx: 0000000049656e69 rsi: 0000000000000000 rdi: ffff830000187980 (XEN) rbp: 0000000000000000 rsp: ffff8300001bfe18 r8: 0000000000410000 (XEN) r9: 000000000000003a r10: 00000000000000ff r11: 0000000000000000 (XEN) r12: ffff830000206840 r13: ffff8300001879a8 r14: ffff830000187980 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000000b0 (XEN) cr3: 0000000000102000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e010 (XEN) Xen stack trace from rsp=ffff8300001bfe18: (XEN) 0000000000001180 00000000756e6547 0000000000000000 ffff830000187980 (XEN) ffff83000019b000 00000000003f7f0d ffff830000020980 ffff83000017168a (XEN) 0000000000000000 ffff830000165a4c ffff830003ffc080 ffff830000187980 (XEN) 0000000000003ce5 ffff830000171948 ffff83000019b000 ffff830003ffc080 (XEN) 000000000000000a ffff83000016de41 0000000000000000 ffff83000002bbc0 (XEN) 0000000800000000 000000010000006e 0000000000000003 00000000000002f8 (XEN) 0000000000000000 000000000000000a 0000000000000000 0000000000000000 (XEN) 0000000000067f0c 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 ffff8300001001c1 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) ffff83000028c080 (XEN) Xen call trace: (XEN) [<ffff8300001713fc>] get_cpu_vendor+0x2c/0x90 (XEN) [<ffff83000017168a>] generic_identify+0x4a/0x160 (XEN) [<ffff830000165a4c>] subarch_init_memory+0xac/0xe0 (XEN) [<ffff830000171948>] identify_cpu+0x78/0x1f0 (XEN) [<ffff83000016de41>] __start_xen+0x861/0xc90 (XEN) [<ffff8300001001c1>] __high_start+0x94/0x96 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL TRAP: vector = 13 (general protection fault) (XEN) [error_code=0000] , IN INTERRUPT CONTEXT (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) machine_crash_shutdown: 0 (XEN) extable.c:77: Pre-exception: ffff830000139e9c -> 0000000000000000 (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e010:[<ffff830000139e9c>] machine_crash_shutdown+0x7c/0xf0 (XEN) RFLAGS: 0000000000010047 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: 00000000000003e8 rcx: 0000ffff0000ffff (XEN) rdx: 0000000000000001 rsi: 0000000000000400 rdi: 0000000000000000 (XEN) rbp: 0000000000000046 rsp: ffff8300001bfbc8 r8: ffff8300000b8000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: ffff83000011c7a0 (XEN) r12: ffff830000206840 r13: ffff8300001879a8 r14: ffff830000187980 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000000b0 (XEN) cr3: 0000000000102000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e010 (XEN) Xen stack trace from rsp=ffff8300001bfbc8: (XEN) 0000000000000000 0000000000000046 ffff830000206840 ffff83000010e18f (XEN) ffff830000206840 ffff83000011c24c 5254204c41544146 74636576203a5041 (XEN) 203331203d20726f 6c6172656e656728 746365746f727020 6c756166206e6f69 (XEN) 6f7272655b0a2974 303d65646f635f72 49202c205d303030 525245544e49204e (XEN) 544e4f4320545055 000000000a545845 0000003000000008 ffff8300001bfd48 (XEN) ffff8300001bfc78 ffff8300001d7321 0000003000000028 ffff8300001bfd68 (XEN) ffff8300001bfc98 ffff8300000b8000 0000000000000002 000000000000000d (XEN) ffff83000017c2aa 0000000000000000 ffff83000017c162 0000000000000000 (XEN) 0000000000000000 ffff8300001d7356 000000008005003b ffff83000011cf7b (XEN) 0000000000000000 0000000000102000 0000000000000096 0000000000000094 (XEN) ffff8300001001c1 ffff8300001001c1 ffff8300001bff20 ffff8300001330a4 (XEN) ffff830000206840 000000000000000d ffff8300001bfd68 ffff8300001334d7 (XEN) ffff830000206840 ffff830000206840 0000000000000000 ffff8300001656c2 (XEN) 0000000000000000 ffff830000187980 ffff8300001879a8 ffff830000206840 (XEN) 0000000000000000 ffff830000206840 0000000000000000 00000000000000ff (XEN) 000000000000003a 0000000000410000 00008e00e0100000 000000006c65746e (XEN) 0000000049656e69 0000000000000000 ffff830000187980 0000000d00000000 (XEN) ffff8300001713fc 000000000000e010 0000000000010006 ffff8300001bfe18 (XEN) 0000000000000000 0000000000000001 0000000000001180 00000000756e6547 (XEN) 0000000000000000 ffff830000187980 ffff83000019b000 00000000003f7f0d (XEN) Xen call trace: (XEN) [<ffff830000139e9c>] machine_crash_shutdown+0x7c/0xf0 (XEN) [<ffff83000010e18f>] machine_crash_kexec+0x2f/0x90 (XEN) [<ffff83000011c24c>] panic+0x15c/0x1b0 (XEN) [<ffff83000011cf7b>] __serial_putc+0xdb/0x100 (XEN) [<ffff8300001001c1>] __high_start+0x94/0x96 (XEN) [<ffff8300001001c1>] __high_start+0x94/0x96 (XEN) [<ffff8300001330a4>] show_trace+0x54/0xa0 (XEN) [<ffff8300001334d7>] fatal_trap+0x77/0xb0 (XEN) [<ffff8300001656c2>] FATAL_exception_with_ints_disabled+0xc/0x1a (XEN) [<ffff8300001713fc>] get_cpu_vendor+0x2c/0x90 (XEN) [<ffff83000017168a>] generic_identify+0x4a/0x160 (XEN) [<ffff830000165a4c>] subarch_init_memory+0xac/0xe0 (XEN) [<ffff830000171948>] identify_cpu+0x78/0x1f0 (XEN) [<ffff83000016de41>] __start_xen+0x861/0xc90 (XEN) [<ffff8300001001c1>] __high_start+0x94/0x96 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL TRAP: vector = 13 (general protection fault) (XEN) [error_code=0000] , IN INTERRUPT CONTEXT (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/12/06 15:37, "Carb, Brian A" <Brian.Carb@unisys.com> wrote:> We successfully start 118 DOMUs, but when we try to start the 119th, the > system panics with the following messages: > Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS! > (XEN) Domain 0 crashed: rebooting machine in 5 seconds. > > The documentation in include/asm-x86_64/irq.h suggests that the value of > NR_IRQS under x86_64 is limited to 256. In fact, when we rebuilt xen-unstable > with NR_IRQS set to 768, the kernel panics on boot (see below). >It¹s not Xen¹s NR_IRQS you should increase; only Linux¹s. This out-of-IRQs condition shouldn¹t crash the dom0 of course. I¹ll look into that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir, We changed NR_IRQS in xen-unstable.hg/linux-2.6.16.33-xen/include/asm-x86_64/irq.h - it''s set to 224 by default. brian carb unisys corporation - malvern, pa brian.carb@unisys.com <mailto:brian.carb@unisys.com> ________________________________ From: Keir Fraser [mailto:keir@xensource.com] Sent: Thursday, December 07, 2006 10:51 AM To: Carb, Brian A; xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Maximum number of domains and NR_IRQS On 7/12/06 15:37, "Carb, Brian A" <Brian.Carb@unisys.com> wrote: We successfully start 118 DOMUs, but when we try to start the 119th, the system panics with the following messages: Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS! (XEN) Domain 0 crashed: rebooting machine in 5 seconds. The documentation in include/asm-x86_64/irq.h suggests that the value of NR_IRQS under x86_64 is limited to 256. In fact, when we rebuilt xen-unstable with NR_IRQS set to 768, the kernel panics on boot (see below). It''s not Xen''s NR_IRQS you should increase; only Linux''s. This out-of-IRQs condition shouldn''t crash the dom0 of course. I''ll look into that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2006-12-07 at 11:08 -0500, Carb, Brian A wrote:> Keir, > > We changed NR_IRQS in > xen-unstable.hg/linux-2.6.16.33-xen/include/asm-x86_64/irq.h - it''s > set to 224 by default.That file is shadowed by include/asm-x86_64/mach-xen/asm/irq.h when build for Xen so you need to change that instead. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/12/06 16:14, "Ian Campbell" <Ian.Campbell@XenSource.com> wrote:>> We changed NR_IRQS in >> xen-unstable.hg/linux-2.6.16.33-xen/include/asm-x86_64/irq.h - it''s >> set to 224 by default. > > That file is shadowed by include/asm-x86_64/mach-xen/asm/irq.h when > build for Xen so you need to change that instead.Apart from that, your crash appeared to be happening during boot of Xen itself. So something must have changed there. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Puthiyaparambil, Aravindh
2006-Dec-07 16:42 UTC
RE: [Xen-devel] Maximum number of domains and NR_IRQS
> That file is shadowed by include/asm-x86_64/mach-xen/asm/irq.h when > build for Xen so you need to change that instead.Ian, You are right. We have been changing the wrong file. I think the file in question is linux-2.6.16.33-xen/include/asm-x86_64/mach-xen/irq_vectors.h. I guess we can bump up NR_DYNIRQS to 512 or 768? Aravindh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/12/06 16:42, "Puthiyaparambil, Aravindh" <aravindh.puthiyaparambil@unisys.com> wrote:>> That file is shadowed by include/asm-x86_64/mach-xen/asm/irq.h when >> build for Xen so you need to change that instead. > > Ian, > > You are right. We have been changing the wrong file. I think the file in > question is > linux-2.6.16.33-xen/include/asm-x86_64/mach-xen/irq_vectors.h. I guess > we can bump up NR_DYNIRQS to 512 or 768?Exactly. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks Keir - We changed NR_DYNIRQS to 768 and we were successfully able to bring up 200 VMs. Can we submit a patch to increase NR_DYNIRQS to 1024? brian carb - unisys corporation - malvern, pa -----Original Message----- From: Keir Fraser [mailto:keir@xensource.com] Sent: Thursday, December 07, 2006 11:48 AM To: Puthiyaparambil, Aravindh; Ian Campbell; Carb, Brian A Cc: xen-devel@lists.xensource.com; Keir Fraser Subject: Re: [Xen-devel] Maximum number of domains and NR_IRQS On 7/12/06 16:42, "Puthiyaparambil, Aravindh" <aravindh.puthiyaparambil@unisys.com> wrote:>> That file is shadowed by include/asm-x86_64/mach-xen/asm/irq.h when >> build for Xen so you need to change that instead. > > Ian, > > You are right. We have been changing the wrong file. I think the file > in question is > linux-2.6.16.33-xen/include/asm-x86_64/mach-xen/irq_vectors.h. I guess> we can bump up NR_DYNIRQS to 512 or 768?Exactly. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7/12/06 8:55 pm, "Carb, Brian A" <Brian.Carb@unisys.com> wrote:> Thanks Keir - We changed NR_DYNIRQS to 768 and we were successfully able > to bring up 200 VMs. > Can we submit a patch to increase NR_DYNIRQS to 1024? > > > brian carb - unisys corporation - malvern, paWe''d take a CONFIG_NR_DYNIRQS patch (i.e., integrate with Kconfig). We''d also love a patch that would make the IRQ space dynamically growable (although I''m not sure how much impact this might have on non-xen-specific irq code). Also potentially useful and easier to implement would be to make the NR_DYN_IRQS selectable at boot time (i.e., via a boot parameter, or bump it when the guest realises it is domain 0). But we don''t want to change the default for everyone all the time -- the current value is good for just about everyone, and growing it just wastes space for most people. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir, We pulled and rebuilt a new xen-unstable (changeset 12895) which includes your patch 12790 for the out-of-IRQs condition. Now we are able to start 121 domus - when we start the 122nd, dom0 does not crash - instead, the ''xm create'' fails with the error: Error: Device 0 (vif) could not be connected. Hotplug scripts not working. and in the serial console we see the following error: Unable to handle kernel paging request at 0000000380435c50 RIP: <ffffffff8023f634>{unbind_from_irq+35} PGD 15c3c067 PUD 0 Oops: 0000 [2] SMP CPU 7 Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge dm_round_robin dm_emc ipv6 nfs lockd nfs_acl sunrpc dm_multipath button battery ac dm_mod e1000 ext3 jbd reiserfs fan thermal processor sg lpfc scsi_transport_fc mptsas mptscsih mptbase scsi_transport_sas piix sd_mod scsi_mod Pid: 36, comm: xenwatch Tainted: GF 2.6.16.33-xen #1 RIP: e030:[<ffffffff8023f634>] <ffffffff8023f634>{unbind_from_irq+35} RSP: e02b:ffff880000b09d48 EFLAGS: 00010246 RAX: 00000000ffffffe4 RBX: 000000001f4958c8 RCX: ffffffff80310d06 RDX: 0000000000000000 RSI: ffffffff80248b4b RDI: ffffffff8036a0c0 RBP: 00000000ffffffe4 R08: ffff880016e6a368 R09: ffff88001cf2ebc0 R10: 0000000000000007 R11: 0000000000000020 R12: ffffffffffffffe4 R13: ffff880016e6a368 R14: ffffffff80310d06 R15: 0000000000000000 FS: 00002b149f6efc90(0000) GS:ffffffff803ae380(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process xenwatch (pid: 36, threadinfo ffff880000b08000, task ffff880000b40820) Stack: 0000000000000008 ffff880016e6a368 0000000000000000 00000000fffffff4 ffffffff80310d06 00000000ffffffe4 00000000000001fc 00000000000001fc 00000000ffffffe4 00000000ffffffea Call Trace: <ffffffff8023fa80>{bind_evtchn_to_irqhandler+144} <ffffffff80248b4b>{blkif_be_int+0} <ffffffff801431ee>{keventd_create_kthread+0} <ffffffff8024a3ac>{blkif_map+425} <ffffffff80249b4d>{frontend_changed+207} <ffffffff80245a66>{xenwatch_thread+0} <ffffffff8024507a>{xenwatch_handle_callback+21} <ffffffff80245ba7>{xenwatch_thread+321} <ffffffff801431ee>{keventd_create_kthread+0} <ffffffff801435f5>{autoremove_wake_function+0} <ffffffff801431ee>{keventd_create_kthread+0} <ffffffff80245a66>{xenwatch_thread+0} <ffffffff801434bb>{kthread+212} <ffffffff8010bdee>{child_rip+8} <ffffffff801431ee>{keventd_create_kthread+0} <ffffffff801433e7>{kthread+0} <ffffffff8010bde6>{child_rip+0} Code: 8b 14 85 c0 5c 43 80 ff ca 85 d2 89 14 85 c0 5c 43 80 0f 85 RIP <ffffffff8023f634>{unbind_from_irq+35} RSP <ffff880000b09d48> CR2: 0000000380435c50 brian carb unisys corporation - malvern, pa brian.carb@unisys.com <mailto:brian.carb@unisys.com> ________________________________ From: Keir Fraser [mailto:keir@xensource.com] Sent: Thursday, December 07, 2006 10:51 AM To: Carb, Brian A; xen-devel@lists.xensource.com Subject: Re: [Xen-devel] Maximum number of domains and NR_IRQS On 7/12/06 15:37, "Carb, Brian A" <Brian.Carb@unisys.com> wrote: We successfully start 118 DOMUs, but when we try to start the 119th, the system panics with the following messages: Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS! (XEN) Domain 0 crashed: rebooting machine in 5 seconds. The documentation in include/asm-x86_64/irq.h suggests that the value of NR_IRQS under x86_64 is limited to 256. In fact, when we rebuilt xen-unstable with NR_IRQS set to 768, the kernel panics on boot (see below). It''s not Xen''s NR_IRQS you should increase; only Linux''s. This out-of-IRQs condition shouldn''t crash the dom0 of course. I''ll look into that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel