Christopher S. Aker
2008-Aug-27 02:37 UTC
[Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
This past weekend we rebooted two machines into 3.3.0. Just moments ago, within minutes of each other, they both crashed with the following: (XEN) Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff8300ceef7d78: (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 000000000000000f (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 ffff828c80231598 (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 0000018a2f71127c (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 ffff828c80231580 (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 ffff828c8015a936 (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 ffff828c801577c3 (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c ffff828c80231100 (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 ffff8300ceef7f28 (XEN) 0000000000000000 00000000deadbeef 0000000000000000 0000000000000007 (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 0000000000000020 (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb 000000000000e008 (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 ffff8300ceef7f20 (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 0000000000000003 (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 0000000000000003 (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 0000000000000061 (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 5555555555555555 (XEN) 5555555555555555 5555555555555555 5555555555555555 5555555500000003 (XEN) Xen call trace: (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Both of the machines required a hard reboot, however the Xen console still responded. Issuing an "R"eboot hung the box. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2008-Aug-27 02:50 UTC
RE: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
I didn''t meet with the ASSERT failure. This also implies some issues around that time_rendezvous functionality. Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this changeset makes the issues easy to appear. -- Dexuan -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. Aker Sent: 2008年8月27日 10:37 To: xen devel Subject: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 This past weekend we rebooted two machines into 3.3.0. Just moments ago, within minutes of each other, they both crashed with the following: (XEN) Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff8300ceef7d78: (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 000000000000000f (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 ffff828c80231598 (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 0000018a2f71127c (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 ffff828c80231580 (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 ffff828c8015a936 (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 ffff828c801577c3 (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c ffff828c80231100 (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 ffff8300ceef7f28 (XEN) 0000000000000000 00000000deadbeef 0000000000000000 0000000000000007 (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 0000000000000020 (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb 000000000000e008 (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 ffff8300ceef7f20 (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 0000000000000003 (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 0000000000000003 (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 0000000000000061 (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 5555555555555555 (XEN) 5555555555555555 5555555555555555 5555555555555555 5555555500000003 (XEN) Xen call trace: (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Both of the machines required a hard reboot, however the Xen console still responded. Issuing an "R"eboot hung the box. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Aug-27 07:33 UTC
Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
3.3.0 does not contain changeset 18359. Anyway, now we can see where an issue arises: set_timer() is called from interrupt contexts. In some cases it does dynamic allocation (xmalloc()) which is not allowed in such contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous changes. Dexuan: did you test with a debug build of Xen? It''d be odd if you saw problems but didn''t hit the assertion if so. -- Keir On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> I didn''t meet with the ASSERT failure. > This also implies some issues around that time_rendezvous functionality. > Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this > changeset makes the issues easy to appear. > > -- Dexuan > > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. > Aker > Sent: 2008年8月27日 10:37 > To: xen devel > Subject: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > This past weekend we rebooted two machines into 3.3.0. Just moments ago, > within minutes of each other, they both crashed with the following: > > (XEN) Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !> 0)'' failed at xmalloc.c:180 > (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 > (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 > (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 > (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 > (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 > (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 > (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 > (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 > (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 > (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff8300ceef7d78: > (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 > 000000000000000f > (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 > ffff828c80231598 > (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 > 0000018a2f71127c > (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 > ffff828c80231580 > (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 > ffff828c8015a936 > (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 > ffff828c801577c3 > (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c > ffff828c80231100 > (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 > ffff8300ceef7f28 > (XEN) 0000000000000000 00000000deadbeef 0000000000000000 > 0000000000000007 > (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 > 0000000000000020 > (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb > 000000000000e008 > (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 > ffff8300ceef7f20 > (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 > 0000000000000003 > (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 00000000deadbeef > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 > 0000000000000003 > (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 > 0000000000000061 > (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 > 5555555555555555 > (XEN) 5555555555555555 5555555555555555 5555555555555555 > 5555555500000003 > (XEN) Xen call trace: > (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d > (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb > (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 > (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 > (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 > (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 > (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 > (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !> 0)'' failed at xmalloc.c:180 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Both of the machines required a hard reboot, however the Xen console > still responded. Issuing an "R"eboot hung the box. > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2008-Aug-27 07:42 UTC
RE: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
Yes. I always use ''debug=y'' build. I never noticed the bug in set_timer(). Actually I only noticed the hang from xen-stable 18359. -- Dexuan -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年8月27日 15:33 To: Cui, Dexuan; Christopher S. Aker; xen devel Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 3.3.0 does not contain changeset 18359. Anyway, now we can see where an issue arises: set_timer() is called from interrupt contexts. In some cases it does dynamic allocation (xmalloc()) which is not allowed in such contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous changes. Dexuan: did you test with a debug build of Xen? It''d be odd if you saw problems but didn''t hit the assertion if so. -- Keir On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> I didn''t meet with the ASSERT failure. > This also implies some issues around that time_rendezvous functionality. > Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this > changeset makes the issues easy to appear. > > -- Dexuan > > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. > Aker > Sent: 2008年8月27日 10:37 > To: xen devel > Subject: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > This past weekend we rebooted two machines into 3.3.0. Just moments ago, > within minutes of each other, they both crashed with the following: > > (XEN) Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !> 0)'' failed at xmalloc.c:180 > (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 > (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 > (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 > (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 > (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 > (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 > (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 > (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 > (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 > (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff8300ceef7d78: > (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 > 000000000000000f > (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 > ffff828c80231598 > (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 > 0000018a2f71127c > (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 > ffff828c80231580 > (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 > ffff828c8015a936 > (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 > ffff828c801577c3 > (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c > ffff828c80231100 > (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 > ffff8300ceef7f28 > (XEN) 0000000000000000 00000000deadbeef 0000000000000000 > 0000000000000007 > (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 > 0000000000000020 > (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb > 000000000000e008 > (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 > ffff8300ceef7f20 > (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 > 0000000000000003 > (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 00000000deadbeef > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 > 0000000000000003 > (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 > 0000000000000061 > (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 > 5555555555555555 > (XEN) 5555555555555555 5555555555555555 5555555555555555 > 5555555500000003 > (XEN) Xen call trace: > (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d > (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb > (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 > (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 > (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 > (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 > (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 > (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !> 0)'' failed at xmalloc.c:180 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Both of the machines required a hard reboot, however the Xen console > still responded. Issuing an "R"eboot hung the box. > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Aug-27 08:01 UTC
Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
You could try reverting the two parts of 18359 separately (local_irq_disable/enable and the change of the ''wait'' parameter) and see which causes the hang to occur for you. -- Keir On 27/8/08 08:42, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> Yes. I always use ''debug=y'' build. > I never noticed the bug in set_timer(). > Actually I only noticed the hang from xen-stable 18359. > > -- Dexuan > > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年8月27日 15:33 > To: Cui, Dexuan; Christopher S. Aker; xen devel > Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > 3.3.0 does not contain changeset 18359. Anyway, now we can see where an > issue arises: set_timer() is called from interrupt contexts. In some cases > it does dynamic allocation (xmalloc()) which is not allowed in such > contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous > changes. > > Dexuan: did you test with a debug build of Xen? It''d be odd if you saw > problems but didn''t hit the assertion if so. > > -- Keir > > On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >> I didn''t meet with the ASSERT failure. >> This also implies some issues around that time_rendezvous functionality. >> Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this >> changeset makes the issues easy to appear. >> >> -- Dexuan >> >> -----Original Message----- >> From: xen-devel-bounces@lists.xensource.com >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. >> Aker >> Sent: 2008年8月27日 10:37 >> To: xen devel >> Subject: [Xen-devel] Xen 3.3.0 - Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >> failed at xmalloc.c:180 >> >> This past weekend we rebooted two machines into 3.3.0. Just moments ago, >> within minutes of each other, they both crashed with the following: >> >> (XEN) Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>> 0)'' failed at xmalloc.c:180 >> (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 3 >> (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d >> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor >> (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 >> (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 >> (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 >> (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 >> (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 >> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >> (XEN) Xen stack trace from rsp=ffff8300ceef7d78: >> (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 >> 000000000000000f >> (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 >> ffff828c80231598 >> (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 >> 0000018a2f71127c >> (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 >> ffff828c80231580 >> (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 >> ffff828c8015a936 >> (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 >> ffff828c801577c3 >> (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c >> ffff828c80231100 >> (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 >> ffff8300ceef7f28 >> (XEN) 0000000000000000 00000000deadbeef 0000000000000000 >> 0000000000000007 >> (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 >> 0000000000000020 >> (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb >> 000000000000e008 >> (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 >> ffff8300ceef7f20 >> (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 >> 0000000000000003 >> (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 >> 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 >> 00000000deadbeef >> (XEN) 0000000000000000 0000000000000000 0000000000000000 >> 0000000000000000 >> (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 >> 0000000000000003 >> (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 >> 0000000000000061 >> (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 >> 5555555555555555 >> (XEN) 5555555555555555 5555555555555555 5555555555555555 >> 5555555500000003 >> (XEN) Xen call trace: >> (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d >> (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb >> (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 >> (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 >> (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 >> (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 >> (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 >> (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>> 0)'' failed at xmalloc.c:180 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> Both of the machines required a hard reboot, however the Xen console >> still responded. Issuing an "R"eboot hung the box. >> >> -Chris >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2008-Aug-27 08:05 UTC
RE: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
OK, I''ll try to do that. Looks the time needed to reproduce the hang is not very definite. maybe 30 minutes, maybe 1~2 hours. -- Dexuan -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年8月27日 16:02 To: Cui, Dexuan; Christopher S. Aker; xen devel Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 You could try reverting the two parts of 18359 separately (local_irq_disable/enable and the change of the ''wait'' parameter) and see which causes the hang to occur for you. -- Keir On 27/8/08 08:42, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> Yes. I always use ''debug=y'' build. > I never noticed the bug in set_timer(). > Actually I only noticed the hang from xen-stable 18359. > > -- Dexuan > > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年8月27日 15:33 > To: Cui, Dexuan; Christopher S. Aker; xen devel > Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > 3.3.0 does not contain changeset 18359. Anyway, now we can see where an > issue arises: set_timer() is called from interrupt contexts. In some cases > it does dynamic allocation (xmalloc()) which is not allowed in such > contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous > changes. > > Dexuan: did you test with a debug build of Xen? It''d be odd if you saw > problems but didn''t hit the assertion if so. > > -- Keir > > On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >> I didn''t meet with the ASSERT failure. >> This also implies some issues around that time_rendezvous functionality. >> Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this >> changeset makes the issues easy to appear. >> >> -- Dexuan >> >> -----Original Message----- >> From: xen-devel-bounces@lists.xensource.com >> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. >> Aker >> Sent: 2008年8月27日 10:37 >> To: xen devel >> Subject: [Xen-devel] Xen 3.3.0 - Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >> failed at xmalloc.c:180 >> >> This past weekend we rebooted two machines into 3.3.0. Just moments ago, >> within minutes of each other, they both crashed with the following: >> >> (XEN) Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>> 0)'' failed at xmalloc.c:180 >> (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 3 >> (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d >> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor >> (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 >> (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 >> (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 >> (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 >> (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 >> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >> (XEN) Xen stack trace from rsp=ffff8300ceef7d78: >> (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 >> 000000000000000f >> (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 >> ffff828c80231598 >> (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 >> 0000018a2f71127c >> (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 >> ffff828c80231580 >> (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 >> ffff828c8015a936 >> (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 >> ffff828c801577c3 >> (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c >> ffff828c80231100 >> (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 >> ffff8300ceef7f28 >> (XEN) 0000000000000000 00000000deadbeef 0000000000000000 >> 0000000000000007 >> (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 >> 0000000000000020 >> (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb >> 000000000000e008 >> (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 >> ffff8300ceef7f20 >> (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 >> 0000000000000003 >> (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 >> 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 >> 00000000deadbeef >> (XEN) 0000000000000000 0000000000000000 0000000000000000 >> 0000000000000000 >> (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 >> 0000000000000003 >> (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 >> 0000000000000061 >> (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 >> 5555555555555555 >> (XEN) 5555555555555555 5555555555555555 5555555555555555 >> 5555555500000003 >> (XEN) Xen call trace: >> (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d >> (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb >> (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 >> (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 >> (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 >> (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 >> (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 >> (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>> 0)'' failed at xmalloc.c:180 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> Both of the machines required a hard reboot, however the Xen console >> still responded. Issuing an "R"eboot hung the box. >> >> -Chris >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Aug-28 12:36 UTC
Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
Hi Dexuan, It was incorrect of me to change the @wait parameter. Changeset 18399 should fix this issue for you. My hope now is that noone will be seeing any more issues with time_rendezvous at the tip of either xen-unstable (18399) or xen-3.3-testing (18375). -- Keir On 27/8/08 09:05, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> OK, I''ll try to do that. > Looks the time needed to reproduce the hang is not very definite. maybe 30 > minutes, maybe 1~2 hours. > > -- Dexuan > > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年8月27日 16:02 > To: Cui, Dexuan; Christopher S. Aker; xen devel > Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > You could try reverting the two parts of 18359 separately > (local_irq_disable/enable and the change of the ''wait'' parameter) and see > which causes the hang to occur for you. > > -- Keir > > On 27/8/08 08:42, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >> Yes. I always use ''debug=y'' build. >> I never noticed the bug in set_timer(). >> Actually I only noticed the hang from xen-stable 18359. >> >> -- Dexuan >> >> >> -----Original Message----- >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: 2008年8月27日 15:33 >> To: Cui, Dexuan; Christopher S. Aker; xen devel >> Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >> failed at xmalloc.c:180 >> >> 3.3.0 does not contain changeset 18359. Anyway, now we can see where an >> issue arises: set_timer() is called from interrupt contexts. In some cases >> it does dynamic allocation (xmalloc()) which is not allowed in such >> contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous >> changes. >> >> Dexuan: did you test with a debug build of Xen? It''d be odd if you saw >> problems but didn''t hit the assertion if so. >> >> -- Keir >> >> On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: >> >>> I didn''t meet with the ASSERT failure. >>> This also implies some issues around that time_rendezvous functionality. >>> Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this >>> changeset makes the issues easy to appear. >>> >>> -- Dexuan >>> >>> -----Original Message----- >>> From: xen-devel-bounces@lists.xensource.com >>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. >>> Aker >>> Sent: 2008年8月27日 10:37 >>> To: xen devel >>> Subject: [Xen-devel] Xen 3.3.0 - Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >>> failed at xmalloc.c:180 >>> >>> This past weekend we rebooted two machines into 3.3.0. Just moments ago, >>> within minutes of each other, they both crashed with the following: >>> >>> (XEN) Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>>> 0)'' failed at xmalloc.c:180 >>> (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- >>> (XEN) CPU: 3 >>> (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d >>> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor >>> (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 >>> (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 >>> (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 >>> (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 >>> (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 >>> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >>> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >>> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >>> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >>> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >>> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >>> (XEN) Xen stack trace from rsp=ffff8300ceef7d78: >>> (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 >>> 000000000000000f >>> (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 >>> ffff828c80231598 >>> (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 >>> 0000018a2f71127c >>> (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 >>> ffff828c80231580 >>> (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 >>> ffff828c8015a936 >>> (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 >>> ffff828c801577c3 >>> (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c >>> ffff828c80231100 >>> (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 >>> ffff8300ceef7f28 >>> (XEN) 0000000000000000 00000000deadbeef 0000000000000000 >>> 0000000000000007 >>> (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 >>> 0000000000000020 >>> (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb >>> 000000000000e008 >>> (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 >>> ffff8300ceef7f20 >>> (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 >>> 0000000000000003 >>> (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 >>> 0000000000000000 >>> (XEN) 0000000000000000 0000000000000000 0000000000000000 >>> 00000000deadbeef >>> (XEN) 0000000000000000 0000000000000000 0000000000000000 >>> 0000000000000000 >>> (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 >>> 0000000000000003 >>> (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 >>> 0000000000000061 >>> (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 >>> 5555555555555555 >>> (XEN) 5555555555555555 5555555555555555 5555555555555555 >>> 5555555500000003 >>> (XEN) Xen call trace: >>> (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d >>> (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb >>> (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 >>> (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 >>> (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 >>> (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 >>> (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 >>> (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f >>> (XEN) >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 3: >>> (XEN) Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>>> 0)'' failed at xmalloc.c:180 >>> (XEN) **************************************** >>> (XEN) >>> (XEN) Reboot in five seconds... >>> >>> Both of the machines required a hard reboot, however the Xen console >>> still responded. Issuing an "R"eboot hung the box. >>> >>> -Chris >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2008-Aug-29 04:31 UTC
RE: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180
Yes, the bug should be fixed by 18399. I can''t meet with the hang. -- Dexuan -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年8月28日 20:36 To: Cui, Dexuan; Christopher S. Aker; xen devel Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' failed at xmalloc.c:180 Hi Dexuan, It was incorrect of me to change the @wait parameter. Changeset 18399 should fix this issue for you. My hope now is that noone will be seeing any more issues with time_rendezvous at the tip of either xen-unstable (18399) or xen-3.3-testing (18375). -- Keir On 27/8/08 09:05, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> OK, I''ll try to do that. > Looks the time needed to reproduce the hang is not very definite. maybe 30 > minutes, maybe 1~2 hours. > > -- Dexuan > > > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: 2008年8月27日 16:02 > To: Cui, Dexuan; Christopher S. Aker; xen devel > Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion > ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' > failed at xmalloc.c:180 > > You could try reverting the two parts of 18359 separately > (local_irq_disable/enable and the change of the ''wait'' parameter) and see > which causes the hang to occur for you. > > -- Keir > > On 27/8/08 08:42, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: > >> Yes. I always use ''debug=y'' build. >> I never noticed the bug in set_timer(). >> Actually I only noticed the hang from xen-stable 18359. >> >> -- Dexuan >> >> >> -----Original Message----- >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: 2008年8月27日 15:33 >> To: Cui, Dexuan; Christopher S. Aker; xen devel >> Subject: Re: [Xen-devel] Xen 3.3.0 - Assertion >> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >> failed at xmalloc.c:180 >> >> 3.3.0 does not contain changeset 18359. Anyway, now we can see where an >> issue arises: set_timer() is called from interrupt contexts. In some cases >> it does dynamic allocation (xmalloc()) which is not allowed in such >> contexts. So it''s a latent bug in set_timer() exposed by the time rendezvous >> changes. >> >> Dexuan: did you test with a debug build of Xen? It''d be odd if you saw >> problems but didn''t hit the assertion if so. >> >> -- Keir >> >> On 27/8/08 03:50, "Cui, Dexuan" <dexuan.cui@intel.com> wrote: >> >>> I didn''t meet with the ASSERT failure. >>> This also implies some issues around that time_rendezvous functionality. >>> Maybe you can try to revert 18359: 95f1dc27e182 and try again. Looks this >>> changeset makes the issues easy to appear. >>> >>> -- Dexuan >>> >>> -----Original Message----- >>> From: xen-devel-bounces@lists.xensource.com >>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Christopher S. >>> Aker >>> Sent: 2008年8月27日 10:37 >>> To: xen devel >>> Subject: [Xen-devel] Xen 3.3.0 - Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) != 0)'' >>> failed at xmalloc.c:180 >>> >>> This past weekend we rebooted two machines into 3.3.0. Just moments ago, >>> within minutes of each other, they both crashed with the following: >>> >>> (XEN) Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>>> 0)'' failed at xmalloc.c:180 >>> (XEN) ----[ Xen-3.3.0 x86_64 debug=y Not tainted ]---- >>> (XEN) CPU: 3 >>> (XEN) RIP: e008:[<ffff828c80120839>] _xmalloc+0x35/0x16d >>> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor >>> (XEN) rax: 0000000000000180 rbx: ffff8300cfde8300 rcx: ffff828c8022b180 >>> (XEN) rdx: ffff828c8028e900 rsi: 0000000000000008 rdi: 0000000000000800 >>> (XEN) rbp: ffff8300ceef7d98 rsp: ffff8300ceef7d78 r8: ffff828c8015a891 >>> (XEN) r9: 0000000000000000 r10: 00000000deadbeef r11: 0000000000000000 >>> (XEN) r12: ffff828c80231598 r13: 000000000000000f r14: ffff828c80231188 >>> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >>> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >>> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >>> (XEN) r15: 000000000000000f cr0: 000000008005003b cr4: 00000000000026b0 >>> (XEN) cr3: 000000056f8b0000 cr2: 00000000c975c7f0 >>> (XEN) ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0000 cs: e008 >>> (XEN) Xen stack trace from rsp=ffff8300ceef7d78: >>> (XEN) ffff828c8014993d ffff8300cfde8300 ffff828c80231598 >>> 000000000000000f >>> (XEN) ffff8300ceef7dd8 ffff828c8011d219 000000ff00000010 >>> ffff828c80231598 >>> (XEN) 0000000000000003 0000018a2f73e9e4 0000000000000286 >>> 0000018a2f71127c >>> (XEN) ffff8300ceef7e08 ffff828c8011d39a ffff828c80277e60 >>> ffff828c80231580 >>> (XEN) ffff828c8028e900 ffff828c80231100 ffff8300ceef7e28 >>> ffff828c8015a936 >>> (XEN) ffff8300ceef7f28 ffff828c802072e0 ffff8300ceef7e38 >>> ffff828c801577c3 >>> (XEN) 00007cff31108197 ffff828c801420f0 0000018a2f71127c >>> ffff828c80231100 >>> (XEN) ffff828c8028e900 ffff828c802072e0 ffff8300ceef7ef0 >>> ffff8300ceef7f28 >>> (XEN) 0000000000000000 00000000deadbeef 0000000000000000 >>> 0000000000000007 >>> (XEN) 0000000000000180 0000ffff0000ffff ffff828c8028e900 >>> 0000000000000020 >>> (XEN) ffff8300cee0e100 000000fb00000000 ffff828c8013a1eb >>> 000000000000e008 >>> (XEN) 0000000000000246 ffff8300ceef7ef0 0000000000000000 >>> ffff8300ceef7f20 >>> (XEN) ffff828c8013a2a2 ffff8300cee72100 ffff8300cee0e100 >>> 0000000000000003 >>> (XEN) ffff8300cee72100 ffff8300ceef7df8 0000000000000000 >>> 0000000000000000 >>> (XEN) 0000000000000000 0000000000000000 0000000000000000 >>> 00000000deadbeef >>> (XEN) 0000000000000000 0000000000000000 0000000000000000 >>> 0000000000000000 >>> (XEN) 0000000000000000 00000000deadbeef 0000000000c3f000 >>> 0000000000000003 >>> (XEN) 00000000c06bc380 0000010000000000 00000000c01010c7 >>> 0000000000000061 >>> (XEN) 0000000000000246 00000000d6445f9c 0000000000000069 >>> 5555555555555555 >>> (XEN) 5555555555555555 5555555555555555 5555555555555555 >>> 5555555500000003 >>> (XEN) Xen call trace: >>> (XEN) [<ffff828c80120839>] _xmalloc+0x35/0x16d >>> (XEN) [<ffff828c8011d219>] add_entry+0x6c/0xfb >>> (XEN) [<ffff828c8011d39a>] set_timer+0xf2/0x170 >>> (XEN) [<ffff828c8015a936>] time_calibration_rendezvous+0xa5/0xb3 >>> (XEN) [<ffff828c801577c3>] smp_call_function_interrupt+0x85/0xd2 >>> (XEN) [<ffff828c801420f0>] call_function_interrupt+0x30/0x40 >>> (XEN) [<ffff828c8013a1eb>] default_idle+0x2f/0x34 >>> (XEN) [<ffff828c8013a2a2>] idle_loop+0x68/0x6f >>> (XEN) >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 3: >>> (XEN) Assertion >>> ''!((irq_stat[(((get_cpu_info()->processor_id)))].__local_irq_count) !>>> 0)'' failed at xmalloc.c:180 >>> (XEN) **************************************** >>> (XEN) >>> (XEN) Reboot in five seconds... >>> >>> Both of the machines required a hard reboot, however the Xen console >>> still responded. Issuing an "R"eboot hung the box. >>> >>> -Chris >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel