Andreas Kinzler
2010-Sep-09 09:20 UTC
[Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
I am talking a while (via email) with Jan now to track the following problem and he suggested that I report the problem on xen-devel: Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI hang ? Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung Jul 9 01:49:10 virt kernel: Calling adapter init Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not guaranteed on shared IRQs Jul 9 01:49:49 virt kernel: Acquiring adapter information Jul 9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous command timed out. Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing problem; Jul 9 01:53:13 virt kernel: update mother board BIOS or consider utilizing one of Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic etc) After the VMs have been running a while the aacraid driver reports a non-responding RAID controller. Most of the time the NIC is also no longer working. I nearly tried every combination of dom0 kernel (pvops0, xenfied suse 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. No success in two month. Every combination earlier or later had the problem shown above. I did extensive tests to make sure that the hardware is OK. And it is - I am sure it is a Xen/dom0 problem. Jan suggested to try the fix in c/s 22051 but it did not help. My answer to him: > In the meantime I did try xen-unstable c/s 22068 (contains staging c/s 22051) and > it did not fix the problem at all. I was able to fix a problem with the serial console > and so I got some debug info that is attached to this email. The following line looks > suspicious to me (irr=1, delivery_status=1): > (XEN) IRQ 16 Vec216: > (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, dest_mode=logical, > delivery_status=1, polarity=1, irr=1, trigger=level, mask=0, dest_id:1 > IRQ 16 is the aacraid controller which after some while seems to be enable to receive > interrupts. Can you see from the debug info what is going on? I also applied a small patch which disables HPET broadcast. The machine is now running for 110 hours without a crash while normally it crashes within a few minutes. Is there something wrong (race, deadlock) with HPET broadcasts in relation to blocked interrupt reception (see above)? Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Sep-21 11:56 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On Thu, Sep 09, 2010 at 11:20:51AM +0200, Andreas Kinzler wrote:> I am talking a while (via email) with Jan now to track the following > problem and he suggested that I report the problem on xen-devel: > > Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI > hang ? > Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung > Jul 9 01:49:10 virt kernel: Calling adapter init > Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not > guaranteed on shared IRQs > Jul 9 01:49:49 virt kernel: Acquiring adapter information > Jul 9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s > Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous > command timed out. > Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing > problem; > Jul 9 01:53:13 virt kernel: update mother board BIOS or consider > utilizing one of > Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic etc) > > After the VMs have been running a while the aacraid driver reports a > non-responding RAID controller. Most of the time the NIC is also no > longer working. > I nearly tried every combination of dom0 kernel (pvops0, xenfied suse > 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen > hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. > No success in two month. Every combination earlier or later had the > problem shown above. I did extensive tests to make sure that the > hardware is OK. And it is - I am sure it is a Xen/dom0 problem. > > Jan suggested to try the fix in c/s 22051 but it did not help. My answer > to him: > > > In the meantime I did try xen-unstable c/s 22068 (contains staging c/s > 22051) and > > it did not fix the problem at all. I was able to fix a problem with > the serial console > > and so I got some debug info that is attached to this email. The > following line looks > > suspicious to me (irr=1, delivery_status=1): > > > (XEN) IRQ 16 Vec216: > > (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, > dest_mode=logical, > > delivery_status=1, polarity=1, irr=1, trigger=level, > mask=0, dest_id:1 > > > IRQ 16 is the aacraid controller which after some while seems to be > enable to receive > > interrupts. Can you see from the debug info what is going on? > > I also applied a small patch which disables HPET broadcast. The machine > is now running > for 110 hours without a crash while normally it crashes within a few > minutes. Is there > something wrong (race, deadlock) with HPET broadcasts in relation to > blocked interrupt > reception (see above)? >Hello, What kind of hardware does this happen on? Should this patch be merged? -- Pasi> Andreas >> diff -urN xx/xen/arch/x86/hpet.c xen-4.0.1/xen/arch/x86/hpet.c > --- xx/xen/arch/x86/hpet.c 2010-08-25 12:22:11.000000000 +0200 > +++ xen-4.0.1/xen/arch/x86/hpet.c 2010-08-30 18:13:34.000000000 +0200 > @@ -405,7 +405,7 @@ > /* Only consider HPET timer with MSI support */ > if ( !(cfg & HPET_TN_FSB_CAP) ) > continue; > - > +if (1) continue; > ch->flags = 0; > ch->idx = i; > > @@ -703,8 +703,9 @@ > > int hpet_broadcast_is_available(void) > { > - return (legacy_hpet_event.event_handler == handle_hpet_broadcast > - || num_hpets_used > 0); > + /*return (legacy_hpet_event.event_handler == handle_hpet_broadcast > + || num_hpets_used > 0);*/ > + return 0; > } > > int hpet_legacy_irq_tick(void)> (XEN) ''*'' pressed -> firing all diagnostic keyhandlers > (XEN) [d: dump registers] > (XEN) ''d'' pressed -> dumping registers > (XEN) > (XEN) *** Dumping CPU0 host state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: 0000000000000080 rcx: 0000000000000000 > (XEN) rdx: 0000000000000000 rsi: 000000000000000a rdi: 0000000000000000 > (XEN) rbp: ffff82c480367f28 rsp: ffff82c480367c38 r8: 0000000000004000 > (XEN) r9: 0000000000003fff r10: ffff83043fd08000 r11: 0000000000000400 > (XEN) r12: ffff82c480375408 r13: ffff82c480367dc8 r14: ffff82c4801339f0 > (XEN) r15: ffff82c480100000 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000434385000 cr2: 000007feff5844c6 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c480367c38: > (XEN) 0000000000000080 ffff82c480111182 ffff82c4802277c0 0000000000000065 > (XEN) ffff82c480375408 ffff82c4801107b8 000007fffffd5000 ffff82c4802278a0 > (XEN) 000000000000002a ffff82c480367dc8 ffff82c480367dc8 ffff82c480110870 > (XEN) ffff82c480224de0 ffff82c480224e58 0000000000000296 ffff82c480134eb6 > (XEN) ffff82c48025e180 2a00000000000292 ffff82c480380260 ffff82c480224de0 > (XEN) ffff82c480367dc8 000000cd86c47808 ffff82c48039f578 ffff82c48013456b > (XEN) ffff82c4801344f0 ffff82c480367dc8 ffff82c480133f51 ffff82c4801746a0 > (XEN) ffff8300be696000 ffff82c4801b8933 ffff82c48037a930 ffff82c480367d48 > (XEN) ffff82c480367dd8 000000fc00000000 ffff82000005a4bc 0000000000000000 > (XEN) ffff820000c20b0f 0000000000004000 ffff82c480367db8 ffff82c480118875 > (XEN) ffff82c48019ecd5 0000000000000001 ffff82c480367dc8 ffff8300be696000 > (XEN) ffff82c480133f20 ffff82c480224de0 000000cd86c47808 ffff83043fd314b8 > (XEN) ffff83043fd314b0 ffff82c4801e9615 ffff83043fd314b0 ffff83043fd314b8 > (XEN) 000000cd86c47808 ffff82c480224de0 ffff82c480133f20 ffff82c48025e180 > (XEN) 0000000000000001 ffff82c48025e0a0 0000000000000006 000000cdbfe68fd2 > (XEN) ffff82c48025e2a0 0000000000000003 0000000000000000 ffff82c4803802a8 > (XEN) ffff82c480224de0 0000000600000000 ffff82c480133f47 000000000000e008 > (XEN) 0000000000010246 ffff82c480367e70 0000000000000000 ffff82c48011f4eb > (XEN) ffff82c48025e180 ffff83043fd314b0 ffff82c4803802a8 ffff82c48011f5a5 > (XEN) ffff8300be696000 ffff8300be697760 ffff82c480367f28 ffff82c48037a980 > (XEN) Xen call trace: > (XEN) [<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) [<ffff82c480111182>] dump_registers+0x52/0x110 > (XEN) [<ffff82c4801107b8>] run_all_keyhandlers+0x78/0xa0 > (XEN) [<ffff82c480110870>] handle_keypress+0x90/0xe0 > (XEN) [<ffff82c480134eb6>] serial_rx_interrupt+0x66/0xe0 > (XEN) [<ffff82c48013456b>] __ns16550_poll+0x7b/0xf0 > (XEN) [<ffff82c4801344f0>] __ns16550_poll+0x0/0xf0 > (XEN) [<ffff82c480133f51>] ns16550_poll+0x31/0x40 > (XEN) [<ffff82c4801746a0>] do_invalid_op+0x700/0x710 > (XEN) [<ffff82c4801b8933>] vmx_vmexit_handler+0x253/0x1d10 > (XEN) [<ffff82c480118875>] _csched_cpu_pick+0xd5/0x2f0 > (XEN) [<ffff82c48019ecd5>] hvm_do_resume+0x175/0x1a0 > (XEN) [<ffff82c480133f20>] ns16550_poll+0x0/0x40 > (XEN) [<ffff82c4801e9615>] handle_exception_saved+0x2d/0x6b > (XEN) [<ffff82c480133f20>] ns16550_poll+0x0/0x40 > (XEN) [<ffff82c480133f47>] ns16550_poll+0x27/0x40 > (XEN) [<ffff82c48011f4eb>] execute_timer+0x2b/0x50 > (XEN) [<ffff82c48011f5a5>] timer_softirq_action+0x95/0x240 > (XEN) [<ffff82c48011d550>] __do_softirq+0x60/0xa0 > (XEN) [<ffff82c4801b26e5>] vmx_asm_do_vmentry+0xd2/0xdd > (XEN) > (XEN) *** Dumping CPU0 guest state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: 0033:[<000000014011f3b4>] > (XEN) RFLAGS: 0000000000000a17 CONTEXT: hvm guest > (XEN) rax: 0000000000000371 rbx: 00000000039ae380 rcx: 0000000003b4ec80 > (XEN) rdx: 0000000000000006 rsi: 0000000003b2df00 rdi: 00000000039ad4e0 > (XEN) rbp: 00000000039b1e00 rsp: 00000000036ef2e0 r8: 0000000001f02000 > (XEN) r9: 00000000036ef590 r10: 4000000000000000 r11: 0000000001f02000 > (XEN) r12: 0000000000000fa0 r13: 00000000012fffff r14: 0000000000000057 > (XEN) r15: 0000000000000001 cr0: 0000000080050031 cr4: 00000000000006f8 > (XEN) cr3: 0000000009e54000 cr2: 000007feff5844c6 > (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 002b cs: 0033 > (XEN) > (XEN) *** Dumping CPU1 host state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 1 > (XEN) RIP: e008:[<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: ffff82c4801110d0 rbx: ffff82c48037a980 rcx: 0000000000000001 > (XEN) rdx: 0000000000000080 rsi: 0000000000000001 rdi: 0000000000000000 > (XEN) rbp: 0000000000000000 rsp: ffff83043fd3fce8 r8: 0000000080050031 > (XEN) r9: fffffffffffffffd r10: 0000000000000008 r11: fffff800026e0db0 > (XEN) r12: 0000000000000001 r13: 0000000000000001 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000434386000 cr2: 000007fefa0516ce > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff83043fd3fce8: > (XEN) ffff82c48037a980 ffff82c48016bb32 ffff82c48025ea40 ffff82c48016c24f > (XEN) 0000000000000012 ffff83043fd3ff28 ffff8300be690000 ffff82c4801b97de > (XEN) 0000000000000000 ffff83043fd12c20 ffff82c48025e080 0000000000000082 > (XEN) ffff82c4801ad9f3 0000000000000001 ffff83043fd3fde8 0000000000004000 > (XEN) ffff83043fd3fdb8 ffff83042e320490 ffff82c48025e180 0000000000000296 > (XEN) ffff83043fd3fdc8 ffff82c480260180 ffff82c48025e080 ffff83042e320450 > (XEN) ffff8300be691768 ffff8300be690000 ffff8300be691760 ffffffffffff8000 > (XEN) 0000000001c9c380 ffff82c4801ada2c 0000000000000006 ffff8300be690000 > (XEN) 000000cd9e7d0daa ffff82c480260080 ffff82c48025e080 ffff82c48019eb91 > (XEN) ffff82c48025e180 ffff8300be690000 000000cd9e7d0daa ffff82c480260080 > (XEN) ffff82c48025e080 ffff8300be690000 0000000001c9c380 ffff82c4801b36a3 > (XEN) ffff8300be690000 ffff82c48014a97d 0000000001c9c380 ffff82c48011c311 > (XEN) ffff82c4802600a0 ffff82c4801468fa ffff8300be690000 ffff82c4801adcac > (XEN) ffff8300be690000 ffff82c48011f6b0 ffff8300be690000 ffff8300be691760 > (XEN) 0000000001c9c380 ffff8300be690000 000000cd9e7d0daa ffff82c48019c0ed > (XEN) ffff8300be690000 ffff8300be690000 0000000000000001 ffff82c4801b5abb > (XEN) 0000000000000000 ffff8300be690000 fffff80002844c18 0000000000000004 > (XEN) 0000000000000000 fffff80002844c10 0000000000000001 ffff82c4801b2613 > (XEN) 0000000000000001 fffff80002844c10 0000000000000000 0000000000000004 > (XEN) fffff80002844c18 0000000002038c32 fffff800026e0db0 0000000000000008 > (XEN) Xen call trace: > (XEN) [<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) [<ffff82c48016bb32>] __smp_call_function_interrupt+0x82/0x90 > (XEN) [<ffff82c48016c24f>] smp_call_function_interrupt+0x4f/0x90 > (XEN) [<ffff82c4801b97de>] vmx_vmexit_handler+0x10fe/0x1d10 > (XEN) [<ffff82c4801ad9f3>] pt_restore_timer+0x23/0xb0 > (XEN) [<ffff82c4801ada2c>] pt_restore_timer+0x5c/0xb0 > (XEN) [<ffff82c48019eb91>] hvm_do_resume+0x31/0x1a0 > (XEN) [<ffff82c4801b36a3>] vmx_do_resume+0x123/0x1d0 > (XEN) [<ffff82c48014a97d>] continue_running+0xd/0x20 > (XEN) [<ffff82c48011c311>] schedule+0x381/0x500 > (XEN) [<ffff82c4801468fa>] reprogram_timer+0x6a/0xb0 > (XEN) [<ffff82c4801adcac>] pt_update_irq+0x2c/0x210 > (XEN) [<ffff82c48011f6b0>] timer_softirq_action+0x1a0/0x240 > (XEN) [<ffff82c48019c0ed>] hvm_interrupt_blocked+0x4d/0xd0 > (XEN) [<ffff82c4801b5abb>] vmx_vmenter_helper+0x5b/0x150 > (XEN) [<ffff82c4801b2613>] vmx_asm_do_vmentry+0x0/0xdd > (XEN) > (XEN) *** Dumping CPU1 guest state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 1 > (XEN) RIP: 0010:[<fffff800026dc500>] > (XEN) RFLAGS: 0000000000000202 CONTEXT: hvm guest > (XEN) rax: 0000000000000001 rbx: 0000000002038c32 rcx: 0000000000010008 > (XEN) rdx: 0000000002038800 rsi: 0000000000000001 rdi: fffff80002848e80 > (XEN) rbp: fffff80002844c18 rsp: fffff88003163860 r8: 0000000000000000 > (XEN) r9: fffffffffffffffd r10: 0000000000000008 r11: fffff800026e0db0 > (XEN) r12: 0000000000000004 r13: 0000000000000000 r14: fffff80002844c10 > (XEN) r15: 0000000000000001 cr0: 0000000080050031 cr4: 00000000000006f8 > (XEN) cr3: 0000000000187000 cr2: 000007fefa0516ce > (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 0018 cs: 0010 > (XEN) > (XEN) *** Dumping CPU2 host state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: e008:[<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: ffff82c4801110d0 rbx: ffff82c48037a980 rcx: 0000000000000001 > (XEN) rdx: 0000000000000100 rsi: 000000000390b480 rdi: 0000000000000000 > (XEN) rbp: 0000000000000000 rsp: ffff83043fd2fce8 r8: 0000000080050031 > (XEN) r9: 00000000ffffff80 r10: 000000000361f590 r11: 0000000001d14000 > (XEN) r12: 0000000000000001 r13: 0000000000000001 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000275c06000 cr2: 0000000000528900 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff83043fd2fce8: > (XEN) ffff82c48037a980 ffff82c48016bb32 ffff82c48025ea40 ffff82c48016c24f > (XEN) 0000000000000020 ffff83043fd2ff28 ffff8300bf45c000 ffff82c4801b97de > (XEN) 4000000000000000 0000000000006000 0000000000004000 0000000000000297 > (XEN) ffff82c48025e180 0000000000000002 ffff83043fd2fde8 0000000000006000 > (XEN) ffff83043fd2fdb8 ffff82c480118875 ffff8300bf2f8000 0000000000000001 > (XEN) ffff83043fd2fdc8 ffff82c48025ee40 ffff82c48025e080 00ff8300bf45c000 > (XEN) ffff82c48025ee40 ffff82c48025e080 0000000000000008 0000000000000000 > (XEN) 0000000000000004 0000000000000000 000000000000000c 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000286 0000000000004000 > (XEN) ffff83043fd31070 ffff83043fd31088 ffff82c48025e180 0000000000000296 > (XEN) ffff830275c3bca0 ffff82c480262180 000000cdb41a4edd ffff82c480262180 > (XEN) ffff82c480118ab0 0000000000000002 000000cdb41c528f ffff83043fd315d8 > (XEN) ffff83043fd315d0 ffff82c4801468fa ffff82c480262180 ffff82c4801adcac > (XEN) ffff83043fd31088 ffff82c48011f6b0 000000000000d102 ffff8300bf45d760 > (XEN) 0000000000000000 ffff8300bf45c000 0000000000000080 ffff82c4801a4a3c > (XEN) 00000000012fffff ffff8300bf45c000 0000000000000001 ffff82c4801b5abb > (XEN) 0000000000000000 ffff8300bf45c000 0000000000000080 0000000000000fa0 > (XEN) 00000000012fffff 0000000000000057 0000000000000001 ffff82c4801b2613 > (XEN) 0000000000000001 0000000000000057 00000000012fffff 0000000000000fa0 > (XEN) 0000000000000080 000000000390c380 0000000001d14000 000000000361f590 > (XEN) Xen call trace: > (XEN) [<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) [<ffff82c48016bb32>] __smp_call_function_interrupt+0x82/0x90 > (XEN) [<ffff82c48016c24f>] smp_call_function_interrupt+0x4f/0x90 > (XEN) [<ffff82c4801b97de>] vmx_vmexit_handler+0x10fe/0x1d10 > (XEN) [<ffff82c480118875>] _csched_cpu_pick+0xd5/0x2f0 > (XEN) [<ffff82c480118ab0>] csched_tick+0x0/0x240 > (XEN) [<ffff82c4801468fa>] reprogram_timer+0x6a/0xb0 > (XEN) [<ffff82c4801adcac>] pt_update_irq+0x2c/0x210 > (XEN) [<ffff82c48011f6b0>] timer_softirq_action+0x1a0/0x240 > (XEN) [<ffff82c4801a4a3c>] hvm_vcpu_has_pending_irq+0x4c/0xb0 > (XEN) [<ffff82c4801b5abb>] vmx_vmenter_helper+0x5b/0x150 > (XEN) [<ffff82c4801b2613>] vmx_asm_do_vmentry+0x0/0xdd > (XEN) > (XEN) *** Dumping CPU2 guest state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: 0033:[<0000000140252b39>] > (XEN) RFLAGS: 0000000000000286 CONTEXT: hvm guest > (XEN) rax: 0000000000008401 rbx: 000000000390c380 rcx: 0000000003bc1600 > (XEN) rdx: 0000000000000000 rsi: 000000000390b480 rdi: 000000000390c680 > (XEN) rbp: 0000000000000080 rsp: 000000000361f310 r8: 000000000361f590 > (XEN) r9: 00000000ffffff80 r10: 000000000361f590 r11: 0000000001d14000 > (XEN) r12: 0000000000000fa0 r13: 00000000012fffff r14: 0000000000000057 > (XEN) r15: 0000000000000001 cr0: 0000000080050031 cr4: 00000000000006f8 > (XEN) cr3: 0000000012245000 cr2: 0000000000528900 > (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 002b cs: 0033 > (XEN) > (XEN) *** Dumping CPU3 host state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) rax: ffff82c4801110d0 rbx: ffff82c48037a980 rcx: 0000000000000001 > (XEN) rdx: 0000000000000180 rsi: 0000000004738a40 rdi: 0000000000000000 > (XEN) rbp: 0000000000000000 rsp: ffff83043fd1fce8 r8: 0000000080050031 > (XEN) r9: 000000000351f360 r10: 4000000000000000 r11: 0000000001d18000 > (XEN) r12: 0000000000000001 r13: 0000000000000001 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000275c06000 cr2: 0000000000528900 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff83043fd1fce8: > (XEN) ffff82c48037a980 ffff82c48016bb32 ffff82c48025ea40 ffff82c48016c24f > (XEN) ffff8300bf45c000 ffff83043fd1ff28 ffff8300bf45c000 ffff82c4801b97de > (XEN) ffff8300bf45c000 ffff8300bf45d760 ffffffffffff8000 0000000000000001 > (XEN) ffff82c4801ada2c 0000000000000003 ffff83043fd1fde8 0000000000004000 > (XEN) ffff83043fd1fdb8 ffff82c480118875 ffff82c48019eb91 0000000000000001 > (XEN) ffff83043fd1fdc8 ffff82c48025ee40 ffff82c48025e080 00ff83043fd1fe00 > (XEN) ffff82c48025ee40 ffff82c48025e080 0000000000000004 0000000000000000 > (XEN) 0000000000000008 0000000000000000 000000000000000c 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000286 0000000000006000 > (XEN) ffff83043fd31130 ffff83043fd31148 ffff82c48025e180 0000000000000296 > (XEN) ffff830275c3bca0 ffff82c480264180 ffff8300bf45c000 ffff82c480264180 > (XEN) ffff82c480118ab0 0000000000000003 000000cdc8f8e6f2 ffff83043fd31668 > (XEN) ffff83043fd31660 ffff82c4801468fa ffff82c480264180 ffff82c4801adcac > (XEN) ffff83043fd31148 ffff82c48011f6b0 ffff82c48037ab00 ffff8300bf45d760 > (XEN) ffff82c48037a980 ffff8300bf45c000 000000000371d600 ffff82c4801a4a3c > (XEN) 00000000012fffff ffff8300bf45c000 0000000000000001 ffff82c4801b5abb > (XEN) 0000000000000000 ffff8300bf45c000 0000000003720000 0000000000000fa0 > (XEN) 00000000012fffff 0000000000000057 0000000000000000 ffff82c4801b2613 > (XEN) 0000000000000000 0000000000000057 00000000012fffff 0000000000000fa0 > (XEN) 0000000003720000 000000000371dc00 0000000001d18000 4000000000000000 > (XEN) Xen call trace: > (XEN) [<ffff82c4801110d1>] __dump_execstate+0x1/0x60 > (XEN) [<ffff82c48016bb32>] __smp_call_function_interrupt+0x82/0x90 > (XEN) [<ffff82c48016c24f>] smp_call_function_interrupt+0x4f/0x90 > (XEN) [<ffff82c4801b97de>] vmx_vmexit_handler+0x10fe/0x1d10 > (XEN) [<ffff82c4801ada2c>] pt_restore_timer+0x5c/0xb0 > (XEN) [<ffff82c480118875>] _csched_cpu_pick+0xd5/0x2f0 > (XEN) [<ffff82c48019eb91>] hvm_do_resume+0x31/0x1a0 > (XEN) [<ffff82c480118ab0>] csched_tick+0x0/0x240 > (XEN) [<ffff82c4801468fa>] reprogram_timer+0x6a/0xb0 > (XEN) [<ffff82c4801adcac>] pt_update_irq+0x2c/0x210 > (XEN) [<ffff82c48011f6b0>] timer_softirq_action+0x1a0/0x240 > (XEN) [<ffff82c4801a4a3c>] hvm_vcpu_has_pending_irq+0x4c/0xb0 > (XEN) [<ffff82c4801b5abb>] vmx_vmenter_helper+0x5b/0x150 > (XEN) [<ffff82c4801b2613>] vmx_asm_do_vmentry+0x0/0xdd > (XEN) > (XEN) *** Dumping CPU3 guest state: *** > (XEN) ----[ Xen-4.0.1 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: 0033:[<000000014011f512>] > (XEN) RFLAGS: 0000000000000286 CONTEXT: hvm guest > (XEN) rax: 00000000000084fa rbx: 000000000371dc00 rcx: 0000000004763580 > (XEN) rdx: 000000000000002b rsi: 0000000004738a40 rdi: 000000000371d300 > (XEN) rbp: 0000000003720000 rsp: 000000000351f0b0 r8: 0000000001d18000 > (XEN) r9: 000000000351f360 r10: 4000000000000000 r11: 0000000001d18000 > (XEN) r12: 0000000000000fa0 r13: 00000000012fffff r14: 0000000000000057 > (XEN) r15: 0000000000000000 cr0: 0000000080050031 cr4: 00000000000006f8 > (XEN) cr3: 0000000012245000 cr2: 0000000000528900 > (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 002b cs: 0033 > (XEN) > (XEN) [0: dump Dom0 registers] > (XEN) ''0'' pressed -> dumping Dom0''s registers > (XEN) *** Dumping Dom0 vcpu#0 state: *** > (XEN) RIP: e033:[<ffffffff810093aa>] > (XEN) RFLAGS: 0000000000000246 EM: 0 CONTEXT: pv guest > (XEN) rax: 0000000000000000 rbx: ffffffff8147c000 rcx: ffffffff810093aa > (XEN) rdx: 0000000000000000 rsi: 0000000000000000 rdi: 0000000000000001 > (XEN) rbp: ffffffff8147dee0 rsp: ffffffff8147dec8 r8: 0000000000000000 > (XEN) r9: ffff880001817388 r10: 000000cd462353f3 r11: 0000000000000246 > (XEN) r12: ffffffff814ed038 r13: ffffffff8147c000 r14: ffffffffffffffff > (XEN) r15: 0000000000000000 cr0: 0000000000000008 cr4: 0000000000002660 > (XEN) cr3: 0000000277678000 cr2: 000000000156e000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffffffff8147dec8: > (XEN) 0000000100023801 0000000000000000 ffffffff8100edf0 ffffffff8147def8 > (XEN) ffffffff8100b9c0 ffffffff8147dfd8 ffffffff8147df28 ffffffff8101167b > (XEN) ffffffff81537920 ffffffff8147c000 ffffffff81537920 0000000000000000 > (XEN) ffffffff8147df48 ffffffff8134a64e ffffffff8147df48 ffffffff81535940 > (XEN) ffffffff8147df88 ffffffff81504cf5 ffffffff8147df88 ffffffff81537920 > (XEN) 00000000015c9ed8 0000000000000000 0000000000000000 0000000000000000 > (XEN) ffffffff8147dfa8 ffffffff8150433a ffffffff814fd5a0 ffffffff81001000 > (XEN) ffffffff8147dff8 ffffffff81506baf 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000001 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > (XEN) *** Dumping Dom0 vcpu#1 state: *** > (XEN) RIP: e033:[<ffffffff810093aa>] > (XEN) RFLAGS: 0000000000000246 EM: 0 CONTEXT: pv guest > (XEN) rax: 0000000000000000 rbx: ffff88003f8a2000 rcx: ffffffff810093aa > (XEN) rdx: 0000000000000000 rsi: 0000000000000000 rdi: 0000000000000001 > (XEN) rbp: ffff88003f8a3ef8 rsp: ffff88003f8a3ee0 r8: 0000000000000000 > (XEN) r9: 0000000000000002 r10: 0000000000000000 r11: 0000000000000246 > (XEN) r12: ffffffff814ed038 r13: ffff88003f8a2000 r14: 0000000000000000 > (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 0000000000002660 > (XEN) cr3: 000000027b5fa000 cr2: 00007f3548d00810 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffff88003f8a3ee0: > (XEN) 00000001000237a8 0000000000000000 ffffffff8100edf0 ffff88003f8a3f10 > (XEN) ffffffff8100b9c0 ffff88003f8a3fd8 ffff88003f8a3f40 ffffffff8101167b > (XEN) ffffffff8100f6b9 0000000000000000 0000000000000000 0000000000000000 > (XEN) ffff88003f8a3f50 ffffffff81353f8d 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) [H: dump heap info] > (XEN) ''H'' pressed -> dumping heap info (now-0xCD:DE2462AC) > (XEN) heap[node=0][zone=0] -> 0 pages > (XEN) heap[node=0][zone=1] -> 0 pages > (XEN) heap[node=0][zone=2] -> 0 pages > (XEN) heap[node=0][zone=3] -> 0 pages > (XEN) heap[node=0][zone=4] -> 0 pages > (XEN) heap[node=0][zone=5] -> 0 pages > (XEN) heap[node=0][zone=6] -> 0 pages > (XEN) heap[node=0][zone=7] -> 0 pages > (XEN) heap[node=0][zone=8] -> 0 pages > (XEN) heap[node=0][zone=9] -> 0 pages > (XEN) heap[node=0][zone=10] -> 0 pages > (XEN) heap[node=0][zone=11] -> 0 pages > (XEN) heap[node=0][zone=12] -> 0 pages > (XEN) heap[node=0][zone=13] -> 0 pages > (XEN) heap[node=0][zone=14] -> 16120 pages > (XEN) heap[node=0][zone=15] -> 32768 pages > (XEN) heap[node=0][zone=16] -> 65536 pages > (XEN) heap[node=0][zone=17] -> 131072 pages > (XEN) heap[node=0][zone=18] -> 262144 pages > (XEN) heap[node=0][zone=19] -> 259171 pages > (XEN) heap[node=0][zone=20] -> 1048576 pages > (XEN) heap[node=0][zone=21] -> 1009493 pages > (XEN) heap[node=0][zone=22] -> 0 pages > (XEN) heap[node=0][zone=23] -> 0 pages > (XEN) heap[node=0][zone=24] -> 0 pages > (XEN) heap[node=0][zone=25] -> 0 pages > (XEN) heap[node=0][zone=26] -> 0 pages > (XEN) heap[node=0][zone=27] -> 0 pages > (XEN) heap[node=0][zone=28] -> 0 pages > (XEN) heap[node=0][zone=29] -> 0 pages > (XEN) heap[node=0][zone=30] -> 0 pages > (XEN) heap[node=0][zone=31] -> 0 pages > (XEN) heap[node=0][zone=32] -> 0 pages > (XEN) heap[node=0][zone=33] -> 0 pages > (XEN) heap[node=0][zone=34] -> 0 pages > (XEN) heap[node=0][zone=35] -> 0 pages > (XEN) heap[node=0][zone=36] -> 0 pages > (XEN) heap[node=0][zone=37] -> 0 pages > (XEN) heap[node=0][zone=38] -> 0 pages > (XEN) heap[node=0][zone=39] -> 0 pages > (XEN) [M: dump MSI state] > (XEN) PCI-MSI interrupt information: > (XEN) MSI 32 vec=31 lowest edge assert log lowest dest=00000001 mask=0/0/-1 > (XEN) MSI 33 vec=49 lowest edge assert log lowest dest=00000001 mask=0/0/-1 > (XEN) [Q: dump PCI devices] > (XEN) ==== PCI devices ===> (XEN) 06:03.0 - dom 0 - MSIs < > > (XEN) 05:00.0 - dom 0 - MSIs < > > (XEN) 04:00.0 - dom 0 - MSIs < > > (XEN) 02:00.0 - dom 0 - MSIs < 33 > > (XEN) 01:00.0 - dom 0 - MSIs < > > (XEN) 00:1f.3 - dom 0 - MSIs < > > (XEN) 00:1f.2 - dom 0 - MSIs < 32 > > (XEN) 00:1f.0 - dom 0 - MSIs < > > (XEN) 00:1e.0 - dom 0 - MSIs < > > (XEN) 00:1d.0 - dom 0 - MSIs < > > (XEN) 00:1c.5 - dom 0 - MSIs < > > (XEN) 00:1c.4 - dom 0 - MSIs < > > (XEN) 00:1c.0 - dom 0 - MSIs < > > (XEN) 00:1a.0 - dom 0 - MSIs < > > (XEN) 00:10.1 - dom 0 - MSIs < > > (XEN) 00:10.0 - dom 0 - MSIs < > > (XEN) 00:08.3 - dom 0 - MSIs < > > (XEN) 00:08.2 - dom 0 - MSIs < > > (XEN) 00:08.1 - dom 0 - MSIs < > > (XEN) 00:08.0 - dom 0 - MSIs < > > (XEN) 00:05.0 - dom 0 - MSIs < > > (XEN) 00:03.0 - dom 0 - MSIs < > > (XEN) 00:00.0 - dom 0 - MSIs < > > (XEN) [a: dump timer queues] > (XEN) Dumping timer queues: NOW=0x000000CDDE265C34 > (XEN) CPU[00] 1 : ffff82c4803802a8 ex=0x000000CDDE31A231 ffff82c480224de0 ffff82c480133f20 > (XEN) 2 : ffff82c48037a900 ex=0x000000CDDFEC2B8F 0000000000000000 ffff82c480117520 > (XEN) 3 : ffff83043fdf7778 ex=0x000000CDDE6D4A00 0000000000000000 ffff82c480118ab0 > (XEN) 4 : ffff82c48039a520 ex=0x000000CEDBF58071 0000000000000000 ffff82c480196f30 > (XEN) 5 : ffff82c480396d20 ex=0x000000CE19BDF7C9 0000000000000000 ffff82c480170260 > (XEN) 6 : ffff82c480396e20 ex=0x000000D19FE29133 0000000000000000 ffff82c48016ffd0 > (XEN) > (XEN) CPU[01] 1 : ffff83042e320490 ex=0x000000CDDE9CF908 ffff83042e320450 ffff82c4801ad330 > (XEN) 2 : ffff83043fdf7f68 ex=0x000000CDDEA4CF85 0000000000000001 ffff82c480118ab0 > (XEN) 3 : ffff83042e320400 ex=0x000000CDEE44861B ffff83042e3203c0 ffff82c4801a7070 > (XEN) 4 : ffff82c4802600a0 ex=0x000000CDDFF04E4D 0000000000000000 ffff82c48011ad10 > (XEN) > (XEN) CPU[02] 1 : ffff83043fd31088 ex=0x000000CDDE6D4A00 0000000000000002 ffff82c480118ab0 > (XEN) 2 : ffff82c4802620a0 ex=0x000000CDDFE45374 0000000000000000 ffff82c48011ad10 > (XEN) 3 : ffff830275c20490 ex=0x000000CDDE9CF908 ffff830275c20450 ffff82c4801ad330 > (XEN) 4 : ffff830275c20718 ex=0x00000103C1093DE3 ffff830275c206f8 ffff82c4801a6830 > (XEN) 5 : ffff830275c20400 ex=0x000000CDFC5C355A ffff830275c203c0 ffff82c4801a7070 > (XEN) 6 : ffff8300bf2f8058 ex=0x000000CDDEB2AC28 ffff8300bf2f8000 ffff82c48011ba60 > (XEN) > (XEN) CPU[03] 1 : ffff8300be692058 ex=0x000000CDDE385B73 ffff8300be692000 ffff82c48011ba60 > (XEN) 2 : ffff82c4802640a0 ex=0x000000CDDFEE04B6 0000000000000000 ffff82c48011ad10 > (XEN) 3 : ffff83043fd31148 ex=0x000000CDDE6D4A00 0000000000000003 ffff82c480118ab0 > (XEN) 4 : ffff83042e320718 ex=0x000001030021801F ffff83042e3206f8 ffff82c4801a6830 > (XEN) > (XEN) [c: dump ACPI Cx structures] > (XEN) ''c'' pressed -> printing ACPI Cx structures > (XEN) ==cpu0=> (XEN) active state: C3 > (XEN) max_cstate: C7 > (XEN) states: > (XEN) C1: type[C1] latency[001] usage[00796066] method[ FFH] duration[19697677581] > (XEN) C2: type[C2] latency[017] usage[00000000] method[ FFH] duration[0] > (XEN) *C3: type[C3] latency[017] usage[00911081] method[ FFH] duration[685338169141] > (XEN) C0: usage[01707147] duration[179159560144] > (XEN) ==cpu1=> (XEN) active state: C3 > (XEN) max_cstate: C7 > (XEN) states: > (XEN) C1: type[C1] latency[001] usage[00282336] method[ FFH] duration[8738035183] > (XEN) C2: type[C2] latency[017] usage[00000000] method[ FFH] duration[0] > (XEN) *C3: type[C3] latency[017] usage[00254304] method[ FFH] duration[673317637054] > (XEN) C0: usage[00536640] duration[202139745271] > (XEN) ==cpu2=> (XEN) active state: C3 > (XEN) max_cstate: C7 > (XEN) states: > (XEN) C1: type[C1] latency[001] usage[00531258] method[ FFH] duration[12402144588] > (XEN) C2: type[C2] latency[017] usage[00000000] method[ FFH] duration[0] > (XEN) *C3: type[C3] latency[017] usage[00259457] method[ FFH] duration[674870343631] > (XEN) C0: usage[00790715] duration[196922939670] > (XEN) ==cpu3=> (XEN) active state: C3 > (XEN) max_cstate: C7 > (XEN) states: > (XEN) C1: type[C1] latency[001] usage[00367553] method[ FFH] duration[10728282307] > (XEN) C2: type[C2] latency[017] usage[00000000] method[ FFH] duration[0] > (XEN) *C3: type[C3] latency[017] usage[00267950] method[ FFH] duration[693228162741] > (XEN) C0: usage[00635503] duration[180238993044] > (XEN) [e: dump evtchn info] > (XEN) ''e'' pressed -> dumping event-channel info > (XEN) Event channel information for domain 0: > (XEN) Polling vCPUs: {00000000,00000000,00000000,00000001} > (XEN) port [p/m] > (XEN) 1 [0/0]: s=5 n=0 v=0 x=0 > (XEN) 2 [0/0]: s=6 n=0 x=0 > (XEN) 3 [0/0]: s=6 n=0 x=0 > (XEN) 4 [0/0]: s=5 n=0 v=1 x=0 > (XEN) 5 [0/0]: s=6 n=0 x=0 > (XEN) 6 [0/0]: s=5 n=1 v=0 x=0 > (XEN) 7 [0/0]: s=6 n=1 x=0 > (XEN) 8 [0/0]: s=6 n=1 x=0 > (XEN) 9 [0/0]: s=5 n=1 v=1 x=0 > (XEN) 10 [0/0]: s=6 n=1 x=0 > (XEN) 11 [0/0]: s=3 n=0 d=0 p=22 x=0 > (XEN) 12 [0/0]: s=4 n=0 p=9 x=0 > (XEN) 13 [0/0]: s=5 n=0 v=9 x=0 > (XEN) 14 [0/0]: s=5 n=0 v=2 x=0 > (XEN) 15 [0/0]: s=4 n=0 p=16 x=0 > (XEN) 16 [0/0]: s=4 n=0 p=279 x=0 > (XEN) 17 [0/0]: s=4 n=0 p=21 x=0 > (XEN) 18 [0/0]: s=4 n=0 p=23 x=0 > (XEN) 19 [0/0]: s=4 n=0 p=12 x=0 > (XEN) 20 [0/0]: s=4 n=0 p=1 x=0 > (XEN) 21 [0/0]: s=4 n=0 p=278 x=0 > (XEN) 22 [0/0]: s=3 n=0 d=0 p=11 x=0 > (XEN) 23 [0/0]: s=5 n=0 v=3 x=0 > (XEN) 24 [0/0]: s=3 n=0 d=1 p=3 x=0 > (XEN) 25 [0/0]: s=3 n=0 d=1 p=1 x=0 > (XEN) 26 [0/0]: s=3 n=0 d=1 p=2 x=0 > (XEN) 27 [0/0]: s=3 n=0 d=2 p=3 x=0 > (XEN) 28 [0/0]: s=3 n=0 d=2 p=1 x=0 > (XEN) 29 [0/0]: s=3 n=0 d=2 p=2 x=0 > (XEN) 30 [0/0]: s=3 n=0 d=1 p=7 x=0 > (XEN) 31 [0/0]: s=3 n=0 d=1 p=8 x=0 > (XEN) 32 [0/0]: s=3 n=0 d=1 p=9 x=0 > (XEN) 33 [0/0]: s=3 n=0 d=2 p=7 x=0 > (XEN) 34 [0/0]: s=3 n=0 d=2 p=8 x=0 > (XEN) 35 [0/0]: s=3 n=0 d=2 p=9 x=0 > (XEN) Event channel information for domain 1: > (XEN) Polling vCPUs: {00000000,00000000,00000000,00000001} > (XEN) port [p/m] > (XEN) 1 [0/1]: s=3 n=0 d=0 p=25 x=1 > (XEN) 2 [0/1]: s=3 n=1 d=0 p=26 x=1 > (XEN) 3 [0/0]: s=3 n=0 d=0 p=24 x=0 > (XEN) 4 [0/1]: s=2 n=0 d=0 x=0 > (XEN) 5 [0/0]: s=6 n=0 x=0 > (XEN) 6 [0/0]: s=2 n=0 d=0 x=0 > (XEN) 7 [0/0]: s=3 n=0 d=0 p=30 x=0 > (XEN) 8 [0/0]: s=3 n=0 d=0 p=31 x=0 > (XEN) 9 [0/0]: s=3 n=0 d=0 p=32 x=0 > (XEN) Event channel information for domain 2: > (XEN) Polling vCPUs: {00000000,00000000,00000000,00000001} > (XEN) port [p/m] > (XEN) 1 [0/1]: s=3 n=0 d=0 p=28 x=1 > (XEN) 2 [0/1]: s=3 n=1 d=0 p=29 x=1 > (XEN) 3 [0/0]: s=3 n=0 d=0 p=27 x=0 > (XEN) 4 [0/1]: s=2 n=0 d=0 x=0 > (XEN) 5 [0/0]: s=6 n=0 x=0 > (XEN) 6 [0/0]: s=2 n=0 d=0 x=0 > (XEN) 7 [0/0]: s=3 n=0 d=0 p=33 x=0 > (XEN) 8 [0/0]: s=3 n=0 d=0 p=34 x=0 > (XEN) 9 [0/0]: s=3 n=0 d=0 p=35 x=0 > (XEN) [g: print grant table usage] > (XEN) gnttab_usage_print_all [ key ''g'' pressed > (XEN) -------- active -------- -------- shared -------- > (XEN) [ref] localdom mfn pin localdom gmfn flags > (XEN) grant-table for remote domain: 0 ... no active grant table entries > (XEN) -------- active -------- -------- shared -------- > (XEN) [ref] localdom mfn pin localdom gmfn flags > (XEN) grant-table for remote domain: 1 (v1) > (XEN) [15580] 0 0x285587 0x00000300 0 0x072387 0x09 > (XEN) [15581] 0 0x286905 0x00000300 0 0x070f05 0x09 > (XEN) [15595] 0 0x286905 0x00000300 0 0x070f05 0x09 > (XEN) [15629] 0 0x285b84 0x00000300 0 0x071d84 0x09 > (XEN) [15688] 0 0x285587 0x00000300 0 0x072387 0x09 > (XEN) [15718] 0 0x285e86 0x00000300 0 0x071886 0x09 > (XEN) [15719] 0 0x285b84 0x00000300 0 0x071d84 0x09 > (XEN) [15724] 0 0x285e86 0x00000300 0 0x071886 0x09 > (XEN) [15728] 0 0x28680b 0x00000300 0 0x070e0b 0x09 > (XEN) [15780] 0 0x286f89 0x00000300 0 0x070989 0x09 > (XEN) [15814] 0 0x287388 0x00000300 0 0x070588 0x09 > (XEN) [16087] 0 0x24a88e 0x00000300 0 0x02ce8e 0x09 > (XEN) [16108] 0 0x2f8848 0x00000001 0 0x07ee48 0x19 > (XEN) [16150] 0 0x2f8847 0x00000001 0 0x07ee47 0x19 > (XEN) [16202] 0 0x286e0a 0x00000300 0 0x07080a 0x09 > (XEN) [16371] 0 0x2f82ca 0x00000001 0 0x07f4ca 0x19 > (XEN) [16383] 0 0x2f82b3 0x00000001 0 0x07f4b3 0x19 > (XEN) -------- active -------- -------- shared -------- > (XEN) [ref] localdom mfn pin localdom gmfn flags > (XEN) grant-table for remote domain: 2 (v1) > (XEN) [15801] 0 0x3004d4 0x00000300 0 0x0766d4 0x09 > (XEN) [15839] 0 0x300d52 0x00000300 0 0x075f52 0x09 > (XEN) [15842] 0 0x300b53 0x00000300 0 0x076153 0x09 > (XEN) [15846] 0 0x300751 0x00000300 0 0x076551 0x09 > (XEN) [15957] 0 0x37cdc4 0x00000003 0 0x079fc4 0x19 > (XEN) [16038] 0 0x377cd5 0x00000001 0 0x07eed5 0x19 > (XEN) [16039] 0 0x377cd4 0x00000001 0 0x07eed4 0x19 > (XEN) [16045] 0 0x3005c9 0x00000003 0 0x0767c9 0x19 > (XEN) [16050] 0 0x3004d4 0x00000300 0 0x0766d4 0x09 > (XEN) [16063] 0 0x300751 0x00000300 0 0x076551 0x09 > (XEN) [16079] 0 0x300a42 0x00000003 0 0x076042 0x19 > (XEN) [16101] 0 0x301539 0x00000003 0 0x075739 0x19 > (XEN) [16135] 0 0x3001ca 0x00000003 0 0x076bca 0x19 > (XEN) [16165] 0 0x300751 0x00000300 0 0x076551 0x09 > (XEN) [16168] 0 0x300b53 0x00000300 0 0x076153 0x09 > (XEN) [16180] 0 0x300941 0x00000003 0 0x076341 0x19 > (XEN) [16203] 0 0x300c43 0x00000003 0 0x075e43 0x19 > (XEN) [16221] 0 0x3006c7 0x00000003 0 0x0764c7 0x19 > (XEN) [16233] 0 0x332e00 0x00000300 0 0x043c00 0x09 > (XEN) [16235] 0 0x300448 0x00000003 0 0x076648 0x19 > (XEN) [16254] 0 0x300d52 0x00000300 0 0x075f52 0x09 > (XEN) [16371] 0 0x3776e8 0x00000001 0 0x07f4e8 0x19 > (XEN) [16383] 0 0x3776d1 0x00000001 0 0x07f4d1 0x19 > (XEN) gnttab_usage_print_all ] done > (XEN) [i: dump interrupt bindings] > (XEN) Guest interrupt information: > (XEN) IRQ: 0 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound > (XEN) IRQ: 1 affinity:00000000,00000000,00000000,00000001 vec:28 type=IO-APIC-edge status=00000014 in-flight=0 domain-list=0: 1(----), > (XEN) IRQ: 2 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC status=00000000 mapped, unbound > (XEN) IRQ: 3 affinity:00000000,00000000,00000000,0000000f vec:30 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 4 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:f1 type=IO-APIC-edge status=00000000 mapped, unbound > (XEN) IRQ: 5 affinity:00000000,00000000,00000000,0000000f vec:38 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 6 affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 7 affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 8 affinity:00000000,00000000,00000000,0000000f vec:50 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 9 affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 9(----), > (XEN) IRQ: 10 affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 11 affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 12 affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge status=00000010 in-flight=0 domain-list=0: 12(----), > (XEN) IRQ: 13 affinity:00000000,00000000,00000000,0000000f vec:78 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 14 affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 15 affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge status=00000002 mapped, unbound > (XEN) IRQ: 16 affinity:00000000,00000000,00000000,00000001 vec:d8 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 16(----), > (XEN) IRQ: 17 affinity:00000000,00000000,00000000,0000000f vec:21 type=IO-APIC-level status=00000002 mapped, unbound > (XEN) IRQ: 19 affinity:00000000,00000000,00000000,0000000f vec:29 type=IO-APIC-level status=00000002 mapped, unbound > (XEN) IRQ: 21 affinity:00000000,00000000,00000000,00000001 vec:39 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 21(----), > (XEN) IRQ: 23 affinity:00000000,00000000,00000000,00000001 vec:41 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 23(----), > (XEN) IRQ: 24 affinity:00000000,00000000,00000000,00000008 vec:98 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 25 affinity:00000000,00000000,00000000,00000008 vec:a0 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 26 affinity:00000000,00000000,00000000,00000002 vec:a8 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 27 affinity:00000000,00000000,00000000,00000002 vec:b0 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 28 affinity:00000000,00000000,00000000,00000002 vec:b8 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 29 affinity:00000000,00000000,00000000,00000002 vec:c0 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 30 affinity:00000000,00000000,00000000,00000002 vec:c8 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 31 affinity:00000000,00000000,00000000,00000008 vec:d0 type=HPET-MSI status=00000000 mapped, unbound > (XEN) IRQ: 32 affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:279(----), > (XEN) IRQ: 33 affinity:00000000,00000000,00000000,00000001 vec:49 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:278(----), > (XEN) IO-APIC interrupt information: > (XEN) IRQ 0 Vec240: > (XEN) Apic 0x00, Pin 2: vector=240, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:255 > (XEN) IRQ 1 Vec 40: > (XEN) Apic 0x00, Pin 1: vector=40, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 3 Vec 48: > (XEN) Apic 0x00, Pin 3: vector=48, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=1, dest_id:15 > (XEN) IRQ 4 Vec241: > (XEN) Apic 0x00, Pin 4: vector=241, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:255 > (XEN) IRQ 5 Vec 56: > (XEN) Apic 0x00, Pin 5: vector=56, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=1, dest_id:15 > (XEN) IRQ 6 Vec 64: > (XEN) Apic 0x00, Pin 6: vector=64, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 7 Vec 72: > (XEN) Apic 0x00, Pin 7: vector=72, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 8 Vec 80: > (XEN) Apic 0x00, Pin 8: vector=80, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=1, dest_id:15 > (XEN) IRQ 9 Vec 88: > (XEN) Apic 0x00, Pin 9: vector=88, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=level, mask=0, dest_id:1 > (XEN) IRQ 10 Vec 96: > (XEN) Apic 0x00, Pin 10: vector=96, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 11 Vec104: > (XEN) Apic 0x00, Pin 11: vector=104, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 12 Vec112: > (XEN) Apic 0x00, Pin 12: vector=112, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 13 Vec120: > (XEN) Apic 0x00, Pin 13: vector=120, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=1, dest_id:15 > (XEN) IRQ 14 Vec136: > (XEN) Apic 0x00, Pin 14: vector=136, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 15 Vec144: > (XEN) Apic 0x00, Pin 15: vector=144, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) IRQ 16 Vec216: > (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, irr=1, trigger=level, mask=0, dest_id:1 > (XEN) IRQ 17 Vec 33: > (XEN) Apic 0x00, Pin 17: vector=33, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=1, dest_id:15 > (XEN) IRQ 19 Vec 41: > (XEN) Apic 0x00, Pin 19: vector=41, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=1, dest_id:15 > (XEN) IRQ 21 Vec 57: > (XEN) Apic 0x00, Pin 21: vector=57, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 > (XEN) IRQ 23 Vec 65: > (XEN) Apic 0x00, Pin 23: vector=65, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [m: memory info] > (XEN) Physical memory information: > (XEN) Xen heap: 0kB free > (XEN) heap[14]: 64480kB free > (XEN) heap[15]: 131072kB free > (XEN) heap[16]: 262144kB free > (XEN) heap[17]: 524288kB free > (XEN) heap[18]: 1048576kB free > (XEN) heap[19]: 1036684kB free > (XEN) heap[20]: 4194304kB free > (XEN) heap[21]: 4037972kB free > (XEN) Dom heap: 11299520kB free > (XEN) [n: NMI statistics] > (XEN) CPU NMI > (XEN) 0 0 > (XEN) 1 0 > (XEN) 2 0 > (XEN) 3 0 > (XEN) 4 0 > (XEN) 5 0 > (XEN) 6 0 > (XEN) 7 0 > (XEN) 8 0 > (XEN) 9 0 > (XEN) 10 0 > (XEN) 11 0 > (XEN) 12 0 > (XEN) 13 0 > (XEN) 14 0 > (XEN) 15 0 > (XEN) 16 0 > (XEN) 17 0 > (XEN) 18 0 > (XEN) 19 0 > (XEN) 20 0 > (XEN) 21 0 > (XEN) 22 0 > (XEN) 23 0 > (XEN) 24 0 > (XEN) 25 0 > (XEN) 26 0 > (XEN) 27 0 > (XEN) 28 0 > (XEN) 29 0 > (XEN) 30 0 > (XEN) 31 0 > (XEN) 32 0 > (XEN) 33 0 > (XEN) 34 0 > (XEN) 35 0 > (XEN) 36 0 > (XEN) 37 0 > (XEN) 38 0 > (XEN) 39 0 > (XEN) 40 0 > (XEN) 41 0 > (XEN) 42 0 > (XEN) 43 0 > (XEN) 44 0 > (XEN) 45 0 > (XEN) 46 0 > (XEN) 47 0 > (XEN) 48 0 > (XEN) 49 0 > (XEN) 50 0 > (XEN) 51 0 > (XEN) 52 0 > (XEN) 53 0 > (XEN) 54 0 > (XEN) 55 0 > (XEN) 56 0 > (XEN) 57 0 > (XEN) 58 0 > (XEN) 59 0 > (XEN) 60 0 > (XEN) 61 0 > (XEN) 62 0 > (XEN) 63 0 > (XEN) 64 0 > (XEN) 65 0 > (XEN) 66 0 > (XEN) 67 0 > (XEN) 68 0 > (XEN) 69 0 > (XEN) 70 0 > (XEN) 71 0 > (XEN) 72 0 > (XEN) 73 0 > (XEN) 74 0 > (XEN) 75 0 > (XEN) 76 0 > (XEN) 77 0 > (XEN) 78 0 > (XEN) 79 0 > (XEN) 80 0 > (XEN) 81 0 > (XEN) 82 0 > (XEN) 83 0 > (XEN) 84 0 > (XEN) 85 0 > (XEN) 86 0 > (XEN) 87 0 > (XEN) 88 0 > (XEN) 89 0 > (XEN) 90 0 > (XEN) 91 0 > (XEN) 92 0 > (XEN) 93 0 > (XEN) 94 0 > (XEN) 95 0 > (XEN) 96 0 > (XEN) 97 0 > (XEN) 98 0 > (XEN) 99 0 > (XEN) 100 0 > (XEN) 101 0 > (XEN) 102 0 > (XEN) 103 0 > (XEN) 104 0 > (XEN) 105 0 > (XEN) 106 0 > (XEN) 107 0 > (XEN) 108 0 > (XEN) 109 0 > (XEN) 110 0 > (XEN) 111 0 > (XEN) 112 0 > (XEN) 113 0 > (XEN) 114 0 > (XEN) 115 0 > (XEN) 116 0 > (XEN) 117 0 > (XEN) 118 0 > (XEN) 119 0 > (XEN) 120 0 > (XEN) 121 0 > (XEN) 122 0 > (XEN) 123 0 > (XEN) 124 0 > (XEN) 125 0 > (XEN) 126 0 > (XEN) 127 0 > (XEN) dom0 vcpu0: NMI neither pending nor masked > (XEN) [q: dump domain (and guest debug) info] > (XEN) ''q'' pressed -> dumping domain info (now=0xCE:0E1F0EE8) > (XEN) General information for domain 0: > (XEN) refcnt=3 dying=0 nr_pages=260224 xenheap_pages=5 dirty_cpus={3} max_pages=4294967295 > (XEN) handle=00000000-0000-0000-0000-000000000000 vm_assist=0000000d > (XEN) Rangesets belonging to domain 0: > (XEN) I/O Ports { 0-1f, 22-3f, 44-60, 62-9f, a2-3f7, 400-807, 80c-cfb, d00-ffff } > (XEN) Interrupts { 0-279 } > (XEN) I/O Memory { 0-febff, fec01-fedff, fee01-ffffffffffffffff } > (XEN) Memory pages belonging to domain 0: > (XEN) DomPage list too long to display > (XEN) XenPage 000000000043fce0: caf=c000000000000002, taf=7400000000000002 > (XEN) XenPage 000000000043fcdf: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000043fcde: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000043fcdd: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000000bf780: caf=c000000000000002, taf=7400000000000002 > (XEN) VCPU information and callbacks for domain 0: > (XEN) VCPU0: CPU3 [has=F] flags=1 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={3} cpu_affinity={0-127} > (XEN) No periodic timer > (XEN) VCPU1: CPU2 [has=F] flags=1 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={} cpu_affinity={0-127} > (XEN) No periodic timer > (XEN) General information for domain 1: > (XEN) refcnt=3 dying=0 nr_pages=525271 xenheap_pages=34 dirty_cpus={0,2} max_pages=525312 > (XEN) handle=3e5e7fbc-bf06-d566-007f-7c887fe0ea95 vm_assist=00000000 > (XEN) paging assistance: hap refcounts log_dirty translate external > (XEN) Rangesets belonging to domain 1: > (XEN) I/O Ports { } > (XEN) Interrupts { } > (XEN) I/O Memory { } > (XEN) Memory pages belonging to domain 1: > (XEN) DomPage list too long to display > (XEN) PoD entries=0 cachesize=0 > (XEN) XenPage 0000000000434393: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000434392: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000434391: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000434390: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000000bf471: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000004341ca: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767f4: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767f3: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767f2: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767f1: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767f0: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767ef: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767ee: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767ed: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767ec: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767eb: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767ea: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e9: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e8: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e7: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e6: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e5: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e4: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e3: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e2: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e1: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767e0: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767df: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767de: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767dd: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767dc: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767db: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767da: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000003767d9: caf=c000000000000001, taf=7400000000000001 > (XEN) VCPU information and callbacks for domain 1: > (XEN) VCPU0: CPU3 [has=T] flags=0 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={3} cpu_affinity={0-127} > (XEN) paging assistance: hap, 4 levels > (XEN) No periodic timer > (XEN) VCPU1: CPU0 [has=F] flags=0 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={0} cpu_affinity={0-127} > (XEN) paging assistance: hap, 4 levels > (XEN) No periodic timer > (XEN) General information for domain 2: > (XEN) refcnt=3 dying=0 nr_pages=525271 xenheap_pages=34 dirty_cpus={1} max_pages=525312 > (XEN) handle=c32d89f1-f967-2e93-3aa8-9b32547900d8 vm_assist=00000000 > (XEN) paging assistance: hap refcounts log_dirty translate external > (XEN) Rangesets belonging to domain 2: > (XEN) I/O Ports { } > (XEN) Interrupts { } > (XEN) I/O Memory { } > (XEN) Memory pages belonging to domain 2: > (XEN) DomPage list too long to display > (XEN) PoD entries=0 cachesize=0 > (XEN) XenPage 0000000000275c13: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000275c12: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000275c11: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000275c10: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 00000000000bf473: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000027660a: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376766: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376765: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376764: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376763: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376762: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376761: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376760: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675f: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675e: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675d: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675c: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675b: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037675a: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376759: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376758: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376757: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376756: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376755: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376754: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376753: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376752: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376751: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 0000000000376750: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037674f: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037674e: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037674d: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037674c: caf=c000000000000001, taf=7400000000000001 > (XEN) XenPage 000000000037674b: caf=c000000000000001, taf=7400000000000001 > (XEN) VCPU information and callbacks for domain 2: > (XEN) VCPU0: CPU1 [has=T] flags=0 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={1} cpu_affinity={0-127} > (XEN) paging assistance: hap, 4 levels > (XEN) No periodic timer > (XEN) VCPU1: CPU3 [has=F] flags=1 poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={} cpu_affinity={0-127} > (XEN) paging assistance: hap, 4 levels > (XEN) No periodic timer > (XEN) Notifying guest 0:0 (virq 1, port 4, stat 0/0/0) > (XEN) Notifying guest 0:1 (virq 1, port 9, stat 0/0/0) > (XEN) Notifying guest 1:0 (virq 1, port 0, stat 0/-1/0) > Aug 31 18:16:46 virt kernel: vcpu 0(XEN) Notifying guest 1:1 (virq 1, port 0, stat 0/-1/0) > > (XEN) Notifying guest 2:0 (virq 1, port 0, stat 0/-1/0) > Aug 31 18:16:46 virt kernel: 0: masked=0 pending=0 event_sel 00000000(XEN) Notifying guest 2:1 (virq 1, port 0, stat 0/-1/0) > > (XEN) [r: dump run queues] > Aug 31 18:16:46 virt kernel: 1: masked=0 pending=0 event_sel 00000000(XEN) Scheduler: SMP Credit Scheduler (credit) > > (XEN) info: > (XEN) ncpus = 4 > (XEN) master = 0 > (XEN) credit = 1200 > (XEN) credit balance = -802 > (XEN) weight = 768 > (XEN) runq_sort = 10301 > (XEN) default-weight = 256 > (XEN) msecs per tick = 10ms > (XEN) credits per msec = 10 > (XEN) ticks per tslice = 3 > (XEN) ticks per acct = 3 > (XEN) migration delay = 0us > Aug 31 18:16:46 virt kernel: pending:(XEN) idlers: 00000000,00000000,00000000,00000008 > > (XEN) active vcpus: > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) 1: [0.0] pri=-1 flags=0 cpu=2 credit=-2015 [w=256] > > (XEN) 2: [2.1] pri=0 flags=0 cpu=3 credit=-225 [w=256]Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 3: [2.0] pri=-2 flags=0 cpu=1 > credit=-16076 [w=256]Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) 4: > [1.0] pri=-2 flags=0 cpu=1Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 credit=-17250 [w=256] > > (XEN) 5: Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000[1.1] pri=-2 flags=0 cpu=0 > credit=-300 [w=256]Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > (XEN) sched_smt_power_savings: disabled > > (XEN) NOW=0x000000CE3D7AB8CB > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) CPU[00] > sort=10300, sibling=00000000,00000000,00000000,00000001, Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000core=00000000,00000000,00000000,0000000f > > (XEN) run: [32767.0] pri=0 flags=0 cpu=0Aug 31 18:16:46 virt kernel: > (XEN) 1: > [1.1] pri=-2 flags=0 cpu=0Aug 31 18:16:46 virt kernel: masks: credit=-300 [w=256] > > (XEN) CPU[01] Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff sort=10301, sibling=00000000,00000000,00000000,00000002, fff ffffffffffffffff ffffffffffffffffcore=00000000,00000000,00000000,0000000f > > (XEN) run: Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff[2.0] pri=-2 flags=0 cpu=1fff ffffffffffffffff ffffffffffffffff credit=-16600 [w=256] > > (XEN) 1: Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff[32767.1] pri=-64 flags=0 cpu=1fff ffffffffffffffff ffffffffffffffff > > (XEN) CPU[02] Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff sort=10301, sibling=00000000,00000000,00000000,00000004, core=00000000,00000000,00000000,0000000f > (XEN) run: [0.0] pri=-1 flags=0 cpu=2 credit=-3956 [w=256] > (XEN) 1: [32767.2] pri=-64 flags=0 cpu=2 > (XEN) CPU[03] fff ffffffffffffffff ffffffffffffffff sort=10301, sibling=00000000,00000000,00000000,00000008, > core=00000000,00000000,00000000,0000000f > Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff(XEN) run: [32767.3] pri=-64 flags=0 cpu=3fff ffffffffffffffff ffffffffffffffff > > (XEN) [s: dump softtsc stats] > Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff(XEN) TSC marked as reliable, warp = 0 (count=2) > fff ffffffffffffffff ffffffffffffffff(XEN) dom1(hvm): mode=0 > ,ofs=0x77519ca8a6Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff,khz=2666746fff ffffffffffffffff ffffffffffffffff,inc=1 > > (XEN) dom2(hvm): mode=0Aug 31 18:16:46 virt kernel: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff fffffffffffff,ofs=0x781272adcefff ffffffffffffffff fffffff000000011,khz=2666746 > ,inc=1 > Aug 31 18:16:46 virt kernel: (XEN) No domains have emulated TSC > > (XEN) [t: display multi-cpu clock info] > Aug 31 18:16:46 virt kernel: unmasked:(XEN) Synced stime skew: max=76ns avg=76ns samples=1 current=76ns > > (XEN) Synced cycles skew: max=276 avg=276 samples=1 current=276 > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) [u: dump numa info] > > (XEN) ''u'' pressed -> dumping numa info (now-0xCE:56A328B5) > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) idx0 -> NODE0 start->0 size->4456448 > > (XEN) phys_to_nid(0000000000001000) -> 0 should be 0 > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) CPU0 -> NODE0 > > (XEN) CPU1 -> NODE0 > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) CPU2 -> NODE0 > > (XEN) CPU3 -> NODE0 > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) Memory location of each domain: > > (XEN) Domain 0 (total: 260224): > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) Node 0: 260224 > > (XEN) Domain 1 (total: 525271): > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) Node 0: 525271 > > (XEN) Domain 2 (total: 525271): > Aug 31 18:16:46 virt kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000(XEN) Node 0: 525271 >> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andreas Kinzler
2010-Sep-29 18:08 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 21.09.2010 13:56, Pasi Kärkkäinen wrote:>> I am talking a while (via email) with Jan now to track the following >> problem and he suggested that I report the problem on xen-devel: >> >> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI >> hang ? >> Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung >> Jul 9 01:49:10 virt kernel: Calling adapter init >> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not >> guaranteed on shared IRQs >> Jul 9 01:49:49 virt kernel: Acquiring adapter information >> Jul 9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s >> Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous >> command timed out. >> Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing >> problem; >> Jul 9 01:53:13 virt kernel: update mother board BIOS or consider >> utilizing one of >> Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic etc) >> >> After the VMs have been running a while the aacraid driver reports a >> non-responding RAID controller. Most of the time the NIC is also no >> longer working. >> I nearly tried every combination of dom0 kernel (pvops0, xenfied suse >> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen >> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. >> No success in two month. Every combination earlier or later had the >> problem shown above. I did extensive tests to make sure that the >> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. >> >> Jan suggested to try the fix in c/s 22051 but it did not help. My answer >> to him: >> >>> In the meantime I did try xen-unstable c/s 22068 (contains staging c/s >> 22051) and >>> it did not fix the problem at all. I was able to fix a problem with >> the serial console >>> and so I got some debug info that is attached to this email. The >> following line looks >>> suspicious to me (irr=1, delivery_status=1): >> >>> (XEN) IRQ 16 Vec216: >>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, >> dest_mode=logical, >>> delivery_status=1, polarity=1, irr=1, trigger=level, >> mask=0, dest_id:1 >> >>> IRQ 16 is the aacraid controller which after some while seems to be >> enable to receive >>> interrupts. Can you see from the debug info what is going on? >> >> I also applied a small patch which disables HPET broadcast. The machine >> is now running >> for 110 hours without a crash while normally it crashes within a few >> minutes. Is there >> something wrong (race, deadlock) with HPET broadcasts in relation to >> blocked interrupt >> reception (see above)? > What kind of hardware does this happen on?It is a Supermicro X8SIL-F, Intel Xeon 3450 system.> Should this patch be merged?Not easy to answer. I spend more than 10 weeks searching nearly full time for the reason of the stability issues. Finally I was able to track it down to the HPET broadcast code. We need to find the developer of the HPET broadcast code. Then, he should try to fix the code. I consider it a quite severe bug as it renders Xen nearly useless on affected systems. That is why I (and my boss who pays me) spend so much time (developing/fixing Xen is not really my core job) and money (buying a E5620 machine just for testing Xen). I think many people on affected systems are having problems. See http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html Regards Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Lyon
2010-Sep-29 19:34 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On Wed, Sep 29, 2010 at 7:08 PM, Andreas Kinzler <ml-xen-devel@hfp.de> wrote:> On 21.09.2010 13:56, Pasi Kärkkäinen wrote: >>> >>> I am talking a while (via email) with Jan now to track the following >>> problem and he suggested that I report the problem on xen-devel: >>> >>> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI >>> hang ? >>> Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung >>> Jul 9 01:49:10 virt kernel: Calling adapter init >>> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not >>> guaranteed on shared IRQs >>> Jul 9 01:49:49 virt kernel: Acquiring adapter information >>> Jul 9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s >>> Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous >>> command timed out. >>> Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing >>> problem; >>> Jul 9 01:53:13 virt kernel: update mother board BIOS or consider >>> utilizing one of >>> Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic >>> etc) >>> >>> After the VMs have been running a while the aacraid driver reports a >>> non-responding RAID controller. Most of the time the NIC is also no >>> longer working. >>> I nearly tried every combination of dom0 kernel (pvops0, xenfied suse >>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen >>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. >>> No success in two month. Every combination earlier or later had the >>> problem shown above. I did extensive tests to make sure that the >>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. >>> >>> Jan suggested to try the fix in c/s 22051 but it did not help. My answer >>> to him: >>> >>>> In the meantime I did try xen-unstable c/s 22068 (contains staging c/s >>> >>> 22051) and >>>> >>>> it did not fix the problem at all. I was able to fix a problem with >>> >>> the serial console >>>> >>>> and so I got some debug info that is attached to this email. The >>> >>> following line looks >>>> >>>> suspicious to me (irr=1, delivery_status=1): >>> >>>> (XEN) IRQ 16 Vec216: >>>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, >>> >>> dest_mode=logical, >>>> >>>> delivery_status=1, polarity=1, irr=1, trigger=level, >>> >>> mask=0, dest_id:1 >>> >>>> IRQ 16 is the aacraid controller which after some while seems to be >>> >>> enable to receive >>>> >>>> interrupts. Can you see from the debug info what is going on? >>> >>> I also applied a small patch which disables HPET broadcast. The machine >>> is now running >>> for 110 hours without a crash while normally it crashes within a few >>> minutes. Is there >>> something wrong (race, deadlock) with HPET broadcasts in relation to >>> blocked interrupt >>> reception (see above)? >> >> What kind of hardware does this happen on? > > It is a Supermicro X8SIL-F, Intel Xeon 3450 system. > >> Should this patch be merged? > > Not easy to answer. I spend more than 10 weeks searching nearly full time > for the reason of the stability issues. Finally I was able to track it down > to the HPET broadcast code. > > We need to find the developer of the HPET broadcast code. Then, he should > try to fix the code. I consider it a quite severe bug as it renders Xen > nearly useless on affected systems. That is why I (and my boss who pays me) > spend so much time (developing/fixing Xen is not really my core job) and > money (buying a E5620 machine just for testing Xen). > > I think many people on affected systems are having problems. See > http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html > > Regards Andreas > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >I will test that patch on my Supermicro X7DWA-N based dual Xeon workstation, I always use a Xenified kernel rather than pv_ops as it supports some features that I need and is compatible with nvidia binary drivers, but I''ve always had problems with very occasional hard/soft lockups :(. I''ve ruled out the nvidia drivers, before going on holiday a few weeks ago I upgraded Xen to 4.0.1 (from 3.4.2) and the kernel to 2.6.34 patched with the latest suse xen patches, but I did not compile the nvidia module or run X using any other drivers, the system locked up after 11 days of moderate load runtime, unfortunately my serial to tcp/ip device was not working so I could not check the serial console remotely and had to reboot the system. This problem has happened with 2.6.29,30,31,32, and 34 + Xen 3.4.1, 3.4.2 and 4.0.1, I''ve also tried using the full suse patch set rather than the minimal set of Xen patches that I usually use, no change, I think this is a Xen problem. Usually a soft lockup is reported by the linux kernel but it is impossible to diagnose further as no i/o is possible so commands like xm do not work, more rarely the system locks hard with no response at all on the serial console and no errors logged. In perhaps 1 in 20 of cases the lockup is temporary and the system returns to normal performance, but usually it is terminal. The machine is my main workstation and the problem is rare enough that I''ve tolerated it, we recently got another dual Xeon workstation with Supermicro X8DAL-i so it will be interesting to see if that has the same issue. Some example soft lockup errors, this one the system recovered from: BUG: soft lockup - CPU#3 stuck for 2796s! [swapper:0] Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last unloaded: nvidia] CPU 3 Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last unloaded: nvidia] Pid: 0, comm: swapper Tainted: P 2.6.34-xen-r4 #1 X7DWA/X7DWA RIP: e030:[<ffffffff802013aa>] [<ffffffff802013aa>] 0xffffffff802013aa RSP: e02b:ffff8803ec4cdf10 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff8088a158 RCX: ffffffff802013aa RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff8803ec4cdfd8 R08: 0000000000000000 R09: ffffffff8088a158 R10: ffff880071aeecc0 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1a34310710(0000) GS:ffff880001049000(0000) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00007f5b6b5fa000 CR3: 00000000363be000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff8803ec4cc000, task ffff8803ec4be000) Stack: 000000000000a280 0000000000000000 ffffffff802062e1 ffffffff80209761 <0> ffffffff8088a158 ffffffff8020361f ffffffff804b2e73 0000000000000000 <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd [<ffffffff80209761>] ? xen_idle+0x4f/0x85 [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80 [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc Call Trace: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd [<ffffffff80209761>] ? xen_idle+0x4f/0x85 [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80 [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa Some older examples which were terminal: Sep 25 05:10:12 ubermicro kernel: BUG: soft lockup - CPU#6 stuck for 61s! [xenvnc.sh:12180] Sep 25 05:10:12 ubermicro kernel: Modules linked in: cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer iTCO_wdt snd iTCO_vendor_support i2c_i801 snd_page_alloc igb sym53c8xx i5k_amb [last unloaded: microcode] Sep 25 05:10:12 ubermicro kernel: CPU 6 Sep 25 05:10:12 ubermicro kernel: Modules linked in: cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer iTCO_wdt snd iTCO_vendor_support i2c_i801 snd_page_alloc igb sym53c8xx i5k_amb [last unloaded: microcode] Sep 25 05:10:12 ubermicro kernel: Sep 25 05:10:12 ubermicro kernel: Pid: 12180, comm: xenvnc.sh Tainted: P 2.6.34-xen-r4 #1 X7DWA/X7DWA Sep 25 05:10:12 ubermicro kernel: RIP: e030:[<ffffffff8025328d>] [<ffffffff8025328d>] smp_call_function_many+0x187/0x19c Sep 25 05:10:12 ubermicro kernel: RSP: e02b:ffff88004e0c7dd8 EFLAGS: 00000202 Sep 25 05:10:12 ubermicro kernel: RAX: ffff880001086ac0 RBX: ffff880001089b30 RCX: 00007f3124b39000 Sep 25 05:10:12 ubermicro kernel: RDX: ffff88000107f000 RSI: 0000000000000020 RDI: 0000000000000020 Sep 25 05:10:12 ubermicro kernel: RBP: ffff880001089b00 R08: 0000000000000000 R09: ffff880001089b30 Sep 25 05:10:12 ubermicro kernel: R10: 0000000000007ff0 R11: ffff8803a73c21c0 R12: ffff8803a73c21c0 Sep 25 05:10:12 ubermicro kernel: R13: ffffffff80216f7f R14: 0000000000000006 R15: ffffffff8088a158 Sep 25 05:10:12 ubermicro kernel: FS: 00007f3125469700(0000) GS:ffff88000107f000(0000) knlGS:0000000000000000 Sep 25 05:10:12 ubermicro kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 25 05:10:12 ubermicro kernel: CR2: 00007f3124b39a90 CR3: 00000000008fd000 CR4: 0000000000002660 Sep 25 05:10:12 ubermicro kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 25 05:10:12 ubermicro kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 25 05:10:12 ubermicro kernel: Process xenvnc.sh (pid: 12180, threadinfo ffff88004e0c6000, task ffff88004cdc3980) Sep 25 05:10:12 ubermicro kernel: Stack: Sep 25 05:10:12 ubermicro kernel: 0000000000000000 0100000000000010 ffff880005ab4588 ffff8803a73c21c0 Sep 25 05:10:12 ubermicro kernel: <0> ffff88004cdc3980 ffff88004cdc3e4c ffff8803a73c2220 0000000000000232 Sep 25 05:10:12 ubermicro kernel: <0> 0000000000000001 ffffffff80216f40 00007f3124b39a90 ffff8803a73c21c0 Sep 25 05:10:12 ubermicro kernel: Call Trace: Sep 25 05:10:12 ubermicro kernel: [<ffffffff80216f40>] ? arch_exit_mmap+0x44/0x83 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80298322>] ? exit_mmap+0x49/0x16c Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5 Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022e03d>] ? exit_mm+0x108/0x113 Sep 25 05:10:12 ubermicro kernel: [<ffffffff802484d5>] ? hrtimer_try_to_cancel+0x92/0x9d Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022fc58>] ? do_exit+0x1f2/0x6e0 Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022f8dd>] ? sys_wait4+0xa5/0xb5 Sep 25 05:10:12 ubermicro kernel: [<ffffffff802301f4>] ? do_group_exit+0xae/0xd8 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80230230>] ? sys_exit_group+0x12/0x17 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80204248>] ? system_call_fastpath+0x16/0x1b Sep 25 05:10:12 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52 Sep 25 05:10:12 ubermicro kernel: Code: 7e 80 48 89 2d 55 8f 59 00 48 89 c6 48 89 6a 08 e8 82 0c 3c 00 0f ae f0 48 89 df e8 91 c1 fb ff 80 7c 24 0f 00 75 04 eb 08 f3 90 <f6> 45 20 01 75 f8 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 Sep 25 05:10:12 ubermicro kernel: Call Trace: Sep 25 05:10:12 ubermicro kernel: [<ffffffff80216f40>] ? arch_exit_mmap+0x44/0x83 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80298322>] ? exit_mmap+0x49/0x16c Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5 Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022e03d>] ? exit_mm+0x108/0x113 Sep 25 05:10:12 ubermicro kernel: [<ffffffff802484d5>] ? hrtimer_try_to_cancel+0x92/0x9d Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022fc58>] ? do_exit+0x1f2/0x6e0 Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022f8dd>] ? sys_wait4+0xa5/0xb5 Sep 25 05:10:12 ubermicro kernel: [<ffffffff802301f4>] ? do_group_exit+0xae/0xd8 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80230230>] ? sys_exit_group+0x12/0x17 Sep 25 05:10:12 ubermicro kernel: [<ffffffff80204248>] ? system_call_fastpath+0x16/0x1b Sep 25 05:10:12 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52 Sep 29 02:54:47 ubermicro kernel: BUG: soft lockup - CPU#3 stuck for 2796s! [swapper:0] Sep 29 02:54:47 ubermicro kernel: Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last unloaded: nvidia] Sep 29 02:54:47 ubermicro kernel: CPU 3 Sep 29 02:54:47 ubermicro kernel: Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last unloaded: nvidia] Sep 29 02:54:47 ubermicro kernel: Sep 29 02:54:47 ubermicro kernel: Pid: 0, comm: swapper Tainted: P 2.6.34-xen-r4 #1 X7DWA/X7DWA Sep 29 02:54:47 ubermicro kernel: RIP: e030:[<ffffffff802013aa>] [<ffffffff802013aa>] 0xffffffff802013aa Sep 29 02:54:47 ubermicro kernel: RSP: e02b:ffff8803ec4cdf10 EFLAGS: 00000246 Sep 29 02:54:47 ubermicro kernel: RAX: 0000000000000000 RBX: ffffffff8088a158 RCX: ffffffff802013aa Sep 29 02:54:47 ubermicro kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 Sep 29 02:54:47 ubermicro kernel: RBP: ffff8803ec4cdfd8 R08: 0000000000000000 R09: ffffffff8088a158 Sep 29 02:54:47 ubermicro kernel: R10: ffff880071aeecc0 R11: 0000000000000246 R12: 0000000000000000 Sep 29 02:54:47 ubermicro kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Sep 29 02:54:47 ubermicro kernel: FS: 00007f1a34310710(0000) GS:ffff880001049000(0000) knlGS:0000000000000000 Sep 29 02:54:47 ubermicro kernel: CS: e033 DS: 002b ES: 002b CR0: 000000008005003b Sep 29 02:54:47 ubermicro kernel: CR2: 00007f5b6b5fa000 CR3: 00000000363be000 CR4: 0000000000002660 Sep 29 02:54:47 ubermicro kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 29 02:54:47 ubermicro kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 29 02:54:47 ubermicro kernel: Process swapper (pid: 0, threadinfo ffff8803ec4cc000, task ffff8803ec4be000) Sep 29 02:54:47 ubermicro kernel: Stack: Sep 29 02:54:47 ubermicro kernel: 000000000000a280 0000000000000000 ffffffff802062e1 ffffffff80209761 Sep 29 02:54:47 ubermicro kernel: <0> ffffffff8088a158 ffffffff8020361f ffffffff804b2e73 0000000000000000 Sep 29 02:54:47 ubermicro kernel: <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Sep 29 02:54:47 ubermicro kernel: Call Trace: Sep 29 02:54:47 ubermicro kernel: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd Sep 29 02:54:47 ubermicro kernel: [<ffffffff80209761>] ? xen_idle+0x4f/0x85 Sep 29 02:54:47 ubermicro kernel: [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80 Sep 29 02:54:47 ubermicro kernel: [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa Sep 29 02:54:47 ubermicro kernel: Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc Sep 29 02:54:47 ubermicro kernel: Call Trace: Sep 29 02:54:47 ubermicro kernel: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd Sep 29 02:54:47 ubermicro kernel: [<ffffffff80209761>] ? xen_idle+0x4f/0x85 Sep 29 02:54:47 ubermicro kernel: [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80 Sep 29 02:54:47 ubermicro kernel: [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa Sep 8 18:16:30 ubermicro kernel: BUG: soft lockup - CPU#2 stuck for 61s! [xenvnc.sh:29385] Sep 8 18:16:30 ubermicro kernel: Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd i2c_i801 iTCO_wdt iTCO_vendor_support snd_page_alloc igb i5k_amb sym53c8xx [last unloaded: microcode] Sep 8 18:16:30 ubermicro kernel: CPU 2 Sep 8 18:16:30 ubermicro kernel: Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm snd_timer snd i2c_i801 iTCO_wdt iTCO_vendor_support snd_page_alloc igb i5k_amb sym53c8xx [last unloaded: microcode] Sep 8 18:16:30 ubermicro kernel: Sep 8 18:16:30 ubermicro kernel: Pid: 29385, comm: xenvnc.sh Tainted: P 2.6.34-xen-r3 #1 X7DWA/X7DWA Sep 8 18:16:30 ubermicro kernel: RIP: e030:[<ffffffff802531ef>] [<ffffffff802531ef>] smp_call_function_many+0x185/0x19c Sep 8 18:16:30 ubermicro kernel: RSP: e02b:ffff88009b18fdd8 EFLAGS: 00000202 Sep 8 18:16:30 ubermicro kernel: RAX: ffff88000103eac0 RBX: ffff880001041b30 RCX: 00007f89e1853000 Sep 8 18:16:30 ubermicro kernel: RDX: ffff880001037000 RSI: 0000000000000020 RDI: 0000000000000020 Sep 8 18:16:30 ubermicro kernel: RBP: ffff880001041b00 R08: 0000000000000000 R09: ffff880001041b30 Sep 8 18:16:30 ubermicro kernel: R10: 0000000000007ff0 R11: ffff8803d7d12800 R12: ffff8803d7d12800 Sep 8 18:16:30 ubermicro kernel: R13: ffffffff80216f7f R14: 0000000000000002 R15: ffffffff8088a158 Sep 8 18:16:30 ubermicro kernel: FS: 00007f89e2183700(0000) GS:ffff880001037000(0000) knlGS:0000000000000000 Sep 8 18:16:30 ubermicro kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 8 18:16:30 ubermicro kernel: CR2: 00007f89e1853a90 CR3: 00000000008fd000 CR4: 0000000000002660 Sep 8 18:16:30 ubermicro kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 8 18:16:30 ubermicro kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 8 18:16:30 ubermicro kernel: Process xenvnc.sh (pid: 29385, threadinfo ffff88009b18e000, task ffff8801004ad320) Sep 8 18:16:30 ubermicro kernel: Stack: Sep 8 18:16:30 ubermicro kernel: 0000000000000000 0100000000000010 ffff880006b3c398 ffff8803d7d12800 Sep 8 18:16:30 ubermicro kernel: <0> ffff8801004ad320 ffff8801004ad7ec ffff8803d7d12860 0000000000000403 Sep 8 18:16:30 ubermicro kernel: <0> 0000000000000001 ffffffff80216f40 00007f89e1853a90 ffff8803d7d12800 Sep 8 18:16:30 ubermicro kernel: Call Trace: Sep 8 18:16:30 ubermicro kernel: [<ffffffff80216f40>] ? arch_exit_mmap+0x44/0x83 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80298202>] ? exit_mmap+0x49/0x16c Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5 Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022e025>] ? exit_mm+0x108/0x113 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80248439>] ? hrtimer_try_to_cancel+0x92/0x9d Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022fc40>] ? do_exit+0x1f2/0x6e0 Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022f8c5>] ? sys_wait4+0xa5/0xb5 Sep 8 18:16:30 ubermicro kernel: [<ffffffff802301dc>] ? do_group_exit+0xae/0xd8 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80230218>] ? sys_exit_group+0x12/0x17 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80204248>] ? system_call_fastpath+0x16/0x1b Sep 8 18:16:30 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52 Sep 8 18:16:30 ubermicro kernel: Code: d0 c1 7e 80 48 89 2d f1 8f 59 00 48 89 c6 48 89 6a 08 e8 0e 08 3c 00 0f ae f0 48 89 df e8 2d c2 fb ff 80 7c 24 0f 00 75 04 eb 08 <f3> 90 f6 45 20 01 75 f8 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 Sep 8 18:16:30 ubermicro kernel: Call Trace: Sep 8 18:16:30 ubermicro kernel: [<ffffffff80216f40>] ? arch_exit_mmap+0x44/0x83 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80298202>] ? exit_mmap+0x49/0x16c Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5 Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022e025>] ? exit_mm+0x108/0x113 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80248439>] ? hrtimer_try_to_cancel+0x92/0x9d Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022fc40>] ? do_exit+0x1f2/0x6e0 Sep 8 18:16:30 ubermicro kernel: [<ffffffff8022f8c5>] ? sys_wait4+0xa5/0xb5 Sep 8 18:16:30 ubermicro kernel: [<ffffffff802301dc>] ? do_group_exit+0xae/0xd8 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80230218>] ? sys_exit_group+0x12/0x17 Sep 8 18:16:30 ubermicro kernel: [<ffffffff80204248>] ? system_call_fastpath+0x16/0x1b Sep 8 18:16:30 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52 I should be able to apply the patch tomorrow and will report back as soon as I have some results. Andy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-29 19:50 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 09/29/2010 11:08 AM, Andreas Kinzler wrote:> On 21.09.2010 13:56, Pasi Kärkkäinen wrote: >>> I am talking a while (via email) with Jan now to track the following >>> problem and he suggested that I report the problem on xen-devel: >>> >>> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI >>> hang ? >>> Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung >>> Jul 9 01:49:10 virt kernel: Calling adapter init >>> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not >>> guaranteed on shared IRQs >>> Jul 9 01:49:49 virt kernel: Acquiring adapter information >>> Jul 9 01:49:49 virt kernel: update_interval=30:00 >>> check_interval=86400s >>> Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous >>> command timed out. >>> Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt >>> routing >>> problem; >>> Jul 9 01:53:13 virt kernel: update mother board BIOS or consider >>> utilizing one of >>> Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, >>> apic etc) >>> >>> After the VMs have been running a while the aacraid driver reports a >>> non-responding RAID controller. Most of the time the NIC is also no >>> longer working. >>> I nearly tried every combination of dom0 kernel (pvops0, xenfied suse >>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen >>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. >>> No success in two month. Every combination earlier or later had the >>> problem shown above. I did extensive tests to make sure that the >>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. >>> >>> Jan suggested to try the fix in c/s 22051 but it did not help. My >>> answer >>> to him: >>> >>>> In the meantime I did try xen-unstable c/s 22068 (contains staging c/s >>> 22051) and >>>> it did not fix the problem at all. I was able to fix a problem with >>> the serial console >>>> and so I got some debug info that is attached to this email. The >>> following line looks >>>> suspicious to me (irr=1, delivery_status=1): >>> >>>> (XEN) IRQ 16 Vec216: >>>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, >>> dest_mode=logical, >>>> delivery_status=1, polarity=1, irr=1, trigger=level, >>> mask=0, dest_id:1 >>> >>>> IRQ 16 is the aacraid controller which after some while seems to be >>> enable to receive >>>> interrupts. Can you see from the debug info what is going on? >>> >>> I also applied a small patch which disables HPET broadcast. The machine >>> is now running >>> for 110 hours without a crash while normally it crashes within a few >>> minutes. Is there >>> something wrong (race, deadlock) with HPET broadcasts in relation to >>> blocked interrupt >>> reception (see above)? >> What kind of hardware does this happen on? > > It is a Supermicro X8SIL-F, Intel Xeon 3450 system.That''s exactly what my main test/devel machine is. It has been very stable for me with xen-unstable. Is 4.0.1 different from xen-unstable with respect to HPET? The big problem I had initially was instability with the integrated ethernet until I disabled PCIe ASPM. The symptom was that the ethernet devices would disappear (ie, their PCI config space would start to read all 0xff...)>> Should this patch be merged? > > Not easy to answer. I spend more than 10 weeks searching nearly full > time for the reason of the stability issues. Finally I was able to > track it down to the HPET broadcast code. > > We need to find the developer of the HPET broadcast code. Then, he > should try to fix the code. I consider it a quite severe bug as it > renders Xen nearly useless on affected systems. That is why I (and my > boss who pays me) spend so much time (developing/fixing Xen is not > really my core job) and money (buying a E5620 machine just for testing > Xen). > > I think many people on affected systems are having problems. See > http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.htmlJust out of interest, does disabling ASPM help? I had to disable it in the BIOS, and set pcie_aspm=off on the kernel command line. This is a total shot in the dark, but given that we''re using identical systems it seems worth a try. Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Sep-29 21:18 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On Wed, Sep 29, 2010 at 08:34:28PM +0100, Andrew Lyon wrote:> On Wed, Sep 29, 2010 at 7:08 PM, Andreas Kinzler <ml-xen-devel@hfp.de> wrote: > > On 21.09.2010 13:56, Pasi Kärkkäinen wrote: > >>> > >>> I am talking a while (via email) with Jan now to track the following > >>> problem and he suggested that I report the problem on xen-devel: > >>> > >>> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI > >>> hang ? > >>> Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears hung > >>> Jul 9 01:49:10 virt kernel: Calling adapter init > >>> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not > >>> guaranteed on shared IRQs > >>> Jul 9 01:49:49 virt kernel: Acquiring adapter information > >>> Jul 9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s > >>> Jul 9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous > >>> command timed out. > >>> Jul 9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing > >>> problem; > >>> Jul 9 01:53:13 virt kernel: update mother board BIOS or consider > >>> utilizing one of > >>> Jul 9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic > >>> etc) > >>> > >>> After the VMs have been running a while the aacraid driver reports a > >>> non-responding RAID controller. Most of the time the NIC is also no > >>> longer working. > >>> I nearly tried every combination of dom0 kernel (pvops0, xenfied suse > >>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen > >>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. > >>> No success in two month. Every combination earlier or later had the > >>> problem shown above. I did extensive tests to make sure that the > >>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. > >>> > >>> Jan suggested to try the fix in c/s 22051 but it did not help. My answer > >>> to him: > >>> > >>>> In the meantime I did try xen-unstable c/s 22068 (contains staging c/s > >>> > >>> 22051) and > >>>> > >>>> it did not fix the problem at all. I was able to fix a problem with > >>> > >>> the serial console > >>>> > >>>> and so I got some debug info that is attached to this email. The > >>> > >>> following line looks > >>>> > >>>> suspicious to me (irr=1, delivery_status=1): > >>> > >>>> (XEN) IRQ 16 Vec216: > >>>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, > >>> > >>> dest_mode=logical, > >>>> > >>>> delivery_status=1, polarity=1, irr=1, trigger=level, > >>> > >>> mask=0, dest_id:1 > >>> > >>>> IRQ 16 is the aacraid controller which after some while seems to be > >>> > >>> enable to receive > >>>> > >>>> interrupts. Can you see from the debug info what is going on? > >>> > >>> I also applied a small patch which disables HPET broadcast. The machine > >>> is now running > >>> for 110 hours without a crash while normally it crashes within a few > >>> minutes. Is there > >>> something wrong (race, deadlock) with HPET broadcasts in relation to > >>> blocked interrupt > >>> reception (see above)? > >> > >> What kind of hardware does this happen on? > > > > It is a Supermicro X8SIL-F, Intel Xeon 3450 system. > > > >> Should this patch be merged? > > > > Not easy to answer. I spend more than 10 weeks searching nearly full time > > for the reason of the stability issues. Finally I was able to track it down > > to the HPET broadcast code. > > > > We need to find the developer of the HPET broadcast code. Then, he should > > try to fix the code. I consider it a quite severe bug as it renders Xen > > nearly useless on affected systems. That is why I (and my boss who pays me) > > spend so much time (developing/fixing Xen is not really my core job) and > > money (buying a E5620 machine just for testing Xen). > > > > I think many people on affected systems are having problems. See > > http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html > > > > Regards Andreas > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > I will test that patch on my Supermicro X7DWA-N based dual Xeon > workstation, I always use a Xenified kernel rather than pv_ops as it > supports some features that I need and is compatible with nvidia > binary drivers, but I''ve always had problems with very occasional<hint> The PVOPS kernel works with the nouveau driver</hint> Look at http://wiki.xensource.com/xenwiki/XenPVOPSDRM for details. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2010-Sep-30 05:00 UTC
RE: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
Maybe you can disable pirq_set_affinity to have a try with the following patch. It may trigger IRQ migration in hypervisor, and the IRQ migration logic about(especailly shared)level-triggered ioapic IRQ is not well tested because of no users before. After intoducing the pirq_set_affinity in #Cset21625, the logic is used frequently when vcpu migration occurs, so I doubt it maybe expose the issue you met. Besides, there is a bug in event driver which is fixed in latest pv_ops dom0, seems the dom0 you are using doesn''t include the fix. This bug may result in lost event in dom0 and invoke dom0 hang eventually. To workaround this bug, you can disable irqbalance in dom0. Good luck! Xiantao diff -r fc29e13f669d xen/arch/x86/irq.c --- a/xen/arch/x86/irq.c Mon Aug 09 16:36:07 2010 +0100 +++ b/xen/arch/x86/irq.c Thu Sep 30 20:33:11 2010 +0800 @@ -516,6 +516,7 @@ void irq_set_affinity(struct irq_desc *d void pirq_set_affinity(struct domain *d, int pirq, const cpumask_t *mask) { +#if 0 unsigned long flags; struct irq_desc *desc = domain_spin_lock_irq_desc(d, pirq, &flags); @@ -523,6 +524,7 @@ void pirq_set_affinity(struct domain *d, return; irq_set_affinity(desc, mask); spin_unlock_irqrestore(&desc->lock, flags); +#endif } DEFINE_PER_CPU(unsigned int, irq_count); Andreas Kinzler wrote:> On 21.09.2010 13:56, Pasi Kärkkäinen wrote: >>> I am talking a while (via email) with Jan now to track the >>> following problem and he suggested that I report the problem on >>> xen-devel: >>> >>> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. >>> SCSI hang ? Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears >>> hung >>> Jul 9 01:49:10 virt kernel: Calling adapter init >>> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not >>> guaranteed on shared IRQs Jul 9 01:49:49 virt kernel: Acquiring >>> adapter information >>> Jul 9 01:49:49 virt kernel: update_interval=30:00 >>> check_interval=86400s Jul 9 01:53:13 virt kernel: aacraid: >>> aac_fib_send: first asynchronous command timed out. Jul 9 01:53:13 >>> virt kernel: Usually a result of a PCI interrupt routing problem; >>> Jul 9 01:53:13 virt kernel: update mother board BIOS or consider >>> utilizing one of Jul 9 01:53:13 virt kernel: the SAFE mode kernel >>> options (acpi, apic etc) >>> >>> After the VMs have been running a while the aacraid driver reports a >>> non-responding RAID controller. Most of the time the NIC is also no >>> longer working. I nearly tried every combination of dom0 kernel >>> (pvops0, xenfied suse >>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen >>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. >>> No success in two month. Every combination earlier or later had the >>> problem shown above. I did extensive tests to make sure that the >>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. >>> >>> Jan suggested to try the fix in c/s 22051 but it did not help. My >>> answer to him: >>> >>>> In the meantime I did try xen-unstable c/s 22068 (contains staging >>>> c/s 22051) and it did not fix the problem at all. I was able to >>>> fix a problem with the serial console and so I got some debug info >>>> that is attached to this email. The following line looks >>>> suspicious to me (irr=1, delivery_status=1): >>> >>>> (XEN) IRQ 16 Vec216: >>>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, >>>> dest_mode=logical, delivery_status=1, polarity=1, >>>> irr=1, trigger=level, mask=0, dest_id:1 >>> >>>> IRQ 16 is the aacraid controller which after some while seems to >>>> be enable to receive interrupts. Can you see from the debug info >>>> what is going on? >>> >>> I also applied a small patch which disables HPET broadcast. The >>> machine is now running for 110 hours without a crash while normally >>> it crashes within a few minutes. Is there something wrong (race, >>> deadlock) with HPET broadcasts in relation to blocked interrupt >>> reception (see above)? >> What kind of hardware does this happen on? > > It is a Supermicro X8SIL-F, Intel Xeon 3450 system. > >> Should this patch be merged? > > Not easy to answer. I spend more than 10 weeks searching nearly full > time for the reason of the stability issues. Finally I was able to > track > it down to the HPET broadcast code. > > We need to find the developer of the HPET broadcast code. Then, he > should try to fix the code. I consider it a quite severe bug as it > renders Xen nearly useless on affected systems. That is why I (and my > boss who pays me) spend so much time (developing/fixing Xen is not > really my core job) and money (buying a E5620 machine just for > testing Xen). > > I think many people on affected systems are having problems. See > http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html > > Regards Andreas > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei, Gang
2010-Sep-30 06:02 UTC
RE: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
I am the original developer of HPET broadcast code. First of all, to disable HPET broadcast, no additional patch is required. Please simply add option "cpuidle=off" or "max_cstate=1" at xen cmdline in /boot/grub/grub.conf. Second, I noticed that the issue just occur on pre-nehalem server processors. I will check whether I can reproduce it. Meanwhile, I am looking forward to see whether Jeremy & Xiantao''s suggestions have effects. So Andreas, could you help to have a try on their suggestions? Jimmy On , xen-devel-bounces@lists.xensource.com wrote:> Maybe you can disable pirq_set_affinity to have a try with the > following patch. It may trigger IRQ migration in hypervisor, > and the IRQ migration logic about(especailly > shared)level-triggered ioapic IRQ is not well tested because > of no users before. After intoducing the pirq_set_affinity in > #Cset21625, the logic is used frequently when vcpu migration > occurs, so I doubt it maybe expose the issue you met. > Besides, there is a bug in event driver which is fixed in > latest pv_ops dom0, seems the dom0 you are using doesn''t > include the fix. This bug may result in lost event in dom0 > and invoke dom0 hang eventually. To workaround this bug, you > can disable irqbalance in dom0. Good luck! > Xiantao > > diff -r fc29e13f669d xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Mon Aug 09 16:36:07 2010 +0100 > +++ b/xen/arch/x86/irq.c Thu Sep 30 20:33:11 2010 +0800 > @@ -516,6 +516,7 @@ void irq_set_affinity(struct irq_desc *d > > void pirq_set_affinity(struct domain *d, int pirq, const cpumask_t > *mask) { > +#if 0 > unsigned long flags; > struct irq_desc *desc = domain_spin_lock_irq_desc(d, pirq, > &flags); > > @@ -523,6 +524,7 @@ void pirq_set_affinity(struct domain *d, > return; irq_set_affinity(desc, mask); > spin_unlock_irqrestore(&desc->lock, flags); > +#endif > } > > DEFINE_PER_CPU(unsigned int, irq_count); > > > Andreas Kinzler wrote: >> On 21.09.2010 13:56, Pasi Kärkkäinen wrote: >>>> I am talking a while (via email) with Jan now to track the >>>> following problem and he suggested that I report the problem on >>>> xen-devel: >>>> >>>> Jul 9 01:48:04 virt kernel: aacraid: Host adapter reset request. >>>> SCSI hang ? Jul 9 01:49:05 virt kernel: aacraid: SCSI bus appears >>>> hung Jul 9 01:49:10 virt kernel: Calling adapter init >>>> Jul 9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not >>>> guaranteed on shared IRQs Jul 9 01:49:49 virt kernel: Acquiring >>>> adapter information Jul 9 01:49:49 virt kernel: >>>> update_interval=30:00 check_interval=86400s Jul 9 01:53:13 virt >>>> kernel: aacraid: aac_fib_send: first asynchronous command timed >>>> out. Jul 9 01:53:13 virt kernel: Usually a result of a PCI >>>> interrupt routing problem; Jul 9 01:53:13 virt kernel: update >>>> mother board BIOS or consider utilizing one of Jul 9 01:53:13 >>>> virt kernel: the SAFE mode kernel options (acpi, apic etc) >>>> >>>> After the VMs have been running a while the aacraid driver reports >>>> a non-responding RAID controller. Most of the time the NIC is also >>>> no longer working. I nearly tried every combination of dom0 kernel >>>> (pvops0, xenfied suse >>>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen >>>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable. >>>> No success in two month. Every combination earlier or later had the >>>> problem shown above. I did extensive tests to make sure that the >>>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem. >>>> >>>> Jan suggested to try the fix in c/s 22051 but it did not help. My >>>> answer to him: >>>> >>>>> In the meantime I did try xen-unstable c/s 22068 (contains staging >>>>> c/s 22051) and it did not fix the problem at all. I was able to >>>>> fix a problem with the serial console and so I got some debug info >>>>> that is attached to this email. The following line looks >>>>> suspicious to me (irr=1, delivery_status=1): >>>> >>>>> (XEN) IRQ 16 Vec216: >>>>> (XEN) Apic 0x00, Pin 16: vector=216, delivery_mode=1, >>>>> dest_mode=logical, delivery_status=1, polarity=1, >>>>> irr=1, trigger=level, mask=0, dest_id:1 >>>> >>>>> IRQ 16 is the aacraid controller which after some while seems to >>>>> be enable to receive interrupts. Can you see from the debug info >>>>> what is going on? >>>> >>>> I also applied a small patch which disables HPET broadcast. The >>>> machine is now running for 110 hours without a crash while normally >>>> it crashes within a few minutes. Is there something wrong (race, >>>> deadlock) with HPET broadcasts in relation to blocked interrupt >>>> reception (see above)? >>> What kind of hardware does this happen on? >> >> It is a Supermicro X8SIL-F, Intel Xeon 3450 system. >> >>> Should this patch be merged? >> >> Not easy to answer. I spend more than 10 weeks searching nearly full >> time for the reason of the stability issues. Finally I was able to >> track it down to the HPET broadcast code. >> >> We need to find the developer of the HPET broadcast code. Then, he >> should try to fix the code. I consider it a quite severe bug as it >> renders Xen nearly useless on affected systems. That is why I (and my >> boss who pays me) spend so much time (developing/fixing Xen is not >> really my core job) and money (buying a E5620 machine just for >> testing Xen). >> >> I think many people on affected systems are having problems. See >> > http://lists.xensource.com/archives/html/xen-users/2010-09/msg0 > 0370.html >> >> Regards Andreas >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andreas Kinzler
2010-Sep-30 09:42 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 30.09.2010 07:00, Zhang, Xiantao wrote:> Maybe you can disable pirq_set_affinity to have a try with the following patch.> It may trigger IRQ migration in hypervisor, and the IRQ migration logic about(especailly shared) > level-triggered ioapic IRQ is not well tested because of > no users before. After intoducing the pirq_set_affinity in > #Cset21625, the logic is used frequently when vcpu migration occurs I am using Xen 4.0.1 which is c/s 21324 so I should not be affected? > Besides, there is a bug in event driver which is fixed in latest > pv_ops dom0, seems the dom0 you are using doesn''t include the fix. > This bug may result in lost event in dom0 and invoke dom0 > hang eventually. Hmm, this really does not explain why everything is rock solid after disabling HPET broadcast? And the problem occured with every kernel (xenfied, pvops, all versions). Please correct me if I am wrong. > To workaround this bug, you can disable irqbalance in dom0. Good luck! As far as I know I am not using irq balancing (certainly not using the irqbalance daemon). Regards Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andreas Kinzler
2010-Sep-30 10:16 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 29.09.2010 21:50, Jeremy Fitzhardinge wrote:>> It is a Supermicro X8SIL-F, Intel Xeon 3450 system. > The big problem I had initially was instability with the integrated > ethernet until I disabled PCIe ASPM. The symptom was that the ethernet > devices would disappear (ie, their PCI config space would start to read > all 0xff...)I know that this is a known problem of Intel 82574L chips (on X8SIL) - it is discussed on "Intel Wired Ethernet" (http://sourceforge.net/projects/e1000/). That is why I tested different NICs (Intel ET Server Adapter (82576 [igb]) and Realtek 8168) and the problem remained. So I can say with certainty that the NIC and/or its power management is not the problem. I also spend extensive time changing hardware components. I used a different mainboard (ASUS P7F-M), a different power supply, changed CPU, changed NICs (see above) - problems remained. > That''s exactly what my main test/devel machine is. It has been very > stable for me with xen-unstable. We have a second Supermicro X8SIL-F, Intel Xeon 3450 system which only runs Linux PVM domains and it is totally stable (without my HPET patch). So I think as with all timing/race/deadlock/... issues it depends on what you do on your system. Let me give you my crash "recipe" [quite reliable ;-)] Have two HVMs (called win1, win2) with Windows 7 x64 installed (do install everything twice, never clone, VM config attached). Install GPLPV 0.11.0.213, iometer 2006.07.27, prime95 25.11 x64. On both systems: start prime95 torture test (in-place large FFT) and using Windows task manager set CPU affinity on win1 of process prime95 to use only CPU1. On win2 do the same thing but to use only CPU0. Then start iometer on both VMs using the following parameters: have a second virtual disk in both VMs (so every windows has 2 virtual disks, one for Windows and one for iometer), use "# of outstanding I/Os" = 4, access spec = "All in one". Wait some minutes. Crash! Regards Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Sep-30 17:12 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 09/30/2010 03:16 AM, Andreas Kinzler wrote:> On 29.09.2010 21:50, Jeremy Fitzhardinge wrote: >>> It is a Supermicro X8SIL-F, Intel Xeon 3450 system. >> The big problem I had initially was instability with the integrated >> ethernet until I disabled PCIe ASPM. The symptom was that the ethernet >> devices would disappear (ie, their PCI config space would start to read >> all 0xff...) > > I know that this is a known problem of Intel 82574L chips (on X8SIL) - > it is discussed on "Intel Wired Ethernet" > (http://sourceforge.net/projects/e1000/).Aha, specifically http://sourceforge.net/tracker/index.php?func=detail&aid=2908463&group_id=42302&atid=447449, in which several people invoke me, but nobody bothered to tell me that this bug existed on sf :/> That is why I tested different NICs (Intel ET Server Adapter (82576 > [igb]) and Realtek 8168) and the problem remained. So I can say with > certainty that the NIC and/or its power management is not the problem.OK.> > I also spend extensive time changing hardware components. I used a > different mainboard (ASUS P7F-M), a different power supply, changed > CPU, changed NICs (see above) - problems remained. > > > That''s exactly what my main test/devel machine is. It has been very > > stable for me with xen-unstable. > > We have a second Supermicro X8SIL-F, Intel Xeon 3450 system which only > runs Linux PVM domains and it is totally stable (without my HPET > patch). So I think as with all timing/race/deadlock/... issues it > depends on what you do on your system. Let me give you my crash > "recipe" [quite reliable ;-)]OK. My machine is mostly running PV domains, with some low-intensity hvm ones.> > Have two HVMs (called win1, win2) with Windows 7 x64 installed (do > install everything twice, never clone, VM config attached). Install > GPLPV 0.11.0.213, iometer 2006.07.27, prime95 25.11 x64. On both > systems: start prime95 torture test (in-place large FFT) and using > Windows task manager set CPU affinity on win1 of process prime95 to > use only CPU1. On win2 do the same thing but to use only CPU0. Then > start iometer on both VMs using the following parameters: have a > second virtual disk in both VMs (so every windows has 2 virtual disks, > one for Windows and one for iometer), use "# of outstanding I/Os" = 4, > access spec = "All in one". Wait some minutes. Crash!Yes, that''s a very different workload from mine. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2010-Oct-01 04:14 UTC
RE: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
Andreas Kinzler wrote:> On 30.09.2010 07:00, Zhang, Xiantao wrote: >> Maybe you can disable pirq_set_affinity to have a try with the >> following patch. > > It may trigger IRQ migration in hypervisor, and the IRQ migration > logic about(especailly shared) > > level-triggered ioapic IRQ is not well tested because of > > no users before. After intoducing the pirq_set_affinity in > > #Cset21625, the logic is used frequently when vcpu migration occurs > > I am using Xen 4.0.1 which is c/s 21324 so I should not be affected?Which Cset is adopted when you collected the suspecious ''irr=1'' log ? Xen-4.0.1 or 22068 ? In addition, did you always see the above strange log for every hang? You know, IRQ 16 is assigned with a relatively big vector 216, if it is not correctly acked, the other interrupt source will be masked automatically, so the dom0 maybe go hang. Another try is to hack assign_irq_vector to allocate a small vector for IRQ16, and when the aacraid controller has something wrong, you stilll have a chance to logon to dom0 to get more information. Besides, could you enable MSI for accraid controller to have a try ?> > Besides, there is a bug in event driver which is fixed in latest > > pv_ops dom0, seems the dom0 you are using doesn''t include the fix. > > This bug may result in lost event in dom0 and invoke dom0 > > hang eventually. > > Hmm, this really does not explain why everything is rock solid after > disabling HPET broadcast? And the problem occured with every kernel > (xenfied, pvops, all versions). Please correct me if I am wrongJust guess hpet broadcase maybe not the real killer, and it just exposes the bug accidentally according to the log you attached.> > To workaround this bug, you can disable irqbalance in dom0. Good > luck! > > As far as I know I am not using irq balancing (certainly not using the > irqbalance daemon).Okay. Xiantao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Dec-31 14:31 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On Thu, Sep 09, 2010 at 11:20:51AM +0200, Andreas Kinzler wrote:> I am talking a while (via email) with Jan now to track the following > problem and he suggested that I report the problem on xen-devel: >...> > I also applied a small patch which disables HPET broadcast. The machine > is now running > for 110 hours without a crash while normally it crashes within a few > minutes. Is there > something wrong (race, deadlock) with HPET broadcasts in relation to > blocked interrupt > reception (see above)? >Hello, Was this issue resolved? Just wondering since many people have reported it on xen-users list.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andreas Kinzler
2011-Jan-09 19:10 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 31.12.2010 15:31, Pasi Kärkkäinen wrote:> On Thu, Sep 09, 2010 at 11:20:51AM +0200, Andreas Kinzler wrote: >> I am talking a while (via email) with Jan now to track the following >> problem and he suggested that I report the problem on xen-devel: >> I also applied a small patch which disables HPET broadcast. The machine >> is now running >> for 110 hours without a crash while normally it crashes within a few >> minutes. Is there >> something wrong (race, deadlock) with HPET broadcasts in relation to >> blocked interrupt >> reception (see above)?> Hello, > Was this issue resolved? Just wondering since many people > have reported it on xen-users list.. > -- PasiAs to my knowledge: not at all. Somehow none of the developers took care of it. Regards Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2011-Jan-09 19:21 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On Sun, Jan 09, 2011 at 08:10:52PM +0100, Andreas Kinzler wrote:> On 31.12.2010 15:31, Pasi Kärkkäinen wrote: >> On Thu, Sep 09, 2010 at 11:20:51AM +0200, Andreas Kinzler wrote: >>> I am talking a while (via email) with Jan now to track the following >>> problem and he suggested that I report the problem on xen-devel: >>> I also applied a small patch which disables HPET broadcast. The machine >>> is now running >>> for 110 hours without a crash while normally it crashes within a few >>> minutes. Is there >>> something wrong (race, deadlock) with HPET broadcasts in relation to >>> blocked interrupt >>> reception (see above)? > >> Hello, >> Was this issue resolved? Just wondering since many people >> have reported it on xen-users list.. >> -- Pasi > > As to my knowledge: not at all. Somehow none of the developers took care > of it. >So you can still reproduce it with latest xen-4.0-testing.hg (Xen 4.0.2-rc1-pre) and latest xen/stable-2.6.32.x pvops dom0 kernel (2.6.32.27) ? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jan-09 20:04 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 09/01/2011 19:21, "Pasi Kärkkäinen" <pasik@iki.fi> wrote:>>> Was this issue resolved? Just wondering since many people >>> have reported it on xen-users list.. >>> -- Pasi >> >> As to my knowledge: not at all. Somehow none of the developers took care >> of it. >> > > So you can still reproduce it with latest xen-4.0-testing.hg (Xen > 4.0.2-rc1-pre) > and latest xen/stable-2.6.32.x pvops dom0 kernel (2.6.32.27) ?Interested in latest xen-unstable.hg too. With both trees frozen for 4.0/4.1 releases very soon, we should disable HPET broadcast if the bug is still reproducible and no ''proper'' fix is forthcoming from the original authors. -- Keir> -- Pasi > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andreas Kinzler
2011-Jan-19 10:19 UTC
Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
On 09.01.2011 21:04, Keir Fraser wrote:>>>> Was this issue resolved? Just wondering since many people >>>> have reported it on xen-users list.. >>> As to my knowledge: not at all. Somehow none of the developers took care >>> of it. >> So you can still reproduce it with latest xen-4.0-testing.hg (Xen >> 4.0.2-rc1-pre) >> and latest xen/stable-2.6.32.x pvops dom0 kernel (2.6.32.27) ? > Interested in latest xen-unstable.hg too. With both trees frozen for 4.0/4.1 > releases very soon, we should disable HPET broadcast if the bug is still > reproducible and no ''proper'' fix is forthcoming from the original authors. > -- KeirI spend some hours testing but for some reason I was unable to reproduce the crash even with the old configuration and my own crash recipe (http://lists.xensource.com/archives/html/xen-devel/2010-09/msg01755.html). Quite odd. However, pvops0 2.6.32.18 is still the latest version working at all on my systems [see large thread "Xen dom0 crash: "d0:v0: unhandled page fault (ec=0000)"]. Regards Andreas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel