Anthony PERARD
2011-Feb-04 16:34 UTC
[Xen-devel] Xen panic on guest shutdown with PCI Passthrough
Hi, Each time I shutdown a guest with a passthrough pci device, Xen panic on a NMI - MEMORY ERROR. guest os: debian lenny with default kernel dom0: linux 2.6.32 Xen serial log: (XEN) NMI - MEMORY ERROR (XEN) ----[ Xen-4.1.0-rc3-pre x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121 (XEN) RFLAGS: 0000000000000082 CONTEXT: hypervisor (XEN) rax: 00000000ffffffff rbx: ffff83023f324190 rcx: 0000000000000001 (XEN) rdx: 0000000000000082 rsi: 0000000000000001 rdi: ffff830232481838 (XEN) rbp: ffff82c480297d38 rsp: ffff82c480297d08 r8: 0000000000000002 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: ffff83010c1e9c70 r13: ffff83023f324000 r14: ffff83010c1e9c20 (XEN) r15: 0000000000000001 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 000000022c9aa000 cr2: 00000000cb74379c (XEN) ds: 007b es: 007b fs: 00d8 gs: 00e0 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff82c480297d08: (XEN) ffff82c480297d48 ffff83023f324190 ffff83010c1e9c70 ffff83023f324000 (XEN) ffff830232481800 0000000000000036 ffff82c480297d48 ffff82c48015f087 (XEN) ffff82c480297da8 ffff82c480162fb1 000000000000002f ffff830232481834 (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff83023f324190 (XEN) 0000000000000036 ffff83023f324000 ffff830232481800 ffffffffffffffff (XEN) ffff82c480297e08 ffff82c48016315c 0000000000000282 0000000000000036 (XEN) 00000000000000d8 ffff8302324808b4 010000000000002f 0000000000000036 (XEN) ffff83023f324000 ffff83023f324190 ffff82c480297f18 ffffffffffffffff (XEN) ffff82c480297e38 ffff82c480163321 ffff83023f324000 ffff8300bf4f0000 (XEN) ffff83023f324000 ffff83023f324000 ffff82c480297e58 ffff82c48015642a (XEN) ffff8300bf4f0000 00000000ffffffff ffff82c480297e88 ffff82c480104c59 (XEN) 0000000000000000 0000000000000000 ffff82c4802d4020 0000000000000000 (XEN) ffff82c480297eb8 ffff82c48012adfd ffff82c480123327 0000000000000000 (XEN) 0000000000000000 ffff82c4802b0880 ffff82c480297ef8 ffff82c480123327 (XEN) ffff82c4802d3ec0 ffff8300bf2f2000 0000000000000000 ffff8300bf4d6000 (XEN) 0000029b7b0912f0 ffff82c4802d3ec0 ffff82c480297f08 ffff82c4801233a2 (XEN) ffff82c480297d20 ffff82c480211516 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 00000000deadbeef 0000000000000001 0000000000000002 (XEN) Xen call trace: (XEN) [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121 (XEN) [<ffff82c48015f087>] mask_msi_irq+0xe/0x10 (XEN) [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa (XEN) [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307 (XEN) [<ffff82c480163321>] free_domain_pirqs+0x57/0x83 (XEN) [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3 (XEN) [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a (XEN) [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1 (XEN) [<ffff82c480123327>] __do_softirq+0x88/0x99 (XEN) [<ffff82c4801233a2>] do_softirq+0x6a/0x7a (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL TRAP: vector = 2 (nmi) (XEN) [error_code=0000] , IN INTERRUPT CONTEXT (XEN) **************************************** qemu log: Using file /dev/xen/blktap-2/tapdev0 in read-write mode Watching /local/domain/0/device-model/1/logdirty/cmd Watching /local/domain/0/device-model/1/command Watching /local/domain/1/cpu char device redirected to /dev/pts/4 qemu_map_cache_init nr_buckets = 4000 size 327680 shared page at pfn feffd buffered io page at pfn feffb Guest uuid = 3e68da1b-bec3-44a0-84cf-d3f7e341f5a6 populating video RAM at ff000000 mapping video RAM from ff000000 Register xen platform. Done register platform. platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state. xs_read(/local/domain/0/device-model/1/xen_extended_power_mgmt): read error xs_read(): vncpasswd get error. /vm/3e68da1b-bec3-44a0-84cf-d3f7e341f5a6/vncpasswd. Log-dirty: no command yet. I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 dm-command: hot insert pass-through pci dev register_real_device: Assigning real physical device 02:00.0 ... pt_iomul_init: Error: pt_iomul_init can''t open file /dev/xen/pci_iomul: No such file or directory: 0x2:0x0.0x0 pt_register_regions: IO region registered (size=0x02000000 base_addr=0xda000004) pt_msix_init: get MSI-X table bar base da000000 pt_msix_init: table_off = c000, total_entries = 9 pt_msix_init: errno = 2 pt_msix_init: mapping physical MSI-X table to b7880000 pci_intx: intx=1 register_real_device: Real physical device 02:00.0 registered successfuly! IRQ type = INTx I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 vcpu-set: watch node error. I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 xs_read(/local/domain/1/log-throttling): read error qemu: ignoring not-understood drive `/local/domain/1/log-throttling'' medium change watch on `/local/domain/1/log-throttling'' - unknown device, ignored I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 cirrus vga map change while on lfb mode pt_iomem_map: e_phys=f2000000 maddr=da000000 type=0 len=33554432 index=0 first_map=1 mapping vram to f0000000 - f0400000 platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state. platform_fixed_ioport: changed ro/rw state of ROM memory area. now is ro state. pt_pci_write_config: Warning: Guest attempt to set address to unused Base Address Register. [00:04.0][Offset:30h][Length:4] pt_msix_update_one: Update msix entry 0 with pirq 37 gvec b1 pt_msix_update_one: Update msix entry 1 with pirq 36 gvec b9 shutdown requested in cpu_handle_ioreq Issued domain 1 poweroff dm-command: hot remove pass-through pci dev -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Feb-07 09:27 UTC
Re: [Xen-devel] Xen panic on guest shutdown with PCI Passthrough
>>> On 04.02.11 at 17:34, Anthony PERARD <anthony.perard@citrix.com> wrote: > Hi, > > Each time I shutdown a guest with a passthrough pci device, Xen panic > on a NMI - MEMORY ERROR.And this happening in msi_set_mask_bit() is consistent? Always at the same RIP, or (slightly) varying? If always at the same address, could you disassemble the respective instruction so we know what address is being read/written? And quite certainly only on one (type of) machine? Jan> Xen serial log: > > (XEN) NMI - MEMORY ERROR > (XEN) ----[ Xen-4.1.0-rc3-pre x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121 > (XEN) RFLAGS: 0000000000000082 CONTEXT: hypervisor > (XEN) rax: 00000000ffffffff rbx: ffff83023f324190 rcx: 0000000000000001 > (XEN) rdx: 0000000000000082 rsi: 0000000000000001 rdi: ffff830232481838 > (XEN) rbp: ffff82c480297d38 rsp: ffff82c480297d08 r8: 0000000000000002 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff83010c1e9c70 r13: ffff83023f324000 r14: ffff83010c1e9c20 > (XEN) r15: 0000000000000001 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 000000022c9aa000 cr2: 00000000cb74379c > (XEN) ds: 007b es: 007b fs: 00d8 gs: 00e0 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c480297d08: > (XEN) ffff82c480297d48 ffff83023f324190 ffff83010c1e9c70 ffff83023f324000 > (XEN) ffff830232481800 0000000000000036 ffff82c480297d48 ffff82c48015f087 > (XEN) ffff82c480297da8 ffff82c480162fb1 000000000000002f ffff830232481834 > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff83023f324190 > (XEN) 0000000000000036 ffff83023f324000 ffff830232481800 ffffffffffffffff > (XEN) ffff82c480297e08 ffff82c48016315c 0000000000000282 0000000000000036 > (XEN) 00000000000000d8 ffff8302324808b4 010000000000002f 0000000000000036 > (XEN) ffff83023f324000 ffff83023f324190 ffff82c480297f18 ffffffffffffffff > (XEN) ffff82c480297e38 ffff82c480163321 ffff83023f324000 ffff8300bf4f0000 > (XEN) ffff83023f324000 ffff83023f324000 ffff82c480297e58 ffff82c48015642a > (XEN) ffff8300bf4f0000 00000000ffffffff ffff82c480297e88 ffff82c480104c59 > (XEN) 0000000000000000 0000000000000000 ffff82c4802d4020 0000000000000000 > (XEN) ffff82c480297eb8 ffff82c48012adfd ffff82c480123327 0000000000000000 > (XEN) 0000000000000000 ffff82c4802b0880 ffff82c480297ef8 ffff82c480123327 > (XEN) ffff82c4802d3ec0 ffff8300bf2f2000 0000000000000000 ffff8300bf4d6000 > (XEN) 0000029b7b0912f0 ffff82c4802d3ec0 ffff82c480297f08 ffff82c4801233a2 > (XEN) ffff82c480297d20 ffff82c480211516 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 00000000deadbeef > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) 0000000000000000 00000000deadbeef 0000000000000001 0000000000000002 > (XEN) Xen call trace: > (XEN) [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121 > (XEN) [<ffff82c48015f087>] mask_msi_irq+0xe/0x10 > (XEN) [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa > (XEN) [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307 > (XEN) [<ffff82c480163321>] free_domain_pirqs+0x57/0x83 > (XEN) [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3 > (XEN) [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a > (XEN) [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1 > (XEN) [<ffff82c480123327>] __do_softirq+0x88/0x99 > (XEN) [<ffff82c4801233a2>] do_softirq+0x6a/0x7a > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL TRAP: vector = 2 (nmi) > (XEN) [error_code=0000] , IN INTERRUPT CONTEXT > (XEN) **************************************** > > > > qemu log: > > Using file /dev/xen/blktap-2/tapdev0 in read-write mode > Watching /local/domain/0/device-model/1/logdirty/cmd > Watching /local/domain/0/device-model/1/command > Watching /local/domain/1/cpu > char device redirected to /dev/pts/4 > qemu_map_cache_init nr_buckets = 4000 size 327680 > shared page at pfn feffd > buffered io page at pfn feffb > Guest uuid = 3e68da1b-bec3-44a0-84cf-d3f7e341f5a6 > populating video RAM at ff000000 > mapping video RAM from ff000000 > Register xen platform. > Done register platform. > platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw > state. > xs_read(/local/domain/0/device-model/1/xen_extended_power_mgmt): read error > xs_read(): vncpasswd get error. > /vm/3e68da1b-bec3-44a0-84cf-d3f7e341f5a6/vncpasswd. > Log-dirty: no command yet. > I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 > dm-command: hot insert pass-through pci dev > register_real_device: Assigning real physical device 02:00.0 ... > pt_iomul_init: Error: pt_iomul_init can''t open file /dev/xen/pci_iomul: No > such file or directory: 0x2:0x0.0x0 > pt_register_regions: IO region registered (size=0x02000000 > base_addr=0xda000004) > pt_msix_init: get MSI-X table bar base da000000 > pt_msix_init: table_off = c000, total_entries = 9 > pt_msix_init: errno = 2 > pt_msix_init: mapping physical MSI-X table to b7880000 > pci_intx: intx=1 > register_real_device: Real physical device 02:00.0 registered successfuly! > IRQ type = INTx > I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 > vcpu-set: watch node error. > I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 > xs_read(/local/domain/1/log-throttling): read error > qemu: ignoring not-understood drive `/local/domain/1/log-throttling'' > medium change watch on `/local/domain/1/log-throttling'' - unknown device, > ignored > I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 > cirrus vga map change while on lfb mode > pt_iomem_map: e_phys=f2000000 maddr=da000000 type=0 len=33554432 index=0 > first_map=1 > mapping vram to f0000000 - f0400000 > platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw > state. > platform_fixed_ioport: changed ro/rw state of ROM memory area. now is ro > state. > pt_pci_write_config: Warning: Guest attempt to set address to unused Base > Address Register. [00:04.0][Offset:30h][Length:4] > pt_msix_update_one: Update msix entry 0 with pirq 37 gvec b1 > pt_msix_update_one: Update msix entry 1 with pirq 36 gvec b9 > shutdown requested in cpu_handle_ioreq > Issued domain 1 poweroff > dm-command: hot remove pass-through pci dev >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2011-Feb-07 15:06 UTC
Re: [Xen-devel] Xen panic on guest shutdown with PCI Passthrough
On Mon, Feb 7, 2011 at 09:27, Jan Beulich <JBeulich@novell.com> wrote:>>>> On 04.02.11 at 17:34, Anthony PERARD <anthony.perard@citrix.com> wrote: >> Hi, >> >> Each time I shutdown a guest with a passthrough pci device, Xen panic >> on a NMI - MEMORY ERROR. > > And this happening in msi_set_mask_bit() is consistent? Always at > the same RIP, or (slightly) varying? If always at the same address, > could you disassemble the respective instruction so we know what > address is being read/written?Unfortunately, this error happened in many place, even in default_idle. Maybe this can help: (XEN) irq.c:1585: dom1: forcing unbind of pirq 16 (XEN) irq.c:1585: dom1: forcing unbind of pirq 54 the error happened just after these debug print.> And quite certainly only on one (type of) machine?Yes, only one machine, I didn''t try on other machine. -- Anthony _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2011-Feb-15 15:35 UTC
Re: [Xen-devel] Xen panic on guest shutdown with PCI Passthrough
On Mon, 7 Feb 2011, Anthony PERARD wrote:> On Mon, Feb 7, 2011 at 09:27, Jan Beulich <JBeulich@novell.com> wrote: > >>>> On 04.02.11 at 17:34, Anthony PERARD <anthony.perard@citrix.com> wrote: > >> Hi, > >> > >> Each time I shutdown a guest with a passthrough pci device, Xen panic > >> on a NMI - MEMORY ERROR. > > > > And this happening in msi_set_mask_bit() is consistent? Always at > > the same RIP, or (slightly) varying? If always at the same address, > > could you disassemble the respective instruction so we know what > > address is being read/written? > > Unfortunately, this error happened in many place, even in default_idle. > > Maybe this can help: > (XEN) irq.c:1585: dom1: forcing unbind of pirq 16 > (XEN) irq.c:1585: dom1: forcing unbind of pirq 54 > the error happened just after these debug print. > > > And quite certainly only on one (type of) machine? > > Yes, only one machine, I didn''t try on other machine.I tried to use ''xl pci-attach'' and ''xl pci-detach'' with the module acpiphp in the guest, this work fine. But xen say: (XEN) domctl.c:992:d0 memory_map:remove: gfn=20000 mfn=da000 nr_mfns=2000 (XEN) p2m.c:2723:d0 clear_mmio_p2m_entry: gfn_to_mfn failed! gfn=0002000c The error append only when the pci device is still attach to the guest when I type ''halt'' in the guest. -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel