Hans Rakers
2007-Dec-17 09:54 UTC
[Xen-users] DomU with PCI passthrough NIC crashes with fatal DMA error
Hi list, One of my CentOS 5.1 DomU''s is regularly crashing on network activity. It has a pci passthrough for direct communication to a Intel Etherexpress 100 card (e100 driver). I managed to grab a kernel crash backtrace through the xen console: ----- Fatal DMA error! Please use ''swiotlb=force'' ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c:365 invalid opcode: 0000 [1] SMP last sysfs file: /block/dm-1/range CPU 0 Modules linked in: autofs4 hidp l2cap bluetooth sunrpc ipv6 dm_multipath parport_pc lp parport e100 mii pcspkr dm_snapshot dm_zero dm_mirror dm_mod xenblk ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 0, comm: swapper Not tainted 2.6.18-53.1.4.el5xen #1 RIP: e030:[<ffffffff8026e9b3>] [<ffffffff8026e9b3>] dma_map_single+0x16b/0x180 RSP: e02b:ffffffff80616e00 EFLAGS: 00010282 RAX: 000000000000002f RBX: ffff880003c72012 RCX: ffffffff804c9328 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: 000000011607e012 R08: ffffffff804c9328 R09: 0000000000001bed R10: 0000000000004474 R11: ffff8800054ed940 R12: 00000000000005fe R13: ffff88001ff16070 R14: ffff880006f97012 R15: ffff88001dbf1980 FS: 00002aaaaee6e4e0(0000) GS:ffffffff80599000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff805d8000, task ffffffff804c4a00) Stack: ffff88001f53a000 ffff88001dbf1960 ffff88001f53a500 0000000000000000 ffff88001f53a000 ffffffff880d82f6 ffff88001dbf1960 ffff88001f53a500 0000000000000000 ffffffff880dac60 Call Trace: <IRQ> [<ffffffff880d82f6>] :e100:e100_rx_alloc_skb+0x93/0x116 [<ffffffff880dac60>] :e100:e100_poll+0x244/0x33b [<ffffffff80395fcf>] unmask_evtchn+0x2d/0xd7 [<ffffffff8020c603>] net_rx_action+0xa8/0x1b4 [<ffffffff80211f50>] __do_softirq+0x62/0xdd [<ffffffff8025dd9c>] call_softirq+0x1c/0x280 [<ffffffff8026aa98>] do_softirq+0x31/0x98 [<ffffffff8026a913>] do_IRQ+0xec/0xf5 [<ffffffff803965ae>] evtchn_do_upcall+0x86/0xe0 [<ffffffff8025d8ce>] do_hypervisor_callback+0x1e/0x2c <EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 [<ffffffff80292086>] rcu_pending+0x26/0x50 [<ffffffff8026be87>] raw_safe_halt+0x84/0xa8 [<ffffffff80269453>] xen_idle+0x38/0x4a [<ffffffff80247ba9>] cpu_idle+0x97/0xba [<ffffffff805e2b10>] start_kernel+0x21f/0x224 [<ffffffff805e21ed>] _sinittext+0x1ed/0x1f3 Code: 0f 0b 68 73 f9 46 80 c2 6d 01 59 5b 48 89 e8 5d 41 5c 41 5d RIP [<ffffffff8026e9b3>] dma_map_single+0x16b/0x180 RSP <ffffffff80616e00> <0>Kernel panic - not syncing: Fatal exception ----- I''ve tried booting the DomU with the ''swiotlb=force'' kernel option but that crashes the DomU immediately at boot. I also checked for IRQ conflicts but there are none afaics: Dom0: [root@xen ~]# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 1: 309 0 0 0 Phys-irq i8042 4: 122094 0 0 0 Phys-irq serial 6: 5 0 0 0 Phys-irq floppy 8: 0 0 0 0 Phys-irq rtc 9: 0 0 0 0 Phys-irq acpi 14: 4462155 0 1890 0 Phys-irq ide0 16: 1075573 0 0 49 Phys-irq uhci_hcd:usb4, peth0 18: 0 0 0 0 Phys-irq uhci_hcd:usb3 19: 0 0 0 0 Phys-irq uhci_hcd:usb1, ehci_hcd:usb5 20: 2306521 24678 376 0 Phys-irq uhci_hcd:usb2, ahci 256: 50650289 0 0 0 Dynamic-irq timer0 257: 1256496 0 0 0 Dynamic-irq resched0 258: 47 0 0 0 Dynamic-irq callfunc0 259: 0 407003 0 0 Dynamic-irq resched1 260: 0 115 0 0 Dynamic-irq callfunc1 261: 0 10477080 0 0 Dynamic-irq timer1 262: 0 0 241603 0 Dynamic-irq resched2 263: 0 0 117 0 Dynamic-irq callfunc2 264: 0 0 1051159 0 Dynamic-irq timer2 265: 0 0 0 87279 Dynamic-irq resched3 266: 0 0 0 90 Dynamic-irq callfunc3 267: 0 0 0 785090 Dynamic-irq timer3 268: 26156 1368 0 0 Dynamic-irq xenbus NMI: 0 0 0 0 LOC: 0 0 0 0 ERR: 0 MIS: 0 DomU: [root@svn ~]# cat /proc/interrupts CPU0 21: 719 Phys-irq eth0 256: 4140 Dynamic-irq timer0 257: 0 Dynamic-irq resched0 258: 0 Dynamic-irq callfunc0 259: 369 Dynamic-irq xenbus 260: 309 Dynamic-irq xencons 261: 188 Dynamic-irq xenfb 262: 0 Dynamic-irq xenkbd 263: 5810 Dynamic-irq blkif NMI: 0 LOC: 0 ERR: 0 MIS: 0 Dom0 also runs CentOS 5.1 btw. Anyone have any ideas how to deal with this? Thanks for your time. With kind regards, Hans Rakers _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hans Rakers
2007-Dec-17 11:23 UTC
Re: [Xen-users] DomU with PCI passthrough NIC crashes with fatal DMA error
In addition, i just tried the CentOS Plus Xen kernel which has the eepro100 driver. Using eepro100 it craps out during boot while configuring the interface. Seems there''s some major DMA issues when using PCI passthrough NICs :{ Backtrace using eepro100 follows: Bringing up loopback interface: [ OK ] Bringing up interface eth0: Fatal DMA error! Please use ''swiotlb=force'' ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c:365 invalid opcode: 0000 [1] SMP last sysfs file: /class/net/eth0/address CPU 0 Modules linked in: ipv6 dm_multipath parport_pc lp parport eepro100 mii pcspkr dm_snapshot dm_zero dm_mirror dm_mod xenblk ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 1035, comm: arping Not tainted 2.6.18-53.1.4.el5.centos.plusxen #1 RIP: e030:[<ffffffff8026e9b3>] [<ffffffff8026e9b3>] dma_map_single+0x16b/0x180 RSP: e02b:ffff88001d55bb98 EFLAGS: 00010086 RAX: 000000000000002f RBX: ffff880000df7c02 RCX: ffff88001ff1f070 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 00000001141dec02 R08: ffff88001f1a79f8 R09: ffff88001d55bb08 R10: 0000000000001b3c R11: ffff88001f4c7078 R12: 000000000000002a R13: ffff88001ff1f070 R14: ffff88001e47b4c0 R15: ffffc200001ea000 FS: 00002aaaab22baf0(0000) GS:ffffffff8059b000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process arping (pid: 1035, threadinfo ffff88001d55a000, task ffff88001f4e0040) Stack: 00000000000004d0 ffff88001ed4e000 ffff88001f53d500 0000000000000100 ffff88001f53d000 ffffffff880da78c 0000000000000000 ffff88001f53d000 ffff88001e47b4c0 0000000000000000 Call Trace: [<ffffffff880da78c>] :eepro100:speedo_start_xmit+0x136/0x274 [<ffffffff8040df2a>] __qdisc_run+0xf6/0x1bb [<ffffffff8022fdf5>] dev_queue_xmit+0x1ee/0x313 [<ffffffff8044b5f1>] packet_sendmsg+0x216/0x26c [<ffffffff802538c4>] sock_sendmsg+0xf3/0x110 [<ffffffff80294356>] autoremove_wake_function+0x0/0x2e [<ffffffff8026190f>] _read_lock_irq+0x9/0x19 [<ffffffff802071cf>] find_get_page+0x44/0x4b [<ffffffff8021330d>] filemap_nopage+0x188/0x322 [<ffffffff80208e30>] __handle_mm_fault+0x668/0xf4d [<ffffffff803f76a5>] sys_sendto+0x11c/0x14f [<ffffffff803f7894>] move_addr_to_user+0x5d/0x78 [<ffffffff8026187d>] _spin_lock_irq+0x9/0x14 [<ffffffff80228ace>] do_sigaction+0x189/0x19d [<ffffffff802409af>] do_ioctl+0x21/0x6b [<ffffffff8025d102>] system_call+0x86/0x8b [<ffffffff8025d07c>] system_call+0x0/0x8b Code: 0f 0b 68 38 1b 47 80 c2 6d 01 59 5b 48 89 e8 5d 41 5c 41 5d RIP [<ffffffff8026e9b3>] dma_map_single+0x16b/0x180 RSP <ffff88001d55bb98> <0>Kernel panic - not syncing: Fatal exception Hans Rakers wrote:> Hi list, > > One of my CentOS 5.1 DomU''s is regularly crashing on network activity. > It has a pci passthrough for direct communication to a Intel > Etherexpress 100 card (e100 driver). > > I managed to grab a kernel crash backtrace through the xen console: >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users