Jeremy Fitzhardinge
2010-May-27 19:51 UTC
[Xen-devel] xen crash related to pci passthrough
I just got this Xen crash (below) by: 1. unbind device from dom0 driver: dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/e1000e/unbind 2. bind to pciback dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/pciback/new_slot dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/pciback/bind 3. attach device to PV domU dom0# xl pci-attach f13pv64 0000:03:00.0 4. unbind from pciback dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/pciback/unbind 5. rmmod driver in domU domU# rmmod e1000e 6. Crash! The device in question is an Intel 82574L ethernet controller, using msi-x interrupts. J (XEN) ----[ Xen-4.1-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c48011efe3>] check_lock+0x1b/0x45 (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: 0000000000000001 rbx: ffff83813ff7ed34 rcx: ffff830101260190 (XEN) rdx: 0000000000000001 rsi: 0000000000000037 rdi: ffff83813ff7ed38 (XEN) rbp: ffff83013feffd68 rsp: ffff83013feffd68 r8: 00000000deadbeef (XEN) r9: 00000000deadbeef r10: ffff82c480204020 r11: 0000000000000202 (XEN) r12: 0000000000000286 r13: ffff83813ff7ed34 r14: 00000000000000dc (XEN) r15: ffff83813ff7ed00 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 000000011c405000 cr2: ffff83813ff7ed38 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83013feffd68: (XEN) ffff83013feffd88 ffff82c48011f3b2 ffff830101260000 00000000ffffffd9 (XEN) ffff83013feffdd8 ffff82c480157cfe ffff83013feffdb8 0000000000000000 (XEN) ffff830101260000 ffff830101260000 000000000000000e 0000000000000037 (XEN) ffff83013c624160 0000000000000000 ffff83013feffe08 ffff82c48015a1d2 (XEN) ffff830101260000 000000000000000e ffff82c4801fe3a0 ffff83013c624160 (XEN) ffff83013feffe68 ffff82c48010513a ffff83013feffe38 0000000000000150 (XEN) ffff830101260190 0000000000000012 ffff83013feffe88 ffffffffffffffda (XEN) ffff880011d91c38 0000000000000037 ffff88001eecda70 ffff88001e2e0000 (XEN) ffff83013fefff08 ffff82c4801068b2 ffff83013fefff28 ffff83000fbe8000 (XEN) 00000000000005f2 0000000000000002 ffff83013fefff08 ffff82c4801f7dff (XEN) ffff88000774b000 ffffffff811e3e83 000000000000000e 0000000000000000 (XEN) ffffffff811e3db2 000000000000e030 0000000000010202 ffff83000fbe8000 (XEN) 000000000000000e 0000000000000037 ffff88001eecda70 ffff88001e2e0000 (XEN) 00007cfec01000b7 ffff82c4801f2fbf ffffffff8100940a 0000000000000020 (XEN) ffff88001e2e0000 ffff88001eecda70 0000000000000037 000000000000000e (XEN) ffff880011d91c58 ffff88001fc0c370 0000000000000202 ffffffff8108f6bb (XEN) 0000000000000001 ffff880011eea698 0000000000000020 ffffffff8100940a (XEN) ffffffff81236d51 ffff880011d91c38 0000000000000003 0000010000000000 (XEN) ffffffff8100940a 000000000000e033 0000000000000202 ffff880011d91c00 (XEN) 000000000000e02b 0000000000000000 0000000000000000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c48011efe3>] check_lock+0x1b/0x45 (XEN) [<ffff82c48011f3b2>] _spin_lock_irqsave+0x21/0x67 (XEN) [<ffff82c480157cfe>] domain_spin_lock_irq_desc+0x49/0x9b (XEN) [<ffff82c48015a1d2>] pirq_guest_unbind+0x5c/0x10a (XEN) [<ffff82c48010513a>] __evtchn_close+0xd5/0x304 (XEN) [<ffff82c4801068b2>] do_event_channel_op+0xaa8/0xeb6 (XEN) [<ffff82c4801f2fbf>] syscall_enter+0xef/0x149 (XEN) (XEN) Pagetable walk from ffff83813ff7ed38: (XEN) L4[0x107] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: ffff83813ff7ed38 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 27/05/2010 20:51, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:> 3. attach device to PV domU > dom0# xl pci-attach f13pv64 0000:03:00.0 > 4. unbind from pciback > dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/pciback/unbind > 5. rmmod driver in domU > domU# rmmod e1000e > 6. Crash! > > The device in question is an Intel 82574L ethernet controller, using > msi-x interrupts.What happens if you pci-detach before unbinding from pciback? Not that Xen should crash of course, but a crash resulting from a mistake in detach ordring in dom0 would be less worrying than some alternatives. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-May-27 21:02 UTC
Re: [Xen-devel] xen crash related to pci passthrough
On 05/27/2010 01:02 PM, Keir Fraser wrote:> On 27/05/2010 20:51, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote: > > >> 3. attach device to PV domU >> dom0# xl pci-attach f13pv64 0000:03:00.0 >> 4. unbind from pciback >> dom0# echo 0000:03:00.0 > /sys/bus/pci/drivers/pciback/unbind >> 5. rmmod driver in domU >> domU# rmmod e1000e >> 6. Crash! >> >> The device in question is an Intel 82574L ethernet controller, using >> msi-x interrupts. >> > What happens if you pci-detach before unbinding from pciback? Not that Xen > should crash of course, but a crash resulting from a mistake in detach > ordring in dom0 would be less worrying than some alternatives. >Doing things in a more sensible order (rmmod in domU, detach, unbind) seems to work OK. I was deliberately seeing what would happen if I tried pulling the device out from under a domain. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This is only tangentially related, but... In xen/arch/x86/irq.c:__pirq_guest_unbind() this code looks wrong to me: memmove(&action->guest[i], &action->guest[i+1], IRQ_MAX_GUESTS-i-1); Should it be: memmove(&action->guest[i], &action->guest[i+1], sizeof(action->guest[0])*(IRQ_MAX_GUESTS-i-1)); ? Regards, Alex On 27/05/10 22:02, Jeremy Fitzhardinge wrote:> On 05/27/2010 01:02 PM, Keir Fraser wrote: >> On 27/05/2010 20:51, "Jeremy Fitzhardinge"<jeremy@goop.org> wrote: >> >> >>> 3. attach device to PV domU >>> dom0# xl pci-attach f13pv64 0000:03:00.0 >>> 4. unbind from pciback >>> dom0# echo 0000:03:00.0> /sys/bus/pci/drivers/pciback/unbind >>> 5. rmmod driver in domU >>> domU# rmmod e1000e >>> 6. Crash! >>> >>> The device in question is an Intel 82574L ethernet controller, using >>> msi-x interrupts. >>> >> What happens if you pci-detach before unbinding from pciback? Not that Xen >> should crash of course, but a crash resulting from a mistake in detach >> ordring in dom0 would be less worrying than some alternatives. >> > > Doing things in a more sensible order (rmmod in domU, detach, unbind) > seems to work OK. I was deliberately seeing what would happen if I > tried pulling the device out from under a domain. > > J > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 28/05/2010 10:28, "Alex Zeffertt" <Alex.Zeffertt@eu.citrix.com> wrote:> This is only tangentially related, but... > > In xen/arch/x86/irq.c:__pirq_guest_unbind() this code looks wrong to me: > > memmove(&action->guest[i], &action->guest[i+1], IRQ_MAX_GUESTS-i-1); > > Should it be: > > memmove(&action->guest[i], &action->guest[i+1], > sizeof(action->guest[0])*(IRQ_MAX_GUESTS-i-1));Yes, that must be one of the oldest bugs ever! Thanks, Keir> > ? > > Regards, > > Alex > > > On 27/05/10 22:02, Jeremy Fitzhardinge wrote: >> On 05/27/2010 01:02 PM, Keir Fraser wrote: >>> On 27/05/2010 20:51, "Jeremy Fitzhardinge"<jeremy@goop.org> wrote: >>> >>> >>>> 3. attach device to PV domU >>>> dom0# xl pci-attach f13pv64 0000:03:00.0 >>>> 4. unbind from pciback >>>> dom0# echo 0000:03:00.0> /sys/bus/pci/drivers/pciback/unbind >>>> 5. rmmod driver in domU >>>> domU# rmmod e1000e >>>> 6. Crash! >>>> >>>> The device in question is an Intel 82574L ethernet controller, using >>>> msi-x interrupts. >>>> >>> What happens if you pci-detach before unbinding from pciback? Not that Xen >>> should crash of course, but a crash resulting from a mistake in detach >>> ordring in dom0 would be less worrying than some alternatives. >>> >> >> Doing things in a more sensible order (rmmod in domU, detach, unbind) >> seems to work OK. I was deliberately seeing what would happen if I >> tried pulling the device out from under a domain. >> >> J >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel