Riccardo Ravaioli
2020-Apr-07 16:05 UTC
"failed to setup INTx fd: Operation not permitted" error when using PCI passthrough
Hi, I'm on a Dell VEP 1405 running Debian 9.11 and I'm running a few tests with various interfaces given in PCI passthrough to a qemu/KVM Virtual Machine also running Debian 9.11. I noticed that only one of the four I350 network controllers can be used in PCI passthrough. The available interfaces are: *# dpdk-devbind.py --status Network devices using kernel driver ==============================* *===== 0000:02:00.0 'I350 Gigabit Network Connection 1521' if=eth2 drv=igb unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:02:00.1 'I350 Gigabit Network Connection 1521' if=eth3 drv=igb unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:02:00.2 'I350 Gigabit Network Connection 1521' if=eth0 drv=igb unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:02:00.3 'I350 Gigabit Network Connection 1521' if=eth1 drv=igb unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:04:00.0 'QCA986x/988x 802.11ac Wireless Network Adapter 003c' if= drv=ath10k_pci unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:05:00.0 'Device 15c4' if=eth7 drv=ixgbe unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:05:00.1 'Device 15c4' if=eth6 drv=ixgbe unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:07:00.0 'Device 15e5' if=eth5 drv=ixgbe unused=igb_uio,vfio-pci,uio_* *pci_generic 0000:07:00.1 'Device 15e5' if=eth4 drv=ixgbe unused=igb_uio,vfio-pci,uio_**pci_generic* If I try PCI passthrough on 02:00.2 (eth0), it works fine. With any of the remaining three interfaces, libvirt fails with this error: *# virsh create vnf.xml error: Failed to create domain from vnf.xml error: internal error: process exited while connecting to monitor: 2020-04-06T16:08:47.048266Z qemu-system-x86_64: -device vfio-pci,host=02:00.1,id=**hostdev0,bus=pci.0,addr=0x5: vfio 0000:02:00.1: failed to setup INTx fd: Operation not permitted* The contents of vnf.xml are available here: https://pastebin.com/rT3RmAi5 This is what happened in *dmesg* when I tried to start the VM: *[ 7305.371730] igb 0000:02:00.1: removed PHC on eth3 [ 7307.085618] ACPI Warning: \_SB.PCI0.PEX2._PRT: Return Package has no elements (empty) (20160831/nsprepkg-130) [ 7307.085717] pcieport 0000:00:0b.0: can't derive routing for PCI INT B [ 7307.085719] vfio-pci 0000:02:00.1: PCI INT B: no GSI [ 7307.369611] igb 0000:02:00.1: enabling device (0400 -> 0402) [ 7307.369668] ACPI Warning: \_SB.PCI0.PEX2._PRT: Return Package has no elements (empty) (20160831/nsprepkg-130) [ 7307.369764] pcieport 0000:00:0b.0: can't derive routing for PCI INT B [ 7307.369766] igb 0000:02:00.1: PCI INT B: no GSI [ 7307.426266] igb 0000:02:00.1: added PHC on eth3 [ 7307.426269] igb 0000:02:00.1: Intel(R) Gigabit Ethernet Network Connection [ 7307.426271] igb 0000:02:00.1: eth3: (PCIe:5.0Gb/s:Width x2) 50:9a:4c:ee:9f:b1 [ 7307.426350] igb 0000:02:00.1: eth3: PBA No: 106300-000 [ 7307.426352] igb 0000:02:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)* These are all the messages related to that device in dmesg before I tried to start the VM: *# dmesg | grep 02:00.1 [ 0.185301] pci 0000:02:00.1: [8086:1521] type 00 class 0x020000 [ 0.185317] pci 0000:02:00.1: reg 0x10: [mem 0xdfd40000-0xdfd5ffff] [ 0.185334] pci 0000:02:00.1: reg 0x18: [io 0xd040-0xd05f] [ 0.185343] pci 0000:02:00.1: reg 0x1c: [mem 0xdfd88000-0xdfd8bfff] [ 0.185434] pci 0000:02:00.1: PME# supported from D0 D3hot D3cold [ 0.185464] pci 0000:02:00.1: reg 0x184: [mem 0xdeea0000-0xdeea3fff 64bit pref] [ 0.185467] pci 0000:02:00.1: VF(n) BAR0 space: [mem 0xdeea0000-0xdeebffff 64bit pref] (contains BAR0 for 8 VFs) [ 0.185486] pci 0000:02:00.1: reg 0x190: [mem 0xdee80000-0xdee83fff 64bit pref] [ 0.185488] pci 0000:02:00.1: VF(n) BAR3 space: [mem 0xdee80000-0xdee9ffff 64bit pref] (contains BAR3 for 8 VFs) [ 0.334021] DMAR: Hardware identity mapping for device 0000:02:00.1 [ 0.334463] iommu: Adding device 0000:02:00.1 to group 16 [ 0.398809] pci 0000:02:00.1: Signaling PME through PCIe PME interrupt [ 2.588049] igb 0000:02:00.1: PCI INT B: not connected [ 2.643900] igb 0000:02:00.1: added PHC on eth1 [ 2.643903] igb 0000:02:00.1: Intel(R) Gigabit Ethernet Network Connection [ 2.643905] igb 0000:02:00.1: eth1: (PCIe:5.0Gb/s:Width x2) 50:9a:4c:ee:9f:b1 [ 2.643984] igb 0000:02:00.1: eth1: PBA No: 106300-000 [ 2.643986] igb 0000:02:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) [ 2.873544] igb 0000:02:00.1 rename3: renamed from eth1 [ 2.939352] igb 0000:02:00.1 eth3: renamed from rename3* In particular this looks suspicious: *igb 0000:02:00.1: PCI INT B: not connected* The full dmesg is available here: https://pastebin.com/kPbUAKCi This is the PCI bus structure: *# lspci -tv -[0000:00]-+-00.0 Intel Corporation Device 1980 +-04.0 Intel Corporation Device 19a1 +-05.0 Intel Corporation Device 19a2 +-06.0-[01]----00.0 Intel Corporation Device 19e2 +-0b.0-[02-03]--+-00.0 Intel Corporation I350 Gigabit Network Connection | +-00.1 Intel Corporation I350 Gigabit Network Connection | +-00.2 Intel Corporation I350 Gigabit Network Connection | \-00.3 Intel Corporation I350 Gigabit Network Connection +-0f.0-[04]----00.0 Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter +-12.0 Intel Corporation DNV SMBus Contoller - Host +-13.0 Intel Corporation DNV SATA Controller 0 +-15.0 Intel Corporation Device 19d0 +-16.0-[05-06]--+-00.0 Intel Corporation Device 15c4 | \-00.1 Intel Corporation Device 15c4 +-17.0-[07-08]--+-00.0 Intel Corporation Device 15e5 | \-00.1 Intel Corporation Device 15e5 +-18.0 Intel Corporation Device 19d3 +-1c.0 Intel Corporation Device 19db +-1f.0 Intel Corporation DNV LPC or eSPI +-1f.2 Intel Corporation Device 19de +-1f.4 Intel Corporation DNV SMBus controller \-1f.5 Intel Corporation DNV SPI Controller* By looking at lspci -v, there's something going on with the IRQ field exactly in three devices I can't use in PCI passthrough ("IRQ -2147483648"): *# lspci -v|grep -A1 I350 02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus master, fast devsel, latency 0, IRQ -2147483648 -- 02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus master, fast devsel, latency 0, IRQ -2147483648 -- 02:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus master, fast devsel, latency 0, IRQ 18 -- 02:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus master, fast devsel, latency 0, IRQ -2147483648* Finally, every i350 interface has its own IOMMU group in /sys/kernel/iommu_groups/. The kernel I'm using in the host machine is 4.9.189 and my libvirt version is 4.3.0. Any thoughts on this? Is there something I should enable in the BIOS or in the kernel to make this work? Thanks! Regards, Riccardo Ravaioli
Riccardo Ravaioli
2020-Apr-30 08:00 UTC
Re: "failed to setup INTx fd: Operation not permitted" error when using PCI passthrough
So ultimately the problem was somewhere in the BIOS. A BIOS update fixed the issue. Riccardo On Tue, 7 Apr 2020 at 18:05, Riccardo Ravaioli <riccardoravaioli@gmail.com> wrote:> Hi, > > I'm on a Dell VEP 1405 running Debian 9.11 and I'm running a few tests > with various interfaces given in PCI passthrough to a qemu/KVM Virtual > Machine also running Debian 9.11. > > I noticed that only one of the four I350 network controllers can be used > in PCI passthrough. The available interfaces are: > > > > *# dpdk-devbind.py --status Network devices using kernel driver > ==============================* > *===== 0000:02:00.0 'I350 Gigabit Network Connection 1521' if=eth2 drv=igb > unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:02:00.1 'I350 Gigabit Network Connection 1521' if=eth3 > drv=igb unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:02:00.2 'I350 Gigabit Network Connection 1521' if=eth0 > drv=igb unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:02:00.3 'I350 Gigabit Network Connection 1521' if=eth1 > drv=igb unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:04:00.0 'QCA986x/988x 802.11ac Wireless Network Adapter > 003c' if= drv=ath10k_pci unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:05:00.0 'Device 15c4' if=eth7 drv=ixgbe > unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:05:00.1 'Device 15c4' if=eth6 drv=ixgbe > unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:07:00.0 'Device 15e5' if=eth5 drv=ixgbe > unused=igb_uio,vfio-pci,uio_* > *pci_generic 0000:07:00.1 'Device 15e5' if=eth4 drv=ixgbe > unused=igb_uio,vfio-pci,uio_**pci_generic* > > If I try PCI passthrough on 02:00.2 (eth0), it works fine. With any of the > remaining three interfaces, libvirt fails with this error: > > > > *# virsh create vnf.xml error: Failed to create domain from vnf.xml error: > internal error: process exited while connecting to monitor: > 2020-04-06T16:08:47.048266Z qemu-system-x86_64: -device > vfio-pci,host=02:00.1,id=**hostdev0,bus=pci.0,addr=0x5: vfio > 0000:02:00.1: failed to setup INTx fd: Operation not permitted* > > The contents of vnf.xml are available here: https://pastebin.com/rT3RmAi5 > This is what happened in *dmesg* when I tried to start the VM: > > > > > > > > > > > > > > *[ 7305.371730] igb 0000:02:00.1: removed PHC on eth3 [ 7307.085618] ACPI > Warning: \_SB.PCI0.PEX2._PRT: Return Package has no elements (empty) > (20160831/nsprepkg-130) [ 7307.085717] pcieport 0000:00:0b.0: can't derive > routing for PCI INT B [ 7307.085719] vfio-pci 0000:02:00.1: PCI INT B: no > GSI [ 7307.369611] igb 0000:02:00.1: enabling device (0400 -> 0402) [ > 7307.369668] ACPI Warning: \_SB.PCI0.PEX2._PRT: Return Package has no > elements (empty) (20160831/nsprepkg-130) [ 7307.369764] pcieport > 0000:00:0b.0: can't derive routing for PCI INT B [ 7307.369766] igb > 0000:02:00.1: PCI INT B: no GSI [ 7307.426266] igb 0000:02:00.1: added PHC > on eth3 [ 7307.426269] igb 0000:02:00.1: Intel(R) Gigabit Ethernet Network > Connection [ 7307.426271] igb 0000:02:00.1: eth3: (PCIe:5.0Gb/s:Width x2) > 50:9a:4c:ee:9f:b1 [ 7307.426350] igb 0000:02:00.1: eth3: PBA No: 106300-000 > [ 7307.426352] igb 0000:02:00.1: Using MSI-X interrupts. 4 rx queue(s), 4 > tx queue(s)* > > > These are all the messages related to that device in dmesg before I tried > to start the VM: > > > > > > > > > > > > > > > > > > > > > *# dmesg | grep 02:00.1 [ 0.185301] pci 0000:02:00.1: [8086:1521] type > 00 class 0x020000 [ 0.185317] pci 0000:02:00.1: reg 0x10: [mem > 0xdfd40000-0xdfd5ffff] [ 0.185334] pci 0000:02:00.1: reg 0x18: [io > 0xd040-0xd05f] [ 0.185343] pci 0000:02:00.1: reg 0x1c: [mem > 0xdfd88000-0xdfd8bfff] [ 0.185434] pci 0000:02:00.1: PME# supported from > D0 D3hot D3cold [ 0.185464] pci 0000:02:00.1: reg 0x184: [mem > 0xdeea0000-0xdeea3fff 64bit pref] [ 0.185467] pci 0000:02:00.1: VF(n) > BAR0 space: [mem 0xdeea0000-0xdeebffff 64bit pref] (contains BAR0 for 8 > VFs) [ 0.185486] pci 0000:02:00.1: reg 0x190: [mem 0xdee80000-0xdee83fff > 64bit pref] [ 0.185488] pci 0000:02:00.1: VF(n) BAR3 space: [mem > 0xdee80000-0xdee9ffff 64bit pref] (contains BAR3 for 8 VFs) [ 0.334021] > DMAR: Hardware identity mapping for device 0000:02:00.1 [ 0.334463] > iommu: Adding device 0000:02:00.1 to group 16 [ 0.398809] pci > 0000:02:00.1: Signaling PME through PCIe PME interrupt [ 2.588049] igb > 0000:02:00.1: PCI INT B: not connected [ 2.643900] igb 0000:02:00.1: > added PHC on eth1 [ 2.643903] igb 0000:02:00.1: Intel(R) Gigabit > Ethernet Network Connection [ 2.643905] igb 0000:02:00.1: eth1: > (PCIe:5.0Gb/s:Width x2) 50:9a:4c:ee:9f:b1 [ 2.643984] igb 0000:02:00.1: > eth1: PBA No: 106300-000 [ 2.643986] igb 0000:02:00.1: Using MSI-X > interrupts. 4 rx queue(s), 4 tx queue(s) [ 2.873544] igb 0000:02:00.1 > rename3: renamed from eth1 [ 2.939352] igb 0000:02:00.1 eth3: renamed > from rename3* > > In particular this looks suspicious: *igb 0000:02:00.1: PCI INT B: not > connected* > The full dmesg is available here: https://pastebin.com/kPbUAKCi > > This is the PCI bus structure: > > > > > > > > > > > > > > > > > > > > > > > > *# lspci -tv -[0000:00]-+-00.0 Intel Corporation Device 1980 > +-04.0 Intel Corporation Device 19a1 +-05.0 Intel Corporation > Device 19a2 +-06.0-[01]----00.0 Intel Corporation Device 19e2 > +-0b.0-[02-03]--+-00.0 Intel Corporation I350 Gigabit Network > Connection | +-00.1 Intel Corporation I350 > Gigabit Network Connection | +-00.2 Intel > Corporation I350 Gigabit Network Connection | > \-00.3 Intel Corporation I350 Gigabit Network Connection > +-0f.0-[04]----00.0 Qualcomm Atheros QCA986x/988x 802.11ac Wireless > Network Adapter +-12.0 Intel Corporation DNV SMBus Contoller - > Host +-13.0 Intel Corporation DNV SATA Controller 0 > +-15.0 Intel Corporation Device 19d0 +-16.0-[05-06]--+-00.0 > Intel Corporation Device 15c4 | \-00.1 Intel > Corporation Device 15c4 +-17.0-[07-08]--+-00.0 Intel > Corporation Device 15e5 | \-00.1 Intel > Corporation Device 15e5 +-18.0 Intel Corporation Device 19d3 > +-1c.0 Intel Corporation Device 19db +-1f.0 Intel > Corporation DNV LPC or eSPI +-1f.2 Intel Corporation Device > 19de +-1f.4 Intel Corporation DNV SMBus controller > \-1f.5 Intel Corporation DNV SPI Controller* > > By looking at lspci -v, there's something going on with the IRQ field > exactly in three devices I can't use in PCI passthrough ("IRQ -2147483648"): > > > > > > > > > > > > > *# lspci -v|grep -A1 I350 02:00.0 Ethernet controller: Intel Corporation > I350 Gigabit Network Connection (rev 01) Flags: bus master, fast > devsel, latency 0, IRQ -2147483648 -- 02:00.1 Ethernet controller: Intel > Corporation I350 Gigabit Network Connection (rev 01) Flags: bus master, > fast devsel, latency 0, IRQ -2147483648 -- 02:00.2 Ethernet controller: > Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus > master, fast devsel, latency 0, IRQ 18 -- 02:00.3 Ethernet controller: > Intel Corporation I350 Gigabit Network Connection (rev 01) Flags: bus > master, fast devsel, latency 0, IRQ -2147483648* > > > Finally, every i350 interface has its own IOMMU group in > /sys/kernel/iommu_groups/. > > The kernel I'm using in the host machine is 4.9.189 and my libvirt version > is 4.3.0. > > Any thoughts on this? > Is there something I should enable in the BIOS or in the kernel to make > this work? > > Thanks! > > Regards, > Riccardo Ravaioli >