Thomas Kuther
2013-Nov-15 13:35 UTC
[libvirt-users] Using hostdev to plug a PCI-E host device into Q35 pcie-root port
Hello, I'm trying to migrate a working qemu command line configuration to libvirt. The part I'm currently failing on is: $ qemu-system-x86_64 -M Q35 ... -device vfio-pci,host=05:00.0,bus=pcie.0 The right way to translate this into libvirt XML seems to be using <hostdev>, but I seem to be unable to plug it into the pcie-root port This is how the interesting part looks like when I let "virsh edit" generate an <address> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </controller> <controller type='pci' index='2' model='pci-bridge'> <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/> </controller> [...] <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x02' slot='0x06' function='0x0'/> </hostdev> [...] To my understanding, this will plug the host device into the pci-bridge controller. The guest OS doesn't boot with this and resets right after bios. Manually setting <address type='pci' domain='0x0000' bus='0x00' slot='0x1E' function='0x0'/> cause XML validation failure. Is there any way in libvirt XML to plug a host's PCI-E device directly into the pcie-root port, like it works on qemu command line? I'm aware I could use something like <qemu:commandline> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=05:00.0,bus=pcie.0'/> </qemu:commandline> but I insist on running the VM as non-root, and if I got that right I need to configure at least one vfio device (or memory locking) in order for libvirt to set a proper RLIMIT_MEMLOCK value. Any help would be be appreciated. Regards, Thomas
Laine Stump
2013-Nov-19 10:36 UTC
Re: [libvirt-users] Using hostdev to plug a PCI-E host device into Q35 pcie-root port
On 11/15/2013 03:35 PM, Thomas Kuther wrote:> Hello, > > I'm trying to migrate a working qemu command line configuration to > libvirt. > The part I'm currently failing on is: > > $ qemu-system-x86_64 -M Q35 ... -device vfio-pci,host=05:00.0,bus=pcie.0 > > The right way to translate this into libvirt XML seems to be using > <hostdev>, but I seem to be unable to plug it into the pcie-root port > > This is how the interesting part looks like when I let "virsh edit" > generate an <address> > > <controller type='pci' index='0' model='pcie-root'/> > <controller type='pci' index='1' model='dmi-to-pci-bridge'> > <address type='pci' domain='0x0000' bus='0x00' slot='0x02' > function='0x0'/> > </controller> > <controller type='pci' index='2' model='pci-bridge'> > <address type='pci' domain='0x0000' bus='0x01' slot='0x01' > function='0x0'/> > </controller> > [...] > <hostdev mode='subsystem' type='pci' managed='yes'> > <driver name='vfio'/> > <source> > <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> > </source> > <address type='pci' domain='0x0000' bus='0x02' slot='0x06' > function='0x0'/> > </hostdev> > [...] > > To my understanding, this will plug the host device into the > pci-bridge controller. > The guest OS doesn't boot with this and resets right after bios.Ugh. That's very unfortunate. This is the first report I've heard of something failing in such a bad way due to being plugged into a pci-bridge slot; up until now I'd only heard that there is some extra PCIe functionality that would be missing if a device was plugged into a PCI slot vs. PCIe. Can I ask what type of device this is?> > Manually setting > <address type='pci' domain='0x0000' bus='0x00' slot='0x1E' > function='0x0'/> > cause XML validation failure. > > Is there any way in libvirt XML to plug a host's PCI-E device directly > into the pcie-root port, like it works on qemu command line?I'm sorry to say, no. With very few (and specific) exceptions, libvirt insists that all guest devices be plugged into a hot-pluggable PCI slot - this eliminates both the PCIe "root complex" (a.k.a. pcie.0) as well as the dmi-to-pci-controller that is plugged into pcie.0 (because pci-to-dmi controllers' slots don't support hot-plug). This is done because, for now, almost all devices that qemu knows about are PCI (no PCI-e) devices, and if we allowed plugging them into pcie.0 now, then on the day in the future when qemu begins enforcing the difference between PCI and PCIe (currently it doesn't), the world would be full of libvirt configs that would no longer work. There was some discussion about this a month or two ago either on libvir-list or maybe it was the qemu-devel list. We decided that qemu needs to provide some sort of introspection of the devices' connection types so that libvirt can determine what device can plug into which slots; at that time we'll be able to allow exactly what's proper in each case. In the meantime we're stuck with being overly cautious in order to prevent future catastrophe.> > I'm aware I could use something like > > <qemu:commandline> > <qemu:arg value='-device'/> > <qemu:arg value='vfio-pci,host=05:00.0,bus=pcie.0'/> > </qemu:commandline> > > but I insist on running the VM as non-root, and if I got that right I > need to configure at least one vfio device (or memory locking) in > order for libvirt to set a proper RLIMIT_MEMLOCK value. > > Any help would be be appreciated.For now at least, you'll need to let it plug into the pci-bridge device pci.2 (which, as you've found, libvirt will automatically find when you don't specify any address). Unfortunately that doesn't do you much good, since that particular device you're assigning actually requires that it be plugged into the PCIe bus. I'm wondering as I type if possibly we could relax the enforcement of the "PCI only" rule such that we allow explicitly placing any device on any type of bus, but only auto-assign to a plain PCI slot. That may be a reasonable compromise until qemu has the required new device/controller introspection info available.
Thomas Kuther
2013-Nov-19 13:51 UTC
Re: [libvirt-users] Using hostdev to plug a PCI-E host device into Q35 pcie-root port
Am 19.11.2013 11:36, schrieb Laine Stump:> On 11/15/2013 03:35 PM, Thomas Kuther wrote: >> Hello, >> >> I'm trying to migrate a working qemu command line configuration to >> libvirt. >> The part I'm currently failing on is: >> >> $ qemu-system-x86_64 -M Q35 ... -device >> vfio-pci,host=05:00.0,bus=pcie.0 >> >> The right way to translate this into libvirt XML seems to be using >> <hostdev>, but I seem to be unable to plug it into the pcie-root port >> >> This is how the interesting part looks like when I let "virsh edit" >> generate an <address> >> >> <controller type='pci' index='0' model='pcie-root'/> >> <controller type='pci' index='1' model='dmi-to-pci-bridge'> >> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' >> function='0x0'/> >> </controller> >> <controller type='pci' index='2' model='pci-bridge'> >> <address type='pci' domain='0x0000' bus='0x01' slot='0x01' >> function='0x0'/> >> </controller> >> [...] >> <hostdev mode='subsystem' type='pci' managed='yes'> >> <driver name='vfio'/> >> <source> >> <address domain='0x0000' bus='0x03' slot='0x00' >> function='0x0'/> >> </source> >> <address type='pci' domain='0x0000' bus='0x02' slot='0x06' >> function='0x0'/> >> </hostdev> >> [...] >> >> To my understanding, this will plug the host device into the >> pci-bridge controller. >> The guest OS doesn't boot with this and resets right after bios. > > Ugh. That's very unfortunate. This is the first report I've heard of > something failing in such a bad way due to being plugged into a > pci-bridge slot; up until now I'd only heard that there is some extra > PCIe functionality that would be missing if a device was plugged into a > PCI slot vs. PCIe. > > Can I ask what type of device this is? >It's a Marvell 88SE9172 SATA controller, here is the lspci -vvv 03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11) (prog-if 01 [AHCI 1.0]) Subsystem: Gigabyte Technology Co., Ltd Device b000 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 47 Region 0: I/O ports at d040 [disabled] [size=8] Region 1: I/O ports at d030 [disabled] [size=4] Region 2: I/O ports at d020 [disabled] [size=8] Region 3: I/O ports at d010 [disabled] [size=4] Region 4: I/O ports at d000 [disabled] [size=16] Region 5: Memory at f7610000 (32-bit, non-prefetchable) [disabled] [size=512] Expansion ROM at f7600000 [disabled by cmd] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Kernel driver in use: vfio-pci The second one I'm trying to pass through is a Renesas uPD720201 USB 3.0 Host Controller, but first I wanted to get the SATA controller working in libvirt. I will try to leave out the SATA controller and see what happens with only the USB3 controller.>> >> Manually setting >> <address type='pci' domain='0x0000' bus='0x00' slot='0x1E' >> function='0x0'/> >> cause XML validation failure. >> >> Is there any way in libvirt XML to plug a host's PCI-E device directly >> into the pcie-root port, like it works on qemu command line? > > > I'm sorry to say, no. With very few (and specific) exceptions, libvirt > insists that all guest devices be plugged into a hot-pluggable PCI slot > - this eliminates both the PCIe "root complex" (a.k.a. pcie.0) as well > as the dmi-to-pci-controller that is plugged into pcie.0 (because > pci-to-dmi controllers' slots don't support hot-plug). > > This is done because, for now, almost all devices that qemu knows about > are PCI (no PCI-e) devices, and if we allowed plugging them into pcie.0 > now, then on the day in the future when qemu begins enforcing the > difference between PCI and PCIe (currently it doesn't), the world would > be full of libvirt configs that would no longer work. > > There was some discussion about this a month or two ago either on > libvir-list or maybe it was the qemu-devel list. We decided that qemu > needs to provide some sort of introspection of the devices' connection > types so that libvirt can determine what device can plug into which > slots; at that time we'll be able to allow exactly what's proper in > each > case. In the meantime we're stuck with being overly cautious in order > to > prevent future catastrophe. >Understood, thanks for the explanation.>> >> I'm aware I could use something like >> >> <qemu:commandline> >> <qemu:arg value='-device'/> >> <qemu:arg value='vfio-pci,host=05:00.0,bus=pcie.0'/> >> </qemu:commandline> >> >> but I insist on running the VM as non-root, and if I got that right I >> need to configure at least one vfio device (or memory locking) in >> order for libvirt to set a proper RLIMIT_MEMLOCK value. >> >> Any help would be be appreciated. > > For now at least, you'll need to let it plug into the pci-bridge device > pci.2 (which, as you've found, libvirt will automatically find when you > don't specify any address). Unfortunately that doesn't do you much > good, > since that particular device you're assigning actually requires that it > be plugged into the PCIe bus. > > I'm wondering as I type if possibly we could relax the enforcement of > the "PCI only" rule such that we allow explicitly placing any device on > any type of bus, but only auto-assign to a plain PCI slot. That may be > a > reasonable compromise until qemu has the required new device/controller > introspection info available. >I like the idea. Regards, Thomas
Possibly Parallel Threads
- Using hostdev to plug a PCI-E host device into Q35 pcie-root port
- Assistance in tracking a kernel/nouveau error
- [Kernel 3.1.5] [OCFS2] After many write/delete on ocfs2 both servers in cluster kernel oops
- GP108 on PPC
- PCI-E not supported in kernel 2.6.32-131.12.1.el6