One fixes that I thought I had fixed but not so. This was discovered when trying to passthrough an PCIe network card to an PVHVM guest and finding that it can''t use MSIs. I thought I had it fixed with git commit 80ba77dfbce85f2d1be54847de3c866de1b18a9a "xen/pciback: Fix proper FLR steps." but that fixed only one use case (bind the device to xen-pciback, then unbind it). The underlaying reason was that after we do an FLR (if the card supports it), we also do a D3 (so turn off the PCIe card), then followed by a D0 (power is back). However we did not the follow the rest of the process that pci_reset_function does - restore the device''s PCI configuration state! (Note: We cannot use pci_reset_function as it holds a mutex that we hold as well - so we use the low-level reset functions that we can invoke and hold a mutex - and we forgot to do the right calls that pci_reset_function does). With this patch: [PATCH 1/2] xen/pciback: Restore the PCI config space after an FLR. I can pass through an PCIe e1000e card succesfully to my Win7 and Linux guest. This patch: [PATCH 2/2] xen/pciback: When resetting the device don''t disable is just a cleanup.
Konrad Rzeszutek Wilk
2012-Sep-25 21:27 UTC
[PATCH 1/2] xen/pciback: Restore the PCI config space after an FLR.
When we do an FLR, or D0->D3_hot we may lose the BARs as the device has turned itself off (and on). This means the device cannot function unless the pci_restore_state is called - which it is when the PCI device is unbound from the Xen PCI backend driver. For PV guests it ends up calling pci_enable_device / pci_enable_msi[x] which does the proper steps That however is not happening if a HVM guest is run as QEMU deals with PCI configuration space. QEMU also requires that the device be "parked" under the ownership of a pci-stub driver to guarantee that the PCI device is not being used. Hence we follow the same incantation as pci_reset_function does - by doing an FLR, then restoring the PCI configuration space. The result of this patch is that when you run lspci, you get now this: - Region 0: [virtual] Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] - Region 1: [virtual] Memory at fe800000 (32-bit, non-prefetchable) [size=512K] + Region 0: Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] + Region 1: Memory at fe800000 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at c000 [size=32] - Region 3: [virtual] Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] + Region 3: Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] The [virtual] means that lspci read those entries from SysFS but when it read them from the device it got a different value (0xfffffff). CC: stable@vger.kernel.org # only for v3.4 and v3.5 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/xen-pciback/pci_stub.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c index acec6fa..e5a0c13 100644 --- a/drivers/xen/xen-pciback/pci_stub.c +++ b/drivers/xen/xen-pciback/pci_stub.c @@ -362,6 +362,7 @@ static int __devinit pcistub_init_device(struct pci_dev *dev) else { dev_dbg(&dev->dev, "reseting (FLR, D3, etc) the device\n"); __pci_reset_function_locked(dev); + pci_restore_state(dev); } /* Now disable the device (this also ensures some private device * data is setup before we export) -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Sep-25 21:27 UTC
[PATCH 2/2] xen/pciback: When resetting the device don''t disable twice.
We call ''pci_disable_device'' which sets the bus_master to zero and it also disables the PCI_COMMAND. There is no need to do it outside the PCI library. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/xen-pciback/pciback_ops.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c index 97f5d26..2e62279 100644 --- a/drivers/xen/xen-pciback/pciback_ops.c +++ b/drivers/xen/xen-pciback/pciback_ops.c @@ -114,10 +114,6 @@ void xen_pcibk_reset_device(struct pci_dev *dev) pci_disable_msi(dev); #endif pci_disable_device(dev); - - pci_write_config_word(dev, PCI_COMMAND, 0); - - dev->is_busmaster = 0; } else { pci_read_config_word(dev, PCI_COMMAND, &cmd); if (cmd & (PCI_COMMAND_INVALIDATE)) { -- 1.7.7.6
Jan Beulich
2012-Sep-26 08:58 UTC
Re: [PATCH 2/2] xen/pciback: When resetting the device don''t disable twice.
>>> On 25.09.12 at 23:27, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > We call ''pci_disable_device'' which sets the bus_master to zero > and it also disables the PCI_COMMAND. There is no need to > do it outside the PCI library.Not really - pci_disable_device() only does anything if enable_cnt drops to zero, and only clears the bus master flag in the command word. The code you delete fully zeros the command word unconditionally. (I also noticed that the old [forward ported] code here forced enable_cnt to zero.) Hence the question is whether this really isn''t a functional change rather than mere cleanup. Jan> --- a/drivers/xen/xen-pciback/pciback_ops.c > +++ b/drivers/xen/xen-pciback/pciback_ops.c > @@ -114,10 +114,6 @@ void xen_pcibk_reset_device(struct pci_dev *dev) > pci_disable_msi(dev); > #endif > pci_disable_device(dev); > - > - pci_write_config_word(dev, PCI_COMMAND, 0); > - > - dev->is_busmaster = 0; > } else { > pci_read_config_word(dev, PCI_COMMAND, &cmd); > if (cmd & (PCI_COMMAND_INVALIDATE)) {