thr3ads.net - Virtualization - [PATCH RFC 3/3] vfio: add virtio pci quirk [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2016-Apr-18 09:58 UTC

[PATCH RFC 0/3] virtio-pci: iommu support

This is an attempt to allow enabling IOMMU for DMA.
Design:
	- new feature bit IOMMU_PLATFORM which means
          host won't bypass IOMMU
	- virtio core uses DMA API if it sees IOMMU_PLATFORM
	- add quirk for vfio to disable device unless IOMMU_PLATFORM is set
          or the no-iommu mode is enabled
	- while I'm not sure how it will be used, it seems like a good idea to
	  also have ability to distinguish between a legacy device and one
	  where iommu is bypassed intentionally.  To this end, add another feature bit
	  IOMMU_PASSTHROUGH. We don't acknowledge it if IOMMU_PLATFORM is set.

TODO:
	- I'm not sure whether there are setups that mix IOMMU
	  and no-IOMMU configs. If so, failing on probe might not
	  be the right thing to do, should fail binding to IOMMU group instead.


Michael S. Tsirkin (3):
  virtio: add features for IOMMU control
  vfio: report group noiommu status
  vfio: add virtio pci quirk

 drivers/vfio/pci/vfio_pci_private.h          |   1 +
 include/uapi/linux/virtio_config.h           |  10 +-
 drivers/vfio/pci/vfio_pci.c                  |  13 ++-
 drivers/vfio/pci/vfio_pci_virtio.c           | 142 +++++++++++++++++++++++++++
 drivers/vfio/platform/vfio_platform_common.c |   2 +-
 drivers/vfio/vfio.c                          |   5 +-
 drivers/virtio/virtio_ring.c                 |  18 +++-
 Documentation/vfio.txt                       |   4 +-
 drivers/vfio/pci/Makefile                    |   1 +
 9 files changed, 190 insertions(+), 6 deletions(-)
 create mode 100644 drivers/vfio/pci/vfio_pci_virtio.c

-- 
MST

Michael S. Tsirkin

2016-Apr-18 09:58 UTC

head link

[PATCH RFC 1/3] virtio: add features for IOMMU control

The interaction between virtio and DMA API is messy.

On most systems with virtio, physical addresses match bus addresses,
and it doesn't particularly matter whether we use the DMA API.

On some systems, including Xen and any system with a physical device
that speaks virtio behind a physical IOMMU, we must use the DMA API
for virtio DMA to work at all.

Add a feature bit to detect that: VIRTIO_F_IOMMU_PLATFORM.

On other systems, including SPARC and PPC64, virtio-pci devices are
enumerated as though they are behind an IOMMU, but the virtio host
ignores the IOMMU, so we must either pretend that the IOMMU isn't
there or somehow map everything as the identity.

Add a feature bit for that as well: VIRTIO_F_IOMMU_PASSTHROUGH: without
VIRTIO_F_IOMMU_PLATFORM, it means that virtio will bypass the IOMMU.
With VIRTIO_F_IOMMU_PLATFORM, it suggests that guest maps everything as
the identity (this last bit isn't trivial to implement so ignore this
hint for now).

If not there, we preserve historic behavior and bypass the DMA
API unless within Xen guest.

Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
---
 include/uapi/linux/virtio_config.h | 10 +++++++++-
 drivers/virtio/virtio_ring.c       | 18 +++++++++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_config.h
b/include/uapi/linux/virtio_config.h
index 4cb65bb..952775c 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -49,7 +49,7 @@
  * transport being used (eg. virtio_ring), the rest are per-device feature
  * bits. */
 #define VIRTIO_TRANSPORT_F_START	28
-#define VIRTIO_TRANSPORT_F_END		33
+#define VIRTIO_TRANSPORT_F_END		35
 
 #ifndef VIRTIO_CONFIG_NO_LEGACY
 /* Do we get callbacks when the ring is completely used, even if we've
@@ -63,4 +63,12 @@
 /* v1.0 compliant. */
 #define VIRTIO_F_VERSION_1		32
 
+/* Request IOMMU passthrough (if available)
+ * Without VIRTIO_F_IOMMU_PLATFORM: bypass the IOMMU even if enabled.
+ * With VIRTIO_F_IOMMU_PLATFORM: suggest disabling IOMMU.
+ */
+#define VIRTIO_F_IOMMU_PASSTHROUGH	33
+
+/* Do not bypass the IOMMU (if configured) */
+#define VIRTIO_F_IOMMU_PLATFORM		34
 #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5c802d4..0436bd2 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -117,7 +117,10 @@ struct vring_virtqueue {
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
 
 /*
- * The interaction between virtio and a possible IOMMU is a mess.
+ * Modern virtio devices might set feature bits to specify whether
+ * they use or unconditionally bypass the platform IOMMU.
+ *
+ * If not there, the interaction between virtio and DMA API is messy.
  *
  * On most systems with virtio, physical addresses match bus addresses,
  * and it doesn't particularly matter whether we use the DMA API.
@@ -137,6 +140,13 @@ struct vring_virtqueue {
 
 static bool vring_use_dma_api(struct virtio_device *vdev)
 {
+	if (virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
+		return true;
+
+	if (virtio_has_feature(vdev, VIRTIO_F_IOMMU_PASSTHROUGH))
+		return false;
+
+	/* Otherwise, we are left to guess. */
 	/*
 	 * In theory, it's possible to have a buggy QEMU-supposed
 	 * emulated Q35 IOMMU and Xen enabled at the same time.  On
@@ -1099,6 +1109,12 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_F_VERSION_1:
 			break;
+		case VIRTIO_F_IOMMU_PASSTHROUGH:
+			break;
+		case VIRTIO_F_IOMMU_PLATFORM:
+			/* Ignore passthrough hint for now, obey kernel config. */
+			__virtio_clear_bit(vdev, VIRTIO_F_IOMMU_PASSTHROUGH);
+			break;
 		default:
 			/* We don't understand this bit. */
 			__virtio_clear_bit(vdev, i);
-- 
MST

Michael S. Tsirkin

2016-Apr-18 09:58 UTC

head link

[PATCH RFC 2/3] vfio: report group noiommu status

When using vfio, callers might want to know whether device is added to a
regular group or an non-iommu group.

Report this status from vfio_add_group_dev.

Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
---
 drivers/vfio/pci/vfio_pci.c                  | 2 +-
 drivers/vfio/platform/vfio_platform_common.c | 2 +-
 drivers/vfio/vfio.c                          | 5 ++++-
 Documentation/vfio.txt                       | 4 +++-
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 712a849..d622a41 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1119,7 +1119,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const
struct pci_device_id *id)
 	spin_lock_init(&vdev->irqlock);
 
 	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
-	if (ret) {
+	if (ret < 0) {
 		vfio_iommu_group_put(group, &pdev->dev);
 		kfree(vdev);
 		return ret;
diff --git a/drivers/vfio/platform/vfio_platform_common.c
b/drivers/vfio/platform/vfio_platform_common.c
index e65b142..bf74e21 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -568,7 +568,7 @@ int vfio_platform_probe_common(struct vfio_platform_device
*vdev,
 	}
 
 	ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
-	if (ret) {
+	if (ret < 0) {
 		iommu_group_put(group);
 		return ret;
 	}
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 6fd6fa5..67db231 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -756,6 +756,7 @@ int vfio_add_group_dev(struct device *dev,
 	struct iommu_group *iommu_group;
 	struct vfio_group *group;
 	struct vfio_device *device;
+	int noiommu;
 
 	iommu_group = iommu_group_get(dev);
 	if (!iommu_group)
@@ -791,6 +792,8 @@ int vfio_add_group_dev(struct device *dev,
 		return PTR_ERR(device);
 	}
 
+	noiommu = group->noiommu;
+
 	/*
 	 * Drop all but the vfio_device reference.  The vfio_device holds
 	 * a reference to the vfio_group, which holds a reference to the
@@ -798,7 +801,7 @@ int vfio_add_group_dev(struct device *dev,
 	 */
 	vfio_group_put(group);
 
-	return 0;
+	return noiommu;
 }
 EXPORT_SYMBOL_GPL(vfio_add_group_dev);
 
diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
index 1dd3fdd..d76be0f 100644
--- a/Documentation/vfio.txt
+++ b/Documentation/vfio.txt
@@ -259,7 +259,9 @@ extern void *vfio_del_group_dev(struct device *dev);
 
 vfio_add_group_dev() indicates to the core to begin tracking the
 specified iommu_group and register the specified dev as owned by
-a VFIO bus driver.  The driver provides an ops structure for callbacks
+a VFIO bus driver.  A negative return value indicates failure.
+A positive return value indicates that an unsafe noiommu mode
+is in use.  The driver provides an ops structure for callbacks
 similar to a file operations structure:
 
 struct vfio_device_ops {
-- 
MST

Michael S. Tsirkin

2016-Apr-18 09:58 UTC

head link

[PATCH RFC 3/3] vfio: add virtio pci quirk

Modern virtio pci devices can set VIRTIO_F_IOMMU_PLATFORM
to signal they are safe to use with an IOMMU.

Without this bit, exposing the device to userspace is unsafe, so probe
and fail VFIO initialization unless noiommu is enabled.

Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
---
 drivers/vfio/pci/vfio_pci_private.h |   1 +
 drivers/vfio/pci/vfio_pci.c         |  11 +++
 drivers/vfio/pci/vfio_pci_virtio.c  | 135 ++++++++++++++++++++++++++++++++++++
 drivers/vfio/pci/Makefile           |   1 +
 4 files changed, 148 insertions(+)
 create mode 100644 drivers/vfio/pci/vfio_pci_virtio.c

diff --git a/drivers/vfio/pci/vfio_pci_private.h
b/drivers/vfio/pci/vfio_pci_private.h
index 8a7d546..604d445 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -130,4 +130,5 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device
*vdev)
 	return -ENODEV;
 }
 #endif
+extern int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, int noiommu);
 #endif /* VFIO_PCI_PRIVATE_H */
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d622a41..2bb8c76 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1125,6 +1125,17 @@ static int vfio_pci_probe(struct pci_dev *pdev, const
struct pci_device_id *id)
 		return ret;
 	}
 
+	if (pdev->vendor == PCI_VENDOR_ID_REDHAT_QUMRANET &&
+	    ((ret = vfio_pci_virtio_quirk(vdev, ret)))) {
+		dev_warn(&vdev->pdev->dev,
+			 "Failed to setup Virtio for VFIO\n");
+		vfio_del_group_dev(&pdev->dev);
+		vfio_iommu_group_put(group, &pdev->dev);
+		kfree(vdev);
+		return ret;
+	}
+
+
 	if (vfio_pci_is_vga(pdev)) {
 		vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
 		vga_set_legacy_decoding(pdev,
diff --git a/drivers/vfio/pci/vfio_pci_virtio.c
b/drivers/vfio/pci/vfio_pci_virtio.c
new file mode 100644
index 0000000..1a32064
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_virtio.c
@@ -0,0 +1,135 @@
+/*
+ * VFIO PCI Intel Graphics support
+ *
+ * Copyright (C) 2016 Red Hat, Inc.  All rights reserved.
+ *	Author: Alex Williamson <alex.williamson at redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Register a device specific region through which to provide read-only
+ * access to the Intel IGD opregion.  The register defining the opregion
+ * address is also virtualized to prevent user modification.
+ */
+
+#include <linux/io.h>
+#include <linux/pci.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_config.h>
+
+#include "vfio_pci_private.h"
+
+/**
+ * virtio_pci_find_capability - walk capabilities to find device info.
+ * @dev: the pci device
+ * @cfg_type: the VIRTIO_PCI_CAP_* value we seek
+ *
+ * Returns offset of the capability, or 0.
+ */
+static inline int virtio_pci_find_capability(struct pci_dev *dev, u8 cfg_type)
+{
+	int pos;
+
+	for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
+	     pos > 0;
+	     pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
+		u8 type;
+		pci_read_config_byte(dev, pos + offsetof(struct virtio_pci_cap,
+							 cfg_type),
+				     &type);
+
+		if (type != cfg_type)
+			continue;
+
+		/* Ignore structures with reserved BAR values */
+		if (type != VIRTIO_PCI_CAP_PCI_CFG) {
+			u8 bar;
+
+			pci_read_config_byte(dev, pos +
+					     offsetof(struct virtio_pci_cap,
+						      bar),
+					     &bar);
+			if (bar > 0x5)
+				continue;
+		}
+
+		return pos;
+	}
+	return 0;
+}
+
+
+int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, int noiommu)
+{
+	struct pci_dev *dev = vdev->pdev;
+	int common, cfg;
+	u32 features;
+	u32 offset;
+	u8 bar;
+
+	/* Without an IOMMU, we don't care */
+	if (noiommu)
+		return 0;
+	/* Check whether device enforces the IOMMU correctly */
+
+	/*
+	 * All modern devices must have common and cfg capabilities. We use cfg
+	 * capability for access so that we don't need to worry about resource
+	 * availability. Slow but sure.
+	 * Note that all vendor-specific fields we access are little-endian
+	 * which matches what pci config accessors expect, so they do byteswap
+	 * for us if appropriate.
+	 */
+	common = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_COMMON_CFG);
+	cfg = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_PCI_CFG);
+	if (!cfg || !common) {
+                dev_warn(&dev->dev,
+                         "Virtio device lacks common or pci cfg.\n");
+		return -ENODEV;
+	}
+
+	pci_read_config_byte(dev, common + offsetof(struct virtio_pci_cap,
+						    bar),
+			     &bar);
+	pci_read_config_dword(dev, common + offsetof(struct virtio_pci_cap,
+						    offset),
+			     &offset);
+
+	/* Program cfg capability for dword access into common cfg. */
+	pci_write_config_byte(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						  cap.bar),
+			      bar);
+	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						   cap.length),
+			       0x4);
+
+	/* Select features dword that has VIRTIO_F_IOMMU_PLATFORM. */
+	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						  cap.offset),
+			       offset + offsetof(struct virtio_pci_common_cfg,
+						 device_feature_select));
+	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						  pci_cfg_data),
+			       VIRTIO_F_IOMMU_PLATFORM / 32);
+
+	/* Get the features dword. */
+	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						  cap.offset),
+			       offset + offsetof(struct virtio_pci_common_cfg,
+						 device_feature));
+	pci_read_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
+						  pci_cfg_data),
+			      &features);
+
+	/* Does this device obey the platform's IOMMU? If not it's an error.
*/
+	if (!(features & (0x1 << (VIRTIO_F_IOMMU_PLATFORM % 32)))) {
+                dev_warn(&dev->dev,
+                         "Virtio device lacks
VIRTIO_F_IOMMU_PLATFORM.\n");
+		return -ENODEV;
+	}
+
+	return 0;
+}
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 76d8ec0..e9b20e7 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,5 +1,6 @@
 
 vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-y += vfio_pci_virtio.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 
 obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
-- 
MST

Alex Williamson

2016-Apr-18 18:56 UTC

head link

[PATCH RFC 2/3] vfio: report group noiommu status

On Mon, 18 Apr 2016 12:58:20 +0300
"Michael S. Tsirkin" <mst at redhat.com> wrote:
> When using vfio, callers might want to know whether device is added to a
> regular group or an non-iommu group.
> 
> Report this status from vfio_add_group_dev.
> 
> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
> ---
What about making an interface to query this rather than playing games
with magic return values?

bool vfio_iommu_group_is_noiommu(struct iommu_group *group)
{
    return iommu_group_get_iommudata(group) == &noiommu;
}
>  drivers/vfio/pci/vfio_pci.c                  | 2 +-
>  drivers/vfio/platform/vfio_platform_common.c | 2 +-
>  drivers/vfio/vfio.c                          | 5 ++++-
>  Documentation/vfio.txt                       | 4 +++-
>  4 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 712a849..d622a41 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1119,7 +1119,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const
struct pci_device_id *id)
>  	spin_lock_init(&vdev->irqlock);
>  
>  	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> -	if (ret) {
> +	if (ret < 0) {
>  		vfio_iommu_group_put(group, &pdev->dev);
>  		kfree(vdev);
>  		return ret;
> diff --git a/drivers/vfio/platform/vfio_platform_common.c
b/drivers/vfio/platform/vfio_platform_common.c
> index e65b142..bf74e21 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -568,7 +568,7 @@ int vfio_platform_probe_common(struct
vfio_platform_device *vdev,
>  	}
>  
>  	ret = vfio_add_group_dev(dev, &vfio_platform_ops, vdev);
> -	if (ret) {
> +	if (ret < 0) {
>  		iommu_group_put(group);
>  		return ret;
>  	}
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 6fd6fa5..67db231 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -756,6 +756,7 @@ int vfio_add_group_dev(struct device *dev,
>  	struct iommu_group *iommu_group;
>  	struct vfio_group *group;
>  	struct vfio_device *device;
> +	int noiommu;
>  
>  	iommu_group = iommu_group_get(dev);
>  	if (!iommu_group)
> @@ -791,6 +792,8 @@ int vfio_add_group_dev(struct device *dev,
>  		return PTR_ERR(device);
>  	}
>  
> +	noiommu = group->noiommu;
> +
>  	/*
>  	 * Drop all but the vfio_device reference.  The vfio_device holds
>  	 * a reference to the vfio_group, which holds a reference to the
> @@ -798,7 +801,7 @@ int vfio_add_group_dev(struct device *dev,
>  	 */
>  	vfio_group_put(group);
>  
> -	return 0;
> +	return noiommu;
>  }
>  EXPORT_SYMBOL_GPL(vfio_add_group_dev);
>  
> diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
> index 1dd3fdd..d76be0f 100644
> --- a/Documentation/vfio.txt
> +++ b/Documentation/vfio.txt
> @@ -259,7 +259,9 @@ extern void *vfio_del_group_dev(struct device *dev);
>  
>  vfio_add_group_dev() indicates to the core to begin tracking the
>  specified iommu_group and register the specified dev as owned by
> -a VFIO bus driver.  The driver provides an ops structure for callbacks
> +a VFIO bus driver.  A negative return value indicates failure.
> +A positive return value indicates that an unsafe noiommu mode
> +is in use.  The driver provides an ops structure for callbacks
>  similar to a file operations structure:
>  
>  struct vfio_device_ops {

Alex Williamson

2016-Apr-18 20:00 UTC

head link

[PATCH RFC 3/3] vfio: add virtio pci quirk

On Mon, 18 Apr 2016 12:58:28 +0300
"Michael S. Tsirkin" <mst at redhat.com> wrote:
> Modern virtio pci devices can set VIRTIO_F_IOMMU_PLATFORM
> to signal they are safe to use with an IOMMU.
> 
> Without this bit, exposing the device to userspace is unsafe, so probe
> and fail VFIO initialization unless noiommu is enabled.
> 
> Signed-off-by: Michael S. Tsirkin <mst at redhat.com>
> ---
>  drivers/vfio/pci/vfio_pci_private.h |   1 +
>  drivers/vfio/pci/vfio_pci.c         |  11 +++
>  drivers/vfio/pci/vfio_pci_virtio.c  | 135
++++++++++++++++++++++++++++++++++++
>  drivers/vfio/pci/Makefile           |   1 +
>  4 files changed, 148 insertions(+)
>  create mode 100644 drivers/vfio/pci/vfio_pci_virtio.c
> 
> diff --git a/drivers/vfio/pci/vfio_pci_private.h
b/drivers/vfio/pci/vfio_pci_private.h
> index 8a7d546..604d445 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -130,4 +130,5 @@ static inline int vfio_pci_igd_init(struct
vfio_pci_device *vdev)
>  	return -ENODEV;
>  }
>  #endif
> +extern int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, int
noiommu);
>  #endif /* VFIO_PCI_PRIVATE_H */
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index d622a41..2bb8c76 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1125,6 +1125,17 @@ static int vfio_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *id)
>  		return ret;
>  	}
>  
> +	if (pdev->vendor == PCI_VENDOR_ID_REDHAT_QUMRANET &&
Virtio really owns this entire vendor ID block?  Apparently nobody told
ivshmem: http://pci-ids.ucw.cz/read/PC/1af4/1110  Even the comment by
virtio_pci_id_table[] suggests virtio is only a subset even if the code
doesn't appear to honor that comment.  I don't know the history there,
but that seems like really inefficient use of an entire, coveted vendor
block.
> +	    ((ret = vfio_pci_virtio_quirk(vdev, ret)))) {
Please don't set variables like this unless necessary.

if (vendor...) {
   ret = vfio_pci_virtio_quir...
   if (ret) {
       ...
> +		dev_warn(&vdev->pdev->dev,
> +			 "Failed to setup Virtio for VFIO\n");
> +		vfio_del_group_dev(&pdev->dev);
> +		vfio_iommu_group_put(group, &pdev->dev);
> +		kfree(vdev);
> +		return ret;
> +	}
> +
> +
>  	if (vfio_pci_is_vga(pdev)) {
>  		vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
>  		vga_set_legacy_decoding(pdev,
> diff --git a/drivers/vfio/pci/vfio_pci_virtio.c
b/drivers/vfio/pci/vfio_pci_virtio.c
> new file mode 100644
> index 0000000..1a32064
> --- /dev/null
> +++ b/drivers/vfio/pci/vfio_pci_virtio.c
> @@ -0,0 +1,135 @@
> +/*
> + * VFIO PCI Intel Graphics support
> + *
> + * Copyright (C) 2016 Red Hat, Inc.  All rights reserved.
> + *	Author: Alex Williamson <alex.williamson at redhat.com>
> + *
Update
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Register a device specific region through which to provide read-only
> + * access to the Intel IGD opregion.  The register defining the opregion
> + * address is also virtualized to prevent user modification.
Update
> + */
> +
> +#include <linux/io.h>
> +#include <linux/pci.h>
> +#include <linux/uaccess.h>
> +#include <linux/vfio.h>
> +#include <linux/virtio_pci.h>
> +#include <linux/virtio_config.h>
I don't see where io or uaccess are needed here.
> +
> +#include "vfio_pci_private.h"
> +
> +/**
> + * virtio_pci_find_capability - walk capabilities to find device info.
> + * @dev: the pci device
> + * @cfg_type: the VIRTIO_PCI_CAP_* value we seek
> + *
> + * Returns offset of the capability, or 0.
> + */
> +static inline int virtio_pci_find_capability(struct pci_dev *dev, u8
cfg_type)
This is called from probe code, why inline?  There's already a function
with this exact same name in virtio code, can we come up with something
unique to avoid confusion?
> +{
> +	int pos;
> +
> +	for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> +	     pos > 0;
> +	     pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
> +		u8 type;
> +		pci_read_config_byte(dev, pos + offsetof(struct virtio_pci_cap,
> +							 cfg_type),
> +				     &type);
> +
> +		if (type != cfg_type)
> +			continue;
> +
> +		/* Ignore structures with reserved BAR values */
> +		if (type != VIRTIO_PCI_CAP_PCI_CFG) {
> +			u8 bar;
> +
> +			pci_read_config_byte(dev, pos +
> +					     offsetof(struct virtio_pci_cap,
> +						      bar),
> +					     &bar);
> +			if (bar > 0x5)
> +				continue;
> +		}
> +
> +		return pos;
> +	}
> +	return 0;
> +}
> +
> +
> +int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, int noiommu)
> +{
> +	struct pci_dev *dev = vdev->pdev;
> +	int common, cfg;
> +	u32 features;
> +	u32 offset;
> +	u8 bar;
> +
> +	/* Without an IOMMU, we don't care */
> +	if (noiommu)
> +		return 0;
> +	/* Check whether device enforces the IOMMU correctly */
> +
> +	/*
> +	 * All modern devices must have common and cfg capabilities. We use cfg
> +	 * capability for access so that we don't need to worry about
resource
> +	 * availability. Slow but sure.
> +	 * Note that all vendor-specific fields we access are little-endian
> +	 * which matches what pci config accessors expect, so they do byteswap
> +	 * for us if appropriate.
> +	 */
> +	common = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_COMMON_CFG);
> +	cfg = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_PCI_CFG);
> +	if (!cfg || !common) {
> +                dev_warn(&dev->dev,
> +                         "Virtio device lacks common or pci
cfg.\n");
White space
> +		return -ENODEV;
> +	}
> +
> +	pci_read_config_byte(dev, common + offsetof(struct virtio_pci_cap,
> +						    bar),
> +			     &bar);
> +	pci_read_config_dword(dev, common + offsetof(struct virtio_pci_cap,
> +						    offset),
> +			     &offset);
> +
> +	/* Program cfg capability for dword access into common cfg. */
> +	pci_write_config_byte(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						  cap.bar),
> +			      bar);
> +	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						   cap.length),
> +			       0x4);
> +
> +	/* Select features dword that has VIRTIO_F_IOMMU_PLATFORM. */
> +	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						  cap.offset),
> +			       offset + offsetof(struct virtio_pci_common_cfg,
> +						 device_feature_select));
> +	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						  pci_cfg_data),
> +			       VIRTIO_F_IOMMU_PLATFORM / 32);
> +
> +	/* Get the features dword. */
> +	pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						  cap.offset),
> +			       offset + offsetof(struct virtio_pci_common_cfg,
> +						 device_feature));
> +	pci_read_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap,
> +						  pci_cfg_data),
> +			      &features);
> +
> +	/* Does this device obey the platform's IOMMU? If not it's an
error. */
> +	if (!(features & (0x1 << (VIRTIO_F_IOMMU_PLATFORM % 32)))) {
> +                dev_warn(&dev->dev,
> +                         "Virtio device lacks
VIRTIO_F_IOMMU_PLATFORM.\n");
White space
> +		return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 76d8ec0..e9b20e7 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -1,5 +1,6 @@
>  
>  vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o
vfio_pci_config.o
> +vfio-pci-y += vfio_pci_virtio.o
>  vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
>  
>  obj-$(CONFIG_VFIO_PCI) += vfio-pci.o

Possibly Parallel Threads

Search for more apparently analagous threads

Virtualization - Apr 2016 - [PATCH RFC 3/3] vfio: add virtio pci quirk

[PATCH RFC 0/3] virtio-pci: iommu support

[PATCH RFC 1/3] virtio: add features for IOMMU control

[PATCH RFC 2/3] vfio: report group noiommu status

[PATCH RFC 3/3] vfio: add virtio pci quirk

[PATCH RFC 2/3] vfio: report group noiommu status

[PATCH RFC 3/3] vfio: add virtio pci quirk

Possibly Parallel Threads