Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 00/11] Introduce a vfio driver over virtio devices
This series introduce a vfio driver over virtio devices to support the legacy interface functionality for VFs. Background, from the virtio spec [1]. -------------------------------------------------------------------- In some systems, there is a need to support a virtio legacy driver with a device that does not directly support the legacy interface. In such scenarios, a group owner device can provide the legacy interface functionality for the group member devices. The driver of the owner device can then access the legacy interface of a member device on behalf of the legacy member device driver. For example, with the SR-IOV group type, group members (VFs) can not present the legacy interface in an I/O BAR in BAR0 as expected by the legacy pci driver. If the legacy driver is running inside a virtual machine, the hypervisor executing the virtual machine can present a virtual device with an I/O BAR in BAR0. The hypervisor intercepts the legacy driver accesses to this I/O BAR and forwards them to the group owner device (PF) using group administration commands. -------------------------------------------------------------------- The first 7 patches are in the virtio area and handle the below: - Introduce the admin virtqueue infrastcture. - Expose APIs to enable upper layers as of vfio, net, etc to execute admin commands. - Expose the layout of the commands that should be used for supporting the legacy access. The above follows the virtio spec that was lastly accepted in that area [1]. The last 4 patches are in the vfio area and handle the below: - Expose some APIs from vfio/pci to be used by the vfio/virtio driver. - Expose admin commands over virtio device. - Introduce a vfio driver over virtio devices to support the legacy interface functionality for VFs. The series was tested successfully over virtio-net VFs in the host, while running in the guest both modern and legacy drivers. [1] https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c Yishai Feng Liu (7): virtio-pci: Use virtio pci device layer vq info instead of generic one virtio: Define feature bit for administration virtqueue virtio-pci: Introduce admin virtqueue virtio: Expose the synchronous command helper function virtio-pci: Introduce admin command sending function virtio-pci: Introduce API to get PF virtio device from VF PCI device virtio-pci: Introduce admin commands Yishai Hadas (4): vfio/pci: Expose vfio_pci_core_setup_barmap() vfio/pci: Expose vfio_pci_iowrite/read##size() vfio/virtio: Expose admin commands over virtio device vfio/virtio: Introduce a vfio driver over virtio devices MAINTAINERS | 6 + drivers/net/virtio_net.c | 21 +- drivers/vfio/pci/Kconfig | 2 + drivers/vfio/pci/Makefile | 2 + drivers/vfio/pci/vfio_pci_core.c | 25 ++ drivers/vfio/pci/vfio_pci_rdwr.c | 38 +- drivers/vfio/pci/virtio/Kconfig | 15 + drivers/vfio/pci/virtio/Makefile | 4 + drivers/vfio/pci/virtio/cmd.c | 146 +++++++ drivers/vfio/pci/virtio/cmd.h | 35 ++ drivers/vfio/pci/virtio/main.c | 546 +++++++++++++++++++++++++ drivers/virtio/Makefile | 2 +- drivers/virtio/virtio.c | 44 +- drivers/virtio/virtio_pci_common.c | 24 +- drivers/virtio/virtio_pci_common.h | 17 +- drivers/virtio/virtio_pci_modern.c | 12 +- drivers/virtio/virtio_pci_modern_avq.c | 138 +++++++ drivers/virtio/virtio_ring.c | 27 ++ include/linux/vfio_pci_core.h | 20 + include/linux/virtio.h | 19 + include/linux/virtio_config.h | 7 + include/linux/virtio_pci_modern.h | 3 + include/uapi/linux/virtio_config.h | 8 +- include/uapi/linux/virtio_pci.h | 66 +++ 24 files changed, 1171 insertions(+), 56 deletions(-) create mode 100644 drivers/vfio/pci/virtio/Kconfig create mode 100644 drivers/vfio/pci/virtio/Makefile create mode 100644 drivers/vfio/pci/virtio/cmd.c create mode 100644 drivers/vfio/pci/virtio/cmd.h create mode 100644 drivers/vfio/pci/virtio/main.c create mode 100644 drivers/virtio/virtio_pci_modern_avq.c -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 01/11] virtio-pci: Use virtio pci device layer vq info instead of generic one
From: Feng Liu <feliu at nvidia.com> Currently VQ deletion callback vp_del_vqs() processes generic virtio_device level VQ list instead of VQ information available at PCI layer. To adhere to the layering, use the pci device level VQ information stored in the virtqueues or vqs. This also prepares the code to handle PCI layer admin vq life cycle to be managed within the pci layer and thereby avoid undesired deletion of admin vq by upper layer drivers (net, console, vfio), in the del_vqs() callback. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/virtio/virtio_pci_common.c | 12 +++++++++--- drivers/virtio/virtio_pci_common.h | 1 + 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index c2524a7207cf..7a3e6edc4dd6 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -232,12 +232,16 @@ static void vp_del_vq(struct virtqueue *vq) void vp_del_vqs(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); - struct virtqueue *vq, *n; + struct virtqueue *vq; int i; - list_for_each_entry_safe(vq, n, &vdev->vqs, list) { + for (i = 0; i < vp_dev->nvqs; i++) { + if (!vp_dev->vqs[i]) + continue; + + vq = vp_dev->vqs[i]->vq; if (vp_dev->per_vq_vectors) { - int v = vp_dev->vqs[vq->index]->msix_vector; + int v = vp_dev->vqs[i]->msix_vector; if (v != VIRTIO_MSI_NO_VECTOR) { int irq = pci_irq_vector(vp_dev->pci_dev, v); @@ -294,6 +298,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs, vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); if (!vp_dev->vqs) return -ENOMEM; + vp_dev->nvqs = nvqs; if (per_vq_vectors) { /* Best option: one for change interrupt, one per vq. */ @@ -365,6 +370,7 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs, vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); if (!vp_dev->vqs) return -ENOMEM; + vp_dev->nvqs = nvqs; err = request_irq(vp_dev->pci_dev->irq, vp_interrupt, IRQF_SHARED, dev_name(&vdev->dev), vp_dev); diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 4b773bd7c58c..602021967aaa 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -60,6 +60,7 @@ struct virtio_pci_device { /* array of all queues for house-keeping */ struct virtio_pci_vq_info **vqs; + u32 nvqs; /* MSI-X support */ int msix_enabled; -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 02/11] virtio: Define feature bit for administration virtqueue
From: Feng Liu <feliu at nvidia.com> Introduce VIRTIO_F_ADMIN_VQ which is used for administration virtqueue support. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- include/uapi/linux/virtio_config.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h index 2c712c654165..09d694968b14 100644 --- a/include/uapi/linux/virtio_config.h +++ b/include/uapi/linux/virtio_config.h @@ -52,7 +52,7 @@ * rest are per-device feature bits. */ #define VIRTIO_TRANSPORT_F_START 28 -#define VIRTIO_TRANSPORT_F_END 41 +#define VIRTIO_TRANSPORT_F_END 42 #ifndef VIRTIO_CONFIG_NO_LEGACY /* Do we get callbacks when the ring is completely used, even if we've @@ -109,4 +109,10 @@ * This feature indicates that the driver can reset a queue individually. */ #define VIRTIO_F_RING_RESET 40 + +/* + * This feature indicates that the device support administration virtqueues. + */ +#define VIRTIO_F_ADMIN_VQ 41 + #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */ -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 03/11] virtio-pci: Introduce admin virtqueue
From: Feng Liu <feliu at nvidia.com> Introduce support for the admin virtqueue. By negotiating VIRTIO_F_ADMIN_VQ feature, driver detects capability and creates one administration virtqueue. Administration virtqueue implementation in virtio pci generic layer, enables multiple types of upper layer drivers such as vfio, net, blk to utilize it. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/virtio/Makefile | 2 +- drivers/virtio/virtio.c | 37 +++++++++++++-- drivers/virtio/virtio_pci_common.h | 15 +++++- drivers/virtio/virtio_pci_modern.c | 10 +++- drivers/virtio/virtio_pci_modern_avq.c | 65 ++++++++++++++++++++++++++ include/linux/virtio_config.h | 4 ++ include/linux/virtio_pci_modern.h | 3 ++ 7 files changed, 129 insertions(+), 7 deletions(-) create mode 100644 drivers/virtio/virtio_pci_modern_avq.c diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index 8e98d24917cc..dcc535b5b4d9 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -5,7 +5,7 @@ obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o -virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o +virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o virtio_pci_modern_avq.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 3893dc29eb26..f4080692b351 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -302,9 +302,15 @@ static int virtio_dev_probe(struct device *_d) if (err) goto err; + if (dev->config->create_avq) { + err = dev->config->create_avq(dev); + if (err) + goto err; + } + err = drv->probe(dev); if (err) - goto err; + goto err_probe; /* If probe didn't do it, mark device DRIVER_OK ourselves. */ if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK)) @@ -316,6 +322,10 @@ static int virtio_dev_probe(struct device *_d) virtio_config_enable(dev); return 0; + +err_probe: + if (dev->config->destroy_avq) + dev->config->destroy_avq(dev); err: virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); return err; @@ -331,6 +341,9 @@ static void virtio_dev_remove(struct device *_d) drv->remove(dev); + if (dev->config->destroy_avq) + dev->config->destroy_avq(dev); + /* Driver should have reset device. */ WARN_ON_ONCE(dev->config->get_status(dev)); @@ -489,13 +502,20 @@ EXPORT_SYMBOL_GPL(unregister_virtio_device); int virtio_device_freeze(struct virtio_device *dev) { struct virtio_driver *drv = drv_to_virtio(dev->dev.driver); + int ret; virtio_config_disable(dev); dev->failed = dev->config->get_status(dev) & VIRTIO_CONFIG_S_FAILED; - if (drv && drv->freeze) - return drv->freeze(dev); + if (drv && drv->freeze) { + ret = drv->freeze(dev); + if (ret) + return ret; + } + + if (dev->config->destroy_avq) + dev->config->destroy_avq(dev); return 0; } @@ -532,10 +552,16 @@ int virtio_device_restore(struct virtio_device *dev) if (ret) goto err; + if (dev->config->create_avq) { + ret = dev->config->create_avq(dev); + if (ret) + goto err; + } + if (drv->restore) { ret = drv->restore(dev); if (ret) - goto err; + goto err_restore; } /* If restore didn't do it, mark device DRIVER_OK ourselves. */ @@ -546,6 +572,9 @@ int virtio_device_restore(struct virtio_device *dev) return 0; +err_restore: + if (dev->config->destroy_avq) + dev->config->destroy_avq(dev); err: virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); return ret; diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 602021967aaa..9bffa95274b6 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -41,6 +41,14 @@ struct virtio_pci_vq_info { unsigned int msix_vector; }; +struct virtio_avq { + /* Virtqueue info associated with this admin queue. */ + struct virtio_pci_vq_info info; + /* Name of the admin queue: avq.$index. */ + char name[10]; + u16 vq_index; +}; + /* Our device structure */ struct virtio_pci_device { struct virtio_device vdev; @@ -58,10 +66,13 @@ struct virtio_pci_device { spinlock_t lock; struct list_head virtqueues; - /* array of all queues for house-keeping */ + /* Array of all virtqueues reported in the + * PCI common config num_queues field + */ struct virtio_pci_vq_info **vqs; u32 nvqs; + struct virtio_avq *admin; /* MSI-X support */ int msix_enabled; int intx_enabled; @@ -115,6 +126,8 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int nvqs, const char * const names[], const bool *ctx, struct irq_affinity *desc); const char *vp_bus_name(struct virtio_device *vdev); +void vp_destroy_avq(struct virtio_device *vdev); +int vp_create_avq(struct virtio_device *vdev); /* Setup the affinity for a virtqueue: * - force the affinity for per vq vector diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index d6bb68ba84e5..a72c87687196 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -37,6 +37,9 @@ static void vp_transport_features(struct virtio_device *vdev, u64 features) if (features & BIT_ULL(VIRTIO_F_RING_RESET)) __virtio_set_bit(vdev, VIRTIO_F_RING_RESET); + + if (features & BIT_ULL(VIRTIO_F_ADMIN_VQ)) + __virtio_set_bit(vdev, VIRTIO_F_ADMIN_VQ); } /* virtio config->finalize_features() implementation */ @@ -317,7 +320,8 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev, else notify = vp_notify; - if (index >= vp_modern_get_num_queues(mdev)) + if (!((index < vp_modern_get_num_queues(mdev) || + (vp_dev->admin && vp_dev->admin->vq_index == index)))) return ERR_PTR(-EINVAL); /* Check if queue is either not available or already active. */ @@ -509,6 +513,8 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = { .get_shm_region = vp_get_shm_region, .disable_vq_and_reset = vp_modern_disable_vq_and_reset, .enable_vq_after_reset = vp_modern_enable_vq_after_reset, + .create_avq = vp_create_avq, + .destroy_avq = vp_destroy_avq, }; static const struct virtio_config_ops virtio_pci_config_ops = { @@ -529,6 +535,8 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .get_shm_region = vp_get_shm_region, .disable_vq_and_reset = vp_modern_disable_vq_and_reset, .enable_vq_after_reset = vp_modern_enable_vq_after_reset, + .create_avq = vp_create_avq, + .destroy_avq = vp_destroy_avq, }; /* the PCI probing function */ diff --git a/drivers/virtio/virtio_pci_modern_avq.c b/drivers/virtio/virtio_pci_modern_avq.c new file mode 100644 index 000000000000..114579ad788f --- /dev/null +++ b/drivers/virtio/virtio_pci_modern_avq.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include <linux/virtio.h> +#include "virtio_pci_common.h" + +static u16 vp_modern_avq_num(struct virtio_pci_modern_device *mdev) +{ + struct virtio_pci_modern_common_cfg __iomem *cfg; + + cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common; + return vp_ioread16(&cfg->admin_queue_num); +} + +static u16 vp_modern_avq_index(struct virtio_pci_modern_device *mdev) +{ + struct virtio_pci_modern_common_cfg __iomem *cfg; + + cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common; + return vp_ioread16(&cfg->admin_queue_index); +} + +int vp_create_avq(struct virtio_device *vdev) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + struct virtio_avq *avq; + struct virtqueue *vq; + u16 admin_q_num; + + if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ)) + return 0; + + admin_q_num = vp_modern_avq_num(&vp_dev->mdev); + if (!admin_q_num) + return -EINVAL; + + vp_dev->admin = kzalloc(sizeof(*vp_dev->admin), GFP_KERNEL); + if (!vp_dev->admin) + return -ENOMEM; + + avq = vp_dev->admin; + avq->vq_index = vp_modern_avq_index(&vp_dev->mdev); + sprintf(avq->name, "avq.%u", avq->vq_index); + vq = vp_dev->setup_vq(vp_dev, &vp_dev->admin->info, avq->vq_index, NULL, + avq->name, NULL, VIRTIO_MSI_NO_VECTOR); + if (IS_ERR(vq)) { + dev_err(&vdev->dev, "failed to setup admin virtqueue"); + kfree(vp_dev->admin); + return PTR_ERR(vq); + } + + vp_dev->admin->info.vq = vq; + vp_modern_set_queue_enable(&vp_dev->mdev, avq->info.vq->index, true); + return 0; +} + +void vp_destroy_avq(struct virtio_device *vdev) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + + if (!vp_dev->admin) + return; + + vp_dev->del_vq(&vp_dev->admin->info); + kfree(vp_dev->admin); +} diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 2b3438de2c4d..028c51ea90ee 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -93,6 +93,8 @@ typedef void vq_callback_t(struct virtqueue *); * Returns 0 on success or error status * If disable_vq_and_reset is set, then enable_vq_after_reset must also be * set. + * @create_avq: initialize admin virtqueue resource. + * @destroy_avq: destroy admin virtqueue resource. */ struct virtio_config_ops { void (*get)(struct virtio_device *vdev, unsigned offset, @@ -120,6 +122,8 @@ struct virtio_config_ops { struct virtio_shm_region *region, u8 id); int (*disable_vq_and_reset)(struct virtqueue *vq); int (*enable_vq_after_reset)(struct virtqueue *vq); + int (*create_avq)(struct virtio_device *vdev); + void (*destroy_avq)(struct virtio_device *vdev); }; /* If driver didn't advertise the feature, it will never appear. */ diff --git a/include/linux/virtio_pci_modern.h b/include/linux/virtio_pci_modern.h index 067ac1d789bc..f6cb13d858fd 100644 --- a/include/linux/virtio_pci_modern.h +++ b/include/linux/virtio_pci_modern.h @@ -10,6 +10,9 @@ struct virtio_pci_modern_common_cfg { __le16 queue_notify_data; /* read-write */ __le16 queue_reset; /* read-write */ + + __le16 admin_queue_index; /* read-only */ + __le16 admin_queue_num; /* read-only */ }; struct virtio_pci_modern_device { -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 04/11] virtio: Expose the synchronous command helper function
From: Feng Liu <feliu at nvidia.com> Synchronous command helper function is exposed at virtio layer, so that ctrl virtqueue and admin virtqueues can reuse this helper function to send synchronous commands. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/net/virtio_net.c | 21 ++++++--------------- drivers/virtio/virtio_ring.c | 27 +++++++++++++++++++++++++++ include/linux/virtio.h | 7 +++++++ 3 files changed, 40 insertions(+), 15 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index fe7f314d65c9..65c210b0fb9e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2451,7 +2451,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, struct scatterlist *out) { struct scatterlist *sgs[4], hdr, stat; - unsigned out_num = 0, tmp; + unsigned int out_num = 0; int ret; /* Caller should know better */ @@ -2472,23 +2472,14 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, sgs[out_num] = &stat; BUG_ON(out_num + 1 > ARRAY_SIZE(sgs)); - ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC); - if (ret < 0) { - dev_warn(&vi->vdev->dev, - "Failed to add sgs for command vq: %d\n.", ret); + ret = virtqueue_exec_cmd(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC); + if (ret) { + dev_err(&vi->vdev->dev, + "Failed to exec command vq(%s,%d): %d\n", + vi->cvq->name, vi->cvq->index, ret); return false; } - if (unlikely(!virtqueue_kick(vi->cvq))) - return vi->ctrl->status == VIRTIO_NET_OK; - - /* Spin for a response, the kick causes an ioport write, trapping - * into the hypervisor, so the request should be handled immediately. - */ - while (!virtqueue_get_buf(vi->cvq, &tmp) && - !virtqueue_is_broken(vi->cvq)) - cpu_relax(); - return vi->ctrl->status == VIRTIO_NET_OK; } diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 51d8f3299c10..253905c0b008 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -3251,4 +3251,31 @@ void virtqueue_dma_sync_single_range_for_device(struct virtqueue *_vq, } EXPORT_SYMBOL_GPL(virtqueue_dma_sync_single_range_for_device); +int virtqueue_exec_cmd(struct virtqueue *vq, + struct scatterlist **sgs, + unsigned int out_num, + unsigned int in_num, + void *data, + gfp_t gfp) +{ + int ret, len; + + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, data, gfp); + if (ret < 0) + return ret; + + if (unlikely(!virtqueue_kick(vq))) + return -EIO; + + /* Spin for a response, the kick causes an ioport write, trapping + * into the hypervisor, so the request should be handled immediately. + */ + while (!virtqueue_get_buf(vq, &len) && + !virtqueue_is_broken(vq)) + cpu_relax(); + + return 0; +} +EXPORT_SYMBOL_GPL(virtqueue_exec_cmd); + MODULE_LICENSE("GPL"); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 4cc614a38376..9d39706bed10 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -103,6 +103,13 @@ int virtqueue_resize(struct virtqueue *vq, u32 num, int virtqueue_reset(struct virtqueue *vq, void (*recycle)(struct virtqueue *vq, void *buf)); +int virtqueue_exec_cmd(struct virtqueue *vq, + struct scatterlist **sgs, + unsigned int out_num, + unsigned int in_num, + void *data, + gfp_t gfp); + /** * struct virtio_device - representation of a device using virtio * @index: unique position on the virtio bus -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 05/11] virtio-pci: Introduce admin command sending function
From: Feng Liu <feliu at nvidia.com> Add support for sending admin command through admin virtqueue interface, and expose generic API to execute virtio admin command. Reuse the send synchronous command helper function at virtio transport layer. In addition, add new result state of admin command and admin commands range definitions. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/virtio/virtio.c | 7 +++ drivers/virtio/virtio_pci_common.h | 1 + drivers/virtio/virtio_pci_modern.c | 2 + drivers/virtio/virtio_pci_modern_avq.c | 73 ++++++++++++++++++++++++++ include/linux/virtio.h | 11 ++++ include/linux/virtio_config.h | 3 ++ include/uapi/linux/virtio_pci.h | 22 ++++++++ 7 files changed, 119 insertions(+) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index f4080692b351..dd71f584a1bd 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -582,6 +582,13 @@ int virtio_device_restore(struct virtio_device *dev) EXPORT_SYMBOL_GPL(virtio_device_restore); #endif +int virtio_admin_cmd_exec(struct virtio_device *vdev, + struct virtio_admin_cmd *cmd) +{ + return vdev->config->exec_admin_cmd(vdev, cmd); +} +EXPORT_SYMBOL_GPL(virtio_admin_cmd_exec); + static int virtio_init(void) { if (bus_register(&virtio_bus) != 0) diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 9bffa95274b6..a579f1338263 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -128,6 +128,7 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned int nvqs, const char *vp_bus_name(struct virtio_device *vdev); void vp_destroy_avq(struct virtio_device *vdev); int vp_create_avq(struct virtio_device *vdev); +int vp_avq_cmd_exec(struct virtio_device *vdev, struct virtio_admin_cmd *cmd); /* Setup the affinity for a virtqueue: * - force the affinity for per vq vector diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index a72c87687196..cac18872b088 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -515,6 +515,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = { .enable_vq_after_reset = vp_modern_enable_vq_after_reset, .create_avq = vp_create_avq, .destroy_avq = vp_destroy_avq, + .exec_admin_cmd = vp_avq_cmd_exec, }; static const struct virtio_config_ops virtio_pci_config_ops = { @@ -537,6 +538,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .enable_vq_after_reset = vp_modern_enable_vq_after_reset, .create_avq = vp_create_avq, .destroy_avq = vp_destroy_avq, + .exec_admin_cmd = vp_avq_cmd_exec, }; /* the PCI probing function */ diff --git a/drivers/virtio/virtio_pci_modern_avq.c b/drivers/virtio/virtio_pci_modern_avq.c index 114579ad788f..ca3fe10f616d 100644 --- a/drivers/virtio/virtio_pci_modern_avq.c +++ b/drivers/virtio/virtio_pci_modern_avq.c @@ -19,6 +19,79 @@ static u16 vp_modern_avq_index(struct virtio_pci_modern_device *mdev) return vp_ioread16(&cfg->admin_queue_index); } +#define VIRTIO_AVQ_SGS_MAX 4 + +int vp_avq_cmd_exec(struct virtio_device *vdev, struct virtio_admin_cmd *cmd) +{ + struct scatterlist *sgs[VIRTIO_AVQ_SGS_MAX], hdr, stat; + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + struct virtio_admin_cmd_status *va_status; + unsigned int out_num = 0, in_num = 0; + struct virtio_admin_cmd_hdr *va_hdr; + struct virtqueue *avq; + u16 status; + int ret; + + avq = vp_dev->admin ? vp_dev->admin->info.vq : NULL; + if (!avq) + return -EOPNOTSUPP; + + va_status = kzalloc(sizeof(*va_status), GFP_KERNEL); + if (!va_status) + return -ENOMEM; + + va_hdr = kzalloc(sizeof(*va_hdr), GFP_KERNEL); + if (!va_hdr) { + ret = -ENOMEM; + goto err_alloc; + } + + va_hdr->opcode = cmd->opcode; + va_hdr->group_type = cmd->group_type; + va_hdr->group_member_id = cmd->group_member_id; + + /* Add header */ + sg_init_one(&hdr, va_hdr, sizeof(*va_hdr)); + sgs[out_num] = &hdr; + out_num++; + + if (cmd->data_sg) { + sgs[out_num] = cmd->data_sg; + out_num++; + } + + /* Add return status */ + sg_init_one(&stat, va_status, sizeof(*va_status)); + sgs[out_num + in_num] = &stat; + in_num++; + + if (cmd->result_sg) { + sgs[out_num + in_num] = cmd->result_sg; + in_num++; + } + + ret = virtqueue_exec_cmd(avq, sgs, out_num, in_num, sgs, GFP_KERNEL); + if (ret) { + dev_err(&vdev->dev, + "Failed to execute command on admin vq: %d\n.", ret); + goto err_cmd_exec; + } + + status = le16_to_cpu(va_status->status); + if (status != VIRTIO_ADMIN_STATUS_OK) { + dev_err(&vdev->dev, + "admin command error: status(%#x) qualifier(%#x)\n", + status, le16_to_cpu(va_status->status_qualifier)); + ret = -status; + } + +err_cmd_exec: + kfree(va_hdr); +err_alloc: + kfree(va_status); + return ret; +} + int vp_create_avq(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 9d39706bed10..094a2ef1c8b8 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -110,6 +110,14 @@ int virtqueue_exec_cmd(struct virtqueue *vq, void *data, gfp_t gfp); +struct virtio_admin_cmd { + __le16 opcode; + __le16 group_type; + __le64 group_member_id; + struct scatterlist *data_sg; + struct scatterlist *result_sg; +}; + /** * struct virtio_device - representation of a device using virtio * @index: unique position on the virtio bus @@ -207,6 +215,9 @@ static inline struct virtio_driver *drv_to_virtio(struct device_driver *drv) return container_of(drv, struct virtio_driver, driver); } +int virtio_admin_cmd_exec(struct virtio_device *vdev, + struct virtio_admin_cmd *cmd); + int register_virtio_driver(struct virtio_driver *drv); void unregister_virtio_driver(struct virtio_driver *drv); diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 028c51ea90ee..e213173e1291 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -95,6 +95,7 @@ typedef void vq_callback_t(struct virtqueue *); * set. * @create_avq: initialize admin virtqueue resource. * @destroy_avq: destroy admin virtqueue resource. + * @exec_admin_cmd: Send admin command and get result. */ struct virtio_config_ops { void (*get)(struct virtio_device *vdev, unsigned offset, @@ -124,6 +125,8 @@ struct virtio_config_ops { int (*enable_vq_after_reset)(struct virtqueue *vq); int (*create_avq)(struct virtio_device *vdev); void (*destroy_avq)(struct virtio_device *vdev); + int (*exec_admin_cmd)(struct virtio_device *vdev, + struct virtio_admin_cmd *cmd); }; /* If driver didn't advertise the feature, it will never appear. */ diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h index f703afc7ad31..1f1ac6ac07df 100644 --- a/include/uapi/linux/virtio_pci.h +++ b/include/uapi/linux/virtio_pci.h @@ -207,4 +207,26 @@ struct virtio_pci_cfg_cap { #endif /* VIRTIO_PCI_NO_MODERN */ +/* Admin command status. */ +#define VIRTIO_ADMIN_STATUS_OK 0 + +struct virtio_admin_cmd_hdr { + __le16 opcode; + /* + * 1 - SR-IOV + * 2-65535 - reserved + */ + __le16 group_type; + /* Unused, reserved for future extensions. */ + __u8 reserved1[12]; + __le64 group_member_id; +} __packed; + +struct virtio_admin_cmd_status { + __le16 status; + __le16 status_qualifier; + /* Unused, reserved for future extensions. */ + __u8 reserved2[4]; +} __packed; + #endif -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 06/11] virtio-pci: Introduce API to get PF virtio device from VF PCI device
From: Feng Liu <feliu at nvidia.com> Introduce API to get PF virtio device from the given VF PCI device so that other modules such as vfio in subsequent patch can use it. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/virtio/virtio_pci_common.c | 12 ++++++++++++ include/linux/virtio.h | 1 + 2 files changed, 13 insertions(+) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 7a3e6edc4dd6..c64484cd5b13 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -648,6 +648,18 @@ static struct pci_driver virtio_pci_driver = { .sriov_configure = virtio_pci_sriov_configure, }; +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev) +{ + struct virtio_pci_device *pf_vp_dev; + + pf_vp_dev = pci_iov_get_pf_drvdata(pdev, &virtio_pci_driver); + if (IS_ERR(pf_vp_dev)) + return NULL; + + return &pf_vp_dev->vdev; +} +EXPORT_SYMBOL_GPL(virtio_pci_vf_get_pf_dev); + module_pci_driver(virtio_pci_driver); MODULE_AUTHOR("Anthony Liguori <aliguori at us.ibm.com>"); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 094a2ef1c8b8..4ae088ea9299 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -217,6 +217,7 @@ static inline struct virtio_driver *drv_to_virtio(struct device_driver *drv) int virtio_admin_cmd_exec(struct virtio_device *vdev, struct virtio_admin_cmd *cmd); +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev); int register_virtio_driver(struct virtio_driver *drv); void unregister_virtio_driver(struct virtio_driver *drv); -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 07/11] virtio-pci: Introduce admin commands
From: Feng Liu <feliu at nvidia.com> Introduces admin commands, as follow: The "list query" command can be used by the driver to query the set of admin commands supported by the virtio device. The "list use" command is used to inform the virtio device which admin commands the driver will use. The "legacy common cfg rd/wr" commands are used to read from/write into the legacy common configuration structure. The "legacy dev cfg rd/wr" commands are used to read from/write into the legacy device configuration structure. The "notify info" command is used to query the notification region information. Signed-off-by: Feng Liu <feliu at nvidia.com> Reviewed-by: Parav Pandit <parav at nvidia.com> Reviewed-by: Jiri Pirko <jiri at nvidia.com> Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- include/uapi/linux/virtio_pci.h | 44 +++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h index 1f1ac6ac07df..2bf275ad0f20 100644 --- a/include/uapi/linux/virtio_pci.h +++ b/include/uapi/linux/virtio_pci.h @@ -210,6 +210,23 @@ struct virtio_pci_cfg_cap { /* Admin command status. */ #define VIRTIO_ADMIN_STATUS_OK 0 +/* Admin command opcode. */ +#define VIRTIO_ADMIN_CMD_LIST_QUERY 0x0 +#define VIRTIO_ADMIN_CMD_LIST_USE 0x1 + +/* Admin command group type. */ +#define VIRTIO_ADMIN_GROUP_TYPE_SRIOV 0x1 + +/* Transitional device admin command. */ +#define VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_WRITE 0x2 +#define VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ 0x3 +#define VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE 0x4 +#define VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ 0x5 +#define VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO 0x6 + +/* Increment MAX_OPCODE to next value when new opcode is added */ +#define VIRTIO_ADMIN_MAX_CMD_OPCODE 0x6 + struct virtio_admin_cmd_hdr { __le16 opcode; /* @@ -229,4 +246,31 @@ struct virtio_admin_cmd_status { __u8 reserved2[4]; } __packed; +struct virtio_admin_cmd_legacy_wr_data { + u8 offset; /* Starting offset of the register(s) to write. */ + u8 reserved[7]; + u8 registers[]; +} __packed; + +struct virtio_admin_cmd_legacy_rd_data { + u8 offset; /* Starting offset of the register(s) to read. */ +} __packed; + +#define VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_END 0 +#define VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_OWNER_DEV 0x1 +#define VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_OWNER_MEM 0x2 + +#define VIRTIO_ADMIN_CMD_MAX_NOTIFY_INFO 4 + +struct virtio_admin_cmd_notify_info_data { + u8 flags; /* 0 = end of list, 1 = owner device, 2 = member device */ + u8 bar; /* BAR of the member or the owner device */ + u8 padding[6]; + __le64 offset; /* Offset within bar. */ +}; __packed + +struct virtio_admin_cmd_notify_info_result { + struct virtio_admin_cmd_notify_info_data entries[VIRTIO_ADMIN_CMD_MAX_NOTIFY_INFO]; +}; + #endif -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 08/11] vfio/pci: Expose vfio_pci_core_setup_barmap()
Expose vfio_pci_core_setup_barmap() to be used by drivers. This will let drivers to mmap a BAR and re-use it from both vfio and the driver when it's applicable. This API will be used in the next patches by the vfio/virtio coming driver. Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/vfio/pci/vfio_pci_core.c | 25 +++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_rdwr.c | 28 ++-------------------------- include/linux/vfio_pci_core.h | 1 + 3 files changed, 28 insertions(+), 26 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 1929103ee59a..b56111ed8a8c 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -684,6 +684,31 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev) } EXPORT_SYMBOL_GPL(vfio_pci_core_disable); +int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar) +{ + struct pci_dev *pdev = vdev->pdev; + void __iomem *io; + int ret; + + if (vdev->barmap[bar]) + return 0; + + ret = pci_request_selected_regions(pdev, 1 << bar, "vfio"); + if (ret) + return ret; + + io = pci_iomap(pdev, bar, 0); + if (!io) { + pci_release_selected_regions(pdev, 1 << bar); + return -ENOMEM; + } + + vdev->barmap[bar] = io; + + return 0; +} +EXPORT_SYMBOL(vfio_pci_core_setup_barmap); + void vfio_pci_core_close_device(struct vfio_device *core_vdev) { struct vfio_pci_core_device *vdev diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index e27de61ac9fe..6f08b3ecbb89 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -200,30 +200,6 @@ static ssize_t do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, return done; } -static int vfio_pci_setup_barmap(struct vfio_pci_core_device *vdev, int bar) -{ - struct pci_dev *pdev = vdev->pdev; - int ret; - void __iomem *io; - - if (vdev->barmap[bar]) - return 0; - - ret = pci_request_selected_regions(pdev, 1 << bar, "vfio"); - if (ret) - return ret; - - io = pci_iomap(pdev, bar, 0); - if (!io) { - pci_release_selected_regions(pdev, 1 << bar); - return -ENOMEM; - } - - vdev->barmap[bar] = io; - - return 0; -} - ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, size_t count, loff_t *ppos, bool iswrite) { @@ -262,7 +238,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf, } x_end = end; } else { - int ret = vfio_pci_setup_barmap(vdev, bar); + int ret = vfio_pci_core_setup_barmap(vdev, bar); if (ret) { done = ret; goto out; @@ -438,7 +414,7 @@ int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset, return -EINVAL; #endif - ret = vfio_pci_setup_barmap(vdev, bar); + ret = vfio_pci_core_setup_barmap(vdev, bar); if (ret) return ret; diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 562e8754869d..67ac58e20e1d 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -127,6 +127,7 @@ int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf); int vfio_pci_core_enable(struct vfio_pci_core_device *vdev); void vfio_pci_core_disable(struct vfio_pci_core_device *vdev); void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev); +int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar); pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev, pci_channel_state_t state); -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 09/11] vfio/pci: Expose vfio_pci_iowrite/read##size()
Expose vfio_pci_iowrite/read##size() to let it be used by drivers. This functionality is needed to enable direct access to some physical BAR of the device with the proper locks/checks in place. The next patches from this series will use this functionality on a data path flow when a direct access to the BAR is needed. Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/vfio/pci/vfio_pci_rdwr.c | 10 ++++++---- include/linux/vfio_pci_core.h | 19 +++++++++++++++++++ 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 6f08b3ecbb89..5d84bad7d30c 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -38,7 +38,7 @@ #define vfio_iowrite8 iowrite8 #define VFIO_IOWRITE(size) \ -static int vfio_pci_iowrite##size(struct vfio_pci_core_device *vdev, \ +int vfio_pci_iowrite##size(struct vfio_pci_core_device *vdev, \ bool test_mem, u##size val, void __iomem *io) \ { \ if (test_mem) { \ @@ -55,7 +55,8 @@ static int vfio_pci_iowrite##size(struct vfio_pci_core_device *vdev, \ up_read(&vdev->memory_lock); \ \ return 0; \ -} +} \ +EXPORT_SYMBOL(vfio_pci_iowrite##size); VFIO_IOWRITE(8) VFIO_IOWRITE(16) @@ -65,7 +66,7 @@ VFIO_IOWRITE(64) #endif #define VFIO_IOREAD(size) \ -static int vfio_pci_ioread##size(struct vfio_pci_core_device *vdev, \ +int vfio_pci_ioread##size(struct vfio_pci_core_device *vdev, \ bool test_mem, u##size *val, void __iomem *io) \ { \ if (test_mem) { \ @@ -82,7 +83,8 @@ static int vfio_pci_ioread##size(struct vfio_pci_core_device *vdev, \ up_read(&vdev->memory_lock); \ \ return 0; \ -} +} \ +EXPORT_SYMBOL(vfio_pci_ioread##size); VFIO_IOREAD(8) VFIO_IOREAD(16) diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 67ac58e20e1d..22c915317788 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -131,4 +131,23 @@ int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar); pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev, pci_channel_state_t state); +#define VFIO_IOWRITE_DECLATION(size) \ +int vfio_pci_iowrite##size(struct vfio_pci_core_device *vdev, \ + bool test_mem, u##size val, void __iomem *io); + +VFIO_IOWRITE_DECLATION(8) +VFIO_IOWRITE_DECLATION(16) +VFIO_IOWRITE_DECLATION(32) +#ifdef iowrite64 +VFIO_IOWRITE_DECLATION(64) +#endif + +#define VFIO_IOREAD_DECLATION(size) \ +int vfio_pci_ioread##size(struct vfio_pci_core_device *vdev, \ + bool test_mem, u##size *val, void __iomem *io); + +VFIO_IOREAD_DECLATION(8) +VFIO_IOREAD_DECLATION(16) +VFIO_IOREAD_DECLATION(32) + #endif /* VFIO_PCI_CORE_H */ -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device
Expose admin commands over the virtio device, to be used by the vfio-virtio driver in the next patches. It includes: list query/use, legacy write/read, read notify_info. Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- drivers/vfio/pci/virtio/cmd.c | 146 ++++++++++++++++++++++++++++++++++ drivers/vfio/pci/virtio/cmd.h | 27 +++++++ 2 files changed, 173 insertions(+) create mode 100644 drivers/vfio/pci/virtio/cmd.c create mode 100644 drivers/vfio/pci/virtio/cmd.h diff --git a/drivers/vfio/pci/virtio/cmd.c b/drivers/vfio/pci/virtio/cmd.c new file mode 100644 index 000000000000..f068239cdbb0 --- /dev/null +++ b/drivers/vfio/pci/virtio/cmd.c @@ -0,0 +1,146 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include "cmd.h" + +int virtiovf_cmd_list_query(struct pci_dev *pdev, u8 *buf, int buf_size) +{ + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev); + struct scatterlist out_sg; + struct virtio_admin_cmd cmd = {}; + + if (!virtio_dev) + return -ENOTCONN; + + sg_init_one(&out_sg, buf, buf_size); + cmd.opcode = VIRTIO_ADMIN_CMD_LIST_QUERY; + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; + cmd.result_sg = &out_sg; + + return virtio_admin_cmd_exec(virtio_dev, &cmd); +} + +int virtiovf_cmd_list_use(struct pci_dev *pdev, u8 *buf, int buf_size) +{ + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev); + struct scatterlist in_sg; + struct virtio_admin_cmd cmd = {}; + + if (!virtio_dev) + return -ENOTCONN; + + sg_init_one(&in_sg, buf, buf_size); + cmd.opcode = VIRTIO_ADMIN_CMD_LIST_USE; + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; + cmd.data_sg = &in_sg; + + return virtio_admin_cmd_exec(virtio_dev, &cmd); +} + +int virtiovf_cmd_lr_write(struct virtiovf_pci_core_device *virtvdev, u16 opcode, + u8 offset, u8 size, u8 *buf) +{ + struct virtio_device *virtio_dev + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); + struct virtio_admin_cmd_data_lr_write *in; + struct scatterlist in_sg; + struct virtio_admin_cmd cmd = {}; + int ret; + + if (!virtio_dev) + return -ENOTCONN; + + in = kzalloc(sizeof(*in) + size, GFP_KERNEL); + if (!in) + return -ENOMEM; + + in->offset = offset; + memcpy(in->registers, buf, size); + sg_init_one(&in_sg, in, sizeof(*in) + size); + cmd.opcode = opcode; + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; + cmd.group_member_id = virtvdev->vf_id + 1; + cmd.data_sg = &in_sg; + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); + + kfree(in); + return ret; +} + +int virtiovf_cmd_lr_read(struct virtiovf_pci_core_device *virtvdev, u16 opcode, + u8 offset, u8 size, u8 *buf) +{ + struct virtio_device *virtio_dev + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); + struct virtio_admin_cmd_data_lr_read *in; + struct scatterlist in_sg, out_sg; + struct virtio_admin_cmd cmd = {}; + int ret; + + if (!virtio_dev) + return -ENOTCONN; + + in = kzalloc(sizeof(*in), GFP_KERNEL); + if (!in) + return -ENOMEM; + + in->offset = offset; + sg_init_one(&in_sg, in, sizeof(*in)); + sg_init_one(&out_sg, buf, size); + cmd.opcode = opcode; + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; + cmd.data_sg = &in_sg; + cmd.result_sg = &out_sg; + cmd.group_member_id = virtvdev->vf_id + 1; + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); + + kfree(in); + return ret; +} + +int virtiovf_cmd_lq_read_notify(struct virtiovf_pci_core_device *virtvdev, + u8 req_bar_flags, u8 *bar, u64 *bar_offset) +{ + struct virtio_device *virtio_dev + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); + struct virtio_admin_cmd_notify_info_result *out; + struct scatterlist out_sg; + struct virtio_admin_cmd cmd = {}; + int ret; + + if (!virtio_dev) + return -ENOTCONN; + + out = kzalloc(sizeof(*out), GFP_KERNEL); + if (!out) + return -ENOMEM; + + sg_init_one(&out_sg, out, sizeof(*out)); + cmd.opcode = VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO; + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; + cmd.result_sg = &out_sg; + cmd.group_member_id = virtvdev->vf_id + 1; + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); + if (!ret) { + struct virtio_admin_cmd_notify_info_data *entry; + int i; + + ret = -ENOENT; + for (i = 0; i < VIRTIO_ADMIN_CMD_MAX_NOTIFY_INFO; i++) { + entry = &out->entries[i]; + if (entry->flags == VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_END) + break; + if (entry->flags != req_bar_flags) + continue; + *bar = entry->bar; + *bar_offset = le64_to_cpu(entry->offset); + ret = 0; + break; + } + } + + kfree(out); + return ret; +} diff --git a/drivers/vfio/pci/virtio/cmd.h b/drivers/vfio/pci/virtio/cmd.h new file mode 100644 index 000000000000..c2a3645f4b90 --- /dev/null +++ b/drivers/vfio/pci/virtio/cmd.h @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + */ + +#ifndef VIRTIO_VFIO_CMD_H +#define VIRTIO_VFIO_CMD_H + +#include <linux/kernel.h> +#include <linux/virtio.h> +#include <linux/vfio_pci_core.h> +#include <linux/virtio_pci.h> + +struct virtiovf_pci_core_device { + struct vfio_pci_core_device core_device; + int vf_id; +}; + +int virtiovf_cmd_list_query(struct pci_dev *pdev, u8 *buf, int buf_size); +int virtiovf_cmd_list_use(struct pci_dev *pdev, u8 *buf, int buf_size); +int virtiovf_cmd_lr_write(struct virtiovf_pci_core_device *virtvdev, u16 opcode, + u8 offset, u8 size, u8 *buf); +int virtiovf_cmd_lr_read(struct virtiovf_pci_core_device *virtvdev, u16 opcode, + u8 offset, u8 size, u8 *buf); +int virtiovf_cmd_lq_read_notify(struct virtiovf_pci_core_device *virtvdev, + u8 req_bar_flags, u8 *bar, u64 *bar_offset); +#endif /* VIRTIO_VFIO_CMD_H */ -- 2.27.0
Yishai Hadas
2023-Sep-21 12:40 UTC
[PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
Introduce a vfio driver over virtio devices to support the legacy interface functionality for VFs. Background, from the virtio spec [1]. -------------------------------------------------------------------- In some systems, there is a need to support a virtio legacy driver with a device that does not directly support the legacy interface. In such scenarios, a group owner device can provide the legacy interface functionality for the group member devices. The driver of the owner device can then access the legacy interface of a member device on behalf of the legacy member device driver. For example, with the SR-IOV group type, group members (VFs) can not present the legacy interface in an I/O BAR in BAR0 as expected by the legacy pci driver. If the legacy driver is running inside a virtual machine, the hypervisor executing the virtual machine can present a virtual device with an I/O BAR in BAR0. The hypervisor intercepts the legacy driver accesses to this I/O BAR and forwards them to the group owner device (PF) using group administration commands. -------------------------------------------------------------------- Specifically, this driver adds support for a virtio-net VF to be exposed as a transitional device to a guest driver and allows the legacy IO BAR functionality on top. This allows a VM which uses a legacy virtio-net driver in the guest to work transparently over a VF which its driver in the host is that new driver. The driver can be extended easily to support some other types of virtio devices (e.g virtio-blk), by adding in a few places the specific type properties as was done for virtio-net. For now, only the virtio-net use case was tested and as such we introduce the support only for such a device. Practically, Upon probing a VF for a virtio-net device, in case its PF supports legacy access over the virtio admin commands and the VF doesn't have BAR 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a transitional device with I/O BAR in BAR 0. The existence of the simulated I/O bar is reported later on by overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device exposes itself as a transitional device by overwriting some properties upon reading its config space. Once we report the existence of I/O BAR as BAR 0 a legacy driver in the guest may use it via read/write calls according to the virtio specification. Any read/write towards the control parts of the BAR will be captured by the new driver and will be translated into admin commands towards the device. Any data path read/write access (i.e. virtio driver notifications) will be forwarded to the physical BAR which its properties were supplied by the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow. With that code in place a legacy driver in the guest has the look and feel as if having a transitional device with legacy support for both its control and data path flows. [1] https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c Signed-off-by: Yishai Hadas <yishaih at nvidia.com> --- MAINTAINERS | 6 + drivers/vfio/pci/Kconfig | 2 + drivers/vfio/pci/Makefile | 2 + drivers/vfio/pci/virtio/Kconfig | 15 + drivers/vfio/pci/virtio/Makefile | 4 + drivers/vfio/pci/virtio/cmd.c | 4 +- drivers/vfio/pci/virtio/cmd.h | 8 + drivers/vfio/pci/virtio/main.c | 546 +++++++++++++++++++++++++++++++ 8 files changed, 585 insertions(+), 2 deletions(-) create mode 100644 drivers/vfio/pci/virtio/Kconfig create mode 100644 drivers/vfio/pci/virtio/Makefile create mode 100644 drivers/vfio/pci/virtio/main.c diff --git a/MAINTAINERS b/MAINTAINERS index bf0f54c24f81..5098418c8389 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -22624,6 +22624,12 @@ L: kvm at vger.kernel.org S: Maintained F: drivers/vfio/pci/mlx5/ +VFIO VIRTIO PCI DRIVER +M: Yishai Hadas <yishaih at nvidia.com> +L: kvm at vger.kernel.org +S: Maintained +F: drivers/vfio/pci/virtio + VFIO PCI DEVICE SPECIFIC DRIVERS R: Jason Gunthorpe <jgg at nvidia.com> R: Yishai Hadas <yishaih at nvidia.com> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index 8125e5f37832..18c397df566d 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig" source "drivers/vfio/pci/pds/Kconfig" +source "drivers/vfio/pci/virtio/Kconfig" + endmenu diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile index 45167be462d8..046139a4eca5 100644 --- a/drivers/vfio/pci/Makefile +++ b/drivers/vfio/pci/Makefile @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/ obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/ obj-$(CONFIG_PDS_VFIO_PCI) += pds/ + +obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/ diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig new file mode 100644 index 000000000000..89eddce8b1bd --- /dev/null +++ b/drivers/vfio/pci/virtio/Kconfig @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: GPL-2.0-only +config VIRTIO_VFIO_PCI + tristate "VFIO support for VIRTIO PCI devices" + depends on VIRTIO_PCI + select VFIO_PCI_CORE + help + This provides support for exposing VIRTIO VF devices using the VFIO + framework that can work with a legacy virtio driver in the guest. + Based on PCIe spec, VFs do not support I/O Space; thus, VF BARs shall + not indicate I/O Space. + As of that this driver emulated I/O BAR in software to let a VF be + seen as a transitional device in the guest and let it work with + a legacy driver. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/pci/virtio/Makefile b/drivers/vfio/pci/virtio/Makefile new file mode 100644 index 000000000000..584372648a03 --- /dev/null +++ b/drivers/vfio/pci/virtio/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio-vfio-pci.o +virtio-vfio-pci-y := main.o cmd.o + diff --git a/drivers/vfio/pci/virtio/cmd.c b/drivers/vfio/pci/virtio/cmd.c index f068239cdbb0..aea9d25fbf1d 100644 --- a/drivers/vfio/pci/virtio/cmd.c +++ b/drivers/vfio/pci/virtio/cmd.c @@ -44,7 +44,7 @@ int virtiovf_cmd_lr_write(struct virtiovf_pci_core_device *virtvdev, u16 opcode, { struct virtio_device *virtio_dev virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); - struct virtio_admin_cmd_data_lr_write *in; + struct virtio_admin_cmd_legacy_wr_data *in; struct scatterlist in_sg; struct virtio_admin_cmd cmd = {}; int ret; @@ -74,7 +74,7 @@ int virtiovf_cmd_lr_read(struct virtiovf_pci_core_device *virtvdev, u16 opcode, { struct virtio_device *virtio_dev virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); - struct virtio_admin_cmd_data_lr_read *in; + struct virtio_admin_cmd_legacy_rd_data *in; struct scatterlist in_sg, out_sg; struct virtio_admin_cmd cmd = {}; int ret; diff --git a/drivers/vfio/pci/virtio/cmd.h b/drivers/vfio/pci/virtio/cmd.h index c2a3645f4b90..347b1dc85570 100644 --- a/drivers/vfio/pci/virtio/cmd.h +++ b/drivers/vfio/pci/virtio/cmd.h @@ -13,7 +13,15 @@ struct virtiovf_pci_core_device { struct vfio_pci_core_device core_device; + u8 bar0_virtual_buf_size; + u8 *bar0_virtual_buf; + /* synchronize access to the virtual buf */ + struct mutex bar_mutex; int vf_id; + void __iomem *notify_addr; + u32 notify_offset; + u8 notify_bar; + u8 pci_cmd_io :1; }; int virtiovf_cmd_list_query(struct pci_dev *pdev, u8 *buf, int buf_size); diff --git a/drivers/vfio/pci/virtio/main.c b/drivers/vfio/pci/virtio/main.c new file mode 100644 index 000000000000..2486991c49f3 --- /dev/null +++ b/drivers/vfio/pci/virtio/main.c @@ -0,0 +1,546 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include <linux/device.h> +#include <linux/module.h> +#include <linux/mutex.h> +#include <linux/pci.h> +#include <linux/pm_runtime.h> +#include <linux/types.h> +#include <linux/uaccess.h> +#include <linux/vfio.h> +#include <linux/vfio_pci_core.h> +#include <linux/virtio_pci.h> +#include <linux/virtio_net.h> +#include <linux/virtio_pci_modern.h> + +#include "cmd.h" + +#define VIRTIO_LEGACY_IO_BAR_HEADER_LEN 20 +#define VIRTIO_LEGACY_IO_BAR_MSIX_HEADER_LEN 4 + +static int virtiovf_issue_lr_cmd(struct virtiovf_pci_core_device *virtvdev, + loff_t pos, char __user *buf, + size_t count, bool read) +{ + u8 *bar0_buf = virtvdev->bar0_virtual_buf; + u16 opcode; + int ret; + + mutex_lock(&virtvdev->bar_mutex); + if (read) { + opcode = (pos < VIRTIO_PCI_CONFIG_OFF(true)) ? + VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ : + VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ; + ret = virtiovf_cmd_lr_read(virtvdev, opcode, pos, + count, bar0_buf + pos); + if (ret) + goto out; + if (copy_to_user(buf, bar0_buf + pos, count)) + ret = -EFAULT; + goto out; + } + + if (copy_from_user(bar0_buf + pos, buf, count)) { + ret = -EFAULT; + goto out; + } + + opcode = (pos < VIRTIO_PCI_CONFIG_OFF(true)) ? + VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_WRITE : + VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE; + ret = virtiovf_cmd_lr_write(virtvdev, opcode, pos, count, + bar0_buf + pos); +out: + mutex_unlock(&virtvdev->bar_mutex); + return ret; +} + +static int +translate_io_bar_to_mem_bar(struct virtiovf_pci_core_device *virtvdev, + loff_t pos, char __user *buf, + size_t count, bool read) +{ + struct vfio_pci_core_device *core_device = &virtvdev->core_device; + u16 queue_notify; + int ret; + + if (pos + count > virtvdev->bar0_virtual_buf_size) + return -EINVAL; + + switch (pos) { + case VIRTIO_PCI_QUEUE_NOTIFY: + if (count != sizeof(queue_notify)) + return -EINVAL; + if (read) { + ret = vfio_pci_ioread16(core_device, true, &queue_notify, + virtvdev->notify_addr); + if (ret) + return ret; + if (copy_to_user(buf, &queue_notify, + sizeof(queue_notify))) + return -EFAULT; + break; + } + + if (copy_from_user(&queue_notify, buf, count)) + return -EFAULT; + + ret = vfio_pci_iowrite16(core_device, true, queue_notify, + virtvdev->notify_addr); + break; + default: + ret = virtiovf_issue_lr_cmd(virtvdev, pos, buf, count, read); + } + + return ret ? ret : count; +} + +static bool range_contains_range(loff_t range1_start, size_t count1, + loff_t range2_start, size_t count2, + loff_t *start_offset) +{ + if (range1_start <= range2_start && + range1_start + count1 >= range2_start + count2) { + *start_offset = range2_start - range1_start; + return true; + } + return false; +} + +static ssize_t virtiovf_pci_read_config(struct vfio_device *core_vdev, + char __user *buf, size_t count, + loff_t *ppos) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; + loff_t copy_offset; + __le32 val32; + __le16 val16; + u8 val8; + int ret; + + ret = vfio_pci_core_read(core_vdev, buf, count, ppos); + if (ret < 0) + return ret; + + if (range_contains_range(pos, count, PCI_DEVICE_ID, sizeof(val16), + ©_offset)) { + val16 = cpu_to_le16(0x1000); + if (copy_to_user(buf + copy_offset, &val16, sizeof(val16))) + return -EFAULT; + } + + if (virtvdev->pci_cmd_io && + range_contains_range(pos, count, PCI_COMMAND, sizeof(val16), + ©_offset)) { + if (copy_from_user(&val16, buf, sizeof(val16))) + return -EFAULT; + val16 |= cpu_to_le16(PCI_COMMAND_IO); + if (copy_to_user(buf + copy_offset, &val16, sizeof(val16))) + return -EFAULT; + } + + if (range_contains_range(pos, count, PCI_REVISION_ID, sizeof(val8), + ©_offset)) { + /* Transional needs to have revision 0 */ + val8 = 0; + if (copy_to_user(buf + copy_offset, &val8, sizeof(val8))) + return -EFAULT; + } + + if (range_contains_range(pos, count, PCI_BASE_ADDRESS_0, sizeof(val32), + ©_offset)) { + val32 = cpu_to_le32(PCI_BASE_ADDRESS_SPACE_IO); + if (copy_to_user(buf + copy_offset, &val32, sizeof(val32))) + return -EFAULT; + } + + if (range_contains_range(pos, count, PCI_SUBSYSTEM_ID, sizeof(val16), + ©_offset)) { + /* Transitional devices use the PCI subsystem device id as + * virtio device id, same as legacy driver always did. + */ + val16 = cpu_to_le16(VIRTIO_ID_NET); + if (copy_to_user(buf + copy_offset, &val16, sizeof(val16))) + return -EFAULT; + } + + return count; +} + +static ssize_t +virtiovf_pci_core_read(struct vfio_device *core_vdev, char __user *buf, + size_t count, loff_t *ppos) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + struct pci_dev *pdev = virtvdev->core_device.pdev; + unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; + int ret; + + if (!count) + return 0; + + if (index == VFIO_PCI_CONFIG_REGION_INDEX) + return virtiovf_pci_read_config(core_vdev, buf, count, ppos); + + if (index != VFIO_PCI_BAR0_REGION_INDEX) + return vfio_pci_core_read(core_vdev, buf, count, ppos); + + ret = pm_runtime_resume_and_get(&pdev->dev); + if (ret) { + pci_info_ratelimited(pdev, "runtime resume failed %d\n", + ret); + return -EIO; + } + + ret = translate_io_bar_to_mem_bar(virtvdev, pos, buf, count, true); + pm_runtime_put(&pdev->dev); + return ret; +} + +static ssize_t +virtiovf_pci_core_write(struct vfio_device *core_vdev, const char __user *buf, + size_t count, loff_t *ppos) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + struct pci_dev *pdev = virtvdev->core_device.pdev; + unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; + int ret; + + if (!count) + return 0; + + if (index == VFIO_PCI_CONFIG_REGION_INDEX) { + loff_t copy_offset; + u16 cmd; + + if (range_contains_range(pos, count, PCI_COMMAND, sizeof(cmd), + ©_offset)) { + if (copy_from_user(&cmd, buf + copy_offset, sizeof(cmd))) + return -EFAULT; + virtvdev->pci_cmd_io = (cmd & PCI_COMMAND_IO); + } + } + + if (index != VFIO_PCI_BAR0_REGION_INDEX) + return vfio_pci_core_write(core_vdev, buf, count, ppos); + + ret = pm_runtime_resume_and_get(&pdev->dev); + if (ret) { + pci_info_ratelimited(pdev, "runtime resume failed %d\n", ret); + return -EIO; + } + + ret = translate_io_bar_to_mem_bar(virtvdev, pos, (char __user *)buf, count, false); + pm_runtime_put(&pdev->dev); + return ret; +} + +static int +virtiovf_pci_ioctl_get_region_info(struct vfio_device *core_vdev, + unsigned int cmd, unsigned long arg) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + unsigned long minsz = offsetofend(struct vfio_region_info, offset); + void __user *uarg = (void __user *)arg; + struct vfio_region_info info = {}; + + if (copy_from_user(&info, uarg, minsz)) + return -EFAULT; + + if (info.argsz < minsz) + return -EINVAL; + + switch (info.index) { + case VFIO_PCI_BAR0_REGION_INDEX: + info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index); + info.size = virtvdev->bar0_virtual_buf_size; + info.flags = VFIO_REGION_INFO_FLAG_READ | + VFIO_REGION_INFO_FLAG_WRITE; + return copy_to_user(uarg, &info, minsz) ? -EFAULT : 0; + default: + return vfio_pci_core_ioctl(core_vdev, cmd, arg); + } +} + +static long +virtiovf_vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, + unsigned long arg) +{ + switch (cmd) { + case VFIO_DEVICE_GET_REGION_INFO: + return virtiovf_pci_ioctl_get_region_info(core_vdev, cmd, arg); + default: + return vfio_pci_core_ioctl(core_vdev, cmd, arg); + } +} + +static int +virtiovf_set_notify_addr(struct virtiovf_pci_core_device *virtvdev) +{ + struct vfio_pci_core_device *core_device = &virtvdev->core_device; + int ret; + + /* Setup the BAR where the 'notify' exists to be used by vfio as well + * This will let us mmap it only once and use it when needed. + */ + ret = vfio_pci_core_setup_barmap(core_device, + virtvdev->notify_bar); + if (ret) + return ret; + + virtvdev->notify_addr = core_device->barmap[virtvdev->notify_bar] + + virtvdev->notify_offset; + return 0; +} + +static int virtiovf_pci_open_device(struct vfio_device *core_vdev) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + struct vfio_pci_core_device *vdev = &virtvdev->core_device; + int ret; + + ret = vfio_pci_core_enable(vdev); + if (ret) + return ret; + + if (virtvdev->bar0_virtual_buf) { + /* upon close_device() the vfio_pci_core_disable() is called + * and will close all the previous mmaps, so it seems that the + * valid life cycle for the 'notify' addr is per open/close. + */ + ret = virtiovf_set_notify_addr(virtvdev); + if (ret) { + vfio_pci_core_disable(vdev); + return ret; + } + } + + vfio_pci_core_finish_enable(vdev); + return 0; +} + +static void virtiovf_pci_close_device(struct vfio_device *core_vdev) +{ + vfio_pci_core_close_device(core_vdev); +} + +static int virtiovf_get_device_config_size(unsigned short device) +{ + switch (device) { + case 0x1041: + /* network card */ + return offsetofend(struct virtio_net_config, status); + default: + return 0; + } +} + +static int virtiovf_read_notify_info(struct virtiovf_pci_core_device *virtvdev) +{ + u64 offset; + int ret; + u8 bar; + + ret = virtiovf_cmd_lq_read_notify(virtvdev, + VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_OWNER_MEM, + &bar, &offset); + if (ret) + return ret; + + virtvdev->notify_bar = bar; + virtvdev->notify_offset = offset; + return 0; +} + +static int virtiovf_pci_init_device(struct vfio_device *core_vdev) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + struct pci_dev *pdev; + int ret; + + ret = vfio_pci_core_init_dev(core_vdev); + if (ret) + return ret; + + pdev = virtvdev->core_device.pdev; + virtvdev->vf_id = pci_iov_vf_id(pdev); + if (virtvdev->vf_id < 0) + return -EINVAL; + + ret = virtiovf_read_notify_info(virtvdev); + if (ret) + return ret; + + virtvdev->bar0_virtual_buf_size = VIRTIO_LEGACY_IO_BAR_HEADER_LEN + + VIRTIO_LEGACY_IO_BAR_MSIX_HEADER_LEN + + virtiovf_get_device_config_size(pdev->device); + virtvdev->bar0_virtual_buf = kzalloc(virtvdev->bar0_virtual_buf_size, + GFP_KERNEL); + if (!virtvdev->bar0_virtual_buf) + return -ENOMEM; + mutex_init(&virtvdev->bar_mutex); + return 0; +} + +static void virtiovf_pci_core_release_dev(struct vfio_device *core_vdev) +{ + struct virtiovf_pci_core_device *virtvdev = container_of( + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); + + kfree(virtvdev->bar0_virtual_buf); + vfio_pci_core_release_dev(core_vdev); +} + +static const struct vfio_device_ops virtiovf_acc_vfio_pci_tran_ops = { + .name = "virtio-transitional-vfio-pci", + .init = virtiovf_pci_init_device, + .release = virtiovf_pci_core_release_dev, + .open_device = virtiovf_pci_open_device, + .close_device = virtiovf_pci_close_device, + .ioctl = virtiovf_vfio_pci_core_ioctl, + .read = virtiovf_pci_core_read, + .write = virtiovf_pci_core_write, + .mmap = vfio_pci_core_mmap, + .request = vfio_pci_core_request, + .match = vfio_pci_core_match, + .bind_iommufd = vfio_iommufd_physical_bind, + .unbind_iommufd = vfio_iommufd_physical_unbind, + .attach_ioas = vfio_iommufd_physical_attach_ioas, +}; + +static const struct vfio_device_ops virtiovf_acc_vfio_pci_ops = { + .name = "virtio-acc-vfio-pci", + .init = vfio_pci_core_init_dev, + .release = vfio_pci_core_release_dev, + .open_device = virtiovf_pci_open_device, + .close_device = virtiovf_pci_close_device, + .ioctl = vfio_pci_core_ioctl, + .device_feature = vfio_pci_core_ioctl_feature, + .read = vfio_pci_core_read, + .write = vfio_pci_core_write, + .mmap = vfio_pci_core_mmap, + .request = vfio_pci_core_request, + .match = vfio_pci_core_match, + .bind_iommufd = vfio_iommufd_physical_bind, + .unbind_iommufd = vfio_iommufd_physical_unbind, + .attach_ioas = vfio_iommufd_physical_attach_ioas, +}; + +static bool virtiovf_bar0_exists(struct pci_dev *pdev) +{ + struct resource *res = pdev->resource; + + return res->flags ? true : false; +} + +#define VIRTIOVF_USE_ADMIN_CMD_BITMAP \ + (BIT_ULL(VIRTIO_ADMIN_CMD_LIST_QUERY) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LIST_USE) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_WRITE) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ) | \ + BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO)) + +static bool virtiovf_support_legacy_access(struct pci_dev *pdev) +{ + int buf_size = DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CMD_OPCODE, 64) * 8; + u8 *buf; + int ret; + + /* Only virtio-net is supported/tested so far */ + if (pdev->device != 0x1041) + return false; + + buf = kzalloc(buf_size, GFP_KERNEL); + if (!buf) + return false; + + ret = virtiovf_cmd_list_query(pdev, buf, buf_size); + if (ret) + goto end; + + if ((le64_to_cpup((__le64 *)buf) & VIRTIOVF_USE_ADMIN_CMD_BITMAP) !+ VIRTIOVF_USE_ADMIN_CMD_BITMAP) { + ret = -EOPNOTSUPP; + goto end; + } + + /* confirm the used commands */ + memset(buf, 0, buf_size); + *(__le64 *)buf = cpu_to_le64(VIRTIOVF_USE_ADMIN_CMD_BITMAP); + ret = virtiovf_cmd_list_use(pdev, buf, buf_size); + +end: + kfree(buf); + return ret ? false : true; +} + +static int virtiovf_pci_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + const struct vfio_device_ops *ops = &virtiovf_acc_vfio_pci_ops; + struct virtiovf_pci_core_device *virtvdev; + int ret; + + if (pdev->is_virtfn && virtiovf_support_legacy_access(pdev) && + !virtiovf_bar0_exists(pdev) && pdev->msix_cap) + ops = &virtiovf_acc_vfio_pci_tran_ops; + + virtvdev = vfio_alloc_device(virtiovf_pci_core_device, core_device.vdev, + &pdev->dev, ops); + if (IS_ERR(virtvdev)) + return PTR_ERR(virtvdev); + + dev_set_drvdata(&pdev->dev, &virtvdev->core_device); + ret = vfio_pci_core_register_device(&virtvdev->core_device); + if (ret) + goto out; + return 0; +out: + vfio_put_device(&virtvdev->core_device.vdev); + return ret; +} + +static void virtiovf_pci_remove(struct pci_dev *pdev) +{ + struct virtiovf_pci_core_device *virtvdev = dev_get_drvdata(&pdev->dev); + + vfio_pci_core_unregister_device(&virtvdev->core_device); + vfio_put_device(&virtvdev->core_device.vdev); +} + +static const struct pci_device_id virtiovf_pci_table[] = { + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) }, + {} +}; + +MODULE_DEVICE_TABLE(pci, virtiovf_pci_table); + +static struct pci_driver virtiovf_pci_driver = { + .name = KBUILD_MODNAME, + .id_table = virtiovf_pci_table, + .probe = virtiovf_pci_probe, + .remove = virtiovf_pci_remove, + .err_handler = &vfio_pci_core_err_handlers, + .driver_managed_dma = true, +}; + +module_pci_driver(virtiovf_pci_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Yishai Hadas <yishaih at nvidia.com>"); +MODULE_DESCRIPTION( + "VIRTIO VFIO PCI - User Level meta-driver for VIRTIO device family"); -- 2.27.0