Hi All:
This is a rework on the IRQ hardening for virtio which is done
previously by the following commits are reverted:
9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
with the assumption of the affinity managed IRQ that is used by some
virtio drivers. And what's more, it is only done for virtio-pci but
not other transports.
In this rework, I try to implement a general virtio solution which
borrows the idea of the INTX hardening by re-using per virtqueue
boolean vq->broken and toggle it in virtio_device_ready() and
virtio_reset_device(). Then we can simply reuse the existing checks in
the vring_interrupt() and return early if the driver is not ready.
Note that, I only did compile test on ccw and MMIO transport.
Please review.
Changes since V5:
- Various tweaks on the comments
Changes since V4:
- use spin_lock_irq()/spin_unlock_irq() to synchronize with
  vring_interrupt() for ccw
- use spin_lock()/spin_unlock() to protect vring_interrupt() for non
  airq
- add comment to explain the ordering implications of set_status() for
  PCI, ccw and MMIO
- various tweaks on the comments and changelogs
Changes since V3:
- Rename synchornize_vqs() to synchronize_cbs()
- tweak the comment for synchronize_cbs()
- switch to use a dedicated helper __virtio_unbreak_device() and
  document it should be only used for probing
- switch to use rwlock to synchornize the non airq for ccw
Changes since V2:
- add ccw and MMIO support
- rename synchronize_vqs() to synchronize_cbs()
- switch to re-use vq->broken instead of introducing new device
  attributes for the future virtqueue reset support
  - remove unnecssary READ_ONCE()/WRITE_ONCE()
  - a new patch to remove device triggerable BUG_ON()
  - more tweaks on the comments
Changes since v1:
- Use transport specific irq synchronization method when possible
- Drop the module parameter and enable the hardening unconditonally
- Tweak the barrier/ordering facilities used in the code
- Reanme irq_soft_enabled to driver_ready
- Avoid unnecssary IRQ synchornization (e.g during boot)
Jason Wang (8):
  virtio: use virtio_reset_device() when possible
  virtio: introduce config op to synchronize vring callbacks
  virtio-pci: implement synchronize_cbs()
  virtio-mmio: implement synchronize_cbs()
  virtio-ccw: implement synchronize_cbs()
  virtio: allow to unbreak virtqueue
  virtio: harden vring IRQ
  virtio: use WARN_ON() to warning illegal status value
Stefano Garzarella (1):
  virtio: use virtio_device_ready() in virtio_device_restore()
 drivers/s390/virtio/virtio_ccw.c       | 34 +++++++++++++++++++
 drivers/virtio/virtio.c                | 24 +++++++++----
 drivers/virtio/virtio_mmio.c           | 13 +++++++
 drivers/virtio/virtio_pci_legacy.c     |  1 +
 drivers/virtio/virtio_pci_modern.c     |  2 ++
 drivers/virtio/virtio_pci_modern_dev.c |  5 +++
 drivers/virtio/virtio_ring.c           | 33 +++++++++++++++---
 include/linux/virtio.h                 |  1 +
 include/linux/virtio_config.h          | 47 +++++++++++++++++++++++++-
 9 files changed, 148 insertions(+), 12 deletions(-)
-- 
2.25.1
Jason Wang
2022-May-27  06:01 UTC
[PATCH V6 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
From: Stefano Garzarella <sgarzare at redhat.com> It will allow us to do extension on virtio_device_ready() without duplicating code. Cc: Thomas Gleixner <tglx at linutronix.de> Cc: Peter Zijlstra <peterz at infradead.org> Cc: "Paul E. McKenney" <paulmck at kernel.org> Cc: Marc Zyngier <maz at kernel.org> Cc: Halil Pasic <pasic at linux.ibm.com> Cc: Cornelia Huck <cohuck at redhat.com> Cc: Vineeth Vijayan <vneethv at linux.ibm.com> Cc: Peter Oberparleiter <oberpar at linux.ibm.com> Cc: linux-s390 at vger.kernel.org Reviewed-by: Cornelia Huck <cohuck at redhat.com> Signed-off-by: Stefano Garzarella <sgarzare at redhat.com> Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/virtio/virtio.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index ce424c16997d..938e975029d4 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -526,8 +526,9 @@ int virtio_device_restore(struct virtio_device *dev) goto err; } - /* Finally, tell the device we're all set */ - virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); + /* If restore didn't do it, mark device DRIVER_OK ourselves. */ + if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK)) + virtio_device_ready(dev); virtio_config_enable(dev); -- 2.25.1
Jason Wang
2022-May-27  06:01 UTC
[PATCH V6 2/9] virtio: use virtio_reset_device() when possible
This allows us to do common extension without duplicating code. Cc: Thomas Gleixner <tglx at linutronix.de> Cc: Peter Zijlstra <peterz at infradead.org> Cc: "Paul E. McKenney" <paulmck at kernel.org> Cc: Marc Zyngier <maz at kernel.org> Cc: Halil Pasic <pasic at linux.ibm.com> Cc: Cornelia Huck <cohuck at redhat.com> Cc: Vineeth Vijayan <vneethv at linux.ibm.com> Cc: Peter Oberparleiter <oberpar at linux.ibm.com> Cc: linux-s390 at vger.kernel.org Reviewed-by: Cornelia Huck <cohuck at redhat.com> Signed-off-by: Jason Wang <jasowang at redhat.com> --- drivers/virtio/virtio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 938e975029d4..aa1eb5132767 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -430,7 +430,7 @@ int register_virtio_device(struct virtio_device *dev) /* We always start by resetting the device, in case a previous * driver messed it up. This also tests that code path a little. */ - dev->config->reset(dev); + virtio_reset_device(dev); /* Acknowledge that we've seen the device. */ virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); @@ -496,7 +496,7 @@ int virtio_device_restore(struct virtio_device *dev) /* We always start by resetting the device, in case a previous * driver messed it up. */ - dev->config->reset(dev); + virtio_reset_device(dev); /* Acknowledge that we've seen the device. */ virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); -- 2.25.1
Jason Wang
2022-May-27  06:01 UTC
[PATCH V6 3/9] virtio: introduce config op to synchronize vring callbacks
This patch introduces new virtio config op to vring
callbacks. Transport specific method is required to make sure the
write before this function is visible to the vring_interrupt() that is
called after the return of this function. For the transport that
doesn't provide synchronize_vqs(), use synchornize_rcu() which
synchronize with IRQ implicitly as a fallback.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck at redhat.com>
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index b341dd62aa4d..25be018810a7 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -57,6 +57,11 @@ struct virtio_shm_region {
  *		include a NULL entry for vqs unused by driver
  *	Returns 0 on success or error status
  * @del_vqs: free virtqueues found by find_vqs().
+ * @synchronize_cbs: synchronize with the virtqueue callbacks (optional)
+ *      The function guarantees that all memory operations on the
+ *      queue before it are visible to the vring_interrupt() that is
+ *      called after it.
+ *      vdev: the virtio_device
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
  *	Returns the first 64 feature bits (all we currently need).
@@ -89,6 +94,7 @@ struct virtio_config_ops {
 			const char * const names[], const bool *ctx,
 			struct irq_affinity *desc);
 	void (*del_vqs)(struct virtio_device *);
+	void (*synchronize_cbs)(struct virtio_device *);
 	u64 (*get_features)(struct virtio_device *vdev);
 	int (*finalize_features)(struct virtio_device *vdev);
 	const char *(*bus_name)(struct virtio_device *vdev);
@@ -217,6 +223,25 @@ int virtio_find_vqs_ctx(struct virtio_device *vdev,
unsigned nvqs,
 				      desc);
 }
 
+/**
+ * virtio_synchronize_cbs - synchronize with virtqueue callbacks
+ * @vdev: the device
+ */
+static inline
+void virtio_synchronize_cbs(struct virtio_device *dev)
+{
+	if (dev->config->synchronize_cbs) {
+		dev->config->synchronize_cbs(dev);
+	} else {
+		/*
+		 * A best effort fallback to synchronize with
+		 * interrupts, preemption and softirq disabled
+		 * regions. See comment above synchronize_rcu().
+		 */
+		synchronize_rcu();
+	}
+}
+
 /**
  * virtio_device_ready - enable vq use in probe function
  * @vdev: the device
-- 
2.25.1
We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck at redhat.com>
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 drivers/virtio/virtio_pci_legacy.c | 1 +
 drivers/virtio/virtio_pci_modern.c | 2 ++
 2 files changed, 3 insertions(+)
diff --git a/drivers/virtio/virtio_pci_legacy.c
b/drivers/virtio/virtio_pci_legacy.c
index 7fe4caa4b519..a5e5721145c7 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -192,6 +192,7 @@ static const struct virtio_config_ops virtio_pci_config_ops
= {
 	.reset		= vp_reset,
 	.find_vqs	= vp_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
diff --git a/drivers/virtio/virtio_pci_modern.c
b/drivers/virtio/virtio_pci_modern.c
index 4acb34409f0b..623906b4996c 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -394,6 +394,7 @@ static const struct virtio_config_ops
virtio_pci_config_nodev_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
@@ -411,6 +412,7 @@ static const struct virtio_config_ops virtio_pci_config_ops
= {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
-- 
2.25.1
Simply synchronize the platform irq that is used by us.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck at redhat.com>
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 drivers/virtio/virtio_mmio.c | 8 ++++++++
 1 file changed, 8 insertions(+)
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 839684d672af..c9699a59f93c 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -345,6 +345,13 @@ static void vm_del_vqs(struct virtio_device *vdev)
 	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
 }
 
+static void vm_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	synchronize_irq(platform_get_irq(vm_dev->pdev, 0));
+}
+
 static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned int
index,
 				  void (*callback)(struct virtqueue *vq),
 				  const char *name, bool ctx)
@@ -541,6 +548,7 @@ static const struct virtio_config_ops virtio_mmio_config_ops
= {
 	.finalize_features = vm_finalize_features,
 	.bus_name	= vm_bus_name,
 	.get_shm_region = vm_get_shm_region,
+	.synchronize_cbs = vm_synchronize_cbs,
 };
 
 
-- 
2.25.1
This patch tries to implement the synchronize_cbs() for ccw. For the
vring_interrupt() that is called via virtio_airq_handler(), the
synchronization is simply done via the airq_info's lock. For the
vring_interrupt() that is called via virtio_ccw_int_handler(), a per
device rwlock is introduced and used in the synchronization method.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Reviewed-by: Halil Pasic <pasic at linux.ibm.com>
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 drivers/s390/virtio/virtio_ccw.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..c188e4f20ca3 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -62,6 +62,7 @@ struct virtio_ccw_device {
 	unsigned int revision; /* Transport revision */
 	wait_queue_head_t wait_q;
 	spinlock_t lock;
+	rwlock_t irq_lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
 	bool is_thinint;
@@ -984,6 +985,30 @@ static const char *virtio_ccw_bus_name(struct virtio_device
*vdev)
 	return dev_name(&vcdev->cdev->dev);
 }
 
+static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
+	struct airq_info *info = vcdev->airq_info;
+
+	if (info) {
+		/*
+		 * This device uses adapter interrupts: synchronize with
+		 * vring_interrupt() called by virtio_airq_handler()
+		 * via the indicator area lock.
+		 */
+		write_lock_irq(&info->lock);
+		write_unlock_irq(&info->lock);
+	} else {
+		/* This device uses classic interrupts: synchronize
+		 * with vring_interrupt() called by
+		 * virtio_ccw_int_handler() via the per-device
+		 * irq_lock
+		 */
+		write_lock_irq(&vcdev->irq_lock);
+		write_unlock_irq(&vcdev->irq_lock);
+	}
+}
+
 static const struct virtio_config_ops virtio_ccw_config_ops = {
 	.get_features = virtio_ccw_get_features,
 	.finalize_features = virtio_ccw_finalize_features,
@@ -995,6 +1020,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops
= {
 	.find_vqs = virtio_ccw_find_vqs,
 	.del_vqs = virtio_ccw_del_vqs,
 	.bus_name = virtio_ccw_bus_name,
+	.synchronize_cbs = virtio_ccw_synchronize_cbs,
 };
 
 
@@ -1106,6 +1132,8 @@ static void virtio_ccw_int_handler(struct ccw_device
*cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
+	/* Interrupts are disabled here */
+	read_lock(&vcdev->irq_lock);
 	for_each_set_bit(i, indicators(vcdev),
 			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
@@ -1114,6 +1142,7 @@ static void virtio_ccw_int_handler(struct ccw_device
*cdev,
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
+	read_unlock(&vcdev->irq_lock);
 	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
 		clear_bit(0, indicators2(vcdev));
@@ -1284,6 +1313,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
+	rwlock_init(&vcdev->irq_lock);
 	mutex_init(&vcdev->io_lock);
 
 	spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
-- 
2.25.1
This patch allows the new introduced __virtio_break_device() to
unbreak the virtqueue.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 drivers/virtio/virtio_ring.c | 22 ++++++++++++++++++++++
 include/linux/virtio.h       |  1 +
 2 files changed, 23 insertions(+)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 9d0bae4293be..9c231e1fded7 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2395,6 +2395,28 @@ void virtio_break_device(struct virtio_device *dev)
 }
 EXPORT_SYMBOL_GPL(virtio_break_device);
 
+/*
+ * This should allow the device to be used by the driver. You may
+ * need to grab appropriate locks to flush the write to
+ * vq->broken. This should only be used in some specific case e.g
+ * (probing and restoring). This function should only be called by the
+ * core, not directly by the driver.
+ */
+void __virtio_unbreak_device(struct virtio_device *dev)
+{
+	struct virtqueue *_vq;
+
+	spin_lock(&dev->vqs_list_lock);
+	list_for_each_entry(_vq, &dev->vqs, list) {
+		struct vring_virtqueue *vq = to_vvq(_vq);
+
+		/* Pairs with READ_ONCE() in virtqueue_is_broken(). */
+		WRITE_ONCE(vq->broken, false);
+	}
+	spin_unlock(&dev->vqs_list_lock);
+}
+EXPORT_SYMBOL_GPL(__virtio_unbreak_device);
+
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5464f398912a..d8fdf170637c 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -131,6 +131,7 @@ void unregister_virtio_device(struct virtio_device *dev);
 bool is_virtio_device(struct device *dev);
 
 void virtio_break_device(struct virtio_device *dev);
+void __virtio_unbreak_device(struct virtio_device *dev);
 
 void virtio_config_changed(struct virtio_device *dev);
 #ifdef CONFIG_PM_SLEEP
-- 
2.25.1
This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:
1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
   that is used by some device such as virtio-blk
2) done only for PCI transport
The vq->broken is re-used in this patch for implementing the IRQ
hardening. The vq->broken is set to true during both initialization
and reset. And the vq->broken is set to false in
virtio_device_ready(). Then vring_interrupt() can check and return
when vq->broken is true. And in this case, switch to return IRQ_NONE
to let the interrupt core aware of such invalid interrupt to prevent
IRQ storm.
The reason of using a per queue variable instead of a per device one
is that we may need it for per queue reset hardening in the future.
Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a024d7
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 drivers/s390/virtio/virtio_ccw.c       |  4 ++++
 drivers/virtio/virtio.c                | 15 ++++++++++++---
 drivers/virtio/virtio_mmio.c           |  5 +++++
 drivers/virtio/virtio_pci_modern_dev.c |  5 +++++
 drivers/virtio/virtio_ring.c           | 11 +++++++----
 include/linux/virtio_config.h          | 20 ++++++++++++++++++++
 6 files changed, 53 insertions(+), 7 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index c188e4f20ca3..97e51c34e6cf 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -971,6 +971,10 @@ static void virtio_ccw_set_status(struct virtio_device
*vdev, u8 status)
 	ccw->flags = 0;
 	ccw->count = sizeof(status);
 	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
+	/* We use ssch for setting the status which is a serializing
+	 * instruction that guarantees the memory writes have
+	 * completed before ssch.
+	 */
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index aa1eb5132767..95fac4c97c8b 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
  * */
 void virtio_reset_device(struct virtio_device *dev)
 {
+	/*
+	 * The below virtio_synchronize_cbs() guarantees that any
+	 * interrupt for this line arriving after
+	 * virtio_synchronize_vqs() has completed is guaranteed to see
+	 * vq->broken as true.
+	 */
+	virtio_break_device(dev);
+	virtio_synchronize_cbs(dev);
+
 	dev->config->reset(dev);
 }
 EXPORT_SYMBOL_GPL(virtio_reset_device);
@@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
 	dev->config_enabled = false;
 	dev->config_change_pending = false;
 
+	INIT_LIST_HEAD(&dev->vqs);
+	spin_lock_init(&dev->vqs_list_lock);
+
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
 	virtio_reset_device(dev);
@@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
 
-	INIT_LIST_HEAD(&dev->vqs);
-	spin_lock_init(&dev->vqs_list_lock);
-
 	/*
 	 * device_add() causes the bus infrastructure to look for a matching
 	 * driver.
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index c9699a59f93c..f9a36bc7ac27 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -253,6 +253,11 @@ static void vm_set_status(struct virtio_device *vdev, u8
status)
 	/* We should never be setting status to 0. */
 	BUG_ON(status == 0);
 
+	/*
+	 * Per memory-barriers.txt, wmb() is not needed to guarantee
+	 * that the the cache coherent memory writes have completed
+	 * before writing to the MMIO region.
+	 */
 	writel(status, vm_dev->base + VIRTIO_MMIO_STATUS);
 }
 
diff --git a/drivers/virtio/virtio_pci_modern_dev.c
b/drivers/virtio/virtio_pci_modern_dev.c
index 4093f9cca7a6..a0fa14f28a7f 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -467,6 +467,11 @@ void vp_modern_set_status(struct virtio_pci_modern_device
*mdev,
 {
 	struct virtio_pci_common_cfg __iomem *cfg = mdev->common;
 
+	/*
+	 * Per memory-barriers.txt, wmb() is not needed to guarantee
+	 * that the the cache coherent memory writes have completed
+	 * before writing to the MMIO region.
+	 */
 	vp_iowrite8(status, &cfg->device_status);
 }
 EXPORT_SYMBOL_GPL(vp_modern_set_status);
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 9c231e1fded7..13a7348cedff 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1688,7 +1688,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->we_own_ring = true;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
@@ -2134,8 +2134,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 		return IRQ_NONE;
 	}
 
-	if (unlikely(vq->broken))
-		return IRQ_HANDLED;
+	if (unlikely(vq->broken)) {
+		dev_warn_once(&vq->vq.vdev->dev,
+			      "virtio vring IRQ raised before DRIVER_OK");
+		return IRQ_NONE;
+	}
 
 	/* Just a hint for performance: so it's ok that this can be racy! */
 	if (vq->event)
@@ -2177,7 +2180,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int
index,
 	vq->we_own_ring = false;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 25be018810a7..d4edfd7d91bb 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -256,6 +256,26 @@ void virtio_device_ready(struct virtio_device *dev)
 	unsigned status = dev->config->get_status(dev);
 
 	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+
+	/*
+	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
+	 * will see the driver specific setup if it sees vq->broken
+	 * as false (even if the notifications come before DRIVER_OK).
+	 */
+	virtio_synchronize_cbs(dev);
+	__virtio_unbreak_device(dev);
+	/*
+	 * The transport should ensure the visibility of vq->broken
+	 * before setting DRIVER_OK. See the comments for the transport
+	 * specific set_status() method.
+	 *
+	 * A well behaved device will only notify a virtqueue after
+	 * DRIVER_OK, this means the device should "see" the coherenct
+	 * memory write that set vq->broken as false which is done by
+	 * the driver when it sees DRIVER_OK, then the following
+	 * driver's vring_interrupt() will see vq->broken as false so
+	 * we won't lose any notification.
+	 */
 	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
 }
 
-- 
2.25.1
Jason Wang
2022-May-27  06:01 UTC
[PATCH V6 9/9] virtio: use WARN_ON() to warning illegal status value
We used to use BUG_ON() in virtio_device_ready() to detect illegal
status value, this seems sub-optimal since the value is under the
control of the device. Switch to use WARN_ON() instead.
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: "Paul E. McKenney" <paulmck at kernel.org>
Cc: Marc Zyngier <maz at kernel.org>
Cc: Halil Pasic <pasic at linux.ibm.com>
Cc: Cornelia Huck <cohuck at redhat.com>
Cc: Vineeth Vijayan <vneethv at linux.ibm.com>
Cc: Peter Oberparleiter <oberpar at linux.ibm.com>
Cc: linux-s390 at vger.kernel.org
Signed-off-by: Jason Wang <jasowang at redhat.com>
---
 include/linux/virtio_config.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index d4edfd7d91bb..9a36051ceb76 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -255,7 +255,7 @@ void virtio_device_ready(struct virtio_device *dev)
 {
 	unsigned status = dev->config->get_status(dev);
 
-	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+	WARN_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
 
 	/*
 	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
-- 
2.25.1
On Fri, May 27, 2022 at 02:01:19PM +0800, Jason Wang wrote:> This is a rework on the previous IRQ hardening that is done for > virtio-pci where several drawbacks were found and were reverted: > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ > that is used by some device such as virtio-blk > 2) done only for PCI transport > > The vq->broken is re-used in this patch for implementing the IRQ > hardening. The vq->broken is set to true during both initialization > and reset. And the vq->broken is set to false in > virtio_device_ready(). Then vring_interrupt() can check and return > when vq->broken is true. And in this case, switch to return IRQ_NONE > to let the interrupt core aware of such invalid interrupt to prevent > IRQ storm. > > The reason of using a per queue variable instead of a per device one > is that we may need it for per queue reset hardening in the future. > > Note that the hardening is only done for vring interrupt since the > config interrupt hardening is already done in commit 22b7050a024d7 > ("virtio: defer config changed notifications"). But the method that is > used by config interrupt can't be reused by the vring interrupt > handler because it uses spinlock to do the synchronization which is > expensive. > > Cc: Thomas Gleixner <tglx at linutronix.de> > Cc: Peter Zijlstra <peterz at infradead.org> > Cc: "Paul E. McKenney" <paulmck at kernel.org> > Cc: Marc Zyngier <maz at kernel.org> > Cc: Halil Pasic <pasic at linux.ibm.com> > Cc: Cornelia Huck <cohuck at redhat.com> > Cc: Vineeth Vijayan <vneethv at linux.ibm.com> > Cc: Peter Oberparleiter <oberpar at linux.ibm.com> > Cc: linux-s390 at vger.kernel.org > Signed-off-by: Jason Wang <jasowang at redhat.com>Jason, I am really concerned by all the fallout. I propose adding a flag to suppress the hardening - this will be a debugging aid and a work around for users if we find more buggy drivers. suppress_interrupt_hardening ?> --- > drivers/s390/virtio/virtio_ccw.c | 4 ++++ > drivers/virtio/virtio.c | 15 ++++++++++++--- > drivers/virtio/virtio_mmio.c | 5 +++++ > drivers/virtio/virtio_pci_modern_dev.c | 5 +++++ > drivers/virtio/virtio_ring.c | 11 +++++++---- > include/linux/virtio_config.h | 20 ++++++++++++++++++++ > 6 files changed, 53 insertions(+), 7 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index c188e4f20ca3..97e51c34e6cf 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -971,6 +971,10 @@ static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status) > ccw->flags = 0; > ccw->count = sizeof(status); > ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status; > + /* We use ssch for setting the status which is a serializing > + * instruction that guarantees the memory writes have > + * completed before ssch. > + */ > ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS); > /* Write failed? We assume status is unchanged. */ > if (ret) > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > index aa1eb5132767..95fac4c97c8b 100644 > --- a/drivers/virtio/virtio.c > +++ b/drivers/virtio/virtio.c > @@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev) > * */ > void virtio_reset_device(struct virtio_device *dev) > { > + /* > + * The below virtio_synchronize_cbs() guarantees that any > + * interrupt for this line arriving after > + * virtio_synchronize_vqs() has completed is guaranteed to see > + * vq->broken as true. > + */ > + virtio_break_device(dev);So make this conditional> + virtio_synchronize_cbs(dev); > + > dev->config->reset(dev); > } > EXPORT_SYMBOL_GPL(virtio_reset_device); > @@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev) > dev->config_enabled = false; > dev->config_change_pending = false; > > + INIT_LIST_HEAD(&dev->vqs); > + spin_lock_init(&dev->vqs_list_lock); > + > /* We always start by resetting the device, in case a previous > * driver messed it up. This also tests that code path a little. */ > virtio_reset_device(dev); > @@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev) > /* Acknowledge that we've seen the device. */ > virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); > > - INIT_LIST_HEAD(&dev->vqs); > - spin_lock_init(&dev->vqs_list_lock); > - > /* > * device_add() causes the bus infrastructure to look for a matching > * driver. > diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c > index c9699a59f93c..f9a36bc7ac27 100644 > --- a/drivers/virtio/virtio_mmio.c > +++ b/drivers/virtio/virtio_mmio.c > @@ -253,6 +253,11 @@ static void vm_set_status(struct virtio_device *vdev, u8 status) > /* We should never be setting status to 0. */ > BUG_ON(status == 0); > > + /* > + * Per memory-barriers.txt, wmb() is not needed to guarantee > + * that the the cache coherent memory writes have completed > + * before writing to the MMIO region. > + */ > writel(status, vm_dev->base + VIRTIO_MMIO_STATUS); > } > > diff --git a/drivers/virtio/virtio_pci_modern_dev.c b/drivers/virtio/virtio_pci_modern_dev.c > index 4093f9cca7a6..a0fa14f28a7f 100644 > --- a/drivers/virtio/virtio_pci_modern_dev.c > +++ b/drivers/virtio/virtio_pci_modern_dev.c > @@ -467,6 +467,11 @@ void vp_modern_set_status(struct virtio_pci_modern_device *mdev, > { > struct virtio_pci_common_cfg __iomem *cfg = mdev->common; > > + /* > + * Per memory-barriers.txt, wmb() is not needed to guarantee > + * that the the cache coherent memory writes have completed > + * before writing to the MMIO region. > + */ > vp_iowrite8(status, &cfg->device_status); > } > EXPORT_SYMBOL_GPL(vp_modern_set_status); > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 9c231e1fded7..13a7348cedff 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1688,7 +1688,7 @@ static struct virtqueue *vring_create_virtqueue_packed( > vq->we_own_ring = true; > vq->notify = notify; > vq->weak_barriers = weak_barriers; > - vq->broken = false; > + vq->broken = true; > vq->last_used_idx = 0; > vq->event_triggered = false; > vq->num_added = 0;and make this conditional> @@ -2134,8 +2134,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq) > return IRQ_NONE; > } > > - if (unlikely(vq->broken)) > - return IRQ_HANDLED; > + if (unlikely(vq->broken)) { > + dev_warn_once(&vq->vq.vdev->dev, > + "virtio vring IRQ raised before DRIVER_OK"); > + return IRQ_NONE; > + } > > /* Just a hint for performance: so it's ok that this can be racy! */ > if (vq->event) > @@ -2177,7 +2180,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, > vq->we_own_ring = false; > vq->notify = notify; > vq->weak_barriers = weak_barriers; > - vq->broken = false; > + vq->broken = true; > vq->last_used_idx = 0; > vq->event_triggered = false; > vq->num_added = 0;and make this conditional> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h > index 25be018810a7..d4edfd7d91bb 100644 > --- a/include/linux/virtio_config.h > +++ b/include/linux/virtio_config.h > @@ -256,6 +256,26 @@ void virtio_device_ready(struct virtio_device *dev) > unsigned status = dev->config->get_status(dev); > > BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK); > + > + /* > + * The virtio_synchronize_cbs() makes sure vring_interrupt() > + * will see the driver specific setup if it sees vq->broken > + * as false (even if the notifications come before DRIVER_OK). > + */ > + virtio_synchronize_cbs(dev); > + __virtio_unbreak_device(dev); > + /* > + * The transport should ensure the visibility of vq->broken > + * before setting DRIVER_OK. See the comments for the transport > + * specific set_status() method. > + * > + * A well behaved device will only notify a virtqueue after > + * DRIVER_OK, this means the device should "see" the coherenct > + * memory write that set vq->broken as false which is done by > + * the driver when it sees DRIVER_OK, then the following > + * driver's vring_interrupt() will see vq->broken as false so > + * we won't lose any notification. > + */ > dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK); > } > > -- > 2.25.1