Asias He
2012-May-04  08:39 UTC
[PATCH v3] virtio-blk: Fix hot-unplug race in remove method
If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.
blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q->in_flight[] > 0, so the drain will not
finish.
How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O
Test:
~1000 rounds of hot-plug/hot-unplug test passed with this patch.
Changes in v3:
- Drop blk_abort_queue and blk_abort_request
- Use __blk_end_request_all to complete request dispatched to driver
Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver
Signed-off-by: Asias He <asias at redhat.com>
---
 drivers/block/virtio_blk.c |   11 +++++++++++
 1 file changed, 11 insertions(+)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 72fe55d..693187d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -576,6 +576,8 @@ static void __devexit virtblk_remove(struct virtio_device
*vdev)
 {
 	struct virtio_blk *vblk = vdev->priv;
 	int index = vblk->index;
+	struct virtblk_req *vbr;
+	unsigned long flags;
 
 	/* Prevent config work handler from accessing the device. */
 	mutex_lock(&vblk->config_lock);
@@ -588,6 +590,15 @@ static void __devexit virtblk_remove(struct virtio_device
*vdev)
 	flush_work(&vblk->config_work);
 
 	del_gendisk(vblk->disk);
+
+	/* Abort requests dispatched to driver. */
+	spin_lock_irqsave(&vblk->lock, flags);
+	while ((vbr = virtqueue_detach_unused_buf(vblk->vq))) {
+		__blk_end_request_all(vbr->req, -EIO);
+		mempool_free(vbr, vblk->pool);
+	}
+	spin_unlock_irqrestore(&vblk->lock, flags);
+
 	blk_cleanup_queue(vblk->disk->queue);
 	put_disk(vblk->disk);
 	mempool_destroy(vblk->pool);
-- 
1.7.10.1
Asias He
2012-May-04  12:22 UTC
[PATCH v3] virtio-blk: Fix hot-unplug race in remove method
If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.
blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q->in_flight[] > 0, so the drain will not
finish.
How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O
Test:
~1000 rounds of hot-plug/hot-unplug test passed with this patch.
Changes in v3:
- Drop blk_abort_queue and blk_abort_request
- Use __blk_end_request_all to complete request dispatched to driver
Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver
Signed-off-by: Asias He <asias at redhat.com>
---
 drivers/block/virtio_blk.c |   11 +++++++++++
 1 file changed, 11 insertions(+)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 72fe55d..693187d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -576,6 +576,8 @@ static void __devexit virtblk_remove(struct virtio_device
*vdev)
 {
 	struct virtio_blk *vblk = vdev->priv;
 	int index = vblk->index;
+	struct virtblk_req *vbr;
+	unsigned long flags;
 
 	/* Prevent config work handler from accessing the device. */
 	mutex_lock(&vblk->config_lock);
@@ -588,6 +590,15 @@ static void __devexit virtblk_remove(struct virtio_device
*vdev)
 	flush_work(&vblk->config_work);
 
 	del_gendisk(vblk->disk);
+
+	/* Abort requests dispatched to driver. */
+	spin_lock_irqsave(&vblk->lock, flags);
+	while ((vbr = virtqueue_detach_unused_buf(vblk->vq))) {
+		__blk_end_request_all(vbr->req, -EIO);
+		mempool_free(vbr, vblk->pool);
+	}
+	spin_unlock_irqrestore(&vblk->lock, flags);
+
 	blk_cleanup_queue(vblk->disk->queue);
 	put_disk(vblk->disk);
 	mempool_destroy(vblk->pool);
-- 
1.7.10.1
Rusty Russell
2012-May-07  05:01 UTC
[PATCH v3] virtio-blk: Fix hot-unplug race in remove method
On Fri, 4 May 2012 16:39:49 +0800, Asias He <asias at redhat.com> wrote:> If we reset the virtio-blk device before the requests already dispatched > to the virtio-blk driver from the block layer are finised, we will stuck > in blk_cleanup_queue() and the remove will fail. > > blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued > before DEAD marking. However it will never success if the device is > already stopped. We'll have q->in_flight[] > 0, so the drain will not > finish. > > How to reproduce the race: > 1. hot-plug a virtio-blk device > 2. keep reading/writing the device in guest > 3. hot-unplug while the device is busy serving I/O > > Test: > ~1000 rounds of hot-plug/hot-unplug test passed with this patch. > > Changes in v3: > - Drop blk_abort_queue and blk_abort_request > - Use __blk_end_request_all to complete request dispatched to driver > > Changes in v2: > - Drop req_in_flight > - Use virtqueue_detach_unused_buf to get request dispatched to driver > > Signed-off-by: Asias He <asias at redhat.com>Thanks, applied. Cheers, Rusty.
Possibly Parallel Threads
- [PATCH v3] virtio-blk: Fix hot-unplug race in remove method
- [PATCH v2] virtio-blk: Fix hot-unplug race in remove method
- [PATCH v2] virtio-blk: Fix hot-unplug race in remove method
- [PATCH 1/2] virtio-blk: Fix hot-unplug race in remove method
- [PATCH 1/2] virtio-blk: Fix hot-unplug race in remove method