Replace virtqueue_kick by virtqueue_kick_prepare, which requires serialization, and virtqueue_notify, which does not. Repurpose the return values to indicate whether the vq should be notified. This fixes a lock contention with qemu host. When the guest calls vibad rtqueue_notify, the qemu vcpu thread exits the guest and waits for the qemu iothread to perform the MMIO. If the qemu iothread is still processing the prior buffer, and if the prior buffer is cheap to GPU, the iothread will go ahead and generate an IRQ for the guest. A worker thread in the guest will call virtio_gpu_dequeue_ctrl_func. If virtqueue_notify was called with the vq lock held, the worker thread would busy wait inside virtio_gpu_dequeue_ctrl_func. Signed-off-by: Chia-I Wu <olvaffe at gmail.com> --- drivers/gpu/drm/virtio/virtgpu_vq.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 6c1a90717535..e96f88fe5c83 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -291,11 +291,9 @@ static int virtio_gpu_queue_ctrl_buffer_locked(struct virtio_gpu_device *vgdev, trace_virtio_gpu_cmd_queue(vq, (struct virtio_gpu_ctrl_hdr *)vbuf->buf); - virtqueue_kick(vq); + ret = virtqueue_kick_prepare(vq); } - if (!ret) - ret = vq->num_free; return ret; } @@ -307,6 +305,10 @@ static int virtio_gpu_queue_ctrl_buffer(struct virtio_gpu_device *vgdev, spin_lock(&vgdev->ctrlq.qlock); rc = virtio_gpu_queue_ctrl_buffer_locked(vgdev, vbuf); spin_unlock(&vgdev->ctrlq.qlock); + + if (rc > 0) + virtqueue_notify(vgdev->ctrlq.vq); + return rc; } @@ -339,6 +341,10 @@ static int virtio_gpu_queue_fenced_ctrl_buffer(struct virtio_gpu_device *vgdev, virtio_gpu_fence_emit(vgdev, hdr, fence); rc = virtio_gpu_queue_ctrl_buffer_locked(vgdev, vbuf); spin_unlock(&vgdev->ctrlq.qlock); + + if (rc > 0) + virtqueue_notify(vgdev->ctrlq.vq); + return rc; } @@ -369,13 +375,14 @@ static int virtio_gpu_queue_cursor(struct virtio_gpu_device *vgdev, trace_virtio_gpu_cmd_queue(vq, (struct virtio_gpu_ctrl_hdr *)vbuf->buf); - virtqueue_kick(vq); + ret = virtqueue_kick_prepare(vq); } spin_unlock(&vgdev->cursorq.qlock); - if (!ret) - ret = vq->num_free; + if (ret > 0) + virtqueue_notify(vq); + return ret; } -- 2.22.0.410.gd8fdbe21b5-goog
Replace virtqueue_kick by virtqueue_kick_prepare, which requires serialization, and virtqueue_notify, which does not. Repurpose the return values to indicate whether the vq should be notified. This fixes a bad spinlock contention when the host is qemu. When the guest calls virtqueue_notify, the qemu vcpu thread exits the guest and waits for the qemu iothread to perform the MMIO. If the qemu iothread is still processing the prior buffer, and if the prior buffer is cheap to GPU, the iothread will go ahead and generate an IRQ. A worker thread in the guest might start running virtio_gpu_dequeue_ctrl_func. If virtqueue_notify was called with the vq lock held, the worker thread would have to busy wait inside virtio_gpu_dequeue_ctrl_func. v2: fix scrambled commit message Signed-off-by: Chia-I Wu <olvaffe at gmail.com> --- drivers/gpu/drm/virtio/virtgpu_vq.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 6c1a90717535..e96f88fe5c83 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -291,11 +291,9 @@ static int virtio_gpu_queue_ctrl_buffer_locked(struct virtio_gpu_device *vgdev, trace_virtio_gpu_cmd_queue(vq, (struct virtio_gpu_ctrl_hdr *)vbuf->buf); - virtqueue_kick(vq); + ret = virtqueue_kick_prepare(vq); } - if (!ret) - ret = vq->num_free; return ret; } @@ -307,6 +305,10 @@ static int virtio_gpu_queue_ctrl_buffer(struct virtio_gpu_device *vgdev, spin_lock(&vgdev->ctrlq.qlock); rc = virtio_gpu_queue_ctrl_buffer_locked(vgdev, vbuf); spin_unlock(&vgdev->ctrlq.qlock); + + if (rc > 0) + virtqueue_notify(vgdev->ctrlq.vq); + return rc; } @@ -339,6 +341,10 @@ static int virtio_gpu_queue_fenced_ctrl_buffer(struct virtio_gpu_device *vgdev, virtio_gpu_fence_emit(vgdev, hdr, fence); rc = virtio_gpu_queue_ctrl_buffer_locked(vgdev, vbuf); spin_unlock(&vgdev->ctrlq.qlock); + + if (rc > 0) + virtqueue_notify(vgdev->ctrlq.vq); + return rc; } @@ -369,13 +375,14 @@ static int virtio_gpu_queue_cursor(struct virtio_gpu_device *vgdev, trace_virtio_gpu_cmd_queue(vq, (struct virtio_gpu_ctrl_hdr *)vbuf->buf); - virtqueue_kick(vq); + ret = virtqueue_kick_prepare(vq); } spin_unlock(&vgdev->cursorq.qlock); - if (!ret) - ret = vq->num_free; + if (ret > 0) + virtqueue_notify(vq); + return ret; } -- 2.22.0.410.gd8fdbe21b5-goog
> @@ -291,11 +291,9 @@ static int virtio_gpu_queue_ctrl_buffer_locked(struct virtio_gpu_device *vgdev, > trace_virtio_gpu_cmd_queue(vq, > (struct virtio_gpu_ctrl_hdr *)vbuf->buf); > > - virtqueue_kick(vq); > + ret = virtqueue_kick_prepare(vq); > } > > - if (!ret) > - ret = vq->num_free;Hmm. Change looks unrelated. On a closer look it seems this is basically dead code. virtio_gpu_queue_ctrl_buffer_locked is called by virtio_gpu_queue_ctrl_buffer and virtio_gpu_queue_fenced_ctrl_buffer. The call sites for these two functions all ignore the return value. So it is a valid change, but it should go to a separate patch. And while being at it virtio_gpu_queue_ctrl_buffer and virtio_gpu_queue_fenced_ctrl_buffer can be changed to return void. Otherwise the patch looks fine. Nice analysis btw. cheers, Gerd