thr3ads.net - Linux Virtualization - [PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full [May 2023]

If this information is useful, please help other people find it:
Share via:

Peter-Jan Gootzen

2023-May-31 20:34 UTC

[PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

On 31/05/2023 21:18, Vivek Goyal wrote:> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
>> When the Virtio queue is full, a work item is scheduled
>> to execute in 1ms that retries adding the request to the queue.
>> This is a large amount of time on the scale on which a
>> virtio-fs device can operate. When using a DPU this is around
>> 40us baseline without going to a remote server (4k, QD=1).
>> This patch queues requests when the Virtio queue is full,
>> and when a completed request is taken off, immediately fills
>> it back up with queued requests.
>>
>> This reduces the 99.9th percentile latencies in our tests by
>> 60x and slightly increases the overall throughput, when using a
>> queue depth 2x the size of the Virtio queue size, with a
>> DPU-powered virtio-fs device.
>>
>> Signed-off-by: Peter-Jan Gootzen <peter-jan at gootzen.net>
>> ---
>> V1 -> V2: Not scheduling dispatch work anymore when not needed
>> and changed delayed_work structs to work_struct structs
>>
>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>> index 4d8d4f16c727..a676297db09b 100644
>> --- a/fs/fuse/virtio_fs.c
>> +++ b/fs/fuse/virtio_fs.c
>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
>>  	struct work_struct done_work;
>>  	struct list_head queued_reqs;
>>  	struct list_head end_reqs;	/* End these requests */
>> -	struct delayed_work dispatch_work;
>> +	struct work_struct dispatch_work;
>>  	struct fuse_dev *fud;
>>  	bool connected;
>>  	long in_flight;
>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct
virtio_fs_vq *fsvq)
>>  	}
>>  
>>  	flush_work(&fsvq->done_work);
>> -	flush_delayed_work(&fsvq->dispatch_work);
>> +	flush_work(&fsvq->dispatch_work);
>>  }
>>  
>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct
work_struct *work)
>>  			dec_in_flight_req(fsvq);
>>  		}
>>  	} while (!virtqueue_enable_cb(vq) &&
likely(!virtqueue_is_broken(vq)));
>> +
>> +	if (!list_empty(&fsvq->queued_reqs))
>> +		schedule_work(&fsvq->dispatch_work);
>>  	spin_unlock(&fsvq->lock);
>>  }
>>  
>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct
work_struct *work)
>>  {
>>  	struct fuse_req *req;
>>  	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
>> -						 dispatch_work.work);
>> +						 dispatch_work);
>>  	int ret;
>>  
>>  	pr_debug("virtio-fs: worker %s called.\n", __func__);
>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct
work_struct *work)
>>  			if (ret == -ENOMEM || ret == -ENOSPC) {
>>  				spin_lock(&fsvq->lock);
>>  				list_add_tail(&req->list, &fsvq->queued_reqs);
>> -				schedule_delayed_work(&fsvq->dispatch_work,
>> -						      msecs_to_jiffies(1));
> 
> Virtqueue being full is only one of the reasons for failure to queue
> the request. What if virtqueue is empty but we could not queue the
> request because lack of memory (-ENOMEM). In that case we will queue
> the request and it might not be dispatched because there is no completion.
> (Assume there is no further new request coming). That means deadlock?
> 
> Thanks
> Vivek
> 
Good catch that will deadlock.

Is default kernel behavior to indefinitely retry a file system
request until memory is available?

Vivek Goyal

2023-May-31 20:49 UTC

head link

[PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen
wrote:> On 31/05/2023 21:18, Vivek Goyal wrote:
> > On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
> >> When the Virtio queue is full, a work item is scheduled
> >> to execute in 1ms that retries adding the request to the queue.
> >> This is a large amount of time on the scale on which a
> >> virtio-fs device can operate. When using a DPU this is around
> >> 40us baseline without going to a remote server (4k, QD=1).
> >> This patch queues requests when the Virtio queue is full,
> >> and when a completed request is taken off, immediately fills
> >> it back up with queued requests.
> >>
> >> This reduces the 99.9th percentile latencies in our tests by
> >> 60x and slightly increases the overall throughput, when using a
> >> queue depth 2x the size of the Virtio queue size, with a
> >> DPU-powered virtio-fs device.
> >>
> >> Signed-off-by: Peter-Jan Gootzen <peter-jan at gootzen.net>
> >> ---
> >> V1 -> V2: Not scheduling dispatch work anymore when not needed
> >> and changed delayed_work structs to work_struct structs
> >>
> >>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
> >>  1 file changed, 17 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> >> index 4d8d4f16c727..a676297db09b 100644
> >> --- a/fs/fuse/virtio_fs.c
> >> +++ b/fs/fuse/virtio_fs.c
> >> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
> >>  	struct work_struct done_work;
> >>  	struct list_head queued_reqs;
> >>  	struct list_head end_reqs;	/* End these requests */
> >> -	struct delayed_work dispatch_work;
> >> +	struct work_struct dispatch_work;
> >>  	struct fuse_dev *fud;
> >>  	bool connected;
> >>  	long in_flight;
> >> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct
virtio_fs_vq *fsvq)
> >>  	}
> >>  
> >>  	flush_work(&fsvq->done_work);
> >> -	flush_delayed_work(&fsvq->dispatch_work);
> >> +	flush_work(&fsvq->dispatch_work);
> >>  }
> >>  
> >>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs
*fs)
> >> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct
work_struct *work)
> >>  			dec_in_flight_req(fsvq);
> >>  		}
> >>  	} while (!virtqueue_enable_cb(vq) &&
likely(!virtqueue_is_broken(vq)));
> >> +
> >> +	if (!list_empty(&fsvq->queued_reqs))
> >> +		schedule_work(&fsvq->dispatch_work);
> >>  	spin_unlock(&fsvq->lock);
> >>  }
> >>  
> >> @@ -353,7 +356,7 @@ static void
virtio_fs_request_dispatch_work(struct work_struct *work)
> >>  {
> >>  	struct fuse_req *req;
> >>  	struct virtio_fs_vq *fsvq = container_of(work, struct
virtio_fs_vq,
> >> -						 dispatch_work.work);
> >> +						 dispatch_work);
> >>  	int ret;
> >>  
> >>  	pr_debug("virtio-fs: worker %s called.\n", __func__);
> >> @@ -388,8 +391,6 @@ static void
virtio_fs_request_dispatch_work(struct work_struct *work)
> >>  			if (ret == -ENOMEM || ret == -ENOSPC) {
> >>  				spin_lock(&fsvq->lock);
> >>  				list_add_tail(&req->list, &fsvq->queued_reqs);
> >> -				schedule_delayed_work(&fsvq->dispatch_work,
> >> -						      msecs_to_jiffies(1));
> > 
> > Virtqueue being full is only one of the reasons for failure to queue
> > the request. What if virtqueue is empty but we could not queue the
> > request because lack of memory (-ENOMEM). In that case we will queue
> > the request and it might not be dispatched because there is no
completion.
> > (Assume there is no further new request coming). That means deadlock?
> > 
> > Thanks
> > Vivek
> > 
> 
> Good catch that will deadlock.
> 
> Is default kernel behavior to indefinitely retry a file system
> request until memory is available?
As of now that seems to be the behavior. I think I had copied this
code from another driver. 

But I guess one can argue that if memory is not available, then
return -ENOMEM to user space instead of retrying in kernel.

Stefan, Miklos, WDYT?

Thanks
Vivek

Linux Virtualization - May 2023 - [PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

[PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full

[PATCH V2] virtio-fs: Improved request latencies when Virtio queue is full