thr3ads.net - Linux Virtualization - [virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2017-Sep-08 03:36 UTC

[PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

On Tue, Aug 29, 2017 at 11:09:18AM +0800, Wei Wang
wrote:> On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:
> > On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:
> > > Add a new feature, VIRTIO_BALLOON_F_SG, which enables the
transfer
> > > of balloon (i.e. inflated/deflated) pages using scatter-gather
lists
> > > to the host.
> > > 
> > > The implementation of the previous virtio-balloon is not very
> > > efficient, because the balloon pages are transferred to the
> > > host one by one. Here is the breakdown of the time in percentage
> > > spent on each step of the balloon inflating process (inflating
> > > 7GB of an 8GB idle guest).
> > > 
> > > 1) allocating pages (6.5%)
> > > 2) sending PFNs to host (68.3%)
> > > 3) address translation (6.1%)
> > > 4) madvise (19%)
> > > 
> > > It takes about 4126ms for the inflating process to complete.
> > > The above profiling shows that the bottlenecks are stage 2)
> > > and stage 4).
> > > 
> > > This patch optimizes step 2) by transferring pages to the host in
> > > sgs. An sg describes a chunk of guest physically continuous
pages.
> > > With this mechanism, step 4) can also be optimized by doing
address
> > > translation and madvise() in chunks rather than page by page.
> > > 
> > > With this new feature, the above ballooning process takes ~597ms
> > > resulting in an improvement of ~86%.
> > > 
> > > TODO: optimize stage 1) by allocating/freeing a chunk of pages
> > > instead of a single page each time.
> > > 
> > > Signed-off-by: Wei Wang <wei.w.wang at intel.com>
> > > Signed-off-by: Liang Li <liang.z.li at intel.com>
> > > Suggested-by: Michael S. Tsirkin <mst at redhat.com>
> > > ---
> > >   drivers/virtio/virtio_balloon.c     | 171
++++++++++++++++++++++++++++++++----
> > >   include/uapi/linux/virtio_balloon.h |   1 +
> > >   2 files changed, 155 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/virtio/virtio_balloon.c
b/drivers/virtio/virtio_balloon.c
> > > index f0b3a0b..8ecc1d4 100644
> > > --- a/drivers/virtio/virtio_balloon.c
> > > +++ b/drivers/virtio/virtio_balloon.c
> > > @@ -32,6 +32,8 @@
> > >   #include <linux/mm.h>
> > >   #include <linux/mount.h>
> > >   #include <linux/magic.h>
> > > +#include <linux/xbitmap.h>
> > > +#include <asm/page.h>
> > >   /*
> > >    * Balloon device works in 4K page units.  So each page is
pointed to by
> > > @@ -79,6 +81,9 @@ struct virtio_balloon {
> > >   	/* Synchronize access/update to this struct virtio_balloon
elements */
> > >   	struct mutex balloon_lock;
> > > +	/* The xbitmap used to record balloon pages */
> > > +	struct xb page_xb;
> > > +
> > >   	/* The array of pfns we tell the Host about. */
> > >   	unsigned int num_pfns;
> > >   	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
> > > @@ -141,13 +146,111 @@ static void set_page_pfns(struct
virtio_balloon *vb,
> > >   					  page_to_balloon_pfn(page) + i);
> > >   }
> > > +static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t
size)
> > > +{
> > > +	struct scatterlist sg;
> > > +
> > > +	sg_init_one(&sg, addr, size);
> > > +	return virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
> > > +}
> > > +
> > > +static void send_balloon_page_sg(struct virtio_balloon *vb,
> > > +				 struct virtqueue *vq,
> > > +				 void *addr,
> > > +				 uint32_t size,
> > > +				 bool batch)
> > > +{
> > > +	unsigned int len;
> > > +	int err;
> > > +
> > > +	err = add_one_sg(vq, addr, size);
> > > +	/* Sanity check: this can't really happen */
> > > +	WARN_ON(err);
> > It might be cleaner to detect that add failed due to
> > ring full and kick then. Just an idea, up to you
> > whether to do it.
> > 
> > > +
> > > +	/* If batching is in use, we batch the sgs till the vq is full.
*/
> > > +	if (!batch || !vq->num_free) {
> > > +		virtqueue_kick(vq);
> > > +		wait_event(vb->acked, virtqueue_get_buf(vq, &len));
> > > +		/* Release all the entries if there are */
> > Meaning
> > 	Account for all used entries if any
> > ?
> > 
> > > +		while (virtqueue_get_buf(vq, &len))
> > > +			;
> > 
> > Above code is reused below. Add a function?
> > 
> > > +	}
> > > +}
> > > +
> > > +/*
> > > + * Send balloon pages in sgs to host. The balloon pages are
recorded in the
> > > + * page xbitmap. Each bit in the bitmap corresponds to a page of
PAGE_SIZE.
> > > + * The page xbitmap is searched for continuous "1"
bits, which correspond
> > > + * to continuous pages, to chunk into sgs.
> > > + *
> > > + * @page_xb_start and @page_xb_end form the range of bits in the
xbitmap that
> > > + * need to be searched.
> > > + */
> > > +static void tell_host_sgs(struct virtio_balloon *vb,
> > > +			  struct virtqueue *vq,
> > > +			  unsigned long page_xb_start,
> > > +			  unsigned long page_xb_end)
> > > +{
> > > +	unsigned long sg_pfn_start, sg_pfn_end;
> > > +	void *sg_addr;
> > > +	uint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);
> > > +
> > > +	sg_pfn_start = page_xb_start;
> > > +	while (sg_pfn_start < page_xb_end) {
> > > +		sg_pfn_start = xb_find_next_bit(&vb->page_xb,
sg_pfn_start,
> > > +						page_xb_end, 1);
> > > +		if (sg_pfn_start == page_xb_end + 1)
> > > +			break;
> > > +		sg_pfn_end = xb_find_next_bit(&vb->page_xb,
sg_pfn_start + 1,
> > > +					      page_xb_end, 0);
> > > +		sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
> > > +		sg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;
> > > +		while (sg_len > sg_max_len) {
> > > +			send_balloon_page_sg(vb, vq, sg_addr, sg_max_len, 1);
> > Last argument should be true, not 1.
> > 
> > > +			sg_addr += sg_max_len;
> > > +			sg_len -= sg_max_len;
> > > +		}
> > > +		send_balloon_page_sg(vb, vq, sg_addr, sg_len, 1);
> > > +		xb_zero(&vb->page_xb, sg_pfn_start, sg_pfn_end);
> > > +		sg_pfn_start = sg_pfn_end + 1;
> > > +	}
> > > +
> > > +	/*
> > > +	 * The last few sgs may not reach the batch size, but need a
kick to
> > > +	 * notify the device to handle them.
> > > +	 */
> > > +	if (vq->num_free != virtqueue_get_vring_size(vq)) {
> > > +		virtqueue_kick(vq);
> > > +		wait_event(vb->acked, virtqueue_get_buf(vq, &sg_len));
> > > +		while (virtqueue_get_buf(vq, &sg_len))
> > > +			;
> > Some entries can get used after a pause. Looks like they will leak
then?
> > One fix would be to convert above if to a while loop.
> > I don't know whether to do it like this in send_balloon_page_sg
too.
> > 
> 
> Thanks for the above comments. I've re-written this part of code.
> Please have a check below if there is anything more we could improve:
> 
> static void kick_and_wait(struct virtqueue *vq, wait_queue_head_t wq_head)
> {
>         unsigned int len;
> 
>         virtqueue_kick(vq);
>         wait_event(wq_head, virtqueue_get_buf(vq, &len));
>         /* Detach all the used buffers from the vq */
>         while (virtqueue_get_buf(vq, &len))
>                 ;
I would move this last part to before add_buf. Increases chances
it succeeds even in case of a bug.
> }
> 
> static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)
> {
>         struct scatterlist sg;
>         int ret;
> 
>         sg_init_one(&sg, addr, size);
>         ret = virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
>         if (unlikely(ret == -ENOSPC))
>                 dev_warn(&vq->vdev->dev, "%s: failed due to
ring full\n",
>                                  __func__);
So if this ever triggers then kick and wait might fail, right?
I think you should not special-case this one then.
> 
>         return ret;
> }
> 
> static void send_balloon_page_sg(struct virtio_balloon *vb,
>                                                       struct virtqueue *vq,
>                                                       void *addr,
>                                                       uint32_t size,
>                                                       bool batch)
> {
>         int err;
> 
>         do {
>                 err = add_one_sg(vq, addr, size);
>                 if (err == -ENOSPC || !batch || !vq->num_free)
>                         kick_and_wait(vq, vb->acked);
>         } while (err == -ENOSPC);
> }
> 
> 
> Best,
> Wei
I think this fixes the bug, yes. I would skip trying to handle ENOSPC,
it's more a less a bug if it triggers. Just bail out on any error.

-- 
MST

Wei Wang

2017-Sep-08 11:09 UTC

head link

[virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

On 09/08/2017 11:36 AM, Michael S. Tsirkin wrote:> On Tue, Aug 29, 2017 at 11:09:18AM +0800, Wei Wang wrote:
>> On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:
>>> On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:
>>>> Add a new feature, VIRTIO_BALLOON_F_SG, which enables the
transfer
>>>> of balloon (i.e. inflated/deflated) pages using scatter-gather
lists
>>>> to the host.
>>>>
>>>> The implementation of the previous virtio-balloon is not very
>>>> efficient, because the balloon pages are transferred to the
>>>> host one by one. Here is the breakdown of the time in
percentage
>>>> spent on each step of the balloon inflating process (inflating
>>>> 7GB of an 8GB idle guest).
>>>>
>>>> 1) allocating pages (6.5%)
>>>> 2) sending PFNs to host (68.3%)
>>>> 3) address translation (6.1%)
>>>> 4) madvise (19%)
>>>>
>>>> It takes about 4126ms for the inflating process to complete.
>>>> The above profiling shows that the bottlenecks are stage 2)
>>>> and stage 4).
>>>>
>>>> This patch optimizes step 2) by transferring pages to the host
in
>>>> sgs. An sg describes a chunk of guest physically continuous
pages.
>>>> With this mechanism, step 4) can also be optimized by doing
address
>>>> translation and madvise() in chunks rather than page by page.
>>>>
>>>> With this new feature, the above ballooning process takes
~597ms
>>>> resulting in an improvement of ~86%.
>>>>
>>>> TODO: optimize stage 1) by allocating/freeing a chunk of pages
>>>> instead of a single page each time.
>>>>
>>>> Signed-off-by: Wei Wang <wei.w.wang at intel.com>
>>>> Signed-off-by: Liang Li <liang.z.li at intel.com>
>>>> Suggested-by: Michael S. Tsirkin <mst at redhat.com>
>>>> ---
>>>>    drivers/virtio/virtio_balloon.c     | 171
++++++++++++++++++++++++++++++++----
>>>>    include/uapi/linux/virtio_balloon.h |   1 +
>>>>    2 files changed, 155 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/virtio/virtio_balloon.c
b/drivers/virtio/virtio_balloon.c
>>>> index f0b3a0b..8ecc1d4 100644
>>>> --- a/drivers/virtio/virtio_balloon.c
>>>> +++ b/drivers/virtio/virtio_balloon.c
>>>> @@ -32,6 +32,8 @@
>>>>    #include <linux/mm.h>
>>>>    #include <linux/mount.h>
>>>>    #include <linux/magic.h>
>>>> +#include <linux/xbitmap.h>
>>>> +#include <asm/page.h>
>>>>    /*
>>>>     * Balloon device works in 4K page units.  So each page is
pointed to by
>>>> @@ -79,6 +81,9 @@ struct virtio_balloon {
>>>>    	/* Synchronize access/update to this struct virtio_balloon
elements */
>>>>    	struct mutex balloon_lock;
>>>> +	/* The xbitmap used to record balloon pages */
>>>> +	struct xb page_xb;
>>>> +
>>>>    	/* The array of pfns we tell the Host about. */
>>>>    	unsigned int num_pfns;
>>>>    	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
>>>> @@ -141,13 +146,111 @@ static void set_page_pfns(struct
virtio_balloon *vb,
>>>>    					  page_to_balloon_pfn(page) + i);
>>>>    }
>>>> +static int add_one_sg(struct virtqueue *vq, void *addr,
uint32_t size)
>>>> +{
>>>> +	struct scatterlist sg;
>>>> +
>>>> +	sg_init_one(&sg, addr, size);
>>>> +	return virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
>>>> +}
>>>> +
>>>> +static void send_balloon_page_sg(struct virtio_balloon *vb,
>>>> +				 struct virtqueue *vq,
>>>> +				 void *addr,
>>>> +				 uint32_t size,
>>>> +				 bool batch)
>>>> +{
>>>> +	unsigned int len;
>>>> +	int err;
>>>> +
>>>> +	err = add_one_sg(vq, addr, size);
>>>> +	/* Sanity check: this can't really happen */
>>>> +	WARN_ON(err);
>>> It might be cleaner to detect that add failed due to
>>> ring full and kick then. Just an idea, up to you
>>> whether to do it.
>>>
>>>> +
>>>> +	/* If batching is in use, we batch the sgs till the vq is
full. */
>>>> +	if (!batch || !vq->num_free) {
>>>> +		virtqueue_kick(vq);
>>>> +		wait_event(vb->acked, virtqueue_get_buf(vq, &len));
>>>> +		/* Release all the entries if there are */
>>> Meaning
>>> 	Account for all used entries if any
>>> ?
>>>
>>>> +		while (virtqueue_get_buf(vq, &len))
>>>> +			;
>>> Above code is reused below. Add a function?
>>>
>>>> +	}
>>>> +}
>>>> +
>>>> +/*
>>>> + * Send balloon pages in sgs to host. The balloon pages are
recorded in the
>>>> + * page xbitmap. Each bit in the bitmap corresponds to a page
of PAGE_SIZE.
>>>> + * The page xbitmap is searched for continuous "1"
bits, which correspond
>>>> + * to continuous pages, to chunk into sgs.
>>>> + *
>>>> + * @page_xb_start and @page_xb_end form the range of bits in
the xbitmap that
>>>> + * need to be searched.
>>>> + */
>>>> +static void tell_host_sgs(struct virtio_balloon *vb,
>>>> +			  struct virtqueue *vq,
>>>> +			  unsigned long page_xb_start,
>>>> +			  unsigned long page_xb_end)
>>>> +{
>>>> +	unsigned long sg_pfn_start, sg_pfn_end;
>>>> +	void *sg_addr;
>>>> +	uint32_t sg_len, sg_max_len = round_down(UINT_MAX,
PAGE_SIZE);
>>>> +
>>>> +	sg_pfn_start = page_xb_start;
>>>> +	while (sg_pfn_start < page_xb_end) {
>>>> +		sg_pfn_start = xb_find_next_bit(&vb->page_xb,
sg_pfn_start,
>>>> +						page_xb_end, 1);
>>>> +		if (sg_pfn_start == page_xb_end + 1)
>>>> +			break;
>>>> +		sg_pfn_end = xb_find_next_bit(&vb->page_xb,
sg_pfn_start + 1,
>>>> +					      page_xb_end, 0);
>>>> +		sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
>>>> +		sg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;
>>>> +		while (sg_len > sg_max_len) {
>>>> +			send_balloon_page_sg(vb, vq, sg_addr, sg_max_len, 1);
>>> Last argument should be true, not 1.
>>>
>>>> +			sg_addr += sg_max_len;
>>>> +			sg_len -= sg_max_len;
>>>> +		}
>>>> +		send_balloon_page_sg(vb, vq, sg_addr, sg_len, 1);
>>>> +		xb_zero(&vb->page_xb, sg_pfn_start, sg_pfn_end);
>>>> +		sg_pfn_start = sg_pfn_end + 1;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * The last few sgs may not reach the batch size, but need a
kick to
>>>> +	 * notify the device to handle them.
>>>> +	 */
>>>> +	if (vq->num_free != virtqueue_get_vring_size(vq)) {
>>>> +		virtqueue_kick(vq);
>>>> +		wait_event(vb->acked, virtqueue_get_buf(vq,
&sg_len));
>>>> +		while (virtqueue_get_buf(vq, &sg_len))
>>>> +			;
>>> Some entries can get used after a pause. Looks like they will leak
then?
>>> One fix would be to convert above if to a while loop.
>>> I don't know whether to do it like this in send_balloon_page_sg
too.
>>>
>> Thanks for the above comments. I've re-written this part of code.
>> Please have a check below if there is anything more we could improve:
>>
>> static void kick_and_wait(struct virtqueue *vq, wait_queue_head_t
wq_head)
>> {
>>          unsigned int len;
>>
>>          virtqueue_kick(vq);
>>          wait_event(wq_head, virtqueue_get_buf(vq, &len));
>>          /* Detach all the used buffers from the vq */
>>          while (virtqueue_get_buf(vq, &len))
>>                  ;
> I would move this last part to before add_buf. Increases chances
> it succeeds even in case of a bug.
>
>> }
>>
>> static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)
>> {
>>          struct scatterlist sg;
>>          int ret;
>>
>>          sg_init_one(&sg, addr, size);
>>          ret = virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
>>          if (unlikely(ret == -ENOSPC))
>>                  dev_warn(&vq->vdev->dev, "%s: failed
due to ring full\n",
>>                                   __func__);
> So if this ever triggers then kick and wait might fail, right?
> I think you should not special-case this one then.
OK, I will remove the check above, and take other suggestions as well. 
Thanks.

Best,
Wei

Michael S. Tsirkin

2017-Sep-29 04:01 UTC

head link

[virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

On Fri, Sep 08, 2017 at 07:09:24PM +0800, Wei Wang
wrote:> On 09/08/2017 11:36 AM, Michael S. Tsirkin wrote:
> > On Tue, Aug 29, 2017 at 11:09:18AM +0800, Wei Wang wrote:
> > > On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:
> > > > On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:
> > > > > Add a new feature, VIRTIO_BALLOON_F_SG, which enables
the transfer
> > > > > of balloon (i.e. inflated/deflated) pages using
scatter-gather lists
> > > > > to the host.
> > > > > 
> > > > > The implementation of the previous virtio-balloon is
not very
> > > > > efficient, because the balloon pages are transferred to
the
> > > > > host one by one. Here is the breakdown of the time in
percentage
> > > > > spent on each step of the balloon inflating process
(inflating
> > > > > 7GB of an 8GB idle guest).
> > > > > 
> > > > > 1) allocating pages (6.5%)
> > > > > 2) sending PFNs to host (68.3%)
> > > > > 3) address translation (6.1%)
> > > > > 4) madvise (19%)
> > > > > 
> > > > > It takes about 4126ms for the inflating process to
complete.
> > > > > The above profiling shows that the bottlenecks are
stage 2)
> > > > > and stage 4).
> > > > > 
> > > > > This patch optimizes step 2) by transferring pages to
the host in
> > > > > sgs. An sg describes a chunk of guest physically
continuous pages.
> > > > > With this mechanism, step 4) can also be optimized by
doing address
> > > > > translation and madvise() in chunks rather than page by
page.
> > > > > 
> > > > > With this new feature, the above ballooning process
takes ~597ms
> > > > > resulting in an improvement of ~86%.
> > > > > 
> > > > > TODO: optimize stage 1) by allocating/freeing a chunk
of pages
> > > > > instead of a single page each time.
> > > > > 
> > > > > Signed-off-by: Wei Wang <wei.w.wang at intel.com>
> > > > > Signed-off-by: Liang Li <liang.z.li at intel.com>
> > > > > Suggested-by: Michael S. Tsirkin <mst at
redhat.com>
> > > > > ---
> > > > >    drivers/virtio/virtio_balloon.c     | 171
++++++++++++++++++++++++++++++++----
> > > > >    include/uapi/linux/virtio_balloon.h |   1 +
> > > > >    2 files changed, 155 insertions(+), 17 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/virtio/virtio_balloon.c
b/drivers/virtio/virtio_balloon.c
> > > > > index f0b3a0b..8ecc1d4 100644
> > > > > --- a/drivers/virtio/virtio_balloon.c
> > > > > +++ b/drivers/virtio/virtio_balloon.c
> > > > > @@ -32,6 +32,8 @@
> > > > >    #include <linux/mm.h>
> > > > >    #include <linux/mount.h>
> > > > >    #include <linux/magic.h>
> > > > > +#include <linux/xbitmap.h>
> > > > > +#include <asm/page.h>
> > > > >    /*
> > > > >     * Balloon device works in 4K page units.  So each
page is pointed to by
> > > > > @@ -79,6 +81,9 @@ struct virtio_balloon {
> > > > >    	/* Synchronize access/update to this struct
virtio_balloon elements */
> > > > >    	struct mutex balloon_lock;
> > > > > +	/* The xbitmap used to record balloon pages */
> > > > > +	struct xb page_xb;
> > > > > +
> > > > >    	/* The array of pfns we tell the Host about. */
> > > > >    	unsigned int num_pfns;
> > > > >    	__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
> > > > > @@ -141,13 +146,111 @@ static void set_page_pfns(struct
virtio_balloon *vb,
> > > > >    					  page_to_balloon_pfn(page) + i);
> > > > >    }
> > > > > +static int add_one_sg(struct virtqueue *vq, void
*addr, uint32_t size)
> > > > > +{
> > > > > +	struct scatterlist sg;
> > > > > +
> > > > > +	sg_init_one(&sg, addr, size);
> > > > > +	return virtqueue_add_inbuf(vq, &sg, 1, vq,
GFP_KERNEL);
> > > > > +}
> > > > > +
> > > > > +static void send_balloon_page_sg(struct virtio_balloon
*vb,
> > > > > +				 struct virtqueue *vq,
> > > > > +				 void *addr,
> > > > > +				 uint32_t size,
> > > > > +				 bool batch)
> > > > > +{
> > > > > +	unsigned int len;
> > > > > +	int err;
> > > > > +
> > > > > +	err = add_one_sg(vq, addr, size);
> > > > > +	/* Sanity check: this can't really happen */
> > > > > +	WARN_ON(err);
> > > > It might be cleaner to detect that add failed due to
> > > > ring full and kick then. Just an idea, up to you
> > > > whether to do it.
> > > > 
> > > > > +
> > > > > +	/* If batching is in use, we batch the sgs till the
vq is full. */
> > > > > +	if (!batch || !vq->num_free) {
> > > > > +		virtqueue_kick(vq);
> > > > > +		wait_event(vb->acked, virtqueue_get_buf(vq,
&len));
> > > > > +		/* Release all the entries if there are */
> > > > Meaning
> > > > 	Account for all used entries if any
> > > > ?
> > > > 
> > > > > +		while (virtqueue_get_buf(vq, &len))
> > > > > +			;
> > > > Above code is reused below. Add a function?
> > > > 
> > > > > +	}
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Send balloon pages in sgs to host. The balloon
pages are recorded in the
> > > > > + * page xbitmap. Each bit in the bitmap corresponds to
a page of PAGE_SIZE.
> > > > > + * The page xbitmap is searched for continuous
"1" bits, which correspond
> > > > > + * to continuous pages, to chunk into sgs.
> > > > > + *
> > > > > + * @page_xb_start and @page_xb_end form the range of
bits in the xbitmap that
> > > > > + * need to be searched.
> > > > > + */
> > > > > +static void tell_host_sgs(struct virtio_balloon *vb,
> > > > > +			  struct virtqueue *vq,
> > > > > +			  unsigned long page_xb_start,
> > > > > +			  unsigned long page_xb_end)
> > > > > +{
> > > > > +	unsigned long sg_pfn_start, sg_pfn_end;
> > > > > +	void *sg_addr;
> > > > > +	uint32_t sg_len, sg_max_len = round_down(UINT_MAX,
PAGE_SIZE);
> > > > > +
> > > > > +	sg_pfn_start = page_xb_start;
> > > > > +	while (sg_pfn_start < page_xb_end) {
> > > > > +		sg_pfn_start = xb_find_next_bit(&vb->page_xb,
sg_pfn_start,
> > > > > +						page_xb_end, 1);
> > > > > +		if (sg_pfn_start == page_xb_end + 1)
> > > > > +			break;
> > > > > +		sg_pfn_end = xb_find_next_bit(&vb->page_xb,
sg_pfn_start + 1,
> > > > > +					      page_xb_end, 0);
> > > > > +		sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
> > > > > +		sg_len = (sg_pfn_end - sg_pfn_start) <<
PAGE_SHIFT;
> > > > > +		while (sg_len > sg_max_len) {
> > > > > +			send_balloon_page_sg(vb, vq, sg_addr, sg_max_len,
1);
> > > > Last argument should be true, not 1.
> > > > 
> > > > > +			sg_addr += sg_max_len;
> > > > > +			sg_len -= sg_max_len;
> > > > > +		}
> > > > > +		send_balloon_page_sg(vb, vq, sg_addr, sg_len, 1);
> > > > > +		xb_zero(&vb->page_xb, sg_pfn_start,
sg_pfn_end);
> > > > > +		sg_pfn_start = sg_pfn_end + 1;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * The last few sgs may not reach the batch size, but
need a kick to
> > > > > +	 * notify the device to handle them.
> > > > > +	 */
> > > > > +	if (vq->num_free != virtqueue_get_vring_size(vq))
{
> > > > > +		virtqueue_kick(vq);
> > > > > +		wait_event(vb->acked, virtqueue_get_buf(vq,
&sg_len));
> > > > > +		while (virtqueue_get_buf(vq, &sg_len))
> > > > > +			;
> > > > Some entries can get used after a pause. Looks like they
will leak then?
> > > > One fix would be to convert above if to a while loop.
> > > > I don't know whether to do it like this in
send_balloon_page_sg too.
> > > > 
> > > Thanks for the above comments. I've re-written this part of
code.
> > > Please have a check below if there is anything more we could
improve:
> > > 
> > > static void kick_and_wait(struct virtqueue *vq, wait_queue_head_t
wq_head)
> > > {
> > >          unsigned int len;
> > > 
> > >          virtqueue_kick(vq);
> > >          wait_event(wq_head, virtqueue_get_buf(vq, &len));
> > >          /* Detach all the used buffers from the vq */
> > >          while (virtqueue_get_buf(vq, &len))
> > >                  ;
> > I would move this last part to before add_buf. Increases chances
> > it succeeds even in case of a bug.
> 
> > 
> > > }
> > > 
> > > static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t
size)
> > > {
> > >          struct scatterlist sg;
> > >          int ret;
> > > 
> > >          sg_init_one(&sg, addr, size);
> > >          ret = virtqueue_add_inbuf(vq, &sg, 1, vq,
GFP_KERNEL);
> > >          if (unlikely(ret == -ENOSPC))
> > >                  dev_warn(&vq->vdev->dev, "%s:
failed due to ring full\n",
> > >                                   __func__);
> > So if this ever triggers then kick and wait might fail, right?
> > I think you should not special-case this one then.
> 
> OK, I will remove the check above, and take other suggestions as well.
> Thanks.
> 
> Best,
> Wei
Any updates here? It's been a while.

Reasonably Related Threads

Search for more possibly parallel threads

Linux Virtualization - Sep 2017 - [virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

[PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

[virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

[virtio-dev] Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

Reasonably Related Threads