Michael S. Tsirkin
2018-Jan-26 02:42 UTC
[virtio-dev] Re: [PATCH v25 2/2] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On Fri, Jan 26, 2018 at 09:40:44AM +0800, Wei Wang wrote:> On 01/25/2018 09:49 PM, Michael S. Tsirkin wrote: > > On Thu, Jan 25, 2018 at 05:14:06PM +0800, Wei Wang wrote: > > > + > > > +static void report_free_page_func(struct work_struct *work) > > > +{ > > > + struct virtio_balloon *vb; > > > + int ret; > > > + > > > + vb = container_of(work, struct virtio_balloon, report_free_page_work); > > > + > > > + /* Start by sending the received cmd id to host with an outbuf */ > > > + ret = send_cmd_id(vb, vb->cmd_id_received); > > > + if (unlikely(ret)) > > > + goto err; > > > + > > > + ret = walk_free_mem_block(vb, 0, &virtio_balloon_send_free_pages); > > > + if (unlikely(ret < 0)) > > > + goto err; > > > + > > > + /* End by sending a stop id to host with an outbuf */ > > > + ret = send_cmd_id(vb, VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); > > > + if (likely(!ret)) > > > + return; > > > +err: > > > + dev_err(&vb->vdev->dev, "%s failure: free page vq is broken\n", > > > + __func__); > > > +} > > > + > > So that's very simple, but it only works well if the whole > > free list fits in the queue or host processes the queue faster > > than the guest. What if it doesn't? > > This is the case that the virtqueue gets full, and I think we've agreed that > this is an optimization feature and losing some hints to report isn't > important, right? > > Actually, in the tests, there is no chance to see the ring is full. If we > check the host patches that were shared before, the device side operation is > quite simple, it just clears the related bits from the bitmap, and then > continues to take entries from the virtqueue till the virtqueue gets empty. > > > > If we had restartability you could just drop the lock > > and wait for a vq interrupt to make more progress, which > > would be better I think. > > > > Restartability means that caller needs to record the state where it was when > it stopped last time.See my comment on the mm patch: if you rotate the previously reported pages towards the end, then you mostly get restartability for free, if only per zone. The only thing remaining will be stopping at a page you already reported. There aren't many zones so restartability wrt zones is kind of trivial.> The controversy is that the free list is not static > once the lock is dropped, so everything is dynamically changing, including > the state that was recorded. The method we are using is more prudent, IMHO. > How about taking the fundamental solution, and seek to improve incrementally > in the future? > > > Best, > WeiI'd like to see kicks happen outside the spinlock. kick with a spinlock taken looks like a scalability issue that won't be easy to reproduce but hurt workloads at random unexpected times. -- MST
Wei Wang
2018-Jan-26 03:31 UTC
[virtio-dev] Re: [PATCH v25 2/2] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On 01/26/2018 10:42 AM, Michael S. Tsirkin wrote:> On Fri, Jan 26, 2018 at 09:40:44AM +0800, Wei Wang wrote: >> On 01/25/2018 09:49 PM, Michael S. Tsirkin wrote: >>> On Thu, Jan 25, 2018 at 05:14:06PM +0800, Wei Wang wrote: >>>>> The controversy is that the free list is not static >> once the lock is dropped, so everything is dynamically changing, including >> the state that was recorded. The method we are using is more prudent, IMHO. >> How about taking the fundamental solution, and seek to improve incrementally >> in the future? >> >> >> Best, >> Wei > I'd like to see kicks happen outside the spinlock. kick with a spinlock > taken looks like a scalability issue that won't be easy to > reproduce but hurt workloads at random unexpected times. >Is that "kick inside the spinlock" the only concern you have? I think we can remove the kick actually. If we check how the host side works, it is worthwhile to let the host poll the virtqueue after it receives the cmd id from the guest (kick for cmd id isn't within the lock). Best, Wei
Tetsuo Handa
2018-Jan-26 13:35 UTC
[virtio-dev] Re: [PATCH v25 2/2] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On 2018/01/26 12:31, Wei Wang wrote:> On 01/26/2018 10:42 AM, Michael S. Tsirkin wrote: >> On Fri, Jan 26, 2018 at 09:40:44AM +0800, Wei Wang wrote: >>> On 01/25/2018 09:49 PM, Michael S. Tsirkin wrote: >>>> On Thu, Jan 25, 2018 at 05:14:06PM +0800, Wei Wang wrote: >>>> > >>> The controversy is that the free list is not static >>> once the lock is dropped, so everything is dynamically changing, including >>> the state that was recorded. The method we are using is more prudent, IMHO. >>> How about taking the fundamental solution, and seek to improve incrementally >>> in the future? >>> >>> >>> Best, >>> Wei >> I'd like to see kicks happen outside the spinlock. kick with a spinlock >> taken looks like a scalability issue that won't be easy to >> reproduce but hurt workloads at random unexpected times. >> > > Is that "kick inside the spinlock" the only concern you have? I think we can remove the kick actually. If we check how the host side works, it is worthwhile to let the host poll the virtqueue after it receives the cmd id from the guest (kick for cmd id isn't within the lock).We should start from the worst case. + * The callback itself must not sleep or perform any operations which would + * require any memory allocations directly (not even GFP_NOWAIT/GFP_ATOMIC) + * or via any lock dependency. It is generally advisable to implement + * the callback as simple as possible and defer any heavy lifting to a + * different context. Making decision based on performance numbers of idle guests is dangerous. There might be busy CPUs waiting for zone->lock.
Michael S. Tsirkin
2018-Jan-30 23:44 UTC
[virtio-dev] Re: [PATCH v25 2/2] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On Fri, Jan 26, 2018 at 11:31:19AM +0800, Wei Wang wrote:> On 01/26/2018 10:42 AM, Michael S. Tsirkin wrote: > > On Fri, Jan 26, 2018 at 09:40:44AM +0800, Wei Wang wrote: > > > On 01/25/2018 09:49 PM, Michael S. Tsirkin wrote: > > > > On Thu, Jan 25, 2018 at 05:14:06PM +0800, Wei Wang wrote: > > > > > > > > The controversy is that the free list is not static > > > once the lock is dropped, so everything is dynamically changing, including > > > the state that was recorded. The method we are using is more prudent, IMHO. > > > How about taking the fundamental solution, and seek to improve incrementally > > > in the future? > > > > > > > > > Best, > > > Wei > > I'd like to see kicks happen outside the spinlock. kick with a spinlock > > taken looks like a scalability issue that won't be easy to > > reproduce but hurt workloads at random unexpected times. > > > > Is that "kick inside the spinlock" the only concern you have? I think we can > remove the kick actually. If we check how the host side works, it is > worthwhile to let the host poll the virtqueue after it receives the cmd id > from the guest (kick for cmd id isn't within the lock). > > > Best, > WeiSo really there are different ways to put free page hints to use. The current interface requires host to do dirty tracking for all memory, and it's more or less useless for things like freeing host memory. So while your project's needs seem to be addressed, I'm still a bit disappointed that so little collaboration happened with e.g. Nitesh's project, to the point where you don't even CC him on patches. So I'm kind of trying to bridge this a bit - I would like the interfaces that we build to at least superficially look like they might be reusable for other uses of hinting. Imagine that you don't have dirty tracking on the host. What would it take to still use hinting information, e.g. to call MADV_FREE on the pages guest gives us? I think you need to kick and you need to wait for host to consume the hint before page is reused. And we know madvise takes a lot of time sometimes, so locking out the free list does not sound like a good idea. That's why I was talking about kick out of lock, so that eventually we can reuse that for hinting and actually wait for an interrupt. So how about we take a bunch of pages out of the free list, move them to the balloon, kick (and optionally wait for host to consume), them move them back? Preferably to end of the list? This will also make things like sorting them much easier as you can just put them in a binary tree or something. For when we need to be careful to make sure we don't create an OOM situation with this out of thin air, and for when you can't give everything to host in one go, you might want some kind of notifier that tells you that you need to return pages to the free list ASAP. How'd this sound? -- MST