David Hildenbrand
2020-Feb-05 16:33 UTC
[PATCH v1 0/3] virtio-balloon: Fixes + switch back to OOM handler
Two fixes for issues I stumbled over while working on patch #3. Switch back to the good ol' OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM as the switch to the shrinker introduce some undesired side effects. Keep the shrinker in place to handle VIRTIO_BALLOON_F_FREE_PAGE_HINT. Lengthy discussion under [1]. I tested with QEMU and "deflate-on-oom=on". Works as expected. Did not test the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, as it is hard to trigger (only when migrating a VM, and even then, it might not trigger). [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html David Hildenbrand (3): virtio-balloon: Fix memory leak when unloading while hinting is in progress virtio_balloon: Fix memory leaks on errors in virtballoon_probe() virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM drivers/virtio/virtio_balloon.c | 124 +++++++++++++++----------------- 1 file changed, 57 insertions(+), 67 deletions(-) -- 2.24.1
David Hildenbrand
2020-Feb-05 16:34 UTC
[PATCH v1 1/3] virtio-balloon: Fix memory leak when unloading while hinting is in progress
When unloading the driver while hinting is in progress, we will not release the free page blocks back to MM, resulting in a memory leak. Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: "Michael S. Tsirkin" <mst at redhat.com> Cc: Jason Wang <jasowang at redhat.com> Cc: Wei Wang <wei.w.wang at intel.com> Cc: Liang Li <liang.z.li at intel.com> Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_balloon.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 8e400ece9273..abef2306c899 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -968,6 +968,10 @@ static void remove_common(struct virtio_balloon *vb) leak_balloon(vb, vb->num_pages); update_balloon_size(vb); + /* There might be free pages that are being reported: release them. */ + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) + return_free_pages_to_mm(vb, ULONG_MAX); + /* Now we reset the device so we can clean up the queues. */ vb->vdev->config->reset(vb->vdev); -- 2.24.1
David Hildenbrand
2020-Feb-05 16:34 UTC
[PATCH v1 2/3] virtio_balloon: Fix memory leaks on errors in virtballoon_probe()
We forget to put the inode and unmount the kernfs used for compaction. Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") Cc: "Michael S. Tsirkin" <mst at redhat.com> Cc: Jason Wang <jasowang at redhat.com> Cc: Wei Wang <wei.w.wang at intel.com> Cc: Liang Li <liang.z.li at intel.com> Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_balloon.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index abef2306c899..7e5d84caeb94 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -901,8 +901,7 @@ static int virtballoon_probe(struct virtio_device *vdev) vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb); if (IS_ERR(vb->vb_dev_info.inode)) { err = PTR_ERR(vb->vb_dev_info.inode); - kern_unmount(balloon_mnt); - goto out_del_vqs; + goto out_kern_unmount; } vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops; #endif @@ -913,13 +912,13 @@ static int virtballoon_probe(struct virtio_device *vdev) */ if (virtqueue_get_vring_size(vb->free_page_vq) < 2) { err = -ENOSPC; - goto out_del_vqs; + goto out_iput; } vb->balloon_wq = alloc_workqueue("balloon-wq", WQ_FREEZABLE | WQ_CPU_INTENSIVE, 0); if (!vb->balloon_wq) { err = -ENOMEM; - goto out_del_vqs; + goto out_iput; } INIT_WORK(&vb->report_free_page_work, report_free_page_func); vb->cmd_id_received_cache = VIRTIO_BALLOON_CMD_ID_STOP; @@ -953,6 +952,12 @@ static int virtballoon_probe(struct virtio_device *vdev) out_del_balloon_wq: if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) destroy_workqueue(vb->balloon_wq); +out_iput: +#ifdef CONFIG_BALLOON_COMPACTION + iput(vb->vb_dev_info.inode); +out_kern_unmount: + kern_unmount(balloon_mnt); +#endif out_del_vqs: vdev->config->del_vqs(vdev); out_free_vb: -- 2.24.1
David Hildenbrand
2020-Feb-05 16:34 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") changed the behavior when deflation happens automatically. Instead of deflating when called by the OOM handler, the shrinker is used. However, the balloon is not simply some slab cache that should be shrunk when under memory pressure. The shrinker does not have a concept of priorities, so this behavior cannot be configured. There was a report that this results in undesired side effects when inflating the balloon to shrink the page cache. [1] "When inflating the balloon against page cache (i.e. no free memory remains) vmscan.c will both shrink page cache, but also invoke the shrinkers -- including the balloon's shrinker. So the balloon driver allocates memory which requires reclaim, vmscan gets this memory by shrinking the balloon, and then the driver adds the memory back to the balloon. Basically a busy no-op." The name "deflate on OOM" makes it pretty clear when deflation should happen - after other approaches to reclaim memory failed, not while reclaiming. This allows to minimize the footprint of a guest - memory will only be taken out of the balloon when really needed. Especially, a drop_slab() will result in the whole balloon getting deflated - undesired. While handling it via the OOM handler might not be perfect, it keeps existing behavior. If we want a different behavior, then we need a new feature bit and document it properly (although, there should be a clear use case and the intended effects should be well described). Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because this has no such side effects. Always register the shrinker with VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free pages that are still to be processed by the guest. The hypervisor takes care of identifying and resolving possible races between processing a hinting request and the guest reusing a page. In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker"), don't add a moodule parameter to configure the number of pages to deflate on OOM. Can be re-added if really needed. Also, pay attention that leak_balloon() returns the number of 4k pages - convert it properly in virtio_balloon_oom_notify(). Note1: using the OOM handler is frowned upon, but it really is what we need for this feature. Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we could actually skip sending deflation requests to our hypervisor, making the OOM path *very* simple. Besically freeing pages and updating the balloon. If the communication with the host ever becomes a problem on this call path. [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html Reported-by: Tyler Sanderson <tysand at google.com> Cc: Michael S. Tsirkin <mst at redhat.com> Cc: Wei Wang <wei.w.wang at intel.com> Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> Cc: David Rientjes <rientjes at google.com> Cc: Nadav Amit <namit at vmware.com> Cc: Michal Hocko <mhocko at kernel.org> Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- 1 file changed, 44 insertions(+), 63 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 7e5d84caeb94..e7b18f556c5e 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -14,6 +14,7 @@ #include <linux/slab.h> #include <linux/module.h> #include <linux/balloon_compaction.h> +#include <linux/oom.h> #include <linux/wait.h> #include <linux/mm.h> #include <linux/mount.h> @@ -27,7 +28,9 @@ */ #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 +/* Maximum number of (4k) pages to deflate on OOM notifications. */ +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \ __GFP_NOMEMALLOC) @@ -112,8 +115,11 @@ struct virtio_balloon { /* Memory statistics */ struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; - /* To register a shrinker to shrink memory upon memory pressure */ + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */ struct shrinker shrinker; + + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ + struct notifier_block oom_nb; }; static struct virtio_device_id id_table[] = { @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct virtio_balloon *vb, return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; } -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, - unsigned long pages_to_free) -{ - return leak_balloon(vb, pages_to_free * VIRTIO_BALLOON_PAGES_PER_PAGE) / - VIRTIO_BALLOON_PAGES_PER_PAGE; -} - -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, - unsigned long pages_to_free) -{ - unsigned long pages_freed = 0; - - /* - * One invocation of leak_balloon can deflate at most - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it - * multiple times to deflate pages till reaching pages_to_free. - */ - while (vb->num_pages && pages_freed < pages_to_free) - pages_freed += leak_balloon_pages(vb, - pages_to_free - pages_freed); - - update_balloon_size(vb); - - return pages_freed; -} - static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { - unsigned long pages_to_free, pages_freed = 0; struct virtio_balloon *vb = container_of(shrinker, struct virtio_balloon, shrinker); - pages_to_free = sc->nr_to_scan; - - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) - pages_freed = shrink_free_pages(vb, pages_to_free); - - if (pages_freed >= pages_to_free) - return pages_freed; - - pages_freed += shrink_balloon_pages(vb, pages_to_free - pages_freed); - - return pages_freed; + return shrink_free_pages(vb, sc->nr_to_scan); } static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, @@ -837,26 +806,22 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, { struct virtio_balloon *vb = container_of(shrinker, struct virtio_balloon, shrinker); - unsigned long count; - - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; - count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; - return count; + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; } -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) +static int virtio_balloon_oom_notify(struct notifier_block *nb, + unsigned long dummy, void *parm) { - unregister_shrinker(&vb->shrinker); -} + struct virtio_balloon *vb = container_of(nb, + struct virtio_balloon, oom_nb); + unsigned long *freed = parm; -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) -{ - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; - vb->shrinker.count_objects = virtio_balloon_shrinker_count; - vb->shrinker.seeks = DEFAULT_SEEKS; + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / + VIRTIO_BALLOON_PAGES_PER_PAGE; + update_balloon_size(vb); - return register_shrinker(&vb->shrinker); + return NOTIFY_OK; } static int virtballoon_probe(struct virtio_device *vdev) @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device *vdev) virtio_cwrite(vb->vdev, struct virtio_balloon_config, poison_val, &poison_val); } - } - /* - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a - * shrinker needs to be registered to relieve memory pressure. - */ - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { - err = virtio_balloon_register_shrinker(vb); + + /* + * We're allowed to reuse any free pages, even if they are + * still to be processed by the host. + */ + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; + vb->shrinker.count_objects = virtio_balloon_shrinker_count; + vb->shrinker.seeks = DEFAULT_SEEKS; + err = register_shrinker(&vb->shrinker); if (err) goto out_del_balloon_wq; } + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; + err = register_oom_notifier(&vb->oom_nb); + if (err < 0) + goto out_unregister_shrinker; + } + virtio_device_ready(vdev); if (towards_target(vb)) virtballoon_changed(vdev); return 0; +out_unregister_shrinker: + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) + unregister_shrinker(&vb->shrinker); out_del_balloon_wq: if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) destroy_workqueue(vb->balloon_wq); @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device *vdev) { struct virtio_balloon *vb = vdev->priv; - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) - virtio_balloon_unregister_shrinker(vb); + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) + unregister_oom_notifier(&vb->oom_nb); + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) + unregister_shrinker(&vb->shrinker); + spin_lock_irq(&vb->stop_update_lock); vb->stop_update = true; spin_unlock_irq(&vb->stop_update_lock); -- 2.24.1
Tyler Sanderson
2020-Feb-05 22:37 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Wed, Feb 5, 2020 at 8:34 AM David Hildenbrand <david at redhat.com> wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed. > Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify(). > > Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature. > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon. If the communication with the host ever > becomes a problem on this call path. > > [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html > > Reported-by: Tyler Sanderson <tysand at google.com> > Cc: Michael S. Tsirkin <mst at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> > Cc: David Rientjes <rientjes at google.com> > Cc: Nadav Amit <namit at vmware.com> > Cc: Michal Hocko <mhocko at kernel.org> > Signed-off-by: David Hildenbrand <david at redhat.com> > --- > drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- > 1 file changed, 44 insertions(+), 63 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c > b/drivers/virtio/virtio_balloon.c > index 7e5d84caeb94..e7b18f556c5e 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/module.h> > #include <linux/balloon_compaction.h> > +#include <linux/oom.h> > #include <linux/wait.h> > #include <linux/mm.h> > #include <linux/mount.h> > @@ -27,7 +28,9 @@ > */ > #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> > VIRTIO_BALLOON_PFN_SHIFT) > #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 > -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 > +/* Maximum number of (4k) pages to deflate on OOM notifications. */ > +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 > +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 > > #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN > | \ > __GFP_NOMEMALLOC) > @@ -112,8 +115,11 @@ struct virtio_balloon { > /* Memory statistics */ > struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; > > - /* To register a shrinker to shrink memory upon memory pressure */ > + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT > */ > struct shrinker shrinker; > + > + /* OOM notifier to deflate on OOM - > VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ > + struct notifier_block oom_nb; > }; > > static struct virtio_device_id id_table[] = { > @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct > virtio_balloon *vb, > return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - return leak_balloon(vb, pages_to_free * > VIRTIO_BALLOON_PAGES_PER_PAGE) / > - VIRTIO_BALLOON_PAGES_PER_PAGE; > -} > - > -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - unsigned long pages_freed = 0; > - > - /* > - * One invocation of leak_balloon can deflate at most > - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > - * multiple times to deflate pages till reaching pages_to_free. > - */ > - while (vb->num_pages && pages_freed < pages_to_free) > - pages_freed += leak_balloon_pages(vb, > - pages_to_free - > pages_freed); > - > - update_balloon_size(vb); > - > - return pages_freed; > -} > - > static unsigned long virtio_balloon_shrinker_scan(struct shrinker > *shrinker, > struct shrink_control > *sc) > { > - unsigned long pages_to_free, pages_freed = 0; > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > > - pages_to_free = sc->nr_to_scan; > - > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > - pages_freed = shrink_free_pages(vb, pages_to_free); > - > - if (pages_freed >= pages_to_free) > - return pages_freed; > - > - pages_freed += shrink_balloon_pages(vb, pages_to_free - > pages_freed); > - > - return pages_freed; > + return shrink_free_pages(vb, sc->nr_to_scan); > } > > static unsigned long virtio_balloon_shrinker_count(struct shrinker > *shrinker, > @@ -837,26 +806,22 @@ static unsigned long > virtio_balloon_shrinker_count(struct shrinker *shrinker, > { > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > - unsigned long count; > - > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > - count += vb->num_free_page_blocks * > VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > - return count; > + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) > +static int virtio_balloon_oom_notify(struct notifier_block *nb, > + unsigned long dummy, void *parm) > { > - unregister_shrinker(&vb->shrinker); > -} > + struct virtio_balloon *vb = container_of(nb, > + struct virtio_balloon, > oom_nb); > + unsigned long *freed = parm; > > -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) > -{ > - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > - vb->shrinker.count_objects = virtio_balloon_shrinker_count; > - vb->shrinker.seeks = DEFAULT_SEEKS; > + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / > + VIRTIO_BALLOON_PAGES_PER_PAGE; > + update_balloon_size(vb); > > - return register_shrinker(&vb->shrinker); > + return NOTIFY_OK; > } > > static int virtballoon_probe(struct virtio_device *vdev) > @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device > *vdev) > virtio_cwrite(vb->vdev, struct > virtio_balloon_config, > poison_val, &poison_val); > } > - } > - /* > - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if > a > - * shrinker needs to be registered to relieve memory pressure. > - */ > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > { > - err = virtio_balloon_register_shrinker(vb); > + > + /* > + * We're allowed to reuse any free pages, even if they are > + * still to be processed by the host. >It is important to clarify that pages that are on the inflate queue but not ACKed by the host (the queue entry has not been returned) are _not_ okay to reuse. If the host is going to do something destructive to the page (like deback it) then that needs to happen before the entry is returned. + */> + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > + vb->shrinker.count_objects = virtio_balloon_shrinker_count; > + vb->shrinker.seeks = DEFAULT_SEEKS; > + err = register_shrinker(&vb->shrinker); > if (err) > goto out_del_balloon_wq; > } > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; > + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; > + err = register_oom_notifier(&vb->oom_nb); > + if (err < 0) > + goto out_unregister_shrinker; > + } > + > virtio_device_ready(vdev); > > if (towards_target(vb)) > virtballoon_changed(vdev); > return 0; > > +out_unregister_shrinker: > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device > *vdev) > { > struct virtio_balloon *vb = vdev->priv; > > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > - virtio_balloon_unregister_shrinker(vb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > + unregister_oom_notifier(&vb->oom_nb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > + > spin_lock_irq(&vb->stop_update_lock); > vb->stop_update = true; > spin_unlock_irq(&vb->stop_update_lock); > -- > 2.24.1 > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20200205/f048ef64/attachment-0001.html>
Michael S. Tsirkin
2020-Feb-06 07:40 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed.I agree. And to make this case even stronger: The oom_pages module parameter was known to be broken: whatever its value, we return at most VIRTIO_BALLOON_ARRAY_PFNS_MAX. So module parameter values > 256 never worked, and it seems highly unlikely that freeing 1Mbyte on OOM is too aggressive. There was a patch virtio-balloon: deflate up to oom_pages on OOM by Wei Wang to try to fix it: https://lore.kernel.org/r/1508500466-21165-3-git-send-email-wei.w.wang at intel.com but this was dropped.> Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify().Oh. So it was returning a wrong value originally (before 71994620bb25). However what really matters for notifiers is whether the value is 0 - whether we made progress. So it's cosmetic.> Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature.Quite. However, I went back researching why we dropped the OOM notifier, and found this: https://lore.kernel.org/r/1508500466-21165-2-git-send-email-wei.w.wang at intel.com To quote from there: The balloon_lock was used to synchronize the access demand to elements of struct virtio_balloon and its queue operations (please see commit e22504296d). This prevents the concurrent run of the leak_balloon and fill_balloon functions, thereby resulting in a deadlock issue on OOM: fill_balloon: take balloon_lock and wait for OOM to get some memory; oom_notify: release some inflated memory via leak_balloon(); leak_balloon: wait for balloon_lock to be released by fill_balloon.> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon.Well not exactly. !VIRTIO_BALLOON_F_MUST_TELL_HOST does not actually mean "never tell host". It means "host will not discard pages in the balloon, you can defer host notification until after use". This was the original implementation: + if (vb->tell_host_first) { + tell_host(vb, vb->deflate_vq); + release_pages_by_pfn(vb->pfns, vb->num_pfns); + } else { + release_pages_by_pfn(vb->pfns, vb->num_pfns); + tell_host(vb, vb->deflate_vq); + } +} I don't know whether completely skipping host notifications when !VIRTIO_BALLOON_F_MUST_TELL_HOST will break any hosts.> If the communication with the host ever > becomes a problem on this call path. > > [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html > > Reported-by: Tyler Sanderson <tysand at google.com> > Cc: Michael S. Tsirkin <mst at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> > Cc: David Rientjes <rientjes at google.com> > Cc: Nadav Amit <namit at vmware.com> > Cc: Michal Hocko <mhocko at kernel.org> > Signed-off-by: David Hildenbrand <david at redhat.com> > --- > drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- > 1 file changed, 44 insertions(+), 63 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7e5d84caeb94..e7b18f556c5e 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/module.h> > #include <linux/balloon_compaction.h> > +#include <linux/oom.h> > #include <linux/wait.h> > #include <linux/mm.h> > #include <linux/mount.h> > @@ -27,7 +28,9 @@ > */ > #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) > #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 > -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 > +/* Maximum number of (4k) pages to deflate on OOM notifications. */ > +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 > +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 > > #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \ > __GFP_NOMEMALLOC) > @@ -112,8 +115,11 @@ struct virtio_balloon { > /* Memory statistics */ > struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; > > - /* To register a shrinker to shrink memory upon memory pressure */ > + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */ > struct shrinker shrinker; > + > + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ > + struct notifier_block oom_nb; > }; > > static struct virtio_device_id id_table[] = { > @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct virtio_balloon *vb, > return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - return leak_balloon(vb, pages_to_free * VIRTIO_BALLOON_PAGES_PER_PAGE) / > - VIRTIO_BALLOON_PAGES_PER_PAGE; > -} > - > -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - unsigned long pages_freed = 0; > - > - /* > - * One invocation of leak_balloon can deflate at most > - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > - * multiple times to deflate pages till reaching pages_to_free. > - */ > - while (vb->num_pages && pages_freed < pages_to_free) > - pages_freed += leak_balloon_pages(vb, > - pages_to_free - pages_freed); > - > - update_balloon_size(vb); > - > - return pages_freed; > -} > - > static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, > struct shrink_control *sc) > { > - unsigned long pages_to_free, pages_freed = 0; > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > > - pages_to_free = sc->nr_to_scan; > - > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > - pages_freed = shrink_free_pages(vb, pages_to_free); > - > - if (pages_freed >= pages_to_free) > - return pages_freed; > - > - pages_freed += shrink_balloon_pages(vb, pages_to_free - pages_freed); > - > - return pages_freed; > + return shrink_free_pages(vb, sc->nr_to_scan); > } > > static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > @@ -837,26 +806,22 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > { > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > - unsigned long count; > - > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > - count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > - return count; > + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) > +static int virtio_balloon_oom_notify(struct notifier_block *nb, > + unsigned long dummy, void *parm) > { > - unregister_shrinker(&vb->shrinker); > -} > + struct virtio_balloon *vb = container_of(nb, > + struct virtio_balloon, oom_nb); > + unsigned long *freed = parm; > > -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) > -{ > - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > - vb->shrinker.count_objects = virtio_balloon_shrinker_count; > - vb->shrinker.seeks = DEFAULT_SEEKS; > + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / > + VIRTIO_BALLOON_PAGES_PER_PAGE; > + update_balloon_size(vb); > > - return register_shrinker(&vb->shrinker); > + return NOTIFY_OK; > } > > static int virtballoon_probe(struct virtio_device *vdev) > @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device *vdev) > virtio_cwrite(vb->vdev, struct virtio_balloon_config, > poison_val, &poison_val); > } > - } > - /* > - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a > - * shrinker needs to be registered to relieve memory pressure. > - */ > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > - err = virtio_balloon_register_shrinker(vb); > + > + /* > + * We're allowed to reuse any free pages, even if they are > + * still to be processed by the host. > + */ > + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > + vb->shrinker.count_objects = virtio_balloon_shrinker_count; > + vb->shrinker.seeks = DEFAULT_SEEKS; > + err = register_shrinker(&vb->shrinker); > if (err) > goto out_del_balloon_wq; > } > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; > + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; > + err = register_oom_notifier(&vb->oom_nb); > + if (err < 0) > + goto out_unregister_shrinker; > + } > + > virtio_device_ready(vdev); > > if (towards_target(vb)) > virtballoon_changed(vdev); > return 0; > > +out_unregister_shrinker: > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device *vdev) > { > struct virtio_balloon *vb = vdev->priv; > > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > - virtio_balloon_unregister_shrinker(vb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > + unregister_oom_notifier(&vb->oom_nb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > + > spin_lock_irq(&vb->stop_update_lock); > vb->stop_update = true; > spin_unlock_irq(&vb->stop_update_lock); > -- > 2.24.1
Michael S. Tsirkin
2020-Feb-06 08:36 UTC
[PATCH v1 1/3] virtio-balloon: Fix memory leak when unloading while hinting is in progress
On Wed, Feb 05, 2020 at 05:34:00PM +0100, David Hildenbrand wrote:> When unloading the driver while hinting is in progress, we will not > release the free page blocks back to MM, resulting in a memory leak. > > Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") > Cc: "Michael S. Tsirkin" <mst at redhat.com> > Cc: Jason Wang <jasowang at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Liang Li <liang.z.li at intel.com> > Signed-off-by: David Hildenbrand <david at redhat.com>Applied, thanks!> --- > drivers/virtio/virtio_balloon.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 8e400ece9273..abef2306c899 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -968,6 +968,10 @@ static void remove_common(struct virtio_balloon *vb) > leak_balloon(vb, vb->num_pages); > update_balloon_size(vb); > > + /* There might be free pages that are being reported: release them. */ > + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + return_free_pages_to_mm(vb, ULONG_MAX); > + > /* Now we reset the device so we can clean up the queues. */ > vb->vdev->config->reset(vb->vdev); > > -- > 2.24.1
Michael S. Tsirkin
2020-Feb-06 08:36 UTC
[PATCH v1 2/3] virtio_balloon: Fix memory leaks on errors in virtballoon_probe()
On Wed, Feb 05, 2020 at 05:34:01PM +0100, David Hildenbrand wrote:> We forget to put the inode and unmount the kernfs used for compaction. > > Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > Cc: "Michael S. Tsirkin" <mst at redhat.com> > Cc: Jason Wang <jasowang at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Liang Li <liang.z.li at intel.com> > Signed-off-by: David Hildenbrand <david at redhat.com>Applied, thanks!> --- > drivers/virtio/virtio_balloon.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index abef2306c899..7e5d84caeb94 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -901,8 +901,7 @@ static int virtballoon_probe(struct virtio_device *vdev) > vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb); > if (IS_ERR(vb->vb_dev_info.inode)) { > err = PTR_ERR(vb->vb_dev_info.inode); > - kern_unmount(balloon_mnt); > - goto out_del_vqs; > + goto out_kern_unmount; > } > vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops; > #endif > @@ -913,13 +912,13 @@ static int virtballoon_probe(struct virtio_device *vdev) > */ > if (virtqueue_get_vring_size(vb->free_page_vq) < 2) { > err = -ENOSPC; > - goto out_del_vqs; > + goto out_iput; > } > vb->balloon_wq = alloc_workqueue("balloon-wq", > WQ_FREEZABLE | WQ_CPU_INTENSIVE, 0); > if (!vb->balloon_wq) { > err = -ENOMEM; > - goto out_del_vqs; > + goto out_iput; > } > INIT_WORK(&vb->report_free_page_work, report_free_page_func); > vb->cmd_id_received_cache = VIRTIO_BALLOON_CMD_ID_STOP; > @@ -953,6 +952,12 @@ static int virtballoon_probe(struct virtio_device *vdev) > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > +out_iput: > +#ifdef CONFIG_BALLOON_COMPACTION > + iput(vb->vb_dev_info.inode); > +out_kern_unmount: > + kern_unmount(balloon_mnt); > +#endif > out_del_vqs: > vdev->config->del_vqs(vdev); > out_free_vb: > -- > 2.24.1
Wang, Wei W
2020-Feb-06 08:57 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Thursday, February 6, 2020 12:34 AM, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be shrunk > when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when inflating > the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op."Not sure if we need to go back to OOM, which has many drawbacks as we discussed. Just posted out another approach, which is simple. Best, Wei
Michael S. Tsirkin
2020-Feb-06 09:11 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed. > Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify(). > > Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature. > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon. If the communication with the host ever > becomes a problem on this call path. > > [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html > > Reported-by: Tyler Sanderson <tysand at google.com> > Cc: Michael S. Tsirkin <mst at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> > Cc: David Rientjes <rientjes at google.com> > Cc: Nadav Amit <namit at vmware.com> > Cc: Michal Hocko <mhocko at kernel.org> > Signed-off-by: David Hildenbrand <david at redhat.com>So the revert looks ok, from that POV and with commit log changes Acked-by: Michael S. Tsirkin <mst at redhat.com> however, let's see what do others think, and whether Wei can come up with a fixup for the shrinker.> --- > drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- > 1 file changed, 44 insertions(+), 63 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7e5d84caeb94..e7b18f556c5e 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/module.h> > #include <linux/balloon_compaction.h> > +#include <linux/oom.h> > #include <linux/wait.h> > #include <linux/mm.h> > #include <linux/mount.h> > @@ -27,7 +28,9 @@ > */ > #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) > #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 > -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 > +/* Maximum number of (4k) pages to deflate on OOM notifications. */ > +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 > +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 > > #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \ > __GFP_NOMEMALLOC) > @@ -112,8 +115,11 @@ struct virtio_balloon { > /* Memory statistics */ > struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; > > - /* To register a shrinker to shrink memory upon memory pressure */ > + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */ > struct shrinker shrinker; > + > + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ > + struct notifier_block oom_nb; > }; > > static struct virtio_device_id id_table[] = { > @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct virtio_balloon *vb, > return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - return leak_balloon(vb, pages_to_free * VIRTIO_BALLOON_PAGES_PER_PAGE) / > - VIRTIO_BALLOON_PAGES_PER_PAGE; > -} > - > -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - unsigned long pages_freed = 0; > - > - /* > - * One invocation of leak_balloon can deflate at most > - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > - * multiple times to deflate pages till reaching pages_to_free. > - */ > - while (vb->num_pages && pages_freed < pages_to_free) > - pages_freed += leak_balloon_pages(vb, > - pages_to_free - pages_freed); > - > - update_balloon_size(vb); > - > - return pages_freed; > -} > - > static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, > struct shrink_control *sc) > { > - unsigned long pages_to_free, pages_freed = 0; > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > > - pages_to_free = sc->nr_to_scan; > - > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > - pages_freed = shrink_free_pages(vb, pages_to_free); > - > - if (pages_freed >= pages_to_free) > - return pages_freed; > - > - pages_freed += shrink_balloon_pages(vb, pages_to_free - pages_freed); > - > - return pages_freed; > + return shrink_free_pages(vb, sc->nr_to_scan); > } > > static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > @@ -837,26 +806,22 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > { > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > - unsigned long count; > - > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > - count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > - return count; > + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) > +static int virtio_balloon_oom_notify(struct notifier_block *nb, > + unsigned long dummy, void *parm) > { > - unregister_shrinker(&vb->shrinker); > -} > + struct virtio_balloon *vb = container_of(nb, > + struct virtio_balloon, oom_nb); > + unsigned long *freed = parm; > > -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) > -{ > - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > - vb->shrinker.count_objects = virtio_balloon_shrinker_count; > - vb->shrinker.seeks = DEFAULT_SEEKS; > + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / > + VIRTIO_BALLOON_PAGES_PER_PAGE; > + update_balloon_size(vb); > > - return register_shrinker(&vb->shrinker); > + return NOTIFY_OK; > } > > static int virtballoon_probe(struct virtio_device *vdev) > @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device *vdev) > virtio_cwrite(vb->vdev, struct virtio_balloon_config, > poison_val, &poison_val); > } > - } > - /* > - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a > - * shrinker needs to be registered to relieve memory pressure. > - */ > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > - err = virtio_balloon_register_shrinker(vb); > + > + /* > + * We're allowed to reuse any free pages, even if they are > + * still to be processed by the host. > + */ > + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > + vb->shrinker.count_objects = virtio_balloon_shrinker_count; > + vb->shrinker.seeks = DEFAULT_SEEKS; > + err = register_shrinker(&vb->shrinker); > if (err) > goto out_del_balloon_wq; > } > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; > + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; > + err = register_oom_notifier(&vb->oom_nb); > + if (err < 0) > + goto out_unregister_shrinker; > + } > + > virtio_device_ready(vdev); > > if (towards_target(vb)) > virtballoon_changed(vdev); > return 0; > > +out_unregister_shrinker: > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device *vdev) > { > struct virtio_balloon *vb = vdev->priv; > > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > - virtio_balloon_unregister_shrinker(vb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > + unregister_oom_notifier(&vb->oom_nb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > + > spin_lock_irq(&vb->stop_update_lock); > vb->stop_update = true; > spin_unlock_irq(&vb->stop_update_lock); > -- > 2.24.1
Michael S. Tsirkin
2020-Feb-06 09:12 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed. > Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify(). > > Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature. > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon. If the communication with the host ever > becomes a problem on this call path. > > [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html > > Reported-by: Tyler Sanderson <tysand at google.com> > Cc: Michael S. Tsirkin <mst at redhat.com> > Cc: Wei Wang <wei.w.wang at intel.com> > Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> > Cc: David Rientjes <rientjes at google.com> > Cc: Nadav Amit <namit at vmware.com> > Cc: Michal Hocko <mhocko at kernel.org> > Signed-off-by: David Hildenbrand <david at redhat.com>I guess we should add a Fixes tag to the patch it's reverting, this way it's backported and hypervisors will be able to rely on OOM behaviour.> --- > drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- > 1 file changed, 44 insertions(+), 63 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7e5d84caeb94..e7b18f556c5e 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/module.h> > #include <linux/balloon_compaction.h> > +#include <linux/oom.h> > #include <linux/wait.h> > #include <linux/mm.h> > #include <linux/mount.h> > @@ -27,7 +28,9 @@ > */ > #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) > #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 > -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 > +/* Maximum number of (4k) pages to deflate on OOM notifications. */ > +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 > +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 > > #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \ > __GFP_NOMEMALLOC) > @@ -112,8 +115,11 @@ struct virtio_balloon { > /* Memory statistics */ > struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; > > - /* To register a shrinker to shrink memory upon memory pressure */ > + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */ > struct shrinker shrinker; > + > + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ > + struct notifier_block oom_nb; > }; > > static struct virtio_device_id id_table[] = { > @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct virtio_balloon *vb, > return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - return leak_balloon(vb, pages_to_free * VIRTIO_BALLOON_PAGES_PER_PAGE) / > - VIRTIO_BALLOON_PAGES_PER_PAGE; > -} > - > -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - unsigned long pages_freed = 0; > - > - /* > - * One invocation of leak_balloon can deflate at most > - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > - * multiple times to deflate pages till reaching pages_to_free. > - */ > - while (vb->num_pages && pages_freed < pages_to_free) > - pages_freed += leak_balloon_pages(vb, > - pages_to_free - pages_freed); > - > - update_balloon_size(vb); > - > - return pages_freed; > -} > - > static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, > struct shrink_control *sc) > { > - unsigned long pages_to_free, pages_freed = 0; > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > > - pages_to_free = sc->nr_to_scan; > - > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > - pages_freed = shrink_free_pages(vb, pages_to_free); > - > - if (pages_freed >= pages_to_free) > - return pages_freed; > - > - pages_freed += shrink_balloon_pages(vb, pages_to_free - pages_freed); > - > - return pages_freed; > + return shrink_free_pages(vb, sc->nr_to_scan); > } > > static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > @@ -837,26 +806,22 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, > { > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > - unsigned long count; > - > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > - count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > - return count; > + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) > +static int virtio_balloon_oom_notify(struct notifier_block *nb, > + unsigned long dummy, void *parm) > { > - unregister_shrinker(&vb->shrinker); > -} > + struct virtio_balloon *vb = container_of(nb, > + struct virtio_balloon, oom_nb); > + unsigned long *freed = parm; > > -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) > -{ > - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > - vb->shrinker.count_objects = virtio_balloon_shrinker_count; > - vb->shrinker.seeks = DEFAULT_SEEKS; > + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / > + VIRTIO_BALLOON_PAGES_PER_PAGE; > + update_balloon_size(vb); > > - return register_shrinker(&vb->shrinker); > + return NOTIFY_OK; > } > > static int virtballoon_probe(struct virtio_device *vdev) > @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device *vdev) > virtio_cwrite(vb->vdev, struct virtio_balloon_config, > poison_val, &poison_val); > } > - } > - /* > - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a > - * shrinker needs to be registered to relieve memory pressure. > - */ > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > - err = virtio_balloon_register_shrinker(vb); > + > + /* > + * We're allowed to reuse any free pages, even if they are > + * still to be processed by the host. > + */ > + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > + vb->shrinker.count_objects = virtio_balloon_shrinker_count; > + vb->shrinker.seeks = DEFAULT_SEEKS; > + err = register_shrinker(&vb->shrinker); > if (err) > goto out_del_balloon_wq; > } > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; > + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; > + err = register_oom_notifier(&vb->oom_nb); > + if (err < 0) > + goto out_unregister_shrinker; > + } > + > virtio_device_ready(vdev); > > if (towards_target(vb)) > virtballoon_changed(vdev); > return 0; > > +out_unregister_shrinker: > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device *vdev) > { > struct virtio_balloon *vb = vdev->priv; > > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > - virtio_balloon_unregister_shrinker(vb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > + unregister_oom_notifier(&vb->oom_nb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > + > spin_lock_irq(&vb->stop_update_lock); > vb->stop_update = true; > spin_unlock_irq(&vb->stop_update_lock); > -- > 2.24.1
David Hildenbrand
2020-Feb-14 09:51 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On 05.02.20 17:34, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed. > Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify(). > > Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature. > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon. If the communication with the host ever > becomes a problem on this call path. >@Michael, how to proceed with this? -- Thanks, David / dhildenb
Michal Hocko
2020-Feb-14 14:06 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Wed 05-02-20 17:34:02, David Hildenbrand wrote:> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured.Adding a priority to the shrinker doesn't sound like a big problem to me. Shrinkers already get shrink_control data structure already and priority could be added there.> There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired.Could you explain why some more? drop_caches shouldn't be really used in any production workloads and if somebody really wants all the cache to be dropped then why is balloon any different? -- Michal Hocko SUSE Labs