On Thu, Jan 30, 2020 at 11:59:46AM -0800, Tyler Sanderson
wrote:>
>
> On Thu, Jan 30, 2020 at 7:31 AM Wang, Wei W <wei.w.wang at intel.com>
wrote:
>
> On Thursday, January 30, 2020 11:03 PM, David Hildenbrand wrote:
> > On 29.01.20 20:11, Tyler Sanderson wrote:
> > >
> > >
> > > On Wed, Jan 29, 2020 at 2:31 AM David Hildenbrand <david
at redhat.com
> > > <mailto:david at redhat.com>> wrote:
> > >
> > >? ? ?On 29.01.20 01:22, Tyler Sanderson via Virtualization
wrote:
> > >? ? ?> A primary advantage of virtio balloon over
other?memory reclaim
> > >? ? ?> mechanisms is that it can?pressure the guest's
page cache into
> > >? ? ?shrinking.
> > >? ? ?>
> > >? ? ?> However, since the balloon driver changed to using
the shrinker
> API
> > >? ? ?>
> > >
> >
<https://github.com/torvalds/linux/commit/71994620bb25a8b109388fefa9
> > e99a28e355255a#diff-fd202acf694d9eba19c8c64da3e480c9>?this
> > >? ? ?> use case has become a bit more tricky. I'm
wondering what the
> > intended
> > >? ? ?> device implementation is.
> > >? ? ?>
> > >? ? ?> When inflating the balloon against page cache (i.e.
no free
> memory
> > >? ? ?> remains) vmscan.c will both shrink page cache, but
also invoke
> the
> > >? ? ?> shrinkers -- including the balloon's shrinker.
So the balloon
> driver
> > >? ? ?> allocates memory which requires reclaim, vmscan gets
this memory
> > by
> > >? ? ?> shrinking the balloon, and then the driver adds the
memory back
> to
> > the
> > >? ? ?> balloon. Basically a busy no-op.
>
> Per my understanding, the balloon allocation won?t invoke shrinker as
> __GFP_DIRECT_RECLAIM isn't set, no?
>
> I could be wrong about the mechanism, but the device sees lots of activity
on
> the deflate queue. The balloon is being shrunk. And this only starts once
all
> free memory is depleted and we're inflating into page cache.
So given this looks like a regression, maybe we should revert the
patch in question 71994620bb25 ("virtio_balloon: replace oom notifier with
shrinker")
Besides, with VIRTIO_BALLOON_F_FREE_PAGE_HINT
shrinker also ignores VIRTIO_BALLOON_F_MUST_TELL_HOST which isn't nice
at all.
So it looks like all this rework introduced more issues than it
addressed ...
I also CC Alex Duyck for an opinion on this.
Alex, what do you use to put pressure on page cache?
>
>
> > >? ? ?>
> > >? ? ?> If file IO is ongoing during this balloon inflation
then the page
> > >? ? ?cache
> > >? ? ?> could be growing which further puts "back
pressure" on the
> balloon
> > >? ? ?> trying to inflate. In testing I've seen periods
of > 45 seconds
> where
> > >? ? ?> balloon inflation makes no net forward progress.
>
> I think this is intentional (but could be improved). As inflation does
not
> stop when the allocation fails (it simply sleeps for a while and
resumes..
> repeat till there are memory to inflate)
> That's why you see no inflation progress for long time under memory
> pressure.
>
> As noted above the deflate queue is active, so it's not just memory
allocation
> failures.
> ?
>
>
>
> Best,
> Wei
>