thr3ads.net - Linux Virtualization - [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Li, Liang Z

2016-Mar-04 15:49 UTC

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

> > > > > > Only detect the unmapped/zero mapped pages is not
enough.
> > > Consider
> > > > > the
> > > > > > situation like case 2, it can't achieve the
same result.
> > > > >
> > > > > Your case 2 doesn't exist in the real world.  If
people could
> > > > > stop their main memory consumer in the guest prior to
migration
> > > > > they wouldn't need live migration at all.
> > > >
> > > > The case 2 is just a simplified scenario, not a real case.
> > > > As long as the guest's memory usage does not keep
increasing, or
> > > > not always run out, it can be covered by the case 2.
> > >
> > > The memory usage will keep increasing due to ever growing caches,
> > > etc, so you'll be left with very little free memory fairly
soon.
> > >
> >
> > I don't think so.
> 
> Here's my laptop:
> KiB Mem : 16048560 total,  8574956 free,  3360532 used,  4113072 buff/cache
> 
> But here's a server:
> KiB Mem:  32892768 total, 20092812 used, 12799956 free,   368704 buffers
> 
> What is the difference? A ton of tiny daemons not doing anything, staying
> resident in memory.
> 
> > > > > I tend to think you can safely assume there's no
free memory in
> > > > > the guest, so there's little point optimizing for
it.
> > > >
> > > > If this is true, we should not inflate the balloon either.
> > >
> > > We certainly should if there's "available" memory,
i.e. not free but
> > > cheap to reclaim.
> > >
> >
> > What's your mean by "available" memory? if they are not
free, I don't think
> it's cheap.
> 
> clean pages are cheap to drop as they don't have to be written.
> whether they will be ever be used is another matter.
> 
> > > > > OTOH it makes perfect sense optimizing for the unmapped
memory
> > > > > that's made up, in particular, by the ballon, and
consider
> > > > > inflating the balloon right before migration unless you
already
> > > > > maintain it at the optimal size for other reasons (like
e.g. a
> > > > > global resource manager
> > > optimizing the VM density).
> > > > >
> > > >
> > > > Yes, I believe the current balloon works and it's
simple. Do you
> > > > take the
> > > performance impact for consideration?
> > > > For and 8G guest, it takes about 5s to  inflating the
balloon. But
> > > > it only takes 20ms to  traverse the free_list and construct
the
> > > > free pages
> > > bitmap.
> > >
> > > I don't have any feeling of how important the difference is. 
And if
> > > the limiting factor for balloon inflation speed is the
granularity
> > > of communication it may be worth optimizing that, because quick
> > > balloon reaction may be important in certain resource management
> scenarios.
> > >
> > > > By inflating the balloon, all the guest's pages are
still be
> > > > processed (zero
> > > page checking).
> > >
> > > Not sure what you mean.  If you describe the current state of
> > > affairs that's exactly the suggested optimization point: skip
unmapped
> pages.
> > >
> >
> > You'd better check the live migration code.
> 
> What's there to check in migration code?
> Here's the extent of what balloon does on output:
> 
> 
>         while (iov_to_buf(elem->out_sg, elem->out_num, offset,
&pfn, 4) == 4)
> {
>             ram_addr_t pa;
>             ram_addr_t addr;
>             int p = virtio_ldl_p(vdev, &pfn);
> 
>             pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
>             offset += 4;
> 
>             /* FIXME: remove get_system_memory(), but how? */
>             section = memory_region_find(get_system_memory(), pa, 1);
>             if (!int128_nz(section.size) ||
!memory_region_is_ram(section.mr))
>                 continue;
> 
> 
> trace_virtio_balloon_handle_output(memory_region_name(section.mr),
>                                                pa);
>             /* Using memory_region_get_ram_ptr is bending the rules a bit,
but
>                should be OK because we only want a single page.  */
>             addr = section.offset_within_region;
>             balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
>                          !!(vq == s->dvq));
>             memory_region_unref(section.mr);
>         }
> 
> so all that happens when we get a page is balloon_page.
> and
> 
> static void balloon_page(void *addr, int deflate) { #if defined(__linux__)
>     if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
>                                          kvm_has_sync_mmu())) {
>         qemu_madvise(addr, TARGET_PAGE_SIZE,
>                 deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
>     }
> #endif
> }
> 
> 
> Do you see anything that tracks pages to help migration skip the ballooned
> memory? I don't.
> 
No. And it's exactly what I mean. The ballooned memory is still processed
during
live migration without skipping. The live migration code is in migration/ram.c.
> 
> > > > The only advantage of ' inflating the balloon before
live
> > > > migration' is simple,
> > > nothing more.
> > >
> > > That's a big advantage.  Another one is that it does
something
> > > useful in real- world scenarios.
> > >
> >
> > I don't think the heave performance impaction is something useful
in real
> world scenarios.
> >
> > Liang
> > > Roman.
> 
> So fix the performance then. You will have to try harder if you want to
> convince people that the performance is due to bad host/guest interface,
> and so we have to change *that*.
> 
Actually, the PV solution is irrelevant with the balloon mechanism, I just use
it
to transfer information between host and guest. 
I am not sure if I should implement a new virtio device, and I want to get the
answer from
the community.
In this RFC patch, to make things simple, I choose to extend the virtio-balloon
and use the
extended interface to transfer the request and free_page_bimap content.

I am not intend to change the current virtio-balloon implementation.

Liang
> --
> MST

Michael S. Tsirkin

2016-Mar-05 19:55 UTC

head link

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

On Fri, Mar 04, 2016 at 03:49:37PM +0000, Li, Liang Z
wrote:> > > > > > > Only detect the unmapped/zero mapped pages is
not enough.
> > > > Consider
> > > > > > the
> > > > > > > situation like case 2, it can't achieve
the same result.
> > > > > >
> > > > > > Your case 2 doesn't exist in the real world. 
If people could
> > > > > > stop their main memory consumer in the guest prior
to migration
> > > > > > they wouldn't need live migration at all.
> > > > >
> > > > > The case 2 is just a simplified scenario, not a real
case.
> > > > > As long as the guest's memory usage does not keep
increasing, or
> > > > > not always run out, it can be covered by the case 2.
> > > >
> > > > The memory usage will keep increasing due to ever growing
caches,
> > > > etc, so you'll be left with very little free memory
fairly soon.
> > > >
> > >
> > > I don't think so.
> > 
> > Here's my laptop:
> > KiB Mem : 16048560 total,  8574956 free,  3360532 used,  4113072
buff/cache
> > 
> > But here's a server:
> > KiB Mem:  32892768 total, 20092812 used, 12799956 free,   368704
buffers
> > 
> > What is the difference? A ton of tiny daemons not doing anything,
staying
> > resident in memory.
> > 
> > > > > > I tend to think you can safely assume there's
no free memory in
> > > > > > the guest, so there's little point optimizing
for it.
> > > > >
> > > > > If this is true, we should not inflate the balloon
either.
> > > >
> > > > We certainly should if there's "available"
memory, i.e. not free but
> > > > cheap to reclaim.
> > > >
> > >
> > > What's your mean by "available" memory? if they are
not free, I don't think
> > it's cheap.
> > 
> > clean pages are cheap to drop as they don't have to be written.
> > whether they will be ever be used is another matter.
> > 
> > > > > > OTOH it makes perfect sense optimizing for the
unmapped memory
> > > > > > that's made up, in particular, by the ballon,
and consider
> > > > > > inflating the balloon right before migration
unless you already
> > > > > > maintain it at the optimal size for other reasons
(like e.g. a
> > > > > > global resource manager
> > > > optimizing the VM density).
> > > > > >
> > > > >
> > > > > Yes, I believe the current balloon works and it's
simple. Do you
> > > > > take the
> > > > performance impact for consideration?
> > > > > For and 8G guest, it takes about 5s to  inflating the
balloon. But
> > > > > it only takes 20ms to  traverse the free_list and
construct the
> > > > > free pages
> > > > bitmap.
> > > >
> > > > I don't have any feeling of how important the difference
is.  And if
> > > > the limiting factor for balloon inflation speed is the
granularity
> > > > of communication it may be worth optimizing that, because
quick
> > > > balloon reaction may be important in certain resource
management
> > scenarios.
> > > >
> > > > > By inflating the balloon, all the guest's pages are
still be
> > > > > processed (zero
> > > > page checking).
> > > >
> > > > Not sure what you mean.  If you describe the current state
of
> > > > affairs that's exactly the suggested optimization point:
skip unmapped
> > pages.
> > > >
> > >
> > > You'd better check the live migration code.
> > 
> > What's there to check in migration code?
> > Here's the extent of what balloon does on output:
> > 
> > 
> >         while (iov_to_buf(elem->out_sg, elem->out_num, offset,
&pfn, 4) == 4)
> > {
> >             ram_addr_t pa;
> >             ram_addr_t addr;
> >             int p = virtio_ldl_p(vdev, &pfn);
> > 
> >             pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
> >             offset += 4;
> > 
> >             /* FIXME: remove get_system_memory(), but how? */
> >             section = memory_region_find(get_system_memory(), pa, 1);
> >             if (!int128_nz(section.size) ||
!memory_region_is_ram(section.mr))
> >                 continue;
> > 
> > 
> > trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> >                                                pa);
> >             /* Using memory_region_get_ram_ptr is bending the rules a
bit, but
> >                should be OK because we only want a single page.  */
> >             addr = section.offset_within_region;
> >             balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
> >                          !!(vq == s->dvq));
> >             memory_region_unref(section.mr);
> >         }
> > 
> > so all that happens when we get a page is balloon_page.
> > and
> > 
> > static void balloon_page(void *addr, int deflate) { #if
defined(__linux__)
> >     if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> >                                          kvm_has_sync_mmu())) {
> >         qemu_madvise(addr, TARGET_PAGE_SIZE,
> >                 deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> >     }
> > #endif
> > }
> > 
> > 
> > Do you see anything that tracks pages to help migration skip the
ballooned
> > memory? I don't.
> > 
> 
> No. And it's exactly what I mean. The ballooned memory is still
processed during
> live migration without skipping. The live migration code is in
migration/ram.c.
So if guest acknowledged VIRTIO_BALLOON_F_MUST_TELL_HOST,
we can teach qemu to skip these pages.
Want to write a patch to do this?
> > 
> > > > > The only advantage of ' inflating the balloon
before live
> > > > > migration' is simple,
> > > > nothing more.
> > > >
> > > > That's a big advantage.  Another one is that it does
something
> > > > useful in real- world scenarios.
> > > >
> > >
> > > I don't think the heave performance impaction is something
useful in real
> > world scenarios.
> > >
> > > Liang
> > > > Roman.
> > 
> > So fix the performance then. You will have to try harder if you want
to
> > convince people that the performance is due to bad host/guest
interface,
> > and so we have to change *that*.
> > 
> 
> Actually, the PV solution is irrelevant with the balloon mechanism, I just
use it
> to transfer information between host and guest. 
> I am not sure if I should implement a new virtio device, and I want to get
the answer from
> the community.
> In this RFC patch, to make things simple, I choose to extend the
virtio-balloon and use the
> extended interface to transfer the request and free_page_bimap content.
> 
> I am not intend to change the current virtio-balloon implementation.
> 
> Liang
And the answer would depend on the answer to my question above.
Does balloon need an interface passing page bitmaps around?
Does this speed up any operations?
OTOH what if you use the regular balloon interface with your patches?

> > --
> > MST

Li, Liang Z

2016-Mar-07 06:49 UTC

head link

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

> > No. And it's exactly what I mean. The ballooned memory is still
> > processed during live migration without skipping. The live migration
code is
> in migration/ram.c.
> 
> So if guest acknowledged VIRTIO_BALLOON_F_MUST_TELL_HOST, we can
> teach qemu to skip these pages.
> Want to write a patch to do this?
> 
Yes, we really can teach qemu to skip these pages and it's not hard.  
The problem is the poor performance, this PV solution is aimed to make it more
efficient and reduce the performance impact on guest.
> > >
> > > > > > The only advantage of ' inflating the balloon
before live
> > > > > > migration' is simple,
> > > > > nothing more.
> > > > >
> > > > > That's a big advantage.  Another one is that it
does something
> > > > > useful in real- world scenarios.
> > > > >
> > > >
> > > > I don't think the heave performance impaction is
something useful
> > > > in real
> > > world scenarios.
> > > >
> > > > Liang
> > > > > Roman.
> > >
> > > So fix the performance then. You will have to try harder if you
want
> > > to convince people that the performance is due to bad host/guest
> > > interface, and so we have to change *that*.
> > >
> >
> > Actually, the PV solution is irrelevant with the balloon mechanism, I
> > just use it to transfer information between host and guest.
> > I am not sure if I should implement a new virtio device, and I want to
> > get the answer from the community.
> > In this RFC patch, to make things simple, I choose to extend the
> > virtio-balloon and use the extended interface to transfer the request
and
> free_page_bimap content.
> >
> > I am not intend to change the current virtio-balloon implementation.
> >
> > Liang
> 
> And the answer would depend on the answer to my question above.
> Does balloon need an interface passing page bitmaps around?
Yes, I need a new interface.
> Does this speed up any operations?
No, a new interface will not speed up anything, but it is the easiest way to
solve the compatibility issue.
> OTOH what if you use the regular balloon interface with your patches?
>
The regular balloon interfaces have their specific function and I can't use
them in my patches.
If using these regular interface, I have to do a lot of changes to keep the
compatibility.
> 
> > > --
> > > MST

Reasonably Related Threads

Search for more seemingly similar threads

Linux Virtualization - Mar 2016 - [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

Reasonably Related Threads