Li, Liang Z
2016-Mar-04 09:12 UTC
[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
> * Roman Kagan (rkagan at virtuozzo.com) wrote: > > On Fri, Mar 04, 2016 at 08:23:09AM +0000, Li, Liang Z wrote: > > > > On Thu, Mar 03, 2016 at 05:46:15PM +0000, Dr. David Alan Gilbert wrote: > > > > > * Liang Li (liang.z.li at intel.com) wrote: > > > > > > The current QEMU live migration implementation mark the all > > > > > > the guest's RAM pages as dirtied in the ram bulk stage, all > > > > > > these pages will be processed and that takes quit a lot of CPU cycles. > > > > > > > > > > > > From guest's point of view, it doesn't care about the content > > > > > > in free pages. We can make use of this fact and skip > > > > > > processing the free pages in the ram bulk stage, it can save a > > > > > > lot CPU cycles and reduce the network traffic significantly > > > > > > while speed up the live migration process obviously. > > > > > > > > > > > > This patch set is the QEMU side implementation. > > > > > > > > > > > > The virtio-balloon is extended so that QEMU can get the free > > > > > > pages information from the guest through virtio. > > > > > > > > > > > > After getting the free pages information (a bitmap), QEMU can > > > > > > use it to filter out the guest's free pages in the ram bulk > > > > > > stage. This make the live migration process much more efficient. > > > > > > > > > > Hi, > > > > > An interesting solution; I know a few different people have > > > > > been looking at how to speed up ballooned VM migration. > > > > > > > > > > I wonder if it would be possible to avoid the kernel changes > > > > > by parsing /proc/self/pagemap - if that can be used to detect > > > > > unmapped/zero mapped pages in the guest ram, would it achieve > > > > > the > > > > same result? > > > > > > > > Yes I was about to suggest the same thing: it's simple and makes > > > > use of the existing infrastructure. And you wouldn't need to care > > > > if the pages were unmapped by ballooning or anything else > > > > (alternative balloon implementations, not yet touched by the > > > > guest, etc.). Besides, you wouldn't need to synchronize with the guest. > > > > > > > > Roman. > > > > > > The unmapped/zero mapped pages can be detected by parsing > > > /proc/self/pagemap, but the free pages can't be detected by this. > > > Imaging an application allocates a large amount of memory , after > > > using, it frees the memory, then live migration happens. All these free > pages will be process and sent to the destination, it's not optimal. > > > > First, the likelihood of such a situation is marginal, there's no > > point optimizing for it specifically. > > > > And second, even if that happens, you inflate the balloon right before > > the migration and the free memory will get umapped very quickly, so > > this case is covered nicely by the same technique that works for more > > realistic cases, too. > > Although I wonder which is cheaper; that would be fairly expensive for the > guest wouldn't it? And you'd somehow have to kick the guest before > migration to do the ballooning - and how long would you wait for it to finish?About 5 seconds for an 8G guest, balloon to 1G. Get the free pages bitmap take about 20ms for an 8G idle guest. Liang> > Dave > > > > > Roman. > -- > Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
Michael S. Tsirkin
2016-Mar-04 09:47 UTC
[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
On Fri, Mar 04, 2016 at 09:12:12AM +0000, Li, Liang Z wrote:> > Although I wonder which is cheaper; that would be fairly expensive for the > > guest wouldn't it? And you'd somehow have to kick the guest before > > migration to do the ballooning - and how long would you wait for it to finish? > > About 5 seconds for an 8G guest, balloon to 1G. Get the free pages bitmap take about 20ms > for an 8G idle guest. > > LiangWhere is the time spent though? allocating within guest? Or passing the info to host? If the former, we can use existing inflate/deflate vqs: Have guest put each free page on inflate vq, then on deflate vq. -- MST
Li, Liang Z
2016-Mar-04 10:11 UTC
[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
> On Fri, Mar 04, 2016 at 09:12:12AM +0000, Li, Liang Z wrote: > > > Although I wonder which is cheaper; that would be fairly expensive > > > for the guest wouldn't it? And you'd somehow have to kick the guest > > > before migration to do the ballooning - and how long would you wait for > it to finish? > > > > About 5 seconds for an 8G guest, balloon to 1G. Get the free pages > > bitmap take about 20ms for an 8G idle guest. > > > > Liang > > Where is the time spent though? allocating within guest? > Or passing the info to host? > If the former, we can use existing inflate/deflate vqs: > Have guest put each free page on inflate vq, then on deflate vq. >Maybe I am not clear enough. I mean if we inflate balloon before live migration, for a 8GB guest, it takes about 5 Seconds for the inflating operation to finish. For the PV solution, there is no need to inflate balloon before live migration, the only cost is to traversing the free_list to construct the free pages bitmap, and it takes about 20ms for a 8GB idle guest( less if there is less free pages), passing the free pages info to host will take about extra 3ms. Liang> -- > MST
Maybe Matching Threads
- [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
- [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
- [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
- [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
- [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization