thr3ads.net - Linux Virtualization - [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Li, Liang Z

2016-Mar-04 15:13 UTC

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

> > Maybe I am not clear enough.
> >
> > I mean if we inflate balloon before live migration, for a 8GB guest,
it takes
> about 5 Seconds for the inflating operation to finish.
> 
> And these 5 seconds are spent where?
> 
The time is spent on allocating the pages and send the allocated pages pfns to
QEMU
through virtio.
> > For the PV solution, there is no need to inflate balloon before live
> > migration, the only cost is to traversing the free_list to  construct
> > the free pages bitmap, and it takes about 20ms for a 8GB idle guest(
less if
> there is less free pages),  passing the free pages info to host will take
about
> extra 3ms.
> >
> >
> > Liang
> 
> So now let's please stop talking about solutions at a high level and
discuss the
> interface changes you make in detail.
> What makes it faster? Better host/guest interface? No need to go through
> buddy allocator within guest? Less interrupts? Something else?
> 
I assume you are familiar with the current virtio-balloon and how it works. 
The new interface is very simple, send a request to the virtio-balloon driver,
The virtio-driver will travers the
'&zone->free_area[order].free_list[t])' to
construct a 'free_page_bitmap', and then the driver will send the
content
of  'free_page_bitmap' back to QEMU. That all the new interface does and
there are no ' alloc_page' related affairs, so it's faster.


Some code snippet:
----------------------------------------------
+static void mark_free_pages_bitmap(struct zone *zone,
+		 unsigned long *free_page_bitmap, unsigned long pfn_gap) {
+	unsigned long pfn, flags, i;
+	unsigned int order, t;
+	struct list_head *curr;
+
+	if (zone_is_empty(zone))
+		return;
+
+	spin_lock_irqsave(&zone->lock, flags);
+
+	for_each_migratetype_order(order, t) {
+		list_for_each(curr, &zone->free_area[order].free_list[t]) {
+
+			pfn = page_to_pfn(list_entry(curr, struct page, lru));
+			for (i = 0; i < (1UL << order); i++) {
+				if ((pfn + i) >= PFN_4G)
+					set_bit_le(pfn + i - pfn_gap,
+						   free_page_bitmap);
+				else
+					set_bit_le(pfn + i, free_page_bitmap);
+			}
+		}
+	}
+
+	spin_unlock_irqrestore(&zone->lock, flags); }
----------------------------------------------------
Sorry for my poor English and expression, if you still can't understand,
you could glance at the patch, total about 400 lines.> 
> > > --
> > > MST

Michael S. Tsirkin

2016-Mar-08 14:03 UTC

head link

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

On Fri, Mar 04, 2016 at 03:13:03PM +0000, Li, Liang Z
wrote:> > > Maybe I am not clear enough.
> > >
> > > I mean if we inflate balloon before live migration, for a 8GB
guest, it takes
> > about 5 Seconds for the inflating operation to finish.
> > 
> > And these 5 seconds are spent where?
> > 
> 
> The time is spent on allocating the pages and send the allocated pages pfns
to QEMU
> through virtio.
What if we skip allocating pages but use the existing interface to send pfns
to QEMU?
> > > For the PV solution, there is no need to inflate balloon before
live
> > > migration, the only cost is to traversing the free_list to 
construct
> > > the free pages bitmap, and it takes about 20ms for a 8GB idle
guest( less if
> > there is less free pages),  passing the free pages info to host will
take about
> > extra 3ms.
> > >
> > >
> > > Liang
> > 
> > So now let's please stop talking about solutions at a high level
and discuss the
> > interface changes you make in detail.
> > What makes it faster? Better host/guest interface? No need to go
through
> > buddy allocator within guest? Less interrupts? Something else?
> > 
> 
> I assume you are familiar with the current virtio-balloon and how it works.
> The new interface is very simple, send a request to the virtio-balloon
driver,
> The virtio-driver will travers the
'&zone->free_area[order].free_list[t])' to
> construct a 'free_page_bitmap', and then the driver will send the
content
> of  'free_page_bitmap' back to QEMU. That all the new interface
does and
> there are no ' alloc_page' related affairs, so it's faster.
> 
> 
> Some code snippet:
> ----------------------------------------------
> +static void mark_free_pages_bitmap(struct zone *zone,
> +		 unsigned long *free_page_bitmap, unsigned long pfn_gap) {
> +	unsigned long pfn, flags, i;
> +	unsigned int order, t;
> +	struct list_head *curr;
> +
> +	if (zone_is_empty(zone))
> +		return;
> +
> +	spin_lock_irqsave(&zone->lock, flags);
> +
> +	for_each_migratetype_order(order, t) {
> +		list_for_each(curr, &zone->free_area[order].free_list[t]) {
> +
> +			pfn = page_to_pfn(list_entry(curr, struct page, lru));
> +			for (i = 0; i < (1UL << order); i++) {
> +				if ((pfn + i) >= PFN_4G)
> +					set_bit_le(pfn + i - pfn_gap,
> +						   free_page_bitmap);
> +				else
> +					set_bit_le(pfn + i, free_page_bitmap);
> +			}
> +		}
> +	}
> +
> +	spin_unlock_irqrestore(&zone->lock, flags); }
> ----------------------------------------------------
> Sorry for my poor English and expression, if you still can't
understand,
> you could glance at the patch, total about 400 lines.
> > 
> > > > --
> > > > MST

Li, Liang Z

2016-Mar-08 14:17 UTC

head link

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

> On Fri, Mar 04, 2016 at 03:13:03PM +0000, Li, Liang Z wrote:
> > > > Maybe I am not clear enough.
> > > >
> > > > I mean if we inflate balloon before live migration, for a
8GB
> > > > guest, it takes
> > > about 5 Seconds for the inflating operation to finish.
> > >
> > > And these 5 seconds are spent where?
> > >
> >
> > The time is spent on allocating the pages and send the allocated pages
> > pfns to QEMU through virtio.
> 
> What if we skip allocating pages but use the existing interface to send
pfns to
> QEMU?
>
I think it will be much faster, allocating pages is the main reason for the long
time of the operation.
Experiment is needed to get the exact time spend on sending the pfns.

Liang

Possibly Parallel Threads

Search for more seemingly similar threads

Linux Virtualization - Mar 2016 - [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

[Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization

Possibly Parallel Threads