thr3ads.net - Xen devel - [Xen-devel] revisit the super page support in HVM restore [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Zhai, Edwin

2009-Aug-19 07:08 UTC

[Xen-devel] revisit the super page support in HVM restore

Keir,

We had ever discussed the super page support in HVM restore, and the
conclusion was:

"If pseudo-phys page is not yet populated in target domain, AND it is
first page of a 2MB extent, AND no other pages in that extent are yet
populated, AND the next pages in the save-image stream populate that
extent in order, THEN allocate a superpage. If the next 511 pages (to
make the 2MB extent) are split across a batch boundary, we have to
optimistically allocate a superpage in the 1st batch, and then break
it into several 4K pages in the 2nd batch."

I once had a patch for it(sleep in my machine for a long time), but
the logic is a little bit complicated:
We need a "requried_pfn" to indicate the expected pfn num in the pfn
list transferred from source machine(set it to invalid when not
tracking 2M page), and the pseudo-code is as following:

for ( i = 0; i < nr_mfns; i++ )
{
    if ( pfn_list[i] is START of a 2M page )
    {
        /* case 1 */
        populate previous collected pfn buffer;
        start a new tracking for this 2M page
        {
            required_pfn = pfn_list[i] + 1;
            start collecting pfn buffer;
        }
    }
    else if ( pfn_list[i] == required_pfn )
    {
        /* case 2: this pfn comes in order inside the 2M page */
        continue tracking this 2M page
        {
            required_pfn++
            add this pfn into collected pfn buffer;
        }
    }
    else if ( required_pfn is VALID)
    {
        /*
         * case 3: this pfn comes out of order inside the 2M page
         * (not start && not required && in tracking)
         */
        populate previous collected pfn buffer;
        start a new tracking for the following 4K pages
        {
            required_pfn = INVALID;
            start collecting pfn buffer;
        }
    }
    else
    {
        /*
         * case 4: series of 4K pages
         * (not start && not required && not in tracking)
         */
        continue this series of 4K pages
        {
            add this pfn into collected pfn buffer;
        }
    }
}  

This is not the end of the story: for the populating action in case 1
& 3, we need tell if it''s a super page or not. Also need know if
the
page has been already populated, and if populated as a normal page or
super page.

Furthermore, we need decide if need break previous allocated 2M page
in case 1 & 3, so need set some flags for it and keep some info when
allocating 2M page.

There are other actions, considerations...

I had spent some time on this patch, but still got some minor bugs:(
Do you have any idea for optimizing this logic?

We have 2 concerns for this method:
1. The code is complicated and bug prone.
2. The target machine at most has the same 2M pages as source machine,
even owning more available big bulk of memory.

So how about this new method:
* Not tracking each of pfn inside 2M page, but trying best to allocate
  2M pages if the 2M page covering this pfn is not allocated.
* There may be holes inside new allocated 2M pages that are not synced in
  this batch, but we don''t care and assume these missing pfns will
  come in future.

This new method is simple as the super page support for PV guest is
already there.

Thanks for any comments,

Edwin


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Aug-19 07:29 UTC

head link

[Xen-devel] Re: revisit the super page support in HVM restore

On 19/08/2009 08:08, "Zhai, Edwin" <edwin.zhai@intel.com> wrote:
> So how about this new method:
> * Not tracking each of pfn inside 2M page, but trying best to allocate
>   2M pages if the 2M page covering this pfn is not allocated.
> * There may be holes inside new allocated 2M pages that are not synced in
>   this batch, but we don''t care and assume these missing pfns will
>   come in future.
> 
> This new method is simple as the super page support for PV guest is
> already there.
You wil fail to restore a guest which has ballooned down its memory as there
will be 4k holes in its memory map. You will allocate 2MB superpages despite
these holes, which do not get fixed up until end of restore process, and run
out of memory in the host, or against the guest''s maxmem limit.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Zhai, Edwin

2009-Aug-19 07:55 UTC

head link

[Xen-devel] Re: revisit the super page support in HVM restore

Keir Fraser wrote:> On 19/08/2009 08:08, "Zhai, Edwin" <edwin.zhai@intel.com>
wrote:
>
>   
>> So how about this new method:
>> * Not tracking each of pfn inside 2M page, but trying best to allocate
>>   2M pages if the 2M page covering this pfn is not allocated.
>> * There may be holes inside new allocated 2M pages that are not synced
in
>>   this batch, but we don''t care and assume these missing pfns
will
>>   come in future.
>>
>> This new method is simple as the super page support for PV guest is
>> already there.
>>     
>
> You wil fail to restore a guest which has ballooned down its memory as
there
> will be 4k holes in its memory map. 
I see. But current PV guest has same issue also. If set superpages for 
the PV guest, allocate_mfn in xc_domain_restore.c would try to allocate 
2M page for each of pfn regardless of the holes. Per my understanding, 
this is more serious issue for PV guest, as it uses balloon driver more 
frequently.

If we have to use this algorithm, back to my complicated code -- do you 
have any suggestion to simplify the logic?

Thanks,

> You will allocate 2MB superpages despite
> these holes, which do not get fixed up until end of restore process, and
run
> out of memory in the host, or against the guest''s maxmem limit.
>
>  -- Keir
>
>
>   
-- 
best rgds,
edwin


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Aug-19 09:04 UTC

head link

[Xen-devel] Re: revisit the super page support in HVM restore

On 19/08/2009 08:55, "Zhai, Edwin" <edwin.zhai@intel.com> wrote:
>> You wil fail to restore a guest which has ballooned down its memory as
there
>> will be 4k holes in its memory map.
> 
> I see. But current PV guest has same issue also. If set superpages for
> the PV guest, allocate_mfn in xc_domain_restore.c would try to allocate
> 2M page for each of pfn regardless of the holes. Per my understanding,
> this is more serious issue for PV guest, as it uses balloon driver more
> frequently.
I don''t think this has been addressed yet for PV guests. But then again
noone much is using the PV superpage support. Whereas this HVM superpage
logic will be always on. So it needs to work reliably!
> If we have to use this algorithm, back to my complicated code -- do you
> have any suggestion to simplify the logic?
I wasn''t clear where your pseudocode fits into xc_domain_restore. My
view is
that we would probably stuff the logic inside allocate_physmem(), or near
the call to allocate_physmem(). The logic added would look for start of a
superpage, and look for a straight run of pages to the end of the superpage
(or until we hit the end of the batch, which would need special treatment).

As for other points:
 * "Need tell if it''s a super page or not" -- superpages in
the guest
physmap are only an optimisation. We can introduce them where possible,
regardless of which regions were or weren''t superpage-backed in the
original
source domain.
 * "Need know if page has already been populated, and if populated as a
normal page or superpage" -- p2m[] array tells us what is already
populated.
And we do not need care after the allocation has happened whether it was a
superpage or not: a superpage will simply fill 512 entries in the p2m[]. Our
try-to-allocate-superpage logic will simply bail if it detects any entry in
the p2m[] range of interest is already populated.

Basically all we need is a "good-enough" heuristic for allocating
superpages, as they are an optimisation only. If measurement tells us our
heuristic is failing too often, then we can get more
sophisticated/complicated.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2009 - revisit the super page support in HVM restore

[Xen-devel] revisit the super page support in HVM restore

[Xen-devel] Re: revisit the super page support in HVM restore

[Xen-devel] Re: revisit the super page support in HVM restore

[Xen-devel] Re: revisit the super page support in HVM restore