George Dunlap
2011-May-16 10:51 UTC
[Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate
This patch series restores the use of superpages when restoring or migrating a VM, while retaining efficient batching of 4k pages when superpages are not appropriate or available. This version does not include the p2m statistics patch, as it''s not needed, and it caused some disagreement. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-May-16 10:51 UTC
[Xen-devel] [PATCH 1 of 4] tools: Detect superpages on domain restore
When receiving pages, look for contiguous 2-meg aligned regions and attempt to allocate a superpage for that region, falling back to 4k pages if the allocation fails. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r f9bb0bbea7c2 -r 57558d87761f tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c Thu May 12 16:42:54 2011 +0100 +++ b/tools/libxc/xc_domain_restore.c Mon May 16 11:50:46 2011 +0100 @@ -48,6 +48,11 @@ struct restore_ctx { #define HEARTBEAT_MS 1000 +#define SUPERPAGE_PFN_SHIFT 9 +#define SUPERPAGE_NR_PFNS (1UL << SUPERPAGE_PFN_SHIFT) + +#define SUPER_PAGE_START(pfn) (((pfn) & (SUPERPAGE_NR_PFNS-1)) == 0 ) + #ifndef __MINIOS__ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx, int fd, void* buf, size_t size) @@ -882,9 +887,11 @@ static int pagebuf_get(xc_interface *xch static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx *ctx, xen_pfn_t* region_mfn, unsigned long* pfn_type, int pae_extended_cr3, unsigned int hvm, struct xc_mmu* mmu, - pagebuf_t* pagebuf, int curbatch) + pagebuf_t* pagebuf, int curbatch, int superpages) { int i, j, curpage, nr_mfns; + int k, scount; + unsigned long superpage_start=INVALID_P2M_ENTRY; /* used by debug verify code */ unsigned long buf[PAGE_SIZE/sizeof(unsigned long)]; /* Our mapping of the current region (batch) */ @@ -902,8 +909,8 @@ static int apply_batch(xc_interface *xch if (j > MAX_BATCH_SIZE) j = MAX_BATCH_SIZE; - /* First pass for this batch: work out how much memory to alloc */ - nr_mfns = 0; + /* First pass for this batch: work out how much memory to alloc, and detect superpages */ + nr_mfns = scount = 0; for ( i = 0; i < j; i++ ) { unsigned long pfn, pagetype; @@ -914,19 +921,103 @@ static int apply_batch(xc_interface *xch (ctx->p2m[pfn] == INVALID_P2M_ENTRY) ) { /* Have a live PFN which hasn''t had an MFN allocated */ + + /* Logic if we''re in the middle of detecting a candidate superpage */ + if ( superpage_start != INVALID_P2M_ENTRY ) + { + /* Is this the next expected continuation? */ + if ( pfn == superpage_start + scount ) + { + if ( !superpages ) + { + ERROR("Unexpexted codepath with no superpages"); + return -1; + } + + scount++; + + /* If we''ve found a whole superpage, allocate it and update p2m */ + if ( scount == SUPERPAGE_NR_PFNS ) + { + unsigned long supermfn; + + + supermfn=superpage_start; + if ( xc_domain_populate_physmap_exact(xch, dom, 1, + SUPERPAGE_PFN_SHIFT, 0, &supermfn) != 0 ) + { + DPRINTF("No 2M page available for pfn 0x%lx, fall back to 4K page.\n", + superpage_start); + /* If we''re falling back from a failed allocation, subtract one + * from count, since the last page == pfn, which will behandled + * anyway. */ + scount--; + goto fallback; + } + + DPRINTF("Mapping superpage (%d) pfn %lx, mfn %lx\n", scount, superpage_start, supermfn); + for (k=0; k<scount; k++) + { + /* We just allocated a new mfn above; update p2m */ + ctx->p2m[superpage_start+k] = supermfn+k; + ctx->nr_pfns++; + /* region_map[] will be set below */ + } + superpage_start=INVALID_P2M_ENTRY; + scount=0; + } + continue; + } + + fallback: + DPRINTF("Falling back %d pages pfn %lx\n", scount, superpage_start); + for (k=0; k<scount; k++) + { + ctx->p2m_batch[nr_mfns++] = superpage_start+k; + ctx->p2m[superpage_start+k]--; + } + superpage_start = INVALID_P2M_ENTRY; + scount=0; + } + + /* Are we ready to start a new superpage candidate? */ + if ( superpages && SUPER_PAGE_START(pfn) ) + { + superpage_start=pfn; + scount++; + continue; + } + + /* Add the current pfn to pfn_batch */ ctx->p2m_batch[nr_mfns++] = pfn; ctx->p2m[pfn]--; } - } + } + + /* Clean up any partial superpage candidates */ + if ( superpage_start != INVALID_P2M_ENTRY ) + { + DPRINTF("Falling back %d pages pfn %lx\n", scount, superpage_start); + for (k=0; k<scount; k++) + { + ctx->p2m_batch[nr_mfns++] = superpage_start+k; + ctx->p2m[superpage_start+k]--; + } + superpage_start = INVALID_P2M_ENTRY; + } /* Now allocate a bunch of mfns for this batch */ - if ( nr_mfns && - (xc_domain_populate_physmap_exact(xch, dom, nr_mfns, 0, - 0, ctx->p2m_batch) != 0) ) - { - ERROR("Failed to allocate memory for batch.!\n"); - errno = ENOMEM; - return -1; + if ( nr_mfns ) + { + DPRINTF("Mapping order 0, %d; first pfn %lx\n", nr_mfns, ctx->p2m_batch[0]); + + if(xc_domain_populate_physmap_exact(xch, dom, nr_mfns, 0, + 0, ctx->p2m_batch) != 0) + { + ERROR("Failed to allocate memory for batch.!\n"); + errno = ENOMEM; + return -1; + } } /* Second pass for this batch: update p2m[] and region_mfn[] */ @@ -977,7 +1068,8 @@ static int apply_batch(xc_interface *xch if (pfn_err[i]) { - ERROR("unexpected PFN mapping failure"); + ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx p2m_mfn %lx", + pfn, region_mfn[i], ctx->p2m[pfn]); goto err_mapped; } @@ -1148,9 +1240,6 @@ int xc_domain_restore(xc_interface *xch, /* For info only */ ctx->nr_pfns = 0; - if ( superpages ) - return 1; - ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt)); if ( ctxt == NULL ) @@ -1298,7 +1387,8 @@ int xc_domain_restore(xc_interface *xch, int brc; brc = apply_batch(xch, dom, ctx, region_mfn, pfn_type, - pae_extended_cr3, hvm, mmu, &pagebuf, curbatch); + pae_extended_cr3, hvm, mmu, &pagebuf, curbatch, + superpages); if ( brc < 0 ) goto out; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-May-16 10:51 UTC
[Xen-devel] [PATCH 2 of 4] tools: Save superpages in the same batch, to make detection easier
On the first time through (when pfns are mostly allocated on the receiving side), try to keep superpages together in the same batch by ending a batch early if we see the first page of a potential superpage and there isn''t enough room in the batch for a full superpage. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 57558d87761f -r a629b41a8d1f tools/libxc/xc_domain_save.c --- a/tools/libxc/xc_domain_save.c Mon May 16 11:50:46 2011 +0100 +++ b/tools/libxc/xc_domain_save.c Mon May 16 11:50:46 2011 +0100 @@ -82,6 +82,11 @@ struct outbuf { ((mfn_to_pfn(_mfn) < (dinfo->p2m_size)) && \ (pfn_to_mfn(mfn_to_pfn(_mfn)) == (_mfn)))) +#define SUPERPAGE_PFN_SHIFT 9 +#define SUPERPAGE_NR_PFNS (1UL << SUPERPAGE_PFN_SHIFT) + +#define SUPER_PAGE_START(pfn) (((pfn) & (SUPERPAGE_NR_PFNS-1)) == 0 ) + /* ** During (live) save/migrate, we maintain a number of bitmaps to track ** which pages we have to send, to fixup, and to skip. @@ -906,6 +911,7 @@ int xc_domain_save(xc_interface *xch, in int rc = 1, frc, i, j, last_iter = 0, iter = 0; int live = (flags & XCFLAGS_LIVE); int debug = (flags & XCFLAGS_DEBUG); + int superpages = !!hvm; int race = 0, sent_last_iter, skip_this_iter = 0; unsigned int sent_this_iter = 0; int tmem_saved = 0; @@ -1262,6 +1268,12 @@ int xc_domain_save(xc_interface *xch, in (test_bit(n, to_fix) && last_iter)) ) continue; + /* First time through, try to keep superpages in the same batch */ + if ( superpages && iter == 1 + && SUPER_PAGE_START(n) + && batch + SUPERPAGE_NR_PFNS > MAX_BATCH_SIZE ) + break; + /* ** we get here if: ** 1. page is marked to_send & hasn''t already been re-dirtied _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-May-16 10:51 UTC
[Xen-devel] [PATCH 3 of 4] tools: Introduce "allocate-only" page type for migration
To detect presence of superpages on the receiver side, we need to have strings of sequential pfns sent across on the first iteration through the memory. However, as we go through the memory, more and more of it will be marked dirty, making it wasteful to send those pages. This patch introduces a new PFINFO type, "XALLOC". Like PFINFO_XTAB, it indicates that there is no corresponding page present in the subsquent page buffer. However, unlike PFINFO_XTAB, it contains a pfn which should be allocated. This new type is only used for migration; but it''s placed in xen/public/domctl.h so that the value isn''t reused. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r a629b41a8d1f -r 2ee1b330f2d3 tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c Mon May 16 11:50:46 2011 +0100 +++ b/tools/libxc/xc_domain_restore.c Mon May 16 11:50:46 2011 +0100 @@ -839,7 +839,8 @@ static int pagebuf_get_one(xc_interface countpages = count; for (i = oldcount; i < buf->nr_pages; ++i) - if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB) + if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XTAB + ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) == XEN_DOMCTL_PFINFO_XALLOC) --countpages; if (!countpages) @@ -917,6 +918,7 @@ static int apply_batch(xc_interface *xch pfn = pagebuf->pfn_types[i + curbatch] & ~XEN_DOMCTL_PFINFO_LTAB_MASK; pagetype = pagebuf->pfn_types[i + curbatch] & XEN_DOMCTL_PFINFO_LTAB_MASK; + /* For allocation purposes, treat XEN_DOMCTL_PFINFO_XALLOC as a normal page */ if ( (pagetype != XEN_DOMCTL_PFINFO_XTAB) && (ctx->p2m[pfn] == INVALID_P2M_ENTRY) ) { @@ -1028,21 +1030,21 @@ static int apply_batch(xc_interface *xch pfn = pagebuf->pfn_types[i + curbatch] & ~XEN_DOMCTL_PFINFO_LTAB_MASK; pagetype = pagebuf->pfn_types[i + curbatch] & XEN_DOMCTL_PFINFO_LTAB_MASK; - if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ) + if ( pagetype != XEN_DOMCTL_PFINFO_XTAB + && ctx->p2m[pfn] == (INVALID_P2M_ENTRY-1) ) + { + /* We just allocated a new mfn above; update p2m */ + ctx->p2m[pfn] = ctx->p2m_batch[nr_mfns++]; + ctx->nr_pfns++; + } + + /* setup region_mfn[] for batch map, if necessary. + * For HVM guests, this interface takes PFNs, not MFNs */ + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB + || pagetype == XEN_DOMCTL_PFINFO_XALLOC ) region_mfn[i] = ~0UL; /* map will fail but we don''t care */ - else - { - if ( ctx->p2m[pfn] == (INVALID_P2M_ENTRY-1) ) - { - /* We just allocated a new mfn above; update p2m */ - ctx->p2m[pfn] = ctx->p2m_batch[nr_mfns++]; - ctx->nr_pfns++; - } - - /* setup region_mfn[] for batch map. - * For HVM guests, this interface takes PFNs, not MFNs */ + else region_mfn[i] = hvm ? pfn : ctx->p2m[pfn]; - } } /* Map relevant mfns */ @@ -1062,8 +1064,9 @@ static int apply_batch(xc_interface *xch pfn = pagebuf->pfn_types[i + curbatch] & ~XEN_DOMCTL_PFINFO_LTAB_MASK; pagetype = pagebuf->pfn_types[i + curbatch] & XEN_DOMCTL_PFINFO_LTAB_MASK; - if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ) - /* a bogus/unmapped page: skip it */ + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB + || pagetype == XEN_DOMCTL_PFINFO_XALLOC) + /* a bogus/unmapped/allocate-only page: skip it */ continue; if (pfn_err[i]) diff -r a629b41a8d1f -r 2ee1b330f2d3 tools/libxc/xc_domain_save.c --- a/tools/libxc/xc_domain_save.c Mon May 16 11:50:46 2011 +0100 +++ b/tools/libxc/xc_domain_save.c Mon May 16 11:50:46 2011 +0100 @@ -1258,13 +1258,15 @@ int xc_domain_save(xc_interface *xch, in } else { - if ( !last_iter && + int dont_skip = (last_iter || (superpages && iter==1)); + + if ( !dont_skip && test_bit(n, to_send) && test_bit(n, to_skip) ) skip_this_iter++; /* stats keeping */ if ( !((test_bit(n, to_send) && !test_bit(n, to_skip)) || - (test_bit(n, to_send) && last_iter) || + (test_bit(n, to_send) && dont_skip) || (test_bit(n, to_fix) && last_iter)) ) continue; @@ -1277,7 +1279,7 @@ int xc_domain_save(xc_interface *xch, in /* ** we get here if: ** 1. page is marked to_send & hasn''t already been re-dirtied - ** 2. (ignore to_skip in last iteration) + ** 2. (ignore to_skip in first and last iterations) ** 3. add in pages that still need fixup (net bufs) */ @@ -1301,7 +1303,7 @@ int xc_domain_save(xc_interface *xch, in set_bit(n, to_fix); continue; } - + if ( last_iter && test_bit(n, to_fix) && !test_bit(n, to_send) ) @@ -1346,6 +1348,7 @@ int xc_domain_save(xc_interface *xch, in { if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB ) continue; + DPRINTF("map fail: page %i mfn %08lx err %d\n", j, gmfn, pfn_err[j]); pfn_type[j] = XEN_DOMCTL_PFINFO_XTAB; @@ -1358,6 +1361,9 @@ int xc_domain_save(xc_interface *xch, in continue; } + if ( superpages && iter==1 && test_bit(gmfn, to_skip)) + pfn_type[j] = XEN_DOMCTL_PFINFO_XALLOC; + /* canonicalise mfn->pfn */ pfn_type[j] |= pfn_batch[j]; ++run; @@ -1432,8 +1438,9 @@ int xc_domain_save(xc_interface *xch, in } } - /* skip pages that aren''t present */ - if ( pagetype == XEN_DOMCTL_PFINFO_XTAB ) + /* skip pages that aren''t present or are alloc-only */ + if ( pagetype == XEN_DOMCTL_PFINFO_XTAB + || pagetype == XEN_DOMCTL_PFINFO_XALLOC ) continue; pagetype &= XEN_DOMCTL_PFINFO_LTABTYPE_MASK; diff -r a629b41a8d1f -r 2ee1b330f2d3 xen/include/public/domctl.h --- a/xen/include/public/domctl.h Mon May 16 11:50:46 2011 +0100 +++ b/xen/include/public/domctl.h Mon May 16 11:50:46 2011 +0100 @@ -133,6 +133,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_getme #define XEN_DOMCTL_PFINFO_LTABTYPE_MASK (0x7U<<28) #define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31) #define XEN_DOMCTL_PFINFO_XTAB (0xfU<<28) /* invalid page */ +#define XEN_DOMCTL_PFINFO_XALLOC (0xeU<<28) /* allocate-only page */ #define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28) #define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-May-16 10:51 UTC
[Xen-devel] [PATCH 4 of 4] tools: Enable superpages for HVM domains by default
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 2ee1b330f2d3 -r 454501a50504 tools/libxl/libxl_dom.c --- a/tools/libxl/libxl_dom.c Mon May 16 11:50:46 2011 +0100 +++ b/tools/libxl/libxl_dom.c Mon May 16 11:50:46 2011 +0100 @@ -316,7 +316,7 @@ int libxl__domain_restore_common(libxl__ rc = xc_domain_restore(ctx->xch, fd, domid, state->store_port, &state->store_mfn, state->console_port, &state->console_mfn, - info->hvm, info->u.hvm.pae, 0); + info->hvm, info->u.hvm.pae, !!info->hvm); if ( rc ) { LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "restoring domain"); return ERROR_FAIL; diff -r 2ee1b330f2d3 -r 454501a50504 tools/xcutils/xc_restore.c --- a/tools/xcutils/xc_restore.c Mon May 16 11:50:46 2011 +0100 +++ b/tools/xcutils/xc_restore.c Mon May 16 11:50:46 2011 +0100 @@ -43,7 +43,7 @@ main(int argc, char **argv) if ( argc == 9 ) superpages = atoi(argv[8]); else - superpages = 0; + superpages = !!hvm; ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, console_evtchn, &console_mfn, hvm, pae, superpages); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-May-24 17:30 UTC
Re: [Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate
George Dunlap writes ("[Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate"):> This patch series restores the use of superpages when restoring or > migrating a VM, while retaining efficient batching of 4k pages when > superpages are not appropriate or available.Thanks for this. It looks plausible but I haven''t done a detailed review. Given how much other stuff I''ve thrown in today I''ll at least wait to see what the tests look like and consider applying this tomorrow. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-May-25 13:56 UTC
Re: [Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate
On Tue, 2011-05-24 at 18:30 +0100, Ian Jackson wrote:> George Dunlap writes ("[Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate"): > > This patch series restores the use of superpages when restoring or > > migrating a VM, while retaining efficient batching of 4k pages when > > superpages are not appropriate or available. > > Thanks for this. It looks plausible but I haven''t done a detailed > review. > > Given how much other stuff I''ve thrown in today I''ll at least wait to > see what the tests look like and consider applying this tomorrow.Ack. FWIW it''s been through a week of XenRT runs with Boston without noticeable incident. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-May-26 14:28 UTC
Re: [Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate
George Dunlap writes ("[Xen-devel] [PATCH 0 of 4] RESEND Use superpages on restore/migrate"):> This patch series restores the use of superpages when restoring or > migrating a VM, while retaining efficient batching of 4k pages when > superpages are not appropriate or available. > > This version does not include the p2m statistics patch, as it''s not > needed, and it caused some disagreement.Applied all four, thanks. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel