thr3ads.net - Xen devel - [Xen-devel] [PATCH 0 of 4] Use superpages on restore/migrate [May 2011]

If this information is useful, please help other people find it:
Share via:

George Dunlap

2011-May-06 14:01 UTC

[Xen-devel] [PATCH 0 of 4] Use superpages on restore/migrate

This patch series restores the use of superpages when restoring or
migrating a VM, while retaining efficient batching of 4k pages when
superpages are not appropriate or available.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 14:01 UTC

head link

[Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Count the number of 4kiB, 2MiB, and 1GiB p2m entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/hap/p2m-ept.c
--- a/xen/arch/x86/mm/hap/p2m-ept.c	Thu May 05 17:40:34 2011 +0100
+++ b/xen/arch/x86/mm/hap/p2m-ept.c	Fri May 06 15:01:08 2011 +0100
@@ -39,6 +39,8 @@
 
 #define is_epte_present(ept_entry)      ((ept_entry)->epte & 0x7)
 #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
+#define is_epte_countable(ept_entry)    (is_epte_present(ept_entry) \
+                                         || ((ept_entry)->sa_p2mt ==
p2m_populate_on_demand))
 
 /* Non-ept "lock-and-check" wrapper */
 static int ept_pod_check_and_populate(struct p2m_domain *p2m, unsigned long
gfn,
@@ -167,11 +169,14 @@
 void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int level)
 {
     /* End if the entry is a leaf entry. */
-    if ( level == 0 || !is_epte_present(ept_entry) ||
-         is_epte_superpage(ept_entry) )
+    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
+    {
+        if ( is_epte_countable(ept_entry) )
+            p2m->stats.entries[level]--;
         return;
+    }
 
-    if ( level > 1 )
+    if ( level > 0 )
     {
         ept_entry_t *epte = map_domain_page(ept_entry->mfn);
         for ( int i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
@@ -217,7 +222,10 @@
         ept_p2m_type_to_flags(epte, epte->sa_p2mt, epte->access);
 
         if ( (level - 1) == target )
+        {
+            p2m->stats.entries[target]++;
             continue;
+        }
 
         ASSERT(is_epte_superpage(epte));
 
@@ -400,6 +408,10 @@
             ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
         }
 
+        /* old_entry will be handled by ept_free_entry below */
+        if ( is_epte_countable(&new_entry) )
+            p2m->stats.entries[i]++;
+
         atomic_write_ept_entry(ept_entry, new_entry);
     }
     else
@@ -412,12 +424,16 @@
 
         split_ept_entry = atomic_read_ept_entry(ept_entry);
 
+        /* Accounting should be OK here; split_ept_entry bump the counts,
+         * free_entry will reduce them. */
         if ( !ept_split_super_page(p2m, &split_ept_entry, i, target) )
         {
             ept_free_entry(p2m, &split_ept_entry, i);
             goto out;
         }
 
+        /* We know this was countable or we wouldn''t be here.*/
+        p2m->stats.entries[i]--;
         /* now install the newly split ept sub-tree */
         /* NB: please make sure domian is paused and no in-fly VT-d DMA. */
         atomic_write_ept_entry(ept_entry, split_ept_entry);
@@ -449,9 +465,13 @@
 
         ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
 
+        /* old_entry will be handled by ept_free_entry below */
+        if ( is_epte_countable(&new_entry) )
+            p2m->stats.entries[i]++;
+
         atomic_write_ept_entry(ept_entry, new_entry);
     }
-
+ 
     /* Track the highest gfn for which we have ever had a valid mapping */
     if ( mfn_valid(mfn_x(mfn)) &&
          (gfn + (1UL << order) - 1 > p2m->max_mapped_pfn) )
diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c	Thu May 05 17:40:34 2011 +0100
+++ b/xen/arch/x86/mm/p2m.c	Fri May 06 15:01:08 2011 +0100
@@ -184,11 +184,15 @@
 {
     /* End if the entry is a leaf entry. */
     if ( page_order == 0
-         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT)
+         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) 
          || (l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
+    {
+        if ( l1e_get_flags(*p2m_entry) )
+            p2m->stats.entries[page_order/9]--;
         return;
-
-    if ( page_order > 9 )
+    }
+
+    if ( page_order )
     {
         l1_pgentry_t *l3_table = map_domain_page(l1e_get_pfn(*p2m_entry));
         for ( int i = 0; i < L3_PAGETABLE_ENTRIES; i++ )
@@ -242,6 +246,7 @@
         new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
                                  __PAGE_HYPERVISOR | _PAGE_USER);
 
+        /* Stats: Empty entry, no mods needed */
         switch ( type ) {
         case PGT_l3_page_table:
             p2m_add_iommu_flags(&new_entry, 3,
IOMMUF_readable|IOMMUF_writable);
@@ -285,10 +290,12 @@
         {
             new_entry = l1e_from_pfn(pfn + (i * L1_PAGETABLE_ENTRIES), flags);
             p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
+            p2m->stats.entries[1]++;
             p2m->write_p2m_entry(p2m, gfn,
                 l1_entry+i, *table_mfn, new_entry, 2);
         }
         unmap_domain_page(l1_entry);
+        p2m->stats.entries[2]--;
         new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
                                  __PAGE_HYPERVISOR|_PAGE_USER); //disable PSE
         p2m_add_iommu_flags(&new_entry, 2,
IOMMUF_readable|IOMMUF_writable);
@@ -320,6 +327,7 @@
         {
             new_entry = l1e_from_pfn(pfn + i, flags);
             p2m_add_iommu_flags(&new_entry, 0, 0);
+            p2m->stats.entries[0]++;
             p2m->write_p2m_entry(p2m, gfn,
                 l1_entry+i, *table_mfn, new_entry, 1);
         }
@@ -328,6 +336,7 @@
         new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
                                  __PAGE_HYPERVISOR|_PAGE_USER);
         p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
+        p2m->stats.entries[1]--;
         p2m->write_p2m_entry(p2m, gfn,
             p2m_entry, *table_mfn, new_entry, 2);
     }
@@ -908,6 +917,15 @@
 void
 p2m_pod_dump_data(struct p2m_domain *p2m)
 {
+    int i;
+    long entries;
+    printk("    P2M entry stats:\n");
+    for ( i=0; i<3; i++)
+        if ( (entries=p2m->stats.entries[i]) )
+            printk("     L%d: %8ld entries, %ld bytes\n",
+                   i+1,
+                   entries,
+                   entries<<(i*9+12));
     printk("    PoD entries=%d cachesize=%d\n",
            p2m->pod.entry_count, p2m->pod.count);
 }
@@ -1475,6 +1493,12 @@
             old_mfn = l1e_get_pfn(*p2m_entry);
         }
 
+        /* Adjust count for present/not-present entries added */
+        if ( l1e_get_flags(*p2m_entry) )
+            p2m->stats.entries[page_order/9]--;
+        if ( l1e_get_flags(entry_content) )
+            p2m->stats.entries[page_order/9]++;
+
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn, entry_content,
3);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
 
@@ -1519,6 +1543,13 @@
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
             old_mfn = l1e_get_pfn(*p2m_entry);
         }
+
+        /* Adjust count for present/not-present entries added */
+        if ( l1e_get_flags(*p2m_entry) )
+            p2m->stats.entries[page_order/9]--;
+        if ( l1e_get_flags(entry_content) )
+            p2m->stats.entries[page_order/9]++;
+
         /* level 1 entry */
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn, entry_content,
1);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
@@ -1556,6 +1587,12 @@
             old_mfn = l1e_get_pfn(*p2m_entry);
         }
 
+        /* Adjust count for present/not-present entries added */
+        if ( l1e_get_flags(*p2m_entry) )
+            p2m->stats.entries[page_order/9]--;
+        if ( l1e_get_flags(entry_content) )
+            p2m->stats.entries[page_order/9]++;
+
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn, entry_content,
2);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
 
@@ -2750,6 +2787,8 @@
                 continue;
             }
 
+            /* STATS: Should change only type; no stats should need adjustment
*/
+
             l2mfn = _mfn(l3e_get_pfn(l3e[i3]));
             l2e = map_domain_page(l3e_get_pfn(l3e[i3]));
             for ( i2 = 0; i2 < L2_PAGETABLE_ENTRIES; i2++ )
diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h
--- a/xen/include/asm-x86/p2m.h	Thu May 05 17:40:34 2011 +0100
+++ b/xen/include/asm-x86/p2m.h	Fri May 06 15:01:08 2011 +0100
@@ -278,6 +278,10 @@
         unsigned         reclaim_single; /* Last gpfn of a scan */
         unsigned         max_guest;    /* gpfn of max guest demand-populate */
     } pod;
+
+    struct {
+        long entries[3]; 
+    } stats;
 };
 
 /* get host p2m table */

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 14:01 UTC

head link

[Xen-devel] [PATCH 2 of 4] tools: Detect superpages on domain restore

When receiving pages, look for contiguous 2-meg aligned regions and
attempt to allocate a superpage for that region, falling back to
4k pages if the allocation fails.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r be5d93d38f28 -r 9b4c0f2f5b9e tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c	Fri May 06 15:01:08 2011 +0100
+++ b/tools/libxc/xc_domain_restore.c	Fri May 06 15:01:08 2011 +0100
@@ -48,6 +48,11 @@
 
 #define HEARTBEAT_MS 1000
 
+#define SUPERPAGE_PFN_SHIFT  9
+#define SUPERPAGE_NR_PFNS    (1UL << SUPERPAGE_PFN_SHIFT)
+
+#define SUPER_PAGE_START(pfn)    (((pfn) & (SUPERPAGE_NR_PFNS-1)) == 0 )
+
 #ifndef __MINIOS__
 static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
                        int fd, void* buf, size_t size)
@@ -882,9 +887,11 @@
 static int apply_batch(xc_interface *xch, uint32_t dom, struct restore_ctx
*ctx,
                        xen_pfn_t* region_mfn, unsigned long* pfn_type, int
pae_extended_cr3,
                        unsigned int hvm, struct xc_mmu* mmu,
-                       pagebuf_t* pagebuf, int curbatch)
+                       pagebuf_t* pagebuf, int curbatch, int superpages)
 {
     int i, j, curpage, nr_mfns;
+    int k, scount;
+    unsigned long superpage_start=INVALID_P2M_ENTRY;
     /* used by debug verify code */
     unsigned long buf[PAGE_SIZE/sizeof(unsigned long)];
     /* Our mapping of the current region (batch) */
@@ -902,8 +909,8 @@
     if (j > MAX_BATCH_SIZE)
         j = MAX_BATCH_SIZE;
 
-    /* First pass for this batch: work out how much memory to alloc */
-    nr_mfns = 0; 
+    /* First pass for this batch: work out how much memory to alloc, and detect
superpages */
+    nr_mfns = scount = 0;
     for ( i = 0; i < j; i++ )
     {
         unsigned long pfn, pagetype;
@@ -914,19 +921,103 @@
              (ctx->p2m[pfn] == INVALID_P2M_ENTRY) )
         {
             /* Have a live PFN which hasn''t had an MFN allocated */
+
+            /* Logic if we''re in the middle of detecting a candidate
superpage */
+            if ( superpage_start != INVALID_P2M_ENTRY )
+            {
+                /* Is this the next expected continuation? */
+                if ( pfn == superpage_start + scount )
+                {
+                    if ( !superpages )
+                    {
+                        ERROR("Unexpexted codepath with no
superpages");
+                        return -1;
+                    }
+
+                    scount++;
+
+                    /* If we''ve found a whole superpage, allocate it
and update p2m */
+                    if ( scount  == SUPERPAGE_NR_PFNS )
+                    {
+                        unsigned long supermfn;
+
+
+                        supermfn=superpage_start;
+                        if ( xc_domain_populate_physmap_exact(xch, dom, 1,
+                                         SUPERPAGE_PFN_SHIFT, 0, &supermfn)
!= 0 )
+                        {
+                            DPRINTF("No 2M page available for pfn 0x%lx,
fall back to 4K page.\n",
+                                    superpage_start);
+                            /* If we''re falling back from a failed
allocation, subtract one
+                             * from count, since the last page == pfn, which
will behandled
+                             * anyway. */
+                            scount--;
+                            goto fallback;
+                        }
+
+                        DPRINTF("Mapping superpage (%d) pfn %lx, mfn
%lx\n", scount, superpage_start, supermfn);
+                        for (k=0; k<scount; k++)
+                        {
+                            /* We just allocated a new mfn above; update p2m */
+                            ctx->p2m[superpage_start+k] = supermfn+k;
+                            ctx->nr_pfns++;
+                            /* region_map[] will be set below */
+                        }
+                        superpage_start=INVALID_P2M_ENTRY;
+                        scount=0;
+                    }
+                    continue;
+                }
+                
+            fallback:
+                DPRINTF("Falling back %d pages pfn %lx\n", scount,
superpage_start);
+                for (k=0; k<scount; k++)
+                {
+                    ctx->p2m_batch[nr_mfns++] = superpage_start+k; 
+                    ctx->p2m[superpage_start+k]--;
+                }
+                superpage_start = INVALID_P2M_ENTRY;
+                scount=0;
+            }
+
+            /* Are we ready to start a new superpage candidate? */
+            if ( superpages && SUPER_PAGE_START(pfn) )
+            {
+                superpage_start=pfn;
+                scount++;
+                continue;
+            }
+            
+            /* Add the current pfn to pfn_batch */
             ctx->p2m_batch[nr_mfns++] = pfn; 
             ctx->p2m[pfn]--;
         }
-    } 
+    }
+
+    /* Clean up any partial superpage candidates */
+    if ( superpage_start != INVALID_P2M_ENTRY )
+    {
+        DPRINTF("Falling back %d pages pfn %lx\n", scount,
superpage_start);
+        for (k=0; k<scount; k++)
+        {
+            ctx->p2m_batch[nr_mfns++] = superpage_start+k; 
+            ctx->p2m[superpage_start+k]--;
+        }
+        superpage_start = INVALID_P2M_ENTRY;
+    }
 
     /* Now allocate a bunch of mfns for this batch */
-    if ( nr_mfns &&
-         (xc_domain_populate_physmap_exact(xch, dom, nr_mfns, 0,
-                                            0, ctx->p2m_batch) != 0) )
-    { 
-        ERROR("Failed to allocate memory for batch.!\n"); 
-        errno = ENOMEM;
-        return -1;
+    if ( nr_mfns )
+    {
+        DPRINTF("Mapping order 0,  %d; first pfn %lx\n", nr_mfns,
ctx->p2m_batch[0]);
+    
+        if(xc_domain_populate_physmap_exact(xch, dom, nr_mfns, 0,
+                                            0, ctx->p2m_batch) != 0) 
+        { 
+            ERROR("Failed to allocate memory for batch.!\n"); 
+            errno = ENOMEM;
+            return -1;
+        }
     }
 
     /* Second pass for this batch: update p2m[] and region_mfn[] */
@@ -977,7 +1068,8 @@
 
         if (pfn_err[i])
         {
-            ERROR("unexpected PFN mapping failure");
+            ERROR("unexpected PFN mapping failure pfn %lx map_mfn %lx
p2m_mfn %lx",
+                  pfn, region_mfn[i], ctx->p2m[pfn]);
             goto err_mapped;
         }
 
@@ -1148,9 +1240,6 @@
     /* For info only */
     ctx->nr_pfns = 0;
 
-    if ( superpages )
-        return 1;
-
     ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
 
     if ( ctxt == NULL )
@@ -1298,7 +1387,8 @@
             int brc;
 
             brc = apply_batch(xch, dom, ctx, region_mfn, pfn_type,
-                              pae_extended_cr3, hvm, mmu, &pagebuf,
curbatch);
+                              pae_extended_cr3, hvm, mmu, &pagebuf,
curbatch,
+                              superpages);
             if ( brc < 0 )
                 goto out;
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 14:01 UTC

head link

[Xen-devel] [PATCH 3 of 4] tools: Save superpages in the same batch, to make detection easier

On the first time through (when pfns are mostly allocated on
the receiving side), try to keep superpages together in the same
batch by ending a batch early if we see the first page of a
potential superpage and there isn''t enough room in the batch
for a full superpage.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 9b4c0f2f5b9e -r 3fa5cca65bac tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c	Fri May 06 15:01:08 2011 +0100
+++ b/tools/libxc/xc_domain_save.c	Fri May 06 15:01:08 2011 +0100
@@ -82,6 +82,11 @@
      ((mfn_to_pfn(_mfn) < (dinfo->p2m_size)) &&   \
       (pfn_to_mfn(mfn_to_pfn(_mfn)) == (_mfn))))
 
+#define SUPERPAGE_PFN_SHIFT  9
+#define SUPERPAGE_NR_PFNS    (1UL << SUPERPAGE_PFN_SHIFT)
+
+#define SUPER_PAGE_START(pfn)    (((pfn) & (SUPERPAGE_NR_PFNS-1)) == 0 )
+
 /*
 ** During (live) save/migrate, we maintain a number of bitmaps to track
 ** which pages we have to send, to fixup, and to skip.
@@ -906,6 +911,7 @@
     int rc = 1, frc, i, j, last_iter = 0, iter = 0;
     int live  = (flags & XCFLAGS_LIVE);
     int debug = (flags & XCFLAGS_DEBUG);
+    int superpages = !!hvm;
     int race = 0, sent_last_iter, skip_this_iter = 0;
     unsigned int sent_this_iter = 0;
     int tmem_saved = 0;
@@ -1262,6 +1268,12 @@
                            (test_bit(n, to_fix)  && last_iter)) )
                         continue;
 
+                    /* First time through, try to keep superpages in the same
batch */
+                    if ( superpages && iter == 1
+                         && SUPER_PAGE_START(n)
+                         && batch + SUPERPAGE_NR_PFNS >
MAX_BATCH_SIZE )
+                        break;
+
                     /*
                     ** we get here if:
                     **  1. page is marked to_send & hasn''t already
been re-dirtied

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 14:01 UTC

head link

[Xen-devel] [PATCH 4 of 4] tools: Introduce "allocate-only" page type for migration

To detect presence of superpages on the receiver side, we need
to have strings of sequential pfns sent across on the first iteration
through the memory.  However, as we go through the memory, more and
more of it will be marked dirty, making it wasteful to send those pages.

This patch introduces a new PFINFO type, "XALLOC".  Like PFINFO_XTAB,
it
indicates that there is no corresponding page present in the subsquent
page buffer.  However, unlike PFINFO_XTAB, it contains a pfn which should be
allocated.

This new type is only used for migration; but it''s placed in
xen/public/domctl.h so that the value isn''t reused.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 3fa5cca65bac -r aeb70017bf60 tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c	Fri May 06 15:01:08 2011 +0100
+++ b/tools/libxc/xc_domain_restore.c	Fri May 06 15:01:08 2011 +0100
@@ -839,7 +839,8 @@
 
     countpages = count;
     for (i = oldcount; i < buf->nr_pages; ++i)
-        if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) ==
XEN_DOMCTL_PFINFO_XTAB)
+        if ((buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) ==
XEN_DOMCTL_PFINFO_XTAB
+            ||(buf->pfn_types[i] & XEN_DOMCTL_PFINFO_LTAB_MASK) ==
XEN_DOMCTL_PFINFO_XALLOC)
             --countpages;
 
     if (!countpages)
@@ -917,6 +918,7 @@
         pfn      = pagebuf->pfn_types[i + curbatch] &
~XEN_DOMCTL_PFINFO_LTAB_MASK;
         pagetype = pagebuf->pfn_types[i + curbatch] & 
XEN_DOMCTL_PFINFO_LTAB_MASK;
 
+        /* For allocation purposes, treat XEN_DOMCTL_PFINFO_XALLOC as a normal
page */
         if ( (pagetype != XEN_DOMCTL_PFINFO_XTAB) && 
              (ctx->p2m[pfn] == INVALID_P2M_ENTRY) )
         {
@@ -1028,21 +1030,21 @@
         pfn      = pagebuf->pfn_types[i + curbatch] &
~XEN_DOMCTL_PFINFO_LTAB_MASK;
         pagetype = pagebuf->pfn_types[i + curbatch] & 
XEN_DOMCTL_PFINFO_LTAB_MASK;
 
-        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB )
+        if ( pagetype != XEN_DOMCTL_PFINFO_XTAB
+             && ctx->p2m[pfn] == (INVALID_P2M_ENTRY-1) )
+        {
+            /* We just allocated a new mfn above; update p2m */
+            ctx->p2m[pfn] = ctx->p2m_batch[nr_mfns++]; 
+            ctx->nr_pfns++; 
+        }
+
+        /* setup region_mfn[] for batch map, if necessary.
+         * For HVM guests, this interface takes PFNs, not MFNs */
+        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
+             || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
             region_mfn[i] = ~0UL; /* map will fail but we don''t care
*/
-        else 
-        {
-            if ( ctx->p2m[pfn] == (INVALID_P2M_ENTRY-1) )
-            {
-                /* We just allocated a new mfn above; update p2m */
-                ctx->p2m[pfn] = ctx->p2m_batch[nr_mfns++]; 
-                ctx->nr_pfns++; 
-            }
-
-            /* setup region_mfn[] for batch map.
-             * For HVM guests, this interface takes PFNs, not MFNs */
+        else
             region_mfn[i] = hvm ? pfn : ctx->p2m[pfn]; 
-        }
     }
 
     /* Map relevant mfns */
@@ -1062,8 +1064,9 @@
         pfn      = pagebuf->pfn_types[i + curbatch] &
~XEN_DOMCTL_PFINFO_LTAB_MASK;
         pagetype = pagebuf->pfn_types[i + curbatch] & 
XEN_DOMCTL_PFINFO_LTAB_MASK;
 
-        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB )
-            /* a bogus/unmapped page: skip it */
+        if ( pagetype == XEN_DOMCTL_PFINFO_XTAB 
+             || pagetype == XEN_DOMCTL_PFINFO_XALLOC)
+            /* a bogus/unmapped/allocate-only page: skip it */
             continue;
 
         if (pfn_err[i])
diff -r 3fa5cca65bac -r aeb70017bf60 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c	Fri May 06 15:01:08 2011 +0100
+++ b/tools/libxc/xc_domain_save.c	Fri May 06 15:01:08 2011 +0100
@@ -1258,13 +1258,15 @@
                 }
                 else
                 {
-                    if ( !last_iter &&
+                    int dont_skip = (last_iter || (superpages &&
iter==1));
+
+                    if ( !dont_skip &&
                          test_bit(n, to_send) &&
                          test_bit(n, to_skip) )
                         skip_this_iter++; /* stats keeping */
 
                     if ( !((test_bit(n, to_send) && !test_bit(n,
to_skip)) ||
-                           (test_bit(n, to_send) && last_iter) ||
+                           (test_bit(n, to_send) && dont_skip) ||
                            (test_bit(n, to_fix)  && last_iter)) )
                         continue;
 
@@ -1277,7 +1279,7 @@
                     /*
                     ** we get here if:
                     **  1. page is marked to_send & hasn''t already
been re-dirtied
-                    **  2. (ignore to_skip in last iteration)
+                    **  2. (ignore to_skip in first and last iterations)
                     **  3. add in pages that still need fixup (net bufs)
                     */
 
@@ -1301,7 +1303,7 @@
                         set_bit(n, to_fix);
                         continue;
                     }
-
+                    
                     if ( last_iter &&
                          test_bit(n, to_fix) &&
                          !test_bit(n, to_send) )
@@ -1346,6 +1348,7 @@
                 {
                     if ( pfn_type[j] == XEN_DOMCTL_PFINFO_XTAB )
                         continue;
+
                     DPRINTF("map fail: page %i mfn %08lx err %d\n",
                             j, gmfn, pfn_err[j]);
                     pfn_type[j] = XEN_DOMCTL_PFINFO_XTAB;
@@ -1358,6 +1361,9 @@
                     continue;
                 }
 
+                if ( superpages && iter==1 && test_bit(gmfn,
to_skip))
+                    pfn_type[j] = XEN_DOMCTL_PFINFO_XALLOC;
+
                 /* canonicalise mfn->pfn */
                 pfn_type[j] |= pfn_batch[j];
                 ++run;
@@ -1432,8 +1438,9 @@
                     }
                 }
 
-                /* skip pages that aren''t present */
-                if ( pagetype == XEN_DOMCTL_PFINFO_XTAB )
+                /* skip pages that aren''t present or are alloc-only */
+                if ( pagetype == XEN_DOMCTL_PFINFO_XTAB
+                    || pagetype == XEN_DOMCTL_PFINFO_XALLOC )
                     continue;
 
                 pagetype &= XEN_DOMCTL_PFINFO_LTABTYPE_MASK;
diff -r 3fa5cca65bac -r aeb70017bf60 xen/include/public/domctl.h
--- a/xen/include/public/domctl.h	Fri May 06 15:01:08 2011 +0100
+++ b/xen/include/public/domctl.h	Fri May 06 15:01:08 2011 +0100
@@ -133,6 +133,7 @@
 #define XEN_DOMCTL_PFINFO_LTABTYPE_MASK (0x7U<<28)
 #define XEN_DOMCTL_PFINFO_LPINTAB (0x1U<<31)
 #define XEN_DOMCTL_PFINFO_XTAB    (0xfU<<28) /* invalid page */
+#define XEN_DOMCTL_PFINFO_XALLOC  (0xeU<<28) /* allocate-only page */
 #define XEN_DOMCTL_PFINFO_PAGEDTAB (0x8U<<28)
 #define XEN_DOMCTL_PFINFO_LTAB_MASK (0xfU<<28)
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-May-06 14:23 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Hi, 

Can you please add this:

 [diff]
 showfunc = True

to your .hgrc?  It makes reviewing much easier. 

There are two places where this patch changes the control flow of p2m
operations: 
> @@ -167,11 +169,14 @@
>  void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int
level)
>  {
>      /* End if the entry is a leaf entry. */
> -    if ( level == 0 || !is_epte_present(ept_entry) ||
> -         is_epte_superpage(ept_entry) )
> +    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
> +    {
> +        if ( is_epte_countable(ept_entry) )
> +            p2m->stats.entries[level]--;
>          return;
> +    }
>  
> -    if ( level > 1 )
> +    if ( level > 0 )
>      {
and similarly: 
> @@ -184,11 +184,15 @@
>  {
>      /* End if the entry is a leaf entry. */
>      if ( page_order == 0
> -         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT)
> +         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) 
>           || (l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
> +    {
> +        if ( l1e_get_flags(*p2m_entry) )
> +            p2m->stats.entries[page_order/9]--;
>          return;
> -
> -    if ( page_order > 9 )
> +    }
> +
> +    if ( page_order )
here.  Can you explain those in a bit more detail?

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christoph Egger

2011-May-06 14:40 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Hi,

Can you please use defines for the indexes of p2m->stats.entries[], 
please? That makes it easier to read what you are counting.

Christoph


On 05/06/11 16:01, George Dunlap wrote:> Count the number of 4kiB, 2MiB, and 1GiB p2m entries.
>
> Signed-off-by: George Dunlap<george.dunlap@eu.citrix.com>
>
> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/hap/p2m-ept.c
> --- a/xen/arch/x86/mm/hap/p2m-ept.c     Thu May 05 17:40:34 2011 +0100
> +++ b/xen/arch/x86/mm/hap/p2m-ept.c     Fri May 06 15:01:08 2011 +0100
> @@ -39,6 +39,8 @@
>
>   #define is_epte_present(ept_entry)      ((ept_entry)->epte&  0x7)
>   #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
> +#define is_epte_countable(ept_entry)    (is_epte_present(ept_entry) \
> +                                         || ((ept_entry)->sa_p2mt ==
p2m_populate_on_demand))
>
>   /* Non-ept "lock-and-check" wrapper */
>   static int ept_pod_check_and_populate(struct p2m_domain *p2m, unsigned
long gfn,
> @@ -167,11 +169,14 @@
>   void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int
level)
>   {
>       /* End if the entry is a leaf entry. */
> -    if ( level == 0 || !is_epte_present(ept_entry) ||
> -         is_epte_superpage(ept_entry) )
> +    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
> +    {
> +        if ( is_epte_countable(ept_entry) )
> +            p2m->stats.entries[level]--;
>           return;
> +    }
>
> -    if ( level>  1 )
> +    if ( level>  0 )
>       {
>           ept_entry_t *epte = map_domain_page(ept_entry->mfn);
>           for ( int i = 0; i<  EPT_PAGETABLE_ENTRIES; i++ )
> @@ -217,7 +222,10 @@
>           ept_p2m_type_to_flags(epte, epte->sa_p2mt, epte->access);
>
>           if ( (level - 1) == target )
> +        {
> +            p2m->stats.entries[target]++;
>               continue;
> +        }
>
>           ASSERT(is_epte_superpage(epte));
>
> @@ -400,6 +408,10 @@
>               ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
>           }
>
> +        /* old_entry will be handled by ept_free_entry below */
> +        if ( is_epte_countable(&new_entry) )
> +            p2m->stats.entries[i]++;
> +
>           atomic_write_ept_entry(ept_entry, new_entry);
>       }
>       else
> @@ -412,12 +424,16 @@
>
>           split_ept_entry = atomic_read_ept_entry(ept_entry);
>
> +        /* Accounting should be OK here; split_ept_entry bump the counts,
> +         * free_entry will reduce them. */
>           if ( !ept_split_super_page(p2m,&split_ept_entry, i, target) )
>           {
>               ept_free_entry(p2m,&split_ept_entry, i);
>               goto out;
>           }
>
> +        /* We know this was countable or we wouldn''t be here.*/
> +        p2m->stats.entries[i]--;
>           /* now install the newly split ept sub-tree */
>           /* NB: please make sure domian is paused and no in-fly VT-d DMA.
*/
>           atomic_write_ept_entry(ept_entry, split_ept_entry);
> @@ -449,9 +465,13 @@
>
>           ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
>
> +        /* old_entry will be handled by ept_free_entry below */
> +        if ( is_epte_countable(&new_entry) )
> +            p2m->stats.entries[i]++;
> +
>           atomic_write_ept_entry(ept_entry, new_entry);
>       }
> -
> +
>       /* Track the highest gfn for which we have ever had a valid mapping
*/
>       if ( mfn_valid(mfn_x(mfn))&&
>            (gfn + (1UL<<  order) - 1>  p2m->max_mapped_pfn) )
> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c     Thu May 05 17:40:34 2011 +0100
> +++ b/xen/arch/x86/mm/p2m.c     Fri May 06 15:01:08 2011 +0100
> @@ -184,11 +184,15 @@
>   {
>       /* End if the entry is a leaf entry. */
>       if ( page_order == 0
> -         || !(l1e_get_flags(*p2m_entry)&  _PAGE_PRESENT)
> +         || !(l1e_get_flags(*p2m_entry)&  _PAGE_PRESENT)
>            || (l1e_get_flags(*p2m_entry)&  _PAGE_PSE) )
> +    {
> +        if ( l1e_get_flags(*p2m_entry) )
> +            p2m->stats.entries[page_order/9]--;
>           return;
> -
> -    if ( page_order>  9 )
> +    }
> +
> +    if ( page_order )
>       {
>           l1_pgentry_t *l3_table =
map_domain_page(l1e_get_pfn(*p2m_entry));
>           for ( int i = 0; i<  L3_PAGETABLE_ENTRIES; i++ )
> @@ -242,6 +246,7 @@
>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>                                    __PAGE_HYPERVISOR | _PAGE_USER);
>
> +        /* Stats: Empty entry, no mods needed */
>           switch ( type ) {
>           case PGT_l3_page_table:
>               p2m_add_iommu_flags(&new_entry, 3,
IOMMUF_readable|IOMMUF_writable);
> @@ -285,10 +290,12 @@
>           {
>               new_entry = l1e_from_pfn(pfn + (i * L1_PAGETABLE_ENTRIES),
flags);
>               p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
> +            p2m->stats.entries[1]++;
>               p2m->write_p2m_entry(p2m, gfn,
>                   l1_entry+i, *table_mfn, new_entry, 2);
>           }
>           unmap_domain_page(l1_entry);
> +        p2m->stats.entries[2]--;
>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>                                    __PAGE_HYPERVISOR|_PAGE_USER); //disable
PSE
>           p2m_add_iommu_flags(&new_entry, 2,
IOMMUF_readable|IOMMUF_writable);
> @@ -320,6 +327,7 @@
>           {
>               new_entry = l1e_from_pfn(pfn + i, flags);
>               p2m_add_iommu_flags(&new_entry, 0, 0);
> +            p2m->stats.entries[0]++;
>               p2m->write_p2m_entry(p2m, gfn,
>                   l1_entry+i, *table_mfn, new_entry, 1);
>           }
> @@ -328,6 +336,7 @@
>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>                                    __PAGE_HYPERVISOR|_PAGE_USER);
>           p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
> +        p2m->stats.entries[1]--;
>           p2m->write_p2m_entry(p2m, gfn,
>               p2m_entry, *table_mfn, new_entry, 2);
>       }
> @@ -908,6 +917,15 @@
>   void
>   p2m_pod_dump_data(struct p2m_domain *p2m)
>   {
> +    int i;
> +    long entries;
> +    printk("    P2M entry stats:\n");
> +    for ( i=0; i<3; i++)
> +        if ( (entries=p2m->stats.entries[i]) )
> +            printk("     L%d: %8ld entries, %ld bytes\n",
> +                   i+1,
> +                   entries,
> +                   entries<<(i*9+12));
>       printk("    PoD entries=%d cachesize=%d\n",
>              p2m->pod.entry_count, p2m->pod.count);
>   }
> @@ -1475,6 +1493,12 @@
>               old_mfn = l1e_get_pfn(*p2m_entry);
>           }
>
> +        /* Adjust count for present/not-present entries added */
> +        if ( l1e_get_flags(*p2m_entry) )
> +            p2m->stats.entries[page_order/9]--;
> +        if ( l1e_get_flags(entry_content) )
> +            p2m->stats.entries[page_order/9]++;
> +
>           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 3);
>           /* NB: paging_write_p2m_entry() handles tlb flushes properly */
>
> @@ -1519,6 +1543,13 @@
>               p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
>               old_mfn = l1e_get_pfn(*p2m_entry);
>           }
> +
> +        /* Adjust count for present/not-present entries added */
> +        if ( l1e_get_flags(*p2m_entry) )
> +            p2m->stats.entries[page_order/9]--;
> +        if ( l1e_get_flags(entry_content) )
> +            p2m->stats.entries[page_order/9]++;
> +
>           /* level 1 entry */
>           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 1);
>           /* NB: paging_write_p2m_entry() handles tlb flushes properly */
> @@ -1556,6 +1587,12 @@
>               old_mfn = l1e_get_pfn(*p2m_entry);
>           }
>
> +        /* Adjust count for present/not-present entries added */
> +        if ( l1e_get_flags(*p2m_entry) )
> +            p2m->stats.entries[page_order/9]--;
> +        if ( l1e_get_flags(entry_content) )
> +            p2m->stats.entries[page_order/9]++;
> +
>           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 2);
>           /* NB: paging_write_p2m_entry() handles tlb flushes properly */
>
> @@ -2750,6 +2787,8 @@
>                   continue;
>               }
>
> +            /* STATS: Should change only type; no stats should need
adjustment */
> +
>               l2mfn = _mfn(l3e_get_pfn(l3e[i3]));
>               l2e = map_domain_page(l3e_get_pfn(l3e[i3]));
>               for ( i2 = 0; i2<  L2_PAGETABLE_ENTRIES; i2++ )
> diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h
> --- a/xen/include/asm-x86/p2m.h Thu May 05 17:40:34 2011 +0100
> +++ b/xen/include/asm-x86/p2m.h Fri May 06 15:01:08 2011 +0100
> @@ -278,6 +278,10 @@
>           unsigned         reclaim_single; /* Last gpfn of a scan */
>           unsigned         max_guest;    /* gpfn of max guest
demand-populate */
>       } pod;
> +
> +    struct {
> +        long entries[3];
> +    } stats;
>   };
>
>   /* get host p2m table */
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christoph Egger

2011-May-06 14:53 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

On 05/06/11 17:00, Tim Deegan wrote:> Hello,
>
> At 15:40 +0100 on 06 May (1304696401), Christoph Egger wrote:
>> Can you please use defines for the indexes of p2m->stats.entries[],
>> please? That makes it easier to read what you are counting.
>
> I''m not sure that''s a great idea.  The array is indexed
in several
> places by integer already, and the defines are unlikely to be much more
> illuminating unless you can think of some better naming scheme than
> "#define LEVEL_0 0" etc.
What about this:

#define PAGE_ORDER_4K  0
#define PAGE_ORDER_2M  9
#define PAGE_ORDER_1G  18
>
> On the other hand, maybe the array itself could have a more descriptive
> name than "stats.entries".
>
> Tim.
>
>> On 05/06/11 16:01, George Dunlap wrote:
>>> Count the number of 4kiB, 2MiB, and 1GiB p2m entries.
>>>
>>> Signed-off-by: George Dunlap<george.dunlap@eu.citrix.com>
>>>
>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/hap/p2m-ept.c
>>> --- a/xen/arch/x86/mm/hap/p2m-ept.c     Thu May 05 17:40:34 2011
+0100
>>> +++ b/xen/arch/x86/mm/hap/p2m-ept.c     Fri May 06 15:01:08 2011
+0100
>>> @@ -39,6 +39,8 @@
>>>
>>>    #define is_epte_present(ept_entry)     
((ept_entry)->epte&   0x7)
>>>    #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
>>> +#define is_epte_countable(ept_entry)   
(is_epte_present(ept_entry) \
>>> +                                         ||
((ept_entry)->sa_p2mt == p2m_populate_on_demand))
>>>
>>>    /* Non-ept "lock-and-check" wrapper */
>>>    static int ept_pod_check_and_populate(struct p2m_domain *p2m,
unsigned long gfn,
>>> @@ -167,11 +169,14 @@
>>>    void ept_free_entry(struct p2m_domain *p2m, ept_entry_t
*ept_entry, int level)
>>>    {
>>>        /* End if the entry is a leaf entry. */
>>> -    if ( level == 0 || !is_epte_present(ept_entry) ||
>>> -         is_epte_superpage(ept_entry) )
>>> +    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
>>> +    {
>>> +        if ( is_epte_countable(ept_entry) )
>>> +            p2m->stats.entries[level]--;
>>>            return;
>>> +    }
>>>
>>> -    if ( level>   1 )
>>> +    if ( level>   0 )
>>>        {
>>>            ept_entry_t *epte = map_domain_page(ept_entry->mfn);
>>>            for ( int i = 0; i<   EPT_PAGETABLE_ENTRIES; i++ )
>>> @@ -217,7 +222,10 @@
>>>            ept_p2m_type_to_flags(epte, epte->sa_p2mt,
epte->access);
>>>
>>>            if ( (level - 1) == target )
>>> +        {
>>> +            p2m->stats.entries[target]++;
>>>                continue;
>>> +        }
>>>
>>>            ASSERT(is_epte_superpage(epte));
>>>
>>> @@ -400,6 +408,10 @@
>>>                ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
>>>            }
>>>
>>> +        /* old_entry will be handled by ept_free_entry below */
>>> +        if ( is_epte_countable(&new_entry) )
>>> +            p2m->stats.entries[i]++;
>>> +
>>>            atomic_write_ept_entry(ept_entry, new_entry);
>>>        }
>>>        else
>>> @@ -412,12 +424,16 @@
>>>
>>>            split_ept_entry = atomic_read_ept_entry(ept_entry);
>>>
>>> +        /* Accounting should be OK here; split_ept_entry bump the
counts,
>>> +         * free_entry will reduce them. */
>>>            if ( !ept_split_super_page(p2m,&split_ept_entry, i,
target) )
>>>            {
>>>                ept_free_entry(p2m,&split_ept_entry, i);
>>>                goto out;
>>>            }
>>>
>>> +        /* We know this was countable or we wouldn''t be
here.*/
>>> +        p2m->stats.entries[i]--;
>>>            /* now install the newly split ept sub-tree */
>>>            /* NB: please make sure domian is paused and no in-fly
VT-d DMA. */
>>>            atomic_write_ept_entry(ept_entry, split_ept_entry);
>>> @@ -449,9 +465,13 @@
>>>
>>>            ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
>>>
>>> +        /* old_entry will be handled by ept_free_entry below */
>>> +        if ( is_epte_countable(&new_entry) )
>>> +            p2m->stats.entries[i]++;
>>> +
>>>            atomic_write_ept_entry(ept_entry, new_entry);
>>>        }
>>> -
>>> +
>>>        /* Track the highest gfn for which we have ever had a valid
mapping */
>>>        if ( mfn_valid(mfn_x(mfn))&&
>>>             (gfn + (1UL<<   order) - 1>  
p2m->max_mapped_pfn) )
>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c
>>> --- a/xen/arch/x86/mm/p2m.c     Thu May 05 17:40:34 2011 +0100
>>> +++ b/xen/arch/x86/mm/p2m.c     Fri May 06 15:01:08 2011 +0100
>>> @@ -184,11 +184,15 @@
>>>    {
>>>        /* End if the entry is a leaf entry. */
>>>        if ( page_order == 0
>>> -         || !(l1e_get_flags(*p2m_entry)&   _PAGE_PRESENT)
>>> +         || !(l1e_get_flags(*p2m_entry)&   _PAGE_PRESENT)
>>>             || (l1e_get_flags(*p2m_entry)&   _PAGE_PSE) )
>>> +    {
>>> +        if ( l1e_get_flags(*p2m_entry) )
>>> +            p2m->stats.entries[page_order/9]--;
>>>            return;
>>> -
>>> -    if ( page_order>   9 )
>>> +    }
>>> +
>>> +    if ( page_order )
>>>        {
>>>            l1_pgentry_t *l3_table =
map_domain_page(l1e_get_pfn(*p2m_entry));
>>>            for ( int i = 0; i<   L3_PAGETABLE_ENTRIES; i++ )
>>> @@ -242,6 +246,7 @@
>>>            new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>                                     __PAGE_HYPERVISOR |
_PAGE_USER);
>>>
>>> +        /* Stats: Empty entry, no mods needed */
>>>            switch ( type ) {
>>>            case PGT_l3_page_table:
>>>                p2m_add_iommu_flags(&new_entry, 3,
IOMMUF_readable|IOMMUF_writable);
>>> @@ -285,10 +290,12 @@
>>>            {
>>>                new_entry = l1e_from_pfn(pfn + (i *
L1_PAGETABLE_ENTRIES), flags);
>>>                p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
>>> +            p2m->stats.entries[1]++;
>>>                p2m->write_p2m_entry(p2m, gfn,
>>>                    l1_entry+i, *table_mfn, new_entry, 2);
>>>            }
>>>            unmap_domain_page(l1_entry);
>>> +        p2m->stats.entries[2]--;
>>>            new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>                                     __PAGE_HYPERVISOR|_PAGE_USER);
//disable PSE
>>>            p2m_add_iommu_flags(&new_entry, 2,
IOMMUF_readable|IOMMUF_writable);
>>> @@ -320,6 +327,7 @@
>>>            {
>>>                new_entry = l1e_from_pfn(pfn + i, flags);
>>>                p2m_add_iommu_flags(&new_entry, 0, 0);
>>> +            p2m->stats.entries[0]++;
>>>                p2m->write_p2m_entry(p2m, gfn,
>>>                    l1_entry+i, *table_mfn, new_entry, 1);
>>>            }
>>> @@ -328,6 +336,7 @@
>>>            new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>                                     __PAGE_HYPERVISOR|_PAGE_USER);
>>>            p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
>>> +        p2m->stats.entries[1]--;
>>>            p2m->write_p2m_entry(p2m, gfn,
>>>                p2m_entry, *table_mfn, new_entry, 2);
>>>        }
>>> @@ -908,6 +917,15 @@
>>>    void
>>>    p2m_pod_dump_data(struct p2m_domain *p2m)
>>>    {
>>> +    int i;
>>> +    long entries;
>>> +    printk("    P2M entry stats:\n");
>>> +    for ( i=0; i<3; i++)
>>> +        if ( (entries=p2m->stats.entries[i]) )
>>> +            printk("     L%d: %8ld entries, %ld
bytes\n",
>>> +                   i+1,
>>> +                   entries,
>>> +                   entries<<(i*9+12));
>>>        printk("    PoD entries=%d cachesize=%d\n",
>>>               p2m->pod.entry_count, p2m->pod.count);
>>>    }
>>> @@ -1475,6 +1493,12 @@
>>>                old_mfn = l1e_get_pfn(*p2m_entry);
>>>            }
>>>
>>> +        /* Adjust count for present/not-present entries added */
>>> +        if ( l1e_get_flags(*p2m_entry) )
>>> +            p2m->stats.entries[page_order/9]--;
>>> +        if ( l1e_get_flags(entry_content) )
>>> +            p2m->stats.entries[page_order/9]++;
>>> +
>>>            p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 3);
>>>            /* NB: paging_write_p2m_entry() handles tlb flushes
properly */
>>>
>>> @@ -1519,6 +1543,13 @@
>>>                p2m_add_iommu_flags(&entry_content, 0,
iommu_pte_flags);
>>>                old_mfn = l1e_get_pfn(*p2m_entry);
>>>            }
>>> +
>>> +        /* Adjust count for present/not-present entries added */
>>> +        if ( l1e_get_flags(*p2m_entry) )
>>> +            p2m->stats.entries[page_order/9]--;
>>> +        if ( l1e_get_flags(entry_content) )
>>> +            p2m->stats.entries[page_order/9]++;
>>> +
>>>            /* level 1 entry */
>>>            p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 1);
>>>            /* NB: paging_write_p2m_entry() handles tlb flushes
properly */
>>> @@ -1556,6 +1587,12 @@
>>>                old_mfn = l1e_get_pfn(*p2m_entry);
>>>            }
>>>
>>> +        /* Adjust count for present/not-present entries added */
>>> +        if ( l1e_get_flags(*p2m_entry) )
>>> +            p2m->stats.entries[page_order/9]--;
>>> +        if ( l1e_get_flags(entry_content) )
>>> +            p2m->stats.entries[page_order/9]++;
>>> +
>>>            p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 2);
>>>            /* NB: paging_write_p2m_entry() handles tlb flushes
properly */
>>>
>>> @@ -2750,6 +2787,8 @@
>>>                    continue;
>>>                }
>>>
>>> +            /* STATS: Should change only type; no stats should
need adjustment */
>>> +
>>>                l2mfn = _mfn(l3e_get_pfn(l3e[i3]));
>>>                l2e = map_domain_page(l3e_get_pfn(l3e[i3]));
>>>                for ( i2 = 0; i2<   L2_PAGETABLE_ENTRIES; i2++ )
>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h
>>> --- a/xen/include/asm-x86/p2m.h Thu May 05 17:40:34 2011 +0100
>>> +++ b/xen/include/asm-x86/p2m.h Fri May 06 15:01:08 2011 +0100
>>> @@ -278,6 +278,10 @@
>>>            unsigned         reclaim_single; /* Last gpfn of a scan
*/
>>>            unsigned         max_guest;    /* gpfn of max guest
demand-populate */
>>>        } pod;
>>> +
>>> +    struct {
>>> +        long entries[3];
>>> +    } stats;
>>>    };
>>>
>>>    /* get host p2m table */

-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-May-06 15:00 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Hello, 

At 15:40 +0100 on 06 May (1304696401), Christoph Egger
wrote:> Can you please use defines for the indexes of p2m->stats.entries[],
> please? That makes it easier to read what you are counting.
I''m not sure that''s a great idea.  The array is indexed in
several
places by integer already, and the defines are unlikely to be much more
illuminating unless you can think of some better naming scheme than 
"#define LEVEL_0 0" etc.

On the other hand, maybe the array itself could have a more descriptive
name than "stats.entries".

Tim.
> On 05/06/11 16:01, George Dunlap wrote:
> > Count the number of 4kiB, 2MiB, and 1GiB p2m entries.
> >
> > Signed-off-by: George Dunlap<george.dunlap@eu.citrix.com>
> >
> > diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/hap/p2m-ept.c
> > --- a/xen/arch/x86/mm/hap/p2m-ept.c     Thu May 05 17:40:34 2011 +0100
> > +++ b/xen/arch/x86/mm/hap/p2m-ept.c     Fri May 06 15:01:08 2011 +0100
> > @@ -39,6 +39,8 @@
> >
> >   #define is_epte_present(ept_entry)      ((ept_entry)->epte& 
0x7)
> >   #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
> > +#define is_epte_countable(ept_entry)    (is_epte_present(ept_entry) \
> > +                                         || ((ept_entry)->sa_p2mt
== p2m_populate_on_demand))
> >
> >   /* Non-ept "lock-and-check" wrapper */
> >   static int ept_pod_check_and_populate(struct p2m_domain *p2m,
unsigned long gfn,
> > @@ -167,11 +169,14 @@
> >   void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry,
int level)
> >   {
> >       /* End if the entry is a leaf entry. */
> > -    if ( level == 0 || !is_epte_present(ept_entry) ||
> > -         is_epte_superpage(ept_entry) )
> > +    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
> > +    {
> > +        if ( is_epte_countable(ept_entry) )
> > +            p2m->stats.entries[level]--;
> >           return;
> > +    }
> >
> > -    if ( level>  1 )
> > +    if ( level>  0 )
> >       {
> >           ept_entry_t *epte = map_domain_page(ept_entry->mfn);
> >           for ( int i = 0; i<  EPT_PAGETABLE_ENTRIES; i++ )
> > @@ -217,7 +222,10 @@
> >           ept_p2m_type_to_flags(epte, epte->sa_p2mt,
epte->access);
> >
> >           if ( (level - 1) == target )
> > +        {
> > +            p2m->stats.entries[target]++;
> >               continue;
> > +        }
> >
> >           ASSERT(is_epte_superpage(epte));
> >
> > @@ -400,6 +408,10 @@
> >               ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
> >           }
> >
> > +        /* old_entry will be handled by ept_free_entry below */
> > +        if ( is_epte_countable(&new_entry) )
> > +            p2m->stats.entries[i]++;
> > +
> >           atomic_write_ept_entry(ept_entry, new_entry);
> >       }
> >       else
> > @@ -412,12 +424,16 @@
> >
> >           split_ept_entry = atomic_read_ept_entry(ept_entry);
> >
> > +        /* Accounting should be OK here; split_ept_entry bump the
counts,
> > +         * free_entry will reduce them. */
> >           if ( !ept_split_super_page(p2m,&split_ept_entry, i,
target) )
> >           {
> >               ept_free_entry(p2m,&split_ept_entry, i);
> >               goto out;
> >           }
> >
> > +        /* We know this was countable or we wouldn''t be
here.*/
> > +        p2m->stats.entries[i]--;
> >           /* now install the newly split ept sub-tree */
> >           /* NB: please make sure domian is paused and no in-fly VT-d
DMA. */
> >           atomic_write_ept_entry(ept_entry, split_ept_entry);
> > @@ -449,9 +465,13 @@
> >
> >           ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
> >
> > +        /* old_entry will be handled by ept_free_entry below */
> > +        if ( is_epte_countable(&new_entry) )
> > +            p2m->stats.entries[i]++;
> > +
> >           atomic_write_ept_entry(ept_entry, new_entry);
> >       }
> > -
> > +
> >       /* Track the highest gfn for which we have ever had a valid
mapping */
> >       if ( mfn_valid(mfn_x(mfn))&&
> >            (gfn + (1UL<<  order) - 1> 
p2m->max_mapped_pfn) )
> > diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c
> > --- a/xen/arch/x86/mm/p2m.c     Thu May 05 17:40:34 2011 +0100
> > +++ b/xen/arch/x86/mm/p2m.c     Fri May 06 15:01:08 2011 +0100
> > @@ -184,11 +184,15 @@
> >   {
> >       /* End if the entry is a leaf entry. */
> >       if ( page_order == 0
> > -         || !(l1e_get_flags(*p2m_entry)&  _PAGE_PRESENT)
> > +         || !(l1e_get_flags(*p2m_entry)&  _PAGE_PRESENT)
> >            || (l1e_get_flags(*p2m_entry)&  _PAGE_PSE) )
> > +    {
> > +        if ( l1e_get_flags(*p2m_entry) )
> > +            p2m->stats.entries[page_order/9]--;
> >           return;
> > -
> > -    if ( page_order>  9 )
> > +    }
> > +
> > +    if ( page_order )
> >       {
> >           l1_pgentry_t *l3_table =
map_domain_page(l1e_get_pfn(*p2m_entry));
> >           for ( int i = 0; i<  L3_PAGETABLE_ENTRIES; i++ )
> > @@ -242,6 +246,7 @@
> >           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
> >                                    __PAGE_HYPERVISOR | _PAGE_USER);
> >
> > +        /* Stats: Empty entry, no mods needed */
> >           switch ( type ) {
> >           case PGT_l3_page_table:
> >               p2m_add_iommu_flags(&new_entry, 3,
IOMMUF_readable|IOMMUF_writable);
> > @@ -285,10 +290,12 @@
> >           {
> >               new_entry = l1e_from_pfn(pfn + (i *
L1_PAGETABLE_ENTRIES), flags);
> >               p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
> > +            p2m->stats.entries[1]++;
> >               p2m->write_p2m_entry(p2m, gfn,
> >                   l1_entry+i, *table_mfn, new_entry, 2);
> >           }
> >           unmap_domain_page(l1_entry);
> > +        p2m->stats.entries[2]--;
> >           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
> >                                    __PAGE_HYPERVISOR|_PAGE_USER);
//disable PSE
> >           p2m_add_iommu_flags(&new_entry, 2,
IOMMUF_readable|IOMMUF_writable);
> > @@ -320,6 +327,7 @@
> >           {
> >               new_entry = l1e_from_pfn(pfn + i, flags);
> >               p2m_add_iommu_flags(&new_entry, 0, 0);
> > +            p2m->stats.entries[0]++;
> >               p2m->write_p2m_entry(p2m, gfn,
> >                   l1_entry+i, *table_mfn, new_entry, 1);
> >           }
> > @@ -328,6 +336,7 @@
> >           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
> >                                    __PAGE_HYPERVISOR|_PAGE_USER);
> >           p2m_add_iommu_flags(&new_entry, 1,
IOMMUF_readable|IOMMUF_writable);
> > +        p2m->stats.entries[1]--;
> >           p2m->write_p2m_entry(p2m, gfn,
> >               p2m_entry, *table_mfn, new_entry, 2);
> >       }
> > @@ -908,6 +917,15 @@
> >   void
> >   p2m_pod_dump_data(struct p2m_domain *p2m)
> >   {
> > +    int i;
> > +    long entries;
> > +    printk("    P2M entry stats:\n");
> > +    for ( i=0; i<3; i++)
> > +        if ( (entries=p2m->stats.entries[i]) )
> > +            printk("     L%d: %8ld entries, %ld bytes\n",
> > +                   i+1,
> > +                   entries,
> > +                   entries<<(i*9+12));
> >       printk("    PoD entries=%d cachesize=%d\n",
> >              p2m->pod.entry_count, p2m->pod.count);
> >   }
> > @@ -1475,6 +1493,12 @@
> >               old_mfn = l1e_get_pfn(*p2m_entry);
> >           }
> >
> > +        /* Adjust count for present/not-present entries added */
> > +        if ( l1e_get_flags(*p2m_entry) )
> > +            p2m->stats.entries[page_order/9]--;
> > +        if ( l1e_get_flags(entry_content) )
> > +            p2m->stats.entries[page_order/9]++;
> > +
> >           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 3);
> >           /* NB: paging_write_p2m_entry() handles tlb flushes properly
*/
> >
> > @@ -1519,6 +1543,13 @@
> >               p2m_add_iommu_flags(&entry_content, 0,
iommu_pte_flags);
> >               old_mfn = l1e_get_pfn(*p2m_entry);
> >           }
> > +
> > +        /* Adjust count for present/not-present entries added */
> > +        if ( l1e_get_flags(*p2m_entry) )
> > +            p2m->stats.entries[page_order/9]--;
> > +        if ( l1e_get_flags(entry_content) )
> > +            p2m->stats.entries[page_order/9]++;
> > +
> >           /* level 1 entry */
> >           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 1);
> >           /* NB: paging_write_p2m_entry() handles tlb flushes properly
*/
> > @@ -1556,6 +1587,12 @@
> >               old_mfn = l1e_get_pfn(*p2m_entry);
> >           }
> >
> > +        /* Adjust count for present/not-present entries added */
> > +        if ( l1e_get_flags(*p2m_entry) )
> > +            p2m->stats.entries[page_order/9]--;
> > +        if ( l1e_get_flags(entry_content) )
> > +            p2m->stats.entries[page_order/9]++;
> > +
> >           p2m->write_p2m_entry(p2m, gfn, p2m_entry, table_mfn,
entry_content, 2);
> >           /* NB: paging_write_p2m_entry() handles tlb flushes properly
*/
> >
> > @@ -2750,6 +2787,8 @@
> >                   continue;
> >               }
> >
> > +            /* STATS: Should change only type; no stats should need
adjustment */
> > +
> >               l2mfn = _mfn(l3e_get_pfn(l3e[i3]));
> >               l2e = map_domain_page(l3e_get_pfn(l3e[i3]));
> >               for ( i2 = 0; i2<  L2_PAGETABLE_ENTRIES; i2++ )
> > diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h
> > --- a/xen/include/asm-x86/p2m.h Thu May 05 17:40:34 2011 +0100
> > +++ b/xen/include/asm-x86/p2m.h Fri May 06 15:01:08 2011 +0100
> > @@ -278,6 +278,10 @@
> >           unsigned         reclaim_single; /* Last gpfn of a scan */
> >           unsigned         max_guest;    /* gpfn of max guest
demand-populate */
> >       } pod;
> > +
> > +    struct {
> > +        long entries[3];
> > +    } stats;
> >   };
> >
> >   /* get host p2m table */
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
> 
> 
> --
> ---to satisfy European Law for business letters:
> Advanced Micro Devices GmbH
> Einsteinring 24, 85689 Dornach b. Muenchen
> Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
> Registergericht Muenchen, HRB Nr. 43632
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 15:07 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

On Fri, 2011-05-06 at 15:23 +0100, Tim Deegan wrote:> Hi, 
> 
> Can you please add this:
> 
>  [diff]
>  showfunc = True
Ah, very helpful.
> There are two places where this patch changes the control flow of p2m
> operations: 
Oops -- yeah, those definitely need some comments.

The "*_free_entry()" functions are used when replacing smaller-order
pages with a larger-order page.  Most commonly this is replacing a bunch
of 4k pages with a 2MiB page, but in theory you could be replacing a
tree of mixed 4k and 2MiB pages with a 1G page as well.

When that happens, we also need to adjust the p2m order stats by:
* Decrementing stats on all the leaf entries being replaced
* Incrementing the stat for the new leaf entry.

Incrementing the stat for the new entry is done when the new entry is
assigned.  Decrementing stats requires walking the old p2m tables and
finding valid leaf entries.  Since *_free_entry() already do that, we
just piggyback.

p2m entries are already checked at the entrance to the function to see
if we''re at a leaf. The patch adds appropriate reference counting if it
is.

The other change is how far down the tree *_free_entry() functions go.
Since their original purpose was only freeing intermediate tables, they
don''t bother to walk all the way down to l0.  Since we want to do
reference counting of the l0s as well, we just make it go all the way
down.  

This does mean making a function call for all 512 entries of the L1
table, but it should be a pretty rare operation.  I could refactor the
function to do the check before making the call instead, if desired.

Let me know if you want me to include this description in a new patch
series.

 -George
> 
> > @@ -167,11 +169,14 @@
> >  void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry,
int level)
> >  {
> >      /* End if the entry is a leaf entry. */
> > -    if ( level == 0 || !is_epte_present(ept_entry) ||
> > -         is_epte_superpage(ept_entry) )
> > +    if ( level == 0 || !is_epte_present(ept_entry) ||
is_epte_superpage(ept_entry) )
> > +    {
> > +        if ( is_epte_countable(ept_entry) )
> > +            p2m->stats.entries[level]--;
> >          return;
> > +    }
> >  
> > -    if ( level > 1 )
> > +    if ( level > 0 )
> >      {
> 
> and similarly: 
> 
> > @@ -184,11 +184,15 @@
> >  {
> >      /* End if the entry is a leaf entry. */
> >      if ( page_order == 0
> > -         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT)
> > +         || !(l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) 
> >           || (l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
> > +    {
> > +        if ( l1e_get_flags(*p2m_entry) )
> > +            p2m->stats.entries[page_order/9]--;
> >          return;
> > -
> > -    if ( page_order > 9 )
> > +    }
> > +
> > +    if ( page_order )
> 
> here.  Can you explain those in a bit more detail?
> 
> Cheers,
> 
> Tim.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 15:14 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

On Fri, May 6, 2011 at 4:00 PM, Tim Deegan <Tim.Deegan@citrix.com>
wrote:> On the other hand, maybe the array itself could have a more descriptive
> name than "stats.entries".
It''s statistics on the number of 4k, 2M, and 1G entries.  I admit
it''s
not that great, though.  Suggestions?
 -G

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-May-06 15:34 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

On Fri, May 6, 2011 at 3:53 PM, Christoph Egger <Christoph.Egger@amd.com>
wrote:> What about this:
>
> #define PAGE_ORDER_4K  0
> #define PAGE_ORDER_2M  9
> #define PAGE_ORDER_1G  18
That would be 0, 1, and 2, respectively.  I had thought about
something like this, but the common usage seems to be to use L1-3
rather than 4k, 2M, or 1G; and

#define PAGE_ORDER_L1 0
#define PAGE_ORDER_L2 1
#define PAGE_ORDER_L3 2

seemed a bit redundant.

This patch is actually not necessary for the series -- just for the
verification that it worked.  I could drop this patch so we can
discuss it, and send the other three by themselves (since they seem
pretty uncontroversial).

 -George
>
>>
>> On the other hand, maybe the array itself could have a more descriptive
>> name than "stats.entries".
>>
>> Tim.
>>
>>> On 05/06/11 16:01, George Dunlap wrote:
>>>>
>>>> Count the number of 4kiB, 2MiB, and 1GiB p2m entries.
>>>>
>>>> Signed-off-by: George Dunlap<george.dunlap@eu.citrix.com>
>>>>
>>>> diff -r 4b0692880dfa -r be5d93d38f28
xen/arch/x86/mm/hap/p2m-ept.c
>>>> --- a/xen/arch/x86/mm/hap/p2m-ept.c     Thu May 05 17:40:34
2011 +0100
>>>> +++ b/xen/arch/x86/mm/hap/p2m-ept.c     Fri May 06 15:01:08
2011 +0100
>>>> @@ -39,6 +39,8 @@
>>>>
>>>>   #define is_epte_present(ept_entry)    
 ((ept_entry)->epte&   0x7)
>>>>   #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
>>>> +#define is_epte_countable(ept_entry)  
 (is_epte_present(ept_entry) \
>>>> +                                         ||
((ept_entry)->sa_p2mt =>>>> p2m_populate_on_demand))
>>>>
>>>>   /* Non-ept "lock-and-check" wrapper */
>>>>   static int ept_pod_check_and_populate(struct p2m_domain *p2m,
unsigned
>>>> long gfn,
>>>> @@ -167,11 +169,14 @@
>>>>   void ept_free_entry(struct p2m_domain *p2m, ept_entry_t
*ept_entry,
>>>> int level)
>>>>   {
>>>>       /* End if the entry is a leaf entry. */
>>>> -    if ( level == 0 || !is_epte_present(ept_entry) ||
>>>> -         is_epte_superpage(ept_entry) )
>>>> +    if ( level == 0 || !is_epte_present(ept_entry) ||
>>>> is_epte_superpage(ept_entry) )
>>>> +    {
>>>> +        if ( is_epte_countable(ept_entry) )
>>>> +            p2m->stats.entries[level]--;
>>>>           return;
>>>> +    }
>>>>
>>>> -    if ( level>   1 )
>>>> +    if ( level>   0 )
>>>>       {
>>>>           ept_entry_t *epte =
map_domain_page(ept_entry->mfn);
>>>>           for ( int i = 0; i<   EPT_PAGETABLE_ENTRIES; i++ )
>>>> @@ -217,7 +222,10 @@
>>>>           ept_p2m_type_to_flags(epte, epte->sa_p2mt,
epte->access);
>>>>
>>>>           if ( (level - 1) == target )
>>>> +        {
>>>> +            p2m->stats.entries[target]++;
>>>>               continue;
>>>> +        }
>>>>
>>>>           ASSERT(is_epte_superpage(epte));
>>>>
>>>> @@ -400,6 +408,10 @@
>>>>               ept_p2m_type_to_flags(&new_entry, p2mt,
p2ma);
>>>>           }
>>>>
>>>> +        /* old_entry will be handled by ept_free_entry below
*/
>>>> +        if ( is_epte_countable(&new_entry) )
>>>> +            p2m->stats.entries[i]++;
>>>> +
>>>>           atomic_write_ept_entry(ept_entry, new_entry);
>>>>       }
>>>>       else
>>>> @@ -412,12 +424,16 @@
>>>>
>>>>           split_ept_entry = atomic_read_ept_entry(ept_entry);
>>>>
>>>> +        /* Accounting should be OK here; split_ept_entry bump
the
>>>> counts,
>>>> +         * free_entry will reduce them. */
>>>>           if ( !ept_split_super_page(p2m,&split_ept_entry,
i, target) )
>>>>           {
>>>>               ept_free_entry(p2m,&split_ept_entry, i);
>>>>               goto out;
>>>>           }
>>>>
>>>> +        /* We know this was countable or we wouldn''t
be here.*/
>>>> +        p2m->stats.entries[i]--;
>>>>           /* now install the newly split ept sub-tree */
>>>>           /* NB: please make sure domian is paused and no
in-fly VT-d
>>>> DMA. */
>>>>           atomic_write_ept_entry(ept_entry, split_ept_entry);
>>>> @@ -449,9 +465,13 @@
>>>>
>>>>           ept_p2m_type_to_flags(&new_entry, p2mt, p2ma);
>>>>
>>>> +        /* old_entry will be handled by ept_free_entry below
*/
>>>> +        if ( is_epte_countable(&new_entry) )
>>>> +            p2m->stats.entries[i]++;
>>>> +
>>>>           atomic_write_ept_entry(ept_entry, new_entry);
>>>>       }
>>>> -
>>>> +
>>>>       /* Track the highest gfn for which we have ever had a
valid
>>>> mapping */
>>>>       if ( mfn_valid(mfn_x(mfn))&&
>>>>            (gfn + (1UL<<   order) - 1>  
p2m->max_mapped_pfn) )
>>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/arch/x86/mm/p2m.c
>>>> --- a/xen/arch/x86/mm/p2m.c     Thu May 05 17:40:34 2011 +0100
>>>> +++ b/xen/arch/x86/mm/p2m.c     Fri May 06 15:01:08 2011 +0100
>>>> @@ -184,11 +184,15 @@
>>>>   {
>>>>       /* End if the entry is a leaf entry. */
>>>>       if ( page_order == 0
>>>> -         || !(l1e_get_flags(*p2m_entry)&   _PAGE_PRESENT)
>>>> +         || !(l1e_get_flags(*p2m_entry)&   _PAGE_PRESENT)
>>>>            || (l1e_get_flags(*p2m_entry)&   _PAGE_PSE) )
>>>> +    {
>>>> +        if ( l1e_get_flags(*p2m_entry) )
>>>> +            p2m->stats.entries[page_order/9]--;
>>>>           return;
>>>> -
>>>> -    if ( page_order>   9 )
>>>> +    }
>>>> +
>>>> +    if ( page_order )
>>>>       {
>>>>           l1_pgentry_t *l3_table >>>>
map_domain_page(l1e_get_pfn(*p2m_entry));
>>>>           for ( int i = 0; i<   L3_PAGETABLE_ENTRIES; i++ )
>>>> @@ -242,6 +246,7 @@
>>>>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>>                                    __PAGE_HYPERVISOR |
_PAGE_USER);
>>>>
>>>> +        /* Stats: Empty entry, no mods needed */
>>>>           switch ( type ) {
>>>>           case PGT_l3_page_table:
>>>>               p2m_add_iommu_flags(&new_entry, 3,
>>>> IOMMUF_readable|IOMMUF_writable);
>>>> @@ -285,10 +290,12 @@
>>>>           {
>>>>               new_entry = l1e_from_pfn(pfn + (i *
L1_PAGETABLE_ENTRIES),
>>>> flags);
>>>>               p2m_add_iommu_flags(&new_entry, 1,
>>>> IOMMUF_readable|IOMMUF_writable);
>>>> +            p2m->stats.entries[1]++;
>>>>               p2m->write_p2m_entry(p2m, gfn,
>>>>                   l1_entry+i, *table_mfn, new_entry, 2);
>>>>           }
>>>>           unmap_domain_page(l1_entry);
>>>> +        p2m->stats.entries[2]--;
>>>>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>>                                  
 __PAGE_HYPERVISOR|_PAGE_USER);
>>>> //disable PSE
>>>>           p2m_add_iommu_flags(&new_entry, 2,
>>>> IOMMUF_readable|IOMMUF_writable);
>>>> @@ -320,6 +327,7 @@
>>>>           {
>>>>               new_entry = l1e_from_pfn(pfn + i, flags);
>>>>               p2m_add_iommu_flags(&new_entry, 0, 0);
>>>> +            p2m->stats.entries[0]++;
>>>>               p2m->write_p2m_entry(p2m, gfn,
>>>>                   l1_entry+i, *table_mfn, new_entry, 1);
>>>>           }
>>>> @@ -328,6 +336,7 @@
>>>>           new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)),
>>>>                                  
 __PAGE_HYPERVISOR|_PAGE_USER);
>>>>           p2m_add_iommu_flags(&new_entry, 1,
>>>> IOMMUF_readable|IOMMUF_writable);
>>>> +        p2m->stats.entries[1]--;
>>>>           p2m->write_p2m_entry(p2m, gfn,
>>>>               p2m_entry, *table_mfn, new_entry, 2);
>>>>       }
>>>> @@ -908,6 +917,15 @@
>>>>   void
>>>>   p2m_pod_dump_data(struct p2m_domain *p2m)
>>>>   {
>>>> +    int i;
>>>> +    long entries;
>>>> +    printk("    P2M entry stats:\n");
>>>> +    for ( i=0; i<3; i++)
>>>> +        if ( (entries=p2m->stats.entries[i]) )
>>>> +            printk("     L%d: %8ld entries, %ld
bytes\n",
>>>> +                   i+1,
>>>> +                   entries,
>>>> +                   entries<<(i*9+12));
>>>>       printk("    PoD entries=%d cachesize=%d\n",
>>>>              p2m->pod.entry_count, p2m->pod.count);
>>>>   }
>>>> @@ -1475,6 +1493,12 @@
>>>>               old_mfn = l1e_get_pfn(*p2m_entry);
>>>>           }
>>>>
>>>> +        /* Adjust count for present/not-present entries added
*/
>>>> +        if ( l1e_get_flags(*p2m_entry) )
>>>> +            p2m->stats.entries[page_order/9]--;
>>>> +        if ( l1e_get_flags(entry_content) )
>>>> +            p2m->stats.entries[page_order/9]++;
>>>> +
>>>>           p2m->write_p2m_entry(p2m, gfn, p2m_entry,
table_mfn,
>>>> entry_content, 3);
>>>>           /* NB: paging_write_p2m_entry() handles tlb flushes
properly
>>>> */
>>>>
>>>> @@ -1519,6 +1543,13 @@
>>>>               p2m_add_iommu_flags(&entry_content, 0,
iommu_pte_flags);
>>>>               old_mfn = l1e_get_pfn(*p2m_entry);
>>>>           }
>>>> +
>>>> +        /* Adjust count for present/not-present entries added
*/
>>>> +        if ( l1e_get_flags(*p2m_entry) )
>>>> +            p2m->stats.entries[page_order/9]--;
>>>> +        if ( l1e_get_flags(entry_content) )
>>>> +            p2m->stats.entries[page_order/9]++;
>>>> +
>>>>           /* level 1 entry */
>>>>           p2m->write_p2m_entry(p2m, gfn, p2m_entry,
table_mfn,
>>>> entry_content, 1);
>>>>           /* NB: paging_write_p2m_entry() handles tlb flushes
properly
>>>> */
>>>> @@ -1556,6 +1587,12 @@
>>>>               old_mfn = l1e_get_pfn(*p2m_entry);
>>>>           }
>>>>
>>>> +        /* Adjust count for present/not-present entries added
*/
>>>> +        if ( l1e_get_flags(*p2m_entry) )
>>>> +            p2m->stats.entries[page_order/9]--;
>>>> +        if ( l1e_get_flags(entry_content) )
>>>> +            p2m->stats.entries[page_order/9]++;
>>>> +
>>>>           p2m->write_p2m_entry(p2m, gfn, p2m_entry,
table_mfn,
>>>> entry_content, 2);
>>>>           /* NB: paging_write_p2m_entry() handles tlb flushes
properly
>>>> */
>>>>
>>>> @@ -2750,6 +2787,8 @@
>>>>                   continue;
>>>>               }
>>>>
>>>> +            /* STATS: Should change only type; no stats should
need
>>>> adjustment */
>>>> +
>>>>               l2mfn = _mfn(l3e_get_pfn(l3e[i3]));
>>>>               l2e = map_domain_page(l3e_get_pfn(l3e[i3]));
>>>>               for ( i2 = 0; i2<   L2_PAGETABLE_ENTRIES; i2++
)
>>>> diff -r 4b0692880dfa -r be5d93d38f28 xen/include/asm-x86/p2m.h
>>>> --- a/xen/include/asm-x86/p2m.h Thu May 05 17:40:34 2011 +0100
>>>> +++ b/xen/include/asm-x86/p2m.h Fri May 06 15:01:08 2011 +0100
>>>> @@ -278,6 +278,10 @@
>>>>           unsigned         reclaim_single; /* Last gpfn of a
scan */
>>>>           unsigned         max_guest;    /* gpfn of max guest
>>>> demand-populate */
>>>>       } pod;
>>>> +
>>>> +    struct {
>>>> +        long entries[3];
>>>> +    } stats;
>>>>   };
>>>>
>>>>   /* get host p2m table */
>
>
> --
> ---to satisfy European Law for business letters:
> Advanced Micro Devices GmbH
> Einsteinring 24, 85689 Dornach b. Muenchen
> Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
> Registergericht Muenchen, HRB Nr. 43632
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2011-May-09 08:27 UTC

head link

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

At 16:34 +0100 on 06 May (1304699686), George Dunlap
wrote:> This patch is actually not necessary for the series -- just for the
> verification that it worked.  I could drop this patch so we can
> discuss it, and send the other three by themselves (since they seem
> pretty uncontroversial).
I think that''s best.  I''ll be looking at reference-counting
p2m entries
soon, I hope, and I''ll add some bookkeeping when I do.  I suspect I can
isolate it into a single place then.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2012-Jun-08 10:52 UTC

head link

Re: [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

On Mon, May 9, 2011 at 9:27 AM, Tim Deegan <Tim.Deegan@citrix.com>
wrote:> At 16:34 +0100 on 06 May (1304699686), George Dunlap wrote:
>> This patch is actually not necessary for the series -- just for the
>> verification that it worked.  I could drop this patch so we can
>> discuss it, and send the other three by themselves (since they seem
>> pretty uncontroversial).
>
> I think that''s best.  I''ll be looking at
reference-counting p2m entries
> soon, I hope, and I''ll add some bookkeeping when I do.  I suspect
I can
> isolate it into a single place then.\
Tim, I realize this is over a year ago now, but did you ever end up
adding this bookkeeping?  I''m trying to port this patch again...

 -George

Tim Deegan

2012-Jun-14 08:52 UTC

head link

Re: [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

At 11:52 +0100 on 08 Jun (1339156322), George Dunlap
wrote:> On Mon, May 9, 2011 at 9:27 AM, Tim Deegan <Tim.Deegan@citrix.com>
wrote:
> > At 16:34 +0100 on 06 May (1304699686), George Dunlap wrote:
> >> This patch is actually not necessary for the series -- just for
the
> >> verification that it worked.  I could drop this patch so we can
> >> discuss it, and send the other three by themselves (since they
seem
> >> pretty uncontroversial).
> >
> > I think that''s best.  I''ll be looking at
reference-counting p2m entries
> > soon, I hope, and I''ll add some bookkeeping when I do.  I
suspect I can
> > isolate it into a single place then.\
> 
> Tim, I realize this is over a year ago now, but did you ever end up
> adding this bookkeeping?  I''m trying to port this patch again...
No, I never did. :(

Tim.

Seemingly Similar Threads

Search for more reasonably related threads

Xen devel - May 2011 - [PATCH 0 of 4] Use superpages on restore/migrate

[Xen-devel] [PATCH 0 of 4] Use superpages on restore/migrate

[Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

[Xen-devel] [PATCH 2 of 4] tools: Detect superpages on domain restore

[Xen-devel] [PATCH 3 of 4] tools: Save superpages in the same batch, to make detection easier

[Xen-devel] [PATCH 4 of 4] tools: Introduce "allocate-only" page type for migration

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [Xen-devel] [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Re: [PATCH 1 of 4] p2m: Keep statistics on order of p2m entries

Seemingly Similar Threads