thr3ads.net - Xen devel - [PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements [Jun 2012]

If this information is useful, please help other people find it:
Share via:

George Dunlap

2012-Jun-28 14:19 UTC

[PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements

xen,pod: Populate-on-demand reclaim improvements

Rework populate-on-demand sweeping

Last summer I did some work on testing whether our PoD sweeping code
was achieving its goals: namely, never crashing unnecessairly,
minimizing boot time, and maximizing the number of superpages in the
p2m table.

This is the resulting patch series.

v2:
 - Move cosmetic code-motion hunk into its own patch
 - Address various comments
 - Include a simplified version of the balloon reclaim reclamation patch
v3:
 - Remove code motion patch (already checked in)
 - Move balloon patch to front
 - Add SoB to balloon patch
 - More clean-ups to "checklast" and remove-supersweep patches

George Dunlap

2012-Jun-28 14:19 UTC

head link

[PATCH 1 of 3 v3] xen, pod: Try to reclaim superpages when ballooning down

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1340893080 -3600
# Node ID fb0187ae8a20d0850dea0cd3e4167503411e5950
# Parent  52f1b8a4f9a4cb454b6fea1220cc6a09cf401a42
xen,pod: Try to reclaim superpages when ballooning down

Windows balloon drivers can typically only get 4k pages from the kernel,
and so hand them back at that level.  Try to regain superpages by checking
the superpage frame that the 4k page is in to see if we can reclaim the whole
thing for the PoD cache.

This also modifies p2m_pod_zero_check_superpage() to return SUPERPAGE_PAGES on
success.

v2:
 - Rewritten to simply to the check as in demand-fault case, without needing
   to know that the p2m entry is a superpage.
 - Also, took out the re-writing of the reclaim loop, leaving it optimized for
   4k pages (by far the most common case), and simplifying the patch.
v3:
 - Add SoB

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -488,6 +488,10 @@ p2m_pod_offline_or_broken_replace(struct
     return;
 }
 
+static int
+p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn);
+
+
 /* This function is needed for two reasons:
  * + To properly handle clearing of PoD entries
  * + To "steal back" memory being freed for the PoD cache, rather
than
@@ -505,8 +509,8 @@ p2m_pod_decrease_reservation(struct doma
     int i;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
 
-    int steal_for_cache = 0;
-    int pod = 0, nonpod = 0, ram = 0;
+    int steal_for_cache;
+    int pod, nonpod, ram;
 
     gfn_lock(p2m, gpfn, order);
     pod_lock(p2m);    
@@ -516,13 +520,15 @@ p2m_pod_decrease_reservation(struct doma
     if ( p2m->pod.entry_count == 0 )
         goto out_unlock;
 
+    if ( unlikely(d->is_dying) )
+        goto out_unlock;
+
+recount:
+    pod = nonpod = ram = 0;
+
     /* Figure out if we need to steal some freed memory for our cache */
     steal_for_cache =  ( p2m->pod.entry_count > p2m->pod.count );
 
-    if ( unlikely(d->is_dying) )
-        goto out_unlock;
-
-    /* See what''s in here. */
     /* FIXME: Add contiguous; query for PSE entries? */
     for ( i=0; i<(1<<order); i++)
     {
@@ -556,7 +562,16 @@ p2m_pod_decrease_reservation(struct doma
         goto out_entry_check;
     }
 
-    /* FIXME: Steal contig 2-meg regions for cache */
+    /* Try to grab entire superpages if possible.  Since the common case is for
drivers
+     * to pass back singleton pages, see if we can take the whole page back and
mark the
+     * rest PoD. */
+    if ( steal_for_cache
+         && p2m_pod_zero_check_superpage(p2m, gpfn &
~(SUPERPAGE_PAGES-1)))
+    {
+        /* Since order may be arbitrary, we may have taken more or less
+         * than we were actually asked to; so just re-count from scratch */
+        goto recount;
+    }
 
     /* Process as long as:
      * + There are PoD entries to handle, or
@@ -758,6 +773,8 @@ p2m_pod_zero_check_superpage(struct p2m_
     p2m_pod_cache_add(p2m, mfn_to_page(mfn0), PAGE_ORDER_2M);
     p2m->pod.entry_count += SUPERPAGE_PAGES;
 
+    ret = SUPERPAGE_PAGES;
+
 out_reset:
     if ( reset )
         set_p2m_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);

George Dunlap

2012-Jun-28 14:19 UTC

head link

[PATCH 2 of 3 v3] xen, pod: Zero-check recently populated pages (checklast)

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1340893083 -3600
# Node ID 9de241075c7f622758f00223805b0279635ff4d9
# Parent  fb0187ae8a20d0850dea0cd3e4167503411e5950
xen,pod: Zero-check recently populated pages (checklast)

When demand-populating pages due to guest accesses, check recently populated
pages to see if we can reclaim them for the cache.  This should keep the PoD
cache filled when the start-of-day scrubber is going through.

The number 128 was chosen by experiment.  Windows does its page
scrubbing in parallel; while a small nubmer like 4 works well for
single VMs, it breaks down as multiple vcpus are scrubbing different
pages in parallel.  Increasing to 128 works well for higher numbers of
vcpus.

v2:
 - Wrapped some long lines
 - unsigned int for index, unsigned long for array
v3:
 - Use PAGE_ORDER_2M instead of 9
 - Removed inappropriate use of p2m_pod_zero_check_superpage() return value

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -926,6 +926,27 @@ p2m_pod_emergency_sweep_super(struct p2m
     p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0;
 }
 
+/* When populating a new superpage, look at recently populated superpages
+ * hoping that they''ve been zeroed.  This will snap up zeroed pages as
soon as
+ * the guest OS is done with them. */
+static void
+p2m_pod_check_last_super(struct p2m_domain *p2m, unsigned long gfn_aligned)
+{
+    unsigned long check_gfn;
+
+    ASSERT(p2m->pod.last_populated_index < POD_HISTORY_MAX);
+
+    check_gfn = p2m->pod.last_populated[p2m->pod.last_populated_index];
+
+    p2m->pod.last_populated[p2m->pod.last_populated_index] = gfn_aligned;
+
+    p2m->pod.last_populated_index +        (
p2m->pod.last_populated_index + 1 ) % POD_HISTORY_MAX;
+
+    p2m_pod_zero_check_superpage(p2m, check_gfn);
+}
+
+
 #define POD_SWEEP_STRIDE  16
 static void
 p2m_pod_emergency_sweep(struct p2m_domain *p2m)
@@ -1083,6 +1104,12 @@ p2m_pod_demand_populate(struct p2m_domai
         __trace_var(TRC_MEM_POD_POPULATE, 0, sizeof(t), &t);
     }
 
+    /* Check the last guest demand-populate */
+    if ( p2m->pod.entry_count > p2m->pod.count 
+         && (order == PAGE_ORDER_2M)
+         && (q & P2M_ALLOC) )
+        p2m_pod_check_last_super(p2m, gfn_aligned);
+
     pod_unlock(p2m);
     return 0;
 out_of_memory:
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -287,6 +287,10 @@ struct p2m_domain {
         unsigned         reclaim_super; /* Last gpfn of a scan */
         unsigned         reclaim_single; /* Last gpfn of a scan */
         unsigned         max_guest;    /* gpfn of max guest demand-populate */
+#define POD_HISTORY_MAX 128
+        /* gpfn of last guest superpage demand-populated */
+        unsigned long    last_populated[POD_HISTORY_MAX]; 
+        unsigned int     last_populated_index;
         mm_lock_t        lock;         /* Locking of private pod structs,   *
                                         * not relying on the p2m lock.      */
     } pod;

George Dunlap

2012-Jun-28 14:19 UTC

head link

[PATCH 3 of 3 v3] xen, pod: Only sweep in an emergency, and only for 4k pages

# HG changeset patch
# User George Dunlap <george.dunlap@eu.citrix.com>
# Date 1340893085 -3600
# Node ID 90f2f1728f906b1fb2e2d70e5a88b54d0fc190d8
# Parent  9de241075c7f622758f00223805b0279635ff4d9
xen,pod: Only sweep in an emergency, and only for 4k pages

Testing has shown that doing sweeps for superpages slows down boot
significantly, but does not result in a significantly higher number of
superpages after boot.  Early sweeping for 4k pages causes superpages
to be broken up unnecessarily.

Only sweep if we''re really out of memory.

v2:
 - Move unrelated code-motion hunk to another patch
v3:
 - Remove now-unused reclaim_super from pod struct

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -897,34 +897,6 @@ p2m_pod_zero_check(struct p2m_domain *p2
 }
 
 #define POD_SWEEP_LIMIT 1024
-static void
-p2m_pod_emergency_sweep_super(struct p2m_domain *p2m)
-{
-    unsigned long i, start, limit;
-
-    if ( p2m->pod.reclaim_super == 0 )
-    {
-        p2m->pod.reclaim_super =
(p2m->pod.max_guest>>PAGE_ORDER_2M)<<PAGE_ORDER_2M;
-        p2m->pod.reclaim_super -= SUPERPAGE_PAGES;
-    }
-    
-    start = p2m->pod.reclaim_super;
-    limit = (start > POD_SWEEP_LIMIT) ? (start - POD_SWEEP_LIMIT) : 0;
-
-    for ( i=p2m->pod.reclaim_super ; i > 0 ; i -= SUPERPAGE_PAGES )
-    {
-        p2m_pod_zero_check_superpage(p2m, i);
-        /* Stop if we''re past our limit and we have found *something*.
-         *
-         * NB that this is a zero-sum game; we''re increasing our cache
size
-         * by increasing our ''debt''.  Since we hold the p2m
lock,
-         * (entry_count - count) must remain the same. */
-        if ( !page_list_empty(&p2m->pod.super) &&  i < limit
)
-            break;
-    }
-
-    p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0;
-}
 
 /* When populating a new superpage, look at recently populated superpages
  * hoping that they''ve been zeroed.  This will snap up zeroed pages as
soon as
@@ -1039,27 +1011,12 @@ p2m_pod_demand_populate(struct p2m_domai
         return 0;
     }
 
-    /* Once we''ve ballooned down enough that we can fill the remaining
-     * PoD entries from the cache, don''t sweep even if the particular
-     * list we want to use is empty: that can lead to thrashing zero pages 
-     * through the cache for no good reason.  */
-    if ( p2m->pod.entry_count > p2m->pod.count )
-    {
+    /* Only sweep if we''re actually out of memory.  Doing anything
else
+     * causes unnecessary time and fragmentation of superpages in the p2m. */
+    if ( p2m->pod.count == 0 )
+        p2m_pod_emergency_sweep(p2m);
 
-        /* If we''re low, start a sweep */
-        if ( order == PAGE_ORDER_2M &&
page_list_empty(&p2m->pod.super) )
-            /* Note that sweeps scan other ranges in the p2m. In an scenario
-             * in which p2m locks are fine-grained, this may result in
deadlock.
-             * Using trylock on the gfn''s as we sweep would avoid it.
*/
-            p2m_pod_emergency_sweep_super(p2m);
-
-        if ( page_list_empty(&p2m->pod.single) &&
-             ( ( order == PAGE_ORDER_4K )
-               || (order == PAGE_ORDER_2M &&
page_list_empty(&p2m->pod.super) ) ) )
-            /* Same comment regarding deadlock applies */
-            p2m_pod_emergency_sweep(p2m);
-    }
-
+    /* If the sweep failed, give up. */
     if ( p2m->pod.count == 0 )
         goto out_of_memory;
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -284,7 +284,6 @@ struct p2m_domain {
                          single;       /* Non-super lists                   */
         int              count,        /* # of pages in cache lists         */
                          entry_count;  /* # of pages in p2m marked pod      */
-        unsigned         reclaim_super; /* Last gpfn of a scan */
         unsigned         reclaim_single; /* Last gpfn of a scan */
         unsigned         max_guest;    /* gpfn of max guest demand-populate */
 #define POD_HISTORY_MAX 128

Tim Deegan

2012-Jun-28 15:16 UTC

head link

Re: [PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements

At 15:19 +0100 on 28 Jun (1340896781), George Dunlap
wrote:> xen,pod: Populate-on-demand reclaim improvements
> 
> Rework populate-on-demand sweeping
> 
> Last summer I did some work on testing whether our PoD sweeping code
> was achieving its goals: namely, never crashing unnecessairly,
> minimizing boot time, and maximizing the number of superpages in the
> p2m table.
> 
> This is the resulting patch series.
Applied, thanks.

Tim.

Xen devel - Jun 2012 - [PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements

[PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements

[PATCH 1 of 3 v3] xen, pod: Try to reclaim superpages when ballooning down

[PATCH 2 of 3 v3] xen, pod: Zero-check recently populated pages (checklast)

[PATCH 3 of 3 v3] xen, pod: Only sweep in an emergency, and only for 4k pages

Re: [PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements