George Dunlap
2012-Jun-28 14:19 UTC
[PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements
xen,pod: Populate-on-demand reclaim improvements Rework populate-on-demand sweeping Last summer I did some work on testing whether our PoD sweeping code was achieving its goals: namely, never crashing unnecessairly, minimizing boot time, and maximizing the number of superpages in the p2m table. This is the resulting patch series. v2: - Move cosmetic code-motion hunk into its own patch - Address various comments - Include a simplified version of the balloon reclaim reclamation patch v3: - Remove code motion patch (already checked in) - Move balloon patch to front - Add SoB to balloon patch - More clean-ups to "checklast" and remove-supersweep patches
George Dunlap
2012-Jun-28 14:19 UTC
[PATCH 1 of 3 v3] xen, pod: Try to reclaim superpages when ballooning down
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340893080 -3600 # Node ID fb0187ae8a20d0850dea0cd3e4167503411e5950 # Parent 52f1b8a4f9a4cb454b6fea1220cc6a09cf401a42 xen,pod: Try to reclaim superpages when ballooning down Windows balloon drivers can typically only get 4k pages from the kernel, and so hand them back at that level. Try to regain superpages by checking the superpage frame that the 4k page is in to see if we can reclaim the whole thing for the PoD cache. This also modifies p2m_pod_zero_check_superpage() to return SUPERPAGE_PAGES on success. v2: - Rewritten to simply to the check as in demand-fault case, without needing to know that the p2m entry is a superpage. - Also, took out the re-writing of the reclaim loop, leaving it optimized for 4k pages (by far the most common case), and simplifying the patch. v3: - Add SoB Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -488,6 +488,10 @@ p2m_pod_offline_or_broken_replace(struct return; } +static int +p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn); + + /* This function is needed for two reasons: * + To properly handle clearing of PoD entries * + To "steal back" memory being freed for the PoD cache, rather than @@ -505,8 +509,8 @@ p2m_pod_decrease_reservation(struct doma int i; struct p2m_domain *p2m = p2m_get_hostp2m(d); - int steal_for_cache = 0; - int pod = 0, nonpod = 0, ram = 0; + int steal_for_cache; + int pod, nonpod, ram; gfn_lock(p2m, gpfn, order); pod_lock(p2m); @@ -516,13 +520,15 @@ p2m_pod_decrease_reservation(struct doma if ( p2m->pod.entry_count == 0 ) goto out_unlock; + if ( unlikely(d->is_dying) ) + goto out_unlock; + +recount: + pod = nonpod = ram = 0; + /* Figure out if we need to steal some freed memory for our cache */ steal_for_cache = ( p2m->pod.entry_count > p2m->pod.count ); - if ( unlikely(d->is_dying) ) - goto out_unlock; - - /* See what''s in here. */ /* FIXME: Add contiguous; query for PSE entries? */ for ( i=0; i<(1<<order); i++) { @@ -556,7 +562,16 @@ p2m_pod_decrease_reservation(struct doma goto out_entry_check; } - /* FIXME: Steal contig 2-meg regions for cache */ + /* Try to grab entire superpages if possible. Since the common case is for drivers + * to pass back singleton pages, see if we can take the whole page back and mark the + * rest PoD. */ + if ( steal_for_cache + && p2m_pod_zero_check_superpage(p2m, gpfn & ~(SUPERPAGE_PAGES-1))) + { + /* Since order may be arbitrary, we may have taken more or less + * than we were actually asked to; so just re-count from scratch */ + goto recount; + } /* Process as long as: * + There are PoD entries to handle, or @@ -758,6 +773,8 @@ p2m_pod_zero_check_superpage(struct p2m_ p2m_pod_cache_add(p2m, mfn_to_page(mfn0), PAGE_ORDER_2M); p2m->pod.entry_count += SUPERPAGE_PAGES; + ret = SUPERPAGE_PAGES; + out_reset: if ( reset ) set_p2m_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
George Dunlap
2012-Jun-28 14:19 UTC
[PATCH 2 of 3 v3] xen, pod: Zero-check recently populated pages (checklast)
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340893083 -3600 # Node ID 9de241075c7f622758f00223805b0279635ff4d9 # Parent fb0187ae8a20d0850dea0cd3e4167503411e5950 xen,pod: Zero-check recently populated pages (checklast) When demand-populating pages due to guest accesses, check recently populated pages to see if we can reclaim them for the cache. This should keep the PoD cache filled when the start-of-day scrubber is going through. The number 128 was chosen by experiment. Windows does its page scrubbing in parallel; while a small nubmer like 4 works well for single VMs, it breaks down as multiple vcpus are scrubbing different pages in parallel. Increasing to 128 works well for higher numbers of vcpus. v2: - Wrapped some long lines - unsigned int for index, unsigned long for array v3: - Use PAGE_ORDER_2M instead of 9 - Removed inappropriate use of p2m_pod_zero_check_superpage() return value Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -926,6 +926,27 @@ p2m_pod_emergency_sweep_super(struct p2m p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0; } +/* When populating a new superpage, look at recently populated superpages + * hoping that they''ve been zeroed. This will snap up zeroed pages as soon as + * the guest OS is done with them. */ +static void +p2m_pod_check_last_super(struct p2m_domain *p2m, unsigned long gfn_aligned) +{ + unsigned long check_gfn; + + ASSERT(p2m->pod.last_populated_index < POD_HISTORY_MAX); + + check_gfn = p2m->pod.last_populated[p2m->pod.last_populated_index]; + + p2m->pod.last_populated[p2m->pod.last_populated_index] = gfn_aligned; + + p2m->pod.last_populated_index + ( p2m->pod.last_populated_index + 1 ) % POD_HISTORY_MAX; + + p2m_pod_zero_check_superpage(p2m, check_gfn); +} + + #define POD_SWEEP_STRIDE 16 static void p2m_pod_emergency_sweep(struct p2m_domain *p2m) @@ -1083,6 +1104,12 @@ p2m_pod_demand_populate(struct p2m_domai __trace_var(TRC_MEM_POD_POPULATE, 0, sizeof(t), &t); } + /* Check the last guest demand-populate */ + if ( p2m->pod.entry_count > p2m->pod.count + && (order == PAGE_ORDER_2M) + && (q & P2M_ALLOC) ) + p2m_pod_check_last_super(p2m, gfn_aligned); + pod_unlock(p2m); return 0; out_of_memory: diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -287,6 +287,10 @@ struct p2m_domain { unsigned reclaim_super; /* Last gpfn of a scan */ unsigned reclaim_single; /* Last gpfn of a scan */ unsigned max_guest; /* gpfn of max guest demand-populate */ +#define POD_HISTORY_MAX 128 + /* gpfn of last guest superpage demand-populated */ + unsigned long last_populated[POD_HISTORY_MAX]; + unsigned int last_populated_index; mm_lock_t lock; /* Locking of private pod structs, * * not relying on the p2m lock. */ } pod;
George Dunlap
2012-Jun-28 14:19 UTC
[PATCH 3 of 3 v3] xen, pod: Only sweep in an emergency, and only for 4k pages
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340893085 -3600 # Node ID 90f2f1728f906b1fb2e2d70e5a88b54d0fc190d8 # Parent 9de241075c7f622758f00223805b0279635ff4d9 xen,pod: Only sweep in an emergency, and only for 4k pages Testing has shown that doing sweeps for superpages slows down boot significantly, but does not result in a significantly higher number of superpages after boot. Early sweeping for 4k pages causes superpages to be broken up unnecessarily. Only sweep if we''re really out of memory. v2: - Move unrelated code-motion hunk to another patch v3: - Remove now-unused reclaim_super from pod struct Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -897,34 +897,6 @@ p2m_pod_zero_check(struct p2m_domain *p2 } #define POD_SWEEP_LIMIT 1024 -static void -p2m_pod_emergency_sweep_super(struct p2m_domain *p2m) -{ - unsigned long i, start, limit; - - if ( p2m->pod.reclaim_super == 0 ) - { - p2m->pod.reclaim_super = (p2m->pod.max_guest>>PAGE_ORDER_2M)<<PAGE_ORDER_2M; - p2m->pod.reclaim_super -= SUPERPAGE_PAGES; - } - - start = p2m->pod.reclaim_super; - limit = (start > POD_SWEEP_LIMIT) ? (start - POD_SWEEP_LIMIT) : 0; - - for ( i=p2m->pod.reclaim_super ; i > 0 ; i -= SUPERPAGE_PAGES ) - { - p2m_pod_zero_check_superpage(p2m, i); - /* Stop if we''re past our limit and we have found *something*. - * - * NB that this is a zero-sum game; we''re increasing our cache size - * by increasing our ''debt''. Since we hold the p2m lock, - * (entry_count - count) must remain the same. */ - if ( !page_list_empty(&p2m->pod.super) && i < limit ) - break; - } - - p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0; -} /* When populating a new superpage, look at recently populated superpages * hoping that they''ve been zeroed. This will snap up zeroed pages as soon as @@ -1039,27 +1011,12 @@ p2m_pod_demand_populate(struct p2m_domai return 0; } - /* Once we''ve ballooned down enough that we can fill the remaining - * PoD entries from the cache, don''t sweep even if the particular - * list we want to use is empty: that can lead to thrashing zero pages - * through the cache for no good reason. */ - if ( p2m->pod.entry_count > p2m->pod.count ) - { + /* Only sweep if we''re actually out of memory. Doing anything else + * causes unnecessary time and fragmentation of superpages in the p2m. */ + if ( p2m->pod.count == 0 ) + p2m_pod_emergency_sweep(p2m); - /* If we''re low, start a sweep */ - if ( order == PAGE_ORDER_2M && page_list_empty(&p2m->pod.super) ) - /* Note that sweeps scan other ranges in the p2m. In an scenario - * in which p2m locks are fine-grained, this may result in deadlock. - * Using trylock on the gfn''s as we sweep would avoid it. */ - p2m_pod_emergency_sweep_super(p2m); - - if ( page_list_empty(&p2m->pod.single) && - ( ( order == PAGE_ORDER_4K ) - || (order == PAGE_ORDER_2M && page_list_empty(&p2m->pod.super) ) ) ) - /* Same comment regarding deadlock applies */ - p2m_pod_emergency_sweep(p2m); - } - + /* If the sweep failed, give up. */ if ( p2m->pod.count == 0 ) goto out_of_memory; diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -284,7 +284,6 @@ struct p2m_domain { single; /* Non-super lists */ int count, /* # of pages in cache lists */ entry_count; /* # of pages in p2m marked pod */ - unsigned reclaim_super; /* Last gpfn of a scan */ unsigned reclaim_single; /* Last gpfn of a scan */ unsigned max_guest; /* gpfn of max guest demand-populate */ #define POD_HISTORY_MAX 128
Tim Deegan
2012-Jun-28 15:16 UTC
Re: [PATCH 0 of 3 v3] xen, pod: Populate-on-demand reclaim improvements
At 15:19 +0100 on 28 Jun (1340896781), George Dunlap wrote:> xen,pod: Populate-on-demand reclaim improvements > > Rework populate-on-demand sweeping > > Last summer I did some work on testing whether our PoD sweeping code > was achieving its goals: namely, never crashing unnecessairly, > minimizing boot time, and maximizing the number of superpages in the > p2m table. > > This is the resulting patch series.Applied, thanks. Tim.