George Dunlap
2012-Jun-27 16:57 UTC
[PATCH 0 of 4] xen, pod: Populate-on-demand reclaim improvements
xen,pod: Populate-on-demand reclaim improvements Rework populate-on-demand sweeping Last summer I did some work on testing whether our PoD sweeping code was achieving its goals: namely, never crashing unnecessairly, minimizing boot time, and maximizing the number of superpages in the p2m table. This is the resulting patch series. v2: - Move cosmetic code-motion hunk into its own patch - Address various comments - Include a simplified version of the balloon reclaim reclamation patch
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340815810 -3600 # Node ID b4e1fec1c98f6cbad666a972f473854518c25500 # Parent b91ef972029ddaa110af6463171715ab9070c9d8 xen,pod: Cosmetic code motion No point in doing the assignment if we''re just going to crash anyway. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -1022,13 +1022,13 @@ p2m_pod_demand_populate(struct p2m_domai p2m_pod_emergency_sweep(p2m); } + if ( p2m->pod.count == 0 ) + goto out_of_memory; + /* Keep track of the highest gfn demand-populated by a guest fault */ if ( gfn > p2m->pod.max_guest ) p2m->pod.max_guest = gfn; - if ( p2m->pod.count == 0 ) - goto out_of_memory; - /* Get a page f/ the cache. A NULL return value indicates that the * 2-meg range should be marked singleton PoD, and retried */ if ( (p = p2m_pod_cache_get(p2m, order)) == NULL )
George Dunlap
2012-Jun-27 16:57 UTC
[PATCH 2 of 4] xen, pod: Zero-check recently populated pages (checklast)
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340815812 -3600 # Node ID ea827c449088a1017b6e5a9564eb33df70f8a9c6 # Parent b4e1fec1c98f6cbad666a972f473854518c25500 xen,pod: Zero-check recently populated pages (checklast) When demand-populating pages due to guest accesses, check recently populated pages to see if we can reclaim them for the cache. This should keep the PoD cache filled when the start-of-day scrubber is going through. The number 128 was chosen by experiment. Windows does its page scrubbing in parallel; while a small nubmer like 4 works well for single VMs, it breaks down as multiple vcpus are scrubbing different pages in parallel. Increasing to 128 works well for higher numbers of vcpus. v2: - Wrapped some long lines - unsigned int for index, unsigned long for array Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -909,6 +909,27 @@ p2m_pod_emergency_sweep_super(struct p2m p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0; } +/* When populating a new superpage, look at recently populated superpages + * hoping that they''ve been zeroed. This will snap up zeroed pages as soon as + * the guest OS is done with them. */ +static void +p2m_pod_check_last_super(struct p2m_domain *p2m, unsigned long gfn_aligned) +{ + unsigned long check_gfn; + + ASSERT(p2m->pod.last_populated_index < POD_HISTORY_MAX); + + check_gfn = p2m->pod.last_populated[p2m->pod.last_populated_index]; + + p2m->pod.last_populated[p2m->pod.last_populated_index] = gfn_aligned; + + p2m->pod.last_populated_index + ( p2m->pod.last_populated_index + 1 ) % POD_HISTORY_MAX; + + p2m->pod.reclaim_super += p2m_pod_zero_check_superpage(p2m, check_gfn); +} + + #define POD_SWEEP_STRIDE 16 static void p2m_pod_emergency_sweep(struct p2m_domain *p2m) @@ -1066,6 +1087,12 @@ p2m_pod_demand_populate(struct p2m_domai __trace_var(TRC_MEM_POD_POPULATE, 0, sizeof(t), &t); } + /* Check the last guest demand-populate */ + if ( p2m->pod.entry_count > p2m->pod.count + && order == 9 + && q & P2M_ALLOC ) + p2m_pod_check_last_super(p2m, gfn_aligned); + pod_unlock(p2m); return 0; out_of_memory: diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -287,6 +287,10 @@ struct p2m_domain { unsigned reclaim_super; /* Last gpfn of a scan */ unsigned reclaim_single; /* Last gpfn of a scan */ unsigned max_guest; /* gpfn of max guest demand-populate */ +#define POD_HISTORY_MAX 128 + /* gpfn of last guest superpage demand-populated */ + unsigned long last_populated[POD_HISTORY_MAX]; + unsigned int last_populated_index; mm_lock_t lock; /* Locking of private pod structs, * * not relying on the p2m lock. */ } pod;
George Dunlap
2012-Jun-27 16:57 UTC
[PATCH 3 of 4] xen, pod: Only sweep in an emergency, and only for 4k pages
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340815812 -3600 # Node ID c71f52608fd8867062cc40a1354305f2af17b2c3 # Parent ea827c449088a1017b6e5a9564eb33df70f8a9c6 xen,pod: Only sweep in an emergency, and only for 4k pages Testing has shown that doing sweeps for superpages slows down boot significantly, but does not result in a significantly higher number of superpages after boot. Early sweeping for 4k pages causes superpages to be broken up unnecessarily. Only sweep if we''re really out of memory. v2: - Move unrelated code-motion hunk to another patch Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -880,34 +880,6 @@ p2m_pod_zero_check(struct p2m_domain *p2 } #define POD_SWEEP_LIMIT 1024 -static void -p2m_pod_emergency_sweep_super(struct p2m_domain *p2m) -{ - unsigned long i, start, limit; - - if ( p2m->pod.reclaim_super == 0 ) - { - p2m->pod.reclaim_super = (p2m->pod.max_guest>>PAGE_ORDER_2M)<<PAGE_ORDER_2M; - p2m->pod.reclaim_super -= SUPERPAGE_PAGES; - } - - start = p2m->pod.reclaim_super; - limit = (start > POD_SWEEP_LIMIT) ? (start - POD_SWEEP_LIMIT) : 0; - - for ( i=p2m->pod.reclaim_super ; i > 0 ; i -= SUPERPAGE_PAGES ) - { - p2m_pod_zero_check_superpage(p2m, i); - /* Stop if we''re past our limit and we have found *something*. - * - * NB that this is a zero-sum game; we''re increasing our cache size - * by increasing our ''debt''. Since we hold the p2m lock, - * (entry_count - count) must remain the same. */ - if ( !page_list_empty(&p2m->pod.super) && i < limit ) - break; - } - - p2m->pod.reclaim_super = i ? i - SUPERPAGE_PAGES : 0; -} /* When populating a new superpage, look at recently populated superpages * hoping that they''ve been zeroed. This will snap up zeroed pages as soon as @@ -1022,27 +994,12 @@ p2m_pod_demand_populate(struct p2m_domai return 0; } - /* Once we''ve ballooned down enough that we can fill the remaining - * PoD entries from the cache, don''t sweep even if the particular - * list we want to use is empty: that can lead to thrashing zero pages - * through the cache for no good reason. */ - if ( p2m->pod.entry_count > p2m->pod.count ) - { + /* Only sweep if we''re actually out of memory. Doing anything else + * causes unnecessary time and fragmentation of superpages in the p2m. */ + if ( p2m->pod.count == 0 ) + p2m_pod_emergency_sweep(p2m); - /* If we''re low, start a sweep */ - if ( order == PAGE_ORDER_2M && page_list_empty(&p2m->pod.super) ) - /* Note that sweeps scan other ranges in the p2m. In an scenario - * in which p2m locks are fine-grained, this may result in deadlock. - * Using trylock on the gfn''s as we sweep would avoid it. */ - p2m_pod_emergency_sweep_super(p2m); - - if ( page_list_empty(&p2m->pod.single) && - ( ( order == PAGE_ORDER_4K ) - || (order == PAGE_ORDER_2M && page_list_empty(&p2m->pod.super) ) ) ) - /* Same comment regarding deadlock applies */ - p2m_pod_emergency_sweep(p2m); - } - + /* If the sweep failed, give up. */ if ( p2m->pod.count == 0 ) goto out_of_memory;
George Dunlap
2012-Jun-27 16:57 UTC
[PATCH 4 of 4] xen, pod: Try to reclaim superpages when ballooning down
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1340815812 -3600 # Node ID 71a22d6d940f27d8dfbcfc12d1377e4622f981bd # Parent c71f52608fd8867062cc40a1354305f2af17b2c3 xen,pod: Try to reclaim superpages when ballooning down Windows balloon drivers can typically only get 4k pages from the kernel, and so hand them back at that level. Try to regain superpages by checking the superpage frame that the 4k page is in to see if we can reclaim the whole thing for the PoD cache. This also modifies p2m_pod_zero_check_superpage() to return SUPERPAGE_PAGES on success. v2: - Rewritten to simply to the check as in demand-fault case, without needing to know that the p2m entry is a superpage. - Also, took out the re-writing of the reclaim loop, leaving it optimized for 4k pages (by far the most common case), and simplifying the patch. diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -488,6 +488,10 @@ p2m_pod_offline_or_broken_replace(struct return; } +static int +p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn); + + /* This function is needed for two reasons: * + To properly handle clearing of PoD entries * + To "steal back" memory being freed for the PoD cache, rather than @@ -505,8 +509,8 @@ p2m_pod_decrease_reservation(struct doma int i; struct p2m_domain *p2m = p2m_get_hostp2m(d); - int steal_for_cache = 0; - int pod = 0, nonpod = 0, ram = 0; + int steal_for_cache; + int pod, nonpod, ram; gfn_lock(p2m, gpfn, order); pod_lock(p2m); @@ -516,13 +520,15 @@ p2m_pod_decrease_reservation(struct doma if ( p2m->pod.entry_count == 0 ) goto out_unlock; + if ( unlikely(d->is_dying) ) + goto out_unlock; + +recount: + pod = nonpod = ram = 0; + /* Figure out if we need to steal some freed memory for our cache */ steal_for_cache = ( p2m->pod.entry_count > p2m->pod.count ); - if ( unlikely(d->is_dying) ) - goto out_unlock; - - /* See what''s in here. */ /* FIXME: Add contiguous; query for PSE entries? */ for ( i=0; i<(1<<order); i++) { @@ -556,7 +562,16 @@ p2m_pod_decrease_reservation(struct doma goto out_entry_check; } - /* FIXME: Steal contig 2-meg regions for cache */ + /* Try to grab entire superpages if possible. Since the common case is for drivers + * to pass back singleton pages, see if we can take the whole page back and mark the + * rest PoD. */ + if ( steal_for_cache + && p2m_pod_zero_check_superpage(p2m, gpfn & ~(SUPERPAGE_PAGES-1))) + { + /* Since order may be arbitrary, we may have taken more or less + * than we were actually asked to; so just re-count from scratch */ + goto recount; + } /* Process as long as: * + There are PoD entries to handle, or @@ -758,6 +773,8 @@ p2m_pod_zero_check_superpage(struct p2m_ p2m_pod_cache_add(p2m, mfn_to_page(mfn0), PAGE_ORDER_2M); p2m->pod.entry_count += SUPERPAGE_PAGES; + ret = SUPERPAGE_PAGES; + out_reset: if ( reset ) set_p2m_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
At 17:57 +0100 on 27 Jun (1340819848), George Dunlap wrote:> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1340815810 -3600 > # Node ID b4e1fec1c98f6cbad666a972f473854518c25500 > # Parent b91ef972029ddaa110af6463171715ab9070c9d8 > xen,pod: Cosmetic code motion > > No point in doing the assignment if we''re just going to crash anyway. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>Applied, thanks. Tim.
Tim Deegan
2012-Jun-28 12:45 UTC
Re: [PATCH 4 of 4] xen, pod: Try to reclaim superpages when ballooning down
At 17:57 +0100 on 27 Jun (1340819851), George Dunlap wrote:> xen,pod: Try to reclaim superpages when ballooning down > > Windows balloon drivers can typically only get 4k pages from the kernel, > and so hand them back at that level. Try to regain superpages by checking > the superpage frame that the 4k page is in to see if we can reclaim the whole > thing for the PoD cache. > > This also modifies p2m_pod_zero_check_superpage() to return SUPERPAGE_PAGES on > success. > > v2: > - Rewritten to simply to the check as in demand-fault case, without needing > to know that the p2m entry is a superpage. > - Also, took out the re-writing of the reclaim loop, leaving it optimized for > 4k pages (by far the most common case), and simplifying the patch. >Acked-by: Tim Deegan <tim@xen.org>
Tim Deegan
2012-Jun-28 12:48 UTC
Re: [PATCH 3 of 4] xen, pod: Only sweep in an emergency, and only for 4k pages
At 17:57 +0100 on 27 Jun (1340819850), George Dunlap wrote:> xen,pod: Only sweep in an emergency, and only for 4k pages > > Testing has shown that doing sweeps for superpages slows down boot > significantly, but does not result in a significantly higher number of > superpages after boot. Early sweeping for 4k pages causes superpages > to be broken up unnecessarily. > > Only sweep if we''re really out of memory. > > v2: > - Move unrelated code-motion hunk to another patch > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>Acked-by: Tim Deegan <tim@xen.org>
Tim Deegan
2012-Jun-28 12:48 UTC
Re: [PATCH 2 of 4] xen, pod: Zero-check recently populated pages (checklast)
At 17:57 +0100 on 27 Jun (1340819849), George Dunlap wrote:> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1340815812 -3600 > # Node ID ea827c449088a1017b6e5a9564eb33df70f8a9c6 > # Parent b4e1fec1c98f6cbad666a972f473854518c25500 > xen,pod: Zero-check recently populated pages (checklast) > > When demand-populating pages due to guest accesses, check recently populated > pages to see if we can reclaim them for the cache. This should keep the PoD > cache filled when the start-of-day scrubber is going through. > > The number 128 was chosen by experiment. Windows does its page > scrubbing in parallel; while a small nubmer like 4 works well for > single VMs, it breaks down as multiple vcpus are scrubbing different > pages in parallel. Increasing to 128 works well for higher numbers of > vcpus. > > v2: > - Wrapped some long lines > - unsigned int for index, unsigned long for array > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>Acked-by: Tim Deegan <tim@xen.org>
Tim Deegan
2012-Jun-28 12:53 UTC
Re: [PATCH 0 of 4] xen, pod: Populate-on-demand reclaim improvements
At 17:57 +0100 on 27 Jun (1340819847), George Dunlap wrote:> xen,pod: Populate-on-demand reclaim improvements > > Rework populate-on-demand sweeping > > Last summer I did some work on testing whether our PoD sweeping code > was achieving its goals: namely, never crashing unnecessairly, > minimizing boot time, and maximizing the number of superpages in the > p2m table. > > This is the resulting patch series.I''ve acked these, but only applied 1/4, because: - patch 4 needs an S-o-B line; - patch 2 can''t go in before patch 4 because it depends on the new return value of p2m_pod_zero_check_superpage(); and - patch 3 can''t go in before patch 2 or it removes the only caller of p2m_pod_zero_check_superpage() and breaks the build. :) If you reshuffle them as 4, 2, 3, add an S-o-B and address Andres''s comment about the order constant, I''ll push them straight in next time. Cheers, Tim.
George Dunlap
2012-Jun-28 13:04 UTC
Re: [PATCH 0 of 4] xen, pod: Populate-on-demand reclaim improvements
On Thu, Jun 28, 2012 at 1:53 PM, Tim Deegan <tim@xen.org> wrote:> At 17:57 +0100 on 27 Jun (1340819847), George Dunlap wrote: >> xen,pod: Populate-on-demand reclaim improvements >> >> Rework populate-on-demand sweeping >> >> Last summer I did some work on testing whether our PoD sweeping code >> was achieving its goals: namely, never crashing unnecessairly, >> minimizing boot time, and maximizing the number of superpages in the >> p2m table. >> >> This is the resulting patch series. > > I''ve acked these, but only applied 1/4, because: > > - patch 4 needs an S-o-B line; > - patch 2 can''t go in before patch 4 because it depends on the new > return value of p2m_pod_zero_check_superpage(); and > - patch 3 can''t go in before patch 2 or it removes the only caller of > p2m_pod_zero_check_superpage() and breaks the build. :) > > If you reshuffle them as 4, 2, 3, add an S-o-B and address Andres''s > comment about the order constant, I''ll push them straight in next time.Oops! Reordering sounds good. I should have a respin this afternoon. -George
George Dunlap
2012-Jun-28 13:51 UTC
Re: [PATCH 0 of 4] xen, pod: Populate-on-demand reclaim improvements
On Thu, Jun 28, 2012 at 1:53 PM, Tim Deegan <tim@xen.org> wrote:> At 17:57 +0100 on 27 Jun (1340819847), George Dunlap wrote: >> xen,pod: Populate-on-demand reclaim improvements >> >> Rework populate-on-demand sweeping >> >> Last summer I did some work on testing whether our PoD sweeping code >> was achieving its goals: namely, never crashing unnecessairly, >> minimizing boot time, and maximizing the number of superpages in the >> p2m table. >> >> This is the resulting patch series. > > I''ve acked these, but only applied 1/4, because: > > - patch 4 needs an S-o-B line; > - patch 2 can''t go in before patch 4 because it depends on the new > return value of p2m_pod_zero_check_superpage(); andHmm -- that''s actually not right, and the only reason it doesn''t cause problems is because of the code taken out in patch 3. Man, I hate having to clean up after sloppy coders! ;-) But unfortunately that means 2 and 3 will probably lack Acks when they come back. -George