George Dunlap
2011-Jan-17 14:36 UTC
[Xen-devel] [PATCH 0 of 3] Miscellaneous populate-on-demand bugs
This patch series includes a series of bugs related to p2m, ept, and PoD code which were found as part of our XenServer product testing. Each of these fixes actual bugs, and the 3.4-based version of the patch has been tested thoroughly. (There may be bugs in porting the patches, but most of them are simple enough as to make it unlikely.) Each patch is conceptually independent, so they can each be considered for inclusion individually. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-17 14:36 UTC
[Xen-devel] [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1295274250 0 # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944 # Parent 75b6287626ee0b852d725543568001e99b13be5b p2m: Allow l2 pages to be replaced by superpages Allow second-level p2m pages to be replaced with superpage entries, freeing the p2m table page properly. This allows, for example, a sequence of 512 singleton zero pages to be replaced with a superpage populate-on-demand entry. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c --- a/xen/arch/x86/mm/hap/hap.c Fri Jan 14 16:38:51 2011 +0000 +++ b/xen/arch/x86/mm/hap/hap.c Mon Jan 17 14:24:10 2011 +0000 @@ -333,9 +333,11 @@ ASSERT(page_get_owner(pg) == d); /* Should have just the one ref we gave it in alloc_p2m_page() */ - if ( (pg->count_info & PGC_count_mask) != 1 ) - HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n", - pg->count_info, pg->u.inuse.type_info); + if ( (pg->count_info & PGC_count_mask) != 1 ) { + HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n", + pg, pg->count_info, pg->u.inuse.type_info); + WARN(); + } pg->count_info &= ~PGC_count_mask; /* Free should not decrement domain''s total allocation, since * these pages were allocated without an owner. */ diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c --- a/xen/arch/x86/mm/hap/p2m-ept.c Fri Jan 14 16:38:51 2011 +0000 +++ b/xen/arch/x86/mm/hap/p2m-ept.c Mon Jan 17 14:24:10 2011 +0000 @@ -317,6 +317,7 @@ int vtd_pte_present = 0; int needs_sync = 1; struct domain *d = p2m->domain; + struct page_info *intermediate_table = NULL; /* * the caller must make sure: @@ -369,6 +370,12 @@ /* No need to flush if the old entry wasn''t valid */ if ( !is_epte_present(ept_entry) ) needs_sync = 0; + else if ( order == 9 && ! ept_entry->sp ) + { + /* We''re replacing a non-SP page with a superpage. Make sure to + * handle freeing the table properly. */ + intermediate_table = mfn_to_page(ept_entry->mfn); + } if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) || (p2mt == p2m_ram_paging_in_start) ) @@ -487,6 +494,17 @@ } } + if ( intermediate_table ) + { + /* Release the old intermediate table. This has to be the + last thing we do, after the ept_sync_domain() and removal + from the iommu tables, so as to avoid a potential + use-after-free. */ + + page_list_del(intermediate_table, &d->arch.p2m->pages); + d->arch.paging.free_page(d, intermediate_table); + } + return rv; } diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c Fri Jan 14 16:38:51 2011 +0000 +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:10 2011 +0000 @@ -1361,9 +1361,15 @@ if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) && !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) ) { - P2M_ERROR("configure P2M table 4KB L2 entry with large page\n"); - domain_crash(p2m->domain); - goto out; + struct page_info *pg; + + /* We''re replacing a non-SP page with a superpage. Make sure to + * handle freeing the table properly. */ + pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry))); + paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn, + l1e_empty(), 2); + page_list_del(pg, &p2m->pages); + p2m->domain->arch.paging.free_page(p2m->domain, pg); } ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-17 14:36 UTC
[Xen-devel] [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1295274253 0 # Node ID 55e123a24da84f3b83caa7a7332699df73aaa90d # Parent 366d675630fd6ecbd6228426b3f7723d8a9dd944 PoD: Allow pod_set_cache_target hypercall to be preempted For very large VMs, setting the cache target can take long enough that dom0 complains of soft lockups. Allow the hypercall to be preempted. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/domain.c --- a/xen/arch/x86/domain.c Mon Jan 17 14:24:10 2011 +0000 +++ b/xen/arch/x86/domain.c Mon Jan 17 14:24:13 2011 +0000 @@ -1653,8 +1653,8 @@ unsigned long nval = 0; va_list args; - BUG_ON(*id > 5); - BUG_ON(mask & (1U << *id)); + BUG_ON(id && *id > 5); + BUG_ON(id && (mask & (1U << *id))); va_start(args, mask); diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c Mon Jan 17 14:24:10 2011 +0000 +++ b/xen/arch/x86/mm.c Mon Jan 17 14:24:13 2011 +0000 @@ -4799,15 +4799,23 @@ rc = p2m_pod_set_mem_target(d, target.target_pages); } - p2m = p2m_get_hostp2m(d); - target.tot_pages = d->tot_pages; - target.pod_cache_pages = p2m->pod.count; - target.pod_entries = p2m->pod.entry_count; - - if ( copy_to_guest(arg, &target, 1) ) + if ( rc == -EAGAIN ) { - rc= -EFAULT; - goto pod_target_out_unlock; + rc = hypercall_create_continuation( + __HYPERVISOR_memory_op, "lh", op, arg); + } + else if ( rc >= 0 ) + { + p2m = p2m_get_hostp2m(d); + target.tot_pages = d->tot_pages; + target.pod_cache_pages = p2m->pod.count; + target.pod_entries = p2m->pod.entry_count; + + if ( copy_to_guest(arg, &target, 1) ) + { + rc= -EFAULT; + goto pod_target_out_unlock; + } } pod_target_out_unlock: diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:10 2011 +0000 +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:13 2011 +0000 @@ -435,7 +435,7 @@ /* Set the size of the cache, allocating or freeing as necessary. */ static int -p2m_pod_set_cache_target(struct p2m_domain *p2m, unsigned long pod_target) +p2m_pod_set_cache_target(struct p2m_domain *p2m, unsigned long pod_target, int preemptible) { struct domain *d = p2m->domain; int ret = 0; @@ -468,6 +468,12 @@ } p2m_pod_cache_add(p2m, page, order); + + if ( hypercall_preempt_check() && preemptible ) + { + ret = -EAGAIN; + goto out; + } } /* Decreasing the target */ @@ -512,6 +518,12 @@ put_page(page+i); put_page(page+i); + + if ( hypercall_preempt_check() && preemptible ) + { + ret = -EAGAIN; + goto out; + } } } @@ -589,7 +601,7 @@ ASSERT( pod_target >= p2m->pod.count ); - ret = p2m_pod_set_cache_target(p2m, pod_target); + ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/); out: p2m_unlock(p2m); @@ -753,7 +765,7 @@ /* If we''ve reduced our "liabilities" beyond our "assets", free some */ if ( p2m->pod.entry_count < p2m->pod.count ) { - p2m_pod_set_cache_target(p2m, p2m->pod.entry_count); + p2m_pod_set_cache_target(p2m, p2m->pod.entry_count, 0/*can''t preempt*/); } out_unlock: diff -r 366d675630fd -r 55e123a24da8 xen/arch/x86/x86_64/compat/mm.c --- a/xen/arch/x86/x86_64/compat/mm.c Mon Jan 17 14:24:10 2011 +0000 +++ b/xen/arch/x86/x86_64/compat/mm.c Mon Jan 17 14:24:13 2011 +0000 @@ -127,6 +127,9 @@ if ( rc < 0 ) break; + if ( rc == __HYPERVISOR_memory_op ) + hypercall_xlat_continuation(NULL, 0x2, nat, arg); + XLAT_pod_target(&cmp, nat); if ( copy_to_guest(arg, &cmp, 1) ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-17 14:36 UTC
[Xen-devel] [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging
# HG changeset patch # User George Dunlap <george.dunlap@eu.citrix.com> # Date 1295274541 0 # Node ID dd81c7acd9b323e9d2e4e3a41b81783fc7df5342 # Parent 55e123a24da84f3b83caa7a7332699df73aaa90d PoD,hap: Fix logdirty mode when using hardware assisted paging When writing a writable p2m entry for a pfn, we need to mark the pfn dirty to avoid corruption when doing live migration. Marking the page dirty exposes another issue, where there are excessive sweeps for zero pages if there''s a mismatch between PoD entries and cache entries. Only sweep for zero pages if we actually need more memory. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> diff -r 55e123a24da8 -r dd81c7acd9b3 xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:13 2011 +0000 +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:29:01 2011 +0000 @@ -1142,14 +1142,22 @@ return 0; } - /* If we''re low, start a sweep */ - if ( order == 9 && page_list_empty(&p2m->pod.super) ) - p2m_pod_emergency_sweep_super(p2m); - - if ( page_list_empty(&p2m->pod.single) && - ( ( order == 0 ) - || (order == 9 && page_list_empty(&p2m->pod.super) ) ) ) - p2m_pod_emergency_sweep(p2m); + /* Once we''ve ballooned down enough that we can fill the remaining + * PoD entries from the cache, don''t sweep even if the particular + * list we want to use is empty: that can lead to thrashing zero pages + * through the cache for no good reason. */ + if ( p2m->pod.entry_count > p2m->pod.count ) + { + + /* If we''re low, start a sweep */ + if ( order == 9 && page_list_empty(&p2m->pod.super) ) + p2m_pod_emergency_sweep_super(p2m); + + if ( page_list_empty(&p2m->pod.single) && + ( ( order == 0 ) + || (order == 9 && page_list_empty(&p2m->pod.super) ) ) ) + p2m_pod_emergency_sweep(p2m); + } /* Keep track of the highest gfn demand-populated by a guest fault */ if ( q == p2m_guest && gfn > p2m->pod.max_guest ) @@ -1176,7 +1184,10 @@ set_p2m_entry(p2m, gfn_aligned, mfn, order, p2m_ram_rw, p2m->default_access); for( i = 0; i < (1UL << order); i++ ) + { set_gpfn_from_mfn(mfn_x(mfn) + i, gfn_aligned + i); + paging_mark_dirty(d, mfn_x(mfn) + i); + } p2m->pod.entry_count -= (1 << order); /* Lock: p2m */ BUG_ON(p2m->pod.entry_count < 0); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jan-19 12:22 UTC
Re: [Xen-devel] [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
Hi, At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote:> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1295274250 0 > # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944 > # Parent 75b6287626ee0b852d725543568001e99b13be5b > p2m: Allow l2 pages to be replaced by superpages > > Allow second-level p2m pages to be replaced with superpage entries, > freeing the p2m table page properly. This allows, for example, a > sequence of 512 singleton zero pages to be replaced with a superpage > populate-on-demand entry.This problem became more general under your feet - xen 4.1 supports 1GiB superpages as well, so the same thing needs to be fixed for them. (I understand that PoD only uses 2MiB superpages but to half-fix it invites future bugs.) Also, although in the EPT code you''re careful to do all the flushing &c before freeing the old page, in the vanilla p2m code you free the page even before writing the new l2e! Cheers, Tim.> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c > --- a/xen/arch/x86/mm/hap/hap.c Fri Jan 14 16:38:51 2011 +0000 > +++ b/xen/arch/x86/mm/hap/hap.c Mon Jan 17 14:24:10 2011 +0000 > @@ -333,9 +333,11 @@ > > ASSERT(page_get_owner(pg) == d); > /* Should have just the one ref we gave it in alloc_p2m_page() */ > - if ( (pg->count_info & PGC_count_mask) != 1 ) > - HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n", > - pg->count_info, pg->u.inuse.type_info); > + if ( (pg->count_info & PGC_count_mask) != 1 ) { > + HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n", > + pg, pg->count_info, pg->u.inuse.type_info); > + WARN(); > + } > pg->count_info &= ~PGC_count_mask; > /* Free should not decrement domain''s total allocation, since > * these pages were allocated without an owner. */ > diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c > --- a/xen/arch/x86/mm/hap/p2m-ept.c Fri Jan 14 16:38:51 2011 +0000 > +++ b/xen/arch/x86/mm/hap/p2m-ept.c Mon Jan 17 14:24:10 2011 +0000 > @@ -317,6 +317,7 @@ > int vtd_pte_present = 0; > int needs_sync = 1; > struct domain *d = p2m->domain; > + struct page_info *intermediate_table = NULL; > > /* > * the caller must make sure: > @@ -369,6 +370,12 @@ > /* No need to flush if the old entry wasn''t valid */ > if ( !is_epte_present(ept_entry) ) > needs_sync = 0; > + else if ( order == 9 && ! ept_entry->sp ) > + { > + /* We''re replacing a non-SP page with a superpage. Make sure to > + * handle freeing the table properly. */ > + intermediate_table = mfn_to_page(ept_entry->mfn); > + } > > if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) || > (p2mt == p2m_ram_paging_in_start) ) > @@ -487,6 +494,17 @@ > } > } > > + if ( intermediate_table ) > + { > + /* Release the old intermediate table. This has to be the > + last thing we do, after the ept_sync_domain() and removal > + from the iommu tables, so as to avoid a potential > + use-after-free. */ > + > + page_list_del(intermediate_table, &d->arch.p2m->pages); > + d->arch.paging.free_page(d, intermediate_table); > + } > + > return rv; > } > > diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c > --- a/xen/arch/x86/mm/p2m.c Fri Jan 14 16:38:51 2011 +0000 > +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:10 2011 +0000 > @@ -1361,9 +1361,15 @@ > if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) && > !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) ) > { > - P2M_ERROR("configure P2M table 4KB L2 entry with large page\n"); > - domain_crash(p2m->domain); > - goto out; > + struct page_info *pg; > + > + /* We''re replacing a non-SP page with a superpage. Make sure to > + * handle freeing the table properly. */ > + pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry))); > + paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn, > + l1e_empty(), 2); > + page_list_del(pg, &p2m->pages); > + p2m->domain->arch.paging.free_page(p2m->domain, pg); > } > > ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jan-19 12:41 UTC
Re: [Xen-devel] [PATCH 2 of 3] PoD: Allow pod_set_cache_target hypercall to be preempted
At 14:36 +0000 on 17 Jan (1295274994), George Dunlap wrote:> PoD: Allow pod_set_cache_target hypercall to be preempted > > For very large VMs, setting the cache target can take long enough that > dom0 complains of soft lockups. Allow the hypercall to be preempted.Applied, thanks. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jan-19 12:41 UTC
Re: [Xen-devel] [PATCH 3 of 3] PoD, hap: Fix logdirty mode when using hardware assisted paging
At 14:36 +0000 on 17 Jan (1295274995), George Dunlap wrote:> PoD,hap: Fix logdirty mode when using hardware assisted paging > > When writing a writable p2m entry for a pfn, we need to mark the pfn > dirty to avoid corruption when doing live migration. > > Marking the page dirty exposes another issue, where there are > excessive sweeps for zero pages if there''s a mismatch between PoD > entries and cache entries. Only sweep for zero pages if we actually > need more memory.Applied, thanks. Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-19 14:49 UTC
Re: [Xen-devel] [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
Just to be clear, should I read your response this as something like below? "Please rework this patch to do the following: * Generalize for 1GB pages * Make the p2m case as careful as the ept case" -George On Wed, Jan 19, 2011 at 12:22 PM, Tim Deegan <Tim.Deegan@citrix.com> wrote:> Hi, > > At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote: >> # HG changeset patch >> # User George Dunlap <george.dunlap@eu.citrix.com> >> # Date 1295274250 0 >> # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944 >> # Parent 75b6287626ee0b852d725543568001e99b13be5b >> p2m: Allow l2 pages to be replaced by superpages >> >> Allow second-level p2m pages to be replaced with superpage entries, >> freeing the p2m table page properly. This allows, for example, a >> sequence of 512 singleton zero pages to be replaced with a superpage >> populate-on-demand entry. > > This problem became more general under your feet - xen 4.1 supports 1GiB > superpages as well, so the same thing needs to be fixed for them. > (I understand that PoD only uses 2MiB superpages but to half-fix it > invites future bugs.) > > Also, although in the EPT code you''re careful to do all the flushing &c > before freeing the old page, in the vanilla p2m code you free the page > even before writing the new l2e! > > Cheers, > > Tim. > >> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> >> >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c >> --- a/xen/arch/x86/mm/hap/hap.c Fri Jan 14 16:38:51 2011 +0000 >> +++ b/xen/arch/x86/mm/hap/hap.c Mon Jan 17 14:24:10 2011 +0000 >> @@ -333,9 +333,11 @@ >> >> ASSERT(page_get_owner(pg) == d); >> /* Should have just the one ref we gave it in alloc_p2m_page() */ >> - if ( (pg->count_info & PGC_count_mask) != 1 ) >> - HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n", >> - pg->count_info, pg->u.inuse.type_info); >> + if ( (pg->count_info & PGC_count_mask) != 1 ) { >> + HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n", >> + pg, pg->count_info, pg->u.inuse.type_info); >> + WARN(); >> + } >> pg->count_info &= ~PGC_count_mask; >> /* Free should not decrement domain''s total allocation, since >> * these pages were allocated without an owner. */ >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c >> --- a/xen/arch/x86/mm/hap/p2m-ept.c Fri Jan 14 16:38:51 2011 +0000 >> +++ b/xen/arch/x86/mm/hap/p2m-ept.c Mon Jan 17 14:24:10 2011 +0000 >> @@ -317,6 +317,7 @@ >> int vtd_pte_present = 0; >> int needs_sync = 1; >> struct domain *d = p2m->domain; >> + struct page_info *intermediate_table = NULL; >> >> /* >> * the caller must make sure: >> @@ -369,6 +370,12 @@ >> /* No need to flush if the old entry wasn''t valid */ >> if ( !is_epte_present(ept_entry) ) >> needs_sync = 0; >> + else if ( order == 9 && ! ept_entry->sp ) >> + { >> + /* We''re replacing a non-SP page with a superpage. Make sure to >> + * handle freeing the table properly. */ >> + intermediate_table = mfn_to_page(ept_entry->mfn); >> + } >> >> if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) || >> (p2mt == p2m_ram_paging_in_start) ) >> @@ -487,6 +494,17 @@ >> } >> } >> >> + if ( intermediate_table ) >> + { >> + /* Release the old intermediate table. This has to be the >> + last thing we do, after the ept_sync_domain() and removal >> + from the iommu tables, so as to avoid a potential >> + use-after-free. */ >> + >> + page_list_del(intermediate_table, &d->arch.p2m->pages); >> + d->arch.paging.free_page(d, intermediate_table); >> + } >> + >> return rv; >> } >> >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c >> --- a/xen/arch/x86/mm/p2m.c Fri Jan 14 16:38:51 2011 +0000 >> +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:10 2011 +0000 >> @@ -1361,9 +1361,15 @@ >> if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) && >> !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) ) >> { >> - P2M_ERROR("configure P2M table 4KB L2 entry with large page\n"); >> - domain_crash(p2m->domain); >> - goto out; >> + struct page_info *pg; >> + >> + /* We''re replacing a non-SP page with a superpage. Make sure to >> + * handle freeing the table properly. */ >> + pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry))); >> + paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn, >> + l1e_empty(), 2); >> + page_list_del(pg, &p2m->pages); >> + p2m->domain->arch.paging.free_page(p2m->domain, pg); >> } >> >> ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct); >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Xen Platform Team > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jan-19 14:53 UTC
Re: [Xen-devel] [PATCH 1 of 3] p2m: Allow l2 pages to be replaced by superpages
At 14:49 +0000 on 19 Jan (1295448547), George Dunlap wrote:> Just to be clear, should I read your response this as something like below? > > "Please rework this patch to do the following: > * Generalize for 1GB pages > * Make the p2m case as careful as the ept case"Yes, please. Tim.> On Wed, Jan 19, 2011 at 12:22 PM, Tim Deegan <Tim.Deegan@citrix.com> wrote: > > Hi, > > > > At 14:36 +0000 on 17 Jan (1295274993), George Dunlap wrote: > >> # HG changeset patch > >> # User George Dunlap <george.dunlap@eu.citrix.com> > >> # Date 1295274250 0 > >> # Node ID 366d675630fd6ecbd6228426b3f7723d8a9dd944 > >> # Parent 75b6287626ee0b852d725543568001e99b13be5b > >> p2m: Allow l2 pages to be replaced by superpages > >> > >> Allow second-level p2m pages to be replaced with superpage entries, > >> freeing the p2m table page properly. This allows, for example, a > >> sequence of 512 singleton zero pages to be replaced with a superpage > >> populate-on-demand entry. > > > > This problem became more general under your feet - xen 4.1 supports 1GiB > > superpages as well, so the same thing needs to be fixed for them. > > (I understand that PoD only uses 2MiB superpages but to half-fix it > > invites future bugs.) > > > > Also, although in the EPT code you''re careful to do all the flushing &c > > before freeing the old page, in the vanilla p2m code you free the page > > even before writing the new l2e! > > > > Cheers, > > > > Tim. > > > >> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > >> > >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/hap.c > >> --- a/xen/arch/x86/mm/hap/hap.c Fri Jan 14 16:38:51 2011 +0000 > >> +++ b/xen/arch/x86/mm/hap/hap.c Mon Jan 17 14:24:10 2011 +0000 > >> @@ -333,9 +333,11 @@ > >> > >> ASSERT(page_get_owner(pg) == d); > >> /* Should have just the one ref we gave it in alloc_p2m_page() */ > >> - if ( (pg->count_info & PGC_count_mask) != 1 ) > >> - HAP_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n", > >> - pg->count_info, pg->u.inuse.type_info); > >> + if ( (pg->count_info & PGC_count_mask) != 1 ) { > >> + HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n", > >> + pg, pg->count_info, pg->u.inuse.type_info); > >> + WARN(); > >> + } > >> pg->count_info &= ~PGC_count_mask; > >> /* Free should not decrement domain''s total allocation, since > >> * these pages were allocated without an owner. */ > >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/hap/p2m-ept.c > >> --- a/xen/arch/x86/mm/hap/p2m-ept.c Fri Jan 14 16:38:51 2011 +0000 > >> +++ b/xen/arch/x86/mm/hap/p2m-ept.c Mon Jan 17 14:24:10 2011 +0000 > >> @@ -317,6 +317,7 @@ > >> int vtd_pte_present = 0; > >> int needs_sync = 1; > >> struct domain *d = p2m->domain; > >> + struct page_info *intermediate_table = NULL; > >> > >> /* > >> * the caller must make sure: > >> @@ -369,6 +370,12 @@ > >> /* No need to flush if the old entry wasn''t valid */ > >> if ( !is_epte_present(ept_entry) ) > >> needs_sync = 0; > >> + else if ( order == 9 && ! ept_entry->sp ) > >> + { > >> + /* We''re replacing a non-SP page with a superpage. Make sure to > >> + * handle freeing the table properly. */ > >> + intermediate_table = mfn_to_page(ept_entry->mfn); > >> + } > >> > >> if ( mfn_valid(mfn_x(mfn)) || direct_mmio || p2m_is_paged(p2mt) || > >> (p2mt == p2m_ram_paging_in_start) ) > >> @@ -487,6 +494,17 @@ > >> } > >> } > >> > >> + if ( intermediate_table ) > >> + { > >> + /* Release the old intermediate table. This has to be the > >> + last thing we do, after the ept_sync_domain() and removal > >> + from the iommu tables, so as to avoid a potential > >> + use-after-free. */ > >> + > >> + page_list_del(intermediate_table, &d->arch.p2m->pages); > >> + d->arch.paging.free_page(d, intermediate_table); > >> + } > >> + > >> return rv; > >> } > >> > >> diff -r 75b6287626ee -r 366d675630fd xen/arch/x86/mm/p2m.c > >> --- a/xen/arch/x86/mm/p2m.c Fri Jan 14 16:38:51 2011 +0000 > >> +++ b/xen/arch/x86/mm/p2m.c Mon Jan 17 14:24:10 2011 +0000 > >> @@ -1361,9 +1361,15 @@ > >> if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) && > >> !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) ) > >> { > >> - P2M_ERROR("configure P2M table 4KB L2 entry with large page\n"); > >> - domain_crash(p2m->domain); > >> - goto out; > >> + struct page_info *pg; > >> + > >> + /* We''re replacing a non-SP page with a superpage. Make sure to > >> + * handle freeing the table properly. */ > >> + pg = mfn_to_page(_mfn(l1e_get_pfn(*p2m_entry))); > >> + paging_write_p2m_entry(p2m->domain, gfn, p2m_entry, table_mfn, > >> + l1e_empty(), 2); > >> + page_list_del(pg, &p2m->pages); > >> + p2m->domain->arch.paging.free_page(p2m->domain, pg); > >> } > >> > >> ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct); > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > > > -- > > Tim Deegan <Tim.Deegan@citrix.com> > > Principal Software Engineer, Xen Platform Team > > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > >-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel