This patch series brings about a number of fixes to the p2m code in anticipation of synchonizing lookups. Specifically: - Four bug fixes on shadow domctls, setting shared p2m entries, and splitting 1GB PoD superpages - Rework the p2m audit code. Today it does neither compile nor apply to ept, so make it compile and move it out of the way for most callers, which obviously don''t care. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> xen/arch/x86/mm/p2m-pod.c | 1 - xen/arch/x86/mm/p2m-pt.c | 9 +- xen/arch/x86/mm/p2m.c | 5 +- xen/arch/x86/mm/hap/hap.c | 1 + tools/libxc/xc_domain.c | 6 +- xen/arch/x86/domctl.c | 24 +++++++ xen/arch/x86/mm/p2m-ept.c | 1 + xen/arch/x86/mm/p2m-pod.c | 5 - xen/arch/x86/mm/p2m-pt.c | 137 +++++++------------------------------------ xen/arch/x86/mm/p2m.c | 124 ++++++++++++++++++++++++++++++++++++--- xen/include/asm-x86/p2m.h | 11 ++- xen/include/public/domctl.h | 12 +++ tools/libxc/xc_domain.c | 22 +++++++ tools/libxc/xenctrl.h | 27 ++++++++ 14 files changed, 241 insertions(+), 144 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 1 of 6] The PoD code may split a 1GB superpage in a potentially unlocked way
xen/arch/x86/mm/p2m-pod.c | 1 - xen/arch/x86/mm/p2m-pt.c | 9 ++++++--- 2 files changed, 6 insertions(+), 4 deletions(-) The path p2m-lookup -> p2m-pt->get_entry -> 1GB PoD superpage -> pod_demand_populate ends in the pod code performing a p2m_set_entry with no locks held (in order to split the 1GB superpage into 512 2MB ones) Further, it calls p2m_unlock after that, which will break the spinlock. This patch attempts to fix that. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 0844b17df7a9 -r d9a344a5c1e3 xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -987,7 +987,6 @@ p2m_pod_demand_populate(struct p2m_domai set_p2m_entry(p2m, gfn_aligned, _mfn(0), PAGE_ORDER_2M, p2m_populate_on_demand, p2m->default_access); audit_p2m(p2m, 1); - p2m_unlock(p2m); return 0; } diff -r 0844b17df7a9 -r d9a344a5c1e3 xen/arch/x86/mm/p2m-pt.c --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -542,10 +542,11 @@ pod_retry_l3: /* The read has succeeded, so we know that mapping exists */ if ( q != p2m_query ) { - if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G, q) ) + if ( !p2m_pod_check_and_populate(p2m, gfn, + (l1_pgentry_t *) &l3e, PAGE_ORDER_1G, q) ) goto pod_retry_l3; p2mt = p2m_invalid; - printk("%s: Allocate 1GB failed!\n", __func__); + gdprintk(XENLOG_ERR, "%s: Allocate 1GB failed!\n", __func__); goto out; } else @@ -743,8 +744,10 @@ pod_retry_l3: { if ( q != p2m_query ) { - if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G, q) ) + if ( !p2m_pod_check_and_populate(p2m, gfn, + (l1_pgentry_t *) l3e, PAGE_ORDER_1G, q) ) goto pod_retry_l3; + gdprintk(XENLOG_ERR, "%s: Allocate 1GB failed!\n", __func__); } else *t = p2m_populate_on_demand; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 2 of 6] Fix handling of m2p map in set_shared_p2m_entry
xen/arch/x86/mm/p2m.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) When updating a p2m mapping to shared, previous code unconditionally set the m2p entry for the old mfn to invalid. We now check that the old mfn does not remain shared. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r d9a344a5c1e3 -r 1ef55d87b459 xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -714,8 +714,9 @@ set_shared_p2m_entry(struct domain *d, u * sharable first */ ASSERT(p2m_is_shared(ot)); ASSERT(mfn_valid(omfn)); - /* XXX: M2P translations have to be handled properly for shared pages */ - set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY); + if ( ((mfn_to_page(omfn)->u.inuse.type_info & PGT_type_mask) + != PGT_shared_page) ) + set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY); P2M_DEBUG("set shared %lx %lx\n", gfn, mfn_x(mfn)); rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_shared, p2m->default_access); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 3 of 6] Make HAP log dirty disable return the correct rc
xen/arch/x86/mm/hap/hap.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) Disabling log dirty mode in HAP always returns -EINVAL. Make it return the correct rc on success. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 1ef55d87b459 -r 43dd4fdbf539 xen/arch/x86/mm/hap/hap.c --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -709,6 +709,7 @@ int hap_domctl(struct domain *d, xen_dom return rc; case XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION: sc->mb = hap_get_allocation(d); + case XEN_DOMCTL_SHADOW_OP_OFF: return 0; default: HAP_ERROR("Bad hap domctl op %u\n", sc->op); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 4 of 6] When passing no bitmap for the shadow log dirty bitmap clean up, we should not get EFAULT
tools/libxc/xc_domain.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) This is due to a stale check for guest_handle_null in the hypervisor, which doesn''t necessarily work with the hypercall buffers. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 43dd4fdbf539 -r f11528df1df3 tools/libxc/xc_domain.c --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -430,6 +430,8 @@ int xc_shadow_control(xc_interface *xch, DECLARE_DOMCTL; DECLARE_HYPERCALL_BUFFER_ARGUMENT(dirty_bitmap); + memset(&domctl, 0, sizeof(domctl)); + domctl.cmd = XEN_DOMCTL_shadow_op; domctl.domain = (domid_t)domid; domctl.u.shadow_op.op = sop; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 5 of 6] Rework stale p2m auditing
xen/arch/x86/domctl.c | 24 +++++++ xen/arch/x86/mm/p2m-ept.c | 1 + xen/arch/x86/mm/p2m-pod.c | 5 - xen/arch/x86/mm/p2m-pt.c | 137 +++++++------------------------------------ xen/arch/x86/mm/p2m.c | 124 ++++++++++++++++++++++++++++++++++++--- xen/include/asm-x86/p2m.h | 11 ++- xen/include/public/domctl.h | 12 +++ 7 files changed, 180 insertions(+), 134 deletions(-) The p2m audit code doesn''t even compile, let alone work. It also partially supports ept. Make it: - compile - lay groundwork for eventual ept support - move out of the way of all calls and turn it into a domctl. It''s obviously not being used by anybody presently. - enable it via said domctl Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/domctl.c --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1449,6 +1449,30 @@ long arch_do_domctl( break; #endif /* __x86_64__ */ + case XEN_DOMCTL_audit_p2m: + { + struct domain *d; + + ret = -ESRCH; + d = rcu_lock_domain_by_id(domctl->domain); + if ( d != NULL ) + { +#if P2M_AUDIT + audit_p2m(d, + &domctl->u.audit_p2m.orphans_debug, + &domctl->u.audit_p2m.orphans_invalid, + &domctl->u.audit_p2m.m2p_bad, + &domctl->u.audit_p2m.p2m_bad); + ret = (copy_to_guest(u_domctl, domctl, 1)) ? + -EFAULT : 0; +#else + ret = -ENOSYS; +#endif /* P2M_AUDIT */ + rcu_unlock_domain(d); + } + } + break; + case XEN_DOMCTL_set_access_required: { struct domain *d; diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-ept.c --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -817,6 +817,7 @@ void ept_p2m_init(struct p2m_domain *p2m p2m->set_entry = ept_set_entry; p2m->get_entry = ept_get_entry; p2m->change_entry_type_global = ept_change_entry_type_global; + p2m->audit_p2m = NULL; } static void ept_dump_p2m_table(unsigned char key) diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pod.c --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -522,7 +522,6 @@ p2m_pod_decrease_reservation(struct doma steal_for_cache = ( p2m->pod.entry_count > p2m->pod.count ); p2m_lock(p2m); - audit_p2m(p2m, 1); if ( unlikely(d->is_dying) ) goto out_unlock; @@ -616,7 +615,6 @@ out_entry_check: } out_unlock: - audit_p2m(p2m, 1); p2m_unlock(p2m); out: @@ -986,7 +984,6 @@ p2m_pod_demand_populate(struct p2m_domai */ set_p2m_entry(p2m, gfn_aligned, _mfn(0), PAGE_ORDER_2M, p2m_populate_on_demand, p2m->default_access); - audit_p2m(p2m, 1); return 0; } @@ -1108,7 +1105,6 @@ guest_physmap_mark_populate_on_demand(st return rc; p2m_lock(p2m); - audit_p2m(p2m, 1); P2M_DEBUG("mark pod gfn=%#lx\n", gfn); @@ -1142,7 +1138,6 @@ guest_physmap_mark_populate_on_demand(st BUG_ON(p2m->pod.entry_count < 0); } - audit_p2m(p2m, 1); p2m_unlock(p2m); out: diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pt.c --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -483,7 +483,6 @@ static int p2m_pod_check_and_populate(st /* This is called from the p2m lookups, which can happen with or * without the lock hed. */ p2m_lock_recursive(p2m); - audit_p2m(p2m, 1); /* Check to make sure this is still PoD */ if ( p2m_flags_to_type(l1e_get_flags(*p2m_entry)) != p2m_populate_on_demand ) @@ -494,7 +493,6 @@ static int p2m_pod_check_and_populate(st r = p2m_pod_demand_populate(p2m, gfn, order, q); - audit_p2m(p2m, 1); p2m_unlock(p2m); return r; @@ -975,118 +973,23 @@ static void p2m_change_type_global(struc } -/* Set up the p2m function pointers for pagetable format */ -void p2m_pt_init(struct p2m_domain *p2m) +#if P2M_AUDIT +long p2m_pt_audit_p2m(struct p2m_domain *p2m) { - p2m->set_entry = p2m_set_entry; - p2m->get_entry = p2m_gfn_to_mfn; - p2m->change_entry_type_global = p2m_change_type_global; - p2m->write_p2m_entry = paging_write_p2m_entry; -} - - -#if P2M_AUDIT -/* strict_m2p == 0 allows m2p mappings that don''#t match the p2m. - * It''s intended for add_to_physmap, when the domain has just been allocated - * new mfns that might have stale m2p entries from previous owners */ -void audit_p2m(struct p2m_domain *p2m, int strict_m2p) -{ - struct page_info *page; - struct domain *od; - unsigned long mfn, gfn, m2pfn, lp2mfn = 0; int entry_count = 0; - mfn_t p2mfn; - unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; + unsigned long pmbad = 0; + unsigned long mfn, gfn, m2pfn; int test_linear; - p2m_type_t type; struct domain *d = p2m->domain; - if ( !paging_mode_translate(d) ) - return; - - //P2M_PRINTK("p2m audit starts\n"); + ASSERT(p2m_locked_by_me(p2m)); test_linear = ( (d == current->domain) && !pagetable_is_null(current->arch.monitor_table) ); if ( test_linear ) flush_tlb_local(); - spin_lock(&d->page_alloc_lock); - - /* Audit part one: walk the domain''s page allocation list, checking - * the m2p entries. */ - page_list_for_each ( page, &d->page_list ) - { - mfn = mfn_x(page_to_mfn(page)); - - // P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); - - od = page_get_owner(page); - - if ( od != d ) - { - P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", - mfn, od, (od?od->domain_id:-1), d, d->domain_id); - continue; - } - - gfn = get_gpfn_from_mfn(mfn); - if ( gfn == INVALID_M2P_ENTRY ) - { - orphans_i++; - //P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid gfn\n", - // mfn); - continue; - } - - if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) - { - orphans_d++; - //P2M_PRINTK("orphaned guest page: mfn=%#lx has debug gfn\n", - // mfn); - continue; - } - - if ( gfn == SHARED_M2P_ENTRY ) - { - P2M_PRINTK("shared mfn (%lx) on domain page list!\n", - mfn); - continue; - } - - p2mfn = gfn_to_mfn_type_p2m(p2m, gfn, &type, p2m_query); - if ( strict_m2p && mfn_x(p2mfn) != mfn ) - { - mpbad++; - P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" - " (-> gfn %#lx)\n", - mfn, gfn, mfn_x(p2mfn), - (mfn_valid(p2mfn) - ? get_gpfn_from_mfn(mfn_x(p2mfn)) - : -1u)); - /* This m2p entry is stale: the domain has another frame in - * this physical slot. No great disaster, but for neatness, - * blow away the m2p entry. */ - set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); - } - - if ( test_linear && (gfn <= p2m->max_mapped_pfn) ) - { - lp2mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &type, p2m_query)); - if ( lp2mfn != mfn_x(p2mfn) ) - { - P2M_PRINTK("linear mismatch gfn %#lx -> mfn %#lx " - "(!= mfn %#lx)\n", gfn, lp2mfn, mfn_x(p2mfn)); - } - } - - // P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, lp2mfn=%#lx\n", - // mfn, gfn, mfn_x(p2mfn), lp2mfn); - } - - spin_unlock(&d->page_alloc_lock); - - /* Audit part two: walk the domain''s p2m table, checking the entries. */ + /* Audit part one: walk the domain''s p2m table, checking the entries. */ if ( pagetable_get_pfn(p2m_get_pagetable(p2m)) != 0 ) { l2_pgentry_t *l2e; @@ -1239,17 +1142,23 @@ void audit_p2m(struct p2m_domain *p2m, i entry_count); BUG(); } - - //P2M_PRINTK("p2m audit complete\n"); - //if ( orphans_i | orphans_d | mpbad | pmbad ) - // P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu debug)\n", - // orphans_i + orphans_d, orphans_i, orphans_d); - if ( mpbad | pmbad ) - { - P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p entries\n", - pmbad, mpbad); - WARN(); - } + + return pmbad; } #endif /* P2M_AUDIT */ +/* Set up the p2m function pointers for pagetable format */ +void p2m_pt_init(struct p2m_domain *p2m) +{ + p2m->set_entry = p2m_set_entry; + p2m->get_entry = p2m_gfn_to_mfn; + p2m->change_entry_type_global = p2m_change_type_global; + p2m->write_p2m_entry = paging_write_p2m_entry; +#if P2M_AUDIT + p2m->audit_p2m = p2m_pt_audit_p2m; +#else + p2m->audit_p2m = NULL; +#endif +} + + diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m.c --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -439,9 +439,7 @@ guest_physmap_remove_page(struct domain { struct p2m_domain *p2m = p2m_get_hostp2m(d); p2m_lock(p2m); - audit_p2m(p2m, 1); p2m_remove_page(p2m, gfn, mfn, page_order); - audit_p2m(p2m, 1); p2m_unlock(p2m); } @@ -482,7 +480,6 @@ guest_physmap_add_entry(struct domain *d return rc; p2m_lock(p2m); - audit_p2m(p2m, 0); P2M_DEBUG("adding gfn=%#lx mfn=%#lx\n", gfn, mfn); @@ -566,7 +563,6 @@ guest_physmap_add_entry(struct domain *d } } - audit_p2m(p2m, 1); p2m_unlock(p2m); return rc; @@ -656,7 +652,6 @@ set_mmio_p2m_entry(struct domain *d, uns P2M_DEBUG("set mmio %lx %lx\n", gfn, mfn_x(mfn)); rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_mmio_direct, p2m->default_access); - audit_p2m(p2m, 1); p2m_unlock(p2m); if ( 0 == rc ) gdprintk(XENLOG_ERR, @@ -688,7 +683,6 @@ clear_mmio_p2m_entry(struct domain *d, u goto out; } rc = set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, p2m_invalid, p2m->default_access); - audit_p2m(p2m, 1); out: p2m_unlock(p2m); @@ -785,7 +779,6 @@ int p2m_mem_paging_nominate(struct domai /* Fix p2m entry */ set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_out, a); - audit_p2m(p2m, 1); ret = 0; out: @@ -852,7 +845,6 @@ int p2m_mem_paging_evict(struct domain * /* Remove mapping from p2m table */ set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, p2m_ram_paged, a); - audit_p2m(p2m, 1); /* Clear content before returning the page to Xen */ scrub_one_page(page); @@ -946,7 +938,6 @@ void p2m_mem_paging_populate(struct doma req.flags |= MEM_EVENT_FLAG_EVICT_FAIL; set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in_start, a); - audit_p2m(p2m, 1); } p2m_unlock(p2m); @@ -1014,7 +1005,6 @@ int p2m_mem_paging_prep(struct domain *d /* Fix p2m mapping */ set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a); - audit_p2m(p2m, 1); atomic_dec(&d->paged_pages); @@ -1063,7 +1053,6 @@ void p2m_mem_paging_resume(struct domain { set_p2m_entry(p2m, rsp.gfn, mfn, PAGE_ORDER_4K, p2m_ram_rw, a); set_gpfn_from_mfn(mfn_x(mfn), rsp.gfn); - audit_p2m(p2m, 1); } p2m_unlock(p2m); } @@ -1425,6 +1414,119 @@ unsigned long paging_gva_to_gfn(struct v return hostmode->gva_to_gfn(v, hostp2m, va, pfec); } +/*** Audit ***/ + +#if P2M_AUDIT +void audit_p2m(struct domain *d, + uint64_t *orphans_debug, + uint64_t *orphans_invalid, + uint64_t *m2p_bad, + uint64_t *p2m_bad) +{ + struct page_info *page; + struct domain *od; + unsigned long mfn, gfn; + mfn_t p2mfn; + unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; + p2m_access_t p2ma; + p2m_type_t type; + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + if ( !paging_mode_translate(d) ) + goto out_p2m_audit; + + P2M_PRINTK("p2m audit starts\n"); + + p2m_lock(p2m); + + if (p2m->audit_p2m) + pmbad = p2m->audit_p2m(p2m); + + /* Audit part two: walk the domain''s page allocation list, checking + * the m2p entries. */ + spin_lock(&d->page_alloc_lock); + page_list_for_each ( page, &d->page_list ) + { + mfn = mfn_x(page_to_mfn(page)); + + P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); + + od = page_get_owner(page); + + if ( od != d ) + { + P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", + mfn, od, (od?od->domain_id:-1), d, d->domain_id); + continue; + } + + gfn = get_gpfn_from_mfn(mfn); + if ( gfn == INVALID_M2P_ENTRY ) + { + orphans_i++; + P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid gfn\n", + mfn); + continue; + } + + if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) + { + orphans_d++; + P2M_PRINTK("orphaned guest page: mfn=%#lx has debug gfn\n", + mfn); + continue; + } + + if ( gfn == SHARED_M2P_ENTRY ) + { + P2M_PRINTK("shared mfn (%lx) on domain page list!\n", + mfn); + continue; + } + + p2mfn = get_gfn_type_access(p2m, gfn, &type, &p2ma, p2m_query, NULL); + if ( mfn_x(p2mfn) != mfn ) + { + mpbad++; + P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" + " (-> gfn %#lx)\n", + mfn, gfn, mfn_x(p2mfn), + (mfn_valid(p2mfn) + ? get_gpfn_from_mfn(mfn_x(p2mfn)) + : -1u)); + /* This m2p entry is stale: the domain has another frame in + * this physical slot. No great disaster, but for neatness, + * blow away the m2p entry. */ + set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); + } + __put_gfn(p2m, gfn); + + P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, lp2mfn=%#lx\n", + mfn, gfn, mfn_x(p2mfn), lp2mfn); + } + spin_unlock(&d->page_alloc_lock); + + p2m_unlock(p2m); + + P2M_PRINTK("p2m audit complete\n"); + if ( orphans_i | orphans_d | mpbad | pmbad ) + P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu debug)\n", + orphans_i + orphans_d, orphans_i, orphans_d); + if ( mpbad | pmbad ) + { + P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p entries\n", + pmbad, mpbad); + WARN(); + } + +out_p2m_audit: + *orphans_debug = (uint64_t) orphans_d; + *orphans_invalid = (uint64_t) orphans_i; + *m2p_bad = (uint64_t) mpbad; + *p2m_bad = (uint64_t) pmbad; +} +#endif /* P2M_AUDIT */ + /* * Local variables: * mode: C diff -r f11528df1df3 -r 764e0872dd4f xen/include/asm-x86/p2m.h --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -244,6 +244,7 @@ struct p2m_domain { unsigned long gfn, l1_pgentry_t *p, mfn_t table_mfn, l1_pgentry_t new, unsigned int level); + long (*audit_p2m)(struct p2m_domain *p2m); /* Default P2M access type for each page in the the domain: new pages, * swapped in pages, cleared pages, and pages that are ambiquously @@ -558,13 +559,15 @@ int set_p2m_entry(struct p2m_domain *p2m extern void p2m_pt_init(struct p2m_domain *p2m); /* Debugging and auditing of the P2M code? */ -#define P2M_AUDIT 0 +#define P2M_AUDIT 1 #define P2M_DEBUGGING 0 #if P2M_AUDIT -extern void audit_p2m(struct p2m_domain *p2m, int strict_m2p); -#else -# define audit_p2m(_p2m, _m2p) do { (void)(_p2m),(_m2p); } while (0) +extern void audit_p2m(struct domain *d, + uint64_t *orphans_debug, + uint64_t *orphans_invalid, + uint64_t *m2p_bad, + uint64_t *p2m_bad); #endif /* P2M_AUDIT */ /* Printouts */ diff -r f11528df1df3 -r 764e0872dd4f xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -800,6 +800,16 @@ struct xen_domctl_mem_sharing_op { typedef struct xen_domctl_mem_sharing_op xen_domctl_mem_sharing_op_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_sharing_op_t); +struct xen_domctl_audit_p2m { + /* OUT error counts */ + uint64_t orphans_debug; + uint64_t orphans_invalid; + uint64_t m2p_bad; + uint64_t p2m_bad; +}; +typedef struct xen_domctl_audit_p2m xen_domctl_audit_p2m_t; +DEFINE_XEN_GUEST_HANDLE(xen_domctl_audit_p2m_t); + #if defined(__i386__) || defined(__x86_64__) /* XEN_DOMCTL_setvcpuextstate */ /* XEN_DOMCTL_getvcpuextstate */ @@ -898,6 +908,7 @@ struct xen_domctl { #define XEN_DOMCTL_setvcpuextstate 62 #define XEN_DOMCTL_getvcpuextstate 63 #define XEN_DOMCTL_set_access_required 64 +#define XEN_DOMCTL_audit_p2m 65 #define XEN_DOMCTL_gdbsx_guestmemio 1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -951,6 +962,7 @@ struct xen_domctl { struct xen_domctl_vcpuextstate vcpuextstate; #endif struct xen_domctl_set_access_required access_required; + struct xen_domctl_audit_p2m audit_p2m; struct xen_domctl_gdbsx_memio gdbsx_guest_memio; struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu; struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-14 21:48 UTC
[Xen-devel] [PATCH 6 of 6] Add libxc wrapper for p2m audit domctl
tools/libxc/xc_domain.c | 22 ++++++++++++++++++++++ tools/libxc/xenctrl.h | 27 +++++++++++++++++++++++++++ 2 files changed, 49 insertions(+), 0 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 764e0872dd4f -r 1f9e4cde0093 tools/libxc/xc_domain.c --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -1472,6 +1472,28 @@ int xc_domain_debug_control(xc_interface return do_domctl(xc, &domctl); } +int xc_domain_p2m_audit(xc_interface *xch, + uint32_t domid, + uint64_t *orphans_debug, + uint64_t *orphans_invalid, + uint64_t *m2p_bad, + uint64_t *p2m_bad) +{ + DECLARE_DOMCTL; + int rc; + + domctl.cmd = XEN_DOMCTL_audit_p2m; + domctl.domain = domid; + rc = do_domctl(xch, &domctl); + + *orphans_debug = domctl.u.audit_p2m.orphans_debug; + *orphans_invalid = domctl.u.audit_p2m.orphans_invalid; + *m2p_bad = domctl.u.audit_p2m.m2p_bad; + *p2m_bad = domctl.u.audit_p2m.p2m_bad; + + return rc; +} + int xc_domain_set_access_required(xc_interface *xch, uint32_t domid, unsigned int required) diff -r 764e0872dd4f -r 1f9e4cde0093 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -712,6 +712,33 @@ int xc_domain_setdebugging(xc_interface unsigned int enable); /** + * This function audits the (top level) p2m of a domain + * and returns the different error counts, if any. + * + * @parm xch a handle to an open hypervisor interface + * @parm domid the domain id whose top level p2m we + * want to audit + * @parm orphans_debug count of m2p entries for valid + * domain pages containing a debug value + * @parm orphans_invalid count of m2p entries for valid + * domain pages containing an invalid value + * @parm m2p_bad count of m2p entries mismatching the + * associated p2m entry for this domain + * @parm p2m_bad count of p2m entries for this domain + * mismatching the associated m2p entry + * return 0 on success, -1 on failure + * errno values on failure include: + * -ENOSYS: not implemented + * -EFAULT: could not copy results back to guest + */ +int xc_domain_p2m_audit(xc_interface *xch, + uint32_t domid, + uint64_t *orphans_debug, + uint64_t *orphans_invalid, + uint64_t *m2p_bad, + uint64_t *p2m_bad); + +/** * This function sets or clears the requirement that an access memory * event listener is required on the domain. * _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Nov-15 10:33 UTC
Re: [Xen-devel] [PATCH 3 of 6] Make HAP log dirty disable return the correct rc
On Mon, Nov 14, 2011 at 9:48 PM, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote:> xen/arch/x86/mm/hap/hap.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > > Disabling log dirty mode in HAP always returns -EINVAL. Make it > return the correct rc on success. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r 1ef55d87b459 -r 43dd4fdbf539 xen/arch/x86/mm/hap/hap.c > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -709,6 +709,7 @@ int hap_domctl(struct domain *d, xen_dom > return rc; > case XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION: > sc->mb = hap_get_allocation(d); > + case XEN_DOMCTL_SHADOW_OP_OFF: > return 0; > default: > HAP_ERROR("Bad hap domctl op %u\n", sc->op);I think you should always put "/* FALLTHRU */" when you''re falling through from one case statement to the next. Or just "return 0" for the case above and let the compiler optimize it. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Nov-15 12:14 UTC
Re: [Xen-devel] [PATCH 5 of 6] Rework stale p2m auditing
On Mon, Nov 14, 2011 at 9:48 PM, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote:> xen/arch/x86/domctl.c | 24 +++++++ > xen/arch/x86/mm/p2m-ept.c | 1 + > xen/arch/x86/mm/p2m-pod.c | 5 - > xen/arch/x86/mm/p2m-pt.c | 137 +++++++------------------------------------ > xen/arch/x86/mm/p2m.c | 124 ++++++++++++++++++++++++++++++++++++--- > xen/include/asm-x86/p2m.h | 11 ++- > xen/include/public/domctl.h | 12 +++ > 7 files changed, 180 insertions(+), 134 deletions(-) > > > The p2m audit code doesn''t even compile, let alone work. It also > partially supports ept. Make it: > > - compile > - lay groundwork for eventual ept supportThis stuff looks good (haven''t reviewed it in detail), but...> - move out of the way of all calls and turn it into a domctl. It''s > obviously not being used by anybody presently. > - enable it via said domctl..the idea with having audit_p2m() sprinkled around the code was to help narrow down what bit of code was screwing up the p2m table. I''m not sure switching it to a domctl will really help debugging in the future. How were you envisioning this would be used?> > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/domctl.c > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -1449,6 +1449,30 @@ long arch_do_domctl( > break; > #endif /* __x86_64__ */ > > + case XEN_DOMCTL_audit_p2m: > + { > + struct domain *d; > + > + ret = -ESRCH; > + d = rcu_lock_domain_by_id(domctl->domain); > + if ( d != NULL ) > + { > +#if P2M_AUDIT > + audit_p2m(d, > + &domctl->u.audit_p2m.orphans_debug, > + &domctl->u.audit_p2m.orphans_invalid, > + &domctl->u.audit_p2m.m2p_bad, > + &domctl->u.audit_p2m.p2m_bad); > + ret = (copy_to_guest(u_domctl, domctl, 1)) ? > + -EFAULT : 0; > +#else > + ret = -ENOSYS; > +#endif /* P2M_AUDIT */ > + rcu_unlock_domain(d); > + } > + } > + break; > + > case XEN_DOMCTL_set_access_required: > { > struct domain *d; > diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-ept.c > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -817,6 +817,7 @@ void ept_p2m_init(struct p2m_domain *p2m > p2m->set_entry = ept_set_entry; > p2m->get_entry = ept_get_entry; > p2m->change_entry_type_global = ept_change_entry_type_global; > + p2m->audit_p2m = NULL; > } > > static void ept_dump_p2m_table(unsigned char key) > diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pod.c > --- a/xen/arch/x86/mm/p2m-pod.c > +++ b/xen/arch/x86/mm/p2m-pod.c > @@ -522,7 +522,6 @@ p2m_pod_decrease_reservation(struct doma > steal_for_cache = ( p2m->pod.entry_count > p2m->pod.count ); > > p2m_lock(p2m); > - audit_p2m(p2m, 1); > > if ( unlikely(d->is_dying) ) > goto out_unlock; > @@ -616,7 +615,6 @@ out_entry_check: > } > > out_unlock: > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > > out: > @@ -986,7 +984,6 @@ p2m_pod_demand_populate(struct p2m_domai > */ > set_p2m_entry(p2m, gfn_aligned, _mfn(0), PAGE_ORDER_2M, > p2m_populate_on_demand, p2m->default_access); > - audit_p2m(p2m, 1); > return 0; > } > > @@ -1108,7 +1105,6 @@ guest_physmap_mark_populate_on_demand(st > return rc; > > p2m_lock(p2m); > - audit_p2m(p2m, 1); > > P2M_DEBUG("mark pod gfn=%#lx\n", gfn); > > @@ -1142,7 +1138,6 @@ guest_physmap_mark_populate_on_demand(st > BUG_ON(p2m->pod.entry_count < 0); > } > > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > > out: > diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pt.c > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -483,7 +483,6 @@ static int p2m_pod_check_and_populate(st > /* This is called from the p2m lookups, which can happen with or > * without the lock hed. */ > p2m_lock_recursive(p2m); > - audit_p2m(p2m, 1); > > /* Check to make sure this is still PoD */ > if ( p2m_flags_to_type(l1e_get_flags(*p2m_entry)) != p2m_populate_on_demand ) > @@ -494,7 +493,6 @@ static int p2m_pod_check_and_populate(st > > r = p2m_pod_demand_populate(p2m, gfn, order, q); > > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > > return r; > @@ -975,118 +973,23 @@ static void p2m_change_type_global(struc > > } > > -/* Set up the p2m function pointers for pagetable format */ > -void p2m_pt_init(struct p2m_domain *p2m) > +#if P2M_AUDIT > +long p2m_pt_audit_p2m(struct p2m_domain *p2m) > { > - p2m->set_entry = p2m_set_entry; > - p2m->get_entry = p2m_gfn_to_mfn; > - p2m->change_entry_type_global = p2m_change_type_global; > - p2m->write_p2m_entry = paging_write_p2m_entry; > -} > - > - > -#if P2M_AUDIT > -/* strict_m2p == 0 allows m2p mappings that don''#t match the p2m. > - * It''s intended for add_to_physmap, when the domain has just been allocated > - * new mfns that might have stale m2p entries from previous owners */ > -void audit_p2m(struct p2m_domain *p2m, int strict_m2p) > -{ > - struct page_info *page; > - struct domain *od; > - unsigned long mfn, gfn, m2pfn, lp2mfn = 0; > int entry_count = 0; > - mfn_t p2mfn; > - unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; > + unsigned long pmbad = 0; > + unsigned long mfn, gfn, m2pfn; > int test_linear; > - p2m_type_t type; > struct domain *d = p2m->domain; > > - if ( !paging_mode_translate(d) ) > - return; > - > - //P2M_PRINTK("p2m audit starts\n"); > + ASSERT(p2m_locked_by_me(p2m)); > > test_linear = ( (d == current->domain) > && !pagetable_is_null(current->arch.monitor_table) ); > if ( test_linear ) > flush_tlb_local(); > > - spin_lock(&d->page_alloc_lock); > - > - /* Audit part one: walk the domain''s page allocation list, checking > - * the m2p entries. */ > - page_list_for_each ( page, &d->page_list ) > - { > - mfn = mfn_x(page_to_mfn(page)); > - > - // P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); > - > - od = page_get_owner(page); > - > - if ( od != d ) > - { > - P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", > - mfn, od, (od?od->domain_id:-1), d, d->domain_id); > - continue; > - } > - > - gfn = get_gpfn_from_mfn(mfn); > - if ( gfn == INVALID_M2P_ENTRY ) > - { > - orphans_i++; > - //P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid gfn\n", > - // mfn); > - continue; > - } > - > - if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) > - { > - orphans_d++; > - //P2M_PRINTK("orphaned guest page: mfn=%#lx has debug gfn\n", > - // mfn); > - continue; > - } > - > - if ( gfn == SHARED_M2P_ENTRY ) > - { > - P2M_PRINTK("shared mfn (%lx) on domain page list!\n", > - mfn); > - continue; > - } > - > - p2mfn = gfn_to_mfn_type_p2m(p2m, gfn, &type, p2m_query); > - if ( strict_m2p && mfn_x(p2mfn) != mfn ) > - { > - mpbad++; > - P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" > - " (-> gfn %#lx)\n", > - mfn, gfn, mfn_x(p2mfn), > - (mfn_valid(p2mfn) > - ? get_gpfn_from_mfn(mfn_x(p2mfn)) > - : -1u)); > - /* This m2p entry is stale: the domain has another frame in > - * this physical slot. No great disaster, but for neatness, > - * blow away the m2p entry. */ > - set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); > - } > - > - if ( test_linear && (gfn <= p2m->max_mapped_pfn) ) > - { > - lp2mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &type, p2m_query)); > - if ( lp2mfn != mfn_x(p2mfn) ) > - { > - P2M_PRINTK("linear mismatch gfn %#lx -> mfn %#lx " > - "(!= mfn %#lx)\n", gfn, lp2mfn, mfn_x(p2mfn)); > - } > - } > - > - // P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, lp2mfn=%#lx\n", > - // mfn, gfn, mfn_x(p2mfn), lp2mfn); > - } > - > - spin_unlock(&d->page_alloc_lock); > - > - /* Audit part two: walk the domain''s p2m table, checking the entries. */ > + /* Audit part one: walk the domain''s p2m table, checking the entries. */ > if ( pagetable_get_pfn(p2m_get_pagetable(p2m)) != 0 ) > { > l2_pgentry_t *l2e; > @@ -1239,17 +1142,23 @@ void audit_p2m(struct p2m_domain *p2m, i > entry_count); > BUG(); > } > - > - //P2M_PRINTK("p2m audit complete\n"); > - //if ( orphans_i | orphans_d | mpbad | pmbad ) > - // P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu debug)\n", > - // orphans_i + orphans_d, orphans_i, orphans_d); > - if ( mpbad | pmbad ) > - { > - P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p entries\n", > - pmbad, mpbad); > - WARN(); > - } > + > + return pmbad; > } > #endif /* P2M_AUDIT */ > > +/* Set up the p2m function pointers for pagetable format */ > +void p2m_pt_init(struct p2m_domain *p2m) > +{ > + p2m->set_entry = p2m_set_entry; > + p2m->get_entry = p2m_gfn_to_mfn; > + p2m->change_entry_type_global = p2m_change_type_global; > + p2m->write_p2m_entry = paging_write_p2m_entry; > +#if P2M_AUDIT > + p2m->audit_p2m = p2m_pt_audit_p2m; > +#else > + p2m->audit_p2m = NULL; > +#endif > +} > + > + > diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m.c > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -439,9 +439,7 @@ guest_physmap_remove_page(struct domain > { > struct p2m_domain *p2m = p2m_get_hostp2m(d); > p2m_lock(p2m); > - audit_p2m(p2m, 1); > p2m_remove_page(p2m, gfn, mfn, page_order); > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > } > > @@ -482,7 +480,6 @@ guest_physmap_add_entry(struct domain *d > return rc; > > p2m_lock(p2m); > - audit_p2m(p2m, 0); > > P2M_DEBUG("adding gfn=%#lx mfn=%#lx\n", gfn, mfn); > > @@ -566,7 +563,6 @@ guest_physmap_add_entry(struct domain *d > } > } > > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > > return rc; > @@ -656,7 +652,6 @@ set_mmio_p2m_entry(struct domain *d, uns > > P2M_DEBUG("set mmio %lx %lx\n", gfn, mfn_x(mfn)); > rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_mmio_direct, p2m->default_access); > - audit_p2m(p2m, 1); > p2m_unlock(p2m); > if ( 0 == rc ) > gdprintk(XENLOG_ERR, > @@ -688,7 +683,6 @@ clear_mmio_p2m_entry(struct domain *d, u > goto out; > } > rc = set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, p2m_invalid, p2m->default_access); > - audit_p2m(p2m, 1); > > out: > p2m_unlock(p2m); > @@ -785,7 +779,6 @@ int p2m_mem_paging_nominate(struct domai > > /* Fix p2m entry */ > set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_out, a); > - audit_p2m(p2m, 1); > ret = 0; > > out: > @@ -852,7 +845,6 @@ int p2m_mem_paging_evict(struct domain * > > /* Remove mapping from p2m table */ > set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, p2m_ram_paged, a); > - audit_p2m(p2m, 1); > > /* Clear content before returning the page to Xen */ > scrub_one_page(page); > @@ -946,7 +938,6 @@ void p2m_mem_paging_populate(struct doma > req.flags |= MEM_EVENT_FLAG_EVICT_FAIL; > > set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in_start, a); > - audit_p2m(p2m, 1); > } > p2m_unlock(p2m); > > @@ -1014,7 +1005,6 @@ int p2m_mem_paging_prep(struct domain *d > > /* Fix p2m mapping */ > set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a); > - audit_p2m(p2m, 1); > > atomic_dec(&d->paged_pages); > > @@ -1063,7 +1053,6 @@ void p2m_mem_paging_resume(struct domain > { > set_p2m_entry(p2m, rsp.gfn, mfn, PAGE_ORDER_4K, p2m_ram_rw, a); > set_gpfn_from_mfn(mfn_x(mfn), rsp.gfn); > - audit_p2m(p2m, 1); > } > p2m_unlock(p2m); > } > @@ -1425,6 +1414,119 @@ unsigned long paging_gva_to_gfn(struct v > return hostmode->gva_to_gfn(v, hostp2m, va, pfec); > } > > +/*** Audit ***/ > + > +#if P2M_AUDIT > +void audit_p2m(struct domain *d, > + uint64_t *orphans_debug, > + uint64_t *orphans_invalid, > + uint64_t *m2p_bad, > + uint64_t *p2m_bad) > +{ > + struct page_info *page; > + struct domain *od; > + unsigned long mfn, gfn; > + mfn_t p2mfn; > + unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; > + p2m_access_t p2ma; > + p2m_type_t type; > + struct p2m_domain *p2m = p2m_get_hostp2m(d); > + > + if ( !paging_mode_translate(d) ) > + goto out_p2m_audit; > + > + P2M_PRINTK("p2m audit starts\n"); > + > + p2m_lock(p2m); > + > + if (p2m->audit_p2m) > + pmbad = p2m->audit_p2m(p2m); > + > + /* Audit part two: walk the domain''s page allocation list, checking > + * the m2p entries. */ > + spin_lock(&d->page_alloc_lock); > + page_list_for_each ( page, &d->page_list ) > + { > + mfn = mfn_x(page_to_mfn(page)); > + > + P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); > + > + od = page_get_owner(page); > + > + if ( od != d ) > + { > + P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", > + mfn, od, (od?od->domain_id:-1), d, d->domain_id); > + continue; > + } > + > + gfn = get_gpfn_from_mfn(mfn); > + if ( gfn == INVALID_M2P_ENTRY ) > + { > + orphans_i++; > + P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid gfn\n", > + mfn); > + continue; > + } > + > + if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) > + { > + orphans_d++; > + P2M_PRINTK("orphaned guest page: mfn=%#lx has debug gfn\n", > + mfn); > + continue; > + } > + > + if ( gfn == SHARED_M2P_ENTRY ) > + { > + P2M_PRINTK("shared mfn (%lx) on domain page list!\n", > + mfn); > + continue; > + } > + > + p2mfn = get_gfn_type_access(p2m, gfn, &type, &p2ma, p2m_query, NULL); > + if ( mfn_x(p2mfn) != mfn ) > + { > + mpbad++; > + P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" > + " (-> gfn %#lx)\n", > + mfn, gfn, mfn_x(p2mfn), > + (mfn_valid(p2mfn) > + ? get_gpfn_from_mfn(mfn_x(p2mfn)) > + : -1u)); > + /* This m2p entry is stale: the domain has another frame in > + * this physical slot. No great disaster, but for neatness, > + * blow away the m2p entry. */ > + set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); > + } > + __put_gfn(p2m, gfn); > + > + P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, lp2mfn=%#lx\n", > + mfn, gfn, mfn_x(p2mfn), lp2mfn); > + } > + spin_unlock(&d->page_alloc_lock); > + > + p2m_unlock(p2m); > + > + P2M_PRINTK("p2m audit complete\n"); > + if ( orphans_i | orphans_d | mpbad | pmbad ) > + P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu debug)\n", > + orphans_i + orphans_d, orphans_i, orphans_d); > + if ( mpbad | pmbad ) > + { > + P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p entries\n", > + pmbad, mpbad); > + WARN(); > + } > + > +out_p2m_audit: > + *orphans_debug = (uint64_t) orphans_d; > + *orphans_invalid = (uint64_t) orphans_i; > + *m2p_bad = (uint64_t) mpbad; > + *p2m_bad = (uint64_t) pmbad; > +} > +#endif /* P2M_AUDIT */ > + > /* > * Local variables: > * mode: C > diff -r f11528df1df3 -r 764e0872dd4f xen/include/asm-x86/p2m.h > --- a/xen/include/asm-x86/p2m.h > +++ b/xen/include/asm-x86/p2m.h > @@ -244,6 +244,7 @@ struct p2m_domain { > unsigned long gfn, l1_pgentry_t *p, > mfn_t table_mfn, l1_pgentry_t new, > unsigned int level); > + long (*audit_p2m)(struct p2m_domain *p2m); > > /* Default P2M access type for each page in the the domain: new pages, > * swapped in pages, cleared pages, and pages that are ambiquously > @@ -558,13 +559,15 @@ int set_p2m_entry(struct p2m_domain *p2m > extern void p2m_pt_init(struct p2m_domain *p2m); > > /* Debugging and auditing of the P2M code? */ > -#define P2M_AUDIT 0 > +#define P2M_AUDIT 1 > #define P2M_DEBUGGING 0 > > #if P2M_AUDIT > -extern void audit_p2m(struct p2m_domain *p2m, int strict_m2p); > -#else > -# define audit_p2m(_p2m, _m2p) do { (void)(_p2m),(_m2p); } while (0) > +extern void audit_p2m(struct domain *d, > + uint64_t *orphans_debug, > + uint64_t *orphans_invalid, > + uint64_t *m2p_bad, > + uint64_t *p2m_bad); > #endif /* P2M_AUDIT */ > > /* Printouts */ > diff -r f11528df1df3 -r 764e0872dd4f xen/include/public/domctl.h > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -800,6 +800,16 @@ struct xen_domctl_mem_sharing_op { > typedef struct xen_domctl_mem_sharing_op xen_domctl_mem_sharing_op_t; > DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_sharing_op_t); > > +struct xen_domctl_audit_p2m { > + /* OUT error counts */ > + uint64_t orphans_debug; > + uint64_t orphans_invalid; > + uint64_t m2p_bad; > + uint64_t p2m_bad; > +}; > +typedef struct xen_domctl_audit_p2m xen_domctl_audit_p2m_t; > +DEFINE_XEN_GUEST_HANDLE(xen_domctl_audit_p2m_t); > + > #if defined(__i386__) || defined(__x86_64__) > /* XEN_DOMCTL_setvcpuextstate */ > /* XEN_DOMCTL_getvcpuextstate */ > @@ -898,6 +908,7 @@ struct xen_domctl { > #define XEN_DOMCTL_setvcpuextstate 62 > #define XEN_DOMCTL_getvcpuextstate 63 > #define XEN_DOMCTL_set_access_required 64 > +#define XEN_DOMCTL_audit_p2m 65 > #define XEN_DOMCTL_gdbsx_guestmemio 1000 > #define XEN_DOMCTL_gdbsx_pausevcpu 1001 > #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 > @@ -951,6 +962,7 @@ struct xen_domctl { > struct xen_domctl_vcpuextstate vcpuextstate; > #endif > struct xen_domctl_set_access_required access_required; > + struct xen_domctl_audit_p2m audit_p2m; > struct xen_domctl_gdbsx_memio gdbsx_guest_memio; > struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu; > struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andres Lagar-Cavilla
2011-Nov-15 15:38 UTC
Re: [Xen-devel] [PATCH 5 of 6] Rework stale p2m auditing
Well, clearly that code has bit-rotted considerably and long ago. So whatever purpose it served, no one is in need of it presently. My intention is to call the p2m audit domctl during runtime. You could turn on sharing, do some sharing operations, and then audit the p2m. The pager could periodically audit via a domctl for debugging purposes. libxl debug builds could audit the p2m before or after xl mem-set. Etc... More subtly, if we transition to fine-grained p2m locks, auditing the p2m will negate most benefits, and introduce the non-trivial complexity of promoting fine-grained locks. Andres> On Mon, Nov 14, 2011 at 9:48 PM, Andres Lagar-Cavilla > <andres@lagarcavilla.org> wrote: >> xen/arch/x86/domctl.c | 24 +++++++ >> xen/arch/x86/mm/p2m-ept.c | 1 + >> xen/arch/x86/mm/p2m-pod.c | 5 - >> xen/arch/x86/mm/p2m-pt.c | 137 >> +++++++------------------------------------ >> xen/arch/x86/mm/p2m.c | 124 >> ++++++++++++++++++++++++++++++++++++--- >> xen/include/asm-x86/p2m.h | 11 ++- >> xen/include/public/domctl.h | 12 +++ >> 7 files changed, 180 insertions(+), 134 deletions(-) >> >> >> The p2m audit code doesn''t even compile, let alone work. It also >> partially supports ept. Make it: >> >> - compile >> - lay groundwork for eventual ept support > > This stuff looks good (haven''t reviewed it in detail), but... > >> - move out of the way of all calls and turn it into a domctl. It''s >> obviously not being used by anybody presently. >> - enable it via said domctl > > ..the idea with having audit_p2m() sprinkled around the code was to > help narrow down what bit of code was screwing up the p2m table. I''m > not sure switching it to a domctl will really help debugging in the > future. How were you envisioning this would be used? > >> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> >> >> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/domctl.c >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -1449,6 +1449,30 @@ long arch_do_domctl( >> break; >> #endif /* __x86_64__ */ >> >> + case XEN_DOMCTL_audit_p2m: >> + { >> + struct domain *d; >> + >> + ret = -ESRCH; >> + d = rcu_lock_domain_by_id(domctl->domain); >> + if ( d != NULL ) >> + { >> +#if P2M_AUDIT >> + audit_p2m(d, >> + &domctl->u.audit_p2m.orphans_debug, >> + &domctl->u.audit_p2m.orphans_invalid, >> + &domctl->u.audit_p2m.m2p_bad, >> + &domctl->u.audit_p2m.p2m_bad); >> + ret = (copy_to_guest(u_domctl, domctl, 1)) ? >> + -EFAULT : 0; >> +#else >> + ret = -ENOSYS; >> +#endif /* P2M_AUDIT */ >> + rcu_unlock_domain(d); >> + } >> + } >> + break; >> + >> case XEN_DOMCTL_set_access_required: >> { >> struct domain *d; >> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-ept.c >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -817,6 +817,7 @@ void ept_p2m_init(struct p2m_domain *p2m >> p2m->set_entry = ept_set_entry; >> p2m->get_entry = ept_get_entry; >> p2m->change_entry_type_global = ept_change_entry_type_global; >> + p2m->audit_p2m = NULL; >> } >> >> static void ept_dump_p2m_table(unsigned char key) >> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pod.c >> --- a/xen/arch/x86/mm/p2m-pod.c >> +++ b/xen/arch/x86/mm/p2m-pod.c >> @@ -522,7 +522,6 @@ p2m_pod_decrease_reservation(struct doma >> steal_for_cache = ( p2m->pod.entry_count > p2m->pod.count ); >> >> p2m_lock(p2m); >> - audit_p2m(p2m, 1); >> >> if ( unlikely(d->is_dying) ) >> goto out_unlock; >> @@ -616,7 +615,6 @@ out_entry_check: >> } >> >> out_unlock: >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> >> out: >> @@ -986,7 +984,6 @@ p2m_pod_demand_populate(struct p2m_domai >> */ >> set_p2m_entry(p2m, gfn_aligned, _mfn(0), PAGE_ORDER_2M, >> p2m_populate_on_demand, p2m->default_access); >> - audit_p2m(p2m, 1); >> return 0; >> } >> >> @@ -1108,7 +1105,6 @@ guest_physmap_mark_populate_on_demand(st >> return rc; >> >> p2m_lock(p2m); >> - audit_p2m(p2m, 1); >> >> P2M_DEBUG("mark pod gfn=%#lx\n", gfn); >> >> @@ -1142,7 +1138,6 @@ guest_physmap_mark_populate_on_demand(st >> BUG_ON(p2m->pod.entry_count < 0); >> } >> >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> >> out: >> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m-pt.c >> --- a/xen/arch/x86/mm/p2m-pt.c >> +++ b/xen/arch/x86/mm/p2m-pt.c >> @@ -483,7 +483,6 @@ static int p2m_pod_check_and_populate(st >> /* This is called from the p2m lookups, which can happen with or >> * without the lock hed. */ >> p2m_lock_recursive(p2m); >> - audit_p2m(p2m, 1); >> >> /* Check to make sure this is still PoD */ >> if ( p2m_flags_to_type(l1e_get_flags(*p2m_entry)) !>> p2m_populate_on_demand ) >> @@ -494,7 +493,6 @@ static int p2m_pod_check_and_populate(st >> >> r = p2m_pod_demand_populate(p2m, gfn, order, q); >> >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> >> return r; >> @@ -975,118 +973,23 @@ static void p2m_change_type_global(struc >> >> } >> >> -/* Set up the p2m function pointers for pagetable format */ >> -void p2m_pt_init(struct p2m_domain *p2m) >> +#if P2M_AUDIT >> +long p2m_pt_audit_p2m(struct p2m_domain *p2m) >> { >> - p2m->set_entry = p2m_set_entry; >> - p2m->get_entry = p2m_gfn_to_mfn; >> - p2m->change_entry_type_global = p2m_change_type_global; >> - p2m->write_p2m_entry = paging_write_p2m_entry; >> -} >> - >> - >> -#if P2M_AUDIT >> -/* strict_m2p == 0 allows m2p mappings that don''#t match the p2m. >> - * It''s intended for add_to_physmap, when the domain has just been >> allocated >> - * new mfns that might have stale m2p entries from previous owners */ >> -void audit_p2m(struct p2m_domain *p2m, int strict_m2p) >> -{ >> - struct page_info *page; >> - struct domain *od; >> - unsigned long mfn, gfn, m2pfn, lp2mfn = 0; >> int entry_count = 0; >> - mfn_t p2mfn; >> - unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; >> + unsigned long pmbad = 0; >> + unsigned long mfn, gfn, m2pfn; >> int test_linear; >> - p2m_type_t type; >> struct domain *d = p2m->domain; >> >> - if ( !paging_mode_translate(d) ) >> - return; >> - >> - //P2M_PRINTK("p2m audit starts\n"); >> + ASSERT(p2m_locked_by_me(p2m)); >> >> test_linear = ( (d == current->domain) >> && !pagetable_is_null(current->arch.monitor_table) >> ); >> if ( test_linear ) >> flush_tlb_local(); >> >> - spin_lock(&d->page_alloc_lock); >> - >> - /* Audit part one: walk the domain''s page allocation list, checking >> - * the m2p entries. */ >> - page_list_for_each ( page, &d->page_list ) >> - { >> - mfn = mfn_x(page_to_mfn(page)); >> - >> - // P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); >> - >> - od = page_get_owner(page); >> - >> - if ( od != d ) >> - { >> - P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", >> - mfn, od, (od?od->domain_id:-1), d, >> d->domain_id); >> - continue; >> - } >> - >> - gfn = get_gpfn_from_mfn(mfn); >> - if ( gfn == INVALID_M2P_ENTRY ) >> - { >> - orphans_i++; >> - //P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid >> gfn\n", >> - // mfn); >> - continue; >> - } >> - >> - if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) >> - { >> - orphans_d++; >> - //P2M_PRINTK("orphaned guest page: mfn=%#lx has debug >> gfn\n", >> - // mfn); >> - continue; >> - } >> - >> - if ( gfn == SHARED_M2P_ENTRY ) >> - { >> - P2M_PRINTK("shared mfn (%lx) on domain page list!\n", >> - mfn); >> - continue; >> - } >> - >> - p2mfn = gfn_to_mfn_type_p2m(p2m, gfn, &type, p2m_query); >> - if ( strict_m2p && mfn_x(p2mfn) != mfn ) >> - { >> - mpbad++; >> - P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" >> - " (-> gfn %#lx)\n", >> - mfn, gfn, mfn_x(p2mfn), >> - (mfn_valid(p2mfn) >> - ? get_gpfn_from_mfn(mfn_x(p2mfn)) >> - : -1u)); >> - /* This m2p entry is stale: the domain has another frame in >> - * this physical slot. No great disaster, but for >> neatness, >> - * blow away the m2p entry. */ >> - set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); >> - } >> - >> - if ( test_linear && (gfn <= p2m->max_mapped_pfn) ) >> - { >> - lp2mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &type, >> p2m_query)); >> - if ( lp2mfn != mfn_x(p2mfn) ) >> - { >> - P2M_PRINTK("linear mismatch gfn %#lx -> mfn %#lx " >> - "(!= mfn %#lx)\n", gfn, lp2mfn, >> mfn_x(p2mfn)); >> - } >> - } >> - >> - // P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, >> lp2mfn=%#lx\n", >> - // mfn, gfn, mfn_x(p2mfn), lp2mfn); >> - } >> - >> - spin_unlock(&d->page_alloc_lock); >> - >> - /* Audit part two: walk the domain''s p2m table, checking the >> entries. */ >> + /* Audit part one: walk the domain''s p2m table, checking the >> entries. */ >> if ( pagetable_get_pfn(p2m_get_pagetable(p2m)) != 0 ) >> { >> l2_pgentry_t *l2e; >> @@ -1239,17 +1142,23 @@ void audit_p2m(struct p2m_domain *p2m, i >> entry_count); >> BUG(); >> } >> - >> - //P2M_PRINTK("p2m audit complete\n"); >> - //if ( orphans_i | orphans_d | mpbad | pmbad ) >> - // P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu >> debug)\n", >> - // orphans_i + orphans_d, orphans_i, orphans_d); >> - if ( mpbad | pmbad ) >> - { >> - P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p >> entries\n", >> - pmbad, mpbad); >> - WARN(); >> - } >> + >> + return pmbad; >> } >> #endif /* P2M_AUDIT */ >> >> +/* Set up the p2m function pointers for pagetable format */ >> +void p2m_pt_init(struct p2m_domain *p2m) >> +{ >> + p2m->set_entry = p2m_set_entry; >> + p2m->get_entry = p2m_gfn_to_mfn; >> + p2m->change_entry_type_global = p2m_change_type_global; >> + p2m->write_p2m_entry = paging_write_p2m_entry; >> +#if P2M_AUDIT >> + p2m->audit_p2m = p2m_pt_audit_p2m; >> +#else >> + p2m->audit_p2m = NULL; >> +#endif >> +} >> + >> + >> diff -r f11528df1df3 -r 764e0872dd4f xen/arch/x86/mm/p2m.c >> --- a/xen/arch/x86/mm/p2m.c >> +++ b/xen/arch/x86/mm/p2m.c >> @@ -439,9 +439,7 @@ guest_physmap_remove_page(struct domain >> { >> struct p2m_domain *p2m = p2m_get_hostp2m(d); >> p2m_lock(p2m); >> - audit_p2m(p2m, 1); >> p2m_remove_page(p2m, gfn, mfn, page_order); >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> } >> >> @@ -482,7 +480,6 @@ guest_physmap_add_entry(struct domain *d >> return rc; >> >> p2m_lock(p2m); >> - audit_p2m(p2m, 0); >> >> P2M_DEBUG("adding gfn=%#lx mfn=%#lx\n", gfn, mfn); >> >> @@ -566,7 +563,6 @@ guest_physmap_add_entry(struct domain *d >> } >> } >> >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> >> return rc; >> @@ -656,7 +652,6 @@ set_mmio_p2m_entry(struct domain *d, uns >> >> P2M_DEBUG("set mmio %lx %lx\n", gfn, mfn_x(mfn)); >> rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_mmio_direct, >> p2m->default_access); >> - audit_p2m(p2m, 1); >> p2m_unlock(p2m); >> if ( 0 == rc ) >> gdprintk(XENLOG_ERR, >> @@ -688,7 +683,6 @@ clear_mmio_p2m_entry(struct domain *d, u >> goto out; >> } >> rc = set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, >> p2m_invalid, p2m->default_access); >> - audit_p2m(p2m, 1); >> >> out: >> p2m_unlock(p2m); >> @@ -785,7 +779,6 @@ int p2m_mem_paging_nominate(struct domai >> >> /* Fix p2m entry */ >> set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_out, a); >> - audit_p2m(p2m, 1); >> ret = 0; >> >> out: >> @@ -852,7 +845,6 @@ int p2m_mem_paging_evict(struct domain * >> >> /* Remove mapping from p2m table */ >> set_p2m_entry(p2m, gfn, _mfn(INVALID_MFN), PAGE_ORDER_4K, >> p2m_ram_paged, a); >> - audit_p2m(p2m, 1); >> >> /* Clear content before returning the page to Xen */ >> scrub_one_page(page); >> @@ -946,7 +938,6 @@ void p2m_mem_paging_populate(struct doma >> req.flags |= MEM_EVENT_FLAG_EVICT_FAIL; >> >> set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, >> p2m_ram_paging_in_start, a); >> - audit_p2m(p2m, 1); >> } >> p2m_unlock(p2m); >> >> @@ -1014,7 +1005,6 @@ int p2m_mem_paging_prep(struct domain *d >> >> /* Fix p2m mapping */ >> set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a); >> - audit_p2m(p2m, 1); >> >> atomic_dec(&d->paged_pages); >> >> @@ -1063,7 +1053,6 @@ void p2m_mem_paging_resume(struct domain >> { >> set_p2m_entry(p2m, rsp.gfn, mfn, PAGE_ORDER_4K, p2m_ram_rw, >> a); >> set_gpfn_from_mfn(mfn_x(mfn), rsp.gfn); >> - audit_p2m(p2m, 1); >> } >> p2m_unlock(p2m); >> } >> @@ -1425,6 +1414,119 @@ unsigned long paging_gva_to_gfn(struct v >> return hostmode->gva_to_gfn(v, hostp2m, va, pfec); >> } >> >> +/*** Audit ***/ >> + >> +#if P2M_AUDIT >> +void audit_p2m(struct domain *d, >> + uint64_t *orphans_debug, >> + uint64_t *orphans_invalid, >> + uint64_t *m2p_bad, >> + uint64_t *p2m_bad) >> +{ >> + struct page_info *page; >> + struct domain *od; >> + unsigned long mfn, gfn; >> + mfn_t p2mfn; >> + unsigned long orphans_d = 0, orphans_i = 0, mpbad = 0, pmbad = 0; >> + p2m_access_t p2ma; >> + p2m_type_t type; >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + >> + if ( !paging_mode_translate(d) ) >> + goto out_p2m_audit; >> + >> + P2M_PRINTK("p2m audit starts\n"); >> + >> + p2m_lock(p2m); >> + >> + if (p2m->audit_p2m) >> + pmbad = p2m->audit_p2m(p2m); >> + >> + /* Audit part two: walk the domain''s page allocation list, checking >> + * the m2p entries. */ >> + spin_lock(&d->page_alloc_lock); >> + page_list_for_each ( page, &d->page_list ) >> + { >> + mfn = mfn_x(page_to_mfn(page)); >> + >> + P2M_PRINTK("auditing guest page, mfn=%#lx\n", mfn); >> + >> + od = page_get_owner(page); >> + >> + if ( od != d ) >> + { >> + P2M_PRINTK("wrong owner %#lx -> %p(%u) != %p(%u)\n", >> + mfn, od, (od?od->domain_id:-1), d, >> d->domain_id); >> + continue; >> + } >> + >> + gfn = get_gpfn_from_mfn(mfn); >> + if ( gfn == INVALID_M2P_ENTRY ) >> + { >> + orphans_i++; >> + P2M_PRINTK("orphaned guest page: mfn=%#lx has invalid >> gfn\n", >> + mfn); >> + continue; >> + } >> + >> + if ( gfn == 0x55555555 || gfn == 0x5555555555555555 ) >> + { >> + orphans_d++; >> + P2M_PRINTK("orphaned guest page: mfn=%#lx has debug gfn\n", >> + mfn); >> + continue; >> + } >> + >> + if ( gfn == SHARED_M2P_ENTRY ) >> + { >> + P2M_PRINTK("shared mfn (%lx) on domain page list!\n", >> + mfn); >> + continue; >> + } >> + >> + p2mfn = get_gfn_type_access(p2m, gfn, &type, &p2ma, p2m_query, >> NULL); >> + if ( mfn_x(p2mfn) != mfn ) >> + { >> + mpbad++; >> + P2M_PRINTK("map mismatch mfn %#lx -> gfn %#lx -> mfn %#lx" >> + " (-> gfn %#lx)\n", >> + mfn, gfn, mfn_x(p2mfn), >> + (mfn_valid(p2mfn) >> + ? get_gpfn_from_mfn(mfn_x(p2mfn)) >> + : -1u)); >> + /* This m2p entry is stale: the domain has another frame in >> + * this physical slot. No great disaster, but for >> neatness, >> + * blow away the m2p entry. */ >> + set_gpfn_from_mfn(mfn, INVALID_M2P_ENTRY); >> + } >> + __put_gfn(p2m, gfn); >> + >> + P2M_PRINTK("OK: mfn=%#lx, gfn=%#lx, p2mfn=%#lx, lp2mfn=%#lx\n", >> + mfn, gfn, mfn_x(p2mfn), lp2mfn); >> + } >> + spin_unlock(&d->page_alloc_lock); >> + >> + p2m_unlock(p2m); >> + >> + P2M_PRINTK("p2m audit complete\n"); >> + if ( orphans_i | orphans_d | mpbad | pmbad ) >> + P2M_PRINTK("p2m audit found %lu orphans (%lu inval %lu >> debug)\n", >> + orphans_i + orphans_d, orphans_i, orphans_d); >> + if ( mpbad | pmbad ) >> + { >> + P2M_PRINTK("p2m audit found %lu odd p2m, %lu bad m2p >> entries\n", >> + pmbad, mpbad); >> + WARN(); >> + } >> + >> +out_p2m_audit: >> + *orphans_debug = (uint64_t) orphans_d; >> + *orphans_invalid = (uint64_t) orphans_i; >> + *m2p_bad = (uint64_t) mpbad; >> + *p2m_bad = (uint64_t) pmbad; >> +} >> +#endif /* P2M_AUDIT */ >> + >> /* >> * Local variables: >> * mode: C >> diff -r f11528df1df3 -r 764e0872dd4f xen/include/asm-x86/p2m.h >> --- a/xen/include/asm-x86/p2m.h >> +++ b/xen/include/asm-x86/p2m.h >> @@ -244,6 +244,7 @@ struct p2m_domain { >> unsigned long gfn, >> l1_pgentry_t *p, >> mfn_t table_mfn, l1_pgentry_t >> new, >> unsigned int level); >> + long (*audit_p2m)(struct p2m_domain *p2m); >> >> /* Default P2M access type for each page in the the domain: new >> pages, >> * swapped in pages, cleared pages, and pages that are ambiquously >> @@ -558,13 +559,15 @@ int set_p2m_entry(struct p2m_domain *p2m >> extern void p2m_pt_init(struct p2m_domain *p2m); >> >> /* Debugging and auditing of the P2M code? */ >> -#define P2M_AUDIT 0 >> +#define P2M_AUDIT 1 >> #define P2M_DEBUGGING 0 >> >> #if P2M_AUDIT >> -extern void audit_p2m(struct p2m_domain *p2m, int strict_m2p); >> -#else >> -# define audit_p2m(_p2m, _m2p) do { (void)(_p2m),(_m2p); } while (0) >> +extern void audit_p2m(struct domain *d, >> + uint64_t *orphans_debug, >> + uint64_t *orphans_invalid, >> + uint64_t *m2p_bad, >> + uint64_t *p2m_bad); >> #endif /* P2M_AUDIT */ >> >> /* Printouts */ >> diff -r f11528df1df3 -r 764e0872dd4f xen/include/public/domctl.h >> --- a/xen/include/public/domctl.h >> +++ b/xen/include/public/domctl.h >> @@ -800,6 +800,16 @@ struct xen_domctl_mem_sharing_op { >> typedef struct xen_domctl_mem_sharing_op xen_domctl_mem_sharing_op_t; >> DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_sharing_op_t); >> >> +struct xen_domctl_audit_p2m { >> + /* OUT error counts */ >> + uint64_t orphans_debug; >> + uint64_t orphans_invalid; >> + uint64_t m2p_bad; >> + uint64_t p2m_bad; >> +}; >> +typedef struct xen_domctl_audit_p2m xen_domctl_audit_p2m_t; >> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_audit_p2m_t); >> + >> #if defined(__i386__) || defined(__x86_64__) >> /* XEN_DOMCTL_setvcpuextstate */ >> /* XEN_DOMCTL_getvcpuextstate */ >> @@ -898,6 +908,7 @@ struct xen_domctl { >> #define XEN_DOMCTL_setvcpuextstate 62 >> #define XEN_DOMCTL_getvcpuextstate 63 >> #define XEN_DOMCTL_set_access_required 64 >> +#define XEN_DOMCTL_audit_p2m 65 >> #define XEN_DOMCTL_gdbsx_guestmemio 1000 >> #define XEN_DOMCTL_gdbsx_pausevcpu 1001 >> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 >> @@ -951,6 +962,7 @@ struct xen_domctl { >> struct xen_domctl_vcpuextstate vcpuextstate; >> #endif >> struct xen_domctl_set_access_required access_required; >> + struct xen_domctl_audit_p2m audit_p2m; >> struct xen_domctl_gdbsx_memio gdbsx_guest_memio; >> struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu; >> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Nov-15 15:39 UTC
Re: [Xen-devel] [PATCH 1 of 6] The PoD code may split a 1GB superpage in a potentially unlocked way
(Including xen-devel in my reply...) On Mon, Nov 14, 2011 at 9:48 PM, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote:> xen/arch/x86/mm/p2m-pod.c | 1 - > xen/arch/x86/mm/p2m-pt.c | 9 ++++++--- > 2 files changed, 6 insertions(+), 4 deletions(-) > > > The path p2m-lookup -> p2m-pt->get_entry -> 1GB PoD superpage -> > pod_demand_populate ends in the pod code performing a p2m_set_entry with > no locks held (in order to split the 1GB superpage into 512 2MB ones) > > Further, it calls p2m_unlock after that, which will break the spinlock.Yeah, not sure how this got to be the way it did... good catch. Acked-by: George Dunlap <george.dunlap@eu.citrix.com>> > This patch attempts to fix that. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r 0844b17df7a9 -r d9a344a5c1e3 xen/arch/x86/mm/p2m-pod.c > --- a/xen/arch/x86/mm/p2m-pod.c > +++ b/xen/arch/x86/mm/p2m-pod.c > @@ -987,7 +987,6 @@ p2m_pod_demand_populate(struct p2m_domai > set_p2m_entry(p2m, gfn_aligned, _mfn(0), PAGE_ORDER_2M, > p2m_populate_on_demand, p2m->default_access); > audit_p2m(p2m, 1); > - p2m_unlock(p2m); > return 0; > } > > diff -r 0844b17df7a9 -r d9a344a5c1e3 xen/arch/x86/mm/p2m-pt.c > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -542,10 +542,11 @@ pod_retry_l3: > /* The read has succeeded, so we know that mapping exists */ > if ( q != p2m_query ) > { > - if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G, q) ) > + if ( !p2m_pod_check_and_populate(p2m, gfn, > + (l1_pgentry_t *) &l3e, PAGE_ORDER_1G, q) ) > goto pod_retry_l3; > p2mt = p2m_invalid; > - printk("%s: Allocate 1GB failed!\n", __func__); > + gdprintk(XENLOG_ERR, "%s: Allocate 1GB failed!\n", __func__); > goto out; > } > else > @@ -743,8 +744,10 @@ pod_retry_l3: > { > if ( q != p2m_query ) > { > - if ( !p2m_pod_demand_populate(p2m, gfn, PAGE_ORDER_1G, q) ) > + if ( !p2m_pod_check_and_populate(p2m, gfn, > + (l1_pgentry_t *) l3e, PAGE_ORDER_1G, q) ) > goto pod_retry_l3; > + gdprintk(XENLOG_ERR, "%s: Allocate 1GB failed!\n", __func__); > } > else > *t = p2m_populate_on_demand; > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 16:48 -0500 on 14 Nov (1321289326), Andres Lagar-Cavilla wrote:> The p2m audit code doesn''t even compile, let alone work. It also > partially supports ept. Make it: > > - compile > - lay groundwork for eventual ept support > - move out of the way of all calls and turn it into a domctl. It''s > obviously not being used by anybody presently. > - enable it via said domctlThanks for looking at this code (which, as you say, had considerably rotted). I''m not sure I''m a big fan of provoking audits from user-space rather than having them run on every operation; in previous incarnations there have been serial debug-keys that triggered auditing code (which would then be run before and after every operation) - I found that much more helpful in the case of failure, as it pointed to which operation had caused the problem rather than saying ''something bad happened at somne point''. If you really want to keep the hypercall, I think it could probably be part of the existing paging/shadow control domctl rather than having its own. That would have the advantage of preventing an untrusted domain from calling it on itself (which has in the past turned slightly bitrotted audit code into a denial-of-service vector!). Cheers, Tim.
Hi, At 16:48 -0500 on 14 Nov (1321289321), Andres Lagar-Cavilla wrote:> This patch series brings about a number of fixes to the p2m code > in anticipation of synchonizing lookups. Specifically: > - Four bug fixes on shadow domctls, setting shared p2m entries, > and splitting 1GB PoD superpages > - Rework the p2m audit code. Today it does neither compile nor > apply to ept, so make it compile and move it out of the way > for most callers, which obviously don''t care. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>Thanks. - I''ve applied 1-3; - 4 is a tools patch and needs a tools maintainer''s ack; - 5 I commented on separately; - 6 depends on 5 and is a tools patch. Cheers, Tim.
Note that patches 1-3 you just applied were resent yesterday (in identical form). So please ignore patches 1-3 from yesterday''s ''MM fixes'' series. Thanks a bunch. Andres> Hi, > > At 16:48 -0500 on 14 Nov (1321289321), Andres Lagar-Cavilla wrote: >> This patch series brings about a number of fixes to the p2m code >> in anticipation of synchonizing lookups. Specifically: >> - Four bug fixes on shadow domctls, setting shared p2m entries, >> and splitting 1GB PoD superpages >> - Rework the p2m audit code. Today it does neither compile nor >> apply to ept, so make it compile and move it out of the way >> for most callers, which obviously don''t care. >> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > Thanks. > - I''ve applied 1-3; > - 4 is a tools patch and needs a tools maintainer''s ack; > - 5 I commented on separately; > - 6 depends on 5 and is a tools patch. > > Cheers, > > Tim. >
Tim, I don''t think we''ll ever be able to move out of global p2m locks if we allow these inline audits. In a fine-grained p2m lock case, the code will acquire exclusive rights on a range of the p2m. But then, in order to audit the whole p2m, it will need to promote its exclusive access to the whole p2m, and that''s a deadlock trap. Ultimately, I can submit a patch that fixes the bit rotting and leaves the inlines in place. And if and when the p2m locking becomes fine-grained, we can ifdef into global locking if P2M_AUDIT is nonzero. It''s kinda yucky. And somewhat crippling of the intended debugging purposes of an audit (with fine-grained locks). But it will work and will not put the fine-grained cart before the locking horse. Thanks Andres> At 16:48 -0500 on 14 Nov (1321289326), Andres Lagar-Cavilla wrote: >> The p2m audit code doesn''t even compile, let alone work. It also >> partially supports ept. Make it: >> >> - compile >> - lay groundwork for eventual ept support >> - move out of the way of all calls and turn it into a domctl. It''s >> obviously not being used by anybody presently. >> - enable it via said domctl > > Thanks for looking at this code (which, as you say, had considerably > rotted). > > I''m not sure I''m a big fan of provoking audits from user-space rather > than having them run on every operation; in previous incarnations there > have been serial debug-keys that triggered auditing code (which would > then be run before and after every operation) - I found that much more > helpful in the case of failure, as it pointed to which operation had > caused the problem rather than saying ''something bad happened at somne > point''. > > If you really want to keep the hypercall, I think it could probably be > part of the existing paging/shadow control domctl rather than having > its own. That would have the advantage of preventing an untrusted > domain from calling it on itself (which has in the past turned slightly > bitrotted audit code into a denial-of-service vector!). > > Cheers, > > Tim. >
At 08:21 -0800 on 24 Nov (1322122868), Andres Lagar-Cavilla wrote:> I don''t think we''ll ever be able to move out of global p2m locks if we > allow these inline audits. In a fine-grained p2m lock case, the code will > acquire exclusive rights on a range of the p2m. But then, in order to > audit the whole p2m, it will need to promote its exclusive access to the > whole p2m, and that''s a deadlock trap.Hmmm. Fair enough, I guess. As you said, this audit code wan''t being any use so your patch is an improvement. :) I''d still like to see the interface changed, in particular so a domain can''t audit itself. Tim.> Ultimately, I can submit a patch that fixes the bit rotting and leaves the > inlines in place. And if and when the p2m locking becomes fine-grained, we > can ifdef into global locking if P2M_AUDIT is nonzero. > > It''s kinda yucky. And somewhat crippling of the intended debugging > purposes of an audit (with fine-grained locks). But it will work and will > not put the fine-grained cart before the locking horse. > > Thanks > Andres > > > At 16:48 -0500 on 14 Nov (1321289326), Andres Lagar-Cavilla wrote: > >> The p2m audit code doesn''t even compile, let alone work. It also > >> partially supports ept. Make it: > >> > >> - compile > >> - lay groundwork for eventual ept support > >> - move out of the way of all calls and turn it into a domctl. It''s > >> obviously not being used by anybody presently. > >> - enable it via said domctl > > > > Thanks for looking at this code (which, as you say, had considerably > > rotted). > > > > I''m not sure I''m a big fan of provoking audits from user-space rather > > than having them run on every operation; in previous incarnations there > > have been serial debug-keys that triggered auditing code (which would > > then be run before and after every operation) - I found that much more > > helpful in the case of failure, as it pointed to which operation had > > caused the problem rather than saying ''something bad happened at somne > > point''. > > > > If you really want to keep the hypercall, I think it could probably be > > part of the existing paging/shadow control domctl rather than having > > its own. That would have the advantage of preventing an untrusted > > domain from calling it on itself (which has in the past turned slightly > > bitrotted audit code into a denial-of-service vector!). > > > > Cheers, > > > > Tim. > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel