Changes from sharing overhaul v3 posted Jan 16th - Patches 1,2: Improved description in header - Patch 1: + Changed mem_sharing audit from int to void + Fixed format of audit debugtrace printk''s Patch 2: added comment to code explaining grant sharing raison-d''etre Patches 4 (mainly), but 6 and 8 as well: + Using a per-cpu static struct instead of stack var for lock order and deadlock detection book-keeping. Patches 7, 9, 11: Added Acked-by''s (all the major tools patches acked-by tools maintainers now) Patch 8: + Fixed off-by-one audit type accounting caused by PGT_locked bumping the page type count New patch 13: rename shared_info to sharing so that grep/cscope/etc don''t alias it with the unrelated shared info page. Regular description follows... -------------------- This patch series brings about a set of changes to the sharing support in the hypervisor and tools. Specifically: - Granular scalable locking: lock mfn''s individually when sharing/unsharing them. Do not rely on a global lock. Use RCU to maintain a list of all shared frames for audit purposes. - Refreshed stats: they now work (tm), and are accesible via libxc and console. - libxl/xl support to get sharing stats. - New sharing command "add to physmap", to directly populate a new domain with shared frames. - Toolstack and memshr kept in sync. A consequence of our modifications is a change to the sharing API. While we refresh the only user of the sharing API in the tree, we want to make sure we are not affecting any other users of the sharing API. Patches 9, 10 11 are tools patches. Patches 1 to 8 are x86/mm. Patch 12 is a simple testing tool. Patch 13 is a style fix, no functional changes. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> xen/arch/x86/mm/mem_sharing.c | 539 +++++++++++++++++----------------- xen/include/asm-x86/mem_sharing.h | 19 +- xen/include/asm-x86/mm.h | 11 +- xen/include/public/domctl.h | 3 + xen/arch/x86/mm/mem_sharing.c | 57 +++- xen/include/public/domctl.h | 13 + xen/arch/x86/mm.c | 74 +---- xen/arch/x86/mm/mem_sharing.c | 257 +++++++++++++++- xen/arch/x86/mm/mm-locks.h | 15 +- xen/include/asm-x86/mm.h | 27 +- xen/arch/x86/mm/mem_sharing.c | 16 + xen/arch/x86/mm/mm-locks.h | 18 +- xen/include/asm-x86/mm.h | 3 +- xen/arch/x86/mm.c | 6 - xen/arch/x86/mm/mem_sharing.c | 25 + xen/arch/x86/x86_64/compat/mm.c | 6 + xen/arch/x86/x86_64/mm.c | 7 + xen/include/asm-x86/mem_sharing.h | 1 + xen/include/public/memory.h | 1 + xen/arch/x86/mm/mem_sharing.c | 103 ++++++ xen/include/public/domctl.h | 3 +- xen/arch/ia64/xen/mm.c | 5 + xen/arch/x86/mm.c | 14 + xen/common/keyhandler.c | 7 +- xen/include/xen/mm.h | 3 + xen/arch/x86/mm/mem_sharing.c | 93 ++--- xen/arch/x86/mm/mm-locks.h | 18 - xen/include/asm-x86/mem_sharing.h | 3 +- tools/blktap2/drivers/tapdisk-vbd.c | 6 +- tools/blktap2/drivers/tapdisk.h | 2 +- tools/libxc/xc_memshr.c | 63 +++- tools/libxc/xenctrl.h | 19 +- tools/memshr/bidir-daemon.c | 34 +- tools/memshr/bidir-hash.h | 12 +- tools/memshr/interface.c | 30 +- tools/memshr/memshr.h | 11 +- tools/libxc/xc_memshr.c | 10 + tools/libxc/xenctrl.h | 17 + docs/man/xl.pod.1 | 13 + tools/libxl/libxl.c | 2 + tools/libxl/libxl_types.idl | 2 + tools/libxl/xl.h | 1 + tools/libxl/xl_cmdimpl.c | 66 ++++ tools/libxl/xl_cmdtable.c | 5 + tools/tests/mem-sharing/Makefile | 26 + tools/tests/mem-sharing/memshrtool.c | 165 ++++++++++ xen/arch/x86/mm/mem_sharing.c | 64 ++-- xen/include/asm-x86/mm.h | 2 +- 48 files changed, 1360 insertions(+), 537 deletions(-)
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 01 of 13] x86/mm: Eliminate hash table in sharing code as index of shared mfns
xen/arch/x86/mm/mem_sharing.c | 539 ++++++++++++++++++------------------- xen/include/asm-x86/mem_sharing.h | 19 +- xen/include/asm-x86/mm.h | 11 +- xen/include/public/domctl.h | 3 + 4 files changed, 294 insertions(+), 278 deletions(-) Eliminate the sharing hastable mechanism by storing a list head directly in the page info for the case when the page is shared. This does not add any extra space to the page_info and serves to remove significant complexity from sharing. Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 3f8338e68164 -r bcc4215663e8 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -3,6 +3,7 @@ * * Memory sharing support. * + * Copyright (c) 2011 GridCentric, Inc. (Adin Scannell & Andres Lagar-Cavilla) * Copyright (c) 2009 Citrix Systems, Inc. (Grzegorz Milos) * * This program is free software; you can redistribute it and/or modify @@ -34,15 +35,27 @@ #include "mm-locks.h" -/* Auditing of memory sharing code? */ -#define MEM_SHARING_AUDIT 0 - #if MEM_SHARING_AUDIT static void mem_sharing_audit(void); #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) +static struct list_head shr_audit_list; + +static inline void audit_add_list(struct page_info *page) +{ + INIT_LIST_HEAD(&page->shared_info->entry); + list_add(&page->shared_info->entry, &shr_audit_list); +} + +static inline void audit_del_list(struct page_info *page) +{ + list_del(&page->shared_info->entry); +} #else -#define mem_sharing_audit() do {} while(0) +#define mem_sharing_audit() ((void)0) + +#define audit_add_list(p) ((void)0) +#define audit_del_list(p) ((void)0) #endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ @@ -58,17 +71,6 @@ static void mem_sharing_audit(void); static shr_handle_t next_handle = 1; static atomic_t nr_saved_mfns = ATOMIC_INIT(0); -typedef struct shr_hash_entry -{ - shr_handle_t handle; - mfn_t mfn; - struct shr_hash_entry *next; - struct list_head gfns; -} shr_hash_entry_t; - -#define SHR_HASH_LENGTH 1000 -static shr_hash_entry_t *shr_hash[SHR_HASH_LENGTH]; - typedef struct gfn_info { unsigned long gfn; @@ -89,164 +91,149 @@ static inline gfn_info_t *gfn_get_info(s return list_entry(list->next, gfn_info_t, list); } -static void __init mem_sharing_hash_init(void) +static inline gfn_info_t *mem_sharing_gfn_alloc(struct page_info *page, + struct domain *d, + unsigned long gfn) { - int i; + gfn_info_t *gfn_info = xmalloc(gfn_info_t); - mm_lock_init(&shr_lock); - for(i=0; i<SHR_HASH_LENGTH; i++) - shr_hash[i] = NULL; + if ( gfn_info == NULL ) + return NULL; + + gfn_info->gfn = gfn; + gfn_info->domain = d->domain_id; + INIT_LIST_HEAD(&gfn_info->list); + list_add(&gfn_info->list, &page->shared_info->gfns); + + /* Increment our number of shared pges. */ + atomic_inc(&d->shr_pages); + + return gfn_info; } -static shr_hash_entry_t *mem_sharing_hash_alloc(void) +static inline void mem_sharing_gfn_destroy(struct domain *d, + gfn_info_t *gfn_info) { - return xmalloc(shr_hash_entry_t); + /* Decrement the number of pages. */ + atomic_dec(&d->shr_pages); + + /* Free the gfn_info structure. */ + list_del(&gfn_info->list); + xfree(gfn_info); } -static void mem_sharing_hash_destroy(shr_hash_entry_t *e) +static struct page_info* mem_sharing_lookup(unsigned long mfn) { - xfree(e); -} - -static gfn_info_t *mem_sharing_gfn_alloc(void) -{ - return xmalloc(gfn_info_t); -} - -static void mem_sharing_gfn_destroy(gfn_info_t *gfn, int was_shared) -{ - /* Decrement the number of pages, if the gfn was shared before */ - if ( was_shared ) + if ( mfn_valid(_mfn(mfn)) ) { - struct domain *d = get_domain_by_id(gfn->domain); - /* Domain may have been destroyed by now * - * (if we are called from p2m_teardown) */ - if ( d ) + struct page_info* page = mfn_to_page(_mfn(mfn)); + if ( page_get_owner(page) == dom_cow ) { - atomic_dec(&d->shr_pages); - put_domain(d); + ASSERT(page->u.inuse.type_info & PGT_type_mask); + ASSERT(get_gpfn_from_mfn(mfn) == SHARED_M2P_ENTRY); + return page; } } - xfree(gfn); -} - -static shr_hash_entry_t* mem_sharing_hash_lookup(shr_handle_t handle) -{ - shr_hash_entry_t *e; - - e = shr_hash[handle % SHR_HASH_LENGTH]; - while(e != NULL) - { - if(e->handle == handle) - return e; - e = e->next; - } return NULL; } -static shr_hash_entry_t* mem_sharing_hash_insert(shr_handle_t handle, mfn_t mfn) -{ - shr_hash_entry_t *e, **ee; - - e = mem_sharing_hash_alloc(); - if(e == NULL) return NULL; - e->handle = handle; - e->mfn = mfn; - ee = &shr_hash[handle % SHR_HASH_LENGTH]; - e->next = *ee; - *ee = e; - return e; -} - -static void mem_sharing_hash_delete(shr_handle_t handle) -{ - shr_hash_entry_t **pprev, *e; - - pprev = &shr_hash[handle % SHR_HASH_LENGTH]; - e = *pprev; - while(e != NULL) - { - if(e->handle == handle) - { - *pprev = e->next; - mem_sharing_hash_destroy(e); - return; - } - pprev = &e->next; - e = e->next; - } - printk("Could not find shr entry for handle %"PRIx64"\n", handle); - BUG(); -} - #if MEM_SHARING_AUDIT static void mem_sharing_audit(void) { - shr_hash_entry_t *e; - struct list_head *le; - gfn_info_t *g; - int bucket; - struct page_info *pg; + int errors = 0; + struct list_head *ae; ASSERT(shr_locked_by_me()); - for(bucket=0; bucket < SHR_HASH_LENGTH; bucket++) + list_for_each(ae, &shr_audit_list) { - e = shr_hash[bucket]; - /* Loop over all shr_hash_entries */ - while(e != NULL) + struct page_sharing_info *shared_info; + unsigned long nr_gfns = 0; + struct page_info *pg; + struct list_head *le; + mfn_t mfn; + + shared_info = list_entry(ae, struct page_sharing_info, entry); + pg = shared_info->pg; + mfn = page_to_mfn(pg); + + /* Check if the MFN has correct type, owner and handle. */ + if ( !(pg->u.inuse.type_info & PGT_shared_page) ) { - int nr_gfns=0; + MEM_SHARING_DEBUG("mfn %lx in audit list, but not PGT_shared_page (%lx)!\n", + mfn_x(mfn), pg->u.inuse.type_info & PGT_type_mask); + errors++; + continue; + } - /* Check if the MFN has correct type, owner and handle */ - pg = mfn_to_page(e->mfn); - if((pg->u.inuse.type_info & PGT_type_mask) != PGT_shared_page) - MEM_SHARING_DEBUG("mfn %lx not shared, but in the hash!\n", - mfn_x(e->mfn)); - if(page_get_owner(pg) != dom_cow) - MEM_SHARING_DEBUG("mfn %lx shared, but wrong owner (%d)!\n", - mfn_x(e->mfn), - page_get_owner(pg)->domain_id); - if(e->handle != pg->shr_handle) - MEM_SHARING_DEBUG("mfn %lx shared, but wrong handle " - "(%ld != %ld)!\n", - mfn_x(e->mfn), pg->shr_handle, e->handle); - /* Check if all GFNs map to the MFN, and the p2m types */ - list_for_each(le, &e->gfns) + /* Check the page owner. */ + if ( page_get_owner(pg) != dom_cow ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but wrong owner (%hu)!\n", + mfn_x(mfn), page_get_owner(pg)->domain_id); + errors++; + } + + /* Check the m2p entry */ + if ( get_gpfn_from_mfn(mfn_x(mfn)) != SHARED_M2P_ENTRY ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but wrong m2p entry (%lx)!\n", + mfn_x(mfn), get_gpfn_from_mfn(mfn_x(mfn))); + errors++; + } + + /* Check we have a list */ + if ( (!pg->shared_info) || (list_empty(&pg->shared_info->gfns)) ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but empty gfn list!\n", + mfn_x(mfn)); + errors++; + continue; + } + + /* Check if all GFNs map to the MFN, and the p2m types */ + list_for_each(le, &pg->shared_info->gfns) + { + struct domain *d; + p2m_type_t t; + mfn_t o_mfn; + gfn_info_t *g; + + g = list_entry(le, gfn_info_t, list); + d = get_domain_by_id(g->domain); + if ( d == NULL ) { - struct domain *d; - p2m_type_t t; - mfn_t mfn; - - g = list_entry(le, struct gfn_info, list); - d = get_domain_by_id(g->domain); - if(d == NULL) - { - MEM_SHARING_DEBUG("Unknow dom: %d, for PFN=%lx, MFN=%lx\n", - g->domain, g->gfn, mfn_x(e->mfn)); - continue; - } - mfn = get_gfn_unlocked(d, g->gfn, &t); - if(mfn_x(mfn) != mfn_x(e->mfn)) - MEM_SHARING_DEBUG("Incorrect P2M for d=%d, PFN=%lx." - "Expecting MFN=%ld, got %ld\n", - g->domain, g->gfn, mfn_x(e->mfn), - mfn_x(mfn)); - if(t != p2m_ram_shared) - MEM_SHARING_DEBUG("Incorrect P2M type for d=%d, PFN=%lx." - "Expecting t=%d, got %d\n", - g->domain, g->gfn, mfn_x(e->mfn), - p2m_ram_shared, t); - put_domain(d); - nr_gfns++; - } - if(nr_gfns != (pg->u.inuse.type_info & PGT_count_mask)) - MEM_SHARING_DEBUG("Mismatched counts for MFN=%lx." - "nr_gfns in hash %d, in type_info %d\n", - mfn_x(e->mfn), nr_gfns, - (pg->u.inuse.type_info & PGT_count_mask)); - e = e->next; + MEM_SHARING_DEBUG("Unknown dom: %hu, for PFN=%lx, MFN=%lx\n", + g->domain, g->gfn, mfn_x(mfn)); + errors++; + continue; + } + o_mfn = get_gfn_query_unlocked(d, g->gfn, &t); + if ( mfn_x(o_mfn) != mfn_x(mfn) ) + { + MEM_SHARING_DEBUG("Incorrect P2M for d=%hu, PFN=%lx." + "Expecting MFN=%lx, got %lx\n", + g->domain, g->gfn, mfn_x(mfn), mfn_x(o_mfn)); + errors++; + } + if ( t != p2m_ram_shared ) + { + MEM_SHARING_DEBUG("Incorrect P2M type for d=%hu, PFN=%lx MFN=%lx." + "Expecting t=%d, got %d\n", + g->domain, g->gfn, mfn_x(mfn), p2m_ram_shared, t); + errors++; + } + put_domain(d); + nr_gfns++; + } + if ( nr_gfns != (pg->u.inuse.type_info & PGT_count_mask) ) + { + MEM_SHARING_DEBUG("Mismatched counts for MFN=%lx." + "nr_gfns in list %lu, in type_info %lx\n", + mfn_x(mfn), nr_gfns, + (pg->u.inuse.type_info & PGT_count_mask)); + errors++; } } } @@ -383,36 +370,6 @@ static int mem_sharing_gref_to_gfn(struc return 0; } -/* Account for a GFN being shared/unshared. - * When sharing this function needs to be called _before_ gfn lists are merged - * together, but _after_ gfn is removed from the list when unsharing. - */ -static int mem_sharing_gfn_account(struct gfn_info *gfn, int sharing) -{ - struct domain *d; - - /* A) When sharing: - * if the gfn being shared is in > 1 long list, its already been - * accounted for - * B) When unsharing: - * if the list is longer than > 1, we don''t have to account yet. - */ - if(list_has_one_entry(&gfn->list)) - { - d = get_domain_by_id(gfn->domain); - BUG_ON(!d); - if(sharing) - atomic_inc(&d->shr_pages); - else - atomic_dec(&d->shr_pages); - put_domain(d); - - return 1; - } - mem_sharing_audit(); - - return 0; -} int mem_sharing_debug_gref(struct domain *d, grant_ref_t ref) { @@ -450,8 +407,6 @@ int mem_sharing_nominate_page(struct dom mfn_t mfn; struct page_info *page; int ret; - shr_handle_t handle; - shr_hash_entry_t *hash_entry; struct gfn_info *gfn_info; *phandle = 0UL; @@ -467,7 +422,7 @@ int mem_sharing_nominate_page(struct dom /* Return the handle if the page is already shared */ page = mfn_to_page(mfn); if ( p2m_is_shared(p2mt) ) { - *phandle = page->shr_handle; + *phandle = page->shared_info->handle; ret = 0; goto out; } @@ -481,16 +436,26 @@ int mem_sharing_nominate_page(struct dom if ( ret ) goto out; - /* Create the handle */ + /* Initialize the shared state */ ret = -ENOMEM; - handle = next_handle++; - if((hash_entry = mem_sharing_hash_insert(handle, mfn)) == NULL) + if ( (page->shared_info = + xmalloc(struct page_sharing_info)) == NULL ) { + BUG_ON(page_make_private(d, page) != 0); goto out; } - if((gfn_info = mem_sharing_gfn_alloc()) == NULL) + page->shared_info->pg = page; + INIT_LIST_HEAD(&page->shared_info->gfns); + + /* Create the handle */ + page->shared_info->handle = next_handle++; + + /* Create the local gfn info */ + if ( (gfn_info = mem_sharing_gfn_alloc(page, d, gfn)) == NULL ) { - mem_sharing_hash_destroy(hash_entry); + xfree(page->shared_info); + page->shared_info = NULL; + BUG_ON(page_make_private(d, page) != 0); goto out; } @@ -501,23 +466,19 @@ int mem_sharing_nominate_page(struct dom * it a few lines above. * The mfn needs to revert back to rw type. This should never fail, * since no-one knew that the mfn was temporarily sharable */ + mem_sharing_gfn_destroy(d, gfn_info); + xfree(page->shared_info); + page->shared_info = NULL; + /* NOTE: We haven''t yet added this to the audit list. */ BUG_ON(page_make_private(d, page) != 0); - mem_sharing_hash_destroy(hash_entry); - mem_sharing_gfn_destroy(gfn_info, 0); goto out; } /* Update m2p entry to SHARED_M2P_ENTRY */ set_gpfn_from_mfn(mfn_x(mfn), SHARED_M2P_ENTRY); - INIT_LIST_HEAD(&hash_entry->gfns); - INIT_LIST_HEAD(&gfn_info->list); - list_add(&gfn_info->list, &hash_entry->gfns); - gfn_info->gfn = gfn; - gfn_info->domain = d->domain_id; - page->shr_handle = handle; - *phandle = handle; - + *phandle = page->shared_info->handle; + audit_add_list(page); ret = 0; out: @@ -526,54 +487,82 @@ out: return ret; } -int mem_sharing_share_pages(shr_handle_t sh, shr_handle_t ch) +int mem_sharing_share_pages(struct domain *sd, unsigned long sgfn, shr_handle_t sh, + struct domain *cd, unsigned long cgfn, shr_handle_t ch) { - shr_hash_entry_t *se, *ce; struct page_info *spage, *cpage; struct list_head *le, *te; - struct gfn_info *gfn; + gfn_info_t *gfn; struct domain *d; - int ret; + int ret = -EINVAL; + mfn_t smfn, cmfn; + p2m_type_t smfn_type, cmfn_type; shr_lock(); + /* XXX if sd == cd handle potential deadlock by ordering + * the get_ and put_gfn''s */ + smfn = get_gfn(sd, sgfn, &smfn_type); + cmfn = get_gfn(cd, cgfn, &cmfn_type); + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; - se = mem_sharing_hash_lookup(sh); - if(se == NULL) goto err_out; + spage = mem_sharing_lookup(mfn_x(smfn)); + if ( spage == NULL ) + goto err_out; + ASSERT(smfn_type == p2m_ram_shared); ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; - ce = mem_sharing_hash_lookup(ch); - if(ce == NULL) goto err_out; - spage = mfn_to_page(se->mfn); - cpage = mfn_to_page(ce->mfn); - /* gfn lists always have at least one entry => save to call list_entry */ - mem_sharing_gfn_account(gfn_get_info(&ce->gfns), 1); - mem_sharing_gfn_account(gfn_get_info(&se->gfns), 1); - list_for_each_safe(le, te, &ce->gfns) + cpage = mem_sharing_lookup(mfn_x(cmfn)); + if ( cpage == NULL ) + goto err_out; + ASSERT(cmfn_type == p2m_ram_shared); + + /* Check that the handles match */ + if ( spage->shared_info->handle != sh ) { - gfn = list_entry(le, struct gfn_info, list); - /* Get the source page and type, this should never fail - * because we are under shr lock, and got non-null se */ + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + goto err_out; + } + if ( cpage->shared_info->handle != ch ) + { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + goto err_out; + } + + /* Merge the lists together */ + list_for_each_safe(le, te, &cpage->shared_info->gfns) + { + gfn = list_entry(le, gfn_info_t, list); + /* Get the source page and type, this should never fail: + * we are under shr lock, and got a successful lookup */ BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); - /* Move the gfn_info from ce list to se list */ + /* Move the gfn_info from client list to source list */ list_del(&gfn->list); + list_add(&gfn->list, &spage->shared_info->gfns); + put_page_and_type(cpage); d = get_domain_by_id(gfn->domain); BUG_ON(!d); - BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0); + BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn) == 0); put_domain(d); - list_add(&gfn->list, &se->gfns); - put_page_and_type(cpage); - } - ASSERT(list_empty(&ce->gfns)); - mem_sharing_hash_delete(ch); - atomic_inc(&nr_saved_mfns); + } + ASSERT(list_empty(&cpage->shared_info->gfns)); + + /* Clear the rest of the shared state */ + audit_del_list(cpage); + xfree(cpage->shared_info); + cpage->shared_info = NULL; + /* Free the client page */ if(test_and_clear_bit(_PGC_allocated, &cpage->count_info)) put_page(cpage); + + /* We managed to free a domain page. */ + atomic_inc(&nr_saved_mfns); ret = 0; err_out: + put_gfn(cd, cgfn); + put_gfn(sd, sgfn); shr_unlock(); - return ret; } @@ -585,13 +574,9 @@ int mem_sharing_unshare_page(struct doma mfn_t mfn; struct page_info *page, *old_page; void *s, *t; - int ret, last_gfn; - shr_hash_entry_t *hash_entry; - struct gfn_info *gfn_info = NULL; - shr_handle_t handle; + int last_gfn; + gfn_info_t *gfn_info = NULL; struct list_head *le; - - /* Remove the gfn_info from the list */ /* This is one of the reasons why we can''t enforce ordering * between shr_lock and p2m fine-grained locks in mm-lock. @@ -607,56 +592,62 @@ int mem_sharing_unshare_page(struct doma return 0; } - page = mfn_to_page(mfn); - handle = page->shr_handle; - - hash_entry = mem_sharing_hash_lookup(handle); - list_for_each(le, &hash_entry->gfns) + page = mem_sharing_lookup(mfn_x(mfn)); + if ( page == NULL ) { - gfn_info = list_entry(le, struct gfn_info, list); + gdprintk(XENLOG_ERR, "Domain p2m is shared, but page is not: " + "%lx\n", gfn); + BUG(); + } + + list_for_each(le, &page->shared_info->gfns) + { + gfn_info = list_entry(le, gfn_info_t, list); if ( (gfn_info->gfn == gfn) && (gfn_info->domain == d->domain_id) ) goto gfn_found; } gdprintk(XENLOG_ERR, "Could not find gfn_info for shared gfn: " "%lx\n", gfn); BUG(); -gfn_found: - /* Delete gfn_info from the list, but hold on to it, until we''ve allocated - * memory to make a copy */ - list_del(&gfn_info->list); - last_gfn = list_empty(&hash_entry->gfns); +gfn_found: + /* Do the accounting first. If anything fails below, we have bigger + * bigger fish to fry. First, remove the gfn from the list. */ + last_gfn = list_has_one_entry(&page->shared_info->gfns); + mem_sharing_gfn_destroy(d, gfn_info); + if ( last_gfn ) + { + /* Clean up shared state */ + audit_del_list(page); + xfree(page->shared_info); + page->shared_info = NULL; + } + else + atomic_dec(&nr_saved_mfns); /* If the GFN is getting destroyed drop the references to MFN * (possibly freeing the page), and exit early */ if ( flags & MEM_SHARING_DESTROY_GFN ) { - mem_sharing_gfn_destroy(gfn_info, !last_gfn); - if(last_gfn) - mem_sharing_hash_delete(handle); - else - /* Even though we don''t allocate a private page, we have to account - * for the MFN that originally backed this PFN. */ - atomic_dec(&nr_saved_mfns); put_gfn(d, gfn); shr_unlock(); put_page_and_type(page); - if(last_gfn && - test_and_clear_bit(_PGC_allocated, &page->count_info)) + if ( last_gfn && + test_and_clear_bit(_PGC_allocated, &page->count_info) ) put_page(page); + return 0; } - ret = page_make_private(d, page); - BUG_ON(last_gfn & ret); - if(ret == 0) goto private_page_found; - + if ( last_gfn ) + { + BUG_ON(page_make_private(d, page) != 0); + goto private_page_found; + } + old_page = page; page = alloc_domheap_page(d, 0); - if(!page) + if ( !page ) { - /* We''ve failed to obtain memory for private page. Need to re-add the - * gfn_info to relevant list */ - list_add(&gfn_info->list, &hash_entry->gfns); put_gfn(d, gfn); mem_sharing_notify_helper(d, gfn); shr_unlock(); @@ -669,30 +660,18 @@ gfn_found: unmap_domain_page(s); unmap_domain_page(t); - /* NOTE: set_shared_p2m_entry will switch the underlying mfn. If - * we do get_page withing get_gfn, the correct sequence here - * should be - get_page(page); - put_page(old_page); - * so that the ref to the old page is dropped, and a ref to - * the new page is obtained to later be dropped in put_gfn */ BUG_ON(set_shared_p2m_entry(d, gfn, page_to_mfn(page)) == 0); put_page_and_type(old_page); private_page_found: - /* We''ve got a private page, we can commit the gfn destruction */ - mem_sharing_gfn_destroy(gfn_info, !last_gfn); - if(last_gfn) - mem_sharing_hash_delete(handle); - else - atomic_dec(&nr_saved_mfns); - if ( p2m_change_type(d, gfn, p2m_ram_shared, p2m_ram_rw) != p2m_ram_shared ) { - printk("Could not change p2m type.\n"); + gdprintk(XENLOG_ERR, "Could not change p2m type d %hu gfn %lx.\n", + d->domain_id, gfn); BUG(); } + /* Update m2p entry */ set_gpfn_from_mfn(mfn_x(page_to_mfn(page)), gfn); @@ -749,9 +728,18 @@ int mem_sharing_domctl(struct domain *d, case XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE: { - shr_handle_t sh = mec->u.share.source_handle; - shr_handle_t ch = mec->u.share.client_handle; - rc = mem_sharing_share_pages(sh, ch); + unsigned long sgfn = mec->u.share.source_gfn; + shr_handle_t sh = mec->u.share.source_handle; + struct domain *cd = get_domain_by_id(mec->u.share.client_domain); + if ( cd ) + { + unsigned long cgfn = mec->u.share.client_gfn; + shr_handle_t ch = mec->u.share.client_handle; + rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); + put_domain(cd); + } + else + return -EEXIST; } break; @@ -799,6 +787,9 @@ int mem_sharing_domctl(struct domain *d, void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); - mem_sharing_hash_init(); + mm_lock_init(&shr_lock); +#if MEM_SHARING_AUDIT + INIT_LIST_HEAD(&shr_audit_list); +#endif } diff -r 3f8338e68164 -r bcc4215663e8 xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -22,13 +22,28 @@ #ifndef __MEM_SHARING_H__ #define __MEM_SHARING_H__ +#include <public/domctl.h> + +/* Auditing of memory sharing code? */ +#define MEM_SHARING_AUDIT 0 + +typedef uint64_t shr_handle_t; + +struct page_sharing_info +{ + struct page_info *pg; /* Back pointer to the page. */ + shr_handle_t handle; /* Globally unique version / handle. */ +#if MEM_SHARING_AUDIT + struct list_head entry; /* List of all shared pages (entry). */ +#endif + struct list_head gfns; /* List of domains and gfns for this page (head). */ +}; + #ifdef __x86_64__ #define sharing_supported(_d) \ (is_hvm_domain(_d) && paging_mode_hap(_d)) -typedef uint64_t shr_handle_t; - unsigned int mem_sharing_get_nr_saved_mfns(void); int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, diff -r 3f8338e68164 -r bcc4215663e8 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -31,6 +31,8 @@ struct page_list_entry __pdx_t next, prev; }; +struct page_sharing_info; + struct page_info { union { @@ -49,8 +51,13 @@ struct page_info /* For non-pinnable single-page shadows, a higher entry that points * at us. */ paddr_t up; - /* For shared/sharable pages the sharing handle */ - uint64_t shr_handle; + /* For shared/sharable pages, we use a doubly-linked list + * of all the {pfn,domain} pairs that map this page. We also include + * an opaque handle, which is effectively a version, so that clients + * of sharing share the version they expect to. + * This list is allocated and freed when a page is shared/unshared. + */ + struct page_sharing_info *shared_info; }; /* Reference count and various PGC_xxx flags and fields. */ diff -r 3f8338e68164 -r bcc4215663e8 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -789,7 +789,10 @@ struct xen_domctl_mem_sharing_op { uint64_aligned_t handle; /* OUT: the handle */ } nominate; struct mem_sharing_op_share { /* OP_SHARE */ + uint64_aligned_t source_gfn; /* IN: the gfn of the source page */ uint64_aligned_t source_handle; /* IN: handle to the source page */ + domid_t client_domain; /* IN: the client domain id */ + uint64_aligned_t client_gfn; /* IN: the client gfn */ uint64_aligned_t client_handle; /* IN: handle to the client page */ } share; struct mem_sharing_op_debug { /* OP_DEBUG_xxx */
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 02 of 13] x86/mm: Update mem sharing interface to (re)allow sharing of grants
xen/arch/x86/mm/mem_sharing.c | 57 ++++++++++++++++++++++++++++++++++++------ xen/include/public/domctl.h | 13 +++++++++ 2 files changed, 61 insertions(+), 9 deletions(-) Previosuly, the mem sharing code would return an opaque handle to index shared pages (and nominees) in its global hash table. By removing the hash table, the new interfaces requires a gfn and a version. However, when sharing grants, the caller provides a grant ref and a version. Update interface to handle this case. The use case for grant sharing is when sharing from within a backend (e.g. memshr + blktap2), in which case the backend is only exposed to grant references. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r bcc4215663e8 -r 61e164d6e956 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -728,18 +728,57 @@ int mem_sharing_domctl(struct domain *d, case XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE: { - unsigned long sgfn = mec->u.share.source_gfn; - shr_handle_t sh = mec->u.share.source_handle; - struct domain *cd = get_domain_by_id(mec->u.share.client_domain); - if ( cd ) + unsigned long sgfn, cgfn; + struct domain *cd; + shr_handle_t sh, ch; + + if ( !mem_sharing_enabled(d) ) + return -EINVAL; + + cd = get_domain_by_id(mec->u.share.client_domain); + if ( !cd ) + return -ESRCH; + + if ( !mem_sharing_enabled(cd) ) { - unsigned long cgfn = mec->u.share.client_gfn; - shr_handle_t ch = mec->u.share.client_handle; - rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); put_domain(cd); + return -EINVAL; } - else - return -EEXIST; + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.source_gfn) ) + { + grant_ref_t gref = (grant_ref_t) + (XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF( + mec->u.share.source_gfn)); + if ( mem_sharing_gref_to_gfn(d, gref, &sgfn) < 0 ) + { + put_domain(cd); + return -EINVAL; + } + } else { + sgfn = mec->u.share.source_gfn; + } + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.client_gfn) ) + { + grant_ref_t gref = (grant_ref_t) + (XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF( + mec->u.share.client_gfn)); + if ( mem_sharing_gref_to_gfn(cd, gref, &cgfn) < 0 ) + { + put_domain(cd); + return -EINVAL; + } + } else { + cgfn = mec->u.share.client_gfn; + } + + sh = mec->u.share.source_handle; + ch = mec->u.share.client_handle; + + rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); + + put_domain(cd); } break; diff -r bcc4215663e8 -r 61e164d6e956 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -775,6 +775,19 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9) +/* The following allows sharing of grant refs. This is useful + * for sharing utilities sitting as "filters" in IO backends + * (e.g. memshr + blktap(2)). The IO backend is only exposed + * to grant references, and this allows sharing of the grefs */ +#define XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG (1ULL << 62) + +#define XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(field, val) \ + (field) = (XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG | val) +#define XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(field) \ + ((field) & XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG) +#define XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF(field) \ + ((field) & (~XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG)) + struct xen_domctl_mem_sharing_op { uint8_t op; /* XEN_DOMCTL_MEM_EVENT_OP_* */
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 03 of 13] x86/mm: Add per-page locking for memory sharing, when audits are disabled
xen/arch/x86/mm.c | 74 +----------- xen/arch/x86/mm/mem_sharing.c | 257 ++++++++++++++++++++++++++++++++++++++--- xen/arch/x86/mm/mm-locks.h | 15 +- xen/include/asm-x86/mm.h | 27 +++- 4 files changed, 275 insertions(+), 98 deletions(-) With the removal of the hash table, all that is needed now is locking of individual shared pages, as new (gfn,domain) pairs are removed or added from the list of mappings. We recycle PGT_locked and use it to lock individual pages. We ensure deadlock is averted by locking pages in increasing order. The global lock remains for the benefit of the auditing code, and is thus enabled only as a compile-time option. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r 61e164d6e956 -r f5e3c94b2066 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -1719,7 +1719,7 @@ static int free_l4_table(struct page_inf #define free_l4_table(page, preemptible) (-EINVAL) #endif -static int page_lock(struct page_info *page) +int page_lock(struct page_info *page) { unsigned long x, nx; @@ -1736,7 +1736,7 @@ static int page_lock(struct page_info *p return 1; } -static void page_unlock(struct page_info *page) +void page_unlock(struct page_info *page) { unsigned long x, nx, y = page->u.inuse.type_info; @@ -4295,76 +4295,6 @@ int steal_page( return -1; } -int page_make_sharable(struct domain *d, - struct page_info *page, - int expected_refcnt) -{ - spin_lock(&d->page_alloc_lock); - - /* Change page type and count atomically */ - if ( !get_page_and_type(page, d, PGT_shared_page) ) - { - spin_unlock(&d->page_alloc_lock); - return -EINVAL; - } - - /* Check it wasn''t already sharable and undo if it was */ - if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) - { - put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); - return -EEXIST; - } - - /* Check if the ref count is 2. The first from PGC_allocated, and - * the second from get_page_and_type at the top of this function */ - if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) - { - /* Return type count back to zero */ - put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); - return -E2BIG; - } - - page_set_owner(page, dom_cow); - d->tot_pages--; - page_list_del(page, &d->page_list); - spin_unlock(&d->page_alloc_lock); - return 0; -} - -int page_make_private(struct domain *d, struct page_info *page) -{ - if ( !get_page(page, dom_cow) ) - return -EINVAL; - - spin_lock(&d->page_alloc_lock); - - /* We can only change the type if count is one */ - if ( (page->u.inuse.type_info & (PGT_type_mask | PGT_count_mask)) - != (PGT_shared_page | 1) ) - { - put_page(page); - spin_unlock(&d->page_alloc_lock); - return -EEXIST; - } - - /* Drop the final typecount */ - put_page_and_type(page); - - /* Change the owner */ - ASSERT(page_get_owner(page) == dom_cow); - page_set_owner(page, d); - - d->tot_pages++; - page_list_add_tail(page, &d->page_list); - spin_unlock(&d->page_alloc_lock); - - put_page(page); - - return 0; -} - static int __do_update_va_mapping( unsigned long va, u64 val64, unsigned long flags, struct domain *pg_owner) { diff -r 61e164d6e956 -r f5e3c94b2066 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -35,10 +35,21 @@ #include "mm-locks.h" +static shr_handle_t next_handle = 1; + #if MEM_SHARING_AUDIT + +static mm_lock_t shr_lock; + +#define shr_lock() _shr_lock() +#define shr_unlock() _shr_unlock() +#define shr_locked_by_me() _shr_locked_by_me() + static void mem_sharing_audit(void); + #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) + static struct list_head shr_audit_list; static inline void audit_add_list(struct page_info *page) @@ -51,11 +62,53 @@ static inline void audit_del_list(struct { list_del(&page->shared_info->entry); } + +static inline int mem_sharing_page_lock(struct page_info *p) +{ + return 1; +} +#define mem_sharing_page_unlock(p) ((void)0) + +#define get_next_handle() next_handle++; #else + +#define shr_lock() ((void)0) +#define shr_unlock() ((void)0) +/* Only used inside audit code */ +//#define shr_locked_by_me() ((void)0) + #define mem_sharing_audit() ((void)0) #define audit_add_list(p) ((void)0) #define audit_del_list(p) ((void)0) + +static inline int mem_sharing_page_lock(struct page_info *pg) +{ + int rc; + rc = page_lock(pg); + if ( rc ) + { + preempt_disable(); + } + return rc; +} + +static inline void mem_sharing_page_unlock(struct page_info *pg) +{ + preempt_enable(); + page_unlock(pg); +} + +static inline shr_handle_t get_next_handle(void) +{ + /* Get the next handle get_page style */ + uint64_t x, y = next_handle; + do { + x = y; + } + while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); + return x + 1; +} #endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ @@ -68,7 +121,6 @@ static inline void audit_del_list(struct #undef page_to_mfn #define page_to_mfn(_pg) _mfn(__page_to_mfn(_pg)) -static shr_handle_t next_handle = 1; static atomic_t nr_saved_mfns = ATOMIC_INIT(0); typedef struct gfn_info @@ -78,8 +130,6 @@ typedef struct gfn_info struct list_head list; } gfn_info_t; -static mm_lock_t shr_lock; - /* Returns true if list has only one entry. O(1) complexity. */ static inline int list_has_one_entry(struct list_head *head) { @@ -398,6 +448,113 @@ int mem_sharing_debug_gref(struct domain return mem_sharing_debug_gfn(d, gfn); } +/* Functions that change a page''s type and ownership */ +static int page_make_sharable(struct domain *d, + struct page_info *page, + int expected_refcnt) +{ + spin_lock(&d->page_alloc_lock); + + /* Change page type and count atomically */ + if ( !get_page_and_type(page, d, PGT_shared_page) ) + { + spin_unlock(&d->page_alloc_lock); + return -EINVAL; + } + + /* Check it wasn''t already sharable and undo if it was */ + if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) + { + put_page_and_type(page); + spin_unlock(&d->page_alloc_lock); + return -EEXIST; + } + + /* Check if the ref count is 2. The first from PGC_allocated, and + * the second from get_page_and_type at the top of this function */ + if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) + { + /* Return type count back to zero */ + put_page_and_type(page); + spin_unlock(&d->page_alloc_lock); + return -E2BIG; + } + + page_set_owner(page, dom_cow); + d->tot_pages--; + page_list_del(page, &d->page_list); + spin_unlock(&d->page_alloc_lock); + return 0; +} + +static int page_make_private(struct domain *d, struct page_info *page) +{ + unsigned long expected_type; + + if ( !get_page(page, dom_cow) ) + return -EINVAL; + + spin_lock(&d->page_alloc_lock); + + /* We can only change the type if count is one */ + /* If we are locking pages individually, then we need to drop + * the lock here, while the page is typed. We cannot risk the + * race of page_unlock and then put_page_type. */ +#if MEM_SHARING_AUDIT + expected_type = (PGT_shared_page | PGT_validated | 1); +#else + expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); +#endif + if ( page->u.inuse.type_info != expected_type ) + { + put_page(page); + spin_unlock(&d->page_alloc_lock); + return -EEXIST; + } + + /* Drop the final typecount */ + put_page_and_type(page); + +#ifndef MEM_SHARING_AUDIT + /* Now that we''ve dropped the type, we can unlock */ + mem_sharing_page_unlock(page); +#endif + + /* Change the owner */ + ASSERT(page_get_owner(page) == dom_cow); + page_set_owner(page, d); + + d->tot_pages++; + page_list_add_tail(page, &d->page_list); + spin_unlock(&d->page_alloc_lock); + + put_page(page); + + return 0; +} + +static inline struct page_info *__grab_shared_page(mfn_t mfn) +{ + struct page_info *pg = NULL; + + if ( !mfn_valid(mfn) ) + return NULL; + pg = mfn_to_page(mfn); + + /* If the page is not validated we can''t lock it, and if it''s + * not validated it''s obviously not shared. */ + if ( !mem_sharing_page_lock(pg) ) + return NULL; + + if ( mem_sharing_lookup(mfn_x(mfn)) == NULL ) + { + mem_sharing_page_unlock(pg); + return NULL; + } + + return pg; +} + int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, int expected_refcnt, @@ -405,7 +562,7 @@ int mem_sharing_nominate_page(struct dom { p2m_type_t p2mt; mfn_t mfn; - struct page_info *page; + struct page_info *page = NULL; /* gcc... */ int ret; struct gfn_info *gfn_info; @@ -420,10 +577,17 @@ int mem_sharing_nominate_page(struct dom goto out; /* Return the handle if the page is already shared */ - page = mfn_to_page(mfn); if ( p2m_is_shared(p2mt) ) { - *phandle = page->shared_info->handle; + struct page_info *pg = __grab_shared_page(mfn); + if ( !pg ) + { + gdprintk(XENLOG_ERR, "Shared p2m entry gfn %lx, but could not " + "grab page %lx dom %d\n", gfn, mfn_x(mfn), d->domain_id); + BUG(); + } + *phandle = pg->shared_info->handle; ret = 0; + mem_sharing_page_unlock(pg); goto out; } @@ -432,15 +596,24 @@ int mem_sharing_nominate_page(struct dom goto out; /* Try to convert the mfn to the sharable type */ + page = mfn_to_page(mfn); ret = page_make_sharable(d, page, expected_refcnt); if ( ret ) goto out; + /* Now that the page is validated, we can lock it. There is no + * race because we''re holding the p2m entry, so no one else + * could be nominating this gfn */ + ret = -ENOENT; + if ( !mem_sharing_page_lock(page) ) + goto out; + /* Initialize the shared state */ ret = -ENOMEM; if ( (page->shared_info = xmalloc(struct page_sharing_info)) == NULL ) { + /* Making a page private atomically unlocks it */ BUG_ON(page_make_private(d, page) != 0); goto out; } @@ -448,7 +621,7 @@ int mem_sharing_nominate_page(struct dom INIT_LIST_HEAD(&page->shared_info->gfns); /* Create the handle */ - page->shared_info->handle = next_handle++; + page->shared_info->handle = get_next_handle(); /* Create the local gfn info */ if ( (gfn_info = mem_sharing_gfn_alloc(page, d, gfn)) == NULL ) @@ -479,6 +652,7 @@ int mem_sharing_nominate_page(struct dom *phandle = page->shared_info->handle; audit_add_list(page); + mem_sharing_page_unlock(page); ret = 0; out: @@ -490,7 +664,7 @@ out: int mem_sharing_share_pages(struct domain *sd, unsigned long sgfn, shr_handle_t sh, struct domain *cd, unsigned long cgfn, shr_handle_t ch) { - struct page_info *spage, *cpage; + struct page_info *spage, *cpage, *firstpg, *secondpg; struct list_head *le, *te; gfn_info_t *gfn; struct domain *d; @@ -505,26 +679,63 @@ int mem_sharing_share_pages(struct domai smfn = get_gfn(sd, sgfn, &smfn_type); cmfn = get_gfn(cd, cgfn, &cmfn_type); - ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; - spage = mem_sharing_lookup(mfn_x(smfn)); - if ( spage == NULL ) + /* This tricky business is to avoid two callers deadlocking if + * grabbing pages in opposite client/source order */ + if( mfn_x(smfn) == mfn_x(cmfn) ) + { + /* The pages are already the same. We could return some + * kind of error here, but no matter how you look at it, + * the pages are already ''shared''. It possibly represents + * a big problem somewhere else, but as far as sharing is + * concerned: great success! */ + ret = 0; goto err_out; + } + else if ( mfn_x(smfn) < mfn_x(cmfn) ) + { + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = firstpg = __grab_shared_page(smfn); + if ( spage == NULL ) + goto err_out; + + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + cpage = secondpg = __grab_shared_page(cmfn); + if ( cpage == NULL ) + { + mem_sharing_page_unlock(spage); + goto err_out; + } + } else { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + cpage = firstpg = __grab_shared_page(cmfn); + if ( cpage == NULL ) + goto err_out; + + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = secondpg = __grab_shared_page(smfn); + if ( spage == NULL ) + { + mem_sharing_page_unlock(cpage); + goto err_out; + } + } + ASSERT(smfn_type == p2m_ram_shared); - ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; - cpage = mem_sharing_lookup(mfn_x(cmfn)); - if ( cpage == NULL ) - goto err_out; ASSERT(cmfn_type == p2m_ram_shared); /* Check that the handles match */ if ( spage->shared_info->handle != sh ) { ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + mem_sharing_page_unlock(secondpg); + mem_sharing_page_unlock(firstpg); goto err_out; } if ( cpage->shared_info->handle != ch ) { ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + mem_sharing_page_unlock(secondpg); + mem_sharing_page_unlock(firstpg); goto err_out; } @@ -551,6 +762,9 @@ int mem_sharing_share_pages(struct domai xfree(cpage->shared_info); cpage->shared_info = NULL; + mem_sharing_page_unlock(secondpg); + mem_sharing_page_unlock(firstpg); + /* Free the client page */ if(test_and_clear_bit(_PGC_allocated, &cpage->count_info)) put_page(cpage); @@ -592,7 +806,7 @@ int mem_sharing_unshare_page(struct doma return 0; } - page = mem_sharing_lookup(mfn_x(mfn)); + page = __grab_shared_page(mfn); if ( page == NULL ) { gdprintk(XENLOG_ERR, "Domain p2m is shared, but page is not: " @@ -628,18 +842,20 @@ gfn_found: * (possibly freeing the page), and exit early */ if ( flags & MEM_SHARING_DESTROY_GFN ) { - put_gfn(d, gfn); - shr_unlock(); put_page_and_type(page); + mem_sharing_page_unlock(page); if ( last_gfn && test_and_clear_bit(_PGC_allocated, &page->count_info) ) put_page(page); + put_gfn(d, gfn); + shr_unlock(); return 0; } if ( last_gfn ) { + /* Making a page private atomically unlocks it */ BUG_ON(page_make_private(d, page) != 0); goto private_page_found; } @@ -648,6 +864,7 @@ gfn_found: page = alloc_domheap_page(d, 0); if ( !page ) { + mem_sharing_page_unlock(old_page); put_gfn(d, gfn); mem_sharing_notify_helper(d, gfn); shr_unlock(); @@ -661,6 +878,7 @@ gfn_found: unmap_domain_page(t); BUG_ON(set_shared_p2m_entry(d, gfn, page_to_mfn(page)) == 0); + mem_sharing_page_unlock(old_page); put_page_and_type(old_page); private_page_found: @@ -678,6 +896,7 @@ private_page_found: /* Now that the gfn<->mfn map is properly established, * marking dirty is feasible */ paging_mark_dirty(d, mfn_x(page_to_mfn(page))); + /* We do not need to unlock a private page */ put_gfn(d, gfn); shr_unlock(); return 0; @@ -826,8 +1045,8 @@ int mem_sharing_domctl(struct domain *d, void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); +#if MEM_SHARING_AUDIT mm_lock_init(&shr_lock); -#if MEM_SHARING_AUDIT INIT_LIST_HEAD(&shr_audit_list); #endif } diff -r 61e164d6e956 -r f5e3c94b2066 xen/arch/x86/mm/mm-locks.h --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -26,6 +26,8 @@ #ifndef _MM_LOCKS_H #define _MM_LOCKS_H +#include <asm/mem_sharing.h> + /* Per-CPU variable for enforcing the lock ordering */ DECLARE_PER_CPU(int, mm_lock_level); #define __get_lock_level() (this_cpu(mm_lock_level)) @@ -141,15 +143,22 @@ static inline void mm_enforce_order_unlo * * ************************************************************************/ +#if MEM_SHARING_AUDIT /* Page-sharing lock (global) * * A single global lock that protects the memory-sharing code''s * hash tables. */ declare_mm_lock(shr) -#define shr_lock() mm_lock(shr, &shr_lock) -#define shr_unlock() mm_unlock(&shr_lock) -#define shr_locked_by_me() mm_locked_by_me(&shr_lock) +#define _shr_lock() mm_lock(shr, &shr_lock) +#define _shr_unlock() mm_unlock(&shr_lock) +#define _shr_locked_by_me() mm_locked_by_me(&shr_lock) + +#else + +/* We use an efficient per-page lock when AUDIT is not enabled. */ + +#endif /* MEM_SHARING_AUDIT */ /* Nested P2M lock (per-domain) * diff -r 61e164d6e956 -r f5e3c94b2066 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -337,6 +337,29 @@ int is_iomem_page(unsigned long mfn); void clear_superpage_mark(struct page_info *page); +/* Per page locks: + * page_lock() is used for two purposes: pte serialization, and memory sharing. + * + * All users of page lock for pte serialization live in mm.c, use it + * to lock a page table page during pte updates, do not take other locks within + * the critical section delimited by page_lock/unlock, and perform no + * nesting. + * + * All users of page lock for memory sharing live in mm/mem_sharing.c. Page_lock + * is used in memory sharing to protect addition (share) and removal (unshare) + * of (gfn,domain) tupples to a list of gfn''s that the shared page is currently + * backing. Nesting may happen when sharing (and locking) two pages -- deadlock + * is avoided by locking pages in increasing order. + * Memory sharing may take the p2m_lock within a page_lock/unlock + * critical section. + * + * These two users (pte serialization and memory sharing) do not collide, since + * sharing is only supported for hvm guests, which do not perform pv pte updates. + * + */ +int page_lock(struct page_info *page); +void page_unlock(struct page_info *page); + struct domain *page_get_owner_and_reference(struct page_info *page); void put_page(struct page_info *page); int get_page(struct page_info *page, struct domain *domain); @@ -588,10 +611,6 @@ int steal_page( struct domain *d, struct page_info *page, unsigned int memflags); int donate_page( struct domain *d, struct page_info *page, unsigned int memflags); -int page_make_sharable(struct domain *d, - struct page_info *page, - int expected_refcnt); -int page_make_private(struct domain *d, struct page_info *page); int map_ldt_shadow_page(unsigned int);
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 04 of 13] x86/mm: Enforce lock ordering for sharing page locks
xen/arch/x86/mm/mem_sharing.c | 16 ++++++++++++++++ xen/arch/x86/mm/mm-locks.h | 18 +++++++++++++++++- xen/include/asm-x86/mm.h | 3 ++- 3 files changed, 35 insertions(+), 2 deletions(-) Use the ordering constructs in mm-locks.h to enforce an order for the p2m and page locks in the sharing code. Applies to either the global sharing lock (in audit mode) or the per page locks. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scanneell <adin@scannell.ca> diff -r f5e3c94b2066 -r e66fe1c0b893 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -37,6 +37,13 @@ static shr_handle_t next_handle = 1; +typedef struct pg_lock_data { + int mm_unlock_level; + unsigned short recurse_count; +} pg_lock_data_t; + +DEFINE_PER_CPU(pg_lock_data_t, __pld); + #if MEM_SHARING_AUDIT static mm_lock_t shr_lock; @@ -85,16 +92,25 @@ static inline int mem_sharing_page_lock( static inline int mem_sharing_page_lock(struct page_info *pg) { int rc; + pg_lock_data_t *pld = &(this_cpu(__pld)); + + page_sharing_mm_pre_lock(); rc = page_lock(pg); if ( rc ) { preempt_disable(); + page_sharing_mm_post_lock(&pld->mm_unlock_level, + &pld->recurse_count); } return rc; } static inline void mem_sharing_page_unlock(struct page_info *pg) { + pg_lock_data_t *pld = &(this_cpu(__pld)); + + page_sharing_mm_unlock(pld->mm_unlock_level, + &pld->recurse_count); preempt_enable(); page_unlock(pg); } diff -r f5e3c94b2066 -r e66fe1c0b893 xen/arch/x86/mm/mm-locks.h --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -156,7 +156,23 @@ declare_mm_lock(shr) #else -/* We use an efficient per-page lock when AUDIT is not enabled. */ +/* Sharing per page lock + * + * This is an external lock, not represented by an mm_lock_t. The memory + * sharing lock uses it to protect addition and removal of (gfn,domain) + * tuples to a shared page. We enforce order here against the p2m lock, + * which is taken after the page_lock to change the gfn''s p2m entry. + * + * Note that in sharing audit mode, we use the global page lock above, + * instead. + * + * The lock is recursive because during share we lock two pages. */ + +declare_mm_order_constraint(per_page_sharing) +#define page_sharing_mm_pre_lock() mm_enforce_order_lock_pre_per_page_sharing() +#define page_sharing_mm_post_lock(l, r) \ + mm_enforce_order_lock_post_per_page_sharing((l), (r)) +#define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r)) #endif /* MEM_SHARING_AUDIT */ diff -r f5e3c94b2066 -r e66fe1c0b893 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -351,7 +351,8 @@ void clear_superpage_mark(struct page_in * backing. Nesting may happen when sharing (and locking) two pages -- deadlock * is avoided by locking pages in increasing order. * Memory sharing may take the p2m_lock within a page_lock/unlock - * critical section. + * critical section. We enforce ordering between page_lock and p2m_lock using an + * mm-locks.h construct. * * These two users (pte serialization and memory sharing) do not collide, since * sharing is only supported for hvm guests, which do not perform pv pte updates.
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 05 of 13] x86/mm: Check how many mfns are shared, in addition to how many are saved
xen/arch/x86/mm.c | 6 ------ xen/arch/x86/mm/mem_sharing.c | 25 +++++++++++++++++++++++++ xen/arch/x86/x86_64/compat/mm.c | 6 ++++++ xen/arch/x86/x86_64/mm.c | 7 +++++++ xen/include/asm-x86/mem_sharing.h | 1 + xen/include/public/memory.h | 1 + 6 files changed, 40 insertions(+), 6 deletions(-) This patch also moves the existing sharing-related memory op to the correct location, and adds logic to the audit() method that uses the new information. This patch only provides the Xen implementation of the domctls. Signed-off-by: Andres Lagar-Cavilla <andres@scannell.ca> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> diff -r e66fe1c0b893 -r 4a2a0b494960 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -119,7 +119,6 @@ #include <xen/trace.h> #include <asm/setup.h> #include <asm/fixmap.h> -#include <asm/mem_sharing.h> /* * Mapping of first 2 or 4 megabytes of memory. This is mapped with 4kB @@ -5024,11 +5023,6 @@ long arch_memory_op(int op, XEN_GUEST_HA return rc; } -#ifdef __x86_64__ - case XENMEM_get_sharing_freed_pages: - return mem_sharing_get_nr_saved_mfns(); -#endif - default: return subarch_memory_op(op, arg); } diff -r e66fe1c0b893 -r 4a2a0b494960 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -138,6 +138,7 @@ static inline shr_handle_t get_next_hand #define page_to_mfn(_pg) _mfn(__page_to_mfn(_pg)) static atomic_t nr_saved_mfns = ATOMIC_INIT(0); +static atomic_t nr_shared_mfns = ATOMIC_INIT(0); typedef struct gfn_info { @@ -208,9 +209,12 @@ static struct page_info* mem_sharing_loo static void mem_sharing_audit(void) { int errors = 0; + unsigned long count_expected; + unsigned long count_found = 0; struct list_head *ae; ASSERT(shr_locked_by_me()); + count_expected = atomic_read(&nr_shared_mfns); list_for_each(ae, &shr_audit_list) { @@ -258,6 +262,9 @@ static void mem_sharing_audit(void) continue; } + /* We''ve found a page that is shared */ + count_found++; + /* Check if all GFNs map to the MFN, and the p2m types */ list_for_each(le, &pg->shared_info->gfns) { @@ -302,6 +309,14 @@ static void mem_sharing_audit(void) errors++; } } + + if ( count_found != count_expected ) + { + MEM_SHARING_DEBUG("Expected %ld shared mfns, found %ld.", + count_expected, count_found); + errors++; + } + } #endif @@ -342,6 +357,11 @@ unsigned int mem_sharing_get_nr_saved_mf return ((unsigned int)atomic_read(&nr_saved_mfns)); } +unsigned int mem_sharing_get_nr_shared_mfns(void) +{ + return (unsigned int)atomic_read(&nr_shared_mfns); +} + int mem_sharing_sharing_resume(struct domain *d) { mem_event_response_t rsp; @@ -663,6 +683,9 @@ int mem_sharing_nominate_page(struct dom goto out; } + /* Account for this page. */ + atomic_inc(&nr_shared_mfns); + /* Update m2p entry to SHARED_M2P_ENTRY */ set_gpfn_from_mfn(mfn_x(mfn), SHARED_M2P_ENTRY); @@ -786,6 +809,7 @@ int mem_sharing_share_pages(struct domai put_page(cpage); /* We managed to free a domain page. */ + atomic_dec(&nr_shared_mfns); atomic_inc(&nr_saved_mfns); ret = 0; @@ -851,6 +875,7 @@ gfn_found: audit_del_list(page); xfree(page->shared_info); page->shared_info = NULL; + atomic_dec(&nr_shared_mfns); } else atomic_dec(&nr_saved_mfns); diff -r e66fe1c0b893 -r 4a2a0b494960 xen/arch/x86/x86_64/compat/mm.c --- a/xen/arch/x86/x86_64/compat/mm.c +++ b/xen/arch/x86/x86_64/compat/mm.c @@ -205,6 +205,12 @@ int compat_arch_memory_op(int op, XEN_GU break; } + case XENMEM_get_sharing_freed_pages: + return mem_sharing_get_nr_saved_mfns(); + + case XENMEM_get_sharing_shared_pages: + return mem_sharing_get_nr_shared_mfns(); + default: rc = -ENOSYS; break; diff -r e66fe1c0b893 -r 4a2a0b494960 xen/arch/x86/x86_64/mm.c --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -34,6 +34,7 @@ #include <asm/msr.h> #include <asm/setup.h> #include <asm/numa.h> +#include <asm/mem_sharing.h> #include <public/memory.h> /* Parameters for PFN/MADDR compression. */ @@ -1093,6 +1094,12 @@ long subarch_memory_op(int op, XEN_GUEST break; + case XENMEM_get_sharing_freed_pages: + return mem_sharing_get_nr_saved_mfns(); + + case XENMEM_get_sharing_shared_pages: + return mem_sharing_get_nr_shared_mfns(); + default: rc = -ENOSYS; break; diff -r e66fe1c0b893 -r 4a2a0b494960 xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -45,6 +45,7 @@ struct page_sharing_info (is_hvm_domain(_d) && paging_mode_hap(_d)) unsigned int mem_sharing_get_nr_saved_mfns(void); +unsigned int mem_sharing_get_nr_shared_mfns(void); int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, int expected_refcnt, diff -r e66fe1c0b893 -r 4a2a0b494960 xen/include/public/memory.h --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -294,6 +294,7 @@ typedef struct xen_pod_target xen_pod_ta * The call never fails. */ #define XENMEM_get_sharing_freed_pages 18 +#define XENMEM_get_sharing_shared_pages 19 #endif /* __XEN_PUBLIC_MEMORY_H__ */
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 06 of 13] x86/mm: New domctl: add a shared page to the physmap
xen/arch/x86/mm/mem_sharing.c | 103 ++++++++++++++++++++++++++++++++++++++++++ xen/include/public/domctl.h | 3 +- 2 files changed, 105 insertions(+), 1 deletions(-) This domctl is useful to, for example, populate parts of a domain''s physmap with shared frames, directly. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r 4a2a0b494960 -r c87eb9e63483 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -820,6 +820,73 @@ err_out: return ret; } +int mem_sharing_add_to_physmap(struct domain *sd, unsigned long sgfn, shr_handle_t sh, + struct domain *cd, unsigned long cgfn) +{ + struct page_info *spage; + int ret = -EINVAL; + mfn_t smfn, cmfn; + p2m_type_t smfn_type, cmfn_type; + struct gfn_info *gfn_info; + struct p2m_domain *p2m = p2m_get_hostp2m(cd); + p2m_access_t a; + + /* XXX if sd == cd handle potential deadlock by ordering + * the get_ and put_gfn''s */ + smfn = get_gfn_query(sd, sgfn, &smfn_type); + cmfn = get_gfn_type_access(p2m, cgfn, &cmfn_type, &a, p2m_query, NULL); + + /* Get the source shared page, check and lock */ + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = __grab_shared_page(smfn); + if ( spage == NULL ) + goto err_out; + ASSERT(smfn_type == p2m_ram_shared); + + /* Check that the handles match */ + if ( spage->shared_info->handle != sh ) + goto err_unlock; + + /* Make sure the target page is a hole in the physmap */ + if ( mfn_valid(cmfn) || + (!(p2m_is_ram(cmfn_type))) ) + { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + goto err_unlock; + } + + /* This is simpler than regular sharing */ + BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); + if ( (gfn_info = mem_sharing_gfn_alloc(spage, cd, cgfn)) == NULL ) + { + put_page_and_type(spage); + ret = -ENOMEM; + goto err_unlock; + } + + p2m_lock(p2m); + ret = set_p2m_entry(p2m, cgfn, smfn, PAGE_ORDER_4K, p2m_ram_shared, a); + p2m_unlock(p2m); + + /* Tempted to turn this into an assert */ + if ( !ret ) + { + ret = -ENOENT; + mem_sharing_gfn_destroy(cd, gfn_info); + put_page_and_type(spage); + } else + ret = 0; + + atomic_inc(&nr_saved_mfns); + +err_unlock: + mem_sharing_page_unlock(spage); +err_out: + put_gfn(cd, cgfn); + put_gfn(sd, sgfn); + return ret; +} + int mem_sharing_unshare_page(struct domain *d, unsigned long gfn, uint16_t flags) @@ -1042,6 +1109,42 @@ int mem_sharing_domctl(struct domain *d, } break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP: + { + unsigned long sgfn, cgfn; + struct domain *cd; + shr_handle_t sh; + + if ( !mem_sharing_enabled(d) ) + return -EINVAL; + + cd = get_domain_by_id(mec->u.share.client_domain); + if ( !cd ) + return -ESRCH; + + if ( !mem_sharing_enabled(cd) ) + { + put_domain(cd); + return -EINVAL; + } + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.source_gfn) ) + { + /* Cannot add a gref to the physmap */ + put_domain(cd); + return -EINVAL; + } + + sgfn = mec->u.share.source_gfn; + sh = mec->u.share.source_handle; + cgfn = mec->u.share.client_gfn; + + rc = mem_sharing_add_to_physmap(d, sgfn, sh, cd, cgfn); + + put_domain(cd); + } + break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_RESUME: { if ( !mem_sharing_enabled(d) ) diff -r 4a2a0b494960 -r c87eb9e63483 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -771,6 +771,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GFN 5 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_MFN 6 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GREF 7 +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP 8 #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9) @@ -801,7 +802,7 @@ struct xen_domctl_mem_sharing_op { } u; uint64_aligned_t handle; /* OUT: the handle */ } nominate; - struct mem_sharing_op_share { /* OP_SHARE */ + struct mem_sharing_op_share { /* OP_SHARE/ADD_PHYSMAP */ uint64_aligned_t source_gfn; /* IN: the gfn of the source page */ uint64_aligned_t source_handle; /* IN: handle to the source page */ domid_t client_domain; /* IN: the client domain id */
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 07 of 13] Add the ability to poll stats about shared memory via the console
xen/arch/ia64/xen/mm.c | 5 +++++ xen/arch/x86/mm.c | 14 ++++++++++++++ xen/common/keyhandler.c | 7 +++++-- xen/include/xen/mm.h | 3 +++ 4 files changed, 27 insertions(+), 2 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> diff -r c87eb9e63483 -r a45fb86e2419 xen/arch/ia64/xen/mm.c --- a/xen/arch/ia64/xen/mm.c +++ b/xen/arch/ia64/xen/mm.c @@ -3574,6 +3574,11 @@ p2m_pod_decrease_reservation(struct doma return 0; } +/* Simple no-op */ +void arch_dump_shared_mem_info(void) +{ +} + /* * Local variables: * mode: C diff -r c87eb9e63483 -r a45fb86e2419 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -113,6 +113,7 @@ #include <asm/e820.h> #include <asm/hypercall.h> #include <asm/shared.h> +#include <asm/mem_sharing.h> #include <public/memory.h> #include <public/sched.h> #include <xsm/xsm.h> @@ -5832,6 +5833,19 @@ void memguard_unguard_stack(void *p) memguard_unguard_range(p, PAGE_SIZE); } +#if defined(__x86_64__) +void arch_dump_shared_mem_info(void) +{ + printk("Shared frames %u -- Saved frames %u\n", + mem_sharing_get_nr_shared_mfns(), + mem_sharing_get_nr_saved_mfns()); +} +#else +void arch_dump_shared_mem_info(void) +{ +} +#endif + /* * Local variables: * mode: C diff -r c87eb9e63483 -r a45fb86e2419 xen/common/keyhandler.c --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -15,6 +15,7 @@ #include <xen/compat.h> #include <xen/ctype.h> #include <xen/perfc.h> +#include <xen/mm.h> #include <xen/init.h> #include <asm/debugger.h> #include <asm/div64.h> @@ -249,8 +250,8 @@ static void dump_domains(unsigned char k printk(" refcnt=%d dying=%d pause_count=%d\n", atomic_read(&d->refcnt), d->is_dying, atomic_read(&d->pause_count)); - printk(" nr_pages=%d xenheap_pages=%d dirty_cpus=%s max_pages=%u\n", - d->tot_pages, d->xenheap_pages, tmpstr, d->max_pages); + printk(" nr_pages=%d xenheap_pages=%d shared_pages=%u dirty_cpus=%s max_pages=%u\n", + d->tot_pages, d->xenheap_pages, atomic_read(&d->shr_pages), tmpstr, d->max_pages); printk(" handle=%02x%02x%02x%02x-%02x%02x-%02x%02x-" "%02x%02x-%02x%02x%02x%02x%02x%02x vm_assist=%08lx\n", d->handle[ 0], d->handle[ 1], d->handle[ 2], d->handle[ 3], @@ -309,6 +310,8 @@ static void dump_domains(unsigned char k } } + arch_dump_shared_mem_info(); + rcu_read_unlock(&domlist_read_lock); #undef tmpstr } diff -r c87eb9e63483 -r a45fb86e2419 xen/include/xen/mm.h --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -72,6 +72,9 @@ int assign_pages( unsigned int order, unsigned int memflags); +/* Dump info to serial console */ +void arch_dump_shared_mem_info(void); + /* memflags: */ #define _MEMF_no_refcount 0 #define MEMF_no_refcount (1U<<_MEMF_no_refcount)
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 08 of 13] x86/mm: use RCU in mem sharing audit list, eliminate global lock completely
xen/arch/x86/mm/mem_sharing.c | 93 ++++++++++++++++++-------------------- xen/arch/x86/mm/mm-locks.h | 18 ------- xen/include/asm-x86/mem_sharing.h | 1 + 3 files changed, 44 insertions(+), 68 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r a45fb86e2419 -r cf70bc85eb23 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -32,6 +32,7 @@ #include <asm/p2m.h> #include <asm/mem_event.h> #include <asm/atomic.h> +#include <xen/rcupdate.h> #include "mm-locks.h" @@ -46,48 +47,49 @@ DEFINE_PER_CPU(pg_lock_data_t, __pld); #if MEM_SHARING_AUDIT -static mm_lock_t shr_lock; - -#define shr_lock() _shr_lock() -#define shr_unlock() _shr_unlock() -#define shr_locked_by_me() _shr_locked_by_me() - static void mem_sharing_audit(void); #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) static struct list_head shr_audit_list; +static spinlock_t shr_audit_lock; +DEFINE_RCU_READ_LOCK(shr_audit_read_lock); + +/* RCU delayed free of audit list entry */ +static void _free_pg_shared_info(struct rcu_head *head) +{ + xfree(container_of(head, struct page_sharing_info, rcu_head)); +} static inline void audit_add_list(struct page_info *page) { INIT_LIST_HEAD(&page->shared_info->entry); - list_add(&page->shared_info->entry, &shr_audit_list); + spin_lock(&shr_audit_lock); + list_add_rcu(&page->shared_info->entry, &shr_audit_list); + spin_unlock(&shr_audit_lock); } static inline void audit_del_list(struct page_info *page) { - list_del(&page->shared_info->entry); + spin_lock(&shr_audit_lock); + list_del_rcu(&page->shared_info->entry); + spin_unlock(&shr_audit_lock); + INIT_RCU_HEAD(&page->shared_info->rcu_head); + call_rcu(&page->shared_info->rcu_head, _free_pg_shared_info); } -static inline int mem_sharing_page_lock(struct page_info *p) -{ - return 1; -} -#define mem_sharing_page_unlock(p) ((void)0) - -#define get_next_handle() next_handle++; #else -#define shr_lock() ((void)0) -#define shr_unlock() ((void)0) -/* Only used inside audit code */ -//#define shr_locked_by_me() ((void)0) - #define mem_sharing_audit() ((void)0) #define audit_add_list(p) ((void)0) -#define audit_del_list(p) ((void)0) +static inline void audit_del_list(struct page_info *page) +{ + xfree(page->shared_info); +} + +#endif /* MEM_SHARING_AUDIT */ static inline int mem_sharing_page_lock(struct page_info *pg) { @@ -125,7 +127,6 @@ static inline shr_handle_t get_next_hand while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); return x + 1; } -#endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ (is_hvm_domain(d) && (d)->arch.hvm_domain.mem_sharing_enabled) @@ -213,10 +214,11 @@ static void mem_sharing_audit(void) unsigned long count_found = 0; struct list_head *ae; - ASSERT(shr_locked_by_me()); count_expected = atomic_read(&nr_shared_mfns); - list_for_each(ae, &shr_audit_list) + rcu_read_lock(&shr_audit_read_lock); + + list_for_each_rcu(ae, &shr_audit_list) { struct page_sharing_info *shared_info; unsigned long nr_gfns = 0; @@ -228,6 +230,15 @@ static void mem_sharing_audit(void) pg = shared_info->pg; mfn = page_to_mfn(pg); + /* If we can''t lock it, it''s definitely not a shared page */ + if ( !mem_sharing_page_lock(pg) ) + { + MEM_SHARING_DEBUG("mfn %lx in audit list, but cannot be locked (%lx)!\n", + mfn_x(mfn), pg->u.inuse.type_info); + errors++; + continue; + } + /* Check if the MFN has correct type, owner and handle. */ if ( !(pg->u.inuse.type_info & PGT_shared_page) ) { @@ -300,7 +311,8 @@ static void mem_sharing_audit(void) put_domain(d); nr_gfns++; } - if ( nr_gfns != (pg->u.inuse.type_info & PGT_count_mask) ) + /* The type count has an extra ref because we have locked the page */ + if ( (nr_gfns + 1) != (pg->u.inuse.type_info & PGT_count_mask) ) { MEM_SHARING_DEBUG("Mismatched counts for MFN=%lx." "nr_gfns in list %lu, in type_info %lx\n", @@ -308,8 +320,12 @@ static void mem_sharing_audit(void) (pg->u.inuse.type_info & PGT_count_mask)); errors++; } + + mem_sharing_page_unlock(pg); } + rcu_read_unlock(&shr_audit_read_lock); + if ( count_found != count_expected ) { MEM_SHARING_DEBUG("Expected %ld shared mfns, found %ld.", @@ -533,14 +549,10 @@ static int page_make_private(struct doma spin_lock(&d->page_alloc_lock); /* We can only change the type if count is one */ - /* If we are locking pages individually, then we need to drop + /* Because we are locking pages individually, we need to drop * the lock here, while the page is typed. We cannot risk the * race of page_unlock and then put_page_type. */ -#if MEM_SHARING_AUDIT - expected_type = (PGT_shared_page | PGT_validated | 1); -#else expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); -#endif if ( page->u.inuse.type_info != expected_type ) { put_page(page); @@ -551,10 +563,8 @@ static int page_make_private(struct doma /* Drop the final typecount */ put_page_and_type(page); -#ifndef MEM_SHARING_AUDIT /* Now that we''ve dropped the type, we can unlock */ mem_sharing_page_unlock(page); -#endif /* Change the owner */ ASSERT(page_get_owner(page) == dom_cow); @@ -604,7 +614,6 @@ int mem_sharing_nominate_page(struct dom *phandle = 0UL; - shr_lock(); mfn = get_gfn(d, gfn, &p2mt); /* Check if mfn is valid */ @@ -696,7 +705,6 @@ int mem_sharing_nominate_page(struct dom out: put_gfn(d, gfn); - shr_unlock(); return ret; } @@ -711,8 +719,6 @@ int mem_sharing_share_pages(struct domai mfn_t smfn, cmfn; p2m_type_t smfn_type, cmfn_type; - shr_lock(); - /* XXX if sd == cd handle potential deadlock by ordering * the get_ and put_gfn''s */ smfn = get_gfn(sd, sgfn, &smfn_type); @@ -798,7 +804,6 @@ int mem_sharing_share_pages(struct domai /* Clear the rest of the shared state */ audit_del_list(cpage); - xfree(cpage->shared_info); cpage->shared_info = NULL; mem_sharing_page_unlock(secondpg); @@ -816,7 +821,6 @@ int mem_sharing_share_pages(struct domai err_out: put_gfn(cd, cgfn); put_gfn(sd, sgfn); - shr_unlock(); return ret; } @@ -899,17 +903,12 @@ int mem_sharing_unshare_page(struct doma gfn_info_t *gfn_info = NULL; struct list_head *le; - /* This is one of the reasons why we can''t enforce ordering - * between shr_lock and p2m fine-grained locks in mm-lock. - * Callers may walk in here already holding the lock for this gfn */ - shr_lock(); mem_sharing_audit(); mfn = get_gfn(d, gfn, &p2mt); /* Has someone already unshared it? */ if ( !p2m_is_shared(p2mt) ) { put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -940,7 +939,6 @@ gfn_found: { /* Clean up shared state */ audit_del_list(page); - xfree(page->shared_info); page->shared_info = NULL; atomic_dec(&nr_shared_mfns); } @@ -956,7 +954,6 @@ gfn_found: test_and_clear_bit(_PGC_allocated, &page->count_info) ) put_page(page); put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -975,7 +972,6 @@ gfn_found: mem_sharing_page_unlock(old_page); put_gfn(d, gfn); mem_sharing_notify_helper(d, gfn); - shr_unlock(); return -ENOMEM; } @@ -1006,7 +1002,6 @@ private_page_found: paging_mark_dirty(d, mfn_x(page_to_mfn(page))); /* We do not need to unlock a private page */ put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -1179,9 +1174,7 @@ int mem_sharing_domctl(struct domain *d, break; } - shr_lock(); mem_sharing_audit(); - shr_unlock(); return rc; } @@ -1190,7 +1183,7 @@ void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); #if MEM_SHARING_AUDIT - mm_lock_init(&shr_lock); + spin_lock_init(&shr_audit_lock); INIT_LIST_HEAD(&shr_audit_list); #endif } diff -r a45fb86e2419 -r cf70bc85eb23 xen/arch/x86/mm/mm-locks.h --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -143,19 +143,6 @@ static inline void mm_enforce_order_unlo * * ************************************************************************/ -#if MEM_SHARING_AUDIT -/* Page-sharing lock (global) - * - * A single global lock that protects the memory-sharing code''s - * hash tables. */ - -declare_mm_lock(shr) -#define _shr_lock() mm_lock(shr, &shr_lock) -#define _shr_unlock() mm_unlock(&shr_lock) -#define _shr_locked_by_me() mm_locked_by_me(&shr_lock) - -#else - /* Sharing per page lock * * This is an external lock, not represented by an mm_lock_t. The memory @@ -163,9 +150,6 @@ declare_mm_lock(shr) * tuples to a shared page. We enforce order here against the p2m lock, * which is taken after the page_lock to change the gfn''s p2m entry. * - * Note that in sharing audit mode, we use the global page lock above, - * instead. - * * The lock is recursive because during share we lock two pages. */ declare_mm_order_constraint(per_page_sharing) @@ -174,8 +158,6 @@ declare_mm_order_constraint(per_page_sha mm_enforce_order_lock_post_per_page_sharing((l), (r)) #define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r)) -#endif /* MEM_SHARING_AUDIT */ - /* Nested P2M lock (per-domain) * * A per-domain lock that protects the mapping from nested-CR3 to diff -r a45fb86e2419 -r cf70bc85eb23 xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -35,6 +35,7 @@ struct page_sharing_info shr_handle_t handle; /* Globally unique version / handle. */ #if MEM_SHARING_AUDIT struct list_head entry; /* List of all shared pages (entry). */ + struct rcu_head rcu_head; /* List of all shared pages (entry). */ #endif struct list_head gfns; /* List of domains and gfns for this page (head). */ };
tools/blktap2/drivers/tapdisk-vbd.c | 6 +- tools/blktap2/drivers/tapdisk.h | 2 +- tools/libxc/xc_memshr.c | 63 ++++++++++++++++++++++++++++++++++-- tools/libxc/xenctrl.h | 19 ++++++++++- tools/memshr/bidir-daemon.c | 34 ++++++++++++++----- tools/memshr/bidir-hash.h | 12 ++++-- tools/memshr/interface.c | 30 ++++++++++------- tools/memshr/memshr.h | 11 +++++- 8 files changed, 139 insertions(+), 38 deletions(-) This patch is the folded version of API updates, along with the associated tool changes to ensure that the build is always consistent. API updates: - The source domain in the sharing calls is no longer assumed to be dom0. - Previously, the mem sharing code would return an opaque handle to index shared pages (and nominees) in its global hash table. By removing the hash table, the handle becomes a version, to avoid sharing a stale version of a page. Thus, libxc wrappers and tools need to be updated to recall the share functions with the information needed to fetch the page (which they readily have). Tool updates: The only (in-tree, that we know of) consumer of the mem sharing API is the memshr tool. This is updated to use the new API. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Ian Campbell <ian.campbell@citrix.com> diff -r cf70bc85eb23 -r f859c29d4011 tools/blktap2/drivers/tapdisk-vbd.c --- a/tools/blktap2/drivers/tapdisk-vbd.c +++ b/tools/blktap2/drivers/tapdisk-vbd.c @@ -1218,14 +1218,14 @@ __tapdisk_vbd_complete_td_request(td_vbd #ifdef MEMSHR if (treq.op == TD_OP_READ && td_flag_test(image->flags, TD_OPEN_RDONLY)) { - uint64_t hnd = treq.memshr_hnd; + share_tuple_t hnd = treq.memshr_hnd; uint16_t uid = image->memshr_id; blkif_request_t *breq = &vreq->req; uint64_t sec = tapdisk_vbd_breq_get_sector(breq, treq); int secs = breq->seg[treq.sidx].last_sect - breq->seg[treq.sidx].first_sect + 1; - if (hnd != 0) + if (hnd.handle != 0) memshr_vbd_complete_ro_request(hnd, uid, sec, secs); } @@ -1297,7 +1297,7 @@ __tapdisk_vbd_reissue_td_request(td_vbd_ /* Reset memshr handle. This''ll prevent * memshr_vbd_complete_ro_request being called */ - treq.memshr_hnd = 0; + treq.memshr_hnd.handle = 0; td_complete_request(treq, 0); } else td_queue_read(parent, treq); diff -r cf70bc85eb23 -r f859c29d4011 tools/blktap2/drivers/tapdisk.h --- a/tools/blktap2/drivers/tapdisk.h +++ b/tools/blktap2/drivers/tapdisk.h @@ -140,7 +140,7 @@ struct td_request { void *private; #ifdef MEMSHR - uint64_t memshr_hnd; + share_tuple_t memshr_hnd; #endif }; diff -r cf70bc85eb23 -r f859c29d4011 tools/libxc/xc_memshr.c --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -86,24 +86,79 @@ int xc_memshr_nominate_gref(xc_interface return ret; } -int xc_memshr_share(xc_interface *xch, - uint64_t source_handle, - uint64_t client_handle) +int xc_memshr_share_gfns(xc_interface *xch, + domid_t source_domain, + unsigned long source_gfn, + uint64_t source_handle, + domid_t client_domain, + unsigned long client_gfn, + uint64_t client_handle) { DECLARE_DOMCTL; struct xen_domctl_mem_sharing_op *op; domctl.cmd = XEN_DOMCTL_mem_sharing_op; domctl.interface_version = XEN_DOMCTL_INTERFACE_VERSION; - domctl.domain = 0; + domctl.domain = source_domain; op = &(domctl.u.mem_sharing_op); op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE; op->u.share.source_handle = source_handle; + op->u.share.source_gfn = source_gfn; + op->u.share.client_domain = client_domain; + op->u.share.client_gfn = client_gfn; op->u.share.client_handle = client_handle; return do_domctl(xch, &domctl); } +int xc_memshr_share_grefs(xc_interface *xch, + domid_t source_domain, + grant_ref_t source_gref, + uint64_t source_handle, + domid_t client_domain, + grant_ref_t client_gref, + uint64_t client_handle) +{ + DECLARE_DOMCTL; + struct xen_domctl_mem_sharing_op *op; + + domctl.cmd = XEN_DOMCTL_mem_sharing_op; + domctl.interface_version = XEN_DOMCTL_INTERFACE_VERSION; + domctl.domain = source_domain; + op = &(domctl.u.mem_sharing_op); + op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE; + op->u.share.source_handle = source_handle; + XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(op->u.share.source_gfn, source_gref); + op->u.share.client_domain = client_domain; + XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(op->u.share.client_gfn, client_gref); + op->u.share.client_handle = client_handle; + + return do_domctl(xch, &domctl); +} + +int xc_memshr_add_to_physmap(xc_interface *xch, + domid_t source_domain, + unsigned long source_gfn, + uint64_t source_handle, + domid_t client_domain, + unsigned long client_gfn) +{ + DECLARE_DOMCTL; + struct xen_domctl_mem_sharing_op *op; + + domctl.cmd = XEN_DOMCTL_mem_sharing_op; + domctl.interface_version = XEN_DOMCTL_INTERFACE_VERSION; + domctl.domain = source_domain; + op = &(domctl.u.mem_sharing_op); + op->op = XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP; + op->u.share.source_gfn = source_gfn; + op->u.share.source_handle = source_handle; + op->u.share.client_gfn = client_gfn; + op->u.share.client_domain = client_domain; + + return do_domctl(xch, &domctl); +} + int xc_memshr_domain_resume(xc_interface *xch, domid_t domid) { diff -r cf70bc85eb23 -r f859c29d4011 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1897,9 +1897,26 @@ int xc_memshr_nominate_gref(xc_interface domid_t domid, grant_ref_t gref, uint64_t *handle); -int xc_memshr_share(xc_interface *xch, +int xc_memshr_share_gfns(xc_interface *xch, + domid_t source_domain, + unsigned long source_gfn, uint64_t source_handle, + domid_t client_domain, + unsigned long client_gfn, uint64_t client_handle); +int xc_memshr_share_grefs(xc_interface *xch, + domid_t source_domain, + grant_ref_t source_gref, + uint64_t source_handle, + domid_t client_domain, + grant_ref_t client_gref, + uint64_t client_handle); +int xc_memshr_add_to_physmap(xc_interface *xch, + domid_t source_domain, + unsigned long source_gfn, + uint64_t source_handle, + domid_t client_domain, + unsigned long client_gfn); int xc_memshr_domain_resume(xc_interface *xch, domid_t domid); int xc_memshr_debug_gfn(xc_interface *xch, diff -r cf70bc85eb23 -r f859c29d4011 tools/memshr/bidir-daemon.c --- a/tools/memshr/bidir-daemon.c +++ b/tools/memshr/bidir-daemon.c @@ -19,16 +19,25 @@ #include <pthread.h> #include <inttypes.h> #include <unistd.h> +#include <errno.h> #include "bidir-hash.h" #include "memshr-priv.h" static struct blockshr_hash *blks_hash; +/* Callback in the iterator, remember this value, and leave */ +int find_one(vbdblk_t k, share_tuple_t v, void *priv) +{ + share_tuple_t *rv = (share_tuple_t *) priv; + *rv = v; + /* Break out of iterator loop */ + return 1; +} + void* bidir_daemon(void *unused) { uint32_t nr_ent, max_nr_ent, tab_size, max_load, min_load; - static uint64_t shrhnd = 1; while(1) { @@ -41,20 +50,30 @@ void* bidir_daemon(void *unused) /* Remove some hints as soon as we get to 90% capacity */ if(10 * nr_ent > 9 * max_nr_ent) { - uint64_t next_remove = shrhnd; + share_tuple_t next_remove; int to_remove; int ret; to_remove = 0.1 * max_nr_ent; while(to_remove > 0) { - ret = blockshr_shrhnd_remove(blks_hash, next_remove, NULL); - if(ret < 0) + /* We use the iterator to get one entry */ + next_remove.handle = 0; + ret = blockshr_hash_iterator(blks_hash, find_one, &next_remove); + + if ( !ret ) + if ( next_remove.handle == 0 ) + ret = -ESRCH; + + if ( !ret ) + ret = blockshr_shrhnd_remove(blks_hash, next_remove, NULL); + + if(ret <= 0) { /* We failed to remove an entry, because of a serious hash * table error */ DPRINTF("Could not remove handle %"PRId64", error: %d\n", - next_remove, ret); + next_remove.handle, ret); /* Force to exit the loop early */ to_remove = 0; } else @@ -62,12 +81,7 @@ void* bidir_daemon(void *unused) { /* Managed to remove the entry. Note next_remove not * incremented, in case there are duplicates */ - shrhnd = next_remove; to_remove--; - } else - { - /* Failed to remove, because there is no such handle */ - next_remove++; } } } diff -r cf70bc85eb23 -r f859c29d4011 tools/memshr/bidir-hash.h --- a/tools/memshr/bidir-hash.h +++ b/tools/memshr/bidir-hash.h @@ -20,6 +20,7 @@ #define __BIDIR_HASH_H__ #include <stdint.h> +#include <string.h> #include "memshr-priv.h" typedef struct vbdblk { @@ -81,15 +82,16 @@ static int fgprtshr_mfn_cmp(uint32_t m1, #undef BIDIR_VALUE #undef BIDIR_KEY_T #undef BIDIR_VALUE_T + /* TODO better hashes! */ static inline uint32_t blockshr_block_hash(vbdblk_t block) { return (uint32_t)(block.sec) ^ (uint32_t)(block.disk_id); } -static inline uint32_t blockshr_shrhnd_hash(uint64_t shrhnd) +static inline uint32_t blockshr_shrhnd_hash(share_tuple_t shrhnd) { - return (uint32_t)shrhnd; + return ((uint32_t) shrhnd.handle); } static inline int blockshr_block_cmp(vbdblk_t b1, vbdblk_t b2) @@ -97,15 +99,15 @@ static inline int blockshr_block_cmp(vbd return (b1.sec == b2.sec) && (b1.disk_id == b2.disk_id); } -static inline int blockshr_shrhnd_cmp(uint64_t h1, uint64_t h2) +static inline int blockshr_shrhnd_cmp(share_tuple_t h1, share_tuple_t h2) { - return (h1 == h2); + return ( !memcmp(&h1, &h2, sizeof(share_tuple_t)) ); } #define BIDIR_NAME_PREFIX blockshr #define BIDIR_KEY block #define BIDIR_VALUE shrhnd #define BIDIR_KEY_T vbdblk_t -#define BIDIR_VALUE_T uint64_t +#define BIDIR_VALUE_T share_tuple_t #include "bidir-namedefs.h" #endif /* BLOCK_MAP */ diff -r cf70bc85eb23 -r f859c29d4011 tools/memshr/interface.c --- a/tools/memshr/interface.c +++ b/tools/memshr/interface.c @@ -145,16 +145,17 @@ void memshr_vbd_image_put(uint16_t memsh int memshr_vbd_issue_ro_request(char *buf, grant_ref_t gref, - uint16_t file_id, + uint16_t file_id, uint64_t sec, int secs, - uint64_t *hnd) + share_tuple_t *hnd) { vbdblk_t blk; - uint64_t s_hnd, c_hnd; + share_tuple_t source_st, client_st; + uint64_t c_hnd; int ret; - *hnd = 0; + *hnd = (share_tuple_t){ 0, 0, 0 }; if(!vbd_info.enabled) return -1; @@ -169,26 +170,31 @@ int memshr_vbd_issue_ro_request(char *bu /* If page couldn''t be made sharable, we cannot do anything about it */ if(ret != 0) return -3; - *hnd = c_hnd; + + client_st = (share_tuple_t){ vbd_info.domid, gref, c_hnd }; + *hnd = client_st; /* Check if we''ve read matching disk block previously */ blk.sec = sec; blk.disk_id = file_id; - if(blockshr_block_lookup(memshr.blks, blk, &s_hnd) > 0) + if(blockshr_block_lookup(memshr.blks, blk, &source_st) > 0) { - ret = xc_memshr_share(vbd_info.xc_handle, s_hnd, c_hnd); + ret = xc_memshr_share_grefs(vbd_info.xc_handle, source_st.domain, source_st.frame, + source_st.handle, vbd_info.domid, gref, c_hnd); if(!ret) return 0; /* Handles failed to be shared => at least one of them must be invalid, remove the relevant ones from the map */ switch(ret) { case XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID: - ret = blockshr_shrhnd_remove(memshr.blks, s_hnd, NULL); - if(ret) DPRINTF("Could not rm invl s_hnd: %"PRId64"\n", s_hnd); + ret = blockshr_shrhnd_remove(memshr.blks, source_st, NULL); + if(ret) DPRINTF("Could not rm invl s_hnd: %u %"PRId64" %"PRId64"\n", + source_st.domain, source_st.frame, source_st.handle); break; case XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID: - ret = blockshr_shrhnd_remove(memshr.blks, c_hnd, NULL); - if(ret) DPRINTF("Could not rm invl c_hnd: %"PRId64"\n", c_hnd); + ret = blockshr_shrhnd_remove(memshr.blks, client_st, NULL); + if(ret) DPRINTF("Could not rm invl c_hnd: %u %"PRId64" %"PRId64"\n", + client_st.domain, client_st.frame, client_st.handle); break; default: break; @@ -199,7 +205,7 @@ int memshr_vbd_issue_ro_request(char *bu return -4; } -void memshr_vbd_complete_ro_request(uint64_t hnd, +void memshr_vbd_complete_ro_request(share_tuple_t hnd, uint16_t file_id, uint64_t sec, int secs) diff -r cf70bc85eb23 -r f859c29d4011 tools/memshr/memshr.h --- a/tools/memshr/memshr.h +++ b/tools/memshr/memshr.h @@ -25,6 +25,13 @@ typedef uint64_t xen_mfn_t; +typedef struct share_tuple +{ + uint32_t domain; + uint64_t frame; + uint64_t handle; +} share_tuple_t; + extern void memshr_set_domid(int domid); extern void memshr_daemon_initialize(void); extern void memshr_vbd_initialize(void); @@ -35,9 +42,9 @@ extern int memshr_vbd_issue_ro_request(c uint16_t file_id, uint64_t sec, int secs, - uint64_t *hnd); + share_tuple_t *hnd); extern void memshr_vbd_complete_ro_request( - uint64_t hnd, + share_tuple_t hnd, uint16_t file_id, uint64_t sec, int secs);
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 10 of 13] Tools: Expose to libxc the total number of shared frames and space saved
tools/libxc/xc_memshr.c | 10 ++++++++++ tools/libxc/xenctrl.h | 17 +++++++++++++++++ 2 files changed, 27 insertions(+), 0 deletions(-) Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r f859c29d4011 -r 9116fc441ca4 tools/libxc/xc_memshr.c --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -225,3 +225,13 @@ int xc_memshr_debug_gref(xc_interface *x return do_domctl(xch, &domctl); } +long xc_sharing_freed_pages(xc_interface *xch) +{ + return do_memory_op(xch, XENMEM_get_sharing_freed_pages, NULL, 0); +} + +long xc_sharing_used_frames(xc_interface *xch) +{ + return do_memory_op(xch, XENMEM_get_sharing_shared_pages, NULL, 0); +} + diff -r f859c29d4011 -r 9116fc441ca4 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1233,6 +1233,23 @@ int xc_mmuext_op(xc_interface *xch, stru /* System wide memory properties */ long xc_maximum_ram_page(xc_interface *xch); +/** + * This function returns the total number of pages freed by using sharing + * on the system. For example, if two domains contain a single entry in + * their p2m map that points to the same shared page (and no other pages + * in the system are shared), then this function should return 1. + */ +long xc_sharing_freed_pages(xc_interface *xch); + +/** + * This function returns the total number of frames occupied by shared + * pages on the system. This is independent of the number of domains + * pointing at these frames. For example, in the above scenario this + * should return 1. The following should hold: + * memory usage without sharing = freed_pages + used_frames + */ +long xc_sharing_used_frames(xc_interface *xch); + /* Get current total pages allocated to a domain. */ long xc_get_tot_pages(xc_interface *xch, uint32_t domid);
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 11 of 13] Tools: Add a sharing command to xl for information about shared pages
docs/man/xl.pod.1 | 13 ++++++++ tools/libxl/libxl.c | 2 + tools/libxl/libxl_types.idl | 2 + tools/libxl/xl.h | 1 + tools/libxl/xl_cmdimpl.c | 66 +++++++++++++++++++++++++++++++++++++++++++++ tools/libxl/xl_cmdtable.c | 5 +++ 6 files changed, 89 insertions(+), 0 deletions(-) Also add the global sharing statistics to the libxl physinfo. This is a slight departure from libxc, but there''s no reason libxl physinfo can''t include extra bits of useful and relevant information. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Ian Campbell <ian.campbell@citrix.com> diff -r 9116fc441ca4 -r 4cefca5789b1 docs/man/xl.pod.1 --- a/docs/man/xl.pod.1 +++ b/docs/man/xl.pod.1 @@ -410,6 +410,19 @@ Leave domain running after creating the =back +=item B<sharing> [I<domain-id>] + +List count of shared pages. + +B<OPTIONS> + +=over 4 + +=item I<domain_id> + +List specifically for that domain. Otherwise, list for all domains. + +=back =item B<shutdown> [I<OPTIONS>] I<domain-id> diff -r 9116fc441ca4 -r 4cefca5789b1 tools/libxl/libxl.c --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2507,6 +2507,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, l physinfo->total_pages = xcphysinfo.total_pages; physinfo->free_pages = xcphysinfo.free_pages; physinfo->scrub_pages = xcphysinfo.scrub_pages; + physinfo->sharing_freed_pages = xc_sharing_freed_pages(ctx->xch); + physinfo->sharing_used_frames = xc_sharing_used_frames(ctx->xch); physinfo->nr_nodes = xcphysinfo.nr_nodes; memcpy(physinfo->hw_cap,xcphysinfo.hw_cap, sizeof(physinfo->hw_cap)); physinfo->phys_cap = xcphysinfo.capabilities; diff -r 9116fc441ca4 -r 4cefca5789b1 tools/libxl/libxl_types.idl --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -367,6 +367,8 @@ libxl_physinfo = Struct("physinfo", [ ("total_pages", uint64), ("free_pages", uint64), ("scrub_pages", uint64), + ("sharing_freed_pages", uint64), + ("sharing_used_frames", uint64), ("nr_nodes", uint32), ("hw_cap", libxl_hwcap), diff -r 9116fc441ca4 -r 4cefca5789b1 tools/libxl/xl.h --- a/tools/libxl/xl.h +++ b/tools/libxl/xl.h @@ -28,6 +28,7 @@ struct cmd_spec { int main_vcpulist(int argc, char **argv); int main_info(int argc, char **argv); +int main_sharing(int argc, char **argv); int main_cd_eject(int argc, char **argv); int main_cd_insert(int argc, char **argv); int main_console(int argc, char **argv); diff -r 9116fc441ca4 -r 4cefca5789b1 tools/libxl/xl_cmdimpl.c --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -3701,6 +3701,8 @@ static void output_physinfo(void) i = (1 << 20) / vinfo->pagesize; printf("total_memory : %"PRIu64"\n", info.total_pages / i); printf("free_memory : %"PRIu64"\n", info.free_pages / i); + printf("sharing_freed_memory : %"PRIu64"\n", info.sharing_freed_pages / i); + printf("sharing_used_memory : %"PRIu64"\n", info.sharing_used_frames / i); } if (!libxl_get_freecpus(ctx, &cpumap)) { libxl_for_each_cpu(i, cpumap) @@ -3784,6 +3786,70 @@ int main_info(int argc, char **argv) return 0; } +static void sharing(const libxl_dominfo *info, int nb_domain) +{ + int i; + + printf("Name ID Mem Shared\n"); + + for (i = 0; i < nb_domain; i++) { + char *domname; + unsigned shutdown_reason; + domname = libxl_domid_to_name(ctx, info[i].domid); + shutdown_reason = info[i].shutdown ? info[i].shutdown_reason : 0; + printf("%-40s %5d %5lu %5lu\n", + domname, + info[i].domid, + (unsigned long) (info[i].current_memkb / 1024), + (unsigned long) (info[i].shared_memkb / 1024)); + free(domname); + } +} + +int main_sharing(int argc, char **argv) +{ + int opt = 0; + libxl_dominfo info_buf; + libxl_dominfo *info, *info_free = NULL; + int nb_domain, rc; + + if ((opt = def_getopt(argc, argv, "", "sharing", 0)) != -1) + return opt; + + if (optind >= argc) { + info = libxl_list_domain(ctx, &nb_domain); + if (!info) { + fprintf(stderr, "libxl_domain_infolist failed.\n"); + return 1; + } + info_free = info; + } else if (optind == argc-1) { + find_domain(argv[optind]); + rc = libxl_domain_info(ctx, &info_buf, domid); + if (rc == ERROR_INVAL) { + fprintf(stderr, "Error: Domain \''%s\'' does not exist.\n", + argv[optind]); + return -rc; + } + if (rc) { + fprintf(stderr, "libxl_domain_info failed (code %d).\n", rc); + return -rc; + } + info = &info_buf; + nb_domain = 1; + } else { + help("sharing"); + return 2; + } + + sharing(info, nb_domain); + + if (info_free) + free(info_free); + + return 0; +} + static int sched_credit_domain_get( int domid, libxl_sched_credit *scinfo) { diff -r 9116fc441ca4 -r 4cefca5789b1 tools/libxl/xl_cmdtable.c --- a/tools/libxl/xl_cmdtable.c +++ b/tools/libxl/xl_cmdtable.c @@ -189,6 +189,11 @@ struct cmd_spec cmd_table[] = { "Get information about Xen host", "-n, --numa List host NUMA topology information", }, + { "sharing", + &main_sharing, 0, + "Get information about page sharing", + "[Domain]", + }, { "sched-credit", &main_sched_credit, 0, "Get/set credit scheduler parameters",
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 12 of 13] Memshrtool: tool to test and exercise the sharing subsystem
tools/tests/mem-sharing/Makefile | 26 +++++ tools/tests/mem-sharing/memshrtool.c | 165 +++++++++++++++++++++++++++++++++++ 2 files changed, 191 insertions(+), 0 deletions(-) This is demo code meant to showcase how to perform sharing operations. It is useful for testing. We submit it so others in the list can have unlimited sharing fun, but not necessarily expecting it be added to the tree. Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 4cefca5789b1 -r 88856b776fcf tools/tests/mem-sharing/Makefile --- /dev/null +++ b/tools/tests/mem-sharing/Makefile @@ -0,0 +1,26 @@ +XEN_ROOT=$(CURDIR)/../../.. +include $(XEN_ROOT)/tools/Rules.mk + +CFLAGS += -Werror + +CFLAGS += $(CFLAGS_libxenctrl) +CFLAGS += $(CFLAGS_xeninclude) + +TARGETS-y := +TARGETS-$(CONFIG_X86) += memshrtool +TARGETS := $(TARGETS-y) + +.PHONY: all +all: build + +.PHONY: build +build: $(TARGETS) + +.PHONY: clean +clean: + $(RM) *.o $(TARGETS) *~ $(DEPS) + +memshrtool: memshrtool.o + $(CC) -o $@ $< $(LDFLAGS) $(LDLIBS_libxenctrl) + +-include $(DEPS) diff -r 4cefca5789b1 -r 88856b776fcf tools/tests/mem-sharing/memshrtool.c --- /dev/null +++ b/tools/tests/mem-sharing/memshrtool.c @@ -0,0 +1,165 @@ +/* + * memshrtool.c + * + * Copyright 2011 GridCentric Inc. (Adin Scannell, Andres Lagar-Cavilla) + */ + +#include <stdio.h> +#include <unistd.h> +#include <stdlib.h> +#include <errno.h> +#include <string.h> +#include <sys/mman.h> + +#include "xenctrl.h" + +static int usage(const char* prog) +{ + printf("usage: %s <command> [args...]\n", prog); + printf("where <command> may be:\n"); + printf(" info - Display total sharing info.\n"); + printf(" enable - Enable sharing on a domain.\n"); + printf(" disable - Disable sharing on a domain.\n"); + printf(" nominate <domid> <gfn> - Nominate a page for sharing.\n"); + printf(" share <domid> <gfn> <handle> <source> <source-gfn> <source-handle>\n"); + printf(" - Share two pages.\n"); + printf(" unshare <domid> <gfn> - Unshare a page by grabbing a writable map.\n"); + printf(" add-to-physmap <domid> <gfn> <source> <source-gfn> <source-handle>\n"); + printf(" - Populate a page in a domain with a shared page.\n"); + printf(" debug-gfn <domid> <gfn> - Debug a particular domain and gfn.\n"); + return 1; +} + +#define R(f) do { \ + int rc = f; \ + if ( rc < 0 ) { \ + printf("error executing %s: %s\n", #f, \ + ((errno * -1) == XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID) ? \ + "problem with client handle" :\ + ((errno * -1) == XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID) ? \ + "problem with source handle" : strerror(errno)); \ + return rc; \ + } \ +} while(0) + +int main(int argc, const char** argv) +{ + const char* cmd = NULL; + xc_interface *xch = xc_interface_open(0, 0, 0); + + if( argc < 2 ) + return usage(argv[0]); + + cmd = argv[1]; + + if( !strcasecmp(cmd, "info") ) + { + if( argc != 2 ) + return usage(argv[0]); + + printf("used = %ld\n", xc_sharing_used_frames(xch)); + printf("freed = %ld\n", xc_sharing_freed_pages(xch)); + } + else if( !strcasecmp(cmd, "enable") ) + { + domid_t domid; + + if( argc != 3 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + R(xc_memshr_control(xch, domid, 1)); + } + else if( !strcasecmp(cmd, "disable") ) + { + domid_t domid; + + if( argc != 3 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + R(xc_memshr_control(xch, domid, 0)); + } + else if( !strcasecmp(cmd, "nominate") ) + { + domid_t domid; + unsigned long gfn; + uint64_t handle; + + if( argc != 4 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + gfn = strtol(argv[3], NULL, 0); + R(xc_memshr_nominate_gfn(xch, domid, gfn, &handle)); + printf("handle = 0x%08llx\n", (unsigned long long) handle); + } + else if( !strcasecmp(cmd, "share") ) + { + domid_t domid; + unsigned long gfn; + uint64_t handle; + domid_t source_domid; + unsigned long source_gfn; + uint64_t source_handle; + + if( argc != 8 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + gfn = strtol(argv[3], NULL, 0); + handle = strtol(argv[4], NULL, 0); + source_domid = strtol(argv[5], NULL, 0); + source_gfn = strtol(argv[6], NULL, 0); + source_handle = strtol(argv[7], NULL, 0); + R(xc_memshr_share_gfns(xch, source_domid, source_gfn, source_handle, domid, gfn, handle)); + } + else if( !strcasecmp(cmd, "unshare") ) + { + domid_t domid; + unsigned long gfn; + void *map; + + if( argc != 4 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + gfn = strtol(argv[3], NULL, 0); + map = xc_map_foreign_range(xch, domid, 4096, PROT_WRITE, gfn); + if( map ) + munmap(map, 4096); + R((int)!map); + } + else if( !strcasecmp(cmd, "add-to-physmap") ) + { + domid_t domid; + unsigned long gfn; + domid_t source_domid; + unsigned long source_gfn; + uint64_t source_handle; + + if( argc != 7 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + gfn = strtol(argv[3], NULL, 0); + source_domid = strtol(argv[4], NULL, 0); + source_gfn = strtol(argv[5], NULL, 0); + source_handle = strtol(argv[6], NULL, 0); + R(xc_memshr_add_to_physmap(xch, source_domid, source_gfn, source_handle, domid, gfn)); + } + else if( !strcasecmp(cmd, "debug-gfn") ) + { + domid_t domid; + unsigned long gfn; + + if( argc != 4 ) + return usage(argv[0]); + + domid = strtol(argv[2], NULL, 0); + gfn = strtol(argv[3], NULL, 0); + R(xc_memshr_debug_gfn(xch, domid, gfn)); + } + + return 0; +}
Andres Lagar-Cavilla
2012-Jan-26 03:32 UTC
[PATCH 13 of 13] x86/mm: Sharing overhaul style improvements
xen/arch/x86/mm/mem_sharing.c | 64 +++++++++++++++++++++--------------------- xen/include/asm-x86/mm.h | 2 +- 2 files changed, 33 insertions(+), 33 deletions(-) The name ''shared_info'' for the list of shared pages backed by a share frame collided with the identifier also used for a domain''s shared info page. To avoid grep/cscope/etc aliasing, rename the shared memory token to ''sharing. This patch only addresses style, and performs no functional changes. To ease reviwing, the patch was left as a stand-alone last-slot addition to the queue to avoid propagating changes throughout the whole series. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 88856b776fcf -r f09f62ae92b7 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -64,19 +64,19 @@ static void _free_pg_shared_info(struct static inline void audit_add_list(struct page_info *page) { - INIT_LIST_HEAD(&page->shared_info->entry); + INIT_LIST_HEAD(&page->sharing->entry); spin_lock(&shr_audit_lock); - list_add_rcu(&page->shared_info->entry, &shr_audit_list); + list_add_rcu(&page->sharing->entry, &shr_audit_list); spin_unlock(&shr_audit_lock); } static inline void audit_del_list(struct page_info *page) { spin_lock(&shr_audit_lock); - list_del_rcu(&page->shared_info->entry); + list_del_rcu(&page->sharing->entry); spin_unlock(&shr_audit_lock); - INIT_RCU_HEAD(&page->shared_info->rcu_head); - call_rcu(&page->shared_info->rcu_head, _free_pg_shared_info); + INIT_RCU_HEAD(&page->sharing->rcu_head); + call_rcu(&page->sharing->rcu_head, _free_pg_shared_info); } #else @@ -86,7 +86,7 @@ static inline void audit_del_list(struct #define audit_add_list(p) ((void)0) static inline void audit_del_list(struct page_info *page) { - xfree(page->shared_info); + xfree(page->sharing); } #endif /* MEM_SHARING_AUDIT */ @@ -171,7 +171,7 @@ static inline gfn_info_t *mem_sharing_gf gfn_info->gfn = gfn; gfn_info->domain = d->domain_id; INIT_LIST_HEAD(&gfn_info->list); - list_add(&gfn_info->list, &page->shared_info->gfns); + list_add(&gfn_info->list, &page->sharing->gfns); /* Increment our number of shared pges. */ atomic_inc(&d->shr_pages); @@ -220,14 +220,14 @@ static void mem_sharing_audit(void) list_for_each_rcu(ae, &shr_audit_list) { - struct page_sharing_info *shared_info; + struct page_sharing_info *pg_shared_info; unsigned long nr_gfns = 0; struct page_info *pg; struct list_head *le; mfn_t mfn; - shared_info = list_entry(ae, struct page_sharing_info, entry); - pg = shared_info->pg; + pg_shared_info = list_entry(ae, struct page_sharing_info, entry); + pg = pg_shared_info->pg; mfn = page_to_mfn(pg); /* If we can''t lock it, it''s definitely not a shared page */ @@ -265,7 +265,7 @@ static void mem_sharing_audit(void) } /* Check we have a list */ - if ( (!pg->shared_info) || (list_empty(&pg->shared_info->gfns)) ) + if ( (!pg->sharing) || (list_empty(&pg->sharing->gfns)) ) { MEM_SHARING_DEBUG("mfn %lx shared, but empty gfn list!\n", mfn_x(mfn)); @@ -277,7 +277,7 @@ static void mem_sharing_audit(void) count_found++; /* Check if all GFNs map to the MFN, and the p2m types */ - list_for_each(le, &pg->shared_info->gfns) + list_for_each(le, &pg->sharing->gfns) { struct domain *d; p2m_type_t t; @@ -630,7 +630,7 @@ int mem_sharing_nominate_page(struct dom "grab page %lx dom %d\n", gfn, mfn_x(mfn), d->domain_id); BUG(); } - *phandle = pg->shared_info->handle; + *phandle = pg->sharing->handle; ret = 0; mem_sharing_page_unlock(pg); goto out; @@ -655,24 +655,24 @@ int mem_sharing_nominate_page(struct dom /* Initialize the shared state */ ret = -ENOMEM; - if ( (page->shared_info = + if ( (page->sharing = xmalloc(struct page_sharing_info)) == NULL ) { /* Making a page private atomically unlocks it */ BUG_ON(page_make_private(d, page) != 0); goto out; } - page->shared_info->pg = page; - INIT_LIST_HEAD(&page->shared_info->gfns); + page->sharing->pg = page; + INIT_LIST_HEAD(&page->sharing->gfns); /* Create the handle */ - page->shared_info->handle = get_next_handle(); + page->sharing->handle = get_next_handle(); /* Create the local gfn info */ if ( (gfn_info = mem_sharing_gfn_alloc(page, d, gfn)) == NULL ) { - xfree(page->shared_info); - page->shared_info = NULL; + xfree(page->sharing); + page->sharing = NULL; BUG_ON(page_make_private(d, page) != 0); goto out; } @@ -685,8 +685,8 @@ int mem_sharing_nominate_page(struct dom * The mfn needs to revert back to rw type. This should never fail, * since no-one knew that the mfn was temporarily sharable */ mem_sharing_gfn_destroy(d, gfn_info); - xfree(page->shared_info); - page->shared_info = NULL; + xfree(page->sharing); + page->sharing = NULL; /* NOTE: We haven''t yet added this to the audit list. */ BUG_ON(page_make_private(d, page) != 0); goto out; @@ -698,7 +698,7 @@ int mem_sharing_nominate_page(struct dom /* Update m2p entry to SHARED_M2P_ENTRY */ set_gpfn_from_mfn(mfn_x(mfn), SHARED_M2P_ENTRY); - *phandle = page->shared_info->handle; + *phandle = page->sharing->handle; audit_add_list(page); mem_sharing_page_unlock(page); ret = 0; @@ -769,14 +769,14 @@ int mem_sharing_share_pages(struct domai ASSERT(cmfn_type == p2m_ram_shared); /* Check that the handles match */ - if ( spage->shared_info->handle != sh ) + if ( spage->sharing->handle != sh ) { ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; mem_sharing_page_unlock(secondpg); mem_sharing_page_unlock(firstpg); goto err_out; } - if ( cpage->shared_info->handle != ch ) + if ( cpage->sharing->handle != ch ) { ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; mem_sharing_page_unlock(secondpg); @@ -785,7 +785,7 @@ int mem_sharing_share_pages(struct domai } /* Merge the lists together */ - list_for_each_safe(le, te, &cpage->shared_info->gfns) + list_for_each_safe(le, te, &cpage->sharing->gfns) { gfn = list_entry(le, gfn_info_t, list); /* Get the source page and type, this should never fail: @@ -793,18 +793,18 @@ int mem_sharing_share_pages(struct domai BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); /* Move the gfn_info from client list to source list */ list_del(&gfn->list); - list_add(&gfn->list, &spage->shared_info->gfns); + list_add(&gfn->list, &spage->sharing->gfns); put_page_and_type(cpage); d = get_domain_by_id(gfn->domain); BUG_ON(!d); BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn) == 0); put_domain(d); } - ASSERT(list_empty(&cpage->shared_info->gfns)); + ASSERT(list_empty(&cpage->sharing->gfns)); /* Clear the rest of the shared state */ audit_del_list(cpage); - cpage->shared_info = NULL; + cpage->sharing = NULL; mem_sharing_page_unlock(secondpg); mem_sharing_page_unlock(firstpg); @@ -848,7 +848,7 @@ int mem_sharing_add_to_physmap(struct do ASSERT(smfn_type == p2m_ram_shared); /* Check that the handles match */ - if ( spage->shared_info->handle != sh ) + if ( spage->sharing->handle != sh ) goto err_unlock; /* Make sure the target page is a hole in the physmap */ @@ -920,7 +920,7 @@ int mem_sharing_unshare_page(struct doma BUG(); } - list_for_each(le, &page->shared_info->gfns) + list_for_each(le, &page->sharing->gfns) { gfn_info = list_entry(le, gfn_info_t, list); if ( (gfn_info->gfn == gfn) && (gfn_info->domain == d->domain_id) ) @@ -933,13 +933,13 @@ int mem_sharing_unshare_page(struct doma gfn_found: /* Do the accounting first. If anything fails below, we have bigger * bigger fish to fry. First, remove the gfn from the list. */ - last_gfn = list_has_one_entry(&page->shared_info->gfns); + last_gfn = list_has_one_entry(&page->sharing->gfns); mem_sharing_gfn_destroy(d, gfn_info); if ( last_gfn ) { /* Clean up shared state */ audit_del_list(page); - page->shared_info = NULL; + page->sharing = NULL; atomic_dec(&nr_shared_mfns); } else diff -r 88856b776fcf -r f09f62ae92b7 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -57,7 +57,7 @@ struct page_info * of sharing share the version they expect to. * This list is allocated and freed when a page is shared/unshared. */ - struct page_sharing_info *shared_info; + struct page_sharing_info *sharing; }; /* Reference count and various PGC_xxx flags and fields. */
At 22:32 -0500 on 25 Jan (1327530744), Andres Lagar-Cavilla wrote:> This patch series brings about a set of changes to the sharing support > in the hypervisor and tools. Specifically: > > - Granular scalable locking: lock mfn''s individually when sharing/unsharing > them. Do not rely on a global lock. Use RCU to maintain a list of all > shared frames for audit purposes. > > - Refreshed stats: they now work (tm), and are accesible via libxc > and console. > > - libxl/xl support to get sharing stats. > > - New sharing command "add to physmap", to directly populate a new domain > with shared frames. > > - Toolstack and memshr kept in sync.Applied all except #12; thank you. Tools maintainers, I think it would be a good idea to take #12 as well, in spite of the commit message saying ''not necessarily expecting it be added to the tree'', in order to have an in-tree consumer of these interfaces. Cheers, Tim.
> At 22:32 -0500 on 25 Jan (1327530744), Andres Lagar-Cavilla wrote: >> This patch series brings about a set of changes to the sharing support >> in the hypervisor and tools. Specifically: >> >> - Granular scalable locking: lock mfn''s individually when >> sharing/unsharing >> them. Do not rely on a global lock. Use RCU to maintain a list of all >> shared frames for audit purposes. >> >> - Refreshed stats: they now work (tm), and are accesible via libxc >> and console. >> >> - libxl/xl support to get sharing stats. >> >> - New sharing command "add to physmap", to directly populate a new >> domain >> with shared frames. >> >> - Toolstack and memshr kept in sync. > > Applied all except #12; thank you. > > Tools maintainers, I think it would be a good idea to take #12 as well, > in spite of the commit message saying ''not necessarily expecting it be > added to the tree'', in order to have an in-tree consumer of these > interfaces.Excellent, thanks. Patch 12 is demo code. Nice to have, but neither critical nor particularly clean... Cheers, Andres> > Cheers, > > Tim. >
On Thu, Jan 26, Tim Deegan wrote:> Applied all except #12; thank you. > > Tools maintainers, I think it would be a good idea to take #12 as well, > in spite of the commit message saying ''not necessarily expecting it be > added to the tree'', in order to have an in-tree consumer of these > interfaces.I havent looked at the tool, but I think that having some sort of (easy to run) test vehicle for the features we have is always good. So also that other tool Andres sent to test access is good to have in the tree. Olaf
Ian Jackson
2012-Jan-27 18:21 UTC
Re: [PATCH 12 of 13] Memshrtool: tool to test and exercise the sharing subsystem
Andres Lagar-Cavilla writes ("[Xen-devel] [PATCH 12 of 13] Memshrtool: tool to test and exercise the sharing subsystem"):> This is demo code meant to showcase how to perform sharing > operations. It is useful for testing. We submit it so others in > the list can have unlimited sharing fun, but not necessarily > expecting it be added to the tree. > > Signed-off-by: Adin Scannell <adin@scannell.ca> > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>I have applied this. I would encourage you to submit a patch to wire it into the tools build system. Ian.
Andres Lagar-Cavilla
2012-Jan-27 18:27 UTC
Re: [PATCH 12 of 13] Memshrtool: tool to test and exercise the sharing subsystem
> Andres Lagar-Cavilla writes ("[Xen-devel] [PATCH 12 of 13] Memshrtool: > tool to test and exercise the sharing subsystem"): >> This is demo code meant to showcase how to perform sharing >> operations. It is useful for testing. We submit it so others in >> the list can have unlimited sharing fun, but not necessarily >> expecting it be added to the tree. >> >> Signed-off-by: Adin Scannell <adin@scannell.ca> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > I have applied this. I would encourage you to submit a patch to wire > it into the tools build system.Great. Give me a sec Andres> > Ian. >