Andres Lagar-Cavilla
2011-Dec-09 20:21 UTC
[PATCH 0 of 9] x86/mm: Memory Sharing Overhaul V2
This patch series proposes a comprehensive overhaul of the memory sharing code. We include - Clean ups. - A reworked locking discipline. We introduce the use of per page locks to protect additions and removals of gfns to shared pages. We use RCU to manage a global list of shared pages used for auditing. The end result is the removal of the sharing global lock. - New features: + Polling stats via console. + More stats: frames that are shared, pages that are saved. due to sharing + New sharing domctl to add a shared page directly to a whole in the physmap. + New sharing domctl to perform audits. Relevant tools patches sent in a separate series. A consequence of our modifications is a change to the sharing API. While we refresh the only user of the sharing API in the tree, we want to make sure we are not affecting any other users of the sharing API. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> xen/arch/x86/mm.c | 6 +- xen/arch/x86/mm/mem_sharing.c | 91 +++-- xen/arch/x86/mm/mem_sharing.c | 560 +++++++++++++++++++------------------ xen/include/asm-x86/mem_sharing.h | 19 +- xen/include/asm-x86/mm.h | 11 +- xen/include/public/domctl.h | 3 + xen/arch/x86/mm/mem_sharing.c | 57 +++- xen/include/public/domctl.h | 9 + xen/arch/x86/mm.c | 6 - xen/arch/x86/mm/mem_sharing.c | 29 + xen/arch/x86/x86_64/compat/mm.c | 6 + xen/arch/x86/x86_64/mm.c | 7 + xen/include/asm-x86/mem_sharing.h | 1 + xen/include/public/memory.h | 1 + xen/arch/x86/mm.c | 74 +---- xen/arch/x86/mm/mem_sharing.c | 280 +++++++++++++++++- xen/arch/x86/mm/mm-locks.h | 31 +- xen/include/asm-x86/mm.h | 28 +- xen/arch/x86/mm/mem_sharing.c | 104 +++++++ xen/include/public/domctl.h | 3 +- xen/arch/ia64/xen/mm.c | 6 + xen/arch/x86/mm/mem_sharing.c | 8 + xen/common/keyhandler.c | 7 +- xen/include/xen/mm.h | 3 + xen/arch/x86/mm/mem_sharing.c | 15 +- xen/include/public/domctl.h | 1 + xen/arch/x86/mm/mem_sharing.c | 83 ++--- xen/arch/x86/mm/mm-locks.h | 18 - xen/include/asm-x86/mem_sharing.h | 3 +- 29 files changed, 957 insertions(+), 513 deletions(-) ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com ______________________________________________________________________
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 0 of 9] x86/mm: Memory Sharing Overhaul V2
This patch series proposes a comprehensive overhaul of the memory sharing code. We include - Clean ups. - A reworked locking discipline. We introduce the use of per page locks to protect additions and removals of gfns to shared pages. We use RCU to manage a global list of shared pages used for auditing. The end result is the removal of the sharing global lock. - New features: + Polling stats via console. + More stats: frames that are shared, pages that are saved. due to sharing + New sharing domctl to add a shared page directly to a whole in the physmap. + New sharing domctl to perform audits. Relevant tools patches sent in a separate series. A consequence of our modifications is a change to the sharing API. While we refresh the only user of the sharing API in the tree, we want to make sure we are not affecting any other users of the sharing API. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> xen/arch/x86/mm.c | 6 +- xen/arch/x86/mm/mem_sharing.c | 91 +++-- xen/arch/x86/mm/mem_sharing.c | 560 +++++++++++++++++++------------------ xen/include/asm-x86/mem_sharing.h | 19 +- xen/include/asm-x86/mm.h | 11 +- xen/include/public/domctl.h | 3 + xen/arch/x86/mm/mem_sharing.c | 57 +++- xen/include/public/domctl.h | 9 + xen/arch/x86/mm.c | 6 - xen/arch/x86/mm/mem_sharing.c | 29 + xen/arch/x86/x86_64/compat/mm.c | 6 + xen/arch/x86/x86_64/mm.c | 7 + xen/include/asm-x86/mem_sharing.h | 1 + xen/include/public/memory.h | 1 + xen/arch/x86/mm.c | 74 +---- xen/arch/x86/mm/mem_sharing.c | 280 +++++++++++++++++- xen/arch/x86/mm/mm-locks.h | 31 +- xen/include/asm-x86/mm.h | 28 +- xen/arch/x86/mm/mem_sharing.c | 104 +++++++ xen/include/public/domctl.h | 3 +- xen/arch/ia64/xen/mm.c | 6 + xen/arch/x86/mm/mem_sharing.c | 8 + xen/common/keyhandler.c | 7 +- xen/include/xen/mm.h | 3 + xen/arch/x86/mm/mem_sharing.c | 15 +- xen/include/public/domctl.h | 1 + xen/arch/x86/mm/mem_sharing.c | 83 ++--- xen/arch/x86/mm/mm-locks.h | 18 - xen/include/asm-x86/mem_sharing.h | 3 +- 29 files changed, 957 insertions(+), 513 deletions(-)
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 1 of 9] x86/mm: Code style fixes in mem_sharing.c
xen/arch/x86/mm.c | 6 +- xen/arch/x86/mm/mem_sharing.c | 91 ++++++++++++++++++++++-------------------- 2 files changed, 50 insertions(+), 47 deletions(-) No functional changes, only code style issues. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r f232d335de98 -r d29cba447de1 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4321,9 +4321,9 @@ int page_make_sharable(struct domain *d, return -EEXIST; } - /* Check if the ref count is 2. The first from PGT_allocated, and + /* Check if the ref count is 2. The first from PGC_allocated, and * the second from get_page_and_type at the top of this function */ - if(page->count_info != (PGC_allocated | (2 + expected_refcnt))) + if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) { /* Return type count back to zero */ put_page_and_type(page); @@ -4340,7 +4340,7 @@ int page_make_sharable(struct domain *d, int page_make_private(struct domain *d, struct page_info *page) { - if(!get_page(page, dom_cow)) + if ( !get_page(page, dom_cow) ) return -EINVAL; spin_lock(&d->page_alloc_lock); diff -r f232d335de98 -r d29cba447de1 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -35,14 +35,14 @@ #include "mm-locks.h" /* Auditing of memory sharing code? */ -#define MEM_SHARING_AUDIT 0 +#define MEM_SHARING_AUDIT 0 #if MEM_SHARING_AUDIT static void mem_sharing_audit(void); #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) #else -# define mem_sharing_audit() do {} while(0) +#define mem_sharing_audit() do {} while(0) #endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ @@ -56,7 +56,7 @@ static void mem_sharing_audit(void); #define page_to_mfn(_pg) _mfn(__page_to_mfn(_pg)) static shr_handle_t next_handle = 1; -static atomic_t nr_saved_mfns = ATOMIC_INIT(0); +static atomic_t nr_saved_mfns = ATOMIC_INIT(0); typedef struct shr_hash_entry { @@ -84,9 +84,9 @@ static inline int list_has_one_entry(str return (head->next != head) && (head->next->next == head); } -static inline struct gfn_info* gfn_get_info(struct list_head *list) +static inline gfn_info_t *gfn_get_info(struct list_head *list) { - return list_entry(list->next, struct gfn_info, list); + return list_entry(list->next, gfn_info_t, list); } static void __init mem_sharing_hash_init(void) @@ -116,12 +116,12 @@ static gfn_info_t *mem_sharing_gfn_alloc static void mem_sharing_gfn_destroy(gfn_info_t *gfn, int was_shared) { /* Decrement the number of pages, if the gfn was shared before */ - if(was_shared) + if ( was_shared ) { struct domain *d = get_domain_by_id(gfn->domain); - /* Domain may have been destroyed by now, if we are called from - * p2m_teardown */ - if(d) + /* Domain may have been destroyed by now * + * (if we are called from p2m_teardown) */ + if ( d ) { atomic_dec(&d->shr_pages); put_domain(d); @@ -261,7 +261,8 @@ static struct page_info* mem_sharing_all mem_event_request_t req; page = alloc_domheap_page(d, 0); - if(page != NULL) return page; + if ( page != NULL ) + return page; memset(&req, 0, sizeof(req)); req.type = MEM_EVENT_TYPE_SHARED; @@ -293,7 +294,7 @@ static struct page_info* mem_sharing_all unsigned int mem_sharing_get_nr_saved_mfns(void) { - return (unsigned int)atomic_read(&nr_saved_mfns); + return ((unsigned int)atomic_read(&nr_saved_mfns)); } int mem_sharing_sharing_resume(struct domain *d) @@ -317,14 +318,15 @@ int mem_sharing_debug_mfn(unsigned long { struct page_info *page; - if(!mfn_valid(_mfn(mfn))) + if ( !mfn_valid(_mfn(mfn)) ) { - printk("Invalid MFN=%lx\n", mfn); + gdprintk(XENLOG_ERR, "Invalid MFN=%lx\n", mfn); return -1; } page = mfn_to_page(_mfn(mfn)); - printk("Debug page: MFN=%lx is ci=%lx, ti=%lx, owner_id=%d\n", + gdprintk(XENLOG_DEBUG, + "Debug page: MFN=%lx is ci=%lx, ti=%lx, owner_id=%d\n", mfn_x(page_to_mfn(page)), page->count_info, page->u.inuse.type_info, @@ -340,9 +342,9 @@ int mem_sharing_debug_gfn(struct domain mfn = get_gfn_unlocked(d, gfn, &p2mt); - printk("Debug for domain=%d, gfn=%lx, ", - d->domain_id, - gfn); + gdprintk(XENLOG_DEBUG, "Debug for domain=%d, gfn=%lx, ", + d->domain_id, + gfn); return mem_sharing_debug_mfn(mfn_x(mfn)); } @@ -359,8 +361,8 @@ int mem_sharing_debug_gfn(struct domain static grant_entry_header_t * shared_entry_header(struct grant_table *t, grant_ref_t ref) { - ASSERT(t->gt_version != 0); - if (t->gt_version == 1) + ASSERT (t->gt_version != 0); + if ( t->gt_version == 1 ) return (grant_entry_header_t*)&shared_entry_v1(t, ref); else return &shared_entry_v2(t, ref).hdr; @@ -370,25 +372,23 @@ static int mem_sharing_gref_to_gfn(struc grant_ref_t ref, unsigned long *gfn) { - if(d->grant_table->gt_version < 1) + if ( d->grant_table->gt_version < 1 ) return -1; - if (d->grant_table->gt_version == 1) + if ( d->grant_table->gt_version == 1 ) { grant_entry_v1_t *sha1; sha1 = &shared_entry_v1(d->grant_table, ref); *gfn = sha1->frame; - return 0; } else { grant_entry_v2_t *sha2; sha2 = &shared_entry_v2(d->grant_table, ref); *gfn = sha2->full_page.frame; - return 0; } - return -2; + return 0; } /* Account for a GFN being shared/unshared. @@ -428,20 +428,22 @@ int mem_sharing_debug_gref(struct domain uint16_t status; unsigned long gfn; - if(d->grant_table->gt_version < 1) + if ( d->grant_table->gt_version < 1 ) { - printk("Asked to debug [dom=%d,gref=%d], but not yet inited.\n", + gdprintk(XENLOG_ERR, + "Asked to debug [dom=%d,gref=%d], but not yet inited.\n", d->domain_id, ref); return -1; } - mem_sharing_gref_to_gfn(d, ref, &gfn); + (void)mem_sharing_gref_to_gfn(d, ref, &gfn); shah = shared_entry_header(d->grant_table, ref); - if (d->grant_table->gt_version == 1) + if ( d->grant_table->gt_version == 1 ) status = shah->flags; else status = status_entry(d->grant_table, ref); - printk("==> Grant [dom=%d,ref=%d], status=%x. ", + gdprintk(XENLOG_DEBUG, + "==> Grant [dom=%d,ref=%d], status=%x. ", d->domain_id, ref, status); return mem_sharing_debug_gfn(d, gfn); @@ -467,24 +469,24 @@ int mem_sharing_nominate_page(struct dom /* Check if mfn is valid */ ret = -EINVAL; - if (!mfn_valid(mfn)) + if ( !mfn_valid(mfn) ) goto out; /* Return the handle if the page is already shared */ page = mfn_to_page(mfn); - if (p2m_is_shared(p2mt)) { + if ( p2m_is_shared(p2mt) ) { *phandle = page->shr_handle; ret = 0; goto out; } /* Check p2m type */ - if (!p2m_is_sharable(p2mt)) + if ( !p2m_is_sharable(p2mt) ) goto out; /* Try to convert the mfn to the sharable type */ ret = page_make_sharable(d, page, expected_refcnt); - if(ret) + if ( ret ) goto out; /* Create the handle */ @@ -501,7 +503,7 @@ int mem_sharing_nominate_page(struct dom } /* Change the p2m type */ - if(p2m_change_type(d, gfn, p2mt, p2m_ram_shared) != p2mt) + if ( p2m_change_type(d, gfn, p2mt, p2m_ram_shared) != p2mt ) { /* This is unlikely, as the type must have changed since we''ve checked * it a few lines above. @@ -607,7 +609,7 @@ int mem_sharing_unshare_page(struct doma mfn = get_gfn(d, gfn, &p2mt); /* Has someone already unshared it? */ - if (!p2m_is_shared(p2mt)) { + if ( !p2m_is_shared(p2mt) ) { put_gfn(d, gfn); shr_unlock(); return 0; @@ -620,10 +622,11 @@ int mem_sharing_unshare_page(struct doma list_for_each(le, &hash_entry->gfns) { gfn_info = list_entry(le, struct gfn_info, list); - if((gfn_info->gfn == gfn) && (gfn_info->domain == d->domain_id)) + if ( (gfn_info->gfn == gfn) && (gfn_info->domain == d->domain_id) ) goto gfn_found; } - printk("Could not find gfn_info for shared gfn: %lx\n", gfn); + gdprintk(XENLOG_ERR, "Could not find gfn_info for shared gfn: " + "%lx\n", gfn); BUG(); gfn_found: /* Delete gfn_info from the list, but hold on to it, until we''ve allocated @@ -633,7 +636,7 @@ gfn_found: /* If the GFN is getting destroyed drop the references to MFN * (possibly freeing the page), and exit early */ - if(flags & MEM_SHARING_DESTROY_GFN) + if ( flags & MEM_SHARING_DESTROY_GFN ) { mem_sharing_gfn_destroy(gfn_info, !last_gfn); if(last_gfn) @@ -691,8 +694,8 @@ private_page_found: else atomic_dec(&nr_saved_mfns); - if(p2m_change_type(d, gfn, p2m_ram_shared, p2m_ram_rw) != - p2m_ram_shared) + if ( p2m_change_type(d, gfn, p2m_ram_shared, p2m_ram_rw) != + p2m_ram_shared ) { printk("Could not change p2m type.\n"); BUG(); @@ -729,7 +732,7 @@ int mem_sharing_domctl(struct domain *d, { unsigned long gfn = mec->u.nominate.u.gfn; shr_handle_t handle; - if(!mem_sharing_enabled(d)) + if ( !mem_sharing_enabled(d) ) return -EINVAL; rc = mem_sharing_nominate_page(d, gfn, 0, &handle); mec->u.nominate.handle = handle; @@ -742,9 +745,9 @@ int mem_sharing_domctl(struct domain *d, unsigned long gfn; shr_handle_t handle; - if(!mem_sharing_enabled(d)) + if ( !mem_sharing_enabled(d) ) return -EINVAL; - if(mem_sharing_gref_to_gfn(d, gref, &gfn) < 0) + if ( mem_sharing_gref_to_gfn(d, gref, &gfn) < 0 ) return -EINVAL; rc = mem_sharing_nominate_page(d, gfn, 3, &handle); mec->u.nominate.handle = handle; @@ -761,7 +764,7 @@ int mem_sharing_domctl(struct domain *d, case XEN_DOMCTL_MEM_EVENT_OP_SHARING_RESUME: { - if(!mem_sharing_enabled(d)) + if ( !mem_sharing_enabled(d) ) return -EINVAL; rc = mem_sharing_sharing_resume(d); }
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 2 of 9] x86/mm: Eliminate hash table in sharing code as index of shared mfns
xen/arch/x86/mm/mem_sharing.c | 560 +++++++++++++++++++------------------ xen/include/asm-x86/mem_sharing.h | 19 +- xen/include/asm-x86/mm.h | 11 +- xen/include/public/domctl.h | 3 + 4 files changed, 322 insertions(+), 271 deletions(-) The hashtable mechanism used by the sharing code is a bit odd. It seems unnecessary and adds complexity. Instead, we replace this by storing a list head directly in the page info for the case when the page is shared. This does not add any extra space to the page_info and serves to remove significant complexity from sharing. Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r d29cba447de1 -r 4862cca21d3c xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -3,6 +3,7 @@ * * Memory sharing support. * + * Copyright (c) 2011 GridCentric, Inc. (Adin Scannell & Andres Lagar-Cavilla) * Copyright (c) 2009 Citrix Systems, Inc. (Grzegorz Milos) * * This program is free software; you can redistribute it and/or modify @@ -34,15 +35,29 @@ #include "mm-locks.h" -/* Auditing of memory sharing code? */ -#define MEM_SHARING_AUDIT 0 - #if MEM_SHARING_AUDIT -static void mem_sharing_audit(void); +static int mem_sharing_audit(void); #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) +static struct list_head shr_audit_list; + +static inline void audit_add_list(struct page_info *page) +{ + list_add(&page->shared_info->entry, &shr_audit_list); +} + +static inline void audit_del_list(struct page_info *page) +{ + list_del(&page->shared_info->entry); +} #else -#define mem_sharing_audit() do {} while(0) +static inline int mem_sharing_audit(void) +{ + return 0; +} + +#define audit_add_list(p) ((void)0) +#define audit_del_list(p) ((void)0) #endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ @@ -58,17 +73,6 @@ static void mem_sharing_audit(void); static shr_handle_t next_handle = 1; static atomic_t nr_saved_mfns = ATOMIC_INIT(0); -typedef struct shr_hash_entry -{ - shr_handle_t handle; - mfn_t mfn; - struct shr_hash_entry *next; - struct list_head gfns; -} shr_hash_entry_t; - -#define SHR_HASH_LENGTH 1000 -static shr_hash_entry_t *shr_hash[SHR_HASH_LENGTH]; - typedef struct gfn_info { unsigned long gfn; @@ -89,36 +93,30 @@ static inline gfn_info_t *gfn_get_info(s return list_entry(list->next, gfn_info_t, list); } -static void __init mem_sharing_hash_init(void) +static inline gfn_info_t *mem_sharing_gfn_alloc(struct page_info *page, + struct domain *d, + unsigned long gfn) { - int i; + gfn_info_t *gfn_info = xmalloc(gfn_info_t); - mm_lock_init(&shr_lock); - for(i=0; i<SHR_HASH_LENGTH; i++) - shr_hash[i] = NULL; + if ( gfn_info == NULL ) + return NULL; + + gfn_info->gfn = gfn; + gfn_info->domain = d->domain_id; + INIT_LIST_HEAD(&gfn_info->list); + list_add(&gfn_info->list, &page->shared_info->gfns); + + return gfn_info; } -static shr_hash_entry_t *mem_sharing_hash_alloc(void) -{ - return xmalloc(shr_hash_entry_t); -} - -static void mem_sharing_hash_destroy(shr_hash_entry_t *e) -{ - xfree(e); -} - -static gfn_info_t *mem_sharing_gfn_alloc(void) -{ - return xmalloc(gfn_info_t); -} - -static void mem_sharing_gfn_destroy(gfn_info_t *gfn, int was_shared) +static inline void mem_sharing_gfn_destroy(gfn_info_t *gfn_info, + int was_shared) { /* Decrement the number of pages, if the gfn was shared before */ if ( was_shared ) { - struct domain *d = get_domain_by_id(gfn->domain); + struct domain *d = get_domain_by_id(gfn_info->domain); /* Domain may have been destroyed by now * * (if we are called from p2m_teardown) */ if ( d ) @@ -127,128 +125,127 @@ static void mem_sharing_gfn_destroy(gfn_ put_domain(d); } } - xfree(gfn); + + list_del(&gfn_info->list); + xfree(gfn_info); } -static shr_hash_entry_t* mem_sharing_hash_lookup(shr_handle_t handle) +static struct page_info* mem_sharing_lookup(unsigned long mfn) { - shr_hash_entry_t *e; - - e = shr_hash[handle % SHR_HASH_LENGTH]; - while(e != NULL) + if ( mfn_valid(_mfn(mfn)) ) { - if(e->handle == handle) - return e; - e = e->next; + struct page_info* page = mfn_to_page(_mfn(mfn)); + if ( page_get_owner(page) == dom_cow ) + { + ASSERT(page->u.inuse.type_info & PGT_type_mask); + ASSERT(get_gpfn_from_mfn(mfn) == SHARED_M2P_ENTRY); + return page; + } } return NULL; } -static shr_hash_entry_t* mem_sharing_hash_insert(shr_handle_t handle, mfn_t mfn) +#if MEM_SHARING_AUDIT +static int mem_sharing_audit(void) { - shr_hash_entry_t *e, **ee; - - e = mem_sharing_hash_alloc(); - if(e == NULL) return NULL; - e->handle = handle; - e->mfn = mfn; - ee = &shr_hash[handle % SHR_HASH_LENGTH]; - e->next = *ee; - *ee = e; - return e; -} - -static void mem_sharing_hash_delete(shr_handle_t handle) -{ - shr_hash_entry_t **pprev, *e; - - pprev = &shr_hash[handle % SHR_HASH_LENGTH]; - e = *pprev; - while(e != NULL) - { - if(e->handle == handle) - { - *pprev = e->next; - mem_sharing_hash_destroy(e); - return; - } - pprev = &e->next; - e = e->next; - } - printk("Could not find shr entry for handle %"PRIx64"\n", handle); - BUG(); -} - -#if MEM_SHARING_AUDIT -static void mem_sharing_audit(void) -{ - shr_hash_entry_t *e; - struct list_head *le; - gfn_info_t *g; - int bucket; - struct page_info *pg; + int errors = 0; + struct list_head *ae; ASSERT(shr_locked_by_me()); - for(bucket=0; bucket < SHR_HASH_LENGTH; bucket++) + list_for_each(ae, &shr_audit_list) { - e = shr_hash[bucket]; - /* Loop over all shr_hash_entries */ - while(e != NULL) + struct page_sharing_info *shared_info; + unsigned long nr_gfns = 0; + struct page_info *pg; + struct list_head *le; + mfn_t mfn; + + shared_info = list_entry(ae, struct page_sharing_info, entry); + pg = shared_info->pg; + mfn = page_to_mfn(pg); + + /* Check if the MFN has correct type, owner and handle. */ + if ( !(pg->u.inuse.type_info & PGT_shared_page) ) { - int nr_gfns=0; + MEM_SHARING_DEBUG("mfn %lx in audit list, but not PGT_shared_page (%d)!\n", + mfn_x(mfn), pg->u.inuse.type_info & PGT_type_mask); + errors++; + continue; + } - /* Check if the MFN has correct type, owner and handle */ - pg = mfn_to_page(e->mfn); - if((pg->u.inuse.type_info & PGT_type_mask) != PGT_shared_page) - MEM_SHARING_DEBUG("mfn %lx not shared, but in the hash!\n", - mfn_x(e->mfn)); - if(page_get_owner(pg) != dom_cow) - MEM_SHARING_DEBUG("mfn %lx shared, but wrong owner (%d)!\n", - mfn_x(e->mfn), - page_get_owner(pg)->domain_id); - if(e->handle != pg->shr_handle) - MEM_SHARING_DEBUG("mfn %lx shared, but wrong handle " - "(%ld != %ld)!\n", - mfn_x(e->mfn), pg->shr_handle, e->handle); - /* Check if all GFNs map to the MFN, and the p2m types */ - list_for_each(le, &e->gfns) + /* Check the page owner. */ + if ( page_get_owner(pg) != dom_cow ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but wrong owner (%d)!\n", + mfn_x(mfn), page_get_owner(pg)->domain_id); + errors++; + } + + /* Check the m2p entry */ + if ( get_gpfn_from_mfn(mfn_x(mfn)) != SHARED_M2P_ENTRY ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but wrong m2p entry (%lx)!\n", + mfn_x(mfn), get_gpfn_from_mfn(mfn_x(mfn))); + errors++; + } + + /* Check we have a list */ + if ( (!pg->shared_info) || (list_empty(&pg->shared_info->gfns)) ) + { + MEM_SHARING_DEBUG("mfn %lx shared, but empty gfn list!\n", + mfn_x(mfn)); + errors++; + continue; + } + + /* Check if all GFNs map to the MFN, and the p2m types */ + list_for_each(le, &pg->shared_info->gfns) + { + struct domain *d; + p2m_type_t t; + mfn_t o_mfn; + gfn_info_t *g; + + g = list_entry(le, gfn_info_t, list); + d = get_domain_by_id(g->domain); + if ( d == NULL ) { - struct domain *d; - p2m_type_t t; - mfn_t mfn; - - g = list_entry(le, struct gfn_info, list); - d = get_domain_by_id(g->domain); - if(d == NULL) - { - MEM_SHARING_DEBUG("Unknow dom: %d, for PFN=%lx, MFN=%lx\n", - g->domain, g->gfn, mfn_x(e->mfn)); - continue; - } - mfn = get_gfn_unlocked(d, g->gfn, &t); - if(mfn_x(mfn) != mfn_x(e->mfn)) - MEM_SHARING_DEBUG("Incorrect P2M for d=%d, PFN=%lx." - "Expecting MFN=%ld, got %ld\n", - g->domain, g->gfn, mfn_x(e->mfn), - mfn_x(mfn)); - if(t != p2m_ram_shared) - MEM_SHARING_DEBUG("Incorrect P2M type for d=%d, PFN=%lx." - "Expecting t=%d, got %d\n", - g->domain, g->gfn, mfn_x(e->mfn), - p2m_ram_shared, t); - put_domain(d); - nr_gfns++; - } - if(nr_gfns != (pg->u.inuse.type_info & PGT_count_mask)) - MEM_SHARING_DEBUG("Mismatched counts for MFN=%lx." - "nr_gfns in hash %d, in type_info %d\n", - mfn_x(e->mfn), nr_gfns, - (pg->u.inuse.type_info & PGT_count_mask)); - e = e->next; + MEM_SHARING_DEBUG("Unknown dom: %d, for PFN=%lx, MFN=%lx\n", + g->domain, g->gfn, mfn); + errors++; + continue; + } + o_mfn = get_gfn_query_unlocked(d, g->gfn, &t); + if ( mfn_x(o_mfn) != mfn_x(mfn) ) + { + MEM_SHARING_DEBUG("Incorrect P2M for d=%d, PFN=%lx." + "Expecting MFN=%ld, got %ld\n", + g->domain, g->gfn, mfn, mfn_x(o_mfn)); + errors++; + } + if ( t != p2m_ram_shared ) + { + MEM_SHARING_DEBUG("Incorrect P2M type for d=%d, PFN=%lx." + "Expecting t=%d, got %d\n", + g->domain, g->gfn, mfn, p2m_ram_shared, t); + errors++; + } + put_domain(d); + nr_gfns++; + } + if ( nr_gfns != (pg->u.inuse.type_info & PGT_count_mask) ) + { + MEM_SHARING_DEBUG("Mismatched counts for MFN=%lx." + "nr_gfns in hash %d, in type_info %d\n", + mfn_x(mfn), nr_gfns, + (pg->u.inuse.type_info & PGT_count_mask)); + errors++; } } + + return errors; } #endif @@ -391,36 +388,6 @@ static int mem_sharing_gref_to_gfn(struc return 0; } -/* Account for a GFN being shared/unshared. - * When sharing this function needs to be called _before_ gfn lists are merged - * together, but _after_ gfn is removed from the list when unsharing. - */ -static int mem_sharing_gfn_account(struct gfn_info *gfn, int sharing) -{ - struct domain *d; - - /* A) When sharing: - * if the gfn being shared is in > 1 long list, its already been - * accounted for - * B) When unsharing: - * if the list is longer than > 1, we don''t have to account yet. - */ - if(list_has_one_entry(&gfn->list)) - { - d = get_domain_by_id(gfn->domain); - BUG_ON(!d); - if(sharing) - atomic_inc(&d->shr_pages); - else - atomic_dec(&d->shr_pages); - put_domain(d); - - return 1; - } - mem_sharing_audit(); - - return 0; -} int mem_sharing_debug_gref(struct domain *d, grant_ref_t ref) { @@ -458,8 +425,6 @@ int mem_sharing_nominate_page(struct dom mfn_t mfn; struct page_info *page; int ret; - shr_handle_t handle; - shr_hash_entry_t *hash_entry; struct gfn_info *gfn_info; *phandle = 0UL; @@ -475,7 +440,7 @@ int mem_sharing_nominate_page(struct dom /* Return the handle if the page is already shared */ page = mfn_to_page(mfn); if ( p2m_is_shared(p2mt) ) { - *phandle = page->shr_handle; + *phandle = page->shared_info->handle; ret = 0; goto out; } @@ -489,16 +454,29 @@ int mem_sharing_nominate_page(struct dom if ( ret ) goto out; - /* Create the handle */ + /* Initialize the shared state */ ret = -ENOMEM; - handle = next_handle++; - if((hash_entry = mem_sharing_hash_insert(handle, mfn)) == NULL) + if ( (page->shared_info = + xmalloc(struct page_sharing_info)) == NULL ) { + BUG_ON(page_make_private(d, page) != 0); goto out; } - if((gfn_info = mem_sharing_gfn_alloc()) == NULL) + page->shared_info->pg = page; + INIT_LIST_HEAD(&page->shared_info->gfns); +#if MEM_SHARING_AUDIT + INIT_LIST_HEAD(&page->shared_info->entry); +#endif + + /* Create the handle */ + page->shared_info->handle = next_handle++; + + /* Create the local gfn info */ + if ( (gfn_info = mem_sharing_gfn_alloc(page, d, gfn)) == NULL ) { - mem_sharing_hash_destroy(hash_entry); + xfree(page->shared_info); + page->shared_info = NULL; + BUG_ON(page_make_private(d, page) != 0); goto out; } @@ -509,23 +487,19 @@ int mem_sharing_nominate_page(struct dom * it a few lines above. * The mfn needs to revert back to rw type. This should never fail, * since no-one knew that the mfn was temporarily sharable */ + mem_sharing_gfn_destroy(gfn_info, 0); + xfree(page->shared_info); + page->shared_info = NULL; + /* NOTE: We haven''t yet added this to the audit list. */ BUG_ON(page_make_private(d, page) != 0); - mem_sharing_hash_destroy(hash_entry); - mem_sharing_gfn_destroy(gfn_info, 0); goto out; } /* Update m2p entry to SHARED_M2P_ENTRY */ set_gpfn_from_mfn(mfn_x(mfn), SHARED_M2P_ENTRY); - INIT_LIST_HEAD(&hash_entry->gfns); - INIT_LIST_HEAD(&gfn_info->list); - list_add(&gfn_info->list, &hash_entry->gfns); - gfn_info->gfn = gfn; - gfn_info->domain = d->domain_id; - page->shr_handle = handle; - *phandle = handle; - + *phandle = page->shared_info->handle; + audit_add_list(page); ret = 0; out: @@ -534,54 +508,92 @@ out: return ret; } -int mem_sharing_share_pages(shr_handle_t sh, shr_handle_t ch) +int mem_sharing_share_pages(struct domain *sd, unsigned long sgfn, shr_handle_t sh, + struct domain *cd, unsigned long cgfn, shr_handle_t ch) { - shr_hash_entry_t *se, *ce; struct page_info *spage, *cpage; struct list_head *le, *te; - struct gfn_info *gfn; + gfn_info_t *gfn; struct domain *d; - int ret; + int ret = -EINVAL, single_client_gfn, single_source_gfn; + mfn_t smfn, cmfn; + p2m_type_t smfn_type, cmfn_type; shr_lock(); + /* XXX if sd == cd handle potential deadlock by ordering + * the get_ and put_gfn''s */ + smfn = get_gfn(sd, sgfn, &smfn_type); + cmfn = get_gfn(cd, cgfn, &cmfn_type); + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; - se = mem_sharing_hash_lookup(sh); - if(se == NULL) goto err_out; + spage = mem_sharing_lookup(mfn_x(smfn)); + if ( spage == NULL ) + goto err_out; + ASSERT(smfn_type == p2m_ram_shared); ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; - ce = mem_sharing_hash_lookup(ch); - if(ce == NULL) goto err_out; - spage = mfn_to_page(se->mfn); - cpage = mfn_to_page(ce->mfn); - /* gfn lists always have at least one entry => save to call list_entry */ - mem_sharing_gfn_account(gfn_get_info(&ce->gfns), 1); - mem_sharing_gfn_account(gfn_get_info(&se->gfns), 1); - list_for_each_safe(le, te, &ce->gfns) + cpage = mem_sharing_lookup(mfn_x(cmfn)); + if ( cpage == NULL ) + goto err_out; + ASSERT(cmfn_type == p2m_ram_shared); + + /* Check that the handles match */ + if ( spage->shared_info->handle != sh ) { - gfn = list_entry(le, struct gfn_info, list); - /* Get the source page and type, this should never fail - * because we are under shr lock, and got non-null se */ + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + goto err_out; + } + if ( cpage->shared_info->handle != ch ) + { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + goto err_out; + } + + /* Merge the lists together */ + single_source_gfn = list_has_one_entry(&spage->shared_info->gfns); + single_client_gfn = list_has_one_entry(&cpage->shared_info->gfns); + list_for_each_safe(le, te, &cpage->shared_info->gfns) + { + gfn = list_entry(le, gfn_info_t, list); + /* Get the source page and type, this should never fail: + * we are under shr lock, and got a successful lookup */ BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); - /* Move the gfn_info from ce list to se list */ + /* Move the gfn_info from client list to source list */ list_del(&gfn->list); + list_add(&gfn->list, &spage->shared_info->gfns); + put_page_and_type(cpage); d = get_domain_by_id(gfn->domain); BUG_ON(!d); - BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0); + BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn) == 0); + if ( single_client_gfn ) + { + /* Only increase the per-domain count when we are actually + * sharing. And don''t increase it should we ever re-share */ + atomic_inc(&d->shr_pages); + ASSERT( cd == d ); + } put_domain(d); - list_add(&gfn->list, &se->gfns); - put_page_and_type(cpage); - } - ASSERT(list_empty(&ce->gfns)); - mem_sharing_hash_delete(ch); - atomic_inc(&nr_saved_mfns); + } + ASSERT(list_empty(&cpage->shared_info->gfns)); + if ( single_source_gfn ) + atomic_inc(&sd->shr_pages); + + /* Clear the rest of the shared state */ + audit_del_list(cpage); + xfree(cpage->shared_info); + cpage->shared_info = NULL; + /* Free the client page */ if(test_and_clear_bit(_PGC_allocated, &cpage->count_info)) put_page(cpage); + + atomic_inc(&nr_saved_mfns); ret = 0; err_out: + put_gfn(cd, cgfn); + put_gfn(sd, sgfn); shr_unlock(); - return ret; } @@ -593,13 +605,9 @@ int mem_sharing_unshare_page(struct doma mfn_t mfn; struct page_info *page, *old_page; void *s, *t; - int ret, last_gfn; - shr_hash_entry_t *hash_entry; - struct gfn_info *gfn_info = NULL; - shr_handle_t handle; + int last_gfn; + gfn_info_t *gfn_info = NULL; struct list_head *le; - - /* Remove the gfn_info from the list */ /* This is one of the reasons why we can''t enforce ordering * between shr_lock and p2m fine-grained locks in mm-lock. @@ -615,56 +623,74 @@ int mem_sharing_unshare_page(struct doma return 0; } - page = mfn_to_page(mfn); - handle = page->shr_handle; - - hash_entry = mem_sharing_hash_lookup(handle); - list_for_each(le, &hash_entry->gfns) + page = mem_sharing_lookup(mfn_x(mfn)); + if ( page == NULL ) { - gfn_info = list_entry(le, struct gfn_info, list); + gdprintk(XENLOG_ERR, "Domain p2m is shared, but page is not: " + "%lx\n", gfn); + BUG(); + } + + list_for_each(le, &page->shared_info->gfns) + { + gfn_info = list_entry(le, gfn_info_t, list); if ( (gfn_info->gfn == gfn) && (gfn_info->domain == d->domain_id) ) goto gfn_found; } gdprintk(XENLOG_ERR, "Could not find gfn_info for shared gfn: " "%lx\n", gfn); BUG(); -gfn_found: - /* Delete gfn_info from the list, but hold on to it, until we''ve allocated - * memory to make a copy */ - list_del(&gfn_info->list); - last_gfn = list_empty(&hash_entry->gfns); + +gfn_found: + /* Do the accounting first. If anything fails below, we have bigger + * bigger fish to fry. First, remove the gfn from the list. */ + last_gfn = list_has_one_entry(&page->shared_info->gfns); + mem_sharing_gfn_destroy(gfn_info, !last_gfn); + if ( !last_gfn ) + { + atomic_dec(&nr_saved_mfns); + /* Update stats if only one gfn is left for this shared page. + * This really isn''t a shared page anymore. */ + if ( list_has_one_entry(&page->shared_info->gfns) ) + { + struct list_head *pos = page->shared_info->gfns.next; + gfn_info_t *gfn_info = list_entry(pos, gfn_info_t, list); + struct domain *d = get_domain_by_id(gfn_info->domain); + BUG_ON(!d); + atomic_dec(&d->shr_pages); + put_domain(d); + } + } else { + /* Clean up shared state */ + audit_del_list(page); + xfree(page->shared_info); + page->shared_info = NULL; + } /* If the GFN is getting destroyed drop the references to MFN * (possibly freeing the page), and exit early */ if ( flags & MEM_SHARING_DESTROY_GFN ) { - mem_sharing_gfn_destroy(gfn_info, !last_gfn); - if(last_gfn) - mem_sharing_hash_delete(handle); - else - /* Even though we don''t allocate a private page, we have to account - * for the MFN that originally backed this PFN. */ - atomic_dec(&nr_saved_mfns); put_gfn(d, gfn); shr_unlock(); put_page_and_type(page); - if(last_gfn && - test_and_clear_bit(_PGC_allocated, &page->count_info)) + if( last_gfn && + test_and_clear_bit(_PGC_allocated, &page->count_info)) put_page(page); + return 0; } - ret = page_make_private(d, page); - BUG_ON(last_gfn & ret); - if(ret == 0) goto private_page_found; + if ( last_gfn ) + { + BUG_ON(page_make_private(d, page) != 0); + goto private_page_found; + } old_page = page; page = mem_sharing_alloc_page(d, gfn); - if(!page) + if ( !page ) { - /* We''ve failed to obtain memory for private page. Need to re-add the - * gfn_info to relevant list */ - list_add(&gfn_info->list, &hash_entry->gfns); put_gfn(d, gfn); shr_unlock(); return -ENOMEM; @@ -676,30 +702,18 @@ gfn_found: unmap_domain_page(s); unmap_domain_page(t); - /* NOTE: set_shared_p2m_entry will switch the underlying mfn. If - * we do get_page withing get_gfn, the correct sequence here - * should be - get_page(page); - put_page(old_page); - * so that the ref to the old page is dropped, and a ref to - * the new page is obtained to later be dropped in put_gfn */ BUG_ON(set_shared_p2m_entry(d, gfn, page_to_mfn(page)) == 0); put_page_and_type(old_page); private_page_found: - /* We''ve got a private page, we can commit the gfn destruction */ - mem_sharing_gfn_destroy(gfn_info, !last_gfn); - if(last_gfn) - mem_sharing_hash_delete(handle); - else - atomic_dec(&nr_saved_mfns); - if ( p2m_change_type(d, gfn, p2m_ram_shared, p2m_ram_rw) != p2m_ram_shared ) { - printk("Could not change p2m type.\n"); + gdprintk(XENLOG_ERR, "Could not change p2m type d %hu gfn %lx.\n", + d->domain_id, gfn); BUG(); } + /* Update m2p entry */ set_gpfn_from_mfn(mfn_x(page_to_mfn(page)), gfn); @@ -756,9 +770,18 @@ int mem_sharing_domctl(struct domain *d, case XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE: { - shr_handle_t sh = mec->u.share.source_handle; - shr_handle_t ch = mec->u.share.client_handle; - rc = mem_sharing_share_pages(sh, ch); + unsigned long sgfn = mec->u.share.source_gfn; + shr_handle_t sh = mec->u.share.source_handle; + struct domain *cd = get_domain_by_id(mec->u.share.client_domain); + if ( cd ) + { + unsigned long cgfn = mec->u.share.client_gfn; + shr_handle_t ch = mec->u.share.client_handle; + rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); + put_domain(cd); + } + else + return -EEXIST; } break; @@ -806,6 +829,9 @@ int mem_sharing_domctl(struct domain *d, void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); - mem_sharing_hash_init(); + mm_lock_init(&shr_lock); +#if MEM_SHARING_AUDIT + INIT_LIST_HEAD(&shr_audit_list); +#endif } diff -r d29cba447de1 -r 4862cca21d3c xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -22,13 +22,28 @@ #ifndef __MEM_SHARING_H__ #define __MEM_SHARING_H__ +#include <public/domctl.h> + +/* Auditing of memory sharing code? */ +#define MEM_SHARING_AUDIT 0 + +typedef uint64_t shr_handle_t; + +struct page_sharing_info +{ + struct page_info *pg; /* Back pointer to the page. */ + shr_handle_t handle; /* Globally unique version / handle. */ +#if MEM_SHARING_AUDIT + struct list_head entry; /* List of all shared pages (entry). */ +#endif + struct list_head gfns; /* List of domains and gfns for this page (head). */ +}; + #ifdef __x86_64__ #define sharing_supported(_d) \ (is_hvm_domain(_d) && paging_mode_hap(_d)) -typedef uint64_t shr_handle_t; - unsigned int mem_sharing_get_nr_saved_mfns(void); int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, diff -r d29cba447de1 -r 4862cca21d3c xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -31,6 +31,8 @@ struct page_list_entry __pdx_t next, prev; }; +struct page_sharing_info; + struct page_info { union { @@ -49,8 +51,13 @@ struct page_info /* For non-pinnable single-page shadows, a higher entry that points * at us. */ paddr_t up; - /* For shared/sharable pages the sharing handle */ - uint64_t shr_handle; + /* For shared/sharable pages, we use a doubly-linked list + * of all the {pfn,domain} pairs that map this page. We also include + * an opaque handle, which is effectively a version, so that clients + * of sharing share the version they expect to. + * This list is allocated and freed when a page is shared/unshared. + */ + struct page_sharing_info *shared_info; }; /* Reference count and various PGC_xxx flags and fields. */ diff -r d29cba447de1 -r 4862cca21d3c xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -789,7 +789,10 @@ struct xen_domctl_mem_sharing_op { uint64_aligned_t handle; /* OUT: the handle */ } nominate; struct mem_sharing_op_share { /* OP_SHARE */ + uint64_aligned_t source_gfn; /* IN: the gfn of the source page */ uint64_aligned_t source_handle; /* IN: handle to the source page */ + uint64_aligned_t client_domain; /* IN: the client domain id */ + uint64_aligned_t client_gfn; /* IN: the client gfn */ uint64_aligned_t client_handle; /* IN: handle to the client page */ } share; struct mem_sharing_op_debug { /* OP_DEBUG_xxx */
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 3 of 9] x86/mm: Update mem sharing interface to (re)allow sharing of grants
xen/arch/x86/mm/mem_sharing.c | 57 ++++++++++++++++++++++++++++++++++++------ xen/include/public/domctl.h | 9 ++++++ 2 files changed, 57 insertions(+), 9 deletions(-) Previosuly, the mem sharing code would return an opaque handle to index shared pages (and nominees) in its global hash table. By removing the hash table, the new interfaces requires a gfn and a version. However, when sharing grants, the caller only has a grant ref and a version. Update interface to handle this case. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 4862cca21d3c -r 4a189125d71e xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -770,18 +770,57 @@ int mem_sharing_domctl(struct domain *d, case XEN_DOMCTL_MEM_EVENT_OP_SHARING_SHARE: { - unsigned long sgfn = mec->u.share.source_gfn; - shr_handle_t sh = mec->u.share.source_handle; - struct domain *cd = get_domain_by_id(mec->u.share.client_domain); - if ( cd ) + unsigned long sgfn, cgfn; + struct domain *cd; + shr_handle_t sh, ch; + + if ( !mem_sharing_enabled(d) ) + return -EINVAL; + + cd = get_domain_by_id(mec->u.share.client_domain); + if ( !cd ) + return -ESRCH; + + if ( !mem_sharing_enabled(cd) ) { - unsigned long cgfn = mec->u.share.client_gfn; - shr_handle_t ch = mec->u.share.client_handle; - rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); put_domain(cd); + return -EINVAL; } - else - return -EEXIST; + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.source_gfn) ) + { + grant_ref_t gref = (grant_ref_t) + (XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF( + mec->u.share.source_gfn)); + if ( mem_sharing_gref_to_gfn(d, gref, &sgfn) < 0 ) + { + put_domain(cd); + return -EINVAL; + } + } else { + sgfn = mec->u.share.source_gfn; + } + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.client_gfn) ) + { + grant_ref_t gref = (grant_ref_t) + (XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF( + mec->u.share.client_gfn)); + if ( mem_sharing_gref_to_gfn(cd, gref, &cgfn) < 0 ) + { + put_domain(cd); + return -EINVAL; + } + } else { + cgfn = mec->u.share.client_gfn; + } + + sh = mec->u.share.source_handle; + ch = mec->u.share.client_handle; + + rc = mem_sharing_share_pages(d, sgfn, sh, cd, cgfn, ch); + + put_domain(cd); } break; diff -r 4862cca21d3c -r 4a189125d71e xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -775,6 +775,15 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9) +#define XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG (1ULL << 62) + +#define XEN_DOMCTL_MEM_SHARING_FIELD_MAKE_GREF(field, val) \ + (field) = (XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG | val) +#define XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(field) \ + ((field) & XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG) +#define XEN_DOMCTL_MEM_SHARING_FIELD_GET_GREF(field) \ + ((field) & (~XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF_FLAG)) + struct xen_domctl_mem_sharing_op { uint8_t op; /* XEN_DOMCTL_MEM_EVENT_OP_* */ ______________________________________________________________________ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com ______________________________________________________________________
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 4 of 9] x86/mm: Check how many mfns are shared, in addition to how many are saved
xen/arch/x86/mm.c | 6 ------ xen/arch/x86/mm/mem_sharing.c | 29 +++++++++++++++++++++++++++++ xen/arch/x86/x86_64/compat/mm.c | 6 ++++++ xen/arch/x86/x86_64/mm.c | 7 +++++++ xen/include/asm-x86/mem_sharing.h | 1 + xen/include/public/memory.h | 1 + 6 files changed, 44 insertions(+), 6 deletions(-) This patch also moves the existing sharing-related memory op to the correct location, and adds logic to the audit() method that uses the new information. This patch only provides the Xen implementation of the domctls. Signed-off-by: Andres Lagar-Cavilla <andres@scannell.ca> Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r 4a189125d71e -r dbfabd1bbfb1 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -119,7 +119,6 @@ #include <xen/trace.h> #include <asm/setup.h> #include <asm/fixmap.h> -#include <asm/mem_sharing.h> /* * Mapping of first 2 or 4 megabytes of memory. This is mapped with 4kB @@ -5093,11 +5092,6 @@ long arch_memory_op(int op, XEN_GUEST_HA return rc; } -#ifdef __x86_64__ - case XENMEM_get_sharing_freed_pages: - return mem_sharing_get_nr_saved_mfns(); -#endif - default: return subarch_memory_op(op, arg); } diff -r 4a189125d71e -r dbfabd1bbfb1 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -72,6 +72,7 @@ static inline int mem_sharing_audit(void static shr_handle_t next_handle = 1; static atomic_t nr_saved_mfns = ATOMIC_INIT(0); +static atomic_t nr_shared_mfns = ATOMIC_INIT(0); typedef struct gfn_info { @@ -150,9 +151,12 @@ static struct page_info* mem_sharing_loo static int mem_sharing_audit(void) { int errors = 0; + unsigned long count_expected; + unsigned long count_found = 0; struct list_head *ae; ASSERT(shr_locked_by_me()); + count_expected = atomic_read(&nr_shared_mfns); list_for_each(ae, &shr_audit_list) { @@ -200,6 +204,10 @@ static int mem_sharing_audit(void) continue; } + /* Only increase our expected count for pages that are actually shared */ + if ( !list_has_one_entry(&pg->shared_info->gfns) ) + count_found++; + /* Check if all GFNs map to the MFN, and the p2m types */ list_for_each(le, &pg->shared_info->gfns) { @@ -245,6 +253,13 @@ static int mem_sharing_audit(void) } } + if ( count_found != count_expected ) + { + MEM_SHARING_DEBUG("Expected %ld shared mfns, found %ld.", + count_expected, count_found); + errors++; + } + return errors; } #endif @@ -294,6 +309,11 @@ unsigned int mem_sharing_get_nr_saved_mf return ((unsigned int)atomic_read(&nr_saved_mfns)); } +unsigned int mem_sharing_get_nr_shared_mfns(void) +{ + return (unsigned int)atomic_read(&nr_shared_mfns); +} + int mem_sharing_sharing_resume(struct domain *d) { mem_event_response_t rsp; @@ -577,6 +597,14 @@ int mem_sharing_share_pages(struct domai ASSERT(list_empty(&cpage->shared_info->gfns)); if ( single_source_gfn ) atomic_inc(&sd->shr_pages); + /* We only increase the number of shared pages when + * sharing for the first time */ + if ( single_source_gfn && single_client_gfn ) + atomic_inc(&nr_shared_mfns); + /* And we decrease it when re-sharing already shared + * (because one of them becomes a "saved" page). */ + if ( !single_source_gfn && !single_client_gfn ) + atomic_dec(&nr_shared_mfns); /* Clear the rest of the shared state */ audit_del_list(cpage); @@ -659,6 +687,7 @@ gfn_found: BUG_ON(!d); atomic_dec(&d->shr_pages); put_domain(d); + atomic_dec(&nr_shared_mfns); } } else { /* Clean up shared state */ diff -r 4a189125d71e -r dbfabd1bbfb1 xen/arch/x86/x86_64/compat/mm.c --- a/xen/arch/x86/x86_64/compat/mm.c +++ b/xen/arch/x86/x86_64/compat/mm.c @@ -205,6 +205,12 @@ int compat_arch_memory_op(int op, XEN_GU break; } + case XENMEM_get_sharing_freed_pages: + return mem_sharing_get_nr_saved_mfns(); + + case XENMEM_get_sharing_shared_pages: + return mem_sharing_get_nr_shared_mfns(); + default: rc = -ENOSYS; break; diff -r 4a189125d71e -r dbfabd1bbfb1 xen/arch/x86/x86_64/mm.c --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -34,6 +34,7 @@ #include <asm/msr.h> #include <asm/setup.h> #include <asm/numa.h> +#include <asm/mem_sharing.h> #include <public/memory.h> /* Parameters for PFN/MADDR compression. */ @@ -1090,6 +1091,12 @@ long subarch_memory_op(int op, XEN_GUEST break; + case XENMEM_get_sharing_freed_pages: + return mem_sharing_get_nr_saved_mfns(); + + case XENMEM_get_sharing_shared_pages: + return mem_sharing_get_nr_shared_mfns(); + default: rc = -ENOSYS; break; diff -r 4a189125d71e -r dbfabd1bbfb1 xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -45,6 +45,7 @@ struct page_sharing_info (is_hvm_domain(_d) && paging_mode_hap(_d)) unsigned int mem_sharing_get_nr_saved_mfns(void); +unsigned int mem_sharing_get_nr_shared_mfns(void); int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, int expected_refcnt, diff -r 4a189125d71e -r dbfabd1bbfb1 xen/include/public/memory.h --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -294,6 +294,7 @@ typedef struct xen_pod_target xen_pod_ta * The call never fails. */ #define XENMEM_get_sharing_freed_pages 18 +#define XENMEM_get_sharing_shared_pages 19 #endif /* __XEN_PUBLIC_MEMORY_H__ */
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 5 of 9] x86/mm: Add per-page locking for memory sharing, when audits are disabled
xen/arch/x86/mm.c | 74 +---------- xen/arch/x86/mm/mem_sharing.c | 280 ++++++++++++++++++++++++++++++++++++++--- xen/arch/x86/mm/mm-locks.h | 31 ++++- xen/include/asm-x86/mm.h | 28 +++- 4 files changed, 310 insertions(+), 103 deletions(-) With the removal of the hash table, all that is needed now is locking of individual shared pages, as new (gfn,domain) pairs are removed or added from the list of mappings. We recycle PGT_locked and use it to lock individual pages. We ensure deadlock is averted by locking pages in increasing order. We leverage an "external" order enforcing construct from mm-locks.h to ensure the p2m_lock (needed to modify gfn entries during sharing) is always taken after page_locks. The global lock remains for the benefit of the auditing code, and is thus enabled only as a compile-time option. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r dbfabd1bbfb1 -r bfebf49b3eb6 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -1718,7 +1718,7 @@ static int free_l4_table(struct page_inf #define free_l4_table(page, preemptible) (-EINVAL) #endif -static int page_lock(struct page_info *page) +int page_lock(struct page_info *page) { unsigned long x, nx; @@ -1735,7 +1735,7 @@ static int page_lock(struct page_info *p return 1; } -static void page_unlock(struct page_info *page) +void page_unlock(struct page_info *page) { unsigned long x, nx, y = page->u.inuse.type_info; @@ -4299,76 +4299,6 @@ int steal_page( return -1; } -int page_make_sharable(struct domain *d, - struct page_info *page, - int expected_refcnt) -{ - spin_lock(&d->page_alloc_lock); - - /* Change page type and count atomically */ - if ( !get_page_and_type(page, d, PGT_shared_page) ) - { - spin_unlock(&d->page_alloc_lock); - return -EINVAL; - } - - /* Check it wasn''t already sharable and undo if it was */ - if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) - { - put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); - return -EEXIST; - } - - /* Check if the ref count is 2. The first from PGC_allocated, and - * the second from get_page_and_type at the top of this function */ - if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) - { - /* Return type count back to zero */ - put_page_and_type(page); - spin_unlock(&d->page_alloc_lock); - return -E2BIG; - } - - page_set_owner(page, dom_cow); - d->tot_pages--; - page_list_del(page, &d->page_list); - spin_unlock(&d->page_alloc_lock); - return 0; -} - -int page_make_private(struct domain *d, struct page_info *page) -{ - if ( !get_page(page, dom_cow) ) - return -EINVAL; - - spin_lock(&d->page_alloc_lock); - - /* We can only change the type if count is one */ - if ( (page->u.inuse.type_info & (PGT_type_mask | PGT_count_mask)) - != (PGT_shared_page | 1) ) - { - put_page(page); - spin_unlock(&d->page_alloc_lock); - return -EEXIST; - } - - /* Drop the final typecount */ - put_page_and_type(page); - - /* Change the owner */ - ASSERT(page_get_owner(page) == dom_cow); - page_set_owner(page, d); - - d->tot_pages++; - page_list_add_tail(page, &d->page_list); - spin_unlock(&d->page_alloc_lock); - - put_page(page); - - return 0; -} - static int __do_update_va_mapping( unsigned long va, u64 val64, unsigned long flags, struct domain *pg_owner) { diff -r dbfabd1bbfb1 -r bfebf49b3eb6 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -35,10 +35,29 @@ #include "mm-locks.h" +static shr_handle_t next_handle = 1; + +typedef struct pg_lock_data { + int mm_unlock_level; + unsigned short recurse_count; +} pg_lock_data_t; + +#define DECLARE_PG_LOCK_DATA(name) \ + pg_lock_data_t name = { 0, 0 }; + #if MEM_SHARING_AUDIT + +static mm_lock_t shr_lock; + +#define shr_lock() _shr_lock() +#define shr_unlock() _shr_unlock() +#define shr_locked_by_me() _shr_locked_by_me() + static int mem_sharing_audit(void); + #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) + static struct list_head shr_audit_list; static inline void audit_add_list(struct page_info *page) @@ -50,7 +69,22 @@ static inline void audit_del_list(struct { list_del(&page->shared_info->entry); } + +static inline int mem_sharing_page_lock(struct page_info *p, + pg_lock_data_t *l) +{ + return 1; +} +#define mem_sharing_page_unlock(p, l) ((void)0) + +#define get_next_handle() next_handle++; #else + +#define shr_lock() ((void)0) +#define shr_unlock() ((void)0) +/* Only used inside audit code */ +//#define shr_locked_by_me() ((void)0) + static inline int mem_sharing_audit(void) { return 0; @@ -58,6 +92,41 @@ static inline int mem_sharing_audit(void #define audit_add_list(p) ((void)0) #define audit_del_list(p) ((void)0) + +static inline int mem_sharing_page_lock(struct page_info *pg, + pg_lock_data_t *pld) +{ + int rc; + page_sharing_mm_pre_lock(); + rc = page_lock(pg); + if ( rc ) + { + preempt_disable(); + page_sharing_mm_post_lock(&pld->mm_unlock_level, + &pld->recurse_count); + } + return rc; +} + +static inline void mem_sharing_page_unlock(struct page_info *pg, + pg_lock_data_t *pld) +{ + page_sharing_mm_unlock(pld->mm_unlock_level, + &pld->recurse_count); + preempt_enable(); + page_unlock(pg); +} + +static inline shr_handle_t get_next_handle(void) +{ + /* Get the next handle get_page style */ + uint64_t x, y = next_handle; + do { + x = y; + } + while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); + return x + 1; +} #endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ @@ -70,7 +139,6 @@ static inline int mem_sharing_audit(void #undef page_to_mfn #define page_to_mfn(_pg) _mfn(__page_to_mfn(_pg)) -static shr_handle_t next_handle = 1; static atomic_t nr_saved_mfns = ATOMIC_INIT(0); static atomic_t nr_shared_mfns = ATOMIC_INIT(0); @@ -81,8 +149,6 @@ typedef struct gfn_info struct list_head list; } gfn_info_t; -static mm_lock_t shr_lock; - /* Returns true if list has only one entry. O(1) complexity. */ static inline int list_has_one_entry(struct list_head *head) { @@ -436,6 +502,117 @@ int mem_sharing_debug_gref(struct domain return mem_sharing_debug_gfn(d, gfn); } +/* Functions that change a page''s type and ownership */ +static int page_make_sharable(struct domain *d, + struct page_info *page, + int expected_refcnt) +{ + spin_lock(&d->page_alloc_lock); + + /* Change page type and count atomically */ + if ( !get_page_and_type(page, d, PGT_shared_page) ) + { + spin_unlock(&d->page_alloc_lock); + return -EINVAL; + } + + /* Check it wasn''t already sharable and undo if it was */ + if ( (page->u.inuse.type_info & PGT_count_mask) != 1 ) + { + put_page_and_type(page); + spin_unlock(&d->page_alloc_lock); + return -EEXIST; + } + + /* Check if the ref count is 2. The first from PGC_allocated, and + * the second from get_page_and_type at the top of this function */ + if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) ) + { + /* Return type count back to zero */ + put_page_and_type(page); + spin_unlock(&d->page_alloc_lock); + return -E2BIG; + } + + page_set_owner(page, dom_cow); + d->tot_pages--; + page_list_del(page, &d->page_list); + spin_unlock(&d->page_alloc_lock); + return 0; +} + +static int page_make_private(struct domain *d, struct page_info *page, + pg_lock_data_t *pld) +{ + unsigned long expected_type; + + if ( !get_page(page, dom_cow) ) + return -EINVAL; + + spin_lock(&d->page_alloc_lock); + + /* We can only change the type if count is one */ + /* If we are locking pages individually, then we need to drop + * the lock here, while the page is typed. We cannot risk the + * race of page_unlock and then put_page_type. */ +#if MEM_SHARING_AUDIT + expected_type = (PGT_shared_page | PGT_validated | 1); +#else + expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); +#endif + if ( page->u.inuse.type_info != expected_type ) + { + put_page(page); + spin_unlock(&d->page_alloc_lock); + return -EEXIST; + } + + /* Drop the final typecount */ + put_page_and_type(page); + +#if MEM_SHARING_AUDIT + (void)pld; +#else + /* Now that we''ve dropped the type, we can unlock */ + mem_sharing_page_unlock(page, pld); +#endif + + /* Change the owner */ + ASSERT(page_get_owner(page) == dom_cow); + page_set_owner(page, d); + + d->tot_pages++; + page_list_add_tail(page, &d->page_list); + spin_unlock(&d->page_alloc_lock); + + put_page(page); + + return 0; +} + +static inline struct page_info *__grab_shared_page(mfn_t mfn, + pg_lock_data_t *pld) +{ + struct page_info *pg = NULL; + + if ( !mfn_valid(mfn) ) + return NULL; + pg = mfn_to_page(mfn); + + /* If the page is not validated we can''t lock it, and if it''s + * not validated it''s obviously not shared. */ + if ( !mem_sharing_page_lock(pg, pld) ) + return NULL; + + if ( mem_sharing_lookup(mfn_x(mfn)) == NULL ) + { + mem_sharing_page_unlock(pg, pld); + return NULL; + } + + return pg; +} + int mem_sharing_nominate_page(struct domain *d, unsigned long gfn, int expected_refcnt, @@ -443,9 +620,10 @@ int mem_sharing_nominate_page(struct dom { p2m_type_t p2mt; mfn_t mfn; - struct page_info *page; + struct page_info *page = NULL; /* gcc... */ int ret; struct gfn_info *gfn_info; + DECLARE_PG_LOCK_DATA(pld); *phandle = 0UL; @@ -458,10 +636,17 @@ int mem_sharing_nominate_page(struct dom goto out; /* Return the handle if the page is already shared */ - page = mfn_to_page(mfn); if ( p2m_is_shared(p2mt) ) { - *phandle = page->shared_info->handle; + struct page_info *pg = __grab_shared_page(mfn, &pld); + if ( !pg ) + { + gdprintk(XENLOG_ERR, "Shared p2m entry gfn %lx, but could not " + "grab page %lx dom %d\n", gfn, mfn_x(mfn), d->domain_id); + BUG(); + } + *phandle = pg->shared_info->handle; ret = 0; + mem_sharing_page_unlock(pg, &pld); goto out; } @@ -470,16 +655,25 @@ int mem_sharing_nominate_page(struct dom goto out; /* Try to convert the mfn to the sharable type */ + page = mfn_to_page(mfn); ret = page_make_sharable(d, page, expected_refcnt); if ( ret ) goto out; + /* Now that the page is validated, we can lock it. There is no + * race because we''re holding the p2m entry, so no one else + * could be nominating this gfn */ + ret = -ENOENT; + if ( !mem_sharing_page_lock(page, &pld) ) + goto out; + /* Initialize the shared state */ ret = -ENOMEM; if ( (page->shared_info = xmalloc(struct page_sharing_info)) == NULL ) { - BUG_ON(page_make_private(d, page) != 0); + /* Making a page private atomically unlocks it */ + BUG_ON(page_make_private(d, page, &pld) != 0); goto out; } page->shared_info->pg = page; @@ -489,14 +683,14 @@ int mem_sharing_nominate_page(struct dom #endif /* Create the handle */ - page->shared_info->handle = next_handle++; + page->shared_info->handle = get_next_handle(); /* Create the local gfn info */ if ( (gfn_info = mem_sharing_gfn_alloc(page, d, gfn)) == NULL ) { xfree(page->shared_info); page->shared_info = NULL; - BUG_ON(page_make_private(d, page) != 0); + BUG_ON(page_make_private(d, page, &pld) != 0); goto out; } @@ -511,7 +705,7 @@ int mem_sharing_nominate_page(struct dom xfree(page->shared_info); page->shared_info = NULL; /* NOTE: We haven''t yet added this to the audit list. */ - BUG_ON(page_make_private(d, page) != 0); + BUG_ON(page_make_private(d, page, &pld) != 0); goto out; } @@ -520,6 +714,7 @@ int mem_sharing_nominate_page(struct dom *phandle = page->shared_info->handle; audit_add_list(page); + mem_sharing_page_unlock(page, &pld); ret = 0; out: @@ -531,13 +726,14 @@ out: int mem_sharing_share_pages(struct domain *sd, unsigned long sgfn, shr_handle_t sh, struct domain *cd, unsigned long cgfn, shr_handle_t ch) { - struct page_info *spage, *cpage; + struct page_info *spage, *cpage, *firstpg, *secondpg; struct list_head *le, *te; gfn_info_t *gfn; struct domain *d; int ret = -EINVAL, single_client_gfn, single_source_gfn; mfn_t smfn, cmfn; p2m_type_t smfn_type, cmfn_type; + DECLARE_PG_LOCK_DATA(pld); shr_lock(); @@ -546,26 +742,53 @@ int mem_sharing_share_pages(struct domai smfn = get_gfn(sd, sgfn, &smfn_type); cmfn = get_gfn(cd, cgfn, &cmfn_type); - ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; - spage = mem_sharing_lookup(mfn_x(smfn)); - if ( spage == NULL ) - goto err_out; + /* This tricky business is to avoid two callers deadlocking if + * grabbing pages in opposite client/source order */ + if ( mfn_x(smfn) < mfn_x(cmfn) ) + { + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = firstpg = __grab_shared_page(smfn, &pld); + if ( spage == NULL ) + goto err_out; + + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + cpage = secondpg = __grab_shared_page(cmfn, &pld); + if ( cpage == NULL ) + { + mem_sharing_page_unlock(spage, &pld); + goto err_out; + } + } else { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + cpage = firstpg = __grab_shared_page(cmfn, &pld); + if ( cpage == NULL ) + goto err_out; + + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = secondpg = __grab_shared_page(smfn, &pld); + if ( spage == NULL ) + { + mem_sharing_page_unlock(cpage, &pld); + goto err_out; + } + } + ASSERT(smfn_type == p2m_ram_shared); - ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; - cpage = mem_sharing_lookup(mfn_x(cmfn)); - if ( cpage == NULL ) - goto err_out; ASSERT(cmfn_type == p2m_ram_shared); /* Check that the handles match */ if ( spage->shared_info->handle != sh ) { ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + mem_sharing_page_unlock(secondpg, &pld); + mem_sharing_page_unlock(firstpg, &pld); goto err_out; } if ( cpage->shared_info->handle != ch ) { ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + mem_sharing_page_unlock(secondpg, &pld); + mem_sharing_page_unlock(firstpg, &pld); goto err_out; } @@ -611,6 +834,9 @@ int mem_sharing_share_pages(struct domai xfree(cpage->shared_info); cpage->shared_info = NULL; + mem_sharing_page_unlock(secondpg, &pld); + mem_sharing_page_unlock(firstpg, &pld); + /* Free the client page */ if(test_and_clear_bit(_PGC_allocated, &cpage->count_info)) put_page(cpage); @@ -636,6 +862,7 @@ int mem_sharing_unshare_page(struct doma int last_gfn; gfn_info_t *gfn_info = NULL; struct list_head *le; + DECLARE_PG_LOCK_DATA(pld); /* This is one of the reasons why we can''t enforce ordering * between shr_lock and p2m fine-grained locks in mm-lock. @@ -651,7 +878,7 @@ int mem_sharing_unshare_page(struct doma return 0; } - page = mem_sharing_lookup(mfn_x(mfn)); + page = __grab_shared_page(mfn, &pld); if ( page == NULL ) { gdprintk(XENLOG_ERR, "Domain p2m is shared, but page is not: " @@ -700,19 +927,21 @@ gfn_found: * (possibly freeing the page), and exit early */ if ( flags & MEM_SHARING_DESTROY_GFN ) { - put_gfn(d, gfn); - shr_unlock(); put_page_and_type(page); + mem_sharing_page_unlock(page, &pld); if( last_gfn && test_and_clear_bit(_PGC_allocated, &page->count_info)) put_page(page); + put_gfn(d, gfn); + shr_unlock(); return 0; } if ( last_gfn ) { - BUG_ON(page_make_private(d, page) != 0); + /* Making a page private atomically unlocks it */ + BUG_ON(page_make_private(d, page, &pld) != 0); goto private_page_found; } @@ -720,6 +949,7 @@ gfn_found: page = mem_sharing_alloc_page(d, gfn); if ( !page ) { + mem_sharing_page_unlock(old_page, &pld); put_gfn(d, gfn); shr_unlock(); return -ENOMEM; @@ -732,6 +962,7 @@ gfn_found: unmap_domain_page(t); BUG_ON(set_shared_p2m_entry(d, gfn, page_to_mfn(page)) == 0); + mem_sharing_page_unlock(old_page, &pld); put_page_and_type(old_page); private_page_found: @@ -749,6 +980,7 @@ private_page_found: /* Now that the gfn<->mfn map is properly established, * marking dirty is feasible */ paging_mark_dirty(d, mfn_x(page_to_mfn(page))); + /* We do not need to unlock a private page */ put_gfn(d, gfn); shr_unlock(); return 0; @@ -897,8 +1129,8 @@ int mem_sharing_domctl(struct domain *d, void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); +#if MEM_SHARING_AUDIT mm_lock_init(&shr_lock); -#if MEM_SHARING_AUDIT INIT_LIST_HEAD(&shr_audit_list); #endif } diff -r dbfabd1bbfb1 -r bfebf49b3eb6 xen/arch/x86/mm/mm-locks.h --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -26,6 +26,8 @@ #ifndef _MM_LOCKS_H #define _MM_LOCKS_H +#include <asm/mem_sharing.h> + /* Per-CPU variable for enforcing the lock ordering */ DECLARE_PER_CPU(int, mm_lock_level); #define __get_lock_level() (this_cpu(mm_lock_level)) @@ -141,15 +143,38 @@ static inline void mm_enforce_order_unlo * * ************************************************************************/ +#if MEM_SHARING_AUDIT /* Page-sharing lock (global) * * A single global lock that protects the memory-sharing code''s * hash tables. */ declare_mm_lock(shr) -#define shr_lock() mm_lock(shr, &shr_lock) -#define shr_unlock() mm_unlock(&shr_lock) -#define shr_locked_by_me() mm_locked_by_me(&shr_lock) +#define _shr_lock() mm_lock(shr, &shr_lock) +#define _shr_unlock() mm_unlock(&shr_lock) +#define _shr_locked_by_me() mm_locked_by_me(&shr_lock) + +#else + +/* Sharing per page lock + * + * This is an external lock, not represented by an mm_lock_t. The memory + * sharing lock uses it to protect addition and removal of (gfn,domain) + * tuples to a shared page. We enforce order here against the p2m lock, + * which is taken after the page_lock to change the gfn''s p2m entry. + * + * Note that in sharing audit mode, we use the global page lock above, + * instead. + * + * The lock is recursive because during share we lock two pages. */ + +declare_mm_order_constraint(per_page_sharing) +#define page_sharing_mm_pre_lock() mm_enforce_order_lock_pre_per_page_sharing() +#define page_sharing_mm_post_lock(l, r) \ + mm_enforce_order_lock_post_per_page_sharing((l), (r)) +#define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r)) + +#endif /* MEM_SHARING_AUDIT */ /* Nested P2M lock (per-domain) * diff -r dbfabd1bbfb1 -r bfebf49b3eb6 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -337,6 +337,30 @@ int is_iomem_page(unsigned long mfn); void clear_superpage_mark(struct page_info *page); +/* Per page locks: + * page_lock() is used for two purposes: pte serialization, and memory sharing. + * + * All users of page lock for pte serialization live in mm.c, use it + * to lock a page table page during pte updates, do not take other locks within + * the critical section delimited by page_lock/unlock, and perform no + * nesting. + * + * All users of page lock for memory sharing live in mm/mem_sharing.c. Page_lock + * is used in memory sharing to protect addition (share) and removal (unshare) + * of (gfn,domain) tupples to a list of gfn''s that the shared page is currently + * backing. Nesting may happen when sharing (and locking) two pages -- deadlock + * is avoided by locking pages in increasing order. + * Memory sharing may take the p2m_lock within a page_lock/unlock + * critical section. We enforce ordering between page_lock and p2m_lock using an + * mm-locks.h construct. + * + * These two users (pte serialization and memory sharing) do not collide, since + * sharing is only supported for hvm guests, which do not perform pv pte updates. + * + */ +int page_lock(struct page_info *page); +void page_unlock(struct page_info *page); + struct domain *page_get_owner_and_reference(struct page_info *page); void put_page(struct page_info *page); int get_page(struct page_info *page, struct domain *domain); @@ -588,10 +612,6 @@ int steal_page( struct domain *d, struct page_info *page, unsigned int memflags); int donate_page( struct domain *d, struct page_info *page, unsigned int memflags); -int page_make_sharable(struct domain *d, - struct page_info *page, - int expected_refcnt); -int page_make_private(struct domain *d, struct page_info *page); int map_ldt_shadow_page(unsigned int);
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 6 of 9] x86/mm: New domctl: add a shared page to the physmap
xen/arch/x86/mm/mem_sharing.c | 104 ++++++++++++++++++++++++++++++++++++++++++ xen/include/public/domctl.h | 3 +- 2 files changed, 106 insertions(+), 1 deletions(-) This domctl is useful to, for example, populate parts of a domain''s physmap with shared frames, directly. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r bfebf49b3eb6 -r bba44de8394a xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -851,6 +851,74 @@ err_out: return ret; } +int mem_sharing_add_to_physmap(struct domain *sd, unsigned long sgfn, shr_handle_t sh, + struct domain *cd, unsigned long cgfn) +{ + struct page_info *spage; + int ret = -EINVAL; + mfn_t smfn, cmfn; + p2m_type_t smfn_type, cmfn_type; + struct gfn_info *gfn_info; + struct p2m_domain *p2m = p2m_get_hostp2m(cd); + p2m_access_t a; + DECLARE_PG_LOCK_DATA(pld); + + /* XXX if sd == cd handle potential deadlock by ordering + * the get_ and put_gfn''s */ + smfn = get_gfn_query(sd, sgfn, &smfn_type); + cmfn = get_gfn_type_access(p2m, cgfn, &cmfn_type, &a, p2m_query, NULL); + + /* Get the source shared page, check and lock */ + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; + spage = __grab_shared_page(smfn, &pld); + if ( spage == NULL ) + goto err_out; + ASSERT(smfn_type == p2m_ram_shared); + + /* Check that the handles match */ + if ( spage->shared_info->handle != sh ) + goto err_unlock; + + /* Make sure the target page is a hole in the physmap */ + if ( mfn_valid(cmfn) || + (!(p2m_is_ram(cmfn_type))) ) + { + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; + goto err_unlock; + } + + /* This is simpler than regular sharing */ + BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); + if ( (gfn_info = mem_sharing_gfn_alloc(spage, cd, cgfn)) == NULL ) + { + put_page_and_type(spage); + ret = -ENOMEM; + goto err_unlock; + } + + p2m_lock(p2m); + ret = set_p2m_entry(p2m, cgfn, smfn, PAGE_ORDER_4K, p2m_ram_shared, a); + p2m_unlock(p2m); + + /* Tempted to turn this into an assert */ + if ( !ret ) { + ret = -ENOENT; + mem_sharing_gfn_destroy(gfn_info, 0); + put_page_and_type(spage); + } else { + atomic_inc(&cd->shr_pages); + atomic_inc(&nr_shared_mfns); + ret = 0; + } + +err_unlock: + mem_sharing_page_unlock(spage, &pld); +err_out: + put_gfn(cd, cgfn); + put_gfn(sd, sgfn); + return ret; +} + int mem_sharing_unshare_page(struct domain *d, unsigned long gfn, uint16_t flags) @@ -1085,6 +1153,42 @@ int mem_sharing_domctl(struct domain *d, } break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP: + { + unsigned long sgfn, cgfn; + struct domain *cd; + shr_handle_t sh; + + if ( !mem_sharing_enabled(d) ) + return -EINVAL; + + cd = get_domain_by_id(mec->u.share.client_domain); + if ( !cd ) + return -ESRCH; + + if ( !mem_sharing_enabled(cd) ) + { + put_domain(cd); + return -EINVAL; + } + + if ( XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.source_gfn) ) + { + /* Cannot add a gref to the physmap */ + put_domain(cd); + return -EINVAL; + } + + sgfn = mec->u.share.source_gfn; + sh = mec->u.share.source_handle; + cgfn = mec->u.share.client_gfn; + + rc = mem_sharing_add_to_physmap(d, sgfn, sh, cd, cgfn); + + put_domain(cd); + } + break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_RESUME: { if ( !mem_sharing_enabled(d) ) diff -r bfebf49b3eb6 -r bba44de8394a xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -771,6 +771,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GFN 5 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_MFN 6 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GREF 7 +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP 8 #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9) @@ -797,7 +798,7 @@ struct xen_domctl_mem_sharing_op { } u; uint64_aligned_t handle; /* OUT: the handle */ } nominate; - struct mem_sharing_op_share { /* OP_SHARE */ + struct mem_sharing_op_share { /* OP_SHARE/ADD_PHYSMAP */ uint64_aligned_t source_gfn; /* IN: the gfn of the source page */ uint64_aligned_t source_handle; /* IN: handle to the source page */ uint64_aligned_t client_domain; /* IN: the client domain id */
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 7 of 9] Add the ability to poll stats about shared memory via the console
xen/arch/ia64/xen/mm.c | 6 ++++++ xen/arch/x86/mm/mem_sharing.c | 8 ++++++++ xen/common/keyhandler.c | 7 +++++-- xen/include/xen/mm.h | 3 +++ 4 files changed, 22 insertions(+), 2 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r bba44de8394a -r 82d9d136bad6 xen/arch/ia64/xen/mm.c --- a/xen/arch/ia64/xen/mm.c +++ b/xen/arch/ia64/xen/mm.c @@ -3565,6 +3565,12 @@ p2m_pod_decrease_reservation(struct doma return 0; } +/* Simple no-op */ +void arch_dump_shared_mem_info(void) +{ + printk("Shared memory not supported yet\n"); +} + /* * Local variables: * mode: C diff -r bba44de8394a -r 82d9d136bad6 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1230,6 +1230,14 @@ int mem_sharing_domctl(struct domain *d, return rc; } +void arch_dump_shared_mem_info(void) +{ + printk("Shared pages %u -- Saved frames %u -- Dom CoW footprintf %u\n", + mem_sharing_get_nr_shared_mfns(), + mem_sharing_get_nr_saved_mfns(), + dom_cow->tot_pages); +} + void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); diff -r bba44de8394a -r 82d9d136bad6 xen/common/keyhandler.c --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -15,6 +15,7 @@ #include <xen/compat.h> #include <xen/ctype.h> #include <xen/perfc.h> +#include <xen/mm.h> #include <asm/debugger.h> #include <asm/div64.h> @@ -248,8 +249,8 @@ static void dump_domains(unsigned char k printk(" refcnt=%d dying=%d pause_count=%d\n", atomic_read(&d->refcnt), d->is_dying, atomic_read(&d->pause_count)); - printk(" nr_pages=%d xenheap_pages=%d dirty_cpus=%s max_pages=%u\n", - d->tot_pages, d->xenheap_pages, tmpstr, d->max_pages); + printk(" nr_pages=%d xenheap_pages=%d shared_pages=%u dirty_cpus=%s max_pages=%u\n", + d->tot_pages, d->xenheap_pages, atomic_read(&d->shr_pages), tmpstr, d->max_pages); printk(" handle=%02x%02x%02x%02x-%02x%02x-%02x%02x-" "%02x%02x-%02x%02x%02x%02x%02x%02x vm_assist=%08lx\n", d->handle[ 0], d->handle[ 1], d->handle[ 2], d->handle[ 3], @@ -308,6 +309,8 @@ static void dump_domains(unsigned char k } } + arch_dump_shared_mem_info(); + rcu_read_unlock(&domlist_read_lock); #undef tmpstr } diff -r bba44de8394a -r 82d9d136bad6 xen/include/xen/mm.h --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -73,6 +73,9 @@ int assign_pages( unsigned int order, unsigned int memflags); +/* Dump info to serial console */ +void arch_dump_shared_mem_info(void); + /* memflags: */ #define _MEMF_no_refcount 0 #define MEMF_no_refcount (1U<<_MEMF_no_refcount)
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 8 of 9] x86/mm: New domctl: Perform sharing audit
xen/arch/x86/mm/mem_sharing.c | 15 ++++++++++----- xen/include/public/domctl.h | 1 + 2 files changed, 11 insertions(+), 5 deletions(-) Sharing audits are heavyweight, so instead of performing them inline, we make them callable via a domctl. Signed-off-by: Adin Scannell <adin@scannell.ca> diff -r 82d9d136bad6 -r 5b9b36648e43 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -936,7 +936,6 @@ int mem_sharing_unshare_page(struct doma * between shr_lock and p2m fine-grained locks in mm-lock. * Callers may walk in here already holding the lock for this gfn */ shr_lock(); - mem_sharing_audit(); mfn = get_gfn(d, gfn, &p2mt); /* Has someone already unshared it? */ @@ -1218,15 +1217,21 @@ int mem_sharing_domctl(struct domain *d, } break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_AUDIT: + { +#if MEM_SHARING_AUDIT + rc = mem_sharing_audit(); +#else + rc = -ENOSYS; +#endif + break; + } + default: rc = -ENOSYS; break; } - shr_lock(); - mem_sharing_audit(); - shr_unlock(); - return rc; } diff -r 82d9d136bad6 -r 5b9b36648e43 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -772,6 +772,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_MFN 6 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GREF 7 #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP 8 +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_AUDIT 9 #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9)
Andres Lagar-Cavilla
2011-Dec-09 20:22 UTC
[PATCH 9 of 9] x86/mm: use RCU in mem sharing audit list, eliminate global lock completely
xen/arch/x86/mm/mem_sharing.c | 83 +++++++++++++++----------------------- xen/arch/x86/mm/mm-locks.h | 18 -------- xen/include/asm-x86/mem_sharing.h | 3 +- 3 files changed, 35 insertions(+), 69 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 5b9b36648e43 -r 835be7c121b7 xen/arch/x86/mm/mem_sharing.c --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -32,6 +32,7 @@ #include <asm/p2m.h> #include <asm/mem_event.h> #include <asm/atomic.h> +#include <xen/rcupdate.h> #include "mm-locks.h" @@ -47,44 +48,33 @@ typedef struct pg_lock_data { #if MEM_SHARING_AUDIT -static mm_lock_t shr_lock; - -#define shr_lock() _shr_lock() -#define shr_unlock() _shr_unlock() -#define shr_locked_by_me() _shr_locked_by_me() - static int mem_sharing_audit(void); #define MEM_SHARING_DEBUG(_f, _a...) \ debugtrace_printk("mem_sharing_debug: %s(): " _f, __func__, ##_a) static struct list_head shr_audit_list; +static spinlock_t shr_audit_lock; +DEFINE_RCU_READ_LOCK(shr_audit_read_lock); + +/* RCU delayed free of audit list entry */ +static void _free_pg_shared_info(struct rcu_head *head) +{ + xfree(container_of(head, struct page_sharing_info, rcu_head)); +} static inline void audit_add_list(struct page_info *page) { - list_add(&page->shared_info->entry, &shr_audit_list); + list_add_rcu(&page->shared_info->entry, &shr_audit_list); } static inline void audit_del_list(struct page_info *page) { - list_del(&page->shared_info->entry); + list_del_rcu(&page->shared_info->entry); + call_rcu(&page->shared_info->rcu_head, _free_pg_shared_info); } - -static inline int mem_sharing_page_lock(struct page_info *p, - pg_lock_data_t *l) -{ - return 1; -} -#define mem_sharing_page_unlock(p, l) ((void)0) - -#define get_next_handle() next_handle++; #else -#define shr_lock() ((void)0) -#define shr_unlock() ((void)0) -/* Only used inside audit code */ -//#define shr_locked_by_me() ((void)0) - static inline int mem_sharing_audit(void) { return 0; @@ -93,6 +83,8 @@ static inline int mem_sharing_audit(void #define audit_add_list(p) ((void)0) #define audit_del_list(p) ((void)0) +#endif /* MEM_SHARING_AUDIT */ + static inline int mem_sharing_page_lock(struct page_info *pg, pg_lock_data_t *pld) { @@ -127,7 +119,6 @@ static inline shr_handle_t get_next_hand while ( (y = cmpxchg(&next_handle, x, x + 1)) != x ); return x + 1; } -#endif /* MEM_SHARING_AUDIT */ #define mem_sharing_enabled(d) \ (is_hvm_domain(d) && (d)->arch.hvm_domain.mem_sharing_enabled) @@ -220,11 +211,13 @@ static int mem_sharing_audit(void) unsigned long count_expected; unsigned long count_found = 0; struct list_head *ae; + DECLARE_PG_LOCK_DATA(pld); - ASSERT(shr_locked_by_me()); count_expected = atomic_read(&nr_shared_mfns); - list_for_each(ae, &shr_audit_list) + rcu_read_lock(&shr_audit_read_lock); + + list_for_each_rcu(ae, &shr_audit_list) { struct page_sharing_info *shared_info; unsigned long nr_gfns = 0; @@ -236,6 +229,15 @@ static int mem_sharing_audit(void) pg = shared_info->pg; mfn = page_to_mfn(pg); + /* If we can''t lock it, it''s definitely not a shared page */ + if ( !mem_sharing_page_lock(pg, &pld) ) + { + MEM_SHARING_DEBUG("mfn %lx in audit list, but cannot be locked (%lx)!\n", + mfn_x(mfn), pg->u.inuse.type_info); + errors++; + continue; + } + /* Check if the MFN has correct type, owner and handle. */ if ( !(pg->u.inuse.type_info & PGT_shared_page) ) { @@ -317,8 +319,12 @@ static int mem_sharing_audit(void) (pg->u.inuse.type_info & PGT_count_mask)); errors++; } + + mem_sharing_page_unlock(pg, &pld); } + rcu_read_unlock(&shr_audit_read_lock); + if ( count_found != count_expected ) { MEM_SHARING_DEBUG("Expected %ld shared mfns, found %ld.", @@ -552,14 +558,10 @@ static int page_make_private(struct doma spin_lock(&d->page_alloc_lock); /* We can only change the type if count is one */ - /* If we are locking pages individually, then we need to drop + /* Because we are locking pages individually, we need to drop * the lock here, while the page is typed. We cannot risk the * race of page_unlock and then put_page_type. */ -#if MEM_SHARING_AUDIT - expected_type = (PGT_shared_page | PGT_validated | 1); -#else expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2); -#endif if ( page->u.inuse.type_info != expected_type ) { put_page(page); @@ -570,12 +572,8 @@ static int page_make_private(struct doma /* Drop the final typecount */ put_page_and_type(page); -#if MEM_SHARING_AUDIT - (void)pld; -#else /* Now that we''ve dropped the type, we can unlock */ mem_sharing_page_unlock(page, pld); -#endif /* Change the owner */ ASSERT(page_get_owner(page) == dom_cow); @@ -627,7 +625,6 @@ int mem_sharing_nominate_page(struct dom *phandle = 0UL; - shr_lock(); mfn = get_gfn(d, gfn, &p2mt); /* Check if mfn is valid */ @@ -719,7 +716,6 @@ int mem_sharing_nominate_page(struct dom out: put_gfn(d, gfn); - shr_unlock(); return ret; } @@ -735,8 +731,6 @@ int mem_sharing_share_pages(struct domai p2m_type_t smfn_type, cmfn_type; DECLARE_PG_LOCK_DATA(pld); - shr_lock(); - /* XXX if sd == cd handle potential deadlock by ordering * the get_ and put_gfn''s */ smfn = get_gfn(sd, sgfn, &smfn_type); @@ -831,7 +825,6 @@ int mem_sharing_share_pages(struct domai /* Clear the rest of the shared state */ audit_del_list(cpage); - xfree(cpage->shared_info); cpage->shared_info = NULL; mem_sharing_page_unlock(secondpg, &pld); @@ -847,7 +840,6 @@ int mem_sharing_share_pages(struct domai err_out: put_gfn(cd, cgfn); put_gfn(sd, sgfn); - shr_unlock(); return ret; } @@ -932,16 +924,11 @@ int mem_sharing_unshare_page(struct doma struct list_head *le; DECLARE_PG_LOCK_DATA(pld); - /* This is one of the reasons why we can''t enforce ordering - * between shr_lock and p2m fine-grained locks in mm-lock. - * Callers may walk in here already holding the lock for this gfn */ - shr_lock(); mfn = get_gfn(d, gfn, &p2mt); /* Has someone already unshared it? */ if ( !p2m_is_shared(p2mt) ) { put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -986,7 +973,6 @@ gfn_found: } else { /* Clean up shared state */ audit_del_list(page); - xfree(page->shared_info); page->shared_info = NULL; } @@ -1000,7 +986,6 @@ gfn_found: test_and_clear_bit(_PGC_allocated, &page->count_info)) put_page(page); put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -1018,7 +1003,6 @@ gfn_found: { mem_sharing_page_unlock(old_page, &pld); put_gfn(d, gfn); - shr_unlock(); return -ENOMEM; } @@ -1049,7 +1033,6 @@ private_page_found: paging_mark_dirty(d, mfn_x(page_to_mfn(page))); /* We do not need to unlock a private page */ put_gfn(d, gfn); - shr_unlock(); return 0; } @@ -1247,7 +1230,7 @@ void __init mem_sharing_init(void) { printk("Initing memory sharing.\n"); #if MEM_SHARING_AUDIT - mm_lock_init(&shr_lock); + spin_lock_init(&shr_audit_lock); INIT_LIST_HEAD(&shr_audit_list); #endif } diff -r 5b9b36648e43 -r 835be7c121b7 xen/arch/x86/mm/mm-locks.h --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -143,19 +143,6 @@ static inline void mm_enforce_order_unlo * * ************************************************************************/ -#if MEM_SHARING_AUDIT -/* Page-sharing lock (global) - * - * A single global lock that protects the memory-sharing code''s - * hash tables. */ - -declare_mm_lock(shr) -#define _shr_lock() mm_lock(shr, &shr_lock) -#define _shr_unlock() mm_unlock(&shr_lock) -#define _shr_locked_by_me() mm_locked_by_me(&shr_lock) - -#else - /* Sharing per page lock * * This is an external lock, not represented by an mm_lock_t. The memory @@ -163,9 +150,6 @@ declare_mm_lock(shr) * tuples to a shared page. We enforce order here against the p2m lock, * which is taken after the page_lock to change the gfn''s p2m entry. * - * Note that in sharing audit mode, we use the global page lock above, - * instead. - * * The lock is recursive because during share we lock two pages. */ declare_mm_order_constraint(per_page_sharing) @@ -174,8 +158,6 @@ declare_mm_order_constraint(per_page_sha mm_enforce_order_lock_post_per_page_sharing((l), (r)) #define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r)) -#endif /* MEM_SHARING_AUDIT */ - /* Nested P2M lock (per-domain) * * A per-domain lock that protects the mapping from nested-CR3 to diff -r 5b9b36648e43 -r 835be7c121b7 xen/include/asm-x86/mem_sharing.h --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -25,7 +25,7 @@ #include <public/domctl.h> /* Auditing of memory sharing code? */ -#define MEM_SHARING_AUDIT 0 +#define MEM_SHARING_AUDIT 1 typedef uint64_t shr_handle_t; @@ -35,6 +35,7 @@ struct page_sharing_info shr_handle_t handle; /* Globally unique version / handle. */ #if MEM_SHARING_AUDIT struct list_head entry; /* List of all shared pages (entry). */ + struct rcu_head rcu_head; /* List of all shared pages (entry). */ #endif struct list_head gfns; /* List of domains and gfns for this page (head). */ };
Jan Beulich
2011-Dec-12 09:29 UTC
Re: [PATCH 5 of 9] x86/mm: Add per-page locking for memory sharing, when audits are disabled
>>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote: > With the removal of the hash table, all that is needed now is locking > of individual shared pages, as new (gfn,domain) pairs are removed or > added from the list of mappings. > > We recycle PGT_locked and use it to lock individual pages. We ensure > deadlock > is averted by locking pages in increasing order. We leverage an "external" > order enforcing construct from mm-locks.h to ensure the p2m_lock (needed to > modify gfn entries during sharing) is always taken after page_locks.As said before, I view taking a coarse lock inside a fine grained one as inappropriate, and hence recommend to not apply this patch without knowing if/when the promised follow-up patch will make it in. Jan> The global lock remains for the benefit of the auditing code, and is > thus enabled only as a compile-time option. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Jan Beulich
2011-Dec-12 09:32 UTC
Re: [PATCH 6 of 9] x86/mm: New domctl: add a shared page to the physmap
>>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote: > This domctl is useful to, for example, populate parts of a domain''s > physmap with shared frames, directly.As said before - there''s no consumer of this new code within the series, and hence you''re asking to add dead code. That''s no appropriate imo. Jan> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r bfebf49b3eb6 -r bba44de8394a xen/arch/x86/mm/mem_sharing.c > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -851,6 +851,74 @@ err_out: > return ret; > } > > +int mem_sharing_add_to_physmap(struct domain *sd, unsigned long sgfn, > shr_handle_t sh, > + struct domain *cd, unsigned long cgfn) > +{ > + struct page_info *spage; > + int ret = -EINVAL; > + mfn_t smfn, cmfn; > + p2m_type_t smfn_type, cmfn_type; > + struct gfn_info *gfn_info; > + struct p2m_domain *p2m = p2m_get_hostp2m(cd); > + p2m_access_t a; > + DECLARE_PG_LOCK_DATA(pld); > + > + /* XXX if sd == cd handle potential deadlock by ordering > + * the get_ and put_gfn''s */ > + smfn = get_gfn_query(sd, sgfn, &smfn_type); > + cmfn = get_gfn_type_access(p2m, cgfn, &cmfn_type, &a, p2m_query, NULL); > + > + /* Get the source shared page, check and lock */ > + ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID; > + spage = __grab_shared_page(smfn, &pld); > + if ( spage == NULL ) > + goto err_out; > + ASSERT(smfn_type == p2m_ram_shared); > + > + /* Check that the handles match */ > + if ( spage->shared_info->handle != sh ) > + goto err_unlock; > + > + /* Make sure the target page is a hole in the physmap */ > + if ( mfn_valid(cmfn) || > + (!(p2m_is_ram(cmfn_type))) ) > + { > + ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID; > + goto err_unlock; > + } > + > + /* This is simpler than regular sharing */ > + BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page)); > + if ( (gfn_info = mem_sharing_gfn_alloc(spage, cd, cgfn)) == NULL ) > + { > + put_page_and_type(spage); > + ret = -ENOMEM; > + goto err_unlock; > + } > + > + p2m_lock(p2m); > + ret = set_p2m_entry(p2m, cgfn, smfn, PAGE_ORDER_4K, p2m_ram_shared, a); > + p2m_unlock(p2m); > + > + /* Tempted to turn this into an assert */ > + if ( !ret ) { > + ret = -ENOENT; > + mem_sharing_gfn_destroy(gfn_info, 0); > + put_page_and_type(spage); > + } else { > + atomic_inc(&cd->shr_pages); > + atomic_inc(&nr_shared_mfns); > + ret = 0; > + } > + > +err_unlock: > + mem_sharing_page_unlock(spage, &pld); > +err_out: > + put_gfn(cd, cgfn); > + put_gfn(sd, sgfn); > + return ret; > +} > + > int mem_sharing_unshare_page(struct domain *d, > unsigned long gfn, > uint16_t flags) > @@ -1085,6 +1153,42 @@ int mem_sharing_domctl(struct domain *d, > } > break; > > + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP: > + { > + unsigned long sgfn, cgfn; > + struct domain *cd; > + shr_handle_t sh; > + > + if ( !mem_sharing_enabled(d) ) > + return -EINVAL; > + > + cd = get_domain_by_id(mec->u.share.client_domain); > + if ( !cd ) > + return -ESRCH; > + > + if ( !mem_sharing_enabled(cd) ) > + { > + put_domain(cd); > + return -EINVAL; > + } > + > + if ( > XEN_DOMCTL_MEM_SHARING_FIELD_IS_GREF(mec->u.share.source_gfn) ) > + { > + /* Cannot add a gref to the physmap */ > + put_domain(cd); > + return -EINVAL; > + } > + > + sgfn = mec->u.share.source_gfn; > + sh = mec->u.share.source_handle; > + cgfn = mec->u.share.client_gfn; > + > + rc = mem_sharing_add_to_physmap(d, sgfn, sh, cd, cgfn); > + > + put_domain(cd); > + } > + break; > + > case XEN_DOMCTL_MEM_EVENT_OP_SHARING_RESUME: > { > if ( !mem_sharing_enabled(d) ) > diff -r bfebf49b3eb6 -r bba44de8394a xen/include/public/domctl.h > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -771,6 +771,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_e > #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GFN 5 > #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_MFN 6 > #define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DEBUG_GREF 7 > +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ADD_PHYSMAP 8 > > #define XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID (-10) > #define XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID (-9) > @@ -797,7 +798,7 @@ struct xen_domctl_mem_sharing_op { > } u; > uint64_aligned_t handle; /* OUT: the handle */ > } nominate; > - struct mem_sharing_op_share { /* OP_SHARE */ > + struct mem_sharing_op_share { /* OP_SHARE/ADD_PHYSMAP */ > uint64_aligned_t source_gfn; /* IN: the gfn of the source > page */ > uint64_aligned_t source_handle; /* IN: handle to the source > page */ > uint64_aligned_t client_domain; /* IN: the client domain id */
Jan Beulich
2011-Dec-12 09:33 UTC
Re: [PATCH 8 of 9] x86/mm: New domctl: Perform sharing audit
>>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> wrote: > Sharing audits are heavyweight, so instead of performing them inline, > we make them callable via a domctl.As said before - there''s again no consumer of this new code within the series, and hence you''re asking to add dead code. Jan> Signed-off-by: Adin Scannell <adin@scannell.ca>
Andres Lagar-Cavilla
2011-Dec-13 04:22 UTC
Re: [PATCH 8 of 9] x86/mm: New domctl: Perform sharing audit
> >>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> > wrote: >> Sharing audits are heavyweight, so instead of performing them inline, >> we make them callable via a domctl. > > As said before - there''s again no consumer of this new code within the > series, and hence you''re asking to add dead code.Stands to reason that audit domctls are generally useful things. We already have p2m audit domctls with no consumer. And that''s not the end of the list of things in the tree without a consumer. I think your comments about dead code are a bit excessive. The only consumer in this tree of the sharing API is memshr. Have a look at how memshr is currently linked into blktap2. Maybe we should just submit one patch removing all of memshr and the Xen sharing code? Instead, we''re trying to undo the bit rot here. Thanks, Andres> > Jan > >> Signed-off-by: Adin Scannell <adin@scannell.ca> > > >
Andres Lagar-Cavilla
2011-Dec-13 04:29 UTC
Re: [PATCH 5 of 9] x86/mm: Add per-page locking for memory sharing, when audits are disabled
> >>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> > wrote: >> With the removal of the hash table, all that is needed now is locking >> of individual shared pages, as new (gfn,domain) pairs are removed or >> added from the list of mappings. >> >> We recycle PGT_locked and use it to lock individual pages. We ensure >> deadlock >> is averted by locking pages in increasing order. We leverage an >> "external" >> order enforcing construct from mm-locks.h to ensure the p2m_lock (needed >> to >> modify gfn entries during sharing) is always taken after page_locks. > > As said before, I view taking a coarse lock inside a fine grained one as > inappropriate, and hence recommend to not apply this patch without > knowing if/when the promised follow-up patch will make it in.Glad you brought that up. The get_gfn*/put_gfn API change was brought forward as a prelude to fully-synchronized p2m access: both lookups and modifications take a lock. That code is not in the tree yet, because I''m fighting a few issues with lock ordering inversions in the sharing code. But once that code makes it in, get_gfn/put_gfn pairs delimit critical sections protected by the p2m lock. And the sharing page_lock()s will be nested inside those sections. In other words, p2m lock first, page lock second, no lock inside page lock regions. That will also require inverting the order in which locks are declared at mm-locks.h. I hope to break through in the p2m locking during January. Certainly before 4.2-rc1. Thanks Andres> > Jan > >> The global lock remains for the benefit of the auditing code, and is >> thus enabled only as a compile-time option. >> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > >
Jan Beulich
2011-Dec-13 07:51 UTC
Re: [PATCH 8 of 9] x86/mm: New domctl: Perform sharing audit
>>> On 13.12.11 at 05:22, "Andres Lagar-Cavilla" <andres@lagarcavilla.org> wrote: >> >>> On 09.12.11 at 21:22, Andres Lagar-Cavilla <andres@lagarcavilla.org> >> wrote: >>> Sharing audits are heavyweight, so instead of performing them inline, >>> we make them callable via a domctl. >> >> As said before - there''s again no consumer of this new code within the >> series, and hence you''re asking to add dead code. > > Stands to reason that audit domctls are generally useful things. We > already have p2m audit domctls with no consumer. And that''s not the end of > the list of things in the tree without a consumer. > > I think your comments about dead code are a bit excessive. The only > consumer in this tree of the sharing API is memshr. Have a look at how > memshr is currently linked into blktap2. Maybe we should just submit one > patch removing all of memshr and the Xen sharing code?Seems like I should have more clear that my request went more in the direction of you submitting the (assumed to be existing) consumer code that you must have used for testing these. But from a more abstract perspective - yes, dead code should be removed (and shouldn''t have been allowed in) without a clear path towards a consumer. It''s not a law of nature that software components must only ever grow in size (and usually slow down accordingly). Jan