Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 0 of 8] Mem event ring interface setup update, V3
Changes from previous posting, dated Feb 29th: - Added Acked-by Ian Campbell for tools bits - Turned ((void)0) define for non-x86 architectures into "empty" static inlines - Added patch #8 to remove the ring from the guest physmap''s once successfully mapped by the helper Headers from previous posting follow below ------------------------------------------------------------------------------ Feb 29th: Changes from previous posting - Added Acked-by Tim Deegan for hypervisor side - Added Acked-by Olaf Hering, okaying the ABI/API change - +ve errno value when sanity-checking the port pointer within libxc - not clearing errno before calling the ring setup ioctl. ------------------------------------------------------------------------------ Update the interface for setting up mem event rings (for sharing, mem access or paging). Remove the "shared page", which was a waste of a whole page for a single event channel port value. More importantly, both the shared page and the ring page were dom0 user-space process pages mapped by the hypervisor. If the dom0 process does not clean up, the hypervisor keeps posting events (and holding a map) to a page now belonging to another process. Solutions proposed: - Pass the event channel port explicitly as part of the domctl payload. - Reserve a pfn in the guest physmap for a each mem event ring. Set/retrieve these pfns via hvm params. Ensure they are set during build and restore, and retrieved during save. Ensure these pages don''t leak and domains are left zombie. In all cases mem events consumers in-tree (xenpaging and xen-access) have been updated. Updating the interface to deal with these problems requires backwards-incompatible changes on both the helper<->libxc and libxc<->hypervisor interfaces. Take advantage of the interface update to plumb setting up of the sharing ring, which was missing. All patches touch x86/mm hypervisor bits. Patches 1, 3 and 5 are tools patches as well. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> tools/libxc/xc_mem_access.c | 10 +++- tools/libxc/xc_mem_event.c | 12 +++-- tools/libxc/xc_mem_paging.c | 10 +++- tools/libxc/xenctrl.h | 6 +- tools/tests/xen-access/xen-access.c | 22 +-------- tools/xenpaging/xenpaging.c | 18 +------ tools/xenpaging/xenpaging.h | 2 +- xen/arch/x86/mm/mem_event.c | 33 +------------- xen/include/public/domctl.h | 4 +- xen/include/public/mem_event.h | 4 - xen/include/xen/sched.h | 2 - xen/arch/x86/hvm/hvm.c | 48 ++++++++++++++++----- xen/include/asm-x86/hvm/hvm.h | 7 +++ tools/libxc/xc_domain_restore.c | 42 ++++++++++++++++++ tools/libxc/xc_domain_save.c | 36 ++++++++++++++++ tools/libxc/xc_hvm_build.c | 21 ++++++-- tools/libxc/xc_mem_access.c | 6 +- tools/libxc/xc_mem_event.c | 3 +- tools/libxc/xc_mem_paging.c | 6 +- tools/libxc/xenctrl.h | 8 +-- tools/libxc/xg_save_restore.h | 4 + tools/tests/xen-access/xen-access.c | 83 +++++++++++++++++------------------- tools/xenpaging/xenpaging.c | 52 ++++++++++++++++------ xen/arch/x86/mm/mem_event.c | 50 ++++++++++------------ xen/include/public/domctl.h | 1 - xen/include/public/hvm/params.h | 7 ++- xen/include/xen/sched.h | 1 + xen/arch/x86/mm/mem_event.c | 41 ++++++++++++++++++ xen/include/public/domctl.h | 20 ++++++++- xen/include/xen/sched.h | 3 + tools/libxc/xc_memshr.c | 25 +++++++++++ tools/libxc/xenctrl.h | 5 ++ xen/arch/x86/mm/mem_event.c | 11 ++++ xen/common/domain.c | 3 + xen/include/asm-arm/mm.h | 3 +- xen/include/asm-ia64/mm.h | 3 +- xen/include/asm-x86/mm.h | 6 ++ xen/arch/x86/mm/mem_event.c | 6 +- tools/tests/xen-access/xen-access.c | 5 ++ tools/xenpaging/xenpaging.c | 5 ++ 40 files changed, 426 insertions(+), 208 deletions(-)
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 1 of 8] Tools: Remove shared page from mem_event/access/paging interfaces
tools/libxc/xc_mem_access.c | 10 ++++++++-- tools/libxc/xc_mem_event.c | 12 +++++++----- tools/libxc/xc_mem_paging.c | 10 ++++++++-- tools/libxc/xenctrl.h | 6 +++--- tools/tests/xen-access/xen-access.c | 22 ++++------------------ tools/xenpaging/xenpaging.c | 18 +++--------------- tools/xenpaging/xenpaging.h | 2 +- xen/arch/x86/mm/mem_event.c | 33 +++------------------------------ xen/include/public/domctl.h | 4 ++-- xen/include/public/mem_event.h | 4 ---- xen/include/xen/sched.h | 2 -- 11 files changed, 39 insertions(+), 84 deletions(-) Don''t use the superfluous shared page, return the event channel directly as part of the domctl struct, instead. In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API change, so please voice any concerns. Known pending issues: - pager could die and its ring page could be used by some other process, yet Xen retains the mapping to it. - use a saner interface for the paging_load buffer. This change also affects the x86/mm bits in the hypervisor that process the mem_event setup domctl. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> diff -r 151bd4ef506a -r e40fd5b939a8 tools/libxc/xc_mem_access.c --- a/tools/libxc/xc_mem_access.c +++ b/tools/libxc/xc_mem_access.c @@ -25,12 +25,18 @@ int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page) + uint32_t *port, void *ring_page) { + if ( !port ) + { + errno = EINVAL; + return -1; + } + return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - shared_page, ring_page); + port, ring_page); } int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) diff -r 151bd4ef506a -r e40fd5b939a8 tools/libxc/xc_mem_event.c --- a/tools/libxc/xc_mem_event.c +++ b/tools/libxc/xc_mem_event.c @@ -24,19 +24,21 @@ #include "xc_private.h" int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, void *page, void *ring_page) + unsigned int mode, uint32_t *port, void *ring_page) { DECLARE_DOMCTL; + int rc; domctl.cmd = XEN_DOMCTL_mem_event_op; domctl.domain = domain_id; domctl.u.mem_event_op.op = op; domctl.u.mem_event_op.mode = mode; - - domctl.u.mem_event_op.shared_addr = (unsigned long)page; - domctl.u.mem_event_op.ring_addr = (unsigned long)ring_page; + domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; - return do_domctl(xch, &domctl); + rc = do_domctl(xch, &domctl); + if ( !rc && port ) + *port = domctl.u.mem_event_op.port; + return rc; } int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, diff -r 151bd4ef506a -r e40fd5b939a8 tools/libxc/xc_mem_paging.c --- a/tools/libxc/xc_mem_paging.c +++ b/tools/libxc/xc_mem_paging.c @@ -25,12 +25,18 @@ int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page) + uint32_t *port, void *ring_page) { + if ( !port ) + { + errno = EINVAL; + return -1; + } + return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - shared_page, ring_page); + port, ring_page); } int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id) diff -r 151bd4ef506a -r e40fd5b939a8 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1888,13 +1888,13 @@ int xc_tmem_restore_extra(xc_interface * * mem_event operations */ int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, void *shared_page, void *ring_page); + unsigned int mode, uint32_t *port, void *ring_page); int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, unsigned int op, unsigned int mode, uint64_t gfn, void *buffer); int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page); + uint32_t *port, void *ring_page); int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id); int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn); @@ -1906,7 +1906,7 @@ int xc_mem_paging_resume(xc_interface *x unsigned long gfn); int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page); + uint32_t *port, void *ring_page); int xc_mem_access_disable(xc_interface *xch, domid_t domain_id); int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); diff -r 151bd4ef506a -r e40fd5b939a8 tools/tests/xen-access/xen-access.c --- a/tools/tests/xen-access/xen-access.c +++ b/tools/tests/xen-access/xen-access.c @@ -98,7 +98,7 @@ typedef struct mem_event { xc_evtchn *xce_handle; int port; mem_event_back_ring_t back_ring; - mem_event_shared_page_t *shared_page; + uint32_t evtchn_port; void *ring_page; spinlock_t ring_lock; } mem_event_t; @@ -166,7 +166,7 @@ int xc_wait_for_event_or_timeout(xc_inte err: return -errno; } - + static void *init_page(void) { void *buffer; @@ -214,14 +214,6 @@ xenaccess_t *xenaccess_init(xc_interface /* Set domain id */ xenaccess->mem_event.domain_id = domain_id; - /* Initialise shared page */ - xenaccess->mem_event.shared_page = init_page(); - if ( xenaccess->mem_event.shared_page == NULL ) - { - ERROR("Error initialising shared page"); - goto err; - } - /* Initialise ring page */ xenaccess->mem_event.ring_page = init_page(); if ( xenaccess->mem_event.ring_page == NULL ) @@ -242,7 +234,7 @@ xenaccess_t *xenaccess_init(xc_interface /* Initialise Xen */ rc = xc_mem_access_enable(xenaccess->xc_handle, xenaccess->mem_event.domain_id, - xenaccess->mem_event.shared_page, + &xenaccess->mem_event.evtchn_port, xenaccess->mem_event.ring_page); if ( rc != 0 ) { @@ -271,7 +263,7 @@ xenaccess_t *xenaccess_init(xc_interface /* Bind event notification */ rc = xc_evtchn_bind_interdomain(xenaccess->mem_event.xce_handle, xenaccess->mem_event.domain_id, - xenaccess->mem_event.shared_page->port); + xenaccess->mem_event.evtchn_port); if ( rc < 0 ) { ERROR("Failed to bind event channel"); @@ -322,12 +314,6 @@ xenaccess_t *xenaccess_init(xc_interface err: if ( xenaccess ) { - if ( xenaccess->mem_event.shared_page ) - { - munlock(xenaccess->mem_event.shared_page, PAGE_SIZE); - free(xenaccess->mem_event.shared_page); - } - if ( xenaccess->mem_event.ring_page ) { munlock(xenaccess->mem_event.ring_page, PAGE_SIZE); diff -r 151bd4ef506a -r e40fd5b939a8 tools/xenpaging/xenpaging.c --- a/tools/xenpaging/xenpaging.c +++ b/tools/xenpaging/xenpaging.c @@ -341,14 +341,6 @@ static struct xenpaging *xenpaging_init( goto err; } - /* Initialise shared page */ - paging->mem_event.shared_page = init_page(); - if ( paging->mem_event.shared_page == NULL ) - { - PERROR("Error initialising shared page"); - goto err; - } - /* Initialise ring page */ paging->mem_event.ring_page = init_page(); if ( paging->mem_event.ring_page == NULL ) @@ -365,7 +357,7 @@ static struct xenpaging *xenpaging_init( /* Initialise Xen */ rc = xc_mem_paging_enable(xch, paging->mem_event.domain_id, - paging->mem_event.shared_page, + &paging->mem_event.evtchn_port, paging->mem_event.ring_page); if ( rc != 0 ) { @@ -397,7 +389,7 @@ static struct xenpaging *xenpaging_init( /* Bind event notification */ rc = xc_evtchn_bind_interdomain(paging->mem_event.xce_handle, paging->mem_event.domain_id, - paging->mem_event.shared_page->port); + paging->mem_event.evtchn_port); if ( rc < 0 ) { PERROR("Failed to bind event channel"); @@ -452,13 +444,9 @@ static struct xenpaging *xenpaging_init( { if ( paging->xs_handle ) xs_close(paging->xs_handle); + if ( xch ) xc_interface_close(xch); - if ( paging->mem_event.shared_page ) - { - munlock(paging->mem_event.shared_page, PAGE_SIZE); - free(paging->mem_event.shared_page); - } if ( paging->mem_event.ring_page ) { diff -r 151bd4ef506a -r e40fd5b939a8 tools/xenpaging/xenpaging.h --- a/tools/xenpaging/xenpaging.h +++ b/tools/xenpaging/xenpaging.h @@ -36,7 +36,7 @@ struct mem_event { xc_evtchn *xce_handle; int port; mem_event_back_ring_t back_ring; - mem_event_shared_page_t *shared_page; + uint32_t evtchn_port; void *ring_page; }; diff -r 151bd4ef506a -r e40fd5b939a8 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -50,12 +50,10 @@ static int mem_event_enable( struct domain *dom_mem_event = current->domain; struct vcpu *v = current; unsigned long ring_addr = mec->ring_addr; - unsigned long shared_addr = mec->shared_addr; l1_pgentry_t l1e; - unsigned long shared_gfn = 0, ring_gfn = 0; /* gcc ... */ + unsigned long ring_gfn = 0; /* gcc ... */ p2m_type_t p2mt; mfn_t ring_mfn; - mfn_t shared_mfn; /* Only one helper at a time. If the helper crashed, * the ring is in an undefined state and so is the guest. @@ -66,10 +64,6 @@ static int mem_event_enable( /* Get MFN of ring page */ guest_get_eff_l1e(v, ring_addr, &l1e); ring_gfn = l1e_get_pfn(l1e); - /* We''re grabbing these two in an order that could deadlock - * dom0 if 1. it were an hvm 2. there were two concurrent - * enables 3. the two gfn''s in each enable criss-crossed - * 2MB regions. Duly noted.... */ ring_mfn = get_gfn(dom_mem_event, ring_gfn, &p2mt); if ( unlikely(!mfn_valid(mfn_x(ring_mfn))) ) @@ -80,23 +74,9 @@ static int mem_event_enable( mem_event_ring_lock_init(med); - /* Get MFN of shared page */ - guest_get_eff_l1e(v, shared_addr, &l1e); - shared_gfn = l1e_get_pfn(l1e); - shared_mfn = get_gfn(dom_mem_event, shared_gfn, &p2mt); - - if ( unlikely(!mfn_valid(mfn_x(shared_mfn))) ) - { - put_gfn(dom_mem_event, ring_gfn); - put_gfn(dom_mem_event, shared_gfn); - return -EINVAL; - } - - /* Map ring and shared pages */ + /* Map ring page */ med->ring_page = map_domain_page(mfn_x(ring_mfn)); - med->shared_page = map_domain_page(mfn_x(shared_mfn)); put_gfn(dom_mem_event, ring_gfn); - put_gfn(dom_mem_event, shared_gfn); /* Set the number of currently blocked vCPUs to 0. */ med->blocked = 0; @@ -108,8 +88,7 @@ static int mem_event_enable( if ( rc < 0 ) goto err; - ((mem_event_shared_page_t *)med->shared_page)->port = rc; - med->xen_port = rc; + med->xen_port = mec->port = rc; /* Prepare ring buffer */ FRONT_RING_INIT(&med->front_ring, @@ -125,9 +104,6 @@ static int mem_event_enable( return 0; err: - unmap_domain_page(med->shared_page); - med->shared_page = NULL; - unmap_domain_page(med->ring_page); med->ring_page = NULL; @@ -249,9 +225,6 @@ static int mem_event_disable(struct doma unmap_domain_page(med->ring_page); med->ring_page = NULL; - unmap_domain_page(med->shared_page); - med->shared_page = NULL; - /* Unblock all vCPUs */ for_each_vcpu ( d, v ) { diff -r 151bd4ef506a -r e40fd5b939a8 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -746,8 +746,8 @@ struct xen_domctl_mem_event_op { uint32_t op; /* XEN_DOMCTL_MEM_EVENT_OP_*_* */ uint32_t mode; /* XEN_DOMCTL_MEM_EVENT_OP_* */ - uint64_aligned_t shared_addr; /* IN: Virtual address of shared page */ - uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ + uint32_t port; /* OUT: event channel for ring */ + uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ }; typedef struct xen_domctl_mem_event_op xen_domctl_mem_event_op_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_event_op_t); diff -r 151bd4ef506a -r e40fd5b939a8 xen/include/public/mem_event.h --- a/xen/include/public/mem_event.h +++ b/xen/include/public/mem_event.h @@ -51,10 +51,6 @@ #define MEM_EVENT_REASON_INT3 5 /* int3 was hit: gla/gfn are RIP */ #define MEM_EVENT_REASON_SINGLESTEP 6 /* single step was invoked: gla/gfn are RIP */ -typedef struct mem_event_shared_page { - uint32_t port; -} mem_event_shared_page_t; - typedef struct mem_event_st { uint16_t type; uint16_t flags; diff -r 151bd4ef506a -r e40fd5b939a8 xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -190,8 +190,6 @@ struct mem_event_domain /* The ring has 64 entries */ unsigned char foreign_producers; unsigned char target_producers; - /* shared page */ - mem_event_shared_page_t *shared_page; /* shared ring page */ void *ring_page; /* front-end ring */
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 2 of 8] x86/hvm: refactor calls to prepare and tear down a helper ring
xen/arch/x86/hvm/hvm.c | 48 ++++++++++++++++++++++++++++++++---------- xen/include/asm-x86/hvm/hvm.h | 7 ++++++ 2 files changed, 43 insertions(+), 12 deletions(-) These are currently used for the rings connecting Xen with qemu. Refactor them so the same code can be later used for mem event rings. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> diff -r e40fd5b939a8 -r ede46d85792a xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -333,6 +333,19 @@ static void hvm_init_ioreq_page( domain_pause(d); } +void destroy_ring_for_helper( + void **_va, struct page_info *page) +{ + void *va = *_va; + + if ( va != NULL ) + { + unmap_domain_page_global(va); + put_page_and_type(page); + *_va = NULL; + } +} + static void hvm_destroy_ioreq_page( struct domain *d, struct hvm_ioreq_page *iorp) { @@ -340,18 +353,14 @@ static void hvm_destroy_ioreq_page( ASSERT(d->is_dying); - if ( iorp->va != NULL ) - { - unmap_domain_page_global(iorp->va); - put_page_and_type(iorp->page); - iorp->va = NULL; - } + destroy_ring_for_helper(&iorp->va, iorp->page); spin_unlock(&iorp->lock); } -static int hvm_set_ioreq_page( - struct domain *d, struct hvm_ioreq_page *iorp, unsigned long gmfn) +int prepare_ring_for_helper( + struct domain *d, unsigned long gmfn, struct page_info **_page, + void **_va) { struct page_info *page; p2m_type_t p2mt; @@ -392,14 +401,30 @@ static int hvm_set_ioreq_page( return -ENOMEM; } + *_va = va; + *_page = page; + + put_gfn(d, gmfn); + + return 0; +} + +static int hvm_set_ioreq_page( + struct domain *d, struct hvm_ioreq_page *iorp, unsigned long gmfn) +{ + struct page_info *page; + void *va; + int rc; + + if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) ) + return rc; + spin_lock(&iorp->lock); if ( (iorp->va != NULL) || d->is_dying ) { + destroy_ring_for_helper(&iorp->va, iorp->page); spin_unlock(&iorp->lock); - unmap_domain_page_global(va); - put_page_and_type(mfn_to_page(mfn)); - put_gfn(d, gmfn); return -EINVAL; } @@ -407,7 +432,6 @@ static int hvm_set_ioreq_page( iorp->page = page; spin_unlock(&iorp->lock); - put_gfn(d, gmfn); domain_unpause(d); diff -r e40fd5b939a8 -r ede46d85792a xen/include/asm-x86/hvm/hvm.h --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -26,6 +26,7 @@ #include <asm/hvm/asid.h> #include <public/domctl.h> #include <public/hvm/save.h> +#include <asm/mm.h> /* Interrupt acknowledgement sources. */ enum hvm_intsrc { @@ -194,6 +195,12 @@ int hvm_vcpu_cacheattr_init(struct vcpu void hvm_vcpu_cacheattr_destroy(struct vcpu *v); void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip); +/* Prepare/destroy a ring for a dom0 helper. Helper with talk + * with Xen on behalf of this hvm domain. */ +int prepare_ring_for_helper(struct domain *d, unsigned long gmfn, + struct page_info **_page, void **_va); +void destroy_ring_for_helper(void **_va, struct page_info *page); + bool_t hvm_send_assist_req(struct vcpu *v); void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc);
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 3 of 8] Use a reserved pfn in the guest address space to store mem event rings
tools/libxc/xc_domain_restore.c | 42 ++++++++++++++++++ tools/libxc/xc_domain_save.c | 36 ++++++++++++++++ tools/libxc/xc_hvm_build.c | 21 ++++++-- tools/libxc/xc_mem_access.c | 6 +- tools/libxc/xc_mem_event.c | 3 +- tools/libxc/xc_mem_paging.c | 6 +- tools/libxc/xenctrl.h | 8 +-- tools/libxc/xg_save_restore.h | 4 + tools/tests/xen-access/xen-access.c | 83 +++++++++++++++++------------------- tools/xenpaging/xenpaging.c | 52 ++++++++++++++++------ xen/arch/x86/mm/mem_event.c | 50 ++++++++++------------ xen/include/public/domctl.h | 1 - xen/include/public/hvm/params.h | 7 ++- xen/include/xen/sched.h | 1 + 14 files changed, 214 insertions(+), 106 deletions(-) This solves a long-standing issue in which the pages backing these rings were pages belonging to dom0 user-space processes. Thus, if the process would die unexpectedly, Xen would keep posting events to a page now belonging to some other process. We update all API-consumers in tree (xenpaging and xen-access). This is an API/ABI change, so please speak up if it breaks your accumptions. The patch touches tools, hypervisor x86/hvm bits, and hypervisor x86/mm bits. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c +++ b/tools/libxc/xc_domain_restore.c @@ -677,6 +677,9 @@ typedef struct { int max_vcpu_id; uint64_t vcpumap; uint64_t identpt; + uint64_t paging_ring_pfn; + uint64_t access_ring_pfn; + uint64_t sharing_ring_pfn; uint64_t vm86_tss; uint64_t console_pfn; uint64_t acpi_ioport_location; @@ -750,6 +753,39 @@ static int pagebuf_get_one(xc_interface // DPRINTF("EPT identity map address: %llx\n", buf->identpt); return pagebuf_get_one(xch, ctx, buf, fd, dom); + case XC_SAVE_ID_HVM_PAGING_RING_PFN: + /* Skip padding 4 bytes then read the paging ring location. */ + if ( RDEXACT(fd, &buf->paging_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->paging_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the paging ring pfn"); + return -1; + } + // DPRINTF("paging ring pfn address: %llx\n", buf->paging_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + + case XC_SAVE_ID_HVM_ACCESS_RING_PFN: + /* Skip padding 4 bytes then read the mem access ring location. */ + if ( RDEXACT(fd, &buf->access_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->access_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the access ring pfn"); + return -1; + } + // DPRINTF("access ring pfn address: %llx\n", buf->access_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + + case XC_SAVE_ID_HVM_SHARING_RING_PFN: + /* Skip padding 4 bytes then read the sharing ring location. */ + if ( RDEXACT(fd, &buf->sharing_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->sharing_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the sharing ring pfn"); + return -1; + } + // DPRINTF("sharing ring pfn address: %llx\n", buf->sharing_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + case XC_SAVE_ID_HVM_VM86_TSS: /* Skip padding 4 bytes then read the vm86 TSS location. */ if ( RDEXACT(fd, &buf->vm86_tss, sizeof(uint32_t)) || @@ -1460,6 +1496,12 @@ int xc_domain_restore(xc_interface *xch, /* should this be deferred? does it change? */ if ( pagebuf.identpt ) xc_set_hvm_param(xch, dom, HVM_PARAM_IDENT_PT, pagebuf.identpt); + if ( pagebuf.paging_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, pagebuf.paging_ring_pfn); + if ( pagebuf.access_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, pagebuf.access_ring_pfn); + if ( pagebuf.sharing_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, pagebuf.sharing_ring_pfn); if ( pagebuf.vm86_tss ) xc_set_hvm_param(xch, dom, HVM_PARAM_VM86_TSS, pagebuf.vm86_tss); if ( pagebuf.console_pfn ) diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_domain_save.c --- a/tools/libxc/xc_domain_save.c +++ b/tools/libxc/xc_domain_save.c @@ -1639,6 +1639,42 @@ int xc_domain_save(xc_interface *xch, in goto out; } + chunk.id = XC_SAVE_ID_HVM_PAGING_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the paging ring pfn for guest"); + goto out; + } + + chunk.id = XC_SAVE_ID_HVM_ACCESS_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the access ring pfn for guest"); + goto out; + } + + chunk.id = XC_SAVE_ID_HVM_SHARING_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the sharing ring pfn for guest"); + goto out; + } + chunk.id = XC_SAVE_ID_HVM_VM86_TSS; chunk.data = 0; xc_get_hvm_param(xch, dom, HVM_PARAM_VM86_TSS, diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_hvm_build.c --- a/tools/libxc/xc_hvm_build.c +++ b/tools/libxc/xc_hvm_build.c @@ -38,12 +38,15 @@ #define SUPERPAGE_1GB_SHIFT 18 #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) -#define SPECIALPAGE_BUFIOREQ 0 -#define SPECIALPAGE_XENSTORE 1 -#define SPECIALPAGE_IOREQ 2 -#define SPECIALPAGE_IDENT_PT 3 -#define SPECIALPAGE_CONSOLE 4 -#define NR_SPECIAL_PAGES 5 +#define SPECIALPAGE_PAGING 0 +#define SPECIALPAGE_ACCESS 1 +#define SPECIALPAGE_SHARING 2 +#define SPECIALPAGE_BUFIOREQ 3 +#define SPECIALPAGE_XENSTORE 4 +#define SPECIALPAGE_IOREQ 5 +#define SPECIALPAGE_IDENT_PT 6 +#define SPECIALPAGE_CONSOLE 7 +#define NR_SPECIAL_PAGES 8 #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x)) static void build_hvm_info(void *hvm_info_page, uint64_t mem_size) @@ -356,6 +359,12 @@ static int setup_guest(xc_interface *xch special_pfn(SPECIALPAGE_IOREQ)); xc_set_hvm_param(xch, dom, HVM_PARAM_CONSOLE_PFN, special_pfn(SPECIALPAGE_CONSOLE)); + xc_set_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, + special_pfn(SPECIALPAGE_PAGING)); + xc_set_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, + special_pfn(SPECIALPAGE_ACCESS)); + xc_set_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, + special_pfn(SPECIALPAGE_SHARING)); /* * Identity-map page table is required for running with CR0.PG=0 when diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_mem_access.c --- a/tools/libxc/xc_mem_access.c +++ b/tools/libxc/xc_mem_access.c @@ -25,7 +25,7 @@ int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page) + uint32_t *port) { if ( !port ) { @@ -36,7 +36,7 @@ int xc_mem_access_enable(xc_interface *x return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - port, ring_page); + port); } int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) @@ -44,7 +44,7 @@ int xc_mem_access_disable(xc_interface * return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_DISABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - NULL, NULL); + NULL); } int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn) diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_mem_event.c --- a/tools/libxc/xc_mem_event.c +++ b/tools/libxc/xc_mem_event.c @@ -24,7 +24,7 @@ #include "xc_private.h" int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, uint32_t *port, void *ring_page) + unsigned int mode, uint32_t *port) { DECLARE_DOMCTL; int rc; @@ -33,7 +33,6 @@ int xc_mem_event_control(xc_interface *x domctl.domain = domain_id; domctl.u.mem_event_op.op = op; domctl.u.mem_event_op.mode = mode; - domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; rc = do_domctl(xch, &domctl); if ( !rc && port ) diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xc_mem_paging.c --- a/tools/libxc/xc_mem_paging.c +++ b/tools/libxc/xc_mem_paging.c @@ -25,7 +25,7 @@ int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page) + uint32_t *port) { if ( !port ) { @@ -36,7 +36,7 @@ int xc_mem_paging_enable(xc_interface *x return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - port, ring_page); + port); } int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id) @@ -44,7 +44,7 @@ int xc_mem_paging_disable(xc_interface * return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_DISABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - NULL, NULL); + NULL); } int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn) diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1888,13 +1888,12 @@ int xc_tmem_restore_extra(xc_interface * * mem_event operations */ int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, uint32_t *port, void *ring_page); + unsigned int mode, uint32_t *port); int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, unsigned int op, unsigned int mode, uint64_t gfn, void *buffer); -int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page); +int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, uint32_t *port); int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id); int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn); @@ -1905,8 +1904,7 @@ int xc_mem_paging_load(xc_interface *xch int xc_mem_paging_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); -int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page); +int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, uint32_t *port); int xc_mem_access_disable(xc_interface *xch, domid_t domain_id); int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); diff -r ede46d85792a -r 39e3ee550391 tools/libxc/xg_save_restore.h --- a/tools/libxc/xg_save_restore.h +++ b/tools/libxc/xg_save_restore.h @@ -254,6 +254,10 @@ #define XC_SAVE_ID_COMPRESSED_DATA -12 /* Marker to indicate arrival of compressed data */ #define XC_SAVE_ID_ENABLE_COMPRESSION -13 /* Marker to enable compression logic at receiver side */ #define XC_SAVE_ID_HVM_GENERATION_ID_ADDR -14 +/* Markers for the pfn''s hosting these mem event rings */ +#define XC_SAVE_ID_HVM_PAGING_RING_PFN -15 +#define XC_SAVE_ID_HVM_ACCESS_RING_PFN -16 +#define XC_SAVE_ID_HVM_SHARING_RING_PFN -17 /* ** We process save/restore/migrate in batches of pages; the below diff -r ede46d85792a -r 39e3ee550391 tools/tests/xen-access/xen-access.c --- a/tools/tests/xen-access/xen-access.c +++ b/tools/tests/xen-access/xen-access.c @@ -166,36 +166,13 @@ int xc_wait_for_event_or_timeout(xc_inte err: return -errno; } - -static void *init_page(void) -{ - void *buffer; - int ret; - - /* Allocated page memory */ - ret = posix_memalign(&buffer, PAGE_SIZE, PAGE_SIZE); - if ( ret != 0 ) - goto out_alloc; - - /* Lock buffer in memory so it can''t be paged out */ - ret = mlock(buffer, PAGE_SIZE); - if ( ret != 0 ) - goto out_lock; - - return buffer; - - munlock(buffer, PAGE_SIZE); - out_lock: - free(buffer); - out_alloc: - return NULL; -} xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id) { xenaccess_t *xenaccess; xc_interface *xch; int rc; + unsigned long ring_pfn, mmap_pfn; xch = xc_interface_open(NULL, NULL, 0); if ( !xch ) @@ -214,28 +191,42 @@ xenaccess_t *xenaccess_init(xc_interface /* Set domain id */ xenaccess->mem_event.domain_id = domain_id; - /* Initialise ring page */ - xenaccess->mem_event.ring_page = init_page(); - if ( xenaccess->mem_event.ring_page == NULL ) - { - ERROR("Error initialising ring page"); - goto err; - } - - - /* Initialise ring */ - SHARED_RING_INIT((mem_event_sring_t *)xenaccess->mem_event.ring_page); - BACK_RING_INIT(&xenaccess->mem_event.back_ring, - (mem_event_sring_t *)xenaccess->mem_event.ring_page, - PAGE_SIZE); - /* Initialise lock */ mem_event_ring_lock_init(&xenaccess->mem_event); + /* Map the ring page */ + xc_get_hvm_param(xch, xenaccess->mem_event.domain_id, + HVM_PARAM_ACCESS_RING_PFN, &ring_pfn); + mmap_pfn = ring_pfn; + xenaccess->mem_event.ring_page = + xc_map_foreign_batch(xch, xenaccess->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + /* Map failed, populate ring page */ + rc = xc_domain_populate_physmap_exact(xenaccess->xc_handle, + xenaccess->mem_event.domain_id, + 1, 0, 0, &ring_pfn); + if ( rc != 0 ) + { + PERROR("Failed to populate ring gfn\n"); + goto err; + } + + mmap_pfn = ring_pfn; + xenaccess->mem_event.ring_page = + xc_map_foreign_batch(xch, xenaccess->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + PERROR("Could not map the ring page\n"); + goto err; + } + } + /* Initialise Xen */ rc = xc_mem_access_enable(xenaccess->xc_handle, xenaccess->mem_event.domain_id, - &xenaccess->mem_event.evtchn_port, - xenaccess->mem_event.ring_page); + &xenaccess->mem_event.evtchn_port); if ( rc != 0 ) { switch ( errno ) { @@ -272,6 +263,12 @@ xenaccess_t *xenaccess_init(xc_interface xenaccess->mem_event.port = rc; + /* Initialise ring */ + SHARED_RING_INIT((mem_event_sring_t *)xenaccess->mem_event.ring_page); + BACK_RING_INIT(&xenaccess->mem_event.back_ring, + (mem_event_sring_t *)xenaccess->mem_event.ring_page, + PAGE_SIZE); + /* Get platform info */ xenaccess->platform_info = malloc(sizeof(xc_platform_info_t)); if ( xenaccess->platform_info == NULL ) @@ -316,8 +313,7 @@ xenaccess_t *xenaccess_init(xc_interface { if ( xenaccess->mem_event.ring_page ) { - munlock(xenaccess->mem_event.ring_page, PAGE_SIZE); - free(xenaccess->mem_event.ring_page); + munmap(xenaccess->mem_event.ring_page, PAGE_SIZE); } free(xenaccess->platform_info); @@ -337,6 +333,7 @@ int xenaccess_teardown(xc_interface *xch return 0; /* Tear down domain xenaccess in Xen */ + munmap(xenaccess->mem_event.ring_page, PAGE_SIZE); rc = xc_mem_access_disable(xenaccess->xc_handle, xenaccess->mem_event.domain_id); if ( rc != 0 ) { diff -r ede46d85792a -r 39e3ee550391 tools/xenpaging/xenpaging.c --- a/tools/xenpaging/xenpaging.c +++ b/tools/xenpaging/xenpaging.c @@ -285,6 +285,7 @@ static struct xenpaging *xenpaging_init( xentoollog_logger *dbg = NULL; char *p; int rc; + unsigned long ring_pfn, mmap_pfn; /* Allocate memory */ paging = calloc(1, sizeof(struct xenpaging)); @@ -341,24 +342,39 @@ static struct xenpaging *xenpaging_init( goto err; } - /* Initialise ring page */ - paging->mem_event.ring_page = init_page(); - if ( paging->mem_event.ring_page == NULL ) + /* Map the ring page */ + xc_get_hvm_param(xch, paging->mem_event.domain_id, + HVM_PARAM_PAGING_RING_PFN, &ring_pfn); + mmap_pfn = ring_pfn; + paging->mem_event.ring_page = + xc_map_foreign_batch(xch, paging->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) { - PERROR("Error initialising ring page"); - goto err; + /* Map failed, populate ring page */ + rc = xc_domain_populate_physmap_exact(paging->xc_handle, + paging->mem_event.domain_id, + 1, 0, 0, &ring_pfn); + if ( rc != 0 ) + { + PERROR("Failed to populate ring gfn\n"); + goto err; + } + + mmap_pfn = ring_pfn; + paging->mem_event.ring_page = + xc_map_foreign_batch(xch, paging->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + PERROR("Could not map the ring page\n"); + goto err; + } } - - /* Initialise ring */ - SHARED_RING_INIT((mem_event_sring_t *)paging->mem_event.ring_page); - BACK_RING_INIT(&paging->mem_event.back_ring, - (mem_event_sring_t *)paging->mem_event.ring_page, - PAGE_SIZE); /* Initialise Xen */ rc = xc_mem_paging_enable(xch, paging->mem_event.domain_id, - &paging->mem_event.evtchn_port, - paging->mem_event.ring_page); + &paging->mem_event.evtchn_port); if ( rc != 0 ) { switch ( errno ) { @@ -398,6 +414,12 @@ static struct xenpaging *xenpaging_init( paging->mem_event.port = rc; + /* Initialise ring */ + SHARED_RING_INIT((mem_event_sring_t *)paging->mem_event.ring_page); + BACK_RING_INIT(&paging->mem_event.back_ring, + (mem_event_sring_t *)paging->mem_event.ring_page, + PAGE_SIZE); + /* Get max_pages from guest if not provided via cmdline */ if ( !paging->max_pages ) { @@ -450,8 +472,7 @@ static struct xenpaging *xenpaging_init( if ( paging->mem_event.ring_page ) { - munlock(paging->mem_event.ring_page, PAGE_SIZE); - free(paging->mem_event.ring_page); + munmap(paging->mem_event.ring_page, PAGE_SIZE); } free(dom_path); @@ -477,6 +498,7 @@ static int xenpaging_teardown(struct xen xch = paging->xc_handle; paging->xc_handle = NULL; /* Tear down domain paging in Xen */ + munmap(paging->mem_event.ring_page, PAGE_SIZE); rc = xc_mem_paging_disable(xch, paging->mem_event.domain_id); if ( rc != 0 ) { diff -r ede46d85792a -r 39e3ee550391 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -44,16 +44,11 @@ static int mem_event_enable( xen_domctl_mem_event_op_t *mec, struct mem_event_domain *med, int pause_flag, + int param, xen_event_channel_notification_t notification_fn) { int rc; - struct domain *dom_mem_event = current->domain; - struct vcpu *v = current; - unsigned long ring_addr = mec->ring_addr; - l1_pgentry_t l1e; - unsigned long ring_gfn = 0; /* gcc ... */ - p2m_type_t p2mt; - mfn_t ring_mfn; + unsigned long ring_gfn = d->arch.hvm_domain.params[param]; /* Only one helper at a time. If the helper crashed, * the ring is in an undefined state and so is the guest. @@ -61,22 +56,18 @@ static int mem_event_enable( if ( med->ring_page ) return -EBUSY; - /* Get MFN of ring page */ - guest_get_eff_l1e(v, ring_addr, &l1e); - ring_gfn = l1e_get_pfn(l1e); - ring_mfn = get_gfn(dom_mem_event, ring_gfn, &p2mt); - - if ( unlikely(!mfn_valid(mfn_x(ring_mfn))) ) - { - put_gfn(dom_mem_event, ring_gfn); - return -EINVAL; - } + /* The parameter defaults to zero, and it should be + * set to something */ + if ( ring_gfn == 0 ) + return -ENOSYS; mem_event_ring_lock_init(med); + mem_event_ring_lock(med); - /* Map ring page */ - med->ring_page = map_domain_page(mfn_x(ring_mfn)); - put_gfn(dom_mem_event, ring_gfn); + rc = prepare_ring_for_helper(d, ring_gfn, &med->ring_pg_struct, + &med->ring_page); + if ( rc < 0 ) + goto err; /* Set the number of currently blocked vCPUs to 0. */ med->blocked = 0; @@ -101,11 +92,13 @@ static int mem_event_enable( /* Initialize the last-chance wait queue. */ init_waitqueue_head(&med->wq); + mem_event_ring_unlock(med); return 0; err: - unmap_domain_page(med->ring_page); - med->ring_page = NULL; + destroy_ring_for_helper(&med->ring_page, + med->ring_pg_struct); + mem_event_ring_unlock(med); return rc; } @@ -221,9 +214,6 @@ static int mem_event_disable(struct doma /* Free domU''s event channel and leave the other one unbound */ free_xen_event_channel(d->vcpu[0], med->xen_port); - - unmap_domain_page(med->ring_page); - med->ring_page = NULL; /* Unblock all vCPUs */ for_each_vcpu ( d, v ) @@ -235,6 +225,8 @@ static int mem_event_disable(struct doma } } + destroy_ring_for_helper(&med->ring_page, + med->ring_pg_struct); mem_event_ring_unlock(med); } @@ -549,7 +541,9 @@ int mem_event_domctl(struct domain *d, x if ( p2m->pod.entry_count ) break; - rc = mem_event_enable(d, mec, med, _VPF_mem_paging, mem_paging_notification); + rc = mem_event_enable(d, mec, med, _VPF_mem_paging, + HVM_PARAM_PAGING_RING_PFN, + mem_paging_notification); } break; @@ -585,7 +579,9 @@ int mem_event_domctl(struct domain *d, x if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ) break; - rc = mem_event_enable(d, mec, med, _VPF_mem_access, mem_access_notification); + rc = mem_event_enable(d, mec, med, _VPF_mem_access, + HVM_PARAM_ACCESS_RING_PFN, + mem_access_notification); } break; diff -r ede46d85792a -r 39e3ee550391 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -747,7 +747,6 @@ struct xen_domctl_mem_event_op { uint32_t mode; /* XEN_DOMCTL_MEM_EVENT_OP_* */ uint32_t port; /* OUT: event channel for ring */ - uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ }; typedef struct xen_domctl_mem_event_op xen_domctl_mem_event_op_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_event_op_t); diff -r ede46d85792a -r 39e3ee550391 xen/include/public/hvm/params.h --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -149,6 +149,11 @@ /* Boolean: Enable nestedhvm (hvm only) */ #define HVM_PARAM_NESTEDHVM 24 -#define HVM_NR_PARAMS 27 +/* Params for the mem event rings */ +#define HVM_PARAM_PAGING_RING_PFN 27 +#define HVM_PARAM_ACCESS_RING_PFN 28 +#define HVM_PARAM_SHARING_RING_PFN 29 + +#define HVM_NR_PARAMS 30 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ diff -r ede46d85792a -r 39e3ee550391 xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -192,6 +192,7 @@ struct mem_event_domain unsigned char target_producers; /* shared ring page */ void *ring_page; + struct page_info *ring_pg_struct; /* front-end ring */ mem_event_front_ring_t front_ring; /* event channel port (vcpu0 only) */
xen/arch/x86/mm/mem_event.c | 41 +++++++++++++++++++++++++++++++++++++++++ xen/include/public/domctl.h | 20 +++++++++++++++++++- xen/include/xen/sched.h | 3 +++ 3 files changed, 63 insertions(+), 1 deletions(-) Now that we have an interface close to finalizing, do the necessary plumbing to set up a ring for reporting failed allocations in the unshare path. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> diff -r 39e3ee550391 -r d03aa37a0288 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -432,6 +432,13 @@ static void mem_access_notification(stru p2m_mem_access_resume(v->domain); } +/* Registered with Xen-bound event channel for incoming notifications. */ +static void mem_sharing_notification(struct vcpu *v, unsigned int port) +{ + if ( likely(v->domain->mem_event->share.ring_page != NULL) ) + mem_sharing_sharing_resume(v->domain); +} + struct domain *get_mem_event_op_target(uint32_t domain, int *rc) { struct domain *d; @@ -599,6 +606,40 @@ int mem_event_domctl(struct domain *d, x } break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING: + { + struct mem_event_domain *med = &d->mem_event->share; + rc = -EINVAL; + + switch( mec->op ) + { + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE: + { + rc = -ENODEV; + /* Only HAP is supported */ + if ( !hap_enabled(d) ) + break; + + rc = mem_event_enable(d, mec, med, _VPF_mem_sharing, + HVM_PARAM_SHARING_RING_PFN, + mem_sharing_notification); + } + break; + + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE: + { + if ( med->ring_page ) + rc = mem_event_disable(d, med); + } + break; + + default: + rc = -ENOSYS; + break; + } + } + break; + default: rc = -ENOSYS; } diff -r 39e3ee550391 -r d03aa37a0288 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -710,7 +710,7 @@ struct xen_domctl_gdbsx_domstatus { /* XEN_DOMCTL_mem_event_op */ /* -* Domain memory paging + * Domain memory paging * Page memory in and out. * Domctl interface to set up and tear down the * pager<->hypervisor interface. Use XENMEM_paging_op* @@ -740,6 +740,24 @@ struct xen_domctl_gdbsx_domstatus { #define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE 0 #define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_DISABLE 1 +/* + * Sharing ENOMEM helper. + * + * As with paging, use the domctl for teardown/setup of the + * helper<->hypervisor interface. + * + * If setup, this ring is used to communicate failed allocations + * in the unshare path. XENMEM_sharing_op_resume is used to wake up + * vcpus that could not unshare. + * + * Note that shring can be turned on (as per the domctl below) + * *without* this ring being setup. + */ +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING 3 + +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE 0 +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE 1 + /* Use for teardown/setup of helper<->hypervisor interface for paging, * access and sharing.*/ struct xen_domctl_mem_event_op { diff -r 39e3ee550391 -r d03aa37a0288 xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -643,6 +643,9 @@ static inline struct domain *next_domain /* VCPU is blocked due to missing mem_access ring. */ #define _VPF_mem_access 5 #define VPF_mem_access (1UL<<_VPF_mem_access) + /* VCPU is blocked due to missing mem_sharing ring. */ +#define _VPF_mem_sharing 6 +#define VPF_mem_sharing (1UL<<_VPF_mem_sharing) static inline int vcpu_runnable(struct vcpu *v) {
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 5 of 8] Tools: libxc side for setting up the mem sharing ring
tools/libxc/xc_memshr.c | 25 +++++++++++++++++++++++++ tools/libxc/xenctrl.h | 5 +++++ 2 files changed, 30 insertions(+), 0 deletions(-) This ring is used to report failed allocations in the unshare path. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> diff -r d03aa37a0288 -r 0b28eaa6422f tools/libxc/xc_memshr.c --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -42,6 +42,31 @@ int xc_memshr_control(xc_interface *xch, return do_domctl(xch, &domctl); } +int xc_memshr_ring_enable(xc_interface *xch, + domid_t domid, + uint32_t *port) +{ + if ( !port ) + { + errno = EINVAL; + return -1; + } + + return xc_mem_event_control(xch, domid, + XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE, + XEN_DOMCTL_MEM_EVENT_OP_SHARING, + port); +} + +int xc_memshr_ring_disable(xc_interface *xch, + domid_t domid) +{ + return xc_mem_event_control(xch, domid, + XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE, + XEN_DOMCTL_MEM_EVENT_OP_SHARING, + NULL); +} + static int xc_memshr_memop(xc_interface *xch, domid_t domid, xen_mem_sharing_op_t *mso) { diff -r d03aa37a0288 -r 0b28eaa6422f tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1915,6 +1915,11 @@ int xc_mem_access_resume(xc_interface *x int xc_memshr_control(xc_interface *xch, domid_t domid, int enable); +int xc_memshr_ring_enable(xc_interface *xch, + domid_t domid, + uint32_t *port); +int xc_memshr_ring_disable(xc_interface *xch, + domid_t domid); int xc_memshr_nominate_gfn(xc_interface *xch, domid_t domid, unsigned long gfn,
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 6 of 8] x86/mm: Clean up mem event structures on domain destruction
xen/arch/x86/mm/mem_event.c | 11 +++++++++++ xen/common/domain.c | 3 +++ xen/include/asm-arm/mm.h | 3 ++- xen/include/asm-ia64/mm.h | 3 ++- xen/include/asm-x86/mm.h | 6 ++++++ 5 files changed, 24 insertions(+), 2 deletions(-) Otherwise we wind up with zombie domains, still holding onto refs to the mem event ring pages. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> diff -r 0b28eaa6422f -r a1ad4c2eccce xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -487,6 +487,17 @@ int do_mem_event_op(int op, uint32_t dom return ret; } +/* Clean up on domain destruction */ +void mem_event_cleanup(struct domain *d) +{ + if ( d->mem_event->paging.ring_page ) + (void)mem_event_disable(d, &d->mem_event->paging); + if ( d->mem_event->access.ring_page ) + (void)mem_event_disable(d, &d->mem_event->access); + if ( d->mem_event->share.ring_page ) + (void)mem_event_disable(d, &d->mem_event->share); +} + int mem_event_domctl(struct domain *d, xen_domctl_mem_event_op_t *mec, XEN_GUEST_HANDLE(void) u_domctl) { diff -r 0b28eaa6422f -r a1ad4c2eccce xen/common/domain.c --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -480,6 +480,9 @@ int domain_kill(struct domain *d) break; } d->is_dying = DOMDYING_dead; + /* Mem event cleanup has to go here because the rings + * have to be put before we call put_domain. */ + mem_event_cleanup(d); put_domain(d); send_global_virq(VIRQ_DOM_EXC); /* fallthrough */ diff -r 0b28eaa6422f -r a1ad4c2eccce xen/include/asm-arm/mm.h --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -266,7 +266,8 @@ int get_page(struct page_info *page, st machine_to_phys_mapping[(mfn)] = (pfn); \ }) -#define put_gfn(d, g) ((void)0) +static inline void put_gfn(struct domain *d, unsigned long gfn) {} +static inline void mem_event_cleanup(struct domain *d) {} #define INVALID_MFN (~0UL) diff -r 0b28eaa6422f -r a1ad4c2eccce xen/include/asm-ia64/mm.h --- a/xen/include/asm-ia64/mm.h +++ b/xen/include/asm-ia64/mm.h @@ -551,7 +551,8 @@ extern u64 translate_domain_pte(u64 ptev gmfn_to_mfn_foreign((_d), (gpfn)) #define get_gfn_untyped(d, gpfn) gmfn_to_mfn(d, gpfn) -#define put_gfn(d, g) ((void)0) +static inline void put_gfn(struct domain *d, unsigned long gfn) {} +static inline void mem_event_cleanup(struct domain *d) {} #define __gpfn_invalid(_d, gpfn) \ (lookup_domain_mpa((_d), ((gpfn)<<PAGE_SHIFT), NULL) == INVALID_MFN) diff -r 0b28eaa6422f -r a1ad4c2eccce xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -634,6 +634,12 @@ unsigned int domain_clamp_alloc_bitsize( unsigned long domain_get_maximum_gpfn(struct domain *d); +#ifdef CONFIG_X86_64 +void mem_event_cleanup(struct domain *d); +#else +static inline void mem_event_cleanup(struct domain *d) {} +#endif + extern struct domain *dom_xen, *dom_io, *dom_cow; /* for vmcoreinfo */ /* Definition of an mm lock: spinlock with extra fields for debugging */
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 7 of 8] x86/mm: Fix mem event error message typos
xen/arch/x86/mm/mem_event.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> diff -r a1ad4c2eccce -r b8c6f0af992a xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -505,13 +505,13 @@ int mem_event_domctl(struct domain *d, x if ( unlikely(d == current->domain) ) { - gdprintk(XENLOG_INFO, "Tried to do a memory paging op on itself.\n"); + gdprintk(XENLOG_INFO, "Tried to do a memory event op on itself.\n"); return -EINVAL; } if ( unlikely(d->is_dying) ) { - gdprintk(XENLOG_INFO, "Ignoring memory paging op on dying domain %u\n", + gdprintk(XENLOG_INFO, "Ignoring memory event op on dying domain %u\n", d->domain_id); return 0; } @@ -519,7 +519,7 @@ int mem_event_domctl(struct domain *d, x if ( unlikely(d->vcpu == NULL) || unlikely(d->vcpu[0] == NULL) ) { gdprintk(XENLOG_INFO, - "Memory paging op on a domain (%u) with no vcpus\n", + "Memory event op on a domain (%u) with no vcpus\n", d->domain_id); return -EINVAL; }
Andres Lagar-Cavilla
2012-Mar-06 23:50 UTC
[PATCH 8 of 8] Tools: After a helper maps a ring, yank it from the guest physmap
tools/tests/xen-access/xen-access.c | 5 +++++ tools/xenpaging/xenpaging.c | 5 +++++ 2 files changed, 10 insertions(+), 0 deletions(-) This limits the ability of the guest to play around with its own rings, and DoS itself. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r b8c6f0af992a -r 13cddd7608bd tools/tests/xen-access/xen-access.c --- a/tools/tests/xen-access/xen-access.c +++ b/tools/tests/xen-access/xen-access.c @@ -269,6 +269,11 @@ xenaccess_t *xenaccess_init(xc_interface (mem_event_sring_t *)xenaccess->mem_event.ring_page, PAGE_SIZE); + /* Now that the ring is set, remove it from the guest''s physmap */ + if ( xc_domain_decrease_reservation_exact(xch, + xenaccess->mem_event.domain_id, 1, 0, &ring_pfn) ) + PERROR("Failed to remove ring from guest physmap"); + /* Get platform info */ xenaccess->platform_info = malloc(sizeof(xc_platform_info_t)); if ( xenaccess->platform_info == NULL ) diff -r b8c6f0af992a -r 13cddd7608bd tools/xenpaging/xenpaging.c --- a/tools/xenpaging/xenpaging.c +++ b/tools/xenpaging/xenpaging.c @@ -420,6 +420,11 @@ static struct xenpaging *xenpaging_init( (mem_event_sring_t *)paging->mem_event.ring_page, PAGE_SIZE); + /* Now that the ring is set, remove it from the guest''s physmap */ + if ( xc_domain_decrease_reservation_exact(xch, + paging->mem_event.domain_id, 1, 0, &ring_pfn) ) + PERROR("Failed to remove ring from guest physmap"); + /* Get max_pages from guest if not provided via cmdline */ if ( !paging->max_pages ) {
Tim Deegan
2012-Mar-08 13:23 UTC
Re: [PATCH 0 of 8] Mem event ring interface setup update, V3
At 18:50 -0500 on 06 Mar (1331059822), Andres Lagar-Cavilla wrote:> Changes from previous posting, dated Feb 29th: > - Added Acked-by Ian Campbell for tools bits > - Turned ((void)0) define for non-x86 architectures into "empty" static inlines > - Added patch #8 to remove the ring from the guest physmap''s once successfully > mapped by the helper >This series doesn''t apply cleanly to xen-unstable. I suspect you''ll need to rebase on top of the recent changes to xenpaging. And patch 8 needs an ack from a tools maintainer. Apart from that it''s good to go. Tim.
Andres Lagar-Cavilla
2012-Mar-08 14:50 UTC
Re: [PATCH 0 of 8] Mem event ring interface setup update, V3
> At 18:50 -0500 on 06 Mar (1331059822), Andres Lagar-Cavilla wrote: >> Changes from previous posting, dated Feb 29th: >> - Added Acked-by Ian Campbell for tools bits >> - Turned ((void)0) define for non-x86 architectures into "empty" static >> inlines >> - Added patch #8 to remove the ring from the guest physmap''s once >> successfully >> mapped by the helper >> > > This series doesn''t apply cleanly to xen-unstable. I suspect you''ll > need to rebase on top of the recent changes to xenpaging. > And patch 8 needs an ack from a tools maintainer. > Apart from that it''s good to go.Working on the rebase, resending shortly. Thanks! Andres> > Tim. >
Ian Campbell
2012-Mar-08 15:42 UTC
Re: [PATCH 8 of 8] Tools: After a helper maps a ring, yank it from the guest physmap
On Tue, 2012-03-06 at 18:50 -0500, Andres Lagar-Cavilla wrote:> tools/tests/xen-access/xen-access.c | 5 +++++ > tools/xenpaging/xenpaging.c | 5 +++++ > 2 files changed, 10 insertions(+), 0 deletions(-) > > > This limits the ability of the guest to play around with its own rings, and DoS > itself. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>Acked-by: Ian Campbell <ian.campbell@citrix.com>> > diff -r b8c6f0af992a -r 13cddd7608bd tools/tests/xen-access/xen-access.c > --- a/tools/tests/xen-access/xen-access.c > +++ b/tools/tests/xen-access/xen-access.c > @@ -269,6 +269,11 @@ xenaccess_t *xenaccess_init(xc_interface > (mem_event_sring_t *)xenaccess->mem_event.ring_page, > PAGE_SIZE); > > + /* Now that the ring is set, remove it from the guest''s physmap */ > + if ( xc_domain_decrease_reservation_exact(xch, > + xenaccess->mem_event.domain_id, 1, 0, &ring_pfn) ) > + PERROR("Failed to remove ring from guest physmap"); > + > /* Get platform info */ > xenaccess->platform_info = malloc(sizeof(xc_platform_info_t)); > if ( xenaccess->platform_info == NULL ) > diff -r b8c6f0af992a -r 13cddd7608bd tools/xenpaging/xenpaging.c > --- a/tools/xenpaging/xenpaging.c > +++ b/tools/xenpaging/xenpaging.c > @@ -420,6 +420,11 @@ static struct xenpaging *xenpaging_init( > (mem_event_sring_t *)paging->mem_event.ring_page, > PAGE_SIZE); > > + /* Now that the ring is set, remove it from the guest''s physmap */ > + if ( xc_domain_decrease_reservation_exact(xch, > + paging->mem_event.domain_id, 1, 0, &ring_pfn) ) > + PERROR("Failed to remove ring from guest physmap"); > + > /* Get max_pages from guest if not provided via cmdline */ > if ( !paging->max_pages ) > {
Ian Jackson
2012-Mar-12 11:23 UTC
Re: [PATCH 8 of 8] Tools: After a helper maps a ring, yank it from the guest physmap
Andres Lagar-Cavilla writes ("[PATCH 8 of 8] Tools: After a helper maps a ring, yank it from the guest physmap"):> This limits the ability of the guest to play around with its own > rings, and DoS itself.This looks plausible to me. Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>