Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 0 of 7] Mem event ring setup interface update
Update the interface for setting up mem event rings (for sharing, mem access or paging). Remove the "shared page", which was a waste of a whole page for a single event channel port value. More importantly, both the shared page and the ring page were dom0 user-space process pages mapped by the hypervisor. If the dom0 process does not clean up, the hypervisor keeps posting events (and holding a map) to a page now belonging to another process. Solutions proposed: - Pass the event channel port explicitly as part of the domctl payload. - Reserve a pfn in the guest physmap for a each mem event ring. Set/retrieve these pfns via hvm params. Ensure they are set during build and restore, and retrieved during save. Ensure these pages don''t leak and domains are left zombie. In all cases mem events consumers in-tree (xenpaging and xen-access) have been updated. Updating the interface to deal with these problems requires backwards-incompatible changes on both the helper<->libxc and libxc<->hypervisor interfaces. Take advantage of the interface update to plumb setting up of the sharing ring, which was missing. All patches touch x86/mm hypervisor bits. Patches 1, 3 and 5 are tools patches as well. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> tools/libxc/xc_mem_access.c | 10 +++- tools/libxc/xc_mem_event.c | 13 +++-- tools/libxc/xc_mem_paging.c | 10 +++- tools/libxc/xenctrl.h | 6 +- tools/tests/xen-access/xen-access.c | 22 +-------- tools/xenpaging/xenpaging.c | 18 +------ tools/xenpaging/xenpaging.h | 2 +- xen/arch/x86/mm/mem_event.c | 33 +------------- xen/include/public/domctl.h | 4 +- xen/include/public/mem_event.h | 4 - xen/include/xen/sched.h | 2 - xen/arch/x86/hvm/hvm.c | 48 ++++++++++++++++----- xen/include/asm-x86/hvm/hvm.h | 7 +++ tools/libxc/xc_domain_restore.c | 42 ++++++++++++++++++ tools/libxc/xc_domain_save.c | 36 ++++++++++++++++ tools/libxc/xc_hvm_build.c | 21 ++++++-- tools/libxc/xc_mem_access.c | 6 +- tools/libxc/xc_mem_event.c | 3 +- tools/libxc/xc_mem_paging.c | 6 +- tools/libxc/xenctrl.h | 8 +-- tools/libxc/xg_save_restore.h | 4 + tools/tests/xen-access/xen-access.c | 83 +++++++++++++++++------------------- tools/xenpaging/xenpaging.c | 52 ++++++++++++++++------ xen/arch/x86/mm/mem_event.c | 50 ++++++++++------------ xen/include/public/domctl.h | 1 - xen/include/public/hvm/params.h | 7 ++- xen/include/xen/sched.h | 1 + xen/arch/x86/mm/mem_event.c | 41 ++++++++++++++++++ xen/include/public/domctl.h | 20 ++++++++- xen/include/xen/sched.h | 3 + tools/libxc/xc_memshr.c | 25 +++++++++++ tools/libxc/xenctrl.h | 5 ++ xen/arch/x86/mm/mem_event.c | 11 ++++ xen/common/domain.c | 3 + xen/include/asm-arm/mm.h | 3 +- xen/include/asm-ia64/mm.h | 3 +- xen/include/asm-x86/mm.h | 2 + xen/arch/x86/mm/mem_event.c | 6 +- 38 files changed, 413 insertions(+), 208 deletions(-)
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 1 of 7] Tools: Sanitize mem_event/access/paging interfaces
tools/libxc/xc_mem_access.c | 10 ++++++++-- tools/libxc/xc_mem_event.c | 13 ++++++++----- tools/libxc/xc_mem_paging.c | 10 ++++++++-- tools/libxc/xenctrl.h | 6 +++--- tools/tests/xen-access/xen-access.c | 22 ++++------------------ tools/xenpaging/xenpaging.c | 18 +++--------------- tools/xenpaging/xenpaging.h | 2 +- xen/arch/x86/mm/mem_event.c | 33 +++------------------------------ xen/include/public/domctl.h | 4 ++-- xen/include/public/mem_event.h | 4 ---- xen/include/xen/sched.h | 2 -- 11 files changed, 40 insertions(+), 84 deletions(-) Don''t use the superfluous shared page, return the event channel directly as part of the domctl struct, instead. In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API change, so please voice any concerns. Known pending issues: - pager could die and its ring page could be used by some other process, yet Xen retains the mapping to it. - use a saner interface for the paging_load buffer. This change also affects the x86/mm bits in the hypervisor that process the mem_event setup domctl. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_access.c --- a/tools/libxc/xc_mem_access.c +++ b/tools/libxc/xc_mem_access.c @@ -25,12 +25,18 @@ int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page) + uint32_t *port, void *ring_page) { + if ( !port ) + { + errno = -EINVAL; + return -1; + } + return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - shared_page, ring_page); + port, ring_page); } int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_event.c --- a/tools/libxc/xc_mem_event.c +++ b/tools/libxc/xc_mem_event.c @@ -24,19 +24,22 @@ #include "xc_private.h" int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, void *page, void *ring_page) + unsigned int mode, uint32_t *port, void *ring_page) { DECLARE_DOMCTL; + int rc; domctl.cmd = XEN_DOMCTL_mem_event_op; domctl.domain = domain_id; domctl.u.mem_event_op.op = op; domctl.u.mem_event_op.mode = mode; - - domctl.u.mem_event_op.shared_addr = (unsigned long)page; - domctl.u.mem_event_op.ring_addr = (unsigned long)ring_page; + domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; - return do_domctl(xch, &domctl); + errno = 0; + rc = do_domctl(xch, &domctl); + if ( !rc && port ) + *port = domctl.u.mem_event_op.port; + return rc; } int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_paging.c --- a/tools/libxc/xc_mem_paging.c +++ b/tools/libxc/xc_mem_paging.c @@ -25,12 +25,18 @@ int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page) + uint32_t *port, void *ring_page) { + if ( !port ) + { + errno = -EINVAL; + return -1; + } + return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - shared_page, ring_page); + port, ring_page); } int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id) diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1888,13 +1888,13 @@ int xc_tmem_restore_extra(xc_interface * * mem_event operations */ int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, void *shared_page, void *ring_page); + unsigned int mode, uint32_t *port, void *ring_page); int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, unsigned int op, unsigned int mode, uint64_t gfn, void *buffer); int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page); + uint32_t *port, void *ring_page); int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id); int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn); @@ -1906,7 +1906,7 @@ int xc_mem_paging_resume(xc_interface *x unsigned long gfn); int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - void *shared_page, void *ring_page); + uint32_t *port, void *ring_page); int xc_mem_access_disable(xc_interface *xch, domid_t domain_id); int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); diff -r 8ddd13cc783e -r 3b24188d8815 tools/tests/xen-access/xen-access.c --- a/tools/tests/xen-access/xen-access.c +++ b/tools/tests/xen-access/xen-access.c @@ -98,7 +98,7 @@ typedef struct mem_event { xc_evtchn *xce_handle; int port; mem_event_back_ring_t back_ring; - mem_event_shared_page_t *shared_page; + uint32_t evtchn_port; void *ring_page; spinlock_t ring_lock; } mem_event_t; @@ -166,7 +166,7 @@ int xc_wait_for_event_or_timeout(xc_inte err: return -errno; } - + static void *init_page(void) { void *buffer; @@ -214,14 +214,6 @@ xenaccess_t *xenaccess_init(xc_interface /* Set domain id */ xenaccess->mem_event.domain_id = domain_id; - /* Initialise shared page */ - xenaccess->mem_event.shared_page = init_page(); - if ( xenaccess->mem_event.shared_page == NULL ) - { - ERROR("Error initialising shared page"); - goto err; - } - /* Initialise ring page */ xenaccess->mem_event.ring_page = init_page(); if ( xenaccess->mem_event.ring_page == NULL ) @@ -242,7 +234,7 @@ xenaccess_t *xenaccess_init(xc_interface /* Initialise Xen */ rc = xc_mem_access_enable(xenaccess->xc_handle, xenaccess->mem_event.domain_id, - xenaccess->mem_event.shared_page, + &xenaccess->mem_event.evtchn_port, xenaccess->mem_event.ring_page); if ( rc != 0 ) { @@ -271,7 +263,7 @@ xenaccess_t *xenaccess_init(xc_interface /* Bind event notification */ rc = xc_evtchn_bind_interdomain(xenaccess->mem_event.xce_handle, xenaccess->mem_event.domain_id, - xenaccess->mem_event.shared_page->port); + xenaccess->mem_event.evtchn_port); if ( rc < 0 ) { ERROR("Failed to bind event channel"); @@ -322,12 +314,6 @@ xenaccess_t *xenaccess_init(xc_interface err: if ( xenaccess ) { - if ( xenaccess->mem_event.shared_page ) - { - munlock(xenaccess->mem_event.shared_page, PAGE_SIZE); - free(xenaccess->mem_event.shared_page); - } - if ( xenaccess->mem_event.ring_page ) { munlock(xenaccess->mem_event.ring_page, PAGE_SIZE); diff -r 8ddd13cc783e -r 3b24188d8815 tools/xenpaging/xenpaging.c --- a/tools/xenpaging/xenpaging.c +++ b/tools/xenpaging/xenpaging.c @@ -341,14 +341,6 @@ static struct xenpaging *xenpaging_init( goto err; } - /* Initialise shared page */ - paging->mem_event.shared_page = init_page(); - if ( paging->mem_event.shared_page == NULL ) - { - PERROR("Error initialising shared page"); - goto err; - } - /* Initialise ring page */ paging->mem_event.ring_page = init_page(); if ( paging->mem_event.ring_page == NULL ) @@ -365,7 +357,7 @@ static struct xenpaging *xenpaging_init( /* Initialise Xen */ rc = xc_mem_paging_enable(xch, paging->mem_event.domain_id, - paging->mem_event.shared_page, + &paging->mem_event.evtchn_port, paging->mem_event.ring_page); if ( rc != 0 ) { @@ -397,7 +389,7 @@ static struct xenpaging *xenpaging_init( /* Bind event notification */ rc = xc_evtchn_bind_interdomain(paging->mem_event.xce_handle, paging->mem_event.domain_id, - paging->mem_event.shared_page->port); + paging->mem_event.evtchn_port); if ( rc < 0 ) { PERROR("Failed to bind event channel"); @@ -452,13 +444,9 @@ static struct xenpaging *xenpaging_init( { if ( paging->xs_handle ) xs_close(paging->xs_handle); + if ( xch ) xc_interface_close(xch); - if ( paging->mem_event.shared_page ) - { - munlock(paging->mem_event.shared_page, PAGE_SIZE); - free(paging->mem_event.shared_page); - } if ( paging->mem_event.ring_page ) { diff -r 8ddd13cc783e -r 3b24188d8815 tools/xenpaging/xenpaging.h --- a/tools/xenpaging/xenpaging.h +++ b/tools/xenpaging/xenpaging.h @@ -36,7 +36,7 @@ struct mem_event { xc_evtchn *xce_handle; int port; mem_event_back_ring_t back_ring; - mem_event_shared_page_t *shared_page; + uint32_t evtchn_port; void *ring_page; }; diff -r 8ddd13cc783e -r 3b24188d8815 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -50,12 +50,10 @@ static int mem_event_enable( struct domain *dom_mem_event = current->domain; struct vcpu *v = current; unsigned long ring_addr = mec->ring_addr; - unsigned long shared_addr = mec->shared_addr; l1_pgentry_t l1e; - unsigned long shared_gfn = 0, ring_gfn = 0; /* gcc ... */ + unsigned long ring_gfn = 0; /* gcc ... */ p2m_type_t p2mt; mfn_t ring_mfn; - mfn_t shared_mfn; /* Only one helper at a time. If the helper crashed, * the ring is in an undefined state and so is the guest. @@ -66,10 +64,6 @@ static int mem_event_enable( /* Get MFN of ring page */ guest_get_eff_l1e(v, ring_addr, &l1e); ring_gfn = l1e_get_pfn(l1e); - /* We''re grabbing these two in an order that could deadlock - * dom0 if 1. it were an hvm 2. there were two concurrent - * enables 3. the two gfn''s in each enable criss-crossed - * 2MB regions. Duly noted.... */ ring_mfn = get_gfn(dom_mem_event, ring_gfn, &p2mt); if ( unlikely(!mfn_valid(mfn_x(ring_mfn))) ) @@ -80,23 +74,9 @@ static int mem_event_enable( mem_event_ring_lock_init(med); - /* Get MFN of shared page */ - guest_get_eff_l1e(v, shared_addr, &l1e); - shared_gfn = l1e_get_pfn(l1e); - shared_mfn = get_gfn(dom_mem_event, shared_gfn, &p2mt); - - if ( unlikely(!mfn_valid(mfn_x(shared_mfn))) ) - { - put_gfn(dom_mem_event, ring_gfn); - put_gfn(dom_mem_event, shared_gfn); - return -EINVAL; - } - - /* Map ring and shared pages */ + /* Map ring page */ med->ring_page = map_domain_page(mfn_x(ring_mfn)); - med->shared_page = map_domain_page(mfn_x(shared_mfn)); put_gfn(dom_mem_event, ring_gfn); - put_gfn(dom_mem_event, shared_gfn); /* Set the number of currently blocked vCPUs to 0. */ med->blocked = 0; @@ -108,8 +88,7 @@ static int mem_event_enable( if ( rc < 0 ) goto err; - ((mem_event_shared_page_t *)med->shared_page)->port = rc; - med->xen_port = rc; + med->xen_port = mec->port = rc; /* Prepare ring buffer */ FRONT_RING_INIT(&med->front_ring, @@ -125,9 +104,6 @@ static int mem_event_enable( return 0; err: - unmap_domain_page(med->shared_page); - med->shared_page = NULL; - unmap_domain_page(med->ring_page); med->ring_page = NULL; @@ -249,9 +225,6 @@ static int mem_event_disable(struct doma unmap_domain_page(med->ring_page); med->ring_page = NULL; - unmap_domain_page(med->shared_page); - med->shared_page = NULL; - /* Unblock all vCPUs */ for_each_vcpu ( d, v ) { diff -r 8ddd13cc783e -r 3b24188d8815 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -746,8 +746,8 @@ struct xen_domctl_mem_event_op { uint32_t op; /* XEN_DOMCTL_MEM_EVENT_OP_*_* */ uint32_t mode; /* XEN_DOMCTL_MEM_EVENT_OP_* */ - uint64_aligned_t shared_addr; /* IN: Virtual address of shared page */ - uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ + uint32_t port; /* OUT: event channel for ring */ + uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ }; typedef struct xen_domctl_mem_event_op xen_domctl_mem_event_op_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_event_op_t); diff -r 8ddd13cc783e -r 3b24188d8815 xen/include/public/mem_event.h --- a/xen/include/public/mem_event.h +++ b/xen/include/public/mem_event.h @@ -51,10 +51,6 @@ #define MEM_EVENT_REASON_INT3 5 /* int3 was hit: gla/gfn are RIP */ #define MEM_EVENT_REASON_SINGLESTEP 6 /* single step was invoked: gla/gfn are RIP */ -typedef struct mem_event_shared_page { - uint32_t port; -} mem_event_shared_page_t; - typedef struct mem_event_st { uint16_t type; uint16_t flags; diff -r 8ddd13cc783e -r 3b24188d8815 xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -190,8 +190,6 @@ struct mem_event_domain /* The ring has 64 entries */ unsigned char foreign_producers; unsigned char target_producers; - /* shared page */ - mem_event_shared_page_t *shared_page; /* shared ring page */ void *ring_page; /* front-end ring */
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 2 of 7] x86/hvm: refactor calls to prepare and tear down a helper ring
xen/arch/x86/hvm/hvm.c | 48 ++++++++++++++++++++++++++++++++---------- xen/include/asm-x86/hvm/hvm.h | 7 ++++++ 2 files changed, 43 insertions(+), 12 deletions(-) These are currently used for the rings connecting Xen with qemu. Refactor them so the same code can be later used for mem event rings. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 3b24188d8815 -r 99e6c9b9e971 xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -331,6 +331,19 @@ static void hvm_init_ioreq_page( domain_pause(d); } +void destroy_ring_for_helper( + void **_va, struct page_info *page) +{ + void *va = *_va; + + if ( va != NULL ) + { + unmap_domain_page_global(va); + put_page_and_type(page); + *_va = NULL; + } +} + static void hvm_destroy_ioreq_page( struct domain *d, struct hvm_ioreq_page *iorp) { @@ -338,18 +351,14 @@ static void hvm_destroy_ioreq_page( ASSERT(d->is_dying); - if ( iorp->va != NULL ) - { - unmap_domain_page_global(iorp->va); - put_page_and_type(iorp->page); - iorp->va = NULL; - } + destroy_ring_for_helper(&iorp->va, iorp->page); spin_unlock(&iorp->lock); } -static int hvm_set_ioreq_page( - struct domain *d, struct hvm_ioreq_page *iorp, unsigned long gmfn) +int prepare_ring_for_helper( + struct domain *d, unsigned long gmfn, struct page_info **_page, + void **_va) { struct page_info *page; p2m_type_t p2mt; @@ -390,14 +399,30 @@ static int hvm_set_ioreq_page( return -ENOMEM; } + *_va = va; + *_page = page; + + put_gfn(d, gmfn); + + return 0; +} + +static int hvm_set_ioreq_page( + struct domain *d, struct hvm_ioreq_page *iorp, unsigned long gmfn) +{ + struct page_info *page; + void *va; + int rc; + + if ( (rc = prepare_ring_for_helper(d, gmfn, &page, &va)) ) + return rc; + spin_lock(&iorp->lock); if ( (iorp->va != NULL) || d->is_dying ) { + destroy_ring_for_helper(&iorp->va, iorp->page); spin_unlock(&iorp->lock); - unmap_domain_page_global(va); - put_page_and_type(mfn_to_page(mfn)); - put_gfn(d, gmfn); return -EINVAL; } @@ -405,7 +430,6 @@ static int hvm_set_ioreq_page( iorp->page = page; spin_unlock(&iorp->lock); - put_gfn(d, gmfn); domain_unpause(d); diff -r 3b24188d8815 -r 99e6c9b9e971 xen/include/asm-x86/hvm/hvm.h --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -26,6 +26,7 @@ #include <asm/hvm/asid.h> #include <public/domctl.h> #include <public/hvm/save.h> +#include <asm/mm.h> /* Interrupt acknowledgement sources. */ enum hvm_intsrc { @@ -191,6 +192,12 @@ int hvm_vcpu_cacheattr_init(struct vcpu void hvm_vcpu_cacheattr_destroy(struct vcpu *v); void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip); +/* Prepare/destroy a ring for a dom0 helper. Helper with talk + * with Xen on behalf of this hvm domain. */ +int prepare_ring_for_helper(struct domain *d, unsigned long gmfn, + struct page_info **_page, void **_va); +void destroy_ring_for_helper(void **_va, struct page_info *page); + bool_t hvm_send_assist_req(struct vcpu *v); void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc);
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
tools/libxc/xc_domain_restore.c | 42 ++++++++++++++++++ tools/libxc/xc_domain_save.c | 36 ++++++++++++++++ tools/libxc/xc_hvm_build.c | 21 ++++++-- tools/libxc/xc_mem_access.c | 6 +- tools/libxc/xc_mem_event.c | 3 +- tools/libxc/xc_mem_paging.c | 6 +- tools/libxc/xenctrl.h | 8 +-- tools/libxc/xg_save_restore.h | 4 + tools/tests/xen-access/xen-access.c | 83 +++++++++++++++++------------------- tools/xenpaging/xenpaging.c | 52 ++++++++++++++++------ xen/arch/x86/mm/mem_event.c | 50 ++++++++++------------ xen/include/public/domctl.h | 1 - xen/include/public/hvm/params.h | 7 ++- xen/include/xen/sched.h | 1 + 14 files changed, 214 insertions(+), 106 deletions(-) This solves a long-standing issue in which the pages backing these rings were pages belonging to dom0 user-space processes. Thus, if the process would die unexpectedly, Xen would keep posting events to a page now belonging to some other process. We update all API-consumers in tree (xenpaging and xen-access). This is an API/ABI change, so please speak up if it breaks your accumptions. The patch touches tools, hypervisor x86/hvm bits, and hypervisor x86/mm bits. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c +++ b/tools/libxc/xc_domain_restore.c @@ -677,6 +677,9 @@ typedef struct { int max_vcpu_id; uint64_t vcpumap; uint64_t identpt; + uint64_t paging_ring_pfn; + uint64_t access_ring_pfn; + uint64_t sharing_ring_pfn; uint64_t vm86_tss; uint64_t console_pfn; uint64_t acpi_ioport_location; @@ -750,6 +753,39 @@ static int pagebuf_get_one(xc_interface // DPRINTF("EPT identity map address: %llx\n", buf->identpt); return pagebuf_get_one(xch, ctx, buf, fd, dom); + case XC_SAVE_ID_HVM_PAGING_RING_PFN: + /* Skip padding 4 bytes then read the paging ring location. */ + if ( RDEXACT(fd, &buf->paging_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->paging_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the paging ring pfn"); + return -1; + } + // DPRINTF("paging ring pfn address: %llx\n", buf->paging_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + + case XC_SAVE_ID_HVM_ACCESS_RING_PFN: + /* Skip padding 4 bytes then read the mem access ring location. */ + if ( RDEXACT(fd, &buf->access_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->access_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the access ring pfn"); + return -1; + } + // DPRINTF("access ring pfn address: %llx\n", buf->access_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + + case XC_SAVE_ID_HVM_SHARING_RING_PFN: + /* Skip padding 4 bytes then read the sharing ring location. */ + if ( RDEXACT(fd, &buf->sharing_ring_pfn, sizeof(uint32_t)) || + RDEXACT(fd, &buf->sharing_ring_pfn, sizeof(uint64_t)) ) + { + PERROR("error read the sharing ring pfn"); + return -1; + } + // DPRINTF("sharing ring pfn address: %llx\n", buf->sharing_ring_pfn); + return pagebuf_get_one(xch, ctx, buf, fd, dom); + case XC_SAVE_ID_HVM_VM86_TSS: /* Skip padding 4 bytes then read the vm86 TSS location. */ if ( RDEXACT(fd, &buf->vm86_tss, sizeof(uint32_t)) || @@ -1460,6 +1496,12 @@ int xc_domain_restore(xc_interface *xch, /* should this be deferred? does it change? */ if ( pagebuf.identpt ) xc_set_hvm_param(xch, dom, HVM_PARAM_IDENT_PT, pagebuf.identpt); + if ( pagebuf.paging_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, pagebuf.paging_ring_pfn); + if ( pagebuf.access_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, pagebuf.access_ring_pfn); + if ( pagebuf.sharing_ring_pfn ) + xc_set_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, pagebuf.sharing_ring_pfn); if ( pagebuf.vm86_tss ) xc_set_hvm_param(xch, dom, HVM_PARAM_VM86_TSS, pagebuf.vm86_tss); if ( pagebuf.console_pfn ) diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_domain_save.c --- a/tools/libxc/xc_domain_save.c +++ b/tools/libxc/xc_domain_save.c @@ -1639,6 +1639,42 @@ int xc_domain_save(xc_interface *xch, in goto out; } + chunk.id = XC_SAVE_ID_HVM_PAGING_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the paging ring pfn for guest"); + goto out; + } + + chunk.id = XC_SAVE_ID_HVM_ACCESS_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the access ring pfn for guest"); + goto out; + } + + chunk.id = XC_SAVE_ID_HVM_SHARING_RING_PFN; + chunk.data = 0; + xc_get_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, + (unsigned long *)&chunk.data); + + if ( (chunk.data != 0) && + wrexact(io_fd, &chunk, sizeof(chunk)) ) + { + PERROR("Error when writing the sharing ring pfn for guest"); + goto out; + } + chunk.id = XC_SAVE_ID_HVM_VM86_TSS; chunk.data = 0; xc_get_hvm_param(xch, dom, HVM_PARAM_VM86_TSS, diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_hvm_build.c --- a/tools/libxc/xc_hvm_build.c +++ b/tools/libxc/xc_hvm_build.c @@ -38,12 +38,15 @@ #define SUPERPAGE_1GB_SHIFT 18 #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) -#define SPECIALPAGE_BUFIOREQ 0 -#define SPECIALPAGE_XENSTORE 1 -#define SPECIALPAGE_IOREQ 2 -#define SPECIALPAGE_IDENT_PT 3 -#define SPECIALPAGE_CONSOLE 4 -#define NR_SPECIAL_PAGES 5 +#define SPECIALPAGE_PAGING 0 +#define SPECIALPAGE_ACCESS 1 +#define SPECIALPAGE_SHARING 2 +#define SPECIALPAGE_BUFIOREQ 3 +#define SPECIALPAGE_XENSTORE 4 +#define SPECIALPAGE_IOREQ 5 +#define SPECIALPAGE_IDENT_PT 6 +#define SPECIALPAGE_CONSOLE 7 +#define NR_SPECIAL_PAGES 8 #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x)) static void build_hvm_info(void *hvm_info_page, uint64_t mem_size) @@ -356,6 +359,12 @@ static int setup_guest(xc_interface *xch special_pfn(SPECIALPAGE_IOREQ)); xc_set_hvm_param(xch, dom, HVM_PARAM_CONSOLE_PFN, special_pfn(SPECIALPAGE_CONSOLE)); + xc_set_hvm_param(xch, dom, HVM_PARAM_PAGING_RING_PFN, + special_pfn(SPECIALPAGE_PAGING)); + xc_set_hvm_param(xch, dom, HVM_PARAM_ACCESS_RING_PFN, + special_pfn(SPECIALPAGE_ACCESS)); + xc_set_hvm_param(xch, dom, HVM_PARAM_SHARING_RING_PFN, + special_pfn(SPECIALPAGE_SHARING)); /* * Identity-map page table is required for running with CR0.PG=0 when diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_mem_access.c --- a/tools/libxc/xc_mem_access.c +++ b/tools/libxc/xc_mem_access.c @@ -25,7 +25,7 @@ int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page) + uint32_t *port) { if ( !port ) { @@ -36,7 +36,7 @@ int xc_mem_access_enable(xc_interface *x return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - port, ring_page); + port); } int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) @@ -44,7 +44,7 @@ int xc_mem_access_disable(xc_interface * return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_ACCESS_DISABLE, XEN_DOMCTL_MEM_EVENT_OP_ACCESS, - NULL, NULL); + NULL); } int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn) diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_mem_event.c --- a/tools/libxc/xc_mem_event.c +++ b/tools/libxc/xc_mem_event.c @@ -24,7 +24,7 @@ #include "xc_private.h" int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, uint32_t *port, void *ring_page) + unsigned int mode, uint32_t *port) { DECLARE_DOMCTL; int rc; @@ -33,7 +33,6 @@ int xc_mem_event_control(xc_interface *x domctl.domain = domain_id; domctl.u.mem_event_op.op = op; domctl.u.mem_event_op.mode = mode; - domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; errno = 0; rc = do_domctl(xch, &domctl); diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_mem_paging.c --- a/tools/libxc/xc_mem_paging.c +++ b/tools/libxc/xc_mem_paging.c @@ -25,7 +25,7 @@ int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page) + uint32_t *port) { if ( !port ) { @@ -36,7 +36,7 @@ int xc_mem_paging_enable(xc_interface *x return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - port, ring_page); + port); } int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id) @@ -44,7 +44,7 @@ int xc_mem_paging_disable(xc_interface * return xc_mem_event_control(xch, domain_id, XEN_DOMCTL_MEM_EVENT_OP_PAGING_DISABLE, XEN_DOMCTL_MEM_EVENT_OP_PAGING, - NULL, NULL); + NULL); } int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn) diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1888,13 +1888,12 @@ int xc_tmem_restore_extra(xc_interface * * mem_event operations */ int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, - unsigned int mode, uint32_t *port, void *ring_page); + unsigned int mode, uint32_t *port); int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, unsigned int op, unsigned int mode, uint64_t gfn, void *buffer); -int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page); +int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, uint32_t *port); int xc_mem_paging_disable(xc_interface *xch, domid_t domain_id); int xc_mem_paging_nominate(xc_interface *xch, domid_t domain_id, unsigned long gfn); @@ -1905,8 +1904,7 @@ int xc_mem_paging_load(xc_interface *xch int xc_mem_paging_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); -int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, - uint32_t *port, void *ring_page); +int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, uint32_t *port); int xc_mem_access_disable(xc_interface *xch, domid_t domain_id); int xc_mem_access_resume(xc_interface *xch, domid_t domain_id, unsigned long gfn); diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xg_save_restore.h --- a/tools/libxc/xg_save_restore.h +++ b/tools/libxc/xg_save_restore.h @@ -254,6 +254,10 @@ #define XC_SAVE_ID_COMPRESSED_DATA -12 /* Marker to indicate arrival of compressed data */ #define XC_SAVE_ID_ENABLE_COMPRESSION -13 /* Marker to enable compression logic at receiver side */ #define XC_SAVE_ID_HVM_GENERATION_ID_ADDR -14 +/* Markers for the pfn''s hosting these mem event rings */ +#define XC_SAVE_ID_HVM_PAGING_RING_PFN -15 +#define XC_SAVE_ID_HVM_ACCESS_RING_PFN -16 +#define XC_SAVE_ID_HVM_SHARING_RING_PFN -17 /* ** We process save/restore/migrate in batches of pages; the below diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/tests/xen-access/xen-access.c --- a/tools/tests/xen-access/xen-access.c +++ b/tools/tests/xen-access/xen-access.c @@ -166,36 +166,13 @@ int xc_wait_for_event_or_timeout(xc_inte err: return -errno; } - -static void *init_page(void) -{ - void *buffer; - int ret; - - /* Allocated page memory */ - ret = posix_memalign(&buffer, PAGE_SIZE, PAGE_SIZE); - if ( ret != 0 ) - goto out_alloc; - - /* Lock buffer in memory so it can''t be paged out */ - ret = mlock(buffer, PAGE_SIZE); - if ( ret != 0 ) - goto out_lock; - - return buffer; - - munlock(buffer, PAGE_SIZE); - out_lock: - free(buffer); - out_alloc: - return NULL; -} xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id) { xenaccess_t *xenaccess; xc_interface *xch; int rc; + unsigned long ring_pfn, mmap_pfn; xch = xc_interface_open(NULL, NULL, 0); if ( !xch ) @@ -214,28 +191,42 @@ xenaccess_t *xenaccess_init(xc_interface /* Set domain id */ xenaccess->mem_event.domain_id = domain_id; - /* Initialise ring page */ - xenaccess->mem_event.ring_page = init_page(); - if ( xenaccess->mem_event.ring_page == NULL ) - { - ERROR("Error initialising ring page"); - goto err; - } - - - /* Initialise ring */ - SHARED_RING_INIT((mem_event_sring_t *)xenaccess->mem_event.ring_page); - BACK_RING_INIT(&xenaccess->mem_event.back_ring, - (mem_event_sring_t *)xenaccess->mem_event.ring_page, - PAGE_SIZE); - /* Initialise lock */ mem_event_ring_lock_init(&xenaccess->mem_event); + /* Map the ring page */ + xc_get_hvm_param(xch, xenaccess->mem_event.domain_id, + HVM_PARAM_ACCESS_RING_PFN, &ring_pfn); + mmap_pfn = ring_pfn; + xenaccess->mem_event.ring_page = + xc_map_foreign_batch(xch, xenaccess->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + /* Map failed, populate ring page */ + rc = xc_domain_populate_physmap_exact(xenaccess->xc_handle, + xenaccess->mem_event.domain_id, + 1, 0, 0, &ring_pfn); + if ( rc != 0 ) + { + PERROR("Failed to populate ring gfn\n"); + goto err; + } + + mmap_pfn = ring_pfn; + xenaccess->mem_event.ring_page = + xc_map_foreign_batch(xch, xenaccess->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + PERROR("Could not map the ring page\n"); + goto err; + } + } + /* Initialise Xen */ rc = xc_mem_access_enable(xenaccess->xc_handle, xenaccess->mem_event.domain_id, - &xenaccess->mem_event.evtchn_port, - xenaccess->mem_event.ring_page); + &xenaccess->mem_event.evtchn_port); if ( rc != 0 ) { switch ( errno ) { @@ -272,6 +263,12 @@ xenaccess_t *xenaccess_init(xc_interface xenaccess->mem_event.port = rc; + /* Initialise ring */ + SHARED_RING_INIT((mem_event_sring_t *)xenaccess->mem_event.ring_page); + BACK_RING_INIT(&xenaccess->mem_event.back_ring, + (mem_event_sring_t *)xenaccess->mem_event.ring_page, + PAGE_SIZE); + /* Get platform info */ xenaccess->platform_info = malloc(sizeof(xc_platform_info_t)); if ( xenaccess->platform_info == NULL ) @@ -316,8 +313,7 @@ xenaccess_t *xenaccess_init(xc_interface { if ( xenaccess->mem_event.ring_page ) { - munlock(xenaccess->mem_event.ring_page, PAGE_SIZE); - free(xenaccess->mem_event.ring_page); + munmap(xenaccess->mem_event.ring_page, PAGE_SIZE); } free(xenaccess->platform_info); @@ -337,6 +333,7 @@ int xenaccess_teardown(xc_interface *xch return 0; /* Tear down domain xenaccess in Xen */ + munmap(xenaccess->mem_event.ring_page, PAGE_SIZE); rc = xc_mem_access_disable(xenaccess->xc_handle, xenaccess->mem_event.domain_id); if ( rc != 0 ) { diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/xenpaging/xenpaging.c --- a/tools/xenpaging/xenpaging.c +++ b/tools/xenpaging/xenpaging.c @@ -285,6 +285,7 @@ static struct xenpaging *xenpaging_init( xentoollog_logger *dbg = NULL; char *p; int rc; + unsigned long ring_pfn, mmap_pfn; /* Allocate memory */ paging = calloc(1, sizeof(struct xenpaging)); @@ -341,24 +342,39 @@ static struct xenpaging *xenpaging_init( goto err; } - /* Initialise ring page */ - paging->mem_event.ring_page = init_page(); - if ( paging->mem_event.ring_page == NULL ) + /* Map the ring page */ + xc_get_hvm_param(xch, paging->mem_event.domain_id, + HVM_PARAM_PAGING_RING_PFN, &ring_pfn); + mmap_pfn = ring_pfn; + paging->mem_event.ring_page = + xc_map_foreign_batch(xch, paging->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) { - PERROR("Error initialising ring page"); - goto err; + /* Map failed, populate ring page */ + rc = xc_domain_populate_physmap_exact(paging->xc_handle, + paging->mem_event.domain_id, + 1, 0, 0, &ring_pfn); + if ( rc != 0 ) + { + PERROR("Failed to populate ring gfn\n"); + goto err; + } + + mmap_pfn = ring_pfn; + paging->mem_event.ring_page = + xc_map_foreign_batch(xch, paging->mem_event.domain_id, + PROT_READ | PROT_WRITE, &mmap_pfn, 1); + if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) + { + PERROR("Could not map the ring page\n"); + goto err; + } } - - /* Initialise ring */ - SHARED_RING_INIT((mem_event_sring_t *)paging->mem_event.ring_page); - BACK_RING_INIT(&paging->mem_event.back_ring, - (mem_event_sring_t *)paging->mem_event.ring_page, - PAGE_SIZE); /* Initialise Xen */ rc = xc_mem_paging_enable(xch, paging->mem_event.domain_id, - &paging->mem_event.evtchn_port, - paging->mem_event.ring_page); + &paging->mem_event.evtchn_port); if ( rc != 0 ) { switch ( errno ) { @@ -398,6 +414,12 @@ static struct xenpaging *xenpaging_init( paging->mem_event.port = rc; + /* Initialise ring */ + SHARED_RING_INIT((mem_event_sring_t *)paging->mem_event.ring_page); + BACK_RING_INIT(&paging->mem_event.back_ring, + (mem_event_sring_t *)paging->mem_event.ring_page, + PAGE_SIZE); + /* Get max_pages from guest if not provided via cmdline */ if ( !paging->max_pages ) { @@ -450,8 +472,7 @@ static struct xenpaging *xenpaging_init( if ( paging->mem_event.ring_page ) { - munlock(paging->mem_event.ring_page, PAGE_SIZE); - free(paging->mem_event.ring_page); + munmap(paging->mem_event.ring_page, PAGE_SIZE); } free(dom_path); @@ -477,6 +498,7 @@ static int xenpaging_teardown(struct xen xch = paging->xc_handle; paging->xc_handle = NULL; /* Tear down domain paging in Xen */ + munmap(paging->mem_event.ring_page, PAGE_SIZE); rc = xc_mem_paging_disable(xch, paging->mem_event.domain_id); if ( rc != 0 ) { diff -r 99e6c9b9e971 -r 0e79f8005b6b xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -44,16 +44,11 @@ static int mem_event_enable( xen_domctl_mem_event_op_t *mec, struct mem_event_domain *med, int pause_flag, + int param, xen_event_channel_notification_t notification_fn) { int rc; - struct domain *dom_mem_event = current->domain; - struct vcpu *v = current; - unsigned long ring_addr = mec->ring_addr; - l1_pgentry_t l1e; - unsigned long ring_gfn = 0; /* gcc ... */ - p2m_type_t p2mt; - mfn_t ring_mfn; + unsigned long ring_gfn = d->arch.hvm_domain.params[param]; /* Only one helper at a time. If the helper crashed, * the ring is in an undefined state and so is the guest. @@ -61,22 +56,18 @@ static int mem_event_enable( if ( med->ring_page ) return -EBUSY; - /* Get MFN of ring page */ - guest_get_eff_l1e(v, ring_addr, &l1e); - ring_gfn = l1e_get_pfn(l1e); - ring_mfn = get_gfn(dom_mem_event, ring_gfn, &p2mt); - - if ( unlikely(!mfn_valid(mfn_x(ring_mfn))) ) - { - put_gfn(dom_mem_event, ring_gfn); - return -EINVAL; - } + /* The parameter defaults to zero, and it should be + * set to something */ + if ( ring_gfn == 0 ) + return -ENOSYS; mem_event_ring_lock_init(med); + mem_event_ring_lock(med); - /* Map ring page */ - med->ring_page = map_domain_page(mfn_x(ring_mfn)); - put_gfn(dom_mem_event, ring_gfn); + rc = prepare_ring_for_helper(d, ring_gfn, &med->ring_pg_struct, + &med->ring_page); + if ( rc < 0 ) + goto err; /* Set the number of currently blocked vCPUs to 0. */ med->blocked = 0; @@ -101,11 +92,13 @@ static int mem_event_enable( /* Initialize the last-chance wait queue. */ init_waitqueue_head(&med->wq); + mem_event_ring_unlock(med); return 0; err: - unmap_domain_page(med->ring_page); - med->ring_page = NULL; + destroy_ring_for_helper(&med->ring_page, + med->ring_pg_struct); + mem_event_ring_unlock(med); return rc; } @@ -221,9 +214,6 @@ static int mem_event_disable(struct doma /* Free domU''s event channel and leave the other one unbound */ free_xen_event_channel(d->vcpu[0], med->xen_port); - - unmap_domain_page(med->ring_page); - med->ring_page = NULL; /* Unblock all vCPUs */ for_each_vcpu ( d, v ) @@ -235,6 +225,8 @@ static int mem_event_disable(struct doma } } + destroy_ring_for_helper(&med->ring_page, + med->ring_pg_struct); mem_event_ring_unlock(med); } @@ -549,7 +541,9 @@ int mem_event_domctl(struct domain *d, x if ( p2m->pod.entry_count ) break; - rc = mem_event_enable(d, mec, med, _VPF_mem_paging, mem_paging_notification); + rc = mem_event_enable(d, mec, med, _VPF_mem_paging, + HVM_PARAM_PAGING_RING_PFN, + mem_paging_notification); } break; @@ -585,7 +579,9 @@ int mem_event_domctl(struct domain *d, x if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ) break; - rc = mem_event_enable(d, mec, med, _VPF_mem_access, mem_access_notification); + rc = mem_event_enable(d, mec, med, _VPF_mem_access, + HVM_PARAM_ACCESS_RING_PFN, + mem_access_notification); } break; diff -r 99e6c9b9e971 -r 0e79f8005b6b xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -747,7 +747,6 @@ struct xen_domctl_mem_event_op { uint32_t mode; /* XEN_DOMCTL_MEM_EVENT_OP_* */ uint32_t port; /* OUT: event channel for ring */ - uint64_aligned_t ring_addr; /* IN: Virtual address of ring page */ }; typedef struct xen_domctl_mem_event_op xen_domctl_mem_event_op_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_mem_event_op_t); diff -r 99e6c9b9e971 -r 0e79f8005b6b xen/include/public/hvm/params.h --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -142,6 +142,11 @@ /* Boolean: Enable nestedhvm (hvm only) */ #define HVM_PARAM_NESTEDHVM 24 -#define HVM_NR_PARAMS 27 +/* Params for the mem event rings */ +#define HVM_PARAM_PAGING_RING_PFN 27 +#define HVM_PARAM_ACCESS_RING_PFN 28 +#define HVM_PARAM_SHARING_RING_PFN 29 + +#define HVM_NR_PARAMS 30 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ diff -r 99e6c9b9e971 -r 0e79f8005b6b xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -192,6 +192,7 @@ struct mem_event_domain unsigned char target_producers; /* shared ring page */ void *ring_page; + struct page_info *ring_pg_struct; /* front-end ring */ mem_event_front_ring_t front_ring; /* event channel port (vcpu0 only) */
xen/arch/x86/mm/mem_event.c | 41 +++++++++++++++++++++++++++++++++++++++++ xen/include/public/domctl.h | 20 +++++++++++++++++++- xen/include/xen/sched.h | 3 +++ 3 files changed, 63 insertions(+), 1 deletions(-) Now that we have an interface close to finalizing, do the necessary plumbing to set up a ring for reporting failed allocations in the unshare path. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 0e79f8005b6b -r 85157ed138a9 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -432,6 +432,13 @@ static void mem_access_notification(stru p2m_mem_access_resume(v->domain); } +/* Registered with Xen-bound event channel for incoming notifications. */ +static void mem_sharing_notification(struct vcpu *v, unsigned int port) +{ + if ( likely(v->domain->mem_event->share.ring_page != NULL) ) + mem_sharing_sharing_resume(v->domain); +} + struct domain *get_mem_event_op_target(uint32_t domain, int *rc) { struct domain *d; @@ -599,6 +606,40 @@ int mem_event_domctl(struct domain *d, x } break; + case XEN_DOMCTL_MEM_EVENT_OP_SHARING: + { + struct mem_event_domain *med = &d->mem_event->share; + rc = -EINVAL; + + switch( mec->op ) + { + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE: + { + rc = -ENODEV; + /* Only HAP is supported */ + if ( !hap_enabled(d) ) + break; + + rc = mem_event_enable(d, mec, med, _VPF_mem_sharing, + HVM_PARAM_SHARING_RING_PFN, + mem_sharing_notification); + } + break; + + case XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE: + { + if ( med->ring_page ) + rc = mem_event_disable(d, med); + } + break; + + default: + rc = -ENOSYS; + break; + } + } + break; + default: rc = -ENOSYS; } diff -r 0e79f8005b6b -r 85157ed138a9 xen/include/public/domctl.h --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -710,7 +710,7 @@ struct xen_domctl_gdbsx_domstatus { /* XEN_DOMCTL_mem_event_op */ /* -* Domain memory paging + * Domain memory paging * Page memory in and out. * Domctl interface to set up and tear down the * pager<->hypervisor interface. Use XENMEM_paging_op* @@ -740,6 +740,24 @@ struct xen_domctl_gdbsx_domstatus { #define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE 0 #define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_DISABLE 1 +/* + * Sharing ENOMEM helper. + * + * As with paging, use the domctl for teardown/setup of the + * helper<->hypervisor interface. + * + * If setup, this ring is used to communicate failed allocations + * in the unshare path. XENMEM_sharing_op_resume is used to wake up + * vcpus that could not unshare. + * + * Note that shring can be turned on (as per the domctl below) + * *without* this ring being setup. + */ +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING 3 + +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE 0 +#define XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE 1 + /* Use for teardown/setup of helper<->hypervisor interface for paging, * access and sharing.*/ struct xen_domctl_mem_event_op { diff -r 0e79f8005b6b -r 85157ed138a9 xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -643,6 +643,9 @@ static inline struct domain *next_domain /* VCPU is blocked due to missing mem_access ring. */ #define _VPF_mem_access 5 #define VPF_mem_access (1UL<<_VPF_mem_access) + /* VCPU is blocked due to missing mem_sharing ring. */ +#define _VPF_mem_sharing 6 +#define VPF_mem_sharing (1UL<<_VPF_mem_sharing) static inline int vcpu_runnable(struct vcpu *v) {
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 5 of 7] Tools: libxc side for setting up the mem sharing ring
tools/libxc/xc_memshr.c | 25 +++++++++++++++++++++++++ tools/libxc/xenctrl.h | 5 +++++ 2 files changed, 30 insertions(+), 0 deletions(-) This ring is used to report failed allocations in the unshare path. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 85157ed138a9 -r 9c6cbbe72dc4 tools/libxc/xc_memshr.c --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -42,6 +42,31 @@ int xc_memshr_control(xc_interface *xch, return do_domctl(xch, &domctl); } +int xc_memshr_ring_enable(xc_interface *xch, + domid_t domid, + uint32_t *port) +{ + if ( !port ) + { + errno = -EINVAL; + return -1; + } + + return xc_mem_event_control(xch, domid, + XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE, + XEN_DOMCTL_MEM_EVENT_OP_SHARING, + port); +} + +int xc_memshr_ring_disable(xc_interface *xch, + domid_t domid) +{ + return xc_mem_event_control(xch, domid, + XEN_DOMCTL_MEM_EVENT_OP_SHARING_DISABLE, + XEN_DOMCTL_MEM_EVENT_OP_SHARING, + NULL); +} + static int xc_memshr_memop(xc_interface *xch, domid_t domid, xen_mem_sharing_op_t *mso) { diff -r 85157ed138a9 -r 9c6cbbe72dc4 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1915,6 +1915,11 @@ int xc_mem_access_resume(xc_interface *x int xc_memshr_control(xc_interface *xch, domid_t domid, int enable); +int xc_memshr_ring_enable(xc_interface *xch, + domid_t domid, + uint32_t *port); +int xc_memshr_ring_disable(xc_interface *xch, + domid_t domid); int xc_memshr_nominate_gfn(xc_interface *xch, domid_t domid, unsigned long gfn,
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 6 of 7] x86/mm: Clean up mem event structures on domain destruction
xen/arch/x86/mm/mem_event.c | 11 +++++++++++ xen/common/domain.c | 3 +++ xen/include/asm-arm/mm.h | 3 ++- xen/include/asm-ia64/mm.h | 3 ++- xen/include/asm-x86/mm.h | 2 ++ 5 files changed, 20 insertions(+), 2 deletions(-) Otherwise we wind up with zombie domains, still holding onto refs to the mem event ring pages. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -487,6 +487,17 @@ int do_mem_event_op(int op, uint32_t dom return ret; } +/* Clean up on domain destruction */ +void mem_event_cleanup(struct domain *d) +{ + if ( d->mem_event->paging.ring_page ) + (void)mem_event_disable(d, &d->mem_event->paging); + if ( d->mem_event->access.ring_page ) + (void)mem_event_disable(d, &d->mem_event->access); + if ( d->mem_event->share.ring_page ) + (void)mem_event_disable(d, &d->mem_event->share); +} + int mem_event_domctl(struct domain *d, xen_domctl_mem_event_op_t *mec, XEN_GUEST_HANDLE(void) u_domctl) { diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/common/domain.c --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -480,6 +480,9 @@ int domain_kill(struct domain *d) break; } d->is_dying = DOMDYING_dead; + /* Mem event cleanup has to go here because the rings + * have to be put before we call put_domain. */ + mem_event_cleanup(d); put_domain(d); send_global_virq(VIRQ_DOM_EXC); /* fallthrough */ diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/include/asm-arm/mm.h --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -266,7 +266,8 @@ int get_page(struct page_info *page, st machine_to_phys_mapping[(mfn)] = (pfn); \ }) -#define put_gfn(d, g) ((void)0) +#define put_gfn(d, g) ((void)0) +#define mem_event_cleanup(d) ((void)0) #define INVALID_MFN (~0UL) diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/include/asm-ia64/mm.h --- a/xen/include/asm-ia64/mm.h +++ b/xen/include/asm-ia64/mm.h @@ -551,7 +551,8 @@ extern u64 translate_domain_pte(u64 ptev gmfn_to_mfn_foreign((_d), (gpfn)) #define get_gfn_untyped(d, gpfn) gmfn_to_mfn(d, gpfn) -#define put_gfn(d, g) ((void)0) +#define put_gfn(d, g) ((void)0) +#define mem_event_cleanup(d) ((void)0) #define __gpfn_invalid(_d, gpfn) \ (lookup_domain_mpa((_d), ((gpfn)<<PAGE_SHIFT), NULL) == INVALID_MFN) diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -634,6 +634,8 @@ unsigned int domain_clamp_alloc_bitsize( unsigned long domain_get_maximum_gpfn(struct domain *d); +void mem_event_cleanup(struct domain *d); + extern struct domain *dom_xen, *dom_io, *dom_cow; /* for vmcoreinfo */ /* Definition of an mm lock: spinlock with extra fields for debugging */
Andres Lagar-Cavilla
2012-Feb-23 06:05 UTC
[PATCH 7 of 7] x86/mm: Fix mem event error message typos
xen/arch/x86/mm/mem_event.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> diff -r 1a76b2f7641b -r 642df88bf5a0 xen/arch/x86/mm/mem_event.c --- a/xen/arch/x86/mm/mem_event.c +++ b/xen/arch/x86/mm/mem_event.c @@ -505,13 +505,13 @@ int mem_event_domctl(struct domain *d, x if ( unlikely(d == current->domain) ) { - gdprintk(XENLOG_INFO, "Tried to do a memory paging op on itself.\n"); + gdprintk(XENLOG_INFO, "Tried to do a memory event op on itself.\n"); return -EINVAL; } if ( unlikely(d->is_dying) ) { - gdprintk(XENLOG_INFO, "Ignoring memory paging op on dying domain %u\n", + gdprintk(XENLOG_INFO, "Ignoring memory event op on dying domain %u\n", d->domain_id); return 0; } @@ -519,7 +519,7 @@ int mem_event_domctl(struct domain *d, x if ( unlikely(d->vcpu == NULL) || unlikely(d->vcpu[0] == NULL) ) { gdprintk(XENLOG_INFO, - "Memory paging op on a domain (%u) with no vcpus\n", + "Memory event op on a domain (%u) with no vcpus\n", d->domain_id); return -EINVAL; }
Olaf Hering
2012-Feb-23 14:29 UTC
Re: [PATCH 1 of 7] Tools: Sanitize mem_event/access/paging interfaces
On Thu, Feb 23, Andres Lagar-Cavilla wrote:> In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API change, > so please voice any concerns.Too many changes already since 4.1, out of tree binaries are already covered by bumped SONAME and bumped domctl version. Acked-by: Olaf Hering <olaf@aepfle.de>
Olaf Hering
2012-Feb-23 14:32 UTC
Re: [PATCH 6 of 7] x86/mm: Clean up mem event structures on domain destruction
On Thu, Feb 23, Andres Lagar-Cavilla wrote:> +++ b/xen/include/asm-arm/mm.h > @@ -266,7 +266,8 @@ int get_page(struct page_info *page, st > machine_to_phys_mapping[(mfn)] = (pfn); \ > }) > > -#define put_gfn(d, g) ((void)0) > +#define put_gfn(d, g) ((void)0) > +#define mem_event_cleanup(d) ((void)0) > > #define INVALID_MFN (~0UL) > > diff -r 9c6cbbe72dc4 -r 1a76b2f7641b xen/include/asm-ia64/mm.h > --- a/xen/include/asm-ia64/mm.h > +++ b/xen/include/asm-ia64/mm.h > @@ -551,7 +551,8 @@ extern u64 translate_domain_pte(u64 ptev > gmfn_to_mfn_foreign((_d), (gpfn)) > > #define get_gfn_untyped(d, gpfn) gmfn_to_mfn(d, gpfn) > -#define put_gfn(d, g) ((void)0) > +#define put_gfn(d, g) ((void)0) > +#define mem_event_cleanup(d) ((void)0)This is C, not cpp, so these should have been like this in the first place: static inline int foo( .... ) { return 0; } Olaf
At 01:05 -0500 on 23 Feb (1329959105), Andres Lagar-Cavilla wrote:> Update the interface for setting up mem event rings (for sharing, mem access or > paging). > > Remove the "shared page", which was a waste of a whole page for a single event > channel port value. > > More importantly, both the shared page and the ring page were dom0 user-space > process pages mapped by the hypervisor. If the dom0 process does not clean up, > the hypervisor keeps posting events (and holding a map) to a page now belonging > to another process. > > Solutions proposed: > - Pass the event channel port explicitly as part of the domctl payload. > - Reserve a pfn in the guest physmap for a each mem event ring. Set/retrieve > these pfns via hvm params. Ensure they are set during build and restore, and > retrieved during save. Ensure these pages don''t leak and domains are left zombie. > > In all cases mem events consumers in-tree (xenpaging and xen-access) have been > updated. > > Updating the interface to deal with these problems requires > backwards-incompatible changes on both the helper<->libxc and > libxc<->hypervisor interfaces.5A>> Take advantage of the interface update to plumb setting up of the sharing ring, > which was missing. > > All patches touch x86/mm hypervisor bits. Patches 1, 3 and 5 are tools patches > as well. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>For the Xen parts: Acked-by: Tim Deegan <tim@xen.org> Cheers, Tim.
Andres Lagar-Cavilla
2012-Feb-27 22:43 UTC
Re: [PATCH 0 of 7] Mem event ring setup interface update
> At 01:05 -0500 on 23 Feb (1329959105), Andres Lagar-Cavilla wrote: >> Update the interface for setting up mem event rings (for sharing, mem >> access or >> paging). >> >> Remove the "shared page", which was a waste of a whole page for a single >> event >> channel port value. >> >> More importantly, both the shared page and the ring page were dom0 >> user-space >> process pages mapped by the hypervisor. If the dom0 process does not >> clean up, >> the hypervisor keeps posting events (and holding a map) to a page now >> belonging >> to another process. >> >> Solutions proposed: >> - Pass the event channel port explicitly as part of the domctl payload. >> - Reserve a pfn in the guest physmap for a each mem event ring. >> Set/retrieve >> these pfns via hvm params. Ensure they are set during build and restore, >> and >> retrieved during save. Ensure these pages don''t leak and domains are >> left zombie. >> >> In all cases mem events consumers in-tree (xenpaging and xen-access) >> have been >> updated. >> >> Updating the interface to deal with these problems requires >> backwards-incompatible changes on both the helper<->libxc and >> libxc<->hypervisor interfaces. > 5A> >> Take advantage of the interface update to plumb setting up of the >> sharing ring, >> which was missing. >> >> All patches touch x86/mm hypervisor bits. Patches 1, 3 and 5 are tools >> patches >> as well. >> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > For the Xen parts: > Acked-by: Tim Deegan <tim@xen.org>Ping, on the tools side. Also note the Acked-by from Olaf. Thanks, Andres> > Cheers, > > Tim. >
Ian Campbell
2012-Feb-28 12:56 UTC
Re: [PATCH 1 of 7] Tools: Sanitize mem_event/access/paging interfaces
On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote:> tools/libxc/xc_mem_access.c | 10 ++++++++-- > tools/libxc/xc_mem_event.c | 13 ++++++++----- > tools/libxc/xc_mem_paging.c | 10 ++++++++-- > tools/libxc/xenctrl.h | 6 +++--- > tools/tests/xen-access/xen-access.c | 22 ++++------------------ > tools/xenpaging/xenpaging.c | 18 +++--------------- > tools/xenpaging/xenpaging.h | 2 +- > xen/arch/x86/mm/mem_event.c | 33 +++------------------------------ > xen/include/public/domctl.h | 4 ++-- > xen/include/public/mem_event.h | 4 ---- > xen/include/xen/sched.h | 2 -- > 11 files changed, 40 insertions(+), 84 deletions(-) > > > Don''t use the superfluous shared page, return the event channel directly as > part of the domctl struct, instead. > > In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API change, > so please voice any concerns. > > Known pending issues: > - pager could die and its ring page could be used by some other process, yet > Xen retains the mapping to it. > - use a saner interface for the paging_load buffer. > > This change also affects the x86/mm bits in the hypervisor that process the > mem_event setup domctl. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_access.c > --- a/tools/libxc/xc_mem_access.c > +++ b/tools/libxc/xc_mem_access.c > @@ -25,12 +25,18 @@ > > > int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, > - void *shared_page, void *ring_page) > + uint32_t *port, void *ring_page) > { > + if ( !port ) > + { > + errno = -EINVAL;Aren''t errno vals normally +ve? Is there any situation where port==NULL would be valid, i.e. any chance that this might happen, or are such callers just buggy?> + return -1; > + }> + > return xc_mem_event_control(xch, domain_id, > XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, > XEN_DOMCTL_MEM_EVENT_OP_ACCESS, > - shared_page, ring_page); > + port, ring_page); > } > > int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) > diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_event.c > --- a/tools/libxc/xc_mem_event.c > +++ b/tools/libxc/xc_mem_event.c > @@ -24,19 +24,22 @@ > #include "xc_private.h" > > int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned int op, > - unsigned int mode, void *page, void *ring_page) > + unsigned int mode, uint32_t *port, void *ring_page) > { > DECLARE_DOMCTL; > + int rc; > > domctl.cmd = XEN_DOMCTL_mem_event_op; > domctl.domain = domain_id; > domctl.u.mem_event_op.op = op; > domctl.u.mem_event_op.mode = mode; > - > - domctl.u.mem_event_op.shared_addr = (unsigned long)page; > - domctl.u.mem_event_op.ring_addr = (unsigned long)ring_page; > + domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; > > - return do_domctl(xch, &domctl); > + errno = 0; > + rc = do_domctl(xch, &domctl); > + if ( !rc && port ) > + *port = domctl.u.mem_event_op.port; > + return rc;Clearing errno is an unusual pattern, normally callers only expect errno to contain a valid value on failure. Are you trying to provide some backwards compatibility with something or is there another reason for doing it this way?> } > > int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, > diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_paging.c > --- a/tools/libxc/xc_mem_paging.c > +++ b/tools/libxc/xc_mem_paging.c > @@ -25,12 +25,18 @@ > > > int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, > - void *shared_page, void *ring_page) > + uint32_t *port, void *ring_page) > { > + if ( !port ) > + { > + errno = -EINVAL;Same comments as before.> + return -1; > + } > + > return xc_mem_event_control(xch, domain_id, > XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, > XEN_DOMCTL_MEM_EVENT_OP_PAGING, > - shared_page, ring_page); > + port, ring_page); > } >Ian.
Ian Campbell
2012-Feb-28 13:01 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote: Doers this mean that a guest can now potentially observe, or even modify it''s own mem event ring? Are we sure there''s no potential for havoc here? Is there no scope for making these pages owned by the domain but not actually part of the P2M? We can cope with that for other types of magic page, can''t we? I didn''t atually dig into the implementation other than to see if it answered my questions, although I did notice:> diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_hvm_build.c > --- a/tools/libxc/xc_hvm_build.c > +++ b/tools/libxc/xc_hvm_build.c > @@ -38,12 +38,15 @@ > #define SUPERPAGE_1GB_SHIFT 18 > #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) > > -#define SPECIALPAGE_BUFIOREQ 0 > -#define SPECIALPAGE_XENSTORE 1 > -#define SPECIALPAGE_IOREQ 2 > -#define SPECIALPAGE_IDENT_PT 3 > -#define SPECIALPAGE_CONSOLE 4 > -#define NR_SPECIAL_PAGES 5 > +#define SPECIALPAGE_PAGING 0 > +#define SPECIALPAGE_ACCESS 1 > +#define SPECIALPAGE_SHARING 2 > +#define SPECIALPAGE_BUFIOREQ 3 > +#define SPECIALPAGE_XENSTORE 4 > +#define SPECIALPAGE_IOREQ 5 > +#define SPECIALPAGE_IDENT_PT 6 > +#define SPECIALPAGE_CONSOLE 7 > +#define NR_SPECIAL_PAGES 8Any reason to not simply append them? Ian.
Ian Campbell
2012-Feb-28 13:02 UTC
Re: [PATCH 5 of 7] Tools: libxc side for setting up the mem sharing ring
On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote:> tools/libxc/xc_memshr.c | 25 +++++++++++++++++++++++++ > tools/libxc/xenctrl.h | 5 +++++ > 2 files changed, 30 insertions(+), 0 deletions(-) > > > This ring is used to report failed allocations in the unshare path. > > Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> > > diff -r 85157ed138a9 -r 9c6cbbe72dc4 tools/libxc/xc_memshr.c > --- a/tools/libxc/xc_memshr.c > +++ b/tools/libxc/xc_memshr.c > @@ -42,6 +42,31 @@ int xc_memshr_control(xc_interface *xch, > return do_domctl(xch, &domctl); > } > > +int xc_memshr_ring_enable(xc_interface *xch, > + domid_t domid, > + uint32_t *port) > +{ > + if ( !port ) > + { > + errno = -EINVAL;+ve.> + return -1; > + } > + > + return xc_mem_event_control(xch, domid, > + XEN_DOMCTL_MEM_EVENT_OP_SHARING_ENABLE, > + XEN_DOMCTL_MEM_EVENT_OP_SHARING, > + port); > +}Ian.
Andres Lagar-Cavilla
2012-Feb-28 15:14 UTC
Re: [PATCH 1 of 7] Tools: Sanitize mem_event/access/paging interfaces
> On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote: >> tools/libxc/xc_mem_access.c | 10 ++++++++-- >> tools/libxc/xc_mem_event.c | 13 ++++++++----- >> tools/libxc/xc_mem_paging.c | 10 ++++++++-- >> tools/libxc/xenctrl.h | 6 +++--- >> tools/tests/xen-access/xen-access.c | 22 ++++------------------ >> tools/xenpaging/xenpaging.c | 18 +++--------------- >> tools/xenpaging/xenpaging.h | 2 +- >> xen/arch/x86/mm/mem_event.c | 33 >> +++------------------------------ >> xen/include/public/domctl.h | 4 ++-- >> xen/include/public/mem_event.h | 4 ---- >> xen/include/xen/sched.h | 2 -- >> 11 files changed, 40 insertions(+), 84 deletions(-) >> >> >> Don''t use the superfluous shared page, return the event channel directly >> as >> part of the domctl struct, instead. >> >> In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API >> change, >> so please voice any concerns. >> >> Known pending issues: >> - pager could die and its ring page could be used by some other process, >> yet >> Xen retains the mapping to it. >> - use a saner interface for the paging_load buffer. >> >> This change also affects the x86/mm bits in the hypervisor that process >> the >> mem_event setup domctl. >> >> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> >> >> diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_access.c >> --- a/tools/libxc/xc_mem_access.c >> +++ b/tools/libxc/xc_mem_access.c >> @@ -25,12 +25,18 @@ >> >> >> int xc_mem_access_enable(xc_interface *xch, domid_t domain_id, >> - void *shared_page, void *ring_page) >> + uint32_t *port, void *ring_page) >> { >> + if ( !port ) >> + { >> + errno = -EINVAL; > > Aren''t errno vals normally +ve? > > Is there any situation where port==NULL would be valid, i.e. any chance > that this might happen, or are such callers just buggy?Definitely, it should be errno = EINVAL. I''ll fix this in this patch and the sharing ring one.> >> + return -1; >> + } > >> + >> return xc_mem_event_control(xch, domain_id, >> XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE, >> XEN_DOMCTL_MEM_EVENT_OP_ACCESS, >> - shared_page, ring_page); >> + port, ring_page); >> } >> >> int xc_mem_access_disable(xc_interface *xch, domid_t domain_id) >> diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_event.c >> --- a/tools/libxc/xc_mem_event.c >> +++ b/tools/libxc/xc_mem_event.c >> @@ -24,19 +24,22 @@ >> #include "xc_private.h" >> >> int xc_mem_event_control(xc_interface *xch, domid_t domain_id, unsigned >> int op, >> - unsigned int mode, void *page, void >> *ring_page) >> + unsigned int mode, uint32_t *port, void >> *ring_page) >> { >> DECLARE_DOMCTL; >> + int rc; >> >> domctl.cmd = XEN_DOMCTL_mem_event_op; >> domctl.domain = domain_id; >> domctl.u.mem_event_op.op = op; >> domctl.u.mem_event_op.mode = mode; >> - >> - domctl.u.mem_event_op.shared_addr = (unsigned long)page; >> - domctl.u.mem_event_op.ring_addr = (unsigned long)ring_page; >> + domctl.u.mem_event_op.ring_addr = (unsigned long) ring_page; >> >> - return do_domctl(xch, &domctl); >> + errno = 0; >> + rc = do_domctl(xch, &domctl); >> + if ( !rc && port ) >> + *port = domctl.u.mem_event_op.port; >> + return rc; > > Clearing errno is an unusual pattern, normally callers only expect errno > to contain a valid value on failure. Are you trying to provide some > backwards compatibility with something or is there another reason for > doing it this way?I can remove that as well. Thanks, Andres> >> } >> >> int xc_mem_event_memop(xc_interface *xch, domid_t domain_id, >> diff -r 8ddd13cc783e -r 3b24188d8815 tools/libxc/xc_mem_paging.c >> --- a/tools/libxc/xc_mem_paging.c >> +++ b/tools/libxc/xc_mem_paging.c >> @@ -25,12 +25,18 @@ >> >> >> int xc_mem_paging_enable(xc_interface *xch, domid_t domain_id, >> - void *shared_page, void *ring_page) >> + uint32_t *port, void *ring_page) >> { >> + if ( !port ) >> + { >> + errno = -EINVAL; > > Same comments as before. > >> + return -1; >> + } >> + >> return xc_mem_event_control(xch, domain_id, >> XEN_DOMCTL_MEM_EVENT_OP_PAGING_ENABLE, >> XEN_DOMCTL_MEM_EVENT_OP_PAGING, >> - shared_page, ring_page); >> + port, ring_page); >> } >> > > Ian. > > >
Andres Lagar-Cavilla
2012-Feb-28 15:19 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
> On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote: > > Doers this mean that a guest can now potentially observe, or even modify > it''s own mem event ring? Are we sure there''s no potential for havoc > here?This is the same mechanism we use to map the qemu ioreq and bufioreq rings. These pfns lie within a reserved region as advertised by the e820 to the guest, so that''s the protection we rely upon.> > Is there no scope for making these pages owned by the domain but not > actually part of the P2M? We can cope with that for other types of magic > page, can''t we?If you''re referring to the qemu rings, I don''t think so. They''re like any other page in the p2m. They get allocated with xc_populate_physmap_exact.> > I didn''t atually dig into the implementation other than to see if it > answered my questions, although I did notice: > >> diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_hvm_build.c >> --- a/tools/libxc/xc_hvm_build.c >> +++ b/tools/libxc/xc_hvm_build.c >> @@ -38,12 +38,15 @@ >> #define SUPERPAGE_1GB_SHIFT 18 >> #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) >> >> -#define SPECIALPAGE_BUFIOREQ 0 >> -#define SPECIALPAGE_XENSTORE 1 >> -#define SPECIALPAGE_IOREQ 2 >> -#define SPECIALPAGE_IDENT_PT 3 >> -#define SPECIALPAGE_CONSOLE 4 >> -#define NR_SPECIAL_PAGES 5 >> +#define SPECIALPAGE_PAGING 0 >> +#define SPECIALPAGE_ACCESS 1 >> +#define SPECIALPAGE_SHARING 2 >> +#define SPECIALPAGE_BUFIOREQ 3 >> +#define SPECIALPAGE_XENSTORE 4 >> +#define SPECIALPAGE_IOREQ 5 >> +#define SPECIALPAGE_IDENT_PT 6 >> +#define SPECIALPAGE_CONSOLE 7 >> +#define NR_SPECIAL_PAGES 8 > > Any reason to not simply append them?Yes: I didn''t want to change the existing magic pfn''s. If I just append, all pfns shift, and now the venerable store or ioreq pfn''s are on different locations. Andres> > Ian. > >
Andres Lagar-Cavilla
2012-Mar-01 02:29 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
> On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote: > > Doers this mean that a guest can now potentially observe, or even modify > it''s own mem event ring? Are we sure there''s no potential for havoc > here? > > Is there no scope for making these pages owned by the domain but not > actually part of the P2M? We can cope with that for other types of magic > page, can''t we? > > I didn''t atually dig into the implementation other than to see if it > answered my questions, although I did notice:Ian, I''ll resend now including the recent Acked-by''s and your two tools-side comments. We can work from there, and hopefully remove this blocker in the 4.2 todo list in short order. Thanks! Andres> >> diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_hvm_build.c >> --- a/tools/libxc/xc_hvm_build.c >> +++ b/tools/libxc/xc_hvm_build.c >> @@ -38,12 +38,15 @@ >> #define SUPERPAGE_1GB_SHIFT 18 >> #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) >> >> -#define SPECIALPAGE_BUFIOREQ 0 >> -#define SPECIALPAGE_XENSTORE 1 >> -#define SPECIALPAGE_IOREQ 2 >> -#define SPECIALPAGE_IDENT_PT 3 >> -#define SPECIALPAGE_CONSOLE 4 >> -#define NR_SPECIAL_PAGES 5 >> +#define SPECIALPAGE_PAGING 0 >> +#define SPECIALPAGE_ACCESS 1 >> +#define SPECIALPAGE_SHARING 2 >> +#define SPECIALPAGE_BUFIOREQ 3 >> +#define SPECIALPAGE_XENSTORE 4 >> +#define SPECIALPAGE_IOREQ 5 >> +#define SPECIALPAGE_IDENT_PT 6 >> +#define SPECIALPAGE_CONSOLE 7 >> +#define NR_SPECIAL_PAGES 8 > > Any reason to not simply append them? > > Ian. > >
Ian Campbell
2012-Mar-01 07:47 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
On Tue, 2012-02-28 at 15:19 +0000, Andres Lagar-Cavilla wrote:> > On Thu, 2012-02-23 at 06:05 +0000, Andres Lagar-Cavilla wrote: > > > > Doers this mean that a guest can now potentially observe, or even modify > > it''s own mem event ring? Are we sure there''s no potential for havoc > > here? > > This is the same mechanism we use to map the qemu ioreq and bufioreq > rings. These pfns lie within a reserved region as advertised by the e820 > to the guest, so that''s the protection we rely upon.That still sounds fishy to me ;-) But if people who actually know what''s going on are happy with it then so am I...> > > > Is there no scope for making these pages owned by the domain but not > > actually part of the P2M? We can cope with that for other types of magic > > page, can''t we? > > If you''re referring to the qemu rings,I meant all rings of this sort (i.e. those associated with a domain but not really part of it, IYSWIM).> I don''t think so. They''re like any > other page in the p2m. They get allocated with xc_populate_physmap_exact. > > > > > I didn''t atually dig into the implementation other than to see if it > > answered my questions, although I did notice: > > > >> diff -r 99e6c9b9e971 -r 0e79f8005b6b tools/libxc/xc_hvm_build.c > >> --- a/tools/libxc/xc_hvm_build.c > >> +++ b/tools/libxc/xc_hvm_build.c > >> @@ -38,12 +38,15 @@ > >> #define SUPERPAGE_1GB_SHIFT 18 > >> #define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT) > >> > >> -#define SPECIALPAGE_BUFIOREQ 0 > >> -#define SPECIALPAGE_XENSTORE 1 > >> -#define SPECIALPAGE_IOREQ 2 > >> -#define SPECIALPAGE_IDENT_PT 3 > >> -#define SPECIALPAGE_CONSOLE 4 > >> -#define NR_SPECIAL_PAGES 5 > >> +#define SPECIALPAGE_PAGING 0 > >> +#define SPECIALPAGE_ACCESS 1 > >> +#define SPECIALPAGE_SHARING 2 > >> +#define SPECIALPAGE_BUFIOREQ 3 > >> +#define SPECIALPAGE_XENSTORE 4 > >> +#define SPECIALPAGE_IOREQ 5 > >> +#define SPECIALPAGE_IDENT_PT 6 > >> +#define SPECIALPAGE_CONSOLE 7 > >> +#define NR_SPECIAL_PAGES 8 > > > > Any reason to not simply append them? > > Yes: I didn''t want to change the existing magic pfn''s. If I just append, > all pfns shift, and now the venerable store or ioreq pfn''s are on > different locations.Ah, because we count the indexes from TOP-NR_SPECIAL_PAGES so increasing NR_SPECIAL_PAGES knocks everything down? As a separate cleanup perhaps these should be -ve offsets from TOP? Or maybe the specific offsets don''t actually need preserving since they aren''t important outside this file? Ian.
Tim Deegan
2012-Mar-01 15:17 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote:> > > Is there no scope for making these pages owned by the domain but not > > > actually part of the P2M? We can cope with that for other types of magic > > > page, can''t we?It would need a new operation to map the ring into the tool that uses it; normal map-foreign-page ops need a GFN. But that sounds like a good idea all the same. Tim.
Andres Lagar-Cavilla
2012-Mar-01 16:26 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
> At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: >> > > Is there no scope for making these pages owned by the domain but not >> > > actually part of the P2M? We can cope with that for other types of >> magic >> > > page, can''t we? > > It would need a new operation to map the ring into the tool that uses > it; normal map-foreign-page ops need a GFN.afaict, the qemu ring works this way, and sits right next to the store shared page. So PV guest drivers can get to the qemu ring. I guess no one is concerned because the worst that could happen is DoS for the naughty guest. Same with these paging rings. What I will try to do is call decrease_reservation on the ring pfn once it''s mapped. That way the guest won''t have access to it, but the ring will live due to references from both Xen and the dom0 helper. I would send that as a follow-up patch though. I think the interface is ok as is. And if decrease_reservation solves the naughty guest problem, then that should probably go into qemu as well. Andres> > But that sounds like a good idea all the same. > > Tim. >
Andres Lagar-Cavilla
2012-Mar-01 16:30 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
> At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: >> > > Is there no scope for making these pages owned by the domain but not >> > > actually part of the P2M? We can cope with that for other types of >> magic >> > > page, can''t we? > > It would need a new operation to map the ring into the tool that uses > it; normal map-foreign-page ops need a GFN.Actually, confirmed: we can call xc_domain_decrease_reservation on the ring after it''s mapped by the helper. Guest won''t get at it. Nothing breaks. I''ll send it as follow-up patch (for xenpaging and xen-access) once this series is finalized. Thanks, Andres> > But that sounds like a good idea all the same. > > Tim. >
Tim Deegan
2012-Mar-01 17:32 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
At 08:30 -0800 on 01 Mar (1330590637), Andres Lagar-Cavilla wrote:> > At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: > >> > > Is there no scope for making these pages owned by the domain but not > >> > > actually part of the P2M? We can cope with that for other types of > >> magic > >> > > page, can''t we? > > > > It would need a new operation to map the ring into the tool that uses > > it; normal map-foreign-page ops need a GFN. > > Actually, confirmed: we can call xc_domain_decrease_reservation on the > ring after it''s mapped by the helper. Guest won''t get at it. Nothing > breaks.But that would only work if: - the helper always attaches before the guest gets to run; and - you never need to restart the helper. Tim.
Andres Lagar-Cavilla
2012-Mar-01 17:44 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event rings
> At 08:30 -0800 on 01 Mar (1330590637), Andres Lagar-Cavilla wrote: >> > At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: >> >> > > Is there no scope for making these pages owned by the domain but >> not >> >> > > actually part of the P2M? We can cope with that for other types >> of >> >> magic >> >> > > page, can''t we? >> > >> > It would need a new operation to map the ring into the tool that uses >> > it; normal map-foreign-page ops need a GFN. >> >> Actually, confirmed: we can call xc_domain_decrease_reservation on the >> ring after it''s mapped by the helper. Guest won''t get at it. Nothing >> breaks. > > But that would only work if: > - the helper always attaches before the guest gets to run; andThe helper will ignore whatever contents there were on the page. And if the guest is out there poking in e820 reserved ranges, then the guest has it coming. Note that we''ve narrowed the window of "vulnerability".> - you never need to restart the helper.The helper can re-populate the pfn every time it starts. That''s contemplated in the current patch series. In fact it can do it "atomically" by pausing the guest. So, it''s an improvement, but it''s not water-tight fool-proof. Andres> > Tim. >
Tim Deegan
2012-Mar-01 17:52 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event ringsg
At 09:44 -0800 on 01 Mar (1330595065), Andres Lagar-Cavilla wrote:> > At 08:30 -0800 on 01 Mar (1330590637), Andres Lagar-Cavilla wrote: > >> > At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: > >> >> > > Is there no scope for making these pages owned by the domain but > >> not > >> >> > > actually part of the P2M? We can cope with that for other types > >> of > >> >> magic > >> >> > > page, can''t we? > >> > > >> > It would need a new operation to map the ring into the tool that uses > >> > it; normal map-foreign-page ops need a GFN. > >> > >> Actually, confirmed: we can call xc_domain_decrease_reservation on the > >> ring after it''s mapped by the helper. Guest won''t get at it. Nothing > >> breaks. > > > > But that would only work if: > > - the helper always attaches before the guest gets to run; and > > The helper will ignore whatever contents there were on the page. And if > the guest is out there poking in e820 reserved ranges, then the guest has > it coming. Note that we''ve narrowed the window of "vulnerability". > > > - you never need to restart the helper. > > The helper can re-populate the pfn every time it starts. That''s > contemplated in the current patch series. In fact it can do it > "atomically" by pausing the guest.OK, that seems good enough. Tim.
Andres Lagar-Cavilla
2012-Mar-06 19:27 UTC
Re: [PATCH 3 of 7] Use a reserved pfn in the guest address space to store mem event ringsg
> At 09:44 -0800 on 01 Mar (1330595065), Andres Lagar-Cavilla wrote: >> > At 08:30 -0800 on 01 Mar (1330590637), Andres Lagar-Cavilla wrote: >> >> > At 07:47 +0000 on 01 Mar (1330588073), Ian Campbell wrote: >> >> >> > > Is there no scope for making these pages owned by the domain >> but >> >> not >> >> >> > > actually part of the P2M? We can cope with that for other >> types >> >> of >> >> >> magic >> >> >> > > page, can''t we? >> >> > >> >> > It would need a new operation to map the ring into the tool that >> uses >> >> > it; normal map-foreign-page ops need a GFN. >> >> >> >> Actually, confirmed: we can call xc_domain_decrease_reservation on >> the >> >> ring after it''s mapped by the helper. Guest won''t get at it. Nothing >> >> breaks. >> > >> > But that would only work if: >> > - the helper always attaches before the guest gets to run; and >> >> The helper will ignore whatever contents there were on the page. And if >> the guest is out there poking in e820 reserved ranges, then the guest >> has >> it coming. Note that we''ve narrowed the window of "vulnerability". >> >> > - you never need to restart the helper. >> >> The helper can re-populate the pfn every time it starts. That''s >> contemplated in the current patch series. In fact it can do it >> "atomically" by pausing the guest. > > OK, that seems good enough.Great, so I''ll post this whole series again. I''ll include the "remove ring from physmap" bits. And I''ll add an Acked-by from Ian C for the tools bits. Cheers! Andres> > Tim. >