Stefano Stabellini
2013-Jul-31 17:57 UTC
[PATCH RFC 0/3] introduce XENMEM_get_dma_buf and XENMEM_put_dma_buf
Hi all, this patch series introduces two new hypercalls to allow autotranslate guests to allocate a contiguous buffer in machine addresses. The XENMEM_get_dma_buf returns the mfns and makes sure to pin the pages so that the hypervisor won''t change their p2m mappings while in use. XENMEM_put_dma_buf simply unpins the pages. The implementation of XENMEM_put_dma_buf is missing, as it''s actually unused. The page pinning is also missing from this series. I would appreciate feedback on the best way to implement it, especially on x86. Cheers, Stefano Stefano Stabellini (3): xen/arm: implement steal_page xen: provide empty stubs for guest_physmap_(un)pin_range on arm and x86 xen: introduce XENMEM_get_dma_buf and XENMEM_put_dma_buf xen/arch/arm/mm.c | 43 +++++++++++++++++++++++++++++ xen/common/memory.c | 20 +++++++++++-- xen/include/asm-arm/mm.h | 12 ++++++++ xen/include/asm-x86/p2m.h | 12 ++++++++ xen/include/public/memory.h | 64 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 148 insertions(+), 3 deletions(-)
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 43 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index f301e65..ea64c03 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -751,6 +751,49 @@ int donate_page(struct domain *d, struct page_info *page, unsigned int memflags) int steal_page( struct domain *d, struct page_info *page, unsigned int memflags) { + unsigned long x, y; + bool_t drop_dom_ref = 0; + + spin_lock(&d->page_alloc_lock); + + if ( is_xen_heap_page(page) || (page_get_owner(page) != d) ) + goto fail; + + /* + * We require there is just one reference (PGC_allocated). We temporarily + * drop this reference now so that we can safely swizzle the owner. + */ + y = page->count_info; + do { + x = y; + if ( (x & (PGC_count_mask|PGC_allocated)) != (1 | PGC_allocated) ) + goto fail; + y = cmpxchg(&page->count_info, x, x & ~PGC_count_mask); + } while ( y != x ); + + /* Swizzle the owner then reinstate the PGC_allocated reference. */ + page_set_owner(page, NULL); + y = page->count_info; + do { + x = y; + BUG_ON((x & (PGC_count_mask|PGC_allocated)) != PGC_allocated); + } while ( (y = cmpxchg(&page->count_info, x, x | 1)) != x ); + + /* Unlink from original owner. */ + if ( !(memflags & MEMF_no_refcount) && !domain_adjust_tot_pages(d, -1) ) + drop_dom_ref = 1; + page_list_del(page, &d->page_list); + + spin_unlock(&d->page_alloc_lock); + if ( unlikely(drop_dom_ref) ) + put_domain(d); + return 0; + + fail: + spin_unlock(&d->page_alloc_lock); + printk("Bad page %p: ed=%p(%u), sd=%p, caf=%08lx, taf=%lx\n", + (void *)page_to_mfn(page), d, d->domain_id, + page_get_owner(page), page->count_info, page->u.inuse.type_info); return -1; } -- 1.7.2.5
Stefano Stabellini
2013-Jul-31 17:57 UTC
[PATCH RFC 2/3] xen: provide empty stubs for guest_physmap_(un)pin_range on arm and x86
guest_physmap_pin_range pins a range of guest pages so that they p2m mappings won''t be changed. guest_physmap_unpin_range unpins the previously pinned pages. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/include/asm-arm/mm.h | 12 ++++++++++++ xen/include/asm-x86/p2m.h | 12 ++++++++++++ 2 files changed, 24 insertions(+), 0 deletions(-) diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index 5e7c5a3..d88fa6c 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -319,6 +319,18 @@ void free_init_memory(void); int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn, unsigned int order); +static inline int guest_physmap_pin_range(struct domain *d, + xen_pfn_t gpfn, + unsigned int order) +{ + return -ENOSYS; +} +static inline int guest_physmap_unpin_range(struct domain *d, + xen_pfn_t gpfn, + unsigned int order) +{ + return -ENOSYS; +} extern void put_page_type(struct page_info *page); static inline void put_page_and_type(struct page_info *page) diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 43583b2..afc7738 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -492,6 +492,18 @@ void guest_physmap_remove_page(struct domain *d, /* Set a p2m range as populate-on-demand */ int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn, unsigned int order); +static inline int guest_physmap_pin_range(struct domain *d, + xen_pfn_t gpfn, + unsigned int order) +{ + return -ENOSYS; +} +int guest_physmap_unpin_range(struct domain *d, + xen_pfn_t gpfn, + unsigned int order) +{ + return -ENOSYS; +} /* Change types across all p2m entries in a domain */ void p2m_change_entry_type_global(struct domain *d, -- 1.7.2.5
Stefano Stabellini
2013-Jul-31 17:57 UTC
[PATCH RFC 3/3] xen: introduce XENMEM_get_dma_buf and XENMEM_put_dma_buf
Introduce two new hypercalls XENMEM_get_dma_buf and XENMEM_put_dma_buf. XENMEM_get_dma_buf exchanges a set of pages for a new set, contiguous and under 4G if so requested. The new pages are going to be "pinned" so that their p2m mapping won''t be changed, until XENMEM_put_dma_buf is called. XENMEM_get_dma_buf returns the MFNs of the new pages to the caller. The only effect of XENMEM_put_dma_buf is to "unpin" the previously pinned pages. Afterwards the p2m mappings can be transparently changed by the hypervisor as normal. The memory remains accessible from the guest. XENMEM_put_dma_buf is unimplemented. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/common/memory.c | 20 +++++++++++-- xen/include/public/memory.h | 64 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 81 insertions(+), 3 deletions(-) diff --git a/xen/common/memory.c b/xen/common/memory.c index 50b740f..2c629c6 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -279,7 +279,7 @@ static void decrease_reservation(struct memop_args *a) a->nr_done = i; } -static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg) +static long memory_exchange(int op, XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg) { struct xen_memory_exchange exch; PAGE_LIST_HEAD(in_chunk_list); @@ -496,7 +496,7 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg) mfn = page_to_mfn(page); guest_physmap_add_page(d, gpfn, mfn, exch.out.extent_order); - if ( !paging_mode_translate(d) ) + if ( op == XENMEM_get_dma_buf || !paging_mode_translate(d) ) { for ( k = 0; k < (1UL << exch.out.extent_order); k++ ) set_gpfn_from_mfn(mfn + k, gpfn + k); @@ -505,6 +505,17 @@ static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg) &mfn, 1) ) rc = -EFAULT; } + + if ( op == XENMEM_get_dma_buf ) + { + static int warning; + rc = guest_physmap_pin_range(d, gpfn, exch.out.extent_order); + if ( rc ) + { + gdprintk(XENLOG_WARNING, "guest_physmap_pin_range not implemented\n"); + warning = 1; + } + } } BUG_ON( !(d->is_dying) && (j != (1UL << out_chunk_order)) ); } @@ -630,8 +641,11 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; + case XENMEM_get_dma_buf: + /* xen_get_dma_buf_t is identical to xen_memory_exchange_t, so + * just cast it and reuse memory_exchange */ case XENMEM_exchange: - rc = memory_exchange(guest_handle_cast(arg, xen_memory_exchange_t)); + rc = memory_exchange(op, guest_handle_cast(arg, xen_memory_exchange_t)); break; case XENMEM_maximum_ram_page: diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index 7a26dee..354b117 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -459,6 +459,70 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_sharing_op_t); * The zero value is appropiate. */ +#define XENMEM_get_dma_buf 26 +/* + * This hypercall is similar to XENMEM_exchange: it exchanges the pages + * passed in with a new set of pages, contiguous and under 4G if so + * requested. The new pages are going to be "pinned": it''s guaranteed + * that their p2m mapping won''t be changed until explicitly "unpinned". + * If return code is zero then @out.extent_list provides the MFNs of the + * newly-allocated memory. Returns zero on complete success, otherwise + * a negative error code. + * On complete success then always @nr_exchanged == @in.nr_extents. On + * partial success @nr_exchanged indicates how much work was done. + */ +struct xen_get_dma_buf { + /* + * [IN] Details of memory extents to be exchanged (GMFN bases). + * Note that @in.address_bits is ignored and unused. + */ + struct xen_memory_reservation in; + + /* + * [IN/OUT] Details of new memory extents. + * We require that: + * 1. @in.domid == @out.domid + * 2. @in.nr_extents << @in.extent_order == + * @out.nr_extents << @out.extent_order + * 3. @in.extent_start and @out.extent_start lists must not overlap + * 4. @out.extent_start lists GPFN bases to be populated + * 5. @out.extent_start is overwritten with allocated GMFN bases + */ + struct xen_memory_reservation out; + + /* + * [OUT] Number of input extents that were successfully exchanged: + * 1. The first @nr_exchanged input extents were successfully + * deallocated. + * 2. The corresponding first entries in the output extent list correctly + * indicate the GMFNs that were successfully exchanged. + * 3. All other input and output extents are untouched. + * 4. If not all input exents are exchanged then the return code of this + * command will be non-zero. + * 5. THIS FIELD MUST BE INITIALISED TO ZERO BY THE CALLER! + */ + xen_ulong_t nr_exchanged; +}; +typedef struct xen_get_dma_buf xen_get_dma_buf_t; +DEFINE_XEN_GUEST_HANDLE(xen_get_dma_buf_t); + +#define XENMEM_put_dma_buf 27 +/* + * XENMEM_put_dma_buf unpins a set of pages, previously pinned by + * XENMEM_get_dma_buf. After this call the p2m mapping of the pages can + * be transparently changed by the hypervisor, as usual. The pages are + * still accessible from the guest. + */ +struct xen_put_dma_buf { + /* + * [IN] Details of memory extents to be exchanged (GMFN bases). + * Note that @in.address_bits is ignored and unused. + */ + struct xen_memory_reservation in; +}; +typedef struct xen_put_dma_buf xen_put_dma_buf_t; +DEFINE_XEN_GUEST_HANDLE(xen_put_dma_buf_t); + #endif /* defined(__XEN__) || defined(__XEN_TOOLS__) */ #endif /* __XEN_PUBLIC_MEMORY_H__ */ -- 1.7.2.5
David Vrabel
2013-Aug-01 09:41 UTC
Re: [PATCH RFC 0/3] introduce XENMEM_get_dma_buf and XENMEM_put_dma_buf
On 31/07/13 18:57, Stefano Stabellini wrote:> Hi all, > this patch series introduces two new hypercalls to allow autotranslate > guests to allocate a contiguous buffer in machine addresses. > The XENMEM_get_dma_buf returns the mfns and makes sure to pin the pages > so that the hypervisor won''t change their p2m mappings while in use. > XENMEM_put_dma_buf simply unpins the pages.Can you expand on what circumstances the hypervisor would otherwise adjust the p2m? How has x86 avoided these problems?> The implementation of XENMEM_put_dma_buf is missing, as it''s actually > unused. > > The page pinning is also missing from this series. I would appreciate > feedback on the best way to implement it, especially on x86.David
Ian Campbell
2013-Aug-01 14:55 UTC
Re: [PATCH RFC 0/3] introduce XENMEM_get_dma_buf and XENMEM_put_dma_buf
On Thu, 2013-08-01 at 10:41 +0100, David Vrabel wrote:> On 31/07/13 18:57, Stefano Stabellini wrote: > > Hi all, > > this patch series introduces two new hypercalls to allow autotranslate > > guests to allocate a contiguous buffer in machine addresses. > > The XENMEM_get_dma_buf returns the mfns and makes sure to pin the pages > > so that the hypervisor won''t change their p2m mappings while in use. > > XENMEM_put_dma_buf simply unpins the pages. > > Can you expand on what circumstances the hypervisor would otherwise > adjust the p2m? How has x86 avoided these problems?Page sharing, CoW, Dario''s forthcoming THP like stuff. x86 avoids this by not supporting any of that for PV guests and by using an IOMMU for HVM guests. On ARM IOMMUs are currently rare enough that we need a software fallback, which is what this series is about. We actually only need the MFN lookup behaviour currently but the pinning behaviour is somewhat forward looking. I don''t know if PVH on x86 is going to trip over similar issues but I expect that IOMMUs are prevalent enough that they can be a prerequisite. Ian.