Hi, I am posting the linux xencomm code for review. I''d plan to submit soon unless comments/remarks. This patch has 3 aims: * xencomm descriptor creation (drivers/xen/core/xencomm.c) * linux issued hypercalls translation (drivers/xen/core/xencomm_hcall.c) * privcmd hypercalls translation (drivers/xen/privcmd/xencomm.c) Most of this comes from powerpc people. Because they use a private linux tree, the patches have never been submitted. However the patches have been seriously modified by myself in order to be used on ia64. I tried to made them more generic so that they can be shared by ia64 and powerpc. Tristan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:> I am posting the linux xencomm code for review. I''d plan to submit soon > unless comments/remarks.NAK. I''m still waiting to hear back about how you can use xencomm_inline() without worrying about page boundaries. -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Le Lundi 21 Août 2006 21:07, Hollis Blanchard a écrit :> On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote: > > I am posting the linux xencomm code for review. I''d plan to submit soon > > unless comments/remarks. > > NAK. I''m still waiting to hear back about how you can use > xencomm_inline() without worrying about page boundaries.More elaborated answer: On linux/ia64, kernel is linearly mapped into guest physical memory. The same is true for process kernel stacks. Therefore all kernels structure are linear in guest physical memory. Kernel data may of course cross page boundaries. But Xen can correctly handle this using only the guest physical address. Is something wrong ? Is something that don''t apply on powerpc ? Tristan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 2006-08-22 at 09:42 +0200, Tristan Gingold wrote:> Le Lundi 21 Août 2006 21:07, Hollis Blanchard a écrit : > > On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote: > > > I am posting the linux xencomm code for review. I''d plan to submit soon > > > unless comments/remarks. > > > > NAK. I''m still waiting to hear back about how you can use > > xencomm_inline() without worrying about page boundaries. > More elaborated answer: > > On linux/ia64, kernel is linearly mapped into guest physical memory. The same > is true for process kernel stacks. Therefore all kernels structure are > linear in guest physical memory. > > Kernel data may of course cross page boundaries. But Xen can correctly handle > this using only the guest physical address.I see, and you handle this by breaking up copies at page granularity in xencomm_copy_from_user(). I have to say I don''t really like the code complexity, or the fact that there are now two very different ways to access guest handles. That being said, it sure would be nice to get rid of that "mini" stack-based stuff, so it''s OK with me. I would be surprised if there were any performance difference, however. I''ll send some comments to your original patch. -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I apologize for my mailer line-wrapping the patch as I quote it below. On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig > --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 09:41:24 > 2006 +0200 > +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 15:04:32 > 2006 +0200 > @@ -257,4 +257,7 @@ config XEN_SMPBOOT > default y > depends on SMP > > +config XEN_XENCOMM > + bool > + default n > endifShouldn''t IA64 "select XEN_XENCOMM"? Or is your kernel in a separate tree?> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Makefile > --- a/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 09:41:24 > 2006 +0200 > +++ b/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 15:04:32 > 2006 +0200 > @@ -1,10 +1,10 @@ obj-y += core/ > obj-y += core/ > obj-y += console/ > obj-y += evtchn/ > -obj-y += privcmd/ > obj-y += xenbus/ > > obj-$(CONFIG_XEN_UTIL) += util.o > +obj-$(CONFIG_XEN_PRIVCMD) += privcmd/ > obj-$(CONFIG_XEN_BALLOON) += balloon/ > obj-$(CONFIG_XEN_DEVMEM) += char/ > obj-$(CONFIG_XEN_BLKDEV_BACKEND) += blkback/Not really part of this patch.> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/core/Makefile > --- a/linux-2.6-xen-sparse/drivers/xen/core/Makefile Mon Aug 21 > 09:41:24 2006 +0200 > +++ b/linux-2.6-xen-sparse/drivers/xen/core/Makefile Mon Aug 21 > 15:04:32 2006 +0200 > @@ -11,3 +11,4 @@ obj-$(CONFIG_XEN_SKBUFF) += skbuff.o > obj-$(CONFIG_XEN_SKBUFF) += skbuff.o > obj-$(CONFIG_XEN_REBOOT) += reboot.o > obj-$(CONFIG_XEN_SMPBOOT) += smpboot.o > +obj-$(CONFIG_XEN_XENCOMM) += xencomm.o xencomm_hcall.o > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile Mon Aug 21 > 09:41:24 2006 +0200 > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile Mon Aug 21 > 15:04:32 2006 +0200 > @@ -1,2 +1,3 @@ > +obj-y := privcmd.o > > -obj-$(CONFIG_XEN_PRIVCMD) := privcmd.o > +obj-$(CONFIG_XEN_XENCOMM) += xencomm.oI agree with the CONFIG_XEN_PRIVCMD stuff, but I think that should be a separate patch.> diff -r b7db009d622c > linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > Aug 21 09:41:24 2006 +0200 > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > Aug 21 15:04:32 2006 +0200 > @@ -34,6 +34,10 @@ > > static struct proc_dir_entry *privcmd_intf; > static struct proc_dir_entry *capabilities_intf; > + > +#ifdef CONFIG_XEN_XENCOMM > +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall); > +#endif > > #define NR_HYPERCALLS 64 > static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS); > @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i > "g" ((unsigned long)hypercall.arg[4]) > : "r8", "r10", "memory" ); > } > -#elif defined (__ia64__) > - __asm__ __volatile__ ( > - ";; mov r14=%2; mov r15=%3; " > - "mov r16=%4; mov r17=%5; mov r18=%6;" > - "mov r2=%1; break 0x1000;; mov %0=r8 ;;" > - : "=r" (ret) > - : "r" (hypercall.op), > - "r" (hypercall.arg[0]), > - "r" (hypercall.arg[1]), > - "r" (hypercall.arg[2]), > - "r" (hypercall.arg[3]), > - "r" (hypercall.arg[4]) > - : > "r14","r15","r16","r17","r18","r2","r8","memory"); > +#elif defined (CONFIG_XEN_XENCOMM) > + ret = xencomm_privcmd_hypercall (&hypercall); > #endif > } > break;Move all the #ifdef stuff into appropriate header files, then have every arch unconditionally call arch_privcmd_hypercall().> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/core/xencomm.c > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/linux-2.6-xen-sparse/drivers/xen/core/xencomm.c Mon Aug 21 > 15:04:32 2006 +0200 > @@ -0,0 +1,213 @@ > +/* > + * Copyright (C) 2006 Hollis Blanchard <hollisb@us.ibm.com>, IBM > Corporation > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#include <linux/gfp.h> > +#include <linux/mm.h> > +#include <asm/page.h> > +#include <xen/xencomm.h> > +#include <xen/interface/xen.h> > + > +int xencomm_debug; > + > +/* translate virtual address to physical address */ > +static unsigned long xen_vaddr_to_paddr(unsigned long vaddr) > +{ > + struct page *page; > + struct vm_area_struct *vma; > + > +#ifdef __ia64__ > + /* On ia64, TASK_SIZE refers to current. It is not > initialized > + during boot. > + Furthermore the kernel is relocatable and __pa() doesn''t > work on > + kernel addresses. */ > + if (vaddr >= KERNEL_START > + && vaddr < (KERNEL_START + KERNEL_TR_PAGE_SIZE)) { > + extern unsigned long kernel_start_pa; > + return vaddr - kernel_start_pa; > + } > +#endif > + if (vaddr > TASK_SIZE) { > + /* kernel address */ > + return __pa(vaddr); > + } > + > + /* XXX double-check (lack of) locking */ > + vma = find_extend_vma(current->mm, vaddr); > + if (!vma) > + return ~0UL; > + > + page = follow_page(vma, vaddr, 0); > + if (!page) > + return ~0UL; > + > + return (page_to_pfn(page) << PAGE_SHIFT) | (vaddr & > ~PAGE_MASK); > +}If there really is no way to implement xen_vaddr_to_paddr() in an arch-neutral way (and I''m willing to believe that''s true), just make the whole function arch-specific. It wouldn''t be too much duplicated code.> +static int xencomm_init(struct xencomm_desc *desc, > + void *buffer, unsigned long bytes) > +{ > + unsigned long recorded = 0; > + int i = 0; > + > + BUG_ON((buffer == NULL) && (bytes > 0)); > + > + /* record the physical pages used */ > + if (buffer == NULL) > + desc->nr_addrs = 0; > + > + while ((recorded < bytes) && (i < desc->nr_addrs)) { > + unsigned long vaddr = (unsigned long)buffer + > recorded; > + unsigned long paddr; > + int offset; > + int chunksz; > + > + offset = vaddr % PAGE_SIZE; /* handle partial pages */ > + chunksz = min(PAGE_SIZE - offset, bytes - recorded); > + > + paddr = xen_vaddr_to_paddr(vaddr); > + if (paddr == ~0UL) { > + printk("%s: couldn''t translate vaddr %lx\n", > + __func__, vaddr); > + return -EINVAL; > + } > + > + desc->address[i++] = paddr; > + recorded += chunksz; > + } > + > + if (recorded < bytes) { > + printk("%s: could only translate %ld of %ld bytes\n", > + __func__, recorded, bytes); > + return -ENOSPC; > + } > + > + /* mark remaining addresses invalid (just for safety) */ > + while (i < desc->nr_addrs) > + desc->address[i++] = XENCOMM_INVALID; > + > + desc->magic = XENCOMM_MAGIC; > + > + return 0; > +} > + > +/* XXX use slab allocator */ > +static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask) > +{ > + struct xencomm_desc *desc; > + > + /* XXX could we call this from irq context? */You can remove this comment. It''s historical, and we''re passing in gfp_mask now.> + desc = (struct xencomm_desc *)__get_free_page(gfp_mask); > + if (desc == NULL) { > + panic("%s: page allocation failed\n", __func__); > + } > + desc->nr_addrs = (PAGE_SIZE - sizeof(struct xencomm_desc)) / > + sizeof(*desc->address); > + > + return desc; > +} > + > +void xencomm_free(struct xencomm_handle *desc) > +{ > + if (desc) > + free_page((unsigned long)__va(desc)); > +} > + > +int xencomm_create(void *buffer, unsigned long bytes, > + struct xencomm_handle **ret, gfp_t gfp_mask) > +{ > + struct xencomm_desc *desc; > + struct xencomm_handle *handle; > + int rc; > + > + if (xencomm_debug) { > + printk("%s: %p[%ld]\n", __func__, buffer, bytes); > + } > + > + if (buffer == NULL || bytes == 0) { > + *ret = (struct xencomm_handle *)NULL; > + return 0; > + } > + > + desc = xencomm_alloc(gfp_mask); > + if (!desc) { > + printk("%s failure\n", "xencomm_alloc"); > + return -ENOMEM; > + } > + handle = (struct xencomm_handle *)__pa(desc); > + > + rc = xencomm_init(desc, buffer, bytes); > + if (rc) { > + printk("%s failure: %d\n", "xencomm_init", rc); > + xencomm_free(handle); > + return rc; > + } > + > + *ret = handle; > + return 0; > +} > + > +/* "mini" routines, for stack-based communications: */ > + > +static void *xencomm_alloc_mini(void *area, int arealen) > +{ > + unsigned long base = (unsigned long)area; > + unsigned int pageoffset; > + > + pageoffset = base % PAGE_SIZE; > + > + /* we probably fit right at the front of area */ > + if ((PAGE_SIZE - pageoffset) >= sizeof(struct xencomm_mini)) { > + return area; > + } > + > + /* if not, see if area is big enough to advance to the next > page */ > + if ((arealen - pageoffset) >= sizeof(struct xencomm_mini)) > + return (void *)(base + pageoffset); > + > + /* area was too small */ > + return NULL; > +} > + > +int xencomm_create_mini(void *area, int arealen, void *buffer, > + unsigned long bytes, struct xencomm_handle > **ret) > +{ > + struct xencomm_desc *desc; > + int rc; > + > + desc = xencomm_alloc_mini(area, arealen); > + if (!desc) > + return -ENOMEM; > + desc->nr_addrs = XENCOMM_MINI_ADDRS; > + > + rc = xencomm_init(desc, buffer, bytes); > + if (rc) > + return rc; > + > + *ret = (struct xencomm_handle *)__pa(desc); > + return 0; > +}*_mini are unused and should be removed entirely.> +struct xencomm_handle *xencomm_create_inline (void *buffer, > + unsigned long bytes) > +{ > + unsigned long paddr; > + > + paddr = xen_vaddr_to_paddr((unsigned long)buffer); > + return (struct xencomm_handle *)XENCOMM_INLINE_CREATE(paddr); > +}XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old patch just fine: +struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned long bytes) +{ + return (struct xencomm_desc *) + (__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE); +}> diff -r b7db009d622c > linux-2.6-xen-sparse/drivers/xen/core/xencomm_hcall.c > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/linux-2.6-xen-sparse/drivers/xen/core/xencomm_hcall.c Mon > Aug 21 15:04:32 2006 +0200 > @@ -0,0 +1,311 @@ > +#include <linux/types.h> > +#include <linux/errno.h> > +#include <linux/kernel.h> > +#include <linux/gfp.h> > +#include <linux/module.h> > +#include <xen/interface/xen.h> > +#include <xen/interface/dom0_ops.h> > +#include <xen/interface/memory.h> > +#include <xen/interface/xencomm.h> > +#include <xen/interface/version.h> > +#include <xen/interface/sched.h> > +#include <xen/interface/event_channel.h> > +#include <xen/interface/physdev.h> > +#include <xen/interface/grant_table.h> > +#include <xen/interface/callback.h> > +#include <xen/interface/acm_ops.h> > +#include <xen/public/privcmd.h> > +#include <asm/hypercall.h> > +#include <asm/page.h> > +#include <asm/uaccess.h> > +#include <xen/xencomm.h> > + > +/* Xencomm notes: > + * > + * Some hypercalls are made before the memory subsystem is up, so > instead of > + * calling xencomm_create(), we allocate XENCOMM_MINI_AREA bytes from > the stack > + * to hold the xencomm descriptor.Remove above comment.> + * In general, we need a xencomm descriptor to cover the top-level > data > + * structure (e.g. the dom0 op), plus another for every embedded > pointer to > + * another data structure (i.e. for every GUEST_HANDLE). > + */ > + > +int xencomm_hypercall_console_io(int cmd, int count, char *str) > +{ > + struct xencomm_handle *desc; > + int rc; > + > + desc = xencomm_create_inline (str, count); > + > + rc = xencomm_arch_hypercall_console_io (cmd, count, desc); > + > + return rc; > +}I don''t understand the point of all these routines if they just call arch_foo anyways.> diff -r b7db009d622c > linux-2.6-xen-sparse/drivers/xen/privcmd/xencomm.c > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/xencomm.c Mon > Aug 21 15:04:32 2006 +0200 > @@ -0,0 +1,358 @@ > +#include <linux/types.h> > +#include <linux/errno.h> > +#include <linux/kernel.h> > +#include <linux/gfp.h> > +#include <linux/module.h> > +#include <xen/interface/xen.h> > +#include <xen/interface/dom0_ops.h> > +#include <xen/interface/memory.h> > +#include <xen/interface/version.h> > +#include <xen/interface/event_channel.h> > +#include <xen/interface/acm_ops.h> > +#include <xen/public/privcmd.h> > +#include <asm/hypercall.h> > +#include <asm/page.h> > +#include <asm/uaccess.h> > +#include <xen/xencomm.h> > + > +static int xencomm_privcmd_dom0_op(privcmd_hypercall_t *hypercall) > +{ > + dom0_op_t kern_op; > + dom0_op_t __user *user_op = (dom0_op_t __user > *)hypercall->arg[0]; > + struct xencomm_handle *op_desc; > + struct xencomm_handle *desc = NULL; > + int ret = 0; > + > + if (copy_from_user(&kern_op, user_op, sizeof(dom0_op_t))) > + return -EFAULT; > + > + if (kern_op.interface_version != DOM0_INTERFACE_VERSION) > + return -EACCES; > + > + op_desc = xencomm_create_inline (&kern_op, sizeof(dom0_op_t)); > + > + switch (kern_op.cmd) { > + case DOM0_GETMEMLIST: > + { > + unsigned long nr_pages > kern_op.u.getmemlist.max_pfns; > +#ifdef __ia64__ > + /* Xen/ia64 pass first_page and nr_pages in max_pfns! > */ > + nr_pages &= 0xffffffff; > +#endifI''m willing to put up with this only if you guys promise to fix this silly API incompatibility, at which point it will be removed.> + ret = xencomm_create( > + xen_guest_handle(kern_op.u.getmemlist.buffer), > + nr_pages * sizeof(unsigned long), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.getmemlist.buffer, > + (void *)desc); > + break; > + } > + case DOM0_SETVCPUCONTEXT: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.setvcpucontext.ctxt), > + sizeof(vcpu_guest_context_t), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.setvcpucontext.ctxt, > + (void *)desc); > + break; > + case DOM0_READCONSOLE: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.readconsole.buffer), > + kern_op.u.readconsole.count, > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.readconsole.buffer, > + (void *)desc); > + break; > + case DOM0_GETPAGEFRAMEINFO2: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.getpageframeinfo2.array), > + kern_op.u.getpageframeinfo2.num, > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.getpageframeinfo2.array, > + (void *)desc); > + break; > + case DOM0_PERFCCONTROL: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.perfccontrol.desc), > + kern_op.u.perfccontrol.nr_counters * > + sizeof(dom0_perfc_desc_t), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.perfccontrol.desc, > + (void *)desc); > + break; > + case DOM0_GETVCPUCONTEXT: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.getvcpucontext.ctxt), > + sizeof(vcpu_guest_context_t), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.getvcpucontext.ctxt, > + (void *)desc); > + break; > + case DOM0_GETDOMAININFOLIST: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.getdomaininfolist.buffer), > + kern_op.u.getdomaininfolist.num_domains * > + sizeof(dom0_getdomaininfo_t), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.getdomaininfolist.buffer, > + (void *)desc); > + break; > + case DOM0_PHYSICAL_MEMORY_MAP: > + ret = xencomm_create( > + xen_guest_handle(kern_op.u.physical_memory_map.memory_map), > + kern_op.u.physical_memory_map.nr_map_entries * > + sizeof(struct dom0_memory_map_entry), > + &desc, GFP_KERNEL); > + set_xen_guest_handle(kern_op.u.physical_memory_map.memory_map, > + (void *)desc); > + break; > + > + case DOM0_SCHEDCTL: > + case DOM0_ADJUSTDOM: > + case DOM0_CREATEDOMAIN: > + case DOM0_DESTROYDOMAIN: > + case DOM0_PAUSEDOMAIN: > + case DOM0_UNPAUSEDOMAIN: > + case DOM0_GETDOMAININFO: > + case DOM0_MSR: > + case DOM0_SETTIME: > + case DOM0_GETPAGEFRAMEINFO: > + case DOM0_SETVCPUAFFINITY: > + case DOM0_TBUFCONTROL: > + case DOM0_PHYSINFO: > + case DOM0_SCHED_ID: > + case DOM0_SETDOMAINMAXMEM: > + case DOM0_ADD_MEMTYPE: > + case DOM0_DEL_MEMTYPE: > + case DOM0_READ_MEMTYPE: > + case DOM0_IOPORT_PERMISSION: > + case DOM0_GETVCPUINFO: > + case DOM0_PLATFORM_QUIRK: > + case DOM0_MAX_VCPUS: > + case DOM0_SETDOMAINHANDLE: > + case DOM0_SETDEBUGGING: > + case DOM0_DOMAIN_SETUP: > + /* no munging needed */ > + break; > + > + default: > + printk("%s: unknown dom0 cmd %d\n", __func__, > kern_op.cmd); > + return -ENOSYS; > + } > + > + if (ret) > + goto out; /* error mapping the nested pointer */ > + > + ret = xencomm_arch_hypercall_dom0_op (op_desc); > + > + /* FIXME: should we restore the handle? */ > + if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t))) > + ret = -EFAULT; > + > + if (desc) > + xencomm_free(desc); > +out: > + return ret; > +}You misplaced the out label; it needs to go before xencomm_free(desc); That''s a good question about the copy_to_user(). I thought we never exposed the modified handles back to the user, but I guess I was wrong. Also please check whitespace throughout. In particular you seem to be doing this: function (args); and not even Keir''s shall-we-say-unique style does that. ;)> +static int xencomm_privcmd_acm_op(privcmd_hypercall_t *hypercall) > +{ > + int cmd = hypercall->arg[0]; > + void __user *arg = (void __user *)hypercall->arg[1]; > + struct xencomm_handle *op_desc; > + struct xencomm_handle *desc = NULL; > + int ret; > + > + switch (cmd) { > + case ACMOP_getssid: > + { > + struct acm_getssid kern_arg; > + > + if (copy_from_user (&kern_arg, arg, sizeof > (kern_arg))) > + return -EFAULT; > + > + op_desc = xencomm_create_inline (&kern_arg, > sizeof(kern_arg)); > + > + ret > xencomm_create(xen_guest_handle(kern_arg.ssidbuf), > + kern_arg.ssidbuf_size, > + &desc, GFP_KERNEL); > + if (ret) > + return ret; > + > + set_xen_guest_handle(kern_arg.ssidbuf, (void *)desc); > + > + ret = xencomm_arch_hypercall_acm_op (cmd, op_desc); > + > + xencomm_free (desc); > + > + if (copy_to_user (arg, &kern_arg, sizeof (kern_arg))) > + return -EFAULT; > + > + return ret; > + } > + default: > + printk("%s: unknown acm_op cmd %d\n", __func__, cmd); > + return -ENOSYS; > + } > + > + return ret; > +} > + > +static int xencomm_privcmd_memory_op(privcmd_hypercall_t *hypercall) > +{ > + const unsigned long cmd = hypercall->arg[0]; > + int ret = 0; > + > + switch (cmd) { > + case XENMEM_increase_reservation: > + case XENMEM_decrease_reservation: > + { > + xen_memory_reservation_t kern_op; > + xen_memory_reservation_t __user *user_op; > + struct xencomm_handle *desc = NULL; > + struct xencomm_handle *desc_op; > + > + user_op = (xen_memory_reservation_t __user > *)hypercall->arg[1]; > + if (copy_from_user(&kern_op, user_op, > + sizeof(xen_memory_reservation_t))) > + return -EFAULT; > + desc_op = xencomm_create_inline (&kern_op, sizeof > (kern_op)); > + > + if (xen_guest_handle(kern_op.extent_start)) { > + void * addr; > + > + addr = xen_guest_handle(kern_op.extent_start); > + ret = xencomm_create > + (addr, > + kern_op.nr_extents * > + sizeof(*xen_guest_handle > + (kern_op.extent_start)), > + &desc, GFP_KERNEL); > + if (ret) > + return ret; > + set_xen_guest_handle(kern_op.extent_start, > + (void *)desc); > + } > + > + ret = xencomm_arch_hypercall_memory_op (cmd, desc_op); > + > + if (desc) > + xencomm_free (desc); > + > + if (ret != 0) > + return ret; > + > + if (copy_to_user(user_op, &kern_op, > + sizeof(xen_memory_reservation_t))) > + return -EFAULT; > + > + return ret; > + } > + default: > + printk("%s: unknown memory op %lu\n", __func__, cmd); > + ret = -ENOSYS; > + } > + return ret; > +} > + > +static int xencomm_privcmd_xen_version(privcmd_hypercall_t > *hypercall) > +{ > + int cmd = hypercall->arg[0]; > + void __user *arg = (void __user *)hypercall->arg[1]; > + struct xencomm_handle *desc; > + size_t argsize; > + int rc; > + > + switch (cmd) { > + case XENVER_version: > + /* do not actually pass an argument */ > + return xencomm_arch_hypercall_xen_version (cmd, 0); > + case XENVER_extraversion: > + argsize = sizeof(xen_extraversion_t); > + break; > + case XENVER_compile_info: > + argsize = sizeof(xen_compile_info_t); > + break; > + case XENVER_capabilities: > + argsize = sizeof(xen_capabilities_info_t); > + break; > + case XENVER_changeset: > + argsize = sizeof(xen_changeset_info_t); > + break; > + case XENVER_platform_parameters: > + argsize = sizeof(xen_platform_parameters_t); > + break; > + case XENVER_pagesize: > + argsize = (arg == NULL) ? 0 : sizeof(void *); > + break; > + case XENVER_get_features: > + argsize = (arg == NULL) ? 0 : > sizeof(xen_feature_info_t); > + break; > + > + default: > + printk("%s: unknown version op %d\n", __func__, cmd); > + return -ENOSYS; > + } > + > + rc = xencomm_create(arg, argsize, &desc, GFP_KERNEL); > + if (rc) > + return rc; > + > + rc = xencomm_arch_hypercall_xen_version (cmd, desc); > + > + xencomm_free(desc); > + > + return rc; > +} > + > +static int xencomm_privcmd_event_channel_op(privcmd_hypercall_t > *hypercall) > +{ > + int cmd = hypercall->arg[0]; > + struct xencomm_handle *desc; > + unsigned int argsize; > + int ret; > + > + switch (cmd) { > + case EVTCHNOP_alloc_unbound: > + argsize = sizeof(evtchn_alloc_unbound_t); > + break; > + > + case EVTCHNOP_status: > + argsize = sizeof(evtchn_status_t); > + break; > + > + default: > + printk("%s: unknown EVTCHNOP %d\n", __func__, cmd); > + return -EINVAL; > + } > + > + ret = xencomm_create((void *)hypercall->arg[1], argsize, > + &desc, GFP_KERNEL); > + if (ret) > + return ret; > + > + ret = xencomm_arch_hypercall_event_channel_op (cmd, desc); > + > + xencomm_free(desc); > + return ret; > +} > + > +int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall) > +{ > + switch (hypercall->op) { > + case __HYPERVISOR_dom0_op: > + return xencomm_privcmd_dom0_op(hypercall); > + case __HYPERVISOR_acm_op: > + return xencomm_privcmd_acm_op(hypercall); > + case __HYPERVISOR_xen_version: > + return xencomm_privcmd_xen_version(hypercall); > + case __HYPERVISOR_memory_op: > + return xencomm_privcmd_memory_op(hypercall); > + case __HYPERVISOR_event_channel_op: > + return xencomm_privcmd_event_channel_op(hypercall); > + default: > + printk("%s: unknown hcall (%ld)\n", __func__, > hypercall->op); > + return -ENOSYS; > + } > +} > + > diff -r b7db009d622c linux-2.6-xen-sparse/include/xen/xencomm.h > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/linux-2.6-xen-sparse/include/xen/xencomm.h Mon Aug 21 > 15:04:32 2006 +0200 > @@ -0,0 +1,45 @@ > +/* > + * Copyright (C) 2006 Hollis Blanchard <hollisb@us.ibm.com>, IBM > Corporation > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#ifndef _LINUX_XENCOMM_H_ > +#define _LINUX_XENCOMM_H_ > + > +#include <xen/interface/xencomm.h> > + > +#define XENCOMM_MINI_ADDRS 3 > +struct xencomm_mini { > + struct xencomm_desc _desc; > + uint64_t address[XENCOMM_MINI_ADDRS]; > +}; > +#define XENCOMM_MINI_AREA (sizeof(struct xencomm_mini) * 2)Remove above.> +/* To avoid additionnal virt to phys convertion, the user only sees > handle > + which are opaque structures. */ > +struct xencomm_handle;Typos in the comment.> +extern int xencomm_create(void *buffer, unsigned long bytes, > + struct xencomm_handle **desc, gfp_t type); > +extern void xencomm_free(struct xencomm_handle *desc); > +extern int xencomm_create_mini(void *area, int arealen, void *buffer, > + unsigned long bytes, struct xencomm_handle **ret);Remove above.> +struct xencomm_handle *xencomm_create_inline (void *buffer, > + unsigned long bytes); > + > +#define xen_guest_handle(hnd) ((hnd).p) > + > +#endif /* _LINUX_XENCOMM_H_ */ > diff -r b7db009d622c linux-2.6-xen-sparse/include/xen/xencomm_hcall.h > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/linux-2.6-xen-sparse/include/xen/xencomm_hcall.h Mon Aug 21 > 15:04:32 2006 +0200 > @@ -0,0 +1,45 @@ > +/* > + * Copyright (C) 2006 Tristan Gingold <tristan.gingold@bull.net>, > Bull SAS > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#ifndef _LINUX_XENCOMM_HCALL_H_ > +#define _LINUX_XENCOMM_HCALL_H_ > + > +/* These function creates inline descriptor for the parameters and > + calls the correspondig xencomm_arch_hypercall_X. > + Architectures should defines HYPERVISOR_xxx as > xencomm_hypercall_xxx unless > + they want to use their own wrapper. */"corresponding" And I''m not clear on the reason for all the xencomm_arch_*, especially because I haven''t seen IA64''s. If you''re worried about the structure size conversion I mentioned earlier, I think PowerPC will need to fix that *before* the xencomm stuff is called anyways. So unless IA64 needs something funny in xencomm_arch_*, they should all be removed. -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit :> I apologize for my mailer line-wrapping the patch as I quote it below. > > On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote: > > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig > > --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 09:41:24 > > 2006 +0200 > > +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 15:04:32 > > 2006 +0200 > > @@ -257,4 +257,7 @@ config XEN_SMPBOOT > > default y > > depends on SMP > > > > +config XEN_XENCOMM > > + bool > > + default n > > endif > > Shouldn''t IA64 "select XEN_XENCOMM"? Or is your kernel in a separate > tree?The arch Kconfig overrides this parameter.> > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Makefile > > --- a/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 09:41:24 > > 2006 +0200 > > +++ b/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 15:04:32 > > 2006 +0200 > > @@ -1,10 +1,10 @@ obj-y += core/ > > obj-y += core/ > > obj-y += console/ > > obj-y += evtchn/ > > -obj-y += privcmd/ > > obj-y += xenbus/ > > > > obj-$(CONFIG_XEN_UTIL) += util.o > > +obj-$(CONFIG_XEN_PRIVCMD) += privcmd/ > > obj-$(CONFIG_XEN_BALLOON) += balloon/ > > obj-$(CONFIG_XEN_DEVMEM) += char/ > > obj-$(CONFIG_XEN_BLKDEV_BACKEND) += blkback/ > > Not really part of this patch.Ok, I will send a separat patch. [...]> I agree with the CONFIG_XEN_PRIVCMD stuff, but I think that should be a > separate patch.> > diff -r b7db009d622c > > linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c > > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > > Aug 21 09:41:24 2006 +0200 > > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > > Aug 21 15:04:32 2006 +0200 > > @@ -34,6 +34,10 @@ > > > > static struct proc_dir_entry *privcmd_intf; > > static struct proc_dir_entry *capabilities_intf; > > + > > +#ifdef CONFIG_XEN_XENCOMM > > +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall); > > +#endif > > > > #define NR_HYPERCALLS 64 > > static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS); > > @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i > > "g" ((unsigned long)hypercall.arg[4]) > > > > : "r8", "r10", "memory" ); > > > > } > > -#elif defined (__ia64__) > > - __asm__ __volatile__ ( > > - ";; mov r14=%2; mov r15=%3; " > > - "mov r16=%4; mov r17=%5; mov r18=%6;" > > - "mov r2=%1; break 0x1000;; mov %0=r8 ;;" > > - : "=r" (ret) > > - : "r" (hypercall.op), > > - "r" (hypercall.arg[0]), > > - "r" (hypercall.arg[1]), > > - "r" (hypercall.arg[2]), > > - "r" (hypercall.arg[3]), > > - "r" (hypercall.arg[4]) > > - : > > "r14","r15","r16","r17","r18","r2","r8","memory"); > > +#elif defined (CONFIG_XEN_XENCOMM) > > + ret = xencomm_privcmd_hypercall (&hypercall); > > #endif > > } > > break; > > Move all the #ifdef stuff into appropriate header files, then have every > arch unconditionally call arch_privcmd_hypercall().I simply prefer not to touch other people code, as I can''t try xen/x86. [...]> > +/* translate virtual address to physical address */ > > +static unsigned long xen_vaddr_to_paddr(unsigned long vaddr) > > +{ > > + struct page *page; > > + struct vm_area_struct *vma; > > + > > +#ifdef __ia64__ > > + /* On ia64, TASK_SIZE refers to current. It is not > > initialized > > + during boot. > > + Furthermore the kernel is relocatable and __pa() doesn''t > > work on > > + kernel addresses. */ > > + if (vaddr >= KERNEL_START > > + && vaddr < (KERNEL_START + KERNEL_TR_PAGE_SIZE)) { > > + extern unsigned long kernel_start_pa; > > + return vaddr - kernel_start_pa; > > + } > > +#endif > > + if (vaddr > TASK_SIZE) { > > + /* kernel address */ > > + return __pa(vaddr); > > + } > > + > > + /* XXX double-check (lack of) locking */ > > + vma = find_extend_vma(current->mm, vaddr); > > + if (!vma) > > + return ~0UL; > > + > > + page = follow_page(vma, vaddr, 0); > > + if (!page) > > + return ~0UL; > > + > > + return (page_to_pfn(page) << PAGE_SHIFT) | (vaddr & > > ~PAGE_MASK); > > +} > > If there really is no way to implement xen_vaddr_to_paddr() in an > arch-neutral way (and I''m willing to believe that''s true), just make the > whole function arch-specific. It wouldn''t be too much duplicated code.I will try to improve this. [...]> > +/* XXX use slab allocator */ > > +static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask) > > +{ > > + struct xencomm_desc *desc; > > + > > + /* XXX could we call this from irq context? */ > > You can remove this comment. It''s historical, and we''re passing in > gfp_mask now.Ok. [...]> > *_mini are unused and should be removed entirely.Ok.> > +struct xencomm_handle *xencomm_create_inline (void *buffer, > > + unsigned long bytes) > > +{ > > + unsigned long paddr; > > + > > + paddr = xen_vaddr_to_paddr((unsigned long)buffer); > > + return (struct xencomm_handle *)XENCOMM_INLINE_CREATE(paddr); > > +} > > XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old patch > just fine: > +struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned long > bytes) > +{ > + return (struct xencomm_desc *) > + (__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE); > +}It is defined in arch-xxx.h file.> > +#include <xen/xencomm.h> > > + > > +/* Xencomm notes: > > + * > > + * Some hypercalls are made before the memory subsystem is up, so > > instead of > > + * calling xencomm_create(), we allocate XENCOMM_MINI_AREA bytes from > > the stack > > + * to hold the xencomm descriptor. > > Remove above comment.Ok.> > + * In general, we need a xencomm descriptor to cover the top-level > > data > > + * structure (e.g. the dom0 op), plus another for every embedded > > pointer to > > + * another data structure (i.e. for every GUEST_HANDLE). > > + */ > > + > > +int xencomm_hypercall_console_io(int cmd, int count, char *str) > > +{ > > + struct xencomm_handle *desc; > > + int rc; > > + > > + desc = xencomm_create_inline (str, count); > > + > > + rc = xencomm_arch_hypercall_console_io (cmd, count, desc); > > + > > + return rc; > > +} > > I don''t understand the point of all these routines if they just call > arch_foo anyways.Sorry I have not explained the principle. xencomm_arch_hypercall_XXX are the raw hypercalls. They must be defined by architecture code. The xencomm_hypercall_XXX are the xencomm wrapper and are shared. [...]> > + case DOM0_GETMEMLIST: > > + { > > + unsigned long nr_pages > > kern_op.u.getmemlist.max_pfns; > > +#ifdef __ia64__ > > + /* Xen/ia64 pass first_page and nr_pages in max_pfns! > > */ > > + nr_pages &= 0xffffffff; > > +#endif > > I''m willing to put up with this only if you guys promise to fix this > silly API incompatibility, at which point it will be removed.I hope this could be fixed once xencomm is used by ia64. [...]> > + if (ret) > > + goto out; /* error mapping the nested pointer */ > > + > > + ret = xencomm_arch_hypercall_dom0_op (op_desc); > > + > > + /* FIXME: should we restore the handle? */ > > + if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t))) > > + ret = -EFAULT; > > + > > + if (desc) > > + xencomm_free(desc); > > +out: > > + return ret; > > +} > > You misplaced the out label; it needs to go before xencomm_free(desc);??? This was copied from your work. The code branches to out iff xencomm allocation failed. It is safe to call xencomm_free but useless.> That''s a good question about the copy_to_user(). I thought we never > exposed the modified handles back to the user, but I guess I was wrong. > > Also please check whitespace throughout. In particular you seem to be > doing this: > function (args); > and not even Keir''s shall-we-say-unique style does that. ;)Yes. [...]> > +#include <xen/interface/xencomm.h> > > + > > +#define XENCOMM_MINI_ADDRS 3 > > +struct xencomm_mini { > > + struct xencomm_desc _desc; > > + uint64_t address[XENCOMM_MINI_ADDRS]; > > +}; > > +#define XENCOMM_MINI_AREA (sizeof(struct xencomm_mini) * 2) > > Remove above.Ok.> > +/* To avoid additionnal virt to phys convertion, the user only sees > > handle > > + which are opaque structures. */ > > +struct xencomm_handle; > > Typos in the comment.Oops.> > +extern int xencomm_create(void *buffer, unsigned long bytes, > > + struct xencomm_handle **desc, gfp_t type); > > +extern void xencomm_free(struct xencomm_handle *desc); > > +extern int xencomm_create_mini(void *area, int arealen, void *buffer, > > + unsigned long bytes, struct xencomm_handle **ret); > > Remove above.Ok. [...]> > +/* These function creates inline descriptor for the parameters and > > + calls the correspondig xencomm_arch_hypercall_X. > > + Architectures should defines HYPERVISOR_xxx as > > xencomm_hypercall_xxx unless > > + they want to use their own wrapper. */ > > "corresponding"Oops.> And I''m not clear on the reason for all the xencomm_arch_*, especially > because I haven''t seen IA64''s. If you''re worried about the structure > size conversion I mentioned earlier, I think PowerPC will need to fix > that *before* the xencomm stuff is called anyways. So unless IA64 needs > something funny in xencomm_arch_*, they should all be removed.See explaination above, basically xencomm_arch_* are the raw hypercalls. Tristan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2006-08-23 at 09:59 +0200, Tristan Gingold wrote:> Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit : > > I apologize for my mailer line-wrapping the patch as I quote it below. > > > > On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote: > > > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig > > > --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 09:41:24 > > > 2006 +0200 > > > +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig Mon Aug 21 15:04:32 > > > 2006 +0200 > > > @@ -257,4 +257,7 @@ config XEN_SMPBOOT > > > default y > > > depends on SMP > > > > > > +config XEN_XENCOMM > > > + bool > > > + default n > > > endif > > > > Shouldn''t IA64 "select XEN_XENCOMM"? Or is your kernel in a separate > > tree? > The arch Kconfig overrides this parameter.My point was that I didn''t see that in this patch.> > > diff -r b7db009d622c > > > linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c > > > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > > > Aug 21 09:41:24 2006 +0200 > > > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c Mon > > > Aug 21 15:04:32 2006 +0200 > > > @@ -34,6 +34,10 @@ > > > > > > static struct proc_dir_entry *privcmd_intf; > > > static struct proc_dir_entry *capabilities_intf; > > > + > > > +#ifdef CONFIG_XEN_XENCOMM > > > +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall); > > > +#endif > > > > > > #define NR_HYPERCALLS 64 > > > static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS); > > > @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i > > > "g" ((unsigned long)hypercall.arg[4]) > > > > > > : "r8", "r10", "memory" ); > > > > > > } > > > -#elif defined (__ia64__) > > > - __asm__ __volatile__ ( > > > - ";; mov r14=%2; mov r15=%3; " > > > - "mov r16=%4; mov r17=%5; mov r18=%6;" > > > - "mov r2=%1; break 0x1000;; mov %0=r8 ;;" > > > - : "=r" (ret) > > > - : "r" (hypercall.op), > > > - "r" (hypercall.arg[0]), > > > - "r" (hypercall.arg[1]), > > > - "r" (hypercall.arg[2]), > > > - "r" (hypercall.arg[3]), > > > - "r" (hypercall.arg[4]) > > > - : > > > "r14","r15","r16","r17","r18","r2","r8","memory"); > > > +#elif defined (CONFIG_XEN_XENCOMM) > > > + ret = xencomm_privcmd_hypercall (&hypercall); > > > #endif > > > } > > > break; > > > > Move all the #ifdef stuff into appropriate header files, then have every > > arch unconditionally call arch_privcmd_hypercall(). > I simply prefer not to touch other people code, as I can''t try xen/x86.That''s nice, but you''re just moving code, and it''s the Right Thing To Do, so please do it. You can point out that you''ve only compile-tested x86 when you submit.> > > +struct xencomm_handle *xencomm_create_inline (void *buffer, > > > + unsigned long bytes) > > > +{ > > > + unsigned long paddr; > > > + > > > + paddr = xen_vaddr_to_paddr((unsigned long)buffer); > > > + return (struct xencomm_handle *)XENCOMM_INLINE_CREATE(paddr); > > > +} > > > > XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old patch > > just fine: > > +struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned long > > bytes) > > +{ > > + return (struct xencomm_desc *) > > + (__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE); > > +} > It is defined in arch-xxx.h file.But why? Do you anticipate that architectures will mark "inline" descriptors differently? If so, how?> > > + * In general, we need a xencomm descriptor to cover the top-level > > > data > > > + * structure (e.g. the dom0 op), plus another for every embedded > > > pointer to > > > + * another data structure (i.e. for every GUEST_HANDLE). > > > + */ > > > + > > > +int xencomm_hypercall_console_io(int cmd, int count, char *str) > > > +{ > > > + struct xencomm_handle *desc; > > > + int rc; > > > + > > > + desc = xencomm_create_inline (str, count); > > > + > > > + rc = xencomm_arch_hypercall_console_io (cmd, count, desc); > > > + > > > + return rc; > > > +} > > > > I don''t understand the point of all these routines if they just call > > arch_foo anyways. > Sorry I have not explained the principle. > xencomm_arch_hypercall_XXX are the raw hypercalls. They must be defined by > architecture code. The xencomm_hypercall_XXX are the xencomm wrapper and are > shared.That much is clear. :) My question is what is being done in those "raw hypercalls" that can''t be done here? You didn''t include them in your patch, so I can''t tell. It seems there are a few missing pieces to your patch. Next time please include the whole thing, including arch-specific parts, so we can see what''s going on.> > > + if (ret) > > > + goto out; /* error mapping the nested pointer */ > > > + > > > + ret = xencomm_arch_hypercall_dom0_op (op_desc); > > > + > > > + /* FIXME: should we restore the handle? */ > > > + if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t))) > > > + ret = -EFAULT; > > > + > > > + if (desc) > > > + xencomm_free(desc); > > > +out: > > > + return ret; > > > +} > > > > You misplaced the out label; it needs to go before xencomm_free(desc); > ??? This was copied from your work.You''ve made changes here, and that''s what I''m pointing out.> The code branches to out iff xencomm allocation failed. It is safe to call > xencomm_free but useless.There are multiple descriptors being created: one for the dom0_op top-level structure, and possibly one for a sub-structure. In fact, in your patch you never free ''op_desc'' inside xencomm_privcmd_dom0_op(). OK, reading closer, I don''t like that at *all*. The trick is that xencomm_create_inline() doesn''t actually create anything, and therefore you don''t need to free it. That needs to change. My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever way is best for portability) and if it is, do the "inline" stuff. On the free side, if the descriptor was inline, free can just return. That would also make me happy because it removes the need to think about whether callers can/should call "create_inline" or not; the code just does the right thing. -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tristan Gingold
2006-Aug-24 07:51 UTC
[Xen-ia64-devel] Re: [XenPPC] RFC: xencomm - linux side
Le Mercredi 23 Août 2006 18:35, Hollis Blanchard a écrit :> On Wed, 2006-08-23 at 09:59 +0200, Tristan Gingold wrote: > > Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit : > > > I don''t understand the point of all these routines if they just call > > > arch_foo anyways. > > > > Sorry I have not explained the principle. > > xencomm_arch_hypercall_XXX are the raw hypercalls. They must be defined > > by architecture code. The xencomm_hypercall_XXX are the xencomm wrapper > > and are shared. > > That much is clear. :) My question is what is being done in those "raw > hypercalls" that can''t be done here? You didn''t include them in your > patch, so I can''t tell. > > It seems there are a few missing pieces to your patch. Next time please > include the whole thing, including arch-specific parts, so we can see > what''s going on. > > > > > + if (ret) > > > > + goto out; /* error mapping the nested pointer */ > > > > + > > > > + ret = xencomm_arch_hypercall_dom0_op (op_desc); > > > > + > > > > + /* FIXME: should we restore the handle? */ > > > > + if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t))) > > > > + ret = -EFAULT; > > > > + > > > > + if (desc) > > > > + xencomm_free(desc); > > > > +out: > > > > + return ret; > > > > +} > > > > > > You misplaced the out label; it needs to go before xencomm_free(desc); > > > > ??? This was copied from your work. > > You''ve made changes here, and that''s what I''m pointing out. > > > The code branches to out iff xencomm allocation failed. It is safe to > > call xencomm_free but useless. > > There are multiple descriptors being created: one for the dom0_op > top-level structure, and possibly one for a sub-structure. In fact, in > your patch you never free ''op_desc'' inside xencomm_privcmd_dom0_op(). > OK, reading closer, I don''t like that at *all*. The trick is that > xencomm_create_inline() doesn''t actually create anything, and therefore > you don''t need to free it. That needs to change. > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever > way is best for portability) and if it is, do the "inline" stuff. On the > free side, if the descriptor was inline, free can just return. That > would also make me happy because it removes the need to think about > whether callers can/should call "create_inline" or not; the code just > does the right thing.We definitly disagree here. One whole point of xencomm_create_inline is it doesn''t allocate memory and can''t fail. Because of that we don''t need to worry about failure and freeing memory. This makes the code a lot easier to write and to read. Tristan. _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold wrote:> > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever > > way is best for portability) and if it is, do the "inline" stuff. On the > > free side, if the descriptor was inline, free can just return. That > > would also make me happy because it removes the need to think about > > whether callers can/should call "create_inline" or not; the code just > > does the right thing. > We definitly disagree here. One whole point of xencomm_create_inline is it > doesn''t allocate memory and can''t fail. Because of that we don''t need to > worry about failure and freeing memory. This makes the code a lot easier to > write and to read.It would simplify the code even more to fold xencomm_create_inline() into xencomm_create(), as I suggest above. That way, the developer never needs to consider if the particular hypercall could ever be called before the page allocator works. Proving that assumption for some hypercall, and guaranteeing it will remain true in the future no matter what Linux changes occur, is a lot more difficult than remembering to call free() after create(). The goal of any API should be to make it impossible to use it incorrectly, and I think my (firm) suggestion makes that true here. -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Le Jeudi 24 Août 2006 17:43, Hollis Blanchard a écrit :> On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold wrote: > > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever > > > way is best for portability) and if it is, do the "inline" stuff. On > > > the free side, if the descriptor was inline, free can just return. That > > > would also make me happy because it removes the need to think about > > > whether callers can/should call "create_inline" or not; the code just > > > does the right thing. > > > > We definitly disagree here. One whole point of xencomm_create_inline is > > it doesn''t allocate memory and can''t fail. Because of that we don''t need > > to worry about failure and freeing memory. This makes the code a lot > > easier to write and to read. > > It would simplify the code even more to fold xencomm_create_inline() > into xencomm_create(), as I suggest above. That way, the developer never > needs to consider if the particular hypercall could ever be called > before the page allocator works. Proving that assumption for some > hypercall, and guaranteeing it will remain true in the future no matter > what Linux changes occur, is a lot more difficult than remembering to > call free() after create().Could you modify the ppc code, I will be happy to fetch directly the code for this new idea.> The goal of any API should be to make it impossible to use it > incorrectly, and I think my (firm) suggestion makes that true here.What about possible errors ? Tristan. _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
On Fri, 2006-08-25 at 09:02 +0200, Tristan Gingold wrote:> Le Jeudi 24 Août 2006 17:43, Hollis Blanchard a écrit : > > On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold wrote: > > > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever > > > > way is best for portability) and if it is, do the "inline" stuff. On > > > > the free side, if the descriptor was inline, free can just return. That > > > > would also make me happy because it removes the need to think about > > > > whether callers can/should call "create_inline" or not; the code just > > > > does the right thing. > > > > > > We definitly disagree here. One whole point of xencomm_create_inline is > > > it doesn''t allocate memory and can''t fail. Because of that we don''t need > > > to worry about failure and freeing memory. This makes the code a lot > > > easier to write and to read. > > > > It would simplify the code even more to fold xencomm_create_inline() > > into xencomm_create(), as I suggest above. That way, the developer never > > needs to consider if the particular hypercall could ever be called > > before the page allocator works. Proving that assumption for some > > hypercall, and guaranteeing it will remain true in the future no matter > > what Linux changes occur, is a lot more difficult than remembering to > > call free() after create(). > Could you modify the ppc code, I will be happy to fetch directly the code for > this new idea.I''m looking at this now, and I''m brought back to the fact that I don''t like the "inline" idea because practically speaking it requires that the kernel stack is physically contiguous. That is true for Linux, but is that really true for all OSs? Since we''re defining a Xen interface, I don''t want to hardcode Linux assumptions. Without that, an OS with a physically discontiguous stack would be forced to do the equivalen of get_free_page() to do all communication (including console output). Before the page allocator works that wouldn''t be possible, but I think we can assume a physically contiguous stack early in the boot process before the page allocator works. So then you''re requiring a test: hcall_console_write(char *str) { if (page_allocator_done()) { desc = (desc *)get_free_page(); xencomm_map(desc, str); hcall(desc); free_page(desc); } else { desc = xencomm_create_inline(str); hcall(desc); } } That seems lame. The "mini" xencomm stuff always works. Actually... I guess the "mini" stuff will always work, and any OS that needs it can use it. The "inline" stuff is an optimization that Linux can take advantage of. Summary: I''ve changed my mind, and I only send this email to illustrate my thought process. :) -- Hollis Blanchard IBM Linux Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel