thr3ads.net - Xen devel - [Xen-devel] RFC: xencomm

If this information is useful, please help other people find it:
Share via:

Tristan Gingold

2006-Aug-21 15:18 UTC

[Xen-devel] RFC: xencomm - linux side

Hi,

I am posting the linux xencomm code for review.  I''d plan to submit
soon
unless comments/remarks.

This patch has 3 aims:
* xencomm descriptor creation (drivers/xen/core/xencomm.c)
* linux issued hypercalls translation (drivers/xen/core/xencomm_hcall.c)
* privcmd hypercalls translation (drivers/xen/privcmd/xencomm.c)

Most of this comes from powerpc people.  Because they use a private linux 
tree, the patches have never been submitted.
However the patches have been seriously modified by myself in order to be used 
on ia64.  I tried to made them more generic so that they can be shared by 
ia64 and powerpc.

Tristan.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Aug-21 19:07 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold
wrote:> I am posting the linux xencomm code for review.  I''d plan to
submit soon
> unless comments/remarks.
NAK. I''m still waiting to hear back about how you can use
xencomm_inline() without worrying about page boundaries.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tristan Gingold

2006-Aug-22 07:42 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

Le Lundi 21 Août 2006 21:07, Hollis Blanchard a écrit :> On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:
> > I am posting the linux xencomm code for review.  I''d plan to
submit soon
> > unless comments/remarks.
>
> NAK. I''m still waiting to hear back about how you can use
> xencomm_inline() without worrying about page boundaries.More elaborated answer:

On linux/ia64, kernel is linearly mapped into guest physical memory.  The same 
is true for process kernel stacks.  Therefore all kernels structure are 
linear in guest physical memory.

Kernel data may of course cross page boundaries.  But Xen can correctly handle 
this using only the guest physical address.

Is something wrong ?  Is something that don''t apply on powerpc ?

Tristan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Aug-22 18:11 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

On Tue, 2006-08-22 at 09:42 +0200, Tristan Gingold
wrote:> Le Lundi 21 Août 2006 21:07, Hollis Blanchard a écrit :
> > On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:
> > > I am posting the linux xencomm code for review.  I''d
plan to submit soon
> > > unless comments/remarks.
> >
> > NAK. I''m still waiting to hear back about how you can use
> > xencomm_inline() without worrying about page boundaries.
> More elaborated answer:
> 
> On linux/ia64, kernel is linearly mapped into guest physical memory.  The
same
> is true for process kernel stacks.  Therefore all kernels structure are 
> linear in guest physical memory.
> 
> Kernel data may of course cross page boundaries.  But Xen can correctly
handle
> this using only the guest physical address.
I see, and you handle this by breaking up copies at page granularity in
xencomm_copy_from_user().

I have to say I don''t really like the code complexity, or the fact that
there are now two very different ways to access guest handles. That
being said, it sure would be nice to get rid of that "mini"
stack-based
stuff, so it''s OK with me. I would be surprised if there were any
performance difference, however.

I''ll send some comments to your original patch.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Aug-22 19:03 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

I apologize for my mailer line-wrapping the patch as I quote it below.

On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold
wrote:> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig
> --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21 09:41:24
> 2006 +0200
> +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21 15:04:32
> 2006 +0200
> @@ -257,4 +257,7 @@ config XEN_SMPBOOT
>         default y
>         depends on SMP
>  
> +config XEN_XENCOMM
> +       bool
> +       default n
>  endif
Shouldn''t IA64 "select XEN_XENCOMM"? Or is your kernel in a
separate
tree?
> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Makefile
> --- a/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 09:41:24
> 2006 +0200
> +++ b/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 15:04:32
> 2006 +0200
> @@ -1,10 +1,10 @@ obj-y += core/
>  obj-y  += core/
>  obj-y  += console/
>  obj-y  += evtchn/
> -obj-y  += privcmd/
>  obj-y  += xenbus/
>  
>  obj-$(CONFIG_XEN_UTIL)                 += util.o
> +obj-$(CONFIG_XEN_PRIVCMD)              += privcmd/
>  obj-$(CONFIG_XEN_BALLOON)              += balloon/
>  obj-$(CONFIG_XEN_DEVMEM)               += char/
>  obj-$(CONFIG_XEN_BLKDEV_BACKEND)       += blkback/
Not really part of this patch.
> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/core/Makefile
> --- a/linux-2.6-xen-sparse/drivers/xen/core/Makefile    Mon Aug 21
> 09:41:24 2006 +0200
> +++ b/linux-2.6-xen-sparse/drivers/xen/core/Makefile    Mon Aug 21
> 15:04:32 2006 +0200
> @@ -11,3 +11,4 @@ obj-$(CONFIG_XEN_SKBUFF)      += skbuff.o
>  obj-$(CONFIG_XEN_SKBUFF)       += skbuff.o
>  obj-$(CONFIG_XEN_REBOOT)       += reboot.o
>  obj-$(CONFIG_XEN_SMPBOOT)      += smpboot.o
> +obj-$(CONFIG_XEN_XENCOMM)      += xencomm.o xencomm_hcall.o
> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile
> --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile Mon Aug 21
> 09:41:24 2006 +0200
> +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/Makefile Mon Aug 21
> 15:04:32 2006 +0200
> @@ -1,2 +1,3 @@
> +obj-y                          := privcmd.o
>  
> -obj-$(CONFIG_XEN_PRIVCMD)      := privcmd.o
> +obj-$(CONFIG_XEN_XENCOMM)      += xencomm.o
I agree with the CONFIG_XEN_PRIVCMD stuff, but I think that should be a
separate patch.
> diff -r b7db009d622c
> linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c
> --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Mon
> Aug 21 09:41:24 2006 +0200
> +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Mon
> Aug 21 15:04:32 2006 +0200
> @@ -34,6 +34,10 @@
>  
>  static struct proc_dir_entry *privcmd_intf;
>  static struct proc_dir_entry *capabilities_intf;
> +
> +#ifdef CONFIG_XEN_XENCOMM
> +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall);
> +#endif
>  
>  #define NR_HYPERCALLS 64
>  static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS);
> @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i
>                                 "g" ((unsigned
long)hypercall.arg[4])
>                                 : "r8", "r10",
"memory" );
>                 }
> -#elif defined (__ia64__)
> -               __asm__ __volatile__ (
> -                       ";; mov r14=%2; mov r15=%3; "
> -                       "mov r16=%4; mov r17=%5; mov r18=%6;"
> -                       "mov r2=%1; break 0x1000;; mov %0=r8 ;;"
> -                       : "=r" (ret)
> -                       : "r" (hypercall.op),
> -                       "r" (hypercall.arg[0]),
> -                       "r" (hypercall.arg[1]),
> -                       "r" (hypercall.arg[2]),
> -                       "r" (hypercall.arg[3]),
> -                       "r" (hypercall.arg[4])
> -                       :
>
"r14","r15","r16","r17","r18","r2","r8","memory");
> +#elif defined (CONFIG_XEN_XENCOMM)
> +               ret = xencomm_privcmd_hypercall (&hypercall);
>  #endif
>         }
>         break;
Move all the #ifdef stuff into appropriate header files, then have every
arch unconditionally call arch_privcmd_hypercall().
> diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/core/xencomm.c
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/linux-2.6-xen-sparse/drivers/xen/core/xencomm.c   Mon Aug 21
> 15:04:32 2006 +0200
> @@ -0,0 +1,213 @@
> +/*
> + * Copyright (C) 2006 Hollis Blanchard <hollisb@us.ibm.com>, IBM
> Corporation
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License as published
> by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> 02111-1307 USA
> + */
> +
> +#include <linux/gfp.h>
> +#include <linux/mm.h>
> +#include <asm/page.h>
> +#include <xen/xencomm.h>
> +#include <xen/interface/xen.h>
> +
> +int xencomm_debug;
> +
> +/* translate virtual address to physical address */
> +static unsigned long xen_vaddr_to_paddr(unsigned long vaddr)
> +{
> +       struct page *page;
> +       struct vm_area_struct *vma;
> +
> +#ifdef __ia64__
> +       /* On ia64, TASK_SIZE refers to current.  It is not
> initialized
> +          during boot.
> +          Furthermore the kernel is relocatable and __pa()
doesn''t
> work on
> +          kernel addresses.  */
> +       if (vaddr >= KERNEL_START
> +           && vaddr < (KERNEL_START + KERNEL_TR_PAGE_SIZE)) {
> +               extern unsigned long kernel_start_pa;
> +               return vaddr - kernel_start_pa;
> +       }
> +#endif
> +       if (vaddr > TASK_SIZE) {
> +               /* kernel address */
> +               return __pa(vaddr);
> +       }
> +
> +       /* XXX double-check (lack of) locking */
> +       vma = find_extend_vma(current->mm, vaddr);
> +       if (!vma)
> +               return ~0UL;
> +
> +       page = follow_page(vma, vaddr, 0);
> +       if (!page)
> +               return ~0UL;
> +
> +       return (page_to_pfn(page) << PAGE_SHIFT) | (vaddr &
> ~PAGE_MASK);
> +}
If there really is no way to implement xen_vaddr_to_paddr() in an
arch-neutral way (and I''m willing to believe that''s true),
just make the
whole function arch-specific. It wouldn''t be too much duplicated code.
> +static int xencomm_init(struct xencomm_desc *desc,
> +                       void *buffer, unsigned long bytes)
> +{
> +       unsigned long recorded = 0;
> +       int i = 0;
> +
> +       BUG_ON((buffer == NULL) && (bytes > 0));
> +
> +       /* record the physical pages used */
> +       if (buffer == NULL)
> +               desc->nr_addrs = 0;
> +
> +       while ((recorded < bytes) && (i < desc->nr_addrs))
{
> +               unsigned long vaddr = (unsigned long)buffer +
> recorded;
> +               unsigned long paddr;
> +               int offset;
> +               int chunksz;
> +
> +               offset = vaddr % PAGE_SIZE; /* handle partial pages */
> +               chunksz = min(PAGE_SIZE - offset, bytes - recorded);
> +
> +               paddr = xen_vaddr_to_paddr(vaddr);
> +               if (paddr == ~0UL) {
> +                       printk("%s: couldn''t translate vaddr
%lx\n",
> +                              __func__, vaddr);
> +                       return -EINVAL;
> +               }
> +
> +               desc->address[i++] = paddr;
> +               recorded += chunksz;
> +       }
> +
> +       if (recorded < bytes) {
> +               printk("%s: could only translate %ld of %ld
bytes\n",
> +                      __func__, recorded, bytes);
> +               return -ENOSPC;
> +       }
> +
> +       /* mark remaining addresses invalid (just for safety) */
> +       while (i < desc->nr_addrs)
> +               desc->address[i++] = XENCOMM_INVALID;
> +
> +       desc->magic = XENCOMM_MAGIC;
> +
> +       return 0;
> +}
> +
> +/* XXX use slab allocator */
> +static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask)
> +{
> +       struct xencomm_desc *desc;
> +
> +       /* XXX could we call this from irq context? */
You can remove this comment. It''s historical, and we''re
passing in
gfp_mask now.
> +       desc = (struct xencomm_desc *)__get_free_page(gfp_mask);
> +       if (desc == NULL) {
> +               panic("%s: page allocation failed\n", __func__);
> +       }
> +       desc->nr_addrs = (PAGE_SIZE - sizeof(struct xencomm_desc)) /
> +                       sizeof(*desc->address);
> +
> +       return desc;
> +}
> +
> +void xencomm_free(struct xencomm_handle *desc)
> +{
> +       if (desc)
> +               free_page((unsigned long)__va(desc));
> +}
> +
> +int xencomm_create(void *buffer, unsigned long bytes,
> +                  struct xencomm_handle **ret, gfp_t gfp_mask)
> +{
> +       struct xencomm_desc *desc;
> +       struct xencomm_handle *handle;
> +       int rc;
> +
> +       if (xencomm_debug) {
> +               printk("%s: %p[%ld]\n", __func__, buffer, bytes);
> +       }
> +
> +       if (buffer == NULL || bytes == 0) {
> +               *ret = (struct xencomm_handle *)NULL;
> +               return 0;
> +       }
> +
> +       desc = xencomm_alloc(gfp_mask);
> +       if (!desc) {
> +               printk("%s failure\n",
"xencomm_alloc");
> +               return -ENOMEM;
> +       }
> +       handle = (struct xencomm_handle *)__pa(desc);
> +
> +       rc = xencomm_init(desc, buffer, bytes);
> +       if (rc) {
> +               printk("%s failure: %d\n",
"xencomm_init", rc);
> +               xencomm_free(handle);
> +               return rc;
> +       }
> +
> +       *ret = handle;
> +       return 0;
> +}
> +
> +/* "mini" routines, for stack-based communications: */
> +
> +static void *xencomm_alloc_mini(void *area, int arealen)
> +{
> +       unsigned long base = (unsigned long)area;
> +       unsigned int pageoffset;
> +
> +       pageoffset = base % PAGE_SIZE;
> +
> +       /* we probably fit right at the front of area */
> +       if ((PAGE_SIZE - pageoffset) >= sizeof(struct xencomm_mini)) {
> +               return area;
> +       }
> +
> +       /* if not, see if area is big enough to advance to the next
> page */
> +       if ((arealen - pageoffset) >= sizeof(struct xencomm_mini))
> +               return (void *)(base + pageoffset);
> +
> +       /* area was too small */
> +       return NULL;
> +}
> +
> +int xencomm_create_mini(void *area, int arealen, void *buffer,
> +                       unsigned long bytes, struct xencomm_handle
> **ret)
> +{
> +       struct xencomm_desc *desc;
> +       int rc;
> +
> +       desc = xencomm_alloc_mini(area, arealen);
> +       if (!desc)
> +               return -ENOMEM;
> +       desc->nr_addrs = XENCOMM_MINI_ADDRS;
> +
> +       rc = xencomm_init(desc, buffer, bytes);
> +       if (rc)
> +               return rc;
> +
> +       *ret = (struct xencomm_handle *)__pa(desc);
> +       return 0;
> +}
*_mini are unused and should be removed entirely.
> +struct xencomm_handle *xencomm_create_inline (void *buffer,
> +                                             unsigned long bytes)
> +{
> +       unsigned long paddr;
> +
> +       paddr = xen_vaddr_to_paddr((unsigned long)buffer);
> +       return (struct xencomm_handle *)XENCOMM_INLINE_CREATE(paddr);
> +}
XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old patch
just fine:
+struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned long
bytes)
+{
+	return (struct xencomm_desc *)
+		(__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE);
+}
> diff -r b7db009d622c
> linux-2.6-xen-sparse/drivers/xen/core/xencomm_hcall.c
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/linux-2.6-xen-sparse/drivers/xen/core/xencomm_hcall.c     Mon
> Aug 21 15:04:32 2006 +0200
> @@ -0,0 +1,311 @@
> +#include <linux/types.h>
> +#include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/gfp.h>
> +#include <linux/module.h>
> +#include <xen/interface/xen.h>
> +#include <xen/interface/dom0_ops.h>
> +#include <xen/interface/memory.h>
> +#include <xen/interface/xencomm.h>
> +#include <xen/interface/version.h>
> +#include <xen/interface/sched.h>
> +#include <xen/interface/event_channel.h>
> +#include <xen/interface/physdev.h>
> +#include <xen/interface/grant_table.h>
> +#include <xen/interface/callback.h>
> +#include <xen/interface/acm_ops.h>
> +#include <xen/public/privcmd.h>
> +#include <asm/hypercall.h>
> +#include <asm/page.h>
> +#include <asm/uaccess.h>
> +#include <xen/xencomm.h>
> +
> +/* Xencomm notes:
> + *
> + * Some hypercalls are made before the memory subsystem is up, so
> instead of
> + * calling xencomm_create(), we allocate XENCOMM_MINI_AREA bytes from
> the stack
> + * to hold the xencomm descriptor.
Remove above comment.
> + * In general, we need a xencomm descriptor to cover the top-level
> data
> + * structure (e.g. the dom0 op), plus another for every embedded
> pointer to
> + * another data structure (i.e. for every GUEST_HANDLE).
> + */
> +
> +int xencomm_hypercall_console_io(int cmd, int count, char *str)
> +{
> +       struct xencomm_handle *desc;
> +       int rc;
> +
> +       desc = xencomm_create_inline (str, count);
> +
> +       rc = xencomm_arch_hypercall_console_io (cmd, count, desc);
> +
> +       return rc;
> +}
I don''t understand the point of all these routines if they just call
arch_foo anyways.

> diff -r b7db009d622c
> linux-2.6-xen-sparse/drivers/xen/privcmd/xencomm.c
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/xencomm.c        Mon
> Aug 21 15:04:32 2006 +0200
> @@ -0,0 +1,358 @@
> +#include <linux/types.h>
> +#include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/gfp.h>
> +#include <linux/module.h>
> +#include <xen/interface/xen.h>
> +#include <xen/interface/dom0_ops.h>
> +#include <xen/interface/memory.h>
> +#include <xen/interface/version.h>
> +#include <xen/interface/event_channel.h>
> +#include <xen/interface/acm_ops.h>
> +#include <xen/public/privcmd.h>
> +#include <asm/hypercall.h>
> +#include <asm/page.h>
> +#include <asm/uaccess.h>
> +#include <xen/xencomm.h>
> +
> +static int xencomm_privcmd_dom0_op(privcmd_hypercall_t *hypercall)
> +{
> +       dom0_op_t kern_op;
> +       dom0_op_t __user *user_op = (dom0_op_t __user
> *)hypercall->arg[0];
> +       struct xencomm_handle *op_desc;
> +       struct xencomm_handle *desc = NULL;
> +       int ret = 0;
> +
> +       if (copy_from_user(&kern_op, user_op, sizeof(dom0_op_t)))
> +               return -EFAULT;
> +
> +       if (kern_op.interface_version != DOM0_INTERFACE_VERSION)
> +               return -EACCES;
> +
> +       op_desc = xencomm_create_inline (&kern_op, sizeof(dom0_op_t));
> +
> +       switch (kern_op.cmd) {
> +       case DOM0_GETMEMLIST:
> +       {
> +               unsigned long nr_pages > kern_op.u.getmemlist.max_pfns;
> +#ifdef __ia64__
> +               /* Xen/ia64 pass first_page and nr_pages in max_pfns!
> */
> +               nr_pages &= 0xffffffff;
> +#endif
I''m willing to put up with this only if you guys promise to fix this
silly API incompatibility, at which point it will be removed.
> +               ret = xencomm_create(
> +                       xen_guest_handle(kern_op.u.getmemlist.buffer),
> +                       nr_pages * sizeof(unsigned long),
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.getmemlist.buffer,
> +                                    (void *)desc);
> +               break;
> +       }
> +       case DOM0_SETVCPUCONTEXT:
> +               ret = xencomm_create(
> +                       xen_guest_handle(kern_op.u.setvcpucontext.ctxt),
> +                       sizeof(vcpu_guest_context_t),
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.setvcpucontext.ctxt,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_READCONSOLE:
> +               ret = xencomm_create(
> +                       xen_guest_handle(kern_op.u.readconsole.buffer),
> +                       kern_op.u.readconsole.count,
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.readconsole.buffer,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_GETPAGEFRAMEINFO2:
> +               ret = xencomm_create(
> +                      
xen_guest_handle(kern_op.u.getpageframeinfo2.array),
> +                       kern_op.u.getpageframeinfo2.num,
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.getpageframeinfo2.array,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_PERFCCONTROL:
> +               ret = xencomm_create(
> +                       xen_guest_handle(kern_op.u.perfccontrol.desc),
> +                       kern_op.u.perfccontrol.nr_counters *
> +                       sizeof(dom0_perfc_desc_t),
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.perfccontrol.desc,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_GETVCPUCONTEXT:
> +               ret = xencomm_create(
> +                       xen_guest_handle(kern_op.u.getvcpucontext.ctxt),
> +                       sizeof(vcpu_guest_context_t),
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.getvcpucontext.ctxt,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_GETDOMAININFOLIST:
> +               ret = xencomm_create(
> +                      
xen_guest_handle(kern_op.u.getdomaininfolist.buffer),
> +                       kern_op.u.getdomaininfolist.num_domains *
> +                       sizeof(dom0_getdomaininfo_t),
> +                       &desc, GFP_KERNEL);
> +               set_xen_guest_handle(kern_op.u.getdomaininfolist.buffer,
> +                                    (void *)desc);
> +               break;
> +       case DOM0_PHYSICAL_MEMORY_MAP:
> +               ret = xencomm_create(
> +                      
xen_guest_handle(kern_op.u.physical_memory_map.memory_map),
> +                       kern_op.u.physical_memory_map.nr_map_entries *
> +                       sizeof(struct dom0_memory_map_entry),
> +                       &desc, GFP_KERNEL);
> +              
set_xen_guest_handle(kern_op.u.physical_memory_map.memory_map,
> +                                    (void *)desc);
> +               break;
> +
> +       case DOM0_SCHEDCTL:
> +       case DOM0_ADJUSTDOM:
> +       case DOM0_CREATEDOMAIN:
> +       case DOM0_DESTROYDOMAIN:
> +       case DOM0_PAUSEDOMAIN:
> +       case DOM0_UNPAUSEDOMAIN:
> +       case DOM0_GETDOMAININFO:
> +       case DOM0_MSR:
> +       case DOM0_SETTIME:
> +       case DOM0_GETPAGEFRAMEINFO:
> +       case DOM0_SETVCPUAFFINITY:
> +       case DOM0_TBUFCONTROL:
> +       case DOM0_PHYSINFO:
> +       case DOM0_SCHED_ID:
> +       case DOM0_SETDOMAINMAXMEM:
> +       case DOM0_ADD_MEMTYPE:
> +       case DOM0_DEL_MEMTYPE:
> +       case DOM0_READ_MEMTYPE:
> +       case DOM0_IOPORT_PERMISSION:
> +       case DOM0_GETVCPUINFO:
> +       case DOM0_PLATFORM_QUIRK:
> +       case DOM0_MAX_VCPUS:
> +       case DOM0_SETDOMAINHANDLE:
> +       case DOM0_SETDEBUGGING:
> +       case DOM0_DOMAIN_SETUP:
> +               /* no munging needed */
> +               break;
> +
> +       default:
> +               printk("%s: unknown dom0 cmd %d\n", __func__,
> kern_op.cmd);
> +               return -ENOSYS;
> +       }
> +
> +       if (ret)
> +               goto out; /* error mapping the nested pointer */
> +
> +       ret = xencomm_arch_hypercall_dom0_op (op_desc);
> +
> +       /* FIXME: should we restore the handle?  */
> +       if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t)))
> +               ret = -EFAULT;
> +
> +       if (desc)
> +               xencomm_free(desc);
> +out:
> +       return ret;
> +}
You misplaced the out label; it needs to go before xencomm_free(desc);

That''s a good question about the copy_to_user(). I thought we never
exposed the modified handles back to the user, but I guess I was wrong.

Also please check whitespace throughout. In particular you seem to be
doing this:
	function (args);
and not even Keir''s shall-we-say-unique style does that. ;)
> +static int xencomm_privcmd_acm_op(privcmd_hypercall_t *hypercall)
> +{
> +       int cmd = hypercall->arg[0];
> +       void __user *arg = (void __user *)hypercall->arg[1];
> +       struct xencomm_handle *op_desc;
> +       struct xencomm_handle *desc = NULL;
> +       int ret;
> +
> +       switch (cmd) {
> +       case ACMOP_getssid:
> +       {
> +               struct acm_getssid kern_arg;
> +
> +               if (copy_from_user (&kern_arg, arg, sizeof
> (kern_arg)))
> +                       return -EFAULT;
> +
> +               op_desc = xencomm_create_inline (&kern_arg,
> sizeof(kern_arg));
> +
> +               ret > xencomm_create(xen_guest_handle(kern_arg.ssidbuf),
> +                                    kern_arg.ssidbuf_size,
> +                                    &desc, GFP_KERNEL);
> +               if (ret)
> +                       return ret;
> +
> +               set_xen_guest_handle(kern_arg.ssidbuf, (void *)desc);
> +
> +               ret = xencomm_arch_hypercall_acm_op (cmd, op_desc);
> +
> +               xencomm_free (desc);
> +
> +               if (copy_to_user (arg, &kern_arg, sizeof (kern_arg)))
> +                       return -EFAULT;
> +
> +               return ret;
> +       }
> +       default:
> +               printk("%s: unknown acm_op cmd %d\n", __func__,
cmd);
> +               return -ENOSYS;
> +       }
> +
> +       return ret;
> +}
> +
> +static int xencomm_privcmd_memory_op(privcmd_hypercall_t *hypercall)
> +{
> +       const unsigned long cmd = hypercall->arg[0];
> +       int ret = 0;
> +
> +       switch (cmd) {
> +       case XENMEM_increase_reservation:
> +       case XENMEM_decrease_reservation:
> +       {
> +               xen_memory_reservation_t kern_op;
> +               xen_memory_reservation_t __user *user_op;
> +               struct xencomm_handle *desc = NULL;
> +               struct xencomm_handle *desc_op;
> +
> +               user_op = (xen_memory_reservation_t __user
> *)hypercall->arg[1];
> +               if (copy_from_user(&kern_op, user_op,
> +                                  sizeof(xen_memory_reservation_t)))
> +                       return -EFAULT;
> +               desc_op = xencomm_create_inline (&kern_op, sizeof
> (kern_op));
> +
> +               if (xen_guest_handle(kern_op.extent_start)) {
> +                       void * addr;
> +
> +                       addr = xen_guest_handle(kern_op.extent_start);
> +                       ret = xencomm_create
> +                               (addr,
> +                                kern_op.nr_extents *
> +                                sizeof(*xen_guest_handle
> +                                       (kern_op.extent_start)),
> +                                &desc, GFP_KERNEL);
> +                       if (ret)
> +                               return ret;
> +                       set_xen_guest_handle(kern_op.extent_start,
> +                                            (void *)desc);
> +               }
> +
> +               ret = xencomm_arch_hypercall_memory_op (cmd, desc_op);
> +
> +               if (desc)
> +                       xencomm_free (desc);
> +
> +               if (ret != 0)
> +                       return ret;
> +
> +               if (copy_to_user(user_op, &kern_op,
> +                                sizeof(xen_memory_reservation_t)))
> +                       return -EFAULT;
> +
> +               return ret;
> +       }
> +       default:
> +               printk("%s: unknown memory op %lu\n", __func__,
cmd);
> +               ret = -ENOSYS;
> +       }
> +       return ret;
> +}
> +
> +static int xencomm_privcmd_xen_version(privcmd_hypercall_t
> *hypercall)
> +{
> +       int cmd = hypercall->arg[0];
> +       void __user *arg = (void __user *)hypercall->arg[1];
> +       struct xencomm_handle *desc;
> +       size_t argsize;
> +       int rc;
> +
> +       switch (cmd) {
> +       case XENVER_version:
> +               /* do not actually pass an argument */
> +               return xencomm_arch_hypercall_xen_version (cmd, 0);
> +       case XENVER_extraversion:
> +               argsize = sizeof(xen_extraversion_t);
> +               break;
> +       case XENVER_compile_info:
> +               argsize = sizeof(xen_compile_info_t);
> +               break;
> +       case XENVER_capabilities:
> +               argsize = sizeof(xen_capabilities_info_t);
> +               break;
> +       case XENVER_changeset:
> +               argsize = sizeof(xen_changeset_info_t);
> +               break;
> +       case XENVER_platform_parameters:
> +               argsize = sizeof(xen_platform_parameters_t);
> +               break;
> +       case XENVER_pagesize:
> +               argsize = (arg == NULL) ? 0 : sizeof(void *);
> +               break;
> +       case XENVER_get_features:
> +               argsize = (arg == NULL) ? 0 :
> sizeof(xen_feature_info_t);
> +               break;
> +
> +       default:
> +               printk("%s: unknown version op %d\n", __func__,
cmd);
> +               return -ENOSYS;
> +       }
> +
> +       rc = xencomm_create(arg, argsize, &desc, GFP_KERNEL);
> +       if (rc)
> +               return rc;
> +
> +       rc = xencomm_arch_hypercall_xen_version (cmd, desc);
> +
> +       xencomm_free(desc);
> +
> +       return rc;
> +}
> +
> +static int xencomm_privcmd_event_channel_op(privcmd_hypercall_t
> *hypercall)
> +{
> +       int cmd = hypercall->arg[0];
> +       struct xencomm_handle *desc;
> +       unsigned int argsize;
> +       int ret;
> +
> +       switch (cmd) {
> +       case EVTCHNOP_alloc_unbound:
> +               argsize = sizeof(evtchn_alloc_unbound_t);
> +               break;
> +
> +       case EVTCHNOP_status:
> +               argsize = sizeof(evtchn_status_t);
> +               break;
> +
> +       default:
> +               printk("%s: unknown EVTCHNOP %d\n", __func__,
cmd);
> +               return -EINVAL;
> +       }
> +
> +       ret = xencomm_create((void *)hypercall->arg[1], argsize,
> +                            &desc, GFP_KERNEL);
> +       if (ret)
> +               return ret;
> +
> +       ret = xencomm_arch_hypercall_event_channel_op (cmd, desc);
> +
> +       xencomm_free(desc);
> +       return ret;
> +}
> +
> +int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall)
> +{
> +       switch (hypercall->op) {
> +       case __HYPERVISOR_dom0_op:
> +               return xencomm_privcmd_dom0_op(hypercall);
> +        case __HYPERVISOR_acm_op:
> +               return xencomm_privcmd_acm_op(hypercall);
> +       case __HYPERVISOR_xen_version:
> +               return xencomm_privcmd_xen_version(hypercall);
> +       case __HYPERVISOR_memory_op:
> +               return xencomm_privcmd_memory_op(hypercall);
> +       case __HYPERVISOR_event_channel_op:
> +               return xencomm_privcmd_event_channel_op(hypercall);
> +       default:
> +               printk("%s: unknown hcall (%ld)\n", __func__,
> hypercall->op);
> +               return -ENOSYS;
> +       }
> +}
> +
> diff -r b7db009d622c linux-2.6-xen-sparse/include/xen/xencomm.h
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/linux-2.6-xen-sparse/include/xen/xencomm.h        Mon Aug 21
> 15:04:32 2006 +0200
> @@ -0,0 +1,45 @@
> +/*
> + * Copyright (C) 2006 Hollis Blanchard <hollisb@us.ibm.com>, IBM
> Corporation
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License as published
> by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> 02111-1307 USA
> + */
> +
> +#ifndef _LINUX_XENCOMM_H_
> +#define _LINUX_XENCOMM_H_
> +
> +#include <xen/interface/xencomm.h>
> +
> +#define XENCOMM_MINI_ADDRS 3
> +struct xencomm_mini {
> +    struct xencomm_desc _desc;
> +    uint64_t address[XENCOMM_MINI_ADDRS];
> +};
> +#define XENCOMM_MINI_AREA (sizeof(struct xencomm_mini) * 2)
Remove above.
> +/* To avoid additionnal virt to phys convertion, the user only sees
> handle
> +   which are opaque structures.  */
> +struct xencomm_handle;
Typos in the comment.
> +extern int xencomm_create(void *buffer, unsigned long bytes,
> +                         struct xencomm_handle **desc, gfp_t type);
> +extern void xencomm_free(struct xencomm_handle *desc);
> +extern int xencomm_create_mini(void *area, int arealen, void *buffer,
> +            unsigned long bytes, struct xencomm_handle **ret);
Remove above.
> +struct xencomm_handle *xencomm_create_inline (void *buffer,
> +                                             unsigned long bytes);
> +
> +#define xen_guest_handle(hnd)  ((hnd).p)
> +
> +#endif /* _LINUX_XENCOMM_H_ */
> diff -r b7db009d622c linux-2.6-xen-sparse/include/xen/xencomm_hcall.h
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/linux-2.6-xen-sparse/include/xen/xencomm_hcall.h  Mon Aug 21
> 15:04:32 2006 +0200
> @@ -0,0 +1,45 @@
> +/*
> + * Copyright (C) 2006 Tristan Gingold <tristan.gingold@bull.net>,
> Bull SAS
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License as published
> by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> 02111-1307 USA
> + */
> +
> +#ifndef _LINUX_XENCOMM_HCALL_H_
> +#define _LINUX_XENCOMM_HCALL_H_
> +
> +/* These function creates inline descriptor for the parameters and
> +   calls the correspondig xencomm_arch_hypercall_X.
> +   Architectures should defines HYPERVISOR_xxx as
> xencomm_hypercall_xxx unless
> +   they want to use their own wrapper.  */
"corresponding"

And I''m not clear on the reason for all the xencomm_arch_*, especially
because I haven''t seen IA64''s. If you''re worried
about the structure
size conversion I mentioned earlier, I think PowerPC will need to fix
that *before* the xencomm stuff is called anyways. So unless IA64 needs
something funny in xencomm_arch_*, they should all be removed.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tristan Gingold

2006-Aug-23 07:59 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit :> I apologize for my mailer line-wrapping the patch as I quote it below.
>
> On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:
> > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig
> > --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21 09:41:24
> > 2006 +0200
> > +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21 15:04:32
> > 2006 +0200
> > @@ -257,4 +257,7 @@ config XEN_SMPBOOT
> >         default y
> >         depends on SMP
> >
> > +config XEN_XENCOMM
> > +       bool
> > +       default n
> >  endif
>
> Shouldn''t IA64 "select XEN_XENCOMM"? Or is your kernel
in a separate
> tree?The arch Kconfig overrides this parameter.
> > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Makefile
> > --- a/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 09:41:24
> > 2006 +0200
> > +++ b/linux-2.6-xen-sparse/drivers/xen/Makefile Mon Aug 21 15:04:32
> > 2006 +0200
> > @@ -1,10 +1,10 @@ obj-y += core/
> >  obj-y  += core/
> >  obj-y  += console/
> >  obj-y  += evtchn/
> > -obj-y  += privcmd/
> >  obj-y  += xenbus/
> >
> >  obj-$(CONFIG_XEN_UTIL)                 += util.o
> > +obj-$(CONFIG_XEN_PRIVCMD)              += privcmd/
> >  obj-$(CONFIG_XEN_BALLOON)              += balloon/
> >  obj-$(CONFIG_XEN_DEVMEM)               += char/
> >  obj-$(CONFIG_XEN_BLKDEV_BACKEND)       += blkback/
>
> Not really part of this patch.Ok, I will send a separat patch.

[...]> I agree with the CONFIG_XEN_PRIVCMD stuff, but I think that should be a
> separate patch.
> > diff -r b7db009d622c
> > linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c
> > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Mon
> > Aug 21 09:41:24 2006 +0200
> > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Mon
> > Aug 21 15:04:32 2006 +0200
> > @@ -34,6 +34,10 @@
> >
> >  static struct proc_dir_entry *privcmd_intf;
> >  static struct proc_dir_entry *capabilities_intf;
> > +
> > +#ifdef CONFIG_XEN_XENCOMM
> > +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t *hypercall);
> > +#endif
> >
> >  #define NR_HYPERCALLS 64
> >  static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS);
> > @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i
> >                                 "g" ((unsigned
long)hypercall.arg[4])
> >
> >                                 : "r8", "r10",
"memory" );
> >
> >                 }
> > -#elif defined (__ia64__)
> > -               __asm__ __volatile__ (
> > -                       ";; mov r14=%2; mov r15=%3; "
> > -                       "mov r16=%4; mov r17=%5; mov
r18=%6;"
> > -                       "mov r2=%1; break 0x1000;; mov %0=r8
;;"
> > -                       : "=r" (ret)
> > -                       : "r" (hypercall.op),
> > -                       "r" (hypercall.arg[0]),
> > -                       "r" (hypercall.arg[1]),
> > -                       "r" (hypercall.arg[2]),
> > -                       "r" (hypercall.arg[3]),
> > -                       "r" (hypercall.arg[4])
> > -                       :
> >
"r14","r15","r16","r17","r18","r2","r8","memory");
> > +#elif defined (CONFIG_XEN_XENCOMM)
> > +               ret = xencomm_privcmd_hypercall (&hypercall);
> >  #endif
> >         }
> >         break;
>
> Move all the #ifdef stuff into appropriate header files, then have every
> arch unconditionally call arch_privcmd_hypercall().I simply prefer not to touch other people code, as I can''t try xen/x86.

[...]> > +/* translate virtual address to physical address */
> > +static unsigned long xen_vaddr_to_paddr(unsigned long vaddr)
> > +{
> > +       struct page *page;
> > +       struct vm_area_struct *vma;
> > +
> > +#ifdef __ia64__
> > +       /* On ia64, TASK_SIZE refers to current.  It is not
> > initialized
> > +          during boot.
> > +          Furthermore the kernel is relocatable and __pa()
doesn''t
> > work on
> > +          kernel addresses.  */
> > +       if (vaddr >= KERNEL_START
> > +           && vaddr < (KERNEL_START +
KERNEL_TR_PAGE_SIZE)) {
> > +               extern unsigned long kernel_start_pa;
> > +               return vaddr - kernel_start_pa;
> > +       }
> > +#endif
> > +       if (vaddr > TASK_SIZE) {
> > +               /* kernel address */
> > +               return __pa(vaddr);
> > +       }
> > +
> > +       /* XXX double-check (lack of) locking */
> > +       vma = find_extend_vma(current->mm, vaddr);
> > +       if (!vma)
> > +               return ~0UL;
> > +
> > +       page = follow_page(vma, vaddr, 0);
> > +       if (!page)
> > +               return ~0UL;
> > +
> > +       return (page_to_pfn(page) << PAGE_SHIFT) | (vaddr &
> > ~PAGE_MASK);
> > +}
>
> If there really is no way to implement xen_vaddr_to_paddr() in an
> arch-neutral way (and I''m willing to believe that''s
true), just make the
> whole function arch-specific. It wouldn''t be too much duplicated
code.I will try to improve this.

[...]> > +/* XXX use slab allocator */
> > +static struct xencomm_desc *xencomm_alloc(gfp_t gfp_mask)
> > +{
> > +       struct xencomm_desc *desc;
> > +
> > +       /* XXX could we call this from irq context? */
>
> You can remove this comment. It''s historical, and we''re
passing in
> gfp_mask now.Ok.

[...]>
> *_mini are unused and should be removed entirely.Ok.
> > +struct xencomm_handle *xencomm_create_inline (void *buffer,
> > +                                             unsigned long bytes)
> > +{
> > +       unsigned long paddr;
> > +
> > +       paddr = xen_vaddr_to_paddr((unsigned long)buffer);
> > +       return (struct xencomm_handle *)XENCOMM_INLINE_CREATE(paddr);
> > +}
>
> XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old patch
> just fine:
> +struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned long
> bytes)
> +{
> +	return (struct xencomm_desc *)
> +		(__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE);
> +}It is defined in arch-xxx.h file.
> > +#include <xen/xencomm.h>
> > +
> > +/* Xencomm notes:
> > + *
> > + * Some hypercalls are made before the memory subsystem is up, so
> > instead of
> > + * calling xencomm_create(), we allocate XENCOMM_MINI_AREA bytes from
> > the stack
> > + * to hold the xencomm descriptor.
>
> Remove above comment.Ok.
> > + * In general, we need a xencomm descriptor to cover the top-level
> > data
> > + * structure (e.g. the dom0 op), plus another for every embedded
> > pointer to
> > + * another data structure (i.e. for every GUEST_HANDLE).
> > + */
> > +
> > +int xencomm_hypercall_console_io(int cmd, int count, char *str)
> > +{
> > +       struct xencomm_handle *desc;
> > +       int rc;
> > +
> > +       desc = xencomm_create_inline (str, count);
> > +
> > +       rc = xencomm_arch_hypercall_console_io (cmd, count, desc);
> > +
> > +       return rc;
> > +}
>
> I don''t understand the point of all these routines if they just
call
> arch_foo anyways.Sorry I have not explained the principle.
xencomm_arch_hypercall_XXX are the raw hypercalls.  They must be defined by 
architecture code. The xencomm_hypercall_XXX are the xencomm wrapper and are 
shared.

[...]> > +       case DOM0_GETMEMLIST:
> > +       {
> > +               unsigned long nr_pages > >
kern_op.u.getmemlist.max_pfns;
> > +#ifdef __ia64__
> > +               /* Xen/ia64 pass first_page and nr_pages in max_pfns!
> > */
> > +               nr_pages &= 0xffffffff;
> > +#endif
>
> I''m willing to put up with this only if you guys promise to fix
this
> silly API incompatibility, at which point it will be removed.I hope this could be fixed once xencomm is used by ia64.

[...]> > +       if (ret)
> > +               goto out; /* error mapping the nested pointer */
> > +
> > +       ret = xencomm_arch_hypercall_dom0_op (op_desc);
> > +
> > +       /* FIXME: should we restore the handle?  */
> > +       if (copy_to_user(user_op, &kern_op, sizeof(dom0_op_t)))
> > +               ret = -EFAULT;
> > +
> > +       if (desc)
> > +               xencomm_free(desc);
> > +out:
> > +       return ret;
> > +}
>
> You misplaced the out label; it needs to go before xencomm_free(desc);??? This was copied from your work.
The code branches to out iff xencomm allocation failed.  It is safe to call 
xencomm_free but useless.
> That''s a good question about the copy_to_user(). I thought we
never
> exposed the modified handles back to the user, but I guess I was wrong.
>
> Also please check whitespace throughout. In particular you seem to be
> doing this:
> 	function (args);
> and not even Keir''s shall-we-say-unique style does that. ;)Yes.

[...]> > +#include <xen/interface/xencomm.h>
> > +
> > +#define XENCOMM_MINI_ADDRS 3
> > +struct xencomm_mini {
> > +    struct xencomm_desc _desc;
> > +    uint64_t address[XENCOMM_MINI_ADDRS];
> > +};
> > +#define XENCOMM_MINI_AREA (sizeof(struct xencomm_mini) * 2)
>
> Remove above.Ok.
> > +/* To avoid additionnal virt to phys convertion, the user only sees
> > handle
> > +   which are opaque structures.  */
> > +struct xencomm_handle;
>
> Typos in the comment.Oops.
> > +extern int xencomm_create(void *buffer, unsigned long bytes,
> > +                         struct xencomm_handle **desc, gfp_t type);
> > +extern void xencomm_free(struct xencomm_handle *desc);
> > +extern int xencomm_create_mini(void *area, int arealen, void *buffer,
> > +            unsigned long bytes, struct xencomm_handle **ret);
>
> Remove above.Ok.

[...]> > +/* These function creates inline descriptor for the parameters and
> > +   calls the correspondig xencomm_arch_hypercall_X.
> > +   Architectures should defines HYPERVISOR_xxx as
> > xencomm_hypercall_xxx unless
> > +   they want to use their own wrapper.  */
>
> "corresponding"Oops.
> And I''m not clear on the reason for all the xencomm_arch_*,
especially
> because I haven''t seen IA64''s. If you''re worried
about the structure
> size conversion I mentioned earlier, I think PowerPC will need to fix
> that *before* the xencomm stuff is called anyways. So unless IA64 needs
> something funny in xencomm_arch_*, they should all be removed.See explaination above, basically xencomm_arch_* are the raw hypercalls.

Tristan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Aug-23 16:35 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

On Wed, 2006-08-23 at 09:59 +0200, Tristan Gingold
wrote:> Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit :
> > I apologize for my mailer line-wrapping the patch as I quote it below.
> >
> > On Mon, 2006-08-21 at 17:18 +0200, Tristan Gingold wrote:
> > > diff -r b7db009d622c linux-2.6-xen-sparse/drivers/xen/Kconfig
> > > --- a/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21
09:41:24
> > > 2006 +0200
> > > +++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig  Mon Aug 21
15:04:32
> > > 2006 +0200
> > > @@ -257,4 +257,7 @@ config XEN_SMPBOOT
> > >         default y
> > >         depends on SMP
> > >
> > > +config XEN_XENCOMM
> > > +       bool
> > > +       default n
> > >  endif
> >
> > Shouldn''t IA64 "select XEN_XENCOMM"? Or is your
kernel in a separate
> > tree?
> The arch Kconfig overrides this parameter.
My point was that I didn''t see that in this patch.
> > > diff -r b7db009d622c
> > > linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c
> > > --- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c       
Mon
> > > Aug 21 09:41:24 2006 +0200
> > > +++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c       
Mon
> > > Aug 21 15:04:32 2006 +0200
> > > @@ -34,6 +34,10 @@
> > >
> > >  static struct proc_dir_entry *privcmd_intf;
> > >  static struct proc_dir_entry *capabilities_intf;
> > > +
> > > +#ifdef CONFIG_XEN_XENCOMM
> > > +extern int xencomm_privcmd_hypercall(privcmd_hypercall_t
*hypercall);
> > > +#endif
> > >
> > >  #define NR_HYPERCALLS 64
> > >  static DECLARE_BITMAP(hypercall_permission_map, NR_HYPERCALLS);
> > > @@ -91,19 +95,8 @@ static int privcmd_ioctl(struct inode *i
> > >                                 "g" ((unsigned
long)hypercall.arg[4])
> > >
> > >                                 : "r8",
"r10", "memory" );
> > >
> > >                 }
> > > -#elif defined (__ia64__)
> > > -               __asm__ __volatile__ (
> > > -                       ";; mov r14=%2; mov r15=%3; "
> > > -                       "mov r16=%4; mov r17=%5; mov
r18=%6;"
> > > -                       "mov r2=%1; break 0x1000;; mov %0=r8
;;"
> > > -                       : "=r" (ret)
> > > -                       : "r" (hypercall.op),
> > > -                       "r" (hypercall.arg[0]),
> > > -                       "r" (hypercall.arg[1]),
> > > -                       "r" (hypercall.arg[2]),
> > > -                       "r" (hypercall.arg[3]),
> > > -                       "r" (hypercall.arg[4])
> > > -                       :
> > >
"r14","r15","r16","r17","r18","r2","r8","memory");
> > > +#elif defined (CONFIG_XEN_XENCOMM)
> > > +               ret = xencomm_privcmd_hypercall (&hypercall);
> > >  #endif
> > >         }
> > >         break;
> >
> > Move all the #ifdef stuff into appropriate header files, then have
every
> > arch unconditionally call arch_privcmd_hypercall().
> I simply prefer not to touch other people code, as I can''t try
xen/x86.
That''s nice, but you''re just moving code, and it''s
the Right Thing To
Do, so please do it. You can point out that you''ve only compile-tested
x86 when you submit.
> > > +struct xencomm_handle *xencomm_create_inline (void *buffer,
> > > +                                             unsigned long
bytes)
> > > +{
> > > +       unsigned long paddr;
> > > +
> > > +       paddr = xen_vaddr_to_paddr((unsigned long)buffer);
> > > +       return (struct xencomm_handle
*)XENCOMM_INLINE_CREATE(paddr);
> > > +}
> >
> > XENCOMM_INLINE_CREATE in undefined in this patch. I liked your old
patch
> > just fine:
> > +struct xencomm_desc *xencomm_create_inline (void *buffer, unsigned
long
> > bytes)
> > +{
> > +	return (struct xencomm_desc *)
> > +		(__kern_paddr((unsigned long)buffer) | XENCOMM_INLINE);
> > +}
> It is defined in arch-xxx.h file.
But why? Do you anticipate that architectures will mark "inline"
descriptors differently? If so, how?
> > > + * In general, we need a xencomm descriptor to cover the
top-level
> > > data
> > > + * structure (e.g. the dom0 op), plus another for every embedded
> > > pointer to
> > > + * another data structure (i.e. for every GUEST_HANDLE).
> > > + */
> > > +
> > > +int xencomm_hypercall_console_io(int cmd, int count, char *str)
> > > +{
> > > +       struct xencomm_handle *desc;
> > > +       int rc;
> > > +
> > > +       desc = xencomm_create_inline (str, count);
> > > +
> > > +       rc = xencomm_arch_hypercall_console_io (cmd, count,
desc);
> > > +
> > > +       return rc;
> > > +}
> >
> > I don''t understand the point of all these routines if they
just call
> > arch_foo anyways.
> Sorry I have not explained the principle.
> xencomm_arch_hypercall_XXX are the raw hypercalls.  They must be defined by
> architecture code. The xencomm_hypercall_XXX are the xencomm wrapper and
are
> shared.
That much is clear. :) My question is what is being done in those "raw
hypercalls" that can''t be done here? You didn''t include
them in your
patch, so I can''t tell.

It seems there are a few missing pieces to your patch. Next time please
include the whole thing, including arch-specific parts, so we can see
what''s going on.
> > > +       if (ret)
> > > +               goto out; /* error mapping the nested pointer */
> > > +
> > > +       ret = xencomm_arch_hypercall_dom0_op (op_desc);
> > > +
> > > +       /* FIXME: should we restore the handle?  */
> > > +       if (copy_to_user(user_op, &kern_op,
sizeof(dom0_op_t)))
> > > +               ret = -EFAULT;
> > > +
> > > +       if (desc)
> > > +               xencomm_free(desc);
> > > +out:
> > > +       return ret;
> > > +}
> >
> > You misplaced the out label; it needs to go before xencomm_free(desc);
> ??? This was copied from your work.
You''ve made changes here, and that''s what I''m
pointing out.
> The code branches to out iff xencomm allocation failed.  It is safe to call
> xencomm_free but useless.
There are multiple descriptors being created: one for the dom0_op
top-level structure, and possibly one for a sub-structure. In fact, in
your patch you never free ''op_desc'' inside
xencomm_privcmd_dom0_op().
OK, reading closer, I don''t like that at *all*. The trick is that
xencomm_create_inline() doesn''t actually create anything, and therefore
you don''t need to free it. That needs to change.

My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever
way is best for portability) and if it is, do the "inline" stuff. On
the
free side, if the descriptor was inline, free can just return. That
would also make me happy because it removes the need to think about
whether callers can/should call "create_inline" or not; the code just
does the right thing.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tristan Gingold

2006-Aug-24 07:51 UTC

head link

[Xen-ia64-devel] Re: [XenPPC] RFC: xencomm - linux side

Le Mercredi 23 Août 2006 18:35, Hollis Blanchard a écrit
:> On Wed, 2006-08-23 at 09:59 +0200, Tristan Gingold wrote:
> > Le Mardi 22 Août 2006 21:03, Hollis Blanchard a écrit :
> > > I don''t understand the point of all these routines if
they just call
> > > arch_foo anyways.
> >
> > Sorry I have not explained the principle.
> > xencomm_arch_hypercall_XXX are the raw hypercalls.  They must be
defined
> > by architecture code. The xencomm_hypercall_XXX are the xencomm
wrapper
> > and are shared.
>
> That much is clear. :) My question is what is being done in those "raw
> hypercalls" that can''t be done here? You didn''t
include them in your
> patch, so I can''t tell.
>
> It seems there are a few missing pieces to your patch. Next time please
> include the whole thing, including arch-specific parts, so we can see
> what''s going on.
>
> > > > +       if (ret)
> > > > +               goto out; /* error mapping the nested
pointer */
> > > > +
> > > > +       ret = xencomm_arch_hypercall_dom0_op (op_desc);
> > > > +
> > > > +       /* FIXME: should we restore the handle?  */
> > > > +       if (copy_to_user(user_op, &kern_op,
sizeof(dom0_op_t)))
> > > > +               ret = -EFAULT;
> > > > +
> > > > +       if (desc)
> > > > +               xencomm_free(desc);
> > > > +out:
> > > > +       return ret;
> > > > +}
> > >
> > > You misplaced the out label; it needs to go before
xencomm_free(desc);
> >
> > ??? This was copied from your work.
>
> You''ve made changes here, and that''s what I''m
pointing out.
>
> > The code branches to out iff xencomm allocation failed.  It is safe to
> > call xencomm_free but useless.
>
> There are multiple descriptors being created: one for the dom0_op
> top-level structure, and possibly one for a sub-structure. In fact, in
> your patch you never free ''op_desc'' inside
xencomm_privcmd_dom0_op().
> OK, reading closer, I don''t like that at *all*. The trick is that
> xencomm_create_inline() doesn''t actually create anything, and
therefore
> you don''t need to free it. That needs to change.
>
> My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in whatever
> way is best for portability) and if it is, do the "inline" stuff.
On the
> free side, if the descriptor was inline, free can just return. That
> would also make me happy because it removes the need to think about
> whether callers can/should call "create_inline" or not; the code
just
> does the right thing.We definitly disagree here.  One whole point of xencomm_create_inline is it 
doesn''t allocate memory and can''t fail.  Because of that we
don''t need to
worry about failure and freeing memory.  This makes the code a lot easier to 
write and to read.

Tristan.

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

Hollis Blanchard

2006-Aug-24 15:43 UTC

head link

Re: [XenPPC] RFC: xencomm - linux side

On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold
wrote:> 
> > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in
whatever
> > way is best for portability) and if it is, do the "inline"
stuff. On the
> > free side, if the descriptor was inline, free can just return. That
> > would also make me happy because it removes the need to think about
> > whether callers can/should call "create_inline" or not; the
code just
> > does the right thing.
> We definitly disagree here.  One whole point of xencomm_create_inline is it
> doesn''t allocate memory and can''t fail.  Because of that
we don''t need to
> worry about failure and freeing memory.  This makes the code a lot easier
to
> write and to read.
It would simplify the code even more to fold xencomm_create_inline()
into xencomm_create(), as I suggest above. That way, the developer never
needs to consider if the particular hypercall could ever be called
before the page allocator works. Proving that assumption for some
hypercall, and guaranteeing it will remain true in the future no matter
what Linux changes occur, is a lot more difficult than remembering to
call free() after create().

The goal of any API should be to make it impossible to use it
incorrectly, and I think my (firm) suggestion makes that true here.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel

Tristan Gingold

2006-Aug-25 07:02 UTC

head link

Re: [XenPPC] RFC: xencomm - linux side

Le Jeudi 24 Août 2006 17:43, Hollis Blanchard a écrit :> On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold wrote:
> > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR() (in
whatever
> > > way is best for portability) and if it is, do the
"inline" stuff. On
> > > the free side, if the descriptor was inline, free can just
return. That
> > > would also make me happy because it removes the need to think
about
> > > whether callers can/should call "create_inline" or not;
the code just
> > > does the right thing.
> >
> > We definitly disagree here.  One whole point of xencomm_create_inline
is
> > it doesn''t allocate memory and can''t fail.  Because
of that we don''t need
> > to worry about failure and freeing memory.  This makes the code a lot
> > easier to write and to read.
>
> It would simplify the code even more to fold xencomm_create_inline()
> into xencomm_create(), as I suggest above. That way, the developer never
> needs to consider if the particular hypercall could ever be called
> before the page allocator works. Proving that assumption for some
> hypercall, and guaranteeing it will remain true in the future no matter
> what Linux changes occur, is a lot more difficult than remembering to
> call free() after create().Could you modify the ppc code, I will be happy to fetch directly the code for 
this new idea.
> The goal of any API should be to make it impossible to use it
> incorrectly, and I think my (firm) suggestion makes that true here.What about possible errors ?

Tristan.

_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel

Hollis Blanchard

2006-Sep-08 19:22 UTC

head link

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

On Fri, 2006-08-25 at 09:02 +0200, Tristan Gingold
wrote:> Le Jeudi 24 Août 2006 17:43, Hollis Blanchard a écrit :
> > On Thu, 2006-08-24 at 09:51 +0200, Tristan Gingold wrote:
> > > > My suggestion: have xencomm_create() test IS_KERNEL_ADDR()
(in whatever
> > > > way is best for portability) and if it is, do the
"inline" stuff. On
> > > > the free side, if the descriptor was inline, free can just
return. That
> > > > would also make me happy because it removes the need to
think about
> > > > whether callers can/should call "create_inline" or
not; the code just
> > > > does the right thing.
> > >
> > > We definitly disagree here.  One whole point of
xencomm_create_inline is
> > > it doesn''t allocate memory and can''t fail. 
Because of that we don''t need
> > > to worry about failure and freeing memory.  This makes the code a
lot
> > > easier to write and to read.
> >
> > It would simplify the code even more to fold xencomm_create_inline()
> > into xencomm_create(), as I suggest above. That way, the developer
never
> > needs to consider if the particular hypercall could ever be called
> > before the page allocator works. Proving that assumption for some
> > hypercall, and guaranteeing it will remain true in the future no
matter
> > what Linux changes occur, is a lot more difficult than remembering to
> > call free() after create().
> Could you modify the ppc code, I will be happy to fetch directly the code
for
> this new idea.
I''m looking at this now, and I''m brought back to the fact that
I don''t
like the "inline" idea because practically speaking it requires that
the
kernel stack is physically contiguous. That is true for Linux, but is
that really true for all OSs? Since we''re defining a Xen interface, I
don''t want to hardcode Linux assumptions.

Without that, an OS with a physically discontiguous stack would be
forced to do the equivalen of get_free_page() to do all communication
(including console output). Before the page allocator works that
wouldn''t be possible, but I think we can assume a physically contiguous
stack early in the boot process before the page allocator works. So then
you''re requiring a test:

hcall_console_write(char *str) {
	if (page_allocator_done()) {
		desc = (desc *)get_free_page();
		xencomm_map(desc, str);
		hcall(desc);
		free_page(desc);
	} else {
		desc = xencomm_create_inline(str);
		hcall(desc);
	}
}

That seems lame. The "mini" xencomm stuff always works.

Actually... I guess the "mini" stuff will always work, and any OS that
needs it can use it. The "inline" stuff is an optimization that Linux
can take advantage of.

Summary: I''ve changed my mind, and I only send this email to illustrate
my thought process. :)

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2006 - RFC: xencomm - linux side

[Xen-devel] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side

[Xen-ia64-devel] Re: [XenPPC] RFC: xencomm - linux side

Re: [XenPPC] RFC: xencomm - linux side

Re: [XenPPC] RFC: xencomm - linux side

[Xen-devel] Re: [XenPPC] RFC: xencomm - linux side