thr3ads.net - Xen devel - [Xen-devel] passing hypercall parameters by pointer [Aug 2005]

If this information is useful, please help other people find it:
Share via:

Hollis Blanchard

2005-Aug-17 19:51 UTC

[Xen-devel] passing hypercall parameters by pointer

Many Xen hypercalls pass mlocked pointers as parameters for both input and 
output. For example, xc_get_pfn_list() is a nice one with multiple levels of 
structures/mlocking.

Considering just the tools for the moment, those pointers are userspace 
addresses. Ultimately the hypervisor ends up with that userspace address, from 
which it reads and writes data. This is OK for x86, since userspace, kernel, 
and hypervisor all share the same virtual address space (and userspace has 
carefully mlocked the relevent memory).

On PowerPC though, the hypervisor runs in real mode (no MMU translation).  
Unlike x86, PowerPC exceptions arrive in real mode, and also PowerPC does not 
force a TLB flush when switching between real and virtual modes. So a virtual 
address is pretty much worthless as a hypervisor parameter; performing the 
MMU translation in software is infeasible.

Although it rarely passes parameters by pointer, the way the pSeries 
hypervisor handles this is having the kernel always pass a
"pseudo-physical"
address (to borrow Xen terminology), which is trivially translatable to a 
"machine" address in the hypervisor. The processor has some notion of
a large
(e.g. 64M) chunk of contiguous machine memory, so the hypervisor keeps a 
table of chunks which can be used to translate pseudo-physical addresses.

Of course, userspace doesn''t know psuedo-physical addresses, only the
kernel
does. So one way or another, to pass parameters by pointer to the PPC 
hypervisor, the kernel is going to need to translate them. That also means  
userspace memory areas will be limited to one page (since virtually 
consecutive pages may not be representable by a single pseudo-physical 
address).

If we''re stuck with structure addresses in hypercalls, one possible
solution
is to modify libxc so that all parameter addresses are physical pointers 
within the same page, then pass that page''s physical address into the 
hypercall. Something like this:

ulong magicpage_vaddr;
ulong magicpage_paddr;

libxc_init() {
#ifdef __powerpc__
	posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
	mlock(magicpage_vaddr);
	magicpage_paddr = new_translate_syscall(magicpage_vaddr);
#endif
	...
}

xc_get_pfn_list() {
	dom0_op_t *op;
	ulong op_paddr;
	magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
	...
}

#ifdef __powerpc__
magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
	*usable_addr = magicpage_vaddr + offset;
	*hcall_addr = magicpage_paddr + offset;
	offset += bytes;
}

do_xen_hypercall(ptr) {
	ptr -= magicpage_vaddr - magicpage_paddr;
	do_privcmd(..., ptr);
}
#endif

(Note that this is for discussion only, not a proposed interface.)

Each architecture would provide their own magicalloc and do_xen_hypercall, and 
for x86 magicalloc would be malloc+mlock and both pointers are the same. x86 
do_xen_hypercall would remain unchanged. Basically, any current use of mlock 
in libxc would be replaced with calls to magicalloc.

For example, if we''re willing to change the embedded pointers in
dom0_ops to
offsets, we do not need to invent a new "translate" system call.

Other suggestions are welcome.

-- 
Hollis Blanchard
IBM Linux Technology Center

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Aug-17 20:44 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

> Many Xen hypercalls pass mlocked pointers as parameters for 
> both input and output. For example, xc_get_pfn_list() is a 
> nice one with multiple levels of structures/mlocking.
> 
> Considering just the tools for the moment, those pointers are 
> userspace addresses. Ultimately the hypervisor ends up with 
> that userspace address, from which it reads and writes data. 
> This is OK for x86, since userspace, kernel, and hypervisor 
> all share the same virtual address space (and userspace has 
> carefully mlocked the relevent memory).
> 
> On PowerPC though, the hypervisor runs in real mode (no MMU 
> translation).  
> Unlike x86, PowerPC exceptions arrive in real mode, and also 
> PowerPC does not force a TLB flush when switching between 
> real and virtual modes. So a virtual address is pretty much 
> worthless as a hypervisor parameter; performing the MMU 
> translation in software is infeasible.
I think I''d prefer to hide all of this by co-operation between the
kernel and the hypervisor''s copy to/from user.

The kernel can easily translate a virtual address and length into a list
of psuedo-phyiscal frame numbers and initial offset. Xen''s copy from
user function can then use this list when doing its work. 

Ian

> Although it rarely passes parameters by pointer, the way the 
> pSeries hypervisor handles this is having the kernel always 
> pass a "pseudo-physical" 
> address (to borrow Xen terminology), which is trivially 
> translatable to a "machine" address in the hypervisor. The 
> processor has some notion of a large (e.g. 64M) chunk of 
> contiguous machine memory, so the hypervisor keeps a table of 
> chunks which can be used to translate pseudo-physical addresses.
> 
> Of course, userspace doesn''t know psuedo-physical addresses, 
> only the kernel does. So one way or another, to pass 
> parameters by pointer to the PPC hypervisor, the kernel is 
> going to need to translate them. That also means userspace 
> memory areas will be limited to one page (since virtually 
> consecutive pages may not be representable by a single 
> pseudo-physical address).
> 
> If we''re stuck with structure addresses in hypercalls, one 
> possible solution is to modify libxc so that all parameter 
> addresses are physical pointers within the same page, then 
> pass that page''s physical address into the hypercall. 
> Something like this:
> 
> ulong magicpage_vaddr;
> ulong magicpage_paddr;
> 
> libxc_init() {
> #ifdef __powerpc__
> 	posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
> 	mlock(magicpage_vaddr);
> 	magicpage_paddr = new_translate_syscall(magicpage_vaddr);
> #endif
> 	...
> }
> 
> xc_get_pfn_list() {
> 	dom0_op_t *op;
> 	ulong op_paddr;
> 	magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
> 	...
> }
> 
> #ifdef __powerpc__
> magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
> 	*usable_addr = magicpage_vaddr + offset;
> 	*hcall_addr = magicpage_paddr + offset;
> 	offset += bytes;
> }
> 
> do_xen_hypercall(ptr) {
> 	ptr -= magicpage_vaddr - magicpage_paddr;
> 	do_privcmd(..., ptr);
> }
> #endif
> 
> (Note that this is for discussion only, not a proposed interface.)
> 
> Each architecture would provide their own magicalloc and 
> do_xen_hypercall, and for x86 magicalloc would be 
> malloc+mlock and both pointers are the same. x86 
> do_xen_hypercall would remain unchanged. Basically, any 
> current use of mlock in libxc would be replaced with calls to 
> magicalloc.
> 
> For example, if we''re willing to change the embedded pointers 
> in dom0_ops to offsets, we do not need to invent a new 
> "translate" system call.
> 
> Other suggestions are welcome.
> 
> --
> Hollis Blanchard
> IBM Linux Technology Center
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arun Sharma

2005-Aug-17 21:07 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

Ian Pratt wrote:>>Many Xen hypercalls pass mlocked pointers as parameters for 
>>both input and output. For example, xc_get_pfn_list() is a 
>>nice one with multiple levels of structures/mlocking.
>>
>>Considering just the tools for the moment, those pointers are 
>>userspace addresses. Ultimately the hypervisor ends up with 
>>that userspace address, from which it reads and writes data. 
>>This is OK for x86, since userspace, kernel, and hypervisor 
>>all share the same virtual address space (and userspace has 
>>carefully mlocked the relevent memory).
This is a problem even on x86 for VMX domains which execute hypercalls 
because of para virtualized device drivers.
>>
>>On PowerPC though, the hypervisor runs in real mode (no MMU 
>>translation).  
>>Unlike x86, PowerPC exceptions arrive in real mode, and also 
>>PowerPC does not force a TLB flush when switching between 
>>real and virtual modes. So a virtual address is pretty much 
>>worthless as a hypervisor parameter; performing the MMU 
>>translation in software is infeasible.
> 
> 
> I think I''d prefer to hide all of this by co-operation between the
> kernel and the hypervisor''s copy to/from user.
>
This is basically what Xiaofeng attempted to do in this patch:

http://article.gmane.org/gmane.comp.emulators.xen.devel/11107

although the virtual -> pseudo physical is also done in the hypervisor.
Please let us know if the patch is acceptable in light of your email.
> The kernel can easily translate a virtual address and length into a list
> of psuedo-phyiscal frame numbers and initial offset. Xen''s copy
from
> user function can then use this list when doing its work. 
The other alternative (which we talked about at OLS) is to use a couple 
of pinned pages for parameter passing - but it doesn''t work very well
for:

a) Multiple levels of structures/pointers
b) Arguments which may be bigger than a couple of pages 
(xc_get_pfn_list() for a bigmem domain for example).

	-Arun

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2005-Aug-17 22:04 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On Wednesday 17 August 2005 15:44, Ian Pratt wrote:> > Many Xen hypercalls pass mlocked pointers as parameters for
> > both input and output. For example, xc_get_pfn_list() is a
> > nice one with multiple levels of structures/mlocking.
> >
> > Considering just the tools for the moment, those pointers are
> > userspace addresses. Ultimately the hypervisor ends up with
> > that userspace address, from which it reads and writes data.
> > This is OK for x86, since userspace, kernel, and hypervisor
> > all share the same virtual address space (and userspace has
> > carefully mlocked the relevent memory).
> >
> > On PowerPC though, the hypervisor runs in real mode (no MMU
> > translation).
> > Unlike x86, PowerPC exceptions arrive in real mode, and also
> > PowerPC does not force a TLB flush when switching between
> > real and virtual modes. So a virtual address is pretty much
> > worthless as a hypervisor parameter; performing the MMU
> > translation in software is infeasible.
>
> I think I''d prefer to hide all of this by co-operation between the
> kernel and the hypervisor''s copy to/from user.
>
> The kernel can easily translate a virtual address and length into a list
> of psuedo-phyiscal frame numbers and initial offset. Xen''s copy
from
> user function can then use this list when doing its work.
Could you elaborate a little?

Consider this structure:
typedef struct {
    /* IN variables. */
    domid_t       domain;
    memory_t      max_pfns;
    void         *buffer;
    /* OUT variables. */
    memory_t      num_pfns;
} dom0_getmemlist_t;

libxc creates this struct and passes it to the kernel, and the kernel
doesn''t
know anything about the internals. Are you saying that privcmd_ioctl() should 
look like this?

    switch ( cmd )
    {
    case IOCTL_PRIVCMD_HYPERCALL:
    {
        privcmd_hypercall_t hypercall;
        dom0_op_t *op = (dom0_op_t *)&hypercall;
  
        if ( copy_from_user(&hypercall, (void *)data, sizeof(hypercall)) )
            return -EFAULT;

        /* NEW switch statement: */
        switch (op->cmd)
        {
        case DOM0_GETMEMLIST:
            op->u.getmemlist.buffer =
virt_to_phys(op->u.getmemlist.buffer);
            break;
        case DOM0_SETDOMAININFO:
            ...
        case DOM0_READCONSOLE:
            ...
        }
    }
    break;
    }

Right now the kernel doesn''t peer inside the hypercall structures at
all.

-- 
Hollis Blanchard
IBM Linux Technology Center

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2005-Aug-17 22:11 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On Wednesday 17 August 2005 16:07, Arun Sharma wrote:> Ian Pratt wrote:
> >>Many Xen hypercalls pass mlocked pointers as parameters for
> >>both input and output. For example, xc_get_pfn_list() is a
> >>nice one with multiple levels of structures/mlocking.
> >>
> >>Considering just the tools for the moment, those pointers are
> >>userspace addresses. Ultimately the hypervisor ends up with
> >>that userspace address, from which it reads and writes data.
> >>This is OK for x86, since userspace, kernel, and hypervisor
> >>all share the same virtual address space (and userspace has
> >>carefully mlocked the relevent memory).
>
> This is a problem even on x86 for VMX domains which execute hypercalls
> because of para virtualized device drivers.
>
> >>On PowerPC though, the hypervisor runs in real mode (no MMU
> >>translation).
> >>Unlike x86, PowerPC exceptions arrive in real mode, and also
> >>PowerPC does not force a TLB flush when switching between
> >>real and virtual modes. So a virtual address is pretty much
> >>worthless as a hypervisor parameter; performing the MMU
> >>translation in software is infeasible.
> >
> > I think I''d prefer to hide all of this by co-operation
between the
> > kernel and the hypervisor''s copy to/from user.
>
> This is basically what Xiaofeng attempted to do in this patch:
>
> http://article.gmane.org/gmane.comp.emulators.xen.devel/11107
>
> although the virtual -> pseudo physical is also done in the hypervisor.
> Please let us know if the patch is acceptable in light of your email.
This patch does performs MMU translation in software. Even if you like that on 
x86, trying to do that on PowerPC is considerably more expensive. Just the 
page table lookup could be 16 loads and compares, and that''s not
counting
segmentation.
> > The kernel can easily translate a virtual address and length into a
list
> > of psuedo-phyiscal frame numbers and initial offset. Xen''s
copy from
> > user function can then use this list when doing its work.
>
> The other alternative (which we talked about at OLS) is to use a couple
> of pinned pages for parameter passing - but it doesn''t work very
well for:
>
> a) Multiple levels of structures/pointers
> b) Arguments which may be bigger than a couple of pages
> (xc_get_pfn_list() for a bigmem domain for example).
This is pretty much the proposal I sent earlier. The multiple levels of 
pointers can be handled as I showed, by creating an allocator that manages 
the couple pages.

I have no answer for parameters that are very large, but I wonder how many 
cases there are. For example, DOM0_READCONSOLE could just be limited to 4KB 
reads, and if there''s more data than that, call it again. Perhaps there
is
some case-specific solution to xc_get_pfn_list() as well.

-- 
Hollis Blanchard
IBM Linux Technology Center

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ling, Xiaofeng

2005-Aug-18 00:47 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

Arun Sharma <mailto:arun.sharma@intel.com> wrote:> Ian Pratt wrote:
> The other alternative (which we talked about at OLS) is to use a
> couple of pinned pages for parameter passing - but it doesn''t work
> very well for:  
> 
> a) Multiple levels of structures/pointersA good example is do_multicall.
A complete implementation need to enum all the hypercall and 
try to deal with each hypercall if it uses points.
> b) Arguments which may be bigger than a couple of pages
> (xc_get_pfn_list() for a bigmem domain for example).
> 
> 	-Arun
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tian, Kevin

2005-Aug-18 06:56 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

>From: Ian Pratt
>Sent: Thursday, August 18, 2005 4:44 AM
>> On PowerPC though, the hypervisor runs in real mode (no MMU 
>> translation).  
>> Unlike x86, PowerPC exceptions arrive in real mode, and also 
>> PowerPC does not force a TLB flush when switching between 
>> real and virtual modes. So a virtual address is pretty much 
>> worthless as a hypervisor parameter; performing the MMU 
>> translation in software is infeasible.
>
>I think I''d prefer to hide all of this by co-operation between the
>kernel and the hypervisor''s copy to/from user.
>
>The kernel can easily translate a virtual address and length into a
list>of psuedo-phyiscal frame numbers and initial offset. Xen''s copy
from
>user function can then use this list when doing its work.
>
>Ian
>
So this is a common concern for hypervisor residing in a different
address space as guest. For PowerPC, it''s real mode (hypervisor) VS
virtual mode (guest). For vmx domain, hypervisor has its own monitor
page table separated from shadow page table. Expect the final solution
to be uniform too. ;-)

See if I understand your suggestion closely here. Previous Xiaofeng''s
patch has following flow when accessing guest address space:
---hypervisor---
- Search gva in guest page table to get pfn
- Get mfn by pfn
- map mfn into hypervisor''s space
- Then directly access the new va''

Then your suggestion is to make gva->pfn search happening in guest. And
hypervisor will still have rest steps to manipulate monitor page table
first and then access new va''. (PowerPC will access mfn directly).
Finally in either option, copy_from/to_user becomes a memcpy to a new
va'' without exception happening.

Now, question comes out. The pseudo-physical frame number list itself
also presents as a parameter to hypervisor, and there''s no promise that
this list will be confined in single page. You also need extra info in
this list if multiple parameters are pointers. How to access this
scalable list effectively seems to be same puzzle as the subject. For
x86 people may set a maximum limitation, but how about 64bit platform?
Good example is always get_pfn_list, which always breaks assumption for
size of parameter. ;-)

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tian, Kevin

2005-Aug-18 06:56 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

>From: Hollis Blanchard
>Sent: Thursday, August 18, 2005 6:11 AM
>
>I have no answer for parameters that are very large, but I wonder how
many>cases there are. For example, DOM0_READCONSOLE could just be limited
>to 4KB
>reads, and if there''s more data than that, call it again. Perhaps
there
is>some case-specific solution to xc_get_pfn_list() as well.
>
If one hypercall wants to get specific context at one point atomically,
"call it again" several times actually returns mixed contexts
belonging
to different time points. That''s not desired. Even if people want to
add
atomic protection for such type of case, performance will be affected a
lot and more risk to suffer dead-lock.

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tian, Kevin

2005-Aug-18 06:56 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

>From: Hollis Blanchard
>Sent: Thursday, August 18, 2005 6:05 AM
>        case DOM0_GETMEMLIST:
>            op->u.getmemlist.buffer
virt_to_phys(op->u.getmemlist.buffer);
>            break;
If following Ian''s suggestion, you have to create a list of pfn here
instead of only converting start address. There''s no guaranty that the
buffer is limited in one page. ;-)

Thanks,
Kevin>        case DOM0_SETDOMAININFO:
>            ...
>        case DOM0_READCONSOLE:
>            ...
>        }
>    }
>    break;
>    }
>
>Right now the kernel doesn''t peer inside the hypercall structures
at
all.>
>--
>Hollis Blanchard
>IBM Linux Technology Center
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2005-Aug-18 15:58 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On Aug 18, 2005, at 1:56 AM, Tian, Kevin wrote:
>> From: Hollis Blanchard
>> Sent: Thursday, August 18, 2005 6:05 AM
>>        case DOM0_GETMEMLIST:
>>            op->u.getmemlist.buffer = 
>> virt_to_phys(op->u.getmemlist.buffer);
>>            break;
>
> If following Ian''s suggestion, you have to create a list of pfn
here
> instead of only converting start address. There''s no guaranty that
the
> buffer is limited in one page. ;-)
Actually that was an explicitly stated limitation.

But I think I like this scatterlist idea. So for every pointer (buffer 
in the above example), instead the pseudo-physical address to a 
scatterlist will be passed to the hypervisor, and then 
copy_to/from_user expects a scatterlist address instead of a pointer. I 
think the copy_to/from_user and get/put_user API would need to change 
though: you''d need the value, the scatterlist pointer, and an offset 
into the scatterlist.

So x86 would need a slight API change, but could continue without 
dealing with any scatterlists, i.e. no ABI change.

The PowerPC kernel would need knowledge of every hypercall structure to 
create and translate the scatterlist. I know that''s an idea Jimi
isn''t
fond of, but it really seems like the best solution here.

-- 
Hollis Blanchard
IBM Linux Technology Center

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jimi Xenidis

2005-Aug-19 02:00 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

>>>>> "HB" == Hollis Blanchard
<hollisb@us.ibm.com> writes:
hmm let me bubble up my intro :)

 HB> I know that''s an idea Jimi isn''t fond of, but it
really seems
 HB> like the best solution here.

Why I dislike this solution.
  1. Currently, the kernel has no intimate knowledge of the managment
     calls.  This is goodness since this gives the freedom to
     "innovate" in the management area without impacting the kernel,
     we now would require kernel updates that grok management
     structures, creating more opportunity for versioning chaos and
     bloating of the kernel patch.
  2. We are complicating the kernel and the hypervisor in order to
     keep a user app simple.  Does anyone care that a user app suffer
     a little performace impact?  Frankly, I''m much more worried about
     unecessarily impacting the hypervisor.

I believe a negotiated managment area that the application serializes
all arguements into to be a far better solution, the area can be of
arbitrary size and it the added complexity to the application is
trivial.

Am I missing something?
-JX


-- 
 "I got an idea, an idea so smart my head would explode if I even
  began to know what I was talking about." -- Peter Griffin (Family Guy)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Aug-19 10:32 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On 19 Aug 2005, at 03:00, Jimi Xenidis wrote:
> I believe a negotiated managment area that the application serializes
> all arguements into to be a far better solution, the area can be of
> arbitrary size and it the added complexity to the application is
> trivial.
>
> Am I missing something?
This is the correct answer imo. get_pfn_list() needs to die anyway: 
there are better ways to get the list of mfns belonging to a guest (you 
can get the list back from increase_reservation, or you can map the 
guest''s pfn->mfn map).

The current mlock() scheme in libxc is screwed anyway -- we 
mlock/munlock regions that may overlap at page granularity. Fixing this 
would lead naturally to a preallocation scheme.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Aug-19 11:34 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

> The current mlock() scheme in libxc is screwed anyway -- we 
> mlock/munlock regions that may overlap at page granularity. 
> Fixing this would lead naturally to a preallocation scheme.
That''s a very good point. For the moment, we should remove all the
munlock() calls for safety. The amount of unnecessary memory we''ll end
up pinning will be tiny, so we shouldn''t worry about it.

Post 3.0 we can completely redo the dom0 op interface, but the rest of
the hypercall interface will have to remain backward compatible, at
least for x86_*. Since passing by VA is so convenient on the
architectures that support it we may not want to do anything different
on these anyhow.

For VT paravirt drivers I think pre-registration will work fine. The set
of hypercalls we need to support is small anyhow.

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jimi Xenidis

2005-Aug-19 11:52 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

>>>>> "IP" == Ian Pratt
<m+Ian.Pratt@cl.cam.ac.uk> writes:
 IP> Post 3.0 we can completely redo the dom0 op interface, but the rest of
 IP> the hypercall interface will have to remain backward compatible, at
 IP> least for x86_*.

Just to clarify, "the rest" refers to hypercalls made from the kernel,
correct?  Any hypercall using VAs made from user space are at issue here.

 IP> Since passing by VA is so convenient on the architectures that
 IP> support it we may not want to do anything different on these
 IP> anyhow.

I agree, why create a new mapping when a usable one exists.

At least for common kernel code, we will need to wrap such VAs in a
macro so that the "psuedo-physical" is passed in for PPC. I assume
this is reasonable?

-JX


-- 
 "I got an idea, an idea so smart my head would explode if I even
  began to know what I was talking about." -- Peter Griffin (Family Guy)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Aug-19 12:17 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On 19 Aug 2005, at 12:52, Jimi Xenidis wrote:
> IP> Since passing by VA is so convenient on the architectures that
>  IP> support it we may not want to do anything different on these
>  IP> anyhow.
>
> I agree, why create a new mapping when a usable one exists.
>
> At least for common kernel code, we will need to wrap such VAs in a
> macro so that the "psuedo-physical" is passed in for PPC. I
assume
> this is reasonable?
This is all potentially fixable before 3.0 final. Paravirt x86 can 
continue to use guest virtual addresses. The idea would be that the 
registration scheme would essentially create a parameter-passing 
''address space'' into which you hook pages of memory. On x86 we
would
map the address space onto regions of kernel va space. On other arches 
we would map the address space onto physical addresses that get mapped 
into Xen''s va space. get_user/put_user/copy_from_user/copy_to_user will
take guest addresses that point into this parameter-passing address 
space.

At least we can scope it out by doing a few hypercalls to start with -- 
probably dom0_ops first and see how it pans out. I think it will work 
quite well...

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Aug-19 12:20 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On 19 Aug 2005, at 12:34, Ian Pratt wrote:
> That''s a very good point. For the moment, we should remove all the
> munlock() calls for safety. The amount of unnecessary memory we''ll
end
> up pinning will be tiny, so we shouldn''t worry about it.
The munlock()s indicate where we should deallocate bounce buffers back 
to the pre-reservation pool. We should at least mark those places so we 
don''t have to search for them again later.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Aug-19 12:41 UTC

head link

RE: [Xen-devel] passing hypercall parameters by pointer

> This is all potentially fixable before 3.0 final. Paravirt 
> x86 can continue to use guest virtual addresses. The idea 
> would be that the registration scheme would essentially 
> create a parameter-passing ''address space'' into which you
> hook pages of memory. On x86 we would map the address space 
> onto regions of kernel va space. On other arches we would map 
> the address space onto physical addresses that get mapped 
> into Xen''s va space. 
> get_user/put_user/copy_from_user/copy_to_user will take guest 
> addresses that point into this parameter-passing address space.
> 
> At least we can scope it out by doing a few hypercalls to 
> start with -- probably dom0_ops first and see how it pans 
> out. I think it will work quite well...
I''d be inclined to first go after the ops that are needed for the
paravirtualized drivers (mem_op, grantab_op). Perhaps people could post
a few patch examples for dicsussion?

NB: This in no way represents a commitment to get this into 3.0-final.
Let''s have a look at the patches and decide.

[Right now, anything that isn''t fixing bugs or sorting out xenbus/tools
is actually a distraction]

Ian


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2005-Aug-19 13:57 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On Aug 19, 2005, at 7:17 AM, Keir Fraser wrote:>
> On 19 Aug 2005, at 12:52, Jimi Xenidis wrote:
>
>> IP> Since passing by VA is so convenient on the architectures that
>>  IP> support it we may not want to do anything different on these
>>  IP> anyhow.
>>
>> I agree, why create a new mapping when a usable one exists.
>>
>> At least for common kernel code, we will need to wrap such VAs in a
>> macro so that the "psuedo-physical" is passed in for PPC. I
assume
>> this is reasonable?
>
> This is all potentially fixable before 3.0 final. Paravirt x86 can 
> continue to use guest virtual addresses. The idea would be that the 
> registration scheme would essentially create a parameter-passing 
> ''address space'' into which you hook pages of memory. On
x86 we would
> map the address space onto regions of kernel va space. On other arches 
> we would map the address space onto physical addresses that get mapped 
> into Xen''s va space. get_user/put_user/copy_from_user/copy_to_user
> will take guest addresses that point into this parameter-passing 
> address space.
Could you flesh this out a little more? I *think* what you''re saying is
this (on PowerPC):
- at boot, the kernel notifies Xen of a parameter page
- replace libxc calls to mlock() with register_this_address() (which 
could be a privcmd ioctl)
- register_this_address() stuffs the userspace pointer and 
corresponding pseudo-physical pointer into a table in the parameter 
page
- libxc ignorantly creates its structures with userspace addresses
- once the hypercall arrives in Xen, copy_from_user() is passed the 
userspace address
- copy_from_user() consults the table in the parameter page to 
translate userspace -> pseudo-physical, then translates pseudo-physical 
-> machine

Is that right?

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Aug-19 14:35 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On 19 Aug 2005, at 14:57, Hollis Blanchard wrote:
> Could you flesh this out a little more? I *think* what you''re
saying
> is this (on PowerPC):
> - at boot, the kernel notifies Xen of a parameter page
It can be multiple pages, and the mappings can change over time. Think 
of something like set_parameter_page(parameter_address_space_frame, 
physical_address_space_fram) establishing a mapping from parameter 
address space to phys address space.
> - replace libxc calls to mlock() with register_this_address() (which 
> could be a privcmd ioctl)
Yep. I think libxc would request via a privcmd ioctl. The kernel can 
extend the parameter-passing region, or allocate a subsection of the 
existing region, and mmap it into user space. It would also return to 
libxc the range of parameter-passing addresses that have been allocated 
to it.
> - libxc ignorantly creates its structures with userspace addresses
libxc would create structs with parameter-passing addresses.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2005-Aug-19 15:18 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On Aug 19, 2005, at 9:35 AM, Keir Fraser wrote:>
> On 19 Aug 2005, at 14:57, Hollis Blanchard wrote:
>
>> - replace libxc calls to mlock() with register_this_address() (which 
>> could be a privcmd ioctl)
>
> Yep. I think libxc would request via a privcmd ioctl. The kernel can 
> extend the parameter-passing region, or allocate a subsection of the 
> existing region, and mmap it into user space. It would also return to 
> libxc the range of parameter-passing addresses that have been 
> allocated to it.
>
>> - libxc ignorantly creates its structures with userspace addresses
>
> libxc would create structs with parameter-passing addresses.
Does "parameter-passing addresses" mean offsets inside the parameter 
passing space?

I think pseudocode is going to be more effective than English here. 
Let''s take DOM0_PERFCCONTROL as an example:

main() {
     xc_perfc_desc_t *desc = malloc();
     mlock(desc); // <------------- [1]
     xc_perfc_control(desc);
}

xc_perfc_control(xc_perfc_desc_t *desc) {
     dom0_op_t dop;

     dop.cmd = DOM0_PERFCCONTROL;
     dop.u.perfccontrol.desc = desc; // <------------ [2]
     do_dom0_op(&dop);
}

Even if you replace malloc/mlock at [1] with a call that maps 
"parameter passing" space into this process, what address will you put
in the struct at [2]? That would have to be an offset within the 
parameter passing space, right?

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Aug-19 15:31 UTC

head link

Re: [Xen-devel] passing hypercall parameters by pointer

On 19 Aug 2005, at 16:18, Hollis Blanchard wrote:
> Even if you replace malloc/mlock at [1] with a call that maps 
> "parameter passing" space into this process, what address will
you put
> in the struct at [2]? That would have to be an offset within the 
> parameter passing space, right?
Yes.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2005 - passing hypercall parameters by pointer

[Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

RE: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer

Re: [Xen-devel] passing hypercall parameters by pointer