Derek Murray
2007-Mar-19  10:40 UTC
[Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
I''ve completed a version of a Linux kernel driver for mapping granted pages into user-space. There are a few use cases for this, but the most important (from a XenSE/dom0 disaggregation point of view) is changing the way the the console daemon and xenstore map their respective pages in the guests. A further possibility might be reimplementing blktap entirely in user-space. Currently the driver supports: * Mapping one or more pages at a time into a contiguous portion of memory.. * Mapping up to 128 pages per file descriptor (this is an arbitrary limit, but it would be trivial to make this configurable). * Each page may only be mapped once. However, it is possible to map the same grant more than once: simply register the grant twice with the ioctl, and then map the two (different) offsets. In addition, I''ve written some simple libxc functions (of the same format as the xc_evtchn_* functions) that provide access to the driver. I''ve split up the patch as follows: 1. gntdev.patch: This is the main driver, and associated header file. 2. libxc-changes.patch: These are the libxc functions for accessing the driver. 3. linux-changes.patch: These contain the necessary changes to linux (in effect, adding a hook to the vm_operations_struct that is called by the unmap_page_range function) for unmapping grants before the page table is destroyed. I''d welcome your comments. Regards, Derek Murray. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Mar-19  14:00 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On 19/3/07 10:40, "Derek Murray" <Derek.Murray@cl.cam.ac.uk> wrote:> 1. gntdev.patch: This is the main driver, and associated header file. > 2. libxc-changes.patch: These are the libxc functions for accessing > the driver. > 3. linux-changes.patch: These contain the necessary changes to linux > (in effect, adding a hook to the vm_operations_struct that is called > by the unmap_page_range function) for unmapping grants before the > page table is destroyed.1. The patches are malformed (don''t apply). It might be best to send the patches at plain-text attachments to prevent mangling by your email client. 2. Dependencies should be listed in the right order. So your patch 3 should be patch 1, since it is required by the grant-table device. 3. Just looking at linux-changes.patch, I think the new interface function''s semantics need to be clearer. Do you really mean for the hook function to be called *and* for unmap_page_range() to do the usual zap_pud_range() work? Should the hook function return anything? What about responsibility for synchronising TLBs? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Derek Murray
2007-Mar-19  14:31 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
Keir Fraser wrote:> 1. The patches are malformed (don''t apply). It might be best to send the > patches at plain-text attachments to prevent mangling by your email client. > 2. Dependencies should be listed in the right order. So your patch 3 should > be patch 1, since it is required by the grant-table device.Sorry about those: I''ll fix them in a subsequent repost.> 3. Just looking at linux-changes.patch, I think the new interface function''s > semantics need to be clearer. Do you really mean for the hook function to be > called *and* for unmap_page_range() to do the usual zap_pud_range() work? > Should the hook function return anything? What about responsibility for > synchronising TLBs?Might it be better to replace this hook with something that operates on the PTE (rather than vm_area) level? Thus the hook could be called from zap_pte_range(), and this would harness all the TLB synchronisation that is normally employed. I''m not sure that the hook function should return anything: as far as I understand, the unmap-grant hypercall should not fail, and, if it does, I don''t see how we can recover from that. Regards, Derek. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Mar-19  15:11 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On 19/3/07 14:31, "Derek Murray" <Derek.Murray@cl.cam.ac.uk> wrote:>> 3. Just looking at linux-changes.patch, I think the new interface function''s >> semantics need to be clearer. Do you really mean for the hook function to be >> called *and* for unmap_page_range() to do the usual zap_pud_range() work? >> Should the hook function return anything? What about responsibility for >> synchronising TLBs? > > Might it be better to replace this hook with something that operates on the > PTE (rather than vm_area) level? Thus the hook could be called from > zap_pte_range(), and this would harness all the TLB synchronisation that is > normally employed.I think that might be better, yes. It avoids needing to do pagetable walking in your driver to find the pte''s and you can also use tlb_remove_tlb_entry() on each pte you zap to cause an eventual tlb flush on tlb_finish_mmu() (which is called from the caller of unmap_page_range())..> I''m not sure that the hook function should return anything: as far as I > understand, the unmap-grant hypercall should not fail, and, if it does, I > don''t see how we can recover from that.You''re probably right, especially if you implement the hook as a zap_pte_range() alternative. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Mar-21  04:32 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On Mon, Mar 19, 2007 at 10:40:18AM +0000, Derek Murray wrote:> 1. gntdev.patch: This is the main driver, and associated header file.This patch doesn''t work on ia64 and ppc because they doesn''t support GNTMAP_application_map which is x86 speicific. The flag doesn''t make sense for ia64 and ppc. (When auto_translated_physmap_mode is enabled, GNTMAP_application_map isn''t necessary.) Fortunately there was the similar issue in blktap. It was solved by utilizing auto_translated_physmap_mode. If it is diffucult for you to work on it (ia64 or ppc machine is necessary), I''d like to do. Your goal seems to be to rewrite xen console daemon and xensotred. (and blktap if possible). They are very fundamental, so I want to resolve it in advance instead of finding/fixing the breakage after the commit. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Derek Murray
2007-Mar-21  11:29 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On 21 Mar 2007, at 04:32, Isaku Yamahata wrote:> This patch doesn''t work on ia64 and ppc because they doesn''t support > GNTMAP_application_map which is x86 speicific. The flag doesn''t make > sense for ia64 and ppc. (When auto_translated_physmap_mode is enabled, > GNTMAP_application_map isn''t necessary.) > Fortunately there was the similar issue in blktap. It was solved > by utilizing auto_translated_physmap_mode.Please forgive my ignorance of ia64 and ppc, as I don''t have ready access to either for development. I based my work on the blktap driver, although there were some bits that I didn''t understand. blktap appears to map each grant into kernel space and, when not using auto-translated mode, into user space as well. If it is using auto-translated mode, then the grant is only mapped into the kernel. Am I correct in thinking that the use of the VM_FOREIGN flag, plus the use of vm_private_data to store an array of struct page pointers, enables the pages to be mapped into user space using get_user_pages() (which is called by make_pages_present(), which is called by do_mmap_pgoff())? Or is it the call to vm_insert_page()?> If it is diffucult for you to work on it (ia64 or ppc machine is > necessary), > I''d like to do.I''ll have a try at adding the code to work in these cases, though I would appreciate your feedback on whether the changes are correct.> Your goal seems to be to rewrite xen console daemon and xensotred. > (and blktap if possible). > They are very fundamental, so I want to resolve it in advance > instead of finding/fixing the breakage after the commit.Definitely! Regards, Derek Murray. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Warfield
2007-Mar-21  14:16 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
> Am I correct in thinking that the use of the VM_FOREIGN flag, plus > the use of vm_private_data to store an array of struct page pointers, > enables the pages to be mapped into user space using get_user_pages() > (which is called by make_pages_present(), which is called by > do_mmap_pgoff())? Or is it the call to vm_insert_page()?Yes, VM_FOREIGN was added to allow get_user_pages to sort out where the granted page was mapped in kernel and to identify the underlying machine mapping. As blktap uses zero-copy, this was necessary for aio to map down appropriately and to arrange DMA directly to the guest pages. The Linux virtual memory code has churned a bunch since I wrote that, so there may now be a better approach than the dual-mapping and the special case in get_user_pages() to resolve them -- then again, maybe not. ;) I really like the early cleanup hook in your patches to cleanly unmap any outstanding granted pages before zapping the range on vm area destruction. It''s something that I''ve been wanting to do for a while -- very stability-adding. ;) a. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Mar-22  02:28 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On Wed, Mar 21, 2007 at 11:29:29AM +0000, Derek Murray wrote:> blktap appears to map each grant into kernel space and, when not > using auto-translated mode, into user space as well. If it is using > auto-translated mode, then the grant is only mapped into the kernel.When auto translate physmap mode, grant table works based on pseudo physical address space. not virtual address space. Please see gnttab_set_map/unmap_op() using __pa(). It is up to the guest kernel to map the pseudo physical address to kernel/user virtual address space using kernel functionality. Given that Linux adopts 1:1 mapping between pseudo physical address space and kernel virtual address space, there is> Am I correct in thinking that the use of the VM_FOREIGN flag, plus > the use of vm_private_data to store an array of struct page pointers, > enables the pages to be mapped into user space using get_user_pages() > (which is called by make_pages_present(), which is called by > do_mmap_pgoff())? Or is it the call to vm_insert_page()?Both setting vm_private_data and calling vm_insert_page are necessary. vm_insert_page() is the counter part of GNTMAP_application_map for auto translated phymap mode. I may be worng because I haven''t taken closer look of your code, though.> I''ll have a try at adding the code to work in these cases, though I > would appreciate your feedback on whether the changes are correct.I''m willing to review/test. Do you have any testing program? Can you provide it? I may need it when test. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Derek Murray
2007-Mar-23  14:50 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
On 22 Mar 2007, at 02:28, Isaku Yamahata wrote:> When auto translate physmap mode, grant table works based on > pseudo physical address space. not virtual address space. > Please see gnttab_set_map/unmap_op() using __pa(). > It is up to the guest kernel to map the pseudo physical address > to kernel/user virtual address space using kernel functionality. > > Given that Linux adopts 1:1 mapping between pseudo physical address > space and kernel virtual address space, there is > > Both setting vm_private_data and calling vm_insert_page are necessary. > vm_insert_page() is the counter part of GNTMAP_application_map for > auto translated phymap mode. > I may be worng because I haven''t taken closer look of your code, > though.Okay, the reimplementation of the driver following the blktap model seems to be going well, and has also managed to tease out a couple of bugs. I''ll post an updated patch probably early next week.> I''m willing to review/test. > Do you have any testing program? Can you provide it? > I may need it when test.Thanks for your offer! This would be very useful. My current plan is to test it on x86_32 with shadow paging turned off and then on, and once this works I will resubmit the patch. My testing program is in two parts: one based on Mini-OS, which I use to have a domain that grants a few pages, and the other which runs in dom0 and calls my modified libxc. I can supply this if necessary. Regards, Derek Murray. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2007-Mar-23  15:25 UTC
Re: [Xen-devel] [PATCH 0/3] [RFC] User-space grant table device
> I really like the early cleanup hook in your patches to cleanly unmap > any outstanding granted pages before zapping the range on vm area > destruction. It''s something that I''ve been wanting to do for a while > -- very stability-adding. ;)Yep; as far as we could tell, this should clean up the mappings reliably whatever happens - both clean munmaping and process destruction. It''s nice that it only turns out as a small change: it would only be a small diff for us to maintain as a dom0 kernel patch, but maybe it''s small enough to go upstream one day. I guess in this respect it would be nice if there were any other potential users of it (maybe non-Xen related)... Once the patch does what Xen folks want (or even before?), maybe it would be a good idea to post it on the Linux-mm list and see if people like the approach and if there''s anything that could be done to make it more generic. This might smooth the road if it ever gets submitted upstream at a later date. Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel