In order to be able to set some of the available bits in the pte resulting from gnttab_set_map_op() (in particular PAGE_SPECIAL and possibly PAGE_IO), it would seem necessary to extend the set of flags that can be passed to that function. While the addition by itself is a simple operation, the question is how to deal with backward compatibility here: Is there anything preventing the guest kernel from setting the flags it wants manually after Xen established the mapping, i.e. would Xen either disallow any modification to such pte-s, or get confused by the pte not being identical to what it set it to? As to PAGE_IO, wouldn''t it be appropriate to set this flag on pte-s created with GNTMAP_device_map? Or is the name misleading me? I''m perhaps confused by the fact that the use of this flag is not tied to granting iomem pages... Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote:> In order to be able to set some of the available bits in the pte resulting > from > gnttab_set_map_op() (in particular PAGE_SPECIAL and possibly PAGE_IO), > it would seem necessary to extend the set of flags that can be passed to > that function. While the addition by itself is a simple operation, the > question > is how to deal with backward compatibility here: Is there anything > preventing the guest kernel from setting the flags it wants manually after > Xen established the mapping, i.e. would Xen either disallow any modification > to such pte-s, or get confused by the pte not being identical to what it set > it to?I think mod_l1_entry() would allow this since it does no validation unless the mapping or PRESENT/RW change. Direct page-table writing won''t work as it happens since it will want to get_page() which of course won''t work on a foreign page. It could be given the same fast path as mod_l1_entry(), of course.> As to PAGE_IO, wouldn''t it be appropriate to set this flag on pte-s created > with GNTMAP_device_map? Or is the name misleading me? I''m perhaps > confused by the fact that the use of this flag is not tied to granting iomem > pages...The idea was that this would grant-map pages into an iommu. Really it''s not actually been used, but GNTMAP_device_map doesn''t cause mappings to appear in host page tables. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 30.10.08 15:15 >>> >On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote: >> As to PAGE_IO, wouldn''t it be appropriate to set this flag on pte-s created >> with GNTMAP_device_map? Or is the name misleading me? I''m perhaps >> confused by the fact that the use of this flag is not tied to granting iomem >> pages... > >The idea was that this would grant-map pages into an iommu. Really it''s not >actually been used, but GNTMAP_device_map doesn''t cause mappings to appear >in host page tables.It actually is being used in Solarflare''s acceleration backend - what you say makes me think this could be wrong then. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 30/10/08 15:53, "Jan Beulich" <jbeulich@novell.com> wrote:>>>> Keir Fraser <keir.fraser@eu.citrix.com> 30.10.08 15:15 >>> >> On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote: >>> As to PAGE_IO, wouldn''t it be appropriate to set this flag on pte-s created >>> with GNTMAP_device_map? Or is the name misleading me? I''m perhaps >>> confused by the fact that the use of this flag is not tied to granting iomem >>> pages... >> >> The idea was that this would grant-map pages into an iommu. Really it''s not >> actually been used, but GNTMAP_device_map doesn''t cause mappings to appear >> in host page tables. > > It actually is being used in Solarflare''s acceleration backend - what you say > makes me think this could be wrong then.I suppose it''s useful if you only want the machine address of the granted page, and don''t care about getting a host mapping. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 30.10.08 15:15 >>> >On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote: > >> In order to be able to set some of the available bits in the pte resulting >> from >> gnttab_set_map_op() (in particular PAGE_SPECIAL and possibly PAGE_IO), >> it would seem necessary to extend the set of flags that can be passed to >> that function. While the addition by itself is a simple operation, the >> question >> is how to deal with backward compatibility here: Is there anything >> preventing the guest kernel from setting the flags it wants manually after >> Xen established the mapping, i.e. would Xen either disallow any modification >> to such pte-s, or get confused by the pte not being identical to what it set >> it to? > >I think mod_l1_entry() would allow this since it does no validation unless >the mapping or PRESENT/RW change. Direct page-table writing won''t work as it >happens since it will want to get_page() which of course won''t work on a >foreign page. It could be given the same fast path as mod_l1_entry(), of >course.That''s probably not needed. Since we know the (machine) address of the pte at that point anyway, the cheapest thing to do would be to use mmu_update here (and as honoring these new flags will need to be advertised through another new feature flag, batching the two steps would anyway be desirable. Looking at that code I see some other potential issue, though: GRANT_PTE_FLAGS include PAGE_USER for x86-64, but the generated pte is passed through adjust_guest_l1e(), so without GNTMAP_application_map we''d get global kernel mappings. Am I wrong here? Plus I would think that GRANT_PTE_FLAGS really should include PAGE_NX. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 31/10/08 15:52, "Jan Beulich" <jbeulich@novell.com> wrote:> Looking at that code I see some other potential issue, though: > GRANT_PTE_FLAGS include PAGE_USER for x86-64, but the generated > pte is passed through adjust_guest_l1e(), so without > GNTMAP_application_map we''d get global kernel mappings. Am I wrong > here?GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then?> Plus I would think that GRANT_PTE_FLAGS really should include PAGE_NX.Yeah, makes sense. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 31.10.08 17:03 >>> >On 31/10/08 15:52, "Jan Beulich" <jbeulich@novell.com> wrote: > >> Looking at that code I see some other potential issue, though: >> GRANT_PTE_FLAGS include PAGE_USER for x86-64, but the generated >> pte is passed through adjust_guest_l1e(), so without >> GNTMAP_application_map we''d get global kernel mappings. Am I wrong >> here? > >GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then?No, it should uniformly include neither _PAGE_USER nor _PAGE_GUEST_KERNEL. _PAGE_USER gets set when GNTMAP_application_map is specified, and based on this setting of _PAGE_USER, adjust_guest_l1e() will set either _PAGE_GUEST_KERNEL or _PAGE_GLOBAL. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 31/10/08 16:13, "Jan Beulich" <jbeulich@novell.com> wrote:>> GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then? > > No, it should uniformly include neither _PAGE_USER nor _PAGE_GUEST_KERNEL. > _PAGE_USER gets set when GNTMAP_application_map is specified, and > based on this setting of _PAGE_USER, adjust_guest_l1e() will set either > _PAGE_GUEST_KERNEL or _PAGE_GLOBAL.The grant code doesn''t use adjust_l1e() though. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 31.10.08 17:17 >>> >On 31/10/08 16:13, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then? >> >> No, it should uniformly include neither _PAGE_USER nor _PAGE_GUEST_KERNEL. >> _PAGE_USER gets set when GNTMAP_application_map is specified, and >> based on this setting of _PAGE_USER, adjust_guest_l1e() will set either >> _PAGE_GUEST_KERNEL or _PAGE_GLOBAL. > >The grant code doesn''t use adjust_l1e() though.Both create_grant_pte_mapping() and create_grant_va_mapping() do afaics. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 31/10/08 16:23, "Jan Beulich" <jbeulich@novell.com> wrote:>>>> Keir Fraser <keir.fraser@eu.citrix.com> 31.10.08 17:17 >>> >> On 31/10/08 16:13, "Jan Beulich" <jbeulich@novell.com> wrote: >> >>>> GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then? >>> >>> No, it should uniformly include neither _PAGE_USER nor _PAGE_GUEST_KERNEL. >>> _PAGE_USER gets set when GNTMAP_application_map is specified, and >>> based on this setting of _PAGE_USER, adjust_guest_l1e() will set either >>> _PAGE_GUEST_KERNEL or _PAGE_GLOBAL. >> >> The grant code doesn''t use adjust_l1e() though. > > Both create_grant_pte_mapping() and create_grant_va_mapping() do > afaics.Ah yes, indeed. Please send a patch to deal with this and _PAGE_NX. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 31.10.08 18:00 >>> >On 31/10/08 16:23, "Jan Beulich" <jbeulich@novell.com> wrote: > >>>>> Keir Fraser <keir.fraser@eu.citrix.com> 31.10.08 17:17 >>> >>> On 31/10/08 16:13, "Jan Beulich" <jbeulich@novell.com> wrote: >>> >>>>> GRANT_PTE_FLAGS should include _PAGE_GUEST_KERNEL then? >>>> >>>> No, it should uniformly include neither _PAGE_USER nor _PAGE_GUEST_KERNEL. >>>> _PAGE_USER gets set when GNTMAP_application_map is specified, and >>>> based on this setting of _PAGE_USER, adjust_guest_l1e() will set either >>>> _PAGE_GUEST_KERNEL or _PAGE_GLOBAL. >>> >>> The grant code doesn''t use adjust_l1e() though. >> >> Both create_grant_pte_mapping() and create_grant_va_mapping() do >> afaics. > >Ah yes, indeed. Please send a patch to deal with this and _PAGE_NX.Since page table entries created through e.g. GNTTABOP_map_grant_ref are being passed through adjust_guest_l1e(), they must not generally get _PAGE_USER set - this will be taken care of by adjust_guest_l1e(), and it will ensure that these don''t get _PAGE_GLOBAL set inadvertently. Due to the implied security aspect here (_PAGE_GLOBAL getting set on kernel pages for x86-64), I''d like to ask that this also be applied to older maintained branches. At the same time, set _PAGE_NX for pte-s created for grants (as long as hardware supports it), since it should be only data pages that remote domains are being given access to. Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: 2008-10-27/xen/include/asm-x86/page.h ==================================================================--- 2008-10-27.orig/xen/include/asm-x86/page.h 2008-09-29 09:45:32.000000000 +0200 +++ 2008-10-27/xen/include/asm-x86/page.h 2008-10-31 16:58:13.000000000 +0100 @@ -314,6 +314,9 @@ unsigned long clone_idle_pagetable(struc #define __PAGE_HYPERVISOR_NOCACHE \ (_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_PCD | _PAGE_ACCESSED) +#define GRANT_PTE_FLAGS \ + (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_NX | _PAGE_GNTTAB) + #ifndef __ASSEMBLY__ static inline int get_order_from_bytes(paddr_t size) Index: 2008-10-27/xen/include/asm-x86/x86_32/page.h ==================================================================--- 2008-10-27.orig/xen/include/asm-x86/x86_32/page.h 2008-09-24 10:50:47.000000000 +0200 +++ 2008-10-27/xen/include/asm-x86/x86_32/page.h 2008-10-31 16:56:19.000000000 +0100 @@ -105,9 +105,6 @@ extern unsigned int PAGE_HYPERVISOR_NOCA #define get_pte_flags(x) (((int)((x) >> 32) & ~0xFFF) | ((int)(x) & 0xFFF)) #define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 32) | ((x) & 0xFFF)) -#define GRANT_PTE_FLAGS \ - (_PAGE_PRESENT|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_GNTTAB) - /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. Index: 2008-10-27/xen/include/asm-x86/x86_64/page.h ==================================================================--- 2008-10-27.orig/xen/include/asm-x86/x86_64/page.h 2008-09-29 09:45:32.000000000 +0200 +++ 2008-10-27/xen/include/asm-x86/x86_64/page.h 2008-10-31 16:56:35.000000000 +0100 @@ -124,9 +124,6 @@ typedef l4_pgentry_t root_pgentry_t; #define PAGE_HYPERVISOR (__PAGE_HYPERVISOR | _PAGE_GLOBAL) #define PAGE_HYPERVISOR_NOCACHE (__PAGE_HYPERVISOR_NOCACHE | _PAGE_GLOBAL) -#define GRANT_PTE_FLAGS \ - (_PAGE_PRESENT|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_GNTTAB|_PAGE_USER) - #define USER_MAPPINGS_ARE_GLOBAL #ifdef USER_MAPPINGS_ARE_GLOBAL /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2008-10-30 at 15:58 +0000, Keir Fraser wrote:> On 30/10/08 15:53, "Jan Beulich" <jbeulich@novell.com> wrote: > > >>>> Keir Fraser <keir.fraser@eu.citrix.com> 30.10.08 15:15 >>> > >> On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> As to PAGE_IO, wouldn''t it be appropriate to set this flag on pte-s created > >>> with GNTMAP_device_map? Or is the name misleading me? I''m perhaps > >>> confused by the fact that the use of this flag is not tied to granting iomem > >>> pages... > >> > >> The idea was that this would grant-map pages into an iommu. Really it''s not > >> actually been used, but GNTMAP_device_map doesn''t cause mappings to appear > >> in host page tables. > > > > It actually is being used in Solarflare''s acceleration backend - what you say > > makes me think this could be wrong then. > > I suppose it''s useful if you only want the machine address of the granted > page, and don''t care about getting a host mapping.Yes, that''s exactly what we use it for. The guest passes pages to dom0 that it would like to have added to the buffer table (effectively the iommu) on the NIC. dom0 doesn''t need to map these pages, but it needs the MFN to pass to the NIC, and it uses the GNTMAP_device_map to do this. I can''t remember off the top of my head if performing this grant has other beneficial side effects, but it wouldn''t surprise me if it did. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 30.10.08 15:15 >>> >On 30/10/08 14:06, "Jan Beulich" <jbeulich@novell.com> wrote: > >> In order to be able to set some of the available bits in the pte resulting >> from >> gnttab_set_map_op() (in particular PAGE_SPECIAL and possibly PAGE_IO), >> it would seem necessary to extend the set of flags that can be passed to >> that function. While the addition by itself is a simple operation, the >> question >> is how to deal with backward compatibility here: Is there anything >> preventing the guest kernel from setting the flags it wants manually after >> Xen established the mapping, i.e. would Xen either disallow any modification >> to such pte-s, or get confused by the pte not being identical to what it set >> it to? > >I think mod_l1_entry() would allow this since it does no validation unless >the mapping or PRESENT/RW change. Direct page-table writing won''t work as it >happens since it will want to get_page() which of course won''t work on a >foreign page. It could be given the same fast path as mod_l1_entry(), of >course.This turns out more difficult than I first thought: Since neither the virtual address for which a mapping is to be updated nor the machine address of the page table entry allow to simply update that entry (because I don''t have a virtual address for the pte at hand to read what Xen wrote), I''m pretty much lost. Due to Xen built with debug=y setting _PAGE_GNTTAB I also cannot easily reconstruct the pte from map->dev_bus_addr (leaving aside the fact that the flag is included in L1_DISALLOW_MASK). Otoh Xen apparently doesn''t keep a guest from clearing that flag... While looking in more detail into how all this fits together, I realized that there appears to be a race condition in blktap: Since fast_flush_area() and dispatch_rw_block_io() call create_lookup_pte_addr() without holding mm->mmap_sem, there''s nothing preventing the just verified (or created) page table structures to go away again before HYPERVISOR_grant_table_op() is able to finish (or even begin) its job. Am I missing something here? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 3/11/08 12:13, "Jan Beulich" <jbeulich@novell.com> wrote:> While looking in more detail into how all this fits together, I realized that > there appears to be a race condition in blktap: Since fast_flush_area() > and dispatch_rw_block_io() call create_lookup_pte_addr() without holding > mm->mmap_sem, there''s nothing preventing the just verified (or > created) page table structures to go away again before > HYPERVISOR_grant_table_op() is able to finish (or even begin) its job. > Am I missing something here?So, for example, it could be munmap()ed under those functions'' feet? Or the process could die? Looks like you could be right, although I''ve never looked closely at the blktap kernel driver. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 03.11.08 13:55 >>> >On 3/11/08 12:13, "Jan Beulich" <jbeulich@novell.com> wrote: > >> While looking in more detail into how all this fits together, I realized that >> there appears to be a race condition in blktap: Since fast_flush_area() >> and dispatch_rw_block_io() call create_lookup_pte_addr() without holding >> mm->mmap_sem, there''s nothing preventing the just verified (or >> created) page table structures to go away again before >> HYPERVISOR_grant_table_op() is able to finish (or even begin) its job. >> Am I missing something here? > >So, for example, it could be munmap()ed under those functions'' feet? Or the >process could die? Looks like you could be right, although I''ve never looked >closely at the blktap kernel driver.Process death is probably not a concern, but munmap() of the region in question or (potentially, maybe not in current kernels) even a nearby one is. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel