thr3ads.net - Xen devel - [V2 PATCH 0/8]: PVH dom0.... [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Mukesh Rathor

2013-Nov-23 00:03 UTC

[V2 PATCH 0/8]: PVH dom0....

Hi,

These patches implement PVH dom0. Please note there is 1 fixme in
this entire series that is being addressed under a different patch 
series. PVH dom0 creation is disabled until then.

Patches 1 thru 4 implement changes in and around construct_dom0.
Patches 5 thru 7 are to support tool stack on PVH dom0, and have
been mostly looked at in the past. Finally, patch 8 adds option to
boot a dom0 in PVH mode.

These patches are based on c/s: b18e2d9 with the addition of fixes 
in epte_get_entry_emt and Roger''s AP bringup/cleanup patches.

These can also be found in public git tree at:

    git://oss.oracle.com/git/mrathor/xen.git  branch: dom0pvh-v2

Thanks for all the help,
Mukesh

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 1/8] PVH dom0: iommu related changes

- For now, iommu is required for PVH dom0. Check for that.
- For pvh, we need to do mfn_to_gmfn before calling mapping function
  intel_iommu_map_page/amd_iommu_map_page which expects a gfn.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/drivers/passthrough/iommu.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 93ad122..dbbd034 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -125,15 +125,27 @@ int iommu_domain_init(struct domain *d)
     return hd->platform_ops->init(d);
 }
 
+static __init void check_dom0_pvh_reqs(struct domain *d)
+{
+    if ( !iommu_enabled )
+        panic("Presently, iommu must be enabled for pvh dom0\n");
+
+    if ( iommu_passthrough )
+        panic("For pvh dom0, dom0-passthrough must not be
enabled\n");
+}
+
 void __init iommu_dom0_init(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
+    if ( is_pvh_domain(d) )
+        check_dom0_pvh_reqs(d);
+
     if ( !iommu_enabled )
         return;
 
     register_keyhandler(''o'', &iommu_p2m_table);
-    d->need_iommu = !!iommu_dom0_strict;
+    d->need_iommu = is_pvh_domain(d) || !!iommu_dom0_strict;
     if ( need_iommu(d) )
     {
         struct page_info *page;
@@ -141,12 +153,13 @@ void __init iommu_dom0_init(struct domain *d)
         page_list_for_each ( page, &d->page_list )
         {
             unsigned long mfn = page_to_mfn(page);
+            unsigned long gfn = mfn_to_gmfn(d, mfn);
             unsigned int mapping = IOMMUF_readable;
             if ( ((page->u.inuse.type_info & PGT_count_mask) == 0) ||
                  ((page->u.inuse.type_info & PGT_type_mask)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
-            hd->platform_ops->map_page(d, mfn, mfn, mapping);
+            hd->platform_ops->map_page(d, gfn, mfn, mapping);
             if ( !(i++ & 0xfffff) )
                 process_pending_softirqs();
         }
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

In this preparatory patch XEN_DOMCTL_memory_mapping code is
put into a function so it can be called later for PVH from
construct_dom0. There is no change in it''s functionality.
The function is made non-static in the construct_dom0 patch.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domctl.c |  111 +++++++++++++++++++++++++++----------------------
 1 files changed, 61 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index f7e4586..ff76598 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -46,6 +46,66 @@ static int gdbsx_guest_mem_io(
     return (iop->remain ? -EFAULT : 0);
 }
 
+static int update_memory_mapping(struct domain *d, unsigned long gfn,
+                          unsigned long mfn, unsigned long nr_mfns,
+                          bool_t add)
+{
+    unsigned long i;
+    int ret;
+
+    ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn + nr_mfns - 1, add);
+    if ( ret )
+        return ret;
+
+    if ( add )
+    {
+        printk(XENLOG_G_INFO
+               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+               d->domain_id, gfn, mfn, nr_mfns);
+
+        ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
+        if ( !ret && paging_mode_translate(d) )
+        {
+            for ( i = 0; !ret && i < nr_mfns; i++ )
+                if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn + i)) )
+                    ret = -EIO;
+            if ( ret )
+            {
+                printk(XENLOG_G_WARNING
+                       "memory_map:fail: dom%d gfn=%lx mfn=%lx\n",
+                       d->domain_id, gfn + i, mfn + i);
+                while ( i-- )
+                    clear_mmio_p2m_entry(d, gfn + i);
+                if ( iomem_deny_access(d, mfn, mfn + nr_mfns - 1) &&
+                     is_hardware_domain(current->domain) )
+                    printk(XENLOG_ERR
+                           "memory_map: failed to deny dom%d access to
[%lx,%lx]\n",
+                           d->domain_id, mfn, mfn + nr_mfns - 1);
+            }
+        }
+    }
+    else
+    {
+        printk(XENLOG_G_INFO
+               "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+               d->domain_id, gfn, mfn, nr_mfns);
+
+        if ( paging_mode_translate(d) )
+            for ( i = 0; i < nr_mfns; i++ )
+                add |= !clear_mmio_p2m_entry(d, gfn + i);
+        ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
+        if ( !ret && add )
+            ret = -EIO;
+        if ( ret && is_hardware_domain(current->domain) )
+            printk(XENLOG_ERR
+                   "memory_map: error %d %s dom%d access to
[%lx,%lx]\n",
+                   ret, add ? "removing" : "denying",
d->domain_id,
+                   mfn, mfn + nr_mfns - 1);
+    }
+
+    return ret;
+}
+
 long arch_do_domctl(
     struct xen_domctl *domctl, struct domain *d,
     XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
@@ -625,7 +685,6 @@ long arch_do_domctl(
         unsigned long mfn = domctl->u.memory_mapping.first_mfn;
         unsigned long nr_mfns = domctl->u.memory_mapping.nr_mfns;
         int add = domctl->u.memory_mapping.add_mapping;
-        unsigned long i;
 
         ret = -EINVAL;
         if ( (mfn + nr_mfns - 1) < mfn || /* wrap? */
@@ -637,55 +696,7 @@ long arch_do_domctl(
         if ( !iomem_access_permitted(current->domain, mfn, mfn + nr_mfns -
1) )
             break;
 
-        ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn + nr_mfns - 1, add);
-        if ( ret )
-            break;
-
-        if ( add )
-        {
-            printk(XENLOG_G_INFO
-                   "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
-                   d->domain_id, gfn, mfn, nr_mfns);
-
-            ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
-            if ( !ret && paging_mode_translate(d) )
-            {
-                for ( i = 0; !ret && i < nr_mfns; i++ )
-                    if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn + i)) )
-                        ret = -EIO;
-                if ( ret )
-                {
-                    printk(XENLOG_G_WARNING
-                           "memory_map:fail: dom%d gfn=%lx
mfn=%lx\n",
-                           d->domain_id, gfn + i, mfn + i);
-                    while ( i-- )
-                        clear_mmio_p2m_entry(d, gfn + i);
-                    if ( iomem_deny_access(d, mfn, mfn + nr_mfns - 1)
&&
-                         is_hardware_domain(current->domain) )
-                        printk(XENLOG_ERR
-                               "memory_map: failed to deny dom%d access to
[%lx,%lx]\n",
-                               d->domain_id, mfn, mfn + nr_mfns - 1);
-                }
-            }
-        }
-        else
-        {
-            printk(XENLOG_G_INFO
-                   "memory_map:remove: dom%d gfn=%lx mfn=%lx
nr=%lx\n",
-                   d->domain_id, gfn, mfn, nr_mfns);
-
-            if ( paging_mode_translate(d) )
-                for ( i = 0; i < nr_mfns; i++ )
-                    add |= !clear_mmio_p2m_entry(d, gfn + i);
-            ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
-            if ( !ret && add )
-                ret = -EIO;
-            if ( ret && is_hardware_domain(current->domain) )
-                printk(XENLOG_ERR
-                       "memory_map: error %ld %s dom%d access to
[%lx,%lx]\n",
-                       ret, add ? "removing" : "denying",
d->domain_id,
-                       mfn, mfn + nr_mfns - 1);
-        }
+        ret = update_memory_mapping(d, gfn, mfn, nr_mfns, add);
     }
     break;
 
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 3/8] PVH dom0: move some pv specific code to static functions

In this preparatory patch also, some pv specific code is
carved out into static functions. No functionality change.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain_build.c |  353 +++++++++++++++++++++++--------------------
 1 files changed, 192 insertions(+), 161 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 232adf8..c9ff680 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -307,6 +307,191 @@ static void __init process_dom0_ioports_disable(void)
     }
 }
 
+/* Pages that are part of page tables must be read only. */
+static __init void mark_pv_pt_pages_rdonly(struct domain *d,
+                                           l4_pgentry_t *l4start,
+                                           unsigned long vpt_start,
+                                           unsigned long nr_pt_pages)
+{
+    unsigned long count;
+    struct page_info *page;
+    l4_pgentry_t *pl4e;
+    l3_pgentry_t *pl3e;
+    l2_pgentry_t *pl2e;
+    l1_pgentry_t *pl1e;
+
+    pl4e = l4start + l4_table_offset(vpt_start);
+    pl3e = l4e_to_l3e(*pl4e);
+    pl3e += l3_table_offset(vpt_start);
+    pl2e = l3e_to_l2e(*pl3e);
+    pl2e += l2_table_offset(vpt_start);
+    pl1e = l2e_to_l1e(*pl2e);
+    pl1e += l1_table_offset(vpt_start);
+    for ( count = 0; count < nr_pt_pages; count++ )
+    {
+        l1e_remove_flags(*pl1e, _PAGE_RW);
+        page = mfn_to_page(l1e_get_pfn(*pl1e));
+
+        /* Read-only mapping + PGC_allocated + page-table page. */
+        page->count_info         = PGC_allocated | 3;
+        page->u.inuse.type_info |= PGT_validated | 1;
+
+        /* Top-level p.t. is pinned. */
+        if ( (page->u.inuse.type_info & PGT_type_mask) =+            
(!is_pv_32on64_domain(d) ?
+              PGT_l4_page_table : PGT_l3_page_table) )
+        {
+            page->count_info        += 1;
+            page->u.inuse.type_info += 1 | PGT_pinned;
+        }
+
+        /* Iterate. */
+        if ( !((unsigned long)++pl1e & (PAGE_SIZE - 1)) )
+        {
+            if ( !((unsigned long)++pl2e & (PAGE_SIZE - 1)) )
+            {
+                if ( !((unsigned long)++pl3e & (PAGE_SIZE - 1)) )
+                    pl3e = l4e_to_l3e(*++pl4e);
+                pl2e = l3e_to_l2e(*pl3e);
+            }
+            pl1e = l2e_to_l1e(*pl2e);
+        }
+    }
+}
+
+/* Set up the phys->machine table if not part of the initial mapping. */
+static __init void setup_pv_physmap(struct domain *d, unsigned long pgtbl_pfn,
+                                    unsigned long v_start, unsigned long v_end,
+                                    unsigned long vphysmap_start,
+                                    unsigned long vphysmap_end,
+                                    unsigned long nr_pages)
+{
+    struct page_info *page = NULL;
+    l4_pgentry_t *pl4e = NULL, *l4start;
+    l3_pgentry_t *pl3e = NULL;
+    l2_pgentry_t *pl2e = NULL;
+    l1_pgentry_t *pl1e = NULL;
+
+    l4start = map_domain_page(pgtbl_pfn);
+
+    if ( v_start <= vphysmap_end && vphysmap_start <= v_end )
+        panic("DOM0 P->M table overlaps initial mapping");
+
+    while ( vphysmap_start < vphysmap_end )
+    {
+        if ( d->tot_pages + ((round_pgup(vphysmap_end) - vphysmap_start)
+                             >> PAGE_SHIFT) + 3 > nr_pages )
+            panic("Dom0 allocation too small for initial P->M
table.\n");
+
+        if ( pl1e )
+        {
+            unmap_domain_page(pl1e);
+            pl1e = NULL;
+        }
+        if ( pl2e )
+        {
+            unmap_domain_page(pl2e);
+            pl2e = NULL;
+        }
+        if ( pl3e )
+        {
+            unmap_domain_page(pl3e);
+            pl3e = NULL;
+        }
+        pl4e = l4start + l4_table_offset(vphysmap_start);
+        if ( !l4e_get_intpte(*pl4e) )
+        {
+            page = alloc_domheap_page(d, 0);
+            if ( !page )
+                break;
+
+            /* No mapping, PGC_allocated + page-table page. */
+            page->count_info = PGC_allocated | 2;
+            page->u.inuse.type_info = PGT_l3_page_table | PGT_validated | 1;
+            pl3e = __map_domain_page(page);
+            clear_page(pl3e);
+            *pl4e = l4e_from_page(page, L4_PROT);
+        } else
+            pl3e = map_domain_page(l4e_get_pfn(*pl4e));
+
+        pl3e += l3_table_offset(vphysmap_start);
+        if ( !l3e_get_intpte(*pl3e) )
+        {
+            if ( cpu_has_page1gb &&
+                 !(vphysmap_start & ((1UL << L3_PAGETABLE_SHIFT) -
1)) &&
+                 vphysmap_end >= vphysmap_start + (1UL <<
L3_PAGETABLE_SHIFT) &&
+                 (page = alloc_domheap_pages(d,
+                                             L3_PAGETABLE_SHIFT - PAGE_SHIFT,
+                                             0)) != NULL )
+            {
+                *pl3e = l3e_from_page(page, L1_PROT|_PAGE_DIRTY|_PAGE_PSE);
+                vphysmap_start += 1UL << L3_PAGETABLE_SHIFT;
+                continue;
+            }
+            if ( (page = alloc_domheap_page(d, 0)) == NULL )
+                break;
+
+            /* No mapping, PGC_allocated + page-table page. */
+            page->count_info = PGC_allocated | 2;
+            page->u.inuse.type_info = PGT_l2_page_table | PGT_validated | 1;
+            pl2e = __map_domain_page(page);
+            clear_page(pl2e);
+            *pl3e = l3e_from_page(page, L3_PROT);
+        }
+        else
+           pl2e = map_domain_page(l3e_get_pfn(*pl3e));
+
+        pl2e += l2_table_offset(vphysmap_start);
+        if ( !l2e_get_intpte(*pl2e) )
+        {
+            if ( !(vphysmap_start & ((1UL << L2_PAGETABLE_SHIFT) -
1)) &&
+                 vphysmap_end >= vphysmap_start + (1UL <<
L2_PAGETABLE_SHIFT) &&
+                 (page = alloc_domheap_pages(d,
+                                             L2_PAGETABLE_SHIFT - PAGE_SHIFT,
+                                             0)) != NULL )
+            {
+                *pl2e = l2e_from_page(page, L1_PROT|_PAGE_DIRTY|_PAGE_PSE);
+                if ( opt_allow_superpage )
+                    get_superpage(page_to_mfn(page), d);
+                vphysmap_start += 1UL << L2_PAGETABLE_SHIFT;
+                continue;
+            }
+            if ( (page = alloc_domheap_page(d, 0)) == NULL )
+                break;
+
+            /* No mapping, PGC_allocated + page-table page. */
+            page->count_info = PGC_allocated | 2;
+            page->u.inuse.type_info = PGT_l1_page_table | PGT_validated | 1;
+            pl1e = __map_domain_page(page);
+            clear_page(pl1e);
+            *pl2e = l2e_from_page(page, L2_PROT);
+        }
+        else
+            pl1e = map_domain_page(l2e_get_pfn(*pl2e));
+
+        pl1e += l1_table_offset(vphysmap_start);
+        BUG_ON(l1e_get_intpte(*pl1e));
+        page = alloc_domheap_page(d, 0);
+        if ( !page )
+            break;
+
+        *pl1e = l1e_from_page(page, L1_PROT|_PAGE_DIRTY);
+        vphysmap_start += PAGE_SIZE;
+        vphysmap_start &= PAGE_MASK;
+    }
+    if ( !page )
+        panic("Not enough RAM for DOM0 P->M table.\n");
+
+    if ( pl1e )
+        unmap_domain_page(pl1e);
+    if ( pl2e )
+        unmap_domain_page(pl2e);
+    if ( pl3e )
+        unmap_domain_page(pl3e);
+
+    unmap_domain_page(l4start);
+}
+
 int __init construct_dom0(
     struct domain *d,
     const module_t *image, unsigned long image_headroom,
@@ -705,44 +890,8 @@ int __init construct_dom0(
                COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2tab));
     }
 
-    /* Pages that are part of page tables must be read only. */
-    l4tab = l4start + l4_table_offset(vpt_start);
-    l3start = l3tab = l4e_to_l3e(*l4tab);
-    l3tab += l3_table_offset(vpt_start);
-    l2start = l2tab = l3e_to_l2e(*l3tab);
-    l2tab += l2_table_offset(vpt_start);
-    l1start = l1tab = l2e_to_l1e(*l2tab);
-    l1tab += l1_table_offset(vpt_start);
-    for ( count = 0; count < nr_pt_pages; count++ ) 
-    {
-        l1e_remove_flags(*l1tab, _PAGE_RW);
-        page = mfn_to_page(l1e_get_pfn(*l1tab));
-
-        /* Read-only mapping + PGC_allocated + page-table page. */
-        page->count_info         = PGC_allocated | 3;
-        page->u.inuse.type_info |= PGT_validated | 1;
-
-        /* Top-level p.t. is pinned. */
-        if ( (page->u.inuse.type_info & PGT_type_mask) =-            
(!is_pv_32on64_domain(d) ?
-              PGT_l4_page_table : PGT_l3_page_table) )
-        {
-            page->count_info        += 1;
-            page->u.inuse.type_info += 1 | PGT_pinned;
-        }
-
-        /* Iterate. */
-        if ( !((unsigned long)++l1tab & (PAGE_SIZE - 1)) )
-        {
-            if ( !((unsigned long)++l2tab & (PAGE_SIZE - 1)) )
-            {
-                if ( !((unsigned long)++l3tab & (PAGE_SIZE - 1)) )
-                    l3start = l3tab = l4e_to_l3e(*++l4tab);
-                l2start = l2tab = l3e_to_l2e(*l3tab);
-            }
-            l1start = l1tab = l2e_to_l1e(*l2tab);
-        }
-    }
+    if  ( is_pv_domain(d) )
+        mark_pv_pt_pages_rdonly(d, l4start, vpt_start, nr_pt_pages);
 
     /* Mask all upcalls... */
     for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ )
@@ -814,132 +963,14 @@ int __init construct_dom0(
              elf_64bit(&elf) ? 64 : 32, parms.pae ? "p" :
"");
 
     count = d->tot_pages;
-    l4start = map_domain_page(pagetable_get_pfn(v->arch.guest_table));
-    l3tab = NULL;
-    l2tab = NULL;
-    l1tab = NULL;
-    /* Set up the phys->machine table if not part of the initial mapping. */
-    if ( parms.p2m_base != UNSET_ADDR )
-    {
-        unsigned long va = vphysmap_start;
 
-        if ( v_start <= vphysmap_end && vphysmap_start <= v_end )
-            panic("DOM0 P->M table overlaps initial mapping");
-
-        while ( va < vphysmap_end )
-        {
-            if ( d->tot_pages + ((round_pgup(vphysmap_end) - va)
-                                 >> PAGE_SHIFT) + 3 > nr_pages )
-                panic("Dom0 allocation too small for initial P->M
table.\n");
-
-            if ( l1tab )
-            {
-                unmap_domain_page(l1tab);
-                l1tab = NULL;
-            }
-            if ( l2tab )
-            {
-                unmap_domain_page(l2tab);
-                l2tab = NULL;
-            }
-            if ( l3tab )
-            {
-                unmap_domain_page(l3tab);
-                l3tab = NULL;
-            }
-            l4tab = l4start + l4_table_offset(va);
-            if ( !l4e_get_intpte(*l4tab) )
-            {
-                page = alloc_domheap_page(d, 0);
-                if ( !page )
-                    break;
-                /* No mapping, PGC_allocated + page-table page. */
-                page->count_info = PGC_allocated | 2;
-                page->u.inuse.type_info -                   
PGT_l3_page_table | PGT_validated | 1;
-                l3tab = __map_domain_page(page);
-                clear_page(l3tab);
-                *l4tab = l4e_from_page(page, L4_PROT);
-            } else
-                l3tab = map_domain_page(l4e_get_pfn(*l4tab));
-            l3tab += l3_table_offset(va);
-            if ( !l3e_get_intpte(*l3tab) )
-            {
-                if ( cpu_has_page1gb &&
-                     !(va & ((1UL << L3_PAGETABLE_SHIFT) - 1))
&&
-                     vphysmap_end >= va + (1UL << L3_PAGETABLE_SHIFT)
&&
-                     (page = alloc_domheap_pages(d,
-                                                 L3_PAGETABLE_SHIFT -
-                                                     PAGE_SHIFT,
-                                                 0)) != NULL )
-                {
-                    *l3tab = l3e_from_page(page,
-                                           L1_PROT|_PAGE_DIRTY|_PAGE_PSE);
-                    va += 1UL << L3_PAGETABLE_SHIFT;
-                    continue;
-                }
-                if ( (page = alloc_domheap_page(d, 0)) == NULL )
-                    break;
-                /* No mapping, PGC_allocated + page-table page. */
-                page->count_info = PGC_allocated | 2;
-                page->u.inuse.type_info -                   
PGT_l2_page_table | PGT_validated | 1;
-                l2tab = __map_domain_page(page);
-                clear_page(l2tab);
-                *l3tab = l3e_from_page(page, L3_PROT);
-            }
-            else
-               l2tab = map_domain_page(l3e_get_pfn(*l3tab));
-            l2tab += l2_table_offset(va);
-            if ( !l2e_get_intpte(*l2tab) )
-            {
-                if ( !(va & ((1UL << L2_PAGETABLE_SHIFT) - 1))
&&
-                     vphysmap_end >= va + (1UL << L2_PAGETABLE_SHIFT)
&&
-                     (page = alloc_domheap_pages(d,
-                                                 L2_PAGETABLE_SHIFT -
-                                                     PAGE_SHIFT,
-                                                 0)) != NULL )
-                {
-                    *l2tab = l2e_from_page(page,
-                                           L1_PROT|_PAGE_DIRTY|_PAGE_PSE);
-                    if ( opt_allow_superpage )
-                        get_superpage(page_to_mfn(page), d);
-                    va += 1UL << L2_PAGETABLE_SHIFT;
-                    continue;
-                }
-                if ( (page = alloc_domheap_page(d, 0)) == NULL )
-                    break;
-                /* No mapping, PGC_allocated + page-table page. */
-                page->count_info = PGC_allocated | 2;
-                page->u.inuse.type_info -                   
PGT_l1_page_table | PGT_validated | 1;
-                l1tab = __map_domain_page(page);
-                clear_page(l1tab);
-                *l2tab = l2e_from_page(page, L2_PROT);
-            }
-            else
-                l1tab = map_domain_page(l2e_get_pfn(*l2tab));
-            l1tab += l1_table_offset(va);
-            BUG_ON(l1e_get_intpte(*l1tab));
-            page = alloc_domheap_page(d, 0);
-            if ( !page )
-                break;
-            *l1tab = l1e_from_page(page, L1_PROT|_PAGE_DIRTY);
-            va += PAGE_SIZE;
-            va &= PAGE_MASK;
-        }
-        if ( !page )
-            panic("Not enough RAM for DOM0 P->M table.\n");
+    if ( is_pv_domain(d) && parms.p2m_base != UNSET_ADDR )
+    {
+        pfn = pagetable_get_pfn(v->arch.guest_table);
+        setup_pv_physmap(d, pfn, v_start, v_end, vphysmap_start, vphysmap_end,
+                         nr_pages);
     }
 
-    if ( l1tab )
-        unmap_domain_page(l1tab);
-    if ( l2tab )
-        unmap_domain_page(l2tab);
-    if ( l3tab )
-        unmap_domain_page(l3tab);
-    unmap_domain_page(l4start);
-
     /* Write the phys->machine and machine->phys table entries. */
     for ( pfn = 0; pfn < count; pfn++ )
     {
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 4/8] dom0: construct_dom0 changes

This patch changes construct_dom0 to boot in PVH mode. Changes
need to support it are also included here.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain_build.c |  229 +++++++++++++++++++++++++++++++++++++++---
 xen/arch/x86/domctl.c       |    2 +-
 xen/arch/x86/mm/hap/hap.c   |   15 +++
 xen/include/asm-x86/hap.h   |    1 +
 xen/include/xen/domain.h    |    3 +
 5 files changed, 232 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index c9ff680..e5c3310 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -35,6 +35,7 @@
 #include <asm/setup.h>
 #include <asm/bzimage.h> /* for bzimage_parse */
 #include <asm/io_apic.h>
+#include <asm/hap.h>
 
 #include <public/version.h>
 
@@ -307,6 +308,145 @@ static void __init process_dom0_ioports_disable(void)
     }
 }
 
+/*
+ * Set the 1:1 map for all non-RAM regions for dom 0. Thus, dom0 will have
+ * the entire io region mapped in the EPT/NPT.
+ *
+ * pvh fixme: The following doesn''t map MMIO ranges when they sit
above the
+ *            highest E820 covered address.
+ */
+static __init void pvh_map_all_iomem(struct domain *d)
+{
+    unsigned long start_pfn, end_pfn, end = 0, start = 0;
+    const struct e820entry *entry;
+    unsigned int i, nump;
+    int rc;
+
+    for ( i = 0, entry = e820.map; i < e820.nr_map; i++, entry++ )
+    {
+        end = entry->addr + entry->size;
+
+        if ( entry->type == E820_RAM || entry->type == E820_UNUSABLE ||
+             i == e820.nr_map - 1 )
+        {
+            start_pfn = PFN_DOWN(start);
+
+            /* Unused RAM areas are marked UNUSABLE, so skip it too */
+            if ( entry->type == E820_RAM || entry->type == E820_UNUSABLE
)
+                end_pfn = PFN_UP(entry->addr);
+            else
+                end_pfn = PFN_UP(end);
+
+            if ( start_pfn < end_pfn )
+            {
+                nump = end_pfn - start_pfn;
+                /* Add pages to the mapping */
+                rc = update_memory_mapping(d, start_pfn, start_pfn, nump, 1);
+                BUG_ON(rc);
+            }
+            start = end;
+        }
+    }
+
+    /* If the e820 ended under 4GB, we must map the remaining space upto 4GB */
+    if ( end < GB(4) )
+    {
+        start_pfn = PFN_UP(end);
+        end_pfn = (GB(4)) >> PAGE_SHIFT;
+        nump = end_pfn - start_pfn;
+        rc = update_memory_mapping(d, start_pfn, start_pfn, nump, 1);
+        BUG_ON(rc);
+    }
+}
+
+static __init void dom0_update_physmap(struct domain *d, unsigned long pfn,
+                                   unsigned long mfn, unsigned long vphysmap_s)
+{
+    if ( is_pvh_domain(d) )
+    {
+        int rc = guest_physmap_add_page(d, pfn, mfn, 0);
+        BUG_ON(rc);
+        return;
+    }
+    if ( !is_pv_32on64_domain(d) )
+        ((unsigned long *)vphysmap_s)[pfn] = mfn;
+    else
+        ((unsigned int *)vphysmap_s)[pfn] = mfn;
+
+    set_gpfn_from_mfn(mfn, pfn);
+}
+
+static __init void pvh_fixup_page_tables_for_hap(struct vcpu *v,
+                                                 unsigned long v_start,
+                                                 unsigned long v_end)
+{
+    int i, j, k;
+    l4_pgentry_t *pl4e, *l4start;
+    l3_pgentry_t *pl3e;
+    l2_pgentry_t *pl2e;
+    l1_pgentry_t *pl1e;
+    unsigned long cr3_pfn;
+
+    ASSERT(paging_mode_enabled(v->domain));
+
+    l4start = map_domain_page(pagetable_get_pfn(v->arch.guest_table));
+
+    /* Clear entries prior to guest L4 start */
+    pl4e = l4start + l4_table_offset(v_start);
+    memset(l4start, 0, (unsigned long)pl4e - (unsigned long)l4start);
+
+    for ( ; pl4e <= l4start + l4_table_offset(v_end - 1); pl4e++ )
+    {
+        pl3e = map_l3t_from_l4e(*pl4e);
+        for ( i = 0; i < PAGE_SIZE / sizeof(*pl3e); i++, pl3e++ )
+        {
+            if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
+                continue;
+
+            pl2e = map_l2t_from_l3e(*pl3e);
+            for ( j = 0; j < PAGE_SIZE / sizeof(*pl2e); j++, pl2e++ )
+            {
+                if ( !(l2e_get_flags(*pl2e)  & _PAGE_PRESENT) )
+                    continue;
+
+                pl1e = map_l1t_from_l2e(*pl2e);
+                for ( k = 0; k < PAGE_SIZE / sizeof(*pl1e); k++, pl1e++ )
+                {
+                    if ( !(l1e_get_flags(*pl1e) & _PAGE_PRESENT) )
+                        continue;
+
+                    *pl1e = l1e_from_pfn(get_gpfn_from_mfn(l1e_get_pfn(*pl1e)),
+                                         l1e_get_flags(*pl1e));
+                }
+                unmap_domain_page(pl1e);
+                *pl2e = l2e_from_pfn(get_gpfn_from_mfn(l2e_get_pfn(*pl2e)),
+                                     l2e_get_flags(*pl2e));
+            }
+            unmap_domain_page(pl2e);
+            *pl3e = l3e_from_pfn(get_gpfn_from_mfn(l3e_get_pfn(*pl3e)),
+                                 l3e_get_flags(*pl3e));
+        }
+        unmap_domain_page(pl3e);
+        *pl4e = l4e_from_pfn(get_gpfn_from_mfn(l4e_get_pfn(*pl4e)),
+                             l4e_get_flags(*pl4e));
+    }
+
+    /* Clear entries post guest L4. */
+    if ( (unsigned long)pl4e & (PAGE_SIZE - 1) )
+        memset(pl4e, 0, PAGE_SIZE - ((unsigned long)pl4e & (PAGE_SIZE -
1)));
+
+    unmap_domain_page(l4start);
+
+    cr3_pfn = get_gpfn_from_mfn(paddr_to_pfn(v->arch.cr3));
+    v->arch.hvm_vcpu.guest_cr[3] = pfn_to_paddr(cr3_pfn);
+
+    /*
+     * Finally, we update the paging modes (hap_update_paging_modes). This will
+     * create monitor_table for us, update v->arch.cr3, and update vmcs.cr3.
+     */
+    paging_update_paging_modes(v);
+}
+
 /* Pages that are part of page tables must be read only. */
 static __init void mark_pv_pt_pages_rdonly(struct domain *d,
                                            l4_pgentry_t *l4start,
@@ -520,6 +660,8 @@ int __init construct_dom0(
     l3_pgentry_t *l3tab = NULL, *l3start = NULL;
     l2_pgentry_t *l2tab = NULL, *l2start = NULL;
     l1_pgentry_t *l1tab = NULL, *l1start = NULL;
+    paddr_t shared_info_paddr = 0;
+    u32 save_pvh_pg_mode = 0;
 
     /*
      * This fully describes the memory layout of the initial domain. All 
@@ -597,12 +739,21 @@ int __init construct_dom0(
         goto out;
     }
 
-    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE
&&
-         !test_bit(XENFEAT_dom0, parms.f_supported) )
+    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
     {
-        printk("Kernel does not support Dom0 operation\n");
-        rc = -EINVAL;
-        goto out;
+        if ( !test_bit(XENFEAT_dom0, parms.f_supported) )
+        {
+            printk("Kernel does not support Dom0 operation\n");
+            rc = -EINVAL;
+            goto out;
+        }
+        if ( is_pvh_domain(d) &&
+             !test_bit(XENFEAT_hvm_callback_vector, parms.f_supported) )
+        {
+            printk("Kernel does not support PVH mode\n");
+            rc = -EINVAL;
+            goto out;
+        }
     }
 
     if ( compat32 )
@@ -667,6 +818,13 @@ int __init construct_dom0(
     vstartinfo_end   = (vstartinfo_start +
                         sizeof(struct start_info) +
                         sizeof(struct dom0_vga_console_info));
+
+    if ( is_pvh_domain(d) )
+    {
+        shared_info_paddr = round_pgup(vstartinfo_end) - v_start;
+        vstartinfo_end   += PAGE_SIZE;
+    }
+
     vpt_start        = round_pgup(vstartinfo_end);
     for ( nr_pt_pages = 2; ; nr_pt_pages++ )
     {
@@ -906,6 +1064,13 @@ int __init construct_dom0(
         (void)alloc_vcpu(d, i, cpu);
     }
 
+    /*
+     * pvh: we temporarily disable paging mode so that we can build cr3 needed
+     * to run on dom0''s page tables.
+     */
+    save_pvh_pg_mode = d->arch.paging.mode;
+    d->arch.paging.mode = 0;
+
     /* Set up CR3 value for write_ptbase */
     if ( paging_mode_enabled(d) )
         paging_update_paging_modes(v);
@@ -971,6 +1136,15 @@ int __init construct_dom0(
                          nr_pages);
     }
 
+    if ( is_pvh_domain(d) )
+        hap_set_pvh_alloc_for_dom0(d, nr_pages);
+
+    /*
+     * We enable paging mode again so guest_physmap_add_page will do the
+     * right thing for us.
+     */
+    d->arch.paging.mode = save_pvh_pg_mode;
+
     /* Write the phys->machine and machine->phys table entries. */
     for ( pfn = 0; pfn < count; pfn++ )
     {
@@ -987,11 +1161,7 @@ int __init construct_dom0(
         if ( pfn > REVERSE_START && (vinitrd_start || pfn <
initrd_pfn) )
             mfn = alloc_epfn - (pfn - REVERSE_START);
 #endif
-        if ( !is_pv_32on64_domain(d) )
-            ((unsigned long *)vphysmap_start)[pfn] = mfn;
-        else
-            ((unsigned int *)vphysmap_start)[pfn] = mfn;
-        set_gpfn_from_mfn(mfn, pfn);
+        dom0_update_physmap(d, pfn, mfn, vphysmap_start);
         if (!(pfn & 0xfffff))
             process_pending_softirqs();
     }
@@ -1007,8 +1177,8 @@ int __init construct_dom0(
             if ( !page->u.inuse.type_info &&
                  !get_page_and_type(page, d, PGT_writable_page) )
                 BUG();
-            ((unsigned long *)vphysmap_start)[pfn] = mfn;
-            set_gpfn_from_mfn(mfn, pfn);
+
+            dom0_update_physmap(d, pfn, mfn, vphysmap_start);
             ++pfn;
             if (!(pfn & 0xfffff))
                 process_pending_softirqs();
@@ -1028,11 +1198,7 @@ int __init construct_dom0(
 #ifndef NDEBUG
 #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn)))
 #endif
-            if ( !is_pv_32on64_domain(d) )
-                ((unsigned long *)vphysmap_start)[pfn] = mfn;
-            else
-                ((unsigned int *)vphysmap_start)[pfn] = mfn;
-            set_gpfn_from_mfn(mfn, pfn);
+            dom0_update_physmap(d, pfn, mfn, vphysmap_start);
 #undef pfn
             page++; pfn++;
             if (!(pfn & 0xfffff))
@@ -1056,6 +1222,15 @@ int __init construct_dom0(
         si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
     }
 
+    /*
+     * PVH: We need to update si->shared_info while we are on dom0 page
tables,
+     * but need to defer the p2m update until after we have fixed up the
+     * page tables for PVH so that the m2p for the si pte entry returns
+     * correct pfn.
+     */
+    if ( is_pvh_domain(d) )
+        si->shared_info = shared_info_paddr;
+
     if ( is_pv_32on64_domain(d) )
         xlat_start_info(si, XLAT_start_info_console_dom0);
 
@@ -1089,8 +1264,15 @@ int __init construct_dom0(
     regs->eflags = X86_EFLAGS_IF;
 
     if ( opt_dom0_shadow )
+    {
+        if ( is_pvh_domain(d) )
+        {
+            printk("Invalid option dom0_shadow for PVH\n");
+            return -EINVAL;
+        }
         if ( paging_enable(d, PG_SH_enable) == 0 ) 
             paging_update_paging_modes(v);
+    }
 
     if ( supervisor_mode_kernel )
     {
@@ -1180,6 +1362,19 @@ int __init construct_dom0(
         printk(" Xen warning: dom0 kernel broken ELF: %s\n",
                elf_check_broken(&elf));
 
+    if ( is_pvh_domain(d) )
+    {
+        /* finally, fixup the page table, replacing mfns with pfns */
+        pvh_fixup_page_tables_for_hap(v, v_start, v_end);
+
+        /* the pt has correct pfn for si, now update the mfn in the p2m */
+        mfn = virt_to_mfn(d->shared_info);
+        pfn = shared_info_paddr >> PAGE_SHIFT;
+        dom0_update_physmap(d, pfn, mfn, 0);
+
+        pvh_map_all_iomem(d);
+    }
+
     iommu_dom0_init(dom0);
     return 0;
 
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index ff76598..de2e24b 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -46,7 +46,7 @@ static int gdbsx_guest_mem_io(
     return (iop->remain ? -EFAULT : 0);
 }
 
-static int update_memory_mapping(struct domain *d, unsigned long gfn,
+int update_memory_mapping(struct domain *d, unsigned long gfn,
                           unsigned long mfn, unsigned long nr_mfns,
                           bool_t add)
 {
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index d3f64bd..4accab6 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -579,6 +579,21 @@ int hap_domctl(struct domain *d, xen_domctl_shadow_op_t
*sc,
     }
 }
 
+void __init hap_set_pvh_alloc_for_dom0(struct domain *d,
+                                       unsigned long num_pages)
+{
+    int rc;
+    unsigned long memkb = num_pages * (PAGE_SIZE / 1024);
+
+   /* Copied from: libxl_get_required_shadow_memory() */
+    memkb = 4 * (256 * d->max_vcpus + 2 * (memkb / 1024));
+    num_pages = ((memkb+1023)/1024) << (20 - PAGE_SHIFT);
+    paging_lock(d);
+    rc = hap_set_allocation(d, num_pages, NULL);
+    paging_unlock(d);
+    BUG_ON(rc);
+}
+
 static const struct paging_mode hap_paging_real_mode;
 static const struct paging_mode hap_paging_protected_mode;
 static const struct paging_mode hap_paging_pae_mode;
diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h
index e03f983..aab8558 100644
--- a/xen/include/asm-x86/hap.h
+++ b/xen/include/asm-x86/hap.h
@@ -63,6 +63,7 @@ int   hap_track_dirty_vram(struct domain *d,
                            XEN_GUEST_HANDLE_64(uint8) dirty_bitmap);
 
 extern const struct paging_mode *hap_paging_get_mode(struct vcpu *);
+void hap_set_pvh_alloc_for_dom0(struct domain *d, unsigned long num_pages);
 
 #endif /* XEN_HAP_H */
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index a057069..1b7286d 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -89,4 +89,7 @@ extern unsigned int xen_processor_pmbits;
 
 extern bool_t opt_dom0_vcpus_pin;
 
+extern int update_memory_mapping(struct domain *d, unsigned long gfn,
+                     unsigned long mfn, unsigned long nr_mfns, bool_t add_map);
+
 #endif /* __XEN_DOMAIN_H__ */
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 5/8] PVH dom0: implement XENMEM_add_to_physmap_range for x86

This preparatory patch adds support for XENMEM_add_to_physmap_range
on x86 so it can be used to create a guest on PVH dom0. To this end, we
add a new function xenmem_add_to_physmap_range(), and change
xenmem_add_to_physmap_once parameters so it can be called from
xenmem_add_to_physmap_range.

Please note, compat will continue to return -ENOSYS.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm.c |   73 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 6c26026..fc8dded 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4675,6 +4675,42 @@ static int xenmem_add_to_physmap(struct domain *d,
     return xenmem_add_to_physmap_once(d, xatp);
 }
 
+static int xenmem_add_to_physmap_range(struct domain *d,
+                                       struct xen_add_to_physmap_range *xatpr)
+{
+    int rc;
+
+    /* Process entries in reverse order to allow continuations */
+    while ( xatpr->size > 0 )
+    {
+        xen_ulong_t idx;
+        xen_pfn_t gpfn;
+        struct xen_add_to_physmap xatp;
+
+        if ( copy_from_guest_offset(&idx, xatpr->idxs, xatpr->size-1,
1)  ||
+             copy_from_guest_offset(&gpfn, xatpr->gpfns,
xatpr->size-1, 1) )
+        {
+            return -EFAULT;
+        }
+
+        xatp.space = xatpr->space;
+        xatp.idx = idx;
+        xatp.gpfn = gpfn;
+        rc = xenmem_add_to_physmap_once(d, &xatp);
+
+        if ( copy_to_guest_offset(xatpr->errs, xatpr->size-1, &rc, 1)
)
+            return -EFAULT;
+
+        xatpr->size--;
+
+        /* Check for continuation if it''s not the last interation */
+        if ( xatpr->size > 0 && hypercall_preempt_check() )
+            return -EAGAIN;
+    }
+
+    return 0;
+}
+
 long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     int rc;
@@ -4689,6 +4725,10 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
         if ( copy_from_guest(&xatp, arg, 1) )
             return -EFAULT;
 
+        /* This one is only supported for add_to_physmap_range */
+        if ( xatp.space == XENMAPSPACE_gmfn_foreign )
+            return -EINVAL;
+
         d = rcu_lock_domain_by_any_id(xatp.domid);
         if ( d == NULL )
             return -ESRCH;
@@ -4716,6 +4756,39 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
         return rc;
     }
 
+    case XENMEM_add_to_physmap_range:
+    {
+        struct xen_add_to_physmap_range xatpr;
+        struct domain *d;
+
+        if ( copy_from_guest(&xatpr, arg, 1) )
+            return -EFAULT;
+
+        /* This mapspace is redundant for this hypercall */
+        if ( xatpr.space == XENMAPSPACE_gmfn_range )
+            return -EINVAL;
+
+        d = rcu_lock_domain_by_any_id(xatpr.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        if ( (rc = xsm_add_to_physmap(XSM_TARGET, current->domain, d)) )
+        {
+            rcu_unlock_domain(d);
+            return rc;
+        }
+
+        rc = xenmem_add_to_physmap_range(d, &xatpr);
+
+        rcu_unlock_domain(d);
+
+        if ( rc == -EAGAIN )
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_memory_op, "ih", op, arg);
+
+        return rc;
+    }
+
     case XENMEM_set_memory_map:
     {
         struct xen_foreign_memory_map fmap;
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

In this patch, a new type p2m_map_foreign is introduced for pages
that toolstack on PVH dom0 maps from foreign domains that its creating
or supporting during it''s run time.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm/p2m-ept.c |    1 +
 xen/arch/x86/mm/p2m-pt.c  |    1 +
 xen/arch/x86/mm/p2m.c     |   28 ++++++++++++++++++++--------
 xen/include/asm-x86/p2m.h |    4 ++++
 4 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 92d9e2d..08d1d72 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -75,6 +75,7 @@ static void ept_p2m_type_to_flags(ept_entry_t *entry,
p2m_type_t type, p2m_acces
             entry->w = 0;
             break;
         case p2m_grant_map_rw:
+        case p2m_map_foreign:
             entry->r = entry->w = 1;
             entry->x = 0;
             break;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index a1d5650..09b60ce 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -89,6 +89,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t
mfn)
     case p2m_ram_rw:
         return flags | P2M_BASE_FLAGS | _PAGE_RW;
     case p2m_grant_map_rw:
+    case p2m_map_foreign:
         return flags | P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
     case p2m_mmio_direct:
         if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn_x(mfn)) )
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 8f380ed..0659ef1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -525,7 +525,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn,
unsigned long mfn,
         for ( i = 0; i < (1UL << page_order); i++ )
         {
             mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0,
NULL);
-            if ( !p2m_is_grant(t) && !p2m_is_shared(t) )
+            if ( !p2m_is_grant(t) && !p2m_is_shared(t) &&
!p2m_is_foreign(t) )
                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
         }
@@ -756,10 +756,9 @@ void p2m_change_type_range(struct domain *d,
     p2m_unlock(p2m);
 }
 
-
-
-int
-set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+static int
+set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
+                    p2m_type_t gfn_p2mt)
 {
     int rc = 0;
     p2m_access_t a;
@@ -784,16 +783,29 @@ set_mmio_p2m_entry(struct domain *d, unsigned long gfn,
mfn_t mfn)
         set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY);
     }
 
-    P2M_DEBUG("set mmio %lx %lx\n", gfn, mfn_x(mfn));
-    rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_mmio_direct,
p2m->default_access);
+    P2M_DEBUG("set %d %lx %lx\n", gfn_p2mt, gfn, mfn_x(mfn));
+    rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, gfn_p2mt,
+                       p2m->default_access);
     gfn_unlock(p2m, gfn, 0);
     if ( 0 == rc )
         gdprintk(XENLOG_ERR,
-            "set_mmio_p2m_entry: set_p2m_entry failed! mfn=%08lx\n",
+            "%s: set_p2m_entry failed! mfn=%08lx\n", __func__,
             mfn_x(get_gfn_query_unlocked(p2m->domain, gfn, &ot)));
     return rc;
 }
 
+/* Returns: True for success. */
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+    return set_typed_p2m_entry(d, gfn, mfn, p2m_map_foreign);
+}
+
+int
+set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+    return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct);
+}
+
 int
 clear_mmio_p2m_entry(struct domain *d, unsigned long gfn)
 {
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 43583b2..6fc71a1 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -70,6 +70,7 @@ typedef enum {
     p2m_ram_paging_in = 11,       /* Memory that is being paged in */
     p2m_ram_shared = 12,          /* Shared or sharable memory */
     p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
+    p2m_map_foreign  = 14,        /* ram pages from foreign domain */
 } p2m_type_t;
 
 /*
@@ -180,6 +181,7 @@ typedef unsigned int p2m_query_t;
 #define p2m_is_sharable(_t) (p2m_to_mask(_t) & P2M_SHARABLE_TYPES)
 #define p2m_is_shared(_t)   (p2m_to_mask(_t) & P2M_SHARED_TYPES)
 #define p2m_is_broken(_t)   (p2m_to_mask(_t) & P2M_BROKEN_TYPES)
+#define p2m_is_foreign(_t)  (p2m_to_mask(_t) &
p2m_to_mask(p2m_map_foreign))
 
 /* Per-p2m-table state */
 struct p2m_domain {
@@ -510,6 +512,8 @@ p2m_type_t p2m_change_type(struct domain *d, unsigned long
gfn,
 int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn);
 
+/* Set foreign mfn in the current guest''s p2m table. */
+int set_foreign_p2m_entry(struct domain *domp, unsigned long gfn, mfn_t mfn);
 
 /* 
  * Populate-on-demand
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

In this patch, a new function, xenmem_add_foreign_to_pmap(), is added
to map pages from foreign guest into current dom0 for domU creation.
Such pages are typed p2m_map_foreign. Also, support is added here to
XENMEM_remove_from_physmap to remove such pages. Note, in the remove
path, we must release the refcount that was taken during the map phase.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm.c           |   92 ++++++++++++++++++++++++++++++++++++++++--
 xen/common/memory.c         |   38 +++++++++++++++--
 xen/include/public/memory.h |    2 +-
 3 files changed, 121 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index fc8dded..c2fd7bf 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2810,7 +2810,7 @@ static struct domain *get_pg_owner(domid_t domid)
         goto out;
     }
 
-    if ( unlikely(paging_mode_translate(curr)) )
+    if ( !is_pvh_domain(curr) && unlikely(paging_mode_translate(curr))
)
     {
         MEM_LOG("Cannot mix foreign mappings with translated
domains");
         goto out;
@@ -4520,9 +4520,83 @@ static int handle_iomem_range(unsigned long s, unsigned
long e, void *p)
     return 0;
 }
 
+/*
+ * Add frames from foreign domain to current domain''s physmap. Similar
to
+ * XENMAPSPACE_gmfn but the frame is foreign being mapped into current,
+ * and is not removed from foreign domain.
+ * Usage: libxl on pvh dom0 creating a guest and doing privcmd_ioctl_mmap.
+ * Side Effect: the mfn for fgfn will be refcounted so it is not lost
+ *              while mapped here. The refcnt is released in do_memory_op()
+ *              via XENMEM_remove_from_physmap.
+ * Returns: 0 ==> success
+ */
+static int xenmem_add_foreign_to_pmap(unsigned long fgfn, unsigned long gpfn,
+                                      domid_t foreign_domid)
+{
+    p2m_type_t p2mt, p2mt_prev;
+    int rc = 0;
+    unsigned long prev_mfn, mfn = 0;
+    struct domain *fdom, *currd = current->domain;
+    struct page_info *page = NULL;
+
+    if ( currd->domain_id == foreign_domid || foreign_domid == DOMID_SELF ||
+         !is_pvh_domain(currd) )
+        return -EINVAL;
+
+    if ( !is_control_domain(currd) ||
+         (fdom = get_pg_owner(foreign_domid)) == NULL )
+        return -EPERM;
+
+    /* following will take a refcnt on the mfn */
+    page = get_page_from_gfn(fdom, fgfn, &p2mt, P2M_ALLOC);
+    if ( !page || !p2m_is_valid(p2mt) )
+    {
+        if ( page )
+            put_page(page);
+        put_pg_owner(fdom);
+        return -EINVAL;
+    }
+    mfn = page_to_mfn(page);
+
+    /* Remove previously mapped page if it is present. */
+    prev_mfn = mfn_x(get_gfn(currd, gpfn, &p2mt_prev));
+    if ( mfn_valid(prev_mfn) )
+    {
+        if ( is_xen_heap_mfn(prev_mfn) )
+            /* Xen heap frames are simply unhooked from this phys slot */
+            guest_physmap_remove_page(currd, gpfn, prev_mfn, 0);
+        else
+            /* Normal domain memory is freed, to avoid leaking memory. */
+            guest_remove_page(currd, gpfn);
+    }
+    /*
+     * Create the new mapping. Can''t use guest_physmap_add_page()
because it
+     * will update the m2p table which will result in  mfn -> gpfn of dom0
+     * and not fgfn of domU.
+     */
+    if ( set_foreign_p2m_entry(currd, gpfn, _mfn(mfn)) == 0 )
+    {
+        dprintk(XENLOG_WARNING,
+                "guest_physmap_add_page failed. gpfn:%lx mfn:%lx
fgfn:%lx\n",
+                gpfn, mfn, fgfn);
+        put_page(page);
+        rc = -EINVAL;
+    }
+
+    /*
+     * We must do this put_gfn after set_foreign_p2m_entry so another cpu
+     * doesn''t populate the gpfn before us.
+     */
+    put_gfn(currd, gpfn);
+
+    put_pg_owner(fdom);
+    return rc;
+}
+
 static int xenmem_add_to_physmap_once(
     struct domain *d,
-    const struct xen_add_to_physmap *xatp)
+    const struct xen_add_to_physmap *xatp,
+    domid_t foreign_domid)
 {
     struct page_info *page = NULL;
     unsigned long gfn = 0; /* gcc ... */
@@ -4581,6 +4655,14 @@ static int xenmem_add_to_physmap_once(
             page = mfn_to_page(mfn);
             break;
         }
+
+        case XENMAPSPACE_gmfn_foreign:
+        {
+            rc = xenmem_add_foreign_to_pmap(xatp->idx, xatp->gpfn,
+                                            foreign_domid);
+            return rc;
+        }
+
         default:
             break;
     }
@@ -4646,7 +4728,7 @@ static int xenmem_add_to_physmap(struct domain *d,
         start_xatp = *xatp;
         while ( xatp->size > 0 )
         {
-            rc = xenmem_add_to_physmap_once(d, xatp);
+            rc = xenmem_add_to_physmap_once(d, xatp, DOMID_INVALID);
             if ( rc < 0 )
                 return rc;
 
@@ -4672,7 +4754,7 @@ static int xenmem_add_to_physmap(struct domain *d,
         return rc;
     }
 
-    return xenmem_add_to_physmap_once(d, xatp);
+    return xenmem_add_to_physmap_once(d, xatp, DOMID_INVALID);
 }
 
 static int xenmem_add_to_physmap_range(struct domain *d,
@@ -4696,7 +4778,7 @@ static int xenmem_add_to_physmap_range(struct domain *d,
         xatp.space = xatpr->space;
         xatp.idx = idx;
         xatp.gpfn = gpfn;
-        rc = xenmem_add_to_physmap_once(d, &xatp);
+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);
 
         if ( copy_to_guest_offset(xatpr->errs, xatpr->size-1, &rc, 1)
)
             return -EFAULT;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 50b740f..d81df18 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -675,9 +675,11 @@ long do_memory_op(unsigned long cmd,
XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case XENMEM_remove_from_physmap:
     {
+        unsigned long mfn;
         struct xen_remove_from_physmap xrfp;
         struct page_info *page;
-        struct domain *d;
+        struct domain *d, *foreign_dom = NULL;
+        p2m_type_t p2mt, tp;
 
         if ( copy_from_guest(&xrfp, arg, 1) )
             return -EFAULT;
@@ -693,11 +695,37 @@ long do_memory_op(unsigned long cmd,
XEN_GUEST_HANDLE_PARAM(void) arg)
             return rc;
         }
 
-        page = get_page_from_gfn(d, xrfp.gpfn, NULL, P2M_ALLOC);
-        if ( page )
+        /*
+         * if PVH, the gfn could be mapped to a mfn from foreign domain by the
+         * user space tool during domain creation. We need to check for that,
+         * free it up from the p2m, and release refcnt on it. In such a case,
+         * page would be NULL and the following call would not have
refcnt''d
+         * the page. See also xenmem_add_foreign_to_pmap().
+         */
+        page = get_page_from_gfn(d, xrfp.gpfn, &p2mt, P2M_ALLOC);
+
+        if ( page || p2m_is_foreign(p2mt) )
         {
-            guest_physmap_remove_page(d, xrfp.gpfn, page_to_mfn(page), 0);
-            put_page(page);
+            if ( page )
+                mfn = page_to_mfn(page);
+            else
+            {
+                mfn = mfn_x(get_gfn_query(d, xrfp.gpfn, &tp));
+                foreign_dom = page_get_owner(mfn_to_page(mfn));
+                ASSERT(is_pvh_domain(d));
+                ASSERT(d != foreign_dom);
+                ASSERT(p2m_is_foreign(tp));
+            }
+
+            guest_physmap_remove_page(d, xrfp.gpfn, mfn, 0);
+            if (page)
+                put_page(page);
+
+            if ( p2m_is_foreign(p2mt) )
+            {
+                put_page(mfn_to_page(mfn));
+                put_gfn(d, xrfp.gpfn);
+            }
         }
         else
             rc = -ENOENT;
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 7a26dee..e319b5d 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -208,7 +208,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
 #define XENMAPSPACE_gmfn_range   3 /* GMFN range, XENMEM_add_to_physmap only.
*/
 #define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
                                     * XENMEM_add_to_physmap_range only.
-                                    */
+                                    * (PVH x86 only) */
 /* ` } */
 
 /*
-- 
1.7.2.3

Mukesh Rathor

2013-Nov-23 00:03 UTC

head link

[V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

Add opt_dom0pvh. Note, pvh dom0 is disabled until the fixme in
domain_build.c is resolved. The fixme is added by patch title:
"PVH dom0: construct_dom0 changes"

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/setup.c |   19 ++++++++++++++++---
 1 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index e33c34b..de30ef6 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
 static bool_t __initdata disable_smep;
 invbool_param("smep", disable_smep);
 
+/* Boot dom0 in pvh mode */
+bool_t __initdata opt_dom0pvh;
+boolean_param("dom0pvh", opt_dom0pvh);
+
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter.
*/
 /* "acpi=force":  Override the disable blacklist.                  
*/
@@ -545,7 +549,7 @@ void __init __start_xen(unsigned long mbi_p)
 {
     char *memmap_type = NULL;
     char *cmdline, *kextra, *loader;
-    unsigned int initrdidx;
+    unsigned int initrdidx, domcr_flags = 0;
     multiboot_info_t *mbi = __va(mbi_p);
     module_t *mod = (module_t *)__va(mbi->mods_addr);
     unsigned long nr_pages, raw_max_page, modules_headroom, *module_map;
@@ -1332,8 +1336,17 @@ void __init __start_xen(unsigned long mbi_p)
     if ( !tboot_protect_mem_regions() )
         panic("Could not protect TXT memory regions\n");
 
-    /* Create initial domain 0. */
-    dom0 = domain_create(0, DOMCRF_s3_integrity, 0);
+    /*
+     * Following removed when "pvh fixme" in domain_build.c is
resolved.
+     * The fixme is added by patch "PVH dom0: construct_dom0
changes".
+     */
+    if ( opt_dom0pvh )
+        panic("You do not have the correct xen version for dom0
PVH\n");
+
+     /* Create initial domain 0. */
+     domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
+     domcr_flags |= DOMCRF_s3_integrity;
+     dom0 = domain_create(0, domcr_flags, 0);
     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0() == NULL) )
         panic("Error creating domain 0\n");
 
-- 
1.7.2.3

Konrad Rzeszutek Wilk

2013-Nov-25 01:19 UTC

head link

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

Mukesh Rathor <mukesh.rathor@oracle.com> wrote:>- For now, iommu is required for PVH dom0. Check for that.
>- For pvh, we need to do mfn_to_gmfn before calling mapping function
>  intel_iommu_map_page/amd_iommu_map_page which expects a gfn.
>
>Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
>---
> xen/drivers/passthrough/iommu.c |   17 +++++++++++++++--
> 1 files changed, 15 insertions(+), 2 deletions(-)
>
>diff --git a/xen/drivers/passthrough/iommu.c
>b/xen/drivers/passthrough/iommu.c
>index 93ad122..dbbd034 100644
>--- a/xen/drivers/passthrough/iommu.c
>+++ b/xen/drivers/passthrough/iommu.c
>@@ -125,15 +125,27 @@ int iommu_domain_init(struct domain *d)
>     return hd->platform_ops->init(d);
> }
> 
>+static __init void check_dom0_pvh_reqs(struct domain *d)
>+{
>+    if ( !iommu_enabled )
>+        panic("Presently, iommu must be enabled for pvh dom0\n");
>+
>+    if ( iommu_passthrough )
>+        panic("For pvh dom0, dom0-passthrough must not be
enabled\n");
>+}
That is not very "check'' type. You panic!

Could you instead just disable PVH dom0 booting and boot it as a normal PV guest
(or try)?>+
> void __init iommu_dom0_init(struct domain *d)
> {
>     struct hvm_iommu *hd = domain_hvm_iommu(d);
> 
>+    if ( is_pvh_domain(d) )
>+        check_dom0_pvh_reqs(d);
>+
>     if ( !iommu_enabled )
>         return;
> 
>     register_keyhandler(''o'', &iommu_p2m_table);
>-    d->need_iommu = !!iommu_dom0_strict;
>+    d->need_iommu = is_pvh_domain(d) || !!iommu_dom0_strict;
>     if ( need_iommu(d) )
>     {
>         struct page_info *page;
>@@ -141,12 +153,13 @@ void __init iommu_dom0_init(struct domain *d)
>         page_list_for_each ( page, &d->page_list )
>         {
>             unsigned long mfn = page_to_mfn(page);
>+            unsigned long gfn = mfn_to_gmfn(d, mfn);
>             unsigned int mapping = IOMMUF_readable;
>             if ( ((page->u.inuse.type_info & PGT_count_mask) == 0)
||
>                  ((page->u.inuse.type_info & PGT_type_mask)
>                   == PGT_writable_page) )
>                 mapping |= IOMMUF_writable;
>-            hd->platform_ops->map_page(d, mfn, mfn, mapping);
>+            hd->platform_ops->map_page(d, gfn, mfn, mapping);
>             if ( !(i++ & 0xfffff) )
>                 process_pending_softirqs();
>         }

Jan Beulich

2013-Nov-25 08:35 UTC

head link

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -125,15 +125,27 @@ int iommu_domain_init(struct domain *d)
>      return hd->platform_ops->init(d);
>  }
>  
> +static __init void check_dom0_pvh_reqs(struct domain *d)
> +{
> +    if ( !iommu_enabled )
> +        panic("Presently, iommu must be enabled for pvh
dom0\n");
> +
> +    if ( iommu_passthrough )
> +        panic("For pvh dom0, dom0-passthrough must not be
enabled\n");
> +}
> +
>  void __init iommu_dom0_init(struct domain *d)
>  {
>      struct hvm_iommu *hd = domain_hvm_iommu(d);
>  
> +    if ( is_pvh_domain(d) )
> +        check_dom0_pvh_reqs(d);
> +
>      if ( !iommu_enabled )
>          return;
>  
>      register_keyhandler(''o'', &iommu_p2m_table);
> -    d->need_iommu = !!iommu_dom0_strict;
> +    d->need_iommu = is_pvh_domain(d) || !!iommu_dom0_strict;
I''d prefer if you simply forced iommu_dom0_strict on in
check_dom0_pvh_reqs().

Jan

Jan Beulich

2013-Nov-25 08:43 UTC

head link

Re: [V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -46,6 +46,66 @@ static int gdbsx_guest_mem_io(
>      return (iop->remain ? -EFAULT : 0);
>  }
>  
> +static int update_memory_mapping(struct domain *d, unsigned long gfn,
> +                          unsigned long mfn, unsigned long nr_mfns,
> +                          bool_t add)
> +{
> +    unsigned long i;
> +    int ret;
> +
> +    ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn + nr_mfns - 1, add);
> +    if ( ret )
> +        return ret;
I think I raised this point more than once before: What is this good
for when the function (later) gets called during Dom0 construction?
If it denied access, the system would fail to boot. Hence denying
access here is pointless, and hence the check should remain in the
caller of this function.
> +    if ( add )
> +    {
Furthermore I assume that the Dom0 construction code path
would call the function with "add" set only, hence only that
portion of the code really needs breaking out.
> +        printk(XENLOG_G_INFO
> +               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> +               d->domain_id, gfn, mfn, nr_mfns);
And this printk is surely going to be rather noisy for Dom0, so
should remain in the caller too.
> +
> +        ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
> +        if ( !ret && paging_mode_translate(d) )
> +        {
> +            for ( i = 0; !ret && i < nr_mfns; i++ )
> +                if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn + i)) )
> +                    ret = -EIO;
> +            if ( ret )
> +            {
> +                printk(XENLOG_G_WARNING
> +                       "memory_map:fail: dom%d gfn=%lx
mfn=%lx\n",
> +                       d->domain_id, gfn + i, mfn + i);
Whereas this should perhaps have its G_WARNING be conditional,
with it being plain XENLOG_ERR when is_hardware_domain(d) (if
being a better fit, you could even avoid recovery here for Dom0,
perhaps by calling panic() instead of printk() in that case).

Jan

Jan Beulich

2013-Nov-25 08:47 UTC

head link

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

>>> On 25.11.13 at 02:19, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
>>- For now, iommu is required for PVH dom0. Check for that.
>>- For pvh, we need to do mfn_to_gmfn before calling mapping function
>>  intel_iommu_map_page/amd_iommu_map_page which expects a gfn.
>>
>>Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
>>---
>> xen/drivers/passthrough/iommu.c |   17 +++++++++++++++--
>> 1 files changed, 15 insertions(+), 2 deletions(-)
>>
>>diff --git a/xen/drivers/passthrough/iommu.c
>>b/xen/drivers/passthrough/iommu.c
>>index 93ad122..dbbd034 100644
>>--- a/xen/drivers/passthrough/iommu.c
>>+++ b/xen/drivers/passthrough/iommu.c
>>@@ -125,15 +125,27 @@ int iommu_domain_init(struct domain *d)
>>     return hd->platform_ops->init(d);
>> }
>>
>>+static __init void check_dom0_pvh_reqs(struct domain *d)
>>+{
>>+    if ( !iommu_enabled )
>>+        panic("Presently, iommu must be enabled for pvh
dom0\n");
>>+
>>+    if ( iommu_passthrough )
>>+        panic("For pvh dom0, dom0-passthrough must not be
enabled\n");
>>+}
> 
> That is not very "check'' type. You panic!
> 
> Could you instead just disable PVH dom0 booting and boot it as a normal PV 
> guest (or try)?
Aiui it''s too late in the game already (i.e. would require starting
over, which surely gets too complicated for initial support).

Jan

Jan Beulich

2013-Nov-25 09:03 UTC

head link

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
> +static int xenmem_add_foreign_to_pmap(unsigned long fgfn, unsigned long
gpfn,
> +                                      domid_t foreign_domid)
> +{
> +    p2m_type_t p2mt, p2mt_prev;
> +    int rc = 0;
> +    unsigned long prev_mfn, mfn = 0;
> +    struct domain *fdom, *currd = current->domain;
> +    struct page_info *page = NULL;
> +
> +    if ( currd->domain_id == foreign_domid || foreign_domid ==
DOMID_SELF ||
> +         !is_pvh_domain(currd) )
> +        return -EINVAL;
> +
> +    if ( !is_control_domain(currd) ||
> +         (fdom = get_pg_owner(foreign_domid)) == NULL )
> +        return -EPERM;
Is this the right approach (i.e. shouldn''t this be an XSM call)? Cc-ing
Daniel...
> +    /* following will take a refcnt on the mfn */
> +    page = get_page_from_gfn(fdom, fgfn, &p2mt, P2M_ALLOC);
> +    if ( !page || !p2m_is_valid(p2mt) )
> +    {
> +        if ( page )
> +            put_page(page);
> +        put_pg_owner(fdom);
> +        return -EINVAL;
> +    }
> +    mfn = page_to_mfn(page);
> +
> +    /* Remove previously mapped page if it is present. */
> +    prev_mfn = mfn_x(get_gfn(currd, gpfn, &p2mt_prev));
> +    if ( mfn_valid(prev_mfn) )
> +    {
> +        if ( is_xen_heap_mfn(prev_mfn) )
> +            /* Xen heap frames are simply unhooked from this phys slot */
> +            guest_physmap_remove_page(currd, gpfn, prev_mfn, 0);
> +        else
> +            /* Normal domain memory is freed, to avoid leaking memory. */
> +            guest_remove_page(currd, gpfn);
> +    }
> +    /*
> +     * Create the new mapping. Can''t use guest_physmap_add_page()
because
> it
> +     * will update the m2p table which will result in  mfn -> gpfn of
dom0
> +     * and not fgfn of domU.
> +     */
> +    if ( set_foreign_p2m_entry(currd, gpfn, _mfn(mfn)) == 0 )
> +    {
> +        dprintk(XENLOG_WARNING,
> +                "guest_physmap_add_page failed. gpfn:%lx mfn:%lx
fgfn:%lx\n",
> +                gpfn, mfn, fgfn);
Either gdprintk() or printk(XENLOG_G_...). And to be useful this
needs to identify both the mapping domain (i.e. if not using
gdprink()) and the subject one.
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -208,7 +208,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
>  #define XENMAPSPACE_gmfn_range   3 /* GMFN range, XENMEM_add_to_physmap 
> only. */
>  #define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
>                                      * XENMEM_add_to_physmap_range only.
> -                                    */
> +                                    * (PVH x86 only) */
Isn''t this something that got introduced for ARM? If so, the way
you alter the comment is wrong.

Jan

Jan Beulich

2013-Nov-25 09:04 UTC

head link

Re: [V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
> Add opt_dom0pvh. Note, pvh dom0 is disabled until the fixme in
> domain_build.c is resolved. The fixme is added by patch title:
> "PVH dom0: construct_dom0 changes"
Pretty pointless a patch then.

Jan
> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
> ---
>  xen/arch/x86/setup.c |   19 ++++++++++++++++---
>  1 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index e33c34b..de30ef6 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
>  static bool_t __initdata disable_smep;
>  invbool_param("smep", disable_smep);
>  
> +/* Boot dom0 in pvh mode */
> +bool_t __initdata opt_dom0pvh;
> +boolean_param("dom0pvh", opt_dom0pvh);
> +
>  /* **** Linux config option: propagated to domain0. */
>  /* "acpi=off":    Sisables both ACPI table parsing and
interpreter. */
>  /* "acpi=force":  Override the disable blacklist.               
*/
> @@ -545,7 +549,7 @@ void __init __start_xen(unsigned long mbi_p)
>  {
>      char *memmap_type = NULL;
>      char *cmdline, *kextra, *loader;
> -    unsigned int initrdidx;
> +    unsigned int initrdidx, domcr_flags = 0;
>      multiboot_info_t *mbi = __va(mbi_p);
>      module_t *mod = (module_t *)__va(mbi->mods_addr);
>      unsigned long nr_pages, raw_max_page, modules_headroom, *module_map;
> @@ -1332,8 +1336,17 @@ void __init __start_xen(unsigned long mbi_p)
>      if ( !tboot_protect_mem_regions() )
>          panic("Could not protect TXT memory regions\n");
>  
> -    /* Create initial domain 0. */
> -    dom0 = domain_create(0, DOMCRF_s3_integrity, 0);
> +    /*
> +     * Following removed when "pvh fixme" in domain_build.c is
resolved.
> +     * The fixme is added by patch "PVH dom0: construct_dom0
changes".
> +     */
> +    if ( opt_dom0pvh )
> +        panic("You do not have the correct xen version for dom0
PVH\n");
> +
> +     /* Create initial domain 0. */
> +     domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
> +     domcr_flags |= DOMCRF_s3_integrity;
> +     dom0 = domain_create(0, domcr_flags, 0);
>      if ( IS_ERR(dom0) || (alloc_dom0_vcpu0() == NULL) )
>          panic("Error creating domain 0\n");
>  
> -- 
> 1.7.2.3

Jan Beulich

2013-Nov-25 09:06 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
> These patches implement PVH dom0. Please note there is 1 fixme in
> this entire series that is being addressed under a different patch 
> series. PVH dom0 creation is disabled until then.
> 
> Patches 1 thru 4 implement changes in and around construct_dom0.
> Patches 5 thru 7 are to support tool stack on PVH dom0, and have
> been mostly looked at in the past. Finally, patch 8 adds option to
> boot a dom0 in PVH mode.
> 
> These patches are based on c/s: b18e2d9 with the addition of fixes 
> in epte_get_entry_emt and Roger''s AP bringup/cleanup patches.
George,

assuming we can get the remaining issues sorted out within
reasonable time, I''m inclined to recommend taking these despite
the code freeze. What''s your opinion in this regard?

Jan

George Dunlap

2013-Nov-25 10:49 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

On 11/25/2013 09:06 AM, Jan Beulich wrote:>>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
>> These patches implement PVH dom0. Please note there is 1 fixme in
>> this entire series that is being addressed under a different patch
>> series. PVH dom0 creation is disabled until then.
>>
>> Patches 1 thru 4 implement changes in and around construct_dom0.
>> Patches 5 thru 7 are to support tool stack on PVH dom0, and have
>> been mostly looked at in the past. Finally, patch 8 adds option to
>> boot a dom0 in PVH mode.
>>
>> These patches are based on c/s: b18e2d9 with the addition of fixes
>> in epte_get_entry_emt and Roger''s AP bringup/cleanup patches.
> George,
>
> assuming we can get the remaining issues sorted out within
> reasonable time, I''m inclined to recommend taking these despite
> the code freeze. What''s your opinion in this regard?
I hadn''t even looked at them, assuming that "PVH for dom0"
had missed
the feature freeze.

Within our decision framework, we would accept this series if we 
considered PVH dom0 to be a "blocker" for the release -- a feature
that
is strategically important enough to risk slipping the release 
significantly.  That''s the basis on which the ARMv8 stuff got in.

I assume (without having looked at all) that this series introduces new 
interfaces for PVH.  So in addition to the normal risk that a series 
like this may break existing functionality, there is the additional risk 
that by rushing the feature in at the last minute, we may not have had 
enough time for the interface to be reviewed, and we may end up having 
to support an interface that wasn''t well thought-out.

So the important questions are:

* Is there a good reason that this feature can''t wait until 4.5?

* What is the risk of these changes breaking existing functionality?

* What is the risk that the new interfaces will turn out to be a bad fit 
or difficult to support going forward?

  -George

Roger Pau Monné

2013-Nov-25 10:57 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

On 25/11/13 11:49, George Dunlap wrote:> On 11/25/2013 09:06 AM, Jan Beulich wrote:
>>>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
>>> These patches implement PVH dom0. Please note there is 1 fixme in
>>> this entire series that is being addressed under a different patch
>>> series. PVH dom0 creation is disabled until then.
>>>
>>> Patches 1 thru 4 implement changes in and around construct_dom0.
>>> Patches 5 thru 7 are to support tool stack on PVH dom0, and have
>>> been mostly looked at in the past. Finally, patch 8 adds option to
>>> boot a dom0 in PVH mode.
>>>
>>> These patches are based on c/s: b18e2d9 with the addition of fixes
>>> in epte_get_entry_emt and Roger''s AP bringup/cleanup
patches.
>> George,
>>
>> assuming we can get the remaining issues sorted out within
>> reasonable time, I''m inclined to recommend taking these
despite
>> the code freeze. What''s your opinion in this regard?
> 
> I hadn''t even looked at them, assuming that "PVH for
dom0" had missed
> the feature freeze.
> 
> Within our decision framework, we would accept this series if we
> considered PVH dom0 to be a "blocker" for the release -- a
feature that
> is strategically important enough to risk slipping the release
> significantly.  That''s the basis on which the ARMv8 stuff got in.
> 
> I assume (without having looked at all) that this series introduces new
> interfaces for PVH.  So in addition to the normal risk that a series
> like this may break existing functionality, there is the additional risk
> that by rushing the feature in at the last minute, we may not have had
> enough time for the interface to be reviewed, and we may end up having
> to support an interface that wasn''t well thought-out.
> 
> So the important questions are:
> 
> * Is there a good reason that this feature can''t wait until 4.5?
> 
> * What is the risk of these changes breaking existing functionality?
I agree with everything above...
> * What is the risk that the new interfaces will turn out to be a bad fit
> or difficult to support going forward?
But I don''t think this applies to PVH Dom0, the DomU interface has
already been marked as experimental, and subject to change. The same
applies here, Dom0 PVH interface would be considered experimental.

Roger Pau Monné

2013-Nov-25 11:02 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

On 23/11/13 01:03, Mukesh Rathor wrote:> Hi,
> 
> These patches implement PVH dom0. Please note there is 1 fixme in
> this entire series that is being addressed under a different patch 
> series. PVH dom0 creation is disabled until then.
Hello,

Thanks for the series and the git branch, it makes testing much more easier.

Were is the solution for the fixme being worked on? I mean, it would be
good if you could put a link to the patch series that addresses this
issue, and rebase your Dom0 series on top of that so at least Dom0 PVH
is functional and we can test it.

Jan Beulich

2013-Nov-25 11:02 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

>>> On 25.11.13 at 11:57, Roger Pau Monné<roger.pau@citrix.com>
wrote:
> On 25/11/13 11:49, George Dunlap wrote:
>> * What is the risk that the new interfaces will turn out to be a bad
fit
>> or difficult to support going forward?
> 
> But I don't think this applies to PVH Dom0, the DomU interface has
> already been marked as experimental, and subject to change. The same
> applies here, Dom0 PVH interface would be considered experimental.
Actually I don't think we were viewing it that way - with 4.4 out,
consumers should be permitted to make use of the interfaces
even if the implementation is still experimental. If we didn't stick
to the interfaces once they're in a release, proper consumers
couldn't appear (i.e. the Linux patch set would not be able to
be pushed into Linus'es tree).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Jan Beulich

2013-Nov-25 11:05 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

>>> On 25.11.13 at 11:49, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> So the important questions are:
> 
> * Is there a good reason that this feature can''t wait until 4.5?
Because it''s still part of the original feature (having been aiming at
Dom0 support only initially) that has got pushed back twice already.
> * What is the risk of these changes breaking existing functionality?
From my looking over the code, not more than the rest of the PVH
stuff.
> * What is the risk that the new interfaces will turn out to be a bad fit 
> or difficult to support going forward?
Iirc the only interface wise change it does is implementing a so far
ARM-only sub-hypercall.

Jan

Daniel De Graaf

2013-Nov-25 19:00 UTC

head link

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

On 11/25/2013 04:03 AM, Jan Beulich wrote:>>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:
>> +static int xenmem_add_foreign_to_pmap(unsigned long fgfn, unsigned
long gpfn,
>> +                                      domid_t foreign_domid)
>> +{
>> +    p2m_type_t p2mt, p2mt_prev;
>> +    int rc = 0;
>> +    unsigned long prev_mfn, mfn = 0;
>> +    struct domain *fdom, *currd = current->domain;
>> +    struct page_info *page = NULL;
>> +
>> +    if ( currd->domain_id == foreign_domid || foreign_domid ==
DOMID_SELF ||
>> +         !is_pvh_domain(currd) )
>> +        return -EINVAL;
>> +
>> +    if ( !is_control_domain(currd) ||
>> +         (fdom = get_pg_owner(foreign_domid)) == NULL )
>> +        return -EPERM;
>
> Is this the right approach (i.e. shouldn''t this be an XSM call)?
Cc-ing
> Daniel...
>
Yes, this should be an XSM call; it needs to explicitly check if currd
has the right to access pages from fdom.

For efficiency, rather than checking permissions here for each page,
check once in xenmem_add_to_physmap_range or next to the existing check
in arch_memory_op; in that case, combining the two checks into one as is
done for xsm_mmu_update may be preferred.  Moving the other EINVAL
checks in addition could also be useful, although that requires
duplicating them in xenmem_add_to_physmap.

-- 
Daniel De Graaf
National Security Agency

Mukesh Rathor

2013-Nov-25 23:20 UTC

head link

Re: [V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

On Mon, 25 Nov 2013 08:43:32 +0000
"Jan Beulich" <JBeulich@suse.com> wrote:
> >>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com>
> >>> wrote:
> > --- a/xen/arch/x86/domctl.c
> > +++ b/xen/arch/x86/domctl.c
> > @@ -46,6 +46,66 @@ static int gdbsx_guest_mem_io(
> >      return (iop->remain ? -EFAULT : 0);
> >  }
> >  
> > +static int update_memory_mapping(struct domain *d, unsigned long
> > gfn,
> > +                          unsigned long mfn, unsigned long nr_mfns,
> > +                          bool_t add)
> > +{
> > +    unsigned long i;
> > +    int ret;
> > +
> > +    ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn + nr_mfns - 1,
> > add);
> > +    if ( ret )
> > +        return ret;
> 
> I think I raised this point more than once before: What is this good
> for when the function (later) gets called during Dom0 construction?
> If it denied access, the system would fail to boot. Hence denying
> access here is pointless, and hence the check should remain in the
> caller of this function.
Well, the point to was to stop unauthorized use. but anyways, moved.
> > +    if ( add )
> > +    {
> 
> Furthermore I assume that the Dom0 construction code path
> would call the function with "add" set only, hence only that
> portion of the code really needs breaking out.
> 
> > +        printk(XENLOG_G_INFO
> > +               "memory_map:add: dom%d gfn=%lx mfn=%lx
nr=%lx\n",
> > +               d->domain_id, gfn, mfn, nr_mfns);
> 
> And this printk is surely going to be rather noisy for Dom0, so
> should remain in the caller too.
> 
> > +
> > +        ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
> > +        if ( !ret && paging_mode_translate(d) )
> > +        {
> > +            for ( i = 0; !ret && i < nr_mfns; i++ )
> > +                if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn +
> > i)) )
> > +                    ret = -EIO;
> > +            if ( ret )
> > +            {
> > +                printk(XENLOG_G_WARNING
> > +                       "memory_map:fail: dom%d gfn=%lx
mfn=%lx\n",
> > +                       d->domain_id, gfn + i, mfn + i);
> 
> Whereas this should perhaps have its G_WARNING be conditional,
> with it being plain XENLOG_ERR when is_hardware_domain(d) (if
> being a better fit, you could even avoid recovery here for Dom0,
> perhaps by calling panic() instead of printk() in that case).
panic sounds good. Done.

thanks
mukesh

Mukesh Rathor

2013-Nov-26 00:32 UTC

head link

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

On Mon, 25 Nov 2013 14:00:35 -0500
Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> On 11/25/2013 04:03 AM, Jan Beulich wrote:
> >>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com>
> >>>> wrote:
> >> +static int xenmem_add_foreign_to_pmap(unsigned long fgfn,
> >> unsigned long gpfn,
> >> +                                      domid_t foreign_domid)
> >> +{
> >> +    p2m_type_t p2mt, p2mt_prev;
> >> +    int rc = 0;
> >> +    unsigned long prev_mfn, mfn = 0;
> >> +    struct domain *fdom, *currd = current->domain;
> >> +    struct page_info *page = NULL;
> >> +
> >> +    if ( currd->domain_id == foreign_domid || foreign_domid
=> >> DOMID_SELF ||
> >> +         !is_pvh_domain(currd) )
> >> +        return -EINVAL;
> >> +
> >> +    if ( !is_control_domain(currd) ||
> >> +         (fdom = get_pg_owner(foreign_domid)) == NULL )
> >> +        return -EPERM;
> >
> > Is this the right approach (i.e. shouldn''t this be an XSM
call)?
> > Cc-ing Daniel...
> >
> 
> Yes, this should be an XSM call; it needs to explicitly check if currd
> has the right to access pages from fdom.
I thought the control_domain would always have right to access pages
from fdoms. If no, can you please give some hints on which
xsm call I need to use. Glancing at xsm file, I can''t figure
quickly....

thanks
mukesh

Mukesh Rathor

2013-Nov-26 01:18 UTC

head link

Re: [V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

On Mon, 25 Nov 2013 09:04:47 +0000
"Jan Beulich" <JBeulich@suse.com> wrote:
> >>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com>
> >>> wrote:
> > Add opt_dom0pvh. Note, pvh dom0 is disabled until the fixme in
> > domain_build.c is resolved. The fixme is added by patch title:
> > "PVH dom0: construct_dom0 changes"
> 
> Pretty pointless a patch then.
It helps others know how pvh is enabled, and other contributors can
quickly use it to test things.

thanks
mukesh

George Dunlap

2013-Nov-26 14:57 UTC

head link

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

On Sat, Nov 23, 2013 at 12:03 AM, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:> - For now, iommu is required for PVH dom0. Check for that.
> - For pvh, we need to do mfn_to_gmfn before calling mapping function
>   intel_iommu_map_page/amd_iommu_map_page which expects a gfn.
>
> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

Daniel De Graaf

2013-Nov-26 15:03 UTC

head link

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

On 11/25/2013 07:32 PM, Mukesh Rathor wrote:> On Mon, 25 Nov 2013 14:00:35 -0500
> Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
>
>> On 11/25/2013 04:03 AM, Jan Beulich wrote:
>>>>>> On 23.11.13 at 01:03, Mukesh Rathor
<mukesh.rathor@oracle.com>
>>>>>> wrote:
>>>> +static int xenmem_add_foreign_to_pmap(unsigned long fgfn,
>>>> unsigned long gpfn,
>>>> +                                      domid_t foreign_domid)
>>>> +{
>>>> +    p2m_type_t p2mt, p2mt_prev;
>>>> +    int rc = 0;
>>>> +    unsigned long prev_mfn, mfn = 0;
>>>> +    struct domain *fdom, *currd = current->domain;
>>>> +    struct page_info *page = NULL;
>>>> +
>>>> +    if ( currd->domain_id == foreign_domid || foreign_domid
=>>>> DOMID_SELF ||
>>>> +         !is_pvh_domain(currd) )
>>>> +        return -EINVAL;
>>>> +
>>>> +    if ( !is_control_domain(currd) ||
>>>> +         (fdom = get_pg_owner(foreign_domid)) == NULL )
>>>> +        return -EPERM;
>>>
>>> Is this the right approach (i.e. shouldn''t this be an XSM
call)?
>>> Cc-ing Daniel...
>>>
>>
>> Yes, this should be an XSM call; it needs to explicitly check if currd
>> has the right to access pages from fdom.
>
> I thought the control_domain would always have right to access pages
> from fdoms.
This is true unless you are creating a system with multiple control domains
or with a control domain that gives up privileges after setting up some
initial boot domains that contain secrets (disk/network encryption, vTPMs,
etc). Preventing the control domain from being able to access pages in such
domains means a compromised control domain does not compromise the entire
system.
> If no, can you please give some hints on which
> xsm call I need to use. Glancing at xsm file, I can''t figure
quickly....
>
> thanks
> mukesh
This will either need a new XSM hook or a change to the prototype of the
xsm_add_to_physmap hook to add a new parameter for the foreign domain; the
latter seems the simplest change, passing NULL for pg_src when not using
XENMAPSPACE_gmfn_foreign. The hook would look something like this:

int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *curr,
		struct domain *target, struct domain *pg_src)
{
     int rc;
     XSM_ASSERT_ACTION(XSM_TARGET);
     rc = xsm_default_action(action, curr, target);
     if ( pg_src && !rc )
         rc = xsm_default_action(action, curr, pg_src);
     return rc;
}
with the corresponding FLASK hook:
{ ...
	rc = domain_has_perm(curr, target, SECCLASS_MMU, MMU__PHYSMAP);
	if ( pg_src && !rc )
	rc = domain_has_perm(curr, pg_src, SECCLASS_MMU, MMU__MAP_READ|MMU__MAP_WRITE);
}

This will require pulling the get_pg_owner(foreign_domid) up a few levels
in order to have the struct domain* available instead of the domid, but
that doesn''t seem like it would cause any issues.

-- 
Daniel De Graaf
National Security Agency

George Dunlap

2013-Nov-26 16:00 UTC

head link

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

On Sat, Nov 23, 2013 at 12:03 AM, Mukesh Rathor
<mukesh.rathor@oracle.com> wrote:> In this patch, a new type p2m_map_foreign is introduced for pages
> that toolstack on PVH dom0 maps from foreign domains that its creating
> or supporting during it''s run time.
>
> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Why is this needed?  Does ARM have a concept of p2m_map_foreign?  grep
doesn''t turn it up... how does ARM dom0 deal with foreign mappings,
and is there a reason we can''t copy their approach?

 -George

Ian Campbell

2013-Nov-26 16:11 UTC

head link

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

On Tue, 2013-11-26 at 16:00 +0000, George Dunlap wrote:> On Sat, Nov 23, 2013 at 12:03 AM, Mukesh Rathor
> <mukesh.rathor@oracle.com> wrote:
> > In this patch, a new type p2m_map_foreign is introduced for pages
> > that toolstack on PVH dom0 maps from foreign domains that its creating
> > or supporting during it''s run time.
> >
> > Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
> 
> Why is this needed?  Does ARM have a concept of p2m_map_foreign?  grep
> doesn''t turn it up... how does ARM dom0 deal with foreign
mappings,
> and is there a reason we can''t copy their approach?
The p2m types are currently per-arch. ARM doesn''t currently have much
in
the way of the concept of such things, but it should eventually and
foreign map seems likely to be one of them.

Ian.

George Dunlap

2013-Nov-26 17:20 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

On Mon, Nov 25, 2013 at 11:02 AM, Jan Beulich <JBeulich@suse.com>
wrote:>>>> On 25.11.13 at 11:57, Roger Pau
Monné<roger.pau@citrix.com> wrote:
>> On 25/11/13 11:49, George Dunlap wrote:
>>> * What is the risk that the new interfaces will turn out to be a
bad fit
>>> or difficult to support going forward?
>>
>> But I don''t think this applies to PVH Dom0, the DomU interface
has
>> already been marked as experimental, and subject to change. The same
>> applies here, Dom0 PVH interface would be considered experimental.
>
> Actually I don''t think we were viewing it that way - with 4.4 out,
> consumers should be permitted to make use of the interfaces
> even if the implementation is still experimental. If we didn''t
stick
> to the interfaces once they''re in a release, proper consumers
> couldn''t appear (i.e. the Linux patch set would not be able to
> be pushed into Linus''es tree).
Right; but this is an argument against checking it in for 4.4.

If Mukesh wants the non-final stuff to be checked into Xen, I think we
have to agree not to push the Linux side until it''s something we can
agree to support.

 -George

George Dunlap

2013-Nov-26 17:35 UTC

head link

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

On 11/26/2013 04:11 PM, Ian Campbell wrote:> On Tue, 2013-11-26 at 16:00 +0000, George Dunlap wrote:
>> On Sat, Nov 23, 2013 at 12:03 AM, Mukesh Rathor
>> <mukesh.rathor@oracle.com> wrote:
>>> In this patch, a new type p2m_map_foreign is introduced for pages
>>> that toolstack on PVH dom0 maps from foreign domains that its
creating
>>> or supporting during it''s run time.
>>>
>>> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
>>
>> Why is this needed?  Does ARM have a concept of p2m_map_foreign?  grep
>> doesn''t turn it up... how does ARM dom0 deal with foreign
mappings,
>> and is there a reason we can''t copy their approach?
>
> The p2m types are currently per-arch. ARM doesn''t currently have
much in
> the way of the concept of such things, but it should eventually and
> foreign map seems likely to be one of them.
Oh, right -- I see now; the interface Mukesh is implementing -- 
XENMAPSPACE_gmfn_foreign and XENMEM_add_to_physmap_range -- is basically 
the same as the ARM dom0 interface.

That''s an important piece of information.

  -George

Jan Beulich

2013-Nov-26 18:06 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

>>> George Dunlap <George.Dunlap@eu.citrix.com> 11/26/13 6:21 PM
>>>
>On Mon, Nov 25, 2013 at 11:02 AM, Jan Beulich <JBeulich@suse.com>
wrote:
>>>>> On 25.11.13 at 11:57, Roger Pau
Monné<roger.pau@citrix.com> wrote:
>>> On 25/11/13 11:49, George Dunlap wrote:
>>>> * What is the risk that the new interfaces will turn out to be
a bad fit
>>>> or difficult to support going forward?
>>>
>>> But I don't think this applies to PVH Dom0, the DomU interface
has
>>> already been marked as experimental, and subject to change. The
same
>>> applies here, Dom0 PVH interface would be considered experimental.
>>
>> Actually I don't think we were viewing it that way - with 4.4 out,
>> consumers should be permitted to make use of the interfaces
>> even if the implementation is still experimental. If we didn't
stick
>> to the interfaces once they're in a release, proper consumers
>> couldn't appear (i.e. the Linux patch set would not be able to
>> be pushed into Linus'es tree).
>
>Right; but this is an argument against checking it in for 4.4.
>
>If Mukesh wants the non-final stuff to be checked into Xen, I think we
>have to agree not to push the Linux side until it's something we can
>agree to support.
Actually my earlier comment was meant to be on the DomU side only.
And I said elsewhere that there's not really any new interface in the
Dom0 code presented so far, so checking that it wouldn't nail down
anything that isn't already nailed down due to DomU support.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Mukesh Rathor

2013-Nov-27 01:19 UTC

head link

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

On Tue, 26 Nov 2013 10:03:52 -0500
Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
> On 11/25/2013 07:32 PM, Mukesh Rathor wrote:
> > On Mon, 25 Nov 2013 14:00:35 -0500
> > Daniel De Graaf <dgdegra@tycho.nsa.gov> wrote:
...........> the xsm_add_to_physmap hook to add a new parameter for the foreign
> domain; the latter seems the simplest change, passing NULL for pg_src
> when not using XENMAPSPACE_gmfn_foreign. The hook would look
> something like this:
> 
> int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *curr,
> 		struct domain *target, struct domain *pg_src)
> {
>      int rc;
>      XSM_ASSERT_ACTION(XSM_TARGET);
>      rc = xsm_default_action(action, curr, target);
>      if ( pg_src && !rc )
>          rc = xsm_default_action(action, curr, pg_src);
>      return rc;
> }
> with the corresponding FLASK hook:
> { ...
> 	rc = domain_has_perm(curr, target, SECCLASS_MMU,
> MMU__PHYSMAP); if ( pg_src && !rc )
> 	rc = domain_has_perm(curr, pg_src, SECCLASS_MMU,
> MMU__MAP_READ|MMU__MAP_WRITE); }
> 
> This will require pulling the get_pg_owner(foreign_domid) up a few
> levels in order to have the struct domain* available instead of the
> domid, but that doesn''t seem like it would cause any issues.

Thanks a lot for detailed help. Please lmk if following looks ok:
Mukesh


diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index fc8dded..11c9a89 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4733,7 +4733,7 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
         if ( d == NULL )
             return -ESRCH;
 
-        if ( xsm_add_to_physmap(XSM_TARGET, current->domain, d) )
+        if ( xsm_add_to_physmap(XSM_TARGET, current->domain, d, NULL) )
         {
             rcu_unlock_domain(d);
             return -EPERM;
@@ -4759,7 +4759,7 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
     case XENMEM_add_to_physmap_range:
     {
         struct xen_add_to_physmap_range xatpr;
-        struct domain *d;
+        struct domain *d, *fd = NULL;
 
         if ( copy_from_guest(&xatpr, arg, 1) )
             return -EFAULT;
@@ -4772,7 +4772,17 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
         if ( d == NULL )
             return -ESRCH;
 
-        if ( (rc = xsm_add_to_physmap(XSM_TARGET, current->domain, d)) )
+        if ( xatpr.foreign_domid )
+        {
+            if ( (fd = rcu_lock_domain_by_any_id(xatpr.foreign_domid)) == NULL
)
+            {
+                rcu_unlock_domain(d);
+                return -ESRCH;
+            }
+            rcu_unlock_domain(fd);
+        }
+
+        if ( (rc = xsm_add_to_physmap(XSM_TARGET, current->domain, d, fd)) )
         {
             rcu_unlock_domain(d);
             return rc;
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index eb9e1a1..34c097d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -467,10 +467,16 @@ static XSM_INLINE int
xsm_pci_config_permission(XSM_DEFAULT_ARG struct domain *d
     return xsm_default_action(action, current->domain, d);
 }
 
-static XSM_INLINE int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *d1,
struct domain *d2)
+static XSM_INLINE int xsm_add_to_physmap(XSM_DEFAULT_ARG struct domain *d1,
struct domain *d2, struct domain *d3)
 {
+    int rc;
+
     XSM_ASSERT_ACTION(XSM_TARGET);
-    return xsm_default_action(action, d1, d2);
+    rc = xsm_default_action(action, d1, d2);
+    if ( d3 && !rc )
+        rc = xsm_default_action(action, d1, d3);
+
+    return rc;
 }
 
 static XSM_INLINE int xsm_remove_from_physmap(XSM_DEFAULT_ARG struct domain
*d1, struct domain *d2)
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 1939453..2d29a2f 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -90,7 +90,7 @@ struct xsm_operations {
     int (*memory_adjust_reservation) (struct domain *d1, struct domain *d2);
     int (*memory_stat_reservation) (struct domain *d1, struct domain *d2);
     int (*memory_pin_page) (struct domain *d1, struct domain *d2, struct
page_info *page);
-    int (*add_to_physmap) (struct domain *d1, struct domain *d2);
+    int (*add_to_physmap) (struct domain *d1, struct domain *d2, struct domain
*d3);
     int (*remove_from_physmap) (struct domain *d1, struct domain *d2);
     int (*claim_pages) (struct domain *d);
 
@@ -344,9 +344,9 @@ static inline int xsm_memory_pin_page(xsm_default_t def,
struct domain *d1, stru
     return xsm_ops->memory_pin_page(d1, d2, page);
 }
 
-static inline int xsm_add_to_physmap(xsm_default_t def, struct domain *d1,
struct domain *d2)
+static inline int xsm_add_to_physmap(xsm_default_t def, struct domain *d1,
struct domain *d2, struct domain *d3)
 {
-    return xsm_ops->add_to_physmap(d1, d2);
+    return xsm_ops->add_to_physmap(d1, d2, d3);
 }
 
 static inline int xsm_remove_from_physmap(xsm_default_t def, struct domain *d1,
struct domain *d2)
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index b1e2593..e541dd3 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1068,9 +1068,15 @@ static inline int flask_tmem_control(void)
     return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
 }
 
-static int flask_add_to_physmap(struct domain *d1, struct domain *d2)
+static int flask_add_to_physmap(struct domain *d1, struct domain *d2, struct
domain *d3)
 {
-    return domain_has_perm(d1, d2, SECCLASS_MMU, MMU__PHYSMAP);
+    int rc;
+
+    rc = domain_has_perm(d1, d2, SECCLASS_MMU, MMU__PHYSMAP);
+    if ( d3 && !rc )
+        rc = domain_has_perm(d1, d3, SECCLASS_MMU, 
+                             MMU__MAP_READ|MMU__MAP_WRITE);
+    return rc;
 }
 
 static int flask_remove_from_physmap(struct domain *d1, struct domain *d2)

Mukesh Rathor

2013-Nov-27 01:32 UTC

head link

Re: [V2 PATCH 0/8]: PVH dom0....

On Tue, 26 Nov 2013 18:06:43 +0000
"Jan Beulich" <jbeulich@suse.com> wrote:
> >>> George Dunlap <George.Dunlap@eu.citrix.com> 11/26/13
6:21 PM >>>
> >On Mon, Nov 25, 2013 at 11:02 AM, Jan Beulich <JBeulich@suse.com>
> >wrote:
> >>>>> On 25.11.13 at 11:57, Roger Pau
Monné<roger.pau@citrix.com>
> >>>>> wrote:
> >>> On 25/11/13 11:49, George Dunlap wrote:
> >>>> * What is the risk that the new interfaces will turn out
to be a
> >>>> bad fit or difficult to support going forward?
> >>>
> >>> But I don't think this applies to PVH Dom0, the DomU
interface has
> >>> already been marked as experimental, and subject to change.
The
> >>> same applies here, Dom0 PVH interface would be considered
> >>> experimental.
> >>
> >> Actually I don't think we were viewing it that way - with 4.4
out,
> >> consumers should be permitted to make use of the interfaces
> >> even if the implementation is still experimental. If we didn't
> >> stick to the interfaces once they're in a release, proper
consumers
> >> couldn't appear (i.e. the Linux patch set would not be able to
> >> be pushed into Linus'es tree).
> >
> >Right; but this is an argument against checking it in for 4.4.
> >
> >If Mukesh wants the non-final stuff to be checked into Xen, I think
> >we have to agree not to push the Linux side until it's something we
> >can agree to support.
> 
> Actually my earlier comment was meant to be on the DomU side only.
> And I said elsewhere that there's not really any new interface in the
> Dom0 code presented so far, so checking that it wouldn't nail down
> anything that isn't already nailed down due to DomU support.
> 
> Jan
Correct, for everything Jan has said in this thread.

Mukesh


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Xen devel - Nov 2013 - [V2 PATCH 0/8]: PVH dom0....

[V2 PATCH 0/8]: PVH dom0....

[V2 PATCH 1/8] PVH dom0: iommu related changes

[V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

[V2 PATCH 3/8] PVH dom0: move some pv specific code to static functions

[V2 PATCH 4/8] dom0: construct_dom0 changes

[V2 PATCH 5/8] PVH dom0: implement XENMEM_add_to_physmap_range for x86

[V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

[V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

[V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

Re: [V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

Re: [V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

Re: [V2 PATCH 2/8] PVH dom0: create update_memory_mapping() function

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

Re: [V2 PATCH 8/8] pvh dom0: add opt_dom0pvh to setup.c

Re: [V2 PATCH 1/8] PVH dom0: iommu related changes

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 6/8] PVH dom0: Introduce p2m_map_foreign

Re: [V2 PATCH 0/8]: PVH dom0....

Re: [V2 PATCH 7/8] pvh dom0: Add and remove foreign pages

Re: [V2 PATCH 0/8]: PVH dom0....