thr3ads.net - Xen devel - [PATCH v4 0/9] xen/arm: live migration support in arndale board [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Jaeyong Yoo

2013-Oct-04 04:43 UTC

[PATCH v4 0/9] xen/arm: live migration support in arndale board

Hello xen-devel,
here goes the v4 patch series. The major changes in this version is the
following:

  1) VLPT is improved. In version 3, we manually construct the xen''s
page table
     to establish VLPT, but in this version, by simply copying the guest
p2m''s
     entry into xen''s page table, we can accomplish the same with
smaller LOC
     and faster dirty-page detection.

  2) Enable SMP-domu live migration.

  3) Stability improved. For single-core domU live migration, we can migrate 512
     times in a row (512 is the maximum number of domain ID, as far as I know).
     For SMP-core domU live migration, we can migrate 50 ~ 300 times in a row,
     (minimum 50), and after that dom0 or domU crashes. We are trying to figure
     out the reason of faults. I hope I could get some feedbacks from community
     members about this.

And, here goes the brief description of each patch.

Patch 1 implements hvm save/restore.

Patch 2 implements vcpu save/restore by adding more required registers for live
migration.

Patch 3 implements ''set_memory_map'' hyerpcall for telling
hypervisor about the DomU''s memory map. This memory map is used for
dirty-page tracing (Patch 4, 7, 8, and 9)

Patch 4 implements ''get_maximum_gpfn''

Patch 5 implements ''modify_returncode'' for switching the
return value of suspend hypercall from domU.

Patch 6 is an obvious fix of a clear_guest_offset macro.

Patch 7 implements base functions for VLPT.

Patch 8 implements dirty-page tracing by using VLPT.

Patch 9 implements the toolstack part for live migration of ARM.


Best,
Jaeyong


Alexey Sokolov, Elena Pyatunina, Evgeny Fedotov, and Nikolay Martyanov (1):
  xen/arm: Implement toolstack for xl restore/save and migrate

Alexey Sokolov (1):
  xen/arm: Implement modify_returncode

Evgeny Fedotov (2):
  xen/arm: Implement set_memory_map hypercall
  xen/arm: Implement get_maximum_gpfn hypercall for arm

Jaeyong Yoo and Evgeny Fedotov (1):
  xen/arm: Implement hvm save and restore

Jaeyong Yoo and Alexey Sokolov (1):
  xen/arm: Add more registers for saving and restoring vcpu  registers

Jaeyong Yoo and Elena Pyatunina (1)
  xen/arm: Implement hypercall for dirty page tracing 

Jaeyong Yoo (2):
  xen/arm: Implement virtual-linear page table for guest      p2m
    mapping in live migration
  xen/arm: Fixing clear_guest_offset macro

 config/arm32.mk                          |   1 +
 tools/include/xen-foreign/reference.size |   2 +-
 tools/libxc/Makefile                     |   5 +
 tools/libxc/xc_arm_migrate.c             | 805 +++++++++++++++++++++++++++++++
 tools/libxc/xc_dom_arm.c                 |  12 +-
 tools/libxc/xc_domain.c                  |  44 ++
 tools/libxc/xc_resume.c                  |  25 +
 tools/libxc/xenctrl.h                    |  23 +
 tools/misc/Makefile                      |   4 +
 xen/arch/arm/Makefile                    |   1 +
 xen/arch/arm/domain.c                    |  52 ++
 xen/arch/arm/domctl.c                    | 136 +++++-
 xen/arch/arm/hvm.c                       | 228 +++++++++
 xen/arch/arm/mm.c                        | 190 +++++++-
 xen/arch/arm/p2m.c                       | 300 ++++++++++++
 xen/arch/arm/save.c                      |  66 +++
 xen/arch/arm/traps.c                     |  21 +-
 xen/common/Makefile                      |   2 +
 xen/include/asm-arm/config.h             |   6 +
 xen/include/asm-arm/domain.h             |  14 +
 xen/include/asm-arm/guest_access.h       |   5 +-
 xen/include/asm-arm/hvm/support.h        |  29 ++
 xen/include/asm-arm/mm.h                 |  29 ++
 xen/include/asm-arm/p2m.h                |   4 +
 xen/include/asm-arm/processor.h          |   2 +
 xen/include/public/arch-arm.h            |  35 ++
 xen/include/public/arch-arm/hvm/save.h   |  66 +++
 xen/include/public/memory.h              |  15 +-
 xen/include/xsm/dummy.h                  |   5 +
 xen/include/xsm/xsm.h                    |   5 +
 30 files changed, 2119 insertions(+), 13 deletions(-)
 create mode 100644 tools/libxc/xc_arm_migrate.c
 create mode 100644 xen/arch/arm/save.c
 create mode 100644 xen/include/asm-arm/hvm/support.h

-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:43 UTC

head link

[PATCH v4 1/9] xen/arm: Implement hvm save and restore

Implement save/restore of hvm context hypercall. In hvm
context save/restore, we save gic, timer and vfp registers.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 xen/arch/arm/Makefile                  |   1 +
 xen/arch/arm/domctl.c                  |  88 ++++++++++++-
 xen/arch/arm/hvm.c                     | 228 +++++++++++++++++++++++++++++++++
 xen/arch/arm/save.c                    |  66 ++++++++++
 xen/common/Makefile                    |   2 +
 xen/include/asm-arm/hvm/support.h      |  29 +++++
 xen/include/public/arch-arm/hvm/save.h |  66 ++++++++++
 7 files changed, 479 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/save.c
 create mode 100644 xen/include/asm-arm/hvm/support.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 003ac84..8910a6c 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -31,6 +31,7 @@ obj-y += vuart.o
 obj-y += hvm.o
 obj-y += device.o
 obj-y += decode.o
+obj-y += save.o
 
 #obj-bin-y += ....o
 
diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 851ee40..86c07de 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -9,12 +9,98 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/hvm/save.h>
+#include <xen/guest_access.h>
 #include <public/domctl.h>
 
 long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 {
-    return -ENOSYS;
+    long ret = 0;
+    bool_t copyback = 0;
+
+    switch ( domctl->cmd )
+    {
+    case XEN_DOMCTL_sethvmcontext:
+    {
+        struct hvm_domain_context c = { .size = domctl->u.hvmcontext.size };
+
+        ret = -ENOMEM;
+        if ( (c.data = xmalloc_bytes(c.size)) == NULL )
+            goto sethvmcontext_out;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) !=
0 )
+            goto sethvmcontext_out;
+
+        domain_pause(d);
+        ret = hvm_load(d, &c);
+        domain_unpause(d);
+
+    sethvmcontext_out:
+        if ( c.data != NULL )
+            xfree(c.data);
+    }
+    break;
+    case XEN_DOMCTL_gethvmcontext:
+    {
+        struct hvm_domain_context c = { 0 };
+
+        ret = -EINVAL;
+
+        c.size = hvm_save_size(d);
+
+        if ( guest_handle_is_null(domctl->u.hvmcontext.buffer) )
+        {
+            /* Client is querying for the correct buffer size */
+            domctl->u.hvmcontext.size = c.size;
+            ret = 0;
+            goto gethvmcontext_out;
+        }
+        /* Check that the client has a big enough buffer */
+        ret = -ENOSPC;
+        if ( domctl->u.hvmcontext.size < c.size )
+        {
+            printk("(gethvmcontext) size error: %d and %d\n",
+                   domctl->u.hvmcontext.size, c.size );
+            goto gethvmcontext_out;
+        }
+
+        /* Allocate our own marshalling buffer */
+        ret = -ENOMEM;
+        if ( (c.data = xmalloc_bytes(c.size)) == NULL )
+        {
+            printk("(gethvmcontext) xmalloc_bytes failed: %d\n",
c.size );
+            goto gethvmcontext_out;
+        }
+
+        domain_pause(d);
+        ret = hvm_save(d, &c);
+        domain_unpause(d);
+
+        domctl->u.hvmcontext.size = c.cur;
+        if ( copy_to_guest(domctl->u.hvmcontext.buffer, c.data, c.size) != 0
)
+        {
+            printk("(gethvmcontext) copy to guest failed\n");
+            ret = -EFAULT;
+        }
+
+    gethvmcontext_out:
+        copyback = 1;
+
+        if ( c.data != NULL )
+            xfree(c.data);
+    }
+    break;
+
+    default:
+        return -EINVAL;
+    }
+
+    if ( copyback && __copy_to_guest(u_domctl, domctl, 1) )
+        ret = -EFAULT;
+
+    return ret;
 }
 
 void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c
index 471c4cd..befd75d 100644
--- a/xen/arch/arm/hvm.c
+++ b/xen/arch/arm/hvm.c
@@ -7,11 +7,13 @@
 
 #include <xsm/xsm.h>
 
+#include <xen/hvm/save.h>
 #include <public/xen.h>
 #include <public/hvm/params.h>
 #include <public/hvm/hvm_op.h>
 
 #include <asm/hypercall.h>
+#include <asm/gic.h>
 
 long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
@@ -65,3 +67,229 @@ long do_hvm_op(unsigned long op,
XEN_GUEST_HANDLE_PARAM(void) arg)
 
     return rc;
 }
+
+static void vgic_irq_rank_save(struct vgic_rank *ext,
+                               struct vgic_irq_rank *rank)
+{
+    spin_lock(&rank->lock);
+    /* Some of VGIC registers are not used yet, it is for a future usage */
+    /* IENABLE, IACTIVE, IPEND,  PENDSGI registers */
+    ext->ienable = rank->ienable;
+    ext->iactive = rank->iactive;
+    ext->ipend = rank->ipend;
+    ext->pendsgi = rank->pendsgi;
+    /* ICFG */
+    ext->icfg[0] = rank->icfg[0];
+    ext->icfg[1] = rank->icfg[1];
+    /* IPRIORITY */
+    memcpy(ext->ipriority, rank->ipriority, sizeof(rank->ipriority));
+    /* ITARGETS */
+    memcpy(ext->itargets, rank->itargets, sizeof(rank->itargets));
+    spin_unlock(&rank->lock);
+}
+
+static void vgic_irq_rank_restore(struct vgic_irq_rank *rank,
+                                  struct vgic_rank *ext)
+{
+    spin_lock(&rank->lock);
+    /* IENABLE, IACTIVE, IPEND,  PENDSGI registers */
+    rank->ienable = ext->ienable;
+    rank->iactive = ext->iactive;
+    rank->ipend = ext->ipend;
+    rank->pendsgi = ext->pendsgi;
+    /* ICFG */
+    rank->icfg[0] = ext->icfg[0];
+    rank->icfg[1] = ext->icfg[1];
+    /* IPRIORITY */
+    memcpy(rank->ipriority, ext->ipriority, sizeof(rank->ipriority));
+    /* ITARGETS */
+    memcpy(rank->itargets, ext->itargets, sizeof(rank->itargets));
+    spin_unlock(&rank->lock);
+}
+
+
+static int gic_save(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_hw_gic ctxt;
+    struct vcpu *v;
+
+    /* Save the state of GICs */
+    for_each_vcpu( d, v )
+    {
+        ctxt.gic_hcr = v->arch.gic_hcr;
+        ctxt.gic_vmcr = v->arch.gic_vmcr;
+        ctxt.gic_apr = v->arch.gic_apr;
+
+        /* Save list registers and masks */
+        /* (it is not necessary to save/restore them, but LR state can have
+         * influence on downtime after Live Migration (to be tested)
+         */
+        memcpy(ctxt.gic_lr, v->arch.gic_lr, sizeof(ctxt.gic_lr));
+        ctxt.lr_mask = v->arch.lr_mask;
+        ctxt.event_mask = v->arch.event_mask;
+
+        /* Save PPI states (per-CPU) */
+        /* It is necessary if SMP enabled */
+        vgic_irq_rank_save(&ctxt.ppi_state, 
&v->arch.vgic.private_irqs);
+
+        if ( hvm_save_entry(GIC, v->vcpu_id, h, &ctxt) != 0 )
+            return 1;
+    }
+    return 0;
+}
+
+static int gic_load(struct domain *d, hvm_domain_context_t *h)
+{
+    int vcpuid;
+    struct hvm_hw_gic ctxt;
+    struct vcpu *v;
+
+    /* Which vcpu is this? */
+    vcpuid = hvm_load_instance(h);
+    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+    {
+        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no vcpu%u\n",
+                d->domain_id, vcpuid);
+        return -EINVAL;
+    }
+
+    if ( hvm_load_entry(GIC, h, &ctxt) != 0 )
+        return -EINVAL;
+
+    v->arch.gic_hcr = ctxt.gic_hcr;
+    v->arch.gic_vmcr = ctxt.gic_vmcr;
+    v->arch.gic_apr = ctxt.gic_apr;
+
+    /* Restore list registers and masks */
+    memcpy(v->arch.gic_lr, ctxt.gic_lr, sizeof(ctxt.gic_lr));
+    v->arch.lr_mask = ctxt.lr_mask;
+    v->arch.event_mask = ctxt.event_mask;
+
+    /* Restore PPI states */
+    vgic_irq_rank_restore(&v->arch.vgic.private_irqs,
&ctxt.ppi_state);
+
+    return 0;
+}
+
+HVM_REGISTER_SAVE_RESTORE(GIC, gic_save, gic_load, 1, HVMSR_PER_VCPU);
+
+static int timer_save(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_hw_timer ctxt;
+    struct vcpu *v;
+    struct vtimer *t;
+    int i;
+
+    /* Save the state of vtimer and ptimer */
+    for_each_vcpu( d, v )
+    {
+        t = &v->arch.virt_timer;
+        for ( i = 0; i < 2; i++ )
+        {
+            ctxt.cval = t->cval;
+            ctxt.ctl = t->ctl;
+            ctxt.vtb_offset = i ? d->arch.phys_timer_base.offset :
+                d->arch.virt_timer_base.offset;
+            ctxt.type = i;
+            if ( hvm_save_entry(A15_TIMER, v->vcpu_id, h, &ctxt) != 0 )
+                return 1;
+            t = &v->arch.phys_timer;
+        }
+    }
+
+    return 0;
+}
+
+static int timer_load(struct domain *d, hvm_domain_context_t *h)
+{
+    int vcpuid;
+    struct hvm_hw_timer ctxt;
+    struct vcpu *v;
+    struct vtimer *t = NULL;
+
+    /* Which vcpu is this? */
+    vcpuid = hvm_load_instance(h);
+
+    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+    {
+        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no vcpu%u\n",
+                d->domain_id, vcpuid);
+        return -EINVAL;
+    }
+
+    if ( hvm_load_entry(A15_TIMER, h, &ctxt) != 0 )
+        return -EINVAL;
+
+
+    if ( ctxt.type == TIMER_TYPE_VIRT )
+    {
+        t = &v->arch.virt_timer;
+        d->arch.virt_timer_base.offset = ctxt.vtb_offset;
+
+    }
+    else
+    {
+        t = &v->arch.phys_timer;
+        d->arch.phys_timer_base.offset = ctxt.vtb_offset;
+    }
+
+    t->cval = ctxt.cval;
+    t->ctl = ctxt.ctl;
+    t->v = v;
+
+    return 0;
+}
+
+HVM_REGISTER_SAVE_RESTORE(A15_TIMER, timer_save, timer_load, 2,
HVMSR_PER_VCPU);
+
+static int vfp_save(struct domain *d, hvm_domain_context_t *h)
+{
+    struct hvm_hw_vfp ctxt;
+    struct vcpu *v;
+
+    /* Save the state of VFP */
+    for_each_vcpu( d, v )
+    {
+        if ( sizeof(v->arch.vfp) != sizeof (ctxt) )
+        {
+            dprintk(XENLOG_G_ERR, "HVM: check VFP structure
match\n");
+            return -EINVAL;
+        }
+        memcpy(&ctxt, &v->arch.vfp, sizeof(ctxt));
+
+        if ( hvm_save_entry(VFP, v->vcpu_id, h, &ctxt) != 0 )
+            return 1;
+    }
+    return 0;
+}
+
+static int vfp_load(struct domain *d, hvm_domain_context_t *h)
+{
+    int vcpuid;
+    struct hvm_hw_vfp ctxt;
+    struct vcpu *v;
+
+    /* Which vcpu is this? */
+    vcpuid = hvm_load_instance(h);
+    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+    {
+        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no vcpu%u\n",
+                d->domain_id, vcpuid);
+        return -EINVAL;
+    }
+
+    if ( sizeof(v->arch.vfp) != sizeof (ctxt) )
+    {
+        dprintk(XENLOG_G_ERR, "HVM: check VFP structure match\n");
+        return -EINVAL;
+    }
+
+    if ( hvm_load_entry(VFP, h, &ctxt) != 0 )
+        return -EINVAL;
+
+    memcpy(&v->arch.vfp, &ctxt, sizeof(ctxt));
+
+    return 0;
+}
+
+HVM_REGISTER_SAVE_RESTORE(VFP, vfp_save, vfp_load, 1, HVMSR_PER_VCPU);
diff --git a/xen/arch/arm/save.c b/xen/arch/arm/save.c
new file mode 100644
index 0000000..c923910
--- /dev/null
+++ b/xen/arch/arm/save.c
@@ -0,0 +1,66 @@
+/*
+ * hvm/save.c: Save and restore HVM guest''s emulated hardware state
for ARM.
+ *
+ * Copyright (c) 2013, Samsung Electronics.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include <asm/hvm/support.h>
+#include <public/hvm/save.h>
+
+void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
+{
+    hdr->cpuid = READ_SYSREG32(MIDR_EL1);
+}
+
+int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
+{
+    uint32_t cpuid;
+
+    if ( hdr->magic != HVM_FILE_MAGIC )
+    {
+        printk(XENLOG_G_ERR "HVM%d restore: bad magic number
%#"PRIx32"\n",
+               d->domain_id, hdr->magic);
+        return -1;
+    }
+
+    if ( hdr->version != HVM_FILE_VERSION )
+    {
+        printk(XENLOG_G_ERR "HVM%d restore: unsupported version
%u\n",
+               d->domain_id, hdr->version);
+        return -1;
+    }
+
+    cpuid = READ_SYSREG32(MIDR_EL1);
+    if ( hdr->cpuid != cpuid )
+    {
+        printk(XENLOG_G_INFO "HVM%d restore: VM saved on one CPU "
+               "(%#"PRIx32") and restored on another
(%#"PRIx32").\n",
+               d->domain_id, hdr->cpuid, cpuid);
+        return -1;
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 6da4651..733b19f 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -61,6 +61,8 @@ subdir-$(CONFIG_COMPAT) += compat
 
 subdir-$(x86_64) += hvm
 
+subdir-$(CONFIG_ARM) += hvm
+
 subdir-$(coverage) += gcov
 
 subdir-y += libelf
diff --git a/xen/include/asm-arm/hvm/support.h
b/xen/include/asm-arm/hvm/support.h
new file mode 100644
index 0000000..8311f2f
--- /dev/null
+++ b/xen/include/asm-arm/hvm/support.h
@@ -0,0 +1,29 @@
+/*
+ * support.h: HVM support routines used by ARMv7 VE.
+ *
+ * Copyright (c) 2012, Citrix Systems
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef __ASM_ARM_HVM_SUPPORT_H__
+#define __ASM_ARM_HVM_SUPPORT_H__
+
+#include <xen/types.h>
+#include <public/hvm/ioreq.h>
+#include <xen/sched.h>
+#include <xen/hvm/save.h>
+#include <asm/processor.h>
+
+#endif /* __ASM_ARM_HVM_SUPPORT_H__ */
diff --git a/xen/include/public/arch-arm/hvm/save.h
b/xen/include/public/arch-arm/hvm/save.h
index 75b8e65..7e55384 100644
--- a/xen/include/public/arch-arm/hvm/save.h
+++ b/xen/include/public/arch-arm/hvm/save.h
@@ -26,6 +26,72 @@
 #ifndef __XEN_PUBLIC_HVM_SAVE_ARM_H__
 #define __XEN_PUBLIC_HVM_SAVE_ARM_H__
 
+#define HVM_FILE_MAGIC   0x92385520
+#define HVM_FILE_VERSION 0x00000001
+
+
+struct hvm_save_header
+{
+    uint32_t magic;             /* Must be HVM_FILE_MAGIC */
+    uint32_t version;           /* File format version */
+    uint64_t changeset;         /* Version of Xen that saved this file */
+    uint32_t cpuid;             /* MIDR_EL1 on the saving machine */
+};
+
+DECLARE_HVM_SAVE_TYPE(HEADER, 1, struct hvm_save_header);
+
+struct vgic_rank
+{
+    uint32_t ienable, iactive, ipend, pendsgi;
+    uint32_t icfg[2];
+    uint32_t ipriority[8];
+    uint32_t itargets[8];
+};
+
+struct hvm_hw_gic
+{
+    uint32_t gic_hcr;
+    uint32_t gic_vmcr;
+    uint32_t gic_apr;
+    uint32_t gic_lr[64];
+    uint64_t event_mask;
+    uint64_t lr_mask;
+    struct vgic_rank ppi_state;
+};
+
+DECLARE_HVM_SAVE_TYPE(GIC, 2, struct hvm_hw_gic);
+
+#define TIMER_TYPE_VIRT 0
+#define TIMER_TYPE_PHYS 1
+
+struct hvm_hw_timer
+{
+    uint64_t vtb_offset;
+    uint32_t ctl;
+    uint64_t cval;
+    uint32_t type;
+};
+
+DECLARE_HVM_SAVE_TYPE(A15_TIMER, 3, struct hvm_hw_timer);
+
+struct hvm_hw_vfp
+{
+    uint64_t fpregs1[16]; /* {d0-d15} */
+    uint64_t fpregs2[16]; /* {d16-d31} */
+    uint32_t fpexc;
+    uint32_t fpscr;
+    /* VFP implementation specific state */
+    uint32_t fpinst;
+    uint32_t fpinst2;
+};
+
+DECLARE_HVM_SAVE_TYPE(VFP, 4, struct hvm_hw_vfp);
+
+/*
+ * Largest type-code in use
+ */
+#define HVM_SAVE_CODE_MAX 4
+
 #endif
 
 /*
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:43 UTC

head link

[PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Add more registers for saving and restoring vcpu registers for live migration.
Those registers are selected from the registers stored when vcpu context
switching.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 tools/include/xen-foreign/reference.size |  2 +-
 xen/arch/arm/domain.c                    | 34 +++++++++++++++++++++++++++++++
 xen/arch/arm/domctl.c                    | 35 ++++++++++++++++++++++++++++++++
 xen/include/public/arch-arm.h            | 35 ++++++++++++++++++++++++++++++++
 4 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-foreign/reference.size
b/tools/include/xen-foreign/reference.size
index b3347b4..dd7b0f8 100644
--- a/tools/include/xen-foreign/reference.size
+++ b/tools/include/xen-foreign/reference.size
@@ -5,7 +5,7 @@ start_info                |       -       -    1112    1168
 trap_info                 |       -       -       8      16
 cpu_user_regs             |       -       -      68     200
 vcpu_guest_core_regs      |     304     304       -       -
-vcpu_guest_context        |     336     336    2800    5168
+vcpu_guest_context        |     440     440    2800    5168
 arch_vcpu_info            |       0       0      24      16
 vcpu_time_info            |      32      32      32      32
 vcpu_info                 |      48      48      64      64
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index cb0424d..d3bac4d 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -601,6 +601,40 @@ int arch_set_info_guest(
     v->arch.ttbr1 = ctxt->ttbr1;
     v->arch.ttbcr = ctxt->ttbcr;
 
+    v->arch.dacr = ctxt->dacr;
+    v->arch.ifar = ctxt->ifar;
+    v->arch.ifsr = ctxt->ifsr;
+    v->arch.dfar = ctxt->dfar;
+    v->arch.dfsr = ctxt->dfsr;
+
+#ifdef CONFIG_ARM_32
+    v->arch.mair0 = ctxt->mair0;
+    v->arch.mair1 = ctxt->mair1;
+#else
+    v->arch.mair = ctxt->mair;
+#endif
+
+    /* Control Registers */
+    v->arch.actlr = ctxt->actlr;
+    v->arch.cpacr = ctxt->cpacr;
+
+    v->arch.contextidr = ctxt->contextidr;
+    v->arch.tpidr_el0 = ctxt->tpidr_el0;
+    v->arch.tpidr_el1 = ctxt->tpidr_el1;
+    v->arch.tpidrro_el0 = ctxt->tpidrro_el0;
+
+    /* CP 15 */
+    v->arch.csselr = ctxt->csselr;
+
+    v->arch.afsr0 = ctxt->afsr0;
+    v->arch.afsr1 = ctxt->afsr1;
+    v->arch.vbar = ctxt->vbar;
+    v->arch.par = ctxt->par;
+    v->arch.teecr = ctxt->teecr;
+    v->arch.teehbr = ctxt->teehbr;
+    v->arch.joscr = ctxt->joscr;
+    v->arch.jmcr = ctxt->jmcr;
+
     v->is_initialised = 1;
 
     if ( ctxt->flags & VGCF_online )
diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 86c07de..78fb322 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -115,6 +115,41 @@ void arch_get_info_guest(struct vcpu *v,
vcpu_guest_context_u c)
     ctxt->ttbr1 = v->arch.ttbr1;
     ctxt->ttbcr = v->arch.ttbcr;
 
+    ctxt->dacr = v->arch.dacr;
+    ctxt->ifar = v->arch.ifar;
+    ctxt->ifsr = v->arch.ifsr;
+    ctxt->dfar = v->arch.dfar;
+    ctxt->dfsr = v->arch.dfsr;
+
+#ifdef CONFIG_ARM_32
+    ctxt->mair0 = v->arch.mair0;
+    ctxt->mair1 = v->arch.mair1;
+#else
+    ctxt->mair = v->arch.mair;
+#endif
+
+    /* Control Registers */
+    ctxt->actlr = v->arch.actlr;
+    ctxt->sctlr = v->arch.sctlr;
+    ctxt->cpacr = v->arch.cpacr;
+
+    ctxt->contextidr = v->arch.contextidr;
+    ctxt->tpidr_el0 = v->arch.tpidr_el0;
+    ctxt->tpidr_el1 = v->arch.tpidr_el1;
+    ctxt->tpidrro_el0 = v->arch.tpidrro_el0;
+
+    /* CP 15 */
+    ctxt->csselr = v->arch.csselr;
+
+    ctxt->afsr0 = v->arch.afsr0;
+    ctxt->afsr1 = v->arch.afsr1;
+    ctxt->vbar = v->arch.vbar;
+    ctxt->par = v->arch.par;
+    ctxt->teecr = v->arch.teecr;
+    ctxt->teehbr = v->arch.teehbr;
+    ctxt->joscr = v->arch.joscr;
+    ctxt->jmcr = v->arch.jmcr;
+
     if ( !test_bit(_VPF_down, &v->pause_flags) )
         ctxt->flags |= VGCF_online;
 }
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index 5d359af..bf6dc9a 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -253,6 +253,41 @@ struct vcpu_guest_context {
 
     uint32_t sctlr, ttbcr;
     uint64_t ttbr0, ttbr1;
+    uint32_t ifar, dfar;
+    uint32_t ifsr, dfsr;
+    uint32_t dacr;
+    uint64_t par;
+
+#ifdef CONFIG_ARM_32
+    uint32_t mair0, mair1;
+    uint32_t tpidr_el0;
+    uint32_t tpidr_el1;
+    uint32_t tpidrro_el0;
+    uint32_t vbar;
+#else
+    uint64_t mair;
+    uint64_t tpidr_el0;
+    uint64_t tpidr_el1;
+    uint64_t tpidrro_el0;
+    uint64_t vbar;
+#endif
+
+    /* Control Registers */
+    uint32_t actlr;
+    uint32_t cpacr;
+
+    uint32_t afsr0, afsr1;
+
+    uint32_t contextidr;
+
+    uint32_t teecr, teehbr; /* ThumbEE, 32-bit guests only */
+
+#ifdef CONFIG_ARM_32
+    uint32_t joscr, jmcr;
+#endif
+
+    /* CP 15 */
+    uint32_t csselr;
 };
 typedef struct vcpu_guest_context vcpu_guest_context_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:43 UTC

head link

[PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

From: Evgeny Fedotov <e.fedotov@samsung.com>

When creating domU in toolstack, pass the guest memory
map info to the hypervisor, and the hypervisor stores those info in
arch_domain for later use.

Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
---
 tools/libxc/xc_dom_arm.c     | 12 +++++++-
 tools/libxc/xc_domain.c      | 44 ++++++++++++++++++++++++++++
 tools/libxc/xenctrl.h        | 23 +++++++++++++++
 xen/arch/arm/domain.c        |  3 ++
 xen/arch/arm/mm.c            | 68 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/domain.h |  2 ++
 xen/include/asm-arm/mm.h     |  1 +
 xen/include/public/memory.h  | 15 ++++++++--
 xen/include/xsm/dummy.h      |  5 ++++
 xen/include/xsm/xsm.h        |  5 ++++
 10 files changed, 175 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index df59ffb..20c9095 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -166,6 +166,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 {
     int rc;
     xen_pfn_t pfn, allocsz, i;
+    struct dt_mem_info memmap;
 
     dom->shadow_enabled = 1;
 
@@ -191,7 +192,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             0, 0, &dom->p2m_host[i]);
     }
 
-    return 0;
+    /* setup guest memory map */
+    memmap.nr_banks = 2;
+    memmap.bank[0].start = (dom->rambase_pfn << PAGE_SHIFT_ARM);
+    memmap.bank[0].size = (dom->total_pages << PAGE_SHIFT_ARM);
+    /*The end of main memory: magic pages */
+    memmap.bank[1].start = memmap.bank[0].start + memmap.bank[0].size;
+    memmap.bank[1].size = NR_MAGIC_PAGES << PAGE_SHIFT_ARM;
+
+    return xc_domain_set_memory_map(dom->xch, dom->guest_domid,
&memmap);
+
 }
 
 int arch_setup_bootearly(struct xc_dom_image *dom)
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 81316d3..7b283f1 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -662,7 +662,51 @@ int xc_domain_set_memmap_limit(xc_interface *xch,
     return -1;
 }
 #endif
+#if defined(__arm__)
+int xc_domain_get_memory_map(xc_interface *xch,
+                              uint32_t domid,
+                              struct dt_mem_info *map)
+{
+    int rc;
+    struct xen_arm_memory_map fmap = {
+           .domid = domid
+    };
+
+    DECLARE_HYPERCALL_BOUNCE(map, sizeof(struct dt_mem_info),
+                                 XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( !map || xc_hypercall_bounce_pre(xch, map) )
+            return -1;
+    set_xen_guest_handle(fmap.buffer, map);
 
+    rc = do_memory_op(xch, XENMEM_memory_map, &fmap, sizeof(fmap));
+
+    xc_hypercall_bounce_post(xch, map);
+
+    return rc;
+}
+
+int xc_domain_set_memory_map(xc_interface *xch,
+                              uint32_t domid,
+                              struct dt_mem_info *map)
+{
+    int rc;
+    struct xen_arm_memory_map fmap = {
+        .domid = domid
+    };
+    DECLARE_HYPERCALL_BOUNCE(map, sizeof(struct dt_mem_info),
+                                    XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( !map || xc_hypercall_bounce_pre(xch, map) )
+                return -1;
+    set_xen_guest_handle(fmap.buffer, map);
+
+    rc = do_memory_op(xch, XENMEM_set_memory_map, &fmap, sizeof(fmap));
+
+    xc_hypercall_bounce_post(xch, map);
+    return rc;
+}
+#endif
 int xc_domain_set_time_offset(xc_interface *xch,
                               uint32_t domid,
                               int32_t time_offset_seconds)
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 58d51f3..c1468fa 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1122,6 +1122,29 @@ int xc_get_machine_memory_map(xc_interface *xch,
                               struct e820entry entries[],
                               uint32_t max_entries);
 #endif
+
+#if defined(__arm__)
+#define NR_MEM_BANKS 8
+typedef uint64_t paddr_t;
+
+struct membank {
+    paddr_t start;
+    paddr_t size;
+};
+
+struct dt_mem_info {
+    int nr_banks;
+    struct membank bank[NR_MEM_BANKS];
+};
+
+int xc_domain_set_memory_map(xc_interface *xch,
+                              uint32_t domid,
+                              struct dt_mem_info *map);
+int xc_domain_get_memory_map(xc_interface *xch,
+                              uint32_t domid,
+                              struct dt_mem_info *map);
+#endif
+
 int xc_domain_set_time_offset(xc_interface *xch,
                               uint32_t domid,
                               int32_t time_offset_seconds);
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index d3bac4d..57be341 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -509,6 +509,9 @@ int arch_domain_create(struct domain *d, unsigned int
domcr_flags)
     /* Default the virtual ID to match the physical */
     d->arch.vpidr = boot_cpu_data.midr.bits;
 
+    spin_lock_init(&d->arch.map_lock);
+    d->arch.map_domain.nr_banks = 0;
+
     clear_page(d->shared_info);
     share_xen_page_with_guest(
         virt_to_page(d->shared_info), d, XENSHARE_writable);
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 474dfef..f5d9cf4 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1156,6 +1156,74 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void)
arg)
 
         return rc;
     }
+    case XENMEM_set_memory_map:
+    {
+        struct xen_arm_memory_map fmap;
+        struct domain *d;
+        struct dt_mem_info info;
+
+        if ( copy_from_guest(&fmap, arg, 1) )
+            return -EFAULT;
+
+        if ( copy_from_guest(&info, fmap.buffer, 1) )
+        {
+            return -EFAULT;
+        }
+
+        if ( info.nr_banks > NR_MEM_BANKS )
+              return -EINVAL;
+
+        d = rcu_lock_domain_by_any_id(fmap.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = xsm_domain_memory_map(XSM_TARGET, d);
+        if ( rc )
+        {
+            rcu_unlock_domain(d);
+            return rc;
+        }
+        spin_lock(&d->arch.map_lock);
+        d->arch.map_domain = info;
+        spin_unlock(&d->arch.map_lock);
+
+        rcu_unlock_domain(d);
+        return rc;
+        }
+
+        case XENMEM_memory_map:
+        {
+        /* get the domain''s memory map as it was stored */
+        struct xen_arm_memory_map fmap;
+        struct domain *d;
+        struct dt_mem_info info;
+
+        if ( copy_from_guest(&fmap, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(fmap.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        spin_lock(&d->arch.map_lock);
+        info = d->arch.map_domain;
+        spin_unlock(&d->arch.map_lock);
+
+        if ( copy_to_guest(fmap.buffer, &info, 1) )
+        {
+            rcu_unlock_domain(d);
+            return -EFAULT;
+        }
+
+        if ( copy_to_guest(arg, &fmap, 1) )
+        {
+            rcu_unlock_domain(d);
+            return -EFAULT;
+        }
+
+        rcu_unlock_domain(d);
+        return 0;
+    }
     /* XXX: memsharing not working yet */
     case XENMEM_get_sharing_shared_pages:
     case XENMEM_get_sharing_freed_pages:
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 67bfbbc..755c7af 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -112,6 +112,8 @@ struct arch_domain
         spinlock_t                  lock;
     } vuart;
 
+    struct dt_mem_info map_domain;
+    spinlock_t         map_lock;
 }  __cacheline_aligned;
 
 struct arch_vcpu
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index ce66099..5142524 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -5,6 +5,7 @@
 #include <xen/kernel.h>
 #include <asm/page.h>
 #include <public/xen.h>
+#include <xen/device_tree.h>
 
 /* Align Xen to a 2 MiB boundary. */
 #define XEN_PADDR_ALIGN (1 << 21)
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 7a26dee..264fb8f 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -283,9 +283,12 @@ DEFINE_XEN_GUEST_HANDLE(xen_remove_from_physmap_t);
 /*#define XENMEM_translate_gpfn_list  8*/
 
 /*
- * Returns the pseudo-physical memory map as it was when the domain
+ * x86: returns the pseudo-physical memory map as it was when the domain
  * was started (specified by XENMEM_set_memory_map).
  * arg == addr of xen_memory_map_t.
+ * ARM: returns the pseudo-physical memory map as it was set
+ * (specified by XENMEM_set_memory_map).
+ * arg == addr of xen_arm_memory_map_t.
  */
 #define XENMEM_memory_map           9
 struct xen_memory_map {
@@ -315,7 +318,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_memory_map_t);
 /*
  * Set the pseudo-physical memory map of a domain, as returned by
  * XENMEM_memory_map.
- * arg == addr of xen_foreign_memory_map_t.
+ * x86: arg == addr of xen_foreign_memory_map_t.
+ * ARM: arg == addr of xen_arm_memory_map_t
  */
 #define XENMEM_set_memory_map       13
 struct xen_foreign_memory_map {
@@ -325,6 +329,13 @@ struct xen_foreign_memory_map {
 typedef struct xen_foreign_memory_map xen_foreign_memory_map_t;
 DEFINE_XEN_GUEST_HANDLE(xen_foreign_memory_map_t);
 
+struct xen_arm_memory_map {
+    domid_t domid;
+    XEN_GUEST_HANDLE(void) buffer;
+};
+typedef struct xen_arm_memory_map xen_arm_memory_map_t;
+DEFINE_XEN_GUEST_HANDLE(xen_arm_memory_map_t);
+
 #define XENMEM_set_pod_target       16
 #define XENMEM_get_pod_target       17
 struct xen_pod_target {
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 052f3e0..427f800 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -627,4 +627,9 @@ static XSM_INLINE int xsm_map_gmfn_foreign(XSM_DEFAULT_ARG
struct domain *d, str
     XSM_ASSERT_ACTION(XSM_TARGET);
     return xsm_default_action(action, d, t);
 }
+static XSM_INLINE int xsm_domain_memory_map(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_TARGET);
+    return xsm_default_action(action, current->domain, d);
+}
 #endif
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 1939453..9764011 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -625,6 +625,11 @@ static inline int xsm_map_gmfn_foreign (struct domain *d,
struct domain *t)
 {
     return xsm_ops->map_gmfn_foreign(d, t);
 }
+static inline int xsm_domain_memory_map(xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->domain_memory_map(d);
+}
+
 #endif /* CONFIG_ARM */
 
 #endif /* XSM_NO_WRAPPERS */
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 4/9] xen/arm: Implement get_maximum_gpfn hypercall for arm

From: Evgeny Fedotov <e.fedotov@samsung.com>

By using the memory map info in arch_domain (from
set_memory_map hypercall) implement get_maximum_gpfn hypercall.

Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
---
 xen/arch/arm/mm.c        | 21 ++++++++++++++++++++-
 xen/include/asm-arm/mm.h |  2 ++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index f5d9cf4..120eae8 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -918,9 +918,28 @@ int page_is_ram_type(unsigned long mfn, unsigned long
mem_type)
     return 0;
 }
 
+void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t *end)
+{
+    int nr_banks = d->arch.map_domain.nr_banks;
+
+    ASSERT(nr_banks != 0);
+
+    spin_lock(&d->arch.map_lock);
+    if ( start )
+        *start = d->arch.map_domain.bank[0].start;
+    if ( end )
+        *end = d->arch.map_domain.bank[nr_banks-1].start +
+               d->arch.map_domain.bank[nr_banks-1].size;
+    spin_unlock(&d->arch.map_lock);
+}
+
 unsigned long domain_get_maximum_gpfn(struct domain *d)
 {
-    return -ENOSYS;
+    vaddr_t end;
+
+    get_gma_start_end(d, NULL, &end);
+
+    return (unsigned long) (end >> PAGE_SHIFT);
 }
 
 void share_xen_page_with_guest(struct page_info *page,
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 5142524..021e8d6 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -342,6 +342,8 @@ static inline void put_page_and_type(struct page_info *page)
     put_page(page);
 }
 
+void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t *end);
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 5/9] xen/arm: Implement modify_returncode

From: Alexey Sokolov <sokolov.a@samsung.com>

Making sched_op in do_suspend (driver/xen/manage.c) returns 0 on
the success of suspend.

Singed-off-by: Alexey Sokolov <sokolov.a@samsung.com>
---
 tools/libxc/xc_resume.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index cb61650..934158c 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -73,6 +73,31 @@ static int modify_returncode(xc_interface *xch, uint32_t
domid)
     return 0;
 }
 
+#elif defined(__arm__)
+
+static int modify_returncode(xc_interface *xch, uint32_t domid)
+{
+    vcpu_guest_context_any_t ctxt;
+    xc_dominfo_t info;
+    int rc;
+
+    if ( xc_domain_getinfo(xch, domid, 1, &info) != 1 )
+    {
+        PERROR("Could not get domain info");
+        return -1;
+    }
+
+    if ( (rc = xc_vcpu_getcontext(xch, domid, 0, &ctxt)) != 0 )
+        return rc;
+
+    ctxt.c.user_regs.r0_usr = 1;
+
+    if ( (rc = xc_vcpu_setcontext(xch, domid, 0, &ctxt)) != 0 )
+        return rc;
+
+    return 0;
+}
+
 #else
 
 static int modify_returncode(xc_interface *xch, uint32_t domid)
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

Fix the the broken macro ''clear_guest_offset'' in arm.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 xen/include/asm-arm/guest_access.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/include/asm-arm/guest_access.h
b/xen/include/asm-arm/guest_access.h
index 34aae14..8ff088f 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -77,8 +77,9 @@ unsigned long raw_clear_guest(void *to, unsigned len);
  * Clear an array of objects in guest context via a guest handle,
  * specifying an offset into the guest array.
  */
-#define clear_guest_offset(hnd, off, ptr, nr) ({      \
-    raw_clear_guest(_d+(off), nr);  \
+#define clear_guest_offset(hnd, off, nr) ({    \
+    void *_d = (hnd).p;                        \
+    raw_clear_guest(_d+(off), nr);             \
 })
 
 /*
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

Implement vlpt (virtual-linear page table) for fast accessing of 3rd
PTE of guest p2m. When creating a mapping for vlpt, just copy the
1st level PTE of guest p2m to the xen''s 2nd level PTE. Then the mapping
becomes the following:
    xen''s 1st PTE -->
    xen''s 2nd PTE (which is the same as 1st PTE of guest p2m) -->
    guest p2m''s 2nd PTE -->
    guest p2m''s 3rd PTE (the memory contents where the vlpt points)
For more info about vlpt, see:
http://www.technovelty.org/linux/virtual-linear-page-table.html

This function is used in dirty-page tracing: when domU write-fault is trapped by
xen,
xen can immediately locate the 3rd PTE of guest p2m.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 xen/arch/arm/mm.c            | 78 ++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/config.h |  6 ++++
 xen/include/asm-arm/domain.h |  7 ++++
 xen/include/asm-arm/mm.h     | 22 +++++++++++++
 4 files changed, 113 insertions(+)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 120eae8..98edbc9 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn)
         return 1;
     return 0;
 }
+
+/* flush the vlpt area */
+static void flush_vlpt(struct domain *d)
+{
+    int flush_size;
+    flush_size = (d->arch.dirty.second_lvl_end -
+                  d->arch.dirty.second_lvl_start) << SECOND_SHIFT;
+    /* flushing the 3rd level mapping */
+    flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START,
+                                flush_size);
+}
+
+/* restore the xen page table for vlpt mapping for domain d */
+void restore_vlpt(struct domain *d)
+{
+    int i, j = 0;
+    int need_flush = 0;
+
+    for ( i = d->arch.dirty.second_lvl_start;
+          i < d->arch.dirty.second_lvl_end;
+          ++i, ++j )
+    {
+        if ( xen_second[i].bits != d->arch.dirty.second_lvl[j].bits )
+        {
+            write_pte(&xen_second[i], d->arch.dirty.second_lvl[j]);
+            need_flush = 1;
+        }
+    }
+
+    if ( need_flush ) flush_vlpt(d);
+}
+
+/* setting up the xen page table for vlpt mapping for domain d */
+void prepare_vlpt(struct domain *d)
+{
+    int xen_second_linear_base;
+    int gp2m_start_index, gp2m_end_index;
+    struct p2m_domain *p2m = &d->arch.p2m;
+    struct page_info *second_lvl_page;
+    vaddr_t gma_start = 0;
+    vaddr_t gma_end = 0;
+    lpae_t *first;
+    int i, j;
+
+    xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START);
+    get_gma_start_end(d, &gma_start, &gma_end);
+
+    gp2m_start_index = gma_start >> FIRST_SHIFT;
+    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
+
+    second_lvl_page = alloc_domheap_page(NULL, 0);
+    d->arch.dirty.second_lvl = map_domain_page_global(
+                                 page_to_mfn(second_lvl_page) );
+
+    first = __map_domain_page(p2m->first_level);
+    for ( i = gp2m_start_index, j = 0; i < gp2m_end_index; ++i, ++j )
+    {
+        write_pte(&xen_second[xen_second_linear_base+i],
+                  first[i]);
+
+        /* we copy the mapping into domain''s structure as a reference
in case
+         * of the context switch (used in restore_vlpt) */
+        d->arch.dirty.second_lvl[j] = first[i];
+    }
+    unmap_domain_page(first);
+
+    /* storing the start and end index */
+    d->arch.dirty.second_lvl_start = xen_second_linear_base +
gp2m_start_index;
+    d->arch.dirty.second_lvl_end = xen_second_linear_base + gp2m_end_index;
+
+    flush_vlpt(d);
+}
+
+void cleanup_vlpt(struct domain *d)
+{
+    unmap_domain_page_global(d->arch.dirty.second_lvl);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index 9e395c2..47fc26e 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -87,6 +87,7 @@
  *   0  -   8M   <COMMON>
  *
  *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
+ * 128M - 256M   Virtual-linear mapping to P2M table
  * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
  *                    space
  *
@@ -132,6 +133,9 @@
 
 #define VMAP_VIRT_END    XENHEAP_VIRT_START
 
+#define VIRT_LIN_P2M_START     _AT(vaddr_t,0x08000000)
+#define VIRT_LIN_P2M_END       VMAP_VIRT_START
+
 #define DOMHEAP_ENTRIES        1024  /* 1024 2MB mapping slots */
 
 /* Number of domheap pagetable pages required at the second level (2MB
mappings) */
@@ -158,6 +162,8 @@
 
 #define HYPERVISOR_VIRT_END    DIRECTMAP_VIRT_END
 
+/*TODO (ARM_64): define VIRT_LIN_P2M_START VIRT_LIN_P2M_END */
+
 #endif
 
 /* Fixmap slots */
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 755c7af..7e7743a 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -112,6 +112,13 @@ struct arch_domain
         spinlock_t                  lock;
     } vuart;
 
+    /* dirty-page tracing */
+    struct {
+        int second_lvl_start;            /* for context switch */
+        int second_lvl_end;
+        lpae_t *second_lvl;
+    } dirty;
+
     struct dt_mem_info map_domain;
     spinlock_t         map_lock;
 }  __cacheline_aligned;
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 021e8d6..b9cbcf2 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -344,6 +344,28 @@ static inline void put_page_and_type(struct page_info
*page)
 
 void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t *end);
 
+/* vlpt (virtual linear page table) related functions */
+void prepare_vlpt(struct domain *d);
+void cleanup_vlpt(struct domain *d);
+void restore_vlpt(struct domain *d);
+
+/* calculate the xen''s virtual address for accessing the leaf PTE of
+ * a given address (GPA) */
+static inline lpae_t * get_vlpt_3lvl_pte(vaddr_t addr)
+{
+    unsigned long ipa_third = third_table_offset(addr);
+    unsigned long ipa_second = second_table_offset(addr);
+    unsigned long ipa_first = first_table_offset(addr);
+
+    /* guest p2m''s first level index is the index of xen''s
second level,
+       and guest p2m''s second level index is the index of
xen''s third level */
+    return (lpae_t *)(
+        ( (second_linear_offset(VIRT_LIN_P2M_START) + ipa_first)
+           << SECOND_SHIFT) +
+        (ipa_second << THIRD_SHIFT) +
+        (ipa_third << 3) );
+}
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Add hypercall (shadow op: enable/disable and clean/peek dirted page bitmap).
It consists of two parts: dirty page detecting and saving.
For detecting, we setup the guest p2m''s leaf PTE read-only and whenever
the guest tries to write something, permission fault happens and traps into xen.
The permission-faulted GPA should be saved for the toolstack (when it wants to
see
which pages are dirted). For this purpose, we temporarily save the GPAs into
linked
list by using ''add_mapped_vaddr'' function and when toolstack
wants
(log_dirty_op function) the GPAs are copied into bitmap and the linnked list is
flushed.

Additionally, for supporting parallel migration of domUs, vlpt area should be
context
switched.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
---
 xen/arch/arm/domain.c           |  15 ++
 xen/arch/arm/domctl.c           |  13 ++
 xen/arch/arm/mm.c               |  23 ++-
 xen/arch/arm/p2m.c              | 300 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c            |  21 ++-
 xen/include/asm-arm/domain.h    |   5 +
 xen/include/asm-arm/mm.h        |   4 +
 xen/include/asm-arm/p2m.h       |   4 +
 xen/include/asm-arm/processor.h |   2 +
 9 files changed, 382 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 57be341..40ebeed 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -215,6 +215,10 @@ static void ctxt_switch_to(struct vcpu *n)
     WRITE_SYSREG(hcr, HCR_EL2);
     isb();
 
+    /* vlpt area context switching for dirty-page tracing */
+    if ( n->domain->arch.dirty.mode )
+        restore_vlpt(n->domain);
+
     /* This is could trigger an hardware interrupt from the virtual
      * timer. The interrupt needs to be injected into the guest. */
     virt_timer_restore(n);
@@ -512,6 +516,17 @@ int arch_domain_create(struct domain *d, unsigned int
domcr_flags)
     spin_lock_init(&d->arch.map_lock);
     d->arch.map_domain.nr_banks = 0;
 
+    /* init for dirty-page tracing */
+    d->arch.dirty.count = 0;
+    d->arch.dirty.mode = 0;
+    d->arch.dirty.head = NULL;
+    d->arch.dirty.mvn_head = NULL;
+    spin_lock_init(&d->arch.dirty.lock);
+
+    d->arch.dirty.second_lvl_start = 0;
+    d->arch.dirty.second_lvl_end = 0;
+    d->arch.dirty.second_lvl = NULL;
+
     clear_page(d->shared_info);
     share_xen_page_with_guest(
         virt_to_page(d->shared_info), d, XENSHARE_writable);
diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
index 78fb322..7edce12 100644
--- a/xen/arch/arm/domctl.c
+++ b/xen/arch/arm/domctl.c
@@ -92,6 +92,19 @@ long arch_do_domctl(struct xen_domctl *domctl, struct domain
*d,
             xfree(c.data);
     }
     break;
+    case XEN_DOMCTL_shadow_op:
+    {
+        domain_pause(d);
+        ret = dirty_mode_op(d, &domctl->u.shadow_op);
+        domain_unpause(d);
+
+        if ( (&domctl->u.shadow_op)->op == XEN_DOMCTL_SHADOW_OP_CLEAN
||
+             (&domctl->u.shadow_op)->op == XEN_DOMCTL_SHADOW_OP_PEEK
)
+        {
+            copyback = 1;
+        }
+    }
+    break;
 
     default:
         return -EINVAL;
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 98edbc9..5b9d26d 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -838,7 +838,6 @@ void destroy_xen_mappings(unsigned long v, unsigned long e)
     create_xen_entries(REMOVE, v, 0, (e - v) >> PAGE_SHIFT, 0);
 }
 
-enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
 static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
 {
     lpae_t pte;
@@ -1460,6 +1459,28 @@ void cleanup_vlpt(struct domain *d)
     unmap_domain_page_global(d->arch.dirty.second_lvl);
 }
 
+/* routine for dirty-page tracing
+ *
+ * On first write, it page faults, its entry is changed to read-write,
+ * and on retry the write succeeds.
+ *
+ * for locating p2m of the faulting entry, we use virtual-linear page table.
+ */
+void handle_page_fault(struct domain *d, paddr_t addr)
+{
+    lpae_t *vlp2m_pte = get_vlpt_3lvl_pte(addr);
+    if ( vlp2m_pte->p2m.valid && vlp2m_pte->p2m.write == 0 )
+    {
+        lpae_t pte = *vlp2m_pte;
+        pte.p2m.write = 1;
+        write_pte(vlp2m_pte, pte);
+        flush_tlb_local();
+
+        /* in order to remove mappings in cleanup stage */
+        add_mapped_vaddr(d, addr);
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2d09fef..3b2b11a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -6,6 +6,8 @@
 #include <xen/bitops.h>
 #include <asm/flushtlb.h>
 #include <asm/gic.h>
+#include <xen/guest_access.h>
+#include <xen/pfn.h>
 
 void dump_p2m_lookup(struct domain *d, paddr_t addr)
 {
@@ -408,6 +410,304 @@ unsigned long gmfn_to_mfn(struct domain *d, unsigned long
gpfn)
     return p >> PAGE_SHIFT;
 }
 
+#define MAX_VA_PER_NODE (PAGE_SIZE - 2 * sizeof(void *) - sizeof(int))\
+                        / sizeof(unsigned long)
+
+/* An array-based linked list for storing virtual addresses (i.e., the 3rd
+ * level table mapping) that should be destroyed after live migration */
+struct mapped_va_node
+{
+    struct page_info *next;
+    struct mapped_va_node *mvn_next;
+    int items;
+    unsigned long vaddrs[MAX_VA_PER_NODE];
+};
+
+int add_mapped_vaddr(struct domain *d, unsigned long va)
+{
+    struct page_info *page_head;
+    struct mapped_va_node *mvn_head;
+
+    spin_lock(&d->arch.dirty.lock);
+
+    page_head = d->arch.dirty.head;
+    mvn_head = d->arch.dirty.mvn_head;
+    if ( !mvn_head )
+    {
+        page_head = alloc_domheap_page(NULL, 0);
+        if ( !page_head )
+        {
+            spin_unlock(&d->arch.dirty.lock);
+            return -ENOMEM;
+        }
+
+        mvn_head = map_domain_page_global(__page_to_mfn(page_head));
+        mvn_head->items = 0;
+        mvn_head->next = NULL;
+        mvn_head->mvn_next = NULL;
+        d->arch.dirty.head = page_head;
+        d->arch.dirty.mvn_head = mvn_head;
+    }
+
+    if ( mvn_head->items == MAX_VA_PER_NODE )
+    {
+        struct page_info *page;
+
+        page = alloc_domheap_page(NULL, 0);
+        if ( !page )
+        {
+            spin_unlock(&d->arch.dirty.lock);
+            return -ENOMEM;
+        }
+
+        mvn_head = map_domain_page_global(__page_to_mfn(page));
+        mvn_head->items = 0;
+        mvn_head->next = d->arch.dirty.head;
+        mvn_head->mvn_next = d->arch.dirty.mvn_head;
+
+        d->arch.dirty.mvn_head = mvn_head;
+        d->arch.dirty.head = page;
+    }
+
+    mvn_head->vaddrs[mvn_head->items] = va;
+    mvn_head->items ++;
+
+    spin_unlock(&d->arch.dirty.lock);
+    return 0;
+}
+
+/* get the dirty bitmap from the linked list stored by add_mapped_vaddr
+ * and also clear the mapped vaddrs linked list */
+static void get_dirty_bitmap(struct domain *d, uint8_t *bitmap[])
+{
+    struct page_info *page_head;
+    struct mapped_va_node *mvn_head;
+    struct page_info *tmp_page;
+    struct mapped_va_node *tmp_mvn;
+    vaddr_t gma_start = 0;
+    vaddr_t gma_end = 0;
+
+    /* this hypercall is called from domain 0, and we don''t know which
guest''s
+     * vlpt is mapped in xen_second, so, to be sure, we restore vlpt here */
+    restore_vlpt(d);
+
+    spin_lock(&d->arch.dirty.lock);
+
+    page_head = d->arch.dirty.head;
+    mvn_head = d->arch.dirty.mvn_head;
+    get_gma_start_end(d, &gma_start, &gma_end);
+
+    while ( page_head )
+    {
+        int i;
+
+        for ( i = 0; i < mvn_head->items; ++i )
+        {
+            unsigned int bit_offset;
+            int bitmap_index, bitmap_offset;
+            lpae_t *vlp2m_pte;
+
+            bit_offset = third_linear_offset(mvn_head->vaddrs[i] -
gma_start);
+            bitmap_index = bit_offset >> (PAGE_SHIFT + 3);
+            bitmap_offset = bit_offset & ((1ul << (PAGE_SHIFT + 3)) -
1);
+
+            __set_bit(bitmap_offset, bitmap[bitmap_index]);
+
+            vlp2m_pte = get_vlpt_3lvl_pte(mvn_head->vaddrs[i]);
+            vlp2m_pte->p2m.write = 0;
+        }
+
+        tmp_page = page_head;
+        page_head = mvn_head->next;
+
+        tmp_mvn = mvn_head;
+        mvn_head = mvn_head->mvn_next;
+
+        unmap_domain_page_global(tmp_mvn);
+        free_domheap_page(tmp_page);
+    }
+
+    d->arch.dirty.head = NULL;
+    d->arch.dirty.mvn_head = NULL;
+
+    spin_unlock(&d->arch.dirty.lock);
+}
+
+
+/* Change types across all p2m entries in a domain */
+static void p2m_change_entry_type_global(struct domain *d, enum mg nt)
+{
+    struct p2m_domain *p2m = &d->arch.p2m;
+    vaddr_t ram_base;
+    int i1, i2, i3;
+    int first_index, second_index, third_index;
+    lpae_t *first = __map_domain_page(p2m->first_level);
+    lpae_t pte, *second = NULL, *third = NULL;
+
+    get_gma_start_end(d, &ram_base, NULL);
+
+    first_index = first_table_offset((uint64_t)ram_base);
+    second_index = second_table_offset((uint64_t)ram_base);
+    third_index = third_table_offset((uint64_t)ram_base);
+
+    BUG_ON( !first && "Can''t map first level p2m."
);
+
+    spin_lock(&p2m->lock);
+
+    for ( i1 = first_index; i1 < LPAE_ENTRIES*2; ++i1 )
+    {
+        lpae_walk_t first_pte = first[i1].walk;
+        if ( !first_pte.valid || !first_pte.table )
+            goto out;
+
+        second = map_domain_page(first_pte.base);
+        BUG_ON( !second && "Can''t map second level
p2m.");
+        for ( i2 = second_index; i2 < LPAE_ENTRIES; ++i2 )
+        {
+            lpae_walk_t second_pte = second[i2].walk;
+
+            /* a nasty hack for vlpt due to the difference
+             * of page table entry layout between p2m and pt */
+            second[i2].pt.ro = 0;
+
+            if ( !second_pte.valid || !second_pte.table )
+                goto out;
+
+            third = map_domain_page(second_pte.base);
+            BUG_ON( !third && "Can''t map third level
p2m.");
+
+            for ( i3 = third_index; i3 < LPAE_ENTRIES; ++i3 )
+            {
+                lpae_walk_t third_pte = third[i3].walk;
+                int write = 0;
+                if ( !third_pte.valid )
+                    goto out;
+
+                pte = third[i3];
+                if ( pte.p2m.write == 1 && nt == mg_ro )
+                {
+                    pte.p2m.write = 0;
+                    write = 1;
+                }
+                else if ( pte.p2m.write == 0 && nt == mg_rw )
+                {
+                    pte.p2m.write = 1;
+                    write = 1;
+                }
+                if ( write )
+                    write_pte(&third[i3], pte);
+            }
+            unmap_domain_page(third);
+
+            third = NULL;
+            third_index = 0;
+        }
+        unmap_domain_page(second);
+
+        second = NULL;
+        second_index = 0;
+        third_index = 0;
+    }
+
+out:
+    flush_tlb_all_local();
+    if ( third ) unmap_domain_page(third);
+    if ( second ) unmap_domain_page(second);
+    if ( first ) unmap_domain_page(first);
+
+    spin_unlock(&p2m->lock);
+}
+
+/* Read a domain''s log-dirty bitmap and stats.
+ * If the operation is a CLEAN, clear the bitmap and stats. */
+int log_dirty_op(struct domain *d, xen_domctl_shadow_op_t *sc)
+{
+    unsigned long gmfn_start;
+    unsigned long gmfn_end;
+    unsigned long gmfns;
+    unsigned int bitmap_pages;
+    int rc = 0, clean = 0, peek = 1;
+    uint8_t *bitmap[256]; /* bitmap[256] covers 32GB ram */
+    int i;
+
+    BUG_ON( !d->arch.map_domain.nr_banks );
+
+    gmfn_start = d->arch.map_domain.bank[0].start >> PAGE_SHIFT;
+    gmfn_end = domain_get_maximum_gpfn(d);
+    gmfns = gmfn_end - gmfn_start;
+    bitmap_pages = PFN_UP((gmfns + 7) / 8);
+
+    if ( guest_handle_is_null(sc->dirty_bitmap) )
+    {
+        peek = 0;
+    }
+    else
+    {
+        /* mapping to the bitmap from guest param */
+        vaddr_t to = (vaddr_t)sc->dirty_bitmap.p;
+
+        BUG_ON( to & ~PAGE_MASK && "offset not aligned to PAGE
SIZE");
+
+        for ( i = 0; i < bitmap_pages; ++i )
+        {
+            paddr_t g;
+            rc = gvirt_to_maddr(to, &g);
+            if ( rc )
+                return rc;
+            bitmap[i] = map_domain_page(g>>PAGE_SHIFT);
+            memset(bitmap[i], 0x00, PAGE_SIZE);
+            to += PAGE_SIZE;
+        }
+    }
+
+    clean = (sc->op == XEN_DOMCTL_SHADOW_OP_CLEAN);
+    sc->stats.dirty_count = d->arch.dirty.count;
+
+    get_dirty_bitmap(d, bitmap);
+    if ( peek )
+    {
+        for ( i = 0; i < bitmap_pages; ++i )
+        {
+            unmap_domain_page(bitmap[i]);
+        }
+    }
+
+    return 0;
+}
+
+long dirty_mode_op(struct domain *d, xen_domctl_shadow_op_t *sc)
+{
+    long ret = 0;
+    switch (sc->op)
+    {
+        case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
+        case XEN_DOMCTL_SHADOW_OP_OFF:
+        {
+            enum mg nt = sc->op == XEN_DOMCTL_SHADOW_OP_OFF ? mg_rw : mg_ro;
+
+            d->arch.dirty.mode = sc->op == XEN_DOMCTL_SHADOW_OP_OFF ? 0 :
1;
+            p2m_change_entry_type_global(d, nt);
+
+            if ( sc->op == XEN_DOMCTL_SHADOW_OP_OFF )
+                cleanup_vlpt(d);
+            else
+                prepare_vlpt(d);
+        }
+        break;
+
+        case XEN_DOMCTL_SHADOW_OP_CLEAN:
+        case XEN_DOMCTL_SHADOW_OP_PEEK:
+        {
+            ret = log_dirty_op(d, sc);
+        }
+        break;
+
+        default:
+            return -ENOSYS;
+    }
+    return ret;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 4c0fc32..3b78ed2 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1313,6 +1313,8 @@ static void do_trap_data_abort_guest(struct cpu_user_regs
*regs,
     const char *msg;
     int rc, level = -1;
     mmio_info_t info;
+    int page_fault = ((dabt.dfsc & FSC_MASK) =+                     
(FSC_FLT_PERM + FSC_3D_LEVEL) && dabt.write);
 
     if ( !check_conditional_instr(regs, hsr) )
     {
@@ -1327,22 +1329,23 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
     info.gva = READ_SYSREG64(FAR_EL2);
 #endif
 
-    if (dabt.s1ptw)
+    if ( dabt.s1ptw && !page_fault )
         goto bad_data_abort;
 
     rc = gva_to_ipa(info.gva, &info.gpa);
-    if ( rc == -EFAULT )
+    if ( rc == -EFAULT && !page_fault )
         goto bad_data_abort;
 
     /* XXX: Decode the instruction if ISS is not valid */
-    if ( !dabt.valid )
+    if ( !dabt.valid && !page_fault )
         goto bad_data_abort;
 
     /*
      * Erratum 766422: Thumb store translation fault to Hypervisor may
      * not have correct HSR Rt value.
      */
-    if ( cpu_has_erratum_766422() && (regs->cpsr & PSR_THUMB)
&& dabt.write )
+    if ( cpu_has_erratum_766422() && (regs->cpsr & PSR_THUMB)
&& dabt.write
+            && !page_fault)
     {
         rc = decode_instruction(regs, &info.dabt);
         if ( rc )
@@ -1358,6 +1361,16 @@ static void do_trap_data_abort_guest(struct cpu_user_regs
*regs,
         return;
     }
 
+    /* handle permission fault on write */
+    if ( page_fault )
+    {
+        if ( current->domain->arch.dirty.mode == 0 )
+           goto bad_data_abort;
+
+        handle_page_fault(current->domain, info.gpa);
+        return;
+    }
+
 bad_data_abort:
 
     msg = decode_fsc( dabt.dfsc, &level);
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 7e7743a..de349e9 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -114,6 +114,11 @@ struct arch_domain
 
     /* dirty-page tracing */
     struct {
+        spinlock_t lock;
+        int mode;
+        unsigned int count;
+        struct page_info *head;     /* maintain the mapped vaddrs */
+        struct mapped_va_node *mvn_head;  /* va of the head */
         int second_lvl_start;            /* for context switch */
         int second_lvl_end;
         lpae_t *second_lvl;
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index b9cbcf2..c0eade0 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -366,6 +366,10 @@ static inline lpae_t * get_vlpt_3lvl_pte(vaddr_t addr)
         (ipa_third << 3) );
 }
 
+/* dirty-page tracing */
+enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
+void handle_page_fault(struct domain *d, paddr_t addr);
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index c660820..0bac332 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -2,6 +2,7 @@
 #define _XEN_P2M_H
 
 #include <xen/mm.h>
+#include <public/domctl.h>
 
 struct domain;
 
@@ -110,6 +111,9 @@ static inline int get_page_and_type(struct page_info *page,
     return rc;
 }
 
+long dirty_mode_op(struct domain *d, xen_domctl_shadow_op_t *sc);
+int add_mapped_vaddr(struct domain *d, unsigned long va);
+
 #endif /* _XEN_P2M_H */
 
 /*
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 5294421..fced6ad 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -399,6 +399,8 @@ union hsr {
 #define FSC_CPR        (0x3a) /* Coprocossor Abort */
 
 #define FSC_LL_MASK    (0x03<<0)
+#define FSC_MASK       (0x3f) /* Fault status mask */
+#define FSC_3D_LEVEL   (0x03) /* Third level fault*/
 
 /* Time counter hypervisor control register */
 #define CNTHCTL_PA      (1u<<0)  /* Kernel/user access to physical
counter */
-- 
1.8.1.2

Jaeyong Yoo

2013-Oct-04 04:44 UTC

head link

[PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

From: Alexey Sokolov <sokolov.a@samsung.com>

Implement for xl restore/save (which are also used for migrate) operation in
xc_arm_migrate.c and make it compilable.
The overall process of save is the following:
1) save guest parameters (i.e., memory map, console and store pfn, etc)
2) save memory (if it is live, perform dirty-page tracing)
3) save hvm states (i.e., gic, timer, etc)
4) save vcpu registers (i.e., pc, sp, lr, etc) The overall process of restore
   is the same to the one in save.

Singed-off-by: Alexey Sokolov <sokolov.a@samsung.com>
---
 config/arm32.mk              |   1 +
 tools/libxc/Makefile         |   5 +
 tools/libxc/xc_arm_migrate.c | 805 +++++++++++++++++++++++++++++++++++++++++++
 tools/misc/Makefile          |   4 +
 4 files changed, 815 insertions(+)
 create mode 100644 tools/libxc/xc_arm_migrate.c

diff --git a/config/arm32.mk b/config/arm32.mk
index aa79d22..01374c9 100644
--- a/config/arm32.mk
+++ b/config/arm32.mk
@@ -1,6 +1,7 @@
 CONFIG_ARM := y
 CONFIG_ARM_32 := y
 CONFIG_ARM_$(XEN_OS) := y
+CONFIG_MIGRATE := y
 
 CONFIG_XEN_INSTALL_SUFFIX : 
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 512a994..05dfef4 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -42,8 +42,13 @@ CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
 GUEST_SRCS-y : GUEST_SRCS-y += xg_private.c xc_suspend.c
 ifeq ($(CONFIG_MIGRATE),y)
+ifeq ($(CONFIG_X86),y)
 GUEST_SRCS-y += xc_domain_restore.c xc_domain_save.c
 GUEST_SRCS-y += xc_offline_page.c xc_compression.c
+endif
+ifeq ($(CONFIG_ARM),y)
+GUEST_SRCS-y += xc_arm_migrate.c
+endif
 else
 GUEST_SRCS-y += xc_nomigrate.c
 endif
diff --git a/tools/libxc/xc_arm_migrate.c b/tools/libxc/xc_arm_migrate.c
new file mode 100644
index 0000000..52f06b5
--- /dev/null
+++ b/tools/libxc/xc_arm_migrate.c
@@ -0,0 +1,805 @@
+/******************************************************************************
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
USA
+ *
+ * Copyright (c) 2013, Samsung Electronics
+ */
+
+#include <inttypes.h>
+#include <errno.h>
+#include <xenctrl.h>
+#include <xenguest.h>
+
+#include <unistd.h>
+#include <xc_private.h>
+#include <xc_dom.h>
+#include "xc_bitops.h"
+#include "xg_private.h"
+
+#define DEF_MAX_ITERS          29 /* limit us to 30 times round loop   */
+#define DEF_MAX_FACTOR         3  /* never send more than 3x p2m_size  */
+#define DEF_MIN_DIRTY_PER_ITER 50 /* dirty page count to define last iter */
+#define DEF_PROGRESS_RATE      50 /* progress bar update rate */
+
+/* Enable this macro for debug only: "static" migration instead of
live */
+/*
+#define DISABLE_LIVE_MIGRATION
+*/
+
+/* Enable this macro for debug only: additional debug info */
+#define ARM_MIGRATE_VERBOSE
+
+/*
+ * Guest params to save: used HVM params, save flags, memory map
+ */
+typedef struct guest_params
+{
+    unsigned long console_pfn;
+    unsigned long store_pfn;
+    uint32_t flags;
+    struct dt_mem_info memmap; /* Memory map */
+    uint32_t max_vcpu_id;
+} guest_params_t;
+
+static int suspend_and_state(int (*suspend)(void*), void *data,
+                             xc_interface *xch, int dom)
+{
+    xc_dominfo_t info;
+    if ( !(*suspend)(data) )
+    {
+        ERROR("Suspend request failed");
+        return -1;
+    }
+
+    if ( (xc_domain_getinfo(xch, dom, 1, &info) != 1) ||
+         !info.shutdown || (info.shutdown_reason != SHUTDOWN_suspend) )
+    {
+        ERROR("Domain is not in suspended state after suspend
attempt");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int write_exact_handled(xc_interface *xch, int fd, const void *data,
+                               size_t size)
+{
+    if ( write_exact(fd, data, size) )
+    {
+        ERROR("Write failed, check space");
+        return -1;
+    }
+    return 0;
+}
+
+/* ============ Memory ============= */
+static int save_memory(xc_interface *xch, int io_fd, uint32_t dom,
+                       struct save_callbacks *callbacks,
+                       uint32_t max_iters, uint32_t max_factor,
+                       guest_params_t *params)
+{
+    int live =  !!(params->flags & XCFLAGS_LIVE);
+    int debug =  !!(params->flags & XCFLAGS_DEBUG);
+    xen_pfn_t i;
+    char reportbuf[80];
+    int iter = 0;
+    int last_iter = !live;
+    int total_dirty_pages_num = 0;
+    int dirty_pages_on_prev_iter_num = 0;
+    int count = 0;
+    char *page = 0;
+    xen_pfn_t *busy_pages = 0;
+    int busy_pages_count = 0;
+    int busy_pages_max = 256;
+
+    DECLARE_HYPERCALL_BUFFER(unsigned long, to_send);
+
+    /* We suppose that guest''s memory base is the first region base */
+    xen_pfn_t start = (params->memmap.bank[0].start  >> PAGE_SHIFT);
+    const xen_pfn_t end = xc_domain_maximum_gpfn(xch, dom);
+    const xen_pfn_t mem_size = end - start;
+    if ( debug )
+    {
+        IPRINTF("(save mem) start=%llx end=%llx!\n", start, end);
+    }
+
+    if ( write_exact_handled(xch, io_fd, &end, sizeof(xen_pfn_t)) )
+        return -1;
+
+    if ( live )
+    {
+        if ( xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
+                    NULL, 0, NULL, 0, NULL) < 0 )
+        {
+            ERROR("Couldn''t enable log-dirty mode !\n");
+            return -1;
+        }
+
+        max_iters  = max_iters  ? : DEF_MAX_ITERS;
+        max_factor = max_factor ? : DEF_MAX_FACTOR;
+
+        if ( debug )
+            IPRINTF("Log-dirty mode enabled, max_iters=%d,
max_factor=%d!\n",
+                    max_iters, max_factor);
+    }
+
+    to_send = xc_hypercall_buffer_alloc_pages(xch, to_send,
+                                              NRPAGES(bitmap_size(mem_size)));
+    if ( !to_send )
+    {
+        ERROR("Couldn''t allocate to_send array!\n");
+        return -1;
+    }
+
+    /* send all pages on first iter */
+    memset(to_send, 0xff, bitmap_size(mem_size));
+
+    for ( ; ; )
+    {
+        int dirty_pages_on_current_iter_num = 0;
+        int frc;
+        iter++;
+
+        snprintf(reportbuf, sizeof(reportbuf),
+                 "Saving memory: iter %d (last sent %u)",
+                 iter, dirty_pages_on_prev_iter_num);
+
+        xc_report_progress_start(xch, reportbuf, mem_size);
+
+        if ( (iter > 1 &&
+              dirty_pages_on_prev_iter_num < DEF_MIN_DIRTY_PER_ITER) ||
+             (iter == max_iters) ||
+             (total_dirty_pages_num >= mem_size*max_factor) )
+        {
+            if ( debug )
+                IPRINTF("Last iteration");
+            last_iter = 1;
+        }
+
+        if ( last_iter )
+        {
+            if ( suspend_and_state(callbacks->suspend, callbacks->data,
+                                   xch, dom) )
+            {
+                ERROR("Domain appears not to have suspended");
+                return -1;
+            }
+        }
+        if ( live && iter > 1 )
+        {
+            frc = xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_CLEAN,
+                                    HYPERCALL_BUFFER(to_send), mem_size,
+                                                     NULL, 0, NULL);
+            if ( frc != mem_size )
+            {
+                ERROR("Error peeking shadow bitmap");
+                xc_hypercall_buffer_free_pages(xch, to_send,
+                                               NRPAGES(bitmap_size(mem_size)));
+                return -1;
+            }
+        }
+
+        busy_pages = malloc(sizeof(xen_pfn_t) * busy_pages_max);
+
+        for ( i = start; i < end; ++i )
+        {
+            if ( test_bit(i - start, to_send) )
+            {
+                page = xc_map_foreign_range(xch, dom, PAGE_SIZE, PROT_READ, i);
+                if ( !page )
+                {
+                    /* This page is mapped elsewhere, should be resent later */
+                    busy_pages[busy_pages_count] = i;
+                    busy_pages_count++;
+                    if ( busy_pages_count >= busy_pages_max )
+                    {
+                        busy_pages_max += 256;
+                        busy_pages = realloc(busy_pages, sizeof(xen_pfn_t) *
+                                                         busy_pages_max);
+                    }
+                    continue;
+                }
+
+                if ( write_exact_handled(xch, io_fd, &i, sizeof(i)) ||
+                     write_exact_handled(xch, io_fd, page, PAGE_SIZE) )
+                {
+                    munmap(page, PAGE_SIZE);
+                    free(busy_pages);
+                    return -1;
+                }
+                count++;
+                munmap(page, PAGE_SIZE);
+
+                if ( (i % DEF_PROGRESS_RATE) == 0 )
+                    xc_report_progress_step(xch, i - start, mem_size);
+                dirty_pages_on_current_iter_num++;
+            }
+        }
+
+        while ( busy_pages_count )
+        {
+            /* Send busy pages */
+            busy_pages_count--;
+            i = busy_pages[busy_pages_count];
+            if ( test_bit(i - start, to_send) )
+            {
+                page = xc_map_foreign_range(xch, dom, PAGE_SIZE,PROT_READ, i);
+                if ( !page )
+                {
+                    IPRINTF("WARNING: 2nd attempt to save page "
+                            "busy failed pfn=%llx", i);
+                    continue;
+                }
+
+                if ( debug )
+                {
+                    IPRINTF("save mem: resend busy page %llx\n", i);
+                }
+
+                if ( write_exact_handled(xch, io_fd, &i, sizeof(i)) ||
+                     write_exact_handled(xch, io_fd, page, PAGE_SIZE) )
+                {
+                    munmap(page, PAGE_SIZE);
+                    free(busy_pages);
+                    return -1;
+                }
+                count++;
+                munmap(page, PAGE_SIZE);
+                dirty_pages_on_current_iter_num++;
+            }
+        }
+        free(busy_pages);
+
+        if ( debug )
+            IPRINTF("Dirty pages=%d",
dirty_pages_on_current_iter_num);
+
+        xc_report_progress_step(xch, mem_size, mem_size);
+
+        dirty_pages_on_prev_iter_num = dirty_pages_on_current_iter_num;
+        total_dirty_pages_num += dirty_pages_on_current_iter_num;
+
+        if ( last_iter )
+        {
+            xc_hypercall_buffer_free_pages(xch, to_send,
+                                           NRPAGES(bitmap_size(mem_size)));
+            if ( live )
+            {
+                if ( xc_shadow_control(xch, dom, XEN_DOMCTL_SHADOW_OP_OFF,
+                                       NULL, 0, NULL, 0, NULL) < 0 )
+                    ERROR("Couldn''t disable log-dirty
mode");
+            }
+            break;
+        }
+    }
+    if ( debug )
+    {
+        IPRINTF("save mem: pages count = %d\n", count);
+    }
+
+    i = (xen_pfn_t) -1; /* end page marker */
+    return write_exact_handled(xch, io_fd, &i, sizeof(i));
+}
+
+static int restore_memory(xc_interface *xch, int io_fd, uint32_t dom,
+                          guest_params_t *params)
+{
+    xen_pfn_t end;
+    xen_pfn_t gpfn;
+    int debug =  !!(params->flags & XCFLAGS_DEBUG);
+    int count = 0;
+    char *page;
+
+    /* We suppose that guest''s memory base is the first region base */
+    xen_pfn_t start = (params->memmap.bank[0].start >> PAGE_SHIFT);
+
+    if ( read_exact(io_fd, &end, sizeof(xen_pfn_t)) )
+    {
+        PERROR("First read of incoming memory failed");
+        return -1;
+    }
+
+    /* TODO allocate several pages per call */
+    for ( gpfn = start; gpfn < end; ++gpfn )
+    {
+        if ( xc_domain_populate_physmap_exact(xch, dom, 1, 0, 0, &gpfn) )
+        {
+            PERROR("Memory allocation for a new domain failed");
+            return -1;
+        }
+    }
+    while ( 1 )
+    {
+
+        if ( read_exact(io_fd, &gpfn, sizeof(gpfn)) )
+        {
+            PERROR("GPFN read failed during memory transfer,
count=%d", count);
+            return -1;
+        }
+        if ( gpfn == (xen_pfn_t) -1 ) break; /* end page marker */
+
+        if ( gpfn < start || gpfn >= end )
+        {
+            ERROR("GPFN %llx doesn''t belong to RAM address space,
count=%d",
+                    gpfn, count);
+            return -1;
+        }
+        page = xc_map_foreign_range(xch, dom, PAGE_SIZE,
+                                    PROT_READ | PROT_WRITE, gpfn);
+        if ( !page )
+        {
+            PERROR("xc_map_foreign_range failed, pfn=%llx", gpfn);
+            return -1;
+        }
+        if ( read_exact(io_fd, page, PAGE_SIZE) )
+        {
+            PERROR("Page data read failed during memory transfer,
pfn=%llx",
+                    gpfn);
+            return -1;
+        }
+        munmap(page, PAGE_SIZE);
+        count++;
+    }
+
+    if ( debug )
+    {
+        IPRINTF("Memory restored, pages count=%d", count);
+    }
+    return 0;
+}
+
+/* ============ HVM context =========== */
+static int save_armhvm(xc_interface *xch, int io_fd, uint32_t dom, int debug)
+{
+    /* HVM: a buffer for holding HVM context */
+    uint32_t hvm_buf_size = 0;
+    uint8_t *hvm_buf = NULL;
+    uint32_t rec_size;
+    int retval = -1;
+
+    /* Need another buffer for HVM context */
+    hvm_buf_size = xc_domain_hvm_getcontext(xch, dom, 0, 0);
+    if ( hvm_buf_size == -1 )
+    {
+        ERROR("Couldn''t get HVM context size from Xen");
+        goto out;
+    }
+    hvm_buf = malloc(hvm_buf_size);
+
+    if ( !hvm_buf )
+    {
+        ERROR("Couldn''t allocate memory for hvm buffer");
+        goto out;
+    }
+
+    /* Get HVM context from Xen and save it too */
+    if ( (rec_size = xc_domain_hvm_getcontext(xch, dom, hvm_buf,
+                    hvm_buf_size)) == -1 )
+    {
+        ERROR("HVM:Could not get hvm buffer");
+        goto out;
+    }
+
+    if ( debug )
+        IPRINTF("HVM save size %d %d", hvm_buf_size, rec_size);
+
+    if ( write_exact_handled(xch, io_fd, &rec_size, sizeof(uint32_t)) )
+        goto out;
+
+    if ( write_exact_handled(xch, io_fd, hvm_buf, rec_size) )
+    {
+        goto out;
+    }
+    retval = 0;
+
+out:
+    if ( hvm_buf )
+        free (hvm_buf);
+    return retval;
+}
+
+static int restore_armhvm(xc_interface *xch, int io_fd,
+                          uint32_t dom, int debug)
+{
+    uint32_t rec_size;
+    uint32_t hvm_buf_size = 0;
+    uint8_t *hvm_buf = NULL;
+    int frc = 0;
+    int retval = -1;
+
+    if ( read_exact(io_fd, &rec_size, sizeof(uint32_t)) )
+    {
+        PERROR("Could not read HVM size");
+        goto out;
+    }
+
+    if ( !rec_size )
+    {
+        ERROR("Zero HVM size");
+        goto out;
+    }
+
+    hvm_buf_size = xc_domain_hvm_getcontext(xch, dom, 0, 0);
+    if ( hvm_buf_size != rec_size )
+    {
+        ERROR("HVM size for this domain is not the same as stored");
+    }
+
+    hvm_buf = malloc(hvm_buf_size);
+    if ( !hvm_buf )
+    {
+        ERROR("Couldn''t allocate memory");
+        goto out;
+    }
+
+    if ( read_exact(io_fd, hvm_buf, hvm_buf_size) )
+    {
+        PERROR("Could not read HVM context");
+        goto out;
+    }
+
+    frc = xc_domain_hvm_setcontext(xch, dom, hvm_buf, hvm_buf_size);
+    if ( frc )
+    {
+        ERROR("error setting the HVM context");
+        goto out;
+    }
+    retval = 0;
+
+    if ( debug )
+    {
+            IPRINTF("HVM restore size %d %d", hvm_buf_size,
rec_size);
+    }
+out:
+    if ( hvm_buf )
+        free (hvm_buf);
+    return retval;
+}
+
+/* ================= Console & Xenstore & Memory map =========== */
+static int save_guest_params(xc_interface *xch, int io_fd,
+                             uint32_t dom, uint32_t flags,
+                             guest_params_t *params)
+{
+    size_t sz = sizeof(guest_params_t);
+    xc_dominfo_t dom_info;
+
+    if ( xc_domain_get_memory_map(xch, dom, &params->memmap) )
+    {
+        ERROR("Can''t get memory map");
+        return -1;
+    }
+
+    if ( flags & XCFLAGS_DEBUG )
+    {
+        IPRINTF("Guest param save size: %d ", sz);
+        IPRINTF("Guest memory map save %d entries",
params->memmap.nr_banks);
+    }
+
+    if ( xc_get_hvm_param(xch, dom, HVM_PARAM_CONSOLE_PFN,
+            &params->console_pfn) )
+    {
+        ERROR("Can''t get console gpfn");
+        return -1;
+    }
+
+    if ( xc_get_hvm_param(xch, dom, HVM_PARAM_STORE_PFN,
&params->store_pfn) )
+    {
+        ERROR("Can''t get store gpfn");
+        return -1;
+    }
+
+    if ( xc_domain_getinfo(xch, dom, 1, &dom_info ) < 0)
+    {
+        ERROR("Can''t get domain info for dom %d", dom);
+        return -1;
+    }
+    params->max_vcpu_id = dom_info.max_vcpu_id;
+
+    params->flags = flags;
+
+    if ( write_exact_handled(xch, io_fd, params, sz) )
+    {
+        return -1;
+    }
+
+    return 0;
+}
+
+static int restore_guest_params(xc_interface *xch, int io_fd,
+                                uint32_t dom, guest_params_t *params)
+{
+    size_t sz = sizeof(guest_params_t);
+
+    if ( read_exact(io_fd, params, sizeof(guest_params_t)) )
+    {
+        PERROR("Can''t read guest params");
+        return -1;
+    }
+
+    if ( params->flags & XCFLAGS_DEBUG )
+    {
+        IPRINTF("Guest param restore size: %d ", sz);
+        IPRINTF("Guest memory map restore %d entries",
+                params->memmap.nr_banks);
+    }
+
+    if ( xc_domain_set_memory_map(xch, dom, &params->memmap) )
+    {
+        ERROR("Can''t set memory map");
+        return -1;
+    }
+    /* Set max. number of vcpus as max_vcpu_id + 1 */
+    if ( xc_domain_max_vcpus(xch, dom, params->max_vcpu_id + 1) )
+    {
+        ERROR("Can''t set max vcpu number for domain");
+        return -1;
+    }
+
+    return 0;
+}
+
+static int set_guest_params(xc_interface *xch, int io_fd, uint32_t dom,
+                            guest_params_t *params, unsigned int
console_evtchn,
+                            domid_t console_domid, unsigned int store_evtchn,
+                            domid_t store_domid)
+{
+    int rc = 0;
+
+    if ( (rc = xc_clear_domain_page(xch, dom, params->console_pfn)) )
+    {
+        ERROR("Can''t clear console page");
+        return rc;
+    }
+
+    if ( (rc = xc_clear_domain_page(xch, dom, params->store_pfn)) )
+    {
+        ERROR("Can''t clear xenstore page");
+        return rc;
+    }
+
+    if ( (rc = xc_dom_gnttab_hvm_seed(xch, dom, params->console_pfn,
+                                      params->store_pfn, console_domid,
+                                      store_domid)) )
+    {
+        ERROR("Can''t grant console and xenstore pages");
+        return rc;
+    }
+
+    if ( (rc = xc_set_hvm_param(xch, dom, HVM_PARAM_CONSOLE_PFN,
+                                params->console_pfn)) )
+    {
+        ERROR("Can''t set console gpfn");
+        return rc;
+    }
+
+    if ( (rc = xc_set_hvm_param(xch, dom, HVM_PARAM_STORE_PFN,
+                                params->store_pfn)) )
+    {
+        ERROR("Can''t set xenstore gpfn");
+        return rc;
+    }
+
+    if ( (rc = xc_set_hvm_param(xch, dom, HVM_PARAM_CONSOLE_EVTCHN,
+                                console_evtchn)) )
+    {
+        ERROR("Can''t set console event channel");
+        return rc;
+    }
+
+    if ( (rc = xc_set_hvm_param(xch, dom, HVM_PARAM_STORE_EVTCHN,
+                                store_evtchn)) )
+    {
+        ERROR("Can''t set xenstore event channel");
+        return rc;
+    }
+    return 0;
+}
+
+/* ====================== VCPU ============== */
+static int save_vcpus(xc_interface *xch, int io_fd, uint32_t dom,
+        guest_params_t *params)
+{
+    vcpu_guest_context_any_t ctxt;
+    int rc, nr_online = 0;
+    uint32_t i;
+    xc_vcpuinfo_t info;
+    int debug = !!(params->flags & XCFLAGS_DEBUG);
+
+    for ( i = 0; i<= params->max_vcpu_id; i++ )
+    {
+        rc = xc_vcpu_getinfo(xch, dom, i, &info);
+        /* Store contexts for online vcpus only;
+         * vcpu id is also stored before */
+        if ( rc || info.online == 0 ) continue;
+        if ( xc_vcpu_getcontext(xch, dom, i, &ctxt) ) continue;
+        rc = write_exact_handled(xch, io_fd, &i, sizeof(i));
+        if ( rc ) return rc;
+        rc = write_exact_handled(xch, io_fd, &ctxt, sizeof(ctxt));
+        if ( rc ) return rc;
+        nr_online++;
+    }
+    /* end marker */
+    i = params->max_vcpu_id + 1;
+    rc = write_exact_handled(xch, io_fd, &i, sizeof(i));
+    if ( rc ) return rc;
+    if ( debug )
+        IPRINTF("save: max_vcpu=%d, nr_online=%d\n", i, nr_online);
+    return 0;
+}
+
+static int restore_vcpus(xc_interface *xch, int io_fd, uint32_t dom,
+        guest_params_t *params)
+{
+    int rc = 0;
+    uint32_t i;
+    int nr_online = 0;
+    int debug = !!(params->flags & XCFLAGS_DEBUG);
+
+    DECLARE_DOMCTL;
+    DECLARE_HYPERCALL_BUFFER(vcpu_guest_context_any_t, ctxt);
+
+    ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
+
+    while ( 1 )
+    {
+        memset(ctxt, 0, sizeof(*ctxt));
+
+        if ( read_exact(io_fd, &i, sizeof(i)) )
+        {
+            PERROR("VCPU context read failed");
+            rc = -1;
+            break;
+        }
+        if ( i > params->max_vcpu_id ) break;
+        if ( read_exact(io_fd, ctxt, sizeof(*ctxt)) )
+        {
+            PERROR("VCPU context read failed");
+            rc = -1;
+            break;
+        }
+        nr_online++;
+        memset(&domctl, 0, sizeof(domctl));
+        domctl.cmd = XEN_DOMCTL_setvcpucontext;
+        domctl.domain = dom;
+        domctl.u.vcpucontext.vcpu = i;
+        set_xen_guest_handle(domctl.u.vcpucontext.ctxt, ctxt);
+        rc = do_domctl(xch, &domctl);
+        if ( rc )
+        {
+            ERROR("VCPU context set failed (error %d)", rc);
+            break;
+        }
+    }
+    if ( debug )
+        IPRINTF("restore: max_vcpu=%d, nr_online=%d\n",
+                params->max_vcpu_id + 1, nr_online);
+
+    xc_hypercall_buffer_free(xch, ctxt);
+    return rc;
+}
+
+/* ================== Main ============== */
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
+                   uint32_t max_iters, uint32_t max_factor, uint32_t flags,
+                   struct save_callbacks *callbacks, int hvm,
+                   unsigned long vm_generationid_addr)
+{
+    int debug;
+    guest_params_t params;
+
+#ifdef ARM_MIGRATE_VERBOSE
+    flags |= XCFLAGS_DEBUG;
+#endif
+
+#ifdef DISABLE_LIVE_MIGRATION
+    flags &= ~(XCFLAGS_LIVE);
+#endif
+
+    debug = !!(flags & XCFLAGS_DEBUG);
+    if ( save_guest_params(xch, io_fd, dom, flags, &params) )
+    {
+       ERROR("Can''t save guest params");
+       return -1;
+    }
+
+    if ( save_memory(xch, io_fd, dom, callbacks, max_iters,
+            max_factor, &params) )
+    {
+        ERROR("Memory not saved");
+        return -1;
+    }
+
+    if ( save_armhvm(xch, io_fd, dom, debug) )
+    {
+        ERROR("HVM not saved");
+        return -1;
+    }
+
+    if ( save_vcpus(xch, io_fd, dom, &params) )
+    {
+        ERROR("VCPU(s) not saved");
+        return -1;
+    }
+    if ( debug )
+    {
+        IPRINTF("Domain %d saved", dom);
+    }
+    return 0;
+}
+
+int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
+                      unsigned int store_evtchn, unsigned long *store_gpfn,
+                      domid_t store_domid, unsigned int console_evtchn,
+                      unsigned long *console_gpfn, domid_t console_domid,
+                      unsigned int hvm, unsigned int pae, int superpages,
+                      int no_incr_generationid,
+                      unsigned long *vm_generationid_addr,
+                      struct restore_callbacks *callbacks)
+{
+    guest_params_t params;
+    int debug = 1;
+
+    if ( restore_guest_params(xch, io_fd, dom, &params) )
+    {
+        ERROR("Can''t restore guest params");
+        return -1;
+    }
+    debug = !!(params.flags & XCFLAGS_DEBUG);
+
+    if ( restore_memory(xch, io_fd, dom, &params) )
+    {
+        ERROR("Can''t restore memory");
+        return -1;
+    }
+    if ( set_guest_params(xch, io_fd, dom, &params,
+                console_evtchn, console_domid,
+                store_evtchn, store_domid) )
+    {
+        ERROR("Can''t setup guest params");
+        return -1;
+    }
+
+    /* Setup console and store PFNs to caller */
+    *console_gpfn = params.console_pfn;
+    *store_gpfn = params.store_pfn;
+
+    if ( restore_armhvm(xch, io_fd, dom, debug) )
+    {
+        ERROR("HVM not restored");
+        return -1;
+    }
+
+    if ( restore_vcpus(xch, io_fd, dom, &params) )
+    {
+        ERROR("Can''t restore VCPU(s)");
+        return -1;
+    }
+
+    if ( debug )
+    {
+         IPRINTF("Domain %d restored", dom);
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 17aeda5..29a1f66 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -11,7 +11,9 @@ HDRS     = $(wildcard *.h)
 
 TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
xenlockprof xenwatchdogd xencov
 TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash xen-lowmemd
xen-mfndump
+ifeq ($(CONFIG_X86),y)
 TARGETS-$(CONFIG_MIGRATE) += xen-hptool
+endif
 TARGETS := $(TARGETS-y)
 
 SUBDIRS := $(SUBDIRS-y)
@@ -23,7 +25,9 @@ INSTALL_BIN := $(INSTALL_BIN-y)
 INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm xen-tmem-list-parse
gtraceview \
 	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
 INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvmcrash xen-lowmemd xen-mfndump
+ifeq ($(CONFIG_X86),y)
 INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
+endif
 INSTALL_SBIN := $(INSTALL_SBIN-y)
 
 INSTALL_PRIVBIN-y := xenpvnetboot
-- 
1.8.1.2

Julien Grall

2013-Oct-07 12:49 UTC

head link

Re: [PATCH v4 1/9] xen/arm: Implement hvm save and restore

On 10/04/2013 05:43 AM, Jaeyong Yoo wrote:> Implement save/restore of hvm context hypercall. In hvm
> context save/restore, we save gic, timer and vfp registers.
> 
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> ---
>  xen/arch/arm/Makefile                  |   1 +
>  xen/arch/arm/domctl.c                  |  88 ++++++++++++-
>  xen/arch/arm/hvm.c                     | 228
+++++++++++++++++++++++++++++++++
>  xen/arch/arm/save.c                    |  66 ++++++++++
>  xen/common/Makefile                    |   2 +
>  xen/include/asm-arm/hvm/support.h      |  29 +++++
>  xen/include/public/arch-arm/hvm/save.h |  66 ++++++++++
>  7 files changed, 479 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/arm/save.c
>  create mode 100644 xen/include/asm-arm/hvm/support.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 003ac84..8910a6c 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -31,6 +31,7 @@ obj-y += vuart.o
>  obj-y += hvm.o
>  obj-y += device.o
>  obj-y += decode.o
> +obj-y += save.o
>  
>  #obj-bin-y += ....o
>  
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index 851ee40..86c07de 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -9,12 +9,98 @@
>  #include <xen/lib.h>
>  #include <xen/errno.h>
>  #include <xen/sched.h>
> +#include <xen/hvm/save.h>
> +#include <xen/guest_access.h>
>  #include <public/domctl.h>
>  
>  long arch_do_domctl(struct xen_domctl *domctl, struct domain *d,
>                      XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>  {
> -    return -ENOSYS;
> +    long ret = 0;
> +    bool_t copyback = 0;
> +
> +    switch ( domctl->cmd )
> +    {
> +    case XEN_DOMCTL_sethvmcontext:
> +    {
> +        struct hvm_domain_context c = { .size =
domctl->u.hvmcontext.size };
> +
> +        ret = -ENOMEM;
> +        if ( (c.data = xmalloc_bytes(c.size)) == NULL )
> +            goto sethvmcontext_out;
> +
> +        ret = -EFAULT;
> +        if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer,
c.size) != 0 )
> +            goto sethvmcontext_out;
> +
> +        domain_pause(d);
> +        ret = hvm_load(d, &c);
> +        domain_unpause(d);
> +
> +    sethvmcontext_out:
> +        if ( c.data != NULL )
> +            xfree(c.data);
> +    }
> +    break;
> +    case XEN_DOMCTL_gethvmcontext:
> +    {
> +        struct hvm_domain_context c = { 0 };
> +
> +        ret = -EINVAL;
> +
> +        c.size = hvm_save_size(d);
> +
> +        if ( guest_handle_is_null(domctl->u.hvmcontext.buffer) )
> +        {
> +            /* Client is querying for the correct buffer size */
> +            domctl->u.hvmcontext.size = c.size;
> +            ret = 0;
> +            goto gethvmcontext_out;
> +        }
> +        /* Check that the client has a big enough buffer */
> +        ret = -ENOSPC;
> +        if ( domctl->u.hvmcontext.size < c.size )
> +        {
> +            printk("(gethvmcontext) size error: %d and %d\n",
> +                   domctl->u.hvmcontext.size, c.size );
> +            goto gethvmcontext_out;
> +        }
> +
> +        /* Allocate our own marshalling buffer */
> +        ret = -ENOMEM;
> +        if ( (c.data = xmalloc_bytes(c.size)) == NULL )
> +        {
> +            printk("(gethvmcontext) xmalloc_bytes failed: %d\n",
c.size );
> +            goto gethvmcontext_out;
> +        }
> +
> +        domain_pause(d);
> +        ret = hvm_save(d, &c);
> +        domain_unpause(d);
> +
> +        domctl->u.hvmcontext.size = c.cur;
> +        if ( copy_to_guest(domctl->u.hvmcontext.buffer, c.data, c.size)
!= 0 )
> +        {
> +            printk("(gethvmcontext) copy to guest failed\n");
> +            ret = -EFAULT;
> +        }
> +
> +    gethvmcontext_out:
> +        copyback = 1;
> +
> +        if ( c.data != NULL )
> +            xfree(c.data);
> +    }
> +    break;
> +
> +    default:
> +        return -EINVAL;
> +    }
> +
> +    if ( copyback && __copy_to_guest(u_domctl, domctl, 1) )
> +        ret = -EFAULT;
> +
> +    return ret;
>  }
>  
>  void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
> diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c
> index 471c4cd..befd75d 100644
> --- a/xen/arch/arm/hvm.c
> +++ b/xen/arch/arm/hvm.c
> @@ -7,11 +7,13 @@
>  
>  #include <xsm/xsm.h>
>  
> +#include <xen/hvm/save.h>
>  #include <public/xen.h>
>  #include <public/hvm/params.h>
>  #include <public/hvm/hvm_op.h>
>  
>  #include <asm/hypercall.h>
> +#include <asm/gic.h>
>  
>  long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
> @@ -65,3 +67,229 @@ long do_hvm_op(unsigned long op,
XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      return rc;
>  }
> +
> +static void vgic_irq_rank_save(struct vgic_rank *ext,
> +                               struct vgic_irq_rank *rank)
> +{
> +    spin_lock(&rank->lock);
> +    /* Some of VGIC registers are not used yet, it is for a future usage
*/
> +    /* IENABLE, IACTIVE, IPEND,  PENDSGI registers */
> +    ext->ienable = rank->ienable;
> +    ext->iactive = rank->iactive;
> +    ext->ipend = rank->ipend;
> +    ext->pendsgi = rank->pendsgi;
> +    /* ICFG */
> +    ext->icfg[0] = rank->icfg[0];
> +    ext->icfg[1] = rank->icfg[1];Can you use memcpy?
> +    /* IPRIORITY */
> +    memcpy(ext->ipriority, rank->ipriority,
sizeof(rank->ipriority));
> +    /* ITARGETS */
> +    memcpy(ext->itargets, rank->itargets,
sizeof(rank->itargets));
> +    spin_unlock(&rank->lock);
> +}
> +
> +static void vgic_irq_rank_restore(struct vgic_irq_rank *rank,
> +                                  struct vgic_rank *ext)
> +{
> +    spin_lock(&rank->lock);
> +    /* IENABLE, IACTIVE, IPEND,  PENDSGI registers */
> +    rank->ienable = ext->ienable;
> +    rank->iactive = ext->iactive;
> +    rank->ipend = ext->ipend;
> +    rank->pendsgi = ext->pendsgi;
> +    /* ICFG */
> +    rank->icfg[0] = ext->icfg[0];
> +    rank->icfg[1] = ext->icfg[1];
Same here.
> +    /* IPRIORITY */
> +    memcpy(rank->ipriority, ext->ipriority,
sizeof(rank->ipriority));
> +    /* ITARGETS */
> +    memcpy(rank->itargets, ext->itargets,
sizeof(rank->itargets));
> +    spin_unlock(&rank->lock);
> +}
> +
> +
> +static int gic_save(struct domain *d, hvm_domain_context_t *h)
> +{
> +    struct hvm_hw_gic ctxt;
> +    struct vcpu *v;
> +
> +    /* Save the state of GICs */
> +    for_each_vcpu( d, v )
> +    {
> +        ctxt.gic_hcr = v->arch.gic_hcr;
> +        ctxt.gic_vmcr = v->arch.gic_vmcr;
> +        ctxt.gic_apr = v->arch.gic_apr;
> +
> +        /* Save list registers and masks */
> +        /* (it is not necessary to save/restore them, but LR state can
have
> +         * influence on downtime after Live Migration (to be tested)
> +         */
> +        memcpy(ctxt.gic_lr, v->arch.gic_lr, sizeof(ctxt.gic_lr));
> +        ctxt.lr_mask = v->arch.lr_mask;
> +        ctxt.event_mask = v->arch.event_mask;
> +
> +        /* Save PPI states (per-CPU) */
> +        /* It is necessary if SMP enabled */
> +        vgic_irq_rank_save(&ctxt.ppi_state, 
&v->arch.vgic.private_irqs);
> +
> +        if ( hvm_save_entry(GIC, v->vcpu_id, h, &ctxt) != 0 )
> +            return 1;
> +    }
> +    return 0;
> +}
> +
> +static int gic_load(struct domain *d, hvm_domain_context_t *h)
> +{
> +    int vcpuid;
> +    struct hvm_hw_gic ctxt;
> +    struct vcpu *v;
> +
> +    /* Which vcpu is this? */
> +    vcpuid = hvm_load_instance(h);
> +    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL
)
> +    {
> +        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no
vcpu%u\n",
> +                d->domain_id, vcpuid);
> +        return -EINVAL;
> +    }
> +
> +    if ( hvm_load_entry(GIC, h, &ctxt) != 0 )
> +        return -EINVAL;
> +
> +    v->arch.gic_hcr = ctxt.gic_hcr;
> +    v->arch.gic_vmcr = ctxt.gic_vmcr;
> +    v->arch.gic_apr = ctxt.gic_apr;
> +
> +    /* Restore list registers and masks */
> +    memcpy(v->arch.gic_lr, ctxt.gic_lr, sizeof(ctxt.gic_lr));
> +    v->arch.lr_mask = ctxt.lr_mask;
> +    v->arch.event_mask = ctxt.event_mask;
> +
> +    /* Restore PPI states */
> +    vgic_irq_rank_restore(&v->arch.vgic.private_irqs,
&ctxt.ppi_state);
> +
> +    return 0;
> +}
> +
> +HVM_REGISTER_SAVE_RESTORE(GIC, gic_save, gic_load, 1, HVMSR_PER_VCPU);
> +
> +static int timer_save(struct domain *d, hvm_domain_context_t *h)
> +{
> +    struct hvm_hw_timer ctxt;
> +    struct vcpu *v;
> +    struct vtimer *t;
> +    int i;
> +
> +    /* Save the state of vtimer and ptimer */
> +    for_each_vcpu( d, v )
> +    {
> +        t = &v->arch.virt_timer;
> +        for ( i = 0; i < 2; i++ )
> +        {
> +            ctxt.cval = t->cval;
> +            ctxt.ctl = t->ctl;
> +            ctxt.vtb_offset = i ? d->arch.phys_timer_base.offset :
> +                d->arch.virt_timer_base.offset;
> +            ctxt.type = i;
> +            if ( hvm_save_entry(A15_TIMER, v->vcpu_id, h, &ctxt) !=
0 )
> +                return 1;
> +            t = &v->arch.phys_timer;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +static int timer_load(struct domain *d, hvm_domain_context_t *h)
> +{
> +    int vcpuid;
> +    struct hvm_hw_timer ctxt;
> +    struct vcpu *v;
> +    struct vtimer *t = NULL;
> +
> +    /* Which vcpu is this? */
> +    vcpuid = hvm_load_instance(h);
> +
> +    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL
)
> +    {
> +        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no
vcpu%u\n",
> +                d->domain_id, vcpuid);
> +        return -EINVAL;
> +    }
> +
> +    if ( hvm_load_entry(A15_TIMER, h, &ctxt) != 0 )
> +        return -EINVAL;
> +
> +
> +    if ( ctxt.type == TIMER_TYPE_VIRT )
> +    {
> +        t = &v->arch.virt_timer;
> +        d->arch.virt_timer_base.offset = ctxt.vtb_offset;
> +
> +    }
> +    else
> +    {
> +        t = &v->arch.phys_timer;
> +        d->arch.phys_timer_base.offset = ctxt.vtb_offset;
> +    }
> +
> +    t->cval = ctxt.cval;
> +    t->ctl = ctxt.ctl;
> +    t->v = v;
> +
> +    return 0;
> +}
> +
> +HVM_REGISTER_SAVE_RESTORE(A15_TIMER, timer_save, timer_load, 2,
HVMSR_PER_VCPU);
> +
> +static int vfp_save(struct domain *d, hvm_domain_context_t *h)
> +{
> +    struct hvm_hw_vfp ctxt;
> +    struct vcpu *v;
> +
> +    /* Save the state of VFP */
> +    for_each_vcpu( d, v )
> +    {
> +        if ( sizeof(v->arch.vfp) != sizeof (ctxt) )
> +        {
> +            dprintk(XENLOG_G_ERR, "HVM: check VFP structure
match\n");
> +            return -EINVAL;
> +        }
> +        memcpy(&ctxt, &v->arch.vfp, sizeof(ctxt));
> +
> +        if ( hvm_save_entry(VFP, v->vcpu_id, h, &ctxt) != 0 )
> +            return 1;
> +    }
> +    return 0;
> +}
> +
> +static int vfp_load(struct domain *d, hvm_domain_context_t *h)
> +{
> +    int vcpuid;
> +    struct hvm_hw_vfp ctxt;
> +    struct vcpu *v;
> +
> +    /* Which vcpu is this? */
> +    vcpuid = hvm_load_instance(h);
> +    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL
)
> +    {
> +        dprintk(XENLOG_G_ERR, "HVM restore: dom%u has no
vcpu%u\n",
> +                d->domain_id, vcpuid);
> +        return -EINVAL;
> +    }
> +
> +    if ( sizeof(v->arch.vfp) != sizeof (ctxt) )
> +    {
> +        dprintk(XENLOG_G_ERR, "HVM: check VFP structure
match\n");
> +        return -EINVAL;
> +    }
> +
> +    if ( hvm_load_entry(VFP, h, &ctxt) != 0 )
> +        return -EINVAL;
> +
> +    memcpy(&v->arch.vfp, &ctxt, sizeof(ctxt));
> +
> +    return 0;
> +}
> +
> +HVM_REGISTER_SAVE_RESTORE(VFP, vfp_save, vfp_load, 1, HVMSR_PER_VCPU);
> diff --git a/xen/arch/arm/save.c b/xen/arch/arm/save.c
> new file mode 100644
> index 0000000..c923910
> --- /dev/null
> +++ b/xen/arch/arm/save.c
> @@ -0,0 +1,66 @@
> +/*
> + * hvm/save.c: Save and restore HVM guest''s emulated hardware
state for ARM.
> + *
> + * Copyright (c) 2013, Samsung Electronics.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along
with
> + * this program; if not, write to the Free Software Foundation, Inc., 59
Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#include <asm/hvm/support.h>
> +#include <public/hvm/save.h>
> +
> +void arch_hvm_save(struct domain *d, struct hvm_save_header *hdr)
> +{
> +    hdr->cpuid = READ_SYSREG32(MIDR_EL1);
> +}
> +
> +int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
> +{
> +    uint32_t cpuid;
> +
> +    if ( hdr->magic != HVM_FILE_MAGIC )
> +    {
> +        printk(XENLOG_G_ERR "HVM%d restore: bad magic number
%#"PRIx32"\n",
> +               d->domain_id, hdr->magic);
> +        return -1;
> +    }
> +
> +    if ( hdr->version != HVM_FILE_VERSION )
> +    {
> +        printk(XENLOG_G_ERR "HVM%d restore: unsupported version
%u\n",
> +               d->domain_id, hdr->version);
> +        return -1;
> +    }
> +
> +    cpuid = READ_SYSREG32(MIDR_EL1);
> +    if ( hdr->cpuid != cpuid )
> +    {
> +        printk(XENLOG_G_INFO "HVM%d restore: VM saved on one CPU
"
> +               "(%#"PRIx32") and restored on another
(%#"PRIx32").\n",
> +               d->domain_id, hdr->cpuid, cpuid);
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 6da4651..733b19f 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -61,6 +61,8 @@ subdir-$(CONFIG_COMPAT) += compat
>  
>  subdir-$(x86_64) += hvm
>  
> +subdir-$(CONFIG_ARM) += hvm
> +
>  subdir-$(coverage) += gcov
>  
>  subdir-y += libelf
> diff --git a/xen/include/asm-arm/hvm/support.h
b/xen/include/asm-arm/hvm/support.h
> new file mode 100644
> index 0000000..8311f2f
> --- /dev/null
> +++ b/xen/include/asm-arm/hvm/support.h
> @@ -0,0 +1,29 @@
> +/*
> + * support.h: HVM support routines used by ARMv7 VE.
> + *
> + * Copyright (c) 2012, Citrix Systems
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along
with
> + * this program; if not, write to the Free Software Foundation, Inc., 59
Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#ifndef __ASM_ARM_HVM_SUPPORT_H__
> +#define __ASM_ARM_HVM_SUPPORT_H__
> +
> +#include <xen/types.h>
> +#include <public/hvm/ioreq.h>
> +#include <xen/sched.h>
> +#include <xen/hvm/save.h>
> +#include <asm/processor.h>
> +
> +#endif /* __ASM_ARM_HVM_SUPPORT_H__ */
> diff --git a/xen/include/public/arch-arm/hvm/save.h
b/xen/include/public/arch-arm/hvm/save.h
> index 75b8e65..7e55384 100644
> --- a/xen/include/public/arch-arm/hvm/save.h
> +++ b/xen/include/public/arch-arm/hvm/save.h
> @@ -26,6 +26,72 @@
>  #ifndef __XEN_PUBLIC_HVM_SAVE_ARM_H__
>  #define __XEN_PUBLIC_HVM_SAVE_ARM_H__
>  
> +#define HVM_FILE_MAGIC   0x92385520
> +#define HVM_FILE_VERSION 0x00000001
> +
> +
> +struct hvm_save_header
> +{
> +    uint32_t magic;             /* Must be HVM_FILE_MAGIC */
> +    uint32_t version;           /* File format version */
> +    uint64_t changeset;         /* Version of Xen that saved this file */
> +    uint32_t cpuid;             /* MIDR_EL1 on the saving machine */
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(HEADER, 1, struct hvm_save_header);
> +
> +struct vgic_rank
> +{
> +    uint32_t ienable, iactive, ipend, pendsgi;
> +    uint32_t icfg[2];
> +    uint32_t ipriority[8];
> +    uint32_t itargets[8];
> +};
> +
> +struct hvm_hw_gic
> +{
> +    uint32_t gic_hcr;
> +    uint32_t gic_vmcr;
> +    uint32_t gic_apr;
> +    uint32_t gic_lr[64];
> +    uint64_t event_mask;
> +    uint64_t lr_mask;
> +    struct vgic_rank ppi_state;
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(GIC, 2, struct hvm_hw_gic);
> +
> +#define TIMER_TYPE_VIRT 0
> +#define TIMER_TYPE_PHYS 1
> +
> +struct hvm_hw_timer
> +{
> +    uint64_t vtb_offset;
> +    uint32_t ctl;
> +    uint64_t cval;
> +    uint32_t type;
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(A15_TIMER, 3, struct hvm_hw_timer);
> +
> +struct hvm_hw_vfp
> +{
> +    uint64_t fpregs1[16]; /* {d0-d15} */
> +    uint64_t fpregs2[16]; /* {d16-d31} */
> +    uint32_t fpexc;
> +    uint32_t fpscr;
> +    /* VFP implementation specific state */
> +    uint32_t fpinst;
> +    uint32_t fpinst2;
> +};
> +
> +DECLARE_HVM_SAVE_TYPE(VFP, 4, struct hvm_hw_vfp);
> +
> +/*
> + * Largest type-code in use
> + */
> +#define HVM_SAVE_CODE_MAX 4
> +
>  #endif
>  
>  /*
> 

-- 
Julien Grall

Julien Grall

2013-Oct-07 12:53 UTC

head link

Re: [PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

On 10/04/2013 05:44 AM, Jaeyong Yoo wrote:> Fix the the broken macro ''clear_guest_offset'' in arm.
> 
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>Reviewed-by: Julien Grall
<julien.grall@linaro.org>> ---
>  xen/include/asm-arm/guest_access.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/include/asm-arm/guest_access.h
b/xen/include/asm-arm/guest_access.h
> index 34aae14..8ff088f 100644
> --- a/xen/include/asm-arm/guest_access.h
> +++ b/xen/include/asm-arm/guest_access.h
> @@ -77,8 +77,9 @@ unsigned long raw_clear_guest(void *to, unsigned len);
>   * Clear an array of objects in guest context via a guest handle,
>   * specifying an offset into the guest array.
>   */
> -#define clear_guest_offset(hnd, off, ptr, nr) ({      \
> -    raw_clear_guest(_d+(off), nr);  \
> +#define clear_guest_offset(hnd, off, nr) ({    \
> +    void *_d = (hnd).p;                        \
> +    raw_clear_guest(_d+(off), nr);             \
>  })
>  
>  /*
> 

-- 
Julien Grall

Julien Grall

2013-Oct-07 13:02 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On 10/04/2013 05:44 AM, Jaeyong Yoo wrote:> Add hypercall (shadow op: enable/disable and clean/peek dirted page
bitmap).
> It consists of two parts: dirty page detecting and saving.
> For detecting, we setup the guest p2m''s leaf PTE read-only and
whenever
> the guest tries to write something, permission fault happens and traps into
xen.
> The permission-faulted GPA should be saved for the toolstack (when it wants
to see
> which pages are dirted). For this purpose, we temporarily save the GPAs
into linked
> list by using ''add_mapped_vaddr'' function and when
toolstack wants
> (log_dirty_op function) the GPAs are copied into bitmap and the linnked
list is flushed.
> 
> Additionally, for supporting parallel migration of domUs, vlpt area should
be context
> switched.
> 
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> ---
[..]
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 4c0fc32..3b78ed2 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1313,6 +1313,8 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>      const char *msg;
>      int rc, level = -1;
>      mmio_info_t info;
> +    int page_fault = ((dabt.dfsc & FSC_MASK) => +                  
(FSC_FLT_PERM + FSC_3D_LEVEL) && dabt.write);
>  
>      if ( !check_conditional_instr(regs, hsr) )
>      {
> @@ -1327,22 +1329,23 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>      info.gva = READ_SYSREG64(FAR_EL2);
>  #endif
>  
> -    if (dabt.s1ptw)
> +    if ( dabt.s1ptw && !page_fault )
I think checking !page_fault is nearly everywhere is error-prone when
this function will be modified.

Can you do something like this?

if ( page_fault )
  // Your code to handle page fault
else
{
  // handle_mmio
}

It will avoid && !page_fault.
>          goto bad_data_abort;
>  
>      rc = gva_to_ipa(info.gva, &info.gpa);
> -    if ( rc == -EFAULT )
> +    if ( rc == -EFAULT && !page_fault )
>          goto bad_data_abort;
>  
>      /* XXX: Decode the instruction if ISS is not valid */
> -    if ( !dabt.valid )
> +    if ( !dabt.valid && !page_fault )
>          goto bad_data_abort;
>  
>      /*
>       * Erratum 766422: Thumb store translation fault to Hypervisor may
>       * not have correct HSR Rt value.
>       */
> -    if ( cpu_has_erratum_766422() && (regs->cpsr &
PSR_THUMB) && dabt.write )
> +    if ( cpu_has_erratum_766422() && (regs->cpsr &
PSR_THUMB) && dabt.write
> +            && !page_fault)
>      {
>          rc = decode_instruction(regs, &info.dabt);
>          if ( rc )
> @@ -1358,6 +1361,16 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>          return;
>      }
>  
> +    /* handle permission fault on write */
> +    if ( page_fault )
> +    {
> +        if ( current->domain->arch.dirty.mode == 0 )
> +           goto bad_data_abort;
> +
> +        handle_page_fault(current->domain, info.gpa);
You must call advance_pc(regs, hsr) here.
> +        return;
> +    }
> +
>  bad_data_abort:
-- 
Julien Grall

Eugene Fedotov

2013-Oct-07 15:51 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

07.10.2013 17:02, Julien Grall:> I think checking !page_fault is nearly everywhere is error-prone when
> this function will be modified.
>
> Can you do something like this?
>
> if ( page_fault )
>    // Your code to handle page fault
> else
> {
>    // handle_mmio
> }
>
> It will avoid && !page_fault.Yes, but I think at page fault condition we should check whether address 
belong MMIO regions (page fault at MMIO addr is possible, but we don''t 
need that memory to transfer)>
>> >          goto bad_data_abort;
>> >  
>> >      rc = gva_to_ipa(info.gva, &info.gpa);
>> >-    if ( rc == -EFAULT )
>> >+    if ( rc == -EFAULT && !page_fault )
>> >          goto bad_data_abort;
>> >  
>> >      /* XXX: Decode the instruction if ISS is not valid */
>> >-    if ( !dabt.valid )
>> >+    if ( !dabt.valid && !page_fault )
>> >          goto bad_data_abort;
>> >  
>> >      /*
>> >       * Erratum 766422: Thumb store translation fault to
Hypervisor may
>> >       * not have correct HSR Rt value.
>> >       */
>> >-    if ( cpu_has_erratum_766422() && (regs->cpsr &
PSR_THUMB) && dabt.write )
>> >+    if ( cpu_has_erratum_766422() && (regs->cpsr &
PSR_THUMB) && dabt.write
>> >+            && !page_fault)
>> >      {
>> >          rc = decode_instruction(regs, &info.dabt);
>> >          if ( rc )
>> >@@ -1358,6 +1361,16 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>> >          return;
>> >      }
>> >  
>> >+    /* handle permission fault on write */
>> >+    if ( page_fault )
>> >+    {
>> >+        if ( current->domain->arch.dirty.mode == 0 )
>> >+           goto bad_data_abort;
>> >+
>> >+        handle_page_fault(current->domain, info.gpa);
> You must call advance_pc(regs, hsr) here.
>Let me explain. I think the difference between handle_page_fault and 
handle_mmio is that the ongoing memory operation (that trapped here) 
should be repeated after handle_page_fault (the page fault handler 
clears read-only bit), but should be stepped out after handle_mmio (to 
do this we call advance_pc to increase the pc register). So, advance_pc 
is intentionally missed.

Best regards,
Evgeny.

Ian Campbell

2013-Oct-10 10:40 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index 5d359af..bf6dc9a 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -253,6 +253,41 @@ struct vcpu_guest_context {
>  
>      uint32_t sctlr, ttbcr;
>      uint64_t ttbr0, ttbr1;
> +    uint32_t ifar, dfar;
> +    uint32_t ifsr, dfsr;
> +    uint32_t dacr;
> +    uint64_t par;
> +
> +#ifdef CONFIG_ARM_32
I''m afraid a per arch ifdef isn''t allowed in the
include/public tree.
The interface should be identical for both 32 and 64 bit callers. Also
think of 32-on-64 guests etc.

Also, this struct is guest facing (via VCPUOP_initialise) but many/all
of these new registers are not things which a guest needs to specify via
a hypercall. IOW I think many of them should be part of some toolstack
private save/restore interface.

I think x86 has a hvm_hw_cpu struct, which I think seems more
appropriate. Tim, what do you think?
> +    uint32_t mair0, mair1;
> +    uint32_t tpidr_el0;
> +    uint32_t tpidr_el1;
> +    uint32_t tpidrro_el0;
> +    uint32_t vbar;
> +#else
> +    uint64_t mair;
> +    uint64_t tpidr_el0;
> +    uint64_t tpidr_el1;
> +    uint64_t tpidrro_el0;
> +    uint64_t vbar;
> +#endif
> +
> +    /* Control Registers */
> +    uint32_t actlr;
> +    uint32_t cpacr;
> +
> +    uint32_t afsr0, afsr1;
> +
> +    uint32_t contextidr;
> +
> +    uint32_t teecr, teehbr; /* ThumbEE, 32-bit guests only */
> +
> +#ifdef CONFIG_ARM_32
> +    uint32_t joscr, jmcr;
> +#endif
> +
> +    /* CP 15 */
> +    uint32_t csselr;
>  };
>  typedef struct vcpu_guest_context vcpu_guest_context_t;
>  DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);

Ian Campbell

2013-Oct-10 10:56 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:> From: Evgeny Fedotov <e.fedotov@samsung.com>
> 
> When creating domU in toolstack, pass the guest memory
> map info to the hypervisor, and the hypervisor stores those info in
> arch_domain for later use.
> 
> Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
> ---
>  tools/libxc/xc_dom_arm.c     | 12 +++++++-
>  tools/libxc/xc_domain.c      | 44 ++++++++++++++++++++++++++++
>  tools/libxc/xenctrl.h        | 23 +++++++++++++++
>  xen/arch/arm/domain.c        |  3 ++
>  xen/arch/arm/mm.c            | 68
++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/domain.h |  2 ++
>  xen/include/asm-arm/mm.h     |  1 +
>  xen/include/public/memory.h  | 15 ++++++++--
>  xen/include/xsm/dummy.h      |  5 ++++
>  xen/include/xsm/xsm.h        |  5 ++++
>  10 files changed, 175 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
> index df59ffb..20c9095 100644
> --- a/tools/libxc/xc_dom_arm.c
> +++ b/tools/libxc/xc_dom_arm.c
> @@ -166,6 +166,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>  {
>      int rc;
>      xen_pfn_t pfn, allocsz, i;
> +    struct dt_mem_info memmap;
>  
>      dom->shadow_enabled = 1;
>  
> @@ -191,7 +192,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              0, 0, &dom->p2m_host[i]);
>      }
>  
> -    return 0;
> +    /* setup guest memory map */
> +    memmap.nr_banks = 2;
> +    memmap.bank[0].start = (dom->rambase_pfn << PAGE_SHIFT_ARM);
> +    memmap.bank[0].size = (dom->total_pages << PAGE_SHIFT_ARM);
> +    /*The end of main memory: magic pages */
> +    memmap.bank[1].start = memmap.bank[0].start + memmap.bank[0].size;
> +    memmap.bank[1].size = NR_MAGIC_PAGES << PAGE_SHIFT_ARM;
Are the 0 and 1 here hardcoded magic numbers?
> +    return xc_domain_set_memory_map(dom->xch, dom->guest_domid,
&memmap);
I think this is using set_memory_map in a different way to it is used
for x86 (where it gives the PV e820 map, a PV version of a BIOS provided
datastructure).

The guest should get its memory map via DTB not via a PV hypercall. I
know the guest isn''t using get_memory_map but I don''t think we
should
even make it available.

On x86 the magic pages are handled specially in the save restore code
via HVM_PARAMS and not exposed to the guest via this memory map call. I
think that''s the way to go here too.

We do need a way to remember and return the guest RAM layout (well, for
now the RAM base is hardcoded, so we could fudge it) but I think a
toolstack internal mechanism like a save record would be better.

Ian.

Ian Campbell

2013-Oct-10 12:00 UTC

head link

Re: [PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

On Mon, 2013-10-07 at 13:53 +0100, Julien Grall wrote:> On 10/04/2013 05:44 AM, Jaeyong Yoo wrote:
> > Fix the the broken macro ''clear_guest_offset'' in
arm.
> > 
> > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> Reviewed-by: Julien Grall <julien.grall@linaro.org>
acked + applied. thanks.

Julien Grall

2013-Oct-10 14:10 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On 10/07/2013 04:51 PM, Eugene Fedotov wrote:
> Let me explain. I think the difference between handle_page_fault and
> handle_mmio is that the ongoing memory operation (that trapped here)
> should be repeated after handle_page_fault (the page fault handler
> clears read-only bit), but should be stepped out after handle_mmio (to
> do this we call advance_pc to increase the pc register). So, advance_pc
> is intentionally missed.
Oh right, thanks for the explanation!

-- 
Julien Grall

Jaeyong Yoo

2013-Oct-11 08:30 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: Thursday, October 10, 2013 7:41 PM
> To: Jaeyong Yoo
> Cc: Tim Deegan; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers for
> saving and restoring vcpu registers
> 
> On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> > diff --git a/xen/include/public/arch-arm.h
> > b/xen/include/public/arch-arm.h index 5d359af..bf6dc9a 100644
> > --- a/xen/include/public/arch-arm.h
> > +++ b/xen/include/public/arch-arm.h
> > @@ -253,6 +253,41 @@ struct vcpu_guest_context {
> >
> >      uint32_t sctlr, ttbcr;
> >      uint64_t ttbr0, ttbr1;
> > +    uint32_t ifar, dfar;
> > +    uint32_t ifsr, dfsr;
> > +    uint32_t dacr;
> > +    uint64_t par;
> > +
> > +#ifdef CONFIG_ARM_32
> 
> I''m afraid a per arch ifdef isn''t allowed in the
include/public tree.
> The interface should be identical for both 32 and 64 bit callers. Also
> think of 32-on-64 guests etc.
> 
> Also, this struct is guest facing (via VCPUOP_initialise) but many/all of
> these new registers are not things which a guest needs to specify via a
> hypercall. IOW I think many of them should be part of some toolstack
> private save/restore interface.
I see, the guest can specify something like sctlr, and ttbr/ttbcr, and the
others
should be hidden inside hvm save/restore. 
> 
> I think x86 has a hvm_hw_cpu struct, which I think seems more appropriate.
> Tim, what do you think?
> 
> > +    uint32_t mair0, mair1;
> > +    uint32_t tpidr_el0;
> > +    uint32_t tpidr_el1;
> > +    uint32_t tpidrro_el0;
> > +    uint32_t vbar;
> > +#else
> > +    uint64_t mair;
> > +    uint64_t tpidr_el0;
> > +    uint64_t tpidr_el1;
> > +    uint64_t tpidrro_el0;
> > +    uint64_t vbar;
> > +#endif
> > +
> > +    /* Control Registers */
> > +    uint32_t actlr;
> > +    uint32_t cpacr;
> > +
> > +    uint32_t afsr0, afsr1;
> > +
> > +    uint32_t contextidr;
> > +
> > +    uint32_t teecr, teehbr; /* ThumbEE, 32-bit guests only */
> > +
> > +#ifdef CONFIG_ARM_32
> > +    uint32_t joscr, jmcr;
> > +#endif
> > +
> > +    /* CP 15 */
> > +    uint32_t csselr;
> >  };
> >  typedef struct vcpu_guest_context vcpu_guest_context_t;
> > DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Campbell

2013-Oct-11 08:43 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > Sent: Thursday, October 10, 2013 7:41 PM
> > To: Jaeyong Yoo
> > Cc: Tim Deegan; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers
for
> > saving and restoring vcpu registers
> > 
> > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> > > diff --git a/xen/include/public/arch-arm.h
> > > b/xen/include/public/arch-arm.h index 5d359af..bf6dc9a 100644
> > > --- a/xen/include/public/arch-arm.h
> > > +++ b/xen/include/public/arch-arm.h
> > > @@ -253,6 +253,41 @@ struct vcpu_guest_context {
> > >
> > >      uint32_t sctlr, ttbcr;
> > >      uint64_t ttbr0, ttbr1;
> > > +    uint32_t ifar, dfar;
> > > +    uint32_t ifsr, dfsr;
> > > +    uint32_t dacr;
> > > +    uint64_t par;
> > > +
> > > +#ifdef CONFIG_ARM_32
> > 
> > I''m afraid a per arch ifdef isn''t allowed in the
include/public tree.
> > The interface should be identical for both 32 and 64 bit callers. Also
> > think of 32-on-64 guests etc.
> > 
> > Also, this struct is guest facing (via VCPUOP_initialise) but many/all
of
> > these new registers are not things which a guest needs to specify via
a
> > hypercall. IOW I think many of them should be part of some toolstack
> > private save/restore interface.
> 
> I see, the guest can specify something like sctlr, and ttbr/ttbcr, and the
> others should be hidden inside hvm save/restore. 
Right, the important thing is that all that additional state is only
visible to the toolstack and the hypervisor, not to guests.

Actually the guest shouldn''t really see this interface anyway,
that''s
really a hold over from x86. On ARM only the toolstack really needs to
use this struct.

I wonder if we can drop struct vcpu_guest_context from the guest facing
ABI on ARM. I see that we already don''t expose VCPUOP_initialise and
the
only other user is XEN_DOMCTL_(sg)etvcpucontext.

I wonder if we could on x86 do:
        typedef vcpu_guest_context_t domctl_guest_context_t
and on ARM s/vcpu_guest_context/domtctl_guest_context/ (plus appropriate
__XEN_TOOLS__ ifdeffery) and make the domctl''s use the new name only
while VCPUOP uses the old name?

Can anyone think of a reason why a guest might need an interface to
set/get VCPU state directly like this on ARM? Or why this might not work
for x86?

Ian.

Eugene Fedotov

2013-Oct-11 10:06 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

10.10.2013 14:56, Ian Campbell:> On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
>> From: Evgeny Fedotov <e.fedotov@samsung.com>
>>
>> When creating domU in toolstack, pass the guest memory
>> map info to the hypervisor, and the hypervisor stores those info in
>> arch_domain for later use.
>>
>> Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
>> ---
>>   tools/libxc/xc_dom_arm.c     | 12 +++++++-
>>   tools/libxc/xc_domain.c      | 44 ++++++++++++++++++++++++++++
>>   tools/libxc/xenctrl.h        | 23 +++++++++++++++
>>   xen/arch/arm/domain.c        |  3 ++
>>   xen/arch/arm/mm.c            | 68
++++++++++++++++++++++++++++++++++++++++++++
>>   xen/include/asm-arm/domain.h |  2 ++
>>   xen/include/asm-arm/mm.h     |  1 +
>>   xen/include/public/memory.h  | 15 ++++++++--
>>   xen/include/xsm/dummy.h      |  5 ++++
>>   xen/include/xsm/xsm.h        |  5 ++++
>>   10 files changed, 175 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
>> index df59ffb..20c9095 100644
>> --- a/tools/libxc/xc_dom_arm.c
>> +++ b/tools/libxc/xc_dom_arm.c
>> @@ -166,6 +166,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>>   {
>>       int rc;
>>       xen_pfn_t pfn, allocsz, i;
>> +    struct dt_mem_info memmap;
>>   
>>       dom->shadow_enabled = 1;
>>   
>> @@ -191,7 +192,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>>               0, 0, &dom->p2m_host[i]);
>>       }
>>   
>> -    return 0;
>> +    /* setup guest memory map */
>> +    memmap.nr_banks = 2;
>> +    memmap.bank[0].start = (dom->rambase_pfn <<
PAGE_SHIFT_ARM);
>> +    memmap.bank[0].size = (dom->total_pages <<
PAGE_SHIFT_ARM);
>> +    /*The end of main memory: magic pages */
>> +    memmap.bank[1].start = memmap.bank[0].start + memmap.bank[0].size;
>> +    memmap.bank[1].size = NR_MAGIC_PAGES << PAGE_SHIFT_ARM;
> Are the 0 and 1 here hardcoded magic numbers?Well, we hardcoded here two memory regions: the first one for RAM, the 
second one for "magic pages".>> +    return xc_domain_set_memory_map(dom->xch, dom->guest_domid,
&memmap);
> I think this is using set_memory_map in a different way to it is used
> for x86 (where it gives the PV e820 map, a PV version of a BIOS provided
> datastructure).Do you mean that using e820 structure for ARM implementation is better 
than dt_mem_info structure (that has been taken from libdt) and should 
be used in ARM implementation of get/set memory map?> The guest should get its memory map via DTB not via a PV hypercall. I
> know the guest isn''t using get_memory_map but I don''t
think we should
> even make it available.
OK, this hypercall will be available from dom0 only.>
> On x86 the magic pages are handled specially in the save restore code
> via HVM_PARAMS and not exposed to the guest via this memory map call. I
> think that''s the way to go here too.
>
> We do need a way to remember and return the guest RAM layout (well, for
> now the RAM base is hardcoded, so we could fudge it) but I think a
> toolstack internal mechanism like a save record would be better.
>
> Ian.
>So, get/set memory map hypercall should get/set only main RAM region? 
Two questions about details of next implementation:
1) Can we restrict memory map number of regions to 1 for ARM? In this 
case we don''t need list or array to define memory map structure.
2)We have already implemented store& restore of HVM params (magic pages 
PFNs) inside ARM toolstack, but I don''t know if we need to migrate 
contents of such pages. In current ARM toolstack we migrate magic pages 
contents too as a part of memory map. In x86 code I cannot find where 
the content of magic pages are saved or restored, so may be Xen 
developers are familiar about it.

Best regards,
Eugene.

Ian Campbell

2013-Oct-11 10:09 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Fri, 2013-10-11 at 14:06 +0400, Eugene Fedotov wrote:> 10.10.2013 14:56, Ian Campbell:
> > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> >> From: Evgeny Fedotov <e.fedotov@samsung.com>
> >>
> >> When creating domU in toolstack, pass the guest memory
> >> map info to the hypervisor, and the hypervisor stores those info
in
> >> arch_domain for later use.
> >>
> >> Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
> >> ---
> >>   tools/libxc/xc_dom_arm.c     | 12 +++++++-
> >>   tools/libxc/xc_domain.c      | 44 ++++++++++++++++++++++++++++
> >>   tools/libxc/xenctrl.h        | 23 +++++++++++++++
> >>   xen/arch/arm/domain.c        |  3 ++
> >>   xen/arch/arm/mm.c            | 68
++++++++++++++++++++++++++++++++++++++++++++
> >>   xen/include/asm-arm/domain.h |  2 ++
> >>   xen/include/asm-arm/mm.h     |  1 +
> >>   xen/include/public/memory.h  | 15 ++++++++--
> >>   xen/include/xsm/dummy.h      |  5 ++++
> >>   xen/include/xsm/xsm.h        |  5 ++++
> >>   10 files changed, 175 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
> >> index df59ffb..20c9095 100644
> >> --- a/tools/libxc/xc_dom_arm.c
> >> +++ b/tools/libxc/xc_dom_arm.c
> >> @@ -166,6 +166,7 @@ int arch_setup_meminit(struct xc_dom_image
*dom)
> >>   {
> >>       int rc;
> >>       xen_pfn_t pfn, allocsz, i;
> >> +    struct dt_mem_info memmap;
> >>   
> >>       dom->shadow_enabled = 1;
> >>   
> >> @@ -191,7 +192,16 @@ int arch_setup_meminit(struct xc_dom_image
*dom)
> >>               0, 0, &dom->p2m_host[i]);
> >>       }
> >>   
> >> -    return 0;
> >> +    /* setup guest memory map */
> >> +    memmap.nr_banks = 2;
> >> +    memmap.bank[0].start = (dom->rambase_pfn <<
PAGE_SHIFT_ARM);
> >> +    memmap.bank[0].size = (dom->total_pages <<
PAGE_SHIFT_ARM);
> >> +    /*The end of main memory: magic pages */
> >> +    memmap.bank[1].start = memmap.bank[0].start +
memmap.bank[0].size;
> >> +    memmap.bank[1].size = NR_MAGIC_PAGES << PAGE_SHIFT_ARM;
> > Are the 0 and 1 here hardcoded magic numbers?
> Well, we hardcoded here two memory regions: the first one for RAM, the 
> second one for "magic pages".
> >> +    return xc_domain_set_memory_map(dom->xch,
dom->guest_domid, &memmap);
> > I think this is using set_memory_map in a different way to it is used
> > for x86 (where it gives the PV e820 map, a PV version of a BIOS
provided
> > datastructure).
> Do you mean that using e820 structure for ARM implementation is better 
> than dt_mem_info structure (that has been taken from libdt) and should 
> be used in ARM implementation of get/set memory map?
I mean on x86 it is used as a way of communicating a memory map (in a
structure vaguely like an actual defined architecture specific layout,
the e820), whereas on ARM it is only used by the tools and should not be
exposed to the guest, since the guest gets its info from the DTB.
> > The guest should get its memory map via DTB not via a PV hypercall. I
> > know the guest isn''t using get_memory_map but I
don''t think we should
> > even make it available.
> OK, this hypercall will be available from dom0 only.
It''ll have to move away from the XENMEM space then.
> >
> > On x86 the magic pages are handled specially in the save restore code
> > via HVM_PARAMS and not exposed to the guest via this memory map call.
I
> > think that''s the way to go here too.
> 
> >
> > We do need a way to remember and return the guest RAM layout (well,
for
> > now the RAM base is hardcoded, so we could fudge it) but I think a
> > toolstack internal mechanism like a save record would be better.
> >
> > Ian.
> >
> So, get/set memory map hypercall should get/set only main RAM region?
If you have multiple RAM regions then it should save them, but the magic
pages are not RAM regions in that sense so should not be included.
> Two questions about details of next implementation:
> 1) Can we restrict memory map number of regions to 1 for ARM? In this 
> case we don''t need list or array to define memory map structure.
> 2)We have already implemented store& restore of HVM params (magic pages
> PFNs) inside ARM toolstack, but I don''t know if we need to migrate
> contents of such pages. In current ARM toolstack we migrate magic pages 
> contents too as a part of memory map. In x86 code I cannot find where 
> the content of magic pages are saved or restored, so may be Xen 
> developers are familiar about it.
I have a feeling they are not migrated, and I''m not sure. Perhaps Tim
knows...

Ian.
> 
> Best regards,
> Eugene.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Tim Deegan

2013-Oct-11 10:18 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

At 11:09 +0100 on 11 Oct (1381489749), Ian Campbell
wrote:> > Two questions about details of next implementation:
> > 1) Can we restrict memory map number of regions to 1 for ARM? In this 
> > case we don''t need list or array to define memory map
structure.
Sounds very sensible to me.
> > 2)We have already implemented store& restore of HVM params (magic
pages
> > PFNs) inside ARM toolstack, but I don''t know if we need to
migrate
> > contents of such pages. In current ARM toolstack we migrate magic
pages
> > contents too as a part of memory map. In x86 code I cannot find where 
> > the content of magic pages are saved or restored, so may be Xen 
> > developers are familiar about it.
> 
> I have a feeling they are not migrated, and I''m not sure. Perhaps
Tim
> knows...
IIRC they are saved as part of general guest memory, but in fact
they''ll
all be reset on the other side anyway.

Tim.

Tim Deegan

2013-Oct-11 10:22 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

At 09:43 +0100 on 11 Oct (1381484614), Ian Campbell
wrote:> On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:
> > > -----Original Message-----
> > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > > Sent: Thursday, October 10, 2013 7:41 PM
> > > To: Jaeyong Yoo
> > > Cc: Tim Deegan; xen-devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more
registers for
> > > saving and restoring vcpu registers
> > > 
> > > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> > > > diff --git a/xen/include/public/arch-arm.h
> > > > b/xen/include/public/arch-arm.h index 5d359af..bf6dc9a
100644
> > > > --- a/xen/include/public/arch-arm.h
> > > > +++ b/xen/include/public/arch-arm.h
> > > > @@ -253,6 +253,41 @@ struct vcpu_guest_context {
> > > >
> > > >      uint32_t sctlr, ttbcr;
> > > >      uint64_t ttbr0, ttbr1;
> > > > +    uint32_t ifar, dfar;
> > > > +    uint32_t ifsr, dfsr;
> > > > +    uint32_t dacr;
> > > > +    uint64_t par;
> > > > +
> > > > +#ifdef CONFIG_ARM_32
> > > 
> > > I''m afraid a per arch ifdef isn''t allowed in
the include/public tree.
> > > The interface should be identical for both 32 and 64 bit callers.
Also
> > > think of 32-on-64 guests etc.
> > > 
> > > Also, this struct is guest facing (via VCPUOP_initialise) but
many/all of
> > > these new registers are not things which a guest needs to specify
via a
> > > hypercall. IOW I think many of them should be part of some
toolstack
> > > private save/restore interface.
> > 
> > I see, the guest can specify something like sctlr, and ttbr/ttbcr, and
the
> > others should be hidden inside hvm save/restore. 
> 
> Right, the important thing is that all that additional state is only
> visible to the toolstack and the hypervisor, not to guests.
> 
> Actually the guest shouldn''t really see this interface anyway,
that''s
> really a hold over from x86. On ARM only the toolstack really needs to
> use this struct.
> 
> I wonder if we can drop struct vcpu_guest_context from the guest facing
> ABI on ARM. I see that we already don''t expose VCPUOP_initialise
and the
> only other user is XEN_DOMCTL_(sg)etvcpucontext.
Does the guest need to have those on ARM? How are you arranging SMP
guest AP bringup?  If it''s using SCI then maybe that hypercall
interface
can be dropped.

Tim.

Ian Campbell

2013-Oct-11 10:25 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

On Fri, 2013-10-11 at 11:22 +0100, Tim Deegan wrote:> At 09:43 +0100 on 11 Oct (1381484614), Ian Campbell wrote:
> > On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:
> > > > -----Original Message-----
> > > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > > > Sent: Thursday, October 10, 2013 7:41 PM
> > > > To: Jaeyong Yoo
> > > > Cc: Tim Deegan; xen-devel@lists.xen.org
> > > > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more
registers for
> > > > saving and restoring vcpu registers
> > > > 
> > > > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> > > > > diff --git a/xen/include/public/arch-arm.h
> > > > > b/xen/include/public/arch-arm.h index 5d359af..bf6dc9a
100644
> > > > > --- a/xen/include/public/arch-arm.h
> > > > > +++ b/xen/include/public/arch-arm.h
> > > > > @@ -253,6 +253,41 @@ struct vcpu_guest_context {
> > > > >
> > > > >      uint32_t sctlr, ttbcr;
> > > > >      uint64_t ttbr0, ttbr1;
> > > > > +    uint32_t ifar, dfar;
> > > > > +    uint32_t ifsr, dfsr;
> > > > > +    uint32_t dacr;
> > > > > +    uint64_t par;
> > > > > +
> > > > > +#ifdef CONFIG_ARM_32
> > > > 
> > > > I''m afraid a per arch ifdef isn''t allowed
in the include/public tree.
> > > > The interface should be identical for both 32 and 64 bit
callers. Also
> > > > think of 32-on-64 guests etc.
> > > > 
> > > > Also, this struct is guest facing (via VCPUOP_initialise)
but many/all of
> > > > these new registers are not things which a guest needs to
specify via a
> > > > hypercall. IOW I think many of them should be part of some
toolstack
> > > > private save/restore interface.
> > > 
> > > I see, the guest can specify something like sctlr, and
ttbr/ttbcr, and the
> > > others should be hidden inside hvm save/restore. 
> > 
> > Right, the important thing is that all that additional state is only
> > visible to the toolstack and the hypervisor, not to guests.
> > 
> > Actually the guest shouldn''t really see this interface
anyway, that''s
> > really a hold over from x86. On ARM only the toolstack really needs to
> > use this struct.
> > 
> > I wonder if we can drop struct vcpu_guest_context from the guest
facing
> > ABI on ARM. I see that we already don''t expose
VCPUOP_initialise and the
> > only other user is XEN_DOMCTL_(sg)etvcpucontext.
> 
> Does the guest need to have those on ARM? How are you arranging SMP
> guest AP bringup?  If it''s using SCI then maybe that hypercall
interface
> can be dropped.
SMP bring up is done via PSCI, which is the firmware "hypercall"
interface, defined by ARM for bringing up physucal CPUs which we
implement within Xen for the guests'' benefit.

Ian

Jaeyong Yoo

2013-Oct-14 04:39 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: Friday, October 11, 2013 7:26 PM
> To: Tim Deegan
> Cc: Keir Fraser; Stefano Stabellini; Jan Beulich; Jaeyong Yoo; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers for
> saving and restoring vcpu registers
> 
> On Fri, 2013-10-11 at 11:22 +0100, Tim Deegan wrote:
> > At 09:43 +0100 on 11 Oct (1381484614), Ian Campbell wrote:
> > > On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:
> > > > > -----Original Message-----
> > > > > From: xen-devel-bounces@lists.xen.org
[mailto:xen-devel-
> > > > > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > > > > Sent: Thursday, October 10, 2013 7:41 PM
> > > > > To: Jaeyong Yoo
> > > > > Cc: Tim Deegan; xen-devel@lists.xen.org
> > > > > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add
more
> > > > > registers for saving and restoring vcpu registers
> > > > >
> > > > > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
> > > > > > diff --git a/xen/include/public/arch-arm.h
> > > > > > b/xen/include/public/arch-arm.h index
5d359af..bf6dc9a 100644
> > > > > > --- a/xen/include/public/arch-arm.h
> > > > > > +++ b/xen/include/public/arch-arm.h
> > > > > > @@ -253,6 +253,41 @@ struct vcpu_guest_context {
> > > > > >
> > > > > >      uint32_t sctlr, ttbcr;
> > > > > >      uint64_t ttbr0, ttbr1;
> > > > > > +    uint32_t ifar, dfar;
> > > > > > +    uint32_t ifsr, dfsr;
> > > > > > +    uint32_t dacr;
> > > > > > +    uint64_t par;
> > > > > > +
> > > > > > +#ifdef CONFIG_ARM_32
> > > > >
> > > > > I''m afraid a per arch ifdef isn''t
allowed in the include/public
> tree.
> > > > > The interface should be identical for both 32 and 64
bit
> > > > > callers. Also think of 32-on-64 guests etc.
> > > > >
> > > > > Also, this struct is guest facing (via
VCPUOP_initialise) but
> > > > > many/all of these new registers are not things which a
guest
> > > > > needs to specify via a hypercall. IOW I think many of
them
> > > > > should be part of some toolstack private save/restore
interface.
> > > >
> > > > I see, the guest can specify something like sctlr, and
ttbr/ttbcr,
> > > > and the others should be hidden inside hvm save/restore.
> > >
> > > Right, the important thing is that all that additional state is
only
> > > visible to the toolstack and the hypervisor, not to guests.
> > >
> > > Actually the guest shouldn''t really see this interface
anyway,
> > > that''s really a hold over from x86. On ARM only the
toolstack really
> > > needs to use this struct.
> > >
> > > I wonder if we can drop struct vcpu_guest_context from the guest
> > > facing ABI on ARM. I see that we already don''t expose
> > > VCPUOP_initialise and the only other user is
> XEN_DOMCTL_(sg)etvcpucontext.
> >
> > Does the guest need to have those on ARM? How are you arranging SMP
> > guest AP bringup?  If it''s using SCI then maybe that
hypercall
> > interface can be dropped.
> 
> SMP bring up is done via PSCI, which is the firmware "hypercall"
> interface, defined by ARM for bringing up physucal CPUs which we implement
> within Xen for the guests'' benefit.
So, could I remove XEN_DOMCTL_(sg)etvcpucontext and use HVM save/restore for
migrating vcpu registers? 

And, if I could, arch_get_info_guest becomes dangling function and would it
be better to remove this function too? or better to keep it for symmetry?

Jaeyong
> 
> Ian
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Campbell

2013-Oct-16 12:50 UTC

head link

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:> Implement vlpt (virtual-linear page table) for fast accessing of 3rd
> PTE of guest p2m. When creating a mapping for vlpt, just copy the
> 1st level PTE of guest p2m to the xen''s 2nd level PTE. Then the
mapping
> becomes the following:
>     xen''s 1st PTE -->
>     xen''s 2nd PTE (which is the same as 1st PTE of guest p2m)
-->
>     guest p2m''s 2nd PTE -->
>     guest p2m''s 3rd PTE (the memory contents where the vlpt
points)
> For more info about vlpt, see:
> http://www.technovelty.org/linux/virtual-linear-page-table.html
> 
> This function is used in dirty-page tracing: when domU write-fault is
trapped by xen,
> xen can immediately locate the 3rd PTE of guest p2m.
ISTR you had some numbers which backed up the choice of this solution.
Can you either include them here or link to the list archives?
> Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> ---
>  xen/arch/arm/mm.c            | 78
++++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/config.h |  6 ++++
>  xen/include/asm-arm/domain.h |  7 ++++
>  xen/include/asm-arm/mm.h     | 22 +++++++++++++
>  4 files changed, 113 insertions(+)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 120eae8..98edbc9 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn)
>          return 1;
>      return 0;
>  }
> +
> +/* flush the vlpt area */
> +static void flush_vlpt(struct domain *d)
> +{
> +    int flush_size;
> +    flush_size = (d->arch.dirty.second_lvl_end -
> +                  d->arch.dirty.second_lvl_start) <<
SECOND_SHIFT;
> +    /* flushing the 3rd level mapping */
> +    flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START,
> +                                flush_size);
Shouldn''t the base here be VIRT_LIN_P2M_START +
(d->arch.dirty.second_lvl_start) << SECOND_SHIFT) or something like
that?

flush_xen_data_tlb_range_va just turns into a loop over the addresses,
so you might find you may as well do the flushes as you update the ptes
in the below, perhaps with an optimisation to pull the barriers outside
that loop.
> +}
> +
> +/* restore the xen page table for vlpt mapping for domain d */
> +void restore_vlpt(struct domain *d)
> +{
> +    int i, j = 0;
> +    int need_flush = 0;
> +
> +    for ( i = d->arch.dirty.second_lvl_start;
> +          i < d->arch.dirty.second_lvl_end;
> +          ++i, ++j )
> +    {
> +        if ( xen_second[i].bits != d->arch.dirty.second_lvl[j].bits )
> +        {
> +            write_pte(&xen_second[i], d->arch.dirty.second_lvl[j]);
> +            need_flush = 1;
> +        }
> +    }
> +
> +    if ( need_flush ) flush_vlpt(d);
> +}
> +
> +/* setting up the xen page table for vlpt mapping for domain d */
> +void prepare_vlpt(struct domain *d)
> +{
> +    int xen_second_linear_base;
> +    int gp2m_start_index, gp2m_end_index;
> +    struct p2m_domain *p2m = &d->arch.p2m;
> +    struct page_info *second_lvl_page;
> +    vaddr_t gma_start = 0;
> +    vaddr_t gma_end = 0;
> +    lpae_t *first;
> +    int i, j;
> +
> +    xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START);
> +    get_gma_start_end(d, &gma_start, &gma_end);
> +
> +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> +
> +    second_lvl_page = alloc_domheap_page(NULL, 0);
The p2m first is two concatenated pages with a total of 1024 entries
(which is needed to give the full 40 bit IPA space). I have a feeling
this means you need two pages here?

Or maybe we shouldn''t be supporting the full 40 bit addresses on
32-bit?

NB currently create_p2m_entries doesn''t actually support 40 bits, but I
need to fix that for an arm64 platform I''m looking at.
> +    d->arch.dirty.second_lvl = map_domain_page_global(
> +                                 page_to_mfn(second_lvl_page) );
> +
> +    first = __map_domain_page(p2m->first_level);
> +    for ( i = gp2m_start_index, j = 0; i < gp2m_end_index; ++i, ++j )
> +    {
> +        write_pte(&xen_second[xen_second_linear_base+i],
> +                  first[i]);
> +
> +        /* we copy the mapping into domain''s structure as a
reference in case
> +         * of the context switch (used in restore_vlpt) */
> +        d->arch.dirty.second_lvl[j] = first[i];
> +    }
> +    unmap_domain_page(first);
> +
> +    /* storing the start and end index */
> +    d->arch.dirty.second_lvl_start = xen_second_linear_base +
gp2m_start_index;
> +    d->arch.dirty.second_lvl_end = xen_second_linear_base +
gp2m_end_index;
> +
> +    flush_vlpt(d);
> +}
> +
> +void cleanup_vlpt(struct domain *d)
> +{
> +    unmap_domain_page_global(d->arch.dirty.second_lvl);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index 9e395c2..47fc26e 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -87,6 +87,7 @@
>   *   0  -   8M   <COMMON>
>   *
>   *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
> + * 128M - 256M   Virtual-linear mapping to P2M table
>   * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
>   *                    space
>   *
> @@ -132,6 +133,9 @@
>  
>  #define VMAP_VIRT_END    XENHEAP_VIRT_START
>  
> +#define VIRT_LIN_P2M_START     _AT(vaddr_t,0x08000000)
Can you put this up in the list of addresses in order please. 
> +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
> +
>  #define DOMHEAP_ENTRIES        1024  /* 1024 2MB mapping slots */
>  
>  /* Number of domheap pagetable pages required at the second level (2MB
mappings) */
> @@ -158,6 +162,8 @@
>  
>  #define HYPERVISOR_VIRT_END    DIRECTMAP_VIRT_END
>  
> +/*TODO (ARM_64): define VIRT_LIN_P2M_START VIRT_LIN_P2M_END */
> +
>  #endif
>  
>  /* Fixmap slots */
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 755c7af..7e7743a 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -112,6 +112,13 @@ struct arch_domain
>          spinlock_t                  lock;
>      } vuart;
>  
> +    /* dirty-page tracing */
> +    struct {
> +        int second_lvl_start;            /* for context switch */
> +        int second_lvl_end;
> +        lpae_t *second_lvl;
> +    } dirty;
> +
>      struct dt_mem_info map_domain;
>      spinlock_t         map_lock;
>  }  __cacheline_aligned;
> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> index 021e8d6..b9cbcf2 100644
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -344,6 +344,28 @@ static inline void put_page_and_type(struct page_info
*page)
>  
>  void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t *end);
>  
> +/* vlpt (virtual linear page table) related functions */
> +void prepare_vlpt(struct domain *d);
> +void cleanup_vlpt(struct domain *d);
> +void restore_vlpt(struct domain *d);
> +
> +/* calculate the xen''s virtual address for accessing the leaf PTE
of
> + * a given address (GPA) */
> +static inline lpae_t * get_vlpt_3lvl_pte(vaddr_t addr)
Argument is an IPA and should therefore be a paddr_t I think?
> +{
> +    unsigned long ipa_third = third_table_offset(addr);
> +    unsigned long ipa_second = second_table_offset(addr);
> +    unsigned long ipa_first = first_table_offset(addr);
> +
> +    /* guest p2m''s first level index is the index of
xen''s second level,
> +       and guest p2m''s second level index is the index of
xen''s third level */
> +    return (lpae_t *)(
> +        ( (second_linear_offset(VIRT_LIN_P2M_START) + ipa_first)
> +           << SECOND_SHIFT) +
> +        (ipa_second << THIRD_SHIFT) +
> +        (ipa_third << 3) );
> +}
Can we assign the table base to an "lpae_t *table" and then use
&table[ipa_third] rather than hardcoding << 3?

I can''t quite shake the feeling that the
"ipa_first<<SECOND_SHIFT" and
"ipa_second<<THIRD_SHIFT" elements of this could be achieved
with a
single shift of the original address (and maybe a mask).
> +
>  #endif /*  __ARCH_ARM_MM__ */
>  /*
>   * Local variables:

Ian Campbell

2013-Oct-16 13:44 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:> Add hypercall (shadow op: enable/disable and clean/peek dirted page
bitmap).
"dirtied"
> It consists of two parts: dirty page detecting and saving.
> For detecting, we setup the guest p2m''s leaf PTE read-only and
whenever
> the guest tries to write something, permission fault happens and traps into
xen.
> The permission-faulted GPA should be saved for the toolstack (when it wants
to see
> which pages are dirted). For this purpose, we temporarily save the GPAs
into linked
"dirtied" again
> list by using ''add_mapped_vaddr'' function and when
toolstack wants
> (log_dirty_op function) the GPAs are copied into bitmap and the linnked
list is flushed.
"linked"
>      spin_lock_init(&d->arch.map_lock);
>      d->arch.map_domain.nr_banks = 0;
>  
> +    /* init for dirty-page tracing */
> +    d->arch.dirty.count = 0;
> +    d->arch.dirty.mode = 0;
> +    d->arch.dirty.head = NULL;
> +    d->arch.dirty.mvn_head = NULL;
> +    spin_lock_init(&d->arch.dirty.lock);
> +
> +    d->arch.dirty.second_lvl_start = 0;
> +    d->arch.dirty.second_lvl_end = 0;
> +    d->arch.dirty.second_lvl = NULL;
I think some/all of these belong in the previous patch which added the
fields.
> +
>      clear_page(d->shared_info);
>      share_xen_page_with_guest(
>          virt_to_page(d->shared_info), d, XENSHARE_writable);
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index 78fb322..7edce12 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -92,6 +92,19 @@ long arch_do_domctl(struct xen_domctl *domctl, struct
domain *d,
>              xfree(c.data);
>      }
>      break;
> +    case XEN_DOMCTL_shadow_op:
> +    {
> +        domain_pause(d);
> +        ret = dirty_mode_op(d, &domctl->u.shadow_op);
> +        domain_unpause(d);
> +
> +        if ( (&domctl->u.shadow_op)->op ==
XEN_DOMCTL_SHADOW_OP_CLEAN ||
> +             (&domctl->u.shadow_op)->op ==
XEN_DOMCTL_SHADOW_OP_PEEK )
(&domctl->u.shadow_op)->op is just a weird way to write
domctl->u.shadow_op.op isn''t it?

Do you only want to copyback on success or is it always correct?
> +        {
> +            copyback = 1;
> +        }
> +    }
> +    break;
> @@ -1460,6 +1459,28 @@ void cleanup_vlpt(struct domain *d)
>      unmap_domain_page_global(d->arch.dirty.second_lvl);
>  }
>  
> +/* routine for dirty-page tracing
> + *
> + * On first write, it page faults, its entry is changed to read-write,
> + * and on retry the write succeeds.
> + *
> + * for locating p2m of the faulting entry, we use virtual-linear page
table.
> + */
> +void handle_page_fault(struct domain *d, paddr_t addr)
> +{
> +    lpae_t *vlp2m_pte = get_vlpt_3lvl_pte(addr);
> +    if ( vlp2m_pte->p2m.valid && vlp2m_pte->p2m.write == 0 )
I think you need to distinguish p2m entries which are r/o due to log
dirty from those which are r/o due to wanting them to be r/o.

Maybe we don''t have the latter right now? We will eventually need p2m
types I think.
> +    {
> +        lpae_t pte = *vlp2m_pte;
> +        pte.p2m.write = 1;
> +        write_pte(vlp2m_pte, pte);
> +        flush_tlb_local();
> +
> +        /* in order to remove mappings in cleanup stage */
> +        add_mapped_vaddr(d, addr);
No locks or atomic operations here? How are races with the tools reading
the dirty bitmap addressed?  If it is by clever ordering of the checks
and pte writes then I think a comment would be in order.
> +    }
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 2d09fef..3b2b11a 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -6,6 +6,8 @@
>  #include <xen/bitops.h>
>  #include <asm/flushtlb.h>
>  #include <asm/gic.h>
> +#include <xen/guest_access.h>
> +#include <xen/pfn.h>
>  
>  void dump_p2m_lookup(struct domain *d, paddr_t addr)
>  {
> @@ -408,6 +410,304 @@ unsigned long gmfn_to_mfn(struct domain *d, unsigned
long gpfn)
>      return p >> PAGE_SHIFT;
>  }
>  
> +#define MAX_VA_PER_NODE (PAGE_SIZE - 2 * sizeof(void *) - sizeof(int))\
> +                        / sizeof(unsigned long)
> +
> +/* An array-based linked list for storing virtual addresses (i.e., the 3rd
> + * level table mapping) that should be destroyed after live migration */
> +struct mapped_va_node
> +{
> +    struct page_info *next;
> +    struct mapped_va_node *mvn_next;
> +    int items;
> +    unsigned long vaddrs[MAX_VA_PER_NODE];
> +};
> +
> +int add_mapped_vaddr(struct domain *d, unsigned long va)
> +{
This function seems awefuly expensive (contains allocations etc) for a
function called on every page fault during a migration.

Why are you remembering the full va of every fault when a single bit per
page would do?

I think you can allocate a bitmap when logdirty is enabled. At a bit per
page it shouldn''t be too huge.

You might be able to encode this in the p2m which you walk when the
tools ask for the bit map (i.e. p2m.type==writeable p2m.write==0 =>
dirty), but I think a bitmap would do the trick and be easier to
implement.
> +    struct page_info *page_head;
> +    struct mapped_va_node *mvn_head;
> +
> +    spin_lock(&d->arch.dirty.lock);
> +
> +    page_head = d->arch.dirty.head;
> +    mvn_head = d->arch.dirty.mvn_head;
> +    if ( !mvn_head )
> +    {
> +        page_head = alloc_domheap_page(NULL, 0);
> +        if ( !page_head )
> +        {
> +            spin_unlock(&d->arch.dirty.lock);
> +            return -ENOMEM;
> +        }
> +
> +        mvn_head = map_domain_page_global(__page_to_mfn(page_head));
> +        mvn_head->items = 0;
> +        mvn_head->next = NULL;
> +        mvn_head->mvn_next = NULL;
> +        d->arch.dirty.head = page_head;
> +        d->arch.dirty.mvn_head = mvn_head;
> +    }
> +
> +    if ( mvn_head->items == MAX_VA_PER_NODE )
> +    {
> +        struct page_info *page;
> +
> +        page = alloc_domheap_page(NULL, 0);
> +        if ( !page )
> +        {
> +            spin_unlock(&d->arch.dirty.lock);
> +            return -ENOMEM;
> +        }
> +
> +        mvn_head = map_domain_page_global(__page_to_mfn(page));
> +        mvn_head->items = 0;
> +        mvn_head->next = d->arch.dirty.head;
> +        mvn_head->mvn_next = d->arch.dirty.mvn_head;
> +
> +        d->arch.dirty.mvn_head = mvn_head;
> +        d->arch.dirty.head = page;
> +    }
> +
> +    mvn_head->vaddrs[mvn_head->items] = va;
> +    mvn_head->items ++;
> +
> +    spin_unlock(&d->arch.dirty.lock);
> +    return 0;
> +}
> +
> +/* get the dirty bitmap from the linked list stored by add_mapped_vaddr
> + * and also clear the mapped vaddrs linked list */
> +static void get_dirty_bitmap(struct domain *d, uint8_t *bitmap[])
> +{
> +    struct page_info *page_head;
> +    struct mapped_va_node *mvn_head;
> +    struct page_info *tmp_page;
> +    struct mapped_va_node *tmp_mvn;
> +    vaddr_t gma_start = 0;
> +    vaddr_t gma_end = 0;
> +
> +    /* this hypercall is called from domain 0, and we don''t know
which guest''s
> +     * vlpt is mapped in xen_second, so, to be sure, we restore vlpt here
*/
> +    restore_vlpt(d);
AFAICT you don''t actually use the vlpt here, just the domains list of
dirty addresses (which as above could become a bitmap)
> +
> +    spin_lock(&d->arch.dirty.lock);
> +
> +    page_head = d->arch.dirty.head;
> +    mvn_head = d->arch.dirty.mvn_head;
> +    get_gma_start_end(d, &gma_start, &gma_end);
> +
> +    while ( page_head )
> +    {
> +        int i;
> +
> +        for ( i = 0; i < mvn_head->items; ++i )
> +        {
> +            unsigned int bit_offset;
> +            int bitmap_index, bitmap_offset;
> +            lpae_t *vlp2m_pte;
> +
> +            bit_offset = third_linear_offset(mvn_head->vaddrs[i] -
gma_start);
> +            bitmap_index = bit_offset >> (PAGE_SHIFT + 3);
> +            bitmap_offset = bit_offset & ((1ul << (PAGE_SHIFT +
3)) - 1);
> +
> +            __set_bit(bitmap_offset, bitmap[bitmap_index]);
> +
> +            vlp2m_pte = get_vlpt_3lvl_pte(mvn_head->vaddrs[i]);
> +            vlp2m_pte->p2m.write = 0;
> +        }
> +
> +        tmp_page = page_head;
> +        page_head = mvn_head->next;
> +
> +        tmp_mvn = mvn_head;
> +        mvn_head = mvn_head->mvn_next;
> +
> +        unmap_domain_page_global(tmp_mvn);
> +        free_domheap_page(tmp_page);
> +    }
> +
> +    d->arch.dirty.head = NULL;
> +    d->arch.dirty.mvn_head = NULL;
> +
> +    spin_unlock(&d->arch.dirty.lock);
> +}
> +
> +
> +/* Change types across all p2m entries in a domain */
> +static void p2m_change_entry_type_global(struct domain *d, enum mg nt)
> +{
Stefano had some patches to generalise create_p2m_entries() into a p2m
walker infrastructure in his series to add XENMEM_exchange_and_pin. That
series turned out to be unnecessary but it could be resurrected for use
here rather than recoding a new one.
> +    struct p2m_domain *p2m = &d->arch.p2m;
> +    vaddr_t ram_base;
> +    int i1, i2, i3;
> +    int first_index, second_index, third_index;
> +    lpae_t *first = __map_domain_page(p2m->first_level);
> +    lpae_t pte, *second = NULL, *third = NULL;
> +
> +    get_gma_start_end(d, &ram_base, NULL);
> +
> +    first_index = first_table_offset((uint64_t)ram_base);
> +    second_index = second_table_offset((uint64_t)ram_base);
> +    third_index = third_table_offset((uint64_t)ram_base);
> +
> +    BUG_ON( !first && "Can''t map first level
p2m." );
> +
> +    spin_lock(&p2m->lock);
> +
> +    for ( i1 = first_index; i1 < LPAE_ENTRIES*2; ++i1 )
> +    {
> +        lpae_walk_t first_pte = first[i1].walk;
> +        if ( !first_pte.valid || !first_pte.table )
> +            goto out;
> +
> +        second = map_domain_page(first_pte.base);
> +        BUG_ON( !second && "Can''t map second level
p2m.");
> +        for ( i2 = second_index; i2 < LPAE_ENTRIES; ++i2 )
> +        {
> +            lpae_walk_t second_pte = second[i2].walk;
> +
> +            /* a nasty hack for vlpt due to the difference
> +             * of page table entry layout between p2m and pt */
> +            second[i2].pt.ro = 0;
What is the hack here?

The issue is that the p2m second level, which is a table entry in the
p2m is installed into the Xen page tables where it ends up the third
level, treated as a block entry.

That''s OK, because for both PT and P2M bits 2..10 are ignored for table
entries. So I think rather than this hack here we should simply ensure
that our non-leaf p2m''s have the correct bits in them to be used as p2m
table entries.
> +
> +            if ( !second_pte.valid || !second_pte.table )
> +                goto out;
> +
> +            third = map_domain_page(second_pte.base);
> +            BUG_ON( !third && "Can''t map third level
p2m.");
> +
> +            for ( i3 = third_index; i3 < LPAE_ENTRIES; ++i3 )
> +            {
> +                lpae_walk_t third_pte = third[i3].walk;
> +                int write = 0;
> +                if ( !third_pte.valid )
> +                    goto out;
> +
> +                pte = third[i3];
> +                if ( pte.p2m.write == 1 && nt == mg_ro )
> +                {
> +                    pte.p2m.write = 0;
> +                    write = 1;
> +                }
> +                else if ( pte.p2m.write == 0 && nt == mg_rw )
> +                {
> +                    pte.p2m.write = 1;
> +                    write = 1;
> +                }
> +                if ( write )
> +                    write_pte(&third[i3], pte);
> +            }
> +            unmap_domain_page(third);
> +
> +            third = NULL;
> +            third_index = 0;
> +        }
> +        unmap_domain_page(second);
> +
> +        second = NULL;
> +        second_index = 0;
> +        third_index = 0;
> +    }
> +
> +out:
> +    flush_tlb_all_local();
> +    if ( third ) unmap_domain_page(third);
> +    if ( second ) unmap_domain_page(second);
> +    if ( first ) unmap_domain_page(first);
> +
> +    spin_unlock(&p2m->lock);
> +}
> +
> +/* Read a domain''s log-dirty bitmap and stats.
> + * If the operation is a CLEAN, clear the bitmap and stats. */
> +int log_dirty_op(struct domain *d, xen_domctl_shadow_op_t *sc)
Can this not be static?
> +{
> +    unsigned long gmfn_start;
> +    unsigned long gmfn_end;
> +    unsigned long gmfns;
> +    unsigned int bitmap_pages;
> +    int rc = 0, clean = 0, peek = 1;
> +    uint8_t *bitmap[256]; /* bitmap[256] covers 32GB ram */
> +    int i;
> +
> +    BUG_ON( !d->arch.map_domain.nr_banks );
> +
> +    gmfn_start = d->arch.map_domain.bank[0].start >> PAGE_SHIFT;
> +    gmfn_end = domain_get_maximum_gpfn(d);
> +    gmfns = gmfn_end - gmfn_start;
> +    bitmap_pages = PFN_UP((gmfns + 7) / 8);
> +
> +    if ( guest_handle_is_null(sc->dirty_bitmap) )
> +    {
> +        peek = 0;
> +    }
> +    else
> +    {
> +        /* mapping to the bitmap from guest param */
> +        vaddr_t to = (vaddr_t)sc->dirty_bitmap.p;
This is not allowed, please use the guest handle interfaces and
copy_(to/from)_(user/guest) instead of diving into the guest handle
struct and open coding it yourself.

This might mean creating a temporary bitmap in hypervisor space, but if
you maintain the bitmap across the whole dirty period as suggested then
you should be able to simply copy it out.

Actually, for consistency you might need an atomic copy and clear
mechanism (otherwise you can perhaps loose updates), in which case
you''ll need a temporary buffer.
> +
> +        BUG_ON( to & ~PAGE_MASK && "offset not aligned to
PAGE SIZE");
> +
> +        for ( i = 0; i < bitmap_pages; ++i )
> +        {
> +            paddr_t g;
> +            rc = gvirt_to_maddr(to, &g);
> +            if ( rc )
> +                return rc;
> +            bitmap[i] = map_domain_page(g>>PAGE_SHIFT);
> +            memset(bitmap[i], 0x00, PAGE_SIZE);
> +            to += PAGE_SIZE;
> +        }
> +    }
> +
> +    clean = (sc->op == XEN_DOMCTL_SHADOW_OP_CLEAN);
> +    sc->stats.dirty_count = d->arch.dirty.count;
> +
> +    get_dirty_bitmap(d, bitmap);
> +    if ( peek )
> +    {
> +        for ( i = 0; i < bitmap_pages; ++i )
> +        {
> +            unmap_domain_page(bitmap[i]);
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +long dirty_mode_op(struct domain *d, xen_domctl_shadow_op_t *sc)
> +{
> +    long ret = 0;
> +    switch (sc->op)
> +    {
> +        case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
> +        case XEN_DOMCTL_SHADOW_OP_OFF:
> +        {
> +            enum mg nt = sc->op == XEN_DOMCTL_SHADOW_OP_OFF ? mg_rw :
mg_ro;
> +
> +            d->arch.dirty.mode = sc->op == XEN_DOMCTL_SHADOW_OP_OFF
? 0 : 1;
> +            p2m_change_entry_type_global(d, nt);
> +
> +            if ( sc->op == XEN_DOMCTL_SHADOW_OP_OFF )
> +                cleanup_vlpt(d);
> +            else
> +                prepare_vlpt(d);
> +        }
> +        break;
> +
> +        case XEN_DOMCTL_SHADOW_OP_CLEAN:
> +        case XEN_DOMCTL_SHADOW_OP_PEEK:
> +        {
> +            ret = log_dirty_op(d, sc);
> +        }
> +        break;
> +
> +        default:
> +            return -ENOSYS;
> +    }
> +    return ret;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 4c0fc32..3b78ed2 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1313,6 +1313,8 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>      const char *msg;
>      int rc, level = -1;
>      mmio_info_t info;
> +    int page_fault = ((dabt.dfsc & FSC_MASK) => +                  
(FSC_FLT_PERM + FSC_3D_LEVEL) && dabt.write);
>  
>      if ( !check_conditional_instr(regs, hsr) )
>      {
> @@ -1327,22 +1329,23 @@ static void do_trap_data_abort_guest(struct
cpu_user_regs *regs,
>      info.gva = READ_SYSREG64(FAR_EL2);
>  #endif
>  
> -    if (dabt.s1ptw)
> +    if ( dabt.s1ptw && !page_fault )
>          goto bad_data_abort;
I see this has been commented on elsewhere.

Ian.

Ian Campbell

2013-Oct-16 13:55 UTC

head link

Re: [PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:>  TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
xenlockprof xenwatchdogd xencov
>  TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash xen-lowmemd
xen-mfndump
> +ifeq ($(CONFIG_X86),y)
>  TARGETS-$(CONFIG_MIGRATE) += xen-hptool
> +endif
I know this isn''t your fault but I cannot imagine what xen-hptool
(memory hotplug tool) has to do with CONFIG_MIGRATE.

If it doesn''t build for ARM better to make it use TARGETS-$(CONFIG_X86)
I think.
>  TARGETS := $(TARGETS-y)
>  
>  SUBDIRS := $(SUBDIRS-y)
> @@ -23,7 +25,9 @@ INSTALL_BIN := $(INSTALL_BIN-y)
>  INSTALL_SBIN-y := xen-bugtool xen-python-path xenperf xenpm
xen-tmem-list-parse gtraceview \
>  	gtracestat xenlockprof xenwatchdogd xen-ringwatch xencov
>  INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx xen-hvmcrash xen-lowmemd
xen-mfndump
> +ifeq ($(CONFIG_X86),y)
>  INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool
> +endif
>  INSTALL_SBIN := $(INSTALL_SBIN-y)
>  
>  INSTALL_PRIVBIN-y := xenpvnetboot

Jaeyong Yoo

2013-Oct-17 02:14 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jaeyong Yoo
> Sent: Monday, October 14, 2013 1:39 PM
> To: ''Ian Campbell''; ''Tim Deegan''
> Cc: ''Stefano Stabellini''; ''Keir
Fraser''; ''Jan Beulich''; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers for
> saving and restoring vcpu registers
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > Sent: Friday, October 11, 2013 7:26 PM
> > To: Tim Deegan
> > Cc: Keir Fraser; Stefano Stabellini; Jan Beulich; Jaeyong Yoo; xen-
> > devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers
> > for saving and restoring vcpu registers
> >
> > On Fri, 2013-10-11 at 11:22 +0100, Tim Deegan wrote:
> > > At 09:43 +0100 on 11 Oct (1381484614), Ian Campbell wrote:
> > > > On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:
> > > > > > -----Original Message-----
> > > > > > From: xen-devel-bounces@lists.xen.org
[mailto:xen-devel-
> > > > > > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > > > > > Sent: Thursday, October 10, 2013 7:41 PM
> > > > > > To: Jaeyong Yoo
> > > > > > Cc: Tim Deegan; xen-devel@lists.xen.org
> > > > > > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm:
Add more
> > > > > > registers for saving and restoring vcpu registers
> > > > > >
> > > > > > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo
wrote:
> > > > > > > diff --git a/xen/include/public/arch-arm.h
> > > > > > > b/xen/include/public/arch-arm.h index
5d359af..bf6dc9a
> > > > > > > 100644
> > > > > > > --- a/xen/include/public/arch-arm.h
> > > > > > > +++ b/xen/include/public/arch-arm.h
> > > > > > > @@ -253,6 +253,41 @@ struct
vcpu_guest_context {
> > > > > > >
> > > > > > >      uint32_t sctlr, ttbcr;
> > > > > > >      uint64_t ttbr0, ttbr1;
> > > > > > > +    uint32_t ifar, dfar;
> > > > > > > +    uint32_t ifsr, dfsr;
> > > > > > > +    uint32_t dacr;
> > > > > > > +    uint64_t par;
> > > > > > > +
> > > > > > > +#ifdef CONFIG_ARM_32
> > > > > >
> > > > > > I''m afraid a per arch ifdef
isn''t allowed in the
> > > > > > include/public
> > tree.
> > > > > > The interface should be identical for both 32 and
64 bit
> > > > > > callers. Also think of 32-on-64 guests etc.
> > > > > >
> > > > > > Also, this struct is guest facing (via
VCPUOP_initialise) but
> > > > > > many/all of these new registers are not things
which a guest
> > > > > > needs to specify via a hypercall. IOW I think many
of them
> > > > > > should be part of some toolstack private
save/restore interface.
> > > > >
> > > > > I see, the guest can specify something like sctlr, and
> > > > > ttbr/ttbcr, and the others should be hidden inside hvm
> save/restore.
> > > >
> > > > Right, the important thing is that all that additional state
is
> > > > only visible to the toolstack and the hypervisor, not to
guests.
> > > >
> > > > Actually the guest shouldn''t really see this
interface anyway,
> > > > that''s really a hold over from x86. On ARM only the
toolstack
> > > > really needs to use this struct.
> > > >
> > > > I wonder if we can drop struct vcpu_guest_context from the
guest
> > > > facing ABI on ARM. I see that we already don''t
expose
> > > > VCPUOP_initialise and the only other user is
> > XEN_DOMCTL_(sg)etvcpucontext.
> > >
> > > Does the guest need to have those on ARM? How are you arranging
SMP
> > > guest AP bringup?  If it''s using SCI then maybe that
hypercall
> > > interface can be dropped.
> >
> > SMP bring up is done via PSCI, which is the firmware
"hypercall"
> > interface, defined by ARM for bringing up physucal CPUs which we
> > implement within Xen for the guests'' benefit.
> 
> So, could I remove XEN_DOMCTL_(sg)etvcpucontext and use HVM save/restore
> for migrating vcpu registers?
> 
> And, if I could, arch_get_info_guest becomes dangling function and would
> it be better to remove this function too? or better to keep it for
> symmetry?
Are these questions missed?
> 
> Jaeyong
> 
> >
> > Ian
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Jaeyong Yoo

2013-Oct-17 08:58 UTC

head link

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

> -----Original Message-----
> From: Ian Campbell [mailto:Ian.Campbell@citrix.com]
> Sent: Wednesday, October 16, 2013 9:51 PM
> To: Jaeyong Yoo
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 7/9] xen/arm: Implement virtual-linear
> page table for guest p2m mapping in live migration
> 
> On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:
> > Implement vlpt (virtual-linear page table) for fast accessing of 3rd
> > PTE of guest p2m. When creating a mapping for vlpt, just copy the 1st
> > level PTE of guest p2m to the xen''s 2nd level PTE. Then the
mapping
> > becomes the following:
> >     xen''s 1st PTE -->
> >     xen''s 2nd PTE (which is the same as 1st PTE of guest p2m)
-->
> >     guest p2m''s 2nd PTE -->
> >     guest p2m''s 3rd PTE (the memory contents where the vlpt
points)
> > For more info about vlpt, see:
> > http://www.technovelty.org/linux/virtual-linear-page-table.html
> >
> > This function is used in dirty-page tracing: when domU write-fault is
> > trapped by xen, xen can immediately locate the 3rd PTE of guest p2m.
> 
> ISTR you had some numbers which backed up the choice of this solution.
> Can you either include them here or link to the list archives?
I think link to the list is OK. The full text is too long, I guess.
> 
> > Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
> > ---
> >  xen/arch/arm/mm.c            | 78
> ++++++++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-arm/config.h |  6 ++++  xen/include/asm-arm/domain.h
> > |  7 ++++
> >  xen/include/asm-arm/mm.h     | 22 +++++++++++++
> >  4 files changed, 113 insertions(+)
> >
> > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index
> > 120eae8..98edbc9 100644
> > --- a/xen/arch/arm/mm.c
> > +++ b/xen/arch/arm/mm.c
> > @@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn)
> >          return 1;
> >      return 0;
> >  }
> > +
> > +/* flush the vlpt area */
> > +static void flush_vlpt(struct domain *d) {
> > +    int flush_size;
> > +    flush_size = (d->arch.dirty.second_lvl_end -
> > +                  d->arch.dirty.second_lvl_start) <<
SECOND_SHIFT;
> > +    /* flushing the 3rd level mapping */
> > +    flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START,
> > +                                flush_size);
> 
> Shouldn''t the base here be VIRT_LIN_P2M_START +
> (d->arch.dirty.second_lvl_start) << SECOND_SHIFT) or something
like that?
Yes, right. It will also decrease the flush overhead.
> 
> flush_xen_data_tlb_range_va just turns into a loop over the addresses, so
> you might find you may as well do the flushes as you update the ptes in
> the below, perhaps with an optimisation to pull the barriers outside that
> loop.
You mean the barriers in write_pte? OK. It looks better.
> 
> > +}
> > +
> > +/* restore the xen page table for vlpt mapping for domain d */ void
> > +restore_vlpt(struct domain *d) {
> > +    int i, j = 0;
> > +    int need_flush = 0;
> > +
> > +    for ( i = d->arch.dirty.second_lvl_start;
> > +          i < d->arch.dirty.second_lvl_end;
> > +          ++i, ++j )
> > +    {
> > +        if ( xen_second[i].bits !=
d->arch.dirty.second_lvl[j].bits )
> > +        {
> > +            write_pte(&xen_second[i],
d->arch.dirty.second_lvl[j]);
> > +            need_flush = 1;
> > +        }
> > +    }
> > +
> > +    if ( need_flush ) flush_vlpt(d);
> > +}
> > +
> > +/* setting up the xen page table for vlpt mapping for domain d */
> > +void prepare_vlpt(struct domain *d) {
> > +    int xen_second_linear_base;
> > +    int gp2m_start_index, gp2m_end_index;
> > +    struct p2m_domain *p2m = &d->arch.p2m;
> > +    struct page_info *second_lvl_page;
> > +    vaddr_t gma_start = 0;
> > +    vaddr_t gma_end = 0;
> > +    lpae_t *first;
> > +    int i, j;
> > +
> > +    xen_second_linear_base =
second_linear_offset(VIRT_LIN_P2M_START);
> > +    get_gma_start_end(d, &gma_start, &gma_end);
> > +
> > +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> > +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> > +
> > +    second_lvl_page = alloc_domheap_page(NULL, 0);
> 
> The p2m first is two concatenated pages with a total of 1024 entries
> (which is needed to give the full 40 bit IPA space). I have a feeling this
> means you need two pages here?
> 
> Or maybe we shouldn''t be supporting the full 40 bit addresses on
32-bit?
For generality, I think it is better to support it. But, honestly, I''m
not
sure how many people would use 40 bit ARM guest.
> 
> NB currently create_p2m_entries doesn''t actually support 40 bits,
but I
> need to fix that for an arm64 platform I''m looking at.
> 
> > +    d->arch.dirty.second_lvl = map_domain_page_global(
> > +                                 page_to_mfn(second_lvl_page) );
> > +
> > +    first = __map_domain_page(p2m->first_level);
> > +    for ( i = gp2m_start_index, j = 0; i < gp2m_end_index; ++i,
++j )
> > +    {
> > +        write_pte(&xen_second[xen_second_linear_base+i],
> > +                  first[i]);
> > +
> > +        /* we copy the mapping into domain''s structure as a
reference in
> case
> > +         * of the context switch (used in restore_vlpt) */
> > +        d->arch.dirty.second_lvl[j] = first[i];
> > +    }
> > +    unmap_domain_page(first);
> > +
> > +    /* storing the start and end index */
> > +    d->arch.dirty.second_lvl_start = xen_second_linear_base +
> gp2m_start_index;
> > +    d->arch.dirty.second_lvl_end = xen_second_linear_base +
> > + gp2m_end_index;
> > +
> > +    flush_vlpt(d);
> > +}
> > +
> > +void cleanup_vlpt(struct domain *d)
> > +{
> > +    unmap_domain_page_global(d->arch.dirty.second_lvl);
> > +}
> > +
> >  /*
> >   * Local variables:
> >   * mode: C
> > diff --git a/xen/include/asm-arm/config.h
> > b/xen/include/asm-arm/config.h index 9e395c2..47fc26e 100644
> > --- a/xen/include/asm-arm/config.h
> > +++ b/xen/include/asm-arm/config.h
> > @@ -87,6 +87,7 @@
> >   *   0  -   8M   <COMMON>
> >   *
> >   *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
> > + * 128M - 256M   Virtual-linear mapping to P2M table
> >   * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual
address
> >   *                    space
> >   *
> > @@ -132,6 +133,9 @@
> >
> >  #define VMAP_VIRT_END    XENHEAP_VIRT_START
> >
> > +#define VIRT_LIN_P2M_START     _AT(vaddr_t,0x08000000)
> 
> Can you put this up in the list of addresses in order please.
> 
> > +#define VIRT_LIN_P2M_END       VMAP_VIRT_START
> 
> > +
> >  #define DOMHEAP_ENTRIES        1024  /* 1024 2MB mapping slots */
> >
> >  /* Number of domheap pagetable pages required at the second level
> > (2MB mappings) */ @@ -158,6 +162,8 @@
> >
> >  #define HYPERVISOR_VIRT_END    DIRECTMAP_VIRT_END
> >
> > +/*TODO (ARM_64): define VIRT_LIN_P2M_START VIRT_LIN_P2M_END */
> > +
> >  #endif
> >
> >  /* Fixmap slots */
> > diff --git a/xen/include/asm-arm/domain.h
> > b/xen/include/asm-arm/domain.h index 755c7af..7e7743a 100644
> > --- a/xen/include/asm-arm/domain.h
> > +++ b/xen/include/asm-arm/domain.h
> > @@ -112,6 +112,13 @@ struct arch_domain
> >          spinlock_t                  lock;
> >      } vuart;
> >
> > +    /* dirty-page tracing */
> > +    struct {
> > +        int second_lvl_start;            /* for context switch */
> > +        int second_lvl_end;
> > +        lpae_t *second_lvl;
> > +    } dirty;
> > +
> >      struct dt_mem_info map_domain;
> >      spinlock_t         map_lock;
> >  }  __cacheline_aligned;
> > diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index
> > 021e8d6..b9cbcf2 100644
> > --- a/xen/include/asm-arm/mm.h
> > +++ b/xen/include/asm-arm/mm.h
> > @@ -344,6 +344,28 @@ static inline void put_page_and_type(struct
> > page_info *page)
> >
> >  void get_gma_start_end(struct domain *d, vaddr_t *start, vaddr_t
> > *end);
> >
> > +/* vlpt (virtual linear page table) related functions */ void
> > +prepare_vlpt(struct domain *d); void cleanup_vlpt(struct domain *d);
> > +void restore_vlpt(struct domain *d);
> > +
> > +/* calculate the xen''s virtual address for accessing the
leaf PTE of
> > + * a given address (GPA) */
> > +static inline lpae_t * get_vlpt_3lvl_pte(vaddr_t addr)
> 
> Argument is an IPA and should therefore be a paddr_t I think?
Oh, right, and the type in traps.c is also paddr_t. It is my mistake.
> 
> > +{
> > +    unsigned long ipa_third = third_table_offset(addr);
> > +    unsigned long ipa_second = second_table_offset(addr);
> > +    unsigned long ipa_first = first_table_offset(addr);
> > +
> > +    /* guest p2m''s first level index is the index of
xen''s second level,
> > +       and guest p2m''s second level index is the index of
xen''s third
> level */
> > +    return (lpae_t *)(
> > +        ( (second_linear_offset(VIRT_LIN_P2M_START) + ipa_first)
> > +           << SECOND_SHIFT) +
> > +        (ipa_second << THIRD_SHIFT) +
> > +        (ipa_third << 3) );
> > +}
> 
> Can we assign the table base to an "lpae_t *table" and then use
> &table[ipa_third] rather than hardcoding << 3?
OK.
> 
> I can''t quite shake the feeling that the
"ipa_first<<SECOND_SHIFT" and
> "ipa_second<<THIRD_SHIFT" elements of this could be
achieved with a single
> shift of the original address (and maybe a mask).
Right, just one shift with LPAE_SHIFT on the addr is sufficient. I''m
surprise
you figure this out immediately.
> 
> > +
> >  #endif /*  __ARCH_ARM_MM__ */
> >  /*
> >   * Local variables:

Ian Campbell

2013-Oct-17 10:01 UTC

head link

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

On Thu, 2013-10-17 at 11:14 +0900, Jaeyong Yoo wrote:> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Jaeyong Yoo
> > Sent: Monday, October 14, 2013 1:39 PM
> > To: ''Ian Campbell''; ''Tim Deegan''
> > Cc: ''Stefano Stabellini''; ''Keir
Fraser''; ''Jan Beulich''; xen-
> > devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more registers
for
> > saving and restoring vcpu registers
> > 
> > > -----Original Message-----
> > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > bounces@lists.xen.org] On Behalf Of Ian Campbell
> > > Sent: Friday, October 11, 2013 7:26 PM
> > > To: Tim Deegan
> > > Cc: Keir Fraser; Stefano Stabellini; Jan Beulich; Jaeyong Yoo;
xen-
> > > devel@lists.xen.org
> > > Subject: Re: [Xen-devel] [PATCH v4 2/9] xen/arm: Add more
registers
> > > for saving and restoring vcpu registers
> > >
> > > On Fri, 2013-10-11 at 11:22 +0100, Tim Deegan wrote:
> > > > At 09:43 +0100 on 11 Oct (1381484614), Ian Campbell wrote:
> > > > > On Fri, 2013-10-11 at 17:30 +0900, Jaeyong Yoo wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: xen-devel-bounces@lists.xen.org
[mailto:xen-devel-
> > > > > > > bounces@lists.xen.org] On Behalf Of Ian
Campbell
> > > > > > > Sent: Thursday, October 10, 2013 7:41 PM
> > > > > > > To: Jaeyong Yoo
> > > > > > > Cc: Tim Deegan; xen-devel@lists.xen.org
> > > > > > > Subject: Re: [Xen-devel] [PATCH v4 2/9]
xen/arm: Add more
> > > > > > > registers for saving and restoring vcpu
registers
> > > > > > >
> > > > > > > On Fri, 2013-10-04 at 13:43 +0900, Jaeyong
Yoo wrote:
> > > > > > > > diff --git
a/xen/include/public/arch-arm.h
> > > > > > > > b/xen/include/public/arch-arm.h index
5d359af..bf6dc9a
> > > > > > > > 100644
> > > > > > > > --- a/xen/include/public/arch-arm.h
> > > > > > > > +++ b/xen/include/public/arch-arm.h
> > > > > > > > @@ -253,6 +253,41 @@ struct
vcpu_guest_context {
> > > > > > > >
> > > > > > > >      uint32_t sctlr, ttbcr;
> > > > > > > >      uint64_t ttbr0, ttbr1;
> > > > > > > > +    uint32_t ifar, dfar;
> > > > > > > > +    uint32_t ifsr, dfsr;
> > > > > > > > +    uint32_t dacr;
> > > > > > > > +    uint64_t par;
> > > > > > > > +
> > > > > > > > +#ifdef CONFIG_ARM_32
> > > > > > >
> > > > > > > I''m afraid a per arch ifdef
isn''t allowed in the
> > > > > > > include/public
> > > tree.
> > > > > > > The interface should be identical for both 32
and 64 bit
> > > > > > > callers. Also think of 32-on-64 guests etc.
> > > > > > >
> > > > > > > Also, this struct is guest facing (via
VCPUOP_initialise) but
> > > > > > > many/all of these new registers are not
things which a guest
> > > > > > > needs to specify via a hypercall. IOW I think
many of them
> > > > > > > should be part of some toolstack private
save/restore interface.
> > > > > >
> > > > > > I see, the guest can specify something like sctlr,
and
> > > > > > ttbr/ttbcr, and the others should be hidden inside
hvm
> > save/restore.
> > > > >
> > > > > Right, the important thing is that all that additional
state is
> > > > > only visible to the toolstack and the hypervisor, not
to guests.
> > > > >
> > > > > Actually the guest shouldn''t really see this
interface anyway,
> > > > > that''s really a hold over from x86. On ARM
only the toolstack
> > > > > really needs to use this struct.
> > > > >
> > > > > I wonder if we can drop struct vcpu_guest_context from
the guest
> > > > > facing ABI on ARM. I see that we already don''t
expose
> > > > > VCPUOP_initialise and the only other user is
> > > XEN_DOMCTL_(sg)etvcpucontext.
> > > >
> > > > Does the guest need to have those on ARM? How are you
arranging SMP
> > > > guest AP bringup?  If it''s using SCI then maybe
that hypercall
> > > > interface can be dropped.
> > >
> > > SMP bring up is done via PSCI, which is the firmware
"hypercall"
> > > interface, defined by ARM for bringing up physucal CPUs which we
> > > implement within Xen for the guests'' benefit.
> > 
> > So, could I remove XEN_DOMCTL_(sg)etvcpucontext and use HVM
save/restore
> > for migrating vcpu registers?
> > 
> > And, if I could, arch_get_info_guest becomes dangling function and
would
> > it be better to remove this function too? or better to keep it for
> > symmetry?
> 
> Are these questions missed?
Yes, sorry!

In principal we might be able to remove XEN_DOMCTL_(sg)etvcpucontext as
you suggest but I''m not sure to what extent tools are at liberty to
poke
inside HVM save/restore blobs? I suppose they are allowed to and can
ignore compatibility stuff like supporting older version''s layouts?

But to be honest I think it''s probably easier to keep the domctls, so
that common(ish) code like xenctx.c continues to work for dumping basic
processor state to distinguish between platforms which use domctl and
those which use hvmsave.
> 
> > 
> > Jaeyong
> > 
> > >
> > > Ian
> > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > http://lists.xen.org/xen-devel
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
>

Ian Campbell

2013-Oct-17 10:06 UTC

head link

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

On Thu, 2013-10-17 at 17:58 +0900, Jaeyong Yoo wrote:> > > @@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn)
> > >          return 1;
> > >      return 0;
> > >  }
> > > +
> > > +/* flush the vlpt area */
> > > +static void flush_vlpt(struct domain *d) {
> > > +    int flush_size;
> > > +    flush_size = (d->arch.dirty.second_lvl_end -
> > > +                  d->arch.dirty.second_lvl_start) <<
SECOND_SHIFT;
> > > +    /* flushing the 3rd level mapping */
> > > +    flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START,
> > > +                                flush_size);
> > 
> > Shouldn''t the base here be VIRT_LIN_P2M_START +
> > (d->arch.dirty.second_lvl_start) << SECOND_SHIFT) or
something like that?
> 
> Yes, right. It will also decrease the flush overhead.
> 
> > 
> > flush_xen_data_tlb_range_va just turns into a loop over the addresses,
so
> > you might find you may as well do the flushes as you update the ptes
in
> > the below, perhaps with an optimisation to pull the barriers outside
that
> > loop.
> 
> You mean the barriers in write_pte? OK. It looks better.
I meant the ones in flush_xen_data_tlb_range_va, but perhaps the ones in
write_pte too. I suppose in both cases are __foo variant without the
barriers could be made so we can do
	dsb
	for each page
		__write_pte
		__flush_xen_data....
	dsb
	isb
(or whatever the right barriers are!)
> > > +/* setting up the xen page table for vlpt mapping for domain d
*/
> > > +void prepare_vlpt(struct domain *d) {
> > > +    int xen_second_linear_base;
> > > +    int gp2m_start_index, gp2m_end_index;
> > > +    struct p2m_domain *p2m = &d->arch.p2m;
> > > +    struct page_info *second_lvl_page;
> > > +    vaddr_t gma_start = 0;
> > > +    vaddr_t gma_end = 0;
> > > +    lpae_t *first;
> > > +    int i, j;
> > > +
> > > +    xen_second_linear_base =
second_linear_offset(VIRT_LIN_P2M_START);
> > > +    get_gma_start_end(d, &gma_start, &gma_end);
> > > +
> > > +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> > > +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> > > +
> > > +    second_lvl_page = alloc_domheap_page(NULL, 0);
> > 
> > The p2m first is two concatenated pages with a total of 1024 entries
> > (which is needed to give the full 40 bit IPA space). I have a feeling
this
> > means you need two pages here?
> > 
> > Or maybe we shouldn''t be supporting the full 40 bit addresses
on 32-bit?
> 
> For generality, I think it is better to support it. But, honestly,
I''m not
> sure how many people would use 40 bit ARM guest.
All it takes is one peripheral at a high address, even if you aren''t
using a large amount of RAM etc.

I''ve got a 64-bit system on my desk which has stuff up in the 50 bit
range. It seems less likely to happen on 32-bit though...

Ian.

Jaeyong Yoo

2013-Oct-17 10:12 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

> -----Original Message-----
> From: Ian Campbell [mailto:Ian.Campbell@citrix.com]
> Sent: Wednesday, October 16, 2013 10:45 PM
> To: Jaeyong Yoo
> Cc: xen-devel@lists.xen.org; Stefano Stabellini
> Subject: Re: [Xen-devel] [PATCH v4 8/9] xen/arm: Implement hypercall for
> dirty page tracing
> 
> On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:
> > Add hypercall (shadow op: enable/disable and clean/peek dirted page
> bitmap).
> 
> "dirtied"
> 
> > It consists of two parts: dirty page detecting and saving.
> > For detecting, we setup the guest p2m''s leaf PTE read-only
and
> > whenever the guest tries to write something, permission fault happens
> and traps into xen.
> > The permission-faulted GPA should be saved for the toolstack (when it
> > wants to see which pages are dirted). For this purpose, we temporarily
> > save the GPAs into linked
> 
> "dirtied" again
> 
> > list by using ''add_mapped_vaddr'' function and when
toolstack wants
> > (log_dirty_op function) the GPAs are copied into bitmap and the
linnked
> list is flushed.
> 
> "linked"
Oops, typos.
> 
> >      spin_lock_init(&d->arch.map_lock);
> >      d->arch.map_domain.nr_banks = 0;
> >
> > +    /* init for dirty-page tracing */
> > +    d->arch.dirty.count = 0;
> > +    d->arch.dirty.mode = 0;
> > +    d->arch.dirty.head = NULL;
> > +    d->arch.dirty.mvn_head = NULL;
> > +    spin_lock_init(&d->arch.dirty.lock);
> > +
> > +    d->arch.dirty.second_lvl_start = 0;
> > +    d->arch.dirty.second_lvl_end = 0;
> > +    d->arch.dirty.second_lvl = NULL;
> 
> I think some/all of these belong in the previous patch which added the
> fields.
Oops, right.
> 
> > +
> >      clear_page(d->shared_info);
> >      share_xen_page_with_guest(
> >          virt_to_page(d->shared_info), d, XENSHARE_writable); diff
> > --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c index
> > 78fb322..7edce12 100644
> > --- a/xen/arch/arm/domctl.c
> > +++ b/xen/arch/arm/domctl.c
> > @@ -92,6 +92,19 @@ long arch_do_domctl(struct xen_domctl *domctl,
struct
> domain *d,
> >              xfree(c.data);
> >      }
> >      break;
> > +    case XEN_DOMCTL_shadow_op:
> > +    {
> > +        domain_pause(d);
> > +        ret = dirty_mode_op(d, &domctl->u.shadow_op);
> > +        domain_unpause(d);
> > +
> > +        if ( (&domctl->u.shadow_op)->op ==
XEN_DOMCTL_SHADOW_OP_CLEAN ||
> > +             (&domctl->u.shadow_op)->op ==
XEN_DOMCTL_SHADOW_OP_PEEK
> > + )
> 
> (&domctl->u.shadow_op)->op is just a weird way to write
> domctl->u.shadow_op.op isn''t it?
Right!
> 
> Do you only want to copyback on success or is it always correct?
Oh, I think I missed the else condition here.
> 
> > +        {
> > +            copyback = 1;
> > +        }
> > +    }
> > +    break;
> 
> 
> > @@ -1460,6 +1459,28 @@ void cleanup_vlpt(struct domain *d)
> >      unmap_domain_page_global(d->arch.dirty.second_lvl);
> >  }
> >
> > +/* routine for dirty-page tracing
> > + *
> > + * On first write, it page faults, its entry is changed to
> > +read-write,
> > + * and on retry the write succeeds.
> > + *
> > + * for locating p2m of the faulting entry, we use virtual-linear page
> table.
> > + */
> > +void handle_page_fault(struct domain *d, paddr_t addr) {
> > +    lpae_t *vlp2m_pte = get_vlpt_3lvl_pte(addr);
> > +    if ( vlp2m_pte->p2m.valid && vlp2m_pte->p2m.write
== 0 )
> 
> I think you need to distinguish p2m entries which are r/o due to log dirty
> from those which are r/o due to wanting them to be r/o.
> 
> Maybe we don''t have the latter right now? We will eventually need
p2m
> types I think.
Yes, that should be distinguished. Actually, that was in my mind and apparently
I failed to remember. For doing this, I''m thinking to use un-used field
in pte
as an indicator of ''wanting to be r/o'' and check this
indicator in permission fault.
> 
> > +    {
> > +        lpae_t pte = *vlp2m_pte;
> > +        pte.p2m.write = 1;
> > +        write_pte(vlp2m_pte, pte);
> > +        flush_tlb_local();
> > +
> > +        /* in order to remove mappings in cleanup stage */
> > +        add_mapped_vaddr(d, addr);
> 
> No locks or atomic operations here? How are races with the tools reading
> the dirty bitmap addressed?  If it is by clever ordering of the checks and
> pte writes then I think a comment would be in order.
Actually, the lock is inside the add_mapped_vaddr function. 
> 
> > +    }
> > +}
> > +
> >  /*
> >   * Local variables:
> >   * mode: C
> > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index
> > 2d09fef..3b2b11a 100644
> > --- a/xen/arch/arm/p2m.c
> > +++ b/xen/arch/arm/p2m.c
> > @@ -6,6 +6,8 @@
> >  #include <xen/bitops.h>
> >  #include <asm/flushtlb.h>
> >  #include <asm/gic.h>
> > +#include <xen/guest_access.h>
> > +#include <xen/pfn.h>
> >
> >  void dump_p2m_lookup(struct domain *d, paddr_t addr)  { @@ -408,6
> > +410,304 @@ unsigned long gmfn_to_mfn(struct domain *d, unsigned long
> gpfn)
> >      return p >> PAGE_SHIFT;
> >  }
> >
> > +#define MAX_VA_PER_NODE (PAGE_SIZE - 2 * sizeof(void *) -
sizeof(int))\
> > +                        / sizeof(unsigned long)
> > +
> > +/* An array-based linked list for storing virtual addresses (i.e.,
> > +the 3rd
> > + * level table mapping) that should be destroyed after live migration
> > +*/ struct mapped_va_node {
> > +    struct page_info *next;
> > +    struct mapped_va_node *mvn_next;
> > +    int items;
> > +    unsigned long vaddrs[MAX_VA_PER_NODE]; };
> > +
> > +int add_mapped_vaddr(struct domain *d, unsigned long va) {
> 
> This function seems awefuly expensive (contains allocations etc) for a
> function called on every page fault during a migration.
> 
> Why are you remembering the full va of every fault when a single bit per
> page would do?
> 
> I think you can allocate a bitmap when logdirty is enabled. At a bit per
> page it shouldn''t be too huge.
> 
> You might be able to encode this in the p2m which you walk when the tools
> ask for the bit map (i.e. p2m.type==writeable p2m.write==0 => dirty),
but
> I think a bitmap would do the trick and be easier to implement.
There are three candidates:

1) bitmap
2) encoding into p2m
3) linked list of VAs. 

And, two functions for dirty-page tracing:

A) dirty-page handling (At trap)
B) getting dirty bitmap (for toolstack)

1) and 2) have advantage for A) but have to search the full range of pages at B)
to find out which page is dirtied and set the page as read-only for next dirty
detection. On the otherhand, 3) does not need to search the full range at B).
I prefer 3) due to the ''non-full range search'' but I
understand your concern.
Maybe I can do some measurement for each method to see which one better. 
> 
> > +    struct page_info *page_head;
> > +    struct mapped_va_node *mvn_head;
> > +
> > +    spin_lock(&d->arch.dirty.lock);
> > +
> > +    page_head = d->arch.dirty.head;
> > +    mvn_head = d->arch.dirty.mvn_head;
> > +    if ( !mvn_head )
> > +    {
> > +        page_head = alloc_domheap_page(NULL, 0);
> > +        if ( !page_head )
> > +        {
> > +            spin_unlock(&d->arch.dirty.lock);
> > +            return -ENOMEM;
> > +        }
> > +
> > +        mvn_head = map_domain_page_global(__page_to_mfn(page_head));
> > +        mvn_head->items = 0;
> > +        mvn_head->next = NULL;
> > +        mvn_head->mvn_next = NULL;
> > +        d->arch.dirty.head = page_head;
> > +        d->arch.dirty.mvn_head = mvn_head;
> > +    }
> > +
> > +    if ( mvn_head->items == MAX_VA_PER_NODE )
> > +    {
> > +        struct page_info *page;
> > +
> > +        page = alloc_domheap_page(NULL, 0);
> > +        if ( !page )
> > +        {
> > +            spin_unlock(&d->arch.dirty.lock);
> > +            return -ENOMEM;
> > +        }
> > +
> > +        mvn_head = map_domain_page_global(__page_to_mfn(page));
> > +        mvn_head->items = 0;
> > +        mvn_head->next = d->arch.dirty.head;
> > +        mvn_head->mvn_next = d->arch.dirty.mvn_head;
> > +
> > +        d->arch.dirty.mvn_head = mvn_head;
> > +        d->arch.dirty.head = page;
> > +    }
> > +
> > +    mvn_head->vaddrs[mvn_head->items] = va;
> > +    mvn_head->items ++;
> > +
> > +    spin_unlock(&d->arch.dirty.lock);
> > +    return 0;
> > +}
> > +
> > +/* get the dirty bitmap from the linked list stored by
> > +add_mapped_vaddr
> > + * and also clear the mapped vaddrs linked list */ static void
> > +get_dirty_bitmap(struct domain *d, uint8_t *bitmap[]) {
> > +    struct page_info *page_head;
> > +    struct mapped_va_node *mvn_head;
> > +    struct page_info *tmp_page;
> > +    struct mapped_va_node *tmp_mvn;
> > +    vaddr_t gma_start = 0;
> > +    vaddr_t gma_end = 0;
> > +
> > +    /* this hypercall is called from domain 0, and we don''t
know which
> guest''s
> > +     * vlpt is mapped in xen_second, so, to be sure, we restore vlpt
> here */
> > +    restore_vlpt(d);
> 
> AFAICT you don''t actually use the vlpt here, just the domains list
of
> dirty addresses (which as above could become a bitmap)
I think vlpt is still needed because at this stage, we have to reset the 
write permission of dirtied pages. Maybe some tricks for efficiently doing this?
> 
> > +
> > +    spin_lock(&d->arch.dirty.lock);
> > +
> > +    page_head = d->arch.dirty.head;
> > +    mvn_head = d->arch.dirty.mvn_head;
> > +    get_gma_start_end(d, &gma_start, &gma_end);
> > +
> > +    while ( page_head )
> > +    {
> > +        int i;
> > +
> > +        for ( i = 0; i < mvn_head->items; ++i )
> > +        {
> > +            unsigned int bit_offset;
> > +            int bitmap_index, bitmap_offset;
> > +            lpae_t *vlp2m_pte;
> > +
> > +            bit_offset = third_linear_offset(mvn_head->vaddrs[i] -
> gma_start);
> > +            bitmap_index = bit_offset >> (PAGE_SHIFT + 3);
> > +            bitmap_offset = bit_offset & ((1ul <<
(PAGE_SHIFT + 3)) -
> > + 1);
> > +
> > +            __set_bit(bitmap_offset, bitmap[bitmap_index]);
> > +
> > +            vlp2m_pte = get_vlpt_3lvl_pte(mvn_head->vaddrs[i]);
> > +            vlp2m_pte->p2m.write = 0;
> > +        }
> > +
> > +        tmp_page = page_head;
> > +        page_head = mvn_head->next;
> > +
> > +        tmp_mvn = mvn_head;
> > +        mvn_head = mvn_head->mvn_next;
> > +
> > +        unmap_domain_page_global(tmp_mvn);
> > +        free_domheap_page(tmp_page);
> > +    }
> > +
> > +    d->arch.dirty.head = NULL;
> > +    d->arch.dirty.mvn_head = NULL;
> > +
> > +    spin_unlock(&d->arch.dirty.lock); }
> > +
> > +
> > +/* Change types across all p2m entries in a domain */ static void
> > +p2m_change_entry_type_global(struct domain *d, enum mg nt) {
> 
> Stefano had some patches to generalise create_p2m_entries() into a p2m
> walker infrastructure in his series to add XENMEM_exchange_and_pin. That
> series turned out to be unnecessary but it could be resurrected for use
> here rather than recoding a new one.
p2m walker infrastructure sounds interesting. 
> 
> > +    struct p2m_domain *p2m = &d->arch.p2m;
> > +    vaddr_t ram_base;
> > +    int i1, i2, i3;
> > +    int first_index, second_index, third_index;
> > +    lpae_t *first = __map_domain_page(p2m->first_level);
> > +    lpae_t pte, *second = NULL, *third = NULL;
> > +
> > +    get_gma_start_end(d, &ram_base, NULL);
> > +
> > +    first_index = first_table_offset((uint64_t)ram_base);
> > +    second_index = second_table_offset((uint64_t)ram_base);
> > +    third_index = third_table_offset((uint64_t)ram_base);
> > +
> > +    BUG_ON( !first && "Can''t map first level
p2m." );
> > +
> > +    spin_lock(&p2m->lock);
> > +
> > +    for ( i1 = first_index; i1 < LPAE_ENTRIES*2; ++i1 )
> > +    {
> > +        lpae_walk_t first_pte = first[i1].walk;
> > +        if ( !first_pte.valid || !first_pte.table )
> > +            goto out;
> > +
> > +        second = map_domain_page(first_pte.base);
> > +        BUG_ON( !second && "Can''t map second
level p2m.");
> > +        for ( i2 = second_index; i2 < LPAE_ENTRIES; ++i2 )
> > +        {
> > +            lpae_walk_t second_pte = second[i2].walk;
> > +
> > +            /* a nasty hack for vlpt due to the difference
> > +             * of page table entry layout between p2m and pt */
> > +            second[i2].pt.ro = 0;
> 
> What is the hack here?
> 
> The issue is that the p2m second level, which is a table entry in the p2m
> is installed into the Xen page tables where it ends up the third level,
> treated as a block entry.
> 
> That''s OK, because for both PT and P2M bits 2..10 are ignored for
table
> entries. So I think rather than this hack here we should simply ensure
> that our non-leaf p2m''s have the correct bits in them to be used
as p2m
> table entries.
Oh, I see. So, should I put something like this in create_p2m_entries function?
> 
> > +
> > +            if ( !second_pte.valid || !second_pte.table )
> > +                goto out;
> > +
> > +            third = map_domain_page(second_pte.base);
> > +            BUG_ON( !third && "Can''t map third
level p2m.");
> > +
> > +            for ( i3 = third_index; i3 < LPAE_ENTRIES; ++i3 )
> > +            {
> > +                lpae_walk_t third_pte = third[i3].walk;
> > +                int write = 0;
> > +                if ( !third_pte.valid )
> > +                    goto out;
> > +
> > +                pte = third[i3];
> > +                if ( pte.p2m.write == 1 && nt == mg_ro )
> > +                {
> > +                    pte.p2m.write = 0;
> > +                    write = 1;
> > +                }
> > +                else if ( pte.p2m.write == 0 && nt == mg_rw )
> > +                {
> > +                    pte.p2m.write = 1;
> > +                    write = 1;
> > +                }
> > +                if ( write )
> > +                    write_pte(&third[i3], pte);
> > +            }
> > +            unmap_domain_page(third);
> > +
> > +            third = NULL;
> > +            third_index = 0;
> > +        }
> > +        unmap_domain_page(second);
> > +
> > +        second = NULL;
> > +        second_index = 0;
> > +        third_index = 0;
> > +    }
> > +
> > +out:
> > +    flush_tlb_all_local();
> > +    if ( third ) unmap_domain_page(third);
> > +    if ( second ) unmap_domain_page(second);
> > +    if ( first ) unmap_domain_page(first);
> > +
> > +    spin_unlock(&p2m->lock);
> > +}
> > +
> > +/* Read a domain''s log-dirty bitmap and stats.
> > + * If the operation is a CLEAN, clear the bitmap and stats. */ int
> > +log_dirty_op(struct domain *d, xen_domctl_shadow_op_t *sc)
> 
> Can this not be static?
Yes, right.
> 
> > +{
> > +    unsigned long gmfn_start;
> > +    unsigned long gmfn_end;
> > +    unsigned long gmfns;
> > +    unsigned int bitmap_pages;
> > +    int rc = 0, clean = 0, peek = 1;
> > +    uint8_t *bitmap[256]; /* bitmap[256] covers 32GB ram */
> > +    int i;
> > +
> > +    BUG_ON( !d->arch.map_domain.nr_banks );
> > +
> > +    gmfn_start = d->arch.map_domain.bank[0].start >>
PAGE_SHIFT;
> > +    gmfn_end = domain_get_maximum_gpfn(d);
> > +    gmfns = gmfn_end - gmfn_start;
> > +    bitmap_pages = PFN_UP((gmfns + 7) / 8);
> > +
> > +    if ( guest_handle_is_null(sc->dirty_bitmap) )
> > +    {
> > +        peek = 0;
> > +    }
> > +    else
> > +    {
> > +        /* mapping to the bitmap from guest param */
> > +        vaddr_t to = (vaddr_t)sc->dirty_bitmap.p;
> 
> This is not allowed, please use the guest handle interfaces and
> copy_(to/from)_(user/guest) instead of diving into the guest handle struct
> and open coding it yourself.
> 
> This might mean creating a temporary bitmap in hypervisor space, but if
> you maintain the bitmap across the whole dirty period as suggested then
> you should be able to simply copy it out.
Oh, I''m seeing more advantages for using bitmap.
> 
> Actually, for consistency you might need an atomic copy and clear
> mechanism (otherwise you can perhaps loose updates), in which case
you''ll
> need a temporary buffer.
> 
> > +
> > +        BUG_ON( to & ~PAGE_MASK && "offset not
aligned to PAGE
> > + SIZE");
> > +
> > +        for ( i = 0; i < bitmap_pages; ++i )
> > +        {
> > +            paddr_t g;
> > +            rc = gvirt_to_maddr(to, &g);
> > +            if ( rc )
> > +                return rc;
> > +            bitmap[i] = map_domain_page(g>>PAGE_SHIFT);
> > +            memset(bitmap[i], 0x00, PAGE_SIZE);
> > +            to += PAGE_SIZE;
> > +        }
> > +    }
> > +
> > +    clean = (sc->op == XEN_DOMCTL_SHADOW_OP_CLEAN);
> > +    sc->stats.dirty_count = d->arch.dirty.count;
> > +
> > +    get_dirty_bitmap(d, bitmap);
> > +    if ( peek )
> > +    {
> > +        for ( i = 0; i < bitmap_pages; ++i )
> > +        {
> > +            unmap_domain_page(bitmap[i]);
> > +        }
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +long dirty_mode_op(struct domain *d, xen_domctl_shadow_op_t *sc) {
> > +    long ret = 0;
> > +    switch (sc->op)
> > +    {
> > +        case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
> > +        case XEN_DOMCTL_SHADOW_OP_OFF:
> > +        {
> > +            enum mg nt = sc->op == XEN_DOMCTL_SHADOW_OP_OFF ?
mg_rw :
> > +mg_ro;
> > +
> > +            d->arch.dirty.mode = sc->op ==
XEN_DOMCTL_SHADOW_OP_OFF ? 0 :
> 1;
> > +            p2m_change_entry_type_global(d, nt);
> > +
> > +            if ( sc->op == XEN_DOMCTL_SHADOW_OP_OFF )
> > +                cleanup_vlpt(d);
> > +            else
> > +                prepare_vlpt(d);
> > +        }
> > +        break;
> > +
> > +        case XEN_DOMCTL_SHADOW_OP_CLEAN:
> > +        case XEN_DOMCTL_SHADOW_OP_PEEK:
> > +        {
> > +            ret = log_dirty_op(d, sc);
> > +        }
> > +        break;
> > +
> > +        default:
> > +            return -ENOSYS;
> > +    }
> > +    return ret;
> > +}
> > +
> >  /*
> >   * Local variables:
> >   * mode: C
> > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index
> > 4c0fc32..3b78ed2 100644
> > --- a/xen/arch/arm/traps.c
> > +++ b/xen/arch/arm/traps.c
> > @@ -1313,6 +1313,8 @@ static void do_trap_data_abort_guest(struct
> cpu_user_regs *regs,
> >      const char *msg;
> >      int rc, level = -1;
> >      mmio_info_t info;
> > +    int page_fault = ((dabt.dfsc & FSC_MASK) => > +        
(FSC_FLT_PERM + FSC_3D_LEVEL) && dabt.write);
> >
> >      if ( !check_conditional_instr(regs, hsr) )
> >      {
> > @@ -1327,22 +1329,23 @@ static void do_trap_data_abort_guest(struct
> cpu_user_regs *regs,
> >      info.gva = READ_SYSREG64(FAR_EL2);  #endif
> >
> > -    if (dabt.s1ptw)
> > +    if ( dabt.s1ptw && !page_fault )
> >          goto bad_data_abort;
> 
> I see this has been commented on elsewhere.
> 
> Ian.

Jaeyong Yoo

2013-Oct-17 10:14 UTC

head link

Re: [PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: Wednesday, October 16, 2013 10:55 PM
> To: Jaeyong Yoo
> Cc: Alexey Sokolov; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 9/9] xen/arm: Implement toolstack for
> xl restore/save and migrate
> 
> On Fri, 2013-10-04 at 13:44 +0900, Jaeyong Yoo wrote:
> >  TARGETS-y := xenperf xenpm xen-tmem-list-parse gtraceview gtracestat
> > xenlockprof xenwatchdogd xencov
> >  TARGETS-$(CONFIG_X86) += xen-detect xen-hvmctx xen-hvmcrash
> > xen-lowmemd xen-mfndump
> > +ifeq ($(CONFIG_X86),y)
> >  TARGETS-$(CONFIG_MIGRATE) += xen-hptool
> > +endif
> 
> I know this isn''t your fault but I cannot imagine what xen-hptool
(memory
> hotplug tool) has to do with CONFIG_MIGRATE.
> 
> If it doesn''t build for ARM better to make it use
TARGETS-$(CONFIG_X86) I
> think.
That sounds more reasonable. 


Thanks for all the comments of this patch series.
Jaeyong

Jaeyong Yoo

2013-Oct-17 10:25 UTC

head link

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

> -----Original Message-----
> From: Ian Campbell [mailto:Ian.Campbell@citrix.com]
> Sent: Thursday, October 17, 2013 7:06 PM
> To: Jaeyong Yoo
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 7/9] xen/arm: Implement virtual-linear
> page table for guest p2m mapping in live migration
> 
> On Thu, 2013-10-17 at 17:58 +0900, Jaeyong Yoo wrote:
> > > > @@ -1382,6 +1382,84 @@ int is_iomem_page(unsigned long mfn)
> > > >          return 1;
> > > >      return 0;
> > > >  }
> > > > +
> > > > +/* flush the vlpt area */
> > > > +static void flush_vlpt(struct domain *d) {
> > > > +    int flush_size;
> > > > +    flush_size = (d->arch.dirty.second_lvl_end -
> > > > +                  d->arch.dirty.second_lvl_start)
<< SECOND_SHIFT;
> > > > +    /* flushing the 3rd level mapping */
> > > > +    flush_xen_data_tlb_range_va(VIRT_LIN_P2M_START,
> > > > +                                flush_size);
> > >
> > > Shouldn''t the base here be VIRT_LIN_P2M_START +
> > > (d->arch.dirty.second_lvl_start) << SECOND_SHIFT) or
something like
> that?
> >
> > Yes, right. It will also decrease the flush overhead.
> >
> > >
> > > flush_xen_data_tlb_range_va just turns into a loop over the
> > > addresses, so you might find you may as well do the flushes as
you
> > > update the ptes in the below, perhaps with an optimisation to
pull
> > > the barriers outside that loop.
> >
> > You mean the barriers in write_pte? OK. It looks better.
> 
> I meant the ones in flush_xen_data_tlb_range_va, but perhaps the ones in
> write_pte too. I suppose in both cases are __foo variant without the
> barriers could be made so we can do
> 	dsb
> 	for each page
> 		__write_pte
> 		__flush_xen_data....
> 	dsb
> 	isb
> (or whatever the right barriers are!)
> 
Oh, you meant put both write_pte and flushing into one loop.
I got it! No reason for having two loops. 
> > > > +/* setting up the xen page table for vlpt mapping for
domain d */
> > > > +void prepare_vlpt(struct domain *d) {
> > > > +    int xen_second_linear_base;
> > > > +    int gp2m_start_index, gp2m_end_index;
> > > > +    struct p2m_domain *p2m = &d->arch.p2m;
> > > > +    struct page_info *second_lvl_page;
> > > > +    vaddr_t gma_start = 0;
> > > > +    vaddr_t gma_end = 0;
> > > > +    lpae_t *first;
> > > > +    int i, j;
> > > > +
> > > > +    xen_second_linear_base >
second_linear_offset(VIRT_LIN_P2M_START);
> > > > +    get_gma_start_end(d, &gma_start, &gma_end);
> > > > +
> > > > +    gp2m_start_index = gma_start >> FIRST_SHIFT;
> > > > +    gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1;
> > > > +
> > > > +    second_lvl_page = alloc_domheap_page(NULL, 0);
> > >
> > > The p2m first is two concatenated pages with a total of 1024
entries
> > > (which is needed to give the full 40 bit IPA space). I have a
> > > feeling this means you need two pages here?
> > >
> > > Or maybe we shouldn''t be supporting the full 40 bit
addresses on 32-
> bit?
> >
> > For generality, I think it is better to support it. But, honestly,
I''m
> > not sure how many people would use 40 bit ARM guest.
> 
> All it takes is one peripheral at a high address, even if you
aren''t using
> a large amount of RAM etc.
> 
> I''ve got a 64-bit system on my desk which has stuff up in the 50
bit range.
> It seems less likely to happen on 32-bit though...
I think I''d better go for generality.

Jaeyong

Ian Campbell

2013-Oct-17 10:30 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On Thu, 2013-10-17 at 19:12 +0900, Jaeyong Yoo wrote:
> > > @@ -1460,6 +1459,28 @@ void cleanup_vlpt(struct domain *d)
> > >      unmap_domain_page_global(d->arch.dirty.second_lvl);
> > >  }
> > >
> > > +/* routine for dirty-page tracing
> > > + *
> > > + * On first write, it page faults, its entry is changed to
> > > +read-write,
> > > + * and on retry the write succeeds.
> > > + *
> > > + * for locating p2m of the faulting entry, we use virtual-linear
page
> > table.
> > > + */
> > > +void handle_page_fault(struct domain *d, paddr_t addr) {
> > > +    lpae_t *vlp2m_pte = get_vlpt_3lvl_pte(addr);
> > > +    if ( vlp2m_pte->p2m.valid &&
vlp2m_pte->p2m.write == 0 )
> > 
> > I think you need to distinguish p2m entries which are r/o due to log
dirty
> > from those which are r/o due to wanting them to be r/o.
> > 
> > Maybe we don''t have the latter right now? We will eventually
need p2m
> > types I think.
> 
> Yes, that should be distinguished. Actually, that was in my mind and
apparently
> I failed to remember. For doing this, I''m thinking to use un-used
field in pte
> as an indicator of ''wanting to be r/o'' and check this
indicator in permission fault.
Yes, I think we can use the 4 avail bits to store a p2m type. ISTR
discussing this before, but perhaps it wasn''t with you!
> 
> > 
> > > +    {
> > > +        lpae_t pte = *vlp2m_pte;
> > > +        pte.p2m.write = 1;
> > > +        write_pte(vlp2m_pte, pte);
> > > +        flush_tlb_local();
> > > +
> > > +        /* in order to remove mappings in cleanup stage */
> > > +        add_mapped_vaddr(d, addr);
> > 
> > No locks or atomic operations here? How are races with the tools
reading
> > the dirty bitmap addressed?  If it is by clever ordering of the checks
and
> > pte writes then I think a comment would be in order.
> 
> Actually, the lock is inside the add_mapped_vaddr function. 
But that only covers the bitmap update, not the pte frobbing.
> 
> > 
> > > +    }
> > > +}
> > > +
> > >  /*
> > >   * Local variables:
> > >   * mode: C
> > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index
> > > 2d09fef..3b2b11a 100644
> > > --- a/xen/arch/arm/p2m.c
> > > +++ b/xen/arch/arm/p2m.c
> > > @@ -6,6 +6,8 @@
> > >  #include <xen/bitops.h>
> > >  #include <asm/flushtlb.h>
> > >  #include <asm/gic.h>
> > > +#include <xen/guest_access.h>
> > > +#include <xen/pfn.h>
> > >
> > >  void dump_p2m_lookup(struct domain *d, paddr_t addr)  { @@
-408,6
> > > +410,304 @@ unsigned long gmfn_to_mfn(struct domain *d, unsigned
long
> > gpfn)
> > >      return p >> PAGE_SHIFT;
> > >  }
> > >
> > > +#define MAX_VA_PER_NODE (PAGE_SIZE - 2 * sizeof(void *) -
sizeof(int))\
> > > +                        / sizeof(unsigned long)
> > > +
> > > +/* An array-based linked list for storing virtual addresses
(i.e.,
> > > +the 3rd
> > > + * level table mapping) that should be destroyed after live
migration
> > > +*/ struct mapped_va_node {
> > > +    struct page_info *next;
> > > +    struct mapped_va_node *mvn_next;
> > > +    int items;
> > > +    unsigned long vaddrs[MAX_VA_PER_NODE]; };
> > > +
> > > +int add_mapped_vaddr(struct domain *d, unsigned long va) {
> > 
> > This function seems awefuly expensive (contains allocations etc) for a
> > function called on every page fault during a migration.
> > 
> > Why are you remembering the full va of every fault when a single bit
per
> > page would do?
> > 
> > I think you can allocate a bitmap when logdirty is enabled. At a bit
per
> > page it shouldn''t be too huge.
> > 
> > You might be able to encode this in the p2m which you walk when the
tools
> > ask for the bit map (i.e. p2m.type==writeable p2m.write==0 =>
dirty), but
> > I think a bitmap would do the trick and be easier to implement.
> 
> There are three candidates:
> 
> 1) bitmap
> 2) encoding into p2m
> 3) linked list of VAs. 
> 
> And, two functions for dirty-page tracing:
> 
> A) dirty-page handling (At trap)
> B) getting dirty bitmap (for toolstack)
> 
> 1) and 2) have advantage for A) but have to search the full range of pages
at B)
> to find out which page is dirtied and set the page as read-only for next
dirty
> detection. On the otherhand, 3) does not need to search the full range at
B).
> I prefer 3) due to the ''non-full range search'' but I
understand your concern.
> Maybe I can do some measurement for each method to see which one better. 
If you have a bitmap then you can do a scan for each set bit, and the
offset of each bit corresponds to the guest pfn, which allows you to
directly calculate the address of the pte in the vlpt.

I don''t think doing a scan for each set bit is going to compare
unfavourably to walking a linked list. Certainly not once the memory
allocator gets involved.
> > > +/* get the dirty bitmap from the linked list stored by
> > > +add_mapped_vaddr
> > > + * and also clear the mapped vaddrs linked list */ static void
> > > +get_dirty_bitmap(struct domain *d, uint8_t *bitmap[]) {
> > > +    struct page_info *page_head;
> > > +    struct mapped_va_node *mvn_head;
> > > +    struct page_info *tmp_page;
> > > +    struct mapped_va_node *tmp_mvn;
> > > +    vaddr_t gma_start = 0;
> > > +    vaddr_t gma_end = 0;
> > > +
> > > +    /* this hypercall is called from domain 0, and we
don''t know which
> > guest''s
> > > +     * vlpt is mapped in xen_second, so, to be sure, we restore
vlpt
> > here */
> > > +    restore_vlpt(d);
> > 
> > AFAICT you don''t actually use the vlpt here, just the domains
list of
> > dirty addresses (which as above could become a bitmap)
> 
> I think vlpt is still needed because at this stage, we have to reset the 
> write permission of dirtied pages. Maybe some tricks for efficiently doing
this?
Ah, of course.

Do you need to be careful about the context switch of a dom0 vcpu which
has a foreign vlpt loaded? I suppose you unload it at the end of this
function so it is safe because Xen doesn''t preempt vcpus in hypervisor
mode.
> > 
> > > +    struct p2m_domain *p2m = &d->arch.p2m;
> > > +    vaddr_t ram_base;
> > > +    int i1, i2, i3;
> > > +    int first_index, second_index, third_index;
> > > +    lpae_t *first = __map_domain_page(p2m->first_level);
> > > +    lpae_t pte, *second = NULL, *third = NULL;
> > > +
> > > +    get_gma_start_end(d, &ram_base, NULL);
> > > +
> > > +    first_index = first_table_offset((uint64_t)ram_base);
> > > +    second_index = second_table_offset((uint64_t)ram_base);
> > > +    third_index = third_table_offset((uint64_t)ram_base);
> > > +
> > > +    BUG_ON( !first && "Can''t map first
level p2m." );
> > > +
> > > +    spin_lock(&p2m->lock);
> > > +
> > > +    for ( i1 = first_index; i1 < LPAE_ENTRIES*2; ++i1 )
> > > +    {
> > > +        lpae_walk_t first_pte = first[i1].walk;
> > > +        if ( !first_pte.valid || !first_pte.table )
> > > +            goto out;
> > > +
> > > +        second = map_domain_page(first_pte.base);
> > > +        BUG_ON( !second && "Can''t map
second level p2m.");
> > > +        for ( i2 = second_index; i2 < LPAE_ENTRIES; ++i2 )
> > > +        {
> > > +            lpae_walk_t second_pte = second[i2].walk;
> > > +
> > > +            /* a nasty hack for vlpt due to the difference
> > > +             * of page table entry layout between p2m and pt */
> > > +            second[i2].pt.ro = 0;
> > 
> > What is the hack here?
> > 
> > The issue is that the p2m second level, which is a table entry in the
p2m
> > is installed into the Xen page tables where it ends up the third
level,
> > treated as a block entry.
> > 
> > That''s OK, because for both PT and P2M bits 2..10 are ignored
for table
> > entries. So I think rather than this hack here we should simply ensure
> > that our non-leaf p2m''s have the correct bits in them to be
used as p2m
> > table entries.
> 
> Oh, I see. So, should I put something like this in create_p2m_entries
function?
Probably p2m_create_table or maybe mfn_to_p2m entry would be the right
place.

Ian.

Jaeyong Yoo

2013-Oct-17 11:05 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: Thursday, October 17, 2013 7:30 PM
> To: Jaeyong Yoo
> Cc: ''Stefano Stabellini''; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 8/9] xen/arm: Implement hypercall for
> dirty page tracing
> 
> On Thu, 2013-10-17 at 19:12 +0900, Jaeyong Yoo wrote:
> 
> > > > @@ -1460,6 +1459,28 @@ void cleanup_vlpt(struct domain *d)
> > > >      unmap_domain_page_global(d->arch.dirty.second_lvl);
> > > >  }
> > > >
> > > > +/* routine for dirty-page tracing
> > > > + *
> > > > + * On first write, it page faults, its entry is changed to
> > > > +read-write,
> > > > + * and on retry the write succeeds.
> > > > + *
> > > > + * for locating p2m of the faulting entry, we use
virtual-linear
> > > > +page
> > > table.
> > > > + */
> > > > +void handle_page_fault(struct domain *d, paddr_t addr) {
> > > > +    lpae_t *vlp2m_pte = get_vlpt_3lvl_pte(addr);
> > > > +    if ( vlp2m_pte->p2m.valid &&
vlp2m_pte->p2m.write == 0 )
> > >
> > > I think you need to distinguish p2m entries which are r/o due to
log
> > > dirty from those which are r/o due to wanting them to be r/o.
> > >
> > > Maybe we don''t have the latter right now? We will
eventually need
> > > p2m types I think.
> >
> > Yes, that should be distinguished. Actually, that was in my mind and
> > apparently I failed to remember. For doing this, I''m thinking
to use
> > un-used field in pte as an indicator of ''wanting to be
r/o'' and check
> this indicator in permission fault.
> 
> Yes, I think we can use the 4 avail bits to store a p2m type. ISTR
> discussing this before, but perhaps it wasn''t with you!
> 
> >
> > >
> > > > +    {
> > > > +        lpae_t pte = *vlp2m_pte;
> > > > +        pte.p2m.write = 1;
> > > > +        write_pte(vlp2m_pte, pte);
> > > > +        flush_tlb_local();
> > > > +
> > > > +        /* in order to remove mappings in cleanup stage */
> > > > +        add_mapped_vaddr(d, addr);
> > >
> > > No locks or atomic operations here? How are races with the tools
> > > reading the dirty bitmap addressed?  If it is by clever ordering
of
> > > the checks and pte writes then I think a comment would be in
order.
> >
> > Actually, the lock is inside the add_mapped_vaddr function.
> 
> But that only covers the bitmap update, not the pte frobbing.
I also locked with the same lock at the get_dirty_bitmap which is reading
the bitmap.
> 
> >
> > >
> > > > +    }
> > > > +}
> > > > +
> > > >  /*
> > > >   * Local variables:
> > > >   * mode: C
> > > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index
> > > > 2d09fef..3b2b11a 100644
> > > > --- a/xen/arch/arm/p2m.c
> > > > +++ b/xen/arch/arm/p2m.c
> > > > @@ -6,6 +6,8 @@
> > > >  #include <xen/bitops.h>
> > > >  #include <asm/flushtlb.h>
> > > >  #include <asm/gic.h>
> > > > +#include <xen/guest_access.h>
> > > > +#include <xen/pfn.h>
> > > >
> > > >  void dump_p2m_lookup(struct domain *d, paddr_t addr)  { @@
-408,6
> > > > +410,304 @@ unsigned long gmfn_to_mfn(struct domain *d,
unsigned
> > > > +long
> > > gpfn)
> > > >      return p >> PAGE_SHIFT;
> > > >  }
> > > >
> > > > +#define MAX_VA_PER_NODE (PAGE_SIZE - 2 * sizeof(void *) -
> sizeof(int))\
> > > > +                        / sizeof(unsigned long)
> > > > +
> > > > +/* An array-based linked list for storing virtual addresses
> > > > +(i.e., the 3rd
> > > > + * level table mapping) that should be destroyed after live
> > > > +migration */ struct mapped_va_node {
> > > > +    struct page_info *next;
> > > > +    struct mapped_va_node *mvn_next;
> > > > +    int items;
> > > > +    unsigned long vaddrs[MAX_VA_PER_NODE]; };
> > > > +
> > > > +int add_mapped_vaddr(struct domain *d, unsigned long va) {
> > >
> > > This function seems awefuly expensive (contains allocations etc)
for
> > > a function called on every page fault during a migration.
> > >
> > > Why are you remembering the full va of every fault when a single
bit
> > > per page would do?
> > >
> > > I think you can allocate a bitmap when logdirty is enabled. At a
bit
> > > per page it shouldn''t be too huge.
> > >
> > > You might be able to encode this in the p2m which you walk when
the
> > > tools ask for the bit map (i.e. p2m.type==writeable p2m.write==0
=>
> > > dirty), but I think a bitmap would do the trick and be easier to
> implement.
> >
> > There are three candidates:
> >
> > 1) bitmap
> > 2) encoding into p2m
> > 3) linked list of VAs.
> >
> > And, two functions for dirty-page tracing:
> >
> > A) dirty-page handling (At trap)
> > B) getting dirty bitmap (for toolstack)
> >
> > 1) and 2) have advantage for A) but have to search the full range of
> > pages at B) to find out which page is dirtied and set the page as
> > read-only for next dirty detection. On the otherhand, 3) does not need
> to search the full range at B).
> > I prefer 3) due to the ''non-full range search'' but I
understand your
> concern.
> > Maybe I can do some measurement for each method to see which one
better.
> 
> If you have a bitmap then you can do a scan for each set bit, and the
> offset of each bit corresponds to the guest pfn, which allows you to
> directly calculate the address of the pte in the vlpt.
> 
> I don''t think doing a scan for each set bit is going to compare
> unfavourably to walking a linked list. Certainly not once the memory
> allocator gets involved.
Oh, do you mean using something like "find_next_zero_bit"? At the
first
time,
I thought searching every bit-by-bit but, I just see findbit.S and it looks
quite optimized. 
> 
> > > > +/* get the dirty bitmap from the linked list stored by
> > > > +add_mapped_vaddr
> > > > + * and also clear the mapped vaddrs linked list */ static
void
> > > > +get_dirty_bitmap(struct domain *d, uint8_t *bitmap[]) {
> > > > +    struct page_info *page_head;
> > > > +    struct mapped_va_node *mvn_head;
> > > > +    struct page_info *tmp_page;
> > > > +    struct mapped_va_node *tmp_mvn;
> > > > +    vaddr_t gma_start = 0;
> > > > +    vaddr_t gma_end = 0;
> > > > +
> > > > +    /* this hypercall is called from domain 0, and we
don''t know
> > > > + which
> > > guest''s
> > > > +     * vlpt is mapped in xen_second, so, to be sure, we
restore
> > > > + vlpt
> > > here */
> > > > +    restore_vlpt(d);
> > >
> > > AFAICT you don''t actually use the vlpt here, just the
domains list
> > > of dirty addresses (which as above could become a bitmap)
> >
> > I think vlpt is still needed because at this stage, we have to reset
> > the write permission of dirtied pages. Maybe some tricks for
efficiently
> doing this?
> 
> Ah, of course.
> 
> Do you need to be careful about the context switch of a dom0 vcpu which
> has a foreign vlpt loaded? I suppose you unload it at the end of this
> function so it is safe because Xen doesn''t preempt vcpus in
hypervisor
> mode.

That is my concern now. For safety, unloading is necessary but it requires
flushing vlpt which is not a cheap operation.

Actually, there is one more concern:
Since we only context switch the vlpt area with ''migrating
domains'' when two
domains (one migrating and the other not) are context switched, the
non-migrating
domain have access to the foreign vlpt. In order to prevent the access of
foreign vlpt area, do you think unloading of vlpt when switching into
non-migrating
domains also necessary?
> 
> > >
> > > > +    struct p2m_domain *p2m = &d->arch.p2m;
> > > > +    vaddr_t ram_base;
> > > > +    int i1, i2, i3;
> > > > +    int first_index, second_index, third_index;
> > > > +    lpae_t *first = __map_domain_page(p2m->first_level);
> > > > +    lpae_t pte, *second = NULL, *third = NULL;
> > > > +
> > > > +    get_gma_start_end(d, &ram_base, NULL);
> > > > +
> > > > +    first_index = first_table_offset((uint64_t)ram_base);
> > > > +    second_index = second_table_offset((uint64_t)ram_base);
> > > > +    third_index = third_table_offset((uint64_t)ram_base);
> > > > +
> > > > +    BUG_ON( !first && "Can''t map
first level p2m." );
> > > > +
> > > > +    spin_lock(&p2m->lock);
> > > > +
> > > > +    for ( i1 = first_index; i1 < LPAE_ENTRIES*2; ++i1 )
> > > > +    {
> > > > +        lpae_walk_t first_pte = first[i1].walk;
> > > > +        if ( !first_pte.valid || !first_pte.table )
> > > > +            goto out;
> > > > +
> > > > +        second = map_domain_page(first_pte.base);
> > > > +        BUG_ON( !second && "Can''t map
second level p2m.");
> > > > +        for ( i2 = second_index; i2 < LPAE_ENTRIES; ++i2
)
> > > > +        {
> > > > +            lpae_walk_t second_pte = second[i2].walk;
> > > > +
> > > > +            /* a nasty hack for vlpt due to the difference
> > > > +             * of page table entry layout between p2m and
pt */
> > > > +            second[i2].pt.ro = 0;
> > >
> > > What is the hack here?
> > >
> > > The issue is that the p2m second level, which is a table entry in
> > > the p2m is installed into the Xen page tables where it ends up
the
> > > third level, treated as a block entry.
> > >
> > > That''s OK, because for both PT and P2M bits 2..10 are
ignored for
> > > table entries. So I think rather than this hack here we should
> > > simply ensure that our non-leaf p2m''s have the correct
bits in them
> > > to be used as p2m table entries.
> >
> > Oh, I see. So, should I put something like this in create_p2m_entries
> function?
> 
> Probably p2m_create_table or maybe mfn_to_p2m entry would be the right
> place.
OK.
> 
> Ian.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Campbell

2013-Oct-17 11:47 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On Thu, 2013-10-17 at 20:05 +0900, Jaeyong Yoo wrote:> > >
> > > >
> > > > > +    {
> > > > > +        lpae_t pte = *vlp2m_pte;
> > > > > +        pte.p2m.write = 1;
> > > > > +        write_pte(vlp2m_pte, pte);
> > > > > +        flush_tlb_local();
> > > > > +
> > > > > +        /* in order to remove mappings in cleanup
stage */
> > > > > +        add_mapped_vaddr(d, addr);
> > > >
> > > > No locks or atomic operations here? How are races with the
tools
> > > > reading the dirty bitmap addressed?  If it is by clever
ordering of
> > > > the checks and pte writes then I think a comment would be in
order.
> > >
> > > Actually, the lock is inside the add_mapped_vaddr function.
> > 
> > But that only covers the bitmap update, not the pte frobbing.
> 
> I also locked with the same lock at the get_dirty_bitmap which is reading
> the bitmap.
I meant the pte frobbing immediately before the call the
add_mapped_vaddr quoted above.
> > > There are three candidates:
> > >
> > > 1) bitmap
> > > 2) encoding into p2m
> > > 3) linked list of VAs.
> > >
> > > And, two functions for dirty-page tracing:
> > >
> > > A) dirty-page handling (At trap)
> > > B) getting dirty bitmap (for toolstack)
> > >
> > > 1) and 2) have advantage for A) but have to search the full range
of
> > > pages at B) to find out which page is dirtied and set the page as
> > > read-only for next dirty detection. On the otherhand, 3) does not
need
> > to search the full range at B).
> > > I prefer 3) due to the ''non-full range search''
but I understand your
> > concern.
> > > Maybe I can do some measurement for each method to see which one
better.
> > 
> > If you have a bitmap then you can do a scan for each set bit, and the
> > offset of each bit corresponds to the guest pfn, which allows you to
> > directly calculate the address of the pte in the vlpt.
> > 
> > I don''t think doing a scan for each set bit is going to
compare
> > unfavourably to walking a linked list. Certainly not once the memory
> > allocator gets involved.
> 
> Oh, do you mean using something like "find_next_zero_bit"?
find_next_bit probably since you want to find the 1s. I suppose you
could also initialise to all 1 and clear bits instead
>  At the first
> time,
> I thought searching every bit-by-bit but, I just see findbit.S and it looks
> quite optimized. 
Yes, I think it should be fast enough.
 > > > > > +/* get the dirty bitmap from the linked list stored by
> > > > > +add_mapped_vaddr
> > > > > + * and also clear the mapped vaddrs linked list */
static void
> > > > > +get_dirty_bitmap(struct domain *d, uint8_t *bitmap[])
{
> > > > > +    struct page_info *page_head;
> > > > > +    struct mapped_va_node *mvn_head;
> > > > > +    struct page_info *tmp_page;
> > > > > +    struct mapped_va_node *tmp_mvn;
> > > > > +    vaddr_t gma_start = 0;
> > > > > +    vaddr_t gma_end = 0;
> > > > > +
> > > > > +    /* this hypercall is called from domain 0, and we
don''t know
> > > > > + which
> > > > guest''s
> > > > > +     * vlpt is mapped in xen_second, so, to be sure,
we restore
> > > > > + vlpt
> > > > here */
> > > > > +    restore_vlpt(d);
> > > >
> > > > AFAICT you don''t actually use the vlpt here, just
the domains list
> > > > of dirty addresses (which as above could become a bitmap)
> > >
> > > I think vlpt is still needed because at this stage, we have to
reset
> > > the write permission of dirtied pages. Maybe some tricks for
efficiently
> > doing this?
> > 
> > Ah, of course.
> > 
> > Do you need to be careful about the context switch of a dom0 vcpu
which
> > has a foreign vlpt loaded? I suppose you unload it at the end of this
> > function so it is safe because Xen doesn''t preempt vcpus in
hypervisor
> > mode.
> 
> 
> That is my concern now. For safety, unloading is necessary but it requires
> flushing vlpt which is not a cheap operation.
I suppose it could be deferred either to the next load of a vplt, if it
is a different one, or the next context switch away from the vcpu.
You''d
just need to track which foreign vlpt was currently loaded on the vcpu
in vcpu->arch.something?
> Actually, there is one more concern:
> Since we only context switch the vlpt area with ''migrating
domains'' when two
> domains (one migrating and the other not) are context switched, the
> non-migrating
> domain have access to the foreign vlpt. In order to prevent the access of
> foreign vlpt area, do you think unloading of vlpt when switching into
> non-migrating
> domains also necessary?
The domain itself doesn''t have access to the vlpt since it is mapped in
hypervisor context so the only danger is that Xen accidentally uses the
vlpt for something while the non-migrating domain is scheduled. I think
we should just not write that bug ;-)

IOW I think flushing the vlpt lazily is OK.

Ian.

Jaeyong Yoo

2013-Oct-18 04:17 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Campbell
> Sent: Thursday, October 17, 2013 8:47 PM
> To: Jaeyong Yoo
> Cc: ''Stefano Stabellini''; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v4 8/9] xen/arm: Implement hypercall for
> dirty page tracing
> 
> On Thu, 2013-10-17 at 20:05 +0900, Jaeyong Yoo wrote:
> > > >
> > > > >
> > > > > > +    {
> > > > > > +        lpae_t pte = *vlp2m_pte;
> > > > > > +        pte.p2m.write = 1;
> > > > > > +        write_pte(vlp2m_pte, pte);
> > > > > > +        flush_tlb_local();
> > > > > > +
> > > > > > +        /* in order to remove mappings in cleanup
stage */
> > > > > > +        add_mapped_vaddr(d, addr);
> > > > >
> > > > > No locks or atomic operations here? How are races with
the tools
> > > > > reading the dirty bitmap addressed?  If it is by clever
ordering
> > > > > of the checks and pte writes then I think a comment
would be in
> order.
> > > >
> > > > Actually, the lock is inside the add_mapped_vaddr function.
> > >
> > > But that only covers the bitmap update, not the pte frobbing.
> >
> > I also locked with the same lock at the get_dirty_bitmap which is
> > reading the bitmap.
> 
> I meant the pte frobbing immediately before the call the add_mapped_vaddr
> quoted above.
> 
I don''t think it is necessary because at the moment the dirty bitmap
is constructed based on the VAs in the linked list. If the PTE frobbing
happens
immediately before add_mapped_vaddr, the corresponding dirty-page would be 
checked at the next round get-dirty-gitmap. 

Now, I start thinking of using bitmap rather than linked list, and I think
it
is similar there. The critical section would be the one that writes into the
bitmap and the one that reads the bitmap (copy to the bitmap from
toolstack).

Jaeyong

Ian Campbell

2013-Oct-18 09:11 UTC

head link

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

On Fri, 2013-10-18 at 13:17 +0900, Jaeyong Yoo wrote:
> > I meant the pte frobbing immediately before the call the
add_mapped_vaddr
> > quoted above.
> > 
> 
> I don''t think it is necessary because at the moment the dirty
bitmap
> is constructed based on the VAs in the linked list. If the PTE frobbing
> happens
> immediately before add_mapped_vaddr, the corresponding dirty-page would be 
> checked at the next round get-dirty-gitmap. 
OK, thanks. Can we get a code comment on the ordering constraints here
please.
>  
> 
> Now, I start thinking of using bitmap rather than linked list, and I think
> it
> is similar there. The critical section would be the one that writes into
the
> bitmap and the one that reads the bitmap (copy to the bitmap from
> toolstack).
> 
> Jaeyong
>

Eugene Fedotov

2013-Oct-31 08:56 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

11.10.2013 14:09, Ian Campbell:> On Fri, 2013-10-11 at 14:06 +0400, Eugene Fedotov wrote:
>> 10.10.2013 14:56, Ian Campbell:
>>> On Fri, 2013-10-04 at 13:43 +0900, Jaeyong Yoo wrote:
>>>> From: Evgeny Fedotov <e.fedotov@samsung.com>
>>>>
>>>> When creating domU in toolstack, pass the guest memory
>>>> map info to the hypervisor, and the hypervisor stores those
info in
>>>> arch_domain for later use.
>>>>
>>>> Singed-off-by: Evgeny Fedotov <e.fedotov@samsung.com>
>>>> ---
>>>>    tools/libxc/xc_dom_arm.c     | 12 +++++++-
>>>>    tools/libxc/xc_domain.c      | 44
++++++++++++++++++++++++++++
>>>>    tools/libxc/xenctrl.h        | 23 +++++++++++++++
>>>>    xen/arch/arm/domain.c        |  3 ++
>>>>    xen/arch/arm/mm.c            | 68
++++++++++++++++++++++++++++++++++++++++++++
>>>>    xen/include/asm-arm/domain.h |  2 ++
>>>>    xen/include/asm-arm/mm.h     |  1 +
>>>>    xen/include/public/memory.h  | 15 ++++++++--
>>>>    xen/include/xsm/dummy.h      |  5 ++++
>>>>    xen/include/xsm/xsm.h        |  5 ++++
>>>>    10 files changed, 175 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/tools/libxc/xc_dom_arm.c
b/tools/libxc/xc_dom_arm.c
>>>> index df59ffb..20c9095 100644
>>>> --- a/tools/libxc/xc_dom_arm.c
>>>> +++ b/tools/libxc/xc_dom_arm.c
>>>> @@ -166,6 +166,7 @@ int arch_setup_meminit(struct xc_dom_image
*dom)
>>>>    {
>>>>        int rc;
>>>>        xen_pfn_t pfn, allocsz, i;
>>>> +    struct dt_mem_info memmap;
>>>>    
>>>>        dom->shadow_enabled = 1;
>>>>    
>>>> @@ -191,7 +192,16 @@ int arch_setup_meminit(struct xc_dom_image
*dom)
>>>>                0, 0, &dom->p2m_host[i]);
>>>>        }
>>>>    
>>>> -    return 0;
>>>> +    /* setup guest memory map */
>>>> +    memmap.nr_banks = 2;
>>>> +    memmap.bank[0].start = (dom->rambase_pfn <<
PAGE_SHIFT_ARM);
>>>> +    memmap.bank[0].size = (dom->total_pages <<
PAGE_SHIFT_ARM);
>>>> +    /*The end of main memory: magic pages */
>>>> +    memmap.bank[1].start = memmap.bank[0].start +
memmap.bank[0].size;
>>>> +    memmap.bank[1].size = NR_MAGIC_PAGES <<
PAGE_SHIFT_ARM;
>>> Are the 0 and 1 here hardcoded magic numbers?
>> Well, we hardcoded here two memory regions: the first one for RAM, the
>> second one for "magic pages".
>>>> +    return xc_domain_set_memory_map(dom->xch,
dom->guest_domid, &memmap);
>>> I think this is using set_memory_map in a different way to it is
used
>>> for x86 (where it gives the PV e820 map, a PV version of a BIOS
provided
>>> datastructure).
>> Do you mean that using e820 structure for ARM implementation is better
>> than dt_mem_info structure (that has been taken from libdt) and should
>> be used in ARM implementation of get/set memory map?
> I mean on x86 it is used as a way of communicating a memory map (in a
> structure vaguely like an actual defined architecture specific layout,
> the e820), whereas on ARM it is only used by the tools and should not be
> exposed to the guest, since the guest gets its info from the DTB.
>
>>> The guest should get its memory map via DTB not via a PV hypercall.
I
>>> know the guest isn''t using get_memory_map but I
don''t think we should
>>> even make it available.
>> OK, this hypercall will be available from dom0 only.
> It''ll have to move away from the XENMEM space then.Can we create new DOMCTL hypercall for get/set memory map?

Ian Campbell

2013-Nov-01 09:04 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Thu, 2013-10-31 at 12:56 +0400, Eugene Fedotov wrote:> Can we create new DOMCTL hypercall for get/set memory map?
You could but is it really necessary? The guest memory map is almost
entirely static apart from the RAM size (which can be discovered by
other means). If you base things on my "xen: arm: 64-bit guest support
and domU FDT autogeneration" series (at least v2) then all the necessary
#defines should be available to the tools as well.

Ian.

Ian Campbell

2013-Nov-01 09:14 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Thu, 2013-10-31 at 12:56 +0400, Eugene Fedotov wrote:> Can we create new DOMCTL hypercall for get/set memory map?
You could but do you really need it? The guest memory map is pretty much
static apart from RAM size and in the next version of my "xen: arm:
64-bit guest support and domU FDT autogeneration" series all the
relevant #defines should be available to the toolstack as well as the
hypervisor (and if not, we can fix that).

Any get/set memory map calls can be deferred until there is a need for a
more dynamic guest layout IMHO.

Ian.

Ian Campbell

2013-Nov-01 09:48 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Fri, 2013-11-01 at 13:51 +0400, Eugene Fedotov wrote:> 01.11.2013 13:14, Ian Campbell пишет:
> > On Thu, 2013-10-31 at 12:56 +0400, Eugene Fedotov wrote:
> >> Can we create new DOMCTL hypercall for get/set memory map?
> > You could but do you really need it? The guest memory map is pretty
much
> > static apart from RAM size and in the next version of my "xen:
arm:
> > 64-bit guest support and domU FDT autogeneration" series all the
> > relevant #defines should be available to the toolstack as well as the
> > hypervisor (and if not, we can fix that).
> >
> > Any get/set memory map calls can be deferred until there is a need for
a
> > more dynamic guest layout IMHO.
> >
> > Ian.
> >
> So it is not yet upstreamed: "xen: arm: 64-bit guest support and domU 
> FDT autogeneration" ?
Not yet.
> As to current patch series, how do the hypervisor acquire RAM size (and 
> base) in current upstream? (since only toolstack knows it yet, or I 
> missed something).
The toolstack makes a hypercall, via libxc:xc_domain_setmaxmem.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Eugene Fedotov

2013-Nov-01 09:51 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

01.11.2013 13:14, Ian Campbell пишет:> On Thu, 2013-10-31 at 12:56 +0400, Eugene Fedotov wrote:
>> Can we create new DOMCTL hypercall for get/set memory map?
> You could but do you really need it? The guest memory map is pretty much
> static apart from RAM size and in the next version of my "xen: arm:
> 64-bit guest support and domU FDT autogeneration" series all the
> relevant #defines should be available to the toolstack as well as the
> hypervisor (and if not, we can fix that).
>
> Any get/set memory map calls can be deferred until there is a need for a
> more dynamic guest layout IMHO.
>
> Ian.
>So it is not yet upstreamed: "xen: arm: 64-bit guest support and domU 
FDT autogeneration" ?
As to current patch series, how do the hypervisor acquire RAM size (and 
base) in current upstream? (since only toolstack knows it yet, or I 
missed something).

Best regards,
Evgeny.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Eugene Fedotov

2013-Nov-01 10:08 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

> The toolstack makes a hypercall, via libxc:xc_domain_setmaxmem.OK, RAM size we know. But, for the RAM base should we use (till the 
patch above applied) a macro like
#define GUEST_RAM_BASE 0x80000000 ?>
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Campbell

2013-Nov-01 10:30 UTC

head link

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

On Fri, 2013-11-01 at 14:08 +0400, Eugene Fedotov wrote:> > The toolstack makes a hypercall, via libxc:xc_domain_setmaxmem.
> OK, RAM size we know. But, for the RAM base should we use (till the 
> patch above applied) a macro like
> #define GUEST_RAM_BASE 0x80000000 ?
Yes, some #define which matches what the builder is using will do until
the series lands.

Ian.

Xen devel - Oct 2013 - [PATCH v4 0/9] xen/arm: live migration support in arndale board

[PATCH v4 0/9] xen/arm: live migration support in arndale board

[PATCH v4 1/9] xen/arm: Implement hvm save and restore

[PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

[PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

[PATCH v4 4/9] xen/arm: Implement get_maximum_gpfn hypercall for arm

[PATCH v4 5/9] xen/arm: Implement modify_returncode

[PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

[PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

[PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

[PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

Re: [PATCH v4 1/9] xen/arm: Implement hvm save and restore

Re: [PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 6/9] xen/arm: Fixing clear_guest_offset macro

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

Re: [PATCH v4 2/9] xen/arm: Add more registers for saving and restoring vcpu registers

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 9/9] xen/arm: Implement toolstack for xl restore/save and migrate

Re: [PATCH v4 7/9] xen/arm: Implement virtual-linear page table for guest p2m mapping in live migration

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 8/9] xen/arm: Implement hypercall for dirty page tracing

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm

Re: [PATCH v4 3/9] xen/arm: Implement set_memory_map hypercall for arm