Hi all, this patch series implement guest SMP support for ARM, using the ARM PSCI interface for secondary cpu bringup. See each patch for a detailed changelog. Stefano Stabellini (13): xen/arm: basic PSCI support, implement cpu_on and cpu_off xen/arm: allocate secondaries dom0 vcpus xen: introduce cpumask_from_bitmap xen/arm: support for guest SGI xen/arm: early_ioremap: allocate virtual addresses from top to bottom xen/arm: implement arch_vmap_virt_end xen/arm: compile and initialize vmap xen/arm: implement map_domain_page_global and unmap_domain_page_global xen: move VCPUOP_register_vcpu_info to common code xen/arm: support VCPUOP_register_vcpu_info. xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus xen/arm: start the vtimer Xen timers on the processor they should be running on xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus xen/arch/arm/Makefile | 1 + xen/arch/arm/domain.c | 15 +++++ xen/arch/arm/domain_build.c | 8 ++- xen/arch/arm/mm.c | 124 ++++++++++++++++++++++++++++++++++++--- xen/arch/arm/psci.c | 87 +++++++++++++++++++++++++++ xen/arch/arm/setup.c | 3 + xen/arch/arm/traps.c | 48 +++++++++++++++- xen/arch/arm/vgic.c | 57 ++++++++++++++++-- xen/arch/arm/vtimer.c | 22 +++++-- xen/arch/x86/domain.c | 113 ----------------------------------- xen/common/domain.c | 111 ++++++++++++++++++++++++++++++++++ xen/include/asm-arm/config.h | 4 +- xen/include/asm-arm/domain.h | 26 +++++--- xen/include/asm-arm/gic.h | 3 + xen/include/asm-arm/hypercall.h | 3 + xen/include/asm-arm/page.h | 3 + xen/include/asm-arm/processor.h | 1 + xen/include/asm-arm/psci.h | 24 ++++++++ xen/include/asm-x86/domain.h | 3 - xen/include/asm-x86/page.h | 8 --- xen/include/xen/cpumask.h | 11 ++++ xen/include/xen/domain.h | 3 + xen/include/xen/mm.h | 7 ++ xen/include/xen/sched.h | 3 + 24 files changed, 531 insertions(+), 157 deletions(-) Cheers, Stefano
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 01/13] xen/arm: basic PSCI support, implement cpu_on and cpu_off
Implement support for ARM Power State Coordination Interface, PSCI in short. Support HVC and SMC calls. Changes in v3: - move do_psci_* to psci.c; - trap SMC; - return PSCI error codes; - remove Linux boot procotol from secondary cpus. Changes in v2: - set is_initialised in arch_set_info_guest; - zero vcpu_guest_context before using it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/Makefile | 1 + xen/arch/arm/domain.c | 2 + xen/arch/arm/psci.c | 87 +++++++++++++++++++++++++++++++++++++++ xen/arch/arm/traps.c | 47 ++++++++++++++++++++- xen/include/asm-arm/processor.h | 1 + xen/include/asm-arm/psci.h | 24 +++++++++++ 6 files changed, 161 insertions(+), 1 deletions(-) create mode 100644 xen/arch/arm/psci.c create mode 100644 xen/include/asm-arm/psci.h diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index 2106a4f..8f75044 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -5,6 +5,7 @@ subdir-y += platforms obj-y += early_printk.o obj-y += cpu.o obj-y += domain.o +obj-y += psci.o obj-y += domctl.o obj-y += sysctl.o obj-y += domain_build.o diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index eae42af..fee3790 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -561,6 +561,8 @@ int arch_set_info_guest( else set_bit(_VPF_down, &v->pause_flags); + v->is_initialised = 1; + return 0; } diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c new file mode 100644 index 0000000..28153ad --- /dev/null +++ b/xen/arch/arm/psci.c @@ -0,0 +1,87 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include <xen/errno.h> +#include <xen/sched.h> +#include <xen/types.h> + +#include <asm/current.h> +#include <asm/psci.h> + +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point) +{ + struct vcpu *v; + struct domain *d = current->domain; + struct vcpu_guest_context *ctxt; + int rc; + + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) + return PSCI_EINVAL; + + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) + return PSCI_EINVAL; + + if ( !v->is_initialised ) { + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) + return PSCI_DENIED; + + memset(ctxt, 0, sizeof(*ctxt)); + ctxt->user_regs.pc32 = entry_point; + ctxt->sctlr = SCTLR_BASE; + ctxt->ttbr0 = 0; + ctxt->ttbr1 = 0; + ctxt->ttbcr = 0; /* Defined Reset Value */ + ctxt->user_regs.cpsr = PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC; + ctxt->flags = VGCF_online; + + domain_lock(d); + rc = arch_set_info_guest(v, ctxt); + free_vcpu_guest_context(ctxt); + + if ( rc < 0 ) + { + domain_unlock(d); + return PSCI_DENIED; + } + domain_unlock(d); + } + + clear_bit(_VPF_down, &v->pause_flags); + vcpu_wake(v); + + return PSCI_SUCCESS; +} + +int do_psci_cpu_off(uint32_t power_state) +{ + struct vcpu *v = current; + if ( !test_and_set_bit(_VPF_down, &v->pause_flags) ) + vcpu_sleep_nosync(v); + return PSCI_SUCCESS; +} +int do_psci_cpu_suspend(uint32_t power_state, uint32_t entry_point) +{ + return PSCI_ENOSYS; +} +int do_psci_migrate(uint32_t vcpuid) +{ + return PSCI_ENOSYS; +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index da53675..733099a 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -34,6 +34,7 @@ #include <asm/event.h> #include <asm/regs.h> #include <asm/cpregs.h> +#include <asm/psci.h> #include "io.h" #include "vtimer.h" @@ -61,7 +62,7 @@ void __cpuinit init_traps(void) WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2); /* Setup hypervisor traps */ - WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI, HCR_EL2); + WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2); isb(); } @@ -618,6 +619,31 @@ static arm_hypercall_t arm_hypercall_table[] = { HYPERCALL(grant_table_op, 3), }; +#define __PSCI_cpu_suspend 0 +#define __PSCI_cpu_off 1 +#define __PSCI_cpu_on 2 +#define __PSCI_migrate 3 + +typedef int (*arm_psci_fn_t)(uint32_t, uint32_t); + +typedef struct { + arm_psci_fn_t fn; + int nr_args; +} arm_psci_t; + +#define PSCI(_name, _nr_args) \ + [ __PSCI_ ## _name ] = { \ + .fn = (arm_psci_fn_t) &do_psci_ ## _name, \ + .nr_args = _nr_args, \ + } + +static arm_psci_t arm_psci_table[] = { + PSCI(cpu_suspend, 2), + PSCI(cpu_off, 1), + PSCI(cpu_on, 2), + PSCI(migrate, 1), +}; + static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) { register_t *r; @@ -647,6 +673,20 @@ static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) } } +static void do_trap_psci(struct cpu_user_regs *regs) +{ + arm_psci_fn_t psci_call = NULL; + + if ( regs->r0 >= ARRAY_SIZE(arm_psci_table) ) + { + regs->r0 = -ENOSYS; + return; + } + + psci_call = arm_psci_table[regs->r0].fn; + regs->r0 = psci_call(regs->r1, regs->r2); +} + static void do_trap_hypercall(struct cpu_user_regs *regs, unsigned long iss) { arm_hypercall_fn_t call = NULL; @@ -959,8 +999,13 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs *regs) case HSR_EC_HVC: if ( (hsr.iss & 0xff00) == 0xff00 ) return do_debug_trap(regs, hsr.iss & 0x00ff); + if ( hsr.iss == 0 ) + return do_trap_psci(regs); do_trap_hypercall(regs, hsr.iss); break; + case HSR_EC_SMC: + do_trap_psci(regs); + break; case HSR_EC_DATA_ABORT_GUEST: do_trap_data_abort_guest(regs, hsr.dabt); break; diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h index 1681ebf..72f7f6b 100644 --- a/xen/include/asm-arm/processor.h +++ b/xen/include/asm-arm/processor.h @@ -85,6 +85,7 @@ #define HSR_EC_CP14_64 0x0c #define HSR_EC_SVC 0x11 #define HSR_EC_HVC 0x12 +#define HSR_EC_SMC 0x13 #define HSR_EC_INSTR_ABORT_GUEST 0x20 #define HSR_EC_INSTR_ABORT_HYP 0x21 #define HSR_EC_DATA_ABORT_GUEST 0x24 diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h new file mode 100644 index 0000000..8511eb1 --- /dev/null +++ b/xen/include/asm-arm/psci.h @@ -0,0 +1,24 @@ +#ifndef __ASM_PSCI_H__ +#define __ASM_PSCI_H__ + +#define PSCI_SUCCESS 0 +#define PSCI_ENOSYS -1 +#define PSCI_EINVAL -2 +#define PSCI_DENIED -3 + +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point); +int do_psci_cpu_off(uint32_t power_state); +int do_psci_cpu_suspend(uint32_t power_state, uint32_t entry_point); +int do_psci_migrate(uint32_t vcpuid); + +#endif /* __ASM_PSCI_H__ */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 02/13] xen/arm: allocate secondaries dom0 vcpus
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/domain_build.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index a6d8e9d..5dfa592 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -363,7 +363,7 @@ static void dtb_load(struct kernel_info *kinfo) int construct_dom0(struct domain *d) { struct kernel_info kinfo = {}; - int rc; + int rc, i, cpu = 0; struct vcpu *v = d->vcpu[0]; struct cpu_user_regs *regs = &v->arch.cpu_info->guest_cpu_user_regs; @@ -451,6 +451,12 @@ int construct_dom0(struct domain *d) } #endif + for ( i = 1; i < d->max_vcpus; i++ ) + { + cpu = cpumask_cycle(cpu, &cpu_online_map); + (void)alloc_vcpu(d, i, cpu); + } + local_abort_enable(); return 0; -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 03/13] xen: introduce cpumask_from_bitmap
Introduce a new cpumask function that takes a pointer to bits in memory, builds a cpumask with them and returns it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: keir@xen.org CC: JBeulich@suse.com --- xen/include/xen/cpumask.h | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h index f931cf2..f6ad154 100644 --- a/xen/include/xen/cpumask.h +++ b/xen/include/xen/cpumask.h @@ -265,6 +265,17 @@ static inline const cpumask_t *cpumask_of(unsigned int cpu) #define cpumask_bits(maskp) ((maskp)->bits) +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int nr_bits) +{ + cpumask_t mask; + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; + + memset(&mask, 0x00, sizeof(mask)); + bitmap_copy(mask.bits, bits, len); + + return mask; +} + static inline int cpumask_scnprintf(char *buf, int len, const cpumask_t *srcp) { -- 1.7.2.5
Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right guest vcpu. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v3: - make use of cpumask_from_bitmap. --- xen/arch/arm/vgic.c | 55 ++++++++++++++++++++++++++++++++++++++++---- xen/include/asm-arm/gic.h | 3 ++ 2 files changed, 53 insertions(+), 5 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index b30da78..8912aab 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -370,6 +370,7 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) { + struct domain *d = v->domain; struct hsr_dabt dabt = info->dabt; struct cpu_user_regs *regs = guest_cpu_user_regs(); register_t *r = select_user_reg(regs, dabt.reg); @@ -498,11 +499,55 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) goto write_ignore; case GICD_SGIR: - if ( dabt.size != 2 ) goto bad_width; - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", - *r, gicd_reg - GICD_ICFGR); - return 0; - + { + cpumask_t vcpu_mask; + int virtual_irq; + int filter; + int vcpuid; + struct vcpu *vt; + int i; + unsigned long bits; + + if ( dabt.size != 2 ) goto bad_width; + + filter = (*r & GICD_SGI_TARGET_LIST_MASK); + virtual_irq = (*r & GICD_SGI_INTID_MASK); + + cpumask_clear(&vcpu_mask); + switch ( filter ) + { + case GICD_SGI_TARGET_LIST: + bits = (*r & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; + vcpu_mask = cpumask_from_bitmap(&bits, 8); + break; + case GICD_SGI_TARGET_OTHERS: + for ( i = 0; i < d->max_vcpus; i++ ) + { + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) + cpumask_set_cpu(i, &vcpu_mask); + } + case GICD_SGI_TARGET_SELF: + vcpu_mask = *cpumask_of(current->vcpu_id); + break; + default: + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); + return 0; + } + + for_each_cpu( vcpuid, &vcpu_mask ) + { + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || + virtual_irq >= 16 ) + { + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", + *r, *cpumask_bits(&vcpu_mask)); + return 0; + } + vgic_vcpu_inject_irq(vt, virtual_irq, 1); + } + return 1; + } + case GICD_CPENDSGIR ... GICD_CPENDSGIRN: if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; printk("vGICD: unhandled %s write %#"PRIregister" to ICPENDSGIR%d\n", diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h index 92711d5..4f2c8b8 100644 --- a/xen/include/asm-arm/gic.h +++ b/xen/include/asm-arm/gic.h @@ -51,12 +51,15 @@ #define GICD_SPENDSGIRN (0xF2C/4) #define GICD_ICPIDR2 (0xFE8/4) +#define GICD_SGI_TARGET_LIST_SHIFT (24) +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) #define GICD_SGI_TARGET_LIST (0UL<<24) #define GICD_SGI_TARGET_OTHERS (1UL<<24) #define GICD_SGI_TARGET_SELF (2UL<<24) #define GICD_SGI_TARGET_SHIFT (16) #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) #define GICD_SGI_GROUP1 (1UL<<15) +#define GICD_SGI_INTID_MASK (0xFUL) #define GICC_CTLR (0x0000/4) #define GICC_PMR (0x0004/4) -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 05/13] xen/arm: early_ioremap: allocate virtual addresses from top to bottom
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index ba3140d..bd7baaf 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -433,15 +433,17 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) */ void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) { - static unsigned long virt_start = EARLY_VMAP_VIRT_START; - unsigned long ret_addr = virt_start; + static unsigned long virt_start = EARLY_VMAP_VIRT_END; paddr_t end = start + len; + len = ((len + SECOND_SIZE - 1) >> SECOND_SHIFT) << SECOND_SHIFT; + virt_start -= len; + ASSERT(!(start & (~SECOND_MASK))); ASSERT(!(virt_start & (~SECOND_MASK))); /* The range we need to map is too big */ - if ( virt_start + len >= EARLY_VMAP_VIRT_END ) + if ( virt_start >= EARLY_VMAP_VIRT_START ) return NULL; while ( start < end ) @@ -453,9 +455,10 @@ void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) start += SECOND_SIZE; virt_start += SECOND_SIZE; } - flush_xen_data_tlb_range_va(ret_addr, len); + virt_start -= len; + flush_xen_data_tlb_range_va(virt_start, len); - return (void*)ret_addr; + return (void*)virt_start; } enum mg { mg_clear, mg_ro, mg_rw, mg_rx }; -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 06/13] xen/arm: implement arch_vmap_virt_end
Move virt_start out of ioremap and rename it to early_vmap_start. Implement arch_vmap_virt_end by returning early_vmap_start. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 24 +++++++++++++++--------- 1 files changed, 15 insertions(+), 9 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index bd7baaf..4d4556b 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -431,34 +431,40 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) * start has to be 2MB aligned. * len has to be < EARLY_VMAP_VIRT_END - EARLY_VMAP_VIRT_START. */ +static unsigned long early_vmap_start = EARLY_VMAP_VIRT_END; void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) { - static unsigned long virt_start = EARLY_VMAP_VIRT_END; paddr_t end = start + len; + unsigned long map_start; len = ((len + SECOND_SIZE - 1) >> SECOND_SHIFT) << SECOND_SHIFT; - virt_start -= len; + early_vmap_start -= len; ASSERT(!(start & (~SECOND_MASK))); - ASSERT(!(virt_start & (~SECOND_MASK))); + ASSERT(!(early_vmap_start & (~SECOND_MASK))); /* The range we need to map is too big */ - if ( virt_start >= EARLY_VMAP_VIRT_START ) + if ( early_vmap_start >= EARLY_VMAP_VIRT_START ) return NULL; + map_start = early_vmap_start; while ( start < end ) { lpae_t e = mfn_to_xen_entry(start >> PAGE_SHIFT); e.pt.ai = attributes; - write_pte(xen_second + second_table_offset(virt_start), e); + write_pte(xen_second + second_table_offset(map_start), e); start += SECOND_SIZE; - virt_start += SECOND_SIZE; + map_start += SECOND_SIZE; } - virt_start -= len; - flush_xen_data_tlb_range_va(virt_start, len); + flush_xen_data_tlb_range_va(early_vmap_start, len); - return (void*)virt_start; + return (void*)early_vmap_start; +} + +void *__init arch_vmap_virt_end(void) +{ + return (void *)early_vmap_start; } enum mg { mg_clear, mg_ro, mg_rw, mg_rx }; -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 07/13] xen/arm: compile and initialize vmap
Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to VMAP_VIRT_END and VMAP_VIRT_START. Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM). Implement map_pages_to_xen and destroy_xen_mappings. Call vm_init from start_xen. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 92 ++++++++++++++++++++++++++++++++++++++++- xen/arch/arm/setup.c | 3 + xen/include/asm-arm/config.h | 4 +- xen/include/asm-arm/page.h | 3 + xen/include/asm-x86/page.h | 8 ---- xen/include/xen/mm.h | 7 +++ 6 files changed, 104 insertions(+), 13 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 4d4556b..240904e 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -33,6 +33,7 @@ #include <xen/err.h> #include <asm/page.h> #include <asm/current.h> +#include <asm/flushtlb.h> #include <public/memory.h> #include <xen/sched.h> #include <xsm/xsm.h> @@ -429,9 +430,9 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) /* Map the physical memory range start - start + len into virtual * memory and return the virtual address of the mapping. * start has to be 2MB aligned. - * len has to be < EARLY_VMAP_VIRT_END - EARLY_VMAP_VIRT_START. + * len has to be < VMAP_VIRT_END - VMAP_VIRT_START. */ -static unsigned long early_vmap_start = EARLY_VMAP_VIRT_END; +static unsigned long early_vmap_start = VMAP_VIRT_END; void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) { paddr_t end = start + len; @@ -444,7 +445,7 @@ void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) ASSERT(!(early_vmap_start & (~SECOND_MASK))); /* The range we need to map is too big */ - if ( early_vmap_start >= EARLY_VMAP_VIRT_START ) + if ( early_vmap_start >= VMAP_VIRT_START ) return NULL; map_start = early_vmap_start; @@ -467,6 +468,91 @@ void *__init arch_vmap_virt_end(void) return (void *)early_vmap_start; } +static int create_xen_table(lpae_t *entry) +{ + void *p; + lpae_t pte; + + p = alloc_xenheap_pages(0, 0); + if ( p == NULL ) + return -ENOMEM; + clear_page(p); + pte = mfn_to_xen_entry(virt_to_mfn(p)); + pte.pt.table = 1; + write_pte(entry, pte); + return 0; +} + +enum xenmap_operation { + INSERT, + REMOVE +}; + +static int create_xen_entries(enum xenmap_operation op, + unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns) +{ + int rc; + unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; + lpae_t pte; + lpae_t *third = NULL; + + for(; addr < addr_end; addr += PAGE_SIZE, mfn++) + { + if ( !xen_second[second_linear_offset(addr)].pt.valid || + !xen_second[second_linear_offset(addr)].pt.table ) + { + rc = create_xen_table(&xen_second[second_linear_offset(addr)]); + if ( rc < 0 ) { + printk("create_xen_entries: L2 failed\n"); + goto out; + } + } + + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); + + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base + << PAGE_SHIFT); + if ( third[third_table_offset(addr)].pt.valid ) + flush_tlb_local(); + + switch ( op ) { + case INSERT: + pte = mfn_to_xen_entry(mfn); + pte.pt.table = 1; + write_pte(&third[third_table_offset(addr)], pte); + break; + case REMOVE: + memset(&pte, 0x00, sizeof(pte)); + write_pte(&third[third_table_offset(addr)], pte); + break; + default: + printk("create_xen_entries: invalid op\n"); + break; + } + } + flush_xen_data_tlb_range_va(virt, PAGE_SIZE * nr_mfns); + + rc = 0; + +out: + return rc; +} + +int map_pages_to_xen(unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns, + unsigned int flags) +{ + ASSERT(flags == PAGE_HYPERVISOR); + return create_xen_entries(INSERT, virt, mfn, nr_mfns); +} +void destroy_xen_mappings(unsigned long v, unsigned long e) +{ + create_xen_entries(REMOVE, v, 0, (e - v) >> PAGE_SHIFT); +} + enum mg { mg_clear, mg_ro, mg_rw, mg_rx }; static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg) { diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index cfe3d94..59646d6 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -34,6 +34,7 @@ #include <xen/keyhandler.h> #include <xen/cpu.h> #include <xen/pfn.h> +#include <xen/vmap.h> #include <asm/page.h> #include <asm/current.h> #include <asm/setup.h> @@ -480,6 +481,8 @@ void __init start_xen(unsigned long boot_phys_offset, console_init_postirq(); + vm_init(); + do_presmp_initcalls(); for_each_present_cpu ( i ) diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 8be8563..1f87acc 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -95,11 +95,11 @@ #define FIXMAP_ADDR(n) (mk_unsigned_long(0x00400000) + (n) * PAGE_SIZE) #define BOOT_MISC_VIRT_START mk_unsigned_long(0x00600000) #define FRAMETABLE_VIRT_START mk_unsigned_long(0x02000000) -#define EARLY_VMAP_VIRT_START mk_unsigned_long(0x10000000) +#define VMAP_VIRT_START mk_unsigned_long(0x10000000) #define XENHEAP_VIRT_START mk_unsigned_long(0x40000000) #define DOMHEAP_VIRT_START mk_unsigned_long(0x80000000) -#define EARLY_VMAP_VIRT_END XENHEAP_VIRT_START +#define VMAP_VIRT_END XENHEAP_VIRT_START #define HYPERVISOR_VIRT_START XEN_VIRT_START #define DOMHEAP_ENTRIES 1024 /* 1024 2MB mapping slots */ diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index 14e63eb..5287a92 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -58,6 +58,9 @@ #define DEV_WC BUFFERABLE #define DEV_CACHED WRITEBACK +#define PAGE_HYPERVISOR (MATTR_MEM) +#define MAP_SMALL_PAGES PAGE_HYPERVISOR + /* * Stage 2 Memory Type. * diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h index b2f3859..e53e1e5 100644 --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -338,14 +338,6 @@ l3_pgentry_t *virt_to_xen_l3e(unsigned long v); extern void set_pdx_range(unsigned long smfn, unsigned long emfn); -/* Map machine page range in Xen virtual address space. */ -int map_pages_to_xen( - unsigned long virt, - unsigned long mfn, - unsigned long nr_mfns, - unsigned int flags); -void destroy_xen_mappings(unsigned long v, unsigned long e); - /* Convert between PAT/PCD/PWT embedded in PTE flags and 3-bit cacheattr. */ static inline uint32_t pte_flags_to_cacheattr(uint32_t flags) { diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 28512fb..efc45c7 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -48,6 +48,13 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags); void free_xenheap_pages(void *v, unsigned int order); #define alloc_xenheap_page() (alloc_xenheap_pages(0,0)) #define free_xenheap_page(v) (free_xenheap_pages(v,0)) +/* Map machine page range in Xen virtual address space. */ +int map_pages_to_xen( + unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns, + unsigned int flags); +void destroy_xen_mappings(unsigned long v, unsigned long e); /* Claim handling */ unsigned long domain_adjust_tot_pages(struct domain *d, long pages); -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 08/13] xen/arm: implement map_domain_page_global and unmap_domain_page_global
The implementation uses vmap and vunmap. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v3: - use vmap/vunmap to implement un/map_domain_page_global. --- xen/arch/arm/mm.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 240904e..6f561f6 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -36,6 +36,7 @@ #include <asm/flushtlb.h> #include <public/memory.h> #include <xen/sched.h> +#include <xen/vmap.h> #include <xsm/xsm.h> struct domain *dom_xen, *dom_io, *dom_cow; @@ -131,6 +132,16 @@ void clear_fixmap(unsigned map) flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE); } +void *map_domain_page_global(unsigned long mfn) +{ + return vmap(&mfn, 1); +} + +void unmap_domain_page_global(const void *va) +{ + vunmap(va); +} + /* Map a page of domheap memory */ void *map_domain_page(unsigned long mfn) { -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 09/13] xen: move VCPUOP_register_vcpu_info to common code
Move the implementation of VCPUOP_register_vcpu_info from x86 specific to commmon code. Move vcpu_info_mfn from an arch specific vcpu sub-field to the common vcpu struct. Move the initialization of vcpu_info_mfn to common code. Move unmap_vcpu_info and the call to unmap_vcpu_info at domain destruction time to common code. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: keir@xen.org CC: JBeulich@suse.com --- xen/arch/x86/domain.c | 113 ------------------------------------------ xen/common/domain.c | 111 +++++++++++++++++++++++++++++++++++++++++ xen/include/asm-x86/domain.h | 3 - xen/include/xen/domain.h | 3 + xen/include/xen/sched.h | 3 + 5 files changed, 117 insertions(+), 116 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 14b6d13..d1b6c64 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -387,8 +387,6 @@ int vcpu_initialise(struct vcpu *v) vmce_init_vcpu(v); - v->arch.vcpu_info_mfn = INVALID_MFN; - if ( is_hvm_domain(d) ) { rc = hvm_vcpu_initialise(v); @@ -954,99 +952,6 @@ void arch_vcpu_reset(struct vcpu *v) } } -/* - * Unmap the vcpu info page if the guest decided to place it somewhere - * else. This is only used from arch_domain_destroy, so there''s no - * need to do anything clever. - */ -static void -unmap_vcpu_info(struct vcpu *v) -{ - unsigned long mfn; - - if ( v->arch.vcpu_info_mfn == INVALID_MFN ) - return; - - mfn = v->arch.vcpu_info_mfn; - unmap_domain_page_global(v->vcpu_info); - - v->vcpu_info = &dummy_vcpu_info; - v->arch.vcpu_info_mfn = INVALID_MFN; - - put_page_and_type(mfn_to_page(mfn)); -} - -/* - * Map a guest page in and point the vcpu_info pointer at it. This - * makes sure that the vcpu_info is always pointing at a valid piece - * of memory, and it sets a pending event to make sure that a pending - * event doesn''t get missed. - */ -static int -map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) -{ - struct domain *d = v->domain; - void *mapping; - vcpu_info_t *new_info; - struct page_info *page; - int i; - - if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) - return -EINVAL; - - if ( v->arch.vcpu_info_mfn != INVALID_MFN ) - return -EINVAL; - - /* Run this command on yourself or on other offline VCPUS. */ - if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) - return -EINVAL; - - page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); - if ( !page ) - return -EINVAL; - - if ( !get_page_type(page, PGT_writable_page) ) - { - put_page(page); - return -EINVAL; - } - - mapping = __map_domain_page_global(page); - if ( mapping == NULL ) - { - put_page_and_type(page); - return -ENOMEM; - } - - new_info = (vcpu_info_t *)(mapping + offset); - - if ( v->vcpu_info == &dummy_vcpu_info ) - { - memset(new_info, 0, sizeof(*new_info)); - __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; - } - else - { - memcpy(new_info, v->vcpu_info, sizeof(*new_info)); - } - - v->vcpu_info = new_info; - v->arch.vcpu_info_mfn = page_to_mfn(page); - - /* Set new vcpu_info pointer /before/ setting pending flags. */ - wmb(); - - /* - * Mark everything as being pending just to make sure nothing gets - * lost. The domain will get a spurious event, but it can cope. - */ - vcpu_info(v, evtchn_upcall_pending) = 1; - for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) - set_bit(i, &vcpu_info(v, evtchn_pending_sel)); - - return 0; -} - long arch_do_vcpu_op( int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) @@ -1083,22 +988,6 @@ arch_do_vcpu_op( break; } - case VCPUOP_register_vcpu_info: - { - struct domain *d = v->domain; - struct vcpu_register_vcpu_info info; - - rc = -EFAULT; - if ( copy_from_guest(&info, arg, 1) ) - break; - - domain_lock(d); - rc = map_vcpu_info(v, info.mfn, info.offset); - domain_unlock(d); - - break; - } - /* * XXX Disable for 4.0.0: __update_vcpu_system_time() writes to the given * virtual address even when running in another domain''s address space. @@ -2025,8 +1914,6 @@ int domain_relinquish_resources(struct domain *d) * mappings. */ destroy_gdt(v); - - unmap_vcpu_info(v); } if ( d->arch.pv_domain.pirq_eoi_map != NULL ) diff --git a/xen/common/domain.c b/xen/common/domain.c index ce45d66..d21909f 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -33,6 +33,7 @@ #include <xen/xenoprof.h> #include <xen/irq.h> #include <asm/debugger.h> +#include <asm/p2m.h> #include <asm/processor.h> #include <public/sched.h> #include <public/sysctl.h> @@ -142,6 +143,7 @@ struct vcpu *alloc_vcpu( v->vcpu_info = ((vcpu_id < XEN_LEGACY_MAX_VCPUS) ? (vcpu_info_t *)&shared_info(d, vcpu_info[vcpu_id]) : &dummy_vcpu_info); + v->vcpu_info_mfn = INVALID_MFN; init_waitqueue_vcpu(v); } @@ -547,6 +549,7 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d) int domain_kill(struct domain *d) { int rc = 0; + struct vcpu *v; if ( d == current->domain ) return -EINVAL; @@ -571,6 +574,8 @@ int domain_kill(struct domain *d) BUG_ON(rc != -EAGAIN); break; } + for_each_vcpu ( d, v ) + unmap_vcpu_info(v); d->is_dying = DOMDYING_dead; /* Mem event cleanup has to go here because the rings * have to be put before we call put_domain. */ @@ -896,6 +901,96 @@ void vcpu_reset(struct vcpu *v) vcpu_unpause(v); } +/* + * Map a guest page in and point the vcpu_info pointer at it. This + * makes sure that the vcpu_info is always pointing at a valid piece + * of memory, and it sets a pending event to make sure that a pending + * event doesn''t get missed. + */ +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) +{ + struct domain *d = v->domain; + void *mapping; + vcpu_info_t *new_info; + struct page_info *page; + int i; + + if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) + return -EINVAL; + + if ( v->vcpu_info_mfn != INVALID_MFN ) + return -EINVAL; + + /* Run this command on yourself or on other offline VCPUS. */ + if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) + return -EINVAL; + + page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); + if ( !page ) + return -EINVAL; + + if ( !get_page_type(page, PGT_writable_page) ) + { + put_page(page); + return -EINVAL; + } + + mapping = __map_domain_page_global(page); + if ( mapping == NULL ) + { + put_page_and_type(page); + return -ENOMEM; + } + + new_info = (vcpu_info_t *)(mapping + offset); + + if ( v->vcpu_info == &dummy_vcpu_info ) + { + memset(new_info, 0, sizeof(*new_info)); + __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; + } + else + { + memcpy(new_info, v->vcpu_info, sizeof(*new_info)); + } + + v->vcpu_info = new_info; + v->vcpu_info_mfn = page_to_mfn(page); + + /* Set new vcpu_info pointer /before/ setting pending flags. */ + wmb(); + + /* + * Mark everything as being pending just to make sure nothing gets + * lost. The domain will get a spurious event, but it can cope. + */ + vcpu_info(v, evtchn_upcall_pending) = 1; + for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) + set_bit(i, &vcpu_info(v, evtchn_pending_sel)); + + return 0; +} + +/* + * Unmap the vcpu info page if the guest decided to place it somewhere + * else. This is only used from arch_domain_destroy, so there''s no + * need to do anything clever. + */ +void unmap_vcpu_info(struct vcpu *v) +{ + unsigned long mfn; + + if ( v->vcpu_info_mfn == INVALID_MFN ) + return; + + mfn = v->vcpu_info_mfn; + unmap_domain_page_global(v->vcpu_info); + + v->vcpu_info = &dummy_vcpu_info; + v->vcpu_info_mfn = INVALID_MFN; + + put_page_and_type(mfn_to_page(mfn)); +} long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) { @@ -1015,6 +1110,22 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) break; + case VCPUOP_register_vcpu_info: + { + struct domain *d = v->domain; + struct vcpu_register_vcpu_info info; + + rc = -EFAULT; + if ( copy_from_guest(&info, arg, 1) ) + break; + + domain_lock(d); + rc = map_vcpu_info(v, info.mfn, info.offset); + domain_unlock(d); + + break; + } + #ifdef VCPU_TRAP_NMI case VCPUOP_send_nmi: if ( !guest_handle_is_null(arg) ) diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index bdaf714..e7ed7c3 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -435,9 +435,6 @@ struct arch_vcpu struct paging_vcpu paging; - /* Guest-specified relocation of vcpu_info. */ - unsigned long vcpu_info_mfn; - uint32_t gdbsx_vcpu_event; /* A secondary copy of the vcpu time info. */ diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index d4ac50f..d0263a9 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -52,6 +52,9 @@ void free_pirq_struct(void *); int vcpu_initialise(struct vcpu *v); void vcpu_destroy(struct vcpu *v); +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset); +void unmap_vcpu_info(struct vcpu *v); + int arch_domain_create(struct domain *d, unsigned int domcr_flags); void arch_domain_destroy(struct domain *d); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index beadc42..7e129d9 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -192,6 +192,9 @@ struct vcpu struct waitqueue_vcpu *waitqueue_vcpu; + /* Guest-specified relocation of vcpu_info. */ + unsigned long vcpu_info_mfn; + struct arch_vcpu arch; }; -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: keir@xen.org CC: JBeulich@suse.com Changes in v3: - do not export all the vcpu_op hypercalls to ARM guests, only VCPUOP_register_vcpu_info. --- xen/arch/arm/domain.c | 13 +++++++++++++ xen/arch/arm/traps.c | 1 + xen/include/asm-arm/hypercall.h | 3 +++ 3 files changed, 17 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index fee3790..a676441 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -10,6 +10,7 @@ * GNU General Public License for more details. */ #include <xen/config.h> +#include <xen/hypercall.h> #include <xen/init.h> #include <xen/lib.h> #include <xen/sched.h> @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) } } + +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) +{ + switch ( cmd ) + { + case VCPUOP_register_vcpu_info: + return do_vcpu_op(cmd, vcpuid, arg); + default: + return -EINVAL; + } +} + long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) { return -ENOSYS; diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index 733099a..d69231c 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -617,6 +617,7 @@ static arm_hypercall_t arm_hypercall_table[] = { HYPERCALL(sysctl, 2), HYPERCALL(hvm_op, 2), HYPERCALL(grant_table_op, 3), + HYPERCALL(restricted_vcpu_op, 3), }; #define __PSCI_cpu_suspend 0 diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h index 0833ec4..8ab0cc4 100644 --- a/xen/include/asm-arm/hypercall.h +++ b/xen/include/asm-arm/hypercall.h @@ -4,6 +4,9 @@ #include <public/domctl.h> /* for arch_do_domctl */ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); +#define __HYPERVISOR_restricted_vcpu_op __HYPERVISOR_vcpu_op +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg); + #endif /* __ASM_ARM_HYPERCALL_H__ */ /* * Local variables: -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 11/13] xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus
If we need to inject an irq into a guest that is running on a different processor, we shouldn''t just enqueue the irq into the lr_pending and inflight lists and wait for something to interrupt the guest execution. Send an IPI to the target pcpu so that Xen can inject the new interrupt returning to guest. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/vgic.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 8912aab..547a2ab 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -664,6 +664,8 @@ out: spin_unlock_irqrestore(&v->arch.vgic.lock, flags); /* we have a new higher priority irq, inject it into the guest */ vcpu_unblock(v); + if ( v != current && v->is_running ) + smp_send_event_check_mask(cpumask_of(v->processor)); } /* -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 12/13] xen/arm: start the vtimer Xen timers on the processor they should be running on
The Xen physical timer emulator and virtual timer driver use two internal Xen timers: initialize them on the processor the vcpu is going to be running on, rather than the processor that it''s creating the vcpu. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/vtimer.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index 1cb365e..2444851 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -48,7 +48,7 @@ int vcpu_vtimer_init(struct vcpu *v) { struct vtimer *t = &v->arch.phys_timer; - init_timer(&t->timer, phys_timer_expired, t, smp_processor_id()); + init_timer(&t->timer, phys_timer_expired, t, v->processor); t->ctl = 0; t->offset = NOW(); t->cval = NOW(); @@ -56,7 +56,7 @@ int vcpu_vtimer_init(struct vcpu *v) t->v = v; t = &v->arch.virt_timer; - init_timer(&t->timer, virt_timer_expired, t, smp_processor_id()); + init_timer(&t->timer, virt_timer_expired, t, v->processor); t->ctl = 0; t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); t->cval = 0; -- 1.7.2.5
Stefano Stabellini
2013-Apr-24 19:07 UTC
[PATCH v3 13/13] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/vtimer.c | 18 ++++++++++++++---- xen/include/asm-arm/domain.h | 26 +++++++++++++++++--------- 2 files changed, 31 insertions(+), 13 deletions(-) diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index 2444851..49858e7 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -46,20 +46,30 @@ static void virt_timer_expired(void *data) int vcpu_vtimer_init(struct vcpu *v) { + struct vtimer_base *b = &v->domain->arch.phys_timer_base; struct vtimer *t = &v->arch.phys_timer; + if ( !b->offset ) + b->offset = NOW(); + if ( !b->cval ) + b->cval = NOW(); init_timer(&t->timer, phys_timer_expired, t, v->processor); t->ctl = 0; - t->offset = NOW(); - t->cval = NOW(); + t->offset = b->offset; + t->cval = b->cval; t->irq = 30; t->v = v; + b = &v->domain->arch.virt_timer_base; + if ( !b->offset ) + b->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); + if ( !b->cval ) + b->cval = 0; t = &v->arch.virt_timer; init_timer(&t->timer, virt_timer_expired, t, v->processor); t->ctl = 0; - t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); - t->cval = 0; + t->offset = b->offset; + t->cval = b->cval; t->irq = 27; t->v = v; diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index 3fa266c2..d001802 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -47,6 +47,20 @@ enum domain_type { #define is_pv64_domain(d) (0) #endif +struct vtimer { + struct vcpu *v; + int irq; + struct timer timer; + uint32_t ctl; + uint64_t offset; + uint64_t cval; +}; + +struct vtimer_base { + uint64_t offset; + uint64_t cval; +}; + struct arch_domain { #ifdef CONFIG_ARM_64 @@ -61,6 +75,9 @@ struct arch_domain uint32_t vpidr; register_t vmpidr; + struct vtimer_base phys_timer_base; + struct vtimer_base virt_timer_base; + struct { /* * Covers access to other members of this struct _except_ for @@ -91,15 +108,6 @@ struct arch_domain } __cacheline_aligned; -struct vtimer { - struct vcpu *v; - int irq; - struct timer timer; - uint32_t ctl; - uint64_t offset; - uint64_t cval; -}; - struct arch_vcpu { struct { -- 1.7.2.5
>>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int nr_bits) > +{ > + cpumask_t mask; > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits;min(nr_bits, nr_cpumask_bits)> + > + memset(&mask, 0x00, sizeof(mask));bitmap_zero().> + bitmap_copy(mask.bits, bits, len);Hard tab.> + > + return mask; > +}And most importantly: Why? This isn''t an operation that should commonly be done, and hence having a utility function for this seems to invite for abuse rather than really help. Jan
Ian Campbell
2013-Apr-25 09:53 UTC
Re: [PATCH v3 01/13] xen/arm: basic PSCI support, implement cpu_on and cpu_off
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Implement support for ARM Power State Coordination Interface, PSCI in > short. > Support HVC and SMC calls. > > Changes in v3: > - move do_psci_* to psci.c; > - trap SMC; > - return PSCI error codes; > - remove Linux boot procotol from secondary cpus. > > Changes in v2: > - set is_initialised in arch_set_info_guest; > - zero vcpu_guest_context before using it. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/Makefile | 1 + > xen/arch/arm/domain.c | 2 + > xen/arch/arm/psci.c | 87 +++++++++++++++++++++++++++++++++++++++ > xen/arch/arm/traps.c | 47 ++++++++++++++++++++- > xen/include/asm-arm/processor.h | 1 + > xen/include/asm-arm/psci.h | 24 +++++++++++ > 6 files changed, 161 insertions(+), 1 deletions(-) > create mode 100644 xen/arch/arm/psci.c > create mode 100644 xen/include/asm-arm/psci.h > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > index 2106a4f..8f75044 100644 > --- a/xen/arch/arm/Makefile > +++ b/xen/arch/arm/Makefile > @@ -5,6 +5,7 @@ subdir-y += platforms > obj-y += early_printk.o > obj-y += cpu.o > obj-y += domain.o > +obj-y += psci.o > obj-y += domctl.o > obj-y += sysctl.o > obj-y += domain_build.o > > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > new file mode 100644 > index 0000000..28153ad > --- /dev/null > +++ b/xen/arch/arm/psci.c > @@ -0,0 +1,87 @@ > +/* > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + */ > + > +#include <xen/errno.h> > +#include <xen/sched.h> > +#include <xen/types.h> > + > +#include <asm/current.h> > +#include <asm/psci.h> > + > +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point) > +{ > + struct vcpu *v; > + struct domain *d = current->domain; > + struct vcpu_guest_context *ctxt; > + int rc; > + > + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) > + return PSCI_EINVAL; > + > + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > + return PSCI_EINVAL; > + > + if ( !v->is_initialised ) {If this is true then we eventually return success, should this be returning EINVAL or DENIED or something? [...]> +int do_psci_migrate(uint32_t vcpuid) > +{ > + return PSCI_ENOSYS; > +}Is this required or should we just not expose it in device tree?> + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) > + return PSCI_DENIED; > + > + memset(ctxt, 0, sizeof(*ctxt)); > + ctxt->user_regs.pc32 = entry_point; > + ctxt->sctlr = SCTLR_BASE; > + ctxt->ttbr0 = 0; > + ctxt->ttbr1 = 0; > + ctxt->ttbcr = 0; /* Defined Reset Value */ > + ctxt->user_regs.cpsr = PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC;This value is repeated in a couple of places now, perhaps we should have a ctxt_init which sets up our vcpus defined "reset" values all in one place?> + ctxt->flags = VGCF_online; > + > + domain_lock(d); > + rc = arch_set_info_guest(v, ctxt); > + free_vcpu_guest_context(ctxt); > + > + if ( rc < 0 ) > + { > + domain_unlock(d); > + return PSCI_DENIED; > + } > + domain_unlock(d); > + } > + > + clear_bit(_VPF_down, &v->pause_flags);ctxt->flags = VGCF_online would cause this to happen, I''d rather do that than repeat the logic here.> @@ -647,6 +673,20 @@ static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) > } > } > > +static void do_trap_psci(struct cpu_user_regs *regs) > +{ > + arm_psci_fn_t psci_call = NULL; > + > + if ( regs->r0 >= ARRAY_SIZE(arm_psci_table) ) > + { > + regs->r0 = -ENOSYS; > + return; > + } > + > + psci_call = arm_psci_table[regs->r0].fn;Might be NULL? Since Xen passes the valid PSCI numbers to the guest I think it should be illegal to call anything else, so rather than ENOSYS calling an unimplemented slots should result in domain_crash_synchronous. It seems like the general plan is for various firmware interfaces to use iss==0 and to communicate the valid r0 numbers via device tree, but I don''t think we can assume that all those different interfaces will have the same ENOSYS like value.> + regs->r0 = psci_call(regs->r1, regs->r2); > +} > + > static void do_trap_hypercall(struct cpu_user_regs *regs, unsigned long iss) > { > arm_hypercall_fn_t call = NULL; > @@ -959,8 +999,13 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs *regs) > case HSR_EC_HVC: > if ( (hsr.iss & 0xff00) == 0xff00 ) > return do_debug_trap(regs, hsr.iss & 0x00ff); > + if ( hsr.iss == 0 ) > + return do_trap_psci(regs); > do_trap_hypercall(regs, hsr.iss);This is a pre-existing issue but this handles all no-zero iss as a Xen hypercall, I think we should switch ( hsr.iss ) case XEN_PSCI_TAG: ... case XEN_HYPERCALL_TAG: ... default: domain_crash_synchrous and pull the domain crash out of do_trap_hypercall. Maybe XEN_PSCI_TAG isn''t the right name if other interfaces are going to use it, but it''s only an internal name anyway.> break; > + case HSR_EC_SMC: > + do_trap_psci(regs); > + break; > case HSR_EC_DATA_ABORT_GUEST: > do_trap_data_abort_guest(regs, hsr.dabt); > break; > diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h > index 1681ebf..72f7f6b 100644 > --- a/xen/include/asm-arm/processor.h > +++ b/xen/include/asm-arm/processor.h > @@ -85,6 +85,7 @@ > #define HSR_EC_CP14_64 0x0c > #define HSR_EC_SVC 0x11 > #define HSR_EC_HVC 0x12 > +#define HSR_EC_SMC 0x13 > #define HSR_EC_INSTR_ABORT_GUEST 0x20 > #define HSR_EC_INSTR_ABORT_HYP 0x21 > #define HSR_EC_DATA_ABORT_GUEST 0x24 > diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h > new file mode 100644 > index 0000000..8511eb1 > --- /dev/null > +++ b/xen/include/asm-arm/psci.h > @@ -0,0 +1,24 @@ > +#ifndef __ASM_PSCI_H__ > +#define __ASM_PSCI_H__ > + > +#define PSCI_SUCCESS 0 > +#define PSCI_ENOSYS -1 > +#define PSCI_EINVAL -2 > +#define PSCI_DENIED -3 > + > +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point); > +int do_psci_cpu_off(uint32_t power_state); > +int do_psci_cpu_suspend(uint32_t power_state, uint32_t entry_point); > +int do_psci_migrate(uint32_t vcpuid); > + > +#endif /* __ASM_PSCI_H__ */ > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */
Ian Campbell
2013-Apr-25 10:01 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 2013-04-25 at 10:24 +0100, Jan Beulich wrote:> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int nr_bits) > > +{ > > + cpumask_t mask; > > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; > > min(nr_bits, nr_cpumask_bits) > > > + > > + memset(&mask, 0x00, sizeof(mask)); > > bitmap_zero(). > > > + bitmap_copy(mask.bits, bits, len); > > Hard tab. > > > + > > + return mask; > > +} > > And most importantly: Why? This isn''t an operation that should > commonly be done, and hence having a utility function for this > seems to invite for abuse rather than really help.I suggested it, we are emulating an instruction which provides a bitmap of the CPUs to operate on. It seemed nicer to me to have a helper rather than to open-code accesses under the hood of the cpumask interface. Ian.
Stefano Stabellini
2013-Apr-25 10:01 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 25 Apr 2013, Jan Beulich wrote:> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int nr_bits) > > +{ > > + cpumask_t mask; > > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; > > min(nr_bits, nr_cpumask_bits) > > > + > > + memset(&mask, 0x00, sizeof(mask)); > > bitmap_zero(). > > > + bitmap_copy(mask.bits, bits, len); > > Hard tab. > > > + > > + return mask; > > +} > > And most importantly: Why? This isn''t an operation that should > commonly be done, and hence having a utility function for this > seems to invite for abuse rather than really help.TBH I have done it to address Ian''s comment, I don''t have a strong opinion on this. However it is true that from an API point of view, cpumask_from_bitmap allows us to cover the new use case without breaking the cpumask abstraction.
Ian Campbell
2013-Apr-25 10:08 UTC
Re: [PATCH v3 02/13] xen/arm: allocate secondaries dom0 vcpus
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/domain_build.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > index a6d8e9d..5dfa592 100644 > --- a/xen/arch/arm/domain_build.c > +++ b/xen/arch/arm/domain_build.c > @@ -363,7 +363,7 @@ static void dtb_load(struct kernel_info *kinfo) > int construct_dom0(struct domain *d) > { > struct kernel_info kinfo = {}; > - int rc; > + int rc, i, cpu = 0;I don''t think you need to init cpu here.> struct vcpu *v = d->vcpu[0]; > struct cpu_user_regs *regs = &v->arch.cpu_info->guest_cpu_user_regs; > @@ -451,6 +451,12 @@ int construct_dom0(struct domain *d) > } > #endif > > + for ( i = 1; i < d->max_vcpus; i++ ) > + { > + cpu = cpumask_cycle(cpu, &cpu_online_map); > + (void)alloc_vcpu(d, i, cpu);Some sort of print on the first failure would be nice, or to be honest looking at the sorts of failure modes alloc_vcpu has just bailing on the whole thing might be best.> + } > + > local_abort_enable();Aside: I wonder what led this to get put here. It doesn''t happen on non-boot CPUs for one thing! Ian.
>>> On 25.04.13 at 12:01, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Thu, 25 Apr 2013, Jan Beulich wrote: >> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: >> > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int > nr_bits) >> > +{ >> > + cpumask_t mask; >> > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; >> >> min(nr_bits, nr_cpumask_bits) >> >> > + >> > + memset(&mask, 0x00, sizeof(mask)); >> >> bitmap_zero(). >> >> > + bitmap_copy(mask.bits, bits, len); >> >> Hard tab. >> >> > + >> > + return mask; >> > +} >> >> And most importantly: Why? This isn''t an operation that should >> commonly be done, and hence having a utility function for this >> seems to invite for abuse rather than really help. > > TBH I have done it to address Ian''s comment, I don''t have a strong > opinion on this. > However it is true that from an API point of view, cpumask_from_bitmap > allows us to cover the new use case without breaking the cpumask > abstraction.Rather than adding a new abstraction that''s used for a single special case, and if open coding is undesirable, I''d prefer if you used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() if at all possible. Jan
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right > guest vcpu. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Changes in v3: > - make use of cpumask_from_bitmap. > --- > xen/arch/arm/vgic.c | 55 ++++++++++++++++++++++++++++++++++++++++---- > xen/include/asm-arm/gic.h | 3 ++ > 2 files changed, 53 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index b30da78..8912aab 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -370,6 +370,7 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) > > static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > { > + struct domain *d = v->domain; > struct hsr_dabt dabt = info->dabt; > struct cpu_user_regs *regs = guest_cpu_user_regs(); > register_t *r = select_user_reg(regs, dabt.reg); > @@ -498,11 +499,55 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > goto write_ignore; > > case GICD_SGIR: > - if ( dabt.size != 2 ) goto bad_width; > - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", > - *r, gicd_reg - GICD_ICFGR); > - return 0; > - > + {There seems to be enough meat here to be worth a separate function e.g. vgic_to_sgi(*r); only the dabt.size check need remain.> + cpumask_t vcpu_mask; > + int virtual_irq; > + int filter; > + int vcpuid; > + struct vcpu *vt; > + int i; > + unsigned long bits; > + > + if ( dabt.size != 2 ) goto bad_width; > + > + filter = (*r & GICD_SGI_TARGET_LIST_MASK); > + virtual_irq = (*r & GICD_SGI_INTID_MASK); > + > + cpumask_clear(&vcpu_mask); > + switch ( filter ) > + { > + case GICD_SGI_TARGET_LIST: > + bits = (*r & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; > + vcpu_mask = cpumask_from_bitmap(&bits, 8); > + break; > + case GICD_SGI_TARGET_OTHERS: > + for ( i = 0; i < d->max_vcpus; i++ ) > + { > + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) > + cpumask_set_cpu(i, &vcpu_mask); > + } > + case GICD_SGI_TARGET_SELF: > + vcpu_mask = *cpumask_of(current->vcpu_id); > + break; > + default: > + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r);We''ve not been very good at this sort of thing, but this guest triggerable print should be gdprintk. Likewise the one below.> + return 0; > + } > + > + for_each_cpu( vcpuid, &vcpu_mask ) > + { > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || > + virtual_irq >= 16 )This virtual_irq check could be outside the loop I think. Is it actually illegal (UNDEFINED etc) to send an SGI to a non-existent processor? Actually not even undefined, just one which hasn''t been PSCI''d yet. Perhaps this could be silently ignored?> + { > + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", > + *r, *cpumask_bits(&vcpu_mask)); > + return 0; > + } > + vgic_vcpu_inject_irq(vt, virtual_irq, 1); > + } > + return 1; > + } > + > case GICD_CPENDSGIR ... GICD_CPENDSGIRN: > if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; > printk("vGICD: unhandled %s write %#"PRIregister" to ICPENDSGIR%d\n", > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > index 92711d5..4f2c8b8 100644 > --- a/xen/include/asm-arm/gic.h > +++ b/xen/include/asm-arm/gic.h > @@ -51,12 +51,15 @@ > #define GICD_SPENDSGIRN (0xF2C/4) > #define GICD_ICPIDR2 (0xFE8/4) > > +#define GICD_SGI_TARGET_LIST_SHIFT (24) > +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) > #define GICD_SGI_TARGET_LIST (0UL<<24) > #define GICD_SGI_TARGET_OTHERS (1UL<<24) > #define GICD_SGI_TARGET_SELF (2UL<<24)These hardcoded 24s should become GICD_SGI_TARGET_LIST_SHIFT since you''ve gone to the trouble of defining it.> #define GICD_SGI_TARGET_SHIFT (16) > #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) > #define GICD_SGI_GROUP1 (1UL<<15) > +#define GICD_SGI_INTID_MASK (0xFUL) > > #define GICC_CTLR (0x0000/4) > #define GICC_PMR (0x0004/4)
Ian Campbell
2013-Apr-25 10:27 UTC
Re: [PATCH v3 05/13] xen/arm: early_ioremap: allocate virtual addresses from top to bottom
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: ... because ... (I know now, but other people don''t, and I may not remember in six months time)> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/mm.c | 13 ++++++++----- > 1 files changed, 8 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index ba3140d..bd7baaf 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -433,15 +433,17 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) > */ > void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) > { > - static unsigned long virt_start = EARLY_VMAP_VIRT_START; > - unsigned long ret_addr = virt_start; > + static unsigned long virt_start = EARLY_VMAP_VIRT_END; > paddr_t end = start + len; > > + len = ((len + SECOND_SIZE - 1) >> SECOND_SHIFT) << SECOND_SHIFT;These shifts are equivalent to & ~SECOND_MASK. Or you could use DIV_ROUND_UP + one shift.> + virt_start -= len; > + > ASSERT(!(start & (~SECOND_MASK))); > ASSERT(!(virt_start & (~SECOND_MASK))); > > /* The range we need to map is too big */ > - if ( virt_start + len >= EARLY_VMAP_VIRT_END ) > + if ( virt_start >= EARLY_VMAP_VIRT_START )Needs to be < doesn''t it?> return NULL; > > while ( start < end ) > @@ -453,9 +455,10 @@ void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) > start += SECOND_SIZE; > virt_start += SECOND_SIZE; > } > - flush_xen_data_tlb_range_va(ret_addr, len); > + virt_start -= len;You''ve done this twice now (once right after my comment about the shifts). Perhaps this is because of the virt_start manipulations inside the loop, but in that case I think that loop needs rewriting with its own variable etc etc, having virt_start go up and down like a yo-yo in this function is just begging to confuse people. Actually, if you drop the first virt_start -= len and make the range check if ( virt_start - len < EARLY.. ) then you can invert the loop to map backwards and decrement virt_start and end as you go along. That would work too IMHO (at least so far as I have one when just imagining what it would look like).> + flush_xen_data_tlb_range_va(virt_start, len); > > - return (void*)ret_addr; > + return (void*)virt_start; > } > > enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
Ian Campbell
2013-Apr-25 10:29 UTC
Re: [PATCH v3 06/13] xen/arm: implement arch_vmap_virt_end
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Move virt_start out of ioremap and rename it to early_vmap_start. > Implement arch_vmap_virt_end by returning early_vmap_start. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/mm.c | 24 +++++++++++++++--------- > 1 files changed, 15 insertions(+), 9 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index bd7baaf..4d4556b 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -431,34 +431,40 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) > * start has to be 2MB aligned. > * len has to be < EARLY_VMAP_VIRT_END - EARLY_VMAP_VIRT_START. > */ > +static unsigned long early_vmap_start = EARLY_VMAP_VIRT_END;__initdata I think?> void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) > { > - static unsigned long virt_start = EARLY_VMAP_VIRT_END; > paddr_t end = start + len; > + unsigned long map_start; > > len = ((len + SECOND_SIZE - 1) >> SECOND_SHIFT) << SECOND_SHIFT; > - virt_start -= len; > + early_vmap_start -= len; > > ASSERT(!(start & (~SECOND_MASK))); > - ASSERT(!(virt_start & (~SECOND_MASK))); > + ASSERT(!(early_vmap_start & (~SECOND_MASK))); > > /* The range we need to map is too big */ > - if ( virt_start >= EARLY_VMAP_VIRT_START ) > + if ( early_vmap_start >= EARLY_VMAP_VIRT_START ) > return NULL; > > + map_start = early_vmap_start; > while ( start < end ) > { > lpae_t e = mfn_to_xen_entry(start >> PAGE_SHIFT); > e.pt.ai = attributes; > - write_pte(xen_second + second_table_offset(virt_start), e); > + write_pte(xen_second + second_table_offset(map_start), e); > > start += SECOND_SIZE; > - virt_start += SECOND_SIZE; > + map_start += SECOND_SIZE; > } > - virt_start -= len; > - flush_xen_data_tlb_range_va(virt_start, len); > + flush_xen_data_tlb_range_va(early_vmap_start, len); > > - return (void*)virt_start; > + return (void*)early_vmap_start;Hrm, you may have just implemented my comments on the previous patch. I think you could perhaps just fold the two and together with the improved changelog I asked for it would be obvious and natural to reverse the mapping and implement arch_vmap_virt_end in one go.> +} > + > +void *__init arch_vmap_virt_end(void) > +{ > + return (void *)early_vmap_start; > } > > enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
Stefano Stabellini
2013-Apr-25 10:35 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 25 Apr 2013, Jan Beulich wrote:> >>> On 25.04.13 at 12:01, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: > > On Thu, 25 Apr 2013, Jan Beulich wrote: > >> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > wrote: > >> > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int > > nr_bits) > >> > +{ > >> > + cpumask_t mask; > >> > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; > >> > >> min(nr_bits, nr_cpumask_bits) > >> > >> > + > >> > + memset(&mask, 0x00, sizeof(mask)); > >> > >> bitmap_zero(). > >> > >> > + bitmap_copy(mask.bits, bits, len); > >> > >> Hard tab. > >> > >> > + > >> > + return mask; > >> > +} > >> > >> And most importantly: Why? This isn''t an operation that should > >> commonly be done, and hence having a utility function for this > >> seems to invite for abuse rather than really help. > > > > TBH I have done it to address Ian''s comment, I don''t have a strong > > opinion on this. > > However it is true that from an API point of view, cpumask_from_bitmap > > allows us to cover the new use case without breaking the cpumask > > abstraction. > > Rather than adding a new abstraction that''s used for a single > special case, and if open coding is undesirable, I''d prefer if you > used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() > if at all possible.Both functions do copy_to/from_guest and use xmalloc, I don''t think I can use them.
Ian Campbell
2013-Apr-25 10:41 UTC
Re: [PATCH v3 07/13] xen/arm: compile and initialize vmap
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > VMAP_VIRT_END and VMAP_VIRT_START. > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM).So our vmap is 2MB mappings only? I suppose that''s ok, at least for now.> Implement map_pages_to_xen and destroy_xen_mappings.This involves moving the prototypes from x86 to generic, so needs Jan + Keir, CCd.> > Call vm_init from start_xen. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> +static int create_xen_table(lpae_t *entry) > +{ > + void *p; > + lpae_t pte; > + > + p = alloc_xenheap_pages(0, 0); > + if ( p == NULL ) > + return -ENOMEM; > + clear_page(p); > + pte = mfn_to_xen_entry(virt_to_mfn(p)); > + pte.pt.table = 1; > + write_pte(entry, pte); > + return 0; > +} > + > +enum xenmap_operation { > + INSERT, > + REMOVE > +}; > + > +static int create_xen_entries(enum xenmap_operation op, > + unsigned long virt, > + unsigned long mfn, > + unsigned long nr_mfns)Shame this can''t be combined with create_p2m_entries, but that uses domain pages and this uses xenheap pages.> +{ > + int rc; > + unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; > + lpae_t pte; > + lpae_t *third = NULL; > + > + for(; addr < addr_end; addr += PAGE_SIZE, mfn++) > + { > + if ( !xen_second[second_linear_offset(addr)].pt.valid || > + !xen_second[second_linear_offset(addr)].pt.table ) > + { > + rc = create_xen_table(&xen_second[second_linear_offset(addr)]); > + if ( rc < 0 ) { > + printk("create_xen_entries: L2 failed\n"); > + goto out; > + } > + } > + > + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); > + > + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base > + << PAGE_SHIFT); > + if ( third[third_table_offset(addr)].pt.valid ) > + flush_tlb_local();Why this flush? (I notice create_p2m_mapping does the same but with _all_local()) Isn''t it a bug for the third to be already mapped? that suggests something is overwriting the mapping, does vmap do that?> + > + switch ( op ) { > + case INSERT: > + pte = mfn_to_xen_entry(mfn); > + pte.pt.table = 1; > + write_pte(&third[third_table_offset(addr)], pte); > + break; > + case REMOVE: > + memset(&pte, 0x00, sizeof(pte));AKA: pte.bits = 0;> + write_pte(&third[third_table_offset(addr)], pte); > + break; > + default: > + printk("create_xen_entries: invalid op\n");ASSERT? This really can never happen.> + break; > + } > + } > + flush_xen_data_tlb_range_va(virt, PAGE_SIZE * nr_mfns); > + > + rc = 0; > + > +out: > + return rc; > +} > +[...] diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h> index 14e63eb..5287a92 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -58,6 +58,9 @@ > #define DEV_WC BUFFERABLE > #define DEV_CACHED WRITEBACK > > +#define PAGE_HYPERVISOR (MATTR_MEM) > +#define MAP_SMALL_PAGES PAGE_HYPERVISOR > + > /* > * Stage 2 Memory Type. > * > diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h > index b2f3859..e53e1e5 100644 > --- a/xen/include/asm-x86/page.h > +++ b/xen/include/asm-x86/page.h > @@ -338,14 +338,6 @@ l3_pgentry_t *virt_to_xen_l3e(unsigned long v); > > extern void set_pdx_range(unsigned long smfn, unsigned long emfn); > > -/* Map machine page range in Xen virtual address space. */ > -int map_pages_to_xen( > - unsigned long virt, > - unsigned long mfn, > - unsigned long nr_mfns, > - unsigned int flags); > -void destroy_xen_mappings(unsigned long v, unsigned long e); > - > /* Convert between PAT/PCD/PWT embedded in PTE flags and 3-bit cacheattr. */ > static inline uint32_t pte_flags_to_cacheattr(uint32_t flags) > { > diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h > index 28512fb..efc45c7 100644 > --- a/xen/include/xen/mm.h > +++ b/xen/include/xen/mm.h > @@ -48,6 +48,13 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags); > void free_xenheap_pages(void *v, unsigned int order); > #define alloc_xenheap_page() (alloc_xenheap_pages(0,0)) > #define free_xenheap_page(v) (free_xenheap_pages(v,0)) > +/* Map machine page range in Xen virtual address space. */ > +int map_pages_to_xen( > + unsigned long virt, > + unsigned long mfn, > + unsigned long nr_mfns, > + unsigned int flags); > +void destroy_xen_mappings(unsigned long v, unsigned long e); > > /* Claim handling */ > unsigned long domain_adjust_tot_pages(struct domain *d, long pages);
Ian Campbell
2013-Apr-25 10:43 UTC
Re: [PATCH v3 08/13] xen/arm: implement map_domain_page_global and unmap_domain_page_global
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> The implementation uses vmap and vunmap.It seems a little bit too good to be true ;-) I''ve CCd Jan as vmap author to make sure we''ve not missed something!> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com> Although since Jan wants x86 to eventually use it this seems like it could be implemented in xen/domain_page.h with a CONFIG_DOMAIN_PAGE_VMAP. Up to Jan...> > Changes in v3: > - use vmap/vunmap to implement un/map_domain_page_global. > --- > xen/arch/arm/mm.c | 11 +++++++++++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 240904e..6f561f6 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -36,6 +36,7 @@ > #include <asm/flushtlb.h> > #include <public/memory.h> > #include <xen/sched.h> > +#include <xen/vmap.h> > #include <xsm/xsm.h> > > struct domain *dom_xen, *dom_io, *dom_cow; > @@ -131,6 +132,16 @@ void clear_fixmap(unsigned map) > flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE); > } > > +void *map_domain_page_global(unsigned long mfn) > +{ > + return vmap(&mfn, 1); > +} > + > +void unmap_domain_page_global(const void *va) > +{ > + vunmap(va); > +} > + > /* Map a page of domheap memory */ > void *map_domain_page(unsigned long mfn) > {
Ian Campbell
2013-Apr-25 10:45 UTC
Re: [PATCH v3 09/13] xen: move VCPUOP_register_vcpu_info to common code
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Move the implementation of VCPUOP_register_vcpu_info from x86 specific > to commmon code. > > Move vcpu_info_mfn from an arch specific vcpu sub-field to the common > vcpu struct. > Move the initialization of vcpu_info_mfn to common code. > > Move unmap_vcpu_info and the call to unmap_vcpu_info at domain > destruction time to common code. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: keir@xen.org > CC: JBeulich@suse.comMy ack isn''t worth much here, but FWIW looks good to me, assuming the bulk of it really is just code motion... Acked-by: Ian Campbell <ian.campbell@citrix.com>> --- > xen/arch/x86/domain.c | 113 ------------------------------------------ > xen/common/domain.c | 111 +++++++++++++++++++++++++++++++++++++++++ > xen/include/asm-x86/domain.h | 3 - > xen/include/xen/domain.h | 3 + > xen/include/xen/sched.h | 3 + > 5 files changed, 117 insertions(+), 116 deletions(-) > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 14b6d13..d1b6c64 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -387,8 +387,6 @@ int vcpu_initialise(struct vcpu *v) > > vmce_init_vcpu(v); > > - v->arch.vcpu_info_mfn = INVALID_MFN; > - > if ( is_hvm_domain(d) ) > { > rc = hvm_vcpu_initialise(v); > @@ -954,99 +952,6 @@ void arch_vcpu_reset(struct vcpu *v) > } > } > > -/* > - * Unmap the vcpu info page if the guest decided to place it somewhere > - * else. This is only used from arch_domain_destroy, so there''s no > - * need to do anything clever. > - */ > -static void > -unmap_vcpu_info(struct vcpu *v) > -{ > - unsigned long mfn; > - > - if ( v->arch.vcpu_info_mfn == INVALID_MFN ) > - return; > - > - mfn = v->arch.vcpu_info_mfn; > - unmap_domain_page_global(v->vcpu_info); > - > - v->vcpu_info = &dummy_vcpu_info; > - v->arch.vcpu_info_mfn = INVALID_MFN; > - > - put_page_and_type(mfn_to_page(mfn)); > -} > - > -/* > - * Map a guest page in and point the vcpu_info pointer at it. This > - * makes sure that the vcpu_info is always pointing at a valid piece > - * of memory, and it sets a pending event to make sure that a pending > - * event doesn''t get missed. > - */ > -static int > -map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) > -{ > - struct domain *d = v->domain; > - void *mapping; > - vcpu_info_t *new_info; > - struct page_info *page; > - int i; > - > - if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) > - return -EINVAL; > - > - if ( v->arch.vcpu_info_mfn != INVALID_MFN ) > - return -EINVAL; > - > - /* Run this command on yourself or on other offline VCPUS. */ > - if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) > - return -EINVAL; > - > - page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); > - if ( !page ) > - return -EINVAL; > - > - if ( !get_page_type(page, PGT_writable_page) ) > - { > - put_page(page); > - return -EINVAL; > - } > - > - mapping = __map_domain_page_global(page); > - if ( mapping == NULL ) > - { > - put_page_and_type(page); > - return -ENOMEM; > - } > - > - new_info = (vcpu_info_t *)(mapping + offset); > - > - if ( v->vcpu_info == &dummy_vcpu_info ) > - { > - memset(new_info, 0, sizeof(*new_info)); > - __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; > - } > - else > - { > - memcpy(new_info, v->vcpu_info, sizeof(*new_info)); > - } > - > - v->vcpu_info = new_info; > - v->arch.vcpu_info_mfn = page_to_mfn(page); > - > - /* Set new vcpu_info pointer /before/ setting pending flags. */ > - wmb(); > - > - /* > - * Mark everything as being pending just to make sure nothing gets > - * lost. The domain will get a spurious event, but it can cope. > - */ > - vcpu_info(v, evtchn_upcall_pending) = 1; > - for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) > - set_bit(i, &vcpu_info(v, evtchn_pending_sel)); > - > - return 0; > -} > - > long > arch_do_vcpu_op( > int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) > @@ -1083,22 +988,6 @@ arch_do_vcpu_op( > break; > } > > - case VCPUOP_register_vcpu_info: > - { > - struct domain *d = v->domain; > - struct vcpu_register_vcpu_info info; > - > - rc = -EFAULT; > - if ( copy_from_guest(&info, arg, 1) ) > - break; > - > - domain_lock(d); > - rc = map_vcpu_info(v, info.mfn, info.offset); > - domain_unlock(d); > - > - break; > - } > - > /* > * XXX Disable for 4.0.0: __update_vcpu_system_time() writes to the given > * virtual address even when running in another domain''s address space. > @@ -2025,8 +1914,6 @@ int domain_relinquish_resources(struct domain *d) > * mappings. > */ > destroy_gdt(v); > - > - unmap_vcpu_info(v); > } > > if ( d->arch.pv_domain.pirq_eoi_map != NULL ) > diff --git a/xen/common/domain.c b/xen/common/domain.c > index ce45d66..d21909f 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -33,6 +33,7 @@ > #include <xen/xenoprof.h> > #include <xen/irq.h> > #include <asm/debugger.h> > +#include <asm/p2m.h> > #include <asm/processor.h> > #include <public/sched.h> > #include <public/sysctl.h> > @@ -142,6 +143,7 @@ struct vcpu *alloc_vcpu( > v->vcpu_info = ((vcpu_id < XEN_LEGACY_MAX_VCPUS) > ? (vcpu_info_t *)&shared_info(d, vcpu_info[vcpu_id]) > : &dummy_vcpu_info); > + v->vcpu_info_mfn = INVALID_MFN; > init_waitqueue_vcpu(v); > } > > @@ -547,6 +549,7 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d) > int domain_kill(struct domain *d) > { > int rc = 0; > + struct vcpu *v; > > if ( d == current->domain ) > return -EINVAL; > @@ -571,6 +574,8 @@ int domain_kill(struct domain *d) > BUG_ON(rc != -EAGAIN); > break; > } > + for_each_vcpu ( d, v ) > + unmap_vcpu_info(v); > d->is_dying = DOMDYING_dead; > /* Mem event cleanup has to go here because the rings > * have to be put before we call put_domain. */ > @@ -896,6 +901,96 @@ void vcpu_reset(struct vcpu *v) > vcpu_unpause(v); > } > > +/* > + * Map a guest page in and point the vcpu_info pointer at it. This > + * makes sure that the vcpu_info is always pointing at a valid piece > + * of memory, and it sets a pending event to make sure that a pending > + * event doesn''t get missed. > + */ > +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) > +{ > + struct domain *d = v->domain; > + void *mapping; > + vcpu_info_t *new_info; > + struct page_info *page; > + int i; > + > + if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) > + return -EINVAL; > + > + if ( v->vcpu_info_mfn != INVALID_MFN ) > + return -EINVAL; > + > + /* Run this command on yourself or on other offline VCPUS. */ > + if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) > + return -EINVAL; > + > + page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); > + if ( !page ) > + return -EINVAL; > + > + if ( !get_page_type(page, PGT_writable_page) ) > + { > + put_page(page); > + return -EINVAL; > + } > + > + mapping = __map_domain_page_global(page); > + if ( mapping == NULL ) > + { > + put_page_and_type(page); > + return -ENOMEM; > + } > + > + new_info = (vcpu_info_t *)(mapping + offset); > + > + if ( v->vcpu_info == &dummy_vcpu_info ) > + { > + memset(new_info, 0, sizeof(*new_info)); > + __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; > + } > + else > + { > + memcpy(new_info, v->vcpu_info, sizeof(*new_info)); > + } > + > + v->vcpu_info = new_info; > + v->vcpu_info_mfn = page_to_mfn(page); > + > + /* Set new vcpu_info pointer /before/ setting pending flags. */ > + wmb(); > + > + /* > + * Mark everything as being pending just to make sure nothing gets > + * lost. The domain will get a spurious event, but it can cope. > + */ > + vcpu_info(v, evtchn_upcall_pending) = 1; > + for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) > + set_bit(i, &vcpu_info(v, evtchn_pending_sel)); > + > + return 0; > +} > + > +/* > + * Unmap the vcpu info page if the guest decided to place it somewhere > + * else. This is only used from arch_domain_destroy, so there''s no > + * need to do anything clever. > + */ > +void unmap_vcpu_info(struct vcpu *v) > +{ > + unsigned long mfn; > + > + if ( v->vcpu_info_mfn == INVALID_MFN ) > + return; > + > + mfn = v->vcpu_info_mfn; > + unmap_domain_page_global(v->vcpu_info); > + > + v->vcpu_info = &dummy_vcpu_info; > + v->vcpu_info_mfn = INVALID_MFN; > + > + put_page_and_type(mfn_to_page(mfn)); > +} > > long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) > { > @@ -1015,6 +1110,22 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) > > break; > > + case VCPUOP_register_vcpu_info: > + { > + struct domain *d = v->domain; > + struct vcpu_register_vcpu_info info; > + > + rc = -EFAULT; > + if ( copy_from_guest(&info, arg, 1) ) > + break; > + > + domain_lock(d); > + rc = map_vcpu_info(v, info.mfn, info.offset); > + domain_unlock(d); > + > + break; > + } > + > #ifdef VCPU_TRAP_NMI > case VCPUOP_send_nmi: > if ( !guest_handle_is_null(arg) ) > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h > index bdaf714..e7ed7c3 100644 > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -435,9 +435,6 @@ struct arch_vcpu > > struct paging_vcpu paging; > > - /* Guest-specified relocation of vcpu_info. */ > - unsigned long vcpu_info_mfn; > - > uint32_t gdbsx_vcpu_event; > > /* A secondary copy of the vcpu time info. */ > diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h > index d4ac50f..d0263a9 100644 > --- a/xen/include/xen/domain.h > +++ b/xen/include/xen/domain.h > @@ -52,6 +52,9 @@ void free_pirq_struct(void *); > int vcpu_initialise(struct vcpu *v); > void vcpu_destroy(struct vcpu *v); > > +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset); > +void unmap_vcpu_info(struct vcpu *v); > + > int arch_domain_create(struct domain *d, unsigned int domcr_flags); > > void arch_domain_destroy(struct domain *d); > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index beadc42..7e129d9 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -192,6 +192,9 @@ struct vcpu > > struct waitqueue_vcpu *waitqueue_vcpu; > > + /* Guest-specified relocation of vcpu_info. */ > + unsigned long vcpu_info_mfn; > + > struct arch_vcpu arch; > }; >
>>> On 25.04.13 at 12:35, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Thu, 25 Apr 2013, Jan Beulich wrote: >> >>> On 25.04.13 at 12:01, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> wrote: >> > On Thu, 25 Apr 2013, Jan Beulich wrote: >> >> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> > wrote: >> >> > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int >> > nr_bits) >> >> > +{ >> >> > + cpumask_t mask; >> >> > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; >> >> >> >> min(nr_bits, nr_cpumask_bits) >> >> >> >> > + >> >> > + memset(&mask, 0x00, sizeof(mask)); >> >> >> >> bitmap_zero(). >> >> >> >> > + bitmap_copy(mask.bits, bits, len); >> >> >> >> Hard tab. >> >> >> >> > + >> >> > + return mask; >> >> > +} >> >> >> >> And most importantly: Why? This isn''t an operation that should >> >> commonly be done, and hence having a utility function for this >> >> seems to invite for abuse rather than really help. >> > >> > TBH I have done it to address Ian''s comment, I don''t have a strong >> > opinion on this. >> > However it is true that from an API point of view, cpumask_from_bitmap >> > allows us to cover the new use case without breaking the cpumask >> > abstraction. >> >> Rather than adding a new abstraction that''s used for a single >> special case, and if open coding is undesirable, I''d prefer if you >> used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() >> if at all possible. > > Both functions do copy_to/from_guest and use xmalloc, I don''t think I > can use them.Hmm, ugly. But okay then... Jan
Ian Campbell
2013-Apr-25 10:50 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: keir@xen.org > CC: JBeulich@suse.com > > Changes in v3: > - do not export all the vcpu_op hypercalls to ARM guests, only > VCPUOP_register_vcpu_info. > --- > xen/arch/arm/domain.c | 13 +++++++++++++ > xen/arch/arm/traps.c | 1 + > xen/include/asm-arm/hypercall.h | 3 +++ > 3 files changed, 17 insertions(+), 0 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index fee3790..a676441 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -10,6 +10,7 @@ > * GNU General Public License for more details. > */ > #include <xen/config.h> > +#include <xen/hypercall.h> > #include <xen/init.h> > #include <xen/lib.h> > #include <xen/sched.h> > @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) > } > } > > + > +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)This is a bit fugly but I suppose it''s no worse than the other alternatives I can think of. I don''t really like the "restricted" name but the other obvious alternative do_arch_vcpu_op is out because typically that''s called *from* do_foo_op not instead of. Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as per-arch on all architectures (basically a nop on x86) an option?> +{ > + switch ( cmd ) > + { > + case VCPUOP_register_vcpu_info: > + return do_vcpu_op(cmd, vcpuid, arg); > + default: > + return -EINVAL;ENOSYS I think.> + } > +} > + > long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) > { > return -ENOSYS; > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > index 733099a..d69231c 100644 > --- a/xen/arch/arm/traps.c > +++ b/xen/arch/arm/traps.c > @@ -617,6 +617,7 @@ static arm_hypercall_t arm_hypercall_table[] = { > HYPERCALL(sysctl, 2), > HYPERCALL(hvm_op, 2), > HYPERCALL(grant_table_op, 3), > + HYPERCALL(restricted_vcpu_op, 3), > }; > > #define __PSCI_cpu_suspend 0 > diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h > index 0833ec4..8ab0cc4 100644 > --- a/xen/include/asm-arm/hypercall.h > +++ b/xen/include/asm-arm/hypercall.h > @@ -4,6 +4,9 @@ > #include <public/domctl.h> /* for arch_do_domctl */ > int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); > > +#define __HYPERVISOR_restricted_vcpu_op __HYPERVISOR_vcpu_opI don''t think this needs it''s own #define, does it? (maybe that requires an alternative HYPERCALL macro, that would be fine IMHO).> +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg); > + > #endif /* __ASM_ARM_HYPERCALL_H__ */ > /* > * Local variables:
Stefano Stabellini
2013-Apr-25 10:57 UTC
Re: [PATCH v3 01/13] xen/arm: basic PSCI support, implement cpu_on and cpu_off
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > Implement support for ARM Power State Coordination Interface, PSCI in > > short. > > Support HVC and SMC calls. > > > > Changes in v3: > > - move do_psci_* to psci.c; > > - trap SMC; > > - return PSCI error codes; > > - remove Linux boot procotol from secondary cpus. > > > > Changes in v2: > > - set is_initialised in arch_set_info_guest; > > - zero vcpu_guest_context before using it. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > xen/arch/arm/Makefile | 1 + > > xen/arch/arm/domain.c | 2 + > > xen/arch/arm/psci.c | 87 +++++++++++++++++++++++++++++++++++++++ > > xen/arch/arm/traps.c | 47 ++++++++++++++++++++- > > xen/include/asm-arm/processor.h | 1 + > > xen/include/asm-arm/psci.h | 24 +++++++++++ > > 6 files changed, 161 insertions(+), 1 deletions(-) > > create mode 100644 xen/arch/arm/psci.c > > create mode 100644 xen/include/asm-arm/psci.h > > > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > > index 2106a4f..8f75044 100644 > > --- a/xen/arch/arm/Makefile > > +++ b/xen/arch/arm/Makefile > > @@ -5,6 +5,7 @@ subdir-y += platforms > > obj-y += early_printk.o > > obj-y += cpu.o > > obj-y += domain.o > > +obj-y += psci.o > > obj-y += domctl.o > > obj-y += sysctl.o > > obj-y += domain_build.o > > > > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > > new file mode 100644 > > index 0000000..28153ad > > --- /dev/null > > +++ b/xen/arch/arm/psci.c > > @@ -0,0 +1,87 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License as published by > > + * the Free Software Foundation; either version 2 of the License, or > > + * (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + */ > > + > > +#include <xen/errno.h> > > +#include <xen/sched.h> > > +#include <xen/types.h> > > + > > +#include <asm/current.h> > > +#include <asm/psci.h> > > + > > +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point) > > +{ > > + struct vcpu *v; > > + struct domain *d = current->domain; > > + struct vcpu_guest_context *ctxt; > > + int rc; > > + > > + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) > > + return PSCI_EINVAL; > > + > > + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > > + return PSCI_EINVAL; > > + > > + if ( !v->is_initialised ) { > > If this is true then we eventually return success, should this be > returning EINVAL or DENIED or something?Nope: this is done on purpose so that you can call cpu_on on a cpu that called cpu_off before.> > +int do_psci_migrate(uint32_t vcpuid) > > +{ > > + return PSCI_ENOSYS; > > +} > > Is this required or should we just not expose it in device tree?We should not expose it in device tree, but we might as well return a proper error here.> > + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) > > + return PSCI_DENIED; > > + > > + memset(ctxt, 0, sizeof(*ctxt)); > > + ctxt->user_regs.pc32 = entry_point; > > + ctxt->sctlr = SCTLR_BASE; > > + ctxt->ttbr0 = 0; > > + ctxt->ttbr1 = 0; > > + ctxt->ttbcr = 0; /* Defined Reset Value */ > > + ctxt->user_regs.cpsr = PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC; > > This value is repeated in a couple of places now, perhaps we should have > a ctxt_init which sets up our vcpus defined "reset" values all in one > place?The only repeated value I could find is "PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC", also used in construct_dom0. I would just introduce a constant called "CPSR_RESET_VALUE" and use it everywhere.> > + ctxt->flags = VGCF_online; > > + > > + domain_lock(d); > > + rc = arch_set_info_guest(v, ctxt); > > + free_vcpu_guest_context(ctxt); > > + > > + if ( rc < 0 ) > > + { > > + domain_unlock(d); > > + return PSCI_DENIED; > > + } > > + domain_unlock(d); > > + } > > + > > + clear_bit(_VPF_down, &v->pause_flags); > > ctxt->flags = VGCF_online would cause this to happen, I''d rather do that > than repeat the logic here.It is repeated because we might get here without getting inside the if statement when the vcpu is already initialized.> > @@ -647,6 +673,20 @@ static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) > > } > > } > > > > +static void do_trap_psci(struct cpu_user_regs *regs) > > +{ > > + arm_psci_fn_t psci_call = NULL; > > + > > + if ( regs->r0 >= ARRAY_SIZE(arm_psci_table) ) > > + { > > + regs->r0 = -ENOSYS; > > + return; > > + } > > + > > + psci_call = arm_psci_table[regs->r0].fn; > > Might be NULL?it can''t be NULL if r0 < ARRAY_SIZE(arm_psci_table)> Since Xen passes the valid PSCI numbers to the guest I think it should > be illegal to call anything else, so rather than ENOSYS calling an > unimplemented slots should result in domain_crash_synchronous. It seems > like the general plan is for various firmware interfaces to use iss==0 > and to communicate the valid r0 numbers via device tree, but I don''t > think we can assume that all those different interfaces will have the > same ENOSYS like value.The (not final) document is not super clear about this but my understanding is that they all have the same "not implemented" error code. Besides the mere fact that an ENOSYS-like error code is defined makes me inclined to return an error rather than crashing the domain. After all if the guest calls an vcpu_op hypercall that we don''t implement, we return error, we don''t crash the domain for it.> > + regs->r0 = psci_call(regs->r1, regs->r2); > > +} > > + > > static void do_trap_hypercall(struct cpu_user_regs *regs, unsigned long iss) > > { > > arm_hypercall_fn_t call = NULL; > > @@ -959,8 +999,13 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs *regs) > > case HSR_EC_HVC: > > if ( (hsr.iss & 0xff00) == 0xff00 ) > > return do_debug_trap(regs, hsr.iss & 0x00ff); > > + if ( hsr.iss == 0 ) > > + return do_trap_psci(regs); > > do_trap_hypercall(regs, hsr.iss); > > This is a pre-existing issue but this handles all no-zero iss as a Xen > hypercall, I think we should > switch ( hsr.iss ) > case XEN_PSCI_TAG: ... > case XEN_HYPERCALL_TAG: ... > default: domain_crash_synchrous > and pull the domain crash out of do_trap_hypercall. > > Maybe XEN_PSCI_TAG isn''t the right name if other interfaces are going to > use it, but it''s only an internal name anyway.OK
Ian Campbell
2013-Apr-25 11:00 UTC
Re: [PATCH v3 11/13] xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> If we need to inject an irq into a guest that is running on a different^VCPU> processor, we shouldn''t just enqueue the irq into the lr_pending and > inflight lists and wait for something to interrupt the guest execution. > Send an IPI to the target pcpu so that Xen can inject the new interrupt > returning to guest. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>> --- > xen/arch/arm/vgic.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 8912aab..547a2ab 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -664,6 +664,8 @@ out: > spin_unlock_irqrestore(&v->arch.vgic.lock, flags); > /* we have a new higher priority irq, inject it into the guest */ > vcpu_unblock(v); > + if ( v != current && v->is_running ) > + smp_send_event_check_mask(cpumask_of(v->processor));I''m a bit concerned about races here, e.g. where v is busy going to sleep. I don''t know enough about the behaviour of the scheduler and vcpu_unblock etc to say, but it worries me... x86 has vcpu_kick which saves v->is_running before the unblock and then sends a softirq with much the same checks as you have -- that seems like a plausible race avoidance strategy, although whether IPI vs softirq is an important distinction I''ve no idea... BTW smp_send_event_check_mask(cpumask_of(v->processor)) is the same as smp_send_event_check_cpu(v->processor). Ian.
Ian Campbell
2013-Apr-25 11:03 UTC
Re: [PATCH v3 12/13] xen/arm: start the vtimer Xen timers on the processor they should be running on
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> The Xen physical timer emulator and virtual timer driver use two > internal Xen timers: initialize them on the processor the vcpu is > going to be running on, rather than the processor that it''s creating the > vcpu.But this CPU can change as the vcpu migrates around the pcpus, do we not need to handle that case too? Surely we need some code in virt_timer_restore at least? Possibly a call to migrate_timer? Although perhaps there is a less expensive way when we know the timer isn''t running?> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/vtimer.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index 1cb365e..2444851 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -48,7 +48,7 @@ int vcpu_vtimer_init(struct vcpu *v) > { > struct vtimer *t = &v->arch.phys_timer; > > - init_timer(&t->timer, phys_timer_expired, t, smp_processor_id()); > + init_timer(&t->timer, phys_timer_expired, t, v->processor); > t->ctl = 0; > t->offset = NOW(); > t->cval = NOW(); > @@ -56,7 +56,7 @@ int vcpu_vtimer_init(struct vcpu *v) > t->v = v; > > t = &v->arch.virt_timer; > - init_timer(&t->timer, virt_timer_expired, t, smp_processor_id()); > + init_timer(&t->timer, virt_timer_expired, t, v->processor); > t->ctl = 0; > t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > t->cval = 0;
Ian Campbell
2013-Apr-25 11:21 UTC
Re: [PATCH v3 13/13] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > xen/arch/arm/vtimer.c | 18 ++++++++++++++---- > xen/include/asm-arm/domain.h | 26 +++++++++++++++++--------- > 2 files changed, 31 insertions(+), 13 deletions(-) > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index 2444851..49858e7 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -46,20 +46,30 @@ static void virt_timer_expired(void *data) > > int vcpu_vtimer_init(struct vcpu *v) > { > + struct vtimer_base *b = &v->domain->arch.phys_timer_base; > struct vtimer *t = &v->arch.phys_timer; > > + if ( !b->offset ) > + b->offset = NOW(); > + if ( !b->cval ) > + b->cval = NOW();Initialisation of domain global state should be done in domain_vtimer_init (which you may need to add) IMHO.> init_timer(&t->timer, phys_timer_expired, t, v->processor); > t->ctl = 0; > - t->offset = NOW(); > - t->cval = NOW(); > + t->offset = b->offset; > > + t->cval = b->cval; > t->irq = 30; > t->v = v; > > + b = &v->domain->arch.virt_timer_base; > + if ( !b->offset ) > + b->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > + if ( !b->cval ) > + b->cval = 0; > t = &v->arch.virt_timer; > init_timer(&t->timer, virt_timer_expired, t, v->processor); > t->ctl = 0; > - t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > - t->cval = 0; > + t->offset = b->offset; > + t->cval = b->cval;Are you sure this is as simple as this? At the very least I''m surprised that we don''t need to consult b->offset anywhere other than at init time, if vcpus are supposed to see a uniform view of time. Or is it the case that what actually needs to happen is that t->offset goes away and b->offset is used everywhere?> t->irq = 27; > t->v = v; > > diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h > index 3fa266c2..d001802 100644 > --- a/xen/include/asm-arm/domain.h > +++ b/xen/include/asm-arm/domain.h > @@ -47,6 +47,20 @@ enum domain_type { > #define is_pv64_domain(d) (0) > #endif > > +struct vtimer { > + struct vcpu *v; > + int irq; > + struct timer timer; > + uint32_t ctl; > + uint64_t offset; > + uint64_t cval; > +}; > + > +struct vtimer_base { > + uint64_t offset; > + uint64_t cval; > +};It would be consistent with what is there to inline this into arch_domain.> + > struct arch_domain > { > #ifdef CONFIG_ARM_64 > @@ -61,6 +75,9 @@ struct arch_domain > uint32_t vpidr; > register_t vmpidr; > > + struct vtimer_base phys_timer_base; > + struct vtimer_base virt_timer_base; > + > struct { > /* > * Covers access to other members of this struct _except_ for > @@ -91,15 +108,6 @@ struct arch_domain > > } __cacheline_aligned; > > -struct vtimer { > - struct vcpu *v; > - int irq; > - struct timer timer; > - uint32_t ctl; > - uint64_t offset; > - uint64_t cval; > -}; > - > struct arch_vcpu > { > struct {
Ian Campbell
2013-Apr-25 11:31 UTC
Re: [PATCH v3 01/13] xen/arm: basic PSCI support, implement cpu_on and cpu_off
On Thu, 2013-04-25 at 11:57 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > Implement support for ARM Power State Coordination Interface, PSCI in > > > short. > > > Support HVC and SMC calls. > > > > > > Changes in v3: > > > - move do_psci_* to psci.c; > > > - trap SMC; > > > - return PSCI error codes; > > > - remove Linux boot procotol from secondary cpus. > > > > > > Changes in v2: > > > - set is_initialised in arch_set_info_guest; > > > - zero vcpu_guest_context before using it. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > --- > > > xen/arch/arm/Makefile | 1 + > > > xen/arch/arm/domain.c | 2 + > > > xen/arch/arm/psci.c | 87 +++++++++++++++++++++++++++++++++++++++ > > > xen/arch/arm/traps.c | 47 ++++++++++++++++++++- > > > xen/include/asm-arm/processor.h | 1 + > > > xen/include/asm-arm/psci.h | 24 +++++++++++ > > > 6 files changed, 161 insertions(+), 1 deletions(-) > > > create mode 100644 xen/arch/arm/psci.c > > > create mode 100644 xen/include/asm-arm/psci.h > > > > > > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile > > > index 2106a4f..8f75044 100644 > > > --- a/xen/arch/arm/Makefile > > > +++ b/xen/arch/arm/Makefile > > > @@ -5,6 +5,7 @@ subdir-y += platforms > > > obj-y += early_printk.o > > > obj-y += cpu.o > > > obj-y += domain.o > > > +obj-y += psci.o > > > obj-y += domctl.o > > > obj-y += sysctl.o > > > obj-y += domain_build.o > > > > > > diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c > > > new file mode 100644 > > > index 0000000..28153ad > > > --- /dev/null > > > +++ b/xen/arch/arm/psci.c > > > @@ -0,0 +1,87 @@ > > > +/* > > > + * This program is free software; you can redistribute it and/or modify > > > + * it under the terms of the GNU General Public License as published by > > > + * the Free Software Foundation; either version 2 of the License, or > > > + * (at your option) any later version. > > > + * > > > + * This program is distributed in the hope that it will be useful, > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > > + * GNU General Public License for more details. > > > + */ > > > + > > > +#include <xen/errno.h> > > > +#include <xen/sched.h> > > > +#include <xen/types.h> > > > + > > > +#include <asm/current.h> > > > +#include <asm/psci.h> > > > + > > > +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point) > > > +{ > > > + struct vcpu *v; > > > + struct domain *d = current->domain; > > > + struct vcpu_guest_context *ctxt; > > > + int rc; > > > + > > > + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) > > > + return PSCI_EINVAL; > > > + > > > + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > > > + return PSCI_EINVAL; > > > + > > > + if ( !v->is_initialised ) { > > > > If this is true then we eventually return success, should this be > > returning EINVAL or DENIED or something? > > Nope: this is done on purpose so that you can call cpu_on on a cpu that > called cpu_off before.But in that case you ignore the newly supplied entry point? Is that really the semantics of PSCI? Needs a comment I think.> > > +int do_psci_migrate(uint32_t vcpuid) > > > +{ > > > + return PSCI_ENOSYS; > > > +} > > > > Is this required or should we just not expose it in device tree? > > We should not expose it in device tree, but we might as well return a > proper error here.If it isn''t in device tree then the kernel doesn''t even know which r0 value this function is.> > > > > + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) > > > + return PSCI_DENIED; > > > + > > > + memset(ctxt, 0, sizeof(*ctxt)); > > > + ctxt->user_regs.pc32 = entry_point; > > > + ctxt->sctlr = SCTLR_BASE; > > > + ctxt->ttbr0 = 0; > > > + ctxt->ttbr1 = 0; > > > + ctxt->ttbcr = 0; /* Defined Reset Value */ > > > + ctxt->user_regs.cpsr = PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC; > > > > This value is repeated in a couple of places now, perhaps we should have > > a ctxt_init which sets up our vcpus defined "reset" values all in one > > place? > > The only repeated value I could find is > "PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC", also used in > construct_dom0. > I would just introduce a constant called "CPSR_RESET_VALUE" and use it > everywhere.Ack.> > > > > + ctxt->flags = VGCF_online; > > > + > > > + domain_lock(d); > > > + rc = arch_set_info_guest(v, ctxt); > > > + free_vcpu_guest_context(ctxt); > > > + > > > + if ( rc < 0 ) > > > + { > > > + domain_unlock(d); > > > + return PSCI_DENIED; > > > + } > > > + domain_unlock(d); > > > + } > > > + > > > + clear_bit(_VPF_down, &v->pause_flags); > > > > ctxt->flags = VGCF_online would cause this to happen, I''d rather do that > > than repeat the logic here. > > It is repeated because we might get here without getting inside the if > statement when the vcpu is already initialized.I think this ties into the lack of resetting the entry point I mentioned above. Perhaps this should be in an } else then?> > > > > @@ -647,6 +673,20 @@ static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) > > > } > > > } > > > > > > +static void do_trap_psci(struct cpu_user_regs *regs) > > > +{ > > > + arm_psci_fn_t psci_call = NULL; > > > + > > > + if ( regs->r0 >= ARRAY_SIZE(arm_psci_table) ) > > > + { > > > + regs->r0 = -ENOSYS; > > > + return; > > > + } > > > + > > > + psci_call = arm_psci_table[regs->r0].fn; > > > > Might be NULL? > > it can''t be NULL if r0 < ARRAY_SIZE(arm_psci_table)Not right now since you happen to have packed them tightly, but in the future?> > > Since Xen passes the valid PSCI numbers to the guest I think it should > > be illegal to call anything else, so rather than ENOSYS calling an > > unimplemented slots should result in domain_crash_synchronous. It seems > > like the general plan is for various firmware interfaces to use iss==0 > > and to communicate the valid r0 numbers via device tree, but I don''t > > think we can assume that all those different interfaces will have the > > same ENOSYS like value. > > The (not final) document is not super clear about this but my > understanding is that they all have the same "not implemented" error > code.My understanding is that some of these interfaces are rather ad-hoc vendor supplied interfaces, in which case I think they are unlikely to have any sort of consistent error return.> Besides the mere fact that an ENOSYS-like error code is defined makes me > inclined to return an error rather than crashing the domain.It''s defined for PSCI, but there is the potential for other firmware interfaces using smc/hvc #0 and their own distinct r0 values.> After all if the guest calls an vcpu_op hypercall that we don''t > implement, we return error, we don''t crash the domain for it.We do (or should!) crash the domain if it calls hvc with a non-Xen tag, which is more analogous to the situation here. When the guest uses the XEN tag then the return values are well defined, so we can, and should, return a Xen error code in those cases.> > > > > + regs->r0 = psci_call(regs->r1, regs->r2); > > > +} > > > + > > > static void do_trap_hypercall(struct cpu_user_regs *regs, unsigned long iss) > > > { > > > arm_hypercall_fn_t call = NULL; > > > @@ -959,8 +999,13 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs *regs) > > > case HSR_EC_HVC: > > > if ( (hsr.iss & 0xff00) == 0xff00 ) > > > return do_debug_trap(regs, hsr.iss & 0x00ff); > > > + if ( hsr.iss == 0 ) > > > + return do_trap_psci(regs); > > > do_trap_hypercall(regs, hsr.iss); > > > > This is a pre-existing issue but this handles all no-zero iss as a Xen > > hypercall, I think we should > > switch ( hsr.iss ) > > case XEN_PSCI_TAG: ... > > case XEN_HYPERCALL_TAG: ... > > default: domain_crash_synchrous > > and pull the domain crash out of do_trap_hypercall. > > > > Maybe XEN_PSCI_TAG isn''t the right name if other interfaces are going to > > use it, but it''s only an internal name anyway. > > OK
Stefano Stabellini
2013-Apr-25 11:38 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > CC: keir@xen.org > > CC: JBeulich@suse.com > > > > Changes in v3: > > - do not export all the vcpu_op hypercalls to ARM guests, only > > VCPUOP_register_vcpu_info. > > --- > > xen/arch/arm/domain.c | 13 +++++++++++++ > > xen/arch/arm/traps.c | 1 + > > xen/include/asm-arm/hypercall.h | 3 +++ > > 3 files changed, 17 insertions(+), 0 deletions(-) > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > index fee3790..a676441 100644 > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -10,6 +10,7 @@ > > * GNU General Public License for more details. > > */ > > #include <xen/config.h> > > +#include <xen/hypercall.h> > > #include <xen/init.h> > > #include <xen/lib.h> > > #include <xen/sched.h> > > @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) > > } > > } > > > > + > > +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) > > This is a bit fugly but I suppose it''s no worse than the other > alternatives I can think of. > > I don''t really like the "restricted" name but the other obvious > alternative do_arch_vcpu_op is out because typically that''s called > *from* do_foo_op not instead of. > > Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as > per-arch on all architectures (basically a nop on x86) an option?This is a question for the x86 maintainers.> > +{ > > + switch ( cmd ) > > + { > > + case VCPUOP_register_vcpu_info: > > + return do_vcpu_op(cmd, vcpuid, arg); > > + default: > > + return -EINVAL; > > ENOSYS I think.right> > + } > > +} > > + > > long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) > > { > > return -ENOSYS; > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > > index 733099a..d69231c 100644 > > --- a/xen/arch/arm/traps.c > > +++ b/xen/arch/arm/traps.c > > @@ -617,6 +617,7 @@ static arm_hypercall_t arm_hypercall_table[] = { > > HYPERCALL(sysctl, 2), > > HYPERCALL(hvm_op, 2), > > HYPERCALL(grant_table_op, 3), > > + HYPERCALL(restricted_vcpu_op, 3), > > }; > > > > #define __PSCI_cpu_suspend 0 > > diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h > > index 0833ec4..8ab0cc4 100644 > > --- a/xen/include/asm-arm/hypercall.h > > +++ b/xen/include/asm-arm/hypercall.h > > @@ -4,6 +4,9 @@ > > #include <public/domctl.h> /* for arch_do_domctl */ > > int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); > > > > +#define __HYPERVISOR_restricted_vcpu_op __HYPERVISOR_vcpu_op > > I don''t think this needs it''s own #define, does it? (maybe that requires > an alternative HYPERCALL macro, that would be fine IMHO).It does because of the way the HYPERCALL macro builds the arm_hypercall_table.
Stefano Stabellini
2013-Apr-25 11:38 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 25 Apr 2013, Jan Beulich wrote:> >>> On 25.04.13 at 12:35, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: > > On Thu, 25 Apr 2013, Jan Beulich wrote: > >> >>> On 25.04.13 at 12:01, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > >> wrote: > >> > On Thu, 25 Apr 2013, Jan Beulich wrote: > >> >> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > >> > wrote: > >> >> > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int > >> > nr_bits) > >> >> > +{ > >> >> > + cpumask_t mask; > >> >> > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; > >> >> > >> >> min(nr_bits, nr_cpumask_bits) > >> >> > >> >> > + > >> >> > + memset(&mask, 0x00, sizeof(mask)); > >> >> > >> >> bitmap_zero(). > >> >> > >> >> > + bitmap_copy(mask.bits, bits, len); > >> >> > >> >> Hard tab. > >> >> > >> >> > + > >> >> > + return mask; > >> >> > +} > >> >> > >> >> And most importantly: Why? This isn''t an operation that should > >> >> commonly be done, and hence having a utility function for this > >> >> seems to invite for abuse rather than really help. > >> > > >> > TBH I have done it to address Ian''s comment, I don''t have a strong > >> > opinion on this. > >> > However it is true that from an API point of view, cpumask_from_bitmap > >> > allows us to cover the new use case without breaking the cpumask > >> > abstraction. > >> > >> Rather than adding a new abstraction that''s used for a single > >> special case, and if open coding is undesirable, I''d prefer if you > >> used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() > >> if at all possible. > > > > Both functions do copy_to/from_guest and use xmalloc, I don''t think I > > can use them. > > Hmm, ugly. But okay then...Is that an ack? ;-)
Ian Campbell
2013-Apr-25 11:40 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On Thu, 2013-04-25 at 12:38 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > CC: keir@xen.org > > > CC: JBeulich@suse.com > > > > > > Changes in v3: > > > - do not export all the vcpu_op hypercalls to ARM guests, only > > > VCPUOP_register_vcpu_info. > > > --- > > > xen/arch/arm/domain.c | 13 +++++++++++++ > > > xen/arch/arm/traps.c | 1 + > > > xen/include/asm-arm/hypercall.h | 3 +++ > > > 3 files changed, 17 insertions(+), 0 deletions(-) > > > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > > index fee3790..a676441 100644 > > > --- a/xen/arch/arm/domain.c > > > +++ b/xen/arch/arm/domain.c > > > @@ -10,6 +10,7 @@ > > > * GNU General Public License for more details. > > > */ > > > #include <xen/config.h> > > > +#include <xen/hypercall.h> > > > #include <xen/init.h> > > > #include <xen/lib.h> > > > #include <xen/sched.h> > > > @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) > > > } > > > } > > > > > > + > > > +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) > > > > This is a bit fugly but I suppose it''s no worse than the other > > alternatives I can think of. > > > > I don''t really like the "restricted" name but the other obvious > > alternative do_arch_vcpu_op is out because typically that''s called > > *from* do_foo_op not instead of. > > > > Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as > > per-arch on all architectures (basically a nop on x86) an option? > > This is a question for the x86 maintainers.Yeah, sorry it was I just didn''t address it to them ;-)> > > + } > > > +} > > > + > > > long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) > > > { > > > return -ENOSYS; > > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > > > index 733099a..d69231c 100644 > > > --- a/xen/arch/arm/traps.c > > > +++ b/xen/arch/arm/traps.c > > > @@ -617,6 +617,7 @@ static arm_hypercall_t arm_hypercall_table[] = { > > > HYPERCALL(sysctl, 2), > > > HYPERCALL(hvm_op, 2), > > > HYPERCALL(grant_table_op, 3), > > > + HYPERCALL(restricted_vcpu_op, 3), > > > }; > > > > > > #define __PSCI_cpu_suspend 0 > > > diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h > > > index 0833ec4..8ab0cc4 100644 > > > --- a/xen/include/asm-arm/hypercall.h > > > +++ b/xen/include/asm-arm/hypercall.h > > > @@ -4,6 +4,9 @@ > > > #include <public/domctl.h> /* for arch_do_domctl */ > > > int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); > > > > > > +#define __HYPERVISOR_restricted_vcpu_op __HYPERVISOR_vcpu_op > > > > I don''t think this needs it''s own #define, does it? (maybe that requires > > an alternative HYPERCALL macro, that would be fine IMHO). > > It does because of the way the HYPERCALL macro builds the > arm_hypercall_table.Right, hence my call for an alternative macro, e.g. #define RESTICTED_HYPERCALL Ian.
Stefano Stabellini
2013-Apr-25 11:41 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Thu, 2013-04-25 at 12:38 +0100, Stefano Stabellini wrote: > > On Thu, 25 Apr 2013, Ian Campbell wrote: > > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > CC: keir@xen.org > > > > CC: JBeulich@suse.com > > > > > > > > Changes in v3: > > > > - do not export all the vcpu_op hypercalls to ARM guests, only > > > > VCPUOP_register_vcpu_info. > > > > --- > > > > xen/arch/arm/domain.c | 13 +++++++++++++ > > > > xen/arch/arm/traps.c | 1 + > > > > xen/include/asm-arm/hypercall.h | 3 +++ > > > > 3 files changed, 17 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > > > index fee3790..a676441 100644 > > > > --- a/xen/arch/arm/domain.c > > > > +++ b/xen/arch/arm/domain.c > > > > @@ -10,6 +10,7 @@ > > > > * GNU General Public License for more details. > > > > */ > > > > #include <xen/config.h> > > > > +#include <xen/hypercall.h> > > > > #include <xen/init.h> > > > > #include <xen/lib.h> > > > > #include <xen/sched.h> > > > > @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) > > > > } > > > > } > > > > > > > > + > > > > +long do_restricted_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) > > > > > > This is a bit fugly but I suppose it''s no worse than the other > > > alternatives I can think of. > > > > > > I don''t really like the "restricted" name but the other obvious > > > alternative do_arch_vcpu_op is out because typically that''s called > > > *from* do_foo_op not instead of. > > > > > > Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as > > > per-arch on all architectures (basically a nop on x86) an option? > > > > This is a question for the x86 maintainers. > > Yeah, sorry it was I just didn''t address it to them ;-) > > > > > + } > > > > +} > > > > + > > > > long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) > > > > { > > > > return -ENOSYS; > > > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > > > > index 733099a..d69231c 100644 > > > > --- a/xen/arch/arm/traps.c > > > > +++ b/xen/arch/arm/traps.c > > > > @@ -617,6 +617,7 @@ static arm_hypercall_t arm_hypercall_table[] = { > > > > HYPERCALL(sysctl, 2), > > > > HYPERCALL(hvm_op, 2), > > > > HYPERCALL(grant_table_op, 3), > > > > + HYPERCALL(restricted_vcpu_op, 3), > > > > }; > > > > > > > > #define __PSCI_cpu_suspend 0 > > > > diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h > > > > index 0833ec4..8ab0cc4 100644 > > > > --- a/xen/include/asm-arm/hypercall.h > > > > +++ b/xen/include/asm-arm/hypercall.h > > > > @@ -4,6 +4,9 @@ > > > > #include <public/domctl.h> /* for arch_do_domctl */ > > > > int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); > > > > > > > > +#define __HYPERVISOR_restricted_vcpu_op __HYPERVISOR_vcpu_op > > > > > > I don''t think this needs it''s own #define, does it? (maybe that requires > > > an alternative HYPERCALL macro, that would be fine IMHO). > > > > It does because of the way the HYPERCALL macro builds the > > arm_hypercall_table. > > Right, hence my call for an alternative macro, e.g. #define > RESTICTED_HYPERCALLI see. That might be better.
>>> On 25.04.13 at 13:38, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Thu, 25 Apr 2013, Jan Beulich wrote: >> >>> On 25.04.13 at 12:35, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> wrote: >> > On Thu, 25 Apr 2013, Jan Beulich wrote: >> >> >>> On 25.04.13 at 12:01, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> >> wrote: >> >> > On Thu, 25 Apr 2013, Jan Beulich wrote: >> >> >> >>> On 24.04.13 at 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > >> >> > wrote: >> >> >> > +static inline cpumask_t cpumask_from_bitmap(unsigned long *bits, int >> >> > nr_bits) >> >> >> > +{ >> >> >> > + cpumask_t mask; >> >> >> > + int len = nr_bits < nr_cpumask_bits ? nr_bits : nr_cpumask_bits; >> >> >> >> >> >> min(nr_bits, nr_cpumask_bits) >> >> >> >> >> >> > + >> >> >> > + memset(&mask, 0x00, sizeof(mask)); >> >> >> >> >> >> bitmap_zero(). >> >> >> >> >> >> > + bitmap_copy(mask.bits, bits, len); >> >> >> >> >> >> Hard tab. >> >> >> >> >> >> > + >> >> >> > + return mask; >> >> >> > +} >> >> >> >> >> >> And most importantly: Why? This isn''t an operation that should >> >> >> commonly be done, and hence having a utility function for this >> >> >> seems to invite for abuse rather than really help. >> >> > >> >> > TBH I have done it to address Ian''s comment, I don''t have a strong >> >> > opinion on this. >> >> > However it is true that from an API point of view, cpumask_from_bitmap >> >> > allows us to cover the new use case without breaking the cpumask >> >> > abstraction. >> >> >> >> Rather than adding a new abstraction that''s used for a single >> >> special case, and if open coding is undesirable, I''d prefer if you >> >> used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() >> >> if at all possible. >> > >> > Both functions do copy_to/from_guest and use xmalloc, I don''t think I >> > can use them. >> >> Hmm, ugly. But okay then... > > Is that an ack? ;-)No, this is at best an "I won''t object to this anymore, but I still don''t like it". Jan
On 25/04/2013 11:01, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:>> And most importantly: Why? This isn''t an operation that should >> commonly be done, and hence having a utility function for this >> seems to invite for abuse rather than really help. > > I suggested it, we are emulating an instruction which provides a bitmap > of the CPUs to operate on. It seemed nicer to me to have a helper rather > than to open-code accesses under the hood of the cpumask interface.Emulating a guest instruction? And it really wants to act on sets of real physical cpus (hence cpumask_t)? -- Keir
On 25/04/2013 13:49, "Jan Beulich" <JBeulich@suse.com> wrote:>>>>>>> And most importantly: Why? This isn''t an operation that should >>>>>>> commonly be done, and hence having a utility function for this >>>>>>> seems to invite for abuse rather than really help. >>>>>> >>>>>> TBH I have done it to address Ian''s comment, I don''t have a strong >>>>>> opinion on this. >>>>>> However it is true that from an API point of view, cpumask_from_bitmap >>>>>> allows us to cover the new use case without breaking the cpumask >>>>>> abstraction. >>>>> >>>>> Rather than adding a new abstraction that''s used for a single >>>>> special case, and if open coding is undesirable, I''d prefer if you >>>>> used bitmap_to_xenctl_bitmap() plus xenctl_bitmap_to_cpumask() >>>>> if at all possible. >>>> >>>> Both functions do copy_to/from_guest and use xmalloc, I don''t think I >>>> can use them. >>> >>> Hmm, ugly. But okay then... >> >> Is that an ack? ;-) > > No, this is at best an "I won''t object to this anymore, but I still > don''t like it".If I am convinced that the usage case is sane, I''ll ack a one-use helper. I have no problem with one-use helpers! To me this would be a perfectly reasonable cpumask_t constructor fn. -- Keir> Jan >
Keir Fraser
2013-Apr-25 14:29 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On 25/04/2013 12:38, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:>>> +long do_restricted_vcpu_op(int cmd, int vcpuid, >>> XEN_GUEST_HANDLE_PARAM(void) arg) >> >> This is a bit fugly but I suppose it''s no worse than the other >> alternatives I can think of. >> >> I don''t really like the "restricted" name but the other obvious >> alternative do_arch_vcpu_op is out because typically that''s called >> *from* do_foo_op not instead of. >> >> Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as >> per-arch on all architectures (basically a nop on x86) an option? > > This is a question for the x86 maintainers.Why not just call it do_arm_vcpu_op() or something? I.e., a new naming convention that indicates this arch-specific fn wraps the common hypercall fn? -- Keir
Ian Campbell
2013-Apr-25 14:39 UTC
Re: [PATCH v3 10/13] xen/arm: support VCPUOP_register_vcpu_info.
On Thu, 2013-04-25 at 15:29 +0100, Keir Fraser wrote:> On 25/04/2013 12:38, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> > wrote: > > >>> +long do_restricted_vcpu_op(int cmd, int vcpuid, > >>> XEN_GUEST_HANDLE_PARAM(void) arg) > >> > >> This is a bit fugly but I suppose it''s no worse than the other > >> alternatives I can think of. > >> > >> I don''t really like the "restricted" name but the other obvious > >> alternative do_arch_vcpu_op is out because typically that''s called > >> *from* do_foo_op not instead of. > >> > >> Is renaming do_vcpu_op to do_common_vcpu_op and adding do_vcpu_op as > >> per-arch on all architectures (basically a nop on x86) an option? > > > > This is a question for the x86 maintainers. > > Why not just call it do_arm_vcpu_op() or something? I.e., a new naming > convention that indicates this arch-specific fn wraps the common hypercall > fn?That is a very sane idea! Ian.
Ian Campbell
2013-Apr-25 14:42 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 2013-04-25 at 15:26 +0100, Keir Fraser wrote:> On 25/04/2013 11:01, "Ian Campbell" <Ian.Campbell@citrix.com> wrote: > > >> And most importantly: Why? This isn''t an operation that should > >> commonly be done, and hence having a utility function for this > >> seems to invite for abuse rather than really help. > > > > I suggested it, we are emulating an instruction which provides a bitmap > > of the CPUs to operate on. It seemed nicer to me to have a helper rather > > than to open-code accesses under the hood of the cpumask interface. > > Emulating a guest instruction? And it really wants to act on sets of real > physical cpus (hence cpumask_t)?It''s emulating a write to a GICD (read: APIC ;-)) register which causes an SGI (read: IPI) to be sent. The value you write includes an 8 bit mask indicating the target processors. See "[PATCH v3 04/13] xen/arm: support for guest SGI" in this series. Ian
Ian Campbell
2013-Apr-25 14:44 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote:> And it really wants to act on sets of real physical cpus (hence cpumask_t)?Actually, vcpus. Perhaps cpumask_t is the wrong datatype then?> It''s emulating a write to a GICD (read: APIC ;-)) register which causes > an SGI (read: IPI) to be sent. The value you write includes an 8 bit > mask indicating the target processors. > > See "[PATCH v3 04/13] xen/arm: support for guest SGI" in this series. > > Ian
Ian Campbell
2013-Apr-25 14:51 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 2013-04-25 at 15:44 +0100, Ian Campbell wrote:> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: > > And it really wants to act on sets of real physical cpus (hence cpumask_t)? > > Actually, vcpus. Perhaps cpumask_t is the wrong datatype then?x86 defines for_each_set_bit in terms of find_first_bit and find_next_bit (both of which we have on ARM). Seems like we should just carry that over?> > It''s emulating a write to a GICD (read: APIC ;-)) register which causes > > an SGI (read: IPI) to be sent. The value you write includes an 8 bit > > mask indicating the target processors. > > > > See "[PATCH v3 04/13] xen/arm: support for guest SGI" in this series. > > > > Ian > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Apr-25 15:08 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: > > And it really wants to act on sets of real physical cpus (hence cpumask_t)? > > Actually, vcpus. Perhaps cpumask_t is the wrong datatype then?It seems to me that cpumask_t is a pretty good fit for a vcpu mask.
On 25/04/2013 15:51, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:> On Thu, 2013-04-25 at 15:44 +0100, Ian Campbell wrote: >> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: >>> And it really wants to act on sets of real physical cpus (hence cpumask_t)? >> >> Actually, vcpus. Perhaps cpumask_t is the wrong datatype then? > > x86 defines for_each_set_bit in terms of find_first_bit and > find_next_bit (both of which we have on ARM). Seems like we should just > carry that over?Yup, seems a shame to duplicate it into asm-arm, but I can live with it. I can imagine some arch wanting a specialised version of it sometime. Certainly using cpumask_t for vcpus is very obviously wrong! -- Keir>>> It''s emulating a write to a GICD (read: APIC ;-)) register which causes >>> an SGI (read: IPI) to be sent. The value you write includes an 8 bit >>> mask indicating the target processors. >>> >>> See "[PATCH v3 04/13] xen/arm: support for guest SGI" in this series. >>> >>> Ian >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > >
On 25/04/2013 16:08, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: >> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: >>> And it really wants to act on sets of real physical cpus (hence cpumask_t)? >> >> Actually, vcpus. Perhaps cpumask_t is the wrong datatype then? > > It seems to me that cpumask_t is a pretty good fit for a vcpu mask.It''s not. It''s bounded by NR_CPUS for a start. And it is good documentation for all uses of cpumasks to be applying to *physical* cpus, otherwise we will end up even more confused! I might consider a vcpumask_t of some sort if such an abstraction could be made to work out nicely (given different domains have different max# of vcpus). But we managed so far without. ;) -- Keir
>>> On 25.04.13 at 16:44, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: >> And it really wants to act on sets of real physical cpus (hence cpumask_t)? > > Actually, vcpus. Perhaps cpumask_t is the wrong datatype then?I was about to says that, when I saw this reply of yours. Jan
>>> On 25.04.13 at 17:08, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > On Thu, 25 Apr 2013, Ian Campbell wrote: >> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: >> > And it really wants to act on sets of real physical cpus (hence cpumask_t)? >> >> Actually, vcpus. Perhaps cpumask_t is the wrong datatype then? > > It seems to me that cpumask_t is a pretty good fit for a vcpu mask.No, not the least because there''s no correlation between the numbers of bits in the two (i.e. you could easily have more vCPU-s in a domain than there are pCPU-s in the system). Jan
Ian Campbell
2013-Apr-25 15:23 UTC
Re: [PATCH v3 03/13] xen: introduce cpumask_from_bitmap
On Thu, 2013-04-25 at 16:13 +0100, Keir Fraser wrote:> On 25/04/2013 15:51, "Ian Campbell" <Ian.Campbell@citrix.com> wrote: > > > On Thu, 2013-04-25 at 15:44 +0100, Ian Campbell wrote: > >> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: > >>> And it really wants to act on sets of real physical cpus (hence cpumask_t)? > >> > >> Actually, vcpus. Perhaps cpumask_t is the wrong datatype then? > > > > x86 defines for_each_set_bit in terms of find_first_bit and > > find_next_bit (both of which we have on ARM). Seems like we should just > > carry that over? > > Yup, seems a shame to duplicate it into asm-arm, but I can live with it. I > can imagine some arch wanting a specialised version of it sometime.We could move it to generic code, perhaps with a containing ifndef for this next arch... Ian.
Stefano Stabellini
2013-Apr-25 17:04 UTC
Re: [PATCH v3 07/13] xen/arm: compile and initialize vmap
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > > VMAP_VIRT_END and VMAP_VIRT_START. > > > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM). > > So our vmap is 2MB mappings only? I suppose that''s ok, at least for now.No, I explained myself poorly, I meant that we only support 4K mappings so MAP_SMALL_PAGES is useless (always set).> > Implement map_pages_to_xen and destroy_xen_mappings. > > This involves moving the prototypes from x86 to generic, so needs Jan + > Keir, CCd. > > > > > Call vm_init from start_xen. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > +static int create_xen_table(lpae_t *entry) > > +{ > > + void *p; > > + lpae_t pte; > > + > > + p = alloc_xenheap_pages(0, 0); > > + if ( p == NULL ) > > + return -ENOMEM; > > + clear_page(p); > > + pte = mfn_to_xen_entry(virt_to_mfn(p)); > > + pte.pt.table = 1; > > + write_pte(entry, pte); > > + return 0; > > +} > > + > > +enum xenmap_operation { > > + INSERT, > > + REMOVE > > +}; > > + > > +static int create_xen_entries(enum xenmap_operation op, > > + unsigned long virt, > > + unsigned long mfn, > > + unsigned long nr_mfns) > > Shame this can''t be combined with create_p2m_entries, but that uses > domain pages and this uses xenheap pages. > > > +{ > > + int rc; > > + unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; > > + lpae_t pte; > > + lpae_t *third = NULL; > > + > > + for(; addr < addr_end; addr += PAGE_SIZE, mfn++) > > + { > > + if ( !xen_second[second_linear_offset(addr)].pt.valid || > > + !xen_second[second_linear_offset(addr)].pt.table ) > > + { > > + rc = create_xen_table(&xen_second[second_linear_offset(addr)]); > > + if ( rc < 0 ) { > > + printk("create_xen_entries: L2 failed\n"); > > + goto out; > > + } > > + } > > + > > + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); > > + > > + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base > > + << PAGE_SHIFT); > > + if ( third[third_table_offset(addr)].pt.valid ) > > + flush_tlb_local(); > > Why this flush? (I notice create_p2m_mapping does the same but with > _all_local())I dropped the _all_local because we don''t want to flush all VMIDs here, only the hypervisor mappings. However I am not sure that this flush is actually necessary as we are going to flush the mapping later on after the pte has been written.> Isn''t it a bug for the third to be already mapped? that suggests > something is overwriting the mapping, does vmap do that?I don''t know, but I thought that this function should be able to handle that case.> > + > > + switch ( op ) { > > + case INSERT: > > + pte = mfn_to_xen_entry(mfn); > > + pte.pt.table = 1; > > + write_pte(&third[third_table_offset(addr)], pte); > > + break; > > + case REMOVE: > > + memset(&pte, 0x00, sizeof(pte)); > > AKA: pte.bits = 0;OK> > + write_pte(&third[third_table_offset(addr)], pte); > > + break; > > + default: > > + printk("create_xen_entries: invalid op\n"); > > ASSERT? This really can never happen.Right. A BUG might be better then.
On 25/04/2013 16:23, "Ian Campbell" <Ian.Campbell@citrix.com> wrote:> On Thu, 2013-04-25 at 16:13 +0100, Keir Fraser wrote: >> On 25/04/2013 15:51, "Ian Campbell" <Ian.Campbell@citrix.com> wrote: >> >>> On Thu, 2013-04-25 at 15:44 +0100, Ian Campbell wrote: >>>> On Thu, 2013-04-25 at 15:42 +0100, Ian Campbell wrote: >>>>> And it really wants to act on sets of real physical cpus (hence >>>>> cpumask_t)? >>>> >>>> Actually, vcpus. Perhaps cpumask_t is the wrong datatype then? >>> >>> x86 defines for_each_set_bit in terms of find_first_bit and >>> find_next_bit (both of which we have on ARM). Seems like we should just >>> carry that over? >> >> Yup, seems a shame to duplicate it into asm-arm, but I can live with it. I >> can imagine some arch wanting a specialised version of it sometime. > > We could move it to generic code, perhaps with a containing ifndef for > this next arch...Happy to see that, and have everyone #include xen/bitops.h rather than asm/bitops.h while we''re at it. -- Keir> Ian. >
On 04/24/2013 08:07 PM, Stefano Stabellini wrote:> + break; > + default: > + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r);The type of r is register_t, which is equal to u64 on arm64 and u32 on arm32. You should use PRIregister.> + return 0; > + } > + > + for_each_cpu( vcpuid, &vcpu_mask ) > + { > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || > + virtual_irq >= 16 ) > + { > + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", > + *r, *cpumask_bits(&vcpu_mask));Same here. Cheers, Julien
Ian Campbell
2013-Apr-26 09:00 UTC
Re: [PATCH v3 07/13] xen/arm: compile and initialize vmap
On Thu, 2013-04-25 at 18:04 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > > > VMAP_VIRT_END and VMAP_VIRT_START. > > > > > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > > > > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM). > > > > So our vmap is 2MB mappings only? I suppose that''s ok, at least for now. > > No, I explained myself poorly, I meant that we only support 4K mappings > so MAP_SMALL_PAGES is useless (always set).Ah, that makes more sense, thanks. (I take it you''ll adjust the changelog?)> > > + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); > > > + > > > + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base > > > + << PAGE_SHIFT); > > > + if ( third[third_table_offset(addr)].pt.valid ) > > > + flush_tlb_local(); > > > > Why this flush? (I notice create_p2m_mapping does the same but with > > _all_local()) > > I dropped the _all_local because we don''t want to flush all VMIDs here, > only the hypervisor mappings. > However I am not sure that this flush is actually necessary as we are > going to flush the mapping later on after the pte has been written.Right, that was what I was really trying to ask: why is this flush necessary at all (not why did you drop the all_local bit). I can''t see any reason why it would be needed on INSERT (really UPDATE if it''s already present) and on REMOVE the flush is obviously needed, but not until after you''ve cleared the entry?> > Isn''t it a bug for the third to be already mapped? that suggests > > something is overwriting the mapping, does vmap do that? > > I don''t know, but I thought that this function should be able to handle > that case.At the vmap layer I think it must always be a bug for it to be trying to replace a mapping, the API simply doesn''t allow for the possibility. At the create_xen_mapping/map_pages_to_xen layer I think it would be very uncommon to be replacing an existing Xen mapping, to the extent that I think it should be an explicit separate UPDATE operation via an update_xen_mapping() style API. But until we have a real usecase for that I think we should just fail such attempts noisily -- they almost certainly represent something gone awry. Ian.
On Thu, 2013-04-25 at 20:03 +0100, Julien Grall wrote:> On 04/24/2013 08:07 PM, Stefano Stabellini wrote: > > > + break; > > + default: > > + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); > > The type of r is register_t, which is equal to u64 on arm64 and u32 on > arm32. You should use PRIregister.This is true, I expect it breaks compile on arm64? The cross compiler I use is at http://www.linaro.org/engineering/armv8/#tab3 I''ve got gcc-linaro-aarch64-linux-gnu-4.7-2013.01-20130125_linux but no reason why 2013.04 shouldn''t also work (I''ve not tried the 4.8 which I see has appeared there now) If you want to go mad for it then instructions for cross-compiling tools for arm64 are at http://wiki.xen.org/wiki/Xen_ARM_with_Virtualization_Extensions/CrossCompiling Annoyingly these don''t work for the hypervisor right now, for some weird linker related reason which I''ve not go to the bottom of. Ian.
On 04/26/2013 10:05 AM, Ian Campbell wrote:> On Thu, 2013-04-25 at 20:03 +0100, Julien Grall wrote: >> On 04/24/2013 08:07 PM, Stefano Stabellini wrote: >> >>> + break; >>> + default: >>> + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); >> >> The type of r is register_t, which is equal to u64 on arm64 and u32 on >> arm32. You should use PRIregister. > > This is true, I expect it breaks compile on arm64?Right.> The cross compiler I use is at > http://www.linaro.org/engineering/armv8/#tab3 > I''ve got gcc-linaro-aarch64-linux-gnu-4.7-2013.01-20130125_linux but no > reason why 2013.04 shouldn''t also work (I''ve not tried the 4.8 which I > see has appeared there now)I tried the lasted, so the 4.8, the compilation works. For the moment I haven''t tried to boot xen. Julien
Stefano Stabellini
2013-Apr-26 12:02 UTC
Re: [PATCH v3 07/13] xen/arm: compile and initialize vmap
On Fri, 26 Apr 2013, Ian Campbell wrote:> > > Isn''t it a bug for the third to be already mapped? that suggests > > > something is overwriting the mapping, does vmap do that? > > > > I don''t know, but I thought that this function should be able to handle > > that case. > > At the vmap layer I think it must always be a bug for it to be trying to > replace a mapping, the API simply doesn''t allow for the possibility.Empirical tests give evidence to the contrary: vmap does replace mappings.
On 24 April 2013 21:07, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right > guest vcpu. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Changes in v3: > - make use of cpumask_from_bitmap. > --- > xen/arch/arm/vgic.c | 55 ++++++++++++++++++++++++++++++++++++++++---- > xen/include/asm-arm/gic.h | 3 ++ > 2 files changed, 53 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index b30da78..8912aab 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -370,6 +370,7 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) > > static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > { > + struct domain *d = v->domain; > struct hsr_dabt dabt = info->dabt; > struct cpu_user_regs *regs = guest_cpu_user_regs(); > register_t *r = select_user_reg(regs, dabt.reg); > @@ -498,11 +499,55 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > goto write_ignore; > > case GICD_SGIR: > - if ( dabt.size != 2 ) goto bad_width; > - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", > - *r, gicd_reg - GICD_ICFGR); > - return 0; > - > + { > + cpumask_t vcpu_mask; > + int virtual_irq; > + int filter; > + int vcpuid; > + struct vcpu *vt; > + int i; > + unsigned long bits; > + > + if ( dabt.size != 2 ) goto bad_width; > + > + filter = (*r & GICD_SGI_TARGET_LIST_MASK); > + virtual_irq = (*r & GICD_SGI_INTID_MASK); > + > + cpumask_clear(&vcpu_mask); > + switch ( filter ) > + { > + case GICD_SGI_TARGET_LIST: > + bits = (*r & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; > + vcpu_mask = cpumask_from_bitmap(&bits, 8); > + break; > + case GICD_SGI_TARGET_OTHERS: > + for ( i = 0; i < d->max_vcpus; i++ ) > + { > + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) > + cpumask_set_cpu(i, &vcpu_mask); > + } > + case GICD_SGI_TARGET_SELF: > + vcpu_mask = *cpumask_of(current->vcpu_id); > + break; > + default: > + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); > + return 0; > + } > + > + for_each_cpu( vcpuid, &vcpu_mask ) > + { > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || > + virtual_irq >= 16 ) > + { > + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", > + *r, *cpumask_bits(&vcpu_mask)); > + return 0; > + } > + vgic_vcpu_inject_irq(vt, virtual_irq, 1); > + } > + return 1; > + } > + > case GICD_CPENDSGIR ... GICD_CPENDSGIRN: > if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; > printk("vGICD: unhandled %s write %#"PRIregister" to ICPENDSGIR%d\n", > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > index 92711d5..4f2c8b8 100644 > --- a/xen/include/asm-arm/gic.h > +++ b/xen/include/asm-arm/gic.h > @@ -51,12 +51,15 @@ > #define GICD_SPENDSGIRN (0xF2C/4) > #define GICD_ICPIDR2 (0xFE8/4) > > +#define GICD_SGI_TARGET_LIST_SHIFT (24) > +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) > #define GICD_SGI_TARGET_LIST (0UL<<24) > #define GICD_SGI_TARGET_OTHERS (1UL<<24) > #define GICD_SGI_TARGET_SELF (2UL<<24) > #define GICD_SGI_TARGET_SHIFT (16) > #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) > #define GICD_SGI_GROUP1 (1UL<<15) > +#define GICD_SGI_INTID_MASK (0xFUL) > > #define GICC_CTLR (0x0000/4) > #define GICC_PMR (0x0004/4) > -- > 1.7.2.5 > >Hi, I''ve been running crashme ( executing random instructions ) on Xen on Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen branch. The Xen branch wasn''t updated in a few days. I managed to crash Xen from dom0 userspace: (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) Unhandled SGI 2 on CPU1 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Tracking this back to the source: xen/arch/arm/gic.c:648 static void do_sgi xen/arch/arm/gic.c:671 void gic_interrupt From what I understood from the source all sgi''s except 0 & 1 would/will cause this panic since they are just not handled. I was wondering if this patch would fix that because going over it didn''t convince me it would. The reason I''m asking and not just trying it because it''s not easy to reproduce a crashme crash. Sander> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Apr-26 13:07 UTC
Re: [PATCH v3 04/13] xen/arm: support for guest SGI
On Fri, 26 Apr 2013, Sander Bogaert wrote:> On 24 April 2013 21:07, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: > > Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right > > guest vcpu. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > Changes in v3: > > - make use of cpumask_from_bitmap. > > --- > > xen/arch/arm/vgic.c | 55 ++++++++++++++++++++++++++++++++++++++++---- > > xen/include/asm-arm/gic.h | 3 ++ > > 2 files changed, 53 insertions(+), 5 deletions(-) > > > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > > index b30da78..8912aab 100644 > > --- a/xen/arch/arm/vgic.c > > +++ b/xen/arch/arm/vgic.c > > @@ -370,6 +370,7 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) > > > > static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > > { > > + struct domain *d = v->domain; > > struct hsr_dabt dabt = info->dabt; > > struct cpu_user_regs *regs = guest_cpu_user_regs(); > > register_t *r = select_user_reg(regs, dabt.reg); > > @@ -498,11 +499,55 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > > goto write_ignore; > > > > case GICD_SGIR: > > - if ( dabt.size != 2 ) goto bad_width; > > - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", > > - *r, gicd_reg - GICD_ICFGR); > > - return 0; > > - > > + { > > + cpumask_t vcpu_mask; > > + int virtual_irq; > > + int filter; > > + int vcpuid; > > + struct vcpu *vt; > > + int i; > > + unsigned long bits; > > + > > + if ( dabt.size != 2 ) goto bad_width; > > + > > + filter = (*r & GICD_SGI_TARGET_LIST_MASK); > > + virtual_irq = (*r & GICD_SGI_INTID_MASK); > > + > > + cpumask_clear(&vcpu_mask); > > + switch ( filter ) > > + { > > + case GICD_SGI_TARGET_LIST: > > + bits = (*r & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; > > + vcpu_mask = cpumask_from_bitmap(&bits, 8); > > + break; > > + case GICD_SGI_TARGET_OTHERS: > > + for ( i = 0; i < d->max_vcpus; i++ ) > > + { > > + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) > > + cpumask_set_cpu(i, &vcpu_mask); > > + } > > + case GICD_SGI_TARGET_SELF: > > + vcpu_mask = *cpumask_of(current->vcpu_id); > > + break; > > + default: > > + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); > > + return 0; > > + } > > + > > + for_each_cpu( vcpuid, &vcpu_mask ) > > + { > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || > > + virtual_irq >= 16 ) > > + { > > + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", > > + *r, *cpumask_bits(&vcpu_mask)); > > + return 0; > > + } > > + vgic_vcpu_inject_irq(vt, virtual_irq, 1); > > + } > > + return 1; > > + } > > + > > case GICD_CPENDSGIR ... GICD_CPENDSGIRN: > > if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; > > printk("vGICD: unhandled %s write %#"PRIregister" to ICPENDSGIR%d\n", > > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > > index 92711d5..4f2c8b8 100644 > > --- a/xen/include/asm-arm/gic.h > > +++ b/xen/include/asm-arm/gic.h > > @@ -51,12 +51,15 @@ > > #define GICD_SPENDSGIRN (0xF2C/4) > > #define GICD_ICPIDR2 (0xFE8/4) > > > > +#define GICD_SGI_TARGET_LIST_SHIFT (24) > > +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) > > #define GICD_SGI_TARGET_LIST (0UL<<24) > > #define GICD_SGI_TARGET_OTHERS (1UL<<24) > > #define GICD_SGI_TARGET_SELF (2UL<<24) > > #define GICD_SGI_TARGET_SHIFT (16) > > #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) > > #define GICD_SGI_GROUP1 (1UL<<15) > > +#define GICD_SGI_INTID_MASK (0xFUL) > > > > #define GICC_CTLR (0x0000/4) > > #define GICC_PMR (0x0004/4) > > -- > > 1.7.2.5 > > > > > > Hi, > > I''ve been running crashme ( executing random instructions ) on Xen on > Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen > branch. The Xen branch wasn''t updated in a few days. > > I managed to crash Xen from dom0 userspace: > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 1: > (XEN) Unhandled SGI 2 on CPU1 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Tracking this back to the source: > xen/arch/arm/gic.c:648 static void do_sgi > xen/arch/arm/gic.c:671 void gic_interrupt > > >From what I understood from the source all sgi''s except 0 & 1 > would/will cause this panic since they are just not handled. I was > wondering if this patch would fix that because going over it didn''t > convince me it would. The reason I''m asking and not just trying it > because it''s not easy to reproduce a crashme crash.I don''t think so: this patch is for guest SGIs ("emulated" SGIs), while the crash is because of a physical SGI.
Ian Campbell
2013-Apr-26 13:21 UTC
Re: [PATCH v3 07/13] xen/arm: compile and initialize vmap
On Fri, 2013-04-26 at 13:02 +0100, Stefano Stabellini wrote:> On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > Isn''t it a bug for the third to be already mapped? that suggests > > > > something is overwriting the mapping, does vmap do that? > > > > > > I don''t know, but I thought that this function should be able to handle > > > that case. > > > > At the vmap layer I think it must always be a bug for it to be trying to > > replace a mapping, the API simply doesn''t allow for the possibility. > > Empirical tests give evidence to the contrary: vmap does replace > mappings.Empirical test == map, unmap, map, where the second map uses the same address space as the first? vunmap is: void vunmap(const void *va) { unsigned long addr = (unsigned long)va; destroy_xen_mappings(addr, addr + PAGE_SIZE * vm_size(va)); vm_free(va); } are you sure our destroy_xen_mappings isn''t buggy? Ian.
Stefano Stabellini
2013-Apr-26 13:23 UTC
Re: [PATCH v3 02/13] xen/arm: allocate secondaries dom0 vcpus
On Thu, 25 Apr 2013, Ian Campbell wrote:> > + } > > + > > local_abort_enable(); > > Aside: I wonder what led this to get put here. It doesn''t happen on > non-boot CPUs for one thing!Interesting.. it doesn''t look wrong to me. Maybe we should enable it on non-boot cpu too.
On Fri, 2013-04-26 at 13:12 +0100, Sander Bogaert wrote:> I''ve been running crashme ( executing random instructions ) on Xen on > Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen > branch. The Xen branch wasn''t updated in a few days. > > I managed to crash Xen from dom0 userspace: > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 1: > (XEN) Unhandled SGI 2 on CPU1 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Tracking this back to the source: > xen/arch/arm/gic.c:648 static void do_sgi > xen/arch/arm/gic.c:671 void gic_interrupt > > From what I understood from the source all sgi''s except 0 & 1 > would/will cause this panic since they are just not handled.Do you know how the SGI was triggered, because if guest/dom0 userspace can trigger an arbitrary SGI then we have big problems! Julien, does your tree have anything in it which might explain this? Ian.
Ian Campbell
2013-Apr-26 13:25 UTC
Re: [PATCH v3 02/13] xen/arm: allocate secondaries dom0 vcpus
On Fri, 2013-04-26 at 14:23 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > > + } > > > + > > > local_abort_enable(); > > > > Aside: I wonder what led this to get put here. It doesn''t happen on > > non-boot CPUs for one thing! > > Interesting.. it doesn''t look wrong to me.The issue is that construct_dom0 really shouldn''t be responsible for setting up global host CPU state.> Maybe we should enable it on non-boot cpu too.I sent out "xen: arm: enable aborts on all physical processors." yesterday. Ian
On 04/26/2013 01:12 PM, Sander Bogaert wrote:> On 24 April 2013 21:07, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: >> Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right >> guest vcpu. >> >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> >> Changes in v3: >> - make use of cpumask_from_bitmap. >> --- >> xen/arch/arm/vgic.c | 55 ++++++++++++++++++++++++++++++++++++++++---- >> xen/include/asm-arm/gic.h | 3 ++ >> 2 files changed, 53 insertions(+), 5 deletions(-) >> >> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c >> index b30da78..8912aab 100644 >> --- a/xen/arch/arm/vgic.c >> +++ b/xen/arch/arm/vgic.c >> @@ -370,6 +370,7 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) >> >> static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) >> { >> + struct domain *d = v->domain; >> struct hsr_dabt dabt = info->dabt; >> struct cpu_user_regs *regs = guest_cpu_user_regs(); >> register_t *r = select_user_reg(regs, dabt.reg); >> @@ -498,11 +499,55 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) >> goto write_ignore; >> >> case GICD_SGIR: >> - if ( dabt.size != 2 ) goto bad_width; >> - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", >> - *r, gicd_reg - GICD_ICFGR); >> - return 0; >> - >> + { >> + cpumask_t vcpu_mask; >> + int virtual_irq; >> + int filter; >> + int vcpuid; >> + struct vcpu *vt; >> + int i; >> + unsigned long bits; >> + >> + if ( dabt.size != 2 ) goto bad_width; >> + >> + filter = (*r & GICD_SGI_TARGET_LIST_MASK); >> + virtual_irq = (*r & GICD_SGI_INTID_MASK); >> + >> + cpumask_clear(&vcpu_mask); >> + switch ( filter ) >> + { >> + case GICD_SGI_TARGET_LIST: >> + bits = (*r & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; >> + vcpu_mask = cpumask_from_bitmap(&bits, 8); >> + break; >> + case GICD_SGI_TARGET_OTHERS: >> + for ( i = 0; i < d->max_vcpus; i++ ) >> + { >> + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) >> + cpumask_set_cpu(i, &vcpu_mask); >> + } >> + case GICD_SGI_TARGET_SELF: >> + vcpu_mask = *cpumask_of(current->vcpu_id); >> + break; >> + default: >> + printk("vGICD: unhandled GICD_SGIR write %x with wrong TargetListFilter field\n", *r); >> + return 0; >> + } >> + >> + for_each_cpu( vcpuid, &vcpu_mask ) >> + { >> + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL || >> + virtual_irq >= 16 ) >> + { >> + printk("vGICD: GICD_SGIR write r=%x vcpu_mask=%lx, wrong CPUTargetList\n", >> + *r, *cpumask_bits(&vcpu_mask)); >> + return 0; >> + } >> + vgic_vcpu_inject_irq(vt, virtual_irq, 1); >> + } >> + return 1; >> + } >> + >> case GICD_CPENDSGIR ... GICD_CPENDSGIRN: >> if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; >> printk("vGICD: unhandled %s write %#"PRIregister" to ICPENDSGIR%d\n", >> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h >> index 92711d5..4f2c8b8 100644 >> --- a/xen/include/asm-arm/gic.h >> +++ b/xen/include/asm-arm/gic.h >> @@ -51,12 +51,15 @@ >> #define GICD_SPENDSGIRN (0xF2C/4) >> #define GICD_ICPIDR2 (0xFE8/4) >> >> +#define GICD_SGI_TARGET_LIST_SHIFT (24) >> +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) >> #define GICD_SGI_TARGET_LIST (0UL<<24) >> #define GICD_SGI_TARGET_OTHERS (1UL<<24) >> #define GICD_SGI_TARGET_SELF (2UL<<24) >> #define GICD_SGI_TARGET_SHIFT (16) >> #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) >> #define GICD_SGI_GROUP1 (1UL<<15) >> +#define GICD_SGI_INTID_MASK (0xFUL) >> >> #define GICC_CTLR (0x0000/4) >> #define GICC_PMR (0x0004/4) >> -- >> 1.7.2.5 >> >> > > Hi, > > I''ve been running crashme ( executing random instructions ) on Xen on > Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen > branch. The Xen branch wasn''t updated in a few days. > > I managed to crash Xen from dom0 userspace: > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 1: > (XEN) Unhandled SGI 2 on CPU1 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Tracking this back to the source: > xen/arch/arm/gic.c:648 static void do_sgi > xen/arch/arm/gic.c:671 void gic_interrupt >> From what I understood from the source all sgi''s except 0 & 1 > would/will cause this panic since they are just not handled. I was > wondering if this patch would fix that because going over it didn''t > convince me it would. The reason I''m asking and not just trying it > because it''s not easy to reproduce a crashme crash.Could you give me the latest commit ID/tag you use? On the latest arndale branch, SGI 2 should be handled. Xen use this SGI to call function on another CPU. -- Julien
On 04/26/2013 02:24 PM, Ian Campbell wrote:> On Fri, 2013-04-26 at 13:12 +0100, Sander Bogaert wrote: >> I''ve been running crashme ( executing random instructions ) on Xen on >> Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen >> branch. The Xen branch wasn''t updated in a few days. >> >> I managed to crash Xen from dom0 userspace: >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 1: >> (XEN) Unhandled SGI 2 on CPU1 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> Tracking this back to the source: >> xen/arch/arm/gic.c:648 static void do_sgi >> xen/arch/arm/gic.c:671 void gic_interrupt >> >> From what I understood from the source all sgi''s except 0 & 1 >> would/will cause this panic since they are just not handled. > > Do you know how the SGI was triggered, because if guest/dom0 userspace > can trigger an arbitrary SGI then we have big problems! > > Julien, does your tree have anything in it which might explain this?This SGI should only be triggered to call a function on another CPU. But it seems his xen tree doesn''t have the patch applied. So nobody should call this SGI. -- Julien
Stefano Stabellini
2013-Apr-26 13:39 UTC
Re: [PATCH v3 11/13] xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > If we need to inject an irq into a guest that is running on a different > > ^VCPU > > > processor, we shouldn''t just enqueue the irq into the lr_pending and > > inflight lists and wait for something to interrupt the guest execution. > > Send an IPI to the target pcpu so that Xen can inject the new interrupt > > returning to guest. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Acked-by: Ian Campbell <ian.campbell@citrix.com> > > > --- > > xen/arch/arm/vgic.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > > index 8912aab..547a2ab 100644 > > --- a/xen/arch/arm/vgic.c > > +++ b/xen/arch/arm/vgic.c > > @@ -664,6 +664,8 @@ out: > > spin_unlock_irqrestore(&v->arch.vgic.lock, flags); > > /* we have a new higher priority irq, inject it into the guest */ > > vcpu_unblock(v); > > + if ( v != current && v->is_running ) > > + smp_send_event_check_mask(cpumask_of(v->processor)); > > I''m a bit concerned about races here, e.g. where v is busy going to > sleep. > I don''t know enough about the behaviour of the scheduler and > vcpu_unblock etc to say, but it worries me...That is that the vcpu_unblock call above is for. I think that should be enough to make sure that the scheduler doesn''t shelve the vcpu.> x86 has vcpu_kick which saves v->is_running before the unblock and then > sends a softirq with much the same checks as you have -- that seems like > a plausible race avoidance strategy, although whether IPI vs softirq is > an important distinction I''ve no idea...Yes, it looks like the code should be race free, although it might be possible to improve it by checking is_running before vcpu_unblock to avoid a useless IPI. I could make that change. I don''t think that the softirq vs IPI thing matters: the purpose is to force the other vcpu to check for events in the near future, the exact point in time shouldn''t make a difference.> BTW smp_send_event_check_mask(cpumask_of(v->processor)) is the same as > smp_send_event_check_cpu(v->processor).Yeah, I''ll make this change.
On 26-04-13 15:35, Julien Grall wrote:> On 04/26/2013 02:24 PM, Ian Campbell wrote: > >> On Fri, 2013-04-26 at 13:12 +0100, Sander Bogaert wrote: >>> I''ve been running crashme ( executing random instructions ) on Xen on >>> Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen >>> branch. The Xen branch wasn''t updated in a few days. >>> >>> I managed to crash Xen from dom0 userspace: >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 1: >>> (XEN) Unhandled SGI 2 on CPU1 >>> (XEN) **************************************** >>> (XEN) >>> (XEN) Reboot in five seconds... >>> >>> Tracking this back to the source: >>> xen/arch/arm/gic.c:648 static void do_sgi >>> xen/arch/arm/gic.c:671 void gic_interrupt >>> >>> From what I understood from the source all sgi''s except 0 & 1 >>> would/will cause this panic since they are just not handled. >> >> Do you know how the SGI was triggered, because if guest/dom0 userspace >> can trigger an arbitrary SGI then we have big problems! >> >> Julien, does your tree have anything in it which might explain this? > > > This SGI should only be triggered to call a function on another CPU. But > it seems his xen tree doesn''t have the patch applied. So nobody should > call this SGI. >I''ve been using 5ce4118f5768c6137d58888d57972bdfdf4c9aba in the branch ''arndale'' on this repo: git://xenbits.xen.org/people/julieng/xen-unstable.git . I will try to reproduce this, I did log my initial seed for crashme and the ones leading up to the crash. Is there something I can enable to get some more output? I thought Xen would print some data ( I''ve seen that before, the register content & such + a stacktrace ). I should have started a new thread for this, I thought it was related to what this patch fixes. Sander
Ian Campbell
2013-Apr-26 14:08 UTC
Re: [PATCH v3 11/13] xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus
On Fri, 2013-04-26 at 14:39 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > If we need to inject an irq into a guest that is running on a different > > > > ^VCPU > > > > > processor, we shouldn''t just enqueue the irq into the lr_pending and > > > inflight lists and wait for something to interrupt the guest execution. > > > Send an IPI to the target pcpu so that Xen can inject the new interrupt > > > returning to guest. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > Acked-by: Ian Campbell <ian.campbell@citrix.com> > > > > > --- > > > xen/arch/arm/vgic.c | 2 ++ > > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > > > index 8912aab..547a2ab 100644 > > > --- a/xen/arch/arm/vgic.c > > > +++ b/xen/arch/arm/vgic.c > > > @@ -664,6 +664,8 @@ out: > > > spin_unlock_irqrestore(&v->arch.vgic.lock, flags); > > > /* we have a new higher priority irq, inject it into the guest */ > > > vcpu_unblock(v); > > > + if ( v != current && v->is_running ) > > > + smp_send_event_check_mask(cpumask_of(v->processor)); > > > > I''m a bit concerned about races here, e.g. where v is busy going to > > sleep. > > I don''t know enough about the behaviour of the scheduler and > > vcpu_unblock etc to say, but it worries me... > > That is that the vcpu_unblock call above is for. > I think that should be enough to make sure that the scheduler doesn''t > shelve the vcpu.If the cpu is already running then vcpu_unblock won''t do anything (useful). If between vcpu_unblock and if ( v->is_running ) the vcpu, currently running on another PCPU, blocks then you will not wake it up.> > x86 has vcpu_kick which saves v->is_running before the unblock and then > > sends a softirq with much the same checks as you have -- that seems like > > a plausible race avoidance strategy, although whether IPI vs softirq is > > an important distinction I''ve no idea... > > Yes, it looks like the code should be race free, although it might be > possible to improve it by checking is_running before vcpu_unblock to > avoid a useless IPI. I could make that change.I think you should, even if I''m worrying about nothing it is more obviously correct and a spurious wake is better than a missed one.> I don''t think that the softirq vs IPI thing matters: the purpose is to > force the other vcpu to check for events in the near future, the exact > point in time shouldn''t make a difference.The comment on x86 suggests that the softirq serves a wider purpose, which is why I was asking: /* * Nothing to do here: we merely prevent notifiers from racing with checks * executed on return to guest context with interrupts enabled. See, for * example, xxx_intr_assist() executed on return to HVM guest context. */ Ian.
On Fri, 2013-04-26 at 14:56 +0100, Sander Bogaert wrote:> On 26-04-13 15:35, Julien Grall wrote: > > On 04/26/2013 02:24 PM, Ian Campbell wrote: > > > >> On Fri, 2013-04-26 at 13:12 +0100, Sander Bogaert wrote: > >>> I''ve been running crashme ( executing random instructions ) on Xen on > >>> Arm installed on an Arndale board. I''m using Julien''s dom0 and Xen > >>> branch. The Xen branch wasn''t updated in a few days. > >>> > >>> I managed to crash Xen from dom0 userspace: > >>> (XEN) > >>> (XEN) **************************************** > >>> (XEN) Panic on CPU 1: > >>> (XEN) Unhandled SGI 2 on CPU1 > >>> (XEN) **************************************** > >>> (XEN) > >>> (XEN) Reboot in five seconds... > >>> > >>> Tracking this back to the source: > >>> xen/arch/arm/gic.c:648 static void do_sgi > >>> xen/arch/arm/gic.c:671 void gic_interrupt > >>> > >>> From what I understood from the source all sgi''s except 0 & 1 > >>> would/will cause this panic since they are just not handled. > >> > >> Do you know how the SGI was triggered, because if guest/dom0 userspace > >> can trigger an arbitrary SGI then we have big problems! > >> > >> Julien, does your tree have anything in it which might explain this? > > > > > > This SGI should only be triggered to call a function on another CPU. But > > it seems his xen tree doesn''t have the patch applied. So nobody should > > call this SGI. > > > > I''ve been using 5ce4118f5768c6137d58888d57972bdfdf4c9aba in the branch > ''arndale'' on this repo: > git://xenbits.xen.org/people/julieng/xen-unstable.git . > > I will try to reproduce this, I did log my initial seed for crashme and > the ones leading up to the crash. Is there something I can enable to get > some more output? I thought Xen would print some data ( I''ve seen that > before, the register content & such + a stacktrace ).I think you can change the panic into a BUG_ON(1) which will get you more info. If you can repro then please can you share crashme + the seed with us?> I should have started a new thread for this, I thought it was related to > what this patch fixes.No problem, please do start a new thread for any further updates. Ian.
Stefano Stabellini
2013-Apr-26 14:30 UTC
Re: [PATCH v3 12/13] xen/arm: start the vtimer Xen timers on the processor they should be running on
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > The Xen physical timer emulator and virtual timer driver use two > > internal Xen timers: initialize them on the processor the vcpu is > > going to be running on, rather than the processor that it''s creating the > > vcpu. > > But this CPU can change as the vcpu migrates around the pcpus, do we not > need to handle that case too? > > Surely we need some code in virt_timer_restore at least? Possibly a call > to migrate_timer? Although perhaps there is a less expensive way when we > know the timer isn''t running?I think that the comment above is misleading: the issue is that they need to be both initialized with the same starting time and because vcpu_vtimer_init is a vcpu initialization function they are called at different times, therefore they are going to have a different starting time on the two vcpus. So the problem is not the physical cpu time but the virtual cpu time.> > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > xen/arch/arm/vtimer.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > index 1cb365e..2444851 100644 > > --- a/xen/arch/arm/vtimer.c > > +++ b/xen/arch/arm/vtimer.c > > @@ -48,7 +48,7 @@ int vcpu_vtimer_init(struct vcpu *v) > > { > > struct vtimer *t = &v->arch.phys_timer; > > > > - init_timer(&t->timer, phys_timer_expired, t, smp_processor_id()); > > + init_timer(&t->timer, phys_timer_expired, t, v->processor); > > t->ctl = 0; > > t->offset = NOW(); > > t->cval = NOW(); > > @@ -56,7 +56,7 @@ int vcpu_vtimer_init(struct vcpu *v) > > t->v = v; > > > > t = &v->arch.virt_timer; > > - init_timer(&t->timer, virt_timer_expired, t, smp_processor_id()); > > + init_timer(&t->timer, virt_timer_expired, t, v->processor); > > t->ctl = 0; > > t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > > t->cval = 0; > > >
Stefano Stabellini
2013-Apr-26 14:30 UTC
Re: [PATCH v3 13/13] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Thu, 25 Apr 2013, Ian Campbell wrote:> On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > xen/arch/arm/vtimer.c | 18 ++++++++++++++---- > > xen/include/asm-arm/domain.h | 26 +++++++++++++++++--------- > > 2 files changed, 31 insertions(+), 13 deletions(-) > > > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > index 2444851..49858e7 100644 > > --- a/xen/arch/arm/vtimer.c > > +++ b/xen/arch/arm/vtimer.c > > @@ -46,20 +46,30 @@ static void virt_timer_expired(void *data) > > > > int vcpu_vtimer_init(struct vcpu *v) > > { > > + struct vtimer_base *b = &v->domain->arch.phys_timer_base; > > struct vtimer *t = &v->arch.phys_timer; > > > > + if ( !b->offset ) > > + b->offset = NOW(); > > + if ( !b->cval ) > > + b->cval = NOW(); > > Initialisation of domain global state should be done in > domain_vtimer_init (which you may need to add) IMHO.Right, that would be nicer> > init_timer(&t->timer, phys_timer_expired, t, v->processor); > > t->ctl = 0; > > - t->offset = NOW(); > > - t->cval = NOW(); > > + t->offset = b->offset; > > > > + t->cval = b->cval; > > t->irq = 30; > > t->v = v; > > > > + b = &v->domain->arch.virt_timer_base; > > + if ( !b->offset ) > > + b->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > > + if ( !b->cval ) > > + b->cval = 0; > > t = &v->arch.virt_timer; > > init_timer(&t->timer, virt_timer_expired, t, v->processor); > > t->ctl = 0; > > - t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > > - t->cval = 0; > > + t->offset = b->offset; > > + t->cval = b->cval; > > Are you sure this is as simple as this? > > At the very least I''m surprised that we don''t need to consult b->offset > anywhere other than at init time, if vcpus are supposed to see a uniform > view of time. > > Or is it the case that what actually needs to happen is that t->offset > goes away and b->offset is used everywhere?That would be another (probably better) way of doing it: the only important thing is that the start time on both vcpus is the same.> > > t->irq = 27; > > t->v = v; > > > > diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h > > index 3fa266c2..d001802 100644 > > --- a/xen/include/asm-arm/domain.h > > +++ b/xen/include/asm-arm/domain.h > > @@ -47,6 +47,20 @@ enum domain_type { > > #define is_pv64_domain(d) (0) > > #endif > > > > +struct vtimer { > > + struct vcpu *v; > > + int irq; > > + struct timer timer; > > + uint32_t ctl; > > + uint64_t offset; > > + uint64_t cval; > > +}; > > + > > +struct vtimer_base { > > + uint64_t offset; > > + uint64_t cval; > > +}; > > It would be consistent with what is there to inline this into > arch_domain.OK
Ian Campbell
2013-Apr-26 14:35 UTC
Re: [PATCH v3 12/13] xen/arm: start the vtimer Xen timers on the processor they should be running on
On Fri, 2013-04-26 at 15:30 +0100, Stefano Stabellini wrote:> On Thu, 25 Apr 2013, Ian Campbell wrote: > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > The Xen physical timer emulator and virtual timer driver use two > > > internal Xen timers: initialize them on the processor the vcpu is > > > going to be running on, rather than the processor that it''s creating the > > > vcpu. > > > > But this CPU can change as the vcpu migrates around the pcpus, do we not > > need to handle that case too? > > > > Surely we need some code in virt_timer_restore at least? Possibly a call > > to migrate_timer? Although perhaps there is a less expensive way when we > > know the timer isn''t running? > > I think that the comment above is misleading: the issue is that they > need to be both initialized with the same starting time and because > vcpu_vtimer_init is a vcpu initialization function they are called at > different times, therefore they are going to have a different starting > time on the two vcpus. > So the problem is not the physical cpu time but the virtual cpu time.OK, but in that case why is the value not just stored in the domain struct instead of being duplicated in each vcpu ?
Stefano Stabellini
2013-Apr-26 14:36 UTC
Re: [PATCH v3 12/13] xen/arm: start the vtimer Xen timers on the processor they should be running on
On Fri, 26 Apr 2013, Ian Campbell wrote:> On Fri, 2013-04-26 at 15:30 +0100, Stefano Stabellini wrote: > > On Thu, 25 Apr 2013, Ian Campbell wrote: > > > On Wed, 2013-04-24 at 20:07 +0100, Stefano Stabellini wrote: > > > > The Xen physical timer emulator and virtual timer driver use two > > > > internal Xen timers: initialize them on the processor the vcpu is > > > > going to be running on, rather than the processor that it''s creating the > > > > vcpu. > > > > > > But this CPU can change as the vcpu migrates around the pcpus, do we not > > > need to handle that case too? > > > > > > Surely we need some code in virt_timer_restore at least? Possibly a call > > > to migrate_timer? Although perhaps there is a less expensive way when we > > > know the timer isn''t running? > > > > I think that the comment above is misleading: the issue is that they > > need to be both initialized with the same starting time and because > > vcpu_vtimer_init is a vcpu initialization function they are called at > > different times, therefore they are going to have a different starting > > time on the two vcpus. > > So the problem is not the physical cpu time but the virtual cpu time. > > OK, but in that case why is the value not just stored in the domain > struct instead of being duplicated in each vcpu ?That''s right, in fact I am making that change now
Ian Campbell
2013-Apr-26 14:37 UTC
Re: [PATCH v3 13/13] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Fri, 2013-04-26 at 15:30 +0100, Stefano Stabellini wrote:> > Are you sure this is as simple as this? > > > > At the very least I''m surprised that we don''t need to consult b->offset > > anywhere other than at init time, if vcpus are supposed to see a uniform > > view of time. > > > > Or is it the case that what actually needs to happen is that t->offset > > goes away and b->offset is used everywhere? > > That would be another (probably better) way of doing it: the only important > thing is that the start time on both vcpus is the same.As it stand yes, although I can''t help thinking that the virt timer ought not to be cleverer, and the offset adjusted based on stolen time or something like that. Otherwise it isn''t really distinct from physical time with holes in it... But that''s a big can-o-worms. Ian.