Hi all, this patch series implement guest SMP support for ARM, using the ARM PSCI interface for secondary cpu bringup. See each patch for a detailed changelog. Stefano Stabellini (12): xen/arm: basic PSCI support, implement cpu_on and cpu_off xen/arm: allocate secondaries dom0 vcpus xen: move for_each_set_bit to xen/bitops.h xen/arm: support for guest SGI xen/arm: implement arch_vmap_virt_end xen/arm: compile and initialize vmap xen/arm: implement map_domain_page_global and unmap_domain_page_global xen: move VCPUOP_register_vcpu_info to common code xen/arm: support VCPUOP_register_vcpu_info. xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus xen/arm: run the vtimer Xen timers on the pcpu the vcpu is running on xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus xen/arch/arm/Makefile | 1 + xen/arch/arm/domain.c | 18 +++++ xen/arch/arm/domain_build.c | 14 ++++- xen/arch/arm/mm.c | 133 ++++++++++++++++++++++++++++++++++++--- xen/arch/arm/psci.c | 76 ++++++++++++++++++++++ xen/arch/arm/setup.c | 3 + xen/arch/arm/traps.c | 56 ++++++++++++++++- xen/arch/arm/vgic.c | 67 ++++++++++++++++++- xen/arch/arm/vtimer.c | 32 ++++++--- xen/arch/arm/vtimer.h | 1 + xen/arch/x86/domain.c | 113 --------------------------------- xen/arch/x86/i8259.c | 2 +- xen/arch/x86/mpparse.c | 2 +- xen/arch/x86/setup.c | 2 +- xen/arch/x86/traps.c | 2 +- xen/common/domain.c | 111 ++++++++++++++++++++++++++++++++ xen/include/asm-arm/config.h | 4 +- xen/include/asm-arm/domain.h | 24 +++++--- xen/include/asm-arm/gic.h | 9 ++- xen/include/asm-arm/hypercall.h | 2 + xen/include/asm-arm/page.h | 3 + xen/include/asm-arm/processor.h | 1 + xen/include/asm-arm/psci.h | 24 +++++++ xen/include/asm-x86/bitops.h | 11 --- xen/include/asm-x86/domain.h | 3 - xen/include/asm-x86/page.h | 8 --- xen/include/asm-x86/smp.h | 2 +- xen/include/asm-x86/system.h | 2 +- xen/include/public/arch-arm.h | 2 + xen/include/xen/bitops.h | 11 +++ xen/include/xen/domain.h | 3 + xen/include/xen/event.h | 2 +- xen/include/xen/mm.h | 7 ++ xen/include/xen/sched.h | 3 + xen/include/xen/softirq.h | 2 +- xen/xsm/xsm_policy.c | 2 +- 36 files changed, 573 insertions(+), 185 deletions(-) Cheers, Stefano
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 01/12] xen/arm: basic PSCI support, implement cpu_on and cpu_off
Implement support for ARM Power State Coordination Interface, PSCI in short. Support HVC and SMC calls. Changes in v3: - move do_psci_* to psci.c; - trap SMC; - return PSCI error codes; - remove Linux boot procotol from secondary cpus; - implement cpu_off; - use register_t as second parameter for arm_psci_fn_t; - unconditionally reset the guest vcpu context on cpu_on. Changes in v2: - set is_initialised in arch_set_info_guest; - zero vcpu_guest_context before using it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/Makefile | 1 + xen/arch/arm/domain.c | 2 + xen/arch/arm/domain_build.c | 2 +- xen/arch/arm/psci.c | 76 +++++++++++++++++++++++++++++++++++++++ xen/arch/arm/traps.c | 50 +++++++++++++++++++++++++- xen/include/asm-arm/processor.h | 1 + xen/include/asm-arm/psci.h | 24 ++++++++++++ xen/include/public/arch-arm.h | 2 + 8 files changed, 156 insertions(+), 2 deletions(-) create mode 100644 xen/arch/arm/psci.c create mode 100644 xen/include/asm-arm/psci.h diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index 2106a4f..8f75044 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -5,6 +5,7 @@ subdir-y += platforms obj-y += early_printk.o obj-y += cpu.o obj-y += domain.o +obj-y += psci.o obj-y += domctl.o obj-y += sysctl.o obj-y += domain_build.o diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index eae42af..fee3790 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -561,6 +561,8 @@ int arch_set_info_guest( else set_bit(_VPF_down, &v->pause_flags); + v->is_initialised = 1; + return 0; } diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index a6d8e9d..3d76065 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -418,7 +418,7 @@ int construct_dom0(struct domain *d) regs->pc = (uint32_t)kinfo.entry; - regs->cpsr = PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC; + regs->cpsr = PSR_GUEST_INIT; #ifdef CONFIG_ARM_64 d->arch.type = kinfo.type; diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c new file mode 100644 index 0000000..562ef0b --- /dev/null +++ b/xen/arch/arm/psci.c @@ -0,0 +1,76 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include <xen/errno.h> +#include <xen/sched.h> +#include <xen/types.h> + +#include <asm/current.h> +#include <asm/psci.h> + +int do_psci_cpu_on(uint32_t vcpuid, register_t entry_point) +{ + struct vcpu *v; + struct domain *d = current->domain; + struct vcpu_guest_context *ctxt; + int rc; + + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) + return PSCI_EINVAL; + + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) + return PSCI_EINVAL; + + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) + return PSCI_DENIED; + + memset(ctxt, 0, sizeof(*ctxt)); + ctxt->user_regs.pc64 = (u64) entry_point; + ctxt->sctlr = SCTLR_BASE; + ctxt->ttbr0 = 0; + ctxt->ttbr1 = 0; + ctxt->ttbcr = 0; /* Defined Reset Value */ + ctxt->user_regs.cpsr = PSR_GUEST_INIT; + ctxt->flags = VGCF_online; + + domain_lock(d); + rc = arch_set_info_guest(v, ctxt); + free_vcpu_guest_context(ctxt); + + if ( rc < 0 ) + { + domain_unlock(d); + return PSCI_DENIED; + } + domain_unlock(d); + + vcpu_wake(v); + + return PSCI_SUCCESS; +} + +int do_psci_cpu_off(uint32_t power_state) +{ + struct vcpu *v = current; + if ( !test_and_set_bit(_VPF_down, &v->pause_flags) ) + vcpu_sleep_nosync(v); + return PSCI_SUCCESS; +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index da53675..4bd03e0 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -34,6 +34,7 @@ #include <asm/event.h> #include <asm/regs.h> #include <asm/cpregs.h> +#include <asm/psci.h> #include "io.h" #include "vtimer.h" @@ -61,7 +62,7 @@ void __cpuinit init_traps(void) WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2); /* Setup hypervisor traps */ - WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI, HCR_EL2); + WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2); isb(); } @@ -618,6 +619,29 @@ static arm_hypercall_t arm_hypercall_table[] = { HYPERCALL(grant_table_op, 3), }; +#define __PSCI_cpu_suspend 0 +#define __PSCI_cpu_off 1 +#define __PSCI_cpu_on 2 +#define __PSCI_migrate 3 + +typedef int (*arm_psci_fn_t)(uint32_t, register_t); + +typedef struct { + arm_psci_fn_t fn; + int nr_args; +} arm_psci_t; + +#define PSCI(_name, _nr_args) \ + [ __PSCI_ ## _name ] = { \ + .fn = (arm_psci_fn_t) &do_psci_ ## _name, \ + .nr_args = _nr_args, \ + } + +static arm_psci_t arm_psci_table[] = { + PSCI(cpu_off, 1), + PSCI(cpu_on, 2), +}; + static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) { register_t *r; @@ -647,6 +671,25 @@ static void do_debug_trap(struct cpu_user_regs *regs, unsigned int code) } } +static void do_trap_psci(struct cpu_user_regs *regs) +{ + arm_psci_fn_t psci_call = NULL; + + if ( regs->r0 >= ARRAY_SIZE(arm_psci_table) ) + { + domain_crash_synchronous(); + return; + } + + psci_call = arm_psci_table[regs->r0].fn; + if ( psci_call == NULL ) + { + domain_crash_synchronous(); + return; + } + regs->r0 = psci_call(regs->r1, regs->r2); +} + static void do_trap_hypercall(struct cpu_user_regs *regs, unsigned long iss) { arm_hypercall_fn_t call = NULL; @@ -959,8 +1002,13 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs *regs) case HSR_EC_HVC: if ( (hsr.iss & 0xff00) == 0xff00 ) return do_debug_trap(regs, hsr.iss & 0x00ff); + if ( hsr.iss == 0 ) + return do_trap_psci(regs); do_trap_hypercall(regs, hsr.iss); break; + case HSR_EC_SMC: + do_trap_psci(regs); + break; case HSR_EC_DATA_ABORT_GUEST: do_trap_data_abort_guest(regs, hsr.dabt); break; diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h index 1681ebf..72f7f6b 100644 --- a/xen/include/asm-arm/processor.h +++ b/xen/include/asm-arm/processor.h @@ -85,6 +85,7 @@ #define HSR_EC_CP14_64 0x0c #define HSR_EC_SVC 0x11 #define HSR_EC_HVC 0x12 +#define HSR_EC_SMC 0x13 #define HSR_EC_INSTR_ABORT_GUEST 0x20 #define HSR_EC_INSTR_ABORT_HYP 0x21 #define HSR_EC_DATA_ABORT_GUEST 0x24 diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h new file mode 100644 index 0000000..8511eb1 --- /dev/null +++ b/xen/include/asm-arm/psci.h @@ -0,0 +1,24 @@ +#ifndef __ASM_PSCI_H__ +#define __ASM_PSCI_H__ + +#define PSCI_SUCCESS 0 +#define PSCI_ENOSYS -1 +#define PSCI_EINVAL -2 +#define PSCI_DENIED -3 + +int do_psci_cpu_on(uint32_t vcpuid, uint32_t entry_point); +int do_psci_cpu_off(uint32_t power_state); +int do_psci_cpu_suspend(uint32_t power_state, uint32_t entry_point); +int do_psci_migrate(uint32_t vcpuid); + +#endif /* __ASM_PSCI_H__ */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index 746df8e..292cc30 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -236,6 +236,8 @@ typedef uint64_t xen_callback_t; #define PSR_BIG_ENDIAN (1<<9) /* Big Endian Mode */ #define PSR_JAZELLE (1<<24) /* Jazelle Mode */ +#define PSR_GUEST_INIT (PSR_ABT_MASK|PSR_FIQ_MASK|PSR_IRQ_MASK|PSR_MODE_SVC) + #endif /* __XEN_PUBLIC_ARCH_ARM_H__ */ /* -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 02/12] xen/arm: allocate secondaries dom0 vcpus
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v4: - check for alloc_vcpu errors. --- xen/arch/arm/domain_build.c | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3d76065..9b50c3c 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -363,7 +363,7 @@ static void dtb_load(struct kernel_info *kinfo) int construct_dom0(struct domain *d) { struct kernel_info kinfo = {}; - int rc; + int rc, i, cpu; struct vcpu *v = d->vcpu[0]; struct cpu_user_regs *regs = &v->arch.cpu_info->guest_cpu_user_regs; @@ -451,6 +451,16 @@ int construct_dom0(struct domain *d) } #endif + for ( i = 1, cpu = 0; i < d->max_vcpus; i++ ) + { + cpu = cpumask_cycle(cpu, &cpu_online_map); + if ( alloc_vcpu(d, i, cpu) == NULL ) + { + printk("Failed to allocate dom0 vcpu %d on pcpu %d\n", i, cpu); + break; + } + } + local_abort_enable(); return 0; -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 03/12] xen: move for_each_set_bit to xen/bitops.h
Move for_each_set_bit from asm-x86/bitops.h to xen/bitops.h. Replace #include <asm/bitops.h> with #include <xen/bitops.h> everywhere. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: keir@xen.org CC: JBeulich@suse.com --- xen/arch/x86/i8259.c | 2 +- xen/arch/x86/mpparse.c | 2 +- xen/arch/x86/setup.c | 2 +- xen/arch/x86/traps.c | 2 +- xen/include/asm-x86/bitops.h | 11 ----------- xen/include/asm-x86/smp.h | 2 +- xen/include/asm-x86/system.h | 2 +- xen/include/xen/bitops.h | 11 +++++++++++ xen/include/xen/event.h | 2 +- xen/include/xen/softirq.h | 2 +- xen/xsm/xsm_policy.c | 2 +- 11 files changed, 20 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c index b537c5f..6fdcce8 100644 --- a/xen/arch/x86/i8259.c +++ b/xen/arch/x86/i8259.c @@ -16,7 +16,7 @@ #include <asm/system.h> #include <asm/io.h> #include <asm/desc.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <xen/delay.h> #include <asm/apic.h> #include <asm/asm_defns.h> diff --git a/xen/arch/x86/mpparse.c b/xen/arch/x86/mpparse.c index 97ab5d3..97d34bc 100644 --- a/xen/arch/x86/mpparse.c +++ b/xen/arch/x86/mpparse.c @@ -22,7 +22,7 @@ #include <xen/efi.h> #include <xen/sched.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/smp.h> #include <asm/acpi.h> #include <asm/mtrr.h> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 43301a5..744335d 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -29,7 +29,7 @@ #include <public/version.h> #include <compat/platform.h> #include <compat/xen.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/smp.h> #include <asm/processor.h> #include <asm/mpspec.h> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index d36eddd..5200940 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -52,7 +52,7 @@ #include <asm/system.h> #include <asm/io.h> #include <asm/atomic.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/desc.h> #include <asm/debugreg.h> #include <asm/smp.h> diff --git a/xen/include/asm-x86/bitops.h b/xen/include/asm-x86/bitops.h index c3cbd26..f5a84ef 100644 --- a/xen/include/asm-x86/bitops.h +++ b/xen/include/asm-x86/bitops.h @@ -368,17 +368,6 @@ static inline unsigned int __scanbit(unsigned long val, unsigned long max) __find_next_zero_bit(addr,size,off))) /** - * for_each_set_bit - iterate over every set bit in a memory region - * @bit: The integer iterator - * @addr: The address to base the search on - * @size: The maximum size to search - */ -#define for_each_set_bit(bit, addr, size) \ - for ( (bit) = find_first_bit(addr, size); \ - (bit) < (size); \ - (bit) = find_next_bit(addr, size, (bit) + 1) ) - -/** * find_first_set_bit - find the first set bit in @word * @word: the word to search * diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h index 301f8c7..81f8610 100644 --- a/xen/include/asm-x86/smp.h +++ b/xen/include/asm-x86/smp.h @@ -12,7 +12,7 @@ #endif #ifndef __ASSEMBLY__ -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/mpspec.h> #endif diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index b0876d6..6ab7d56 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -2,7 +2,7 @@ #define __ASM_SYSTEM_H #include <xen/lib.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #define read_segment_register(name) \ ({ u16 __sel; \ diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h index c6a78b6..6054155 100644 --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -182,4 +182,15 @@ static inline __u32 ror32(__u32 word, unsigned int shift) #define __L16(_x) (((_x) & 0x0000ff00) ? ( 8 + __L8( (_x)>> 8)) : __L8( _x)) #define LOG_2(_x) (((_x) & 0xffff0000) ? (16 + __L16((_x)>>16)) : __L16(_x)) +/** + * for_each_set_bit - iterate over every set bit in a memory region + * @bit: The integer iterator + * @addr: The address to base the search on + * @size: The maximum size to search + */ +#define for_each_set_bit(bit, addr, size) \ + for ( (bit) = find_first_bit(addr, size); \ + (bit) < (size); \ + (bit) = find_next_bit(addr, size, (bit) + 1) ) + #endif diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 71c3e92..4ac39ad 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -12,7 +12,7 @@ #include <xen/sched.h> #include <xen/smp.h> #include <xen/softirq.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/event.h> /* diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h index 193351d..0c0d481 100644 --- a/xen/include/xen/softirq.h +++ b/xen/include/xen/softirq.h @@ -13,7 +13,7 @@ enum { #include <xen/lib.h> #include <xen/smp.h> -#include <asm/bitops.h> +#include <xen/bitops.h> #include <asm/current.h> #include <asm/hardirq.h> #include <asm/softirq.h> diff --git a/xen/xsm/xsm_policy.c b/xen/xsm/xsm_policy.c index 65be61d..cad7964 100644 --- a/xen/xsm/xsm_policy.c +++ b/xen/xsm/xsm_policy.c @@ -20,7 +20,7 @@ #include <xsm/xsm.h> #include <xen/multiboot.h> -#include <asm/bitops.h> +#include <xen/bitops.h> char *__initdata policy_buffer = NULL; u32 __initdata policy_size = 0; -- 1.7.2.5
Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right guest vcpu. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v4: - move the code to a separate function; - use gdprintk for debugging output; - make use of PRIregister; - replace the cpumask with a bitmask; - move the virtual_irq check outside the loop; - ignore non-existent target vcpus. Changes in v3: - make use of cpumask_from_bitmap. --- xen/arch/arm/vgic.c | 63 ++++++++++++++++++++++++++++++++++++++++++--- xen/include/asm-arm/gic.h | 9 ++++-- 2 files changed, 65 insertions(+), 7 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index b30da78..8d87609 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -17,6 +17,7 @@ * GNU General Public License for more details. */ +#include <xen/bitops.h> #include <xen/config.h> #include <xen/lib.h> #include <xen/init.h> @@ -368,6 +369,61 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) } } +static int vgic_to_sgi(struct vcpu *v, register_t sgir) +{ + struct domain *d = v->domain; + int virtual_irq; + int filter; + int vcpuid; + struct vcpu *vt; + int i; + unsigned long vcpu_mask = 0; + + ASSERT(d->max_vcpus < 8*sizeof(vcpu_mask)); + + filter = (sgir & GICD_SGI_TARGET_LIST_MASK); + virtual_irq = (sgir & GICD_SGI_INTID_MASK); + + switch ( filter ) + { + case GICD_SGI_TARGET_LIST: + vcpu_mask = (sgir & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; + break; + case GICD_SGI_TARGET_OTHERS: + for ( i = 0; i < d->max_vcpus; i++ ) + { + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) + set_bit(i, &vcpu_mask); + } + case GICD_SGI_TARGET_SELF: + set_bit(current->vcpu_id, &vcpu_mask); + break; + default: + gdprintk(XENLOG_WARNING, "vGICD: unhandled GICD_SGIR write %"PRIregister" with wrong TargetListFilter field\n", + sgir); + return 0; + } + if ( virtual_irq >= 16 ) + { + gdprintk(XENLOG_WARNING, "vGICD: try to send SGI %d that is >= 16\n", + virtual_irq); + return 0; + } + + + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) + { + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) + { + gdprintk(XENLOG_WARNING, "vGICD: GICD_SGIR write r=%"PRIregister" vcpu_mask=%lx, wrong CPUTargetList\n", + sgir, vcpu_mask); + continue; + } + vgic_vcpu_inject_irq(vt, virtual_irq, 1); + } + return 1; +} + static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) { struct hsr_dabt dabt = info->dabt; @@ -498,10 +554,9 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) goto write_ignore; case GICD_SGIR: - if ( dabt.size != 2 ) goto bad_width; - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", - *r, gicd_reg - GICD_ICFGR); - return 0; + if ( dabt.size != 2 ) + goto bad_width; + return vgic_to_sgi(v, *r); case GICD_CPENDSGIR ... GICD_CPENDSGIRN: if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h index 92711d5..061ebf4 100644 --- a/xen/include/asm-arm/gic.h +++ b/xen/include/asm-arm/gic.h @@ -51,12 +51,15 @@ #define GICD_SPENDSGIRN (0xF2C/4) #define GICD_ICPIDR2 (0xFE8/4) -#define GICD_SGI_TARGET_LIST (0UL<<24) -#define GICD_SGI_TARGET_OTHERS (1UL<<24) -#define GICD_SGI_TARGET_SELF (2UL<<24) +#define GICD_SGI_TARGET_LIST_SHIFT (24) +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT) +#define GICD_SGI_TARGET_LIST (0UL<<GICD_SGI_TARGET_LIST_SHIFT) +#define GICD_SGI_TARGET_OTHERS (1UL<<GICD_SGI_TARGET_LIST_SHIFT) +#define GICD_SGI_TARGET_SELF (2UL<<GICD_SGI_TARGET_LIST_SHIFT) #define GICD_SGI_TARGET_SHIFT (16) #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) #define GICD_SGI_GROUP1 (1UL<<15) +#define GICD_SGI_INTID_MASK (0xFUL) #define GICC_CTLR (0x0000/4) #define GICC_PMR (0x0004/4) -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 05/12] xen/arm: implement arch_vmap_virt_end
Move virt_start out of ioremap and rename it to early_vmap_start. Implement arch_vmap_virt_end by returning early_vmap_start. Allocate virtual addresses in early_ioremap from top to bottom so that later on when we initialize vmap, we can return the end of the vmap address space (the last address allocated by early_ioremap). Changes in v4: - merge with "early_ioremap: allocate virtual addresses from top to bottom" patch; - better commit message; - declare early_vmap_start __initdata; - use "& ~SECOND_MASK" instead of shifts. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 25 +++++++++++++++++-------- 1 files changed, 17 insertions(+), 8 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index ba3140d..ceca674 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -431,31 +431,40 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) * start has to be 2MB aligned. * len has to be < EARLY_VMAP_VIRT_END - EARLY_VMAP_VIRT_START. */ +static __initdata unsigned long early_vmap_start = EARLY_VMAP_VIRT_END; void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) { - static unsigned long virt_start = EARLY_VMAP_VIRT_START; - unsigned long ret_addr = virt_start; paddr_t end = start + len; + unsigned long map_start; + + len = (len + SECOND_SIZE - 1) & ~SECOND_MASK; + early_vmap_start -= len; ASSERT(!(start & (~SECOND_MASK))); - ASSERT(!(virt_start & (~SECOND_MASK))); + ASSERT(!(early_vmap_start & (~SECOND_MASK))); /* The range we need to map is too big */ - if ( virt_start + len >= EARLY_VMAP_VIRT_END ) + if ( early_vmap_start >= EARLY_VMAP_VIRT_START ) return NULL; + map_start = early_vmap_start; while ( start < end ) { lpae_t e = mfn_to_xen_entry(start >> PAGE_SHIFT); e.pt.ai = attributes; - write_pte(xen_second + second_table_offset(virt_start), e); + write_pte(xen_second + second_table_offset(map_start), e); start += SECOND_SIZE; - virt_start += SECOND_SIZE; + map_start += SECOND_SIZE; } - flush_xen_data_tlb_range_va(ret_addr, len); + flush_xen_data_tlb_range_va(early_vmap_start, len); + + return (void*)early_vmap_start; +} - return (void*)ret_addr; +void *__init arch_vmap_virt_end(void) +{ + return (void *)early_vmap_start; } enum mg { mg_clear, mg_ro, mg_rw, mg_rx }; -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 06/12] xen/arm: compile and initialize vmap
Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to VMAP_VIRT_END and VMAP_VIRT_START. Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM, because we only support 4K pages so as a matter of fact it is always set). Implement map_pages_to_xen and destroy_xen_mappings. Call vm_init from start_xen. Changes in v4: - remove flush_tlb_local() from create_xen_entries; - return error if a mapping is already present. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen/arch/arm/mm.c | 101 ++++++++++++++++++++++++++++++++++++++++- xen/arch/arm/setup.c | 3 + xen/include/asm-arm/config.h | 4 +- xen/include/asm-arm/page.h | 3 + xen/include/asm-x86/page.h | 8 --- xen/include/xen/mm.h | 7 +++ 6 files changed, 113 insertions(+), 13 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index ceca674..3c0b0f0 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -33,6 +33,7 @@ #include <xen/err.h> #include <asm/page.h> #include <asm/current.h> +#include <asm/flushtlb.h> #include <public/memory.h> #include <xen/sched.h> #include <xsm/xsm.h> @@ -429,9 +430,9 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) /* Map the physical memory range start - start + len into virtual * memory and return the virtual address of the mapping. * start has to be 2MB aligned. - * len has to be < EARLY_VMAP_VIRT_END - EARLY_VMAP_VIRT_START. + * len has to be < VMAP_VIRT_END - VMAP_VIRT_START. */ -static __initdata unsigned long early_vmap_start = EARLY_VMAP_VIRT_END; +static __initdata unsigned long early_vmap_start = VMAP_VIRT_END; void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) { paddr_t end = start + len; @@ -444,7 +445,7 @@ void* __init early_ioremap(paddr_t start, size_t len, unsigned attributes) ASSERT(!(early_vmap_start & (~SECOND_MASK))); /* The range we need to map is too big */ - if ( early_vmap_start >= EARLY_VMAP_VIRT_START ) + if ( early_vmap_start >= VMAP_VIRT_START ) return NULL; map_start = early_vmap_start; @@ -467,6 +468,100 @@ void *__init arch_vmap_virt_end(void) return (void *)early_vmap_start; } +static int create_xen_table(lpae_t *entry) +{ + void *p; + lpae_t pte; + + p = alloc_xenheap_pages(0, 0); + if ( p == NULL ) + return -ENOMEM; + clear_page(p); + pte = mfn_to_xen_entry(virt_to_mfn(p)); + pte.pt.table = 1; + write_pte(entry, pte); + return 0; +} + +enum xenmap_operation { + INSERT, + REMOVE +}; + +static int create_xen_entries(enum xenmap_operation op, + unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns) +{ + int rc; + unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; + lpae_t pte; + lpae_t *third = NULL; + + for(; addr < addr_end; addr += PAGE_SIZE, mfn++) + { + if ( !xen_second[second_linear_offset(addr)].pt.valid || + !xen_second[second_linear_offset(addr)].pt.table ) + { + rc = create_xen_table(&xen_second[second_linear_offset(addr)]); + if ( rc < 0 ) { + printk("create_xen_entries: L2 failed\n"); + goto out; + } + } + + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); + + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base + << PAGE_SHIFT); + + switch ( op ) { + case INSERT: + if ( third[third_table_offset(addr)].pt.valid ) + { + printk("create_xen_entries: trying to replace an existing mapping addr=%lx mfn=%lx\n", + addr, mfn); + return -EINVAL; + } + pte = mfn_to_xen_entry(mfn); + pte.pt.table = 1; + write_pte(&third[third_table_offset(addr)], pte); + break; + case REMOVE: + if ( !third[third_table_offset(addr)].pt.valid ) + { + printk("create_xen_entries: trying to remove a non-existing mapping addr=%lx\n", + addr); + return -EINVAL; + } + pte.bits = 0; + write_pte(&third[third_table_offset(addr)], pte); + break; + default: + BUG(); + } + } + flush_xen_data_tlb_range_va(virt, PAGE_SIZE * nr_mfns); + + rc = 0; + +out: + return rc; +} + +int map_pages_to_xen(unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns, + unsigned int flags) +{ + ASSERT(flags == PAGE_HYPERVISOR); + return create_xen_entries(INSERT, virt, mfn, nr_mfns); +} +void destroy_xen_mappings(unsigned long v, unsigned long e) +{ + create_xen_entries(REMOVE, v, 0, (e - v) >> PAGE_SHIFT); +} + enum mg { mg_clear, mg_ro, mg_rw, mg_rx }; static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg) { diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index cfe3d94..59646d6 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -34,6 +34,7 @@ #include <xen/keyhandler.h> #include <xen/cpu.h> #include <xen/pfn.h> +#include <xen/vmap.h> #include <asm/page.h> #include <asm/current.h> #include <asm/setup.h> @@ -480,6 +481,8 @@ void __init start_xen(unsigned long boot_phys_offset, console_init_postirq(); + vm_init(); + do_presmp_initcalls(); for_each_present_cpu ( i ) diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 8be8563..1f87acc 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -95,11 +95,11 @@ #define FIXMAP_ADDR(n) (mk_unsigned_long(0x00400000) + (n) * PAGE_SIZE) #define BOOT_MISC_VIRT_START mk_unsigned_long(0x00600000) #define FRAMETABLE_VIRT_START mk_unsigned_long(0x02000000) -#define EARLY_VMAP_VIRT_START mk_unsigned_long(0x10000000) +#define VMAP_VIRT_START mk_unsigned_long(0x10000000) #define XENHEAP_VIRT_START mk_unsigned_long(0x40000000) #define DOMHEAP_VIRT_START mk_unsigned_long(0x80000000) -#define EARLY_VMAP_VIRT_END XENHEAP_VIRT_START +#define VMAP_VIRT_END XENHEAP_VIRT_START #define HYPERVISOR_VIRT_START XEN_VIRT_START #define DOMHEAP_ENTRIES 1024 /* 1024 2MB mapping slots */ diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index 14e63eb..5287a92 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -58,6 +58,9 @@ #define DEV_WC BUFFERABLE #define DEV_CACHED WRITEBACK +#define PAGE_HYPERVISOR (MATTR_MEM) +#define MAP_SMALL_PAGES PAGE_HYPERVISOR + /* * Stage 2 Memory Type. * diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h index b2f3859..e53e1e5 100644 --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -338,14 +338,6 @@ l3_pgentry_t *virt_to_xen_l3e(unsigned long v); extern void set_pdx_range(unsigned long smfn, unsigned long emfn); -/* Map machine page range in Xen virtual address space. */ -int map_pages_to_xen( - unsigned long virt, - unsigned long mfn, - unsigned long nr_mfns, - unsigned int flags); -void destroy_xen_mappings(unsigned long v, unsigned long e); - /* Convert between PAT/PCD/PWT embedded in PTE flags and 3-bit cacheattr. */ static inline uint32_t pte_flags_to_cacheattr(uint32_t flags) { diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 28512fb..efc45c7 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -48,6 +48,13 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags); void free_xenheap_pages(void *v, unsigned int order); #define alloc_xenheap_page() (alloc_xenheap_pages(0,0)) #define free_xenheap_page(v) (free_xenheap_pages(v,0)) +/* Map machine page range in Xen virtual address space. */ +int map_pages_to_xen( + unsigned long virt, + unsigned long mfn, + unsigned long nr_mfns, + unsigned int flags); +void destroy_xen_mappings(unsigned long v, unsigned long e); /* Claim handling */ unsigned long domain_adjust_tot_pages(struct domain *d, long pages); -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 07/12] xen/arm: implement map_domain_page_global and unmap_domain_page_global
The implementation uses vmap and vunmap. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Changes in v3: - use vmap/vunmap to implement un/map_domain_page_global. --- xen/arch/arm/mm.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 3c0b0f0..e11d393 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -36,6 +36,7 @@ #include <asm/flushtlb.h> #include <public/memory.h> #include <xen/sched.h> +#include <xen/vmap.h> #include <xsm/xsm.h> struct domain *dom_xen, *dom_io, *dom_cow; @@ -131,6 +132,16 @@ void clear_fixmap(unsigned map) flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE); } +void *map_domain_page_global(unsigned long mfn) +{ + return vmap(&mfn, 1); +} + +void unmap_domain_page_global(const void *va) +{ + vunmap(va); +} + /* Map a page of domheap memory */ void *map_domain_page(unsigned long mfn) { -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 08/12] xen: move VCPUOP_register_vcpu_info to common code
Move the implementation of VCPUOP_register_vcpu_info from x86 specific to commmon code. Move vcpu_info_mfn from an arch specific vcpu sub-field to the common vcpu struct. Move the initialization of vcpu_info_mfn to common code. Move unmap_vcpu_info and the call to unmap_vcpu_info at domain destruction time to common code. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: keir@xen.org CC: JBeulich@suse.com --- xen/arch/x86/domain.c | 113 ------------------------------------------ xen/common/domain.c | 111 +++++++++++++++++++++++++++++++++++++++++ xen/include/asm-x86/domain.h | 3 - xen/include/xen/domain.h | 3 + xen/include/xen/sched.h | 3 + 5 files changed, 117 insertions(+), 116 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 14b6d13..d1b6c64 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -387,8 +387,6 @@ int vcpu_initialise(struct vcpu *v) vmce_init_vcpu(v); - v->arch.vcpu_info_mfn = INVALID_MFN; - if ( is_hvm_domain(d) ) { rc = hvm_vcpu_initialise(v); @@ -954,99 +952,6 @@ void arch_vcpu_reset(struct vcpu *v) } } -/* - * Unmap the vcpu info page if the guest decided to place it somewhere - * else. This is only used from arch_domain_destroy, so there''s no - * need to do anything clever. - */ -static void -unmap_vcpu_info(struct vcpu *v) -{ - unsigned long mfn; - - if ( v->arch.vcpu_info_mfn == INVALID_MFN ) - return; - - mfn = v->arch.vcpu_info_mfn; - unmap_domain_page_global(v->vcpu_info); - - v->vcpu_info = &dummy_vcpu_info; - v->arch.vcpu_info_mfn = INVALID_MFN; - - put_page_and_type(mfn_to_page(mfn)); -} - -/* - * Map a guest page in and point the vcpu_info pointer at it. This - * makes sure that the vcpu_info is always pointing at a valid piece - * of memory, and it sets a pending event to make sure that a pending - * event doesn''t get missed. - */ -static int -map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) -{ - struct domain *d = v->domain; - void *mapping; - vcpu_info_t *new_info; - struct page_info *page; - int i; - - if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) - return -EINVAL; - - if ( v->arch.vcpu_info_mfn != INVALID_MFN ) - return -EINVAL; - - /* Run this command on yourself or on other offline VCPUS. */ - if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) - return -EINVAL; - - page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); - if ( !page ) - return -EINVAL; - - if ( !get_page_type(page, PGT_writable_page) ) - { - put_page(page); - return -EINVAL; - } - - mapping = __map_domain_page_global(page); - if ( mapping == NULL ) - { - put_page_and_type(page); - return -ENOMEM; - } - - new_info = (vcpu_info_t *)(mapping + offset); - - if ( v->vcpu_info == &dummy_vcpu_info ) - { - memset(new_info, 0, sizeof(*new_info)); - __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; - } - else - { - memcpy(new_info, v->vcpu_info, sizeof(*new_info)); - } - - v->vcpu_info = new_info; - v->arch.vcpu_info_mfn = page_to_mfn(page); - - /* Set new vcpu_info pointer /before/ setting pending flags. */ - wmb(); - - /* - * Mark everything as being pending just to make sure nothing gets - * lost. The domain will get a spurious event, but it can cope. - */ - vcpu_info(v, evtchn_upcall_pending) = 1; - for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) - set_bit(i, &vcpu_info(v, evtchn_pending_sel)); - - return 0; -} - long arch_do_vcpu_op( int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) @@ -1083,22 +988,6 @@ arch_do_vcpu_op( break; } - case VCPUOP_register_vcpu_info: - { - struct domain *d = v->domain; - struct vcpu_register_vcpu_info info; - - rc = -EFAULT; - if ( copy_from_guest(&info, arg, 1) ) - break; - - domain_lock(d); - rc = map_vcpu_info(v, info.mfn, info.offset); - domain_unlock(d); - - break; - } - /* * XXX Disable for 4.0.0: __update_vcpu_system_time() writes to the given * virtual address even when running in another domain''s address space. @@ -2025,8 +1914,6 @@ int domain_relinquish_resources(struct domain *d) * mappings. */ destroy_gdt(v); - - unmap_vcpu_info(v); } if ( d->arch.pv_domain.pirq_eoi_map != NULL ) diff --git a/xen/common/domain.c b/xen/common/domain.c index ce45d66..d21909f 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -33,6 +33,7 @@ #include <xen/xenoprof.h> #include <xen/irq.h> #include <asm/debugger.h> +#include <asm/p2m.h> #include <asm/processor.h> #include <public/sched.h> #include <public/sysctl.h> @@ -142,6 +143,7 @@ struct vcpu *alloc_vcpu( v->vcpu_info = ((vcpu_id < XEN_LEGACY_MAX_VCPUS) ? (vcpu_info_t *)&shared_info(d, vcpu_info[vcpu_id]) : &dummy_vcpu_info); + v->vcpu_info_mfn = INVALID_MFN; init_waitqueue_vcpu(v); } @@ -547,6 +549,7 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d) int domain_kill(struct domain *d) { int rc = 0; + struct vcpu *v; if ( d == current->domain ) return -EINVAL; @@ -571,6 +574,8 @@ int domain_kill(struct domain *d) BUG_ON(rc != -EAGAIN); break; } + for_each_vcpu ( d, v ) + unmap_vcpu_info(v); d->is_dying = DOMDYING_dead; /* Mem event cleanup has to go here because the rings * have to be put before we call put_domain. */ @@ -896,6 +901,96 @@ void vcpu_reset(struct vcpu *v) vcpu_unpause(v); } +/* + * Map a guest page in and point the vcpu_info pointer at it. This + * makes sure that the vcpu_info is always pointing at a valid piece + * of memory, and it sets a pending event to make sure that a pending + * event doesn''t get missed. + */ +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) +{ + struct domain *d = v->domain; + void *mapping; + vcpu_info_t *new_info; + struct page_info *page; + int i; + + if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) + return -EINVAL; + + if ( v->vcpu_info_mfn != INVALID_MFN ) + return -EINVAL; + + /* Run this command on yourself or on other offline VCPUS. */ + if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) ) + return -EINVAL; + + page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); + if ( !page ) + return -EINVAL; + + if ( !get_page_type(page, PGT_writable_page) ) + { + put_page(page); + return -EINVAL; + } + + mapping = __map_domain_page_global(page); + if ( mapping == NULL ) + { + put_page_and_type(page); + return -ENOMEM; + } + + new_info = (vcpu_info_t *)(mapping + offset); + + if ( v->vcpu_info == &dummy_vcpu_info ) + { + memset(new_info, 0, sizeof(*new_info)); + __vcpu_info(v, new_info, evtchn_upcall_mask) = 1; + } + else + { + memcpy(new_info, v->vcpu_info, sizeof(*new_info)); + } + + v->vcpu_info = new_info; + v->vcpu_info_mfn = page_to_mfn(page); + + /* Set new vcpu_info pointer /before/ setting pending flags. */ + wmb(); + + /* + * Mark everything as being pending just to make sure nothing gets + * lost. The domain will get a spurious event, but it can cope. + */ + vcpu_info(v, evtchn_upcall_pending) = 1; + for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) + set_bit(i, &vcpu_info(v, evtchn_pending_sel)); + + return 0; +} + +/* + * Unmap the vcpu info page if the guest decided to place it somewhere + * else. This is only used from arch_domain_destroy, so there''s no + * need to do anything clever. + */ +void unmap_vcpu_info(struct vcpu *v) +{ + unsigned long mfn; + + if ( v->vcpu_info_mfn == INVALID_MFN ) + return; + + mfn = v->vcpu_info_mfn; + unmap_domain_page_global(v->vcpu_info); + + v->vcpu_info = &dummy_vcpu_info; + v->vcpu_info_mfn = INVALID_MFN; + + put_page_and_type(mfn_to_page(mfn)); +} long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) { @@ -1015,6 +1110,22 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) break; + case VCPUOP_register_vcpu_info: + { + struct domain *d = v->domain; + struct vcpu_register_vcpu_info info; + + rc = -EFAULT; + if ( copy_from_guest(&info, arg, 1) ) + break; + + domain_lock(d); + rc = map_vcpu_info(v, info.mfn, info.offset); + domain_unlock(d); + + break; + } + #ifdef VCPU_TRAP_NMI case VCPUOP_send_nmi: if ( !guest_handle_is_null(arg) ) diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index bdaf714..e7ed7c3 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -435,9 +435,6 @@ struct arch_vcpu struct paging_vcpu paging; - /* Guest-specified relocation of vcpu_info. */ - unsigned long vcpu_info_mfn; - uint32_t gdbsx_vcpu_event; /* A secondary copy of the vcpu time info. */ diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h index d4ac50f..d0263a9 100644 --- a/xen/include/xen/domain.h +++ b/xen/include/xen/domain.h @@ -52,6 +52,9 @@ void free_pirq_struct(void *); int vcpu_initialise(struct vcpu *v); void vcpu_destroy(struct vcpu *v); +int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset); +void unmap_vcpu_info(struct vcpu *v); + int arch_domain_create(struct domain *d, unsigned int domcr_flags); void arch_domain_destroy(struct domain *d); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index beadc42..7e129d9 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -192,6 +192,9 @@ struct vcpu struct waitqueue_vcpu *waitqueue_vcpu; + /* Guest-specified relocation of vcpu_info. */ + unsigned long vcpu_info_mfn; + struct arch_vcpu arch; }; -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 09/12] xen/arm: support VCPUOP_register_vcpu_info.
We don''t want to support the full vcpu_op hypercall interface, just VCPUOP_register_vcpu_info: introduce an internal ARM-only do_arm_vcpu_op function to filter out the vcpu_op hypercalls that we don''t want to support. Call do_arm_vcpu_op instead of do_vcpu_op from traps.c. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: keir@xen.org CC: JBeulich@suse.com Changes in v4: - introduce an HYPERCALL_ARM macro and use it for do_arm_vcpu_op. Changes in v3: - do not export all the vcpu_op hypercalls to ARM guests, only VCPUOP_register_vcpu_info. --- xen/arch/arm/domain.c | 13 +++++++++++++ xen/arch/arm/traps.c | 6 ++++++ xen/include/asm-arm/hypercall.h | 2 ++ 3 files changed, 21 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index fee3790..f7ec979 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -10,6 +10,7 @@ * GNU General Public License for more details. */ #include <xen/config.h> +#include <xen/hypercall.h> #include <xen/init.h> #include <xen/lib.h> #include <xen/sched.h> @@ -628,6 +629,18 @@ void arch_dump_domain_info(struct domain *d) } } + +long do_arm_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) +{ + switch ( cmd ) + { + case VCPUOP_register_vcpu_info: + return do_vcpu_op(cmd, vcpuid, arg); + default: + return -EINVAL; + } +} + long arch_do_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) { return -ENOSYS; diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index 4bd03e0..25f87cd 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -605,6 +605,11 @@ typedef struct { .nr_args = _nr_args, \ } +#define HYPERCALL_ARM(_name, _nr_args) \ + [ __HYPERVISOR_ ## _name ] = { \ + .fn = (arm_hypercall_fn_t) &do_arm_ ## _name, \ + .nr_args = _nr_args, \ + } static arm_hypercall_t arm_hypercall_table[] = { HYPERCALL(memory_op, 2), HYPERCALL(domctl, 1), @@ -617,6 +622,7 @@ static arm_hypercall_t arm_hypercall_table[] = { HYPERCALL(sysctl, 2), HYPERCALL(hvm_op, 2), HYPERCALL(grant_table_op, 3), + HYPERCALL_ARM(vcpu_op, 3), }; #define __PSCI_cpu_suspend 0 diff --git a/xen/include/asm-arm/hypercall.h b/xen/include/asm-arm/hypercall.h index 0833ec4..3327a96 100644 --- a/xen/include/asm-arm/hypercall.h +++ b/xen/include/asm-arm/hypercall.h @@ -4,6 +4,8 @@ #include <public/domctl.h> /* for arch_do_domctl */ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg); +long do_arm_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg); + #endif /* __ASM_ARM_HYPERCALL_H__ */ /* * Local variables: -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 10/12] xen/arm: send IPIs to inject irqs into guest vcpus running on different pcpus
If we need to inject an irq into a VCPU that is running on a different processor, we shouldn''t just enqueue the irq into the lr_pending and inflight lists and wait for something to interrupt the guest execution. Send an IPI to the target pcpu so that Xen can inject the new interrupt returning to guest. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Changes in v4: - check for is_running before vcpu_unblock; - use smp_send_event_check_cpu instead of smp_send_event_check_mask. --- xen/arch/arm/vgic.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 8d87609..eb97cd1 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -638,6 +638,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq, int virtual) struct vgic_irq_rank *rank = vgic_irq_rank(v, 8, idx); struct pending_irq *iter, *n = irq_to_pending(v, irq); unsigned long flags; + bool_t running; spin_lock_irqsave(&v->arch.vgic.lock, flags); @@ -673,7 +674,10 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq, int virtual) out: spin_unlock_irqrestore(&v->arch.vgic.lock, flags); /* we have a new higher priority irq, inject it into the guest */ + running = v->is_running; vcpu_unblock(v); + if ( running && v != current ) + smp_send_event_check_mask(cpumask_of(v->processor)); } /* -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 11/12] xen/arm: run the vtimer Xen timers on the pcpu the vcpu is running on
The Xen physical timer emulator and virtual timer driver use two internal Xen timers: initialize them on the pcpu the vcpu is running on, rather than the processor that it''s creating the vcpu. On vcpu restore migrate the phys_timer and the virt_timer to the pcpu the vcpu is running on. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v4: - migrate Xen timers on virt_timer_restore. --- xen/arch/arm/vtimer.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index 1cb365e..393aac3 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -48,7 +48,7 @@ int vcpu_vtimer_init(struct vcpu *v) { struct vtimer *t = &v->arch.phys_timer; - init_timer(&t->timer, phys_timer_expired, t, smp_processor_id()); + init_timer(&t->timer, phys_timer_expired, t, v->processor); t->ctl = 0; t->offset = NOW(); t->cval = NOW(); @@ -56,7 +56,7 @@ int vcpu_vtimer_init(struct vcpu *v) t->v = v; t = &v->arch.virt_timer; - init_timer(&t->timer, virt_timer_expired, t, smp_processor_id()); + init_timer(&t->timer, virt_timer_expired, t, v->processor); t->ctl = 0; t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); t->cval = 0; @@ -95,6 +95,8 @@ int virt_timer_restore(struct vcpu *v) return 0; stop_timer(&v->arch.virt_timer.timer); + migrate_timer(&v->arch.virt_timer.timer, v->processor); + migrate_timer(&v->arch.phys_timer.timer, v->processor); WRITE_SYSREG64(v->arch.virt_timer.offset, CNTVOFF_EL2); WRITE_SYSREG64(v->arch.virt_timer.cval, CNTV_CVAL_EL0); -- 1.7.2.5
Stefano Stabellini
2013-Apr-26 15:28 UTC
[PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
Introduce a domain wide vtimer initialization function to initialize the phys_timer and the virt_timer offsets. Use the domain phys_timer and virt_timer offsets throughout the vtimer code instead of the per-vcpu offsets. Remove the per-vcpu offsets from struct vtimer altogether. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v4: - introduce vcpu_domain_init; - inline phys_timer_base and virt_timer_base in arch_domain; - use phys_timer_base.offset and virt_timer_base.offset directly in vtimer code (remove offset field from struct vtimer). --- xen/arch/arm/domain.c | 3 +++ xen/arch/arm/vtimer.c | 26 +++++++++++++++++--------- xen/arch/arm/vtimer.h | 1 + xen/include/asm-arm/domain.h | 24 +++++++++++++++--------- 4 files changed, 36 insertions(+), 18 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index f7ec979..f26222a 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -485,6 +485,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) if ( (rc = domain_vgic_init(d)) != 0 ) goto fail; + if ( (rc = vcpu_domain_init(d)) != 0 ) + goto fail; + /* Domain 0 gets a real UART not an emulated one */ if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) goto fail; diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index 393aac3..782a255 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -44,21 +44,27 @@ static void virt_timer_expired(void *data) vgic_vcpu_inject_irq(t->v, 27, 1); } +int vcpu_domain_init(struct domain *d) +{ + d->arch.phys_timer_base.offset = NOW(); + d->arch.virt_timer_base.offset = READ_SYSREG64(CNTVCT_EL0) + + READ_SYSREG64(CNTVOFF_EL2); + return 0; +} + int vcpu_vtimer_init(struct vcpu *v) { struct vtimer *t = &v->arch.phys_timer; init_timer(&t->timer, phys_timer_expired, t, v->processor); t->ctl = 0; - t->offset = NOW(); - t->cval = NOW(); + t->cval = 0; t->irq = 30; t->v = v; t = &v->arch.virt_timer; init_timer(&t->timer, virt_timer_expired, t, v->processor); t->ctl = 0; - t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); t->cval = 0; t->irq = 27; t->v = v; @@ -84,7 +90,7 @@ int virt_timer_save(struct vcpu *v) !(v->arch.virt_timer.ctl & CNTx_CTL_MASK)) { set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v->arch.virt_timer.cval + - v->arch.virt_timer.offset - boot_count)); + v->domain->arch.virt_timer_base.offset - boot_count)); } return 0; } @@ -98,7 +104,7 @@ int virt_timer_restore(struct vcpu *v) migrate_timer(&v->arch.virt_timer.timer, v->processor); migrate_timer(&v->arch.phys_timer.timer, v->processor); - WRITE_SYSREG64(v->arch.virt_timer.offset, CNTVOFF_EL2); + WRITE_SYSREG64(v->domain->arch.virt_timer_base.offset, CNTVOFF_EL2); WRITE_SYSREG64(v->arch.virt_timer.cval, CNTV_CVAL_EL0); WRITE_SYSREG32(v->arch.virt_timer.ctl, CNTV_CTL_EL0); return 0; @@ -128,7 +134,8 @@ static int vtimer_emulate_32(struct cpu_user_regs *regs, union hsr hsr) if ( v->arch.phys_timer.ctl & CNTx_CTL_ENABLE ) { set_timer(&v->arch.phys_timer.timer, - v->arch.phys_timer.cval + v->arch.phys_timer.offset); + v->arch.phys_timer.cval + + v->domain->arch.phys_timer_base.offset); } else stop_timer(&v->arch.phys_timer.timer); @@ -137,7 +144,7 @@ static int vtimer_emulate_32(struct cpu_user_regs *regs, union hsr hsr) return 1; case HSR_CPREG32(CNTP_TVAL): - now = NOW() - v->arch.phys_timer.offset; + now = NOW() - v->domain->arch.phys_timer_base.offset; if ( cp32.read ) { *r = (uint32_t)(ns_to_ticks(v->arch.phys_timer.cval - now) & 0xffffffffull); @@ -149,7 +156,8 @@ static int vtimer_emulate_32(struct cpu_user_regs *regs, union hsr hsr) { v->arch.phys_timer.ctl &= ~CNTx_CTL_PENDING; set_timer(&v->arch.phys_timer.timer, - v->arch.phys_timer.cval + v->arch.phys_timer.offset); + v->arch.phys_timer.cval + + v->domain->arch.phys_timer_base.offset); } } @@ -174,7 +182,7 @@ static int vtimer_emulate_64(struct cpu_user_regs *regs, union hsr hsr) case HSR_CPREG64(CNTPCT): if ( cp64.read ) { - now = NOW() - v->arch.phys_timer.offset; + now = NOW() - v->domain->arch.phys_timer_base.offset; ticks = ns_to_ticks(now); *r1 = (uint32_t)(ticks & 0xffffffff); *r2 = (uint32_t)(ticks >> 32); diff --git a/xen/arch/arm/vtimer.h b/xen/arch/arm/vtimer.h index 690231d..bcf910e 100644 --- a/xen/arch/arm/vtimer.h +++ b/xen/arch/arm/vtimer.h @@ -20,6 +20,7 @@ #ifndef __ARCH_ARM_VTIMER_H__ #define __ARCH_ARM_VTIMER_H__ +extern int vcpu_domain_init(struct domain *d); extern int vcpu_vtimer_init(struct vcpu *v); extern int vtimer_emulate(struct cpu_user_regs *regs, union hsr hsr); extern int virt_timer_save(struct vcpu *v); diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index 3fa266c2..cca7416 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -47,6 +47,14 @@ enum domain_type { #define is_pv64_domain(d) (0) #endif +struct vtimer { + struct vcpu *v; + int irq; + struct timer timer; + uint32_t ctl; + uint64_t cval; +}; + struct arch_domain { #ifdef CONFIG_ARM_64 @@ -62,6 +70,13 @@ struct arch_domain register_t vmpidr; struct { + uint64_t offset; + } phys_timer_base; + struct { + uint64_t offset; + } virt_timer_base; + + struct { /* * Covers access to other members of this struct _except_ for * shared_irqs where each member contains its own locking. @@ -91,15 +106,6 @@ struct arch_domain } __cacheline_aligned; -struct vtimer { - struct vcpu *v; - int irq; - struct timer timer; - uint32_t ctl; - uint64_t offset; - uint64_t cval; -}; - struct arch_vcpu { struct { -- 1.7.2.5
Ian Campbell
2013-Apr-26 15:38 UTC
Re: [PATCH v4 01/12] xen/arm: basic PSCI support, implement cpu_on and cpu_off
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index eae42af..fee3790 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -561,6 +561,8 @@ int arch_set_info_guest( > else > set_bit(_VPF_down, &v->pause_flags); > > + v->is_initialised = 1; > + > return 0; > }This is (now) already there about 2 lines further up than the context shows... Ian.
Keir Fraser
2013-Apr-26 16:16 UTC
Re: [PATCH v4 03/12] xen: move for_each_set_bit to xen/bitops.h
On 26/04/2013 16:28, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:> Move for_each_set_bit from asm-x86/bitops.h to xen/bitops.h. > Replace #include <asm/bitops.h> with #include <xen/bitops.h> everywhere. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: keir@xen.org > CC: JBeulich@suse.comAcked-by: Keir Fraser <keir@xen.org>> --- > xen/arch/x86/i8259.c | 2 +- > xen/arch/x86/mpparse.c | 2 +- > xen/arch/x86/setup.c | 2 +- > xen/arch/x86/traps.c | 2 +- > xen/include/asm-x86/bitops.h | 11 ----------- > xen/include/asm-x86/smp.h | 2 +- > xen/include/asm-x86/system.h | 2 +- > xen/include/xen/bitops.h | 11 +++++++++++ > xen/include/xen/event.h | 2 +- > xen/include/xen/softirq.h | 2 +- > xen/xsm/xsm_policy.c | 2 +- > 11 files changed, 20 insertions(+), 20 deletions(-) > > diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c > index b537c5f..6fdcce8 100644 > --- a/xen/arch/x86/i8259.c > +++ b/xen/arch/x86/i8259.c > @@ -16,7 +16,7 @@ > #include <asm/system.h> > #include <asm/io.h> > #include <asm/desc.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <xen/delay.h> > #include <asm/apic.h> > #include <asm/asm_defns.h> > diff --git a/xen/arch/x86/mpparse.c b/xen/arch/x86/mpparse.c > index 97ab5d3..97d34bc 100644 > --- a/xen/arch/x86/mpparse.c > +++ b/xen/arch/x86/mpparse.c > @@ -22,7 +22,7 @@ > #include <xen/efi.h> > #include <xen/sched.h> > > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/smp.h> > #include <asm/acpi.h> > #include <asm/mtrr.h> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > index 43301a5..744335d 100644 > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -29,7 +29,7 @@ > #include <public/version.h> > #include <compat/platform.h> > #include <compat/xen.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/smp.h> > #include <asm/processor.h> > #include <asm/mpspec.h> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index d36eddd..5200940 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -52,7 +52,7 @@ > #include <asm/system.h> > #include <asm/io.h> > #include <asm/atomic.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/desc.h> > #include <asm/debugreg.h> > #include <asm/smp.h> > diff --git a/xen/include/asm-x86/bitops.h b/xen/include/asm-x86/bitops.h > index c3cbd26..f5a84ef 100644 > --- a/xen/include/asm-x86/bitops.h > +++ b/xen/include/asm-x86/bitops.h > @@ -368,17 +368,6 @@ static inline unsigned int __scanbit(unsigned long val, > unsigned long max) > __find_next_zero_bit(addr,size,off))) > > /** > - * for_each_set_bit - iterate over every set bit in a memory region > - * @bit: The integer iterator > - * @addr: The address to base the search on > - * @size: The maximum size to search > - */ > -#define for_each_set_bit(bit, addr, size) \ > - for ( (bit) = find_first_bit(addr, size); \ > - (bit) < (size); \ > - (bit) = find_next_bit(addr, size, (bit) + 1) ) > - > -/** > * find_first_set_bit - find the first set bit in @word > * @word: the word to search > * > diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h > index 301f8c7..81f8610 100644 > --- a/xen/include/asm-x86/smp.h > +++ b/xen/include/asm-x86/smp.h > @@ -12,7 +12,7 @@ > #endif > > #ifndef __ASSEMBLY__ > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/mpspec.h> > #endif > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index b0876d6..6ab7d56 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -2,7 +2,7 @@ > #define __ASM_SYSTEM_H > > #include <xen/lib.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > > #define read_segment_register(name) \ > ({ u16 __sel; \ > diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h > index c6a78b6..6054155 100644 > --- a/xen/include/xen/bitops.h > +++ b/xen/include/xen/bitops.h > @@ -182,4 +182,15 @@ static inline __u32 ror32(__u32 word, unsigned int shift) > #define __L16(_x) (((_x) & 0x0000ff00) ? ( 8 + __L8( (_x)>> 8)) : __L8( _x)) > #define LOG_2(_x) (((_x) & 0xffff0000) ? (16 + __L16((_x)>>16)) : __L16(_x)) > > +/** > + * for_each_set_bit - iterate over every set bit in a memory region > + * @bit: The integer iterator > + * @addr: The address to base the search on > + * @size: The maximum size to search > + */ > +#define for_each_set_bit(bit, addr, size) \ > + for ( (bit) = find_first_bit(addr, size); \ > + (bit) < (size); \ > + (bit) = find_next_bit(addr, size, (bit) + 1) ) > + > #endif > diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h > index 71c3e92..4ac39ad 100644 > --- a/xen/include/xen/event.h > +++ b/xen/include/xen/event.h > @@ -12,7 +12,7 @@ > #include <xen/sched.h> > #include <xen/smp.h> > #include <xen/softirq.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/event.h> > > /* > diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h > index 193351d..0c0d481 100644 > --- a/xen/include/xen/softirq.h > +++ b/xen/include/xen/softirq.h > @@ -13,7 +13,7 @@ enum { > > #include <xen/lib.h> > #include <xen/smp.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > #include <asm/current.h> > #include <asm/hardirq.h> > #include <asm/softirq.h> > diff --git a/xen/xsm/xsm_policy.c b/xen/xsm/xsm_policy.c > index 65be61d..cad7964 100644 > --- a/xen/xsm/xsm_policy.c > +++ b/xen/xsm/xsm_policy.c > @@ -20,7 +20,7 @@ > > #include <xsm/xsm.h> > #include <xen/multiboot.h> > -#include <asm/bitops.h> > +#include <xen/bitops.h> > > char *__initdata policy_buffer = NULL; > u32 __initdata policy_size = 0;
Keir Fraser
2013-Apr-26 16:17 UTC
Re: [PATCH v4 08/12] xen: move VCPUOP_register_vcpu_info to common code
On 26/04/2013 16:28, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:> Move the implementation of VCPUOP_register_vcpu_info from x86 specific > to commmon code. > > Move vcpu_info_mfn from an arch specific vcpu sub-field to the common > vcpu struct. > Move the initialization of vcpu_info_mfn to common code. > > Move unmap_vcpu_info and the call to unmap_vcpu_info at domain > destruction time to common code. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Acked-by: Ian Campbell <ian.campbell@citrix.com> > CC: keir@xen.org > CC: JBeulich@suse.comAcked-by: Keir Fraser <keir@xen.org>
Keir Fraser
2013-Apr-26 16:17 UTC
Re: [PATCH v4 09/12] xen/arm: support VCPUOP_register_vcpu_info.
On 26/04/2013 16:28, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:> We don''t want to support the full vcpu_op hypercall interface, just > VCPUOP_register_vcpu_info: introduce an internal ARM-only > do_arm_vcpu_op function to filter out the vcpu_op hypercalls that > we don''t want to support. > Call do_arm_vcpu_op instead of do_vcpu_op from traps.c. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: keir@xen.org > CC: JBeulich@suse.comAcked-by: Keir Fraser <keir@xen.org>
Ian Campbell
2013-Apr-26 16:22 UTC
Re: [PATCH v4 02/12] xen/arm: allocate secondaries dom0 vcpus
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> Trap writes to GICD_SGIR, parse the requests, inject SGIs into the right > guest vcpu. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Changes in v4: > - move the code to a separate function; > - use gdprintk for debugging output; > - make use of PRIregister; > - replace the cpumask with a bitmask; > - move the virtual_irq check outside the loop; > - ignore non-existent target vcpus. > > Changes in v3: > - make use of cpumask_from_bitmap. > --- > xen/arch/arm/vgic.c | 63 ++++++++++++++++++++++++++++++++++++++++++--- > xen/include/asm-arm/gic.h | 9 ++++-- > 2 files changed, 65 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index b30da78..8d87609 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -17,6 +17,7 @@ > * GNU General Public License for more details. > */ > > +#include <xen/bitops.h> > #include <xen/config.h> > #include <xen/lib.h> > #include <xen/init.h> > @@ -368,6 +369,61 @@ static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) > } > } > > +static int vgic_to_sgi(struct vcpu *v, register_t sgir) > +{ > + struct domain *d = v->domain; > + int virtual_irq; > + int filter; > + int vcpuid; > + struct vcpu *vt; > + int i; > + unsigned long vcpu_mask = 0; > + > + ASSERT(d->max_vcpus < 8*sizeof(vcpu_mask)); > + > + filter = (sgir & GICD_SGI_TARGET_LIST_MASK); > + virtual_irq = (sgir & GICD_SGI_INTID_MASK); > + > + switch ( filter ) > + { > + case GICD_SGI_TARGET_LIST: > + vcpu_mask = (sgir & GICD_SGI_TARGET_MASK) >> GICD_SGI_TARGET_SHIFT; > + break; > + case GICD_SGI_TARGET_OTHERS: > + for ( i = 0; i < d->max_vcpus; i++ ) > + { > + if ( i != current->vcpu_id && d->vcpu[i] != NULL ) > + set_bit(i, &vcpu_mask); > + } > + case GICD_SGI_TARGET_SELF: > + set_bit(current->vcpu_id, &vcpu_mask); > + break; > + default: > + gdprintk(XENLOG_WARNING, "vGICD: unhandled GICD_SGIR write %"PRIregister" with wrong TargetListFilter field\n", > + sgir); > + return 0; > + } > + if ( virtual_irq >= 16 ) > + { > + gdprintk(XENLOG_WARNING, "vGICD: try to send SGI %d that is >= 16\n", > + virtual_irq); > + return 0; > + }It occurs to me that the only way this can happen is if we have decoded the instruction incorrectly, because it''s only 4 bits. It can probably be an ASSERT/BUG_ON.> + > + > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > + { > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL )Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has not been PSCI''d up will pass this -- what do we do to them? Hopefully we don''t wake them up? Do we not want to check something like _VPF_down too? The same likely goes for the TARGET_OTHERS code too which builds the mask, I''m not sure how this isn''t racy against a cpu going down though. (personally I dislike assignments inside if()s, but they are not uncommon in the hypervisor code)> + { > + gdprintk(XENLOG_WARNING, "vGICD: GICD_SGIR write r=%"PRIregister" vcpu_mask=%lx, wrong CPUTargetList\n", > + sgir, vcpu_mask); > + continue; > + } > + vgic_vcpu_inject_irq(vt, virtual_irq, 1); > + } > + return 1; > +} > + > static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > { > struct hsr_dabt dabt = info->dabt; > @@ -498,10 +554,9 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info) > goto write_ignore; > > case GICD_SGIR: > - if ( dabt.size != 2 ) goto bad_width; > - printk("vGICD: unhandled write %#"PRIregister" to ICFGR%d\n", > - *r, gicd_reg - GICD_ICFGR); > - return 0; > + if ( dabt.size != 2 ) > + goto bad_width; > + return vgic_to_sgi(v, *r); > > case GICD_CPENDSGIR ... GICD_CPENDSGIRN: > if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width; > diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h > index 92711d5..061ebf4 100644 > --- a/xen/include/asm-arm/gic.h > +++ b/xen/include/asm-arm/gic.h > @@ -51,12 +51,15 @@ > #define GICD_SPENDSGIRN (0xF2C/4) > #define GICD_ICPIDR2 (0xFE8/4) > > -#define GICD_SGI_TARGET_LIST (0UL<<24) > -#define GICD_SGI_TARGET_OTHERS (1UL<<24) > -#define GICD_SGI_TARGET_SELF (2UL<<24) > +#define GICD_SGI_TARGET_LIST_SHIFT (24) > +#define GICD_SGI_TARGET_LIST_MASK (0x3UL << GICD_SGI_TARGET_LIST_SHIFT)Can you line the others up with these please, you are touching most of them anyway.> +#define GICD_SGI_TARGET_LIST (0UL<<GICD_SGI_TARGET_LIST_SHIFT) > +#define GICD_SGI_TARGET_OTHERS (1UL<<GICD_SGI_TARGET_LIST_SHIFT) > +#define GICD_SGI_TARGET_SELF (2UL<<GICD_SGI_TARGET_LIST_SHIFT) > #define GICD_SGI_TARGET_SHIFT (16) > #define GICD_SGI_TARGET_MASK (0xFFUL<<GICD_SGI_TARGET_SHIFT) > #define GICD_SGI_GROUP1 (1UL<<15) > +#define GICD_SGI_INTID_MASK (0xFUL) > > #define GICC_CTLR (0x0000/4) > #define GICC_PMR (0x0004/4)
Ian Campbell
2013-Apr-26 16:37 UTC
Re: [PATCH v4 05/12] xen/arm: implement arch_vmap_virt_end
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> Move virt_start out of ioremap and rename it to early_vmap_start. > Implement arch_vmap_virt_end by returning early_vmap_start. > > Allocate virtual addresses in early_ioremap from top to bottom so that > later on when we initialize vmap, we can return the end of the vmap > address space (the last address allocated by early_ioremap). > > Changes in v4: > - merge with "early_ioremap: allocate virtual addresses from top to > bottom" patch; > - better commit message; > - declare early_vmap_start __initdata; > - use "& ~SECOND_MASK" instead of shifts. > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell
2013-Apr-26 16:50 UTC
Re: [PATCH v4 06/12] xen/arm: compile and initialize vmap
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > VMAP_VIRT_END and VMAP_VIRT_START. > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM, because we > only support 4K pages so as a matter of fact it is always set). > > Implement map_pages_to_xen and destroy_xen_mappings. > > Call vm_init from start_xen. > > Changes in v4: > - remove flush_tlb_local() from create_xen_entries;Do you think the related one in create_p2m can go too?> +static int create_xen_table(lpae_t *entry) > +{ > + void *p; > + lpae_t pte; > + > + p = alloc_xenheap_pages(0, 0);You can use alloc_xenheap_page here.> + BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); > + > + third = __va((paddr_t)xen_second[second_linear_offset(addr)].pt.base > + << PAGE_SHIFT);Sometimes the cure is worse than the disease when it comes to 80 columns, and this would have been far from the longest line in this patch ;-) The cast + shift construct is available as pfn_to_paddr. Those are just nits really, if want to fix the pfn_to_paddr + alloc_xenheap_page things then: Acked-by: Ian Campbell <ian.campbell@citrix.com> The line length thing I''ll leave to you. Ian.
Stefano Stabellini
2013-Apr-27 14:55 UTC
Re: [PATCH v4 06/12] xen/arm: compile and initialize vmap
On Fri, 26 Apr 2013, Ian Campbell wrote:> On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > > VMAP_VIRT_END and VMAP_VIRT_START. > > > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM, because we > > only support 4K pages so as a matter of fact it is always set). > > > > Implement map_pages_to_xen and destroy_xen_mappings. > > > > Call vm_init from start_xen. > > > > Changes in v4: > > - remove flush_tlb_local() from create_xen_entries; > > Do you think the related one in create_p2m can go too?No, because the flush in create_p2m is the only flush present in that function, and because we change p2m mappings there, we need to make sure that the guest doesn''t keep accessing the old ones.
Stefano Stabellini
2013-Apr-28 14:26 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Fri, 26 Apr 2013, Ian Campbell wrote:> > + > > + > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > + { > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > don''t wake them up? Do we not want to check something like _VPF_down > too?I think you are right, we should test for _VPF_down too
Ian Campbell
2013-Apr-29 09:15 UTC
Re: [PATCH v4 06/12] xen/arm: compile and initialize vmap
On Sat, 2013-04-27 at 15:55 +0100, Stefano Stabellini wrote:> On Fri, 26 Apr 2013, Ian Campbell wrote: > > On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > > Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > > > VMAP_VIRT_END and VMAP_VIRT_START. > > > > > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > > > > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM, because we > > > only support 4K pages so as a matter of fact it is always set). > > > > > > Implement map_pages_to_xen and destroy_xen_mappings. > > > > > > Call vm_init from start_xen. > > > > > > Changes in v4: > > > - remove flush_tlb_local() from create_xen_entries; > > > > Do you think the related one in create_p2m can go too? > > No, because the flush in create_p2m is the only flush present in that > function, and because we change p2m mappings there, we need to make sure > that the guest doesn''t keep accessing the old ones.Isn''t that flush in the wrong place then, because a guest VCPU running on another PCPU could easily repopulate the TLB entry between that flush and the time we actually update the PTE. At the point of the current flush should be not remember that a flush is required and do it at the end? Or clear+flush here and reinstate later. BTW, this flushes all VMIDs, in principal we need to only flush the guest''s one, which requires this CPU to switch to that VMID, issue the (TLBIALL) and switch back to the callers VMID. An optimisation to keep in mind until later I think. It appears to not be possible to flsh individual IPAs, so given a lack of insight into the guest VA->IPA mapping we cannot optimise beyond a full flush, shame! Ian.
Ian Campbell
2013-Apr-29 09:15 UTC
Re: [PATCH v4 01/12] xen/arm: basic PSCI support, implement cpu_on and cpu_off
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> +int do_psci_cpu_on(uint32_t vcpuid, register_t entry_point) > +{ > + struct vcpu *v; > + struct domain *d = current->domain; > + struct vcpu_guest_context *ctxt; > + int rc; > + > + if ( (vcpuid < 0) || (vcpuid >= MAX_VIRT_CPUS) ) > + return PSCI_EINVAL; > + > + if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) > + return PSCI_EINVAL; > + > + if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) > + return PSCI_DENIED; > + > + memset(ctxt, 0, sizeof(*ctxt)); > + ctxt->user_regs.pc64 = (u64) entry_point; > + ctxt->sctlr = SCTLR_BASE; > + ctxt->ttbr0 = 0; > + ctxt->ttbr1 = 0; > + ctxt->ttbcr = 0; /* Defined Reset Value */ > + ctxt->user_regs.cpsr = PSR_GUEST_INIT; > + ctxt->flags = VGCF_online; > + > + domain_lock(d); > + rc = arch_set_info_guest(v, ctxt); > + free_vcpu_guest_context(ctxt);I''m slightly concerned that there might be other state which needs clearing when we are brining up a VCPU which has subsequently been running and blocked via cpu_down. I''m thinking of the sort of things which vcpu_reset clears up. However I guess we''ll find those out later.> + if ( rc < 0 ) > + { > + domain_unlock(d); > + return PSCI_DENIED; > + } > + domain_unlock(d); > + > + vcpu_wake(v); > + > + return PSCI_SUCCESS; > +} > + > +int do_psci_cpu_off(uint32_t power_state) > +{ > + struct vcpu *v = current; > + if ( !test_and_set_bit(_VPF_down, &v->pause_flags) ) > + vcpu_sleep_nosync(v); > + return PSCI_SUCCESS;We will return here but not make it back to the guest, because we will call into the scheduler (vcpu_sleep_nosync makes sure this is the case) and block. If cpu_up is subsequently called then it will overwrite the VCPU''s guest state and so it will return to the newly selected address. Good! Acked-by: Ian Campbell <ian.campbell@citrix.com>
On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote:> On Fri, 26 Apr 2013, Ian Campbell wrote: > > > + > > > + > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > + { > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > don''t wake them up? Do we not want to check something like _VPF_down > > too? > > I think you are right, we should test for _VPF_down tooAre there any interesting races here? Seems like there should be, unless we hold the domain lock or something? Ian.
Ian Campbell
2013-Apr-29 14:06 UTC
Re: [PATCH v4 11/12] xen/arm: run the vtimer Xen timers on the pcpu the vcpu is running on
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> The Xen physical timer emulator and virtual timer driver use two > internal Xen timers: initialize them on the pcpu the vcpu is running on, > rather than the processor that it''s creating the vcpu. > > On vcpu restore migrate the phys_timer and the virt_timer to the > pcpu the vcpu is running on. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>In so far as I understand Xen''s timer architecture: Acked-by: Ian Campbell <ian.campbell@citrix.com>> Changes in v4: > - migrate Xen timers on virt_timer_restore. > --- > xen/arch/arm/vtimer.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index 1cb365e..393aac3 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -48,7 +48,7 @@ int vcpu_vtimer_init(struct vcpu *v) > { > struct vtimer *t = &v->arch.phys_timer; > > - init_timer(&t->timer, phys_timer_expired, t, smp_processor_id()); > + init_timer(&t->timer, phys_timer_expired, t, v->processor); > t->ctl = 0; > t->offset = NOW(); > t->cval = NOW(); > @@ -56,7 +56,7 @@ int vcpu_vtimer_init(struct vcpu *v) > t->v = v; > > t = &v->arch.virt_timer; > - init_timer(&t->timer, virt_timer_expired, t, smp_processor_id()); > + init_timer(&t->timer, virt_timer_expired, t, v->processor); > t->ctl = 0; > t->offset = READ_SYSREG64(CNTVCT_EL0) + READ_SYSREG64(CNTVOFF_EL2); > t->cval = 0; > @@ -95,6 +95,8 @@ int virt_timer_restore(struct vcpu *v) > return 0; > > stop_timer(&v->arch.virt_timer.timer); > + migrate_timer(&v->arch.virt_timer.timer, v->processor); > + migrate_timer(&v->arch.phys_timer.timer, v->processor); > > WRITE_SYSREG64(v->arch.virt_timer.offset, CNTVOFF_EL2); > WRITE_SYSREG64(v->arch.virt_timer.cval, CNTV_CVAL_EL0);
Ian Campbell
2013-Apr-29 14:17 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> Introduce a domain wide vtimer initialization function to initialize > the phys_timer and the virt_timer offsets. > > Use the domain phys_timer and virt_timer offsets throughout the vtimer > code instead of the per-vcpu offsets. > > Remove the per-vcpu offsets from struct vtimer altogether. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Changes in v4: > - introduce vcpu_domain_init; > - inline phys_timer_base and virt_timer_base in arch_domain; > - use phys_timer_base.offset and virt_timer_base.offset directly in > vtimer code (remove offset field from struct vtimer). > --- > xen/arch/arm/domain.c | 3 +++ > xen/arch/arm/vtimer.c | 26 +++++++++++++++++--------- > xen/arch/arm/vtimer.h | 1 + > xen/include/asm-arm/domain.h | 24 +++++++++++++++--------- > 4 files changed, 36 insertions(+), 18 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index f7ec979..f26222a 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -485,6 +485,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > if ( (rc = domain_vgic_init(d)) != 0 ) > goto fail; > > + if ( (rc = vcpu_domain_init(d)) != 0 ) > + goto fail; > + > /* Domain 0 gets a real UART not an emulated one */ > if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) > goto fail; > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index 393aac3..782a255 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -44,21 +44,27 @@ static void virt_timer_expired(void *data) > vgic_vcpu_inject_irq(t->v, 27, 1); > } > > +int vcpu_domain_init(struct domain *d) > +{ > + d->arch.phys_timer_base.offset = NOW(); > + d->arch.virt_timer_base.offset = READ_SYSREG64(CNTVCT_EL0) + > + READ_SYSREG64(CNTVOFF_EL2);I know you are just moving this code but I don''t understand how this can make sense. When initialising dom0 these are, AFAICT, uninitialised. When initialising domU these are whatever the current domain (so, dom0) happens to have here. Also CNTVCT == Phys Count - Virt Offset, so adding Virt offset to it seems like an odd way to do it?> + return 0; > +} > + > int vcpu_vtimer_init(struct vcpu *v) > { > struct vtimer *t = &v->arch.phys_timer; > > init_timer(&t->timer, phys_timer_expired, t, v->processor); > t->ctl = 0; > - t->offset = NOW(); > - t->cval = NOW(); > + t->cval = 0; > t->irq = 30;Why this cval change? I don''t see the opposite being taken into account so I think this is an actual semantic change unrelated to the moving of the offset fields? Our handling of vcpu time is IMHO pretty, well, not well understood but other than this little change the rest of the patch seems to otherwise fall into the "can''t make anything worse" category. (I don''t know how to classify this hunk, not necessarily saying it is bad). Ian.
Stefano Stabellini
2013-Apr-29 15:47 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Mon, 29 Apr 2013, Ian Campbell wrote:> On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > + > > > > + > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > + { > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > don''t wake them up? Do we not want to check something like _VPF_down > > > too? > > > > I think you are right, we should test for _VPF_down too > > Are there any interesting races here? Seems like there should be, unless > we hold the domain lock or something?Considering that max_vcpus is static, d->vcpu[vcpuid] has been allocated at domain creation and that test_bit should be atomic, I don''t think there are any races here. What am I missing?
Stefano Stabellini
2013-Apr-29 16:14 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Mon, 29 Apr 2013, Ian Campbell wrote:> On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > Introduce a domain wide vtimer initialization function to initialize > > the phys_timer and the virt_timer offsets. > > > > Use the domain phys_timer and virt_timer offsets throughout the vtimer > > code instead of the per-vcpu offsets. > > > > Remove the per-vcpu offsets from struct vtimer altogether. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > Changes in v4: > > - introduce vcpu_domain_init; > > - inline phys_timer_base and virt_timer_base in arch_domain; > > - use phys_timer_base.offset and virt_timer_base.offset directly in > > vtimer code (remove offset field from struct vtimer). > > --- > > xen/arch/arm/domain.c | 3 +++ > > xen/arch/arm/vtimer.c | 26 +++++++++++++++++--------- > > xen/arch/arm/vtimer.h | 1 + > > xen/include/asm-arm/domain.h | 24 +++++++++++++++--------- > > 4 files changed, 36 insertions(+), 18 deletions(-) > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > index f7ec979..f26222a 100644 > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -485,6 +485,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > > if ( (rc = domain_vgic_init(d)) != 0 ) > > goto fail; > > > > + if ( (rc = vcpu_domain_init(d)) != 0 ) > > + goto fail; > > + > > /* Domain 0 gets a real UART not an emulated one */ > > if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) > > goto fail; > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > index 393aac3..782a255 100644 > > --- a/xen/arch/arm/vtimer.c > > +++ b/xen/arch/arm/vtimer.c > > @@ -44,21 +44,27 @@ static void virt_timer_expired(void *data) > > vgic_vcpu_inject_irq(t->v, 27, 1); > > } > > > > +int vcpu_domain_init(struct domain *d) > > +{ > > + d->arch.phys_timer_base.offset = NOW(); > > + d->arch.virt_timer_base.offset = READ_SYSREG64(CNTVCT_EL0) + > > + READ_SYSREG64(CNTVOFF_EL2); > > I know you are just moving this code but I don''t understand how this can > make sense. > > When initialising dom0 these are, AFAICT, uninitialised.I don''t think so: init_xen_time() is called before building dom0, so NOW() should be returning proper values here.> When > initialising domU these are whatever the current domain (so, dom0) > happens to have here.The offset is what we use to convert DomU time to Xen time and vice versa. Initializing phys_timer_base.offset to NOW() means that domU''s physical timer starts counting from NOW in terms of Xen time. Initializing virt_timer_base.offset to CNTVCT_EL0 + CNTVOFF_EL2 accomplishes the same thing for vtimer: offset = CNTVCT_EL0 + CNTVOFF_EL2 = Phys Count - CNTVOFF_EL2 + CNTVOFF_EL2 = Phys Count> Also CNTVCT == Phys Count - Virt Offset, so adding Virt offset to it > seems like an odd way to do it?I think the reason for it is that in the old document it wasn''t explicitly stated that Phys Count is actually identical to CNTPCT.> > + return 0; > > +} > > + > > int vcpu_vtimer_init(struct vcpu *v) > > { > > struct vtimer *t = &v->arch.phys_timer; > > > > init_timer(&t->timer, phys_timer_expired, t, v->processor); > > t->ctl = 0; > > - t->offset = NOW(); > > - t->cval = NOW(); > > + t->cval = 0; > > t->irq = 30; > > Why this cval change? I don''t see the opposite being taken into account > so I think this is an actual semantic change unrelated to the moving of > the offset fields? >cval is the compare value, I don''t think it matters what it is initialized to. I couldn''t think of a reason why it should be initialized to NOW(), so I changed to 0. I think it shouldn''t make any differences, but you are right, this is a semantic change, I will remove it from this patch.> Our handling of vcpu time is IMHO pretty, well, not well understood but > other than this little change the rest of the patch seems to otherwise > fall into the "can''t make anything worse" category. (I don''t know how to > classify this hunk, not necessarily saying it is bad).Right, it should be only code movement plus same initialization on both vcpus.
Stefano Stabellini
2013-Apr-29 16:20 UTC
Re: [PATCH v4 06/12] xen/arm: compile and initialize vmap
On Mon, 29 Apr 2013, Ian Campbell wrote:> On Sat, 2013-04-27 at 15:55 +0100, Stefano Stabellini wrote: > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > > > Rename EARLY_VMAP_VIRT_END and EARLY_VMAP_VIRT_START to > > > > VMAP_VIRT_END and VMAP_VIRT_START. > > > > > > > > Defining VMAP_VIRT_START triggers the compilation of common/vmap.c. > > > > > > > > Define PAGE_HYPERVISOR and MAP_SMALL_PAGES (unused on ARM, because we > > > > only support 4K pages so as a matter of fact it is always set). > > > > > > > > Implement map_pages_to_xen and destroy_xen_mappings. > > > > > > > > Call vm_init from start_xen. > > > > > > > > Changes in v4: > > > > - remove flush_tlb_local() from create_xen_entries; > > > > > > Do you think the related one in create_p2m can go too? > > > > No, because the flush in create_p2m is the only flush present in that > > function, and because we change p2m mappings there, we need to make sure > > that the guest doesn''t keep accessing the old ones. > > Isn''t that flush in the wrong place then, because a guest VCPU running > on another PCPU could easily repopulate the TLB entry between that flush > and the time we actually update the PTE. > > At the point of the current flush should be not remember that a flush is > required and do it at the end? Or clear+flush here and reinstate later.I agree, I''ll send a patch> BTW, this flushes all VMIDs, in principal we need to only flush the > guest''s one, which requires this CPU to switch to that VMID, issue the > (TLBIALL) and switch back to the callers VMID. An optimisation to keep > in mind until later I think. > > It appears to not be possible to flsh individual IPAs, so given a lack > of insight into the guest VA->IPA mapping we cannot optimise beyond a > full flush, shame!right
On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote:> On Mon, 29 Apr 2013, Ian Campbell wrote: > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > + > > > > > + > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > + { > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > too? > > > > > > I think you are right, we should test for _VPF_down too > > > > Are there any interesting races here? Seems like there should be, unless > > we hold the domain lock or something? > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > allocated at domain creation and that test_bit should be atomic, I don''t > think there are any races here. > What am I missing?The race between checking the bit and acting on the result. Ian.
Stefano Stabellini
2013-Apr-29 17:48 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Mon, 29 Apr 2013, Ian Campbell wrote:> On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > + > > > > > > + > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > + { > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > too? > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > Are there any interesting races here? Seems like there should be, unless > > > we hold the domain lock or something? > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > allocated at domain creation and that test_bit should be atomic, I don''t > > think there are any races here. > > What am I missing? > > The race between checking the bit and acting on the result.I don''t think that race can cause any problems: if vcpu0 is sending a TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who wins the race vcpu1 is going to receive the SGI on not, but the result is going to be always consistent: if vcpu1 shuts down before the SGI is sent, it is not going to receive it (actually it is going to receive it if it gets ever woken up), otherwise it is going to receive it before shutting down.
Stefano Stabellini
2013-Apr-29 17:51 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Mon, 29 Apr 2013, Stefano Stabellini wrote:> On Mon, 29 Apr 2013, Ian Campbell wrote: > > On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > > + > > > > > > > + > > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > > + { > > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > > too? > > > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > > > Are there any interesting races here? Seems like there should be, unless > > > > we hold the domain lock or something? > > > > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > > allocated at domain creation and that test_bit should be atomic, I don''t > > > think there are any races here. > > > What am I missing? > > > > The race between checking the bit and acting on the result. > > I don''t think that race can cause any problems: if vcpu0 is sending a > TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who > wins the race vcpu1 is going to receive the SGI on not, but the result > is going to be always consistent: if vcpu1 shuts down before the SGI is > sent, it is not going to receive it (actually it is going to receive it > if it gets ever woken up), otherwise it is going to receive it before > shutting down. >That makes me realize that we might not want to queue up interrupts when a cpu is offline, what do you think?
Ian Campbell
2013-Apr-30 08:38 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Mon, 2013-04-29 at 17:14 +0100, Stefano Stabellini wrote:> On Mon, 29 Apr 2013, Ian Campbell wrote: > > On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > > Introduce a domain wide vtimer initialization function to initialize > > > the phys_timer and the virt_timer offsets. > > > > > > Use the domain phys_timer and virt_timer offsets throughout the vtimer > > > code instead of the per-vcpu offsets. > > > > > > Remove the per-vcpu offsets from struct vtimer altogether. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > > > Changes in v4: > > > - introduce vcpu_domain_init; > > > - inline phys_timer_base and virt_timer_base in arch_domain; > > > - use phys_timer_base.offset and virt_timer_base.offset directly in > > > vtimer code (remove offset field from struct vtimer). > > > --- > > > xen/arch/arm/domain.c | 3 +++ > > > xen/arch/arm/vtimer.c | 26 +++++++++++++++++--------- > > > xen/arch/arm/vtimer.h | 1 + > > > xen/include/asm-arm/domain.h | 24 +++++++++++++++--------- > > > 4 files changed, 36 insertions(+), 18 deletions(-) > > > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > > index f7ec979..f26222a 100644 > > > --- a/xen/arch/arm/domain.c > > > +++ b/xen/arch/arm/domain.c > > > @@ -485,6 +485,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > > > if ( (rc = domain_vgic_init(d)) != 0 ) > > > goto fail; > > > > > > + if ( (rc = vcpu_domain_init(d)) != 0 ) > > > + goto fail; > > > + > > > /* Domain 0 gets a real UART not an emulated one */ > > > if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) > > > goto fail; > > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > > index 393aac3..782a255 100644 > > > --- a/xen/arch/arm/vtimer.c > > > +++ b/xen/arch/arm/vtimer.c > > > @@ -44,21 +44,27 @@ static void virt_timer_expired(void *data) > > > vgic_vcpu_inject_irq(t->v, 27, 1); > > > } > > > > > > +int vcpu_domain_init(struct domain *d) > > > +{ > > > + d->arch.phys_timer_base.offset = NOW(); > > > + d->arch.virt_timer_base.offset = READ_SYSREG64(CNTVCT_EL0) + > > > + READ_SYSREG64(CNTVOFF_EL2); > > > > I know you are just moving this code but I don''t understand how this can > > make sense. > > > > When initialising dom0 these are, AFAICT, uninitialised. > > I don''t think so: init_xen_time() is called before building dom0, so NOW() > should be returning proper values here.I meant the reads of CNTVCT_EL0 and CNTVOFF_EL2, especially the second one which is reset to an unknown value and we don''t set it.> > When > > initialising domU these are whatever the current domain (so, dom0) > > happens to have here. > > The offset is what we use to convert DomU time to Xen time and vice > versa. > Initializing phys_timer_base.offset to NOW() means that domU''s physical > timer starts counting from NOW in terms of Xen time. > > Initializing virt_timer_base.offset to CNTVCT_EL0 + CNTVOFF_EL2 > accomplishes the same thing for vtimer: > > offset = CNTVCT_EL0 + CNTVOFF_EL2 > = Phys Count - CNTVOFF_EL2 + CNTVOFF_EL2 > = Phys Count > > > Also CNTVCT == Phys Count - Virt Offset, so adding Virt offset to it > > seems like an odd way to do it? > > I think the reason for it is that in the old document it wasn''t > explicitly stated that Phys Count is actually identical to CNTPCT.So we could just read CNTPCT but don''t because the docs at the time didn''t suggest we could? By setting offset == phys-count we are making it so that virtual time for the domain appears to start from approx. 0, is that right? But then it ticks along at the same rate as phys time with no accounting for stolen or lost time etc? TBH I''m not even sure what stolen/lost time would be like for a clock which is supposed to be consistent across all VCPUs, or maybe that restriction is only for physical and hypervisor timers. As I say, our whole concept of virtual time is a little bit confused, partly I think because we don''t really know what the right answer is supposed to be. Your comment about the old docs made me go look at the new ARM -- there''s a bunch of new register in there from the looks of things, Ian.
On Mon, 2013-04-29 at 18:51 +0100, Stefano Stabellini wrote:> On Mon, 29 Apr 2013, Stefano Stabellini wrote: > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > > > + > > > > > > > > + > > > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > > > + { > > > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > > > too? > > > > > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > > > > > Are there any interesting races here? Seems like there should be, unless > > > > > we hold the domain lock or something? > > > > > > > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > > > allocated at domain creation and that test_bit should be atomic, I don''t > > > > think there are any races here. > > > > What am I missing? > > > > > > The race between checking the bit and acting on the result. > > > > I don''t think that race can cause any problems: if vcpu0 is sending a > > TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who > > wins the race vcpu1 is going to receive the SGI on not, but the result > > is going to be always consistent: if vcpu1 shuts down before the SGI is > > sent, it is not going to receive it (actually it is going to receive it > > if it gets ever woken up), otherwise it is going to receive it before > > shutting down.Did you test the case where the SGI gets delivered to a CPU which was up but now is down? We don''t want to crash because some resource has been freed etc...> That makes me realize that we might not want to queue up interrupts when > a cpu is offline, what do you think?Probably we should either avoid that or explicitly clear the queue when we bring a CPU up. Ian.
Ian Campbell
2013-Apr-30 14:59 UTC
Re: [PATCH v4 09/12] xen/arm: support VCPUOP_register_vcpu_info.
On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote:> We don''t want to support the full vcpu_op hypercall interface, just > VCPUOP_register_vcpu_info: introduce an internal ARM-only > do_arm_vcpu_op function to filter out the vcpu_op hypercalls that > we don''t want to support. > Call do_arm_vcpu_op instead of do_vcpu_op from traps.c. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>
Stefano Stabellini
2013-Apr-30 15:32 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Tue, 30 Apr 2013, Ian Campbell wrote:> On Mon, 2013-04-29 at 18:51 +0100, Stefano Stabellini wrote: > > On Mon, 29 Apr 2013, Stefano Stabellini wrote: > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > > > > + > > > > > > > > > + > > > > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > > > > + { > > > > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > > > > too? > > > > > > > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > > > > > > > Are there any interesting races here? Seems like there should be, unless > > > > > > we hold the domain lock or something? > > > > > > > > > > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > > > > allocated at domain creation and that test_bit should be atomic, I don''t > > > > > think there are any races here. > > > > > What am I missing? > > > > > > > > The race between checking the bit and acting on the result. > > > > > > I don''t think that race can cause any problems: if vcpu0 is sending a > > > TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who > > > wins the race vcpu1 is going to receive the SGI on not, but the result > > > is going to be always consistent: if vcpu1 shuts down before the SGI is > > > sent, it is not going to receive it (actually it is going to receive it > > > if it gets ever woken up), otherwise it is going to receive it before > > > shutting down. > > Did you test the case where the SGI gets delivered to a CPU which was up > but now is down? We don''t want to crash because some resource has been > freed etc...I haven''t actually tested it, but no resources are freed: vcpu_sleep_nosync only sets a scheduler flag.> > a cpu is offline, what do you think? > > Probably we should either avoid that or explicitly clear the queue when > we bring a CPU up.I think that clearing the queue when we bring the cpu up is better, I''ll add a patch for that.
On Tue, 2013-04-30 at 16:32 +0100, Stefano Stabellini wrote:> On Tue, 30 Apr 2013, Ian Campbell wrote: > > On Mon, 2013-04-29 at 18:51 +0100, Stefano Stabellini wrote: > > > On Mon, 29 Apr 2013, Stefano Stabellini wrote: > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > > > > > + > > > > > > > > > > + > > > > > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > > > > > + { > > > > > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > > > > > too? > > > > > > > > > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > > > > > > > > > Are there any interesting races here? Seems like there should be, unless > > > > > > > we hold the domain lock or something? > > > > > > > > > > > > > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > > > > > allocated at domain creation and that test_bit should be atomic, I don''t > > > > > > think there are any races here. > > > > > > What am I missing? > > > > > > > > > > The race between checking the bit and acting on the result. > > > > > > > > I don''t think that race can cause any problems: if vcpu0 is sending a > > > > TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who > > > > wins the race vcpu1 is going to receive the SGI on not, but the result > > > > is going to be always consistent: if vcpu1 shuts down before the SGI is > > > > sent, it is not going to receive it (actually it is going to receive it > > > > if it gets ever woken up), otherwise it is going to receive it before > > > > shutting down. > > > > Did you test the case where the SGI gets delivered to a CPU which was up > > but now is down? We don''t want to crash because some resource has been > > freed etc... > > I haven''t actually tested it, but no resources are freed: > vcpu_sleep_nosync only sets a scheduler flag.Good, so vcpus are only actually destroyed when we are destroying a domain. Do we need to be checking d->is_dying?> > > > a cpu is offline, what do you think? > > > > Probably we should either avoid that or explicitly clear the queue when > > we bring a CPU up. > > I think that clearing the queue when we bring the cpu up is better, I''ll > add a patch for that.
Stefano Stabellini
2013-Apr-30 16:34 UTC
Re: [PATCH v4 04/12] xen/arm: support for guest SGI
On Tue, 30 Apr 2013, Ian Campbell wrote:> On Tue, 2013-04-30 at 16:32 +0100, Stefano Stabellini wrote: > > On Tue, 30 Apr 2013, Ian Campbell wrote: > > > On Mon, 2013-04-29 at 18:51 +0100, Stefano Stabellini wrote: > > > > On Mon, 29 Apr 2013, Stefano Stabellini wrote: > > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > > On Mon, 2013-04-29 at 16:47 +0100, Stefano Stabellini wrote: > > > > > > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > > > > > > On Sun, 2013-04-28 at 15:26 +0100, Stefano Stabellini wrote: > > > > > > > > > On Fri, 26 Apr 2013, Ian Campbell wrote: > > > > > > > > > > > + > > > > > > > > > > > + > > > > > > > > > > > + for_each_set_bit( vcpuid, &vcpu_mask, d->max_vcpus ) > > > > > > > > > > > + { > > > > > > > > > > > + if ( vcpuid >= d->max_vcpus || (vt = d->vcpu[vcpuid]) == NULL ) > > > > > > > > > > > > > > > > > > > > Is d->vcpu[vcpuid] == NULL sufficient here? A vcpu which exists but has > > > > > > > > > > not been PSCI''d up will pass this -- what do we do to them? Hopefully we > > > > > > > > > > don''t wake them up? Do we not want to check something like _VPF_down > > > > > > > > > > too? > > > > > > > > > > > > > > > > > > I think you are right, we should test for _VPF_down too > > > > > > > > > > > > > > > > Are there any interesting races here? Seems like there should be, unless > > > > > > > > we hold the domain lock or something? > > > > > > > > > > > > > > > > > > > > > Considering that max_vcpus is static, d->vcpu[vcpuid] has been > > > > > > > allocated at domain creation and that test_bit should be atomic, I don''t > > > > > > > think there are any races here. > > > > > > > What am I missing? > > > > > > > > > > > > The race between checking the bit and acting on the result. > > > > > > > > > > I don''t think that race can cause any problems: if vcpu0 is sending a > > > > > TARGET_OTHERS SGI and vcpu1 is executing psci.cpu_off, depending on who > > > > > wins the race vcpu1 is going to receive the SGI on not, but the result > > > > > is going to be always consistent: if vcpu1 shuts down before the SGI is > > > > > sent, it is not going to receive it (actually it is going to receive it > > > > > if it gets ever woken up), otherwise it is going to receive it before > > > > > shutting down. > > > > > > Did you test the case where the SGI gets delivered to a CPU which was up > > > but now is down? We don''t want to crash because some resource has been > > > freed etc... > > > > I haven''t actually tested it, but no resources are freed: > > vcpu_sleep_nosync only sets a scheduler flag. > > Good, so vcpus are only actually destroyed when we are destroying a > domain. > > Do we need to be checking d->is_dying?I don''t think so: is_dying is used for toolstack driven domain kill operations.
Stefano Stabellini
2013-Apr-30 16:37 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Tue, 30 Apr 2013, Ian Campbell wrote:> On Mon, 2013-04-29 at 17:14 +0100, Stefano Stabellini wrote: > > On Mon, 29 Apr 2013, Ian Campbell wrote: > > > On Fri, 2013-04-26 at 16:28 +0100, Stefano Stabellini wrote: > > > > Introduce a domain wide vtimer initialization function to initialize > > > > the phys_timer and the virt_timer offsets. > > > > > > > > Use the domain phys_timer and virt_timer offsets throughout the vtimer > > > > code instead of the per-vcpu offsets. > > > > > > > > Remove the per-vcpu offsets from struct vtimer altogether. > > > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > > > > > Changes in v4: > > > > - introduce vcpu_domain_init; > > > > - inline phys_timer_base and virt_timer_base in arch_domain; > > > > - use phys_timer_base.offset and virt_timer_base.offset directly in > > > > vtimer code (remove offset field from struct vtimer). > > > > --- > > > > xen/arch/arm/domain.c | 3 +++ > > > > xen/arch/arm/vtimer.c | 26 +++++++++++++++++--------- > > > > xen/arch/arm/vtimer.h | 1 + > > > > xen/include/asm-arm/domain.h | 24 +++++++++++++++--------- > > > > 4 files changed, 36 insertions(+), 18 deletions(-) > > > > > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > > > index f7ec979..f26222a 100644 > > > > --- a/xen/arch/arm/domain.c > > > > +++ b/xen/arch/arm/domain.c > > > > @@ -485,6 +485,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) > > > > if ( (rc = domain_vgic_init(d)) != 0 ) > > > > goto fail; > > > > > > > > + if ( (rc = vcpu_domain_init(d)) != 0 ) > > > > + goto fail; > > > > + > > > > /* Domain 0 gets a real UART not an emulated one */ > > > > if ( d->domain_id && (rc = domain_uart0_init(d)) != 0 ) > > > > goto fail; > > > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > > > index 393aac3..782a255 100644 > > > > --- a/xen/arch/arm/vtimer.c > > > > +++ b/xen/arch/arm/vtimer.c > > > > @@ -44,21 +44,27 @@ static void virt_timer_expired(void *data) > > > > vgic_vcpu_inject_irq(t->v, 27, 1); > > > > } > > > > > > > > +int vcpu_domain_init(struct domain *d) > > > > +{ > > > > + d->arch.phys_timer_base.offset = NOW(); > > > > + d->arch.virt_timer_base.offset = READ_SYSREG64(CNTVCT_EL0) + > > > > + READ_SYSREG64(CNTVOFF_EL2); > > > > > > I know you are just moving this code but I don''t understand how this can > > > make sense. > > > > > > When initialising dom0 these are, AFAICT, uninitialised. > > > > I don''t think so: init_xen_time() is called before building dom0, so NOW() > > should be returning proper values here. > > I meant the reads of CNTVCT_EL0 and CNTVOFF_EL2, especially the second > one which is reset to an unknown value and we don''t set it. > > > > When > > > initialising domU these are whatever the current domain (so, dom0) > > > happens to have here. > > > > The offset is what we use to convert DomU time to Xen time and vice > > versa. > > Initializing phys_timer_base.offset to NOW() means that domU''s physical > > timer starts counting from NOW in terms of Xen time. > > > > Initializing virt_timer_base.offset to CNTVCT_EL0 + CNTVOFF_EL2 > > accomplishes the same thing for vtimer: > > > > offset = CNTVCT_EL0 + CNTVOFF_EL2 > > = Phys Count - CNTVOFF_EL2 + CNTVOFF_EL2 > > = Phys Count > > > > > Also CNTVCT == Phys Count - Virt Offset, so adding Virt offset to it > > > seems like an odd way to do it? > > > > I think the reason for it is that in the old document it wasn''t > > explicitly stated that Phys Count is actually identical to CNTPCT. > > So we could just read CNTPCT but don''t because the docs at the time > didn''t suggest we could?Right, I think so.> By setting offset == phys-count we are making it so that virtual time > for the domain appears to start from approx. 0, is that right?Yes> But then > it ticks along at the same rate as phys time with no accounting for > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > would be like for a clock which is supposed to be consistent across all > VCPUs, or maybe that restriction is only for physical and hypervisor > timers.Right, no accounting. I don''t know how the lost time accounting would look like either.
Ian Campbell
2013-May-01 09:53 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Tue, 2013-04-30 at 17:37 +0100, Stefano Stabellini wrote:> > But then > > it ticks along at the same rate as phys time with no accounting for > > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > > would be like for a clock which is supposed to be consistent across all > > VCPUs, or maybe that restriction is only for physical and hypervisor > > timers. > > Right, no accounting. I don''t know how the lost time accounting would > look like either.I''ve added this to the ARM_TODO wiki. I wonder if the right answer, by analogy with PV time on x86, will be something like: Reading the ARM Physical timer == Raw read of x86 TSC, i.e. you get a raw host system time. Reading the ARM virtual timer == The x86 PV clock protocol (e.g. the tsc*factor+offset), i.e. you get a time source which does not include stolen time and which ticks only when the guest is running (I think this is the x86 semantics, not 100% sure though). We also need to figure out whether we expect virtual time to remain in step across the domain -- if yes (this is what the physical timers do for example) then we need to understand what this means when VCPU0 runs but VCPU1 doesn''t. I don''t know what x86 does here... Ideally we would have a scheme which didn''t require us to emulate either virtual or physical time in the common case (e.g. migration among like systems). Ian.
Stefano Stabellini
2013-May-01 10:28 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, 1 May 2013, Ian Campbell wrote:> On Tue, 2013-04-30 at 17:37 +0100, Stefano Stabellini wrote: > > > But then > > > it ticks along at the same rate as phys time with no accounting for > > > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > > > would be like for a clock which is supposed to be consistent across all > > > VCPUs, or maybe that restriction is only for physical and hypervisor > > > timers. > > > > Right, no accounting. I don''t know how the lost time accounting would > > look like either. > > I''ve added this to the ARM_TODO wiki. > > I wonder if the right answer, by analogy with PV time on x86, will be > something like: > > Reading the ARM Physical timer == Raw read of x86 TSC, i.e. you get a > raw host system time. > > Reading the ARM virtual timer == The x86 PV clock protocol (e.g. the > tsc*factor+offset), i.e. you get a time source which does not include > stolen time and which ticks only when the guest is running (I think this > is the x86 semantics, not 100% sure though).there is another problem there: the "factor" is not available or configurable on ARM, so it can cause problems on VM migration.> We also need to figure out whether we expect virtual time to remain in > step across the domain -- if yes (this is what the physical timers do > for example) then we need to understand what this means when VCPU0 runs > but VCPU1 doesn''t. I don''t know what x86 does here... > > Ideally we would have a scheme which didn''t require us to emulate either > virtual or physical time in the common case (e.g. migration among like > systems).I am thinking that we might have to enable the PV timer on ARM after all.
Ian Campbell
2013-May-01 10:36 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, 2013-05-01 at 11:28 +0100, Stefano Stabellini wrote:> On Wed, 1 May 2013, Ian Campbell wrote: > > On Tue, 2013-04-30 at 17:37 +0100, Stefano Stabellini wrote: > > > > But then > > > > it ticks along at the same rate as phys time with no accounting for > > > > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > > > > would be like for a clock which is supposed to be consistent across all > > > > VCPUs, or maybe that restriction is only for physical and hypervisor > > > > timers. > > > > > > Right, no accounting. I don''t know how the lost time accounting would > > > look like either. > > > > I''ve added this to the ARM_TODO wiki. > > > > I wonder if the right answer, by analogy with PV time on x86, will be > > something like: > > > > Reading the ARM Physical timer == Raw read of x86 TSC, i.e. you get a > > raw host system time. > > > > Reading the ARM virtual timer == The x86 PV clock protocol (e.g. the > > tsc*factor+offset), i.e. you get a time source which does not include > > stolen time and which ticks only when the guest is running (I think this > > is the x86 semantics, not 100% sure though). > > there is another problem there: the "factor" is not available or > configurable on ARM, so it can cause problems on VM migration.Yes, that was what I was getting at with the common case for avoiding being migration among like systems, IOW if you migrate to a system with a different factor (which really == clock frequency, I think) then you need to emulate :-(> > We also need to figure out whether we expect virtual time to remain in > > step across the domain -- if yes (this is what the physical timers do > > for example) then we need to understand what this means when VCPU0 runs > > but VCPU1 doesn''t. I don''t know what x86 does here... > > > > Ideally we would have a scheme which didn''t require us to emulate either > > virtual or physical time in the common case (e.g. migration among like > > systems). > > I am thinking that we might have to enable the PV timer on ARM after > all.It wouldn''t be the *end* of the world and by making use of the availability of both physical and virtual time we may be able to come up with something even better? On x86 we share the pvclock algorithmn with KVM -- this would be something interesting to discuss with them on ARM too I think. Or I suppose we have some PV mechanism to reset the kernels idea of the clock frequency? Ian.
Stefano Stabellini
2013-May-01 11:15 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, 1 May 2013, Ian Campbell wrote:> On Wed, 2013-05-01 at 11:28 +0100, Stefano Stabellini wrote: > > On Wed, 1 May 2013, Ian Campbell wrote: > > > On Tue, 2013-04-30 at 17:37 +0100, Stefano Stabellini wrote: > > > > > But then > > > > > it ticks along at the same rate as phys time with no accounting for > > > > > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > > > > > would be like for a clock which is supposed to be consistent across all > > > > > VCPUs, or maybe that restriction is only for physical and hypervisor > > > > > timers. > > > > > > > > Right, no accounting. I don''t know how the lost time accounting would > > > > look like either. > > > > > > I''ve added this to the ARM_TODO wiki. > > > > > > I wonder if the right answer, by analogy with PV time on x86, will be > > > something like: > > > > > > Reading the ARM Physical timer == Raw read of x86 TSC, i.e. you get a > > > raw host system time. > > > > > > Reading the ARM virtual timer == The x86 PV clock protocol (e.g. the > > > tsc*factor+offset), i.e. you get a time source which does not include > > > stolen time and which ticks only when the guest is running (I think this > > > is the x86 semantics, not 100% sure though). > > > > there is another problem there: the "factor" is not available or > > configurable on ARM, so it can cause problems on VM migration. > > Yes, that was what I was getting at with the common case for avoiding > being migration among like systems, IOW if you migrate to a system with > a different factor (which really == clock frequency, I think) then you > need to emulate :-( > > > > We also need to figure out whether we expect virtual time to remain in > > > step across the domain -- if yes (this is what the physical timers do > > > for example) then we need to understand what this means when VCPU0 runs > > > but VCPU1 doesn''t. I don''t know what x86 does here... > > > > > > Ideally we would have a scheme which didn''t require us to emulate either > > > virtual or physical time in the common case (e.g. migration among like > > > systems). > > > > I am thinking that we might have to enable the PV timer on ARM after > > all. > > It wouldn''t be the *end* of the world and by making use of the > availability of both physical and virtual time we may be able to come up > with something even better? On x86 we share the pvclock algorithmn with > KVM -- this would be something interesting to discuss with them on ARM > too I think. > > Or I suppose we have some PV mechanism to reset the kernels idea of the > clock frequency?I think it is possible: if we exposed the "factor" to the guest and accounted for the stolen ticks in the vtimer offset, the guest would have everything it needs to calculate the time.
Ian Campbell
2013-May-01 11:26 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, 2013-05-01 at 12:15 +0100, Stefano Stabellini wrote:> > Or I suppose we have some PV mechanism to reset the kernels idea of the > > clock frequency? > > I think it is possible: if we exposed the "factor" to the guest and > accounted for the stolen ticks in the vtimer offset, the guest would > have everything it needs to calculate the time.Agreed. Exposing the factor may need to only consist of the guest rereading CNTFRQ after a resume. We can probably take this particular patch based on this understanding of where we are headed (it doesn''t make things worse). Shall we aim to have vtimer offset do the right thing in 4.3 though? Ian.
Konrad Rzeszutek Wilk
2013-May-01 13:28 UTC
Re: [PATCH v4 12/12] xen/arm: initialize virt_timer and phys_timer with the same values on all vcpus
On Wed, May 01, 2013 at 10:53:48AM +0100, Ian Campbell wrote:> On Tue, 2013-04-30 at 17:37 +0100, Stefano Stabellini wrote: > > > But then > > > it ticks along at the same rate as phys time with no accounting for > > > stolen or lost time etc? TBH I''m not even sure what stolen/lost time > > > would be like for a clock which is supposed to be consistent across all > > > VCPUs, or maybe that restriction is only for physical and hypervisor > > > timers. > > > > Right, no accounting. I don''t know how the lost time accounting would > > look like either. > > I''ve added this to the ARM_TODO wiki. > > I wonder if the right answer, by analogy with PV time on x86, will be > something like: > > Reading the ARM Physical timer == Raw read of x86 TSC, i.e. you get a > raw host system time. > > Reading the ARM virtual timer == The x86 PV clock protocol (e.g. the > tsc*factor+offset), i.e. you get a time source which does not include > stolen time and which ticks only when the guest is running (I think this > is the x86 semantics, not 100% sure though). > > We also need to figure out whether we expect virtual time to remain in > step across the domain -- if yes (this is what the physical timers do > for example) then we need to understand what this means when VCPU0 runs > but VCPU1 doesn''t. I don''t know what x86 does here...It uses the VCPU_vcpuruntime to figure out whether it was "offline" and add that to the ''stolen'' or what not delta. The arch/x86/xen/time.c has do_stolen_accounting which does some of this.> > Ideally we would have a scheme which didn''t require us to emulate either > virtual or physical time in the common case (e.g. migration among like > systems). > > Ian. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >