Stefano Stabellini
2013-May-06 14:50 UTC
[PATCH RFC v2 0/4] xen/arm: CONFIG_PARAVIRT and stolen ticks accounting
Hi all, this patch series introduces stolen ticks accounting for Xen on ARM. Stolen ticks are clocksource ticks that have been "stolen" from the cpu, typically because Linux is running in a virtual machine and the vcpu has been descheduled. To account for these ticks we introduce CONFIG_PARAVIRT and pv_time_ops so that we can make use of: kernel/sched/cputime.c:steal_account_process_tick Stefano Stabellini (4): xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c arm: introduce CONFIG_PARAVIRT and pv_time_ops kernel: missing include in cputime.c xen/arm: account for stolen ticks arch/arm/Kconfig | 9 ++++ arch/arm/include/asm/paravirt.h | 19 ++++++++ arch/arm/kernel/Makefile | 1 + arch/arm/kernel/paravirt.c | 32 ++++++++++++++ arch/arm/xen/enlighten.c | 23 ++++++++++ arch/ia64/xen/time.c | 48 +++------------------ arch/x86/xen/time.c | 76 +-------------------------------- drivers/xen/Makefile | 2 +- drivers/xen/time.c | 91 +++++++++++++++++++++++++++++++++++++++ include/xen/xen-ops.h | 5 ++ kernel/sched/cputime.c | 1 + 11 files changed, 189 insertions(+), 118 deletions(-) create mode 100644 arch/arm/include/asm/paravirt.h create mode 100644 arch/arm/kernel/paravirt.c create mode 100644 drivers/xen/time.c Cheers, Stefano
Stefano Stabellini
2013-May-06 14:51 UTC
[PATCH RFC v2 1/4] xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c
Changes in v2: - leave do_stolen_accounting in arch/x86/xen/time.c; - use the new common functions in arch/ia64/xen/time.c. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: konrad.wilk@oracle.com --- arch/ia64/xen/time.c | 48 +++---------------------- arch/x86/xen/time.c | 76 +---------------------------------------- drivers/xen/Makefile | 2 +- drivers/xen/time.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++ include/xen/xen-ops.h | 5 +++ 5 files changed, 104 insertions(+), 118 deletions(-) create mode 100644 drivers/xen/time.c diff --git a/arch/ia64/xen/time.c b/arch/ia64/xen/time.c index 1f8244a..79a0b8c 100644 --- a/arch/ia64/xen/time.c +++ b/arch/ia64/xen/time.c @@ -34,53 +34,17 @@ #include "../kernel/fsyscall_gtod_data.h" -static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); static DEFINE_PER_CPU(unsigned long, xen_stolen_time); static DEFINE_PER_CPU(unsigned long, xen_blocked_time); /* taken from i386/kernel/time-xen.c */ static void xen_init_missing_ticks_accounting(int cpu) { - struct vcpu_register_runstate_memory_area area; - struct vcpu_runstate_info *runstate = &per_cpu(xen_runstate, cpu); - int rc; + xen_setup_runstate_info(&runstate); - memset(runstate, 0, sizeof(*runstate)); - - area.addr.v = runstate; - rc = HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area, cpu, - &area); - WARN_ON(rc && rc != -ENOSYS); - - per_cpu(xen_blocked_time, cpu) = runstate->time[RUNSTATE_blocked]; - per_cpu(xen_stolen_time, cpu) = runstate->time[RUNSTATE_runnable] - + runstate->time[RUNSTATE_offline]; -} - -/* - * Runstate accounting - */ -/* stolen from arch/x86/xen/time.c */ -static void get_runstate_snapshot(struct vcpu_runstate_info *res) -{ - u64 state_time; - struct vcpu_runstate_info *state; - - BUG_ON(preemptible()); - - state = &__get_cpu_var(xen_runstate); - - /* - * The runstate info is always updated by the hypervisor on - * the current CPU, so there''s no need to use anything - * stronger than a compiler barrier when fetching it. - */ - do { - state_time = state->state_entry_time; - rmb(); - *res = *state; - rmb(); - } while (state->state_entry_time != state_time); + per_cpu(xen_blocked_time, cpu) = runstate.time[RUNSTATE_blocked]; + per_cpu(xen_stolen_time, cpu) = runstate.time[RUNSTATE_runnable] + + runstate.time[RUNSTATE_offline]; } #define NS_PER_TICK (1000000000LL/HZ) @@ -94,7 +58,7 @@ consider_steal_time(unsigned long new_itm) struct vcpu_runstate_info runstate; struct task_struct *p = current; - get_runstate_snapshot(&runstate); + xen_get_runstate_snapshot(&runstate); /* * Check for vcpu migration effect @@ -202,7 +166,7 @@ static unsigned long long xen_sched_clock(void) */ now = ia64_native_sched_clock(); - get_runstate_snapshot(&runstate); + xen_get_runstate_snapshot(&runstate); WARN_ON(runstate.state != RUNSTATE_running); diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 0296a95..18d0104 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -30,9 +30,6 @@ #define TIMER_SLOP 100000 #define NS_PER_TICK (1000000000LL / HZ) -/* runstate info updated by Xen */ -static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); - /* snapshots of runstate info */ static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot); @@ -40,77 +37,6 @@ static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot); static DEFINE_PER_CPU(u64, xen_residual_stolen); static DEFINE_PER_CPU(u64, xen_residual_blocked); -/* return an consistent snapshot of 64-bit time/counter value */ -static u64 get64(const u64 *p) -{ - u64 ret; - - if (BITS_PER_LONG < 64) { - u32 *p32 = (u32 *)p; - u32 h, l; - - /* - * Read high then low, and then make sure high is - * still the same; this will only loop if low wraps - * and carries into high. - * XXX some clean way to make this endian-proof? - */ - do { - h = p32[1]; - barrier(); - l = p32[0]; - barrier(); - } while (p32[1] != h); - - ret = (((u64)h) << 32) | l; - } else - ret = *p; - - return ret; -} - -/* - * Runstate accounting - */ -static void get_runstate_snapshot(struct vcpu_runstate_info *res) -{ - u64 state_time; - struct vcpu_runstate_info *state; - - BUG_ON(preemptible()); - - state = &__get_cpu_var(xen_runstate); - - /* - * The runstate info is always updated by the hypervisor on - * the current CPU, so there''s no need to use anything - * stronger than a compiler barrier when fetching it. - */ - do { - state_time = get64(&state->state_entry_time); - barrier(); - *res = *state; - barrier(); - } while (get64(&state->state_entry_time) != state_time); -} - -/* return true when a vcpu could run but has no real cpu to run on */ -bool xen_vcpu_stolen(int vcpu) -{ - return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable; -} - -void xen_setup_runstate_info(int cpu) -{ - struct vcpu_register_runstate_memory_area area; - - area.addr.v = &per_cpu(xen_runstate, cpu); - - if (HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area, - cpu, &area)) - BUG(); -} - static void do_stolen_accounting(void) { struct vcpu_runstate_info state; @@ -118,7 +44,7 @@ static void do_stolen_accounting(void) s64 blocked, runnable, offline, stolen; cputime_t ticks; - get_runstate_snapshot(&state); + xen_get_runstate_snapshot(&state); WARN_ON(state.state != RUNSTATE_running); diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index eabd0ee..2bf461a 100644 --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -3,7 +3,7 @@ obj-y += manage.o obj-$(CONFIG_HOTPLUG_CPU) += cpu_hotplug.o endif obj-$(CONFIG_X86) += fallback.o -obj-y += grant-table.o features.o events.o balloon.o +obj-y += grant-table.o features.o events.o balloon.o time.o obj-y += xenbus/ nostackp := $(call cc-option, -fno-stack-protector) diff --git a/drivers/xen/time.c b/drivers/xen/time.c new file mode 100644 index 0000000..c2e39d3 --- /dev/null +++ b/drivers/xen/time.c @@ -0,0 +1,91 @@ +/* + * Xen stolen ticks accounting. + */ +#include <linux/kernel.h> +#include <linux/kernel_stat.h> +#include <linux/math64.h> +#include <linux/gfp.h> + +#include <asm/xen/hypervisor.h> +#include <asm/xen/hypercall.h> + +#include <xen/events.h> +#include <xen/features.h> +#include <xen/interface/xen.h> +#include <xen/interface/vcpu.h> +#include <xen/xen-ops.h> + +/* runstate info updated by Xen */ +static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); + +/* return an consistent snapshot of 64-bit time/counter value */ +static u64 get64(const u64 *p) +{ + u64 ret; + + if (BITS_PER_LONG < 64) { + u32 *p32 = (u32 *)p; + u32 h, l; + + /* + * Read high then low, and then make sure high is + * still the same; this will only loop if low wraps + * and carries into high. + * XXX some clean way to make this endian-proof? + */ + do { + h = p32[1]; + barrier(); + l = p32[0]; + barrier(); + } while (p32[1] != h); + + ret = (((u64)h) << 32) | l; + } else + ret = *p; + + return ret; +} + +/* + * Runstate accounting + */ +void xen_get_runstate_snapshot(struct vcpu_runstate_info *res) +{ + u64 state_time; + struct vcpu_runstate_info *state; + + BUG_ON(preemptible()); + + state = &__get_cpu_var(xen_runstate); + + /* + * The runstate info is always updated by the hypervisor on + * the current CPU, so there''s no need to use anything + * stronger than a compiler barrier when fetching it. + */ + do { + state_time = get64(&state->state_entry_time); + barrier(); + *res = *state; + barrier(); + } while (get64(&state->state_entry_time) != state_time); +} + +/* return true when a vcpu could run but has no real cpu to run on */ +bool xen_vcpu_stolen(int vcpu) +{ + return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable; +} + +void xen_setup_runstate_info(int cpu) +{ + struct vcpu_register_runstate_memory_area area; + + area.addr.v = &per_cpu(xen_runstate, cpu); + + if (HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area, + cpu, &area)) + BUG(); +} + diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index d6fe062..4fd4e47 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -3,6 +3,7 @@ #include <linux/percpu.h> #include <asm/xen/interface.h> +#include <xen/interface/vcpu.h> DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); @@ -16,6 +17,10 @@ void xen_mm_unpin_all(void); void xen_timer_resume(void); void xen_arch_resume(void); +bool xen_vcpu_stolen(int vcpu); +void xen_setup_runstate_info(int cpu); +void xen_get_runstate_snapshot(struct vcpu_runstate_info *res); + int xen_setup_shutdown_event(void); extern unsigned long *xen_contiguous_bitmap; -- 1.7.2.5
Stefano Stabellini
2013-May-06 14:51 UTC
[PATCH RFC v2 2/4] arm: introduce CONFIG_PARAVIRT and pv_time_ops
Introduce CONFIG_PARAVIRT on ARM. The only paravirt interface supported is pv_time_ops.steal_clock. No runtime pvops patching yet. This allows us to make us of steal_account_process_tick for stolen ticks accounting. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: linux@arm.linux.org.uk CC: will.deacon@arm.com CC: nico@linaro.org CC: marc.zyngier@arm.com CC: cov@codeaurora.org CC: arnd@arndb.de CC: olof@lixom.net --- arch/arm/Kconfig | 9 +++++++++ arch/arm/include/asm/paravirt.h | 19 +++++++++++++++++++ arch/arm/kernel/Makefile | 1 + arch/arm/kernel/paravirt.c | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 61 insertions(+), 0 deletions(-) create mode 100644 arch/arm/include/asm/paravirt.h create mode 100644 arch/arm/kernel/paravirt.c diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 344e299..35cb10a 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1887,12 +1887,21 @@ config XEN_DOM0 def_bool y depends on XEN +config PARAVIRT + bool "Enable paravirtualization code" + ---help--- + This changes the kernel so it can modify itself when it is run + under a hypervisor, potentially improving performance significantly + over full virtualization. However, when run without a hypervisor + the kernel is theoretically slower and slightly larger. + config XEN bool "Xen guest support on ARM (EXPERIMENTAL)" depends on ARM && AEABI && OF depends on CPU_V7 && !CPU_V6 depends on !GENERIC_ATOMIC64 select ARM_PSCI + select PARAVIRT help Say Y if you want to run Linux in a Virtual Machine on Xen on ARM. diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h new file mode 100644 index 0000000..3b95bc6 --- /dev/null +++ b/arch/arm/include/asm/paravirt.h @@ -0,0 +1,19 @@ +#ifndef _ASM_ARM_PARAVIRT_H +#define _ASM_ARM_PARAVIRT_H + +struct static_key; +extern struct static_key paravirt_steal_enabled; +extern struct static_key paravirt_steal_rq_enabled; + +struct pv_time_ops { + unsigned long long (*steal_clock)(int cpu); +}; +extern struct pv_time_ops pv_time_ops; + +static inline u64 paravirt_steal_clock(int cpu) +{ + return pv_time_ops.steal_clock(cpu); +} + + +#endif diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index dd9d90a..6764f60 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -86,5 +86,6 @@ ifeq ($(CONFIG_ARM_PSCI),y) obj-y += psci.o obj-$(CONFIG_SMP) += psci_smp.o endif +obj-$(CONFIG_PARAVIRT) += paravirt.o extra-y := $(head-y) vmlinux.lds diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c new file mode 100644 index 0000000..3e73fc8 --- /dev/null +++ b/arch/arm/kernel/paravirt.c @@ -0,0 +1,32 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Copyright (C) 2013 Citrix Systems + * + * Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> + */ + +#include <linux/export.h> +#include <linux/jump_label.h> +#include <linux/types.h> +#include <asm/paravirt.h> + +struct static_key paravirt_steal_enabled; +struct static_key paravirt_steal_rq_enabled; + +static u64 native_steal_clock(int cpu) +{ + return 0; +} + +struct pv_time_ops pv_time_ops = { + .steal_clock = native_steal_clock, +}; +EXPORT_SYMBOL_GPL(pv_time_ops); -- 1.7.2.5
Stefano Stabellini
2013-May-06 14:51 UTC
[PATCH RFC v2 3/4] kernel: missing include in cputime.c
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: mingo@redhat.com CC: peterz@infradead.org --- kernel/sched/cputime.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index ed12cbb..488a5bf 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -5,6 +5,7 @@ #include <linux/static_key.h> #include <linux/context_tracking.h> #include "sched.h" +#include <asm/paravirt.h> #ifdef CONFIG_IRQ_TIME_ACCOUNTING -- 1.7.2.5
Stefano Stabellini
2013-May-06 14:51 UTC
[PATCH RFC v2 4/4] xen/arm: account for stolen ticks
Register the runstate_memory_area with the hypervisor. Use pv_time_ops.steal_clock to account for stolen ticks. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/xen/enlighten.c | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c index ee86bfa..2a5cc82 100644 --- a/arch/arm/xen/enlighten.c +++ b/arch/arm/xen/enlighten.c @@ -14,7 +14,10 @@ #include <xen/xen-ops.h> #include <asm/xen/hypervisor.h> #include <asm/xen/hypercall.h> +#include <asm/arch_timer.h> #include <asm/system_misc.h> +#include <asm/paravirt.h> +#include <linux/jump_label.h> #include <linux/interrupt.h> #include <linux/irqreturn.h> #include <linux/module.h> @@ -152,6 +155,20 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, } EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); +unsigned long long xen_stolen_accounting(int cpu) +{ + struct vcpu_runstate_info state; + + if (cpu != get_cpu()) + BUG(); + + xen_get_runstate_snapshot(&state); + + WARN_ON(state.state != RUNSTATE_running); + + return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline]; +} + static void __init xen_percpu_init(void *unused) { struct vcpu_register_vcpu_info info; @@ -170,6 +187,8 @@ static void __init xen_percpu_init(void *unused) BUG(); per_cpu(xen_vcpu, cpu) = vcpup; + xen_setup_runstate_info(cpu); + enable_percpu_irq(xen_events_irq, 0); } @@ -301,6 +320,10 @@ static int __init xen_init_events(void) on_each_cpu(xen_percpu_init, NULL, 0); + pv_time_ops.steal_clock = xen_stolen_accounting; + static_key_slow_inc(¶virt_steal_enabled); + static_key_slow_inc(¶virt_steal_rq_enabled); + return 0; } postcore_initcall(xen_init_events); -- 1.7.2.5
Ian Campbell
2013-May-07 09:17 UTC
Re: [Xen-devel] [PATCH RFC v2 4/4] xen/arm: account for stolen ticks
On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote:> Register the runstate_memory_area with the hypervisor. > Use pv_time_ops.steal_clock to account for stolen ticks. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > arch/arm/xen/enlighten.c | 23 +++++++++++++++++++++++ > 1 files changed, 23 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > index ee86bfa..2a5cc82 100644 > --- a/arch/arm/xen/enlighten.c > +++ b/arch/arm/xen/enlighten.c > @@ -14,7 +14,10 @@ > #include <xen/xen-ops.h> > #include <asm/xen/hypervisor.h> > #include <asm/xen/hypercall.h> > +#include <asm/arch_timer.h> > #include <asm/system_misc.h> > +#include <asm/paravirt.h> > +#include <linux/jump_label.h> > #include <linux/interrupt.h> > #include <linux/irqreturn.h> > #include <linux/module.h> > @@ -152,6 +155,20 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, > } > EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); > > +unsigned long long xen_stolen_accounting(int cpu) > +{ > + struct vcpu_runstate_info state; > + > + if (cpu != get_cpu())get_cpu disables preempt, so you need a matching put_cpu. But actually I think you just want smp_processor_id and you probably want the BUG_ON form to get unlikely etc. That said, you don''t use cpu for anything else, so why not drop it entirely?> + BUG(); > + > + xen_get_runstate_snapshot(&state); > + > + WARN_ON(state.state != RUNSTATE_running); > + > + return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline]; > +} > + > static void __init xen_percpu_init(void *unused) > { > struct vcpu_register_vcpu_info info; > @@ -170,6 +187,8 @@ static void __init xen_percpu_init(void *unused) > BUG(); > per_cpu(xen_vcpu, cpu) = vcpup; > > + xen_setup_runstate_info(cpu); > + > enable_percpu_irq(xen_events_irq, 0); > } > > @@ -301,6 +320,10 @@ static int __init xen_init_events(void) > > on_each_cpu(xen_percpu_init, NULL, 0); > > + pv_time_ops.steal_clock = xen_stolen_accounting; > + static_key_slow_inc(¶virt_steal_enabled); > + static_key_slow_inc(¶virt_steal_rq_enabled);We don''t seem to do this on x86 -- is that a bug on x86 on Xen?> + > return 0; > } > postcore_initcall(xen_init_events);
Ian Campbell
2013-May-07 09:23 UTC
Re: [Xen-devel] [PATCH RFC v2 2/4] arm: introduce CONFIG_PARAVIRT and pv_time_ops
On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote:> Introduce CONFIG_PARAVIRT on ARM.What about PARAVIRT_TIME_ACCOUNTING? I''m not sure what it is but it looks like a more lightweight version of pv stolen time?> The only paravirt interface supported is pv_time_ops.steal_clock. > No runtime pvops patching yet.Or indeed ever, I think. The use cases for patching on x86 are not things which carry over to ARM with virt extensions.> This allows us to make us of steal_account_process_tick for stolen ticks > accounting. > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: linux@arm.linux.org.uk > CC: will.deacon@arm.com > CC: nico@linaro.org > CC: marc.zyngier@arm.com > CC: cov@codeaurora.org > CC: arnd@arndb.de > CC: olof@lixom.net > --- > arch/arm/Kconfig | 9 +++++++++ > arch/arm/include/asm/paravirt.h | 19 +++++++++++++++++++ > arch/arm/kernel/Makefile | 1 + > arch/arm/kernel/paravirt.c | 32 ++++++++++++++++++++++++++++++++ > 4 files changed, 61 insertions(+), 0 deletions(-) > create mode 100644 arch/arm/include/asm/paravirt.h > create mode 100644 arch/arm/kernel/paravirt.c > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 344e299..35cb10a 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -1887,12 +1887,21 @@ config XEN_DOM0 > def_bool y > depends on XEN > > +config PARAVIRT > + bool "Enable paravirtualization code" > + ---help--- > + This changes the kernel so it can modify itself when it is run > + under a hypervisor, potentially improving performance significantly > + over full virtualization. However, when run without a hypervisor > + the kernel is theoretically slower and slightly larger.I''m not sure this description (carried over from x86) are really true for ARM. e.g. the downsides there when not virtualised are in the PV MMU (pte operations) and interrupt masking stuff, which should never make its way onto ARM. I think it would be a worthwhile change to refactor the stolen time handling out from under the rather wide reaching umbrella of the x86 PARAVIRT option. (assuming PARAVIRT_TIME_ACCOUNTING isn''t already that)> + > config XEN > bool "Xen guest support on ARM (EXPERIMENTAL)" > depends on ARM && AEABI && OF > depends on CPU_V7 && !CPU_V6 > depends on !GENERIC_ATOMIC64 > select ARM_PSCI > + select PARAVIRT > help > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM. > > diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h > new file mode 100644 > index 0000000..3b95bc6 > --- /dev/null > +++ b/arch/arm/include/asm/paravirt.h > @@ -0,0 +1,19 @@ > +#ifndef _ASM_ARM_PARAVIRT_H > +#define _ASM_ARM_PARAVIRT_H > + > +struct static_key; > +extern struct static_key paravirt_steal_enabled; > +extern struct static_key paravirt_steal_rq_enabled; > + > +struct pv_time_ops { > + unsigned long long (*steal_clock)(int cpu); > +}; > +extern struct pv_time_ops pv_time_ops; > + > +static inline u64 paravirt_steal_clock(int cpu) > +{ > + return pv_time_ops.steal_clock(cpu); > +} > + > + > +#endif > diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile > index dd9d90a..6764f60 100644 > --- a/arch/arm/kernel/Makefile > +++ b/arch/arm/kernel/Makefile > @@ -86,5 +86,6 @@ ifeq ($(CONFIG_ARM_PSCI),y) > obj-y += psci.o > obj-$(CONFIG_SMP) += psci_smp.o > endif > +obj-$(CONFIG_PARAVIRT) += paravirt.o > > extra-y := $(head-y) vmlinux.lds > diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c > new file mode 100644 > index 0000000..3e73fc8 > --- /dev/null > +++ b/arch/arm/kernel/paravirt.c > @@ -0,0 +1,32 @@ > +/* > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * Copyright (C) 2013 Citrix Systems > + * > + * Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > + */ > + > +#include <linux/export.h> > +#include <linux/jump_label.h> > +#include <linux/types.h> > +#include <asm/paravirt.h> > + > +struct static_key paravirt_steal_enabled; > +struct static_key paravirt_steal_rq_enabled; > + > +static u64 native_steal_clock(int cpu) > +{ > + return 0; > +} > + > +struct pv_time_ops pv_time_ops = { > + .steal_clock = native_steal_clock, > +}; > +EXPORT_SYMBOL_GPL(pv_time_ops);This foo_ops.bar and native_bar thing is a bit of a hangover from the paravirt patching infrastructure on x86 and it doesn''t really apply here. Given that the call to paravirt_steal_time call is already protected by this static_key stuff I think it would be safe to leave the hook as NULL in the case where it is unused. Given all the different clock sources on ARM is there not an existing ops struct where this could live? Ian.
Stefano Stabellini
2013-May-07 11:58 UTC
Re: [Xen-devel] [PATCH RFC v2 4/4] xen/arm: account for stolen ticks
On Tue, 7 May 2013, Ian Campbell wrote:> On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote: > > Register the runstate_memory_area with the hypervisor. > > Use pv_time_ops.steal_clock to account for stolen ticks. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > arch/arm/xen/enlighten.c | 23 +++++++++++++++++++++++ > > 1 files changed, 23 insertions(+), 0 deletions(-) > > > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > index ee86bfa..2a5cc82 100644 > > --- a/arch/arm/xen/enlighten.c > > +++ b/arch/arm/xen/enlighten.c > > @@ -14,7 +14,10 @@ > > #include <xen/xen-ops.h> > > #include <asm/xen/hypervisor.h> > > #include <asm/xen/hypercall.h> > > +#include <asm/arch_timer.h> > > #include <asm/system_misc.h> > > +#include <asm/paravirt.h> > > +#include <linux/jump_label.h> > > #include <linux/interrupt.h> > > #include <linux/irqreturn.h> > > #include <linux/module.h> > > @@ -152,6 +155,20 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, > > } > > EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); > > > > +unsigned long long xen_stolen_accounting(int cpu) > > +{ > > + struct vcpu_runstate_info state; > > + > > + if (cpu != get_cpu()) > > get_cpu disables preempt, so you need a matching put_cpu.Oops, thanks.> But actually I think you just want smp_processor_id and you probably > want the BUG_ON form to get unlikely etc. > > That said, you don''t use cpu for anything else, so why not drop it > entirely?Ah, that''s right, legacy of the past. I''ll do that.> > + BUG(); > > + > > + xen_get_runstate_snapshot(&state); > > + > > + WARN_ON(state.state != RUNSTATE_running); > > + > > + return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline]; > > +} > > + > > static void __init xen_percpu_init(void *unused) > > { > > struct vcpu_register_vcpu_info info; > > @@ -170,6 +187,8 @@ static void __init xen_percpu_init(void *unused) > > BUG(); > > per_cpu(xen_vcpu, cpu) = vcpup; > > > > + xen_setup_runstate_info(cpu); > > + > > enable_percpu_irq(xen_events_irq, 0); > > } > > > > @@ -301,6 +320,10 @@ static int __init xen_init_events(void) > > > > on_each_cpu(xen_percpu_init, NULL, 0); > > > > + pv_time_ops.steal_clock = xen_stolen_accounting; > > + static_key_slow_inc(¶virt_steal_enabled); > > + static_key_slow_inc(¶virt_steal_rq_enabled); > > We don''t seem to do this on x86 -- is that a bug on x86 on Xen?On x86 we do all the accounting in do_stolen_accounting, called from our own interrupt handler (xen_timer_interrupt). I don''t think we would gain anything by using the common infrastructure, we would actually loose the idle ticks accounting we do there. Speaking of which, I don''t think that pv_time_ops.steal_clock would properly increase CPUTIME_IDLE the way we do in do_stolen_accounting. How much of an issue is that?
Stefano Stabellini
2013-May-07 12:15 UTC
Re: [Xen-devel] [PATCH RFC v2 2/4] arm: introduce CONFIG_PARAVIRT and pv_time_ops
On Tue, 7 May 2013, Ian Campbell wrote:> On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote: > > Introduce CONFIG_PARAVIRT on ARM. > > What about PARAVIRT_TIME_ACCOUNTING? I''m not sure what it is but it > looks like a more lightweight version of pv stolen time?PARAVIRT_TIME_ACCOUNTING selects PARAVIRT on x86 :-)> > The only paravirt interface supported is pv_time_ops.steal_clock. > > No runtime pvops patching yet. > > Or indeed ever, I think. The use cases for patching on x86 are not > things which carry over to ARM with virt extensions.Agreed> > This allows us to make us of steal_account_process_tick for stolen ticks > > accounting. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > CC: linux@arm.linux.org.uk > > CC: will.deacon@arm.com > > CC: nico@linaro.org > > CC: marc.zyngier@arm.com > > CC: cov@codeaurora.org > > CC: arnd@arndb.de > > CC: olof@lixom.net > > --- > > arch/arm/Kconfig | 9 +++++++++ > > arch/arm/include/asm/paravirt.h | 19 +++++++++++++++++++ > > arch/arm/kernel/Makefile | 1 + > > arch/arm/kernel/paravirt.c | 32 ++++++++++++++++++++++++++++++++ > > 4 files changed, 61 insertions(+), 0 deletions(-) > > create mode 100644 arch/arm/include/asm/paravirt.h > > create mode 100644 arch/arm/kernel/paravirt.c > > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > > index 344e299..35cb10a 100644 > > --- a/arch/arm/Kconfig > > +++ b/arch/arm/Kconfig > > @@ -1887,12 +1887,21 @@ config XEN_DOM0 > > def_bool y > > depends on XEN > > > > +config PARAVIRT > > + bool "Enable paravirtualization code" > > + ---help--- > > + This changes the kernel so it can modify itself when it is run > > + under a hypervisor, potentially improving performance significantly > > + over full virtualization. However, when run without a hypervisor > > + the kernel is theoretically slower and slightly larger. > > I''m not sure this description (carried over from x86) are really true > for ARM. e.g. the downsides there when not virtualised are in the PV MMU > (pte operations) and interrupt masking stuff, which should never make > its way onto ARM.Right> I think it would be a worthwhile change to refactor the stolen time > handling out from under the rather wide reaching umbrella of the x86 > PARAVIRT option. (assuming PARAVIRT_TIME_ACCOUNTING isn''t already that)Actually PARAVIRT doesn''t mean much in common code, the only thing it covers is stolen time. What I mean to say is that just because we are introducing something called "PARAVIRT" on ARM, it doesn''t mean that it has to come with all sort of baggage.> > + > > config XEN > > bool "Xen guest support on ARM (EXPERIMENTAL)" > > depends on ARM && AEABI && OF > > depends on CPU_V7 && !CPU_V6 > > depends on !GENERIC_ATOMIC64 > > select ARM_PSCI > > + select PARAVIRT > > help > > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM. > > > > diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h > > new file mode 100644 > > index 0000000..3b95bc6 > > --- /dev/null > > +++ b/arch/arm/include/asm/paravirt.h > > @@ -0,0 +1,19 @@ > > +#ifndef _ASM_ARM_PARAVIRT_H > > +#define _ASM_ARM_PARAVIRT_H > > + > > +struct static_key; > > +extern struct static_key paravirt_steal_enabled; > > +extern struct static_key paravirt_steal_rq_enabled; > > + > > +struct pv_time_ops { > > + unsigned long long (*steal_clock)(int cpu); > > +}; > > +extern struct pv_time_ops pv_time_ops; > > + > > +static inline u64 paravirt_steal_clock(int cpu) > > +{ > > + return pv_time_ops.steal_clock(cpu); > > +} > > + > > + > > +#endif > > diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile > > index dd9d90a..6764f60 100644 > > --- a/arch/arm/kernel/Makefile > > +++ b/arch/arm/kernel/Makefile > > @@ -86,5 +86,6 @@ ifeq ($(CONFIG_ARM_PSCI),y) > > obj-y += psci.o > > obj-$(CONFIG_SMP) += psci_smp.o > > endif > > +obj-$(CONFIG_PARAVIRT) += paravirt.o > > > > extra-y := $(head-y) vmlinux.lds > > diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c > > new file mode 100644 > > index 0000000..3e73fc8 > > --- /dev/null > > +++ b/arch/arm/kernel/paravirt.c > > @@ -0,0 +1,32 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License version 2 as > > + * published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * Copyright (C) 2013 Citrix Systems > > + * > > + * Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > + */ > > + > > +#include <linux/export.h> > > +#include <linux/jump_label.h> > > +#include <linux/types.h> > > +#include <asm/paravirt.h> > > + > > +struct static_key paravirt_steal_enabled; > > +struct static_key paravirt_steal_rq_enabled; > > + > > +static u64 native_steal_clock(int cpu) > > +{ > > + return 0; > > +} > > + > > +struct pv_time_ops pv_time_ops = { > > + .steal_clock = native_steal_clock, > > +}; > > +EXPORT_SYMBOL_GPL(pv_time_ops); > > This foo_ops.bar and native_bar thing is a bit of a hangover from the > paravirt patching infrastructure on x86 and it doesn''t really apply > here. > > Given that the call to paravirt_steal_time call is already protected by > this static_key stuff I think it would be safe to leave the hook as NULL > in the case where it is unused.Good point> Given all the different clock sources on ARM is there not an existing > ops struct where this could live?I am not sure, but I would be happy to move it something arch-specific.
Ian Campbell
2013-May-07 12:56 UTC
Re: [Xen-devel] [PATCH RFC v2 2/4] arm: introduce CONFIG_PARAVIRT and pv_time_ops
On Tue, 2013-05-07 at 13:15 +0100, Stefano Stabellini wrote:> On Tue, 7 May 2013, Ian Campbell wrote: > > On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote: > > > Introduce CONFIG_PARAVIRT on ARM. > > > > What about PARAVIRT_TIME_ACCOUNTING? I''m not sure what it is but it > > looks like a more lightweight version of pv stolen time? > > PARAVIRT_TIME_ACCOUNTING selects PARAVIRT on x86 :-)Ah, that''s maybe what confused me. TBH its not at all clear to me what distinction the core code is trying to make with those two options, but do we not also want/need PARAVIRT_TIME_ACCOUNTING? Having reread the help text it seems to be some sort of "more accurate" accounting?> > I think it would be a worthwhile change to refactor the stolen time > > handling out from under the rather wide reaching umbrella of the x86 > > PARAVIRT option. (assuming PARAVIRT_TIME_ACCOUNTING isn''t already that) > > Actually PARAVIRT doesn''t mean much in common code, the only thing it > covers is stolen time. > What I mean to say is that just because we are introducing something > called "PARAVIRT" on ARM, it doesn''t mean that it has to come with all > sort of baggage.I was more concerned with perceived baggage than actual trunks full of skeletons. Ian.
Ian Campbell
2013-May-07 12:56 UTC
Re: [Xen-devel] [PATCH RFC v2 4/4] xen/arm: account for stolen ticks
> > > @@ -301,6 +320,10 @@ static int __init xen_init_events(void) > > > > > > on_each_cpu(xen_percpu_init, NULL, 0); > > > > > > + pv_time_ops.steal_clock = xen_stolen_accounting; > > > + static_key_slow_inc(¶virt_steal_enabled); > > > + static_key_slow_inc(¶virt_steal_rq_enabled); > > > > We don''t seem to do this on x86 -- is that a bug on x86 on Xen? > > On x86 we do all the accounting in do_stolen_accounting, called from our > own interrupt handler (xen_timer_interrupt). > I don''t think we would gain anything by using the common infrastructure, > we would actually loose the idle ticks accounting we do there. > > Speaking of which, I don''t think that pv_time_ops.steal_clock would > properly increase CPUTIME_IDLE the way we do in do_stolen_accounting. > > How much of an issue is that?Doesn''t the generic account_idle_time handle this? Ian.
Konrad Rzeszutek Wilk
2013-May-07 13:33 UTC
Re: [Xen-devel] [PATCH RFC v2 4/4] xen/arm: account for stolen ticks
On Tue, May 07, 2013 at 12:58:32PM +0100, Stefano Stabellini wrote:> On Tue, 7 May 2013, Ian Campbell wrote: > > On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote: > > > Register the runstate_memory_area with the hypervisor. > > > Use pv_time_ops.steal_clock to account for stolen ticks. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > --- > > > arch/arm/xen/enlighten.c | 23 +++++++++++++++++++++++ > > > 1 files changed, 23 insertions(+), 0 deletions(-) > > > > > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > > index ee86bfa..2a5cc82 100644 > > > --- a/arch/arm/xen/enlighten.c > > > +++ b/arch/arm/xen/enlighten.c > > > @@ -14,7 +14,10 @@ > > > #include <xen/xen-ops.h> > > > #include <asm/xen/hypervisor.h> > > > #include <asm/xen/hypercall.h> > > > +#include <asm/arch_timer.h> > > > #include <asm/system_misc.h> > > > +#include <asm/paravirt.h> > > > +#include <linux/jump_label.h> > > > #include <linux/interrupt.h> > > > #include <linux/irqreturn.h> > > > #include <linux/module.h> > > > @@ -152,6 +155,20 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, > > > } > > > EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); > > > > > > +unsigned long long xen_stolen_accounting(int cpu) > > > +{ > > > + struct vcpu_runstate_info state; > > > + > > > + if (cpu != get_cpu()) > > > > get_cpu disables preempt, so you need a matching put_cpu. > > Oops, thanks. > > > > But actually I think you just want smp_processor_id and you probably > > want the BUG_ON form to get unlikely etc. > > > > That said, you don''t use cpu for anything else, so why not drop it > > entirely? > > Ah, that''s right, legacy of the past. I''ll do that. > > > > > + BUG(); > > > + > > > + xen_get_runstate_snapshot(&state); > > > + > > > + WARN_ON(state.state != RUNSTATE_running); > > > + > > > + return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline]; > > > +} > > > + > > > static void __init xen_percpu_init(void *unused) > > > { > > > struct vcpu_register_vcpu_info info; > > > @@ -170,6 +187,8 @@ static void __init xen_percpu_init(void *unused) > > > BUG(); > > > per_cpu(xen_vcpu, cpu) = vcpup; > > > > > > + xen_setup_runstate_info(cpu); > > > + > > > enable_percpu_irq(xen_events_irq, 0); > > > } > > > > > > @@ -301,6 +320,10 @@ static int __init xen_init_events(void) > > > > > > on_each_cpu(xen_percpu_init, NULL, 0); > > > > > > + pv_time_ops.steal_clock = xen_stolen_accounting; > > > + static_key_slow_inc(¶virt_steal_enabled); > > > + static_key_slow_inc(¶virt_steal_rq_enabled); > > > > We don''t seem to do this on x86 -- is that a bug on x86 on Xen? > > On x86 we do all the accounting in do_stolen_accounting, called from our > own interrupt handler (xen_timer_interrupt). > I don''t think we would gain anything by using the common infrastructure, > we would actually loose the idle ticks accounting we do there. > > Speaking of which, I don''t think that pv_time_ops.steal_clock would > properly increase CPUTIME_IDLE the way we do in do_stolen_accounting. > > How much of an issue is that?Well, if you look in https://git.kernel.org/cgit/linux/kernel/git/konrad/xen.git/log/?h=devel/pvtime.v1.1 I tried to do use those two asm goto keys but it all started complaining to me during bootup with some WARNs. Never figured out why, but if somebody is interested in taking a look ...
Stefano Stabellini
2013-May-07 13:34 UTC
Re: [Xen-devel] [PATCH RFC v2 2/4] arm: introduce CONFIG_PARAVIRT and pv_time_ops
On Tue, 7 May 2013, Ian Campbell wrote:> On Tue, 2013-05-07 at 13:15 +0100, Stefano Stabellini wrote: > > On Tue, 7 May 2013, Ian Campbell wrote: > > > On Mon, 2013-05-06 at 15:51 +0100, Stefano Stabellini wrote: > > > > Introduce CONFIG_PARAVIRT on ARM. > > > > > > What about PARAVIRT_TIME_ACCOUNTING? I''m not sure what it is but it > > > looks like a more lightweight version of pv stolen time? > > > > PARAVIRT_TIME_ACCOUNTING selects PARAVIRT on x86 :-) > > Ah, that''s maybe what confused me. > > TBH its not at all clear to me what distinction the core code is trying > to make with those two options, but do we not also want/need > PARAVIRT_TIME_ACCOUNTING? Having reread the help text it seems to be > some sort of "more accurate" accounting?It is not clear to me either. I think that you are right, we probably want PARAVIRT_TIME_ACCOUNTING too.
Stefano Stabellini
2013-May-07 13:49 UTC
Re: [Xen-devel] [PATCH RFC v2 4/4] xen/arm: account for stolen ticks
On Tue, 7 May 2013, Ian Campbell wrote:> > > > @@ -301,6 +320,10 @@ static int __init xen_init_events(void) > > > > > > > > on_each_cpu(xen_percpu_init, NULL, 0); > > > > > > > > + pv_time_ops.steal_clock = xen_stolen_accounting; > > > > + static_key_slow_inc(¶virt_steal_enabled); > > > > + static_key_slow_inc(¶virt_steal_rq_enabled); > > > > > > We don''t seem to do this on x86 -- is that a bug on x86 on Xen? > > > > On x86 we do all the accounting in do_stolen_accounting, called from our > > own interrupt handler (xen_timer_interrupt). > > I don''t think we would gain anything by using the common infrastructure, > > we would actually loose the idle ticks accounting we do there. > > > > Speaking of which, I don''t think that pv_time_ops.steal_clock would > > properly increase CPUTIME_IDLE the way we do in do_stolen_accounting. > > > > How much of an issue is that? > > Doesn''t the generic account_idle_time handle this?AFAICT only if the rq is idle, while do_stolen_accounting would account for ticks in RUNSTATE_blocked