Hi all, this small patch series introduces stolen ticks accounting for Xen on ARM. Stolen ticks are clocksource ticks that have been "stolen" from the cpu, typically because Linux is running in a virtual machine and the vcpu has been descheduled. Stefano Stabellini (3): arm_arch_timer: introduce arch_timer_stolen_ticks xen: move do_stolen_accounting to drivers/xen/time.c xen/arm: account for stolen ticks arch/arm/include/asm/arch_timer.h | 5 + arch/arm/xen/enlighten.c | 4 + arch/x86/xen/time.c | 127 +------------------------------ drivers/clocksource/arm_arch_timer.c | 6 ++ drivers/xen/Makefile | 2 +- drivers/xen/time.c | 142 ++++++++++++++++++++++++++++++++++ include/xen/xen-ops.h | 4 + 7 files changed, 163 insertions(+), 127 deletions(-) create mode 100644 drivers/xen/time.c git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git lost_ticks_1 Cheers, Stefano
Stefano Stabellini
2013-May-01 19:27 UTC
[PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
Introduce a function, called arch_timer_stolen_ticks, called from the arch_timer interrupt handler to account for stolen ticks. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: linux@arm.linux.org.uk CC: john.stultz@linaro.org CC: catalin.marinas@arm.com CC: marc.zyngier@arm.com --- arch/arm/include/asm/arch_timer.h | 5 +++++ drivers/clocksource/arm_arch_timer.c | 6 ++++++ 2 files changed, 11 insertions(+), 0 deletions(-) diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h index 7ade91d..30db413 100644 --- a/arch/arm/include/asm/arch_timer.h +++ b/arch/arm/include/asm/arch_timer.h @@ -13,6 +13,11 @@ int arch_timer_of_register(void); int arch_timer_sched_clock_init(void); +/* per-platform function to calculate stolen ticks (clock cycles stolen + * to the vcpu by the hypervisor). + */ +extern void (*arch_timer_stolen_ticks)(void); + /* * These register accessors are marked inline so the compiler can * nicely work out which register we want, and chuck away the rest of diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index d7ad425..d1ddf0b 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -37,6 +37,8 @@ static int arch_timer_ppi[MAX_TIMER_PPI]; static struct clock_event_device __percpu *arch_timer_evt; +void (*arch_timer_stolen_ticks)(void); + static bool arch_timer_use_virtual = true; /* @@ -52,6 +54,10 @@ static inline irqreturn_t timer_handler(const int access, ctrl |= ARCH_TIMER_CTRL_IT_MASK; arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl); evt->event_handler(evt); + + if ( arch_timer_stolen_ticks != NULL ) + arch_timer_stolen_ticks(); + return IRQ_HANDLED; } -- 1.7.2.5
Stefano Stabellini
2013-May-01 19:27 UTC
[PATCH 2/3] xen: move do_stolen_accounting to drivers/xen/time.c
Move do_stolen_accounting, xen_vcpu_stolen and related functions and static variables to common code (drivers/xen/time.c). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/x86/xen/time.c | 127 +------------------------------------------- drivers/xen/Makefile | 2 +- drivers/xen/time.c | 142 +++++++++++++++++++++++++++++++++++++++++++++++++ include/xen/xen-ops.h | 4 ++ 4 files changed, 148 insertions(+), 127 deletions(-) create mode 100644 drivers/xen/time.c diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 0296a95..278b7c2 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -28,131 +28,6 @@ /* Xen may fire a timer up to this many ns early */ #define TIMER_SLOP 100000 -#define NS_PER_TICK (1000000000LL / HZ) - -/* runstate info updated by Xen */ -static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); - -/* snapshots of runstate info */ -static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot); - -/* unused ns of stolen and blocked time */ -static DEFINE_PER_CPU(u64, xen_residual_stolen); -static DEFINE_PER_CPU(u64, xen_residual_blocked); - -/* return an consistent snapshot of 64-bit time/counter value */ -static u64 get64(const u64 *p) -{ - u64 ret; - - if (BITS_PER_LONG < 64) { - u32 *p32 = (u32 *)p; - u32 h, l; - - /* - * Read high then low, and then make sure high is - * still the same; this will only loop if low wraps - * and carries into high. - * XXX some clean way to make this endian-proof? - */ - do { - h = p32[1]; - barrier(); - l = p32[0]; - barrier(); - } while (p32[1] != h); - - ret = (((u64)h) << 32) | l; - } else - ret = *p; - - return ret; -} - -/* - * Runstate accounting - */ -static void get_runstate_snapshot(struct vcpu_runstate_info *res) -{ - u64 state_time; - struct vcpu_runstate_info *state; - - BUG_ON(preemptible()); - - state = &__get_cpu_var(xen_runstate); - - /* - * The runstate info is always updated by the hypervisor on - * the current CPU, so there''s no need to use anything - * stronger than a compiler barrier when fetching it. - */ - do { - state_time = get64(&state->state_entry_time); - barrier(); - *res = *state; - barrier(); - } while (get64(&state->state_entry_time) != state_time); -} - -/* return true when a vcpu could run but has no real cpu to run on */ -bool xen_vcpu_stolen(int vcpu) -{ - return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable; -} - -void xen_setup_runstate_info(int cpu) -{ - struct vcpu_register_runstate_memory_area area; - - area.addr.v = &per_cpu(xen_runstate, cpu); - - if (HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area, - cpu, &area)) - BUG(); -} - -static void do_stolen_accounting(void) -{ - struct vcpu_runstate_info state; - struct vcpu_runstate_info *snap; - s64 blocked, runnable, offline, stolen; - cputime_t ticks; - - get_runstate_snapshot(&state); - - WARN_ON(state.state != RUNSTATE_running); - - snap = &__get_cpu_var(xen_runstate_snapshot); - - /* work out how much time the VCPU has not been runn*ing* */ - blocked = state.time[RUNSTATE_blocked] - snap->time[RUNSTATE_blocked]; - runnable = state.time[RUNSTATE_runnable] - snap->time[RUNSTATE_runnable]; - offline = state.time[RUNSTATE_offline] - snap->time[RUNSTATE_offline]; - - *snap = state; - - /* Add the appropriate number of ticks of stolen time, - including any left-overs from last time. */ - stolen = runnable + offline + __this_cpu_read(xen_residual_stolen); - - if (stolen < 0) - stolen = 0; - - ticks = iter_div_u64_rem(stolen, NS_PER_TICK, &stolen); - __this_cpu_write(xen_residual_stolen, stolen); - account_steal_ticks(ticks); - - /* Add the appropriate number of ticks of blocked time, - including any left-overs from last time. */ - blocked += __this_cpu_read(xen_residual_blocked); - - if (blocked < 0) - blocked = 0; - - ticks = iter_div_u64_rem(blocked, NS_PER_TICK, &blocked); - __this_cpu_write(xen_residual_blocked, blocked); - account_idle_ticks(ticks); -} /* Get the TSC speed from Xen */ static unsigned long xen_tsc_khz(void) @@ -390,7 +265,7 @@ static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) ret = IRQ_HANDLED; } - do_stolen_accounting(); + xen_stolen_accounting(); return ret; } diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index eabd0ee..2bf461a 100644 --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -3,7 +3,7 @@ obj-y += manage.o obj-$(CONFIG_HOTPLUG_CPU) += cpu_hotplug.o endif obj-$(CONFIG_X86) += fallback.o -obj-y += grant-table.o features.o events.o balloon.o +obj-y += grant-table.o features.o events.o balloon.o time.o obj-y += xenbus/ nostackp := $(call cc-option, -fno-stack-protector) diff --git a/drivers/xen/time.c b/drivers/xen/time.c new file mode 100644 index 0000000..d00ed24 --- /dev/null +++ b/drivers/xen/time.c @@ -0,0 +1,142 @@ +/* + * Xen stolen ticks accounting. + */ +#include <linux/kernel.h> +#include <linux/kernel_stat.h> +#include <linux/math64.h> +#include <linux/gfp.h> + +#include <asm/xen/hypervisor.h> +#include <asm/xen/hypercall.h> + +#include <xen/events.h> +#include <xen/features.h> +#include <xen/interface/xen.h> +#include <xen/interface/vcpu.h> +#include <xen/xen-ops.h> + +#define NS_PER_TICK (1000000000LL / HZ) + +/* runstate info updated by Xen */ +static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); + +/* snapshots of runstate info */ +static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot); + +/* unused ns of stolen and blocked time */ +static DEFINE_PER_CPU(u64, xen_residual_stolen); +static DEFINE_PER_CPU(u64, xen_residual_blocked); + +/* return an consistent snapshot of 64-bit time/counter value */ +static u64 get64(const u64 *p) +{ + u64 ret; + + if (BITS_PER_LONG < 64) { + u32 *p32 = (u32 *)p; + u32 h, l; + + /* + * Read high then low, and then make sure high is + * still the same; this will only loop if low wraps + * and carries into high. + * XXX some clean way to make this endian-proof? + */ + do { + h = p32[1]; + barrier(); + l = p32[0]; + barrier(); + } while (p32[1] != h); + + ret = (((u64)h) << 32) | l; + } else + ret = *p; + + return ret; +} + +/* + * Runstate accounting + */ +static void get_runstate_snapshot(struct vcpu_runstate_info *res) +{ + u64 state_time; + struct vcpu_runstate_info *state; + + BUG_ON(preemptible()); + + state = &__get_cpu_var(xen_runstate); + + /* + * The runstate info is always updated by the hypervisor on + * the current CPU, so there''s no need to use anything + * stronger than a compiler barrier when fetching it. + */ + do { + state_time = get64(&state->state_entry_time); + barrier(); + *res = *state; + barrier(); + } while (get64(&state->state_entry_time) != state_time); +} + +/* return true when a vcpu could run but has no real cpu to run on */ +bool xen_vcpu_stolen(int vcpu) +{ + return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable; +} + +void xen_setup_runstate_info(int cpu) +{ + struct vcpu_register_runstate_memory_area area; + + area.addr.v = &per_cpu(xen_runstate, cpu); + + if (HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area, + cpu, &area)) + BUG(); +} + +void xen_stolen_accounting(void) +{ + struct vcpu_runstate_info state; + struct vcpu_runstate_info *snap; + s64 blocked, runnable, offline, stolen; + cputime_t ticks; + + get_runstate_snapshot(&state); + + WARN_ON(state.state != RUNSTATE_running); + + snap = &__get_cpu_var(xen_runstate_snapshot); + + /* work out how much time the VCPU has not been runn*ing* */ + blocked = state.time[RUNSTATE_blocked] - snap->time[RUNSTATE_blocked]; + runnable = state.time[RUNSTATE_runnable] - snap->time[RUNSTATE_runnable]; + offline = state.time[RUNSTATE_offline] - snap->time[RUNSTATE_offline]; + + *snap = state; + + /* Add the appropriate number of ticks of stolen time, + including any left-overs from last time. */ + stolen = runnable + offline + __this_cpu_read(xen_residual_stolen); + + if (stolen < 0) + stolen = 0; + + ticks = iter_div_u64_rem(stolen, NS_PER_TICK, &stolen); + __this_cpu_write(xen_residual_stolen, stolen); + account_steal_ticks(ticks); + + /* Add the appropriate number of ticks of blocked time, + including any left-overs from last time. */ + blocked += __this_cpu_read(xen_residual_blocked); + + if (blocked < 0) + blocked = 0; + + ticks = iter_div_u64_rem(blocked, NS_PER_TICK, &blocked); + __this_cpu_write(xen_residual_blocked, blocked); + account_idle_ticks(ticks); +} diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index d6fe062..85bb673 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -16,6 +16,10 @@ void xen_mm_unpin_all(void); void xen_timer_resume(void); void xen_arch_resume(void); +bool xen_vcpu_stolen(int vcpu); +void xen_setup_runstate_info(int cpu); +void xen_stolen_accounting(void); + int xen_setup_shutdown_event(void); extern unsigned long *xen_contiguous_bitmap; -- 1.7.2.5
Register the runstate_memory_area with the hypervisor. Use arch_timer_stolen_ticks and xen_stolen_accounting to account for stolen ticks. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/xen/enlighten.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c index d30042e..8f040a0 100644 --- a/arch/arm/xen/enlighten.c +++ b/arch/arm/xen/enlighten.c @@ -14,6 +14,7 @@ #include <xen/xen-ops.h> #include <asm/xen/hypervisor.h> #include <asm/xen/hypercall.h> +#include <asm/arch_timer.h> #include <asm/system_misc.h> #include <linux/interrupt.h> #include <linux/irqreturn.h> @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) later ones fail to. */ per_cpu(xen_vcpu, cpu) = vcpup; } + + xen_setup_runstate_info(cpu); return 0; } @@ -272,6 +275,7 @@ static int __init xen_guest_init(void) if (!xen_initial_domain()) xenbus_probe(NULL); + arch_timer_stolen_ticks = xen_stolen_accounting; pm_power_off = xen_power_off; arm_pm_restart = xen_restart; -- 1.7.2.5
Christopher Covington
2013-May-01 20:36 UTC
Re: [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
Hi Stefano, On 05/01/2013 03:27 PM, Stefano Stabellini wrote:> Introduce a function, called arch_timer_stolen_ticks, called from the > arch_timer interrupt handler to account for stolen ticks.[...]> diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h > index 7ade91d..30db413 100644 > --- a/arch/arm/include/asm/arch_timer.h > +++ b/arch/arm/include/asm/arch_timer.h > @@ -13,6 +13,11 @@ > int arch_timer_of_register(void); > int arch_timer_sched_clock_init(void); > > +/* per-platform function to calculate stolen ticks (clock cycles stolen > + * to the vcpu by the hypervisor).Stolen from the vcpu by the hypervisor? Is the hypervisor adjusting the Virtual Offset Register? [...] Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Ian Campbell
2013-May-02 08:19 UTC
Re: [Xen-devel] [PATCH 2/3] xen: move do_stolen_accounting to drivers/xen/time.c
On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote:> Move do_stolen_accounting, xen_vcpu_stolen and related functions and > static variables to common code (drivers/xen/time.c). > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>> + /* work out how much time the VCPU has not been runn*ing* */I wonder what this emphasis on runn*ing* means? Ian.
Ian Campbell
2013-May-02 08:19 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On Wed, 2013-05-01 at 21:36 +0100, Christopher Covington wrote:> Hi Stefano, > > On 05/01/2013 03:27 PM, Stefano Stabellini wrote: > > Introduce a function, called arch_timer_stolen_ticks, called from the > > arch_timer interrupt handler to account for stolen ticks. > > [...] > > > diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h > > index 7ade91d..30db413 100644 > > --- a/arch/arm/include/asm/arch_timer.h > > +++ b/arch/arm/include/asm/arch_timer.h > > @@ -13,6 +13,11 @@ > > int arch_timer_of_register(void); > > int arch_timer_sched_clock_init(void); > > > > +/* per-platform function to calculate stolen ticks (clock cycles stolen > > + * to the vcpu by the hypervisor). > > Stolen from the vcpu by the hypervisor?Stolen is time where the VCPU wants to be running bit isn''t because the hypervisor has descheduled it, e.g. because another VCPU is being run. So yes, Stefano meant "from".> Is the hypervisor adjusting the Virtual Offset Register?The virtual offset register is useful when a VCPU is migrated to another system to account for the differences in physical time on the two hosts but isn''t useful for accounting for stolen time while running on a single host. e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it would not make sense (from the guests PoV) for NOW''==NOW+2 at the point where the timer goes off. Nor does it make sense to require that the guest actually be running for 5 before injecting the timer because that would mean real time elapsed time for the timer would be 5+3 in the case where 3 are stolen. So the virtual timer should appear to have been running even while time is being stolen and therefore stolen time needs to be accounted via some other means. Ian.
Ian Campbell
2013-May-02 08:21 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote:> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > index d30042e..8f040a0 100644 > --- a/arch/arm/xen/enlighten.c > +++ b/arch/arm/xen/enlighten.c > @@ -14,6 +14,7 @@ > #include <xen/xen-ops.h> > #include <asm/xen/hypervisor.h> > #include <asm/xen/hypercall.h> > +#include <asm/arch_timer.h> > #include <asm/system_misc.h> > #include <linux/interrupt.h> > #include <linux/irqreturn.h> > @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) > later ones fail to. */ > per_cpu(xen_vcpu, cpu) = vcpup; > } > + > + xen_setup_runstate_info(cpu);Is this called for the boot processor too? (do you have a linux-next branch where I can easily find the baseline for this series?) Ian.
Stefano Stabellini
2013-May-02 10:38 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Thu, 2 May 2013, Ian Campbell wrote:> On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote: > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > index d30042e..8f040a0 100644 > > --- a/arch/arm/xen/enlighten.c > > +++ b/arch/arm/xen/enlighten.c > > @@ -14,6 +14,7 @@ > > #include <xen/xen-ops.h> > > #include <asm/xen/hypervisor.h> > > #include <asm/xen/hypercall.h> > > +#include <asm/arch_timer.h> > > #include <asm/system_misc.h> > > #include <linux/interrupt.h> > > #include <linux/irqreturn.h> > > @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) > > later ones fail to. */ > > per_cpu(xen_vcpu, cpu) = vcpup; > > } > > + > > + xen_setup_runstate_info(cpu); > > Is this called for the boot processor too?Yep.> (do you have a linux-next branch where I can easily find the baseline > for this series?)git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git lost_ticks_1 It''s based on my 3.9-rc3-smp-6 branch currently in linux-next (that is based on 3.9-rc3). However lost_ticks_1 is not in linux-next yet.
Ian Campbell
2013-May-02 10:48 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Thu, 2013-05-02 at 11:38 +0100, Stefano Stabellini wrote:> On Thu, 2 May 2013, Ian Campbell wrote: > > On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote: > > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > > index d30042e..8f040a0 100644 > > > --- a/arch/arm/xen/enlighten.c > > > +++ b/arch/arm/xen/enlighten.c > > > @@ -14,6 +14,7 @@ > > > #include <xen/xen-ops.h> > > > #include <asm/xen/hypervisor.h> > > > #include <asm/xen/hypercall.h> > > > +#include <asm/arch_timer.h> > > > #include <asm/system_misc.h> > > > #include <linux/interrupt.h> > > > #include <linux/irqreturn.h> > > > @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) > > > later ones fail to. */ > > > per_cpu(xen_vcpu, cpu) = vcpup; > > > } > > > + > > > + xen_setup_runstate_info(cpu); > > > > Is this called for the boot processor too? > > Yep. > > > (do you have a linux-next branch where I can easily find the baseline > > for this series?) > > git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git lost_ticks_1Is that your Xen tree, I was after the Linux one...> It''s based on my 3.9-rc3-smp-6 branch currently in linux-next (that is > based on 3.9-rc3). > However lost_ticks_1 is not in linux-next yet.Which branch is in Linux next? It doesn''t seem to be your linux-next (last commit 2011). Ian.
Stefano Stabellini
2013-May-02 10:54 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Thu, 2 May 2013, Ian Campbell wrote:> On Thu, 2013-05-02 at 11:38 +0100, Stefano Stabellini wrote: > > On Thu, 2 May 2013, Ian Campbell wrote: > > > On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote: > > > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > > > index d30042e..8f040a0 100644 > > > > --- a/arch/arm/xen/enlighten.c > > > > +++ b/arch/arm/xen/enlighten.c > > > > @@ -14,6 +14,7 @@ > > > > #include <xen/xen-ops.h> > > > > #include <asm/xen/hypervisor.h> > > > > #include <asm/xen/hypercall.h> > > > > +#include <asm/arch_timer.h> > > > > #include <asm/system_misc.h> > > > > #include <linux/interrupt.h> > > > > #include <linux/irqreturn.h> > > > > @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) > > > > later ones fail to. */ > > > > per_cpu(xen_vcpu, cpu) = vcpup; > > > > } > > > > + > > > > + xen_setup_runstate_info(cpu); > > > > > > Is this called for the boot processor too? > > > > Yep. > > > > > (do you have a linux-next branch where I can easily find the baseline > > > for this series?) > > > > git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git lost_ticks_1 > > Is that your Xen tree, I was after the Linux one...nope, that''s my Linux tree disguised as a Xen tree to confuse the maintainers ;-)> > It''s based on my 3.9-rc3-smp-6 branch currently in linux-next (that is > > based on 3.9-rc3). > > However lost_ticks_1 is not in linux-next yet. > > Which branch is in Linux next? It doesn''t seem to be your linux-next > (last commit 2011).that would be linux-next from git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git
Ian Campbell
2013-May-02 11:02 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Thu, 2013-05-02 at 11:54 +0100, Stefano Stabellini wrote:> On Thu, 2 May 2013, Ian Campbell wrote: > > On Thu, 2013-05-02 at 11:38 +0100, Stefano Stabellini wrote: > > > On Thu, 2 May 2013, Ian Campbell wrote: > > > > On Wed, 2013-05-01 at 20:27 +0100, Stefano Stabellini wrote: > > > > > diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c > > > > > index d30042e..8f040a0 100644 > > > > > --- a/arch/arm/xen/enlighten.c > > > > > +++ b/arch/arm/xen/enlighten.c > > > > > @@ -14,6 +14,7 @@ > > > > > #include <xen/xen-ops.h> > > > > > #include <asm/xen/hypervisor.h> > > > > > #include <asm/xen/hypercall.h> > > > > > +#include <asm/arch_timer.h> > > > > > #include <asm/system_misc.h> > > > > > #include <linux/interrupt.h> > > > > > #include <linux/irqreturn.h> > > > > > @@ -172,6 +173,8 @@ static int __init xen_secondary_init(unsigned int cpu) > > > > > later ones fail to. */ > > > > > per_cpu(xen_vcpu, cpu) = vcpup; > > > > > } > > > > > + > > > > > + xen_setup_runstate_info(cpu); > > > > > > > > Is this called for the boot processor too? > > > > > > Yep. > > > > > > > (do you have a linux-next branch where I can easily find the baseline > > > > for this series?) > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git lost_ticks_1 > > > > Is that your Xen tree, I was after the Linux one... > > nope, that''s my Linux tree disguised as a Xen tree to confuse the > maintainers ;-);-)> > > It''s based on my 3.9-rc3-smp-6 branch currently in linux-next (that is > > > based on 3.9-rc3). > > > However lost_ticks_1 is not in linux-next yet. > > > > Which branch is in Linux next? It doesn''t seem to be your linux-next > > (last commit 2011). > > that would be linux-next from > git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.gitAh, I was looking in your xenbits tree -- is that deprecated? Ian.
Stefano Stabellini
2013-May-02 11:04 UTC
Re: [Xen-devel] [PATCH 3/3] xen/arm: account for stolen ticks
On Thu, 2 May 2013, Ian Campbell wrote:> > > > It''s based on my 3.9-rc3-smp-6 branch currently in linux-next (that is > > > > based on 3.9-rc3). > > > > However lost_ticks_1 is not in linux-next yet. > > > > > > Which branch is in Linux next? It doesn''t seem to be your linux-next > > > (last commit 2011). > > > > that would be linux-next from > > git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git > > Ah, I was looking in your xenbits tree -- is that deprecated?I guess it is, now that I have a kernel.org account. I might as well get rid of it.
Christopher Covington
2013-May-02 18:49 UTC
Re: [PATCH 2/3] xen: move do_stolen_accounting to drivers/xen/time.c
Hi Stefano, On 05/01/2013 03:27 PM, Stefano Stabellini wrote:> Move do_stolen_accounting, xen_vcpu_stolen and related functions and > static variables to common code (drivers/xen/time.c). > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > arch/x86/xen/time.c | 127 +-------------------------------------------In going through some of the code, it looked like there might be a few little bits that you might be interested in pulling out of arch/ia64/xen/time.c as well. [...] Regards, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Christopher Covington
2013-May-02 21:33 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
Hi Ian, On 05/02/2013 04:19 AM, Ian Campbell wrote:> On Wed, 2013-05-01 at 21:36 +0100, Christopher Covington wrote: >> Hi Stefano, >> >> On 05/01/2013 03:27 PM, Stefano Stabellini wrote: >>> Introduce a function, called arch_timer_stolen_ticks, called from the >>> arch_timer interrupt handler to account for stolen ticks. >> >> [...] >> >>> diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h >>> index 7ade91d..30db413 100644 >>> --- a/arch/arm/include/asm/arch_timer.h >>> +++ b/arch/arm/include/asm/arch_timer.h >>> @@ -13,6 +13,11 @@ >>> int arch_timer_of_register(void); >>> int arch_timer_sched_clock_init(void); >>> >>> +/* per-platform function to calculate stolen ticks (clock cycles stolen >>> + * to the vcpu by the hypervisor).[...]>> Is the hypervisor adjusting the Virtual Offset Register? > > The virtual offset register is useful when a VCPU is migrated to another > system to account for the differences in physical time on the two hosts > but isn''t useful for accounting for stolen time while running on a > single host. > > e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it > would not make sense (from the guests PoV) for NOW''==NOW+2 at the point > where the timer goes off. Nor does it make sense to require that the > guest actually be running for 5 before injecting the timer because that > would mean real time elapsed time for the timer would be 5+3 in the case > where 3 are stolen.This is a bit of an aside, but I think that hiding time spent at higher privilege levels can be a quite sensible approach to timekeeping in a virtualized environment, but I understand that it''s not the approach taken with Xen, and as you pointed out above, adjusting the Virtual Offset Register by itself isn''t enough to implement that approach.> So the virtual timer should appear to have been running even while time > is being stolen and therefore stolen time needs to be accounted via some > other means.Something that''s not currently obvious to me is that given that the stolen cycle accounting should be done, what makes the architected timer interrupt handler the ideal place to do it? Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Ian Campbell
2013-May-03 08:26 UTC
Re: [Xen-devel] [PATCH 2/3] xen: move do_stolen_accounting to drivers/xen/time.c
On Thu, 2013-05-02 at 19:49 +0100, Christopher Covington wrote:> Hi Stefano, > > On 05/01/2013 03:27 PM, Stefano Stabellini wrote: > > Move do_stolen_accounting, xen_vcpu_stolen and related functions and > > static variables to common code (drivers/xen/time.c). > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > arch/x86/xen/time.c | 127 +------------------------------------------- > > In going through some of the code, it looked like there might be a few little > bits that you might be interested in pulling out of arch/ia64/xen/time.c as well.ia64 in the hypervisor was deprecated and with the 4.3 release has been removed. AFAIK ia64/Xen support has been broken in the mainline kernel for year. Ian
Stefano Stabellini
2013-May-03 10:29 UTC
Re: [Xen-devel] [PATCH 2/3] xen: move do_stolen_accounting to drivers/xen/time.c
On Fri, 3 May 2013, Ian Campbell wrote:> On Thu, 2013-05-02 at 19:49 +0100, Christopher Covington wrote: > > Hi Stefano, > > > > On 05/01/2013 03:27 PM, Stefano Stabellini wrote: > > > Move do_stolen_accounting, xen_vcpu_stolen and related functions and > > > static variables to common code (drivers/xen/time.c). > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > --- > > > arch/x86/xen/time.c | 127 +------------------------------------------- > > > > In going through some of the code, it looked like there might be a few little > > bits that you might be interested in pulling out of arch/ia64/xen/time.c as well. > > ia64 in the hypervisor was deprecated and with the 4.3 release has been > removed. > > AFAIK ia64/Xen support has been broken in the mainline kernel for year.Yes, that''s why it didn''t occur to me to do anything about it. Also, the ia64 code differs from the x86 code, I can''t just replace it with the newly introduced common code unfortunately.
Stefano Stabellini
2013-May-03 10:43 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On Thu, 2 May 2013, Christopher Covington wrote:> > So the virtual timer should appear to have been running even while time > > is being stolen and therefore stolen time needs to be accounted via some > > other means. > > Something that''s not currently obvious to me is that given that the stolen > cycle accounting should be done, what makes the architected timer interrupt > handler the ideal place to do it?That is a good question and I would appreciate suggestions to improve the patch. Given that Xen x86 and ia64 does stolen time accounting from the timer interrupt handler: arch/x86/xen/time.c:xen_timer_interrupt arch/ia64/kernel/time.c:timer_interrupt and given that the arch_timer is the only timer used by Xen on ARM and that it includes a virt_timer that is made on purpose to be used by virtual machines, I thought that it might be a good place for it. I also thought that doing it this way, KVM should be able to reuse the same hook.
Marc Zyngier
2013-May-03 10:54 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On 03/05/13 11:43, Stefano Stabellini wrote:> On Thu, 2 May 2013, Christopher Covington wrote: >>> So the virtual timer should appear to have been running even while time >>> is being stolen and therefore stolen time needs to be accounted via some >>> other means. >> >> Something that''s not currently obvious to me is that given that the stolen >> cycle accounting should be done, what makes the architected timer interrupt >> handler the ideal place to do it? > > That is a good question and I would appreciate suggestions to improve > the patch. > > Given that Xen x86 and ia64 does stolen time accounting from the timer > interrupt handler: > > arch/x86/xen/time.c:xen_timer_interrupt > arch/ia64/kernel/time.c:timer_interrupt > > and given that the arch_timer is the only timer used by Xen on ARM and > that it includes a virt_timer that is made on purpose to be used by > virtual machines, I thought that it might be a good place for it. > > I also thought that doing it this way, KVM should be able to reuse the > same hook.Indeed. I just need to understand how time stealing works there ;-). Now, KVM is not necessarily limited to arch_timers, and we''ve run KVM using a QEMU-provided timer in the past. Can you think of a more generic location for this hook? Possibly something that would satisfy the requirements of other architectures while we''re at it? M. -- Jazz is not dead. It just smells funny...
Stefano Stabellini
2013-May-05 16:47 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On Fri, 3 May 2013, Marc Zyngier wrote:> On 03/05/13 11:43, Stefano Stabellini wrote: > > On Thu, 2 May 2013, Christopher Covington wrote: > >>> So the virtual timer should appear to have been running even while time > >>> is being stolen and therefore stolen time needs to be accounted via some > >>> other means. > >> > >> Something that''s not currently obvious to me is that given that the stolen > >> cycle accounting should be done, what makes the architected timer interrupt > >> handler the ideal place to do it? > > > > That is a good question and I would appreciate suggestions to improve > > the patch. > > > > Given that Xen x86 and ia64 does stolen time accounting from the timer > > interrupt handler: > > > > arch/x86/xen/time.c:xen_timer_interrupt > > arch/ia64/kernel/time.c:timer_interrupt > > > > and given that the arch_timer is the only timer used by Xen on ARM and > > that it includes a virt_timer that is made on purpose to be used by > > virtual machines, I thought that it might be a good place for it. > > > > I also thought that doing it this way, KVM should be able to reuse the > > same hook. > > Indeed. I just need to understand how time stealing works there ;-). > > Now, KVM is not necessarily limited to arch_timers, and we''ve run KVM > using a QEMU-provided timer in the past. Can you think of a more generic > location for this hook? Possibly something that would satisfy the > requirements of other architectures while we''re at it?Probably the best option would be to reuse kernel/sched/cputime.c:steal_account_process_tick but that also means introducing CONFIG_PARAVIRT on arm. I am up for that, what do you think?
Konrad Rzeszutek Wilk
2013-May-06 14:35 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
> > e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it > > would not make sense (from the guests PoV) for NOW''==NOW+2 at the point > > where the timer goes off. Nor does it make sense to require that the > > guest actually be running for 5 before injecting the timer because that > > would mean real time elapsed time for the timer would be 5+3 in the case > > where 3 are stolen. > > This is a bit of an aside, but I think that hiding time spent at higher > privilege levels can be a quite sensible approach to timekeeping in a > virtualized environment, but I understand that it''s not the approach taken > with Xen, and as you pointed out above, adjusting the Virtual Offset Register > by itself isn''t enough to implement that approach.This is the approach taken by Xen and KVM. Look in CONFIG_PARAVIRT_CLOCK for implementation. In the user-space, the entry in ''top'' of "stolen" (%st) is for this exact value.
Christopher Covington
2013-May-06 14:55 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On 05/05/2013 12:47 PM, Stefano Stabellini wrote:> On Fri, 3 May 2013, Marc Zyngier wrote: >> On 03/05/13 11:43, Stefano Stabellini wrote: >>> On Thu, 2 May 2013, Christopher Covington wrote: >>>>> So the virtual timer should appear to have been running even while time >>>>> is being stolen and therefore stolen time needs to be accounted via some >>>>> other means. >>>> >>>> Something that''s not currently obvious to me is that given that the stolen >>>> cycle accounting should be done, what makes the architected timer interrupt >>>> handler the ideal place to do it? >>> >>> That is a good question and I would appreciate suggestions to improve >>> the patch. >>> >>> Given that Xen x86 and ia64 does stolen time accounting from the timer >>> interrupt handler: >>> >>> arch/x86/xen/time.c:xen_timer_interrupt >>> arch/ia64/kernel/time.c:timer_interrupt >>> >>> and given that the arch_timer is the only timer used by Xen on ARM and >>> that it includes a virt_timer that is made on purpose to be used by >>> virtual machines, I thought that it might be a good place for it. >>> >>> I also thought that doing it this way, KVM should be able to reuse the >>> same hook. >> >> Indeed. I just need to understand how time stealing works there ;-). >> >> Now, KVM is not necessarily limited to arch_timers, and we''ve run KVM >> using a QEMU-provided timer in the past. Can you think of a more generic >> location for this hook? Possibly something that would satisfy the >> requirements of other architectures while we''re at it? > > Probably the best option would be to reuse > > kernel/sched/cputime.c:steal_account_process_tick > > but that also means introducing CONFIG_PARAVIRT on arm. > I am up for that, what do you think?That sounds like a good move to me. Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Christopher Covington
2013-May-07 16:17 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
Hi Konrad, On 05/06/2013 10:35 AM, Konrad Rzeszutek Wilk wrote:>>> e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it >>> would not make sense (from the guests PoV) for NOW''==NOW+2 at the point >>> where the timer goes off. Nor does it make sense to require that the >>> guest actually be running for 5 before injecting the timer because that >>> would mean real time elapsed time for the timer would be 5+3 in the case >>> where 3 are stolen. >> >> This is a bit of an aside, but I think that hiding time spent at higher >> privilege levels can be a quite sensible approach to timekeeping in a >> virtualized environment, but I understand that it''s not the approach taken >> with Xen, and as you pointed out above, adjusting the Virtual Offset Register >> by itself isn''t enough to implement that approach. > > This is the approach taken by Xen and KVM. Look in CONFIG_PARAVIRT_CLOCK for > implementation. In the user-space, the entry in ''top'' of "stolen" (%st) > is for this exact value.I may have been unclear with my terms, sorry. When I refer to time being "hidden", I mean that kernel level software (supervisor mode, EL1) cannot detect the passage of that time at all. I don''t know whether this would really work, but I imagine one might be able to get close with the current virtualization facilities for ARM. Am I correct in interpreting that what you''re referring to is the deployment of paravirtualization code that ensures (observable) "stolen" time is factored into kernel decision-making? Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Stefano Stabellini
2013-May-08 11:19 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On Tue, 7 May 2013, Christopher Covington wrote:> Hi Konrad, > > On 05/06/2013 10:35 AM, Konrad Rzeszutek Wilk wrote: > >>> e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it > >>> would not make sense (from the guests PoV) for NOW''==NOW+2 at the point > >>> where the timer goes off. Nor does it make sense to require that the > >>> guest actually be running for 5 before injecting the timer because that > >>> would mean real time elapsed time for the timer would be 5+3 in the case > >>> where 3 are stolen. > >> > >> This is a bit of an aside, but I think that hiding time spent at higher > >> privilege levels can be a quite sensible approach to timekeeping in a > >> virtualized environment, but I understand that it''s not the approach taken > >> with Xen, and as you pointed out above, adjusting the Virtual Offset Register > >> by itself isn''t enough to implement that approach. > > > > This is the approach taken by Xen and KVM. Look in CONFIG_PARAVIRT_CLOCK for > > implementation. In the user-space, the entry in ''top'' of "stolen" (%st) > > is for this exact value. > > I may have been unclear with my terms, sorry. When I refer to time being > "hidden", I mean that kernel level software (supervisor mode, EL1) cannot > detect the passage of that time at all. I don''t know whether this would really > work, but I imagine one might be able to get close with the current > virtualization facilities for ARM. > > Am I correct in interpreting that what you''re referring to is the deployment > of paravirtualization code that ensures (observable) "stolen" time is factored > into kernel decision-making?Although it might be possible to hide the real time flow from the VM, it is inadvisable: what would happen when the VM needs to deal with a real hardware device? Or just send packets over the network? This is why it is much safer and more reliable to expose the stolen ticks to the VM.
Christopher Covington
2013-May-08 11:48 UTC
Re: [Xen-devel] [PATCH 1/3] arm_arch_timer: introduce arch_timer_stolen_ticks
On 05/08/2013 07:19 AM, Stefano Stabellini wrote:> On Tue, 7 May 2013, Christopher Covington wrote: >> Hi Konrad, >> >> On 05/06/2013 10:35 AM, Konrad Rzeszutek Wilk wrote: >>>>> e.g. if a VCPU sets a timer for NOW+5, but 3 are stolen in the middle it >>>>> would not make sense (from the guests PoV) for NOW''==NOW+2 at the point >>>>> where the timer goes off. Nor does it make sense to require that the >>>>> guest actually be running for 5 before injecting the timer because that >>>>> would mean real time elapsed time for the timer would be 5+3 in the case >>>>> where 3 are stolen. >>>> >>>> This is a bit of an aside, but I think that hiding time spent at higher >>>> privilege levels can be a quite sensible approach to timekeeping in a >>>> virtualized environment, but I understand that it''s not the approach taken >>>> with Xen, and as you pointed out above, adjusting the Virtual Offset Register >>>> by itself isn''t enough to implement that approach. >>> >>> This is the approach taken by Xen and KVM. Look in CONFIG_PARAVIRT_CLOCK for >>> implementation. In the user-space, the entry in ''top'' of "stolen" (%st) >>> is for this exact value. >> >> I may have been unclear with my terms, sorry. When I refer to time being >> "hidden", I mean that kernel level software (supervisor mode, EL1) cannot >> detect the passage of that time at all. I don''t know whether this would really >> work, but I imagine one might be able to get close with the current >> virtualization facilities for ARM. >> >> Am I correct in interpreting that what you''re referring to is the deployment >> of paravirtualization code that ensures (observable) "stolen" time is factored >> into kernel decision-making? > > Although it might be possible to hide the real time flow from the VM, it > is inadvisable: what would happen when the VM needs to deal with a real > hardware device? Or just send packets over the network? This is why it > is much safer and more reliable to expose the stolen ticks to the VM.One would probably have to emulate all the hardware. I don''t mean to imply that I think this is useful for everyday virtualization, but I speculate that such an approach might enable the analysis of kernels infected with VM-detecting malware or other niche use-cases. Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.