thr3ads.net - Xen devel - [Xen-devel] [PATCH] xen: always handle VIRQ

If this information is useful, please help other people find it:
Share via:

Ian Campbell

2010-Oct-15 10:52 UTC

[Xen-devel] [PATCH] xen: always handle VIRQ_TIMER first.

This ensures that system is updated before calling any hard irq
handlers after a long period of ticklessness. If we do not do this
then hardirq will see a jiffies from before the period of ticklessness
and make intcorrect decisions regarding timer expiry etc.

This resolves issues e.g. with USB keyboard timer repeats.

Based on a patch by Keir Fraser.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: keir@xen.org
---
 drivers/xen/events.c |   22 +++++++++++++++++++++-
 1 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index c9d1d4a..1496ba5 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -1052,6 +1052,7 @@ static void __xen_evtchn_do_upcall(struct pt_regs *regs)
 
 	do {
 		unsigned long pending_words;
+		int irq;
 
 		vcpu_info->evtchn_upcall_pending = 0;
 
@@ -1062,6 +1063,24 @@ static void __xen_evtchn_do_upcall(struct pt_regs *regs)
 		/* Clear master flag /before/ clearing selector flag. */
 		wmb();
 #endif
+
+		/*
+		 * Handle timer interrupts before all others, so that all
+		 * hardirq handlers see an up-to-date system time even if we
+		 * have just woken from a long idle period.
+		 */
+		irq = percpu_read(virq_to_irq[VIRQ_TIMER]);
+		if (irq != -1) {
+			int word_idx;
+			int bit_idx;
+			int port = evtchn_from_irq(irq);
+			word_idx = port / BITS_PER_LONG;
+			bit_idx = port % BITS_PER_LONG;
+			if (VALID_EVTCHN(port) &&
+			    (active_evtchns(cpu, s, word_idx) & (1UL<<bit_idx)))
+				(void)handle_irq(irq, regs);
+		}
+
 		pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
 		while (pending_words != 0) {
 			unsigned long pending_bits;
@@ -1071,9 +1090,10 @@ static void __xen_evtchn_do_upcall(struct pt_regs *regs)
 			while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
 				int bit_idx = __ffs(pending_bits);
 				int port = (word_idx * BITS_PER_LONG) + bit_idx;
-				int irq = evtchn_to_irq[port];
 				struct irq_desc *desc;
 
+				irq = evtchn_to_irq[port];
+
 				mask_evtchn(port);
 				clear_evtchn(port);
 
-- 
1.5.6.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Oct-15 17:18 UTC

head link

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 10/15/2010 03:52 AM, Ian Campbell wrote:> This ensures that system is updated before calling any hard irq
> handlers after a long period of ticklessness. If we do not do this
> then hardirq will see a jiffies from before the period of ticklessness
> and make intcorrect decisions regarding timer expiry etc.
>
> This resolves issues e.g. with USB keyboard timer repeats.
>
> Based on a patch by Keir Fraser.
I talked about this with James, and it makes no sense to me at all.

    J
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: keir@xen.org
> ---
>  drivers/xen/events.c |   22 +++++++++++++++++++++-
>  1 files changed, 21 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index c9d1d4a..1496ba5 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -1052,6 +1052,7 @@ static void __xen_evtchn_do_upcall(struct pt_regs
*regs)
>  
>  	do {
>  		unsigned long pending_words;
> +		int irq;
>  
>  		vcpu_info->evtchn_upcall_pending = 0;
>  
> @@ -1062,6 +1063,24 @@ static void __xen_evtchn_do_upcall(struct pt_regs
*regs)
>  		/* Clear master flag /before/ clearing selector flag. */
>  		wmb();
>  #endif
> +
> +		/*
> +		 * Handle timer interrupts before all others, so that all
> +		 * hardirq handlers see an up-to-date system time even if we
> +		 * have just woken from a long idle period.
> +		 */
> +		irq = percpu_read(virq_to_irq[VIRQ_TIMER]);
> +		if (irq != -1) {
> +			int word_idx;
> +			int bit_idx;
> +			int port = evtchn_from_irq(irq);
> +			word_idx = port / BITS_PER_LONG;
> +			bit_idx = port % BITS_PER_LONG;
> +			if (VALID_EVTCHN(port) &&
> +			    (active_evtchns(cpu, s, word_idx) & (1UL<<bit_idx)))
> +				(void)handle_irq(irq, regs);
> +		}
> +
>  		pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
>  		while (pending_words != 0) {
>  			unsigned long pending_bits;
> @@ -1071,9 +1090,10 @@ static void __xen_evtchn_do_upcall(struct pt_regs
*regs)
>  			while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
>  				int bit_idx = __ffs(pending_bits);
>  				int port = (word_idx * BITS_PER_LONG) + bit_idx;
> -				int irq = evtchn_to_irq[port];
>  				struct irq_desc *desc;
>  
> +				irq = evtchn_to_irq[port];
> +
>  				mask_evtchn(port);
>  				clear_evtchn(port);
>  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Oct-15 18:30 UTC

head link

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 15/10/2010 18:18, "Jeremy Fitzhardinge" <jeremy@goop.org>
wrote:
>  On 10/15/2010 03:52 AM, Ian Campbell wrote:
>> This ensures that system is updated before calling any hard irq
>> handlers after a long period of ticklessness. If we do not do this
>> then hardirq will see a jiffies from before the period of ticklessness
>> and make intcorrect decisions regarding timer expiry etc.
>> 
>> This resolves issues e.g. with USB keyboard timer repeats.
>> 
>> Based on a patch by Keir Fraser.
> 
> I talked about this with James, and it makes no sense to me at all.
When guest resumes execution after a long period blocked, the unblocking
interrupt may be handled before the inevitable timer interrupt which
actually syncs up jiffies to current system time. The unblocking interrupt
sees old jiffies -- most hardirq handlers really don''t care about time,
but
it happens that USB keyboard repeat is handled at that level -- it sees the
key pressed at old jiffies and not released until new jiffies plus small
delta. The difference between old and new jiffies can easily be enough to
cause phantom key repeats.

One question of course is whether the same hardirq key repeat mechanism can
be foxed simply be involuntary preemption of the guest. I suppose it could,
but it''s vastly more unlikely than the systematic deterministic race
introduced by resume-from-block. Also we would hope that a runnable guest
would not be descheduled for as long periods as a guest can be voluntarily
blocked (bit arm waving that one I''ll admit ;-).

 -- Keir
>     J
> 
>> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
>> Cc: keir@xen.org
>> ---
>>  drivers/xen/events.c |   22 +++++++++++++++++++++-
>>  1 files changed, 21 insertions(+), 1 deletions(-)
>> 
>> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
>> index c9d1d4a..1496ba5 100644
>> --- a/drivers/xen/events.c
>> +++ b/drivers/xen/events.c
>> @@ -1052,6 +1052,7 @@ static void __xen_evtchn_do_upcall(struct pt_regs
>> *regs)
>>  
>> do {
>> unsigned long pending_words;
>> +  int irq;
>>  
>> vcpu_info->evtchn_upcall_pending = 0;
>>  
>> @@ -1062,6 +1063,24 @@ static void __xen_evtchn_do_upcall(struct
pt_regs
>> *regs)
>> /* Clear master flag /before/ clearing selector flag. */
>> wmb();
>>  #endif
>> +
>> +  /*
>> +   * Handle timer interrupts before all others, so that all
>> +   * hardirq handlers see an up-to-date system time even if we
>> +   * have just woken from a long idle period.
>> +   */
>> +  irq = percpu_read(virq_to_irq[VIRQ_TIMER]);
>> +  if (irq != -1) {
>> +   int word_idx;
>> +   int bit_idx;
>> +   int port = evtchn_from_irq(irq);
>> +   word_idx = port / BITS_PER_LONG;
>> +   bit_idx = port % BITS_PER_LONG;
>> +   if (VALID_EVTCHN(port) &&
>> +       (active_evtchns(cpu, s, word_idx) & (1UL<<bit_idx)))
>> +    (void)handle_irq(irq, regs);
>> +  }
>> +
>> pending_words = xchg(&vcpu_info->evtchn_pending_sel, 0);
>> while (pending_words != 0) {
>> unsigned long pending_bits;
>> @@ -1071,9 +1090,10 @@ static void __xen_evtchn_do_upcall(struct
pt_regs
>> *regs)
>> while ((pending_bits = active_evtchns(cpu, s, word_idx)) != 0) {
>> int bit_idx = __ffs(pending_bits);
>> int port = (word_idx * BITS_PER_LONG) + bit_idx;
>> -    int irq = evtchn_to_irq[port];
>> struct irq_desc *desc;
>>  
>> +    irq = evtchn_to_irq[port];
>> +
>> mask_evtchn(port);
>> clear_evtchn(port);
>>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Oct-15 21:11 UTC

head link

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 10/15/2010 11:30 AM, Keir Fraser wrote:> On 15/10/2010 18:18, "Jeremy Fitzhardinge"
<jeremy@goop.org> wrote:
>
>>  On 10/15/2010 03:52 AM, Ian Campbell wrote:
>>> This ensures that system is updated before calling any hard irq
>>> handlers after a long period of ticklessness. If we do not do this
>>> then hardirq will see a jiffies from before the period of
ticklessness
>>> and make intcorrect decisions regarding timer expiry etc.
>>>
>>> This resolves issues e.g. with USB keyboard timer repeats.
>>>
>>> Based on a patch by Keir Fraser.
>> I talked about this with James, and it makes no sense to me at all.
> When guest resumes execution after a long period blocked, the unblocking
> interrupt may be handled before the inevitable timer interrupt which
Why "inevitable"?  What if the next timer event is still some time in
the future?  Or are you assuming the timer is driven by the default Xen
100Hz timer?
> actually syncs up jiffies to current system time. The unblocking interrupt
> sees old jiffies -- most hardirq handlers really don''t care about
time, but
> it happens that USB keyboard repeat is handled at that level -- it sees the
> key pressed at old jiffies and not released until new jiffies plus small
> delta. The difference between old and new jiffies can easily be enough to
> cause phantom key repeats.
Yes, but...

If the system is idle and has disabled timer ticks, and the next
interrupt is from a piece of hardware, then jiffies will be out of date,
but there won''t necessarily be a pending timer tick.  If a device
interrupt handler is allowed to rely on jiffies, then there must be some
generic mechanism to update jiffies before calling any interrupt
handler.  This situation doesn''t seem like it is in any way Xen
dependent, and AFAIK there''s no general requirement that timer
interrupts be handled first. 

I''m guessing that this particular problem in the forward-port Xen
kernel
as a side-effect of its bespoke time handling code (including
IDLE_NO_HZ) which is not doing something that the core time
infrastructure would normally do.  (I don''t see why the forward-port
kernels couldn''t use the existing Xen time support in mainline rather
than replacing it.)

Or perhaps there is a real bug here, but again, I don''t think it is
Xen-specific, or be addressed in Xen code.
> One question of course is whether the same hardirq key repeat mechanism can
> be foxed simply be involuntary preemption of the guest. I suppose it could,
> but it''s vastly more unlikely than the systematic deterministic
race
> introduced by resume-from-block. Also we would hope that a runnable guest
> would not be descheduled for as long periods as a guest can be voluntarily
> blocked (bit arm waving that one I''ll admit ;-).
I''ve seen unexpected key repeats in guests when using kvm keyboards.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Campbell

2010-Oct-16 06:48 UTC

head link

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On Fri, 2010-10-15 at 22:11 +0100, Jeremy Fitzhardinge wrote:
> This situation doesn''t seem like it is in any way Xen
> dependent, and AFAIK there''s no general requirement that timer
> interrupts be handled first. 
It''s not implicit somehow on native due to timer interrupt always being
IRQ0 or something like that?
> I''m guessing that this particular problem in the forward-port Xen
> kernel as a side-effect of its bespoke time handling code (including
> IDLE_NO_HZ) which is not doing something that the core time
> infrastructure would normally do.
You are right, it''s very possible this is a forward-port Xen only
issue. 

The patch is out there now so perhaps we should not worry about it and
revisit it if someone shows up with a plausible looking issue affecting
pvops.
>   (I don''t see why the forward-port
> kernels couldn''t use the existing Xen time support in mainline
rather
> than replacing it.) 
Agreed. Things like *front and the /dev/xen/* drivers would be good
first candidates for this sort of convergence too if someone were
interested. FWIW netback and blktap2 are already mostly converged in the
XCP tree which has made pushing patches back and forth much easier.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Oct-16 07:14 UTC

head link

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 15/10/2010 22:11, "Jeremy Fitzhardinge" <jeremy@goop.org>
wrote:
>> When guest resumes execution after a long period blocked, the
unblocking
>> interrupt may be handled before the inevitable timer interrupt which
> 
> Why "inevitable"?  What if the next timer event is still some
time in
> the future?  Or are you assuming the timer is driven by the default Xen
> 100Hz timer?
Do you sometimes disable, or indeed never use, VCPUOP_set_periodic_timer?

Hmmm... Perhaps as you suggest this would be a generic issue with any
tickless kernel, and the correct upstream fix for issues such as USB kbd
repeat -- if indeed such issues still exist -- is to fix such hardirq
handlers to not depend on jiffies.

We fixed it the way we did in ''classic Xen'' patched kernels
since it seemed
arhitecturally neatest. I can accept that in the tickless kernel world that
may not actually be true.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Oct-17 06:11 UTC

head link

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 10/16/2010 12:14 AM, Keir Fraser wrote:> On 15/10/2010 22:11, "Jeremy Fitzhardinge"
<jeremy@goop.org> wrote:
>
>>> When guest resumes execution after a long period blocked, the
unblocking
>>> interrupt may be handled before the inevitable timer interrupt
which
>> Why "inevitable"?  What if the next timer event is still some
time in
>> the future?  Or are you assuming the timer is driven by the default Xen
>> 100Hz timer?
> Do you sometimes disable, or indeed never use, VCPUOP_set_periodic_timer?
I disable it ASAP at boot and always use VCPUOP_set_singleshot_timer
from then on.
> Hmmm... Perhaps as you suggest this would be a generic issue with any
> tickless kernel, and the correct upstream fix for issues such as USB kbd
> repeat -- if indeed such issues still exist -- is to fix such hardirq
> handlers to not depend on jiffies.
>
> We fixed it the way we did in ''classic Xen'' patched
kernels since it seemed
> arhitecturally neatest. I can accept that in the tickless kernel world that
> may not actually be true.
I think (but I haven''t spelunked into that code lately) that after a
tickless idle period it will update jiffies N ticks based on the
clocksource, and then run any other interrupt handler code, so jiffies
will always appear to be up to date.

Ah, yes, here it is:

/**
 * tick_nohz_update_jiffies - update jiffies when idle was interrupted
 *
 * Called from interrupt entry when the CPU was idle
 *
 * In case the sched_tick was stopped on this CPU, we have to check if jiffies
 * must be updated. Otherwise an interrupt handler could use a stale jiffy
 * value. We do this unconditionally on any cpu, as we don''t know
whether the
 * cpu, which has the update task assigned is in a long sleep.
 */
static void tick_nohz_update_jiffies(ktime_t now)
{
	...
}



    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Oct-17 07:38 UTC

head link

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 17/10/2010 07:11, "Jeremy Fitzhardinge" <jeremy@goop.org>
wrote:
>> We fixed it the way we did in ''classic Xen'' patched
kernels since it seemed
>> arhitecturally neatest. I can accept that in the tickless kernel world
that
>> may not actually be true.
> 
> I think (but I haven''t spelunked into that code lately) that after
a
> tickless idle period it will update jiffies N ticks based on the
> clocksource, and then run any other interrupt handler code, so jiffies
> will always appear to be up to date.
Okay, that should suffice. That presumably calls into the clocksource and
gives us our opportunity to sync with Xen''s current system time.
Effectively
it''s just hooking into the interrupt handling preamble at a different,
more
generic, point. :-)

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2010-Oct-18 13:20 UTC

head link

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

>>> On 17.10.10 at 08:11, Jeremy Fitzhardinge <jeremy@goop.org>
wrote:
> I think (but I haven''t spelunked into that code lately) that after
a
> tickless idle period it will update jiffies N ticks based on the
> clocksource, and then run any other interrupt handler code, so jiffies
> will always appear to be up to date.
> 
> Ah, yes, here it is:
> 
> /**
>  * tick_nohz_update_jiffies - update jiffies when idle was interrupted
>  *
>  * Called from interrupt entry when the CPU was idle
>  *
>  * In case the sched_tick was stopped on this CPU, we have to check if 
> jiffies
>  * must be updated. Otherwise an interrupt handler could use a stale jiffy
>  * value. We do this unconditionally on any cpu, as we don''t know
whether
> the
>  * cpu, which has the update task assigned is in a long sleep.
>  */
> static void tick_nohz_update_jiffies(ktime_t now)
> {
> 	...
> }
But this is available only with CONFIG_NO_HZ, which is a freely
selectable option. So perhaps the code should still be added
inside an #ifndef CONFIG_NO_HZ?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Oct-18 16:52 UTC

head link

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

On 10/18/2010 06:20 AM, Jan Beulich wrote:>>>> On 17.10.10 at 08:11, Jeremy Fitzhardinge
<jeremy@goop.org> wrote:
>> I think (but I haven''t spelunked into that code lately) that
after a
>> tickless idle period it will update jiffies N ticks based on the
>> clocksource, and then run any other interrupt handler code, so jiffies
>> will always appear to be up to date.
>>
>> Ah, yes, here it is:
>>
>> /**
>>  * tick_nohz_update_jiffies - update jiffies when idle was interrupted
>>  *
>>  * Called from interrupt entry when the CPU was idle
>>  *
>>  * In case the sched_tick was stopped on this CPU, we have to check if 
>> jiffies
>>  * must be updated. Otherwise an interrupt handler could use a stale
jiffy
>>  * value. We do this unconditionally on any cpu, as we don''t
know whether
>> the
>>  * cpu, which has the update task assigned is in a long sleep.
>>  */
>> static void tick_nohz_update_jiffies(ktime_t now)
>> {
>> 	...
>> }
> But this is available only with CONFIG_NO_HZ, which is a freely
> selectable option. So perhaps the code should still be added
> inside an #ifndef CONFIG_NO_HZ?
Non-NO_HZ is a pretty pessimal configuration for a VM, or indeed any
system which cares about power.  Are there any use cases for which its a
good idea?

However, you could change it to always update jiffies on any interrupt
entrypoint, regardless of whether its coming from an idle state.  Or
even just change "jiffies" into a macro which calls a function to just
compute the current value without needing to rely on interrupts at all.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Oct 2010 - [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

[Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.

Re: [Xen-devel] Re: [PATCH] xen: always handle VIRQ_TIMER first.