thr3ads.net - Xen devel - [Xen-devel] schedule() vs softirqs [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Hollis Blanchard

2006-Dec-15 17:27 UTC

[Xen-devel] schedule() vs softirqs

PowerPC''s timer interrupt (called the decrementer) is a one-shot timer,
not periodic. When it goes off, entering the hypervisor, we first set it
very high so it won''t interrupt hypervisor code, then
raise_softirq(TIMER_SOFTIRQ). We know that timer_softirq_action() will
then call reprogam_timer(), which will reprogram the decrementer to the
appropriate value.

We recently uncovered a bug on PowerPC where if a timer tick arrives
just inside schedule() while interrupts are still enabled, the
decrementer is never reprogrammed to that appropriate value. This is
because once inside schedule(), we never handle any subsequent softirqs:
we call context_switch() and resume the guest.

I believe the tick problem affects periodic timers (i.e. x86) as well,
though less drastically. With a CPU-bound guest, it would result in
dropped ticks: TIMER_SOFTIRQ is set and not handled, and when the timer
expires again it is re-set. In other cases, it would result in some
timer ticks being delivered very late. I don''t know what effect this
might have on guests, perhaps with sensitive time-slewing code.

In addition, when SCHEDULE_SOFTIRQ is set, all "greater" softirqs
(including NMI) will not be handled until the next hypervisor
invocation.

This is pretty anti-social behavior for a softirq handler. One solution
would be to have schedule() *not* call context_switch() directly, but
rather set a flag (or a "next vcpu" pointer) and return. That would
allow other softirqs to be processed normally. Once do_softirq() returns
to assembly, we can check the "next vcpu" pointer and call
context_switch().

(This solution would enable a PowerPC optimization as well: we would
like to lazily save non-volatile registers. We can''t do this unless the
exception handler regains control from do_softirq() before
context_switch() is called.)

Thoughts?

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 17:36 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On 15/12/06 17:27, "Hollis Blanchard" <hollisb@us.ibm.com>
wrote:
> We recently uncovered a bug on PowerPC where if a timer tick arrives
> just inside schedule() while interrupts are still enabled, the
> decrementer is never reprogrammed to that appropriate value. This is
> because once inside schedule(), we never handle any subsequent softirqs:
> we call context_switch() and resume the guest.
Easily fixed. You need to handle softirqs in the exit path to guest context.
You need to do this final check with interrupts disabled to avoid races.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Dec-15 19:09 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On Fri, 2006-12-15 at 17:36 +0000, Keir Fraser wrote:> On 15/12/06 17:27, "Hollis Blanchard" <hollisb@us.ibm.com>
wrote:
> 
> > We recently uncovered a bug on PowerPC where if a timer tick arrives
> > just inside schedule() while interrupts are still enabled, the
> > decrementer is never reprogrammed to that appropriate value. This is
> > because once inside schedule(), we never handle any subsequent
softirqs:
> > we call context_switch() and resume the guest.
> 
> Easily fixed. You need to handle softirqs in the exit path to guest
context.
> You need to do this final check with interrupts disabled to avoid races.
Ah OK, I see now how x86 is doing that. I don''t think that code flow
really makes sense: why would you jump out of do_softirq() into assembly
just to call do_softirq() again?

Also, that doesn''t solve the lazy register saving problem.

However, I think I see how we can implement our desired context_switch()
scheme in arch-specific code. The context_switch() call in schedule()
will return, so please don''t add a BUG() after that. :)

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 20:00 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On 15/12/06 19:09, "Hollis Blanchard" <hollisb@us.ibm.com>
wrote:
> Ah OK, I see now how x86 is doing that. I don''t think that code
flow
> really makes sense: why would you jump out of do_softirq() into assembly
> just to call do_softirq() again?
Well, that''s the way it works out on x86. It is a bit odd, but it works
and
is unlikely to affect performance. I think returning from schedule() would
have its own problems (e.g., context switch from idle domain to guest would
return to the idle loop, which we''d need explicit code to bail from,
etc).
> Also, that doesn''t solve the lazy register saving problem.
I assume this is a PPC-specific issue?
> However, I think I see how we can implement our desired context_switch()
> scheme in arch-specific code. The context_switch() call in schedule()
> will return, so please don''t add a BUG() after that. :)
We already support this mode of operation for IA64 which always returns from
schedule().

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Dec-15 20:41 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On Fri, 2006-12-15 at 20:00 +0000, Keir Fraser wrote:> 
> > Also, that doesn''t solve the lazy register saving problem.
> 
> I assume this is a PPC-specific issue?
It''s an issue with any architecture with a large number of registers
which aren''t automatically saved by hardware (and a C ABI that makes
some of them non-volatile).

x86 has a small number of registers. ia64 automatically saves them (from
what I understand). So of the currently-supported architectures, yes,
that leaves PowerPC.

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 21:39 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On 15/12/06 20:41, "Hollis Blanchard" <hollisb@us.ibm.com>
wrote:
> It''s an issue with any architecture with a large number of
registers
> which aren''t automatically saved by hardware (and a C ABI that
makes
> some of them non-volatile).
> 
> x86 has a small number of registers. ia64 automatically saves them (from
> what I understand). So of the currently-supported architectures, yes,
> that leaves PowerPC.
I see. It sounds like returning from context_switch() is perhaps the right
thing for powerpc. That would be easier if you have per-cpu stacks (like
ia64). If not there are issues in saving register state later (and hence
delaying your call to context_saved()) as there are calls to do_softirq()
outside your asm code (well, not many, but there is one in domain.c for
example) where you won''t end up executing your do_softirq() wrapper. In
general we''d like to reserve the right to include voluntary yield
points,
and that won''t mix well with lazy register saves and per-physical-cpu
stacks.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Hollis Blanchard

2006-Dec-15 23:51 UTC

head link

Re: [Xen-devel] schedule() vs softirqs

On Fri, 2006-12-15 at 21:39 +0000, Keir Fraser wrote:> On 15/12/06 20:41, "Hollis Blanchard" <hollisb@us.ibm.com>
wrote:
> 
> > It''s an issue with any architecture with a large number of
registers
> > which aren''t automatically saved by hardware (and a C ABI
that makes
> > some of them non-volatile).
> >
> > x86 has a small number of registers. ia64 automatically saves them
(from
> > what I understand). So of the currently-supported architectures, yes,
> > that leaves PowerPC.
> 
> I see. It sounds like returning from context_switch() is perhaps the right
> thing for powerpc. That would be easier if you have per-cpu stacks (like
> ia64).
Yup, we have per-cpu stacks.
> If not there are issues in saving register state later (and hence
> delaying your call to context_saved()) as there are calls to do_softirq()
> outside your asm code (well, not many, but there is one in domain.c for
> example) where you won''t end up executing your do_softirq()
wrapper. In
> general we''d like to reserve the right to include voluntary yield
points,
> and that won''t mix well with lazy register saves and
per-physical-cpu
> stacks.
Oh, we have per-physical-cpu stacks. We can do that because there''s no
such thing as a "hypervisor thread" which could block in hypervisor
space and need to be restored later.

Are you saying in the future you want to have hypervisor threads, and so
we''ll need per-VIRTUAL-cpu stacks?

-- 
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Dec 2006 - schedule() vs softirqs

[Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs

Re: [Xen-devel] schedule() vs softirqs