thr3ads.net - Xen devel - [Xen-devel] fooey. no interrupts. [Aug 2004]

If this information is useful, please help other people find it:
Share via:

ron minnich

2004-Aug-09 03:39 UTC

[Xen-devel] fooey. no interrupts.

I''ve just realized a few days ago, when I get back to xen/plan 9, that
I''m
not getting interrupts after the first few. This with a very recent pull. 
What''s amazing that it got as far as it did, but I am processing
pending
interrupt stuff in spllo() so that explains a lot. What I''m not getting
is
the asynchronous calls to evtchn_do_upcall. 

The mask is zero. I''ve enabled VIRQ_TIMER. Yet I''m only
getting one set of
interrupts and it looks like no more. My loop for picking up events is
pretty much the same as the linux loop -- I just took that code. I am 
clearing out evtchn_upcall_pending and evtchn_pending_sel. I am clearing 
the mask to 0 at the end of the interrupt. 

What''s a reasonable set of things to look for? I''m stumped.

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-09 07:28 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> 
> I''ve just realized a few days ago, when I get back to xen/plan 9,
that I''m
> not getting interrupts after the first few. This with a very recent pull. 
> What''s amazing that it got as far as it did, but I am processing
pending
> interrupt stuff in spllo() so that explains a lot. What I''m not
getting is
> the asynchronous calls to evtchn_do_upcall. 
> 
> The mask is zero. I''ve enabled VIRQ_TIMER. Yet I''m only
getting one set of
> interrupts and it looks like no more. My loop for picking up events is
> pretty much the same as the linux loop -- I just took that code. I am 
> clearing out evtchn_upcall_pending and evtchn_pending_sel. I am clearing 
> the mask to 0 at the end of the interrupt. 
> 
> What''s a reasonable set of things to look for? I''m
stumped.
The Linux code sets the evtchn_mask before clearing evtchn_pending,
then clears the evtchn_mask after calling the interrupt handler. Are
you doing the setting but forgetting the clearing?

The order that Linux has for this stuff, to avoid races, is:

 1. Test-and-clear evtchn_upcall_pending flag
 2. Read-and-clear (XCHG) the evtchn_pending_sel
 3. For each set bit @i in the sel:
 4.   Read evtchn_pending[@i]
 5.   For each set bit @j in the word:
 6.     Set evtchn_mask[@i*32+@j]
 7.     Clear evtchn_pending[@i*32+@j]
 8.     ....do interrupt work...
 9.     Clear evtchn_msk[@i*32+@j]

The fact that step 2 is a real XCHG instruction is important, as it
also acts as a memory barrier (not important if you''re not running on
an SMP machine). Also, all your bit-munging instructions must have the
LOCK prefix if you''re running on an SMP machine.

Unmasking evtchn_upcall_mask and evtchn_mask[] need special attention
because a pending interrupt will not automatically get raised as it
would on real hardware ---- think of it like an edge-triggered
interrupt where you lost the edge because the line was masked. So what
we do in Linux is:

Clearing evtchn_upcall_mask:
 1. Clear evtchn_upcall_mask
 2. Barrier [just a compiler barrier, not a CPU barrier]
 3. If ( evtchn_upcall_pending) do_evtchn_processing()

Clearing evtchn_mask[]:
 A bit more involved; see unmask_evtchn() in include/asm-xen/evtchn.h

Sticking close to the Linux code, and making sure the underlying
bitops are LOCKed, is important! I guess yours is unlikely to be a
subtle race if you only ever receive precisly one VIRQ. :-)

 -- Keir


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 13:49 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Mon, 9 Aug 2004, Keir Fraser wrote:
> 
> The Linux code sets the evtchn_mask before clearing evtchn_pending,
> then clears the evtchn_mask after calling the interrupt handler. Are
> you doing the setting but forgetting the clearing?
sadly, no. 
> 
> The order that Linux has for this stuff, to avoid races, is:
> 
>  1. Test-and-clear evtchn_upcall_pending flag
>  2. Read-and-clear (XCHG) the evtchn_pending_sel
>  3. For each set bit @i in the sel:
>  4.   Read evtchn_pending[@i]
>  5.   For each set bit @j in the word:
>  6.     Set evtchn_mask[@i*32+@j]
>  7.     Clear evtchn_pending[@i*32+@j]
>  8.     ....do interrupt work...
>  9.     Clear evtchn_msk[@i*32+@j]
yeah, I''m actually using that code. 

oh well.

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 13:50 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Mon, 9 Aug 2004, Keir Fraser wrote:
> Unmasking evtchn_upcall_mask and evtchn_mask[] need special attention
> because a pending interrupt will not automatically get raised as it
> would on real hardware ---- think of it like an edge-triggered
> interrupt where you lost the edge because the line was masked. So what
> we do in Linux is:
in any event, if clock interrupts are enabled, shouldn''t I get an 
interrupt 500 ms later? Or am I misreading the code.

ron
-- 
LANL CCS-1 email flavor:
***** Correspondence   [X]
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [ ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-09 13:52 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> On Mon, 9 Aug 2004, Keir Fraser wrote:
> 
> > Unmasking evtchn_upcall_mask and evtchn_mask[] need special attention
> > because a pending interrupt will not automatically get raised as it
> > would on real hardware ---- think of it like an edge-triggered
> > interrupt where you lost the edge because the line was masked. So what
> > we do in Linux is:
> 
> in any event, if clock interrupts are enabled, shouldn''t I get an 
> interrupt 500 ms later? Or am I misreading the code.
Yes, you''ll get the interrupt sometime later, the next time Xen is
returning execution to you.

 -- Keir


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 13:54 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Mon, 9 Aug 2004, Keir Fraser wrote:
> > in any event, if clock interrupts are enabled, shouldn''t I
get an
> > interrupt 500 ms later? Or am I misreading the code.
> 
> Yes, you''ll get the interrupt sometime later, the next time Xen is
> returning execution to you.
So that''s the really weird problem. I''m not worried at this
point that
they are delayed: the mask is 0 and they''re not happening at all. 

Anyway, I''ve got a way to dump some info, so more later .

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 14:12 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

OK, some more data. 

in a timer interrupt (first and only) I see this:
xentimerread: hz 0 cpu_hz 598736060 shadow_system_time 333ac5398180
 settin timer to 0x333bfecedc00

So the time is 333ac5398180, I set the timer to 0x333bfecedc00 via 
HYPERVISOR_set_timer_op. I''m assuming that once the system time gets to
0x333bfecedc00 then I''ll get an interrupt. 

Now we wait ... in this loop, I''m print time and the values of the
mask,
the pending, and the pending_sel values in shared info:

ipending: @0x333bfe262600 islo 0x0, pending 0x0, pending_sel 0x0
ipending: @0x333bfebebc80 islo 0x0, pending 0x0, pending_sel 0x0
ipending: @0x333bff575300 islo 0x0, pending 0x0, pending_sel 0x0
ipending: @0x333bffefe980 islo 0x0, pending 0x0, pending_sel 0x0
ipending: @0x333c00888000 islo 0x0, pending 0x0, pending_sel 0x0

note that there is no changing in pending as we cross the time for an 
interrupt to have happened. I would expect to see pending to get set to 
some non-zero value once the system time had passed 0x333bfecedc00.

My reading of HYPERVISOR_set_timer_op is that you set an absolute value,
and when the time is > than that value, you get called with an interrupt; 
is that wrong?

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-09 14:18 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

This is correct, and you should take an interrupt every 10ms in any
case. 

Interesting values to print are the master mask, master pending,
pending_sel, and the words in the pend and mask arrays that contain
the bits for the event channel that you are interested in.
(Remember that the event channel will have a different index to the
VIRQ number).

 -- Keir
> OK, some more data. 
> 
> in a timer interrupt (first and only) I see this:
> xentimerread: hz 0 cpu_hz 598736060 shadow_system_time 333ac5398180
>  settin timer to 0x333bfecedc00
> 
> So the time is 333ac5398180, I set the timer to 0x333bfecedc00 via 
> HYPERVISOR_set_timer_op. I''m assuming that once the system time
gets to
> 0x333bfecedc00 then I''ll get an interrupt. 
> 
> Now we wait ... in this loop, I''m print time and the values of the
mask,
> the pending, and the pending_sel values in shared info:
> 
> ipending: @0x333bfe262600 islo 0x0, pending 0x0, pending_sel 0x0
> ipending: @0x333bfebebc80 islo 0x0, pending 0x0, pending_sel 0x0
> ipending: @0x333bff575300 islo 0x0, pending 0x0, pending_sel 0x0
> ipending: @0x333bffefe980 islo 0x0, pending 0x0, pending_sel 0x0
> ipending: @0x333c00888000 islo 0x0, pending 0x0, pending_sel 0x0
> 
> note that there is no changing in pending as we cross the time for an 
> interrupt to have happened. I would expect to see pending to get set to 
> some non-zero value once the system time had passed 0x333bfecedc00.
> 
> My reading of HYPERVISOR_set_timer_op is that you set an absolute value,
> and when the time is > than that value, you get called with an
interrupt;
> is that wrong?
> 
> ron
> 
> -- 
> LANL CCS-1 email flavor:
> ***** Correspondence   []
> ***** DUSA LACSI-HW    [ ]
> ***** DUSA LACSI-OS    [x ]
> ***** DUSA LACSI-CS    [ ]
> 
> 


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 14:26 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

thanks again, you cleared it up for me; pilot error. 

I''m amazed it got this far, as in ''How did this *EVER*
work".

Oh well, off to work, fix it tonight.

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 19:54 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Mon, 9 Aug 2004, Keir Fraser wrote:
> Interesting values to print are the master mask, master pending,
> pending_sel, and the words in the pend and mask arrays that contain
> the bits for the event channel that you are interested in.
> (Remember that the event channel will have a different index to the
> VIRQ number).
I understand my confusion better. 

cli on linux on xen is this:
HYPERVISOR_shared_info->vcpu_data[0].evtchn_upcall_mask = 1;

That disables all interrupts? I''m confused on that, how does this
relate
to the evtchn_mask? 

For cpu 0 do I need to clear BOTH of these for interrupts to happen, and 
then in the domain itself only mess with the one for vcpu 0? I''m
looking
at linux U kernel code but want to make sure I get this right. Is this 
stuff really firmly tested and laid out or still somewhat tentative due to 
the fact that it''s not really tested with vcpu > 0?

thanks

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   [X]
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [ ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-09 20:24 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

One last note: 

I am (weirdly) getting further, a proc is running and asking for input. 

But I still get to this weird state: 
global irupt mask is 0xfffffff8, global pending is 7, pending_sel is 0, 
vcpu_data[0].mask is 0, and vcpu_data[0].evtchn_upcall_pending is 0. 

Seems to me I should be taking some async upcalls. 

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-09 23:25 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> One last note: 
> 
> I am (weirdly) getting further, a proc is running and asking for input. 
> 
> But I still get to this weird state: 
> global irupt mask is 0xfffffff8, global pending is 7, pending_sel is 0, 
> vcpu_data[0].mask is 0, and vcpu_data[0].evtchn_upcall_pending is 0. 
> 
> Seems to me I should be taking some async upcalls. 
What''s ''global irupt mask'' and ''global
pending''?

You won''t get an sync upcall until evtchn_upcall_pending becomes
non-zero. That won''t occur until one of the bits in pending_sel
becomes set. Which, in turn, won''t occur until an event-channel has
its bit set in the evtchn_pend[] array, which doesn''t have its
evtchn_mask[] array bit set.

 -- Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-09 23:32 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> I understand my confusion better. 
> 
> cli on linux on xen is this:
> HYPERVISOR_shared_info->vcpu_data[0].evtchn_upcall_mask = 1;
> 
> That disables all interrupts? I''m confused on that, how does this
relate
> to the evtchn_mask? 
The purpose of the evtchn_mask[] array is to disallow callbacks at
per-channel granularity. e.g., for scheduling purposes.

The evtchn_upcall_mask is intended to disallow callbacks in general,
where your OS is in a state that it cannot handle them. i.e., it''s to
allow easy reeentrancy control in your OS.
> For cpu 0 do I need to clear BOTH of these for interrupts to happen, and 
> then in the domain itself only mess with the one for vcpu 0? I''m
looking
> at linux U kernel code but want to make sure I get this right. Is this 
> stuff really firmly tested and laid out or still somewhat tentative due to 
> the fact that it''s not really tested with vcpu > 0?
For a particular event channel @e to fire you an async callback, you
need:
 1. The original value of bit @e in evtchn_pending[] must be zero.
 2. The value of bit @e in evtchn_mask[] must be zero.
 3. The original value of bit (@e>>5) in evtchn_pending_sel must be zero.
 4. The original value of vcpu_data[0].evtchn_upcall_pending must be zero.
 5. The value of vcpu_data[0].evtchn_upcall_mask must be zero.

If these 5 requirements are satisfied then you _will_ receive a
callback.

 -- Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-10 03:15 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

I''ve got a question about this code. 


static inline void evtchn_set_pending(struct domain *d, int port)
{
    shared_info_t *s = d->shared_info;
    if ( !test_and_set_bit(port,    &s->evtchn_pending[0]) &&
         !test_bit        (port,    &s->evtchn_mask[0])    &&
         !test_and_set_bit(port>>5, &s->evtchn_pending_sel) )
    {
        /* The VCPU pending flag must be set /after/ update to 
			evtchn-pend. */
        s->vcpu_data[0].evtchn_upcall_pending = 1;
        guest_async_callback(d);
    }
}

So you''ll get an upcall IFF: the bit ''port'' in
evtchn_pending WAS 0, the
bit ''port'' in the mask IS 0, and the bit ''port
>> 5'' in the
evtchn_pending_sel WAS 0. 

OK, here''s my question: suppose the first test_and_set_bit fails
because
the bit in evtchn_pending[0] was already set? You''ll never get called,
that''s what, as far as I can tell. And this is exactly what
I''m seeing.
I''ve got bits 0,1,2 set in evtchn_pending, but the guest_async_callback
is
never happening, since the test_and_set_bit returns 1.  I''m missing an
interrupt, due to a plethora of debug prints in my kernel, and I''m not
seeing another one.

To me, it looks like I''m exercising a race condition in this function 
shown above. 

Here is my question: why isn''t this code something like:

static inline void evtchn_set_pending(struct domain *d, int port)
{
    shared_info_t *s = d->shared_info;
    set_bit(port, &s->evtchn_pending[0]);
    if ( !test_bit        (port,    &s->evtchn_mask[0])    &&
         !test_and_set_bit(port>>5, &s->evtchn_pending_sel) )
    {
        /* The VCPU pending flag must be set /after/ update to 
		evtchn-pend. */
        s->vcpu_data[0].evtchn_upcall_pending = 1;
        guest_async_callback(d);
    }
}

In other words, I don''t see the reason for the first test_and_set_bit, 
given that the bit may have been set by an earlier call to 
evtchn_set_pending, masked by the mask, and then the next time you call 
the first test_and_set_bit will fail. 

So, what''s the reason for that first TAS? 

thanks

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   [X]
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [ ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-10 03:24 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Tue, 10 Aug 2004, Keir Fraser wrote:
> For a particular event channel @e to fire you an async callback, you
> need:
>  1. The original value of bit @e in evtchn_pending[] must be zero.
>  2. The value of bit @e in evtchn_mask[] must be zero.
Is there a race condition here? Let''s pretend this is the 10ms
interrupt
	@e gets set in evtchn_pending
	@e in evtchn_mask is set (not zero) because it is still masked
		as Plan is still in the interrupt printing lots of info 
		for me. 
>From my reading of this, if your interrupt handler takes more than 10 ms, then you''ll never get another timer interrupt @e, since the
evtchn_pending
is now 1. Is there any way in which a callback for timer ints will occur 
if @e in evtchn_pending was set and the mask is now zero? I can''t see
it.
>  3. The original value of bit (@e>>5) in evtchn_pending_sel must be
zero.
OK, this I can see. 
>  4. The original value of vcpu_data[0].evtchn_upcall_pending must be zero.
>  5. The value of vcpu_data[0].evtchn_upcall_mask must be zero.
OK, this I don''t totally see. From the code I posted before, it seems
to
me only the first three conditions matter. 

Thanks!

ron

-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-10 03:40 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

Further thinking about the whole interrupt thing. It seems to me that all 
the interrupts are edge-triggered, see: 
     *  2. MASK -- if this bit is clear then a 0->1 transition of PENDING
     *     will cause an asynchronous upcall to be scheduled. This bit is 
		only
     *     updated by the guest. It is read-only within Xen. If a channel
     *     becomes pending while the channel is masked then the
''edge'' is
			lost
     *     (i.e., when the channel is unmasked, the guest must manually 
			handle
     *     pending notifications as no upcall will be scheduled by Xen).

But what we want in some cases (timer in particular) are level interrupts. 
So this code: 

static inline void evtchn_set_pending(struct domain *d, int port)
{
    shared_info_t *s = d->shared_info;
    if ( !test_and_set_bit(port,    &s->evtchn_pending[0]) &&
         !test_bit        (port,    &s->evtchn_mask[0])    &&
         !test_and_set_bit(port>>5, &s->evtchn_pending_sel) )

	etc. 

is really testing for edges (which is fine) but in some cases we really do 
want a level. 

Sadly this does complicate life but at the same time I''d argue that
VIRQ_TIMER should be a level interrupt. I can''t see any way out of this
race condition otherwise.

Does this make sense or am I totally off base? I do think the comment 
above (from hypervisor-if.h) very clearly explains the potential race 
condition. I''ve fallen into it in a big way, but I think it is a
problem
others may fall into as well. 

I''m going to add a trivial function evtchn_set_pending_level and call
it
out of send_guest_virq and see if it helps my problem. My guess is it 
will. 

Thanks for your patience on this one, Keir.

ron
-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-10 09:07 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> Sadly this does complicate life but at the same time I''d argue
that
> VIRQ_TIMER should be a level interrupt. I can''t see any way out of
this
> race condition otherwise.
> 
> Does this make sense or am I totally off base? I do think the comment 
> above (from hypervisor-if.h) very clearly explains the potential race 
> condition. I''ve fallen into it in a big way, but I think it is a
problem
> others may fall into as well. 
> 
> I''m going to add a trivial function evtchn_set_pending_level and
call it
> out of send_guest_virq and see if it helps my problem. My guess is it 
> will. 
If you use this function (from evtchn.h) to unmask individual event
channels then you will not experience the race. Changes to Xen are not
required: 

[NB. sync_*_bit forces uses of SMP-safe atomic bit operations (ie., on 
 x86 they will use the LOCK prefix). I need to deliberately specify
 this because the guest OS is UP, and so it''s usual *_bit operations
 are not SMP-safe! You may want to watch out for this one yourself.]

static inline void unmask_evtchn(int port)
{
    shared_info_t *s = HYPERVISOR_shared_info;

    synch_clear_bit(port, &s->evtchn_mask[0]);

    /*
     * The following is basically the equivalent of
''hw_resend_irq''. Just
     * like a real IO-APIC we ''lose the interrupt edge'' if the
channel is
     * masked.
     */
    if (  synch_test_bit        (port,    &s->evtchn_pending[0])
&&
         !synch_test_and_set_bit(port>>5, &s->evtchn_pending_sel) )
    {
        s->vcpu_data[0].evtchn_upcall_pending = 1;
        if ( !s->vcpu_data[0].evtchn_upcall_mask )
            force_evtchn_callback();
    }
}


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Christian Limpach

2004-Aug-10 09:07 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Mon, Aug 09, 2004 at 09:24:44PM -0600, ron minnich
wrote:> > For a particular event channel @e to fire you an async callback, you
> > need:
> >  1. The original value of bit @e in evtchn_pending[] must be zero.
> >  2. The value of bit @e in evtchn_mask[] must be zero.
> 
> Is there a race condition here? Let''s pretend this is the 10ms
interrupt
> 	@e gets set in evtchn_pending
> 	@e in evtchn_mask is set (not zero) because it is still masked
> 		as Plan is still in the interrupt printing lots of info 
> 		for me. 
You have to check for a pending interrupt when you unmask an interrupt.
This can be done atomically by disabling interrupts with evtchn_upcall_mask.
> >  4. The original value of vcpu_data[0].evtchn_upcall_pending must be
zero.
> >  5. The value of vcpu_data[0].evtchn_upcall_mask must be zero.
> 
> OK, this I don''t totally see. From the code I posted before, it
seems to
> me only the first three conditions matter. 
4 is not true, we don''t test it and set it unconditionally.

    christian



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-10 09:09 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> >  4. The original value of vcpu_data[0].evtchn_upcall_pending must be
zero.
> >  5. The value of vcpu_data[0].evtchn_upcall_mask must be zero.
> 
> OK, this I don''t totally see. From the code I posted before, it
seems to
> me only the first three conditions matter. 
The first three conditions cause us to decide whether or not to
schedule the target domain, sending a cross-cpu interrupt if
necessary. The final two are checks just before calling back to the
guest OS, just to check whether it is in a position to receive async
callbacks. The final two are only ever accessed on the CPU that is
running the guest, which is why we can access/update them using
non-atomic operations and compiler barriers (rather than atomic ops
and CPU barriers).

 -- Keir
> Thanks!
> 
> ron
> 
> -- 
> LANL CCS-1 email flavor:
> ***** Correspondence   []
> ***** DUSA LACSI-HW    [ ]
> ***** DUSA LACSI-OS    [x ]
> ***** DUSA LACSI-CS    [ ]
> 
> 


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-10 13:32 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Tue, 10 Aug 2004, Keir Fraser wrote:
> 
> > >  4. The original value of vcpu_data[0].evtchn_upcall_pending must
be zero.
> > >  5. The value of vcpu_data[0].evtchn_upcall_mask must be zero.
> > 
> > OK, this I don''t totally see. From the code I posted before,
it seems to
> > me only the first three conditions matter. 
> 
> The first three conditions cause us to decide whether or not to
> schedule the target domain, sending a cross-cpu interrupt if
> necessary. The final two are checks just before calling back to the
> guest OS, just to check whether it is in a position to receive async
> callbacks. 
Keir, I don''t see that in the code and Christian sent a note that left
me
thinking it does not work that way. 

as Christian said, (4) doesn''t do anything conditional, it does this:
        /* The VCPU pending flag must be set /after/ update to 
evtchn-pend. */
        s->vcpu_data[0].evtchn_upcall_pending = 1;
        guest_async_callback(d);

which looks pretty unconditional to me. Is there something else I''m 
missing?

thanks

ron


-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [x ]
***** DUSA LACSI-OS    [ ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Keir Fraser

2004-Aug-10 13:35 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

> On Tue, 10 Aug 2004, Keir Fraser wrote:
> 
> > 
> > > >  4. The original value of vcpu_data[0].evtchn_upcall_pending
must be zero.
> > > >  5. The value of vcpu_data[0].evtchn_upcall_mask must be
zero.
> > > 
> > > OK, this I don''t totally see. From the code I posted
before, it seems to
> > > me only the first three conditions matter. 
> > 
> > The first three conditions cause us to decide whether or not to
> > schedule the target domain, sending a cross-cpu interrupt if
> > necessary. The final two are checks just before calling back to the
> > guest OS, just to check whether it is in a position to receive async
> > callbacks. 
> 
> Keir, I don''t see that in the code and Christian sent a note that
left me
> thinking it does not work that way. 
> 
> as Christian said, (4) doesn''t do anything conditional, it does
this:
>         /* The VCPU pending flag must be set /after/ update to 
> evtchn-pend. */
>         s->vcpu_data[0].evtchn_upcall_pending = 1;
>         guest_async_callback(d);
> 
> which looks pretty unconditional to me. Is there something else
I''m
> missing?
No, I''d forgotten how the code worked -- Christian is correct.
evtchn_upcall_pending is set unconditionally on the CPU that is
transmitting the event. evtchn_upcall_mask is only checked immediately
before return to your guest OS to determine whether or not to create
an async callback frame.

 -- Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Christian Limpach

2004-Aug-10 13:51 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

On Tue, Aug 10, 2004 at 02:35:08PM +0100, Keir Fraser
wrote:> > On Tue, 10 Aug 2004, Keir Fraser wrote:
> > 
> > > 
> > > > >  4. The original value of
vcpu_data[0].evtchn_upcall_pending must be zero.
> > > > >  5. The value of vcpu_data[0].evtchn_upcall_mask must
be zero.
> > > > 
> > > > OK, this I don''t totally see. From the code I
posted before, it seems to
> > > > me only the first three conditions matter. 
> > > 
> > > The first three conditions cause us to decide whether or not to
> > > schedule the target domain, sending a cross-cpu interrupt if
> > > necessary. The final two are checks just before calling back to
the
> > > guest OS, just to check whether it is in a position to receive
async
> > > callbacks. 
> > 
> > Keir, I don''t see that in the code and Christian sent a note
that left me
> > thinking it does not work that way. 
> > 
> > as Christian said, (4) doesn''t do anything conditional, it
does this:
> >         /* The VCPU pending flag must be set /after/ update to 
> > evtchn-pend. */
> >         s->vcpu_data[0].evtchn_upcall_pending = 1;
> >         guest_async_callback(d);
> > 
> > which looks pretty unconditional to me. Is there something else
I''m
> > missing?
> 
> No, I''d forgotten how the code worked -- Christian is correct.
> evtchn_upcall_pending is set unconditionally on the CPU that is
> transmitting the event. evtchn_upcall_mask is only checked immediately
> before return to your guest OS to determine whether or not to create
> an async callback frame.
For completeness sake:  evtchn_upcall_pending is also checked
immediately before return to your guest OS to determine whether
or not to create an async callback frame.  Unlike evtchn_upcall_mask
evtchn_upcall_pending needs to be set while evtchn_upcall_mask needs
to be clear.
See /*test_guest_events:*/ in xen/arch/x86/x86_32/entry.S

    christian



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

ron minnich

2004-Aug-10 15:46 UTC

head link

Re: [Xen-devel] fooey. no interrupts.

it looks like I''m falling into the gap between how linux does
interrupts
and how Plan 9 does them. 

I''m still seeing the race condition, even with the suggestions you
posted
in place in my interrupt handler. I''ll try to work this a bit more but
it
does seem to me that if you''re taking too long in your interrupt
handler
you will get bit by the race -- Xen will I assume pre-empt dom1 in the
event of a clock interrupt, and at that point it is game over. I think
long term a "level interrupt" construct may be needed, but that is
conjecture on my part.

Trying to latch levels in a UP is a lot different than latching them in
hardware, since the hardware is more or less running all the time, but the
Xen/Dom0/Dom1 is all time shared. Looking at the race sequence I think
I''m
seeing I''m still not convinced it is completely avoidable, but I will
hope
I am wrong.

thanks for the good suggestions!

ron


-- 
LANL CCS-1 email flavor:
***** Correspondence   []
***** DUSA LACSI-HW    [ ]
***** DUSA LACSI-OS    [x ]
***** DUSA LACSI-CS    [ ]




-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

Xen devel - Aug 2004 - fooey. no interrupts.

[Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.

Re: [Xen-devel] fooey. no interrupts.