thr3ads.net - Xen devel - [Xen-devel] [PATCH 0/5] Add MSI support to XEN [Mar 2008]

If this information is useful, please help other people find it:
Share via:

Shan, Haitao

2008-Mar-27 06:55 UTC

[Xen-devel] [PATCH 0/5] Add MSI support to XEN

Hi, Keir,
 
    These patches are rebased version of Yunhong''s original patches,
which were sent out before XEN 3.2 was released. These patches enable
MSI support and limited MSI-X support in XEN. Here is the original
description of the patches from Yunhong''s mail.
 
The basic idea including:
1) Keep vector global resource owned by xen, while split pirq into
per-domain information.
2) Domain0 kernel will operate msi resource for domain0/domU, while QEMU
will operate MSI resource for HVM domain.
3) Xen will do EOI for MSI interrupt.

Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com
<mailto:yunhong.jiang@intel.com> >
 
    There are no much changes made compared with the original patches.
But there do have some issues that we need your kind comments.
    1> ACK-NEW method is necessary to avoid IRQ storm. But it causes the
deadlock. 
         During my tests, I do find there can be deadlock with patches
applied. When assigned a NIC device to HVM domain, the scenario is: Dom0
is waiting to IDE interrupt (vector 0x21); HVM domain is waiting for
qemu''s IDE emulation and thus blocked; NIC interrupt (MSI vector 0x31)
is waiting for injection to HVM domain since it is blocked now; IDE
interrupt is waiting for NIC interrupt since NIC interrupt is of high
priority but not ACKed by XEN now. When IDE interrupt and NIC interrupt
are delivered to the same CPU, and when guest OS is Vista, the
phenomenon is easy to be observed.
    2> Without ACK-NEW, some naughty NIC devices as we observed will
bring IRQ storms. For this phenomenon, I think Yunhong can comment more.
Basically, writing EOI without mask the source of MSI will bring IRQ
storm. Although the reason is under investigation, XEN should anyhow
handle such bogous device, right?
    3> Using ACK-OLD and masking the MSI when writing EOI can be
solution. However, XEN does not own PCI configuration spaces.
 
    We also tried some work arounds.
    One work around might be using a timer to force a EOI within some
time interval. This method is already implemented in VT-D''s code.
However, with this approach, if the timer is fired and EOI is written,
this is essentially the same apporach as option 2.
    Another approach is to never deliver these two IRQs to the same CPU.
But this is really ugly and can not be applied to UP.
    We have also considered using VT-D 2 interrupt remapping feature.
According to the spec, there is no bit in the remapping table to mask
the interrupt. Therefore, this can not be combined with option 2 to
solve the issue. Masking the interrupt still needs accessing PCI
configuration spaces.
 
    We think the most clean method may be to move ownership from dom0 to
VMM. However, this is a great change. This should be well discussed in
community and need your comments.
 
    These patch series sent out can be served as a discussion materials.
What is your comments on the patches and the issues, Keir?
    
Thanks!
Haitao Shan
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-27 07:56 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Thanks,

I¹ll have to look at the patches regarding the per-domain pirq changes. That
sounds like it probably makes sense, but I seem to remember there were big
changes to the irq architecture and irq naming in the hypervisor in previous
iterations of these patches, which I didn¹t understand.

This IRQ storm issue still needs properly resolving. Noone has yet explained
how a message-based interrupt source can cause an irq storm. Storms are
inherently a property of level-triggered sources, where ACK/EOI immediately
causes re-sampling of the interrupt line and re-assertion of the interrupt
at the CPU. How can anything similar happen with MSI? You (Intel) are
probably uniquely placed to answer this question, since you manufacture the
chipset and NIC which exhibit this problem.

 -- Keir

On 27/3/08 06:55, "Shan, Haitao" <haitao.shan@intel.com> wrote:
> The basic idea including:
> 1) Keep vector global resource owned by xen, while split pirq into
per-domain
> information.
> 2) Domain0 kernel will operate msi resource for domain0/domU, while QEMU
will
> operate MSI resource for HVM domain.
> 3) Xen will do EOI for MSI interrupt.
> 
> Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com
> <mailto:yunhong.jiang@intel.com> >
>  
>     There are no much changes made compared with the original patches. But
> there do have some issues that we need your kind comments.
>     1> ACK-NEW method is necessary to avoid IRQ storm. But it causes the
> deadlock. 
>          During my tests, I do find there can be deadlock with patches
> applied. When assigned a NIC device to HVM domain, the scenario is: Dom0 is
> waiting to IDE interrupt (vector 0x21); HVM domain is waiting for qemu¹s
IDE
> emulation and thus blocked; NIC interrupt (MSI vector 0x31) is waiting for
> injection to HVM domain since it is blocked now; IDE interrupt is waiting
for
> NIC interrupt since NIC interrupt is of high priority but not ACKed by XEN
> now. When IDE interrupt and NIC interrupt are delivered to the same CPU,
and
> when guest OS is Vista, the phenomenon is easy to be observed.
>     2> Without ACK-NEW, some naughty NIC devices as we observed will
bring IRQ
> storms. For this phenomenon, I think Yunhong can comment more. Basically,
> writing EOI without mask the source of MSI will bring IRQ storm. Although
the
> reason is under investigation, XEN should anyhow handle such bogous device,
> right?
>     3> Using ACK-OLD and masking the MSI when writing EOI can be
solution.
> However, XEN does not own PCI configuration spaces.
>  
>     We also tried some work arounds.
>     One work around might be using a timer to force a EOI within some time
> interval. This method is already implemented in VT-D¹s code. However, with
> this approach, if the timer is fired and EOI is written, this is
essentially
> the same apporach as option 2.
>     Another approach is to never deliver these two IRQs to the same CPU.
But
> this is really ugly and can not be applied to UP.
>     We have also considered using VT-D 2 interrupt remapping feature.
> According to the spec, there is no bit in the remapping table to mask the
> interrupt. Therefore, this can not be combined with option 2 to solve the
> issue. Masking the interrupt still needs accessing PCI configuration
spaces.
>  
>     We think the most clean method may be to move ownership from dom0 to
VMM.
> However, this is a great change. This should be well discussed in community
> and need your comments.
>  
>     These patch series sent out can be served as a discussion materials.
What
> is your comments on the patches and the issues, Keir?
>     
> Thanks!



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2008-Mar-27 17:32 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Preventing interrupt storms by masking the interrupt in the MSI/MSI-X
capabilty structure or MSI-X table within the interrupt handler is
insane.  It requires accesses over the PCI/PCIe bus and is clearly
something you want to avoid on the fast path.

	eSk


[Haitao Shan]>     There are no much changes made compared with the original patches.
> But there do have some issues that we need your kind comments.
>   1> ACK-NEW method is necessary to avoid IRQ storm. But it causes the
> deadlock. 
>          During my tests, I do find there can be deadlock with patches
> applied. When assigned a NIC device to HVM domain, the scenario is: Dom0
> is waiting to IDE interrupt (vector 0x21); HVM domain is waiting for
> qemu''s IDE emulation and thus blocked; NIC interrupt (MSI vector
0x31)
> is waiting for injection to HVM domain since it is blocked now; IDE
> interrupt is waiting for NIC interrupt since NIC interrupt is of high
> priority but not ACKed by XEN now. When IDE interrupt and NIC interrupt
> are delivered to the same CPU, and when guest OS is Vista, the
> phenomenon is easy to be observed.
>   2> Without ACK-NEW, some naughty NIC devices as we observed will
> bring IRQ storms. For this phenomenon, I think Yunhong can comment more.
> Basically, writing EOI without mask the source of MSI will bring IRQ
> storm. Although the reason is under investigation, XEN should anyhow
> handle such bogous device, right?
>   3> Using ACK-OLD and masking the MSI when writing EOI can be
> solution. However, XEN does not own PCI configuration spaces.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Caitlin Bestler

2008-Mar-27 22:09 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Espen Skoglund wrote:> 
> Preventing interrupt storms by masking the interrupt in the MSI/MSI-X
> capabilty structure or MSI-X table within the interrupt handler is
> insane.  It requires accesses over the PCI/PCIe bus and is clearly
> something you want to avoid on the fast path.
> 
> 	eSk
> 
I agree. Interrupt mitigation schemes should already be part of the
host/device interface that is being assigned to the HVM guest. The
HVM guest should already know how to use it.
> 
> [Haitao Shan]
> >     There are no much changes made compared with the original
> patches.
> > But there do have some issues that we need your kind comments.
> 
> >   1> ACK-NEW method is necessary to avoid IRQ storm. But it causes
> the
> > deadlock.
> >          During my tests, I do find there can be deadlock with
> patches
> > applied. When assigned a NIC device to HVM domain, the scenario is:
> Dom0
> > is waiting to IDE interrupt (vector 0x21); HVM domain is waiting for
> > qemu''s IDE emulation and thus blocked; NIC interrupt (MSI
vector
> 0x31)
> > is waiting for injection to HVM domain since it is blocked now; IDE
> > interrupt is waiting for NIC interrupt since NIC interrupt is of
high> > priority but not ACKed by XEN now. When IDE interrupt and NIC
> interrupt
> > are delivered to the same CPU, and when guest OS is Vista, the
> > phenomenon is easy to be observed.
> 
> >   2> Without ACK-NEW, some naughty NIC devices as we observed will
> > bring IRQ storms. For this phenomenon, I think Yunhong can comment
> more.
> > Basically, writing EOI without mask the source of MSI will bring IRQ
> > storm. Although the reason is under investigation, XEN should anyhow
> > handle such bogous device, right?
> 
Device assignment should deliver the device to the HVM, with all of its
warts as well as all of its features. Isn''t the ultimate point is to
use
the same driver in the HVM guest whether Xen is present or not?


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2008-Mar-28 01:48 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Not masking each time when interrupt happen, instead, we do that only
when the second interrupt happen while the previous one is still
pending, it should be something like handle_edge_irqs() in upstream
linux.

-- Yunhong Jiang

Espen Skoglund <mailto:espen.skoglund@netronome.com>
wrote:> Preventing interrupt storms by masking the interrupt in the MSI/MSI-X
> capabilty structure or MSI-X table within the interrupt handler is
> insane.  It requires accesses over the PCI/PCIe bus and is clearly
> something you want to avoid on the fast path.
> 
> 	eSk
> 
> 
> [Haitao Shan]
>>     There are no much changes made compared with the original
patches.>> But there do have some issues that we need your kind comments.
> 
>>   1> ACK-NEW method is necessary to avoid IRQ storm. But it causes
the>>          deadlock. During my tests, I do find there can be deadlock
with>> patches applied. When assigned a NIC device to HVM domain, the
scenario>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain is
waiting>> for qemu''s IDE emulation and thus blocked; NIC interrupt (MSI
vector
0x31)>> is waiting for injection to HVM domain since it is blocked now; IDE
>> interrupt is waiting for NIC interrupt since NIC interrupt is of high
>> priority but not ACKed by XEN now. When IDE interrupt and NIC
interrupt>> are delivered to the same CPU, and when guest OS is Vista, the
>> phenomenon is easy to be observed.
> 
>>   2> Without ACK-NEW, some naughty NIC devices as we observed will
>> bring IRQ storms. For this phenomenon, I think Yunhong can comment
more.>> Basically, writing EOI without mask the source of MSI will bring IRQ
>> storm. Although the reason is under investigation, XEN should anyhow
>> handle such bogous device, right?
> 
>>   3> Using ACK-OLD and masking the MSI when writing EOI can be
>> solution. However, XEN does not own PCI configuration spaces.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-28 07:24 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

This requires the guest to call back into Xen to signal EOI (as we already
do for legacy level-triggered interrupts). We shouldn''t really need to
do
that for MSI and it''s rather more expensive than a couple of accesses
over
the PCI bus!

It''s this callback into Xen, which we do not really understand why
it''s
needed, which I''m railing against. Is there some fundamental aspect of
MSI
we do not understand, or are we working around one brain-dead or buggy
device?

 -- Keir

On 28/3/08 01:48, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
> Not masking each time when interrupt happen, instead, we do that only
> when the second interrupt happen while the previous one is still
> pending, it should be something like handle_edge_irqs() in upstream
> linux.
> 
> -- Yunhong Jiang
> 
> Espen Skoglund <mailto:espen.skoglund@netronome.com> wrote:
>> Preventing interrupt storms by masking the interrupt in the MSI/MSI-X
>> capabilty structure or MSI-X table within the interrupt handler is
>> insane.  It requires accesses over the PCI/PCIe bus and is clearly
>> something you want to avoid on the fast path.
>> 
>> eSk
>> 
>> 
>> [Haitao Shan]
>>>     There are no much changes made compared with the original
> patches.
>>> But there do have some issues that we need your kind comments.
>> 
>>>   1> ACK-NEW method is necessary to avoid IRQ storm. But it
causes
> the
>>>          deadlock. During my tests, I do find there can be deadlock
> with
>>> patches applied. When assigned a NIC device to HVM domain, the
> scenario
>>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain is
> waiting
>>> for qemu''s IDE emulation and thus blocked; NIC interrupt
(MSI vector
> 0x31)
>>> is waiting for injection to HVM domain since it is blocked now; IDE
>>> interrupt is waiting for NIC interrupt since NIC interrupt is of
high
>>> priority but not ACKed by XEN now. When IDE interrupt and NIC
> interrupt
>>> are delivered to the same CPU, and when guest OS is Vista, the
>>> phenomenon is easy to be observed.
>> 
>>>   2> Without ACK-NEW, some naughty NIC devices as we observed
will
>>> bring IRQ storms. For this phenomenon, I think Yunhong can comment
> more.
>>> Basically, writing EOI without mask the source of MSI will bring
IRQ
>>> storm. Although the reason is under investigation, XEN should
anyhow
>>> handle such bogous device, right?
>> 
>>>   3> Using ACK-OLD and masking the MSI when writing EOI can be
>>> solution. However, XEN does not own PCI configuration spaces.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2008-Mar-28 08:40 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

I''d give some experiement I did after I discovered this issue.

The device was a 82575EB NIC card, the driver I used was igb 1.0.8
(search http://sourceforge.net/project/showfiles.php?group_id=42302  for
it). 
LSC interrupt is a line status change interrupt. It can happen
physically , or it can be triggered as the driver did in  igb_open() in
igb_main.c line 1496, which write to a special register (E1000_ICS) to
trigger an interrupt event.

I did some experiemnt in linux 2.6.23 again with this driver, I try to
a) change the handle_edge_irqs() to mask/ack to only ack the interrupt
if the interrupt happen when the previous one is on way, see the patch
below, b) commented out line 1496 in the driver.

The investigation result is,
1) if mask and ack the interrupt, the interrupt will happen 3 times, the
last 2 is masked because they happened when the first one is still
pending for ISR''s handler, the system is ok.
2) if ack and no-mask the interrupt, the interrupt will happen
continously, the system hang for ever.
3) if ack and no-mask the interrupt, and I remove line 1496 (i.e. no
software trigger interrupt), the intrrupt will happen twice, system is
ok.

So I suppose the problem happens only if trigger the interrupt by
software. I consulted the HW engineer also but didn''t get confirmation,
the only answer I got is, the PCI-E need a rising edge before send the
2nd interrupt :(

I''m not sure if there are any other BRAIN-DEAD device like this, I only
have this device to test MSI-X function, but we may need make sure it
will not break the whole system.

The call-back to guest because we are using the ACK-new method to work
around this issue. Yes, it is expensive, Also, this ACK-new method may
cause deadlock as Haitao suggested in the mail.

But if we move the config space to HV, then we don''t need this ACK-new
method, that should be ok, but admittedly, that should be the last
method we we turn to, since config-space should be owned by domain0.

Thanks
-- Yunhong Jiang

The patch to ack and no-mask the MSI-x interrupt is below:

--- kernel/irq/chip.c   2008-03-28 13:23:51.000000000 -0400
+++ ../linux-2.6.23/kernel/irq/chip.c   2007-10-09 16:31:38.000000000
-0400
@@ -439,9 +439,7 @@
  *     the handler was running. If all pending interrupts are handled,
the
  *     loop is left.
  */
-
-extern struct irq_chip msi_chip ;
-void
+void fastcall
 handle_edge_irq(unsigned int irq, struct irq_desc *desc)
 {
        const unsigned int cpu = smp_processor_id();
@@ -457,23 +455,11 @@
         */
        if (unlikely((desc->status & (IRQ_INPROGRESS | IRQ_DISABLED)) ||
                    !desc->action)) {
-
-        if (desc->chip == &msi_chip)
-            printk("mask msi chip irq %x cpu %x desc->status %x
desc->action %p tsc %lx\n", irq, cpu, desc->status, desc->action,
tsc_this);
-
                desc->status |= (IRQ_PENDING | IRQ_MASKED);
-        if (desc->chip == &msi_chip)
-        {
-               desc->chip->ack(irq);
-        }else
                mask_ack_irq(desc, irq);
-
                goto out_unlock;
        }

Keir Fraser <mailto:keir.fraser@eu.citrix.com>
wrote:> This requires the guest to call back into Xen to signal EOI (as we
already> do for legacy level-triggered interrupts). We shouldn''t really
> need to do
> that for MSI and it''s rather more expensive than a couple of
> accesses over
> the PCI bus!
> 
> It''s this callback into Xen, which we do not really understand why
it''s> needed, which I''m railing against. Is there some fundamental
> aspect of MSI
> we do not understand, or are we working around one brain-dead or buggy
> device? 
> 
> -- Keir
> 
> On 28/3/08 01:48, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>> Not masking each time when interrupt happen, instead, we do that only
>> when the second interrupt happen while the previous one is still
>> pending, it should be something like handle_edge_irqs() in upstream
linux.>> 
>> -- Yunhong Jiang
>> 
>> Espen Skoglund <mailto:espen.skoglund@netronome.com> wrote:
>>> Preventing interrupt storms by masking the interrupt in the
MSI/MSI-X>>> capabilty structure or MSI-X table within the interrupt handler is
>>> insane.  It requires accesses over the PCI/PCIe bus and is clearly
>>> something you want to avoid on the fast path.
>>> 
>>> eSk
>>> 
>>> 
>>> [Haitao Shan]
>>>>     There are no much changes made compared with the original
patches.>>>> But there do have some issues that we need your kind comments.
>>> 
>>>>   1> ACK-NEW method is necessary to avoid IRQ storm. But it
causes
the>>>>          deadlock. During my tests, I do find there can be
deadlock
with>>>> patches applied. When assigned a NIC device to HVM domain, the
scenario>>>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain
is
waiting>>>> for qemu''s IDE emulation and thus blocked; NIC
interrupt (MSI
vector>>>> 0x31) is waiting for injection to HVM domain since it is
blocked
now; IDE>>>> interrupt is waiting for NIC interrupt since NIC interrupt is
of
high>>>> priority but not ACKed by XEN now. When IDE interrupt and NIC
interrupt>>>> are delivered to the same CPU, and when guest OS is Vista, the
>>>> phenomenon is easy to be observed.
>>> 
>>>>   2> Without ACK-NEW, some naughty NIC devices as we
observed will
>>>> bring IRQ storms. For this phenomenon, I think Yunhong can
comment
more.>>>> Basically, writing EOI without mask the source of MSI will
bring
IRQ>>>> storm. Although the reason is under investigation, XEN should
anyhow>>>> handle such bogous device, right?
>>> 
>>>>   3> Using ACK-OLD and masking the MSI when writing EOI can
be
>>>> solution. However, XEN does not own PCI configuration spaces.
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-28 09:16 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

On 28/3/08 08:40, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
> The investigation result is,
> 1) if mask and ack the interrupt, the interrupt will happen 3 times, the
> last 2 is masked because they happened when the first one is still
> pending for ISR''s handler, the system is ok.
How can you tell it happened three times? If the interrupt is pending in the
ISR then only one further pending interrupt can become visible to software
as there is only one pending bit per vector in the IRR.
> So I suppose the problem happens only if trigger the interrupt by
> software. I consulted the HW engineer also but didn''t get
confirmation,
> the only answer I got is, the PCI-E need a rising edge before send the
> 2nd interrupt :(
That answer means very little to me. One interesting question to have
answered would be: is this a closed-loop or open-loop interrupt storm? I.e.,
does the device somehow detect API EOI and then trigger re-send of the MSI
(closed loop) or is this an initialisation-time-only open-loop storm where
the device is spitting out the MSI regularly until some device register gets
written by the interrupt service routine?

Given the circumstances, I''m inclined to think it is the latter.
Especially
since I think the former is impossible as EPIC EOI is not visible outside
the processor unless the interrupt came from a level-triggered IO-APIC pin,
and even then the EOI would not be visible across the PCI bus!

Also it seems *very* likely that this is just an initialisation-time thing,
and the device probably behaves very nicely after it is bootstrapped. In
light of this I think we should treat MSI sources as ACKTYPE_NONE in Xen
(i.e, require no callback from guest to hypervisor on completion of the
interrupt handler). We can then handle the interrupt storm entirely within
the hypervisor by detecting the storm and masking the interrupt and only
unmasking on some timeout.

In your tests, how aggressive was the IRQ storm? If you looked at the
interrupted EIP on each interrupt, was it immediately after the APIC was
EOIed and EFLAGS.IF set back to 1, or was it some time after? This tells us
how aggressively the device is sending out EOIs, and may determine how
cunning we need to be regarding interrupt storm detection.
> I''m not sure if there are any other BRAIN-DEAD device like this, I
only
> have this device to test MSI-X function, but we may need make sure it
> will not break the whole system.
Yes, we have to handle this case, unfortunately.
> The call-back to guest because we are using the ACK-new method to work
> around this issue. Yes, it is expensive, Also, this ACK-new method may
> cause deadlock as Haitao suggested in the mail.
Yes, that sucks. See my previous email -- if possible it would be great to
teach Xen enough about the PCI config space to be able to mask MSIs.
> But if we move the config space to HV, then we don''t need this
ACK-new
> method, that should be ok, but admittedly, that should be the last
> method we we turn to, since config-space should be owned by domain0.
A partial movement into the hypervisor may be the best of a choice of evils.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2008-Mar-28 09:35 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

xen-devel-bounces@lists.xensource.com <> wrote:> On 28/3/08 08:40, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>> The investigation result is,
>> 1) if mask and ack the interrupt, the interrupt will happen 3 times,
the>> last 2 is masked because they happened when the first one is still
>> pending for ISR''s handler, the system is ok.
> 
> How can you tell it happened three times? If the interrupt is
> pending in the
> ISR then only one further pending interrupt can become visible
> to software
> as there is only one pending bit per vector in the IRR.
There are two type of msi interrupt, one for receive/transmit, one for
other (this is the one cuase storm). I add printk if interrupt happen
while previous is in progress. Then I added the print number and the
output in /prot/interrupt. The output in /prco/interrupt is only 1.
> 
>> So I suppose the problem happens only if trigger the interrupt by
>> software. I consulted the HW engineer also but didn''t get
confirmation,>> the only answer I got is, the PCI-E need a rising edge before send
the>> 2nd interrupt :(
> 
> That answer means very little to me. One interesting question to have
> answered would be: is this a closed-loop or open-loop
> interrupt storm? I.e.,
> does the device somehow detect API EOI and then trigger
> re-send of the MSI
> (closed loop) or is this an initialisation-time-only open-loop
> storm where
> the device is spitting out the MSI regularly until some device
register gets> written by the interrupt service routine?
> 
> Given the circumstances, I''m inclined to think it is the
> latter. Especially
> since I think the former is impossible as EPIC EOI is not
> visible outside
> the processor unless the interrupt came from a level-triggered
> IO-APIC pin,
> and even then the EOI would not be visible across the PCI bus!
> 
> Also it seems *very* likely that this is just an
> initialisation-time thing,
> and the device probably behaves very nicely after it is
> bootstrapped. In
I can''t tell this becuase this interrupt didn''t happen again
after the
device is up. Maybe I can change the driver to do more experiement.
> light of this I think we should treat MSI sources as
> ACKTYPE_NONE in Xen
> (i.e, require no callback from guest to hypervisor on completion of
the> interrupt handler). We can then handle the interrupt storm
> entirely within
> the hypervisor by detecting the storm and masking the
> interrupt and only
> unmasking on some timeout.
> 
> In your tests, how aggressive was the IRQ storm? If you looked at the
> interrupted EIP on each interrupt, was it immediately after
> the APIC was
> EOIed and EFLAGS.IF set back to 1, or was it some time after?
> This tells us
> how aggressively the device is sending out EOIs, and may determine how
> cunning we need to be regarding interrupt storm detection.
I will try that.
> 
>> I''m not sure if there are any other BRAIN-DEAD device like
this, I
only>> have this device to test MSI-X function, but we may need make sure it
>> will not break the whole system.
> 
> Yes, we have to handle this case, unfortunately.
> 
>> The call-back to guest because we are using the ACK-new method to
work>> around this issue. Yes, it is expensive, Also, this ACK-new method
may>> cause deadlock as Haitao suggested in the mail.
> 
> Yes, that sucks. See my previous email -- if possible it would
> be great to
> teach Xen enough about the PCI config space to be able to mask MSIs.In fact, currently xen is already tryting to access config space,
althought that is a bug still currently. In vt-d, xen try to access FLR
directly :)
> 
>> But if we move the config space to HV, then we don''t need this
ACK-new>> method, that should be ok, but admittedly, that should be the last
>> method we we turn to, since config-space should be owned by domain0.
> 
> A partial movement into the hypervisor may be the best of a
> choice of evils.
Sure, we will do that! 
> -- Keir
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2008-Mar-28 11:37 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

That is true.  I was quite puzzled with the requirement of the
callback into Xen myself.  In standard Linux MSI interrupts are
treated as edge triggered and are just acked in the local APIC upon
delivery.

	eSk



[Keir Fraser]> This requires the guest to call back into Xen to signal EOI (as we already
> do for legacy level-triggered interrupts). We shouldn''t really
need to do
> that for MSI and it''s rather more expensive than a couple of
accesses over
> the PCI bus!
> It''s this callback into Xen, which we do not really understand why
it''s
> needed, which I''m railing against. Is there some fundamental
aspect of MSI
> we do not understand, or are we working around one brain-dead or buggy
> device?
>  -- Keir
> On 28/3/08 01:48, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
>> Not masking each time when interrupt happen, instead, we do that only
>> when the second interrupt happen while the previous one is still
>> pending, it should be something like handle_edge_irqs() in upstream
>> linux.
>> 
>> -- Yunhong Jiang
>> 
>> Espen Skoglund <mailto:espen.skoglund@netronome.com> wrote:
>>> Preventing interrupt storms by masking the interrupt in the
MSI/MSI-X
>>> capabilty structure or MSI-X table within the interrupt handler is
>>> insane.  It requires accesses over the PCI/PCIe bus and is clearly
>>> something you want to avoid on the fast path.
>>> 
>>> eSk
>>> 
>>> 
>>> [Haitao Shan]
>>>> There are no much changes made compared with the original
>> patches.
>>>> But there do have some issues that we need your kind comments.
>>> 1> ACK-NEW method is necessary to avoid IRQ storm. But it
causes>> the
>>>> deadlock. During my tests, I do find there can be deadlock
>> with
>>>> patches applied. When assigned a NIC device to HVM domain, the
>> scenario
>>>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain
is
>> waiting
>>>> for qemu''s IDE emulation and thus blocked; NIC
interrupt (MSI vector
>> 0x31)
>>>> is waiting for injection to HVM domain since it is blocked now;
IDE
>>>> interrupt is waiting for NIC interrupt since NIC interrupt is
of high
>>>> priority but not ACKed by XEN now. When IDE interrupt and NIC
>> interrupt
>>>> are delivered to the same CPU, and when guest OS is Vista, the
>>>> phenomenon is easy to be observed.
>>> 2> Without ACK-NEW, some naughty NIC devices as we observed
will>>>> bring IRQ storms. For this phenomenon, I think Yunhong can
comment
>> more.
>>>> Basically, writing EOI without mask the source of MSI will
bring IRQ
>>>> storm. Although the reason is under investigation, XEN should
anyhow
>>>> handle such bogous device, right?
>>> 3> Using ACK-OLD and masking the MSI when writing EOI can
be>>>> solution. However, XEN does not own PCI configuration spaces.
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-28 11:53 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

I think Linux EOIs on ->end() not on ->ack(). Which is fine since Linux
doesn''t defer or otherwise schedule ISR handlers.

 -- Keir

On 28/3/08 11:37, "Espen Skoglund"
<espen.skoglund@netronome.com> wrote:
> That is true.  I was quite puzzled with the requirement of the
> callback into Xen myself.  In standard Linux MSI interrupts are
> treated as edge triggered and are just acked in the local APIC upon
> delivery.
> 
> eSk
> 
> 
> 
> [Keir Fraser]
>> This requires the guest to call back into Xen to signal EOI (as we
already
>> do for legacy level-triggered interrupts). We shouldn''t really
need to do
>> that for MSI and it''s rather more expensive than a couple of
accesses over
>> the PCI bus!
> 
>> It''s this callback into Xen, which we do not really understand
why it''s
>> needed, which I''m railing against. Is there some fundamental
aspect of MSI
>> we do not understand, or are we working around one brain-dead or buggy
>> device?
> 
>>  -- Keir
> 
>> On 28/3/08 01:48, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>>> Not masking each time when interrupt happen, instead, we do that
only
>>> when the second interrupt happen while the previous one is still
>>> pending, it should be something like handle_edge_irqs() in upstream
>>> linux.
>>> 
>>> -- Yunhong Jiang
>>> 
>>> Espen Skoglund <mailto:espen.skoglund@netronome.com> wrote:
>>>> Preventing interrupt storms by masking the interrupt in the
MSI/MSI-X
>>>> capabilty structure or MSI-X table within the interrupt handler
is
>>>> insane.  It requires accesses over the PCI/PCIe bus and is
clearly
>>>> something you want to avoid on the fast path.
>>>> 
>>>> eSk
>>>> 
>>>> 
>>>> [Haitao Shan]
>>>>> There are no much changes made compared with the original
>>> patches.
>>>>> But there do have some issues that we need your kind
comments.
>>>> 
> 1> ACK-NEW method is necessary to avoid IRQ storm. But it causes
>>> the
>>>>> deadlock. During my tests, I do find there can be deadlock
>>> with
>>>>> patches applied. When assigned a NIC device to HVM domain,
the
>>> scenario
>>>>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM
domain is
>>> waiting
>>>>> for qemu''s IDE emulation and thus blocked; NIC
interrupt (MSI vector
>>> 0x31)
>>>>> is waiting for injection to HVM domain since it is blocked
now; IDE
>>>>> interrupt is waiting for NIC interrupt since NIC interrupt
is of high
>>>>> priority but not ACKed by XEN now. When IDE interrupt and
NIC
>>> interrupt
>>>>> are delivered to the same CPU, and when guest OS is Vista,
the
>>>>> phenomenon is easy to be observed.
>>>> 
> 2> Without ACK-NEW, some naughty NIC devices as we observed will
>>>>> bring IRQ storms. For this phenomenon, I think Yunhong can
comment
>>> more.
>>>>> Basically, writing EOI without mask the source of MSI will
bring IRQ
>>>>> storm. Although the reason is under investigation, XEN
should anyhow
>>>>> handle such bogous device, right?
>>>> 
> 3> Using ACK-OLD and masking the MSI when writing EOI can be
>>>>> solution. However, XEN does not own PCI configuration
spaces.
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Espen Skoglund

2008-Mar-28 12:15 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Just checked this.  Linux does the local APIC EOI on ->ack().

	eSk


[Keir Fraser]> I think Linux EOIs on ->end() not on ->ack(). Which is fine since
> Linux doesn''t defer or otherwise schedule ISR handlers.
>  -- Keir
> On 28/3/08 11:37, "Espen Skoglund"
<espen.skoglund@netronome.com> wrote:
>> That is true.  I was quite puzzled with the requirement of the
>> callback into Xen myself.  In standard Linux MSI interrupts are
>> treated as edge triggered and are just acked in the local APIC upon
>> delivery.
>> 
>> eSk
>> 
>> 
>> 
>> [Keir Fraser]
>>> This requires the guest to call back into Xen to signal EOI (as we
already
>>> do for legacy level-triggered interrupts). We shouldn''t
really need to do
>>> that for MSI and it''s rather more expensive than a couple
of accesses over
>>> the PCI bus!
>> 
>>> It''s this callback into Xen, which we do not really
understand why it''s
>>> needed, which I''m railing against. Is there some
fundamental aspect of MSI
>>> we do not understand, or are we working around one brain-dead or
buggy
>>> device?
>> 
>>> -- Keir
>> 
>>> On 28/3/08 01:48, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
>> 
>>>> Not masking each time when interrupt happen, instead, we do
that only
>>>> when the second interrupt happen while the previous one is
still
>>>> pending, it should be something like handle_edge_irqs() in
upstream
>>>> linux.
>>>> 
>>>> -- Yunhong Jiang
>>>> 
>>>> Espen Skoglund <mailto:espen.skoglund@netronome.com>
wrote:
>>>>> Preventing interrupt storms by masking the interrupt in the
MSI/MSI-X
>>>>> capabilty structure or MSI-X table within the interrupt
handler is
>>>>> insane.  It requires accesses over the PCI/PCIe bus and is
clearly
>>>>> something you want to avoid on the fast path.
>>>>> 
>>>>> eSk
>>>>> 
>>>>> 
>>>>> [Haitao Shan]
>>>>>> There are no much changes made compared with the
original
>>>> patches.
>>>>>> But there do have some issues that we need your kind
comments.
>>>>> 1> ACK-NEW method is necessary to avoid IRQ storm. But it
causes>>>> the
>>>>>> deadlock. During my tests, I do find there can be
deadlock
>>>> with
>>>>>> patches applied. When assigned a NIC device to HVM
domain, the
>>>> scenario
>>>>>> is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM
domain is
>>>> waiting
>>>>>> for qemu''s IDE emulation and thus blocked; NIC
interrupt (MSI vector
>>>> 0x31)
>>>>>> is waiting for injection to HVM domain since it is
blocked now; IDE
>>>>>> interrupt is waiting for NIC interrupt since NIC
interrupt is of high
>>>>>> priority but not ACKed by XEN now. When IDE interrupt
and NIC
>>>> interrupt
>>>>>> are delivered to the same CPU, and when guest OS is
Vista, the
>>>>>> phenomenon is easy to be observed.
>>>>> 2> Without ACK-NEW, some naughty NIC devices as we observed
will>>>>>> bring IRQ storms. For this phenomenon, I think Yunhong
can comment
>>>> more.
>>>>>> Basically, writing EOI without mask the source of MSI
will bring IRQ
>>>>>> storm. Although the reason is under investigation, XEN
should anyhow
>>>>>> handle such bogous device, right?
>>>>> 3> Using ACK-OLD and masking the MSI when writing EOI can
be>>>>>> solution. However, XEN does not own PCI configuration
spaces.
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>> 
>> 
>> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-28 13:00 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Oh yes, that is true. They then have special logic for detecting nested
delivery and mask/unmask in that case. Fair enough, and similar to what we
should do in Xen.

 -- Keir

On 28/3/08 12:15, "Espen Skoglund"
<espen.skoglund@netronome.com> wrote:
> Just checked this.  Linux does the local APIC EOI on ->ack().
> 
> eSk
> 
> 
> [Keir Fraser]
>> I think Linux EOIs on ->end() not on ->ack(). Which is fine since
>> Linux doesn''t defer or otherwise schedule ISR handlers.
> 
>>  -- Keir
> 
>> On 28/3/08 11:37, "Espen Skoglund"
<espen.skoglund@netronome.com> wrote:
> 
>>> That is true.  I was quite puzzled with the requirement of the
>>> callback into Xen myself.  In standard Linux MSI interrupts are
>>> treated as edge triggered and are just acked in the local APIC upon
>>> delivery.
>>> 
>>> eSk
>>> 
>>> 
>>> 
>>> [Keir Fraser]
>>>> This requires the guest to call back into Xen to signal EOI (as
we already
>>>> do for legacy level-triggered interrupts). We
shouldn''t really need to do
>>>> that for MSI and it''s rather more expensive than a
couple of accesses over
>>>> the PCI bus!
>>> 
>>>> It''s this callback into Xen, which we do not really
understand why it''s
>>>> needed, which I''m railing against. Is there some
fundamental aspect of MSI
>>>> we do not understand, or are we working around one brain-dead
or buggy
>>>> device?
>>> 
>>>> -- Keir
>>> 
>>>> On 28/3/08 01:48, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
>>> 
>>>>> Not masking each time when interrupt happen, instead, we do
that only
>>>>> when the second interrupt happen while the previous one is
still
>>>>> pending, it should be something like handle_edge_irqs() in
upstream
>>>>> linux.
>>>>> 
>>>>> -- Yunhong Jiang
>>>>> 
>>>>> Espen Skoglund <mailto:espen.skoglund@netronome.com>
wrote:
>>>>>> Preventing interrupt storms by masking the interrupt in
the MSI/MSI-X
>>>>>> capabilty structure or MSI-X table within the interrupt
handler is
>>>>>> insane.  It requires accesses over the PCI/PCIe bus and
is clearly
>>>>>> something you want to avoid on the fast path.
>>>>>> 
>>>>>> eSk
>>>>>> 
>>>>>> 
>>>>>> [Haitao Shan]
>>>>>>> There are no much changes made compared with the
original
>>>>> patches.
>>>>>>> But there do have some issues that we need your
kind comments.
>>>>>> 
> 1> ACK-NEW method is necessary to avoid IRQ storm. But it causes
>>>>> the
>>>>>>> deadlock. During my tests, I do find there can be
deadlock
>>>>> with
>>>>>>> patches applied. When assigned a NIC device to HVM
domain, the
>>>>> scenario
>>>>>>> is: Dom0 is waiting to IDE interrupt (vector 0x21);
HVM domain is
>>>>> waiting
>>>>>>> for qemu''s IDE emulation and thus blocked;
NIC interrupt (MSI vector
>>>>> 0x31)
>>>>>>> is waiting for injection to HVM domain since it is
blocked now; IDE
>>>>>>> interrupt is waiting for NIC interrupt since NIC
interrupt is of high
>>>>>>> priority but not ACKed by XEN now. When IDE
interrupt and NIC
>>>>> interrupt
>>>>>>> are delivered to the same CPU, and when guest OS is
Vista, the
>>>>>>> phenomenon is easy to be observed.
>>>>>> 
> 2> Without ACK-NEW, some naughty NIC devices as we observed will
>>>>>>> bring IRQ storms. For this phenomenon, I think
Yunhong can comment
>>>>> more.
>>>>>>> Basically, writing EOI without mask the source of
MSI will bring IRQ
>>>>>>> storm. Although the reason is under investigation,
XEN should anyhow
>>>>>>> handle such bogous device, right?
>>>>>> 
> 3> Using ACK-OLD and masking the MSI when writing EOI can be
>>>>>>> solution. However, XEN does not own PCI
configuration spaces.
>>>>> 
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xensource.com
>>>>> http://lists.xensource.com/xen-devel
>>> 
>>> 
>>> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2008-Mar-31 13:57 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Keir, when I try to get the ip address today, I suddenly found I can''t
reproduce it anymore, also orginally if I removed the code that trigger
the software LSC interrupt, the NIC can still work and get IP address,
but now if I remove that code, the NIC can''t work anymore. 
It is really strange to me, I did''t change anything to the system. Also
I don''t know any changes in the lab environment that may cause this
change. But I do can reproduce it before each time.

Really frustrated to get this :-( , do you think we still need move the
config space access down, now the only reasons to move this down is,
ack_edge_ioapic_irq() did the mask, and this mask can make HV more
robust.

Thanks
-- Yunhong Jiang


Jiang, Yunhong <> wrote:> xen-devel-bounces@lists.xensource.com <> wrote:
>> On 28/3/08 08:40, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
>> 
>>> The investigation result is,
>>> 1) if mask and ack the interrupt, the interrupt will happen 3
times,
the>>> last 2 is masked because they happened when the first one is still
>>> pending for ISR''s handler, the system is ok.
>> 
>> How can you tell it happened three times? If the interrupt is pending
in>> the ISR then only one further pending interrupt can become visible
>> to software
>> as there is only one pending bit per vector in the IRR.
> 
> There are two type of msi interrupt, one for receive/transmit,
> one for other (this is the one cuase storm). I add printk if
> interrupt happen while previous is in progress. Then I added
> the print number and the output in /prot/interrupt. The output in
> /prco/interrupt is only 1. 
> 
>> 
>>> So I suppose the problem happens only if trigger the interrupt by
>>> software. I consulted the HW engineer also but didn''t get
confirmation,>>> the only answer I got is, the PCI-E need a rising edge before send
the>>> 2nd interrupt :(
>> 
>> That answer means very little to me. One interesting question to have
>> answered would be: is this a closed-loop or open-loop
>> interrupt storm? I.e.,
>> does the device somehow detect API EOI and then trigger
>> re-send of the MSI
>> (closed loop) or is this an initialisation-time-only open-loop
>> storm where
>> the device is spitting out the MSI regularly until some device
register>> gets written by the interrupt service routine?
>> 
>> Given the circumstances, I''m inclined to think it is the
>> latter. Especially
>> since I think the former is impossible as EPIC EOI is not
>> visible outside
>> the processor unless the interrupt came from a level-triggered
IO-APIC pin,>> and even then the EOI would not be visible across the PCI bus!
>> 
>> Also it seems *very* likely that this is just an
>> initialisation-time thing,
>> and the device probably behaves very nicely after it is
>> bootstrapped. In
> 
> I can''t tell this becuase this interrupt didn''t happen
again
> after the device is up. Maybe I can change the driver to do more
> experiement. 
> 
>> light of this I think we should treat MSI sources as
>> ACKTYPE_NONE in Xen
>> (i.e, require no callback from guest to hypervisor on completion of
the>> interrupt handler). We can then handle the interrupt storm
>> entirely within
>> the hypervisor by detecting the storm and masking the
>> interrupt and only
>> unmasking on some timeout.
>> 
>> In your tests, how aggressive was the IRQ storm? If you looked at the
>> interrupted EIP on each interrupt, was it immediately after
>> the APIC was
>> EOIed and EFLAGS.IF set back to 1, or was it some time after?
>> This tells us
>> how aggressively the device is sending out EOIs, and may determine
how>> cunning we need to be regarding interrupt storm detection.
> 
> I will try that.
> 
>> 
>>> I''m not sure if there are any other BRAIN-DEAD device like
this, I
only>>> have this device to test MSI-X function, but we may need make sure
it>>> will not break the whole system.
>> 
>> Yes, we have to handle this case, unfortunately.
>> 
>>> The call-back to guest because we are using the ACK-new method to
work>>> around this issue. Yes, it is expensive, Also, this ACK-new method
may>>> cause deadlock as Haitao suggested in the mail.
>> 
>> Yes, that sucks. See my previous email -- if possible it would
>> be great to
>> teach Xen enough about the PCI config space to be able to mask MSIs.
> In fact, currently xen is already tryting to access config
> space, althought that is a bug still currently. In vt-d, xen try to
access> FLR directly :) 
> 
>> 
>>> But if we move the config space to HV, then we don''t need
this
ACK-new>>> method, that should be ok, but admittedly, that should be the last
>>> method we we turn to, since config-space should be owned by
domain0.
>> 
>> A partial movement into the hypervisor may be the best of a
>> choice of evils.
> 
> Sure, we will do that!
> 
>> -- Keir
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-31 14:14 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

On 31/3/08 14:57, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
> Keir, when I try to get the ip address today, I suddenly found I
can''t
> reproduce it anymore, also orginally if I removed the code that trigger
> the software LSC interrupt, the NIC can still work and get IP address,
> but now if I remove that code, the NIC can''t work anymore.
> It is really strange to me, I did''t change anything to the system.
Also
> I don''t know any changes in the lab environment that may cause
this
> change. But I do can reproduce it before each time.
> 
> Really frustrated to get this :-( , do you think we still need move the
> config space access down, now the only reasons to move this down is,
> ack_edge_ioapic_irq() did the mask, and this mask can make HV more
> robust.
So, if you leave the driver as it is (triggering the software LSC
interrupt), do APIC EOI in Xen before executing the interrupt handler in
dom0, and do not mask the MSI at all, then you no longer hang?

That''s a weird change in behaviour if so!

I wonder whether there is a timing issue of some sort, and it depends if the
NIC generates the software-triggered interrupt at a fast enough rate that
the host CPU fails to make progress if it doesn''t mask the MSI? You
haven''t
changed test machine at all, or put the NIC in a different PCI slot, or
anything like that?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-31 14:15 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

On 31/3/08 15:14, "Keir Fraser" <keir.fraser@eu.citrix.com>
wrote:
> I wonder whether there is a timing issue of some sort, and it depends if
the
> NIC generates the software-triggered interrupt at a fast enough rate that
the
> host CPU fails to make progress if it doesn''t mask the MSI? You
haven''t
> changed test machine at all, or put the NIC in a different PCI slot, or
> anything like that?
Also, it''s got to be worth kicking your hardware guys again and find
out
from them exactly what happens when that software-triggered interrupt
register gets written by the device driver. Their previous response
didn''t
sound very enlightening.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2008-Mar-31 14:25 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Keir Fraser <mailto:keir.fraser@eu.citrix.com>
wrote:> On 31/3/08 14:57, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>> Keir, when I try to get the ip address today, I suddenly found I
can''t>> reproduce it anymore, also orginally if I removed the code that
trigger>> the software LSC interrupt, the NIC can still work and get IP
address,>> but now if I remove that code, the NIC can''t work anymore.
>> It is really strange to me, I did''t change anything to the
system.
Also>> I don''t know any changes in the lab environment that may cause
this
>> change. But I do can reproduce it before each time.
>> 
>> Really frustrated to get this :-( , do you think we still need move
the>> config space access down, now the only reasons to move this down is,
>> ack_edge_ioapic_irq() did the mask, and this mask can make HV more
>> robust.
> 
> So, if you leave the driver as it is (triggering the software LSC
> interrupt), do APIC EOI in Xen before executing the interrupt
> handler in
> dom0, and do not mask the MSI at all, then you no longer hang?
I usuually do experiement in linux kernel, and it no longer hang.
> 
> That''s a weird change in behaviour if so!
> 
> I wonder whether there is a timing issue of some sort, and it
> depends if the
> NIC generates the software-triggered interrupt at a fast
> enough rate that
> the host CPU fails to make progress if it doesn''t mask the
> MSI? You haven''t
> changed test machine at all, or put the NIC in a different PCI slot,
or> anything like that? 
I haven''t change anything at all, the machine is on lab, which is far
away from my cub.  And I just stay at home at weekend.
> 
> -- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Mar-31 14:33 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

On 31/3/08 15:25, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
>> So, if you leave the driver as it is (triggering the software LSC
>> interrupt), do APIC EOI in Xen before executing the interrupt
>> handler in
>> dom0, and do not mask the MSI at all, then you no longer hang?
> 
> I usuually do experiement in linux kernel, and it no longer hang.
Well, I''d be okay with an initial implementation which does not allow
Xen to
mask MSIs. But still I think it will be cleaner and more extensible to have
Xen program the MSI registers anyway. This will hide details like interrupt
vector, APIC destination mode, etc. from the MSI-capable guest, and also
will make it easier to support things like changing interrupt affinity on
the fly (since it will not be necessary to get dom0 involved in that).

Once you have Xen able to write the MSI registers, I suppose it is not much
extra work to implement some kind of interrupt mitigation scheme involving
mask/enable bits of the MSI configuration register.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shan, Haitao

2008-Apr-01 02:39 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Hi, Keir,

I am doing on that and incorporating your comments in. I will post the
updated patch after I finished. Thanks for your help!

Best Regards
Haitao Shan

Keir Fraser wrote:> On 31/3/08 15:25, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>>> So, if you leave the driver as it is (triggering the software LSC
>>> interrupt), do APIC EOI in Xen before executing the interrupt
>>> handler in dom0, and do not mask the MSI at all, then you no longer
>>> hang? 
>> 
>> I usuually do experiement in linux kernel, and it no longer hang.
> 
> Well, I''d be okay with an initial implementation which does not
allow
> Xen to mask MSIs. But still I think it will be cleaner and more
> extensible to have Xen program the MSI registers anyway. This will
> hide details like interrupt vector, APIC destination mode, etc. from
> the MSI-capable guest, and also will make it easier to support things
> like changing interrupt affinity on the fly (since it will not be
> necessary to get dom0 involved in that). 
> 
> Once you have Xen able to write the MSI registers, I suppose it is
> not much extra work to implement some kind of interrupt mitigation
> scheme involving mask/enable bits of the MSI configuration register.
> 
>  -- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Neil Turton

2008-Apr-02 14:55 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

I tried this patch and MSI seems to work fine with a driver in DOM0.  It
didn''t work with MSI-X though because pci_vector_resources returned 8
and I have 10 MSI capable devices in the machine.  I''ve only got 6
Phys-irq interrupts listed in /proc/interrupts so I''d expect there to
be
more vectors free.  I applied the debugging patch below and got the
following output.

diff -r 9bb373519b68 arch/i386/pci/irq-xen.c
--- a/arch/i386/pci/irq-xen.c	Tue Apr 01 14:15:23 2008 +0100
+++ b/arch/i386/pci/irq-xen.c	Wed Apr 02 13:19:05 2008 +0100
@@ -1192,6 +1192,7 @@ int pci_vector_resources(int last, int n
 	int offset = (last % 8);

 	while (next < FIRST_SYSTEM_VECTOR) {
+		printk("next=%d count=%d\n", next, count);
 		next += 8;
 #ifdef CONFIG_X86_64
 		if (next == IA32_SYSCALL_VECTOR)

[pci_vector_resources(176, 1) called]
next=176 count=1
next=184 count=2
next=192 count=3
next=200 count=4
next=208 count=5
next=216 count=6
next=224 count=7
next=232 count=8
[pci_vector_resources returned 8]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shan, Haitao

2008-Apr-03 12:11 UTC

head link

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Hi, Neil

Thanks for trying the patches. The problem is caused by incompatibility between
Xen and Dom0 kernel.
Pci_vector_resources is to calculate available vectors. Xen assigns vector by
start with vector 0x20 and offset = 0. This will confuse the code in
pci_vector_resources.
Maybe we should replace the function with a hypercall to acquire the number of
available vectors.
How do you think about it, Keir?
Thanks!

Shan Haitao

-----Original Message-----
From: Neil Turton [mailto:nturton@solarflare.com] 
Sent: 2008年4月2日 22:56
To: Shan, Haitao
Cc: Keir Fraser; xen-devel; Tian, Kevin; Jiang, Yunhong; Li, Xin B
Subject: Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

I tried this patch and MSI seems to work fine with a driver in DOM0.  It
didn''t work with MSI-X though because pci_vector_resources returned 8
and I have 10 MSI capable devices in the machine.  I''ve only got 6
Phys-irq interrupts listed in /proc/interrupts so I''d expect there to
be
more vectors free.  I applied the debugging patch below and got the
following output.

diff -r 9bb373519b68 arch/i386/pci/irq-xen.c
--- a/arch/i386/pci/irq-xen.c	Tue Apr 01 14:15:23 2008 +0100
+++ b/arch/i386/pci/irq-xen.c	Wed Apr 02 13:19:05 2008 +0100
@@ -1192,6 +1192,7 @@ int pci_vector_resources(int last, int n
 	int offset = (last % 8);

 	while (next < FIRST_SYSTEM_VECTOR) {
+		printk("next=%d count=%d\n", next, count);
 		next += 8;
 #ifdef CONFIG_X86_64
 		if (next == IA32_SYSCALL_VECTOR)

[pci_vector_resources(176, 1) called]
next=176 count=1
next=184 count=2
next=192 count=3
next=200 count=4
next=208 count=5
next=216 count=6
next=224 count=7
next=232 count=8
[pci_vector_resources returned 8]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Apr-03 12:31 UTC

head link

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

On 3/4/08 13:11, "Shan, Haitao" <haitao.shan@intel.com> wrote:
> Thanks for trying the patches. The problem is caused by incompatibility
> between Xen and Dom0 kernel.
> Pci_vector_resources is to calculate available vectors. Xen assigns vector
by
> start with vector 0x20 and offset = 0. This will confuse the code in
> pci_vector_resources.
> Maybe we should replace the function with a hypercall to acquire the number
of
> available vectors.
> How do you think about it, Keir?
I may not understand the issue here, but in principle I do not particularly
want to have anything outside Xen handling real IRQ vectors. In which case
this confusion should not exist in the first place? I know the last round of
patches did have dom0 poking the MSI registers, and hence it knew about real
vectors, but that''s being changed in the next round, right?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Maybe Matching Threads

Search for more seemingly similar threads

Xen devel - Mar 2008 - [PATCH 0/5] Add MSI support to XEN

[Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Re: [Xen-devel] [PATCH 0/5] Add MSI support to XEN

Maybe Matching Threads