Some recent changes from Espen Skoglund http://markmail.org/thread/76szuiywgd5dn2x4 suggest that it should now be possible to map an MSI-X interrupt direct to a guest. However, when trying to do this, I get errors, and before I delve too deep I wondered if there was something obvious I''m doing wrong. My approach is that dom0 does something like this: struct physdev_map_pirq map_irq; map_irq.domid = guest_domid; map_irq.type = MAP_PIRQ_TYPE_MSI; map_irq.index = -1; map_irq.pirq = -1; map_irq.bus = pci_dev->bus->number; map_irq.devfn = pci_dev->devfn; map_irq.entry_nr = msix_entry; map_irq.table_base = msix_table; rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); This call succeeds. dom0 then tells the guest the value of map_irq.pirq (which seems to have a sensible value given the other interrupts mapped to that domain) that it gets back from this hypercall, and the guest tries to call request_irq() using that value. The call to request_irq() returns -38 (ENOSYS). Any ideas what the problem could be? I''m happy to investigate. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 17:43 >>> >Some recent changes from Espen Skoglund >http://markmail.org/thread/76szuiywgd5dn2x4 suggest that it should now >be possible to map an MSI-X interrupt direct to a guest. However, when >trying to do this, I get errors, and before I delve too deep I wondered >if there was something obvious I''m doing wrong. > >My approach is that dom0 does something like this: > > struct physdev_map_pirq map_irq; > > map_irq.domid = guest_domid; > map_irq.type = MAP_PIRQ_TYPE_MSI; > map_irq.index = -1; > map_irq.pirq = -1; > map_irq.bus = pci_dev->bus->number; > map_irq.devfn = pci_dev->devfn; > map_irq.entry_nr = msix_entry; > map_irq.table_base = msix_table; > > rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);But you don''t do this yourself, you get this done for you, right?>This call succeeds. dom0 then tells the guest the value of map_irq.pirq >(which seems to have a sensible value given the other interrupts mapped >to that domain) that it gets back from this hypercall, and the guest >tries to call request_irq() using that value. The call to request_irq() >returns -38 (ENOSYS). > >Any ideas what the problem could be? I''m happy to investigate.What kernel version is it that''s running in the guest? Sounds like some preparatory step might be missing the particular kernel version you got. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Do you also map the vector to domU before requesting it? You should be able to map it as a GSI. You''ll also need c/s 791 or newer for the domU kernel. eSk [Kieran Mansley]> Some recent changes from Espen Skoglund > http://markmail.org/thread/76szuiywgd5dn2x4 suggest that it should now > be possible to map an MSI-X interrupt direct to a guest. However, when > trying to do this, I get errors, and before I delve too deep I wondered > if there was something obvious I''m doing wrong.> My approach is that dom0 does something like this:> struct physdev_map_pirq map_irq;> map_irq.domid = guest_domid; > map_irq.type = MAP_PIRQ_TYPE_MSI; > map_irq.index = -1; > map_irq.pirq = -1; > map_irq.bus = pci_dev->bus->number; > map_irq.devfn = pci_dev->devfn; > map_irq.entry_nr = msix_entry; > map_irq.table_base = msix_table;> rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);> This call succeeds. dom0 then tells the guest the value of map_irq.pirq > (which seems to have a sensible value given the other interrupts mapped > to that domain) that it gets back from this hypercall, and the guest > tries to call request_irq() using that value. The call to request_irq() > returns -38 (ENOSYS).> Any ideas what the problem could be? I''m happy to investigate.> Kieran> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Oh. I just noticed that you map the MSI directly to the guest. You should first map it into dom0 as an MSI (to set up the MSI-X descriptor) and then map it to domU as a GSI. eSk [Espen Skoglund]> Do you also map the vector to domU before requesting it? You should > be able to map it as a GSI. You''ll also need c/s 791 or newer for the > domU kernel.> eSk> [Kieran Mansley] >> Some recent changes from Espen Skoglund >> http://markmail.org/thread/76szuiywgd5dn2x4 suggest that it should now >> be possible to map an MSI-X interrupt direct to a guest. However, when >> trying to do this, I get errors, and before I delve too deep I wondered >> if there was something obvious I''m doing wrong.>> My approach is that dom0 does something like this:>> struct physdev_map_pirq map_irq;>> map_irq.domid = guest_domid; >> map_irq.type = MAP_PIRQ_TYPE_MSI; >> map_irq.index = -1; >> map_irq.pirq = -1; >> map_irq.bus = pci_dev->bus->number; >> map_irq.devfn = pci_dev->devfn; >> map_irq.entry_nr = msix_entry; >> map_irq.table_base = msix_table;>> rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq);>> This call succeeds. dom0 then tells the guest the value of map_irq.pirq >> (which seems to have a sensible value given the other interrupts mapped >> to that domain) that it gets back from this hypercall, and the guest >> tries to call request_irq() using that value. The call to request_irq() >> returns -38 (ENOSYS).>> Any ideas what the problem could be? I''m happy to investigate.>> Kieran>> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2009-04-15 at 16:58 +0100, Jan Beulich wrote:> >>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 17:43 >>> > > rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); > > But you don''t do this yourself, you get this done for you, right?No, I was doing it myself. What approach would you recommend?> >This call succeeds. dom0 then tells the guest the value of map_irq.pirq > >(which seems to have a sensible value given the other interrupts mapped > >to that domain) that it gets back from this hypercall, and the guest > >tries to call request_irq() using that value. The call to request_irq() > >returns -38 (ENOSYS). > > > >Any ideas what the problem could be? I''m happy to investigate. > > What kernel version is it that''s running in the guest? Sounds like some > preparatory step might be missing the particular kernel version you got.c/s 855 from the linux-2.6.18-xen.hg tree. Thanks Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 18:12 >>> >On Wed, 2009-04-15 at 16:58 +0100, Jan Beulich wrote: >> >>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 17:43 >>> >> > rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); >> >> But you don''t do this yourself, you get this done for you, right? > >No, I was doing it myself. What approach would you recommend?In general this should be taken care of by the code in drivers/pci/msi-xen.c, unless of course you''re trying to pass through less than a full PCI device.>> >This call succeeds. dom0 then tells the guest the value of map_irq.pirq >> >(which seems to have a sensible value given the other interrupts mapped >> >to that domain) that it gets back from this hypercall, and the guest >> >tries to call request_irq() using that value. The call to request_irq() >> >returns -38 (ENOSYS). >> > >> >Any ideas what the problem could be? I''m happy to investigate. >> >> What kernel version is it that''s running in the guest? Sounds like some >> preparatory step might be missing the particular kernel version you got. > >c/s 855 from the linux-2.6.18-xen.hg tree.That should have all that''s needed (as Espen also indicated). You''ll have to hunt down where that -ENOSYS gets generated. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2009-04-15 at 17:10 +0100, Espen Skoglund wrote:> Oh. I just noticed that you map the MSI directly to the guest. You > should first map it into dom0 as an MSI (to set up the MSI-X > descriptor)OK, I''ll check that: I thought our driver might be doing that already, but apparently not.> and then map it to domU as a GSI.The GSI bit is something I wasn''t expecting. Any hints on the API for mapping this? Having done so I assume that the request_irq() in the guest should then succeed. Thanks very much for your help. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2009-04-15 at 17:21 +0100, Jan Beulich wrote:> >>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 18:12 >>> > >On Wed, 2009-04-15 at 16:58 +0100, Jan Beulich wrote: > >> >>> Kieran Mansley <kmansley@solarflare.com> 15.04.09 17:43 >>> > >> > rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); > >> > >> But you don''t do this yourself, you get this done for you, right? > > > >No, I was doing it myself. What approach would you recommend? > > In general this should be taken care of by the code in drivers/pci/msi-xen.c, > unless of course you''re trying to pass through less than a full PCI device.Exactly right: I''m not trying to pass through the whole device, just get the interrupt for a queue to the guest. Thanks Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
[Kieran Mansley]> On Wed, 2009-04-15 at 17:10 +0100, Espen Skoglund wrote: >> Oh. I just noticed that you map the MSI directly to the guest. >> You should first map it into dom0 as an MSI (to set up the MSI-X >> descriptor)> OK, I''ll check that: I thought our driver might be doing that > already, but apparently not.As Jan was saying, your dom0 kernel should take care of this. Just use the standard pcie_enable_msix() call to set it up.>> and then map it to domU as a GSI.> The GSI bit is something I wasn''t expecting. Any hints on the API > for mapping this? Having done so I assume that the request_irq() in > the guest should then succeed.> Thanks very much for your help.The API is the same as used for mapping regular PIRQs (they just happen to be MSI-X vectors). In particular, xend will use the right API when mapping a PIRQ to the guest. eSk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2009-04-15 at 17:30 +0100, Espen Skoglund wrote:> The API is the same as used for mapping regular PIRQs (they just > happen to be MSI-X vectors). In particular, xend will use the right > API when mapping a PIRQ to the guest.Changing the code to do as I think you describe gives a different error: the call to PHYSDEVOP_map_pirq to map the GSI into the guest fails as from reading the code (xen/arch/x86/physdev.c:58) it seems to be expecting an IRQ number to be passed in when what we have is an MSI-X vector, and it falls out of the range of IRQ numbers. The frustrating part is that the code that handles PHYSDEVOP_map_pirq in the hypervisor just turns the IRQ back into a vector. For the record, this is what I''m doing now: 1) Call pci_enable_msix() to set up the MSI-X tables and allocate a set of vectors. This works fine. As I understand things this calls among other things PHYSDEVOP_map_pirq to map it into dom0. 2a) Attempt to map this to the guest using PHYSDEVOP_map_pirq with parameters: map_irq.domid = guest_domid; map_irq.type = MAP_PIRQ_TYPE_GSI; map_irq.index = vector; map_irq.pirq = -1; => gives an error in the hypervisor: (XEN) physdev.c:61: dom1: map invalid irq 510 2b) (This is what we tried before) Attempt to map this to the guest using PHYSDEVOP_map_pirq with parameters: map_irq.domid = guest_domid; map_irq.type = MAP_PIRQ_TYPE_MSI; map_irq.index = -1; map_irq.pirq = -1; map_irq.bus = pci_dev->bus->number; map_irq.devfn = pci_dev->devfn; map_irq.entry_nr = msix_entry; map_irq.table_base = msix_table; => This gives no error, and returns a sane looking value in map_irq.pirq, but it fails with ENOSYS when the guest calls request_irq () Can you suggest what I might be doing wrong? I''m basing the code in (2a) on the approach taken for xc_physdev_map_pirq in xend, but it gets the physical IRQ passed to it. One point that is causing some confusion is the need to map the MSI-X into dom0 and get a physical IRQ in the first place before mapping it to the guest. Is this strictly necessary? It suggests that, once it has also been mapped to the guest the interrupt may be delivered to two different domains. Thanks Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
[Kieran Mansley]> On Wed, 2009-04-15 at 17:30 +0100, Espen Skoglund wrote: >> The API is the same as used for mapping regular PIRQs (they just >> happen to be MSI-X vectors). In particular, xend will use the >> right API when mapping a PIRQ to the guest.> Changing the code to do as I think you describe gives a different > error: the call to PHYSDEVOP_map_pirq to map the GSI into the guest > fails as from reading the code (xen/arch/x86/physdev.c:58) it seems > to be expecting an IRQ number to be passed in when what we have is > an MSI-X vector, and it falls out of the range of IRQ numbers. The > frustrating part is that the code that handles PHYSDEVOP_map_pirq in > the hypervisor just turns the IRQ back into a vector.> For the record, this is what I''m doing now: > 1) Call pci_enable_msix() to set up the MSI-X tables and allocate a set > of vectors. This works fine. As I understand things this calls among > other things PHYSDEVOP_map_pirq to map it into dom0.> 2a) Attempt to map this to the guest using PHYSDEVOP_map_pirq with > parameters: > map_irq.domid = guest_domid; > map_irq.type = MAP_PIRQ_TYPE_GSI; > map_irq.index = vector; > map_irq.pirq = -1;> => gives an error in the hypervisor: > (XEN) physdev.c:61: dom1: map invalid irq 510IRQ510? This definitely sounds wrong. This can''t possibly be the "PIRQ" assigned to the MSI vector.> One point that is causing some confusion is the need to map the > MSI-X into dom0 and get a physical IRQ in the first place before > mapping it to the guest. Is this strictly necessary? It suggests > that, once it has also been mapped to the guest the interrupt may be > delivered to two different domains.Only if two different domains bind to the pirq. MSI-X vectors could in theory be shared between multiple domains. I have not attempted this, and don''t know if it would work with the current code base. eSk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2009-04-16 at 16:39 +0100, Espen Skoglund wrote:> > 2a) Attempt to map this to the guest using PHYSDEVOP_map_pirq with > > parameters: > > map_irq.domid = guest_domid; > > map_irq.type = MAP_PIRQ_TYPE_GSI; > > map_irq.index = vector; > > map_irq.pirq = -1; > > > => gives an error in the hypervisor: > > (XEN) physdev.c:61: dom1: map invalid irq 510 > > IRQ510? This definitely sounds wrong. This can''t possibly be the > "PIRQ" assigned to the MSI vector.It''s not - it''s the vector itself. Sadly the code in question (msi- xen.c) seems to confuse the use of pirq and vector in the variable names, so I''m not exactly sure how to describe it. It''s the returned value from msi_map_vector() when called by pci_enable_msix() in dom0. What is the magic step I''m missing to go from the vector that I get back from pci_enable_msix() to the value for "PIRQ assigned to the MSI vector" to give to PHYSDEVOP_map_pirq? Thanks Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Kieran Mansley <kmansley@solarflare.com> 16.04.09 17:47 >>> >On Thu, 2009-04-16 at 16:39 +0100, Espen Skoglund wrote: >> > 2a) Attempt to map this to the guest using PHYSDEVOP_map_pirq with >> > parameters: >> > map_irq.domid = guest_domid; >> > map_irq.type = MAP_PIRQ_TYPE_GSI; >> > map_irq.index = vector; >> > map_irq.pirq = -1; >> >> > => gives an error in the hypervisor: >> > (XEN) physdev.c:61: dom1: map invalid irq 510 >> >> IRQ510? This definitely sounds wrong. This can''t possibly be the >> "PIRQ" assigned to the MSI vector. > >It''s not - it''s the vector itself. Sadly the code in question (msi- >xen.c) seems to confuse the use of pirq and vector in the variable >names, so I''m not exactly sure how to describe it. It''s the returned >value from msi_map_vector() when called by pci_enable_msix() in dom0.No, there simply cannot be a vector 510 - x86 is limited to 256 vectors. What you get back here is a (Xen) IRQ number. The question is why this is outside the default NR_IRQS range - are you building Xen with support for more than 256 IRQs? See get_free_pirq(), which starts its iteration ar NR_IRQS-1 for the case you''re interested in. Or is 510 perhaps a Linux IRQ number rather than a Xen one? PHYSDEVOP_map_pirq also returns vector information, but I strongly believe this is actually a mistake, as no guest should ever care about the vector Xen uses for a particular interrupt.>What is the magic step I''m missing to go from the vector that I get back >from pci_enable_msix() to the value for "PIRQ assigned to the MSI >vector" to give to PHYSDEVOP_map_pirq?I would think your problems begin with msi_map_pirq_to_vector() not having a way to know that the particular IRQ is to not go to Dom0, but to a DomU: msi_get_dev_owner() only considers the whole device. You may need to somehow undo this for those IRQs that you want to pass through (since you want the Xen PIRQ number here in order to pass to the DomU, not the Linux one). Whether not undoing the whole operation, but instead just obtaining the Xen PIRQ number would work I''m not really certain, but would assume that would at least have the unintended side effect of sharing the IRQ between DomU and Dom0. Otoh - did you check whether the VMDQ and/or SR-IOV work already contains a solution to your problem? I didn''t look closely at that code yet, but would suppose that there passing through IRQs without the whole devices should also be used in some way. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xen-devel-bounces@lists.xensource.com wrote:>>>> Kieran Mansley <kmansley@solarflare.com> 16.04.09 17:47 >>> >> On Thu, 2009-04-16 at 16:39 +0100, Espen Skoglund wrote: >>>> 2a) Attempt to map this to the guest using PHYSDEVOP_map_pirq with >>>> parameters: map_irq.domid = guest_domid; >>>> map_irq.type = MAP_PIRQ_TYPE_GSI; >>>> map_irq.index = vector; >>>> map_irq.pirq = -1; >>> >>>> => gives an error in the hypervisor: >>>> (XEN) physdev.c:61: dom1: map invalid irq 510 >>> >>> IRQ510? This definitely sounds wrong. This can''t possibly be the >>> "PIRQ" assigned to the MSI vector. >> >> It''s not - it''s the vector itself. Sadly the code in question (msi- >> xen.c) seems to confuse the use of pirq and vector in the variableOne tricky thing here is we can''t change the definition of msix_entry. BTW, the msi_map_vector() is a bit confusing.>> names, so I''m not exactly sure how to describe it. It''s the returned >> value from msi_map_vector() when called by pci_enable_msix() in dom0. > > No, there simply cannot be a vector 510 - x86 is limited to 256 > vectors. What you get back here is a (Xen) IRQ number. The question > is why > this is outside > the default NR_IRQS range - are you building Xen with support > for more than > 256 IRQs? See get_free_pirq(), which starts its iteration ar > NR_IRQS-1 for > the case you''re interested in. Or is 510 perhaps a Linux IRQ > number rather > than a Xen one? > > PHYSDEVOP_map_pirq also returns vector information, but I > strongly believe > this is actually a mistake, as no guest should ever care about > the vector Xen > uses for a particular interrupt.Agree, this should be wrong. Guest should have no idea of vector. Not sure if there are any special reason for it.> >> What is the magic step I''m missing to go from the vector that I get >> back from pci_enable_msix() to the value for "PIRQ assigned to the >> MSI vector" to give to PHYSDEVOP_map_pirq? > > I would think your problems begin with > msi_map_pirq_to_vector() not having > a way to know that the particular IRQ is to not go to Dom0, > but to a DomU: > msi_get_dev_owner() only considers the whole device. You may need to > somehow undo this for those IRQs that you want to pass through (since > you want the Xen PIRQ number here in order to pass to the DomU, not > the Linux one). Whether not undoing the whole operation, but instead > just obtaining the Xen PIRQ number would work I''m not really > certain, but would > assume that would at least have the unintended side effect of > sharing the > IRQ between DomU and Dom0. > > Otoh - did you check whether the VMDQ and/or SR-IOV work already > contains a solution to your problem? I didn''t look closely at > that code yet, > but would suppose that there passing through IRQs without the whole > devices should also be used in some way.I suspect SR-IOV will have no such usage model, since in SR-IOV will have different VF for guest. No idea of VMDq.> > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel