This patch set adds support for msi in Xen dom0. It''s based on the pci notifier patches of Weidong Han (on rebase/pci branch) and contains the following 3 patches. [PATCH 1/3] xen: make pci notifier work with booting devices [PATCH 2/3] xen: add msi support for dom0 [PATCH 3/3] xen: re-enable msi (effectively revert bf89bc29) One of the problem left is how to save/restore MSI across S3. Since pci_restore_msi_state() now doesn''t have any arch specific hook, the code in arch/x86/ won''t get a chance to run during S3 wakeup, so write_msi_msg() is called instead of xen specific functions. One of the possible solutions (and which I prefer) is to add something like arch_pci_restore_msi, but that involves slightly changing drivers/pci/msi.c, which probably needs more thinking and discussion. An alternative is to trap and emulate any access to pci configuration space. In that case, nothing in dom0 needs changing, and write_msi_msg can be reused, but considerable logic may need to change in Xen hypervisor. Thanks, Qing _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Qing He
2009-Aug-18 05:45 UTC
[Xen-devel] [PATCH 1/3] xen: make pci notifier work with booting devices
change fs_initcall to arch_initcall so that pci notifier would handle booting pci devices Signed-off-by: Qing He <qing.he@intel.com> --- drivers/xen/pci.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c index b13e054..5156278 100644 --- a/drivers/xen/pci.c +++ b/drivers/xen/pci.c @@ -113,4 +113,4 @@ static int __init register_xen_pci_notifier(void) return bus_register_notifier(&pci_bus_type, &device_nb); } -fs_initcall(register_xen_pci_notifier); +arch_initcall(register_xen_pci_notifier); -- 1.6.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This patch adds msi support for dom0, based on arch_setup_msi_irqs hook, a xen_setup_msi_irqs is called if it''s Xen domain. No interrupt remapping is handled since Xen domain isn''t exposed with such feature at this time. Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Qing He <qing.he@intel.com> --- arch/x86/include/asm/xen/pci.h | 20 +++++++++ arch/x86/kernel/apic/io_apic.c | 9 +++- arch/x86/xen/pci.c | 24 ++++++++++ drivers/xen/events.c | 91 ++++++++++++++++++++++++++++++++++++++- include/xen/interface/physdev.h | 30 +++++++++++++ 5 files changed, 172 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h index 0563fc6..714443b 100644 --- a/arch/x86/include/asm/xen/pci.h +++ b/arch/x86/include/asm/xen/pci.h @@ -3,11 +3,31 @@ #ifdef CONFIG_XEN_DOM0_PCI int xen_register_gsi(u32 gsi, int triggering, int polarity); +int xen_create_msi_irq(struct pci_dev *dev, + struct msi_desc *msidesc, + int type); +int xen_destroy_irq(int irq); +int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type); #else static inline int xen_register_gsi(u32 gsi, int triggering, int polarity) { return -1; } + +static int xen_create_msi_irq(struct pci_dev *dev, + struct msi_desc *msidesc, + int type) +{ + return -1; +} +static int xen_destroy_irq(int irq) +{ + return -1; +} +static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +{ + return -1; +} #endif #endif /* _ASM_X86_XEN_PCI_H */ diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index b562550..ce82ddb 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -66,6 +66,7 @@ #include <asm/xen/hypervisor.h> #include <asm/apic.h> +#include <asm/xen/pci.h> #define __apicdebuginit(type) static type __init @@ -3502,6 +3503,9 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) if (type == PCI_CAP_ID_MSI && nvec > 1) return 1; + if (xen_domain()) + return xen_setup_msi_irqs(dev, nvec, type); + irq_want = nr_irqs_gsi; sub_handle = 0; list_for_each_entry(msidesc, &dev->msi_list, list) { @@ -3550,7 +3554,10 @@ error: void arch_teardown_msi_irq(unsigned int irq) { - destroy_irq(irq); + if (xen_domain()) + xen_destroy_irq(irq); + else + destroy_irq(irq); } #if defined (CONFIG_DMAR) || defined (CONFIG_INTR_REMAP) diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c index 07b59fe..c0ef627 100644 --- a/arch/x86/xen/pci.c +++ b/arch/x86/xen/pci.c @@ -1,12 +1,14 @@ #include <linux/kernel.h> #include <linux/acpi.h> #include <linux/pci.h> +#include <linux/msi.h> #include <asm/mpspec.h> #include <asm/io_apic.h> #include <asm/pci_x86.h> #include <asm/xen/hypervisor.h> +#include <asm/xen/pci.h> #include <xen/interface/xen.h> #include <xen/events.h> @@ -84,3 +86,25 @@ void __init xen_setup_pirqs(void) polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH); } } + +int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) +{ + unsigned int irq; + int ret; + struct msi_desc *msidesc; + + list_for_each_entry(msidesc, &dev->msi_list, list) { + irq = xen_create_msi_irq(dev, msidesc, type); + if (irq == 0) + return -1; + + ret = set_irq_msi(irq, msidesc); + if (ret) + goto error; + } + return 0; + +error: + xen_destroy_irq(irq); + return ret; +} diff --git a/drivers/xen/events.c b/drivers/xen/events.c index af2aad4..eef4834 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -28,6 +28,9 @@ #include <linux/string.h> #include <linux/bootmem.h> #include <linux/irqnr.h> +#include <linux/pci_regs.h> +#include <linux/pci.h> +#include <linux/msi.h> #include <asm/ptrace.h> #include <asm/irq.h> @@ -42,6 +45,8 @@ #include <xen/interface/xen.h> #include <xen/interface/event_channel.h> +#include "../pci/msi.h" + /* * This lock protects updates to the following mapping and reference-count * arrays. The lock does not need to be acquired to read the mapping tables. @@ -560,14 +565,98 @@ int xen_allocate_pirq(unsigned gsi, char *name) if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) { dynamic_irq_cleanup(irq); irq = -ENOSPC; + goto out; + } + + irq_info[irq] = mk_pirq_info(0, gsi, irq_op.vector); +out: + spin_unlock(&irq_mapping_update_lock); + return irq; +} + +int xen_destroy_irq(int irq) +{ + struct irq_desc *desc; + struct physdev_unmap_pirq unmap_irq; + int rc = -ENOENT; + + spin_lock(&irq_mapping_update_lock); + + desc = irq_to_desc(irq); + if (!desc) + goto out; + + unmap_irq.pirq = irq; + unmap_irq.domid = DOMID_SELF; + rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq); + if (rc) { + printk(KERN_WARNING "unmap irq failed %x\n", rc); goto out; } - irq_info[irq] = mk_pirq_info(0, gsi, irq_op.vector); + irq_info[irq] = mk_unbound_info(); + + dynamic_irq_cleanup(irq); out: spin_unlock(&irq_mapping_update_lock); + return rc; +} +int xen_create_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int type) +{ + int irq = 0; + struct physdev_map_pirq map_irq; + int rc; + domid_t domid = DOMID_SELF; + int pos; + u32 table_offset, bir; + + memset(&map_irq, 0, sizeof(map_irq)); + map_irq.domid = domid; + map_irq.type = MAP_PIRQ_TYPE_MSI; + map_irq.index = -1; + map_irq.bus = dev->bus->number; + map_irq.devfn = dev->devfn; + + if (type == PCI_CAP_ID_MSIX) { + pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); + + pci_read_config_dword(dev, msix_table_offset_reg(pos), + &table_offset); + bir = (u8)(table_offset & PCI_MSIX_FLAGS_BIRMASK); + + map_irq.table_base = pci_resource_start(dev, bir); + map_irq.entry_nr = msidesc->msi_attrib.entry_nr; + } + + spin_lock(&irq_mapping_update_lock); + + irq = find_unbound_irq(); + + if (irq == -1) + goto out; + + map_irq.pirq = irq; + + rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq, &map_irq); + if (rc) { + + printk(KERN_WARNING "xen map irq failed %x\n", rc); + + dynamic_irq_cleanup(irq); + + irq = -1; + goto out; + } + + irq_info[irq] = mk_pirq_info(0, -1, map_irq.index); + set_irq_chip_and_handler_name(irq, &xen_pirq_chip, + handle_level_irq, + (type == PCI_CAP_ID_MSIX) ? "msi-x":"msi"); + +out: + spin_unlock(&irq_mapping_update_lock); return irq; } diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h index 7a7d007..ac5de37 100644 --- a/include/xen/interface/physdev.h +++ b/include/xen/interface/physdev.h @@ -106,6 +106,36 @@ struct physdev_irq { uint32_t vector; }; +#define MAP_PIRQ_TYPE_MSI 0x0 +#define MAP_PIRQ_TYPE_GSI 0x1 +#define MAP_PIRQ_TYPE_UNKNOWN 0x2 + +#define PHYSDEVOP_map_pirq 13 +struct physdev_map_pirq { + domid_t domid; + /* IN */ + int type; + /* IN */ + int index; + /* IN or OUT */ + int pirq; + /* IN */ + int bus; + /* IN */ + int devfn; + /* IN */ + int entry_nr; + /* IN */ + uint64_t table_base; +}; + +#define PHYSDEVOP_unmap_pirq 14 +struct physdev_unmap_pirq { + domid_t domid; + /* IN */ + int pirq; +}; + #define PHYSDEVOP_manage_pci_add 15 #define PHYSDEVOP_manage_pci_remove 16 struct physdev_manage_pci { -- 1.6.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
remove pci_no_msi for Xen Dom 0, also move pci_no_msi out of public header effectively revert commit bf89bc290d429ce223c1018628130ddabc66614e xen: disable MSI Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Qing He <qing.he@intel.com> --- arch/x86/xen/apic.c | 3 --- drivers/pci/pci.h | 2 ++ include/linux/pci.h | 6 ------ 3 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c index 496f07d..ee0db39 100644 --- a/arch/x86/xen/apic.c +++ b/arch/x86/xen/apic.c @@ -1,7 +1,6 @@ #include <linux/kernel.h> #include <linux/threads.h> #include <linux/bitmap.h> -#include <linux/pci.h> #include <asm/io_apic.h> #include <asm/acpi.h> @@ -48,8 +47,6 @@ void xen_init_apic(void) if (!xen_initial_domain()) return; - pci_no_msi(); - #ifdef CONFIG_ACPI /* * Pretend ACPI found our lapic even though we''ve disabled it, diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 79ada7b..d03f6b9 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -111,8 +111,10 @@ extern struct rw_semaphore pci_bus_sem; extern unsigned int pci_pm_d3_delay; #ifdef CONFIG_PCI_MSI +void pci_no_msi(void); extern void pci_msi_init_pci_dev(struct pci_dev *dev); #else +static inline void pci_no_msi(void) { } static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { } #endif diff --git a/include/linux/pci.h b/include/linux/pci.h index 75b0645..e831a10 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1256,11 +1256,5 @@ static inline irqreturn_t pci_sriov_migration(struct pci_dev *dev) } #endif -#ifdef CONFIG_PCI_MSI -void pci_no_msi(void); -#else -static inline void pci_no_msi(void) { } -#endif - #endif /* __KERNEL__ */ #endif /* LINUX_PCI_H */ -- 1.6.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-18 20:24 UTC
Re: [Xen-devel] [PATCH 0/3] xen: msi support for Xen dom0
On 08/17/09 22:45, Qing He wrote:> This patch set adds support for msi in Xen dom0. It''s based on the > pci notifier patches of Weidong Han (on rebase/pci branch) and > contains the following 3 patches. > > [PATCH 1/3] xen: make pci notifier work with booting devices > [PATCH 2/3] xen: add msi support for dom0 > [PATCH 3/3] xen: re-enable msi (effectively revert bf89bc29) >Thanks, I''ve applied these to rebase/dom0/msi for now. I haven''t tested them (or really compiled them) yet, so please look at the branch and see that everything''s OK.> One of the problem left is how to save/restore MSI across S3. Since > pci_restore_msi_state() now doesn''t have any arch specific hook, the > code in arch/x86/ won''t get a chance to run during S3 wakeup, so > write_msi_msg() is called instead of xen specific functions. One of > the possible solutions (and which I prefer) is to add something like > arch_pci_restore_msi, but that involves slightly changing > drivers/pci/msi.c, which probably needs more thinking and discussion. > > An alternative is to trap and emulate any access to pci configuration > space. In that case, nothing in dom0 needs changing, and write_msi_msg > can be reused, but considerable logic may need to change in Xen > hypervisor. >The approach taken by 2/3 is not really going to fly upstream, and is broadly incompatible with my intended design for interrupt handling, which is to decouple the Xen/dom0 aspects of interrupt handling from the apic/ioapic code entirely. I don''t know what impact this will have on MSI support. I''d appreciate it if you could look at the rebase/dom0/new-interrupt-routing branch and comment on it. I''m not actually sure this approach is going to work; so far it just locks up the machine shortly after ACPI initialization. Trapping an emulating (IO-)APIC accesses may well turn out to be simpler (on the Linux side, at least) and more robust in the end... J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2009-08-19 at 04:24 +0800, Jeremy Fitzhardinge wrote:> On 08/17/09 22:45, Qing He wrote: > > This patch set adds support for msi in Xen dom0. It''s based on the > > pci notifier patches of Weidong Han (on rebase/pci branch) and > > contains the following 3 patches. > > > > [PATCH 1/3] xen: make pci notifier work with booting devices > > [PATCH 2/3] xen: add msi support for dom0 > > [PATCH 3/3] xen: re-enable msi (effectively revert bf89bc29) > > > > Thanks, I''ve applied these to rebase/dom0/msi for now. I haven''t > tested them (or really compiled them) yet, so please look at the branch > and see that everything''s OK. > > > One of the problem left is how to save/restore MSI across S3. Since > > pci_restore_msi_state() now doesn''t have any arch specific hook, the > > code in arch/x86/ won''t get a chance to run during S3 wakeup, so > > write_msi_msg() is called instead of xen specific functions. One of > > the possible solutions (and which I prefer) is to add something like > > arch_pci_restore_msi, but that involves slightly changing > > drivers/pci/msi.c, which probably needs more thinking and discussion. > > > > An alternative is to trap and emulate any access to pci configuration > > space. In that case, nothing in dom0 needs changing, and write_msi_msg > > can be reused, but considerable logic may need to change in Xen > > hypervisor. > > > > The approach taken by 2/3 is not really going to fly upstream, and is > broadly incompatible with my intended design for interrupt handling, > which is to decouple the Xen/dom0 aspects of interrupt handling from the > apic/ioapic code entirely. I don''t know what impact this will have on > MSI support. I''d appreciate it if you could look at the > rebase/dom0/new-interrupt-routing branch and comment on it.I''ll have a look at it first.> > I''m not actually sure this approach is going to work; so far it just > locks up the machine shortly after ACPI initialization. Trapping an > emulating (IO-)APIC accesses may well turn out to be simpler (on the > Linux side, at least) and more robust in the end...MSI by nature is vector based, but xen uses pirq as the interface to communicate with dom0, 2/3 is actually used to handle the pirq thing instead of solely vectors. So trapping and emulating for msi will first need a decoupling of that, allowing dom0 to use only vector to index its msi. This requires changing of the current pirq based xen irq, which seems quite annoying at Xen side, especially if per-cpu vector is in. Thanks, Qing> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-19 19:22 UTC
Re: [Xen-devel] [PATCH 0/3] xen: msi support for Xen dom0
On 08/18/09 18:45, Qing He wrote:> On Wed, 2009-08-19 at 04:24 +0800, Jeremy Fitzhardinge wrote: > >> On 08/17/09 22:45, Qing He wrote: >> >>> This patch set adds support for msi in Xen dom0. It''s based on the >>> pci notifier patches of Weidong Han (on rebase/pci branch) and >>> contains the following 3 patches. >>> >>> [PATCH 1/3] xen: make pci notifier work with booting devices >>> [PATCH 2/3] xen: add msi support for dom0 >>> [PATCH 3/3] xen: re-enable msi (effectively revert bf89bc29) >>> >>> >> Thanks, I''ve applied these to rebase/dom0/msi for now. I haven''t >> tested them (or really compiled them) yet, so please look at the branch >> and see that everything''s OK. >> >> >>> One of the problem left is how to save/restore MSI across S3. Since >>> pci_restore_msi_state() now doesn''t have any arch specific hook, the >>> code in arch/x86/ won''t get a chance to run during S3 wakeup, so >>> write_msi_msg() is called instead of xen specific functions. One of >>> the possible solutions (and which I prefer) is to add something like >>> arch_pci_restore_msi, but that involves slightly changing >>> drivers/pci/msi.c, which probably needs more thinking and discussion. >>> >>> An alternative is to trap and emulate any access to pci configuration >>> space. In that case, nothing in dom0 needs changing, and write_msi_msg >>> can be reused, but considerable logic may need to change in Xen >>> hypervisor. >>> >>> >> The approach taken by 2/3 is not really going to fly upstream, and is >> broadly incompatible with my intended design for interrupt handling, >> which is to decouple the Xen/dom0 aspects of interrupt handling from the >> apic/ioapic code entirely. I don''t know what impact this will have on >> MSI support. I''d appreciate it if you could look at the >> rebase/dom0/new-interrupt-routing branch and comment on it. >> > I''ll have a look at it first. > > >> I''m not actually sure this approach is going to work; so far it just >> locks up the machine shortly after ACPI initialization. Trapping an >> emulating (IO-)APIC accesses may well turn out to be simpler (on the >> Linux side, at least) and more robust in the end... >> > MSI by nature is vector based, but xen uses pirq as the interface to > communicate with dom0, 2/3 is actually used to handle the pirq thing > instead of solely vectors. >Yes, that''s awkward. The new-interrupt-routing branch moves towards eliminating vectors from the kernel<->xen, replacing them entirely with pirqs. Part of the motivation for this is to insulate the kernel from Xen''s decisions on how to allocate and route vectors (ie, the number of vectors and whether they''re percpu or not). Extending this to MSI would presumably require Xen to do the actual PCI (config space?) programming given a pirq for the device interrupt. Perhaps this could be best achieved by making the x86 arch_setup_msi_irqs call through a function pointer so we can direct it to a Xen-specific version. Would that be clean?> So trapping and emulating for msi will first need a decoupling of that, > allowing dom0 to use only vector to index its msi. This requires changing > of the current pirq based xen irq, which seems quite annoying at Xen side, > especially if per-cpu vector is in. >What about making pirq == vector, so the "vector" seen by the kernel isn''t a real vector at all, and Xen remaps from the pirq/vector to the real vector when emulating the write? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2009-08-20 at 03:22 +0800, Jeremy Fitzhardinge wrote:> >> I''m not actually sure this approach is going to work; so far it just > >> locks up the machine shortly after ACPI initialization. Trapping an > >> emulating (IO-)APIC accesses may well turn out to be simpler (on the > >> Linux side, at least) and more robust in the end... > >> > > MSI by nature is vector based, but xen uses pirq as the interface to > > communicate with dom0, 2/3 is actually used to handle the pirq thing > > instead of solely vectors. > > > > Yes, that''s awkward. The new-interrupt-routing branch moves towards > eliminating vectors from the kernel<->xen, replacing them entirely with > pirqs. Part of the motivation for this is to insulate the kernel from > Xen''s decisions on how to allocate and route vectors (ie, the number of > vectors and whether they''re percpu or not). > > Extending this to MSI would presumably require Xen to do the actual PCI > (config space?) programming given a pirq for the device interrupt. > Perhaps this could be best achieved by making the x86 > arch_setup_msi_irqs call through a function pointer so we can direct it > to a Xen-specific version. Would that be clean?I finished reading the new-interrupt-routing branch and had some revisit on MSI. Basically, I think this branch is OK, exposing Xen vectors to guest is quite weird (also includes `index'' field of map_pirq call), this `vector'' is now even renamed to `irq'' because of per-cpu vector. And my MSI patches don''t conflict significantly with the new-interrupt-routing branch. We can consider the setting up of MSI as two aspects: one is programming the interrupt source (by writing to config space and MSIX mmio), the other is software: creating new irqs, and starting them. The method I use now is to add a branching in arch_setup_msi_irqs, to (1) set the irq_chip of these new irqs to pirq; (2) avoid programming the config space. This can be changed to funciton pointers with no additional pain.> > So trapping and emulating for msi will first need a decoupling of that, > > allowing dom0 to use only vector to index its msi. This requires changing > > of the current pirq based xen irq, which seems quite annoying at Xen side, > > especially if per-cpu vector is in. > > > > What about making pirq == vector, so the "vector" seen by the kernel > isn''t a real vector at all, and Xen remaps from the pirq/vector to the > real vector when emulating the write?But the vector is allocated by the kernel, right? How can Xen know which vector (and its CPU affinity) is related to a pirq? For suspend/resume of MSI devices support, what the kernel does on bare metal is simply to reprogram the interrupt source, including writing vectors and set enable bit. On Xen, the reprogramming is done in the hypervisor, so we just need to bypass kernel the kernel reprogramming: If using T&E, just ignore address/data writing and if not, an arch specific hook is needed. And what do you mean by I/O APIC T&E? Since dom0 is eventually notified by evtchn, a relation between vector and evtchn has to be established, which doesn''t exist yet. I ever think of mapping evtchns for (vcpu, vector) pairs at vector allocation time, so that everything else can be keep untouched. But that seems to involve to much change. Thanks, Qing> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Sep-02 22:55 UTC
Re: [Xen-devel] [PATCH 0/3] xen: msi support for Xen dom0
On 08/31/09 18:54, Qing He wrote:> I finished reading the new-interrupt-routing branch and had some revisit > on MSI. Basically, I think this branch is OK, exposing Xen vectors to > guest is quite weird (also includes `index'' field of map_pirq call), > this `vector'' is now even renamed to `irq'' because of per-cpu vector. >OK, good.> And my MSI patches don''t conflict significantly with the > new-interrupt-routing branch. We can consider the setting up of MSI as > two aspects: one is programming the interrupt source (by writing to > config space and MSIX mmio), the other is software: creating new irqs, > and starting them. The method I use now is to add a branching > in arch_setup_msi_irqs, to (1) set the irq_chip of these new irqs to > pirq; (2) avoid programming the config space. This can be changed to > funciton pointers with no additional pain. >Yep.>>> So trapping and emulating for msi will first need a decoupling of that, >>> allowing dom0 to use only vector to index its msi. This requires changing >>> of the current pirq based xen irq, which seems quite annoying at Xen side, >>> especially if per-cpu vector is in. >>> >>> >> What about making pirq == vector, so the "vector" seen by the kernel >> isn''t a real vector at all, and Xen remaps from the pirq/vector to the >> real vector when emulating the write? >> > But the vector is allocated by the kernel, right? How can Xen know which > vector (and its CPU affinity) is related to a pirq? >In the current dom0 scheme, the kernel does a hypercall to ask Xen to allocate the vector for a given pirq.> For suspend/resume of MSI devices support, what the kernel does on bare > metal is simply to reprogram the interrupt source, including writing vectors > and set enable bit. On Xen, the reprogramming is done in the hypervisor, > so we just need to bypass kernel the kernel reprogramming: If using T&E, > just ignore address/data writing and if not, an arch specific hook is > needed. >That''s fairly straightforward, since the PCI config space is mapped anyway. We just need to have some way to create mappings that will trap (ie, we can''t use the ptes the kernel establishes as-is, as they won''t cause a pagefault).> And what do you mean by I/O APIC T&E? Since dom0 is eventually notified > by evtchn, a relation between vector and evtchn has to be established, > which doesn''t exist yet. I ever think of mapping evtchns for (vcpu, > vector) pairs at vector allocation time, so that everything else can be > keep untouched. But that seems to involve to much change. >At the moment we need to paravirtualize IO APIC register writes via hypercall, specifically so that Xen can completely establish the apic+pin -> vector -> pirq -> evtchn mapping. It needs to be a hypercall because the IO APIC doesn''t have an explicit mapping within the dom0 address space. If we created such a mapping and trapped reads and writes to it, it would be logically equivalent to the hypercall, but wouldn''t require kernel changes. However, this is moot in the new-interrupt-routing branch as we take over the whole interrupt path, so there''s no problem making hypercalls (either to update the IO APIC via register writes, or the new hypercall to establish a gsi -> pirq mapping). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel