Jan Beulich
2008-Sep-08 11:17 UTC
[Xen-devel] Re: Linux patch "translate Xen-provided PIRQs"
>>> Keir Fraser <keir.fraser@eu.citrix.com> 06.09.08 14:47 >>> >On 6/9/08 10:00, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >> I reverted this from the tree since it breaks the -xenU build and also mainly >> because I thought it was responsible for breaking our automated tests. >> Actually I''m not so sure about the latter, and the build breakage is easily >> fixed. I''ll do some more investigation on Monday and reinstate the patch (with >> build fix) if it looks okay to do so. >> >> -- Keir > >Jan, > >I fixed the build failure easily enough by making the ifdef-else''ed >definition of NR_PIRQS check for defined(MAX_IO_APICS). > >However, save/restore is definitely broken (I got the same result with both >default -xen and -xenU configs). The appended backtrace is reported >immediately upon resume (it''s the only BUG_ON in restore_cpu_ipis()). I >don''t get this failure with the patch reverted (although automated tests are >still failing save/restore for some unknown reason; but it works for me >manually).Pretty easily fixed - there was another place where the width of the event channel part of the irq_info array was hard-coded to 16 bits, which I overlooked. New patch below.>I''m not personally bothered enough about this patch to fix it myself (I know >that Jeremy has other approaches in pv_ops, which has to be what we move to >during 3.4 development), so this is FYI if you want to fix it up for the >2.6.18 tree.So you really think this will not only be available, but also stable and at feature parity with the current tree within about half a year? I''ve got other significant changes to event channel handling pending that I would only bother to make apply to 2.6.18 if there''s really going to be any use (and testing) for them in that tree. They mostly have to do with the scalability issue of currently binding per-CPU IPIs and VIRQs to one (kernel) IRQ for each CPU, whereas they should really use a single per-CPU IRQ to bind all event channels resulting from the same IPI/VIRQ (Jeremy indicated he had something similar in mind for pv-ops). Jan **************************************************** Previously, the kernel depended upon Xen''s NR_IRQS to be no larger than the kernel''s NR_PIRQS. As usual, written and tested on 2.6.27-rc5 and made apply to the 2.6.18 tree without further testing. Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: head-2008-09-01/arch/i386/kernel/io_apic-xen.c ==================================================================--- head-2008-09-01.orig/arch/i386/kernel/io_apic-xen.c 2008-08-22 15:28:26.000000000 +0200 +++ head-2008-09-01/arch/i386/kernel/io_apic-xen.c 2008-09-03 10:47:24.000000000 +0200 @@ -52,6 +52,7 @@ #include <xen/interface/xen.h> #include <xen/interface/physdev.h> +#include <xen/evtchn.h> /* Fake i8259 */ #define make_8259A_irq(_irq) (io_apic_irqs &= ~(1UL<<(_irq))) @@ -1275,7 +1276,7 @@ static void ioapic_register_intr(int irq set_intr_gate(vector, interrupt[idx]); } #else -#define ioapic_register_intr(_irq,_vector,_trigger) ((void)0) +#define ioapic_register_intr(irq, vector, trigger) evtchn_register_pirq(irq) #endif static void __init setup_IO_APIC_irqs(void) Index: head-2008-09-01/arch/x86_64/kernel/io_apic-xen.c ==================================================================--- head-2008-09-01.orig/arch/x86_64/kernel/io_apic-xen.c 2008-08-22 15:28:26.000000000 +0200 +++ head-2008-09-01/arch/x86_64/kernel/io_apic-xen.c 2008-09-03 10:47:28.000000000 +0200 @@ -81,6 +81,7 @@ int sis_apic_bug; /* not actually suppor #include <xen/interface/xen.h> #include <xen/interface/physdev.h> +#include <xen/evtchn.h> /* Fake i8259 */ #define make_8259A_irq(_irq) (io_apic_irqs &= ~(1UL<<(_irq))) @@ -840,7 +841,7 @@ static void ioapic_register_intr(int irq set_intr_gate(vector, interrupt[idx]); } #else -#define ioapic_register_intr(_irq,_vector,_trigger) ((void)0) +#define ioapic_register_intr(irq, vector, trigger) evtchn_register_pirq(irq) #endif /* !CONFIG_XEN */ static void __init setup_IO_APIC_irqs(void) Index: head-2008-09-01/drivers/pci/msi-xen.c ==================================================================--- head-2008-09-01.orig/drivers/pci/msi-xen.c 2008-09-03 13:17:09.000000000 +0200 +++ head-2008-09-01/drivers/pci/msi-xen.c 2008-09-03 14:38:24.000000000 +0200 @@ -17,6 +17,8 @@ #include <linux/pci.h> #include <linux/proc_fs.h> +#include <xen/evtchn.h> + #include <asm/errno.h> #include <asm/io.h> #include <asm/smp.h> @@ -231,13 +233,15 @@ static int msi_unmap_pirq(struct pci_dev int rc; unmap.domid = msi_get_dev_owner(dev); - unmap.pirq = pirq; + unmap.pirq = evtchn_get_xen_pirq(pirq); if ((rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap))) printk(KERN_WARNING "unmap irq %x failed\n", pirq); if (rc < 0) return rc; + + evtchn_map_pirq(pirq, 0); return 0; } @@ -272,7 +276,7 @@ static int msi_map_pirq_to_vector(struct map_irq.domid = domid; map_irq.type = MAP_PIRQ_TYPE_MSI; map_irq.index = -1; - map_irq.pirq = pirq; + map_irq.pirq = pirq < 0 ? -1 : evtchn_get_xen_pirq(pirq); map_irq.bus = dev->bus->number; map_irq.devfn = dev->devfn; map_irq.entry_nr = entry_nr; @@ -283,8 +287,12 @@ static int msi_map_pirq_to_vector(struct if (rc < 0) return rc; + /* This happens when MSI support is not enabled in Xen. */ + if (rc == 0 && map_irq.pirq < 0) + return -ENOSYS; - return map_irq.pirq; + BUG_ON(map_irq.pirq <= 0); + return evtchn_map_pirq(pirq, map_irq.pirq); } static int msi_map_vector(struct pci_dev *dev, int entry_nr, u64 table_base) Index: head-2008-09-01/drivers/xen/core/evtchn.c ==================================================================--- head-2008-09-01.orig/drivers/xen/core/evtchn.c 2008-09-03 13:17:09.000000000 +0200 +++ head-2008-09-01/drivers/xen/core/evtchn.c 2008-09-03 16:05:34.000000000 +0200 @@ -83,13 +83,27 @@ enum { IRQT_VIRQ, IRQT_IPI, IRQT_LOCAL_PORT, - IRQT_CALLER_PORT + IRQT_CALLER_PORT, + _IRQT_COUNT }; +#define _IRQT_BITS 4 +#define _EVTCHN_BITS 12 +#define _INDEX_BITS (32 - _IRQT_BITS - _EVTCHN_BITS) + /* Constructor for packed IRQ information. */ static inline u32 mk_irq_info(u32 type, u32 index, u32 evtchn) { - return ((type << 24) | (index << 16) | evtchn); + BUILD_BUG_ON(_IRQT_COUNT > (1U << _IRQT_BITS)); + + BUILD_BUG_ON(NR_PIRQS > (1U << _INDEX_BITS)); + BUILD_BUG_ON(NR_VIRQS > (1U << _INDEX_BITS)); + BUILD_BUG_ON(NR_IPIS > (1U << _INDEX_BITS)); + BUG_ON(index >> _INDEX_BITS); + + BUILD_BUG_ON(NR_EVENT_CHANNELS > (1U << _EVTCHN_BITS)); + + return ((type << (32 - _IRQT_BITS)) | (index << _EVTCHN_BITS) | evtchn); } /* Convenient shorthand for packed representation of an unbound IRQ. */ @@ -102,17 +116,17 @@ static inline u32 mk_irq_info(u32 type, static inline unsigned int evtchn_from_irq(int irq) { - return (u16)(irq_info[irq]); + return irq_info[irq] & ((1U << _EVTCHN_BITS) - 1); } static inline unsigned int index_from_irq(int irq) { - return (u8)(irq_info[irq] >> 16); + return (irq_info[irq] >> _EVTCHN_BITS) & ((1U << _INDEX_BITS) - 1); } static inline unsigned int type_from_irq(int irq) { - return (u8)(irq_info[irq] >> 24); + return irq_info[irq] >> (32 - _IRQT_BITS); } /* IRQ <-> VIRQ mapping. */ @@ -882,6 +896,60 @@ static struct irq_chip dynirq_chip = { .retrigger = resend_irq_on_evtchn, }; +void evtchn_register_pirq(int irq) +{ + irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, 0); +} + +#ifndef CONFIG_X86_IO_APIC +#undef IO_APIC_IRQ +#define IO_APIC_IRQ(irq) ((irq) >= pirq_to_irq(16)) +#endif + +int evtchn_map_pirq(int irq, int xen_pirq) +{ + if (irq < 0) { + static DEFINE_SPINLOCK(irq_alloc_lock); + + irq = pirq_to_irq(NR_PIRQS - 1); + spin_lock(&irq_alloc_lock); + do { + if (!IO_APIC_IRQ(irq)) + continue; + if (!index_from_irq(irq)) { + BUG_ON(type_from_irq(irq) != IRQT_UNBOUND); + irq_info[irq] = mk_irq_info(IRQT_PIRQ, + xen_pirq, 0); + break; + } + } while (--irq); + spin_unlock(&irq_alloc_lock); + if (irq < pirq_to_irq(16)) + return -ENOSPC; + } else if (!xen_pirq) { + if (unlikely(type_from_irq(irq) != IRQT_PIRQ)) + return -EINVAL; + irq_info[irq] = IRQ_UNBOUND; + return 0; + } else if (type_from_irq(irq) != IRQT_PIRQ + || index_from_irq(irq) != xen_pirq) { + printk(KERN_ERR "IRQ#%d is already mapped to %d:%u - " + "cannot map to PIRQ#%u\n", + irq, type_from_irq(irq), index_from_irq(irq), xen_pirq); + return -EINVAL; + } + return index_from_irq(irq) ? irq : -EINVAL; +} + +int evtchn_get_xen_pirq(int irq) +{ + if (!IO_APIC_IRQ(irq)) + return irq; + if (unlikely(type_from_irq(irq) != IRQT_PIRQ)) + return 0; + return index_from_irq(irq); +} + static inline void pirq_unmask_notify(int pirq) { struct physdev_eoi eoi = { .irq = pirq }; @@ -914,7 +982,7 @@ static unsigned int startup_pirq(unsigne if (VALID_EVTCHN(evtchn)) goto out; - bind_pirq.pirq = irq; + bind_pirq.pirq = evtchn_get_xen_pirq(irq); /* NB. We are happy to share unless we are probing. */ bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE; if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) { @@ -929,7 +997,7 @@ static unsigned int startup_pirq(unsigne evtchn_to_irq[evtchn] = irq; bind_evtchn_to_cpu(evtchn, 0); - irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, evtchn); + irq_info[irq] = mk_irq_info(IRQT_PIRQ, bind_pirq.pirq, evtchn); out: unmask_evtchn(evtchn); @@ -954,7 +1022,7 @@ static void shutdown_pirq(unsigned int i bind_evtchn_to_cpu(evtchn, 0); evtchn_to_irq[evtchn] = -1; - irq_info[irq] = IRQ_UNBOUND; + irq_info[irq] = mk_irq_info(IRQT_PIRQ, index_from_irq(irq), 0); } static void enable_pirq(unsigned int irq) @@ -1210,7 +1278,7 @@ static int evtchn_resume(struct sys_devi /* No IRQ <-> event-channel mappings. */ for (irq = 0; irq < NR_IRQS; irq++) - irq_info[irq] &= ~0xFFFF; /* zap event-channel binding */ + irq_info[irq] &= ~((1U << _EVTCHN_BITS) - 1); for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) evtchn_to_irq[evtchn] = -1; Index: head-2008-09-01/include/asm-i386/mach-xen/irq_vectors.h ==================================================================--- head-2008-09-01.orig/include/asm-i386/mach-xen/irq_vectors.h 2008-09-03 13:17:09.000000000 +0200 +++ head-2008-09-01/include/asm-i386/mach-xen/irq_vectors.h 2008-09-03 11:09:40.000000000 +0200 @@ -37,7 +37,13 @@ */ #define PIRQ_BASE 0 -#define NR_PIRQS 256 +#if !defined(MAX_IO_APICS) +# define NR_PIRQS (NR_VECTORS + 32 * NR_CPUS) +#elif NR_CPUS < MAX_IO_APICS +# define NR_PIRQS (NR_VECTORS + 32 * NR_CPUS) +#else +# define NR_PIRQS (NR_VECTORS + 32 * MAX_IO_APICS) +#endif #define DYNIRQ_BASE (PIRQ_BASE + NR_PIRQS) #define NR_DYNIRQS 256 Index: head-2008-09-01/include/xen/evtchn.h ==================================================================--- head-2008-09-01.orig/include/xen/evtchn.h 2008-09-03 13:17:09.000000000 +0200 +++ head-2008-09-01/include/xen/evtchn.h 2008-09-03 14:34:50.000000000 +0200 @@ -117,6 +117,13 @@ asmlinkage void evtchn_do_upcall(struct /* Entry point for notifications into the userland character device. */ void evtchn_device_upcall(int port); +/* Mark a PIRQ as unavailable for dynamic allocation. */ +void evtchn_register_pirq(int irq); +/* Map a Xen-supplied PIRQ to a dynamically allocated one. */ +int evtchn_map_pirq(int irq, int xen_pirq); +/* Look up a Xen-supplied PIRQ for a dynamically allocated one. */ +int evtchn_get_xen_pirq(int irq); + void mask_evtchn(int port); void disable_all_local_evtchn(void); void unmask_evtchn(int port); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-08 12:12 UTC
[Xen-devel] Re: Linux patch "translate Xen-provided PIRQs"
On 8/9/08 12:17, "Jan Beulich" <jbeulich@novell.com> wrote:>> I''m not personally bothered enough about this patch to fix it myself (I know >> that Jeremy has other approaches in pv_ops, which has to be what we move to >> during 3.4 development), so this is FYI if you want to fix it up for the >> 2.6.18 tree. > > So you really think this will not only be available, but also stable and at > feature parity with the current tree within about half a year? I''ve got > other significant changes to event channel handling pending that I would > only bother to make apply to 2.6.18 if there''s really going to be any use > (and testing) for them in that tree. They mostly have to do with the > scalability issue of currently binding per-CPU IPIs and VIRQs to one > (kernel) IRQ for each CPU, whereas they should really use a single per-CPU > IRQ to bind all event channels resulting from the same IPI/VIRQ (Jeremy > indicated he had something similar in mind for pv-ops).I believe we will be testing and running a xenbits pv_ops-based tree within 6 months, yes. Obviously we will have branches containing patches not yet in, or perhaps suitable for, upstream. Actually I hope we can move decisively towards this in the next couple of months, once Jeremy has the basic dom0 stuff working well enough. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel