Jeremy Fitzhardinge
2010-Oct-16 00:15 UTC
[Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling
On 10/12/2010 01:29 PM, tip-bot for Thomas Gleixner wrote:> Commit-ID: 77dff1c755c3218691e95e7e38ee14323b35dbdb > Gitweb: http://git.kernel.org/tip/77dff1c755c3218691e95e7e38ee14323b35dbdb > Author: Thomas Gleixner <tglx@linutronix.de> > AuthorDate: Wed, 29 Sep 2010 17:37:10 +0200 > Committer: Thomas Gleixner <tglx@linutronix.de> > CommitDate: Tue, 12 Oct 2010 16:53:44 +0200 > > x86: xen: Sanitise sparse_irq handling > > There seems to be more cleanups possible, but that''s left to the xen > experts :)This causes the kernel to fail to boot under Xen. The WARN_ON(res !irq) triggers and nobody is very happy about the results. It looks like the patch below might be sufficient, but then the system fails due to block IO errors (not sure if that''s related). J diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 7d24b0d..a11fabb 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -338,31 +338,7 @@ static void unmask_evtchn(int port) static int find_unbound_irq(void) { - struct irq_data *data; - int irq, res; - - for (irq = 0; irq < nr_irqs; irq++) { - data = irq_get_irq_data(irq); - /* only 0->15 have init''d desc; handle irq > 16 */ - if (!data) - break; - if (data->chip == &no_irq_chip) - break; - if (data->chip != &xen_dynamic_chip) - continue; - if (irq_info[irq].type == IRQT_UNBOUND) - return irq; - } - - if (irq == nr_irqs) - panic("No available IRQ to bind to: increase nr_irqs!\n"); - - res = irq_alloc_desc_at(irq, 0); - - if (WARN_ON(res != irq)) - return -1; - - return irq; + return irq_alloc_descs(-1, 0, 1, 0); } int bind_evtchn_to_irq(unsigned int evtchn)> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Reviewed-by: Ingo Molnar <mingo@elte.hu> > Cc: Jeremy Fitzhardinge <jeremy@xensource.com> > --- > drivers/xen/events.c | 23 +++++++++++------------ > 1 files changed, 11 insertions(+), 12 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 13365ba..7d24b0d 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -338,30 +338,29 @@ static void unmask_evtchn(int port) > > static int find_unbound_irq(void) > { > - int irq; > - struct irq_desc *desc; > + struct irq_data *data; > + int irq, res; > > for (irq = 0; irq < nr_irqs; irq++) { > - desc = irq_to_desc(irq); > + data = irq_get_irq_data(irq); > /* only 0->15 have init''d desc; handle irq > 16 */ > - if (desc == NULL) > + if (!data) > break; > - if (desc->chip == &no_irq_chip) > + if (data->chip == &no_irq_chip) > break; > - if (desc->chip != &xen_dynamic_chip) > + if (data->chip != &xen_dynamic_chip) > continue; > if (irq_info[irq].type == IRQT_UNBOUND) > - break; > + return irq; > } > > if (irq == nr_irqs) > panic("No available IRQ to bind to: increase nr_irqs!\n"); > > - desc = irq_to_desc_alloc_node(irq, 0); > - if (WARN_ON(desc == NULL)) > - return -1; > + res = irq_alloc_desc_at(irq, 0); > > - dynamic_irq_init_keep_chip_data(irq); > + if (WARN_ON(res != irq)) > + return -1; > > return irq; > } > @@ -495,7 +494,7 @@ static void unbind_from_irq(unsigned int irq) > if (irq_info[irq].type != IRQT_UNBOUND) { > irq_info[irq] = mk_unbound_info(); > > - dynamic_irq_cleanup(irq); > + irq_free_desc(irq); > } > > spin_unlock(&irq_mapping_update_lock);_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-16 00:17 UTC
Re: [Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling
On 10/15/2010 05:15 PM, Jeremy Fitzhardinge wrote:> On 10/12/2010 01:29 PM, tip-bot for Thomas Gleixner wrote: >> Commit-ID: 77dff1c755c3218691e95e7e38ee14323b35dbdb >> Gitweb: http://git.kernel.org/tip/77dff1c755c3218691e95e7e38ee14323b35dbdb >> Author: Thomas Gleixner <tglx@linutronix.de> >> AuthorDate: Wed, 29 Sep 2010 17:37:10 +0200 >> Committer: Thomas Gleixner <tglx@linutronix.de> >> CommitDate: Tue, 12 Oct 2010 16:53:44 +0200 >> >> x86: xen: Sanitise sparse_irq handling >> >> There seems to be more cleanups possible, but that''s left to the xen >> experts :) > This causes the kernel to fail to boot under Xen. The WARN_ON(res !> irq) triggers and nobody is very happy about the results.Of course the really interesting question is whether this sparse irq rework allows us to hang our extra per-irq information of the irq_data structure now, rather than having to maintain all these auxiliary arrays? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2010-Oct-16 02:01 UTC
Re: [Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling
It should. "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:> On 10/15/2010 05:15 PM, Jeremy Fitzhardinge wrote: >> On 10/12/2010 01:29 PM, tip-bot for Thomas Gleixner wrote: >>> Commit-ID: 77dff1c755c3218691e95e7e38ee14323b35dbdb >>> Gitweb: http://git.kernel.org/tip/77dff1c755c3218691e95e7e38ee14323b35dbdb >>> Author: Thomas Gleixner <tglx@linutronix.de> >>> AuthorDate: Wed, 29 Sep 2010 17:37:10 +0200 >>> Committer: Thomas Gleixner <tglx@linutronix.de> >>> CommitDate: Tue, 12 Oct 2010 16:53:44 +0200 >>> >>> x86: xen: Sanitise sparse_irq handling >>> >>> There seems to be more cleanups possible, but that''s left to the xen >>> experts :) >> This causes the kernel to fail to boot under Xen. The WARN_ON(res !>> irq) triggers and nobody is very happy about the results. > >Of course the really interesting question is whether this sparse irq >rework allows us to hang our extra per-irq information of the irq_data >structure now, rather than having to maintain all these auxiliary arrays? > >Thanks, > J-- Sent from my mobile phone. Please pardon any lack of formatting. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:22 UTC
[PATCH 00/05] xen: events: cleanups after irq core improvements (Was: Re: [Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling)
On Sat, 2010-10-16 at 01:17 +0100, Jeremy Fitzhardinge wrote:> Of course the really interesting question is whether this sparse irq > rework allows us to hang our extra per-irq information of the irq_data > structure now, rather than having to maintain all these auxiliary > arrays?On Sat, 2010-10-16 at 03:01 +0100, H. Peter Anvin wrote:> It should.I''m about to followup with an initial stab at some cleanups which are made possible by these core changes, including hanging the per-irq info off the handler_data. These patches are on top of recent Linus master tree plus Konrad''s swiotlb-xen tree and Stefano''s PVHVM tree since the latter in particular touches the same area. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:23 UTC
[Xen-devel] [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
Encapsulate allocate and free in xen_irq_alloc and xen_irq_free. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/events.c | 68 ++++++++++++++++++++----------------------------- 1 files changed, 28 insertions(+), 40 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 97612f5..c8f3e43 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -394,41 +394,29 @@ static int find_unbound_pirq(void) return -1; } -static int find_unbound_irq(void) +static int xen_irq_alloc(void) { - struct irq_data *data; - int irq, res; - int start = get_nr_hw_irqs(); + int irq = irq_alloc_desc(0); - if (start == nr_irqs) - goto no_irqs; - - /* nr_irqs is a magic value. Must not use it.*/ - for (irq = nr_irqs-1; irq > start; irq--) { - data = irq_get_irq_data(irq); - /* only 0->15 have init''d desc; handle irq > 16 */ - if (!data) - break; - if (data->chip == &no_irq_chip) - break; - if (data->chip != &xen_dynamic_chip) - continue; - if (irq_info[irq].type == IRQT_UNBOUND) - return irq; - } - - if (irq == start) - goto no_irqs; + if (irq < 0) + panic("No available IRQ to bind to: increase nr_irqs!\n"); - res = irq_alloc_desc_at(irq, 0); + return irq; +} - if (WARN_ON(res != irq)) - return -1; +static void xen_irq_alloc_specific(unsigned irq) +{ + int res = irq_alloc_desc_at(irq, 0); - return irq; + if (res < 0) + panic("No available IRQ: increase nr_irqs!\n"); + if (res != irq) + panic("Unable to allocate to IRQ%d\n", irq); +} -no_irqs: - panic("No available IRQ to bind to: increase nr_irqs!\n"); +static void xen_irq_free(unsigned irq) +{ + irq_free_desc(irq); } static bool identity_mapped_irq(unsigned irq) @@ -627,9 +615,9 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name) if (identity_mapped_irq(gsi) || (!xen_initial_domain() && xen_pv_domain())) { irq = gsi; - irq_alloc_desc_at(irq, 0); + xen_irq_alloc_specific(irq); } else - irq = find_unbound_irq(); + irq = xen_irq_alloc(); set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_level_irq, name); @@ -642,7 +630,7 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name) * this in the priv domain. */ if (xen_initial_domain() && HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) { - irq_free_desc(irq); + xen_irq_free(irq); irq = -ENOSPC; goto out; } @@ -665,7 +653,7 @@ void xen_allocate_pirq_msi(char *name, int *irq, int *pirq) { spin_lock(&irq_mapping_update_lock); - *irq = find_unbound_irq(); + *irq = xen_irq_alloc(); if (*irq == -1) goto out; @@ -712,7 +700,7 @@ int xen_create_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int type) spin_lock(&irq_mapping_update_lock); - irq = find_unbound_irq(); + irq = xen_irq_alloc(); if (irq == -1) goto out; @@ -721,7 +709,7 @@ int xen_create_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int type) if (rc) { printk(KERN_WARNING "xen map irq failed %d\n", rc); - irq_free_desc(irq); + xen_irq_free(irq); irq = -1; goto out; @@ -762,7 +750,7 @@ int xen_destroy_irq(int irq) } irq_info[irq] = mk_unbound_info(); - irq_free_desc(irq); + xen_irq_free(irq); out: spin_unlock(&irq_mapping_update_lock); @@ -788,7 +776,7 @@ int bind_evtchn_to_irq(unsigned int evtchn) irq = evtchn_to_irq[evtchn]; if (irq == -1) { - irq = find_unbound_irq(); + irq = xen_irq_alloc(); set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, handle_fasteoi_irq, "event"); @@ -813,7 +801,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) irq = per_cpu(ipi_to_irq, cpu)[ipi]; if (irq == -1) { - irq = find_unbound_irq(); + irq = xen_irq_alloc(); if (irq < 0) goto out; @@ -849,7 +837,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) irq = per_cpu(virq_to_irq, cpu)[virq]; if (irq == -1) { - irq = find_unbound_irq(); + irq = xen_irq_alloc(); set_irq_chip_and_handler_name(irq, &xen_percpu_chip, handle_percpu_irq, "virq"); @@ -908,7 +896,7 @@ static void unbind_from_irq(unsigned int irq) if (irq_info[irq].type != IRQT_UNBOUND) { irq_info[irq] = mk_unbound_info(); - irq_free_desc(irq); + xen_irq_free(irq); } spin_unlock(&irq_mapping_update_lock); -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:23 UTC
[Xen-devel] [PATCH 2/5] xen: events: turn irq_info constructors into initialiser functions
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/events.c | 102 ++++++++++++++++++++++++++++++++------------------ 1 files changed, 65 insertions(+), 37 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index c8f3e43..33fae3d 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -129,46 +129,76 @@ static struct irq_chip xen_dynamic_chip; static struct irq_chip xen_percpu_chip; static struct irq_chip xen_pirq_chip; -/* Constructor for packed IRQ information. */ -static struct irq_info mk_unbound_info(void) +/* Get info for IRQ */ +static struct irq_info *info_for_irq(unsigned irq) { - return (struct irq_info) { .type = IRQT_UNBOUND }; + return &irq_info[irq]; } -static struct irq_info mk_evtchn_info(unsigned short evtchn) +/* Constructors for packed IRQ information. */ +static void xen_irq_info_common_init(struct irq_info *info, + enum xen_irq_type type, + unsigned short evtchn, + unsigned short cpu) { - return (struct irq_info) { .type = IRQT_EVTCHN, .evtchn = evtchn, - .cpu = 0 }; + + BUG_ON(info->type != IRQT_UNBOUND && info->type != type); + + info->type = type; + info->evtchn = evtchn; + info->cpu = cpu; } -static struct irq_info mk_ipi_info(unsigned short evtchn, enum ipi_vector ipi) +static void xen_irq_info_evtchn_init(unsigned irq, + unsigned short evtchn) { - return (struct irq_info) { .type = IRQT_IPI, .evtchn = evtchn, - .cpu = 0, .u.ipi = ipi }; + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, IRQT_EVTCHN, evtchn, 0); } -static struct irq_info mk_virq_info(unsigned short evtchn, unsigned short virq) +static void xen_irq_info_ipi_init(unsigned irq, + unsigned short evtchn, + enum ipi_vector ipi) { - return (struct irq_info) { .type = IRQT_VIRQ, .evtchn = evtchn, - .cpu = 0, .u.virq = virq }; + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, IRQT_IPI, evtchn, 0); + + info->u.ipi = ipi; } -static struct irq_info mk_pirq_info(unsigned short evtchn, unsigned short pirq, - unsigned short gsi, unsigned short vector) +static void xen_irq_info_virq_init(unsigned irq, + unsigned short evtchn, + unsigned short virq) { - return (struct irq_info) { .type = IRQT_PIRQ, .evtchn = evtchn, - .cpu = 0, - .u.pirq = { .pirq = pirq, .gsi = gsi, .vector = vector } }; + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, IRQT_VIRQ, evtchn, 0); + + info->u.virq = virq; } -/* - * Accessors for packed IRQ information. - */ -static struct irq_info *info_for_irq(unsigned irq) +static void xen_irq_info_pirq_init(unsigned irq, + unsigned short evtchn, + unsigned short pirq, + unsigned short gsi, + unsigned short vector, + unsigned char flags) { - return &irq_info[irq]; + struct irq_info *info = info_for_irq(irq); + + xen_irq_info_common_init(info, IRQT_PIRQ, evtchn, 0); + + info->u.pirq.pirq = pirq; + info->u.pirq.gsi = gsi; + info->u.pirq.vector = vector; + info->u.pirq.flags = flags; } +/* + * Accessors for packed IRQ information. + */ static unsigned int evtchn_from_irq(unsigned irq) { return info_for_irq(irq)->evtchn; @@ -416,6 +446,7 @@ static void xen_irq_alloc_specific(unsigned irq) static void xen_irq_free(unsigned irq) { + irq_info[irq].type = IRQT_UNBOUND; irq_free_desc(irq); } @@ -635,8 +666,8 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name) goto out; } - irq_info[irq] = mk_pirq_info(0, pirq, gsi, irq_op.vector); - irq_info[irq].u.pirq.flags |= shareable ? PIRQ_SHAREABLE : 0; + xen_irq_info_pirq_init(irq, 0, pirq, gsi, irq_op.vector, + shareable ? PIRQ_SHAREABLE : 0); pirq_to_irq[pirq] = irq; out: @@ -664,7 +695,7 @@ void xen_allocate_pirq_msi(char *name, int *irq, int *pirq) set_irq_chip_and_handler_name(*irq, &xen_pirq_chip, handle_level_irq, name); - irq_info[*irq] = mk_pirq_info(0, *pirq, 0, 0); + xen_irq_info_pirq_init(*irq, 0, *pirq, 0, 0, 0); pirq_to_irq[*pirq] = *irq; out: @@ -714,7 +745,7 @@ int xen_create_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc, int type) irq = -1; goto out; } - irq_info[irq] = mk_pirq_info(0, map_irq.pirq, 0, map_irq.index); + xen_irq_info_pirq_init(irq, 0, map_irq.pirq, 0, map_irq.index, 0); set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_level_irq, @@ -748,7 +779,6 @@ int xen_destroy_irq(int irq) goto out; } } - irq_info[irq] = mk_unbound_info(); xen_irq_free(irq); @@ -782,7 +812,7 @@ int bind_evtchn_to_irq(unsigned int evtchn) handle_fasteoi_irq, "event"); evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_evtchn_info(evtchn); + xen_irq_info_evtchn_init(irq, evtchn); } spin_unlock(&irq_mapping_update_lock); @@ -815,7 +845,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) evtchn = bind_ipi.port; evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_ipi_info(evtchn, ipi); + xen_irq_info_ipi_init(irq, evtchn, ipi); per_cpu(ipi_to_irq, cpu)[ipi] = irq; bind_evtchn_to_cpu(evtchn, cpu); @@ -850,7 +880,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) evtchn = bind_virq.port; evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_virq_info(evtchn, virq); + xen_irq_info_virq_init(irq, evtchn, virq); per_cpu(virq_to_irq, cpu)[virq] = irq; @@ -893,11 +923,9 @@ static void unbind_from_irq(unsigned int irq) evtchn_to_irq[evtchn] = -1; } - if (irq_info[irq].type != IRQT_UNBOUND) { - irq_info[irq] = mk_unbound_info(); + BUG_ON(irq_info[irq].type == IRQT_UNBOUND); - xen_irq_free(irq); - } + xen_irq_free(irq); spin_unlock(&irq_mapping_update_lock); } @@ -1158,7 +1186,7 @@ void rebind_evtchn_irq(int evtchn, int irq) BUG_ON(info->type == IRQT_UNBOUND); evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_evtchn_info(evtchn); + xen_irq_info_evtchn_init(irq, evtchn); spin_unlock(&irq_mapping_update_lock); @@ -1285,7 +1313,7 @@ static void restore_cpu_virqs(unsigned int cpu) /* Record the new mapping. */ evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_virq_info(evtchn, virq); + xen_irq_info_virq_init(irq, evtchn, virq); bind_evtchn_to_cpu(evtchn, cpu); /* Ready for use. */ @@ -1313,7 +1341,7 @@ static void restore_cpu_ipis(unsigned int cpu) /* Record the new mapping. */ evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_ipi_info(evtchn, ipi); + xen_irq_info_ipi_init(irq, evtchn, ipi); bind_evtchn_to_cpu(evtchn, cpu); /* Ready for use. */ -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:23 UTC
[Xen-devel] [PATCH 3/5] xen: events: push setup of irq<->{evtchn, pirq} maps into irq_info init functions
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/events.c | 23 +++++++++-------------- 1 files changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 33fae3d..94055ea 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -137,6 +137,7 @@ static struct irq_info *info_for_irq(unsigned irq) /* Constructors for packed IRQ information. */ static void xen_irq_info_common_init(struct irq_info *info, + unsigned irq, enum xen_irq_type type, unsigned short evtchn, unsigned short cpu) @@ -147,6 +148,8 @@ static void xen_irq_info_common_init(struct irq_info *info, info->type = type; info->evtchn = evtchn; info->cpu = cpu; + + evtchn_to_irq[evtchn] = irq; } static void xen_irq_info_evtchn_init(unsigned irq, @@ -154,7 +157,7 @@ static void xen_irq_info_evtchn_init(unsigned irq, { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, IRQT_EVTCHN, evtchn, 0); + xen_irq_info_common_init(info, irq, IRQT_EVTCHN, evtchn, 0); } static void xen_irq_info_ipi_init(unsigned irq, @@ -163,18 +166,17 @@ static void xen_irq_info_ipi_init(unsigned irq, { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, IRQT_IPI, evtchn, 0); + xen_irq_info_common_init(info, irq, IRQT_IPI, evtchn, 0); info->u.ipi = ipi; } - static void xen_irq_info_virq_init(unsigned irq, unsigned short evtchn, unsigned short virq) { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, IRQT_VIRQ, evtchn, 0); + xen_irq_info_common_init(info, irq, IRQT_VIRQ, evtchn, 0); info->u.virq = virq; } @@ -188,12 +190,14 @@ static void xen_irq_info_pirq_init(unsigned irq, { struct irq_info *info = info_for_irq(irq); - xen_irq_info_common_init(info, IRQT_PIRQ, evtchn, 0); + xen_irq_info_common_init(info, irq, IRQT_PIRQ, evtchn, 0); info->u.pirq.pirq = pirq; info->u.pirq.gsi = gsi; info->u.pirq.vector = vector; info->u.pirq.flags = flags; + + pirq_to_irq[pirq] = irq; } /* @@ -668,11 +672,9 @@ int xen_map_pirq_gsi(unsigned pirq, unsigned gsi, int shareable, char *name) xen_irq_info_pirq_init(irq, 0, pirq, gsi, irq_op.vector, shareable ? PIRQ_SHAREABLE : 0); - pirq_to_irq[pirq] = irq; out: spin_unlock(&irq_mapping_update_lock); - return irq; } @@ -696,7 +698,6 @@ void xen_allocate_pirq_msi(char *name, int *irq, int *pirq) handle_level_irq, name); xen_irq_info_pirq_init(*irq, 0, *pirq, 0, 0, 0); - pirq_to_irq[*pirq] = *irq; out: spin_unlock(&irq_mapping_update_lock); @@ -811,7 +812,6 @@ int bind_evtchn_to_irq(unsigned int evtchn) set_irq_chip_and_handler_name(irq, &xen_dynamic_chip, handle_fasteoi_irq, "event"); - evtchn_to_irq[evtchn] = irq; xen_irq_info_evtchn_init(irq, evtchn); } @@ -844,7 +844,6 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) BUG(); evtchn = bind_ipi.port; - evtchn_to_irq[evtchn] = irq; xen_irq_info_ipi_init(irq, evtchn, ipi); per_cpu(ipi_to_irq, cpu)[ipi] = irq; @@ -879,7 +878,6 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu) BUG(); evtchn = bind_virq.port; - evtchn_to_irq[evtchn] = irq; xen_irq_info_virq_init(irq, evtchn, virq); per_cpu(virq_to_irq, cpu)[virq] = irq; @@ -1185,7 +1183,6 @@ void rebind_evtchn_irq(int evtchn, int irq) so there should be a proper type */ BUG_ON(info->type == IRQT_UNBOUND); - evtchn_to_irq[evtchn] = irq; xen_irq_info_evtchn_init(irq, evtchn); spin_unlock(&irq_mapping_update_lock); @@ -1312,7 +1309,6 @@ static void restore_cpu_virqs(unsigned int cpu) evtchn = bind_virq.port; /* Record the new mapping. */ - evtchn_to_irq[evtchn] = irq; xen_irq_info_virq_init(irq, evtchn, virq); bind_evtchn_to_cpu(evtchn, cpu); @@ -1340,7 +1336,6 @@ static void restore_cpu_ipis(unsigned int cpu) evtchn = bind_ipi.port; /* Record the new mapping. */ - evtchn_to_irq[evtchn] = irq; xen_irq_info_ipi_init(irq, evtchn, ipi); bind_evtchn_to_cpu(evtchn, cpu); -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:23 UTC
[Xen-devel] [PATCH 4/5] xen: events: dynamically allocate irq info structures
Removes nr_irq sized array allocation at start of day. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/events.c | 50 +++++++++++++++++++++++++++++++++++++++++--------- 1 files changed, 41 insertions(+), 9 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 94055ea..9b58505 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -56,6 +56,8 @@ */ static DEFINE_SPINLOCK(irq_mapping_update_lock); +static LIST_HEAD(xen_irq_list_head); + /* IRQ <-> VIRQ mapping. */ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; @@ -85,6 +87,7 @@ enum xen_irq_type { */ struct irq_info { + struct list_head list; enum xen_irq_type type; /* type */ unsigned short evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ @@ -103,7 +106,6 @@ struct irq_info #define PIRQ_NEEDS_EOI (1 << 0) #define PIRQ_SHAREABLE (1 << 1) -static struct irq_info *irq_info; static int *pirq_to_irq; static int nr_pirqs; @@ -132,7 +134,7 @@ static struct irq_chip xen_pirq_chip; /* Get info for IRQ */ static struct irq_info *info_for_irq(unsigned irq) { - return &irq_info[irq]; + return get_irq_data(irq); } /* Constructors for packed IRQ information. */ @@ -315,7 +317,7 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); __set_bit(chn, cpu_evtchn_mask(cpu)); - irq_info[irq].cpu = cpu; + info_for_irq(irq)->cpu = cpu; } static void init_evtchn_cpu_bindings(void) @@ -428,6 +430,21 @@ static int find_unbound_pirq(void) return -1; } +static void xen_irq_init(unsigned irq) +{ + struct irq_info *info; + + info = kmalloc(sizeof(*info), GFP_KERNEL); + if (info == NULL) + panic("Unable to allocate metadata for IRQ%d\n", irq); + + info->type = IRQT_UNBOUND; + + set_irq_data(irq, info); + + list_add_tail(&info->list, &xen_irq_list_head); +} + static int xen_irq_alloc(void) { int irq = irq_alloc_desc(0); @@ -435,6 +452,8 @@ static int xen_irq_alloc(void) if (irq < 0) panic("No available IRQ to bind to: increase nr_irqs!\n"); + xen_irq_init(irq); + return irq; } @@ -446,11 +465,20 @@ static void xen_irq_alloc_specific(unsigned irq) panic("No available IRQ: increase nr_irqs!\n"); if (res != irq) panic("Unable to allocate to IRQ%d\n", irq); + + xen_irq_init(irq); } static void xen_irq_free(unsigned irq) { - irq_info[irq].type = IRQT_UNBOUND; + struct irq_info *info = get_irq_data(irq); + + list_del(&info->list); + + set_irq_data(irq, NULL); + + kfree(info); + irq_free_desc(irq); } @@ -921,7 +949,7 @@ static void unbind_from_irq(unsigned int irq) evtchn_to_irq[evtchn] = -1; } - BUG_ON(irq_info[irq].type == IRQT_UNBOUND); + BUG_ON(info_for_irq(irq)->type == IRQT_UNBOUND); xen_irq_free(irq); @@ -1400,7 +1428,10 @@ void xen_poll_irq(int irq) void xen_irq_resume(void) { - unsigned int cpu, irq, evtchn; + unsigned int cpu, evtchn; + struct irq_info *info; + + spin_lock(&irq_mapping_update_lock); init_evtchn_cpu_bindings(); @@ -1409,8 +1440,8 @@ void xen_irq_resume(void) mask_evtchn(evtchn); /* No IRQ <-> event-channel mappings. */ - for (irq = 0; irq < nr_irqs; irq++) - irq_info[irq].evtchn = 0; /* zap event-channel binding */ + list_for_each_entry(info, &xen_irq_list_head, list) + info->evtchn = 0; /* zap event-channel binding */ for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) evtchn_to_irq[evtchn] = -1; @@ -1419,6 +1450,8 @@ void xen_irq_resume(void) restore_cpu_virqs(cpu); restore_cpu_ipis(cpu); } + + spin_unlock(&irq_mapping_update_lock); } static struct irq_chip xen_dynamic_chip __read_mostly = { @@ -1508,7 +1541,6 @@ void __init xen_init_IRQ(void) cpu_evtchn_mask_p = kcalloc(nr_cpu_ids, sizeof(struct cpu_evtchn_s), GFP_KERNEL); - irq_info = kcalloc(nr_irqs, sizeof(*irq_info), GFP_KERNEL); rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_nr_pirqs, &op_nr_pirqs); if (rc < 0) { -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 16:23 UTC
[Xen-devel] [PATCH 5/5] xen: events: use per-cpu variable for cpu_evtchn_mask
I can''t see any reason why it isn''t already. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/events.c | 31 +++++++++++-------------------- 1 files changed, 11 insertions(+), 20 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 9b58505..144ff72 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -110,19 +110,9 @@ static int *pirq_to_irq; static int nr_pirqs; static int *evtchn_to_irq; -struct cpu_evtchn_s { - unsigned long bits[NR_EVENT_CHANNELS/BITS_PER_LONG]; -}; -static __initdata struct cpu_evtchn_s init_evtchn_mask = { - .bits[0 ... (NR_EVENT_CHANNELS/BITS_PER_LONG)-1] = ~0ul, -}; -static struct cpu_evtchn_s *cpu_evtchn_mask_p = &init_evtchn_mask; - -static inline unsigned long *cpu_evtchn_mask(int cpu) -{ - return cpu_evtchn_mask_p[cpu].bits; -} +static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS/BITS_PER_LONG], + cpu_evtchn_mask); /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) @@ -301,7 +291,7 @@ static inline unsigned long active_evtchns(unsigned int cpu, unsigned int idx) { return (sh->evtchn_pending[idx] & - cpu_evtchn_mask(cpu)[idx] & + per_cpu(cpu_evtchn_mask, cpu)[idx] & ~sh->evtchn_mask[idx]); } @@ -314,8 +304,8 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) cpumask_copy(irq_to_desc(irq)->affinity, cpumask_of(cpu)); #endif - __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); - __set_bit(chn, cpu_evtchn_mask(cpu)); + __clear_bit(chn, per_cpu(cpu_evtchn_mask, cpu_from_irq(irq))); + __set_bit(chn, per_cpu(cpu_evtchn_mask, cpu)); info_for_irq(irq)->cpu = cpu; } @@ -332,7 +322,11 @@ static void init_evtchn_cpu_bindings(void) } #endif - memset(cpu_evtchn_mask(0), ~0, sizeof(struct cpu_evtchn_s)); + printk(KERN_CRIT "%s: CPU0 at %p size %zd\n", __func__, + per_cpu(cpu_evtchn_mask, 0), + sizeof(per_cpu(cpu_evtchn_mask, 0))); + memset(per_cpu(cpu_evtchn_mask, 0), ~0, + sizeof(per_cpu(cpu_evtchn_mask, 0))); } static inline void clear_evtchn(int port) @@ -1034,7 +1028,7 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id) { struct shared_info *sh = HYPERVISOR_shared_info; int cpu = smp_processor_id(); - unsigned long *cpu_evtchn = cpu_evtchn_mask(cpu); + unsigned long *cpu_evtchn = per_cpu(cpu_evtchn_mask, cpu); int i; unsigned long flags; static DEFINE_SPINLOCK(debug_lock); @@ -1539,9 +1533,6 @@ void __init xen_init_IRQ(void) int i, rc; struct physdev_nr_pirqs op_nr_pirqs; - cpu_evtchn_mask_p = kcalloc(nr_cpu_ids, sizeof(struct cpu_evtchn_s), - GFP_KERNEL); - rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_nr_pirqs, &op_nr_pirqs); if (rc < 0) { nr_pirqs = nr_irqs; -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-25 17:35 UTC
[Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Mon, Oct 25, 2010 at 05:23:29PM +0100, Ian Campbell wrote:> Encapsulate allocate and free in xen_irq_alloc and xen_irq_free. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > drivers/xen/events.c | 68 ++++++++++++++++++++----------------------------- > 1 files changed, 28 insertions(+), 40 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 97612f5..c8f3e43 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -394,41 +394,29 @@ static int find_unbound_pirq(void) > return -1; > } > > -static int find_unbound_irq(void) > +static int xen_irq_alloc(void) > { > - struct irq_data *data; > - int irq, res; > - int start = get_nr_hw_irqs(); > + int irq = irq_alloc_desc(0); > > - if (start == nr_irqs) > - goto no_irqs; > - > - /* nr_irqs is a magic value. Must not use it.*/ > - for (irq = nr_irqs-1; irq > start; irq--) { > - data = irq_get_irq_data(irq); > - /* only 0->15 have init''d desc; handle irq > 16 */ > - if (!data) > - break; > - if (data->chip == &no_irq_chip) > - break; > - if (data->chip != &xen_dynamic_chip) > - continue; > - if (irq_info[irq].type == IRQT_UNBOUND) > - return irq; > - } > - > - if (irq == start) > - goto no_irqs; > + if (irq < 0) > + panic("No available IRQ to bind to: increase nr_irqs!\n"); > > - res = irq_alloc_desc_at(irq, 0); > + return irq; > +}So I am curious what the /proc/interrupts looks?The issue (and the reason for this implementation above) was that under PV with PCI devices we would overlap PCI devices IRQs with Xen event channels. So we could have a USB device at IRQ 16 _and_ also a xen_spinlock4 handler. That would throw off the system since the xen_spinlock4 was an edge type handler while the USB device was an level (at least on my box). But with this shinny sparse_irq rework, maybe this is not an issue anymore? Can we mix a level and edge chip handler under one IRQ? What do you see when you pass in a PCI device and say give the guest 32 CPUs?? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-25 18:02 UTC
[Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Mon, 2010-10-25 at 18:35 +0100, Konrad Rzeszutek Wilk wrote:> So I am curious what the /proc/interrupts looks?The issue (and the reason > for this implementation above) was that under PV with PCI devices we would > overlap PCI devices IRQs with Xen event channels. So we could have a USB device > at IRQ 16 _and_ also a xen_spinlock4 handler. That would throw off the system > since the xen_spinlock4 was an edge type handler while the USB device was an > level (at least on my box).I suspect what we should really be doing is to segregate the different classes of event channel in IRQ space. I _think_ this new stuff is happy with a discontinuous (but presumably clustered) IRQ space, I should probably check. e.g. regular interdomain event channels, VIRQs and the like should probably request allocations from some range higher than nr_hw_irqs, thus avoiding conflicts with hardware PIRQ event channels which would ask for a 1-1 mapping with the GSI (i.e. same interrupt numbers as the device would get under native, AIUI). We might even decide to start the interdomain event channel range even higher than nr_hw_irqs in order to leave room for the more dynamic h/w PIRQs (e.g. MSIs) just after nr_hw_irqs. Assuming this is consistent with what would happen on native then it is probably worthwhile.> But with this shinny sparse_irq rework, maybe this is not an issue anymore? > Can we mix a level and edge chip handler under one IRQ?I doubt the sparse irq rework had any impact on this aspect, but it does help us more easily arrange for them not to be shared in that way in the first place.> What do you see when you pass in a PCI device and say give the guest 32 CPUs??I can try tomorrow and see, based on what you say above without implementing what I described I suspect the answer will be "carnage". Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-25 23:03 UTC
[Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/25/2010 10:35 AM, Konrad Rzeszutek Wilk wrote:> On Mon, Oct 25, 2010 at 05:23:29PM +0100, Ian Campbell wrote: >> Encapsulate allocate and free in xen_irq_alloc and xen_irq_free. >> >> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> >> --- >> drivers/xen/events.c | 68 ++++++++++++++++++++----------------------------- >> 1 files changed, 28 insertions(+), 40 deletions(-) >> >> diff --git a/drivers/xen/events.c b/drivers/xen/events.c >> index 97612f5..c8f3e43 100644 >> --- a/drivers/xen/events.c >> +++ b/drivers/xen/events.c >> @@ -394,41 +394,29 @@ static int find_unbound_pirq(void) >> return -1; >> } >> >> -static int find_unbound_irq(void) >> +static int xen_irq_alloc(void) >> { >> - struct irq_data *data; >> - int irq, res; >> - int start = get_nr_hw_irqs(); >> + int irq = irq_alloc_desc(0); >> >> - if (start == nr_irqs) >> - goto no_irqs; >> - >> - /* nr_irqs is a magic value. Must not use it.*/ >> - for (irq = nr_irqs-1; irq > start; irq--) { >> - data = irq_get_irq_data(irq); >> - /* only 0->15 have init''d desc; handle irq > 16 */ >> - if (!data) >> - break; >> - if (data->chip == &no_irq_chip) >> - break; >> - if (data->chip != &xen_dynamic_chip) >> - continue; >> - if (irq_info[irq].type == IRQT_UNBOUND) >> - return irq; >> - } >> - >> - if (irq == start) >> - goto no_irqs; >> + if (irq < 0) >> + panic("No available IRQ to bind to: increase nr_irqs!\n"); >> >> - res = irq_alloc_desc_at(irq, 0); >> + return irq; >> +} > So I am curious what the /proc/interrupts looks?The issue (and the reason > for this implementation above) was that under PV with PCI devices we would > overlap PCI devices IRQs with Xen event channels. So we could have a USB device > at IRQ 16 _and_ also a xen_spinlock4 handler. That would throw off the system > since the xen_spinlock4 was an edge type handler while the USB device was an > level (at least on my box).What? Why? How? Surely if we''re asking the irq subsystem to allocate us an irq, it will return a fresh never-before-used (and certainly not shared) irq? Shared irqs only make sense if multiple devices are actually sharing, say, a wire on the board. Or am I missing something? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-25 23:03 UTC
Re: [PATCH 00/05] xen: events: cleanups after irq core improvements (Was: Re: [Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling)
On 10/25/2010 09:22 AM, Ian Campbell wrote:> I''m about to followup with an initial stab at some cleanups which are > made possible by these core changes, including hanging the per-irq info > off the handler_data. > > These patches are on top of recent Linus master tree plus Konrad''s > swiotlb-xen tree and Stefano''s PVHVM tree since the latter in particular > touches the same area.This looks pretty good. How much testing have you given it? Do you have a git branch I can pull? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2010-Oct-25 23:05 UTC
[Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/25/2010 04:03 PM, Jeremy Fitzhardinge wrote:> > What? Why? How? Surely if we''re asking the irq subsystem to allocate > us an irq, it will return a fresh never-before-used (and certainly not > shared) irq? Shared irqs only make sense if multiple devices are > actually sharing, say, a wire on the board. > > Or am I missing something? >I think the number is not necessarily "never before used", but rather "not currently used". -hpa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-25 23:21 UTC
[Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/25/2010 04:05 PM, H. Peter Anvin wrote:> On 10/25/2010 04:03 PM, Jeremy Fitzhardinge wrote: >> What? Why? How? Surely if we''re asking the irq subsystem to allocate >> us an irq, it will return a fresh never-before-used (and certainly not >> shared) irq? Shared irqs only make sense if multiple devices are >> actually sharing, say, a wire on the board. >> >> Or am I missing something? >> > I think the number is not necessarily "never before used", but rather > "not currently used".Yeah, that''s what I meant. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-26 07:25 UTC
Re: [PATCH 00/05] xen: events: cleanups after irq core improvements (Was: Re: [Xen-devel] Re: [tip:irq/core] x86: xen: Sanitise sparse_irq handling)
On Tue, 2010-10-26 at 00:03 +0100, Jeremy Fitzhardinge wrote:> On 10/25/2010 09:22 AM, Ian Campbell wrote: > > I''m about to followup with an initial stab at some cleanups which are > > made possible by these core changes, including hanging the per-irq info > > off the handler_data. > > > > These patches are on top of recent Linus master tree plus Konrad''s > > swiotlb-xen tree and Stefano''s PVHVM tree since the latter in particular > > touches the same area. > > This looks pretty good. How much testing have you given it?Not as much as I would like. I''ve tested normal PV domU operation and live migration and that''s about it. I''ve not tested anything involving hardware interrupts (so no passthrough, dom0, msi etc).> Do you have a git branch I can pull?The patches are currently based on trees in linux-next since they depend on your, Konrad''s and Stefano''s trees (especially Stefano''s, I think) so I don''t yet have a stable base for a branch. Once those flow through I can produce a proper branch. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-26 08:15 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Mon, 2010-10-25 at 19:02 +0100, Ian Campbell wrote:> > > > What do you see when you pass in a PCI device and say give the guest > 32 CPUs?? > > I can try tomorrow and see, based on what you say above without > implementing what I described I suspect the answer will be "carnage".Actually, it looks like multi-vcpu is broken, I only see 1 regardless of how many I configured. It''s not clear if this is breakage in Linus'' tree, something I pulled in from one of Jeremy''s, yours or Stefano''s trees or some local pebcak. I''ll investigate... Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-26 14:17 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Mon, Oct 25, 2010 at 04:03:19PM -0700, Jeremy Fitzhardinge wrote:> On 10/25/2010 10:35 AM, Konrad Rzeszutek Wilk wrote: > > On Mon, Oct 25, 2010 at 05:23:29PM +0100, Ian Campbell wrote: > >> Encapsulate allocate and free in xen_irq_alloc and xen_irq_free. > >> > >> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > >> --- > >> drivers/xen/events.c | 68 ++++++++++++++++++++----------------------------- > >> 1 files changed, 28 insertions(+), 40 deletions(-) > >> > >> diff --git a/drivers/xen/events.c b/drivers/xen/events.c > >> index 97612f5..c8f3e43 100644 > >> --- a/drivers/xen/events.c > >> +++ b/drivers/xen/events.c > >> @@ -394,41 +394,29 @@ static int find_unbound_pirq(void) > >> return -1; > >> } > >> > >> -static int find_unbound_irq(void) > >> +static int xen_irq_alloc(void) > >> { > >> - struct irq_data *data; > >> - int irq, res; > >> - int start = get_nr_hw_irqs(); > >> + int irq = irq_alloc_desc(0); > >> > >> - if (start == nr_irqs) > >> - goto no_irqs; > >> - > >> - /* nr_irqs is a magic value. Must not use it.*/ > >> - for (irq = nr_irqs-1; irq > start; irq--) { > >> - data = irq_get_irq_data(irq); > >> - /* only 0->15 have init''d desc; handle irq > 16 */ > >> - if (!data) > >> - break; > >> - if (data->chip == &no_irq_chip) > >> - break; > >> - if (data->chip != &xen_dynamic_chip) > >> - continue; > >> - if (irq_info[irq].type == IRQT_UNBOUND) > >> - return irq; > >> - } > >> - > >> - if (irq == start) > >> - goto no_irqs; > >> + if (irq < 0) > >> + panic("No available IRQ to bind to: increase nr_irqs!\n"); > >> > >> - res = irq_alloc_desc_at(irq, 0); > >> + return irq; > >> +} > > So I am curious what the /proc/interrupts looks?The issue (and the reason > > for this implementation above) was that under PV with PCI devices we would > > overlap PCI devices IRQs with Xen event channels. So we could have a USB device > > at IRQ 16 _and_ also a xen_spinlock4 handler. That would throw off the system > > since the xen_spinlock4 was an edge type handler while the USB device was an > > level (at least on my box). > > What? Why? How? Surely if we''re asking the irq subsystem to allocateImagine a PV guest with PCI passthrough. Normally the first 16 IRQs are reserved for "legacy" devices. And the IRQs after that are up for grabs. Since the Xen event channels are initialized much much earlier than any PCI devices, they end up using the IRQs right after 16 -which is OK if you don''t have any PCI devices. If you have a PCI device that is using IRQ 17 it ends up colliding with an event channel. Now, I have to confess I did not look carefully at the sparse_irq rework so it might be that the IRQ numbur is not as important as it was before 2.6.37.> us an irq, it will return a fresh never-before-used (and certainly not > shared) irq? Shared irqs only make sense if multiple devices are > actually sharing, say, a wire on the board.Right, and in this case we end up trying to use the IRQ for a physical device and find out that the IRQ has/is being aleady used for an event channel.> > Or am I missing something?Event channels are allocated before PCI devices so they get to usurp the IRQ chip for the IRQ that belongs to the PCI device. Keep in mind that this is not possible under Dom0, as we have the IOAPIC information, so we know that IRQ0-48 are reserved for GSI''s for three of the IOAPIC. In PV with PCI passthrough such information is not present and the kernel assumes no IOAPICs, and hence no GSI. a). Maybe one way to do this is set the GSI high watermark to be the same as the host (so move it from the legacy IRQ 16 to 48 for example). This would require fiddling with the shared_info structure.. b) Another approach was to allocate event-channel IRQs and virtual IRQs from the highest available IRQ and continue down . Physical IRQs would be allocated from the legacy IRQ up to whatever is available. c) 2.6.18 kernels made a division right at 255, so anything under 255 was to be used for physical IRQs, while anything above that for event channels and vitual IRQs. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-26 14:30 UTC
[Xen-devel] Re: [PATCH 4/5] xen: events: dynamically allocate irq info structures
On Mon, Oct 25, 2010 at 05:23:32PM +0100, Ian Campbell wrote:> Removes nr_irq sized array allocation at start of day. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > drivers/xen/events.c | 50 +++++++++++++++++++++++++++++++++++++++++--------- > 1 files changed, 41 insertions(+), 9 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 94055ea..9b58505 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -56,6 +56,8 @@ > */ > static DEFINE_SPINLOCK(irq_mapping_update_lock); > > +static LIST_HEAD(xen_irq_list_head); > + > /* IRQ <-> VIRQ mapping. */ > static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; > > @@ -85,6 +87,7 @@ enum xen_irq_type { > */ > struct irq_info > { > + struct list_head list; > enum xen_irq_type type; /* type */ > unsigned short evtchn; /* event channel */ > unsigned short cpu; /* cpu bound */ > @@ -103,7 +106,6 @@ struct irq_info > #define PIRQ_NEEDS_EOI (1 << 0) > #define PIRQ_SHAREABLE (1 << 1) > > -static struct irq_info *irq_info; > static int *pirq_to_irq; > static int nr_pirqs; > > @@ -132,7 +134,7 @@ static struct irq_chip xen_pirq_chip; > /* Get info for IRQ */ > static struct irq_info *info_for_irq(unsigned irq) > { > - return &irq_info[irq]; > + return get_irq_data(irq); > } > > /* Constructors for packed IRQ information. */ > @@ -315,7 +317,7 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); > __set_bit(chn, cpu_evtchn_mask(cpu)); > > - irq_info[irq].cpu = cpu; > + info_for_irq(irq)->cpu = cpu; > } > > static void init_evtchn_cpu_bindings(void) > @@ -428,6 +430,21 @@ static int find_unbound_pirq(void) > return -1; > } > > +static void xen_irq_init(unsigned irq) > +{ > + struct irq_info *info; > + > + info = kmalloc(sizeof(*info), GFP_KERNEL); > + if (info == NULL) > + panic("Unable to allocate metadata for IRQ%d\n", irq);There is a bunch of panic around allocating IRQs. There is one earlier in xen_irq_alloc too, and I was wondering - would it make sense to perhaps print an error to both the hypervisor and the kernel and just return -1 as an IRQ and let the kernel continue with its normal failure path? I am thinking just in terms of making the system still be able to work even if parts of it are busted, instead of just crashing the system. Not sure which philosophy is domiant in the Linux kernel? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-26 14:31 UTC
[Xen-devel] Re: [PATCH 3/5] xen: events: push setup of irq<->{evtchn, pirq} maps into irq_info init functions
On Mon, Oct 25, 2010 at 05:23:31PM +0100, Ian Campbell wrote: The description and title aren''t really that descriptive. Could you redo it a bit, please? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-26 14:36 UTC
[Xen-devel] Re: [PATCH 5/5] xen: events: use per-cpu variable for cpu_evtchn_mask
On Mon, Oct 25, 2010 at 05:23:33PM +0100, Ian Campbell wrote:> I can''t see any reason why it isn''t already. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > drivers/xen/events.c | 31 +++++++++++-------------------- > 1 files changed, 11 insertions(+), 20 deletions(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 9b58505..144ff72 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -110,19 +110,9 @@ static int *pirq_to_irq; > static int nr_pirqs; > > static int *evtchn_to_irq; > -struct cpu_evtchn_s { > - unsigned long bits[NR_EVENT_CHANNELS/BITS_PER_LONG]; > -}; > > -static __initdata struct cpu_evtchn_s init_evtchn_mask = { > - .bits[0 ... (NR_EVENT_CHANNELS/BITS_PER_LONG)-1] = ~0ul, > -}; > -static struct cpu_evtchn_s *cpu_evtchn_mask_p = &init_evtchn_mask; > - > -static inline unsigned long *cpu_evtchn_mask(int cpu) > -{ > - return cpu_evtchn_mask_p[cpu].bits; > -} > +static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS/BITS_PER_LONG], > + cpu_evtchn_mask); > > /* Xen will never allocate port zero for any purpose. */ > #define VALID_EVTCHN(chn) ((chn) != 0) > @@ -301,7 +291,7 @@ static inline unsigned long active_evtchns(unsigned int cpu, > unsigned int idx) > { > return (sh->evtchn_pending[idx] & > - cpu_evtchn_mask(cpu)[idx] & > + per_cpu(cpu_evtchn_mask, cpu)[idx] & > ~sh->evtchn_mask[idx]); > } > > @@ -314,8 +304,8 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > cpumask_copy(irq_to_desc(irq)->affinity, cpumask_of(cpu)); > #endif > > - __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); > - __set_bit(chn, cpu_evtchn_mask(cpu)); > + __clear_bit(chn, per_cpu(cpu_evtchn_mask, cpu_from_irq(irq))); > + __set_bit(chn, per_cpu(cpu_evtchn_mask, cpu)); > > info_for_irq(irq)->cpu = cpu; > } > @@ -332,7 +322,11 @@ static void init_evtchn_cpu_bindings(void) > } > #endif > > - memset(cpu_evtchn_mask(0), ~0, sizeof(struct cpu_evtchn_s)); > + printk(KERN_CRIT "%s: CPU0 at %p size %zd\n", __func__,If this is per_cpu, wouldn''t the comment ''CPU0 at'' be improper? Hmm, so we use percpu structure, but all of the event channel and such are set for CPU0. So what is the benefit of this, when the interrupts are _only_ happening on CPU0? Oh, right, there is also the rebind cpu function somewhere so that the spinlock, timers, etc are allocated on other CPUs, right? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-26 14:50 UTC
[Xen-devel] Re: [PATCH 5/5] xen: events: use per-cpu variable for cpu_evtchn_mask
On Tue, 2010-10-26 at 15:36 +0100, Konrad Rzeszutek Wilk wrote:> On Mon, Oct 25, 2010 at 05:23:33PM +0100, Ian Campbell wrote: > > I can''t see any reason why it isn''t already. > > > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > --- > > drivers/xen/events.c | 31 +++++++++++-------------------- > > 1 files changed, 11 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > > index 9b58505..144ff72 100644 > > --- a/drivers/xen/events.c > > +++ b/drivers/xen/events.c > > @@ -110,19 +110,9 @@ static int *pirq_to_irq; > > static int nr_pirqs; > > > > static int *evtchn_to_irq; > > -struct cpu_evtchn_s { > > - unsigned long bits[NR_EVENT_CHANNELS/BITS_PER_LONG]; > > -}; > > > > -static __initdata struct cpu_evtchn_s init_evtchn_mask = { > > - .bits[0 ... (NR_EVENT_CHANNELS/BITS_PER_LONG)-1] = ~0ul, > > -}; > > -static struct cpu_evtchn_s *cpu_evtchn_mask_p = &init_evtchn_mask; > > - > > -static inline unsigned long *cpu_evtchn_mask(int cpu) > > -{ > > - return cpu_evtchn_mask_p[cpu].bits; > > -} > > +static DEFINE_PER_CPU(unsigned long [NR_EVENT_CHANNELS/BITS_PER_LONG], > > + cpu_evtchn_mask); > > > > /* Xen will never allocate port zero for any purpose. */ > > #define VALID_EVTCHN(chn) ((chn) != 0) > > @@ -301,7 +291,7 @@ static inline unsigned long active_evtchns(unsigned int cpu, > > unsigned int idx) > > { > > return (sh->evtchn_pending[idx] & > > - cpu_evtchn_mask(cpu)[idx] & > > + per_cpu(cpu_evtchn_mask, cpu)[idx] & > > ~sh->evtchn_mask[idx]); > > } > > > > @@ -314,8 +304,8 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) > > cpumask_copy(irq_to_desc(irq)->affinity, cpumask_of(cpu)); > > #endif > > > > - __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); > > - __set_bit(chn, cpu_evtchn_mask(cpu)); > > + __clear_bit(chn, per_cpu(cpu_evtchn_mask, cpu_from_irq(irq))); > > + __set_bit(chn, per_cpu(cpu_evtchn_mask, cpu)); > > > > info_for_irq(irq)->cpu = cpu; > > } > > @@ -332,7 +322,11 @@ static void init_evtchn_cpu_bindings(void) > > } > > #endif > > > > - memset(cpu_evtchn_mask(0), ~0, sizeof(struct cpu_evtchn_s)); > > + printk(KERN_CRIT "%s: CPU0 at %p size %zd\n", __func__, > > If this is per_cpu, wouldn''t the comment ''CPU0 at'' be improper?Oops, that was just debug anyway.> Hmm, so we use percpu structure, but all of the event channel and such > are set for CPU0. So what is the benefit of this, when the interrupts > are _only_ happening on CPU0?All event channels start off bound to VCPU0 so we need to initialise cpu_evtchn_mask(0) to have all ones and leave all the other CPU''s bitmaps with all zeros.> Oh, right, there is also the rebind cpu function somewhere so that the > spinlock, timers, etc are allocated on other CPUs, right?All evtchn''s start off bound to VCPU 0 but the kernel can rebind as necessary. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-26 16:37 UTC
[Xen-devel] Re: [PATCH 4/5] xen: events: dynamically allocate irq info structures
On 10/26/2010 07:30 AM, Konrad Rzeszutek Wilk wrote:> On Mon, Oct 25, 2010 at 05:23:32PM +0100, Ian Campbell wrote: >> Removes nr_irq sized array allocation at start of day. >> >> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> >> --- >> drivers/xen/events.c | 50 +++++++++++++++++++++++++++++++++++++++++--------- >> 1 files changed, 41 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/xen/events.c b/drivers/xen/events.c >> index 94055ea..9b58505 100644 >> --- a/drivers/xen/events.c >> +++ b/drivers/xen/events.c >> @@ -56,6 +56,8 @@ >> */ >> static DEFINE_SPINLOCK(irq_mapping_update_lock); >> >> +static LIST_HEAD(xen_irq_list_head); >> + >> /* IRQ <-> VIRQ mapping. */ >> static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1}; >> >> @@ -85,6 +87,7 @@ enum xen_irq_type { >> */ >> struct irq_info >> { >> + struct list_head list; >> enum xen_irq_type type; /* type */ >> unsigned short evtchn; /* event channel */ >> unsigned short cpu; /* cpu bound */ >> @@ -103,7 +106,6 @@ struct irq_info >> #define PIRQ_NEEDS_EOI (1 << 0) >> #define PIRQ_SHAREABLE (1 << 1) >> >> -static struct irq_info *irq_info; >> static int *pirq_to_irq; >> static int nr_pirqs; >> >> @@ -132,7 +134,7 @@ static struct irq_chip xen_pirq_chip; >> /* Get info for IRQ */ >> static struct irq_info *info_for_irq(unsigned irq) >> { >> - return &irq_info[irq]; >> + return get_irq_data(irq); >> } >> >> /* Constructors for packed IRQ information. */ >> @@ -315,7 +317,7 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) >> __clear_bit(chn, cpu_evtchn_mask(cpu_from_irq(irq))); >> __set_bit(chn, cpu_evtchn_mask(cpu)); >> >> - irq_info[irq].cpu = cpu; >> + info_for_irq(irq)->cpu = cpu; >> } >> >> static void init_evtchn_cpu_bindings(void) >> @@ -428,6 +430,21 @@ static int find_unbound_pirq(void) >> return -1; >> } >> >> +static void xen_irq_init(unsigned irq) >> +{ >> + struct irq_info *info; >> + >> + info = kmalloc(sizeof(*info), GFP_KERNEL); >> + if (info == NULL) >> + panic("Unable to allocate metadata for IRQ%d\n", irq); > There is a bunch of panic around allocating IRQs. There is one earlier > in xen_irq_alloc too, and I was wondering - would it make sense to > perhaps print an error to both the hypervisor and the kernel and > just return -1 as an IRQ and let the kernel continue with its normal > failure path? > > I am thinking just in terms of making the system still be able to > work even if parts of it are busted, instead of just crashing the system. > > Not sure which philosophy is domiant in the Linux kernel?Normally to print something and struggle on, rather than crash over every little thing. There''s been a general tendency to convert BUG into WARN, for example. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-26 16:44 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/26/2010 07:17 AM, Konrad Rzeszutek Wilk wrote:> On Mon, Oct 25, 2010 at 04:03:19PM -0700, Jeremy Fitzhardinge wrote: >> On 10/25/2010 10:35 AM, Konrad Rzeszutek Wilk wrote: >>> On Mon, Oct 25, 2010 at 05:23:29PM +0100, Ian Campbell wrote: >>>> Encapsulate allocate and free in xen_irq_alloc and xen_irq_free. >>>> >>>> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> >>>> --- >>>> drivers/xen/events.c | 68 ++++++++++++++++++++----------------------------- >>>> 1 files changed, 28 insertions(+), 40 deletions(-) >>>> >>>> diff --git a/drivers/xen/events.c b/drivers/xen/events.c >>>> index 97612f5..c8f3e43 100644 >>>> --- a/drivers/xen/events.c >>>> +++ b/drivers/xen/events.c >>>> @@ -394,41 +394,29 @@ static int find_unbound_pirq(void) >>>> return -1; >>>> } >>>> >>>> -static int find_unbound_irq(void) >>>> +static int xen_irq_alloc(void) >>>> { >>>> - struct irq_data *data; >>>> - int irq, res; >>>> - int start = get_nr_hw_irqs(); >>>> + int irq = irq_alloc_desc(0); >>>> >>>> - if (start == nr_irqs) >>>> - goto no_irqs; >>>> - >>>> - /* nr_irqs is a magic value. Must not use it.*/ >>>> - for (irq = nr_irqs-1; irq > start; irq--) { >>>> - data = irq_get_irq_data(irq); >>>> - /* only 0->15 have init''d desc; handle irq > 16 */ >>>> - if (!data) >>>> - break; >>>> - if (data->chip == &no_irq_chip) >>>> - break; >>>> - if (data->chip != &xen_dynamic_chip) >>>> - continue; >>>> - if (irq_info[irq].type == IRQT_UNBOUND) >>>> - return irq; >>>> - } >>>> - >>>> - if (irq == start) >>>> - goto no_irqs; >>>> + if (irq < 0) >>>> + panic("No available IRQ to bind to: increase nr_irqs!\n"); >>>> >>>> - res = irq_alloc_desc_at(irq, 0); >>>> + return irq; >>>> +} >>> So I am curious what the /proc/interrupts looks?The issue (and the reason >>> for this implementation above) was that under PV with PCI devices we would >>> overlap PCI devices IRQs with Xen event channels. So we could have a USB device >>> at IRQ 16 _and_ also a xen_spinlock4 handler. That would throw off the system >>> since the xen_spinlock4 was an edge type handler while the USB device was an >>> level (at least on my box). >> What? Why? How? Surely if we''re asking the irq subsystem to allocate > Imagine a PV guest with PCI passthrough. Normally the first 16 IRQs > are reserved for "legacy" devices. And the IRQs after that are up for grabs. > > Since the Xen event channels are initialized much much earlier than > any PCI devices, they end up using the IRQs right after 16 -which is OK > if you don''t have any PCI devices. If you have a PCI device that is > using IRQ 17 it ends up colliding with an event channel.Well, only because of the general tendency to try and allocate irq==gsi. If we don''t care about that (and we don''t particularly) then we can allocate any irq we like and map it to any gsi/pirq. In fact, Stefano''s series explicitly implements this.> Now, I have to confess I did not look carefully at the sparse_irq rework > so it might be that the IRQ numbur is not as important as it was > before 2.6.37.It was never very important. There was just a general policy to try and keep the irq for a device the same as it would be for native. But that''s probably only slightly relevant for dom0 and completely fictional for domU w/ passthrough.>> us an irq, it will return a fresh never-before-used (and certainly not >> shared) irq? Shared irqs only make sense if multiple devices are >> actually sharing, say, a wire on the board. > Right, and in this case we end up trying to use the IRQ for a physical > device and find out that the IRQ has/is being aleady used for an > event channel.In that case we should use dynamic allocation for everything. Or try to work out distinct irq ranges for different interrupts if you really want to keep irq==gsi.>> Or am I missing something? > Event channels are allocated before PCI devices so they get to usurp > the IRQ chip for the IRQ that belongs to the PCI device. > > Keep in mind that this is not possible under Dom0, as we have the > IOAPIC information, so we know that IRQ0-48 are reserved for GSI''s > for three of the IOAPIC. In PV with PCI passthrough such information > is not present and the kernel assumes no IOAPICs, and hence no > GSI. > > a). Maybe one way to do this is set the GSI high watermark to be the > same as the host (so move it from the legacy IRQ 16 to 48 for example). > This would require fiddling with the shared_info structure.. > > b) Another approach was to allocate event-channel IRQs and virtual IRQs > from the highest available IRQ and continue down . Physical IRQs would be > allocated from the legacy IRQ up to whatever is available. > > c) 2.6.18 kernels made a division right at 255, so anything under 255 was to be > used for physical IRQs, while anything above that for event channels and > vitual IRQs.d) dynamically allocate all irqs for all event channel types. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-26 17:08 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
> In that case we should use dynamic allocation for everything. Or try to > work out distinct irq ranges for different interrupts if you really want > to keep irq==gsi.Some little alarm bells are ringing in the back of my head about irq != gsi. I think the issue was the permission. When a PCI device is allocated to the PV guest, we do a bunch of xc_* calls to allow the domain to use the BARs and the IRQ. I believe when the guest boots and tries to map the event channel with the physical IRQ, one of the arguments is that GSI. And if we provide a bogus GSI, well, we won''t get the INTx to the guest. As you mentioned, Stefano''s patch add a new element to the tuple that can contain the GSI value. At which point we can make the guest IRQ != GSI, as long as we can contain the <gsi, event channel> mapping present so that for the hypercalls we can give it the right GSI. The MSI/MSI-X use a completly different mechanism that does not all of this complication, so we are OK with that. .. snip ..> d) dynamically allocate all irqs for all event channel types.<nods> Ok, you sold me on this idea. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Oct-26 19:49 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Tue, 26 Oct 2010, Ian Campbell wrote:> On Mon, 2010-10-25 at 19:02 +0100, Ian Campbell wrote: > > > > > > > What do you see when you pass in a PCI device and say give the guest > > 32 CPUs?? > > > > I can try tomorrow and see, based on what you say above without > > implementing what I described I suspect the answer will be "carnage". > > Actually, it looks like multi-vcpu is broken, I only see 1 regardless of > how many I configured. It''s not clear if this is breakage in Linus'' > tree, something I pulled in from one of Jeremy''s, yours or Stefano''s > trees or some local pebcak. I''ll investigate...I found the bug, it was introduced by: "xen: use vcpu_ops to setup cpu masks" I have added the fix at the end of my branch and I am also appending the fix here. --- xen: initialize cpu masks for pv guests in xen_smp_init Pv guests don''t have ACPI and need the cpu masks to be set correctly as early as possible so we call xen_fill_possible_map from xen_smp_init. On the other hand the initial domain supports ACPI so in this case we skip xen_fill_possible_map and rely on it. However Xen might limit the number of cpus usable by the domain, so we filter those masks during smp initialization using the VCPUOP_is_up hypercall. It is important that the filtering is done before xen_setup_vcpu_info_placement. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 1386767..834dfeb 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -28,6 +28,7 @@ #include <asm/xen/interface.h> #include <asm/xen/hypercall.h> +#include <xen/xen.h> #include <xen/page.h> #include <xen/events.h> @@ -156,6 +157,25 @@ static void __init xen_fill_possible_map(void) { int i, rc; + if (xen_initial_domain()) + return; + + for (i = 0; i < nr_cpu_ids; i++) { + rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL); + if (rc >= 0) { + num_processors++; + set_cpu_possible(i, true); + } + } +} + +static void __init xen_filter_cpu_maps(void) +{ + int i, rc; + + if (!xen_initial_domain()) + return; + num_processors = 0; disabled_cpus = 0; for (i = 0; i < nr_cpu_ids; i++) { @@ -179,6 +199,7 @@ static void __init xen_smp_prepare_boot_cpu(void) old memory can be recycled */ make_lowmem_page_readwrite(xen_initial_gdt); + xen_filter_cpu_maps(); xen_setup_vcpu_info_placement(); } @@ -195,8 +216,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) if (xen_smp_intr_init(0)) BUG(); - xen_fill_possible_map(); - if (!alloc_cpumask_var(&xen_cpu_initialized_map, GFP_KERNEL)) panic("could not allocate xen_cpu_initialized_map\n"); @@ -487,5 +506,6 @@ static const struct smp_ops xen_smp_ops __initdata = { void __init xen_smp_init(void) { smp_ops = xen_smp_ops; + xen_fill_possible_map(); xen_init_spinlocks(); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-26 20:20 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/26/2010 12:49 PM, Stefano Stabellini wrote:> On Tue, 26 Oct 2010, Ian Campbell wrote: >> On Mon, 2010-10-25 at 19:02 +0100, Ian Campbell wrote: >>> >>>> What do you see when you pass in a PCI device and say give the guest >>> 32 CPUs?? >>> >>> I can try tomorrow and see, based on what you say above without >>> implementing what I described I suspect the answer will be "carnage". >> Actually, it looks like multi-vcpu is broken, I only see 1 regardless of >> how many I configured. It''s not clear if this is breakage in Linus'' >> tree, something I pulled in from one of Jeremy''s, yours or Stefano''s >> trees or some local pebcak. I''ll investigate... > > I found the bug, it was introduced by: > > "xen: use vcpu_ops to setup cpu masks" > > I have added the fix at the end of my branch and I am also appending the > fix here.Acked. J> --- > > > xen: initialize cpu masks for pv guests in xen_smp_init > > Pv guests don''t have ACPI and need the cpu masks to be set > correctly as early as possible so we call xen_fill_possible_map from > xen_smp_init. > On the other hand the initial domain supports ACPI so in this case we skip > xen_fill_possible_map and rely on it. However Xen might limit the number > of cpus usable by the domain, so we filter those masks during smp > initialization using the VCPUOP_is_up hypercall. > It is important that the filtering is done before > xen_setup_vcpu_info_placement. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 1386767..834dfeb 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -28,6 +28,7 @@ > #include <asm/xen/interface.h> > #include <asm/xen/hypercall.h> > > +#include <xen/xen.h> > #include <xen/page.h> > #include <xen/events.h> > > @@ -156,6 +157,25 @@ static void __init xen_fill_possible_map(void) > { > int i, rc; > > + if (xen_initial_domain()) > + return; > + > + for (i = 0; i < nr_cpu_ids; i++) { > + rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL); > + if (rc >= 0) { > + num_processors++; > + set_cpu_possible(i, true); > + } > + } > +} > + > +static void __init xen_filter_cpu_maps(void) > +{ > + int i, rc; > + > + if (!xen_initial_domain()) > + return; > + > num_processors = 0; > disabled_cpus = 0; > for (i = 0; i < nr_cpu_ids; i++) { > @@ -179,6 +199,7 @@ static void __init xen_smp_prepare_boot_cpu(void) > old memory can be recycled */ > make_lowmem_page_readwrite(xen_initial_gdt); > > + xen_filter_cpu_maps(); > xen_setup_vcpu_info_placement(); > } > > @@ -195,8 +216,6 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) > if (xen_smp_intr_init(0)) > BUG(); > > - xen_fill_possible_map(); > - > if (!alloc_cpumask_var(&xen_cpu_initialized_map, GFP_KERNEL)) > panic("could not allocate xen_cpu_initialized_map\n"); > > @@ -487,5 +506,6 @@ static const struct smp_ops xen_smp_ops __initdata = { > void __init xen_smp_init(void) > { > smp_ops = xen_smp_ops; > + xen_fill_possible_map(); > xen_init_spinlocks(); > } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Oct-28 12:43 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Tue, 26 Oct 2010, Konrad Rzeszutek Wilk wrote:> > In that case we should use dynamic allocation for everything. Or try to > > work out distinct irq ranges for different interrupts if you really want > > to keep irq==gsi. > > Some little alarm bells are ringing in the back of my head about irq != gsi. > > I think the issue was the permission. When a PCI device is allocated to the > PV guest, we do a bunch of xc_* calls to allow the domain to use the BARs > and the IRQ. I believe when the guest boots and tries to map the > event channel with the physical IRQ, one of the arguments is that GSI. And > if we provide a bogus GSI, well, we won''t get the INTx to the guest. > > As you mentioned, Stefano''s patch add a new element to the tuple that can > contain the GSI value. At which point we can make the guest IRQ != GSI, > as long as we can contain the <gsi, event channel> mapping present so > that for the hypercalls we can give it the right GSI. > > The MSI/MSI-X use a completly different mechanism that does not all > of this complication, so we are OK with that. > > .. snip .. > > > d) dynamically allocate all irqs for all event channel types. > > <nods> Ok, you sold me on this idea. >Even though dynamic allocation might seem possible for both pirqs and irqs, there are some severe limitations: - Xen won''t allocate pirq numbers lower than 16 (probably because it expects pirq == gsi for the first 16 gsi), so it might run out of pirqs if we ask Xen to always choose the pirq number for us. As a consequence it is safer to keep using pirq == gsi, at least for the first 16 gsis. This limitation should probably be fixed in Xen, but we need to support older hypervisors so we cannot rely on the fix to be present. - Linux expects irq == gsi, see arch/x86/kernel/acpi/boot.c:gsi_to_irq /* Provide an identity mapping of gsi == irq * except on truly weird platforms that have * non isa irqs in the first 16 gsis. */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sander Eikelenboom
2010-Oct-28 12:57 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
Hello Stefano, Thursday, October 28, 2010, 2:43:12 PM, you wrote:> On Tue, 26 Oct 2010, Konrad Rzeszutek Wilk wrote: >> > In that case we should use dynamic allocation for everything. Or try to >> > work out distinct irq ranges for different interrupts if you really want >> > to keep irq==gsi. >> >> Some little alarm bells are ringing in the back of my head about irq != gsi. >> >> I think the issue was the permission. When a PCI device is allocated to the >> PV guest, we do a bunch of xc_* calls to allow the domain to use the BARs >> and the IRQ. I believe when the guest boots and tries to map the >> event channel with the physical IRQ, one of the arguments is that GSI. And >> if we provide a bogus GSI, well, we won''t get the INTx to the guest. >> >> As you mentioned, Stefano''s patch add a new element to the tuple that can >> contain the GSI value. At which point we can make the guest IRQ != GSI, >> as long as we can contain the <gsi, event channel> mapping present so >> that for the hypercalls we can give it the right GSI. >> >> The MSI/MSI-X use a completly different mechanism that does not all >> of this complication, so we are OK with that. >> >> .. snip .. >> >> > d) dynamically allocate all irqs for all event channel types. >> >> <nods> Ok, you sold me on this idea. >>> Even though dynamic allocation might seem possible for both pirqs and > irqs, there are some severe limitations:> - Xen won''t allocate pirq numbers lower than 16 (probably because it > expects pirq == gsi for the first 16 gsi), so it might run out > of pirqs if we ask Xen to always choose the pirq number for us. As a > consequence it is safer to keep using pirq == gsi, at least for the > first 16 gsis. This limitation should probably be fixed in Xen, but we > need to support older hypervisors so we cannot rely on the fix to be > present.I don''t know if this discussion is for dom0 kernels only., if it is .. is that support of older hypervisors necessarily true ? If i read the xen pvops wiki: NOTE! xen/stable-2.6.32.x versions after June 2010 (2.6.32.15 and newer) require at least Xen 4.0.1-rc2 or newer to work properly! xend and xenstored will fail to start if using those kernel versions with for example Xen 4.0.0. There''s an issue with creating/using /dev/xen/ device nodes, which has been fixed in Xen 4.0.1-rc2 and newer versions. See this patch: http://xenbits.xen.org/xen-4.0-testing.hg?rev/0e1521f654f2 and discussion at: http://lists.xensource.com/archives/html/xen-devel/2010-06/msg01129.html for more information. So a pvops dom0 kernel all ready seems to require a very recent hypervisor. So for a dom0 pvops kernel there has to be much less worry about support for older hypervisors for Xen guests it would only apply for pv domains ? So this could perhaps also be the opportunity to change things ? -- Sander> - Linux expects irq == gsi, see arch/x86/kernel/acpi/boot.c:gsi_to_irq> /* Provide an identity mapping of gsi == irq > * except on truly weird platforms that have > * non isa irqs in the first 16 gsis. > */-- Best regards, Sander mailto:linux@eikelenboom.it _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Oct-28 13:12 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On Thu, 28 Oct 2010, Sander Eikelenboom wrote:> > - Xen won''t allocate pirq numbers lower than 16 (probably because it > > expects pirq == gsi for the first 16 gsi), so it might run out > > of pirqs if we ask Xen to always choose the pirq number for us. As a > > consequence it is safer to keep using pirq == gsi, at least for the > > first 16 gsis. This limitation should probably be fixed in Xen, but we > > need to support older hypervisors so we cannot rely on the fix to be > > present. > > I don''t know if this discussion is for dom0 kernels only., if it is .. is that support of older hypervisors necessarily true ? > > If i read the xen pvops wiki: > NOTE! xen/stable-2.6.32.x versions after June 2010 (2.6.32.15 and newer) require at least Xen 4.0.1-rc2 or newer to work properly! > xend and xenstored will fail to start if using those kernel versions with for example Xen 4.0.0. > There''s an issue with creating/using /dev/xen/ device nodes, which has been fixed in Xen 4.0.1-rc2 and newer versions. > See this patch: http://xenbits.xen.org/xen-4.0-testing.hg?rev/0e1521f654f2 and discussion at: http://lists.xensource.com/archives/html/xen-devel/2010-06/msg01129.html for more information. > > So a pvops dom0 kernel all ready seems to require a very recent hypervisor. So for a dom0 pvops kernel there has to be much less worry about support for older hypervisors for Xen guests it would only apply for pv domains ? > So this could perhaps also be the opportunity to change things ? >That is true, however we should try to avoid introducing new incompatibilities unless really necessary. In this case it should be sufficient to keep using pirq == gsi for gsi < 16. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-28 16:22 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/28/2010 05:43 AM, Stefano Stabellini wrote:> On Tue, 26 Oct 2010, Konrad Rzeszutek Wilk wrote: >>> In that case we should use dynamic allocation for everything. Or try to >>> work out distinct irq ranges for different interrupts if you really want >>> to keep irq==gsi. >> Some little alarm bells are ringing in the back of my head about irq != gsi. >> >> I think the issue was the permission. When a PCI device is allocated to the >> PV guest, we do a bunch of xc_* calls to allow the domain to use the BARs >> and the IRQ. I believe when the guest boots and tries to map the >> event channel with the physical IRQ, one of the arguments is that GSI. And >> if we provide a bogus GSI, well, we won''t get the INTx to the guest. >> >> As you mentioned, Stefano''s patch add a new element to the tuple that can >> contain the GSI value. At which point we can make the guest IRQ != GSI, >> as long as we can contain the <gsi, event channel> mapping present so >> that for the hypercalls we can give it the right GSI. >> >> The MSI/MSI-X use a completly different mechanism that does not all >> of this complication, so we are OK with that. >> >> .. snip .. >> >>> d) dynamically allocate all irqs for all event channel types. >> <nods> Ok, you sold me on this idea. >> > > Even though dynamic allocation might seem possible for both pirqs and > irqs, there are some severe limitations: > > > - Xen won''t allocate pirq numbers lower than 16 (probably because it > expects pirq == gsi for the first 16 gsi), so it might run out > of pirqs if we ask Xen to always choose the pirq number for us. As a > consequence it is safer to keep using pirq == gsi, at least for the > first 16 gsis. This limitation should probably be fixed in Xen, but we > need to support older hypervisors so we cannot rely on the fix to be > present. > > > - Linux expects irq == gsi, see arch/x86/kernel/acpi/boot.c:gsi_to_irq > > /* Provide an identity mapping of gsi == irq > * except on truly weird platforms that have > * non isa irqs in the first 16 gsis. > */Yes, we always have to identity map legacy ISA interrupts, and we should never attempt to dynamically allocate in that region. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Oct-29 15:51 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
> > So a pvops dom0 kernel all ready seems to require a very recent hypervisor. So for a dom0 pvops kernel there has to be much less worry about support for older hypervisors for Xen guests it would only apply for pv domains ? > > So this could perhaps also be the opportunity to change things ? > > > > That is true, however we should try to avoid introducing new > incompatibilities unless really necessary.<nods> Especially as there is nothing wrong with using the 4.0 Xen Hypervisor (released months ago) with 2.6.37.> In this case it should be sufficient to keep using pirq == gsi for > gsi < 16.So we would still need our own interrupt allocation mechanism to deal with this then. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-29 16:16 UTC
Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator.
On 10/29/2010 08:51 AM, Konrad Rzeszutek Wilk wrote:>> In this case it should be sufficient to keep using pirq == gsi for >> gsi < 16. > So we would still need our own interrupt allocation mechanism to > deal with this then.Well, only minimally. The core interrupt allocator never returns irqs < 16, so we''re free to directly use them. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel