Ian Campbell
2010-Feb-26  11:42 UTC
[Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
With a single VCPU domain 0 (either due to hardware on dom0_max_vcpus=1)
and CONFIG_SPARSE_IRQ on xen/next I see:
        Kernel panic - not syncing: No available IRQ to bind to: increase
nr_irqs! (currently 256, started from 256)
        
        Pid: 0, comm: swapper Not tainted 2.6.32-x86_64-xen0 #5
        Call Trace:
         [<ffffffff813ae4a5>] panic+0xa0/0x17f
         [<ffffffff8100fb5f>] ? xen_restore_fl_direct_end+0x0/0x1
         [<ffffffff8118990e>] ? kvasprintf+0x6e/0x90
         [<ffffffff811cd87a>] find_unbound_irq+0x8a/0xb0
         [<ffffffff811cd941>] bind_virq_to_irq+0xa1/0x190
         [<ffffffff813ae5eb>] ? printk+0x67/0x6c
         [<ffffffff8100f7b0>] ? xen_timer_interrupt+0x0/0x1a0
         [<ffffffff811cde6d>] bind_virq_to_irqhandler+0x2d/0x80
         [<ffffffff8100f6e9>] xen_setup_timer+0x59/0x120
         [<ffffffff815cc8e1>] xen_time_init+0xa0/0xcf
         [<ffffffff815cd530>] x86_late_time_init+0xa/0x11
         [<ffffffff815c8d65>] start_kernel+0x31e/0x442
         [<ffffffff815c82b9>] x86_64_start_reservations+0x99/0xb9
         [<ffffffff815cba63>] xen_start_kernel+0x6a4/0x76e
it appears that nr_irqs == get_nr_irqs_gsi() in this configuration.
Seems to impact 32(on64) and 64 bit kernels.
xen/master (2.6.31.6) appears fine. I glanced through the diff between
xen/master and xen/next and nothing leaps out. xen/next is missing
e459de959 "Find an unbound irq number in reverse order (high to low)."
but I don''t see how that make a difference (and it doesn''t).
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Ian Campbell
2010-Feb-26  12:05 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Fri, 2010-02-26 at 11:42 +0000, Ian Campbell wrote:> xen/master (2.6.31.6) appears fine. I glanced through the diff between > xen/master and xen/next and nothing leaps out. xen/next is missing > e459de959 "Find an unbound irq number in reverse order (high to low)." > but I don''t see how that make a difference (and it doesn''t).Under 2.6.31.6 acpi_probe_gsi (and hence probe_nr_irqs_gsi) returns 24 while under 2.6.32 it returns 256. Part of it might be 48beb917f "" which is only in xen/next and contains: --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -1039,6 +1039,9 @@ int __init acpi_probe_gsi(void) max_gsi = gsi; } + if (xen_initial_domain()) + max_gsi += 255; /* Plus maximum entries of an ioapic. */ + return max_gsi + 1; } Which looks might suspicious to me... However simply removing that causes acpi_probe_gsi to return 16 (instead of 24) and I run out of interrupts for use by real hardware (specifically my disk controller). If I hack acpi_probe_gsi to return at least 24 everything works OK so it seems the error is only at the detection stage. The kernel logs contains this diff between 2.6.31.6 and 2.6.32 which I suspect is relevant... @@ -349,167 +341,125 @@ ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) -IOAPIC[0]: Unable to change apic_id! -IOAPIC[0]: apic_id 255, version 32, address 0xfec00000, GSI 0-23 +IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-0 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) +ERROR: Unable to locate IOAPIC for GSI 2 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) -ACPI: IRQ0 used by override. -ACPI: IRQ2 used by override. -ACPI: IRQ9 used by override. +ERROR: Unable to locate IOAPIC for GSI 9 Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information 2a4ab640 "ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n" moved some stuff around in this area but io_apic_get_redir_entries() (which appears to be the key function defining the GSI range) doesn''t seem to have changed. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Feb-26  16:17 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Fri, 2010-02-26 at 12:05 +0000, Ian Campbell wrote:> > Which looks might suspicious to me... However simply removing that > causes acpi_probe_gsi to return 16 (instead of 24) and I run out of > interrupts for use by real hardware (specifically my disk controller). > If I hack acpi_probe_gsi to return at least 24 everything works OK so > it seems the error is only at the detection stage.So this seems to all relate to the removal of the xen_io_apic_(read| write) stuff. I can see that the GSI routing stuff is effective replaced by PHYSDEVOP_setup_gsi but I don''t see what replaces the IO APIC enumeration. We still map a dummy page for FIX_IO_ACPI_* and io_apic_(read|write) now go at that direct (and therefore get 0s back). If the intention is not to enumerate the IO APICs in this way then what seems to be missing is the part which discovers the number of GSIs in the system and I''m not sure what is supposed to replace that. Perhaps the "max_gsi += 256" thing is simply supposed to cover the largest possible use but in that case I think we also need to bump nr_irqs to leave some space for dynamically created IRQ sources. I guess I''m not sure what the intended approach here is, let along what the right answer is. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Feb-26  17:28 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On 02/26/2010 08:17 AM, Ian Campbell wrote:> On Fri, 2010-02-26 at 12:05 +0000, Ian Campbell wrote: > >> Which looks might suspicious to me... However simply removing that >> causes acpi_probe_gsi to return 16 (instead of 24) and I run out of >> interrupts for use by real hardware (specifically my disk controller). >> If I hack acpi_probe_gsi to return at least 24 everything works OK so >> it seems the error is only at the detection stage. >> > So this seems to all relate to the removal of the xen_io_apic_(read| > write) stuff. >Yep.> I can see that the GSI routing stuff is effective replaced by > PHYSDEVOP_setup_gsi but I don''t see what replaces the IO APIC > enumeration. We still map a dummy page for FIX_IO_ACPI_* and > io_apic_(read|write) now go at that direct (and therefore get 0s back). > If the intention is not to enumerate the IO APICs in this way then what > seems to be missing is the part which discovers the number of GSIs in > the system and I''m not sure what is supposed to replace that. >Nothing, as yet. The "+= 256" is definitely a hack, and we need to come up with a sound way to resolve it. There seem to be three possibilities: * Let the kernel see the IO APICs for the purposes of enumeration, but nothing else (which seems to defeat the point of the exercise) * Make up a fake Xen IO APIC mapping which just contains static state for the config registers. (I don''t think this will work, because the IO APIC registers aren''t simply memory-mapped) * Add an interface to Xen so it can return the results of its own IO APIC enumeration, and use that in dom0. I think this is probably most consistent with the idea that "Xen owns all the APICs", but I''m not sure how to wire it into the Linux side. Ideally we should also be able to get rid of the fake IO APIC mappings because nothing in Linux will even attempt to access them, but I suspect in practice it will be easier to let some probe code poke at them and find they''re not there rather than try and disable the probe. I think Xiantao has some thoughts on this too. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Erik Brakkee
2010-Feb-28  12:01 UTC
[Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Ian Campbell <Ian.Campbell <at> citrix.com> writes:> > With a single VCPU domain 0 (either due to hardware on dom0_max_vcpus=1) > and CONFIG_SPARSE_IRQ on xen/next I see: > > Kernel panic - not syncing: No available IRQ to bind to: increasenr_irqs! (currently 256, started from 256)>The strange thing is that the kernel panic at boot also appears with xen/master now. I am using the .config from http://wiki.xensource.com/xenwiki/XenParavirtOps. This behavior is occurring on a Sony Vaio F11 laptop (Core i7 720QM, 8GB memory). Is there a simple workaround for this problem? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Erik Brakkee
2010-Feb-28  20:12 UTC
[Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Erik Brakkee wrote:> > The strange thing is that the kernel panic at boot also appears with xen/master > now. I am using the .config from > http://wiki.xensource.com/xenwiki/XenParavirtOps. This behavior is occurring on > a Sony Vaio F11 laptop (Core i7 720QM, 8GB memory). > > Is there a simple workaround for this problem?Disabling CONFIG_SPARSE_IRQ also does not help. And I also use the default settings for vcpus for dom0. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Mar-01  09:41 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Fri, 2010-02-26 at 17:28 +0000, Jeremy Fitzhardinge wrote:> On 02/26/2010 08:17 AM, Ian Campbell wrote: > > On Fri, 2010-02-26 at 12:05 +0000, Ian Campbell wrote: > > > >> Which looks might suspicious to me... However simply removing that > >> causes acpi_probe_gsi to return 16 (instead of 24) and I run out of > >> interrupts for use by real hardware (specifically my disk controller). > >> If I hack acpi_probe_gsi to return at least 24 everything works OK so > >> it seems the error is only at the detection stage. > >> > > So this seems to all relate to the removal of the xen_io_apic_(read| > > write) stuff. > > > > Yep. > > > I can see that the GSI routing stuff is effective replaced by > > PHYSDEVOP_setup_gsi but I don''t see what replaces the IO APIC > > enumeration. We still map a dummy page for FIX_IO_ACPI_* and > > io_apic_(read|write) now go at that direct (and therefore get 0s back). > > If the intention is not to enumerate the IO APICs in this way then what > > seems to be missing is the part which discovers the number of GSIs in > > the system and I''m not sure what is supposed to replace that. > > > > Nothing, as yet. The "+= 256" is definitely a hack, and we need to come > up with a sound way to resolve it. There seem to be three possibilities: > > * Let the kernel see the IO APICs for the purposes of enumeration, > but nothing else (which seems to defeat the point of the exercise) > * Make up a fake Xen IO APIC mapping which just contains static > state for the config registers. (I don''t think this will work, > because the IO APIC registers aren''t simply memory-mapped)Unfortunately IIRC they are index+data style, which is a pain.> * Add an interface to Xen so it can return the results of its own IO > APIC enumeration, and use that in dom0. I think this is probably > most consistent with the idea that "Xen owns all the APICs", but > I''m not sure how to wire it into the Linux side.Yes, that''s the lines I was thinking along as well, but I wasn''t sure how to go about it either. For example I suppose you''d need to get in at the acpi_parse_ioapic-ish level which isn''t going to fly. Could we nobble the APIC portion of the ACPI tables or otherwise arrange for the generic code to find no IO APICs and instead enumerate and register them in the Xen specific code? I noticed we have xen_io_apic_init() which doesn''t seem to be called from anywhere. There is also a "skip_ioapic_setup" variable already defined which might be useful? Regardless of the mechanism for detecting number of hardware interrupts required I think we need a mechanism to cause some extra interrupts to be available for VIRQ and backend use. I think previously we just been lucky that the core code overestimated the number of h/w interrupt sources so we got a few free ones for our purposes. It looks like the upstream x86 guys are doing some work to make interrupts be more dynamically allocated (the radix tree irq_desc stuff) which looks like it would be very useful for us once it lands. In 2.6.18 we explicitly left space for a number of dynamic IRQs which seems like a reasonable approach in the interim, I''ll cook up a patch. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Mar-01  10:33 UTC
Re: [Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Sun, 2010-02-28 at 12:01 +0000, Erik Brakkee wrote:> Ian Campbell <Ian.Campbell <at> citrix.com> writes: > > > > > With a single VCPU domain 0 (either due to hardware on dom0_max_vcpus=1) > > and CONFIG_SPARSE_IRQ on xen/next I see: > > > > Kernel panic - not syncing: No available IRQ to bind to: increase > nr_irqs! (currently 256, started from 256) > > > > > The strange thing is that the kernel panic at boot also appears with xen/master > now. I am using the .config from > http://wiki.xensource.com/xenwiki/XenParavirtOps. This behavior is occurring on > a Sony Vaio F11 laptop (Core i7 720QM, 8GB memory). > > Is there a simple workaround for this problem?I think the xen/master case is actually down to my "fix off-by-one error in find_unbound_irq" change -- should probably be reverted. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Mar-01  11:27 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Mon, 2010-03-01 at 09:41 +0000, Ian Campbell wrote:> > Regardless of the mechanism for detecting number of hardware > interrupts > required I think we need a mechanism to cause some extra interrupts to > be available for VIRQ and backend use. I think previously we just been > lucky that the core code overestimated the number of h/w interrupt > sources so we got a few free ones for our purposes. It looks like the > upstream x86 guys are doing some work to make interrupts be more > dynamically allocated (the radix tree irq_desc stuff) which looks like > it would be very useful for us once it lands. In 2.6.18 we explicitly > left space for a number of dynamic IRQs which seems like a reasonable > approach in the interim, I''ll cook up a patch.How about this vs. xen/xen/dom0/apic-next: The following changes since commit dc23f2c13cc3c0080af806b318cf63850778c4c2: Jeremy Fitzhardinge (1): xen/apic: add missing header are available in the git repository at: git://xenbits.xensource.com/people/ianc/linux-2.6.git for-jeremy/apic Ian Campbell (1): xen: allow some overhead in IRQ space for dynamic IRQs arch/x86/include/asm/irq_vectors.h | 14 +++++++++++--- arch/x86/kernel/apic/io_apic.c | 4 ++++ arch/x86/xen/enlighten.c | 4 ++++ 3 files changed, 19 insertions(+), 3 deletions(-) --->From 6d4a9168207ade237098a401270959ecc0bdd1e9 Mon Sep 17 00:00:00 2001From: Ian Campbell <ian.campbell@citrix.com> Date: Mon, 1 Mar 2010 11:21:15 +0000 Subject: [PATCH] xen: allow some overhead in IRQ space for dynamic IRQs such as VIRQs and backend event channels. This is an interim solution until x86 interrupts become totally dynamic. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- arch/x86/include/asm/irq_vectors.h | 14 +++++++++++--- arch/x86/kernel/apic/io_apic.c | 4 ++++ arch/x86/xen/enlighten.c | 4 ++++ 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 5b21f0e..db2aef4 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -157,6 +157,14 @@ static inline int invalid_vm86_irq(int irq) #define CPU_VECTOR_LIMIT ( 8 * NR_CPUS ) #define IO_APIC_VECTOR_LIMIT ( 32 * MAX_IO_APICS ) +#ifndef __ASSEMBLY__ +# if defined(CONFIG_X86_IO_APIC) && defined(CONFIG_SPARSE_IRQ) +extern int nr_dynamic_irqs; +# else +# define NR_DYNAMIC_IRQS 256 +# endif +#endif + #ifdef CONFIG_X86_IO_APIC # ifdef CONFIG_SPARSE_IRQ # define NR_IRQS \ @@ -165,13 +173,13 @@ static inline int invalid_vm86_irq(int irq) (NR_VECTORS + IO_APIC_VECTOR_LIMIT)) # else # if NR_CPUS < MAX_IO_APICS -# define NR_IRQS (NR_VECTORS + 4*CPU_VECTOR_LIMIT) +# define NR_IRQS (NR_VECTORS + 4*CPU_VECTOR_LIMIT) + NR_DYNAMIC_IRQS # else -# define NR_IRQS (NR_VECTORS + IO_APIC_VECTOR_LIMIT) +# define NR_IRQS (NR_VECTORS + IO_APIC_VECTOR_LIMIT) + NR_DYNAMIC_IRQS # endif # endif #else /* !CONFIG_X86_IO_APIC: */ -# define NR_IRQS NR_IRQS_LEGACY +# define NR_IRQS NR_IRQS_LEGACY + NR_DYNAMIC_IRQS #endif #endif /* _ASM_X86_IRQ_VECTORS_H */ diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index c074c1b..3ea627d 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -3869,6 +3869,8 @@ int get_nr_irqs_gsi(void) } #ifdef CONFIG_SPARSE_IRQ +int nr_dynamic_irqs; + int __init arch_probe_nr_irqs(void) { int nr; @@ -3886,6 +3888,8 @@ int __init arch_probe_nr_irqs(void) if (nr < nr_irqs) nr_irqs = nr; + nr_irqs += nr_dynamic_irqs; + return 0; } #endif diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 9dea797..421e3ee 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -986,6 +986,10 @@ asmlinkage void __init xen_start_kernel(void) pv_apic_ops = xen_apic_ops; pv_mmu_ops = xen_mmu_ops; +#ifdef CONFIG_SPARSE_IRQ + nr_dynamic_irqs += 256; +#endif + xen_init_irq_ops(); xen_init_cpuid_mask(); -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2010-Mar-01  12:34 UTC
RE: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Jeremy Fitzhardinge wrote:> On 02/26/2010 08:17 AM, Ian Campbell wrote: >> On Fri, 2010-02-26 at 12:05 +0000, Ian Campbell wrote: >> >>> Which looks might suspicious to me... However simply removing that >>> causes acpi_probe_gsi to return 16 (instead of 24) and I run out of >>> interrupts for use by real hardware (specifically my disk >>> controller). If I hack acpi_probe_gsi to return at least 24 >>> everything works OK so it seems the error is only at the detection >>> stage. >>> >> So this seems to all relate to the removal of the xen_io_apic_(read| >> write) stuff. >> > > Yep. > >> I can see that the GSI routing stuff is effective replaced by >> PHYSDEVOP_setup_gsi but I don''t see what replaces the IO APIC >> enumeration. We still map a dummy page for FIX_IO_ACPI_* and >> io_apic_(read|write) now go at that direct (and therefore get 0s >> back). If the intention is not to enumerate the IO APICs in this way >> then what seems to be missing is the part which discovers the number >> of GSIs in the system and I''m not sure what is supposed to replace >> that. >> > > Nothing, as yet. The "+= 256" is definitely a hack, and we need to > come up with a sound way to resolve it. There seem to be three > possibilities: > > * Let the kernel see the IO APICs for the purposes of enumeration, > but nothing else (which seems to defeat the point of the > exercise) * Make up a fake Xen IO APIC mapping which just > contains static state for the config registers. (I don''t think > this will work, because the IO APIC registers aren''t simply > memory-mapped) * Add an interface to Xen so it can return the > results of its own IO APIC enumeration, and use that in dom0. > I think this is probably most consistent with the idea that > "Xen owns all the APICs", but I''m not sure how to wire it into > the Linux side. > > Ideally we should also be able to get rid of the fake IO APIC mappings > because nothing in Linux will even attempt to access them, but I > suspect in practice it will be easier to let some probe code poke at > them and find they''re not there rather than try and disable the probe.Currenlty, ioapic access only exists at kernel''s boot time to probe some info related to ioapic(e.g. ioapic version, ioapic''s rte number), and no any access to ioapic at runtime, and this is why we still need the dump page there. To remove the hack, we can use your third method with existing interface PHYSDEVOP_apic_read to read the redirect entry number of ioapic. Attached the patch. What''s your opinion ? :)>From e5a75b3f2f40e56de714818b51932e6f36491f56 Mon Sep 17 00:00:00 2001From: Xiantao Zhang <xiantao.zhang@intel.com> Date: Mon, 1 Mar 2010 19:06:43 -0500 Subject: [PATCH] x86: ioapic: Remove the hack for calculating nr_irq_gsi for Xen. Read the entry number through the hypercall PHYSDEVOP_apic_read, but the default vaule is also set to 255 if PHYSDEVOP_apic_read doesn''t exist. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> --- arch/x86/include/asm/io_apic.h | 1 + arch/x86/kernel/acpi/boot.c | 3 --- arch/x86/kernel/apic/io_apic.c | 5 +++++ arch/x86/xen/pci.c | 20 ++++++++++++++++++++ 4 files changed, 26 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h index 2fc09d3..c58a838 100644 --- a/arch/x86/include/asm/io_apic.h +++ b/arch/x86/include/asm/io_apic.h @@ -172,6 +172,7 @@ extern int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries); extern void probe_nr_irqs_gsi(void); extern int get_nr_irqs_gsi(void); +extern void set_nr_irqs_gsi(int nr_gsi); extern int setup_ioapic_entry(int apic, int irq, struct IO_APIC_route_entry *entry, diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 21fc029..7ba650f 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -869,9 +869,6 @@ int __init acpi_probe_gsi(void) max_gsi = gsi; } - if (xen_initial_domain()) - max_gsi += 255; /* Plus maximum entries of an ioapic. */ - return max_gsi + 1; } diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 68acd64..e116f7f 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -3831,6 +3831,11 @@ int get_nr_irqs_gsi(void) return nr_irqs_gsi; } +void set_nr_irqs_gsi(int nr_gsi) +{ + nr_irqs_gsi = nr_gsi; +} + #ifdef CONFIG_SPARSE_IRQ int __init arch_probe_nr_irqs(void) { diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c index f999ad8..1839d5f 100644 --- a/arch/x86/xen/pci.c +++ b/arch/x86/xen/pci.c @@ -78,6 +78,26 @@ void __init xen_setup_pirqs(void) for (irq = 0; irq < NR_IRQS_LEGACY; irq++) xen_allocate_pirq(irq, 0, "xt-pic"); return; + } else { + struct physdev_apic apic_op; + int ret; + union IO_APIC_reg_01 reg_01; + int nr_gsi = get_nr_irqs_gsi(); + + apic_op.apic_physbase = mp_ioapics[nr_ioapics - 1].apicaddr; + apic_op.reg = 1; + ret = HYPERVISOR_physdev_op(PHYSDEVOP_apic_read, &apic_op); + if (ret) { + nr_gsi += 255; + printk("PHYSDEVOP_apic_read error," + "set to max value(255) for entry number!\n"); + } else { + reg_01.raw = apic_op.value; + nr_gsi += reg_01.bits.entries; + } + if (nr_ioapics == 1) + nr_gsi -= NR_IRQS_LEGACY; + set_nr_irqs_gsi(nr_gsi); } /* Pre-allocate legacy irqs */ -- 1.6.0.rc1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Erik Brakkee
2010-Mar-01  17:44 UTC
Re: [Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Ian Campbell wrote:> On Sun, 2010-02-28 at 12:01 +0000, Erik Brakkee wrote: > >> >> Is there a simple workaround for this problem? >> > I think the xen/master case is actually down to my "fix off-by-one error > in find_unbound_irq" change -- should probably be reverted. > > Ian. > >I have been desparately trying to get a stable xen configuration running for some time now. One of the things I have tried was compile a xen release from source including kernel version 2.6.18 but that failed to boot because of an incompatibility of the LVM version of opensuse 11.2 and that old kernel. The current opensuse 11.2 contains xen 3.4.1, but there I have a serious stability issue (https://bugzilla.novell.com/show_bug.cgi?id=583867), so I was hoping to get the xen/master branch to work. It would be good to have a stable xen version again with a recent kernel. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Mar-01  17:47 UTC
Re: [Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Mon, Mar 01, 2010 at 10:33:17AM +0000, Ian Campbell wrote:> On Sun, 2010-02-28 at 12:01 +0000, Erik Brakkee wrote: > > Ian Campbell <Ian.Campbell <at> citrix.com> writes: > > > > > > > > With a single VCPU domain 0 (either due to hardware on dom0_max_vcpus=1) > > > and CONFIG_SPARSE_IRQ on xen/next I see: > > > > > > Kernel panic - not syncing: No available IRQ to bind to: increase > > nr_irqs! (currently 256, started from 256) > > > > > > > > > The strange thing is that the kernel panic at boot also appears with xen/master > > now. I am using the .config from > > http://wiki.xensource.com/xenwiki/XenParavirtOps. This behavior is occurring on > > a Sony Vaio F11 laptop (Core i7 720QM, 8GB memory). > > > > Is there a simple workaround for this problem? > > I think the xen/master case is actually down to my "fix off-by-one error > in find_unbound_irq" change -- should probably be reverted.Shouldn''t your other patch, the one that inserts the if (start == nr_irqs) goto out; be added instead? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Mar-01  19:36 UTC
Re: [Xen-devel] Re: CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Mon, 2010-03-01 at 17:47 +0000, Konrad Rzeszutek Wilk wrote:> On Mon, Mar 01, 2010 at 10:33:17AM +0000, Ian Campbell wrote: > > On Sun, 2010-02-28 at 12:01 +0000, Erik Brakkee wrote: > > > Ian Campbell <Ian.Campbell <at> citrix.com> writes: > > > > > > > > > > > With a single VCPU domain 0 (either due to hardware on dom0_max_vcpus=1) > > > > and CONFIG_SPARSE_IRQ on xen/next I see: > > > > > > > > Kernel panic - not syncing: No available IRQ to bind to: increase > > > nr_irqs! (currently 256, started from 256) > > > > > > > > > > > > > The strange thing is that the kernel panic at boot also appears with xen/master > > > now. I am using the .config from > > > http://wiki.xensource.com/xenwiki/XenParavirtOps. This behavior is occurring on > > > a Sony Vaio F11 laptop (Core i7 720QM, 8GB memory). > > > > > > Is there a simple workaround for this problem? > > > > I think the xen/master case is actually down to my "fix off-by-one error > > in find_unbound_irq" change -- should probably be reverted. > > Shouldn''t your other patch, the one that inserts the > > if (start == nr_irqs) goto out; > > be added instead?Yes, I hadn''t written it when I wrote the above. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Mar-03  22:35 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On 03/01/2010 04:34 AM, Zhang, Xiantao wrote:> Currenlty, ioapic access only exists at kernel''s boot time to probe some info related to ioapic(e.g. ioapic version, ioapic''s rte number), and no any access to ioapic at runtime, and this is why we still need the dump page there. > To remove the hack, we can use your third method with existing interface PHYSDEVOP_apic_read to read the redirect entry number of ioapic. Attached the patch. What''s your opinion ? :) > From e5a75b3f2f40e56de714818b51932e6f36491f56 Mon Sep 17 00:00:00 2001 > From: Xiantao Zhang<xiantao.zhang@intel.com> > Date: Mon, 1 Mar 2010 19:06:43 -0500 > Subject: [PATCH] x86: ioapic: Remove the hack for calculating nr_irq_gsi for Xen. > > Read the entry number through the hypercall PHYSDEVOP_apic_read, but > the default vaule is also set to 255 if PHYSDEVOP_apic_read doesn''t > exist. >This doesn''t look too bad, but I wonder if there''s a cleaner way of integrating it into the ioapic code path. Hm, nothing obvious. I''d almost be tempted to just add something like: nr_irqs_gsi = xen_probe_gsi(nr_irqs_gsi); to the end of probe_nr_irqs_gsi(). It does have the downside of adding Xen-specific code here, but it has the upside of being fairly clear and to the point, and doesn''t add a somewhat arbitrary interface like set_nr_irqs_gsi(). At the very least, I think we can get rid of get_nr_irqs_gsi() as we add set_nr_irqs_gsi()... And should set_nr_irqs_gsi() refuse to decrease nr_irqs_gsi? Perhaps it should be add_nr_irqs_gsi()? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Xiantao
2010-Mar-05  02:41 UTC
RE: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Jeremy Fitzhardinge wrote:> On 03/01/2010 04:34 AM, Zhang, Xiantao wrote: >> Currenlty, ioapic access only exists at kernel''s boot time to probe >> some info related to ioapic(e.g. ioapic version, ioapic''s rte >> number), and no any access to ioapic at runtime, and this is why we >> still need the dump page there. To remove the hack, we can use your >> third method with existing interface PHYSDEVOP_apic_read to read the >> redirect entry number of ioapic. Attached the patch. What''s your >> opinion ? :) From e5a75b3f2f40e56de714818b51932e6f36491f56 Mon Sep >> 17 00:00:00 2001 >> From: Xiantao Zhang<xiantao.zhang@intel.com> >> Date: Mon, 1 Mar 2010 19:06:43 -0500 >> Subject: [PATCH] x86: ioapic: Remove the hack for calculating >> nr_irq_gsi for Xen. >> >> Read the entry number through the hypercall PHYSDEVOP_apic_read, but >> the default vaule is also set to 255 if PHYSDEVOP_apic_read doesn''t >> exist. >> > > This doesn''t look too bad, but I wonder if there''s a cleaner way of > integrating it into the ioapic code path. Hm, nothing obvious. I''d > almost be tempted to just add something like: > > nr_irqs_gsi = xen_probe_gsi(nr_irqs_gsi); > > to the end of probe_nr_irqs_gsi(). It does have the downside of > adding Xen-specific code here, but it has the upside of being fairly > clear and to the point, and doesn''t add a somewhat arbitrary > interface like set_nr_irqs_gsi(). > > At the very least, I think we can get rid of get_nr_irqs_gsi() as we > add set_nr_irqs_gsi()... > > And should set_nr_irqs_gsi() refuse to decrease nr_irqs_gsi? Perhaps > it should be add_nr_irqs_gsi()?Okay, add_nr_irqs_gsi maybe a better option, and it can avoid {get,set}_nr_irqs_gsi as well. I will cook a patch for that. Xiantao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Mar-19  15:07 UTC
RE: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On Fri, 2010-03-05 at 02:41 +0000, Zhang, Xiantao wrote:> Jeremy Fitzhardinge wrote:> Okay, add_nr_irqs_gsi maybe a better option, and it can avoid {get,set}_nr_irqs_gsi as well. > I will cook a patch for that.Sorry, I''ve kind of dropped the ball on keeping up with this thread. It looks like you guys have got the issue of identify the correct number of physical interrupts and handling the APIC setup correctly etc in hand but do we also need to consider taking the patch in http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00028.html as well in order to ensure we have plenty of dynamic interrupts until such a time as they can be fully dynamically allocated? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Schwinge
2010-Mar-19  18:11 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
Hello! Ian, thanks for pointing me to this patch: On Mon, Mar 01, 2010 at 11:27:17AM +0000, Ian Campbell wrote:> >From 6d4a9168207ade237098a401270959ecc0bdd1e9 Mon Sep 17 00:00:00 2001 > From: Ian Campbell <ian.campbell@citrix.com> > Date: Mon, 1 Mar 2010 11:21:15 +0000 > Subject: [PATCH] xen: allow some overhead in IRQ space for dynamic IRQs > such as VIRQs and backend event channels. > > This is an interim solution until x86 interrupts become totally dynamic. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > arch/x86/include/asm/irq_vectors.h | 14 +++++++++++--- > arch/x86/kernel/apic/io_apic.c | 4 ++++ > arch/x86/xen/enlighten.c | 4 ++++ > 3 files changed, 19 insertions(+), 3 deletions(-)This patch fixes the issue (``No available IRQ to bind to: increase nr_irqs!'''') that I reported in <http://lists.debian.org/debian-kernel/2010/03/msg00665.html>, thus feel free to add ``Tested-by: Thomas Schwinge <thomas@schwinge.name>''''. I put this one commit on top of whatever Debian''s linux-image-2.6.32-4-xen-amd64 2.6.32-10 package has, and am now running that one on my AMD Sempron system: uni-processor, no hardware virtualization. Regards, Thomas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Mar-19  23:45 UTC
Re: [Xen-devel] CONFIG_SPARSE_IRQ breaks single VCPU domain 0 between xen/master and xen/next
On 03/19/2010 08:07 AM, Ian Campbell wrote:> It looks like you guys have got the issue of identify the correct number > of physical interrupts and handling the APIC setup correctly etc in hand > but do we also need to consider taking the patch in > http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00028.html > as well in order to ensure we have plenty of dynamic interrupts until > such a time as they can be fully dynamically allocated? >Just reminded myself of this by booting a UP nested Xen. Pulled. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel