Ian Campbell
2011-Feb-16 14:59 UTC
[Xen-devel] Xen and incorporating event channels in to nr_irqs
Thomas, I wonder if you have any advice on the following. Under Xen we have 1024 event channels per-VM which can be injected into any VCPU. We map these into IRQs and inject them into the system through the generic IRQ mechanisms. Event channels are independent from the normal x86 concept of a vector, although these can also exist e.g. in an HVM guest with PV extensions you get both 256 vectors per CPU and 1024 event channels. In some cases there is some rough equivalence between event channels and x86 vectors. Specifically in domain 0 or HVM guests with the right PV extensions host GSIs or emulated GSIs respectively can be bound to an event channel as a "pirq". In this case we allocate IRQs such that GSI==IRQ for consistency with the same kernel running natively. For all other event channels we allocate the IRQs dynamically. Since both event channels and x86 vectors can exist simultaneously we always allocate an IRQ for dynamic event channels from above nr_irqs_gsi (somewhat similar to MSIs on native I guess). Since nr_irqs_gsi under Xen is always an overestimate compared with the actual number of host GSIs (or accurate in the HVM with PV extensions case) there is no problem with clashes between the 1-1 GSI==IRQ range and the dynamic range. However because nr_irqs on x86, including when running under Xen, is derived from NR_VECTORS * nr_cpu_ids it is often the case that we can run out of available IRQ numbers above the nr_irqs_gsi limit, in fact it is sometimes the case that nr_irqs_gsi >= nr_irqs in which case no dynamic event channels can be allocated at all! To work around this Xen currently tries to allocate an IRQ from nr_irqs_gsi..nr_irqs but if that doesn''t work it will fall back to to using the IRQ space below nr_irqs_gsi. This risks clashing with allocation in the 1-1 GSI<->IRQ region. I''d very much like to remove this workaround (better described as a hack I think) but in order to do so I need to make sure there are plenty of IRQs between nr_irqs_gsi and nr_irqs. Effectively what we would like to do is: nr_irqs += NR_EVENT_CHANNELS; somewhere, except obviously we don''t want to just drop that into generic code! Do you have any hints as to an appropriate existing interface which could Xen use here? If not any suggestions for what sort of interface might be acceptable to add? For example I was wondering about adding x86_info.irqs.probe_nr_irqs, which returns a platform specific additional number of IRQs, and having arch_probe_nr_irqs += that value into its calculations. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Gleixner
2011-Feb-16 15:56 UTC
[Xen-devel] Re: Xen and incorporating event channels in to nr_irqs
Ian, On Wed, 16 Feb 2011, Ian Campbell wrote:> I''d very much like to remove this workaround (better described as a hack > I think) but in order to do so I need to make sure there are plenty of > IRQs between nr_irqs_gsi and nr_irqs. Effectively what we would like to > do is: > nr_irqs += NR_EVENT_CHANNELS; > somewhere, except obviously we don''t want to just drop that into generic > code! > > Do you have any hints as to an appropriate existing interface which > could Xen use here? > > If not any suggestions for what sort of interface might be acceptable to > add? > > For example I was wondering about adding x86_info.irqs.probe_nr_irqs, > which returns a platform specific additional number of IRQs, and having > arch_probe_nr_irqs += that value into its calculations.I''m about to remove the nr_irqs NR_IRQS limitation. It''s silly when we deal with sparse irqs. So the idea is to have the initial nr_irqs set in early boot to have a sensible size for allocating stuff. Later on we can expand nr_irqs when the need arises. It''s not only Xen which wants to eliminate the limitation. Think about irq expanders which are detected late in the boot. We have no sensible way to reserve enough numbers for them at early boot as we dont know whether that hardware is there or not. So my plan for .39 is to ignore the NR_IRQS limitation in the sparse case and make nr_irqs expandable of course with a sensible upper limit in the core code itself. It''s basically the allocation bitmap which limits it, but I doubt we''ll hit 1 Million irq numbers in the forseeable future. Thanks, tglx _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Feb-16 16:12 UTC
[Xen-devel] Re: Xen and incorporating event channels in to nr_irqs
On Wed, 2011-02-16 at 15:56 +0000, Thomas Gleixner wrote:> I''m about to remove the nr_irqs NR_IRQS limitation. It''s silly when we > deal with sparse irqs. So the idea is to have the initial nr_irqs set > in early boot to have a sensible size for allocating stuff. Later on > we can expand nr_irqs when the need arises.> It''s not only Xen which wants to eliminate the limitation. Think about > irq expanders which are detected late in the boot. We have no sensible > way to reserve enough numbers for them at early boot as we dont know > whether that hardware is there or not. > > So my plan for .39 is to ignore the NR_IRQS limitation in the sparse > case and make nr_irqs expandable of course with a sensible upper limit > in the core code itself. It''s basically the allocation bitmap which > limits it, but I doubt we''ll hit 1 Million irq numbers in the > forseeable future.That sounds ideal, thanks! I was hoping to get rid of the workaround in Xen events.c in the 2.6.39 timeframe too. If you let me know when you have something I can test I''ll combine with the Xen side and give it a spin. On a vaguely related note, what is the future of non-sparse IRQs (on x86 and/or generally)? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thomas Gleixner
2011-Feb-16 16:25 UTC
[Xen-devel] Re: Xen and incorporating event channels in to nr_irqs
On Wed, 16 Feb 2011, Ian Campbell wrote:> On Wed, 2011-02-16 at 15:56 +0000, Thomas Gleixner wrote: > > I''m about to remove the nr_irqs NR_IRQS limitation. It''s silly when we > > deal with sparse irqs. So the idea is to have the initial nr_irqs set > > in early boot to have a sensible size for allocating stuff. Later on > > we can expand nr_irqs when the need arises. > > > It''s not only Xen which wants to eliminate the limitation. Think about > > irq expanders which are detected late in the boot. We have no sensible > > way to reserve enough numbers for them at early boot as we dont know > > whether that hardware is there or not. > > > > So my plan for .39 is to ignore the NR_IRQS limitation in the sparse > > case and make nr_irqs expandable of course with a sensible upper limit > > in the core code itself. It''s basically the allocation bitmap which > > limits it, but I doubt we''ll hit 1 Million irq numbers in the > > forseeable future. > > That sounds ideal, thanks! > > I was hoping to get rid of the workaround in Xen events.c in the 2.6.39 > timeframe too. > > If you let me know when you have something I can test I''ll combine with > the Xen side and give it a spin. > > On a vaguely related note, what is the future of non-sparse IRQs (on x86 > and/or generally)?In general I want to switch everything over to SPARSE_IRQ. When the open coded access to irq_desc[] is gone, which should be mostly the case in .39 then switching everything over should be a smooth thing. For those archs which do not want to adjust the numbers dynamically we simple allocate NR_IRQS in early_irq_init(). So they wont even notice. Thanks, tglx _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Feb-16 16:27 UTC
[Xen-devel] Re: Xen and incorporating event channels in to nr_irqs
On Wed, 2011-02-16 at 16:25 +0000, Thomas Gleixner wrote:> On Wed, 16 Feb 2011, Ian Campbell wrote: > > On a vaguely related note, what is the future of non-sparse IRQs (on x86 > > and/or generally)? > > In general I want to switch everything over to SPARSE_IRQ. When the > open coded access to irq_desc[] is gone, which should be mostly the > case in .39 then switching everything over should be a smooth > thing. For those archs which do not want to adjust the numbers > dynamically we simple allocate NR_IRQS in early_irq_init(). So they > wont even notice.Sweet, I won''t worry myself unduly over the non-SPARSE_IRQ case then. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel