Konrad Rzeszutek Wilk
2013-May-22 16:21 UTC
[Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
During the hackothon we chatted about the Intel APIC virtualization and how it works with current Linux PVHVM. Or rather how it is not per my understanding. I am trying to visualize how this would work with a 10GB NIC that is passed in a guest. This slide (starting at pg 6) gives an idea of what it is: http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-virt-intel-vt-feat-nakajima.pdf pg9 goes in details of what this does - APIC reads don''t trap, but writes do cause an VMEXIT. OK, that is similar to how PVHVM callback vector + events work. If the NIC has an vector, it hits the hypervisor, sets the right event channel. Then the guest is interrupted with vector 0xf3 (callback vector), goes straight to __xen_evtchn_do_upcall and reads the event channel and calls the NIC driver IRQ handler. If it needs to write (say do an IPI or mask an other CPU IRQ) it will do a hypercall and exit (there optimizations to not do this if the masking, etc is done on the local CPU). For that the PVHVM event channel machinery gives the same benefit. The next part is "Virtual-interrupt delivery". Here it says: "CPU delivers virtual interrupts to guest (including virtual IPIs)." Not much on details, but then this slide: http://www.linux-kvm.org/wiki/images/7/70/2012-forum-nakajima_apicv.pdf gives a better idea (page 7 and 8) and then it goes in details. Also the Intel Software Development Manual starting at 29.1 talks in details about it. Per my understanding, the CPU sets the SVI and RVI to tell the hypervisor what vector is currently being execututed and which one is going next. Those vectors are choosen by the OS. It could use vector 0xfa for a NIC driver and a lower one for IPIs or such. The hypervisor sets a VISR (a bitmap) off all the vectors that a guest is allowed to execute without an VMEXIT. In all likehood it will just mask out the vectors it is using and let the guest have a free range. Which means that if this is set to be higher than the hypervisor timer or IPI callback the guest can run unbounded. Also it would seem that this value has to be often reset when migrating a guest between the pCPUs. And it would appear that this value is static. Meaning the guest only sets these vectors once and the hypervisor is responsible for managing the priority of that guest and other guests (say dom0) on the CPU. For example, we have a guest with a 10gB NIC and the guest has decided to use vector 0x80 for it (assume a UP guest). Dom0 has an SAS controller and is using event number 30, 31, 32, and 33 (there are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 and 0x88 and spreads those vectors on each pCPU. The guest is running on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned to the guest wins and dom0 SAS controller is preempted. The solution for that seems to have some interaction with the guest when it allocates the vectors so that they are always below the dom0 priority vectors. Or hypervisor has to dynamically shuffle its own vectors to be higher priority. Or is there an guest vector <-> hypervisor vector lookup table that the CPU can use? So the hypervisor can say: the vector 0x80 in the guest actually maps to vector 0x48 in the hypervisor? Now the above example assumed a simple HVM Linux kernel that does not use PV extensions. Currently Linux on HVM will enable the event system and use one vector for a callback (0xf3). For this to work where we mix the event callback and a real physical device vector along with access to the virtual APIC, this would require some knowing of which devices (or vectors) can use the event path or not. Am I on the right track?
Jan Beulich
2013-May-23 07:49 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > Which means that if this is set to be higher than the hypervisor timer > or IPI callback the guest can run unbounded. Also it would seem that > this value has to be often reset when migrating a guest between the pCPUs. > And it would appear that this value is static. Meaning the guest only > sets these vectors once and the hypervisor is responsible for managing > the priority of that guest and other guests (say dom0) on the CPU. > > For example, we have a guest with a 10gB NIC and the guest has decided > to use vector 0x80 for it (assume a UP guest). Dom0 has an SAS controller > and is using event number 30, 31, 32, and 33 (there are only 4 PCPUS). > The hypervisor maps them to be 0x58, 0x68, 0x78 and 0x88 and spreads those > vectors on each pCPU. The guest is running on pCPU1 and there are two > vectors - 0x80 and 0x58. The one assigned to the guest wins and dom0 > SAS controller is preempted. > > The solution for that seems to have some interaction with the > guest when it allocates the vectors so that they are always below > the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > own vectors to be higher priority. > > Or is there an guest vector <-> hypervisor vector lookup table that > the CPU can use? So the hypervisor can say: the vector 0x80 in the > guest actually maps to vector 0x48 in the hypervisor?It is my understanding that the vector spaces are separate, and hence guest interrupts can''t block host ones (like the timer). Iirc there''s some sort of flag bit in the IRTE to tell whether an interrupt should get delivered directly to the guest, or to the hypervisor. Jan Jan
Zhang, Yang Z
2013-May-23 08:25 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Jan Beulich wrote on 2013-05-23:>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > wrote: >> Which means that if this is set to be higher than the hypervisor timer >> or IPI callback the guest can run unbounded. Also it would seem that >> this value has to be often reset when migrating a guest between the pCPUs. >> And it would appear that this value is static. Meaning the guest only >> sets these vectors once and the hypervisor is responsible for managing >> the priority of that guest and other guests (say dom0) on the CPU. >> >> For example, we have a guest with a 10gB NIC and the guest has decided >> to use vector 0x80 for it (assume a UP guest). Dom0 has an SAS controller >> and is using event number 30, 31, 32, and 33 (there are only 4 PCPUS). >> The hypervisor maps them to be 0x58, 0x68, 0x78 and 0x88 and spreads those >> vectors on each pCPU. The guest is running on pCPU1 and there are two >> vectors - 0x80 and 0x58. The one assigned to the guest wins and dom0 >> SAS controller is preempted. >> >> The solution for that seems to have some interaction with the >> guest when it allocates the vectors so that they are always below >> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its >> own vectors to be higher priority. >> >> Or is there an guest vector <-> hypervisor vector lookup table that >> the CPU can use? So the hypervisor can say: the vector 0x80 in the >> guest actually maps to vector 0x48 in the hypervisor? > > It is my understanding that the vector spaces are separate, and > hence guest interrupts can''t block host ones (like the timer). IircRight. virtual interrupt delivery only for delivering guest virtual interrupt(from emulation device and assigned device.) which is located in guest''s vector space. It has nothing to do with other guest.> there''s some sort of flag bit in the IRTE to tell whether an interrupt > should get delivered directly to the guest, or to the hypervisor.I think you are talking about Posted interrupt.> > Jan > > JanBest regards, Yang
Konrad Rzeszutek Wilk
2013-May-24 14:30 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Thu, May 23, 2013 at 08:49:50AM +0100, Jan Beulich wrote:> >>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > Which means that if this is set to be higher than the hypervisor timer > > or IPI callback the guest can run unbounded. Also it would seem that > > this value has to be often reset when migrating a guest between the pCPUs. > > And it would appear that this value is static. Meaning the guest only > > sets these vectors once and the hypervisor is responsible for managing > > the priority of that guest and other guests (say dom0) on the CPU. > > > > For example, we have a guest with a 10gB NIC and the guest has decided > > to use vector 0x80 for it (assume a UP guest). Dom0 has an SAS controller > > and is using event number 30, 31, 32, and 33 (there are only 4 PCPUS). > > The hypervisor maps them to be 0x58, 0x68, 0x78 and 0x88 and spreads those > > vectors on each pCPU. The guest is running on pCPU1 and there are two > > vectors - 0x80 and 0x58. The one assigned to the guest wins and dom0 > > SAS controller is preempted. > > > > The solution for that seems to have some interaction with the > > guest when it allocates the vectors so that they are always below > > the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > > own vectors to be higher priority. > > > > Or is there an guest vector <-> hypervisor vector lookup table that > > the CPU can use? So the hypervisor can say: the vector 0x80 in the > > guest actually maps to vector 0x48 in the hypervisor? > > It is my understanding that the vector spaces are separate, and > hence guest interrupts can''t block host ones (like the timer). Iirc > there''s some sort of flag bit in the IRTE to tell whether an interrupt > should get delivered directly to the guest, or to the hypervisor.Ah, so the VT-d interrupt remap table would help in setting the "Xen"s vectors. Got it.> > Jan > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Konrad Rzeszutek Wilk
2013-May-24 14:31 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote:> Jan Beulich wrote on 2013-05-23: > >>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > wrote: > >> Which means that if this is set to be higher than the hypervisor timer > >> or IPI callback the guest can run unbounded. Also it would seem that > >> this value has to be often reset when migrating a guest between the pCPUs. > >> And it would appear that this value is static. Meaning the guest only > >> sets these vectors once and the hypervisor is responsible for managing > >> the priority of that guest and other guests (say dom0) on the CPU. > >> > >> For example, we have a guest with a 10gB NIC and the guest has decided > >> to use vector 0x80 for it (assume a UP guest). Dom0 has an SAS controller > >> and is using event number 30, 31, 32, and 33 (there are only 4 PCPUS). > >> The hypervisor maps them to be 0x58, 0x68, 0x78 and 0x88 and spreads those > >> vectors on each pCPU. The guest is running on pCPU1 and there are two > >> vectors - 0x80 and 0x58. The one assigned to the guest wins and dom0 > >> SAS controller is preempted. > >> > >> The solution for that seems to have some interaction with the > >> guest when it allocates the vectors so that they are always below > >> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > >> own vectors to be higher priority. > >> > >> Or is there an guest vector <-> hypervisor vector lookup table that > >> the CPU can use? So the hypervisor can say: the vector 0x80 in the > >> guest actually maps to vector 0x48 in the hypervisor? > > > > It is my understanding that the vector spaces are separate, and > > hence guest interrupts can''t block host ones (like the timer). Iirc > Right. virtual interrupt delivery only for delivering guest virtual interrupt(from emulation device and assigned device.) which is located in guest''s vector space. It has nothing to do with other guest.OK, in which case Linux ~v2.6.32 (when the event callback mechanism was introduced for HVM guests) will _not_ take advantage of this, right? Is there a way to solve this so that they _will_ take advantage of this.> > > there''s some sort of flag bit in the IRTE to tell whether an interrupt > > should get delivered directly to the guest, or to the hypervisor. > I think you are talking about Posted interrupt. > > > > > Jan > > > > Jan > > > Best regards, > Yang > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Zhang, Yang Z
2013-May-27 04:56 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Konrad Rzeszutek Wilk wrote on 2013-05-24:> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: >> Jan Beulich wrote on 2013-05-23: >>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> wrote: >>>> Which means that if this is set to be higher than the hypervisor timer >>>> or IPI callback the guest can run unbounded. Also it would seem that >>>> this value has to be often reset when migrating a guest between the pCPUs. >>>> And it would appear that this value is static. Meaning the guest only >>>> sets these vectors once and the hypervisor is responsible for managing >>>> the priority of that guest and other guests (say dom0) on the CPU. >>>> >>>> For example, we have a guest with a 10gB NIC and the guest has >>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has an >>>> SAS controller and is using event number 30, 31, 32, and 33 (there >>>> are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 >>>> and 0x88 and spreads those vectors on each pCPU. The guest is running >>>> on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned >>>> to the guest wins and dom0 SAS controller is preempted. >>>> >>>> The solution for that seems to have some interaction with the >>>> guest when it allocates the vectors so that they are always below >>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its >>>> own vectors to be higher priority. >>>> >>>> Or is there an guest vector <-> hypervisor vector lookup table that >>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the >>>> guest actually maps to vector 0x48 in the hypervisor? >>> >>> It is my understanding that the vector spaces are separate, and >>> hence guest interrupts can''t block host ones (like the timer). Iirc >> Right. virtual interrupt delivery only for delivering guest virtual interrupt(from > emulation device and assigned device.) which is located in guest''s vector space. It > has nothing to do with other guest. > > > OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > introduced for HVM guests) will _not_ take advantage of this, right?Yes, event mechanism cannot benefit from it.> > Is there a way to solve this so that they _will_ take advantage of this.Perhaps not. virtual interrupt delivery relies on EOI logic to inject the pending interrupt. But event channel doesn''t have such mechanism.>> >>> there''s some sort of flag bit in the IRTE to tell whether an interrupt >>> should get delivered directly to the guest, or to the hypervisor. >> I think you are talking about Posted interrupt. >> >>> >>> Jan >>> >>> Jan >> >> >> Best regards, >> Yang >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >>Best regards, Yang
Stefano Stabellini
2013-May-27 10:43 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Mon, 27 May 2013, Zhang, Yang Z wrote:> Konrad Rzeszutek Wilk wrote on 2013-05-24: > > On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: > >> Jan Beulich wrote on 2013-05-23: > >>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > >>> wrote: > >>>> Which means that if this is set to be higher than the hypervisor timer > >>>> or IPI callback the guest can run unbounded. Also it would seem that > >>>> this value has to be often reset when migrating a guest between the pCPUs. > >>>> And it would appear that this value is static. Meaning the guest only > >>>> sets these vectors once and the hypervisor is responsible for managing > >>>> the priority of that guest and other guests (say dom0) on the CPU. > >>>> > >>>> For example, we have a guest with a 10gB NIC and the guest has > >>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has an > >>>> SAS controller and is using event number 30, 31, 32, and 33 (there > >>>> are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 > >>>> and 0x88 and spreads those vectors on each pCPU. The guest is running > >>>> on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned > >>>> to the guest wins and dom0 SAS controller is preempted. > >>>> > >>>> The solution for that seems to have some interaction with the > >>>> guest when it allocates the vectors so that they are always below > >>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > >>>> own vectors to be higher priority. > >>>> > >>>> Or is there an guest vector <-> hypervisor vector lookup table that > >>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the > >>>> guest actually maps to vector 0x48 in the hypervisor? > >>> > >>> It is my understanding that the vector spaces are separate, and > >>> hence guest interrupts can''t block host ones (like the timer). Iirc > >> Right. virtual interrupt delivery only for delivering guest virtual interrupt(from > > emulation device and assigned device.) which is located in guest''s vector space. It > > has nothing to do with other guest.I think you mean "It has nothing to do with _the hypervisor_"?> > OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > > introduced for HVM guests) will _not_ take advantage of this, right? > Yes, event mechanism cannot benefit from it.I think that Konrad was referring to the vector callback mechanism: linux side drivers/xen/events.c:xen_callback_vector xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via Also see: commit e5fd1f6505c43440bc2450253c79c80174b693bc Author: Keir Fraser <keir.fraser@citrix.com> Date: Tue May 25 11:28:58 2010 +0100 x86 hvm: implement vector callback for evtchn delivery Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com> From the guest point of view it looks like a normal vector callback (similar to an IPI).> > Is there a way to solve this so that they _will_ take advantage of this. > Perhaps not. virtual interrupt delivery relies on EOI logic to inject the pending interrupt. But event channel doesn''t have such mechanism.It''s true that we don''t do any EOIs with the vector callback mechanism, the same way the operating system doesn''t do any EOIs when it receives an IPI. Can IPIs take advantage of virtual interrupt delivery?
Zhang, Yang Z
2013-May-28 10:51 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Stefano Stabellini wrote on 2013-05-27:> On Mon, 27 May 2013, Zhang, Yang Z wrote: >> Konrad Rzeszutek Wilk wrote on 2013-05-24: >>> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: >>>> Jan Beulich wrote on 2013-05-23: >>>>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> >>>>> wrote: >>>>>> Which means that if this is set to be higher than the hypervisor >>>>>> timer or IPI callback the guest can run unbounded. Also it would >>>>>> seem that this value has to be often reset when migrating a guest >>>>>> between the pCPUs. And it would appear that this value is static. >>>>>> Meaning the guest only sets these vectors once and the hypervisor >>>>>> is responsible for managing the priority of that guest and other >>>>>> guests (say dom0) on the CPU. >>>>>> >>>>>> For example, we have a guest with a 10gB NIC and the guest has >>>>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has an >>>>>> SAS controller and is using event number 30, 31, 32, and 33 (there >>>>>> are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 >>>>>> and 0x88 and spreads those vectors on each pCPU. The guest is running >>>>>> on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned >>>>>> to the guest wins and dom0 SAS controller is preempted. >>>>>> >>>>>> The solution for that seems to have some interaction with the >>>>>> guest when it allocates the vectors so that they are always below >>>>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its >>>>>> own vectors to be higher priority. >>>>>> >>>>>> Or is there an guest vector <-> hypervisor vector lookup table that >>>>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the >>>>>> guest actually maps to vector 0x48 in the hypervisor? >>>>> >>>>> It is my understanding that the vector spaces are separate, and >>>>> hence guest interrupts can''t block host ones (like the timer). Iirc >>>> Right. virtual interrupt delivery only for delivering guest virtual > interrupt(from >>> emulation device and assigned device.) which is located in guest''s >>> vector space. It has nothing to do with other guest. > > I think you mean "It has nothing to do with _the hypervisor_"?Yes. Both hypervisor and guest have separated vector space.> >>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was >>> introduced for HVM guests) will _not_ take advantage of this, right? >> Yes, event mechanism cannot benefit from it. > > I think that Konrad was referring to the vector callback mechanism:You are right. What I want to say is vector callback mechanism.> > linux side drivers/xen/events.c:xen_callback_vector > xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > > Also see: > > commit e5fd1f6505c43440bc2450253c79c80174b693bc > Author: Keir Fraser <keir.fraser@citrix.com> > Date: Tue May 25 11:28:58 2010 +0100 > > x86 hvm: implement vector callback for evtchn delivery > > Signed-off-by: Sheng Yang <sheng@linux.intel.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > > From the guest point of view it looks like a normal vector callback > (similar to an IPI). > > >>> Is there a way to solve this so that they _will_ take advantage of this. >> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the pending > interrupt. But event channel doesn''t have such mechanism. > > It''s true that we don''t do any EOIs with the vector callback mechanism, > the same way the operating system doesn''t do any EOIs when it receives > an IPI.IPI also need EOI.> Can IPIs take advantage of virtual interrupt delivery?Best regards, Yang
Konrad Rzeszutek Wilk
2013-May-28 13:22 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Tue, May 28, 2013 at 10:51:21AM +0000, Zhang, Yang Z wrote:> Stefano Stabellini wrote on 2013-05-27: > > On Mon, 27 May 2013, Zhang, Yang Z wrote: > >> Konrad Rzeszutek Wilk wrote on 2013-05-24: > >>> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: > >>>> Jan Beulich wrote on 2013-05-23: > >>>>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk > > <konrad.wilk@oracle.com> > >>>>> wrote: > >>>>>> Which means that if this is set to be higher than the hypervisor > >>>>>> timer or IPI callback the guest can run unbounded. Also it would > >>>>>> seem that this value has to be often reset when migrating a guest > >>>>>> between the pCPUs. And it would appear that this value is static. > >>>>>> Meaning the guest only sets these vectors once and the hypervisor > >>>>>> is responsible for managing the priority of that guest and other > >>>>>> guests (say dom0) on the CPU. > >>>>>> > >>>>>> For example, we have a guest with a 10gB NIC and the guest has > >>>>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has an > >>>>>> SAS controller and is using event number 30, 31, 32, and 33 (there > >>>>>> are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 > >>>>>> and 0x88 and spreads those vectors on each pCPU. The guest is running > >>>>>> on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned > >>>>>> to the guest wins and dom0 SAS controller is preempted. > >>>>>> > >>>>>> The solution for that seems to have some interaction with the > >>>>>> guest when it allocates the vectors so that they are always below > >>>>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > >>>>>> own vectors to be higher priority. > >>>>>> > >>>>>> Or is there an guest vector <-> hypervisor vector lookup table that > >>>>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the > >>>>>> guest actually maps to vector 0x48 in the hypervisor? > >>>>> > >>>>> It is my understanding that the vector spaces are separate, and > >>>>> hence guest interrupts can''t block host ones (like the timer). Iirc > >>>> Right. virtual interrupt delivery only for delivering guest virtual > > interrupt(from > >>> emulation device and assigned device.) which is located in guest''s > >>> vector space. It has nothing to do with other guest. > > > > I think you mean "It has nothing to do with _the hypervisor_"? > Yes. Both hypervisor and guest have separated vector space. > > > > >>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > >>> introduced for HVM guests) will _not_ take advantage of this, right? > >> Yes, event mechanism cannot benefit from it. > > > > I think that Konrad was referring to the vector callback mechanism: > You are right. What I want to say is vector callback mechanism. > > > > > linux side drivers/xen/events.c:xen_callback_vector > > xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > > > > Also see: > > > > commit e5fd1f6505c43440bc2450253c79c80174b693bc > > Author: Keir Fraser <keir.fraser@citrix.com> > > Date: Tue May 25 11:28:58 2010 +0100 > > > > x86 hvm: implement vector callback for evtchn delivery > > > > Signed-off-by: Sheng Yang <sheng@linux.intel.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > > > > From the guest point of view it looks like a normal vector callback > > (similar to an IPI). > > > > > >>> Is there a way to solve this so that they _will_ take advantage of this. > >> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the pending > > interrupt. But event channel doesn''t have such mechanism. > > > > It''s true that we don''t do any EOIs with the vector callback mechanism, > > the same way the operating system doesn''t do any EOIs when it receives > > an IPI. > IPI also need EOI.Can we fix this to use the HVM (which would use the Virtual APIC and EOI) mechanism for: - passthrough devices - IPIs And for everything else use the vector callback mechanism? Which baremetal mechanism is needed? The x2APIC one?> > > Can IPIs take advantage of virtual interrupt delivery?From my reading - yes. Yang, that is correct right?> > Best regards, > Yang > >
Stefano Stabellini
2013-May-28 16:10 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Tue, 28 May 2013, Zhang, Yang Z wrote:> Stefano Stabellini wrote on 2013-05-27: > > On Mon, 27 May 2013, Zhang, Yang Z wrote: > >> Konrad Rzeszutek Wilk wrote on 2013-05-24: > >>> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: > >>>> Jan Beulich wrote on 2013-05-23: > >>>>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk > > <konrad.wilk@oracle.com> > >>>>> wrote: > >>>>>> Which means that if this is set to be higher than the hypervisor > >>>>>> timer or IPI callback the guest can run unbounded. Also it would > >>>>>> seem that this value has to be often reset when migrating a guest > >>>>>> between the pCPUs. And it would appear that this value is static. > >>>>>> Meaning the guest only sets these vectors once and the hypervisor > >>>>>> is responsible for managing the priority of that guest and other > >>>>>> guests (say dom0) on the CPU. > >>>>>> > >>>>>> For example, we have a guest with a 10gB NIC and the guest has > >>>>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has an > >>>>>> SAS controller and is using event number 30, 31, 32, and 33 (there > >>>>>> are only 4 PCPUS). The hypervisor maps them to be 0x58, 0x68, 0x78 > >>>>>> and 0x88 and spreads those vectors on each pCPU. The guest is running > >>>>>> on pCPU1 and there are two vectors - 0x80 and 0x58. The one assigned > >>>>>> to the guest wins and dom0 SAS controller is preempted. > >>>>>> > >>>>>> The solution for that seems to have some interaction with the > >>>>>> guest when it allocates the vectors so that they are always below > >>>>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > >>>>>> own vectors to be higher priority. > >>>>>> > >>>>>> Or is there an guest vector <-> hypervisor vector lookup table that > >>>>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the > >>>>>> guest actually maps to vector 0x48 in the hypervisor? > >>>>> > >>>>> It is my understanding that the vector spaces are separate, and > >>>>> hence guest interrupts can''t block host ones (like the timer). Iirc > >>>> Right. virtual interrupt delivery only for delivering guest virtual > > interrupt(from > >>> emulation device and assigned device.) which is located in guest''s > >>> vector space. It has nothing to do with other guest. > > > > I think you mean "It has nothing to do with _the hypervisor_"? > Yes. Both hypervisor and guest have separated vector space. > > > > >>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > >>> introduced for HVM guests) will _not_ take advantage of this, right? > >> Yes, event mechanism cannot benefit from it. > > > > I think that Konrad was referring to the vector callback mechanism: > You are right. What I want to say is vector callback mechanism. > > > > > linux side drivers/xen/events.c:xen_callback_vector > > xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > > > > Also see: > > > > commit e5fd1f6505c43440bc2450253c79c80174b693bc > > Author: Keir Fraser <keir.fraser@citrix.com> > > Date: Tue May 25 11:28:58 2010 +0100 > > > > x86 hvm: implement vector callback for evtchn delivery > > > > Signed-off-by: Sheng Yang <sheng@linux.intel.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > > > > From the guest point of view it looks like a normal vector callback > > (similar to an IPI). > > > > > >>> Is there a way to solve this so that they _will_ take advantage of this. > >> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the pending > > interrupt. But event channel doesn''t have such mechanism. > > > > It''s true that we don''t do any EOIs with the vector callback mechanism, > > the same way the operating system doesn''t do any EOIs when it receives > > an IPI. > IPI also need EOI.Ooops, you are right. Does guest EOI still cause a trap into Xen?
Zhang, Yang Z
2013-May-29 00:40 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Stefano Stabellini wrote on 2013-05-29:> On Tue, 28 May 2013, Zhang, Yang Z wrote: >> Stefano Stabellini wrote on 2013-05-27: >>> On Mon, 27 May 2013, Zhang, Yang Z wrote: >>>> Konrad Rzeszutek Wilk wrote on 2013-05-24: >>>>> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: >>>>>> Jan Beulich wrote on 2013-05-23: >>>>>>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk >>> <konrad.wilk@oracle.com> >>>>>>> wrote: >>>>>>>> Which means that if this is set to be higher than the hypervisor >>>>>>>> timer or IPI callback the guest can run unbounded. Also it would >>>>>>>> seem that this value has to be often reset when migrating a guest >>>>>>>> between the pCPUs. And it would appear that this value is static. >>>>>>>> Meaning the guest only sets these vectors once and the hypervisor >>>>>>>> is responsible for managing the priority of that guest and other >>>>>>>> guests (say dom0) on the CPU. >>>>>>>> >>>>>>>> For example, we have a guest with a 10gB NIC and the guest has >>>>>>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has >>>>>>>> an SAS controller and is using event number 30, 31, 32, and 33 >>>>>>>> (there are only 4 PCPUS). The hypervisor maps them to be 0x58, >>>>>>>> 0x68, 0x78 and 0x88 and spreads those vectors on each pCPU. The >>>>>>>> guest is running on pCPU1 and there are two vectors - 0x80 and >>>>>>>> 0x58. The one assigned to the guest wins and dom0 SAS controller >>>>>>>> is preempted. >>>>>>>> >>>>>>>> The solution for that seems to have some interaction with the >>>>>>>> guest when it allocates the vectors so that they are always below >>>>>>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its >>>>>>>> own vectors to be higher priority. >>>>>>>> >>>>>>>> Or is there an guest vector <-> hypervisor vector lookup table that >>>>>>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the >>>>>>>> guest actually maps to vector 0x48 in the hypervisor? >>>>>>> >>>>>>> It is my understanding that the vector spaces are separate, and >>>>>>> hence guest interrupts can't block host ones (like the timer). Iirc >>>>>> Right. virtual interrupt delivery only for delivering guest virtual >>> interrupt(from >>>>> emulation device and assigned device.) which is located in guest's >>>>> vector space. It has nothing to do with other guest. >>> >>> I think you mean "It has nothing to do with _the hypervisor_"? >> Yes. Both hypervisor and guest have separated vector space. >> >>> >>>>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was >>>>> introduced for HVM guests) will _not_ take advantage of this, right? >>>> Yes, event mechanism cannot benefit from it. >>> >>> I think that Konrad was referring to the vector callback mechanism: >> You are right. What I want to say is vector callback mechanism. >> >>> >>> linux side drivers/xen/events.c:xen_callback_vector >>> xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via >>> >>> Also see: >>> >>> commit e5fd1f6505c43440bc2450253c79c80174b693bc >>> Author: Keir Fraser <keir.fraser@citrix.com> >>> Date: Tue May 25 11:28:58 2010 +0100 >>> >>> x86 hvm: implement vector callback for evtchn delivery >>> >>> Signed-off-by: Sheng Yang <sheng@linux.intel.com> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >>> Signed-off-by: Keir Fraser <keir.fraser@citrix.com> >>> From the guest point of view it looks like a normal vector callback >>> (similar to an IPI). >>> >>> >>>>> Is there a way to solve this so that they _will_ take advantage of this. >>>> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the > pending >>> interrupt. But event channel doesn't have such mechanism. >>> >>> It's true that we don't do any EOIs with the vector callback mechanism, >>> the same way the operating system doesn't do any EOIs when it receives >>> an IPI. >> IPI also need EOI. > > Ooops, you are right. > > Does guest EOI still cause a trap into Xen?It depends on the bit in EOI exit bitmap. If it is set, then EOI still will cause vmexit(EOI-induced vmexit). Otherwise, no vmexit happened. The following pseudocode details the behavior of EOI virtualization: Vector ← SVI; VISR[Vector] ← 0; IF any bits set in VISR THEN SVI ← highest index of bit set in VISR ELSE SVI ← 0; FI; perform PPR virtualiation IF EOI_exit_bitmap[Vector] = 1 THEN cause EOI-induced VM exit with Vector as exit qualification; ELSE evaluate pending virtual interrupts; FI; Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Stefano Stabellini
2013-May-29 10:07 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Wed, 29 May 2013, Zhang, Yang Z wrote:> Stefano Stabellini wrote on 2013-05-29: > > On Tue, 28 May 2013, Zhang, Yang Z wrote: > >> Stefano Stabellini wrote on 2013-05-27: > >>> On Mon, 27 May 2013, Zhang, Yang Z wrote: > >>>> Konrad Rzeszutek Wilk wrote on 2013-05-24: > >>>>> On Thu, May 23, 2013 at 08:25:06AM +0000, Zhang, Yang Z wrote: > >>>>>> Jan Beulich wrote on 2013-05-23: > >>>>>>>>>> On 22.05.13 at 18:21, Konrad Rzeszutek Wilk > >>> <konrad.wilk@oracle.com> > >>>>>>> wrote: > >>>>>>>> Which means that if this is set to be higher than the hypervisor > >>>>>>>> timer or IPI callback the guest can run unbounded. Also it would > >>>>>>>> seem that this value has to be often reset when migrating a guest > >>>>>>>> between the pCPUs. And it would appear that this value is static. > >>>>>>>> Meaning the guest only sets these vectors once and the hypervisor > >>>>>>>> is responsible for managing the priority of that guest and other > >>>>>>>> guests (say dom0) on the CPU. > >>>>>>>> > >>>>>>>> For example, we have a guest with a 10gB NIC and the guest has > >>>>>>>> decided to use vector 0x80 for it (assume a UP guest). Dom0 has > >>>>>>>> an SAS controller and is using event number 30, 31, 32, and 33 > >>>>>>>> (there are only 4 PCPUS). The hypervisor maps them to be 0x58, > >>>>>>>> 0x68, 0x78 and 0x88 and spreads those vectors on each pCPU. The > >>>>>>>> guest is running on pCPU1 and there are two vectors - 0x80 and > >>>>>>>> 0x58. The one assigned to the guest wins and dom0 SAS controller > >>>>>>>> is preempted. > >>>>>>>> > >>>>>>>> The solution for that seems to have some interaction with the > >>>>>>>> guest when it allocates the vectors so that they are always below > >>>>>>>> the dom0 priority vectors. Or hypervisor has to dynamically shuffle its > >>>>>>>> own vectors to be higher priority. > >>>>>>>> > >>>>>>>> Or is there an guest vector <-> hypervisor vector lookup table that > >>>>>>>> the CPU can use? So the hypervisor can say: the vector 0x80 in the > >>>>>>>> guest actually maps to vector 0x48 in the hypervisor? > >>>>>>> > >>>>>>> It is my understanding that the vector spaces are separate, and > >>>>>>> hence guest interrupts can''t block host ones (like the timer). Iirc > >>>>>> Right. virtual interrupt delivery only for delivering guest virtual > >>> interrupt(from > >>>>> emulation device and assigned device.) which is located in guest''s > >>>>> vector space. It has nothing to do with other guest. > >>> > >>> I think you mean "It has nothing to do with _the hypervisor_"? > >> Yes. Both hypervisor and guest have separated vector space. > >> > >>> > >>>>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > >>>>> introduced for HVM guests) will _not_ take advantage of this, right? > >>>> Yes, event mechanism cannot benefit from it. > >>> > >>> I think that Konrad was referring to the vector callback mechanism: > >> You are right. What I want to say is vector callback mechanism. > >> > >>> > >>> linux side drivers/xen/events.c:xen_callback_vector > >>> xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > >>> > >>> Also see: > >>> > >>> commit e5fd1f6505c43440bc2450253c79c80174b693bc > >>> Author: Keir Fraser <keir.fraser@citrix.com> > >>> Date: Tue May 25 11:28:58 2010 +0100 > >>> > >>> x86 hvm: implement vector callback for evtchn delivery > >>> > >>> Signed-off-by: Sheng Yang <sheng@linux.intel.com> > >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > >>> Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > >>> From the guest point of view it looks like a normal vector callback > >>> (similar to an IPI). > >>> > >>> > >>>>> Is there a way to solve this so that they _will_ take advantage of this. > >>>> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the > > pending > >>> interrupt. But event channel doesn''t have such mechanism. > >>> > >>> It''s true that we don''t do any EOIs with the vector callback mechanism, > >>> the same way the operating system doesn''t do any EOIs when it receives > >>> an IPI. > >> IPI also need EOI. > > > > Ooops, you are right. > > > > Does guest EOI still cause a trap into Xen? > It depends on the bit in EOI exit bitmap. If it is set, then EOI still will cause vmexit(EOI-induced vmexit). Otherwise, no vmexit happened. > > The following pseudocode details the behavior of EOI virtualization: > Vector ← SVI; > VISR[Vector] ← 0; > IF any bits set in VISR > THEN SVI ← highest index of bit set in VISR > ELSE SVI ← 0; > FI; > perform PPR virtualiation > IF EOI_exit_bitmap[Vector] = 1 > THEN cause EOI-induced VM exit with Vector as exit qualification; > ELSE evaluate pending virtual interrupts; > FI;Thanks for the explanation. At this point I wonder: would vector callbacks, that doesn''t do any guest EOIs, create any problems to this new virtual interrupt delivery mechanism? If the guest does not do any EOIs after receiving a vector callback, then other pending interrupts are never evaluated (the last ELSE condition in your pseudocode cannot happen), is that correct? In any case we could consider introducing an ack_APIC_irq() call at the beginning of xen_evtchn_do_upcall, so that the vector callback mechanism can take advantage of posted interrupts too. Of course we would do that only if posted interrupts are available. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Jun-03 12:59 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
> > >>>>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > > >>>>> introduced for HVM guests) will _not_ take advantage of this, right? > > >>>> Yes, event mechanism cannot benefit from it. > > >>> > > >>> I think that Konrad was referring to the vector callback mechanism: > > >> You are right. What I want to say is vector callback mechanism. > > >> > > >>> > > >>> linux side drivers/xen/events.c:xen_callback_vector > > >>> xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > > >>> > > >>> Also see: > > >>> > > >>> commit e5fd1f6505c43440bc2450253c79c80174b693bc > > >>> Author: Keir Fraser <keir.fraser@citrix.com> > > >>> Date: Tue May 25 11:28:58 2010 +0100 > > >>> > > >>> x86 hvm: implement vector callback for evtchn delivery > > >>> > > >>> Signed-off-by: Sheng Yang <sheng@linux.intel.com> > > >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > >>> Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > > >>> From the guest point of view it looks like a normal vector callback > > >>> (similar to an IPI). > > >>> > > >>> > > >>>>> Is there a way to solve this so that they _will_ take advantage of this. > > >>>> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the > > > pending > > >>> interrupt. But event channel doesn't have such mechanism. > > >>> > > >>> It's true that we don't do any EOIs with the vector callback mechanism, > > >>> the same way the operating system doesn't do any EOIs when it receives > > >>> an IPI. > > >> IPI also need EOI. > > > > > > Ooops, you are right. > > > > > > Does guest EOI still cause a trap into Xen? > > It depends on the bit in EOI exit bitmap. If it is set, then EOI still will cause vmexit(EOI-induced vmexit). Otherwise, no vmexit happened. > > > > The following pseudocode details the behavior of EOI virtualization: > > Vector ← SVI; > > VISR[Vector] ← 0; > > IF any bits set in VISR > > THEN SVI ← highest index of bit set in VISR > > ELSE SVI ← 0; > > FI; > > perform PPR virtualiation > > IF EOI_exit_bitmap[Vector] = 1 > > THEN cause EOI-induced VM exit with Vector as exit qualification; > > ELSE evaluate pending virtual interrupts; > > FI; > > Thanks for the explanation. > > At this point I wonder: would vector callbacks, that doesn't do any > guest EOIs, create any problems to this new virtual interrupt delivery > mechanism? > If the guest does not do any EOIs after receiving a vector callback, > then other pending interrupts are never evaluated (the last ELSE > condition in your pseudocode cannot happen), is that correct? > > In any case we could consider introducing an ack_APIC_irq() call at the > beginning of xen_evtchn_do_upcall, so that the vector callback mechanism > can take advantage of posted interrupts too.Or just split the mechanism. Meaning use the event callback for "legacy" type events, and for PCI passthrough devices (where the host supports posted interrupts) just use the baremetal implementation. That would entail some form of hypercall to identify whether a PCIe device is "posted-interrupt" candidate and if so don't use the event channel mechanism for it.> Of course we would do that only if posted interrupts are available.> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Jun-03 15:22 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Mon, 3 Jun 2013, Konrad Rzeszutek Wilk wrote:> > > >>>>> OK, in which case Linux ~v2.6.32 (when the event callback mechanism was > > > >>>>> introduced for HVM guests) will _not_ take advantage of this, right? > > > >>>> Yes, event mechanism cannot benefit from it. > > > >>> > > > >>> I think that Konrad was referring to the vector callback mechanism: > > > >> You are right. What I want to say is vector callback mechanism. > > > >> > > > >>> > > > >>> linux side drivers/xen/events.c:xen_callback_vector > > > >>> xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via > > > >>> > > > >>> Also see: > > > >>> > > > >>> commit e5fd1f6505c43440bc2450253c79c80174b693bc > > > >>> Author: Keir Fraser <keir.fraser@citrix.com> > > > >>> Date: Tue May 25 11:28:58 2010 +0100 > > > >>> > > > >>> x86 hvm: implement vector callback for evtchn delivery > > > >>> > > > >>> Signed-off-by: Sheng Yang <sheng@linux.intel.com> > > > >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > >>> Signed-off-by: Keir Fraser <keir.fraser@citrix.com> > > > >>> From the guest point of view it looks like a normal vector callback > > > >>> (similar to an IPI). > > > >>> > > > >>> > > > >>>>> Is there a way to solve this so that they _will_ take advantage of this. > > > >>>> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the > > > > pending > > > >>> interrupt. But event channel doesn''t have such mechanism. > > > >>> > > > >>> It''s true that we don''t do any EOIs with the vector callback mechanism, > > > >>> the same way the operating system doesn''t do any EOIs when it receives > > > >>> an IPI. > > > >> IPI also need EOI. > > > > > > > > Ooops, you are right. > > > > > > > > Does guest EOI still cause a trap into Xen? > > > It depends on the bit in EOI exit bitmap. If it is set, then EOI still will cause vmexit(EOI-induced vmexit). Otherwise, no vmexit happened. > > > > > > The following pseudocode details the behavior of EOI virtualization: > > > Vector ← SVI; > > > VISR[Vector] ← 0; > > > IF any bits set in VISR > > > THEN SVI ← highest index of bit set in VISR > > > ELSE SVI ← 0; > > > FI; > > > perform PPR virtualiation > > > IF EOI_exit_bitmap[Vector] = 1 > > > THEN cause EOI-induced VM exit with Vector as exit qualification; > > > ELSE evaluate pending virtual interrupts; > > > FI; > > > > Thanks for the explanation. > > > > At this point I wonder: would vector callbacks, that doesn''t do any > > guest EOIs, create any problems to this new virtual interrupt delivery > > mechanism? > > If the guest does not do any EOIs after receiving a vector callback, > > then other pending interrupts are never evaluated (the last ELSE > > condition in your pseudocode cannot happen), is that correct? > > > > In any case we could consider introducing an ack_APIC_irq() call at the > > beginning of xen_evtchn_do_upcall, so that the vector callback mechanism > > can take advantage of posted interrupts too. > > Or just split the mechanism. Meaning use the event callback for "legacy" > type events, and for PCI passthrough devices (where the host supports > posted interrupts) just use the baremetal implementation. > > That would entail some form of hypercall to identify whether a PCIe device > is "posted-interrupt" candidate and if so don''t use the event channel > mechanism for it.That should also work. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Zhang, Yang Z
2013-Jun-05 00:37 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Stefano Stabellini wrote on 2013-06-03:> On Mon, 3 Jun 2013, Konrad Rzeszutek Wilk wrote: >>>>>>>>> OK, in which case Linux ~v2.6.32 (when the event callback >>>>>>>>> mechanism was introduced for HVM guests) will _not_ take >>>>>>>>> advantage of this, right? >>>>>>>> Yes, event mechanism cannot benefit from it. >>>>>>> >>>>>>> I think that Konrad was referring to the vector callback mechanism: >>>>>> You are right. What I want to say is vector callback mechanism. >>>>>> >>>>>>> >>>>>>> linux side drivers/xen/events.c:xen_callback_vector >>>>>>> xen side xen/arch/x86/hvm/irq.c:hvm_set_callback_via >>>>>>> >>>>>>> Also see: >>>>>>> >>>>>>> commit e5fd1f6505c43440bc2450253c79c80174b693bc >>>>>>> Author: Keir Fraser <keir.fraser@citrix.com> >>>>>>> Date: Tue May 25 11:28:58 2010 +0100 >>>>>>> >>>>>>> x86 hvm: implement vector callback for evtchn delivery >>>>>>> >>>>>>> Signed-off-by: Sheng Yang <sheng@linux.intel.com> >>>>>>> Signed-off-by: Stefano Stabellini >>>>>>> <stefano.stabellini@eu.citrix.com> Signed-off-by: Keir Fraser >>>>>>> <keir.fraser@citrix.com> >>>>>>> From the guest point of view it looks like a normal vector callback >>>>>>> (similar to an IPI). >>>>>>> >>>>>>> >>>>>>>>> Is there a way to solve this so that they _will_ take advantage of this. >>>>>>>> Perhaps not. virtual interrupt delivery relies on EOI logic to inject the >>>>> pending >>>>>>> interrupt. But event channel doesn't have such mechanism. >>>>>>> >>>>>>> It's true that we don't do any EOIs with the vector callback >>>>>>> mechanism, the same way the operating system doesn't do any EOIs >>>>>>> when it receives an IPI. >>>>>> IPI also need EOI. >>>>> >>>>> Ooops, you are right. >>>>> >>>>> Does guest EOI still cause a trap into Xen? >>>> It depends on the bit in EOI exit bitmap. If it is set, then EOI >>>> still will cause vmexit(EOI-induced vmexit). Otherwise, no vmexit >>>> happened. >>>> >>>> The following pseudocode details the behavior of EOI virtualization: >>>> Vector ← SVI; >>>> VISR[Vector] ← 0; >>>> IF any bits set in VISR >>>> THEN SVI ← highest index of bit set in VISR >>>> ELSE SVI ← 0; >>>> FI; >>>> perform PPR virtualiation >>>> IF EOI_exit_bitmap[Vector] = 1 >>>> THEN cause EOI-induced VM exit with Vector as exit qualification; >>>> ELSE evaluate pending virtual interrupts; >>>> FI; >>> >>> Thanks for the explanation. >>> >>> At this point I wonder: would vector callbacks, that doesn't do any >>> guest EOIs, create any problems to this new virtual interrupt delivery >>> mechanism? >>> If the guest does not do any EOIs after receiving a vector callback, >>> then other pending interrupts are never evaluated (the last ELSE >>> condition in your pseudocode cannot happen), is that correct?No. Vector callback mechanism totally by pass the lapic(Please corrected me if I am wrong), it will not touch vIRR,vISR and no eoi needed. So if we want to benefit from virtual interrupt delivery, we should change the vector callback mechanism to use lapic. But it appears to change the original purpose of introduce vector callback.>>> >>> In any case we could consider introducing an ack_APIC_irq() call at the >>> beginning of xen_evtchn_do_upcall, so that the vector callback mechanism >>> can take advantage of posted interrupts too.Currently, we don't deliver "hvm_intsrc_vector" via posted interrupt. So we cannot take advantage of posted interrupts even adding ack_APIC_irq() into xen_evtchn_do_upcall.>> >> Or just split the mechanism. Meaning use the event callback for "legacy" >> type events, and for PCI passthrough devices (where the host supports >> posted interrupts) just use the baremetal implementation. >> >> That would entail some form of hypercall to identify whether a PCIe device >> is "posted-interrupt" candidate and if so don't use the event channel >> mechanism for it.What you mean "a PCIe device is posted-interrupt candidate"? Do you mean pass-through device will use event channel currently?> > That should also work.Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Jun-05 12:51 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Wed, 5 Jun 2013, Zhang, Yang Z wrote:> >> Or just split the mechanism. Meaning use the event callback for "legacy" > >> type events, and for PCI passthrough devices (where the host supports > >> posted interrupts) just use the baremetal implementation. > >> > >> That would entail some form of hypercall to identify whether a PCIe device > >> is "posted-interrupt" candidate and if so don''t use the event channel > >> mechanism for it. > What you mean "a PCIe device is posted-interrupt candidate"? Do you mean pass-through device will use event channel currently?On Linux, yes: Linux is going to remap the MSI/MSI-X onto an event channel. Therefore if we want to use posted-interrupts with pass-through devices we would need to disable event channel remapping for them.
Zhang, Yang Z
2013-Jun-06 03:08 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Stefano Stabellini wrote on 2013-06-05:> On Wed, 5 Jun 2013, Zhang, Yang Z wrote: >>>> Or just split the mechanism. Meaning use the event callback for "legacy" >>>> type events, and for PCI passthrough devices (where the host supports >>>> posted interrupts) just use the baremetal implementation. >>>> >>>> That would entail some form of hypercall to identify whether a PCIe device >>>> is "posted-interrupt" candidate and if so don''t use the event channel >>>> mechanism for it. >> What you mean "a PCIe device is posted-interrupt candidate"? Do you mean > pass-through device will use event channel currently? > > On Linux, yes: Linux is going to remap the MSI/MSI-X onto an event > channel. Therefore if we want to use posted-interrupts with pass-through > devices we would need to disable event channel remapping for them.Then we must disable event channel remapping if posted interrupt is used. Best regards, Yang
Konrad Rzeszutek Wilk
2013-Jun-06 13:02 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Thu, Jun 06, 2013 at 03:08:07AM +0000, Zhang, Yang Z wrote:> Stefano Stabellini wrote on 2013-06-05: > > On Wed, 5 Jun 2013, Zhang, Yang Z wrote: > >>>> Or just split the mechanism. Meaning use the event callback for "legacy" > >>>> type events, and for PCI passthrough devices (where the host supports > >>>> posted interrupts) just use the baremetal implementation. > >>>> > >>>> That would entail some form of hypercall to identify whether a PCIe device > >>>> is "posted-interrupt" candidate and if so don''t use the event channel > >>>> mechanism for it. > >> What you mean "a PCIe device is posted-interrupt candidate"? Do you mean > > pass-through device will use event channel currently? > > > > On Linux, yes: Linux is going to remap the MSI/MSI-X onto an event > > channel. Therefore if we want to use posted-interrupts with pass-through > > devices we would need to disable event channel remapping for them. > Then we must disable event channel remapping if posted interrupt is used.That was my understanding as well. Or at least have some form of event channel for inter-domain communication (netback/netfront, blkback/blkfront) - which should still be present. Then for PCIe (so MSI and MSI-X) and IPI do not use the event channel mechanism and just use the normal HVM type ack system. Any ETA on when patches for this would surface?> > Best regards, > Yang > >
Konrad Rzeszutek Wilk
2013-Jul-15 14:54 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
On Thu, Jun 06, 2013 at 03:08:07AM +0000, Zhang, Yang Z wrote:> Stefano Stabellini wrote on 2013-06-05: > > On Wed, 5 Jun 2013, Zhang, Yang Z wrote: > >>>> Or just split the mechanism. Meaning use the event callback for "legacy" > >>>> type events, and for PCI passthrough devices (where the host supports > >>>> posted interrupts) just use the baremetal implementation. > >>>> > >>>> That would entail some form of hypercall to identify whether a PCIe device > >>>> is "posted-interrupt" candidate and if so don''t use the event channel > >>>> mechanism for it. > >> What you mean "a PCIe device is posted-interrupt candidate"? Do you mean > > pass-through device will use event channel currently? > > > > On Linux, yes: Linux is going to remap the MSI/MSI-X onto an event > > channel. Therefore if we want to use posted-interrupts with pass-through > > devices we would need to disable event channel remapping for them. > Then we must disable event channel remapping if posted interrupt is used.When can we expect some of these patches to be posted? Thanks!> > Best regards, > Yang > >
Zhang, Yang Z
2013-Jul-16 02:12 UTC
Re: [Xenhackthon] Virtualized APIC registers - virtual interrupt delivery.
Konrad Rzeszutek Wilk wrote on 2013-07-15:> On Thu, Jun 06, 2013 at 03:08:07AM +0000, Zhang, Yang Z wrote: >> Stefano Stabellini wrote on 2013-06-05: >>> On Wed, 5 Jun 2013, Zhang, Yang Z wrote: >>>>>> Or just split the mechanism. Meaning use the event callback for >>>>>> "legacy" type events, and for PCI passthrough devices (where the >>>>>> host supports posted interrupts) just use the baremetal >>>>>> implementation. >>>>>> >>>>>> That would entail some form of hypercall to identify whether a >>>>>> PCIe device is "posted-interrupt" candidate and if so don''t use >>>>>> the event channel mechanism for it. >>>> What you mean "a PCIe device is posted-interrupt candidate"? Do >>>> you mean >>> pass-through device will use event channel currently? >>> >>> On Linux, yes: Linux is going to remap the MSI/MSI-X onto an event >>> channel. Therefore if we want to use posted-interrupts with >>> pass-through devices we would need to disable event channel >>> remapping > for them. >> Then we must disable event channel remapping if posted interrupt is used. > > When can we expect some of these patches to be posted? Thanks!Sorry, I don''t have much time to work on this. Perhaps it will be delayed. Best regards, Yang