On 2014/11/11 23:11, Pawel Moll wrote:> On Tue, 2014-11-04 at 09:35 +0000, Shannon Zhao wrote: >> As the current virtio-mmio only support single irq, >> so some advanced features such as vhost-net with irqfd >> are not supported. And the net performance is not >> the best without vhost-net and irqfd supporting. > > Could you, please, help understanding me where does the main issue is? > Is it about: > > 1. The fact that the existing implementation blindly kicks all queues, > instead only of the updated ones? > > or: > > 2. Literally having a dedicated interrupt line (remember, we're talking > "real" interrupts here, not message signalled ones) per queue, so they > can be handled by different processors at the same time? >The main issue is that current virtio-mmio only support one interrupt which is shared by config and queues. Therefore the virtio-mmio driver should read the "VIRTIO_MMIO_INTERRUPT_STATUS" to get the interrupt reason and check whom this interrupt is to. If we use vhost-net which uses irqfd to inject interrupt, the vhost-net doesn't update "VIRTIO_MMIO_INTERRUPT_STATUS", then the guest driver can't read the interrupt reason and doesn't call a handler to process. So we can assign a dedicated interrupt line per queue for virtio-mmio and it can work with irqfd.> Now, if it's only about 1, the simplest solution would be to extend the > VIRTIO_MMIO_INTERRUPT_STATUS register to signal up to 30 queues > "readiness" in bits 2-31, still keeping bit 0 as a "combined" > VIRTIO_MMIO_INT_VRING. In case when VIRTIO_MMIO_INT_VRING is set and > none of the "individual" bits is (a device which doesn't support this > feature or one that has more than 30 queues and of of those is ready) we > would fall back to the original "kick all queues" approach. This could > be a useful (and pretty simple) extension. In the worst case scenario it > could be a post-1.0 standard addition, as it would provide backward > compatibility. > > However, if it's about 2, we're talking larger changes here. From the > device perspective, we can define it as having per-queue (plus one for > config) interrupt output *and* a "combined" output, being simple logical > "or" of all the others. Then, the Device Tree bindings would be used to > express the implementation choices (I'd keep the kernel parameter > approach supporting the single interrupt case only). This is a very > popular and well understood approach for memory mapped peripherals (for > example, see the . It allows the system integrator to make a decision > when it's coming to latency vs number interrupt lines trade off. The > main issue is that we can't really impose a limit on a number of queues, > therefore on a number of interrupts. This would require adding a new > "interrupt acknowledge" register, which would take a number of the queue > (or a symbolic value for the config one) instead of a bit mask. And IYes, maybe should add a new "interrupt acknowledge" register for backend and frontend to consult the number of queues.> must say that I'm not enjoying the idea of such substantial change to > the specification that late in the process... (in other words: you'll > have to put extra effort into convincing me :-) > >> This patch support virtio-mmio to request multiple >> irqs like virtio-pci. With this patch and qemu assigning >> multiple irqs for virtio-mmio device, it's ok to use >> vhost-net with irqfd on arm/arm64. > > Could you please tell me how many queues (interrupts) are we talking > about in this case? 5? A dozen? Hundreds? >Theoretically the number of interrupts has no limit, but as the limit of ARM interrupt line, the number should be less than ARM interrupt lines. In the real situation, I think, the number is generally less than 17 (8 pairs of vring interrupts and one config interrupt).> Disclaimer: I have no personal experience with virtio and network (due > to the fact how our Fast Models are implemented, I mostly us block > devices and 9p protocol over virtio and I get enough performance from > them :-). > >> As arm doesn't support msi-x now, > > To be precise: "ARM" does "support" MSI-X :-) (google for GICv2m)Sorry, I mean ARM with GICv2.> > The correct statement would be: "normal memory mapped devices have no > interface for message signalled interrupts (like MSI-X)" >Yes, that's right.>> we use GSI for multiple irq. > > I'm not sure what GSI stands for, but looking at the code I assume it's > just a "normal" peripheral interrupt. > >> In this patch we use "vm_try_to_find_vqs" >> to check whether multiple irqs are supported like >> virtio-pci. > > Yeah, I can see that you have followed virtio-pci quite literally. I'm > particularly not convinced to the one interrupt for config, one for all > queues option. Doesn't make any sense to me here. >About one interrupt for all queues, it's not a typical case. But just offer one more choice for users. Users should configure the number of interrupts according to their situation.>> Is this the right direction? is there other ways to >> make virtio-mmio support multiple irq? Hope for feedback. > > One point I'd like to make is that the device was intentionally designed > with simplicity in mind first, performance later (something about > "embedded" etc" :-). Changing this assumption is of course possible, butAh, I think ARM is not only about embedded things. Maybe it could has a wider application such as micro server. Just my personal opinion.> - I must say - makes me slightly uncomfortable... The extensions we're > discussing here seem doable, but I've noticed your other patches doing > with a shared memory region and I didn't like them at all, sorry. >The approach with a shared memory region is dropped as you can see from the mailing list. The approach of this patch get a net performance improvement about 30%. This maybe makes sense to the paltform without MSI support(e.g ARM with GICv2).> I see the subject has been already touched in the discussions, but let > me bring PCI to the surface again. We're getting more server-class SOCs > in the market, which obviously bring PCI with them to both arm and arm64 > world, something unheard of in the "mobile past". I believe the PCI > patches for the arm64 have been already merged in the kernel. > > Therefore: I'm not your boss so, obviously, I can't tell you what to do, > but could you consider redirecting your efforts into getting the "ARM > PCI" up and running in qemu so you can simply use the existing > infrastructure? This would save us a lot of work and pain in doing late > functional changes to the standard and will be probably more > future-proof from your perspective (PCI will happen, sooner or later - > you can make it sooner ;-) > > Regards > > Pawel > > > . >-- Shannon
On Wed, 2014-11-12 at 08:32 +0000, Shannon Zhao wrote:> On 2014/11/11 23:11, Pawel Moll wrote: > > On Tue, 2014-11-04 at 09:35 +0000, Shannon Zhao wrote: > >> As the current virtio-mmio only support single irq, > >> so some advanced features such as vhost-net with irqfd > >> are not supported. And the net performance is not > >> the best without vhost-net and irqfd supporting. > > > > Could you, please, help understanding me where does the main issue is? > > Is it about: > > > > 1. The fact that the existing implementation blindly kicks all queues, > > instead only of the updated ones? > > > > or: > > > > 2. Literally having a dedicated interrupt line (remember, we're talking > > "real" interrupts here, not message signalled ones) per queue, so they > > can be handled by different processors at the same time? > > > The main issue is that current virtio-mmio only support one interrupt which is shared by > config and queues.Yes. Additional question: is your device changing the configuration space often? Other devices (for example block devices) change configuration very rarely (eg. change of the media size, usually meaning hot-un-plugging), so unless you tell me that the config space is frequently changes, I'll consider the config event irrelevant from the performance point of view.> Therefore the virtio-mmio driver should read the > "VIRTIO_MMIO_INTERRUPT_STATUS" to get the interrupt reason and check whom this interrupt is to.Correct.> If we use vhost-net which uses irqfd to inject interrupt, the > vhost-net doesn't update "VIRTIO_MMIO_INTERRUPT_STATUS", then the > guest driver can't read the interrupt reason and doesn't call a > handler to process.Ok, so this is the bit I don't understand. Explain, please how does this whole thing work. And keep in mind that I've just looked up "irqfd" for the first time in my life, so know nothing about its implementation. The same applies to "vhost-net". I'm simply coming from a completely different background, so bear with me on this. Now, the virtio-mmio device *must* behave in a certain way, as specified in both the old and the new version of the spec. If a Used Ring of *any* of the queues has been updated the device *must* set bit 0 of the interrupt status register, and - in effect - generate the interrupt. But you are telling me that in some circumstances the interrupt is *not* generated when a queue requires handling. Sounds like a bug to me, as it is not following the requirements of the device specification. It is quite likely that I simply don't see some facts which are obvious for you, so please - help me understand.> So we can assign a dedicated interrupt line per queue for virtio-mmio > and it can work with irqfd. >If my reasoning above is correct, I'd say that you are just working around a functional bug, not improving performance. And this is a wrong way of solving the problem.> Theoretically the number of interrupts has no limit, but as the limit > of ARM interrupt line, the number should be less than ARM interrupt > lines.Let me just emphasize the fact that virtio-mmio is not related to ARM in any way, it's just a memory mapped device with a generic interrupt output. Any limitations of a particular hardware platform (I guess you're referring to the maximum number of peripheral interrupts a GIC can take) are irrelevant - if we go with multiple interrupts, it must cater for as many interrupt lines as one wishes... :-)> In the real situation, I think, the number > is generally less than 17 (8 pairs of vring interrupts and one config > interrupt).... but of course we could optimize for the real scenarios. And I'm glad to hear that the number you quoted is less then 30 :-)> >> we use GSI for multiple irq. > > > > I'm not sure what GSI stands for, but looking at the code I assume it's > > just a "normal" peripheral interrupt. > > > >> In this patch we use "vm_try_to_find_vqs" > >> to check whether multiple irqs are supported like > >> virtio-pci. > > > > Yeah, I can see that you have followed virtio-pci quite literally. I'm > > particularly not convinced to the one interrupt for config, one for all > > queues option. Doesn't make any sense to me here. > > > About one interrupt for all queues, it's not a typical case. But just offer > one more choice for users. Users should configure the number of interrupts > according to their situation.> >> Is this the right direction? is there other ways to > >> make virtio-mmio support multiple irq? Hope for feedback. > > > > One point I'd like to make is that the device was intentionally designed > > with simplicity in mind first, performance later (something about > > "embedded" etc" :-). Changing this assumption is of course possible, but > Ah, I think ARM is not only about embedded things. Maybe it could has a wider application > such as micro server. Just my personal opinion. > > > - I must say - makes me slightly uncomfortable... The extensions we're > > discussing here seem doable, but I've noticed your other patches doing > > with a shared memory region and I didn't like them at all, sorry. > > > The approach with a shared memory region is dropped as you can see from the mailing list.Glad to hear that :-)> The approach of this patch get a net performance improvement about 30%. > This maybe makes sense to the paltform without MSI support(e.g ARM with GICv2).I appreciate the improvement. I'm just cautions when it comes to significant changes to the standard so late in the process. Pawel
On 2014/11/13 2:33, Pawel Moll wrote:> On Wed, 2014-11-12 at 08:32 +0000, Shannon Zhao wrote: >> On 2014/11/11 23:11, Pawel Moll wrote: >>> On Tue, 2014-11-04 at 09:35 +0000, Shannon Zhao wrote: >>>> As the current virtio-mmio only support single irq, >>>> so some advanced features such as vhost-net with irqfd >>>> are not supported. And the net performance is not >>>> the best without vhost-net and irqfd supporting. >>> >>> Could you, please, help understanding me where does the main issue is? >>> Is it about: >>> >>> 1. The fact that the existing implementation blindly kicks all queues, >>> instead only of the updated ones? >>> >>> or: >>> >>> 2. Literally having a dedicated interrupt line (remember, we're talking >>> "real" interrupts here, not message signalled ones) per queue, so they >>> can be handled by different processors at the same time? >>> >> The main issue is that current virtio-mmio only support one interrupt which is shared by >> config and queues. > > Yes. > > Additional question: is your device changing the configuration space > often? Other devices (for example block devices) change configuration > very rarely (eg. change of the media size, usually meaning > hot-un-plugging), so unless you tell me that the config space is > frequently changes, I'll consider the config event irrelevant from the > performance point of view. >No, change the configuration space not often.>> Therefore the virtio-mmio driver should read the >> "VIRTIO_MMIO_INTERRUPT_STATUS" to get the interrupt reason and check whom this interrupt is to. > > Correct. > >> If we use vhost-net which uses irqfd to inject interrupt, the >> vhost-net doesn't update "VIRTIO_MMIO_INTERRUPT_STATUS", then the >> guest driver can't read the interrupt reason and doesn't call a >> handler to process. > > Ok, so this is the bit I don't understand. Explain, please how does this > whole thing work. And keep in mind that I've just looked up "irqfd" for > the first time in my life, so know nothing about its implementation. The > same applies to "vhost-net". I'm simply coming from a completely > different background, so bear with me on this. >Ok, sorry. I ignored this. When we use only virtio-mmio or vhost-net without irqfd, the device uses qemu_set_irq(within qemu) to inject interrupt and at the same time qemu update "VIRTIO_MMIO_INTERRUPT_STATUS" to tell guest driver whom this interrupt to. All these things are done by qemu. Though qemu_set_irq will call kvm_set_irq to do the real interrupt inject, but the trigger is qemu and it can update the "VIRTIO_MMIO_INTERRUPT_STATUS" register before injecting the interrupt. But if we use vhost-net with irqfd, the device uses ioeventfd mechanism to inject interrupt. When an interrupt happened, it doesn't transfer to qemu, while the irqfd finally call kvm_set_irq to inject the interrupt directly. All these things are done in kvm and it can't update the "VIRTIO_MMIO_INTERRUPT_STATUS" register. So if the guest driver still uses the old interrupt handler, it has to read the "VIRTIO_MMIO_INTERRUPT_STATUS" register to get the interrupt reason and run different handlers for different reasons. But the register has nothing and none of handlers will be called. I make myself clear? :-)> Now, the virtio-mmio device *must* behave in a certain way, as specified > in both the old and the new version of the spec. If a Used Ring of *any* > of the queues has been updated the device *must* set bit 0 of the > interrupt status register, and - in effect - generate the interrupt. > > But you are telling me that in some circumstances the interrupt is *not* > generated when a queue requires handling. Sounds like a bug to me, as it > is not following the requirements of the device specification. > > It is quite likely that I simply don't see some facts which are obvious > for you, so please - help me understand. > >> So we can assign a dedicated interrupt line per queue for virtio-mmio >> and it can work with irqfd. >> > If my reasoning above is correct, I'd say that you are just working > around a functional bug, not improving performance. And this is a wrong > way of solving the problem. > >> Theoretically the number of interrupts has no limit, but as the limit >> of ARM interrupt line, the number should be less than ARM interrupt >> lines. > > Let me just emphasize the fact that virtio-mmio is not related to ARM in > any way, it's just a memory mapped device with a generic interrupt > output. Any limitations of a particular hardware platform (I guess > you're referring to the maximum number of peripheral interrupts a GIC > can take) are irrelevant - if we go with multiple interrupts, it must > cater for as many interrupt lines as one wishes... :-) > >> In the real situation, I think, the number >> is generally less than 17 (8 pairs of vring interrupts and one config >> interrupt). > > ... but of course we could optimize for the real scenarios. And I'm glad > to hear that the number you quoted is less then 30 :-) > >>>> we use GSI for multiple irq. >>> >>> I'm not sure what GSI stands for, but looking at the code I assume it's >>> just a "normal" peripheral interrupt. >>> >>>> In this patch we use "vm_try_to_find_vqs" >>>> to check whether multiple irqs are supported like >>>> virtio-pci. >>> >>> Yeah, I can see that you have followed virtio-pci quite literally. I'm >>> particularly not convinced to the one interrupt for config, one for all >>> queues option. Doesn't make any sense to me here. >>> >> About one interrupt for all queues, it's not a typical case. But just offer >> one more choice for users. Users should configure the number of interrupts >> according to their situation. > > >>>> Is this the right direction? is there other ways to >>>> make virtio-mmio support multiple irq? Hope for feedback. >>> >>> One point I'd like to make is that the device was intentionally designed >>> with simplicity in mind first, performance later (something about >>> "embedded" etc" :-). Changing this assumption is of course possible, but >> Ah, I think ARM is not only about embedded things. Maybe it could has a wider application >> such as micro server. Just my personal opinion. >> >>> - I must say - makes me slightly uncomfortable... The extensions we're >>> discussing here seem doable, but I've noticed your other patches doing >>> with a shared memory region and I didn't like them at all, sorry. >>> >> The approach with a shared memory region is dropped as you can see from the mailing list. > > Glad to hear that :-) > >> The approach of this patch get a net performance improvement about 30%. >> This maybe makes sense to the paltform without MSI support(e.g ARM with GICv2). > > I appreciate the improvement. I'm just cautions when it comes to > significant changes to the standard so late in the process. >Sorry, I didn't notice it's at the final phase of the process. It's the first time in my life to make a change for virtio-spec like you first time saw irqfd.:-) In a word, the Multi-IRQ make vhost-net with irqfd based on virtio-mmio work and the irqfd reduces vm-exit and context switch between qemu and kvm. So it can bring performance improvement.> Pawel > > > . >-- Shannon