Simon Horman
2009-Mar-09 09:01 UTC
[Xen-devel] [rfc 0/2] allow pass-through devices to share GSI
As discussed recently in the thread "[rfc 00/18] ioemu: use devfn instead of slots as the unit for passthrough", it would be useful if pass-through devices could share GSI. The motivation for this is: * Allow multi-function devices to be passed through as multi-function devices - This implies that the devices may have functions that use INTB, C or D. With the current scheme this would clash with the GSI of INTA on device 13, 14, 15, 21, 22, 23, 29 ,30 and 31. The INTX, device to GSI mapping is described in xen/include/asm-x86/hvm/irq.h as: #define hvm_pci_intx_gsi(dev, intx) \ (((((dev)<<2) + ((dev)>>3) + (intx)) & 31) + 16) And is illustrated in this diagram http://lists.xensource.com/archives/html/xen-devel/2009-02/pngmIO1Sm0VEX.png * Allow more than two pass-through devices. - This will place more contention on the GSI-space, and allocation becomes a lot simpler if GSI sharing is allowed. -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Simon Horman
2009-Mar-09 09:01 UTC
[Xen-devel] [rfc 1/2] pt_irq_time_out() should act on all machine_irq
In pt_irq_time_out() the following code loops through all used guest_gsi: list_for_each_entry ( digl, &irq_map->digl_list, list ) { guest_gsi = digl->gsi; machine_gsi = dpci->girq[guest_gsi].machine_gsi; ... } And a little later on machine_gsi is used. That is the last machine_gsi found is used, rather than all of the machine_gsi that are found. This seems to be incorrect to me, but I am unsure of how to test this. This code appears to have been introduced in "vt-d: Support intra-domain shared interrupt" by Weidong Han. Cc: Weidong Han <weidong.han@intel.com> Cc: Yuji Shimada <shimada-yxb@necst.nec.co.jp> Signed-off-by: Simon Horman <horms@verge.net.au> Index: xen-unstable.hg/xen/drivers/passthrough/io.c ==================================================================--- xen-unstable.hg.orig/xen/drivers/passthrough/io.c 2009-03-09 12:44:48.000000000 +1100 +++ xen-unstable.hg/xen/drivers/passthrough/io.c 2009-03-09 12:58:28.000000000 +1100 @@ -37,6 +37,9 @@ static void pt_irq_time_out(void *data) struct hvm_irq_dpci *dpci = NULL; struct dev_intx_gsi_link *digl; uint32_t device, intx; + DECLARE_BITMAP(machine_gsi_map, NR_IRQS); + + bitmap_zero(machine_gsi_map, NR_IRQS); spin_lock(&irq_map->dom->event_lock); @@ -46,16 +49,31 @@ static void pt_irq_time_out(void *data) { guest_gsi = digl->gsi; machine_gsi = dpci->girq[guest_gsi].machine_gsi; + set_bit(machine_gsi, machine_gsi_map); device = digl->device; intx = digl->intx; hvm_pci_intx_deassert(irq_map->dom, device, intx); } - clear_bit(machine_gsi, dpci->dirq_mask); - vector = domain_irq_to_vector(irq_map->dom, machine_gsi); - dpci->mirq[machine_gsi].pending = 0; + for ( machine_gsi = find_first_bit(machine_gsi_map, NR_IRQS); + machine_gsi < NR_IRQS; + machine_gsi = find_next_bit(machine_gsi_map, NR_IRQS, + machine_gsi + 1) ) + { + clear_bit(machine_gsi, dpci->dirq_mask); + vector = domain_irq_to_vector(irq_map->dom, machine_gsi); + dpci->mirq[machine_gsi].pending = 0; + } + spin_unlock(&irq_map->dom->event_lock); - pirq_guest_eoi(irq_map->dom, machine_gsi); + + for ( machine_gsi = find_first_bit(machine_gsi_map, NR_IRQS); + machine_gsi < NR_IRQS; + machine_gsi = find_next_bit(machine_gsi_map, NR_IRQS, + machine_gsi + 1) ) + { + pirq_guest_eoi(irq_map->dom, machine_gsi); + } } int pt_irq_create_bind_vtd( -- -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Simon Horman
2009-Mar-09 09:01 UTC
[Xen-devel] [rfc 2/2] allow pass-through devices to share GSI
Allow multiple pass-through devices to use the same guest_gsi. The motivation for this is: * Allow multi-function devices to be passed through as multi-function devices - This implies that the devices may have functions that use INTB, C or D. With the current scheme this would clash with the GSI of INTA on device 13, 14, 15, 21, 22, 23, 29 ,30 and 31. The INTX, device to GSI mapping is described in xen/include/asm-x86/hvm/irq.h as: #define hvm_pci_intx_gsi(dev, intx) \ (((((dev)<<2) + ((dev)>>3) + (intx)) & 31) + 16) And is illustrated in this diagram http://lists.xensource.com/archives/html/xen-devel/2009-02/pngmIO1Sm0VEX.png * Allow more than two pass-through devices. - This will place more contention on the GSI-space, and allocation becomes a lot simpler if GSI sharing is allowed. Cc: Weidong Han <weidong.han@intel.com> Cc: Yuji Shimada <shimada-yxb@necst.nec.co.jp> Signed-off-by: Simon Horman <horms@verge.net.au> --- This seems to work. Tested by changing the GSI mapping so that dev 6 and 7 both use the same GSI. Then passign through two ethernet MICs. I was able to ping using both NICs and confirmed with some debugging code that they were assigned the same guest GSI. xen/drivers/passthrough/io.c | 111 ++++++++++++++++++++++++++---------------- xen/include/xen/hvm/irq.h | 4 - 2 files changed, 71 insertions(+), 44 deletions(-) Index: xen-unstable.hg/xen/include/xen/hvm/irq.h ==================================================================--- xen-unstable.hg.orig/xen/include/xen/hvm/irq.h 2009-03-09 19:51:14.000000000 +1100 +++ xen-unstable.hg/xen/include/xen/hvm/irq.h 2009-03-09 19:51:44.000000000 +1100 @@ -60,7 +60,7 @@ struct hvm_mirq_dpci_mapping { }; struct hvm_girq_dpci_mapping { - uint8_t valid; + struct list_head list; uint8_t device; uint8_t intx; uint8_t machine_gsi; @@ -75,7 +75,7 @@ struct hvm_irq_dpci { DECLARE_BITMAP(mapping, NR_IRQS); struct hvm_mirq_dpci_mapping mirq[NR_IRQS]; /* Guest IRQ to guest device/intx mapping. */ - struct hvm_girq_dpci_mapping girq[NR_IRQS]; + struct list_head girq[NR_IRQS]; uint8_t msi_gvec_pirq[NR_VECTORS]; DECLARE_BITMAP(dirq_mask, NR_IRQS); /* Record of mapped ISA IRQs */ Index: xen-unstable.hg/xen/drivers/passthrough/io.c ==================================================================--- xen-unstable.hg.orig/xen/drivers/passthrough/io.c 2009-03-09 19:51:35.000000000 +1100 +++ xen-unstable.hg/xen/drivers/passthrough/io.c 2009-03-09 19:51:44.000000000 +1100 @@ -36,6 +36,7 @@ static void pt_irq_time_out(void *data) int vector; struct hvm_irq_dpci *dpci = NULL; struct dev_intx_gsi_link *digl; + struct hvm_girq_dpci_mapping *girq; uint32_t device, intx; DECLARE_BITMAP(machine_gsi_map, NR_IRQS); @@ -48,8 +49,11 @@ static void pt_irq_time_out(void *data) list_for_each_entry ( digl, &irq_map->digl_list, list ) { guest_gsi = digl->gsi; - machine_gsi = dpci->girq[guest_gsi].machine_gsi; - set_bit(machine_gsi, machine_gsi_map); + list_for_each_entry ( girq, &dpci->girq[guest_gsi], list ) + { + machine_gsi = girq->machine_gsi; + set_bit(machine_gsi, machine_gsi_map); + } device = digl->device; intx = digl->intx; hvm_pci_intx_deassert(irq_map->dom, device, intx); @@ -83,6 +87,7 @@ int pt_irq_create_bind_vtd( uint32_t machine_gsi, guest_gsi; uint32_t device, intx, link; struct dev_intx_gsi_link *digl; + struct hvm_girq_dpci_mapping *girq; int rc, pirq = pt_irq_bind->machine_irq; if ( pirq < 0 || pirq >= NR_IRQS ) @@ -101,7 +106,10 @@ int pt_irq_create_bind_vtd( } memset(hvm_irq_dpci, 0, sizeof(*hvm_irq_dpci)); for ( int i = 0; i < NR_IRQS; i++ ) + { INIT_LIST_HEAD(&hvm_irq_dpci->mirq[i].digl_list); + INIT_LIST_HEAD(&hvm_irq_dpci->girq[i]); + } if ( domain_set_irq_dpci(d, hvm_irq_dpci) == 0 ) { @@ -158,6 +166,14 @@ int pt_irq_create_bind_vtd( return -ENOMEM; } + girq = xmalloc(struct hvm_girq_dpci_mapping); + if ( !girq ) + { + xfree(digl); + spin_unlock(&d->event_lock); + return -ENOMEM; + } + digl->device = device; digl->intx = intx; digl->gsi = guest_gsi; @@ -165,10 +181,10 @@ int pt_irq_create_bind_vtd( list_add_tail(&digl->list, &hvm_irq_dpci->mirq[machine_gsi].digl_list); - hvm_irq_dpci->girq[guest_gsi].valid = 1; - hvm_irq_dpci->girq[guest_gsi].device = device; - hvm_irq_dpci->girq[guest_gsi].intx = intx; - hvm_irq_dpci->girq[guest_gsi].machine_gsi = machine_gsi; + girq->device = device; + girq->intx = intx; + girq->machine_gsi = machine_gsi; + list_add_tail(&girq->list, &hvm_irq_dpci->girq[guest_gsi]); /* Bind the same mirq once in the same domain */ if ( !test_and_set_bit(machine_gsi, hvm_irq_dpci->mapping)) @@ -203,10 +219,8 @@ int pt_irq_create_bind_vtd( kill_timer(&hvm_irq_dpci->hvm_timer[vector]); hvm_irq_dpci->mirq[machine_gsi].dom = NULL; clear_bit(machine_gsi, hvm_irq_dpci->mapping); - hvm_irq_dpci->girq[guest_gsi].machine_gsi = 0; - hvm_irq_dpci->girq[guest_gsi].intx = 0; - hvm_irq_dpci->girq[guest_gsi].device = 0; - hvm_irq_dpci->girq[guest_gsi].valid = 0; + list_del(&girq->list); + xfree(girq); list_del(&digl->list); hvm_irq_dpci->link_cnt[link]--; spin_unlock(&d->event_lock); @@ -231,6 +245,7 @@ int pt_irq_destroy_bind_vtd( uint32_t device, intx, link; struct list_head *digl_list, *tmp; struct dev_intx_gsi_link *digl; + struct hvm_girq_dpci_mapping *girq; machine_gsi = pt_irq_bind->machine_irq; device = pt_irq_bind->u.pci.device; @@ -253,8 +268,16 @@ int pt_irq_destroy_bind_vtd( } hvm_irq_dpci->link_cnt[link]--; - memset(&hvm_irq_dpci->girq[guest_gsi], 0, - sizeof(struct hvm_girq_dpci_mapping)); + + list_for_each_entry ( girq, &hvm_irq_dpci->girq[guest_gsi], list ) + { + if ( girq->machine_gsi == machine_gsi ) + { + list_del(&girq->list); + xfree(girq); + break; + } + } /* clear the mirq info */ if ( test_bit(machine_gsi, hvm_irq_dpci->mapping)) @@ -422,13 +445,39 @@ void hvm_dirq_assist(struct vcpu *v) } } +static void __hvm_dpci_eoi(struct domain *d, + struct hvm_irq_dpci *hvm_irq_dpci, + struct hvm_girq_dpci_mapping *girq, + union vioapic_redir_entry *ent) +{ + uint32_t device, intx, machine_gsi; + + device = girq->device; + intx = girq->intx; + hvm_pci_intx_deassert(d, device, intx); + + machine_gsi = girq->machine_gsi; + + /* + * No need to get vector lock for timer + * since interrupt is still not EOIed + */ + if ( --hvm_irq_dpci->mirq[machine_gsi].pending || + ( ent && ent->fields.mask ) || + ! pt_irq_need_timer(hvm_irq_dpci->mirq[machine_gsi].flags) ) + return; + + stop_timer(&hvm_irq_dpci->hvm_timer[domain_irq_to_vector(d, machine_gsi)]); + pirq_guest_eoi(d, machine_gsi); +} + void hvm_dpci_eoi(struct domain *d, unsigned int guest_gsi, union vioapic_redir_entry *ent) { - struct hvm_irq_dpci *hvm_irq_dpci = NULL; - uint32_t device, intx, machine_gsi; + struct hvm_irq_dpci *hvm_irq_dpci; + struct hvm_girq_dpci_mapping *girq; - if ( !iommu_enabled) + if ( !iommu_enabled ) return; if ( guest_gsi < NR_ISAIRQS ) @@ -440,34 +489,12 @@ void hvm_dpci_eoi(struct domain *d, unsi spin_lock(&d->event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); - if((hvm_irq_dpci == NULL) || - (guest_gsi >= NR_ISAIRQS && - !hvm_irq_dpci->girq[guest_gsi].valid) ) - { - spin_unlock(&d->event_lock); - return; - } + if ( !hvm_irq_dpci ) + goto unlock; - device = hvm_irq_dpci->girq[guest_gsi].device; - intx = hvm_irq_dpci->girq[guest_gsi].intx; - hvm_pci_intx_deassert(d, device, intx); + list_for_each_entry ( girq, &hvm_irq_dpci->girq[guest_gsi], list ) + __hvm_dpci_eoi(d, hvm_irq_dpci, girq, ent); - machine_gsi = hvm_irq_dpci->girq[guest_gsi].machine_gsi; - if ( --hvm_irq_dpci->mirq[machine_gsi].pending == 0 ) - { - if ( (ent == NULL) || !ent->fields.mask ) - { - /* - * No need to get vector lock for timer - * since interrupt is still not EOIed - */ - if ( pt_irq_need_timer(hvm_irq_dpci->mirq[machine_gsi].flags) ) - { - stop_timer(&hvm_irq_dpci->hvm_timer[ - domain_irq_to_vector(d, machine_gsi)]); - pirq_guest_eoi(d, machine_gsi); - } - } - } +unlock: spin_unlock(&d->event_lock); } -- -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2009-Mar-10 07:40 UTC
[Xen-devel] RE: [rfc 1/2] pt_irq_time_out() should act on all machine_irq
Simon Horman wrote:> In pt_irq_time_out() the following code loops through all used > guest_gsi: > > list_for_each_entry ( digl, &irq_map->digl_list, list ) > { > guest_gsi = digl->gsi; > machine_gsi = dpci->girq[guest_gsi].machine_gsi; > ... > } > > And a little later on machine_gsi is used. > That is the last machine_gsi found is used, > rather than all of the machine_gsi that are found. > > This seems to be incorrect to me, > but I am unsure of how to test this. >Timer is set for each machine GSI, so all the machine_gsi found in loop are the same. More than one devices may share machine GSI, but assume they won''t share guest gsi. digl_list contains all guest GSIs which are correspond to this a machine GSI. Now you want pass-throughed devices share guest gsi, you needs to change it obviously. Your below change looks fine for me. Regards, Weidong> This code appears to have been introduced in > "vt-d: Support intra-domain shared interrupt" by Weidong Han. > > Cc: Weidong Han <weidong.han@intel.com> > Cc: Yuji Shimada <shimada-yxb@necst.nec.co.jp> > Signed-off-by: Simon Horman <horms@verge.net.au> > > Index: xen-unstable.hg/xen/drivers/passthrough/io.c > ==================================================================> --- xen-unstable.hg.orig/xen/drivers/passthrough/io.c 2009-03-09 > 12:44:48.000000000 +1100 +++ > xen-unstable.hg/xen/drivers/passthrough/io.c 2009-03-09 > 12:58:28.000000000 +1100 @@ -37,6 +37,9 @@ static void > pt_irq_time_out(void *data) struct hvm_irq_dpci *dpci = NULL; > struct dev_intx_gsi_link *digl; uint32_t device, intx; > + DECLARE_BITMAP(machine_gsi_map, NR_IRQS); > + > + bitmap_zero(machine_gsi_map, NR_IRQS); > > spin_lock(&irq_map->dom->event_lock); > > @@ -46,16 +49,31 @@ static void pt_irq_time_out(void *data) > { > guest_gsi = digl->gsi; > machine_gsi = dpci->girq[guest_gsi].machine_gsi; > + set_bit(machine_gsi, machine_gsi_map); > device = digl->device; > intx = digl->intx; > hvm_pci_intx_deassert(irq_map->dom, device, intx); > } > > - clear_bit(machine_gsi, dpci->dirq_mask); > - vector = domain_irq_to_vector(irq_map->dom, machine_gsi); > - dpci->mirq[machine_gsi].pending = 0; > + for ( machine_gsi = find_first_bit(machine_gsi_map, NR_IRQS); > + machine_gsi < NR_IRQS; > + machine_gsi = find_next_bit(machine_gsi_map, NR_IRQS, > + machine_gsi + 1) ) > + { > + clear_bit(machine_gsi, dpci->dirq_mask); > + vector = domain_irq_to_vector(irq_map->dom, machine_gsi); > + dpci->mirq[machine_gsi].pending = 0; > + } > + > spin_unlock(&irq_map->dom->event_lock); > - pirq_guest_eoi(irq_map->dom, machine_gsi); > + > + for ( machine_gsi = find_first_bit(machine_gsi_map, NR_IRQS); > + machine_gsi < NR_IRQS; > + machine_gsi = find_next_bit(machine_gsi_map, NR_IRQS, > + machine_gsi + 1) ) > + { > + pirq_guest_eoi(irq_map->dom, machine_gsi); > + } > } > > int pt_irq_create_bind_vtd( > > --_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Simon Horman
2009-Mar-10 07:46 UTC
Re: [Xen-devel] RE: [rfc 1/2] pt_irq_time_out() should act on all machine_irq
On Tue, Mar 10, 2009 at 03:40:21PM +0800, Han, Weidong wrote:> Simon Horman wrote: > > In pt_irq_time_out() the following code loops through all used > > guest_gsi: > > > > list_for_each_entry ( digl, &irq_map->digl_list, list ) > > { > > guest_gsi = digl->gsi; > > machine_gsi = dpci->girq[guest_gsi].machine_gsi; > > ... > > } > > > > And a little later on machine_gsi is used. > > That is the last machine_gsi found is used, > > rather than all of the machine_gsi that are found. > > > > This seems to be incorrect to me, > > but I am unsure of how to test this. > > > > Timer is set for each machine GSI, so all the machine_gsi found in loop > are the same. More than one devices may share machine GSI, but assume > they won''t share guest gsi. digl_list contains all guest GSIs which are > correspond to this a machine GSI.Ok, thanks. I understand now.> > Now you want pass-throughed devices share guest gsi, you needs to change > it obviously. Your below change looks fine for me.Thanks -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel