Zhang, Fengzhe
2011-Jan-26 08:02 UTC
[Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
vtd: Fix for irq bind failure after PCI attaching 32 times Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI device for 32 times, irq resources run out. This patch removes the redundant logic. Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> diff -r 003acf02d416 xen/drivers/passthrough/io.c --- a/xen/drivers/passthrough/io.c Thu Jan 20 17:04:06 2011 +0000 +++ b/xen/drivers/passthrough/io.c Wed Jan 26 23:05:33 2011 +0800 @@ -375,7 +375,6 @@ hvm_irq_dpci->mirq[machine_gsi].dom = NULL; hvm_irq_dpci->mirq[machine_gsi].flags = 0; clear_bit(machine_gsi, hvm_irq_dpci->mapping); - unmap_domain_pirq_emuirq(d, machine_gsi); } } spin_unlock(&d->event_lock); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jan-26 08:41 UTC
Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On 26/01/2011 08:02, "Zhang, Fengzhe" <fengzhe.zhang@intel.com> wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times > > Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are > freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, > duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This > causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI > device for 32 times, irq resources run out. This patch removes the redundant > logic.This needs an Ack, or alternative fix, from Stefano (cc''ed). -- Keir> Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > > diff -r 003acf02d416 xen/drivers/passthrough/io.c > --- a/xen/drivers/passthrough/io.c Thu Jan 20 17:04:06 2011 +0000 > +++ b/xen/drivers/passthrough/io.c Wed Jan 26 23:05:33 2011 +0800 > @@ -375,7 +375,6 @@ > hvm_irq_dpci->mirq[machine_gsi].dom = NULL; > hvm_irq_dpci->mirq[machine_gsi].flags = 0; > clear_bit(machine_gsi, hvm_irq_dpci->mapping); > - unmap_domain_pirq_emuirq(d, machine_gsi); > } > } > spin_unlock(&d->event_lock); > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-26 11:10 UTC
Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Wed, 26 Jan 2011, Zhang, Fengzhe wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times > > Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI device for 32 times, irq resources run out. This patch removes the redundant logic. > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> >It looks OK in principle, but if the theory is that we should always call xc_physdev_unmap_pirq after xc_domain_unbind_pt_irq, I can find an instance of xc_domain_unbind_pt_irq without any corresponding xc_physdev_unmap_pirq. Take a look at hw/pass-through.c:pt_reset_interrupt_and_io_mapping in qemu: if (ptdev->msi_trans_en == 0 && ptdev->machine_irq) { if (xc_domain_unbind_pt_irq(xc_handle, domid, ptdev->machine_irq, PT_IRQ_TYPE_PCI, 0, e_device, e_intx, 0)) PT_LOG("Error: Unbinding of interrupt failed!\n"); } but there is no following xc_physdev_unmap_pirq if MSI and MSIX are disabled. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Fengzhe
2011-Jan-27 07:39 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
Hi, Stefano, Here is the calling graph that cause the bug: unregister_real_device (ioemu) | +----> pt_msix_disable (ioemu) | +----> xc_domain_unbind_msi_irq (ioemu) | | | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) | | | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq | +----> xc_physdev_unmap_pirq (ioemu) | +----> do_physdev_op (xen) | +----> physdev_unmap_pirq (xen) | +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort | +----> unmap_domain_pirq (xen) //not called The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. -----Original Message----- From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com] Sent: Wednesday, January 26, 2011 7:11 PM To: Zhang, Fengzhe Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times On Wed, 26 Jan 2011, Zhang, Fengzhe wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times > > Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI device for 32 times, irq resources run out. This patch removes the redundant logic. > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> >It looks OK in principle, but if the theory is that we should always call xc_physdev_unmap_pirq after xc_domain_unbind_pt_irq, I can find an instance of xc_domain_unbind_pt_irq without any corresponding xc_physdev_unmap_pirq. Take a look at hw/pass-through.c:pt_reset_interrupt_and_io_mapping in qemu: if (ptdev->msi_trans_en == 0 && ptdev->machine_irq) { if (xc_domain_unbind_pt_irq(xc_handle, domid, ptdev->machine_irq, PT_IRQ_TYPE_PCI, 0, e_device, e_intx, 0)) PT_LOG("Error: Unbinding of interrupt failed!\n"); } but there is no following xc_physdev_unmap_pirq if MSI and MSIX are disabled. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-27 11:16 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Thu, 27 Jan 2011, Zhang, Fengzhe wrote:> Hi, Stefano, > > Here is the calling graph that cause the bug: > > unregister_real_device (ioemu) > | > +----> pt_msix_disable (ioemu) > | > +----> xc_domain_unbind_msi_irq (ioemu) > | | > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > | | > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > | > +----> xc_physdev_unmap_pirq (ioemu) > | > +----> do_physdev_op (xen) > | > +----> physdev_unmap_pirq (xen) > | > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > | > +----> unmap_domain_pirq (xen) //not called > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. >Thank you for the clarification. I think your patch is correct and should be applied. I was just wondering if it is possible to think of another scenario that would trigger the same kind of bug calling several times pt_reset_interrupt_and_io_mapping in ioemu, because it seems to me that there is no xc_physdev_unmap_pirq call there if ptdev->machine_irq != 0. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-02 17:35 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Thu, 27 Jan 2011, Zhang, Fengzhe wrote:> Hi, Stefano, > > Here is the calling graph that cause the bug: > > unregister_real_device (ioemu) > | > +----> pt_msix_disable (ioemu) > | > +----> xc_domain_unbind_msi_irq (ioemu) > | | > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > | | > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > | > +----> xc_physdev_unmap_pirq (ioemu) > | > +----> do_physdev_op (xen) > | > +----> physdev_unmap_pirq (xen) > | > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > | > +----> unmap_domain_pirq (xen) //not called > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0.It has just occurred to me that all this only happens with guest cooperation: unregister_real_device is called on pci hotplug in response to guest''s action. That means that a guest that doesn''t support pci hot-unplug (or a malicious guest) won''t do anything in response to the acpi SCI interrupt we send, therefore unregister_real_device will never be called and we will be leaking MSIs in the host! Of course we could solve it adding a new xenstore command to qemu that calls unregister_real_device directly, but it seems to me that relying on qemu to free some hypervisor/dom0 resources is not a good idea. Xen knows all the pirq remapped to this domain, so wouldn''t it be possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq on domain_kill? I think that Xen shouldn''t leak pirqs no matter what the toolstack or qemu do. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-03 15:22 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Wed, 2 Feb 2011, Stefano Stabellini wrote:> On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > Hi, Stefano, > > > > Here is the calling graph that cause the bug: > > > > unregister_real_device (ioemu) > > | > > +----> pt_msix_disable (ioemu) > > | > > +----> xc_domain_unbind_msi_irq (ioemu) > > | | > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > | | > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > | > > +----> xc_physdev_unmap_pirq (ioemu) > > | > > +----> do_physdev_op (xen) > > | > > +----> physdev_unmap_pirq (xen) > > | > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > | > > +----> unmap_domain_pirq (xen) //not called > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > It has just occurred to me that all this only happens with guest > cooperation: unregister_real_device is called on pci hotplug in response > to guest''s action. > That means that a guest that doesn''t support pci hot-unplug (or a > malicious guest) won''t do anything in response to the acpi SCI interrupt > we send, therefore unregister_real_device will never be called and we > will be leaking MSIs in the host! > > Of course we could solve it adding a new xenstore command to qemu that > calls unregister_real_device directly, but it seems to me that relying > on qemu to free some hypervisor/dom0 resources is not a good idea. > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > on domain_kill? > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > qemu do. >actually it looks like xen is cleaning up after itself: arch_domain_destroy | +--> pci_release_devices | | | +--> pci_cleanup_msi | | | +--> msi_free_irq | | | +--> iommu_update_ire_from_msi | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) | | +--> free_domain_pirqs | +--> unmap_domain_pirq so it doesn''t actually matter if the guest supports pci hotplug or not, because if it doesn''t, xen won''t leak any resources anyway. Am I right? Could you please confirm this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Feb-03 16:54 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
Stefano Stabellini writes ("RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times"):> so it doesn''t actually matter if the guest supports pci hotplug or not, > because if it doesn''t, xen won''t leak any resources anyway. > Am I right? Could you please confirm this?Is there some way to get a list of the MSIs etc. which have been reserved/allocated ? If so my automatic test system could check they weren''t leaked. (Although this will be of more interest when I make it do some testing of pci passthrough...) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-14 14:17 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
ping? On Thu, 3 Feb 2011, Stefano Stabellini wrote:> On Wed, 2 Feb 2011, Stefano Stabellini wrote: > > On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > > Hi, Stefano, > > > > > > Here is the calling graph that cause the bug: > > > > > > unregister_real_device (ioemu) > > > | > > > +----> pt_msix_disable (ioemu) > > > | > > > +----> xc_domain_unbind_msi_irq (ioemu) > > > | | > > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > > | | > > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > > | > > > +----> xc_physdev_unmap_pirq (ioemu) > > > | > > > +----> do_physdev_op (xen) > > > | > > > +----> physdev_unmap_pirq (xen) > > > | > > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > > | > > > +----> unmap_domain_pirq (xen) //not called > > > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > > > It has just occurred to me that all this only happens with guest > > cooperation: unregister_real_device is called on pci hotplug in response > > to guest''s action. > > That means that a guest that doesn''t support pci hot-unplug (or a > > malicious guest) won''t do anything in response to the acpi SCI interrupt > > we send, therefore unregister_real_device will never be called and we > > will be leaking MSIs in the host! > > > > Of course we could solve it adding a new xenstore command to qemu that > > calls unregister_real_device directly, but it seems to me that relying > > on qemu to free some hypervisor/dom0 resources is not a good idea. > > > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > > on domain_kill? > > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > > qemu do. > > > > actually it looks like xen is cleaning up after itself: > > arch_domain_destroy > | > +--> pci_release_devices > | | > | +--> pci_cleanup_msi > | | > | +--> msi_free_irq > | | > | +--> iommu_update_ire_from_msi > | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) > | > | > +--> free_domain_pirqs > | > +--> unmap_domain_pirq > > > so it doesn''t actually matter if the guest supports pci hotplug or not, > because if it doesn''t, xen won''t leak any resources anyway. > Am I right? Could you please confirm this?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Feb-25 01:41 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
> | +--> iommu_update_ire_from_msi > | (should clean the vtd binding, like pt_irq_destroy_bind_vtd)Stefano, iommu_update_ire_from_msi() maps to intremap.c/msi_msg_write_remap_rte() with vt-d hardware. This function strictly deals with VT-d interrupt remapping HW related tasks. It does not do any irq cleanup for xen. Allen -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Stefano Stabellini Sent: Thursday, February 03, 2011 7:23 AM To: Stefano Stabellini Cc: xen-devel@lists.xensource.com; Zhang, Xiantao; Zhang, Fengzhe Subject: RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times On Wed, 2 Feb 2011, Stefano Stabellini wrote:> On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > Hi, Stefano, > > > > Here is the calling graph that cause the bug: > > > > unregister_real_device (ioemu) > > | > > +----> pt_msix_disable (ioemu) > > | > > +----> xc_domain_unbind_msi_irq (ioemu) > > | | > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > | | > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > | > > +----> xc_physdev_unmap_pirq (ioemu) > > | > > +----> do_physdev_op (xen) > > | > > +----> physdev_unmap_pirq (xen) > > | > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > | > > +----> unmap_domain_pirq (xen) //not called > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > It has just occurred to me that all this only happens with guest > cooperation: unregister_real_device is called on pci hotplug in response > to guest''s action. > That means that a guest that doesn''t support pci hot-unplug (or a > malicious guest) won''t do anything in response to the acpi SCI interrupt > we send, therefore unregister_real_device will never be called and we > will be leaking MSIs in the host! > > Of course we could solve it adding a new xenstore command to qemu that > calls unregister_real_device directly, but it seems to me that relying > on qemu to free some hypervisor/dom0 resources is not a good idea. > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > on domain_kill? > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > qemu do. >actually it looks like xen is cleaning up after itself: arch_domain_destroy | +--> pci_release_devices | | | +--> pci_cleanup_msi | | | +--> msi_free_irq | | | +--> iommu_update_ire_from_msi | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) | | +--> free_domain_pirqs | +--> unmap_domain_pirq so it doesn''t actually matter if the guest supports pci hotplug or not, because if it doesn''t, xen won''t leak any resources anyway. Am I right? Could you please confirm this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel