Zhang, Fengzhe
2011-Jan-26  08:02 UTC
[Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
vtd: Fix for irq bind failure after PCI attaching 32 times
Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are freed
via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, duplicated
logic is added to free pirq_to_emuirq, but not pirq_to_irq. This causes
do_physdev_op fail to free both emuirq and irq. After attaching a PCI device for
32 times, irq resources run out. This patch removes the redundant logic.
Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com>
diff -r 003acf02d416 xen/drivers/passthrough/io.c
--- a/xen/drivers/passthrough/io.c	Thu Jan 20 17:04:06 2011 +0000
+++ b/xen/drivers/passthrough/io.c	Wed Jan 26 23:05:33 2011 +0800
@@ -375,7 +375,6 @@
             hvm_irq_dpci->mirq[machine_gsi].dom   = NULL;
             hvm_irq_dpci->mirq[machine_gsi].flags = 0;
             clear_bit(machine_gsi, hvm_irq_dpci->mapping);
-            unmap_domain_pirq_emuirq(d, machine_gsi);
         }
     }
     spin_unlock(&d->event_lock);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jan-26  08:41 UTC
Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On 26/01/2011 08:02, "Zhang, Fengzhe" <fengzhe.zhang@intel.com> wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times > > Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are > freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, > duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This > causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI > device for 32 times, irq resources run out. This patch removes the redundant > logic.This needs an Ack, or alternative fix, from Stefano (cc''ed). -- Keir> Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> > > diff -r 003acf02d416 xen/drivers/passthrough/io.c > --- a/xen/drivers/passthrough/io.c Thu Jan 20 17:04:06 2011 +0000 > +++ b/xen/drivers/passthrough/io.c Wed Jan 26 23:05:33 2011 +0800 > @@ -375,7 +375,6 @@ > hvm_irq_dpci->mirq[machine_gsi].dom = NULL; > hvm_irq_dpci->mirq[machine_gsi].flags = 0; > clear_bit(machine_gsi, hvm_irq_dpci->mapping); > - unmap_domain_pirq_emuirq(d, machine_gsi); > } > } > spin_unlock(&d->event_lock); > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-26  11:10 UTC
Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Wed, 26 Jan 2011, Zhang, Fengzhe wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times > > Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd, duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI device for 32 times, irq resources run out. This patch removes the redundant logic. > > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> >It looks OK in principle, but if the theory is that we should always call xc_physdev_unmap_pirq after xc_domain_unbind_pt_irq, I can find an instance of xc_domain_unbind_pt_irq without any corresponding xc_physdev_unmap_pirq. Take a look at hw/pass-through.c:pt_reset_interrupt_and_io_mapping in qemu: if (ptdev->msi_trans_en == 0 && ptdev->machine_irq) { if (xc_domain_unbind_pt_irq(xc_handle, domid, ptdev->machine_irq, PT_IRQ_TYPE_PCI, 0, e_device, e_intx, 0)) PT_LOG("Error: Unbinding of interrupt failed!\n"); } but there is no following xc_physdev_unmap_pirq if MSI and MSIX are disabled. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhang, Fengzhe
2011-Jan-27  07:39 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
Hi, Stefano,
Here is the calling graph that cause the bug:
unregister_real_device (ioemu)
    |
    +----> pt_msix_disable (ioemu)
            |
            +----> xc_domain_unbind_msi_irq (ioemu)
            |       |
            |       +----> do_domctl (xen) ----> arch_do_domctl (xen)
----> pt_irq_destroy_bind_vtd (xen)
            |              |
            |              +----> unmap_domain_pirq_emuirq (xen)  //freed
pirq_to_emuirq
            |
            +----> xc_physdev_unmap_pirq (ioemu)
                   |
                   +----> do_physdev_op (xen) 
                           |
                           +----> physdev_unmap_pirq (xen)
                                   |
                                   +----> unmap_domain_pirq_emuirq (xen) 
//found pirq_to_emuirq already freed, abort
                                   |
                                   +----> unmap_domain_pirq (xen)    //not
called
The code path you mentioned is not taken for VF dev as its ptdev->machine_irq
is 0.
-----Original Message-----
From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com] 
Sent: Wednesday, January 26, 2011 7:11 PM
To: Zhang, Fengzhe
Cc: xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI
attaching 32 times
On Wed, 26 Jan 2011, Zhang, Fengzhe wrote:> vtd: Fix for irq bind failure after PCI attaching 32 times
> 
> Originally when detaching a PCI device, pirq_to_emuirq and pirq_to_irq are
freed via hypercall do_physdev_op. Now in function pt_irq_destroy_bind_vtd,
duplicated logic is added to free pirq_to_emuirq, but not pirq_to_irq. This
causes do_physdev_op fail to free both emuirq and irq. After attaching a PCI
device for 32 times, irq resources run out. This patch removes the redundant
logic.
> 
> Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com>
> 
It looks OK in principle, but if the theory is that we should always
call xc_physdev_unmap_pirq after xc_domain_unbind_pt_irq, I can find an
instance of xc_domain_unbind_pt_irq without any corresponding
xc_physdev_unmap_pirq.
Take a look at hw/pass-through.c:pt_reset_interrupt_and_io_mapping in
qemu:
if (ptdev->msi_trans_en == 0 && ptdev->machine_irq)
{
    if (xc_domain_unbind_pt_irq(xc_handle, domid, ptdev->machine_irq,
                    PT_IRQ_TYPE_PCI, 0, e_device, e_intx, 0))
        PT_LOG("Error: Unbinding of interrupt failed!\n");
}
but there is no following xc_physdev_unmap_pirq if MSI and MSIX are
disabled.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-27  11:16 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Thu, 27 Jan 2011, Zhang, Fengzhe wrote:> Hi, Stefano, > > Here is the calling graph that cause the bug: > > unregister_real_device (ioemu) > | > +----> pt_msix_disable (ioemu) > | > +----> xc_domain_unbind_msi_irq (ioemu) > | | > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > | | > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > | > +----> xc_physdev_unmap_pirq (ioemu) > | > +----> do_physdev_op (xen) > | > +----> physdev_unmap_pirq (xen) > | > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > | > +----> unmap_domain_pirq (xen) //not called > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. >Thank you for the clarification. I think your patch is correct and should be applied. I was just wondering if it is possible to think of another scenario that would trigger the same kind of bug calling several times pt_reset_interrupt_and_io_mapping in ioemu, because it seems to me that there is no xc_physdev_unmap_pirq call there if ptdev->machine_irq != 0. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-02  17:35 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Thu, 27 Jan 2011, Zhang, Fengzhe wrote:> Hi, Stefano, > > Here is the calling graph that cause the bug: > > unregister_real_device (ioemu) > | > +----> pt_msix_disable (ioemu) > | > +----> xc_domain_unbind_msi_irq (ioemu) > | | > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > | | > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > | > +----> xc_physdev_unmap_pirq (ioemu) > | > +----> do_physdev_op (xen) > | > +----> physdev_unmap_pirq (xen) > | > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > | > +----> unmap_domain_pirq (xen) //not called > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0.It has just occurred to me that all this only happens with guest cooperation: unregister_real_device is called on pci hotplug in response to guest''s action. That means that a guest that doesn''t support pci hot-unplug (or a malicious guest) won''t do anything in response to the acpi SCI interrupt we send, therefore unregister_real_device will never be called and we will be leaking MSIs in the host! Of course we could solve it adding a new xenstore command to qemu that calls unregister_real_device directly, but it seems to me that relying on qemu to free some hypervisor/dom0 resources is not a good idea. Xen knows all the pirq remapped to this domain, so wouldn''t it be possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq on domain_kill? I think that Xen shouldn''t leak pirqs no matter what the toolstack or qemu do. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-03  15:22 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
On Wed, 2 Feb 2011, Stefano Stabellini wrote:> On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > Hi, Stefano, > > > > Here is the calling graph that cause the bug: > > > > unregister_real_device (ioemu) > > | > > +----> pt_msix_disable (ioemu) > > | > > +----> xc_domain_unbind_msi_irq (ioemu) > > | | > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > | | > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > | > > +----> xc_physdev_unmap_pirq (ioemu) > > | > > +----> do_physdev_op (xen) > > | > > +----> physdev_unmap_pirq (xen) > > | > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > | > > +----> unmap_domain_pirq (xen) //not called > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > It has just occurred to me that all this only happens with guest > cooperation: unregister_real_device is called on pci hotplug in response > to guest''s action. > That means that a guest that doesn''t support pci hot-unplug (or a > malicious guest) won''t do anything in response to the acpi SCI interrupt > we send, therefore unregister_real_device will never be called and we > will be leaking MSIs in the host! > > Of course we could solve it adding a new xenstore command to qemu that > calls unregister_real_device directly, but it seems to me that relying > on qemu to free some hypervisor/dom0 resources is not a good idea. > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > on domain_kill? > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > qemu do. >actually it looks like xen is cleaning up after itself: arch_domain_destroy | +--> pci_release_devices | | | +--> pci_cleanup_msi | | | +--> msi_free_irq | | | +--> iommu_update_ire_from_msi | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) | | +--> free_domain_pirqs | +--> unmap_domain_pirq so it doesn''t actually matter if the guest supports pci hotplug or not, because if it doesn''t, xen won''t leak any resources anyway. Am I right? Could you please confirm this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Feb-03  16:54 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
Stefano Stabellini writes ("RE: [Xen-devel] [PATCH]vtd: Fix for irq bind
failure after PCI attaching 32 times"):> so it doesn''t actually matter if the guest supports pci hotplug or
not,
> because if it doesn''t, xen won''t leak any resources
anyway.
> Am I right? Could you please confirm this?
Is there some way to get a list of the MSIs etc. which have been
reserved/allocated ?  If so my automatic test system could check they
weren''t leaked.
(Although this will be of more interest when I make it do some testing
of pci passthrough...)
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Feb-14  14:17 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
ping? On Thu, 3 Feb 2011, Stefano Stabellini wrote:> On Wed, 2 Feb 2011, Stefano Stabellini wrote: > > On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > > Hi, Stefano, > > > > > > Here is the calling graph that cause the bug: > > > > > > unregister_real_device (ioemu) > > > | > > > +----> pt_msix_disable (ioemu) > > > | > > > +----> xc_domain_unbind_msi_irq (ioemu) > > > | | > > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > > | | > > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > > | > > > +----> xc_physdev_unmap_pirq (ioemu) > > > | > > > +----> do_physdev_op (xen) > > > | > > > +----> physdev_unmap_pirq (xen) > > > | > > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > > | > > > +----> unmap_domain_pirq (xen) //not called > > > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > > > It has just occurred to me that all this only happens with guest > > cooperation: unregister_real_device is called on pci hotplug in response > > to guest''s action. > > That means that a guest that doesn''t support pci hot-unplug (or a > > malicious guest) won''t do anything in response to the acpi SCI interrupt > > we send, therefore unregister_real_device will never be called and we > > will be leaking MSIs in the host! > > > > Of course we could solve it adding a new xenstore command to qemu that > > calls unregister_real_device directly, but it seems to me that relying > > on qemu to free some hypervisor/dom0 resources is not a good idea. > > > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > > on domain_kill? > > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > > qemu do. > > > > actually it looks like xen is cleaning up after itself: > > arch_domain_destroy > | > +--> pci_release_devices > | | > | +--> pci_cleanup_msi > | | > | +--> msi_free_irq > | | > | +--> iommu_update_ire_from_msi > | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) > | > | > +--> free_domain_pirqs > | > +--> unmap_domain_pirq > > > so it doesn''t actually matter if the guest supports pci hotplug or not, > because if it doesn''t, xen won''t leak any resources anyway. > Am I right? Could you please confirm this?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Feb-25  01:41 UTC
RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times
> | +--> iommu_update_ire_from_msi > | (should clean the vtd binding, like pt_irq_destroy_bind_vtd)Stefano, iommu_update_ire_from_msi() maps to intremap.c/msi_msg_write_remap_rte() with vt-d hardware. This function strictly deals with VT-d interrupt remapping HW related tasks. It does not do any irq cleanup for xen. Allen -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Stefano Stabellini Sent: Thursday, February 03, 2011 7:23 AM To: Stefano Stabellini Cc: xen-devel@lists.xensource.com; Zhang, Xiantao; Zhang, Fengzhe Subject: RE: [Xen-devel] [PATCH]vtd: Fix for irq bind failure after PCI attaching 32 times On Wed, 2 Feb 2011, Stefano Stabellini wrote:> On Thu, 27 Jan 2011, Zhang, Fengzhe wrote: > > Hi, Stefano, > > > > Here is the calling graph that cause the bug: > > > > unregister_real_device (ioemu) > > | > > +----> pt_msix_disable (ioemu) > > | > > +----> xc_domain_unbind_msi_irq (ioemu) > > | | > > | +----> do_domctl (xen) ----> arch_do_domctl (xen) ----> pt_irq_destroy_bind_vtd (xen) > > | | > > | +----> unmap_domain_pirq_emuirq (xen) //freed pirq_to_emuirq > > | > > +----> xc_physdev_unmap_pirq (ioemu) > > | > > +----> do_physdev_op (xen) > > | > > +----> physdev_unmap_pirq (xen) > > | > > +----> unmap_domain_pirq_emuirq (xen) //found pirq_to_emuirq already freed, abort > > | > > +----> unmap_domain_pirq (xen) //not called > > > > The code path you mentioned is not taken for VF dev as its ptdev->machine_irq is 0. > > It has just occurred to me that all this only happens with guest > cooperation: unregister_real_device is called on pci hotplug in response > to guest''s action. > That means that a guest that doesn''t support pci hot-unplug (or a > malicious guest) won''t do anything in response to the acpi SCI interrupt > we send, therefore unregister_real_device will never be called and we > will be leaking MSIs in the host! > > Of course we could solve it adding a new xenstore command to qemu that > calls unregister_real_device directly, but it seems to me that relying > on qemu to free some hypervisor/dom0 resources is not a good idea. > > Xen knows all the pirq remapped to this domain, so wouldn''t it be > possible for Xen to call pt_irq_destroy_bind_vtd and physdev_unmap_pirq > on domain_kill? > I think that Xen shouldn''t leak pirqs no matter what the toolstack or > qemu do. >actually it looks like xen is cleaning up after itself: arch_domain_destroy | +--> pci_release_devices | | | +--> pci_cleanup_msi | | | +--> msi_free_irq | | | +--> iommu_update_ire_from_msi | (should clean the vtd binding, like pt_irq_destroy_bind_vtd) | | +--> free_domain_pirqs | +--> unmap_domain_pirq so it doesn''t actually matter if the guest supports pci hotplug or not, because if it doesn''t, xen won''t leak any resources anyway. Am I right? Could you please confirm this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel