Tim Deegan
2011-Jul-18 16:38 UTC
[Xen-devel] [PATCH] [RFC] VT-d: always clean up dpci timers.
If a VM has all its PCI devices deassigned, need_iommu(d) becomes false but it might still have DPCI EOI timers that were init_timer()d but not yet kill_timer()d. That causes xen to crash later because the linked list of inactive timers gets corrupted, e.g.: (XEN) Xen call trace: (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89 (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28 (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion ''entry->next->prev == entry'' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172 (XEN) **************************************** The following patch makes sure that the domain destruction path always clears up the DPCI state even if !needs_iommu(d). Although it fixes the crash for me, I''m sufficiently confused by this code that I don''t know whether it''s enough. If the dpci timer state gets freed earlier than pci_clean_dpci_irqs() then there''s still a race, and some other function (reassign_device_ownership() ?) needs to sort out the timers when the PCI card is deassigned. Allen, can you comment? Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> diff -r ab6551e30841 xen/drivers/passthrough/pci.c --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100 +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100 @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d if ( !iommu_enabled ) return; - if ( !is_hvm_domain(d) || !need_iommu(d) ) + if ( !is_hvm_domain(d) ) return; spin_lock(&d->event_lock); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jul-21 01:08 UTC
[Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
Hi Tim, Can you provide the code flow that can cause this failure? In pci_release_devices(), pci_clean_dpci_irqs() is called before "d->need_iommu = 0" in deassign_device(). If this path is taken, then it should not return from !need_iommu(d) in pci_clean_dpci_irqs(). Allen -----Original Message----- From: Tim Deegan [mailto:Tim.Deegan@citrix.com] Sent: Monday, July 18, 2011 9:39 AM To: xen-devel@lists.xensource.com Cc: keir@xen.org; Kay, Allen M Subject: [PATCH] [RFC] VT-d: always clean up dpci timers. If a VM has all its PCI devices deassigned, need_iommu(d) becomes false but it might still have DPCI EOI timers that were init_timer()d but not yet kill_timer()d. That causes xen to crash later because the linked list of inactive timers gets corrupted, e.g.: (XEN) Xen call trace: (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89 (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28 (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion ''entry->next->prev == entry'' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172 (XEN) **************************************** The following patch makes sure that the domain destruction path always clears up the DPCI state even if !needs_iommu(d). Although it fixes the crash for me, I''m sufficiently confused by this code that I don''t know whether it''s enough. If the dpci timer state gets freed earlier than pci_clean_dpci_irqs() then there''s still a race, and some other function (reassign_device_ownership() ?) needs to sort out the timers when the PCI card is deassigned. Allen, can you comment? Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> diff -r ab6551e30841 xen/drivers/passthrough/pci.c --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100 +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100 @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d if ( !iommu_enabled ) return; - if ( !is_hvm_domain(d) || !need_iommu(d) ) + if ( !is_hvm_domain(d) ) return; spin_lock(&d->event_lock); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jul-21 08:50 UTC
Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote:> Hi Tim, > > Can you provide the code flow that can cause this failure? > > In pci_release_devices(), pci_clean_dpci_irqs() is called before > "d->need_iommu = 0" in deassign_device(). If this path is taken, then > it should not return from !need_iommu(d) in pci_clean_dpci_irqs().The problem is that the xl toolstack has already deassigned the domain''s devices, using a hypercall to invoke deassign_device(), so by the time the domain is destroyed, pci_release_devices() can''t tell that it once had a PCI device passed through. It seems like the Right Thing[tm] would be for deassign_device() to find and undo the relevant IRQ plumbing but I couldn''t see how. Is there a mapping from bdf to irq in the iommu code or are they handled entirely separately? Tim.> Allen > > -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@citrix.com] > Sent: Monday, July 18, 2011 9:39 AM > To: xen-devel@lists.xensource.com > Cc: keir@xen.org; Kay, Allen M > Subject: [PATCH] [RFC] VT-d: always clean up dpci timers. > > If a VM has all its PCI devices deassigned, need_iommu(d) becomes false > but it might still have DPCI EOI timers that were init_timer()d but not > yet kill_timer()d. That causes xen to crash later because the linked > list of inactive timers gets corrupted, e.g.: > > (XEN) Xen call trace: > (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f > (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd > (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89 > (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28 > (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion ''entry->next->prev == entry'' failed at /local/scratch/tdeegan/xen-unstable.hg/xen/include:172 > (XEN) **************************************** > > The following patch makes sure that the domain destruction path always > clears up the DPCI state even if !needs_iommu(d). > > Although it fixes the crash for me, I''m sufficiently confused by this > code that I don''t know whether it''s enough. If the dpci timer state > gets freed earlier than pci_clean_dpci_irqs() then there''s still a race, > and some other function (reassign_device_ownership() ?) needs to sort > out the timers when the PCI card is deassigned. > > Allen, can you comment? > > Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> > > diff -r ab6551e30841 xen/drivers/passthrough/pci.c > --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100 > +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100 > @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d > if ( !iommu_enabled ) > return; > > - if ( !is_hvm_domain(d) || !need_iommu(d) ) > + if ( !is_hvm_domain(d) ) > return; > > spin_lock(&d->event_lock); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jul-21 09:01 UTC
Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
On 21/07/2011 09:50, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:> At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote: >> Hi Tim, >> >> Can you provide the code flow that can cause this failure? >> >> In pci_release_devices(), pci_clean_dpci_irqs() is called before >> "d->need_iommu = 0" in deassign_device(). If this path is taken, then >> it should not return from !need_iommu(d) in pci_clean_dpci_irqs(). > > The problem is that the xl toolstack has already deassigned the domain''s > devices, using a hypercall to invoke deassign_device(), so by the time > the domain is destroyed, pci_release_devices() can''t tell that it once > had a PCI device passed through. > > It seems like the Right Thing[tm] would be for deassign_device() to find > and undo the relevant IRQ plumbing but I couldn''t see how. Is there a > mapping from bdf to irq in the iommu code or are they handled entirely > separately?Could we make need_iommu(d) sticky? Being able to clear it doesn''t seem an important case (such a domain is probably being torn down anyway) and clearly it can lead to fragility. The fact that presumably we''d end up doing unnecessary IOMMU PT work for the remaining lifetime of the domain doesn''t seem a major downside to me. -- Keir> Tim. > >> Allen >> >> -----Original Message----- >> From: Tim Deegan [mailto:Tim.Deegan@citrix.com] >> Sent: Monday, July 18, 2011 9:39 AM >> To: xen-devel@lists.xensource.com >> Cc: keir@xen.org; Kay, Allen M >> Subject: [PATCH] [RFC] VT-d: always clean up dpci timers. >> >> If a VM has all its PCI devices deassigned, need_iommu(d) becomes false >> but it might still have DPCI EOI timers that were init_timer()d but not >> yet kill_timer()d. That causes xen to crash later because the linked >> list of inactive timers gets corrupted, e.g.: >> >> (XEN) Xen call trace: >> (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f >> (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd >> (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89 >> (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28 >> (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) Assertion ''entry->next->prev == entry'' failed at >> /local/scratch/tdeegan/xen-unstable.hg/xen/include:172 >> (XEN) **************************************** >> >> The following patch makes sure that the domain destruction path always >> clears up the DPCI state even if !needs_iommu(d). >> >> Although it fixes the crash for me, I''m sufficiently confused by this >> code that I don''t know whether it''s enough. If the dpci timer state >> gets freed earlier than pci_clean_dpci_irqs() then there''s still a race, >> and some other function (reassign_device_ownership() ?) needs to sort >> out the timers when the PCI card is deassigned. >> >> Allen, can you comment? >> >> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> >> >> diff -r ab6551e30841 xen/drivers/passthrough/pci.c >> --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100 >> +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100 >> @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d >> if ( !iommu_enabled ) >> return; >> >> - if ( !is_hvm_domain(d) || !need_iommu(d) ) >> + if ( !is_hvm_domain(d) ) >> return; >> >> spin_lock(&d->event_lock); >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jul-22 07:33 UTC
RE: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
> Is there a mapping from bdf to irq in the iommu code or are they handled > entirely separately?I think they are handled separately inside of xen. Irq stuff is mostly about mirq to girq mapping. There should be BDF to irq correlation in user mode code. Allen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jul-25 14:21 UTC
Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
At 10:01 +0100 on 21 Jul (1311242513), Keir Fraser wrote:> On 21/07/2011 09:50, "Tim Deegan" <Tim.Deegan@citrix.com> wrote: > > > At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote: > >> Hi Tim, > >> > >> Can you provide the code flow that can cause this failure? > >> > >> In pci_release_devices(), pci_clean_dpci_irqs() is called before > >> "d->need_iommu = 0" in deassign_device(). If this path is taken, then > >> it should not return from !need_iommu(d) in pci_clean_dpci_irqs(). > > > > The problem is that the xl toolstack has already deassigned the domain''s > > devices, using a hypercall to invoke deassign_device(), so by the time > > the domain is destroyed, pci_release_devices() can''t tell that it once > > had a PCI device passed through. > > > > It seems like the Right Thing[tm] would be for deassign_device() to find > > and undo the relevant IRQ plumbing but I couldn''t see how. Is there a > > mapping from bdf to irq in the iommu code or are they handled entirely > > separately? > > Could we make need_iommu(d) sticky? Being able to clear it doesn''t seem an > important case (such a domain is probably being torn down anyway) and > clearly it can lead to fragility. The fact that presumably we''d end up doing > unnecessary IOMMU PT work for the remaining lifetime of the domain doesn''t > seem a major downside to me.If you prefer it that way. TBH I think I prefer the other way though: things that gate on need_iommu() should be cleaned up by iommu->teardown(). --8<---- PCI passthrough: don''t tear down IOMMU when the last device is deassigned. Otherwise there''s a risk that not all iommu-related state will get torn down during domain destruction. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> diff -r 9dbbf1631193 xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Mon Jul 25 14:21:13 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Mon Jul 25 15:14:31 2011 +0100 @@ -296,12 +296,6 @@ int deassign_device(struct domain *d, u8 return ret; } - if ( !has_arch_pdevs(d) && need_iommu(d) ) - { - d->need_iommu = 0; - hd->platform_ops->teardown(d); - } - return ret; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jul-25 14:47 UTC
Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
On 25/07/2011 15:21, "Tim Deegan" <Tim.Deegan@citrix.com> wrote:>> Could we make need_iommu(d) sticky? Being able to clear it doesn''t seem an >> important case (such a domain is probably being torn down anyway) and >> clearly it can lead to fragility. The fact that presumably we''d end up doing >> unnecessary IOMMU PT work for the remaining lifetime of the domain doesn''t >> seem a major downside to me. > > If you prefer it that way. TBH I think I prefer the other way though: > things that gate on need_iommu() should be cleaned up by > iommu->teardown().Is your original patch still proposed as a valid alternative to this one? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2011-Jul-25 15:04 UTC
Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
At 15:47 +0100 on 25 Jul (1311608879), Keir Fraser wrote:> On 25/07/2011 15:21, "Tim Deegan" <Tim.Deegan@citrix.com> wrote: > > >> Could we make need_iommu(d) sticky? Being able to clear it doesn''t seem an > >> important case (such a domain is probably being torn down anyway) and > >> clearly it can lead to fragility. The fact that presumably we''d end up doing > >> unnecessary IOMMU PT work for the remaining lifetime of the domain doesn''t > >> seem a major downside to me. > > > > If you prefer it that way. TBH I think I prefer the other way though: > > things that gate on need_iommu() should be cleaned up by > > iommu->teardown(). > > Is your original patch still proposed as a valid alternative to this one?Yes; I prefer the original. Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel