thr3ads.net - Xen devel - [Xen-devel] [PATCH][RFC] fix some spinlock issues in vmsi [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Kouya Shimura

2009-Mar-06 02:55 UTC

[Xen-devel] [PATCH][RFC] fix some spinlock issues in vmsi

Hi,

This patch fixes some spinlock issues related to changeset:
 19246:9bc5799566be "passthrough MSI-X mask bit acceleration"
 19263:9c5b4efc934d "hvm: passthrough MSI-X: fix ia64 link and MSI-X clean
up"

Without this patch, I got:
(XEN) Assertion ''_spin_is_locked(&pcidevs_lock)'' failed at
vmsi.c:389
or
(XEN) Xen BUG at spinlock.c:25


I''m not sure this fix is right. Please review it.

- d->arch.hvm_domain.msixtbl_list_lock is redundant.
  irq_desc->lock covers it, thus remove the spinlock.

- ASSERT(spin_is_locked(&pcidevs_lock)) is not needed.
  At first, I intended to add the enclosure of pcidevs_lock.
  But this assertion seems pointless. pcidevs_lock is
  for alldevs_list and msixtbl_pt_xxx functions never refer it.

- In "debug=y" environment, xmalloc must not be called from both
  irq_enable and irq_disable state. Otherwise, 
  the assertion failure occurs in check_lock().
  So, in msixtbl_pt_register, xmalloc is called beforehand.
  I thinks this is ugly, though.


Thanks,
Kouya

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Qing He

2009-Mar-06 05:31 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On Fri, 2009-03-06 at 10:55 +0800, Kouya Shimura wrote:> Hi,
> 
> This patch fixes some spinlock issues related to changeset:
>  19246:9bc5799566be "passthrough MSI-X mask bit acceleration"
>  19263:9c5b4efc934d "hvm: passthrough MSI-X: fix ia64 link and MSI-X
clean up"
> 
> Without this patch, I got:
> (XEN) Assertion ''_spin_is_locked(&pcidevs_lock)''
failed at vmsi.c:389
> or
> (XEN) Xen BUG at spinlock.c:25
> 
> 
> I''m not sure this fix is right. Please review it.
> 
> - d->arch.hvm_domain.msixtbl_list_lock is redundant.
>   irq_desc->lock covers it, thus remove the spinlock.
No, it''s not redundant. msixtbl_list_lock is to protect msixtbl_list,
which may contain entries for multiple irqs, consider if two
pt_bind_irq are called simultaneously, irq_desc->lock wouldn''t be
enough to protect the list.
> 
> - ASSERT(spin_is_locked(&pcidevs_lock)) is not needed.
>   At first, I intended to add the enclosure of pcidevs_lock.
>   But this assertion seems pointless. pcidevs_lock is
>   for alldevs_list and msixtbl_pt_xxx functions never refer it.
I was thinking that pcidevs_lock is already held for msixtbl_pt_xxx, but
obviously I was wrong:-) However, it''s not that pointless,
msixtbl_pt_xxx
refers to the content of pci_dev, the use of pcidevs_lock is intended to
protect it against destructive modification, like an improper
PHYDEVOP_manage_pci_remove. It may not be neccessary for a well-written
device model, but that''s something we can''t rely on.

A similar problem also exists in msixtbl_write/msixtbl_read. Currently,
these functions access the mapped msix table using the pointer stored
in pci_dev, and when pci_dev is removed without the knowledge of the
guest, pci_dev alongside the fixmaps will be removed and this
definitely will frustrate these functions. Unfortunately, adding
spinlocks to msixtbl_{read,write} seems quite expensive, especially for
multiple guest threads, that''s why RCU based lock is used here. For a
completely secure handling of this, maybe we need a less brute,
asynchronous pci_remove hypercall or something.
> 
> - In "debug=y" environment, xmalloc must not be called from both
>   irq_enable and irq_disable state. Otherwise, 
>   the assertion failure occurs in check_lock().
>   So, in msixtbl_pt_register, xmalloc is called beforehand.
>   I thinks this is ugly, though.
This is fine to me, thank you for pointing out this.

Thanks,
Qing> 
> 
> Thanks,
> Kouya
> 
> Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Kouya Shimura

2009-Mar-06 07:31 UTC

head link

Re: [Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

Hi Qing,

Thank you for review and elaboration.

Qing He writes:> > - d->arch.hvm_domain.msixtbl_list_lock is redundant.
> >   irq_desc->lock covers it, thus remove the spinlock.
> 
> No, it''s not redundant. msixtbl_list_lock is to protect
msixtbl_list,
> which may contain entries for multiple irqs, consider if two
> pt_bind_irq are called simultaneously, irq_desc->lock wouldn''t
be
> enough to protect the list.
I see. Although two pt_bind_irq are hardly called simultaneously.
> > - ASSERT(spin_is_locked(&pcidevs_lock)) is not needed.
> >   At first, I intended to add the enclosure of pcidevs_lock.
> >   But this assertion seems pointless. pcidevs_lock is
> >   for alldevs_list and msixtbl_pt_xxx functions never refer it.
> 
> I was thinking that pcidevs_lock is already held for msixtbl_pt_xxx, but
> obviously I was wrong:-) However, it''s not that pointless,
msixtbl_pt_xxx
> refers to the content of pci_dev, the use of pcidevs_lock is intended to
> protect it against destructive modification, like an improper
> PHYDEVOP_manage_pci_remove. It may not be neccessary for a well-written
> device model, but that''s something we can''t rely on.
Okay, one more question.
If ASSERT(spin_is_locked(&pcidevs_lock)) is needed,
msixtbl_list_lock becomes redundant?
(i.e. pcidevs_lock must covers it)
Any way, msixtbl_list_lock would better be remained for the future...

Thanks,
Kouya

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Mar-06 08:28 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On 06/03/2009 05:31, "Qing He" <qing.he@intel.com> wrote:
> A similar problem also exists in msixtbl_write/msixtbl_read. Currently,
> these functions access the mapped msix table using the pointer stored
> in pci_dev, and when pci_dev is removed without the knowledge of the
> guest, pci_dev alongside the fixmaps will be removed and this
> definitely will frustrate these functions. Unfortunately, adding
> spinlocks to msixtbl_{read,write} seems quite expensive, especially for
> multiple guest threads, that''s why RCU based lock is used here.
For a
> completely secure handling of this, maybe we need a less brute,
> asynchronous pci_remove hypercall or something.
Add the locks, making it correct. Then optimise later as necessary, using
RCU or (perhaps easier) just smaller locks: Per MSI-X or per-pci-dev or
whatever. Throwing in racey code condemns us to subtle races forever more.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Qing He

2009-Mar-06 09:21 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On Fri, 2009-03-06 at 16:28 +0800, Keir Fraser wrote:> On 06/03/2009 05:31, "Qing He" <qing.he@intel.com> wrote:
> 
> > A similar problem also exists in msixtbl_write/msixtbl_read.
Currently,
> > these functions access the mapped msix table using the pointer stored
> > in pci_dev, and when pci_dev is removed without the knowledge of the
> > guest, pci_dev alongside the fixmaps will be removed and this
> > definitely will frustrate these functions. Unfortunately, adding
> > spinlocks to msixtbl_{read,write} seems quite expensive, especially
for
> > multiple guest threads, that''s why RCU based lock is used
here. For a
> > completely secure handling of this, maybe we need a less brute,
> > asynchronous pci_remove hypercall or something.
> 
> Add the locks, making it correct. Then optimise later as necessary, using
> RCU or (perhaps easier) just smaller locks: Per MSI-X or per-pci-dev or
> whatever. Throwing in racey code condemns us to subtle races forever more.
Ok, I''ll make a patch to use the locks.

Do you think it''s ok to assume that guest has already free the device
when calling remove_pci_device on hotplugging?

Thanks,
Qing
> 
>  -- Keir
> 
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Mar-06 12:50 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On 06/03/2009 09:21, "Qing He" <qing.he@intel.com> wrote:
>> Add the locks, making it correct. Then optimise later as necessary,
using
>> RCU or (perhaps easier) just smaller locks: Per MSI-X or per-pci-dev or
>> whatever. Throwing in racey code condemns us to subtle races forever
more.
> 
> Ok, I''ll make a patch to use the locks.
> 
> Do you think it''s ok to assume that guest has already free the
device
> when calling remove_pci_device on hotplugging?
Do you mean deinitialised and unloaded the device driver?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Qing He

2009-Mar-06 17:43 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On Fri, 2009-03-06 at 20:50 +0800, Keir Fraser wrote:> On 06/03/2009 09:21, "Qing He" <qing.he@intel.com> wrote:
> 
> >> Add the locks, making it correct. Then optimise later as
necessary, using
> >> RCU or (perhaps easier) just smaller locks: Per MSI-X or
per-pci-dev or
> >> whatever. Throwing in racey code condemns us to subtle races
forever more.
> > 
> > Ok, I''ll make a patch to use the locks.
> > 
> > Do you think it''s ok to assume that guest has already free
the device
> > when calling remove_pci_device on hotplugging?
> 
> Do you mean deinitialised and unloaded the device driver?
Yes, the racey condition we just talked happens here. Consider
the following case, when remove_pci_device is called, the device
data structure is removed by the hypervisor. But there is no
guarauntee in the hypervisor that guest already has knowledge,
for example the guest may continue to access the device resources.
It''s in this case msixtbl_{read,write} has a little racey condition.

I don''t think this scenario is desirable, but does it happen in the
current logic?

Thanks,
Qing

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Mar-06 18:12 UTC

head link

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On 06/03/2009 17:43, "Qing He" <qing.he@intel.com> wrote:
>> Do you mean deinitialised and unloaded the device driver?
> 
> Yes, the racey condition we just talked happens here. Consider
> the following case, when remove_pci_device is called, the device
> data structure is removed by the hypervisor. But there is no
> guarauntee in the hypervisor that guest already has knowledge,
> for example the guest may continue to access the device resources.
> It''s in this case msixtbl_{read,write} has a little racey
condition.
> 
> I don''t think this scenario is desirable, but does it happen in
the
> current logic?
Uh, well since dom0 is in fact mostly in charge of PCI access, there''s
not
much we can do in Xen to enforce this. At that level we kind of have to
assume that dom0 has sequenced hotplug/hotunplug sensibly.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2009-Mar-09 03:19 UTC

head link

[Xen-devel] RE: [PATCH][RFC] fix some spinlock issues in vmsi

I was confused by this also when I try to working on the pci_dev lock.

Current implementation did''t assume the device is not owned by guest,
that means the IOMMU/MSI are both maybe enabled for a guest when dom0 try to
invoke the PHYSDEVOP_manage_pci_remove hypercall. This cause a lot of race
condition issue and makes the pci pass-through very complex.

Maybe we can simplify through two step:

a) Instead of let Xen HV to handle the situation that the device is still owned
by other guest, we can have pciback to do that before this hypercall is invoked.
For PV driver domain it will be simple since dom0''s kernel should
invoke PCIback''s remove() function. For HVM domain, I think it should
be ok for stubdomain environment. This will make HV simpler and robust and
didn''t make thing worse.

b) Find some mechanism so that dom0 can cause virtual hotplug to guest, then
hotplug in host side will not cause trouble to guest. There are virtual hotplug
for PV guest but it  is async method, not sure if that can be integrated with
dom0''s hotplug event handling.

Not sure if this is feasible.

CC Esp and Winston also.

Thanks
Yunhong Jiang

xen-devel-bounces@lists.xensource.com <> wrote:> On 06/03/2009 17:43, "Qing He" <qing.he@intel.com> wrote:
> 
>>> Do you mean deinitialised and unloaded the device driver?
>> 
>> Yes, the racey condition we just talked happens here. Consider
>> the following case, when remove_pci_device is called, the device
>> data structure is removed by the hypervisor. But there is no
>> guarauntee in the hypervisor that guest already has knowledge,
>> for example the guest may continue to access the device resources.
>> It''s in this case msixtbl_{read,write} has a little racey
condition.
>> 
>> I don''t think this scenario is desirable, but does it happen
in the
>> current logic?
> 
> Uh, well since dom0 is in fact mostly in charge of PCI access,
> there''s not
> much we can do in Xen to enforce this. At that level we kind of have to
> assume that dom0 has sequenced hotplug/hotunplug sensibly.
> 
> -- Keir
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Qing He

2009-Mar-10 01:11 UTC

head link

Re: [Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

On Fri, 2009-03-06 at 15:31 +0800, Kouya Shimura wrote:> Qing He writes:
> > No, it''s not redundant. msixtbl_list_lock is to protect
msixtbl_list,
> > which may contain entries for multiple irqs, consider if two
> > pt_bind_irq are called simultaneously, irq_desc->lock
wouldn''t be
> > enough to protect the list.
> 
> I see. Although two pt_bind_irq are hardly called simultaneously.
The problem here and in many other place is whether we can safely assume
that dom0 operations are completely within expectation
> 
> > > - ASSERT(spin_is_locked(&pcidevs_lock)) is not needed.
> > >   At first, I intended to add the enclosure of pcidevs_lock.
> > >   But this assertion seems pointless. pcidevs_lock is
> > >   for alldevs_list and msixtbl_pt_xxx functions never refer it.
> > 
> > I was thinking that pcidevs_lock is already held for msixtbl_pt_xxx,
but
> > obviously I was wrong:-) However, it''s not that pointless,
msixtbl_pt_xxx
> > refers to the content of pci_dev, the use of pcidevs_lock is intended
to
> > protect it against destructive modification, like an improper
> > PHYDEVOP_manage_pci_remove. It may not be neccessary for a
well-written
> > device model, but that''s something we can''t rely on.
> 
> Okay, one more question.
> If ASSERT(spin_is_locked(&pcidevs_lock)) is needed,
> msixtbl_list_lock becomes redundant?
> (i.e. pcidevs_lock must covers it)
> Any way, msixtbl_list_lock would better be remained for the future...
I''m OK with that, though I think msixtbl_list_lock is more closer to
the
list. Are you going to use something named like this?

Thanks,
Qing

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Mar 2009 - [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

Re: [Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi

[Xen-devel] RE: [PATCH][RFC] fix some spinlock issues in vmsi

Re: [Xen-devel] Re: [PATCH][RFC] fix some spinlock issues in vmsi