thr3ads.net - Xen devel - Deadlocks by p2m_lock and event

If this information is useful, please help other people find it:
Share via:

Hao, Xudong

2012-Mar-09 10:58 UTC

Deadlocks by p2m_lock and event_lock

Hi, Jan and Andres

Here we met a deadlocks issue by p2m_lock and event_lock on Xen. The issue
appears from the series of patches "Synchronized p2m lookups, Xen
24770~", deadlocks may happen on a specifical case: Assign a PCIe device
with MSI-x capability to HVM guest(in case of EPT).

I''ve dump the all processor registers when dom0 hang( attach the log).
The deadlocks happen as follows:
====CPU0==map_domain_pirq()    Grab event_lock
  /
Pci_enable_msi()
  /
msix_capability_init()
  /
p2m_change_entry_type_global()   Trying to acquire p2m_lock

====CPU9==hvm_hap_nested_page_fault() -> get_gfn_type_access()   Grab
p2m_lock
  /
handle_mmio()
  /
...
  /
notify_via_xen_event_channel()    Trying to acquire event_lock


The event_lock is used anywhere in Xen, I only have a patch of workaround this
issue for proposal, but not for the final fix. Any good suggestion?

diff -r f61120046915 xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c	Wed Mar 07 11:50:31 2012 +0100
+++ b/xen/arch/x86/irq.c	Sat Mar 10 02:06:18 2012 +0800
@@ -1875,10 +1875,12 @@ int map_domain_pirq(
         if ( !cpu_has_apic )
             goto done;
 
+        spin_unlock(&d->event_lock);
         pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
         ret = pci_enable_msi(msi, &msi_desc);
         if ( ret )
             goto done;
+        spin_lock(&d->event_lock);
 
         spin_lock_irqsave(&desc->lock, flags);

Best Regards,
Xudong Hao



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Tim Deegan

2012-Mar-09 11:20 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

Hi, 

At 10:58 +0000 on 09 Mar (1331290728), Hao, Xudong
wrote:> ====CPU0==> map_domain_pirq()    Grab event_lock
>   /
> Pci_enable_msi()
>   /
> msix_capability_init()
>   /
> p2m_change_entry_type_global()   Trying to acquire p2m_lock
> 
> ====CPU9==> hvm_hap_nested_page_fault() -> get_gfn_type_access()  
Grab p2m_lock
>   /
> handle_mmio()
>   /
> ...
>   /
> notify_via_xen_event_channel()    Trying to acquire event_lock
> 
> 
> The event_lock is used anywhere in Xen, I only have a patch of workaround
this issue for proposal, but not for the final fix. Any good suggestion?
> 
> diff -r f61120046915 xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c	Wed Mar 07 11:50:31 2012 +0100
> +++ b/xen/arch/x86/irq.c	Sat Mar 10 02:06:18 2012 +0800
> @@ -1875,10 +1875,12 @@ int map_domain_pirq(
>          if ( !cpu_has_apic )
>              goto done;
>  
> +        spin_unlock(&d->event_lock);
>          pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
>          ret = pci_enable_msi(msi, &msi_desc);
>          if ( ret )
>              goto done;
> +        spin_lock(&d->event_lock);
>  
>          spin_lock_irqsave(&desc->lock, flags);
> 
> Best Regards,
> Xudong Hao
I don''t know about the event lock, but it seems unwise to call in to
handle_mmio with a gfn lock held.  How about fixing the other path?

diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012 +0000
+++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012 +0000
@@ -1324,10 +1324,11 @@ int hvm_hap_nested_page_fault(unsigned l
     if ( (p2mt == p2m_mmio_dm) || 
          (access_w && (p2mt == p2m_ram_ro)) )
     {
+        put_gfn(p2m->domain, gfn);
         if ( !handle_mmio() )
             hvm_inject_exception(TRAP_gp_fault, 0, 0);
         rc = 1;
-        goto out_put_gfn;
+        goto out;
     }
 
 #ifdef __x86_64__
@@ -1379,6 +1380,7 @@ int hvm_hap_nested_page_fault(unsigned l
 
 out_put_gfn:
     put_gfn(p2m->domain, gfn);
+out:
     if ( paged )
         p2m_mem_paging_populate(v->domain, gfn);
     if ( req_ptr )

Hao, Xudong

2012-Mar-09 11:44 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: Friday, March 09, 2012 7:20 PM
> To: Hao, Xudong
> Cc: JBeulich@suse.com; Andres Lagar-Cavilla; xen-devel@lists.xensource.com;
> Keir Fraser; Zhang, Xiantao
> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
> 
> Hi,
> 
> At 10:58 +0000 on 09 Mar (1331290728), Hao, Xudong wrote:
> > ====CPU0==> > map_domain_pirq()    Grab event_lock
> >   /
> > Pci_enable_msi()
> >   /
> > msix_capability_init()
> >   /
> > p2m_change_entry_type_global()   Trying to acquire p2m_lock
> >
> > ====CPU9==> > hvm_hap_nested_page_fault() ->
get_gfn_type_access()   Grab p2m_lock
> >   /
> > handle_mmio()
> >   /
> > ...
> >   /
> > notify_via_xen_event_channel()    Trying to acquire event_lock
> >
> >
> > The event_lock is used anywhere in Xen, I only have a patch of
workaround
> this issue for proposal, but not for the final fix. Any good suggestion?
> >
> > diff -r f61120046915 xen/arch/x86/irq.c
> > --- a/xen/arch/x86/irq.c	Wed Mar 07 11:50:31 2012 +0100
> > +++ b/xen/arch/x86/irq.c	Sat Mar 10 02:06:18 2012 +0800
> > @@ -1875,10 +1875,12 @@ int map_domain_pirq(
> >          if ( !cpu_has_apic )
> >              goto done;
> >
> > +        spin_unlock(&d->event_lock);
> >          pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
> >          ret = pci_enable_msi(msi, &msi_desc);
> >          if ( ret )
> >              goto done;
> > +        spin_lock(&d->event_lock);
> >
> >          spin_lock_irqsave(&desc->lock, flags);
> >
> > Best Regards,
> > Xudong Hao
> 
> I don''t know about the event lock, but it seems unwise to call in
to
> handle_mmio with a gfn lock held.  How about fixing the other path?
> 
> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012 +0000
> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012 +0000
> @@ -1324,10 +1324,11 @@ int hvm_hap_nested_page_fault(unsigned l
>      if ( (p2mt == p2m_mmio_dm) ||
>           (access_w && (p2mt == p2m_ram_ro)) )
>      {
> +        put_gfn(p2m->domain, gfn);
>          if ( !handle_mmio() )
>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
>          rc = 1;
> -        goto out_put_gfn;
> +        goto out;
>      }
> 
>  #ifdef __x86_64__
> @@ -1379,6 +1380,7 @@ int hvm_hap_nested_page_fault(unsigned l
> 
>  out_put_gfn:
>      put_gfn(p2m->domain, gfn);
> +out:
>      if ( paged )
>          p2m_mem_paging_populate(v->domain, gfn);
>      if ( req_ptr )
Yes, that''s fine to release the p2m lock earlier than handle_mmio.

Andres Lagar-Cavilla

2012-Mar-09 16:29 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

>
>> -----Original Message-----
>> From: Tim Deegan [mailto:tim@xen.org]
>> Sent: Friday, March 09, 2012 7:20 PM
>> To: Hao, Xudong
>> Cc: JBeulich@suse.com; Andres Lagar-Cavilla;
>> xen-devel@lists.xensource.com;
>> Keir Fraser; Zhang, Xiantao
>> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> Hi,
>>
>> At 10:58 +0000 on 09 Mar (1331290728), Hao, Xudong wrote:
>> > ====CPU0==>> > map_domain_pirq()    Grab event_lock
>> >   /
>> > Pci_enable_msi()
>> >   /
>> > msix_capability_init()
>> >   /
>> > p2m_change_entry_type_global()   Trying to acquire p2m_lock
>> >
>> > ====CPU9==>> > hvm_hap_nested_page_fault() ->
get_gfn_type_access()   Grab p2m_lock
>> >   /
>> > handle_mmio()
>> >   /
>> > ...
>> >   /
>> > notify_via_xen_event_channel()    Trying to acquire event_lock
>> >
>> >
>> > The event_lock is used anywhere in Xen, I only have a patch of
>> workaround
>> this issue for proposal, but not for the final fix. Any good
suggestion?
>> >
>> > diff -r f61120046915 xen/arch/x86/irq.c
>> > --- a/xen/arch/x86/irq.c	Wed Mar 07 11:50:31 2012 +0100
>> > +++ b/xen/arch/x86/irq.c	Sat Mar 10 02:06:18 2012 +0800
>> > @@ -1875,10 +1875,12 @@ int map_domain_pirq(
>> >          if ( !cpu_has_apic )
>> >              goto done;
>> >
>> > +        spin_unlock(&d->event_lock);
>> >          pdev = pci_get_pdev(msi->seg, msi->bus,
msi->devfn);
>> >          ret = pci_enable_msi(msi, &msi_desc);
>> >          if ( ret )
>> >              goto done;
>> > +        spin_lock(&d->event_lock);
>> >
>> >          spin_lock_irqsave(&desc->lock, flags);
>> >
>> > Best Regards,
>> > Xudong Hao
>>
>> I don''t know about the event lock, but it seems unwise to call
in to
>> handle_mmio with a gfn lock held.  How about fixing the other path?
>>
>> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012 +0000
>> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012 +0000
>> @@ -1324,10 +1324,11 @@ int hvm_hap_nested_page_fault(unsigned l
>>      if ( (p2mt == p2m_mmio_dm) ||
>>           (access_w && (p2mt == p2m_ram_ro)) )
>>      {
>> +        put_gfn(p2m->domain, gfn);
>>          if ( !handle_mmio() )
>>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
>>          rc = 1;
>> -        goto out_put_gfn;
>> +        goto out;
>>      }
>>
>>  #ifdef __x86_64__
>> @@ -1379,6 +1380,7 @@ int hvm_hap_nested_page_fault(unsigned l
>>
>>  out_put_gfn:
>>      put_gfn(p2m->domain, gfn);
>> +out:
>>      if ( paged )
>>          p2m_mem_paging_populate(v->domain, gfn);
>>      if ( req_ptr )
>
> Yes, that''s fine to release the p2m lock earlier than handle_mmio.Ack
Thanks,
Andres>

Tim Deegan

2012-Mar-09 16:55 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla
wrote:> >> I don''t know about the event lock, but it seems unwise to
call in to
> >> handle_mmio with a gfn lock held.  How about fixing the other
path?
> >>
> >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012 +0000
> >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012 +0000
> >> @@ -1324,10 +1324,11 @@ int hvm_hap_nested_page_fault(unsigned l
> >>      if ( (p2mt == p2m_mmio_dm) ||
> >>           (access_w && (p2mt == p2m_ram_ro)) )
> >>      {
> >> +        put_gfn(p2m->domain, gfn);
> >>          if ( !handle_mmio() )
> >>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
> >>          rc = 1;
> >> -        goto out_put_gfn;
> >> +        goto out;
> >>      }
> >>
> >>  #ifdef __x86_64__
> >> @@ -1379,6 +1380,7 @@ int hvm_hap_nested_page_fault(unsigned l
> >>
> >>  out_put_gfn:
> >>      put_gfn(p2m->domain, gfn);
> >> +out:
> >>      if ( paged )
> >>          p2m_mem_paging_populate(v->domain, gfn);
> >>      if ( req_ptr )
> >
> > Yes, that''s fine to release the p2m lock earlier than
handle_mmio.
> 
> Ack
OK, applied.

Tim.

Hao, Xudong

2012-Mar-13 07:51 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

Hi, Tim and Andres
The patch fix part of this issue. In handle_mmio, function hvmemul_do_io() is
called and p2m lock was held again by calling get_gfn_unshare(), still trigger a
deadlocks.

(XEN) Xen call trace:
(XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
(XEN)    [<ffff82c4801070d3>] notify_via_xen_event_channel+0x21/0x106
(XEN)    [<ffff82c4801b6883>] hvm_buffered_io_send+0x1f1/0x21b
(XEN)    [<ffff82c4801bbd3a>] stdvga_intercept_mmio+0x491/0x4c7
(XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
(XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
(XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
(XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
(XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
(XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
(XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
(XEN)    [<ffff82c4801afd72>] hvm_hap_nested_page_fault+0x210/0x37f
(XEN)    [<ffff82c4801d2419>] vmx_vmexit_handler+0x1523/0x17d0

Thanks,
-Xudong
> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: Saturday, March 10, 2012 12:56 AM
> To: Andres Lagar-Cavilla
> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
Xiantao;
> JBeulich@suse.com
> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
> 
> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla wrote:
> > >> I don''t know about the event lock, but it seems
unwise to call in
> > >> to handle_mmio with a gfn lock held.  How about fixing the
other path?
> > >>
> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012 +0000
> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012 +0000
> > >> @@ -1324,10 +1324,11 @@ int
hvm_hap_nested_page_fault(unsigned l
> > >>      if ( (p2mt == p2m_mmio_dm) ||
> > >>           (access_w && (p2mt == p2m_ram_ro)) )
> > >>      {
> > >> +        put_gfn(p2m->domain, gfn);
> > >>          if ( !handle_mmio() )
> > >>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
> > >>          rc = 1;
> > >> -        goto out_put_gfn;
> > >> +        goto out;
> > >>      }
> > >>
> > >>  #ifdef __x86_64__
> > >> @@ -1379,6 +1380,7 @@ int hvm_hap_nested_page_fault(unsigned
l
> > >>
> > >>  out_put_gfn:
> > >>      put_gfn(p2m->domain, gfn);
> > >> +out:
> > >>      if ( paged )
> > >>          p2m_mem_paging_populate(v->domain, gfn);
> > >>      if ( req_ptr )
> > >
> > > Yes, that''s fine to release the p2m lock earlier than
handle_mmio.
> >
> > Ack
> 
> OK, applied.
> 
> Tim.

Andres Lagar-Cavilla

2012-Mar-13 15:27 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> Hi, Tim and Andres
> The patch fix part of this issue. In handle_mmio, function hvmemul_do_io()
> is called and p2m lock was held again by calling get_gfn_unshare(), still
> trigger a deadlocks.
>
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> (XEN)    [<ffff82c4801070d3>] notify_via_xen_event_channel+0x21/0x106
> (XEN)    [<ffff82c4801b6883>] hvm_buffered_io_send+0x1f1/0x21b
> (XEN)    [<ffff82c4801bbd3a>] stdvga_intercept_mmio+0x491/0x4c7
> (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
> (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
> (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> (XEN)    [<ffff82c4801afd72>] hvm_hap_nested_page_fault+0x210/0x37f
> (XEN)    [<ffff82c4801d2419>] vmx_vmexit_handler+0x1523/0x17d0
>
> Thanks,
Thanks for the report, very useful. I''ll look into it
Andres
> -Xudong
>
>> -----Original Message-----
>> From: Tim Deegan [mailto:tim@xen.org]
>> Sent: Saturday, March 10, 2012 12:56 AM
>> To: Andres Lagar-Cavilla
>> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla wrote:
>> > >> I don''t know about the event lock, but it seems
unwise to call in
>> > >> to handle_mmio with a gfn lock held.  How about fixing
the other
>> path?
>> > >>
>> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012
+0000
>> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012
+0000
>> > >> @@ -1324,10 +1324,11 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> > >>           (access_w && (p2mt == p2m_ram_ro)) )
>> > >>      {
>> > >> +        put_gfn(p2m->domain, gfn);
>> > >>          if ( !handle_mmio() )
>> > >>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
>> > >>          rc = 1;
>> > >> -        goto out_put_gfn;
>> > >> +        goto out;
>> > >>      }
>> > >>
>> > >>  #ifdef __x86_64__
>> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>
>> > >>  out_put_gfn:
>> > >>      put_gfn(p2m->domain, gfn);
>> > >> +out:
>> > >>      if ( paged )
>> > >>          p2m_mem_paging_populate(v->domain, gfn);
>> > >>      if ( req_ptr )
>> > >
>> > > Yes, that''s fine to release the p2m lock earlier
than handle_mmio.
>> >
>> > Ack
>>
>> OK, applied.
>>
>> Tim.
>

Andres Lagar-Cavilla

2012-Mar-13 18:26 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> Hi, Tim and Andres
> The patch fix part of this issue. In handle_mmio, function hvmemul_do_io()
> is called and p2m lock was held again by calling get_gfn_unshare(), still
> trigger a deadlocks.
I have a question before I dive into lock untangling

msix_capability_init ->
p2m_change_entry_type_global(dev->domain, p2m_mmio_direct, p2m_mmio_direct);

Huh? This achieves ... nothing. Almost. It flushes a bunch of TLBs, but
that can be done with significantly less effort. Am I missing something?

Andres>
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> (XEN)    [<ffff82c4801070d3>] notify_via_xen_event_channel+0x21/0x106
> (XEN)    [<ffff82c4801b6883>] hvm_buffered_io_send+0x1f1/0x21b
> (XEN)    [<ffff82c4801bbd3a>] stdvga_intercept_mmio+0x491/0x4c7
> (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
> (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
> (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> (XEN)    [<ffff82c4801afd72>] hvm_hap_nested_page_fault+0x210/0x37f
> (XEN)    [<ffff82c4801d2419>] vmx_vmexit_handler+0x1523/0x17d0
>
> Thanks,
> -Xudong
>
>> -----Original Message-----
>> From: Tim Deegan [mailto:tim@xen.org]
>> Sent: Saturday, March 10, 2012 12:56 AM
>> To: Andres Lagar-Cavilla
>> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla wrote:
>> > >> I don''t know about the event lock, but it seems
unwise to call in
>> > >> to handle_mmio with a gfn lock held.  How about fixing
the other
>> path?
>> > >>
>> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012
+0000
>> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012
+0000
>> > >> @@ -1324,10 +1324,11 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> > >>           (access_w && (p2mt == p2m_ram_ro)) )
>> > >>      {
>> > >> +        put_gfn(p2m->domain, gfn);
>> > >>          if ( !handle_mmio() )
>> > >>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
>> > >>          rc = 1;
>> > >> -        goto out_put_gfn;
>> > >> +        goto out;
>> > >>      }
>> > >>
>> > >>  #ifdef __x86_64__
>> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>
>> > >>  out_put_gfn:
>> > >>      put_gfn(p2m->domain, gfn);
>> > >> +out:
>> > >>      if ( paged )
>> > >>          p2m_mem_paging_populate(v->domain, gfn);
>> > >>      if ( req_ptr )
>> > >
>> > > Yes, that''s fine to release the p2m lock earlier
than handle_mmio.
>> >
>> > Ack
>>
>> OK, applied.
>>
>> Tim.
>

Andres Lagar-Cavilla

2012-Mar-13 18:45 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> Hi, Tim and Andres
> The patch fix part of this issue. In handle_mmio, function hvmemul_do_io()
> is called and p2m lock was held again by calling get_gfn_unshare(), still
> trigger a deadlocks.
Typically hvmemul_do_io gets the zero gfn, because in many cases that''s
the ''rma_gpa'' it is passed. However, in the case of mmio, and
particularly
stdvga, ram_gpa is the data to be copied to the framebuffer. So it is in
principle ok to get_gfn in hvmemul_do_io.

There are two solutions
1. msix_capability_init does not call p2m_change_entry_type_global. See my
previous email. If we want to resync the
EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
explicitly. I hope.

2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa, and holds
that for the critical section, instead of the p2m lock. One way to achieve
this is

    /* Check for paged out page */
    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
    if ( this or that )
    { ... handle ... }
    if ( mfn_valid(ram_mfn) )
        get_page(mfn_to_page(ram_mfn, curr->domain));
    put_gfn(curr->domain, ram_gfn)

    /* replace all put_gfn in all exit paths by put_page */

This will ensure the target page is live and sane while not holding the
p2m lock. Xudong, did that make sense? Do you think you could try that and
report back?

Thanks!
Andres
>
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> (XEN)    [<ffff82c4801070d3>] notify_via_xen_event_channel+0x21/0x106
> (XEN)    [<ffff82c4801b6883>] hvm_buffered_io_send+0x1f1/0x21b
> (XEN)    [<ffff82c4801bbd3a>] stdvga_intercept_mmio+0x491/0x4c7
> (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
> (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
> (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> (XEN)    [<ffff82c4801afd72>] hvm_hap_nested_page_fault+0x210/0x37f
> (XEN)    [<ffff82c4801d2419>] vmx_vmexit_handler+0x1523/0x17d0
>
> Thanks,
> -Xudong
>
>> -----Original Message-----
>> From: Tim Deegan [mailto:tim@xen.org]
>> Sent: Saturday, March 10, 2012 12:56 AM
>> To: Andres Lagar-Cavilla
>> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla wrote:
>> > >> I don''t know about the event lock, but it seems
unwise to call in
>> > >> to handle_mmio with a gfn lock held.  How about fixing
the other
>> path?
>> > >>
>> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05 2012
+0000
>> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25 2012
+0000
>> > >> @@ -1324,10 +1324,11 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> > >>           (access_w && (p2mt == p2m_ram_ro)) )
>> > >>      {
>> > >> +        put_gfn(p2m->domain, gfn);
>> > >>          if ( !handle_mmio() )
>> > >>              hvm_inject_exception(TRAP_gp_fault, 0, 0);
>> > >>          rc = 1;
>> > >> -        goto out_put_gfn;
>> > >> +        goto out;
>> > >>      }
>> > >>
>> > >>  #ifdef __x86_64__
>> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
>> > >>
>> > >>  out_put_gfn:
>> > >>      put_gfn(p2m->domain, gfn);
>> > >> +out:
>> > >>      if ( paged )
>> > >>          p2m_mem_paging_populate(v->domain, gfn);
>> > >>      if ( req_ptr )
>> > >
>> > > Yes, that''s fine to release the p2m lock earlier
than handle_mmio.
>> >
>> > Ack
>>
>> OK, applied.
>>
>> Tim.
>

Hao, Xudong

2012-Mar-14 07:12 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

I prefer to the 2nd, I made a patch and testing show it works.

diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
+++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
@@ -60,20 +60,23 @@
     ioreq_t *p = get_ioreq(curr);
     unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
     p2m_type_t p2mt;
-    mfn_t ram_mfn;
+    unsigned long ram_mfn;
     int rc;
 
     /* Check for paged out page */
-    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
+    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn, &p2mt));
+    if ( mfn_valid(ram_mfn) )
+        get_page(mfn_to_page(ram_mfn), curr->domain);
+    put_gfn(curr->domain, ram_gfn); 
     if ( p2m_is_paging(p2mt) )
     {
-        put_gfn(curr->domain, ram_gfn); 
+        put_page(mfn_to_page(ram_mfn));
         p2m_mem_paging_populate(curr->domain, ram_gfn);
         return X86EMUL_RETRY;
     }
     if ( p2m_is_shared(p2mt) )
     {
-        put_gfn(curr->domain, ram_gfn); 
+        put_page(mfn_to_page(ram_mfn));
         return X86EMUL_RETRY;
     }
 
@@ -87,7 +90,7 @@
         ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
         if ( dir == IOREQ_READ )
             memset(p_data, ~0, size);
-        put_gfn(curr->domain, ram_gfn); 
+        put_page(mfn_to_page(ram_mfn));
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -108,7 +111,7 @@
             unsigned int bytes = vio->mmio_large_write_bytes;
             if ( (addr >= pa) && ((addr + size) <= (pa + bytes))
)
             {
-                put_gfn(curr->domain, ram_gfn); 
+                put_page(mfn_to_page(ram_mfn));
                 return X86EMUL_OKAY;
             }
         }
@@ -120,7 +123,7 @@
             {
                 memcpy(p_data, &vio->mmio_large_read[addr - pa],
                        size);
-                put_gfn(curr->domain, ram_gfn); 
+                put_page(mfn_to_page(ram_mfn));
                 return X86EMUL_OKAY;
             }
         }
@@ -134,7 +137,7 @@
         vio->io_state = HVMIO_none;
         if ( p_data == NULL )
         {
-            put_gfn(curr->domain, ram_gfn);
+            put_page(mfn_to_page(ram_mfn));
             return X86EMUL_UNHANDLEABLE;
         }
         goto finish_access;
@@ -144,11 +147,11 @@
              (addr == (vio->mmio_large_write_pa +
                        vio->mmio_large_write_bytes)) )
         {
-            put_gfn(curr->domain, ram_gfn);
+            put_page(mfn_to_page(ram_mfn));
             return X86EMUL_RETRY;
         }
     default:
-        put_gfn(curr->domain, ram_gfn);
+        put_page(mfn_to_page(ram_mfn));
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -156,7 +159,7 @@
     {
         gdprintk(XENLOG_WARNING, "WARNING: io already pending
(%d)?\n",
                  p->state);
-        put_gfn(curr->domain, ram_gfn); 
+        put_page(mfn_to_page(ram_mfn));
         return X86EMUL_UNHANDLEABLE;
     }
 
@@ -208,7 +211,7 @@
 
     if ( rc != X86EMUL_OKAY )
     {
-        put_gfn(curr->domain, ram_gfn); 
+        put_page(mfn_to_page(ram_mfn));
         return rc;
     }
 
@@ -244,7 +247,7 @@
         }
     }
 
-    put_gfn(curr->domain, ram_gfn); 
+    put_page(mfn_to_page(ram_mfn));
     return X86EMUL_OKAY;
 }


Thanks,
-Xudong

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
> Sent: Wednesday, March 14, 2012 2:46 AM
> To: Hao, Xudong
> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang, Xiantao;
> JBeulich@suse.com
> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
> 
> > Hi, Tim and Andres
> > The patch fix part of this issue. In handle_mmio, function
> > hvmemul_do_io() is called and p2m lock was held again by calling
> > get_gfn_unshare(), still trigger a deadlocks.
> 
> Typically hvmemul_do_io gets the zero gfn, because in many cases
that''s the
> ''rma_gpa'' it is passed. However, in the case of mmio, and
particularly stdvga,
> ram_gpa is the data to be copied to the framebuffer. So it is in principle
ok to
> get_gfn in hvmemul_do_io.
> 
> There are two solutions
> 1. msix_capability_init does not call p2m_change_entry_type_global. See my
> previous email. If we want to resync the
> EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
> explicitly. I hope.
> 
> 2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa, and holds
> that for the critical section, instead of the p2m lock. One way to achieve
this is
> 
>     /* Check for paged out page */
>     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
>     if ( this or that )
>     { ... handle ... }
>     if ( mfn_valid(ram_mfn) )
>         get_page(mfn_to_page(ram_mfn, curr->domain));
>     put_gfn(curr->domain, ram_gfn)
> 
>     /* replace all put_gfn in all exit paths by put_page */
> 
> This will ensure the target page is live and sane while not holding the p2m
lock.
> Xudong, did that make sense? Do you think you could try that and report
back?
> 
> Thanks!
> Andres
> 
> >
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> > (XEN)    [<ffff82c4801070d3>]
notify_via_xen_event_channel+0x21/0x106
> > (XEN)    [<ffff82c4801b6883>] hvm_buffered_io_send+0x1f1/0x21b
> > (XEN)    [<ffff82c4801bbd3a>] stdvga_intercept_mmio+0x491/0x4c7
> > (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
> > (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> > (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> > (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> > (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> > (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
> > (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> > (XEN)    [<ffff82c4801afd72>]
hvm_hap_nested_page_fault+0x210/0x37f
> > (XEN)    [<ffff82c4801d2419>] vmx_vmexit_handler+0x1523/0x17d0
> >
> > Thanks,
> > -Xudong
> >
> >> -----Original Message-----
> >> From: Tim Deegan [mailto:tim@xen.org]
> >> Sent: Saturday, March 10, 2012 12:56 AM
> >> To: Andres Lagar-Cavilla
> >> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com;
Zhang,
> >> Xiantao; JBeulich@suse.com
> >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
> >>
> >> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla wrote:
> >> > >> I don''t know about the event lock, but it
seems unwise to call
> >> > >> in to handle_mmio with a gfn lock held.  How about
fixing the
> >> > >> other
> >> path?
> >> > >>
> >> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05
2012 +0000
> >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25
2012 +0000
> >> > >> @@ -1324,10 +1324,11 @@ int
> hvm_hap_nested_page_fault(unsigned l
> >> > >>      if ( (p2mt == p2m_mmio_dm) ||
> >> > >>           (access_w && (p2mt == p2m_ram_ro))
)
> >> > >>      {
> >> > >> +        put_gfn(p2m->domain, gfn);
> >> > >>          if ( !handle_mmio() )
> >> > >>              hvm_inject_exception(TRAP_gp_fault, 0,
0);
> >> > >>          rc = 1;
> >> > >> -        goto out_put_gfn;
> >> > >> +        goto out;
> >> > >>      }
> >> > >>
> >> > >>  #ifdef __x86_64__
> >> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
> >> > >>
> >> > >>  out_put_gfn:
> >> > >>      put_gfn(p2m->domain, gfn);
> >> > >> +out:
> >> > >>      if ( paged )
> >> > >>          p2m_mem_paging_populate(v->domain, gfn);
> >> > >>      if ( req_ptr )
> >> > >
> >> > > Yes, that''s fine to release the p2m lock
earlier than handle_mmio.
> >> >
> >> > Ack
> >>
> >> OK, applied.
> >>
> >> Tim.
> >
>

Zhang, Yang Z

2012-Mar-14 08:28 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

The get_page() and put_page() should be used in pairs. You cannot call
put_page() separately when get_page() is not called before.

best regards
yang
> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org
> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Hao, Xudong
> Sent: Wednesday, March 14, 2012 3:13 PM
> To: andres@lagarcavilla.org
> Cc: Keir Fraser; xen-devel@lists.xensource.com; Tim Deegan;
JBeulich@suse.com;
> Zhang, Xiantao
> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
> 
> I prefer to the 2nd, I made a patch and testing show it works.
> 
> diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
> +++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
> @@ -60,20 +60,23 @@
>      ioreq_t *p = get_ioreq(curr);
>      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>      p2m_type_t p2mt;
> -    mfn_t ram_mfn;
> +    unsigned long ram_mfn;
>      int rc;
> 
>      /* Check for paged out page */
> -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> +    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn, &p2mt));
> +    if ( mfn_valid(ram_mfn) )
> +        get_page(mfn_to_page(ram_mfn), curr->domain);
> +    put_gfn(curr->domain, ram_gfn);
>      if ( p2m_is_paging(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          p2m_mem_paging_populate(curr->domain, ram_gfn);
>          return X86EMUL_RETRY;
>      }
>      if ( p2m_is_shared(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_RETRY;
>      }
> 
> @@ -87,7 +90,7 @@
>          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
>          if ( dir == IOREQ_READ )
>              memset(p_data, ~0, size);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -108,7 +111,7 @@
>              unsigned int bytes = vio->mmio_large_write_bytes;
>              if ( (addr >= pa) && ((addr + size) <= (pa +
bytes)) )
>              {
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -120,7 +123,7 @@
>              {
>                  memcpy(p_data, &vio->mmio_large_read[addr - pa],
>                         size);
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -134,7 +137,7 @@
>          vio->io_state = HVMIO_none;
>          if ( p_data == NULL )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_UNHANDLEABLE;
>          }
>          goto finish_access;
> @@ -144,11 +147,11 @@
>               (addr == (vio->mmio_large_write_pa +
>                         vio->mmio_large_write_bytes)) )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_RETRY;
>          }
>      default:
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -156,7 +159,7 @@
>      {
>          gdprintk(XENLOG_WARNING, "WARNING: io already pending
> (%d)?\n",
>                   p->state);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -208,7 +211,7 @@
> 
>      if ( rc != X86EMUL_OKAY )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return rc;
>      }
> 
> @@ -244,7 +247,7 @@
>          }
>      }
> 
> -    put_gfn(curr->domain, ram_gfn);
> +    put_page(mfn_to_page(ram_mfn));
>      return X86EMUL_OKAY;
>  }
> 
> 
> Thanks,
> -Xudong
> 
> 
> > -----Original Message-----
> > From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
> > Sent: Wednesday, March 14, 2012 2:46 AM
> > To: Hao, Xudong
> > Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
Xiantao;
> > JBeulich@suse.com
> > Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
> >
> > > Hi, Tim and Andres
> > > The patch fix part of this issue. In handle_mmio, function
> > > hvmemul_do_io() is called and p2m lock was held again by calling
> > > get_gfn_unshare(), still trigger a deadlocks.
> >
> > Typically hvmemul_do_io gets the zero gfn, because in many cases
that''s the
> > ''rma_gpa'' it is passed. However, in the case of
mmio, and particularly stdvga,
> > ram_gpa is the data to be copied to the framebuffer. So it is in
principle ok to
> > get_gfn in hvmemul_do_io.
> >
> > There are two solutions
> > 1. msix_capability_init does not call p2m_change_entry_type_global.
See my
> > previous email. If we want to resync the
> > EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
> > explicitly. I hope.
> >
> > 2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa, and
holds
> > that for the critical section, instead of the p2m lock. One way to
achieve this is
> >
> >     /* Check for paged out page */
> >     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> >     if ( this or that )
> >     { ... handle ... }
> >     if ( mfn_valid(ram_mfn) )
> >         get_page(mfn_to_page(ram_mfn, curr->domain));
> >     put_gfn(curr->domain, ram_gfn)
> >
> >     /* replace all put_gfn in all exit paths by put_page */
> >
> > This will ensure the target page is live and sane while not holding
the p2m lock.
> > Xudong, did that make sense? Do you think you could try that and
report back?
> >
> > Thanks!
> > Andres
> >
> > >
> > > (XEN) Xen call trace:
> > > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> > > (XEN)    [<ffff82c4801070d3>]
notify_via_xen_event_channel+0x21/0x106
> > > (XEN)    [<ffff82c4801b6883>]
hvm_buffered_io_send+0x1f1/0x21b
> > > (XEN)    [<ffff82c4801bbd3a>]
stdvga_intercept_mmio+0x491/0x4c7
> > > (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
> > > (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> > > (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> > > (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> > > (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> > > (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
> > > (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> > > (XEN)    [<ffff82c4801afd72>]
hvm_hap_nested_page_fault+0x210/0x37f
> > > (XEN)    [<ffff82c4801d2419>]
vmx_vmexit_handler+0x1523/0x17d0
> > >
> > > Thanks,
> > > -Xudong
> > >
> > >> -----Original Message-----
> > >> From: Tim Deegan [mailto:tim@xen.org]
> > >> Sent: Saturday, March 10, 2012 12:56 AM
> > >> To: Andres Lagar-Cavilla
> > >> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com;
Zhang,
> > >> Xiantao; JBeulich@suse.com
> > >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
> > >>
> > >> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla
wrote:
> > >> > >> I don''t know about the event lock, but
it seems unwise to call
> > >> > >> in to handle_mmio with a gfn lock held.  How
about fixing the
> > >> > >> other
> > >> path?
> > >> > >>
> > >> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> > >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08
16:40:05 2012 +0000
> > >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09
11:15:25 2012 +0000
> > >> > >> @@ -1324,10 +1324,11 @@ int
> > hvm_hap_nested_page_fault(unsigned l
> > >> > >>      if ( (p2mt == p2m_mmio_dm) ||
> > >> > >>           (access_w && (p2mt ==
p2m_ram_ro)) )
> > >> > >>      {
> > >> > >> +        put_gfn(p2m->domain, gfn);
> > >> > >>          if ( !handle_mmio() )
> > >> > >>             
hvm_inject_exception(TRAP_gp_fault, 0, 0);
> > >> > >>          rc = 1;
> > >> > >> -        goto out_put_gfn;
> > >> > >> +        goto out;
> > >> > >>      }
> > >> > >>
> > >> > >>  #ifdef __x86_64__
> > >> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
> > >> > >>
> > >> > >>  out_put_gfn:
> > >> > >>      put_gfn(p2m->domain, gfn);
> > >> > >> +out:
> > >> > >>      if ( paged )
> > >> > >>          p2m_mem_paging_populate(v->domain,
gfn);
> > >> > >>      if ( req_ptr )
> > >> > >
> > >> > > Yes, that''s fine to release the p2m lock
earlier than handle_mmio.
> > >> >
> > >> > Ack
> > >>
> > >> OK, applied.
> > >>
> > >> Tim.
> > >
> >
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Jan Beulich

2012-Mar-14 09:20 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

>>> On 13.03.12 at 19:26, "Andres Lagar-Cavilla"
<andres@lagarcavilla.org> wrote:
>>  Hi, Tim and Andres
>> The patch fix part of this issue. In handle_mmio, function
hvmemul_do_io()
>> is called and p2m lock was held again by calling get_gfn_unshare(),
still
>> trigger a deadlocks.
> 
> I have a question before I dive into lock untangling
> 
> msix_capability_init ->
> p2m_change_entry_type_global(dev->domain, p2m_mmio_direct,
p2m_mmio_direct);
> 
> Huh? This achieves ... nothing. Almost. It flushes a bunch of TLBs, but
> that can be done with significantly less effort. Am I missing something?
Yes - the purpose of this isn''t to flush any TLBs, but to enforce the
immediately preceding addition to the mmio_ro_ranges range set.

Jan

Andres Lagar-Cavilla

2012-Mar-14 14:18 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

>>>> On 13.03.12 at 19:26, "Andres Lagar-Cavilla"
<andres@lagarcavilla.org>
>>>> wrote:
>>>  Hi, Tim and Andres
>>> The patch fix part of this issue. In handle_mmio, function
>>> hvmemul_do_io()
>>> is called and p2m lock was held again by calling get_gfn_unshare(),
>>> still
>>> trigger a deadlocks.
>>
>> I have a question before I dive into lock untangling
>>
>> msix_capability_init ->
>> p2m_change_entry_type_global(dev->domain, p2m_mmio_direct,
>> p2m_mmio_direct);
>>
>> Huh? This achieves ... nothing. Almost. It flushes a bunch of TLBs, but
>> that can be done with significantly less effort. Am I missing
something?
>
> Yes - the purpose of this isn''t to flush any TLBs, but to enforce
the
> immediately preceding addition to the mmio_ro_ranges range set.
Because p2m entries of type mmio_direct have their permissions
(re)computed as a function of rangesets. Got it. Thanks.

Andres>
> Jan
>
>

Andres Lagar-Cavilla

2012-Mar-14 14:20 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> I prefer to the 2nd, I made a patch and testing show it works.
Not only is it preferable, as per Jan''s email, it is the only way to
go.

Thanks for putting together the patch and testing it. However, let me
rework it a bit (I''ll add your Signed-off). If the new version works
well,
then let''s request to get it applied.

Cheers
Andres
>
> diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
> +++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
> @@ -60,20 +60,23 @@
>      ioreq_t *p = get_ioreq(curr);
>      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>      p2m_type_t p2mt;
> -    mfn_t ram_mfn;
> +    unsigned long ram_mfn;
>      int rc;
>
>      /* Check for paged out page */
> -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> +    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn, &p2mt));
> +    if ( mfn_valid(ram_mfn) )
> +        get_page(mfn_to_page(ram_mfn), curr->domain);
> +    put_gfn(curr->domain, ram_gfn);
>      if ( p2m_is_paging(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          p2m_mem_paging_populate(curr->domain, ram_gfn);
>          return X86EMUL_RETRY;
>      }
>      if ( p2m_is_shared(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_RETRY;
>      }
>
> @@ -87,7 +90,7 @@
>          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
>          if ( dir == IOREQ_READ )
>              memset(p_data, ~0, size);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -108,7 +111,7 @@
>              unsigned int bytes = vio->mmio_large_write_bytes;
>              if ( (addr >= pa) && ((addr + size) <= (pa +
bytes)) )
>              {
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -120,7 +123,7 @@
>              {
>                  memcpy(p_data, &vio->mmio_large_read[addr - pa],
>                         size);
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -134,7 +137,7 @@
>          vio->io_state = HVMIO_none;
>          if ( p_data == NULL )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_UNHANDLEABLE;
>          }
>          goto finish_access;
> @@ -144,11 +147,11 @@
>               (addr == (vio->mmio_large_write_pa +
>                         vio->mmio_large_write_bytes)) )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_RETRY;
>          }
>      default:
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -156,7 +159,7 @@
>      {
>          gdprintk(XENLOG_WARNING, "WARNING: io already pending
(%d)?\n",
>                   p->state);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -208,7 +211,7 @@
>
>      if ( rc != X86EMUL_OKAY )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return rc;
>      }
>
> @@ -244,7 +247,7 @@
>          }
>      }
>
> -    put_gfn(curr->domain, ram_gfn);
> +    put_page(mfn_to_page(ram_mfn));
>      return X86EMUL_OKAY;
>  }
>
>
> Thanks,
> -Xudong
>
>
>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
>> Sent: Wednesday, March 14, 2012 2:46 AM
>> To: Hao, Xudong
>> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> > Hi, Tim and Andres
>> > The patch fix part of this issue. In handle_mmio, function
>> > hvmemul_do_io() is called and p2m lock was held again by calling
>> > get_gfn_unshare(), still trigger a deadlocks.
>>
>> Typically hvmemul_do_io gets the zero gfn, because in many cases
that''s
>> the
>> ''rma_gpa'' it is passed. However, in the case of mmio,
and particularly
>> stdvga,
>> ram_gpa is the data to be copied to the framebuffer. So it is in
>> principle ok to
>> get_gfn in hvmemul_do_io.
>>
>> There are two solutions
>> 1. msix_capability_init does not call p2m_change_entry_type_global. See
>> my
>> previous email. If we want to resync the
>> EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
>> explicitly. I hope.
>>
>> 2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa, and
>> holds
>> that for the critical section, instead of the p2m lock. One way to
>> achieve this is
>>
>>     /* Check for paged out page */
>>     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
>>     if ( this or that )
>>     { ... handle ... }
>>     if ( mfn_valid(ram_mfn) )
>>         get_page(mfn_to_page(ram_mfn, curr->domain));
>>     put_gfn(curr->domain, ram_gfn)
>>
>>     /* replace all put_gfn in all exit paths by put_page */
>>
>> This will ensure the target page is live and sane while not holding the
>> p2m lock.
>> Xudong, did that make sense? Do you think you could try that and report
>> back?
>>
>> Thanks!
>> Andres
>>
>> >
>> > (XEN) Xen call trace:
>> > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
>> > (XEN)    [<ffff82c4801070d3>]
notify_via_xen_event_channel+0x21/0x106
>> > (XEN)    [<ffff82c4801b6883>]
hvm_buffered_io_send+0x1f1/0x21b
>> > (XEN)    [<ffff82c4801bbd3a>]
stdvga_intercept_mmio+0x491/0x4c7
>> > (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
>> > (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
>> > (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
>> > (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
>> > (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
>> > (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
>> > (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
>> > (XEN)    [<ffff82c4801afd72>]
hvm_hap_nested_page_fault+0x210/0x37f
>> > (XEN)    [<ffff82c4801d2419>]
vmx_vmexit_handler+0x1523/0x17d0
>> >
>> > Thanks,
>> > -Xudong
>> >
>> >> -----Original Message-----
>> >> From: Tim Deegan [mailto:tim@xen.org]
>> >> Sent: Saturday, March 10, 2012 12:56 AM
>> >> To: Andres Lagar-Cavilla
>> >> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com;
Zhang,
>> >> Xiantao; JBeulich@suse.com
>> >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>> >>
>> >> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla
wrote:
>> >> > >> I don''t know about the event lock, but
it seems unwise to call
>> >> > >> in to handle_mmio with a gfn lock held.  How
about fixing the
>> >> > >> other
>> >> path?
>> >> > >>
>> >> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05
2012 +0000
>> >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25
2012 +0000
>> >> > >> @@ -1324,10 +1324,11 @@ int
>> hvm_hap_nested_page_fault(unsigned l
>> >> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> >> > >>           (access_w && (p2mt ==
p2m_ram_ro)) )
>> >> > >>      {
>> >> > >> +        put_gfn(p2m->domain, gfn);
>> >> > >>          if ( !handle_mmio() )
>> >> > >>              hvm_inject_exception(TRAP_gp_fault,
0, 0);
>> >> > >>          rc = 1;
>> >> > >> -        goto out_put_gfn;
>> >> > >> +        goto out;
>> >> > >>      }
>> >> > >>
>> >> > >>  #ifdef __x86_64__
>> >> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
>> >> > >>
>> >> > >>  out_put_gfn:
>> >> > >>      put_gfn(p2m->domain, gfn);
>> >> > >> +out:
>> >> > >>      if ( paged )
>> >> > >>          p2m_mem_paging_populate(v->domain,
gfn);
>> >> > >>      if ( req_ptr )
>> >> > >
>> >> > > Yes, that''s fine to release the p2m lock
earlier than
>> handle_mmio.
>> >> >
>> >> > Ack
>> >>
>> >> OK, applied.
>> >>
>> >> Tim.
>> >
>>
>
>

Andres Lagar-Cavilla

2012-Mar-14 15:10 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

Can you give this a try? (Tim, Keir, Jan, if Xudong reports success, can
you please apply?)

Thanks,
Andres

# HG changeset patch
# User Andres Lagar-Cavilla <andres@lagarcavilla.org>
# Date 1331737660 14400
# Node ID fe10f0433f6279091c193127d95d4d39b44a72ed
# Parent  5d20d2f6ffed0a49f030f04a8870f1926babbcbf
x86/mm: Fix deadlock between p2m and event channel locks.

The hvm io emulation code holds the p2m lock for the duration of the
emulation, which may include sending an event to qemu. On a separate path,
map_domain_pirq grabs the event channel and p2m locks in opposite order.

Fix this by ensuring liveness of the ram_gfn used by io emulation, with a
page ref.

Reported-by: "Hao, Xudong" <xudong.hao@intel.com>
Signed-off-by: "Hao, Xudong" <xudong.hao@intel.com>
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

diff -r 5d20d2f6ffed -r fe10f0433f62 xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -77,6 +77,17 @@ static int hvmemul_do_io(
         return X86EMUL_RETRY;
     }

+    /* Maintain a ref on the mfn to ensure liveness. Put the gfn
+     * to avoid potential deadlock wrt event channel lock, later. */
+    if ( mfn_valid(mfn_x(ram_mfn)) )
+        if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
+             curr->domain) )
+        {
+            put_gfn(curr->domain, ram_gfn);
+            return X86EMUL_RETRY;
+        }
+    put_gfn(curr->domain, ram_gfn);
+
     /*
      * Weird-sized accesses have undefined behaviour: we discard writes
      * and read all-ones.
@@ -87,7 +98,8 @@ static int hvmemul_do_io(
         ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
         if ( dir == IOREQ_READ )
             memset(p_data, ~0, size);
-        put_gfn(curr->domain, ram_gfn);
+        if ( mfn_valid(mfn_x(ram_mfn)) )
+            put_page(mfn_to_page(mfn_x(ram_mfn)));
         return X86EMUL_UNHANDLEABLE;
     }

@@ -108,7 +120,8 @@ static int hvmemul_do_io(
             unsigned int bytes = vio->mmio_large_write_bytes;
             if ( (addr >= pa) && ((addr + size) <= (pa + bytes))
)
             {
-                put_gfn(curr->domain, ram_gfn);
+                if ( mfn_valid(mfn_x(ram_mfn)) )
+                    put_page(mfn_to_page(mfn_x(ram_mfn)));
                 return X86EMUL_OKAY;
             }
         }
@@ -120,7 +133,8 @@ static int hvmemul_do_io(
             {
                 memcpy(p_data, &vio->mmio_large_read[addr - pa],
                        size);
-                put_gfn(curr->domain, ram_gfn);
+                if ( mfn_valid(mfn_x(ram_mfn)) )
+                    put_page(mfn_to_page(mfn_x(ram_mfn)));
                 return X86EMUL_OKAY;
             }
         }
@@ -134,7 +148,8 @@ static int hvmemul_do_io(
         vio->io_state = HVMIO_none;
         if ( p_data == NULL )
         {
-            put_gfn(curr->domain, ram_gfn);
+            if ( mfn_valid(mfn_x(ram_mfn)) )
+                put_page(mfn_to_page(mfn_x(ram_mfn)));
             return X86EMUL_UNHANDLEABLE;
         }
         goto finish_access;
@@ -144,11 +159,13 @@ static int hvmemul_do_io(
              (addr == (vio->mmio_large_write_pa +
                        vio->mmio_large_write_bytes)) )
         {
-            put_gfn(curr->domain, ram_gfn);
+            if ( mfn_valid(mfn_x(ram_mfn)) )
+                put_page(mfn_to_page(mfn_x(ram_mfn)));
             return X86EMUL_RETRY;
         }
     default:
-        put_gfn(curr->domain, ram_gfn);
+        if ( mfn_valid(mfn_x(ram_mfn)) )
+            put_page(mfn_to_page(mfn_x(ram_mfn)));
         return X86EMUL_UNHANDLEABLE;
     }

@@ -156,7 +173,8 @@ static int hvmemul_do_io(
     {
         gdprintk(XENLOG_WARNING, "WARNING: io already pending
(%d)?\n",
                  p->state);
-        put_gfn(curr->domain, ram_gfn);
+        if ( mfn_valid(mfn_x(ram_mfn)) )
+            put_page(mfn_to_page(mfn_x(ram_mfn)));
         return X86EMUL_UNHANDLEABLE;
     }

@@ -208,7 +226,8 @@ static int hvmemul_do_io(

     if ( rc != X86EMUL_OKAY )
     {
-        put_gfn(curr->domain, ram_gfn);
+        if ( mfn_valid(mfn_x(ram_mfn)) )
+            put_page(mfn_to_page(mfn_x(ram_mfn)));
         return rc;
     }

@@ -244,7 +263,8 @@ static int hvmemul_do_io(
         }
     }

-    put_gfn(curr->domain, ram_gfn);
+    if ( mfn_valid(mfn_x(ram_mfn)) )
+        put_page(mfn_to_page(mfn_x(ram_mfn)));
     return X86EMUL_OKAY;
 }



> I prefer to the 2nd, I made a patch and testing show it works.
>
> diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
> +++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
> @@ -60,20 +60,23 @@
>      ioreq_t *p = get_ioreq(curr);
>      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>      p2m_type_t p2mt;
> -    mfn_t ram_mfn;
> +    unsigned long ram_mfn;
>      int rc;
>
>      /* Check for paged out page */
> -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> +    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn, &p2mt));
> +    if ( mfn_valid(ram_mfn) )
> +        get_page(mfn_to_page(ram_mfn), curr->domain);
> +    put_gfn(curr->domain, ram_gfn);
>      if ( p2m_is_paging(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          p2m_mem_paging_populate(curr->domain, ram_gfn);
>          return X86EMUL_RETRY;
>      }
>      if ( p2m_is_shared(p2mt) )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_RETRY;
>      }
>
> @@ -87,7 +90,7 @@
>          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
>          if ( dir == IOREQ_READ )
>              memset(p_data, ~0, size);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -108,7 +111,7 @@
>              unsigned int bytes = vio->mmio_large_write_bytes;
>              if ( (addr >= pa) && ((addr + size) <= (pa +
bytes)) )
>              {
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -120,7 +123,7 @@
>              {
>                  memcpy(p_data, &vio->mmio_large_read[addr - pa],
>                         size);
> -                put_gfn(curr->domain, ram_gfn);
> +                put_page(mfn_to_page(ram_mfn));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -134,7 +137,7 @@
>          vio->io_state = HVMIO_none;
>          if ( p_data == NULL )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_UNHANDLEABLE;
>          }
>          goto finish_access;
> @@ -144,11 +147,11 @@
>               (addr == (vio->mmio_large_write_pa +
>                         vio->mmio_large_write_bytes)) )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_page(mfn_to_page(ram_mfn));
>              return X86EMUL_RETRY;
>          }
>      default:
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -156,7 +159,7 @@
>      {
>          gdprintk(XENLOG_WARNING, "WARNING: io already pending
(%d)?\n",
>                   p->state);
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return X86EMUL_UNHANDLEABLE;
>      }
>
> @@ -208,7 +211,7 @@
>
>      if ( rc != X86EMUL_OKAY )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        put_page(mfn_to_page(ram_mfn));
>          return rc;
>      }
>
> @@ -244,7 +247,7 @@
>          }
>      }
>
> -    put_gfn(curr->domain, ram_gfn);
> +    put_page(mfn_to_page(ram_mfn));
>      return X86EMUL_OKAY;
>  }
>
>
> Thanks,
> -Xudong
>
>
>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
>> Sent: Wednesday, March 14, 2012 2:46 AM
>> To: Hao, Xudong
>> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> > Hi, Tim and Andres
>> > The patch fix part of this issue. In handle_mmio, function
>> > hvmemul_do_io() is called and p2m lock was held again by calling
>> > get_gfn_unshare(), still trigger a deadlocks.
>>
>> Typically hvmemul_do_io gets the zero gfn, because in many cases
that''s
>> the
>> ''rma_gpa'' it is passed. However, in the case of mmio,
and particularly
>> stdvga,
>> ram_gpa is the data to be copied to the framebuffer. So it is in
>> principle ok to
>> get_gfn in hvmemul_do_io.
>>
>> There are two solutions
>> 1. msix_capability_init does not call p2m_change_entry_type_global. See
>> my
>> previous email. If we want to resync the
>> EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
>> explicitly. I hope.
>>
>> 2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa, and
>> holds
>> that for the critical section, instead of the p2m lock. One way to
>> achieve this is
>>
>>     /* Check for paged out page */
>>     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
>>     if ( this or that )
>>     { ... handle ... }
>>     if ( mfn_valid(ram_mfn) )
>>         get_page(mfn_to_page(ram_mfn, curr->domain));
>>     put_gfn(curr->domain, ram_gfn)
>>
>>     /* replace all put_gfn in all exit paths by put_page */
>>
>> This will ensure the target page is live and sane while not holding the
>> p2m lock.
>> Xudong, did that make sense? Do you think you could try that and report
>> back?
>>
>> Thanks!
>> Andres
>>
>> >
>> > (XEN) Xen call trace:
>> > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
>> > (XEN)    [<ffff82c4801070d3>]
notify_via_xen_event_channel+0x21/0x106
>> > (XEN)    [<ffff82c4801b6883>]
hvm_buffered_io_send+0x1f1/0x21b
>> > (XEN)    [<ffff82c4801bbd3a>]
stdvga_intercept_mmio+0x491/0x4c7
>> > (XEN)    [<ffff82c4801b5d58>] hvm_io_intercept+0x218/0x244
>> > (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
>> > (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
>> > (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
>> > (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
>> > (XEN)    [<ffff82c4801a9d2e>] hvm_emulate_one+0x120/0x1af
>> > (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
>> > (XEN)    [<ffff82c4801afd72>]
hvm_hap_nested_page_fault+0x210/0x37f
>> > (XEN)    [<ffff82c4801d2419>]
vmx_vmexit_handler+0x1523/0x17d0
>> >
>> > Thanks,
>> > -Xudong
>> >
>> >> -----Original Message-----
>> >> From: Tim Deegan [mailto:tim@xen.org]
>> >> Sent: Saturday, March 10, 2012 12:56 AM
>> >> To: Andres Lagar-Cavilla
>> >> Cc: Hao, Xudong; Keir Fraser; xen-devel@lists.xensource.com;
Zhang,
>> >> Xiantao; JBeulich@suse.com
>> >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and event_lock
>> >>
>> >> At 08:29 -0800 on 09 Mar (1331281767), Andres Lagar-Cavilla
wrote:
>> >> > >> I don''t know about the event lock, but
it seems unwise to call
>> >> > >> in to handle_mmio with a gfn lock held.  How
about fixing the
>> >> > >> other
>> >> path?
>> >> > >>
>> >> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
>> >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08 16:40:05
2012 +0000
>> >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09 11:15:25
2012 +0000
>> >> > >> @@ -1324,10 +1324,11 @@ int
>> hvm_hap_nested_page_fault(unsigned l
>> >> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> >> > >>           (access_w && (p2mt ==
p2m_ram_ro)) )
>> >> > >>      {
>> >> > >> +        put_gfn(p2m->domain, gfn);
>> >> > >>          if ( !handle_mmio() )
>> >> > >>              hvm_inject_exception(TRAP_gp_fault,
0, 0);
>> >> > >>          rc = 1;
>> >> > >> -        goto out_put_gfn;
>> >> > >> +        goto out;
>> >> > >>      }
>> >> > >>
>> >> > >>  #ifdef __x86_64__
>> >> > >> @@ -1379,6 +1380,7 @@ int
hvm_hap_nested_page_fault(unsigned l
>> >> > >>
>> >> > >>  out_put_gfn:
>> >> > >>      put_gfn(p2m->domain, gfn);
>> >> > >> +out:
>> >> > >>      if ( paged )
>> >> > >>          p2m_mem_paging_populate(v->domain,
gfn);
>> >> > >>      if ( req_ptr )
>> >> > >
>> >> > > Yes, that''s fine to release the p2m lock
earlier than
>> handle_mmio.
>> >> >
>> >> > Ack
>> >>
>> >> OK, applied.
>> >>
>> >> Tim.
>> >
>>
>
>

Hao, Xudong

2012-Mar-15 02:19 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

Works by tested.

Thanks,
-Xudong

> -----Original Message-----
> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
> Sent: Wednesday, March 14, 2012 11:11 PM
> To: Hao, Xudong
> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang, Xiantao;
> JBeulich@suse.com
> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
> 
> Can you give this a try? (Tim, Keir, Jan, if Xudong reports success, can
you
> please apply?)
> 
> Thanks,
> Andres
> 
> # HG changeset patch
> # User Andres Lagar-Cavilla <andres@lagarcavilla.org> # Date
1331737660
> 14400 # Node ID fe10f0433f6279091c193127d95d4d39b44a72ed
> # Parent  5d20d2f6ffed0a49f030f04a8870f1926babbcbf
> x86/mm: Fix deadlock between p2m and event channel locks.
> 
> The hvm io emulation code holds the p2m lock for the duration of the
emulation,
> which may include sending an event to qemu. On a separate path,
> map_domain_pirq grabs the event channel and p2m locks in opposite order.
> 
> Fix this by ensuring liveness of the ram_gfn used by io emulation, with a
page
> ref.
> 
> Reported-by: "Hao, Xudong" <xudong.hao@intel.com>
> Signed-off-by: "Hao, Xudong" <xudong.hao@intel.com>
> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> 
> diff -r 5d20d2f6ffed -r fe10f0433f62 xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -77,6 +77,17 @@ static int hvmemul_do_io(
>          return X86EMUL_RETRY;
>      }
> 
> +    /* Maintain a ref on the mfn to ensure liveness. Put the gfn
> +     * to avoid potential deadlock wrt event channel lock, later. */
> +    if ( mfn_valid(mfn_x(ram_mfn)) )
> +        if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
> +             curr->domain) )
> +        {
> +            put_gfn(curr->domain, ram_gfn);
> +            return X86EMUL_RETRY;
> +        }
> +    put_gfn(curr->domain, ram_gfn);
> +
>      /*
>       * Weird-sized accesses have undefined behaviour: we discard writes
>       * and read all-ones.
> @@ -87,7 +98,8 @@ static int hvmemul_do_io(
>          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
>          if ( dir == IOREQ_READ )
>              memset(p_data, ~0, size);
> -        put_gfn(curr->domain, ram_gfn);
> +        if ( mfn_valid(mfn_x(ram_mfn)) )
> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -108,7 +120,8 @@ static int hvmemul_do_io(
>              unsigned int bytes = vio->mmio_large_write_bytes;
>              if ( (addr >= pa) && ((addr + size) <= (pa +
bytes)) )
>              {
> -                put_gfn(curr->domain, ram_gfn);
> +                if ( mfn_valid(mfn_x(ram_mfn)) )
> +                    put_page(mfn_to_page(mfn_x(ram_mfn)));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -120,7 +133,8 @@ static int hvmemul_do_io(
>              {
>                  memcpy(p_data, &vio->mmio_large_read[addr - pa],
>                         size);
> -                put_gfn(curr->domain, ram_gfn);
> +                if ( mfn_valid(mfn_x(ram_mfn)) )
> +                    put_page(mfn_to_page(mfn_x(ram_mfn)));
>                  return X86EMUL_OKAY;
>              }
>          }
> @@ -134,7 +148,8 @@ static int hvmemul_do_io(
>          vio->io_state = HVMIO_none;
>          if ( p_data == NULL )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            if ( mfn_valid(mfn_x(ram_mfn)) )
> +                put_page(mfn_to_page(mfn_x(ram_mfn)));
>              return X86EMUL_UNHANDLEABLE;
>          }
>          goto finish_access;
> @@ -144,11 +159,13 @@ static int hvmemul_do_io(
>               (addr == (vio->mmio_large_write_pa +
>                         vio->mmio_large_write_bytes)) )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            if ( mfn_valid(mfn_x(ram_mfn)) )
> +                put_page(mfn_to_page(mfn_x(ram_mfn)));
>              return X86EMUL_RETRY;
>          }
>      default:
> -        put_gfn(curr->domain, ram_gfn);
> +        if ( mfn_valid(mfn_x(ram_mfn)) )
> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -156,7 +173,8 @@ static int hvmemul_do_io(
>      {
>          gdprintk(XENLOG_WARNING, "WARNING: io already pending
> (%d)?\n",
>                   p->state);
> -        put_gfn(curr->domain, ram_gfn);
> +        if ( mfn_valid(mfn_x(ram_mfn)) )
> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>          return X86EMUL_UNHANDLEABLE;
>      }
> 
> @@ -208,7 +226,8 @@ static int hvmemul_do_io(
> 
>      if ( rc != X86EMUL_OKAY )
>      {
> -        put_gfn(curr->domain, ram_gfn);
> +        if ( mfn_valid(mfn_x(ram_mfn)) )
> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>          return rc;
>      }
> 
> @@ -244,7 +263,8 @@ static int hvmemul_do_io(
>          }
>      }
> 
> -    put_gfn(curr->domain, ram_gfn);
> +    if ( mfn_valid(mfn_x(ram_mfn)) )
> +        put_page(mfn_to_page(mfn_x(ram_mfn)));
>      return X86EMUL_OKAY;
>  }
> 
> 
> 
> 
> > I prefer to the 2nd, I made a patch and testing show it works.
> >
> > diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
> > --- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
> > +++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
> > @@ -60,20 +60,23 @@
> >      ioreq_t *p = get_ioreq(curr);
> >      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
> >      p2m_type_t p2mt;
> > -    mfn_t ram_mfn;
> > +    unsigned long ram_mfn;
> >      int rc;
> >
> >      /* Check for paged out page */
> > -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> > +    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn,
&p2mt));
> > +    if ( mfn_valid(ram_mfn) )
> > +        get_page(mfn_to_page(ram_mfn), curr->domain);
> > +    put_gfn(curr->domain, ram_gfn);
> >      if ( p2m_is_paging(p2mt) )
> >      {
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          p2m_mem_paging_populate(curr->domain, ram_gfn);
> >          return X86EMUL_RETRY;
> >      }
> >      if ( p2m_is_shared(p2mt) )
> >      {
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          return X86EMUL_RETRY;
> >      }
> >
> > @@ -87,7 +90,7 @@
> >          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
> >          if ( dir == IOREQ_READ )
> >              memset(p_data, ~0, size);
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          return X86EMUL_UNHANDLEABLE;
> >      }
> >
> > @@ -108,7 +111,7 @@
> >              unsigned int bytes = vio->mmio_large_write_bytes;
> >              if ( (addr >= pa) && ((addr + size) <= (pa
+ bytes)) )
> >              {
> > -                put_gfn(curr->domain, ram_gfn);
> > +                put_page(mfn_to_page(ram_mfn));
> >                  return X86EMUL_OKAY;
> >              }
> >          }
> > @@ -120,7 +123,7 @@
> >              {
> >                  memcpy(p_data, &vio->mmio_large_read[addr -
pa],
> >                         size);
> > -                put_gfn(curr->domain, ram_gfn);
> > +                put_page(mfn_to_page(ram_mfn));
> >                  return X86EMUL_OKAY;
> >              }
> >          }
> > @@ -134,7 +137,7 @@
> >          vio->io_state = HVMIO_none;
> >          if ( p_data == NULL )
> >          {
> > -            put_gfn(curr->domain, ram_gfn);
> > +            put_page(mfn_to_page(ram_mfn));
> >              return X86EMUL_UNHANDLEABLE;
> >          }
> >          goto finish_access;
> > @@ -144,11 +147,11 @@
> >               (addr == (vio->mmio_large_write_pa +
> >                         vio->mmio_large_write_bytes)) )
> >          {
> > -            put_gfn(curr->domain, ram_gfn);
> > +            put_page(mfn_to_page(ram_mfn));
> >              return X86EMUL_RETRY;
> >          }
> >      default:
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          return X86EMUL_UNHANDLEABLE;
> >      }
> >
> > @@ -156,7 +159,7 @@
> >      {
> >          gdprintk(XENLOG_WARNING, "WARNING: io already pending
> (%d)?\n",
> >                   p->state);
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          return X86EMUL_UNHANDLEABLE;
> >      }
> >
> > @@ -208,7 +211,7 @@
> >
> >      if ( rc != X86EMUL_OKAY )
> >      {
> > -        put_gfn(curr->domain, ram_gfn);
> > +        put_page(mfn_to_page(ram_mfn));
> >          return rc;
> >      }
> >
> > @@ -244,7 +247,7 @@
> >          }
> >      }
> >
> > -    put_gfn(curr->domain, ram_gfn);
> > +    put_page(mfn_to_page(ram_mfn));
> >      return X86EMUL_OKAY;
> >  }
> >
> >
> > Thanks,
> > -Xudong
> >
> >
> >> -----Original Message-----
> >> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
> >> Sent: Wednesday, March 14, 2012 2:46 AM
> >> To: Hao, Xudong
> >> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
> >> Xiantao; JBeulich@suse.com
> >> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
> >>
> >> > Hi, Tim and Andres
> >> > The patch fix part of this issue. In handle_mmio, function
> >> > hvmemul_do_io() is called and p2m lock was held again by
calling
> >> > get_gfn_unshare(), still trigger a deadlocks.
> >>
> >> Typically hvmemul_do_io gets the zero gfn, because in many cases
> >> that''s the ''rma_gpa'' it is passed.
However, in the case of mmio, and
> >> particularly stdvga, ram_gpa is the data to be copied to the
> >> framebuffer. So it is in principle ok to get_gfn in hvmemul_do_io.
> >>
> >> There are two solutions
> >> 1. msix_capability_init does not call
p2m_change_entry_type_global.
> >> See my previous email. If we want to resync the
> >> EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do that
> >> explicitly. I hope.
> >>
> >> 2. hvmemul_do_io does gets a ref on the mfn underlying ram_gpa,
and
> >> holds that for the critical section, instead of the p2m lock. One
way
> >> to achieve this is
> >>
> >>     /* Check for paged out page */
> >>     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn,
&p2mt);
> >>     if ( this or that )
> >>     { ... handle ... }
> >>     if ( mfn_valid(ram_mfn) )
> >>         get_page(mfn_to_page(ram_mfn, curr->domain));
> >>     put_gfn(curr->domain, ram_gfn)
> >>
> >>     /* replace all put_gfn in all exit paths by put_page */
> >>
> >> This will ensure the target page is live and sane while not
holding
> >> the p2m lock.
> >> Xudong, did that make sense? Do you think you could try that and
> >> report back?
> >>
> >> Thanks!
> >> Andres
> >>
> >> >
> >> > (XEN) Xen call trace:
> >> > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
> >> > (XEN)    [<ffff82c4801070d3>]
> notify_via_xen_event_channel+0x21/0x106
> >> > (XEN)    [<ffff82c4801b6883>]
hvm_buffered_io_send+0x1f1/0x21b
> >> > (XEN)    [<ffff82c4801bbd3a>]
stdvga_intercept_mmio+0x491/0x4c7
> >> > (XEN)    [<ffff82c4801b5d58>]
hvm_io_intercept+0x218/0x244
> >> > (XEN)    [<ffff82c4801aa931>] hvmemul_do_io+0x55a/0x716
> >> > (XEN)    [<ffff82c4801aab1a>] hvmemul_do_mmio+0x2d/0x2f
> >> > (XEN)    [<ffff82c4801ab239>] hvmemul_write+0x181/0x1a2
> >> > (XEN)    [<ffff82c4801963f0>] x86_emulate+0xcad3/0xfbdf
> >> > (XEN)    [<ffff82c4801a9d2e>]
hvm_emulate_one+0x120/0x1af
> >> > (XEN)    [<ffff82c4801b63cb>] handle_mmio+0x4e/0x1d1
> >> > (XEN)    [<ffff82c4801afd72>]
> hvm_hap_nested_page_fault+0x210/0x37f
> >> > (XEN)    [<ffff82c4801d2419>]
vmx_vmexit_handler+0x1523/0x17d0
> >> >
> >> > Thanks,
> >> > -Xudong
> >> >
> >> >> -----Original Message-----
> >> >> From: Tim Deegan [mailto:tim@xen.org]
> >> >> Sent: Saturday, March 10, 2012 12:56 AM
> >> >> To: Andres Lagar-Cavilla
> >> >> Cc: Hao, Xudong; Keir Fraser;
xen-devel@lists.xensource.com;
> >> >> Zhang, Xiantao; JBeulich@suse.com
> >> >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and
event_lock
> >> >>
> >> >> At 08:29 -0800 on 09 Mar (1331281767), Andres
Lagar-Cavilla wrote:
> >> >> > >> I don''t know about the event lock,
but it seems unwise to
> >> >> > >> call in to handle_mmio with a gfn lock
held.  How about
> >> >> > >> fixing the other
> >> >> path?
> >> >> > >>
> >> >> > >> diff -r 04673ecb9d78 xen/arch/x86/hvm/hvm.c
> >> >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08
16:40:05 2012 +0000
> >> >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09
11:15:25 2012 +0000
> >> >> > >> @@ -1324,10 +1324,11 @@ int
> >> hvm_hap_nested_page_fault(unsigned l
> >> >> > >>      if ( (p2mt == p2m_mmio_dm) ||
> >> >> > >>           (access_w && (p2mt ==
p2m_ram_ro)) )
> >> >> > >>      {
> >> >> > >> +        put_gfn(p2m->domain, gfn);
> >> >> > >>          if ( !handle_mmio() )
> >> >> > >>             
hvm_inject_exception(TRAP_gp_fault, 0, 0);
> >> >> > >>          rc = 1;
> >> >> > >> -        goto out_put_gfn;
> >> >> > >> +        goto out;
> >> >> > >>      }
> >> >> > >>
> >> >> > >>  #ifdef __x86_64__
> >> >> > >> @@ -1379,6 +1380,7 @@ int
> hvm_hap_nested_page_fault(unsigned
> >> >> > >> l
> >> >> > >>
> >> >> > >>  out_put_gfn:
> >> >> > >>      put_gfn(p2m->domain, gfn);
> >> >> > >> +out:
> >> >> > >>      if ( paged )
> >> >> > >>         
p2m_mem_paging_populate(v->domain, gfn);
> >> >> > >>      if ( req_ptr )
> >> >> > >
> >> >> > > Yes, that''s fine to release the p2m
lock earlier than
> >> handle_mmio.
> >> >> >
> >> >> > Ack
> >> >>
> >> >> OK, applied.
> >> >>
> >> >> Tim.
> >> >
> >>
> >
> >
>

Andres Lagar-Cavilla

2012-Mar-15 03:37 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

> Works by tested.
>
> Thanks,
Thanks to you!
Andres
> -Xudong
>
>
>> -----Original Message-----
>> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
>> Sent: Wednesday, March 14, 2012 11:11 PM
>> To: Hao, Xudong
>> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com; Zhang,
>> Xiantao;
>> JBeulich@suse.com
>> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
>>
>> Can you give this a try? (Tim, Keir, Jan, if Xudong reports success,
can
>> you
>> please apply?)
>>
>> Thanks,
>> Andres
>>
>> # HG changeset patch
>> # User Andres Lagar-Cavilla <andres@lagarcavilla.org> # Date
1331737660
>> 14400 # Node ID fe10f0433f6279091c193127d95d4d39b44a72ed
>> # Parent  5d20d2f6ffed0a49f030f04a8870f1926babbcbf
>> x86/mm: Fix deadlock between p2m and event channel locks.
>>
>> The hvm io emulation code holds the p2m lock for the duration of the
>> emulation,
>> which may include sending an event to qemu. On a separate path,
>> map_domain_pirq grabs the event channel and p2m locks in opposite
order.
>>
>> Fix this by ensuring liveness of the ram_gfn used by io emulation, with
>> a page
>> ref.
>>
>> Reported-by: "Hao, Xudong" <xudong.hao@intel.com>
>> Signed-off-by: "Hao, Xudong" <xudong.hao@intel.com>
>> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
>>
>> diff -r 5d20d2f6ffed -r fe10f0433f62 xen/arch/x86/hvm/emulate.c
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -77,6 +77,17 @@ static int hvmemul_do_io(
>>          return X86EMUL_RETRY;
>>      }
>>
>> +    /* Maintain a ref on the mfn to ensure liveness. Put the gfn
>> +     * to avoid potential deadlock wrt event channel lock, later. */
>> +    if ( mfn_valid(mfn_x(ram_mfn)) )
>> +        if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
>> +             curr->domain) )
>> +        {
>> +            put_gfn(curr->domain, ram_gfn);
>> +            return X86EMUL_RETRY;
>> +        }
>> +    put_gfn(curr->domain, ram_gfn);
>> +
>>      /*
>>       * Weird-sized accesses have undefined behaviour: we discard
writes
>>       * and read all-ones.
>> @@ -87,7 +98,8 @@ static int hvmemul_do_io(
>>          ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
>>          if ( dir == IOREQ_READ )
>>              memset(p_data, ~0, size);
>> -        put_gfn(curr->domain, ram_gfn);
>> +        if ( mfn_valid(mfn_x(ram_mfn)) )
>> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>>          return X86EMUL_UNHANDLEABLE;
>>      }
>>
>> @@ -108,7 +120,8 @@ static int hvmemul_do_io(
>>              unsigned int bytes = vio->mmio_large_write_bytes;
>>              if ( (addr >= pa) && ((addr + size) <= (pa +
bytes)) )
>>              {
>> -                put_gfn(curr->domain, ram_gfn);
>> +                if ( mfn_valid(mfn_x(ram_mfn)) )
>> +                    put_page(mfn_to_page(mfn_x(ram_mfn)));
>>                  return X86EMUL_OKAY;
>>              }
>>          }
>> @@ -120,7 +133,8 @@ static int hvmemul_do_io(
>>              {
>>                  memcpy(p_data, &vio->mmio_large_read[addr -
pa],
>>                         size);
>> -                put_gfn(curr->domain, ram_gfn);
>> +                if ( mfn_valid(mfn_x(ram_mfn)) )
>> +                    put_page(mfn_to_page(mfn_x(ram_mfn)));
>>                  return X86EMUL_OKAY;
>>              }
>>          }
>> @@ -134,7 +148,8 @@ static int hvmemul_do_io(
>>          vio->io_state = HVMIO_none;
>>          if ( p_data == NULL )
>>          {
>> -            put_gfn(curr->domain, ram_gfn);
>> +            if ( mfn_valid(mfn_x(ram_mfn)) )
>> +                put_page(mfn_to_page(mfn_x(ram_mfn)));
>>              return X86EMUL_UNHANDLEABLE;
>>          }
>>          goto finish_access;
>> @@ -144,11 +159,13 @@ static int hvmemul_do_io(
>>               (addr == (vio->mmio_large_write_pa +
>>                         vio->mmio_large_write_bytes)) )
>>          {
>> -            put_gfn(curr->domain, ram_gfn);
>> +            if ( mfn_valid(mfn_x(ram_mfn)) )
>> +                put_page(mfn_to_page(mfn_x(ram_mfn)));
>>              return X86EMUL_RETRY;
>>          }
>>      default:
>> -        put_gfn(curr->domain, ram_gfn);
>> +        if ( mfn_valid(mfn_x(ram_mfn)) )
>> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>>          return X86EMUL_UNHANDLEABLE;
>>      }
>>
>> @@ -156,7 +173,8 @@ static int hvmemul_do_io(
>>      {
>>          gdprintk(XENLOG_WARNING, "WARNING: io already pending
>> (%d)?\n",
>>                   p->state);
>> -        put_gfn(curr->domain, ram_gfn);
>> +        if ( mfn_valid(mfn_x(ram_mfn)) )
>> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>>          return X86EMUL_UNHANDLEABLE;
>>      }
>>
>> @@ -208,7 +226,8 @@ static int hvmemul_do_io(
>>
>>      if ( rc != X86EMUL_OKAY )
>>      {
>> -        put_gfn(curr->domain, ram_gfn);
>> +        if ( mfn_valid(mfn_x(ram_mfn)) )
>> +            put_page(mfn_to_page(mfn_x(ram_mfn)));
>>          return rc;
>>      }
>>
>> @@ -244,7 +263,8 @@ static int hvmemul_do_io(
>>          }
>>      }
>>
>> -    put_gfn(curr->domain, ram_gfn);
>> +    if ( mfn_valid(mfn_x(ram_mfn)) )
>> +        put_page(mfn_to_page(mfn_x(ram_mfn)));
>>      return X86EMUL_OKAY;
>>  }
>>
>>
>>
>>
>> > I prefer to the 2nd, I made a patch and testing show it works.
>> >
>> > diff -r 5d20d2f6ffed xen/arch/x86/hvm/emulate.c
>> > --- a/xen/arch/x86/hvm/emulate.c	Fri Mar 09 16:54:24 2012 +0000
>> > +++ b/xen/arch/x86/hvm/emulate.c	Wed Mar 14 15:11:52 2012 -0400
>> > @@ -60,20 +60,23 @@
>> >      ioreq_t *p = get_ioreq(curr);
>> >      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>> >      p2m_type_t p2mt;
>> > -    mfn_t ram_mfn;
>> > +    unsigned long ram_mfn;
>> >      int rc;
>> >
>> >      /* Check for paged out page */
>> > -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn,
&p2mt);
>> > +    ram_mfn = mfn_x(get_gfn_unshare(curr->domain, ram_gfn,
&p2mt));
>> > +    if ( mfn_valid(ram_mfn) )
>> > +        get_page(mfn_to_page(ram_mfn), curr->domain);
>> > +    put_gfn(curr->domain, ram_gfn);
>> >      if ( p2m_is_paging(p2mt) )
>> >      {
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          p2m_mem_paging_populate(curr->domain, ram_gfn);
>> >          return X86EMUL_RETRY;
>> >      }
>> >      if ( p2m_is_shared(p2mt) )
>> >      {
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          return X86EMUL_RETRY;
>> >      }
>> >
>> > @@ -87,7 +90,7 @@
>> >          ASSERT(p_data != NULL); /* cannot happen with a REP
prefix */
>> >          if ( dir == IOREQ_READ )
>> >              memset(p_data, ~0, size);
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          return X86EMUL_UNHANDLEABLE;
>> >      }
>> >
>> > @@ -108,7 +111,7 @@
>> >              unsigned int bytes = vio->mmio_large_write_bytes;
>> >              if ( (addr >= pa) && ((addr + size) <=
(pa + bytes)) )
>> >              {
>> > -                put_gfn(curr->domain, ram_gfn);
>> > +                put_page(mfn_to_page(ram_mfn));
>> >                  return X86EMUL_OKAY;
>> >              }
>> >          }
>> > @@ -120,7 +123,7 @@
>> >              {
>> >                  memcpy(p_data, &vio->mmio_large_read[addr
- pa],
>> >                         size);
>> > -                put_gfn(curr->domain, ram_gfn);
>> > +                put_page(mfn_to_page(ram_mfn));
>> >                  return X86EMUL_OKAY;
>> >              }
>> >          }
>> > @@ -134,7 +137,7 @@
>> >          vio->io_state = HVMIO_none;
>> >          if ( p_data == NULL )
>> >          {
>> > -            put_gfn(curr->domain, ram_gfn);
>> > +            put_page(mfn_to_page(ram_mfn));
>> >              return X86EMUL_UNHANDLEABLE;
>> >          }
>> >          goto finish_access;
>> > @@ -144,11 +147,11 @@
>> >               (addr == (vio->mmio_large_write_pa +
>> >                         vio->mmio_large_write_bytes)) )
>> >          {
>> > -            put_gfn(curr->domain, ram_gfn);
>> > +            put_page(mfn_to_page(ram_mfn));
>> >              return X86EMUL_RETRY;
>> >          }
>> >      default:
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          return X86EMUL_UNHANDLEABLE;
>> >      }
>> >
>> > @@ -156,7 +159,7 @@
>> >      {
>> >          gdprintk(XENLOG_WARNING, "WARNING: io already
pending
>> (%d)?\n",
>> >                   p->state);
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          return X86EMUL_UNHANDLEABLE;
>> >      }
>> >
>> > @@ -208,7 +211,7 @@
>> >
>> >      if ( rc != X86EMUL_OKAY )
>> >      {
>> > -        put_gfn(curr->domain, ram_gfn);
>> > +        put_page(mfn_to_page(ram_mfn));
>> >          return rc;
>> >      }
>> >
>> > @@ -244,7 +247,7 @@
>> >          }
>> >      }
>> >
>> > -    put_gfn(curr->domain, ram_gfn);
>> > +    put_page(mfn_to_page(ram_mfn));
>> >      return X86EMUL_OKAY;
>> >  }
>> >
>> >
>> > Thanks,
>> > -Xudong
>> >
>> >
>> >> -----Original Message-----
>> >> From: Andres Lagar-Cavilla [mailto:andres@lagarcavilla.org]
>> >> Sent: Wednesday, March 14, 2012 2:46 AM
>> >> To: Hao, Xudong
>> >> Cc: Tim Deegan; Keir Fraser; xen-devel@lists.xensource.com;
Zhang,
>> >> Xiantao; JBeulich@suse.com
>> >> Subject: RE: [Xen-devel] Deadlocks by p2m_lock and event_lock
>> >>
>> >> > Hi, Tim and Andres
>> >> > The patch fix part of this issue. In handle_mmio,
function
>> >> > hvmemul_do_io() is called and p2m lock was held again by
calling
>> >> > get_gfn_unshare(), still trigger a deadlocks.
>> >>
>> >> Typically hvmemul_do_io gets the zero gfn, because in many
cases
>> >> that''s the ''rma_gpa'' it is passed.
However, in the case of mmio, and
>> >> particularly stdvga, ram_gpa is the data to be copied to the
>> >> framebuffer. So it is in principle ok to get_gfn in
hvmemul_do_io.
>> >>
>> >> There are two solutions
>> >> 1. msix_capability_init does not call
p2m_change_entry_type_global.
>> >> See my previous email. If we want to resync the
>> >> EPT/NPT/traditional/VTD/IOMMU/superduper TLBs, we can just do
that
>> >> explicitly. I hope.
>> >>
>> >> 2. hvmemul_do_io does gets a ref on the mfn underlying
ram_gpa, and
>> >> holds that for the critical section, instead of the p2m lock.
One way
>> >> to achieve this is
>> >>
>> >>     /* Check for paged out page */
>> >>     ram_mfn = get_gfn_unshare(curr->domain, ram_gfn,
&p2mt);
>> >>     if ( this or that )
>> >>     { ... handle ... }
>> >>     if ( mfn_valid(ram_mfn) )
>> >>         get_page(mfn_to_page(ram_mfn, curr->domain));
>> >>     put_gfn(curr->domain, ram_gfn)
>> >>
>> >>     /* replace all put_gfn in all exit paths by put_page */
>> >>
>> >> This will ensure the target page is live and sane while not
holding
>> >> the p2m lock.
>> >> Xudong, did that make sense? Do you think you could try that
and
>> >> report back?
>> >>
>> >> Thanks!
>> >> Andres
>> >>
>> >> >
>> >> > (XEN) Xen call trace:
>> >> > (XEN)    [<ffff82c4801261a3>] _spin_lock+0x1b/0xa8
>> >> > (XEN)    [<ffff82c4801070d3>]
>> notify_via_xen_event_channel+0x21/0x106
>> >> > (XEN)    [<ffff82c4801b6883>]
hvm_buffered_io_send+0x1f1/0x21b
>> >> > (XEN)    [<ffff82c4801bbd3a>]
stdvga_intercept_mmio+0x491/0x4c7
>> >> > (XEN)    [<ffff82c4801b5d58>]
hvm_io_intercept+0x218/0x244
>> >> > (XEN)    [<ffff82c4801aa931>]
hvmemul_do_io+0x55a/0x716
>> >> > (XEN)    [<ffff82c4801aab1a>]
hvmemul_do_mmio+0x2d/0x2f
>> >> > (XEN)    [<ffff82c4801ab239>]
hvmemul_write+0x181/0x1a2
>> >> > (XEN)    [<ffff82c4801963f0>]
x86_emulate+0xcad3/0xfbdf
>> >> > (XEN)    [<ffff82c4801a9d2e>]
hvm_emulate_one+0x120/0x1af
>> >> > (XEN)    [<ffff82c4801b63cb>]
handle_mmio+0x4e/0x1d1
>> >> > (XEN)    [<ffff82c4801afd72>]
>> hvm_hap_nested_page_fault+0x210/0x37f
>> >> > (XEN)    [<ffff82c4801d2419>]
vmx_vmexit_handler+0x1523/0x17d0
>> >> >
>> >> > Thanks,
>> >> > -Xudong
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Tim Deegan [mailto:tim@xen.org]
>> >> >> Sent: Saturday, March 10, 2012 12:56 AM
>> >> >> To: Andres Lagar-Cavilla
>> >> >> Cc: Hao, Xudong; Keir Fraser;
xen-devel@lists.xensource.com;
>> >> >> Zhang, Xiantao; JBeulich@suse.com
>> >> >> Subject: Re: [Xen-devel] Deadlocks by p2m_lock and
event_lock
>> >> >>
>> >> >> At 08:29 -0800 on 09 Mar (1331281767), Andres
Lagar-Cavilla wrote:
>> >> >> > >> I don''t know about the event
lock, but it seems unwise to
>> >> >> > >> call in to handle_mmio with a gfn lock
held.  How about
>> >> >> > >> fixing the other
>> >> >> path?
>> >> >> > >>
>> >> >> > >> diff -r 04673ecb9d78
xen/arch/x86/hvm/hvm.c
>> >> >> > >> --- a/xen/arch/x86/hvm/hvm.c	Thu Mar 08
16:40:05 2012 +0000
>> >> >> > >> +++ b/xen/arch/x86/hvm/hvm.c	Fri Mar 09
11:15:25 2012 +0000
>> >> >> > >> @@ -1324,10 +1324,11 @@ int
>> >> hvm_hap_nested_page_fault(unsigned l
>> >> >> > >>      if ( (p2mt == p2m_mmio_dm) ||
>> >> >> > >>           (access_w && (p2mt ==
p2m_ram_ro)) )
>> >> >> > >>      {
>> >> >> > >> +        put_gfn(p2m->domain, gfn);
>> >> >> > >>          if ( !handle_mmio() )
>> >> >> > >>             
hvm_inject_exception(TRAP_gp_fault, 0, 0);
>> >> >> > >>          rc = 1;
>> >> >> > >> -        goto out_put_gfn;
>> >> >> > >> +        goto out;
>> >> >> > >>      }
>> >> >> > >>
>> >> >> > >>  #ifdef __x86_64__
>> >> >> > >> @@ -1379,6 +1380,7 @@ int
>> hvm_hap_nested_page_fault(unsigned
>> >> >> > >> l
>> >> >> > >>
>> >> >> > >>  out_put_gfn:
>> >> >> > >>      put_gfn(p2m->domain, gfn);
>> >> >> > >> +out:
>> >> >> > >>      if ( paged )
>> >> >> > >>         
p2m_mem_paging_populate(v->domain, gfn);
>> >> >> > >>      if ( req_ptr )
>> >> >> > >
>> >> >> > > Yes, that''s fine to release the
p2m lock earlier than
>> >> handle_mmio.
>> >> >> >
>> >> >> > Ack
>> >> >>
>> >> >> OK, applied.
>> >> >>
>> >> >> Tim.
>> >> >
>> >>
>> >
>> >
>>
>
>

Tim Deegan

2012-Mar-15 10:44 UTC

head link

Re: Deadlocks by p2m_lock and event_lock

At 02:19 +0000 on 15 Mar (1331777960), Hao, Xudong
wrote:> Works by tested.
Righto.  Applied, thanks.

Tim.

Xen devel - Mar 2012 - Deadlocks by p2m_lock and event_lock

Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock

Re: Deadlocks by p2m_lock and event_lock