thr3ads.net - Xen devel - [PATCH 0 of 3] x86/mm: Relieve contention of p2m lock on three hot paths [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Andres Lagar-Cavilla

2012-Apr-24 19:34 UTC

[PATCH 0 of 3] x86/mm: Relieve contention of p2m lock on three hot paths

This series is motivated by recent scalability issues reported by Yhan Z Zang
from Intel.

The general pattern used is to perform a synchronized p2m query to take care of
paging in, PoD allocation or unsharing, and then take a reference to the
underlying page. The reference ensures the page can be safely mapped by the
caller as it won''t be paged out. The p2m lock can then be relased and
not held
while actual work is performed on the page. This reduces p2m locked critical
sections to a minimum.

The pattern is applied to page table walking and emulation of rep movs. By
checking for the bogus zero ram_gpa value on the general emulation function, we
also reduce contention derived from the p2m lock for that hot path.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

 xen/arch/x86/mm/hap/guest_walk.c |   6 ++++-
 xen/arch/x86/hvm/emulate.c       |  27 +++++++---------------
 xen/arch/x86/hvm/emulate.c       |  48 +++++++++++++++++++++------------------
 3 files changed, 40 insertions(+), 41 deletions(-)

Andres Lagar-Cavilla

2012-Apr-24 19:34 UTC

head link

[PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn

xen/arch/x86/mm/hap/guest_walk.c |  6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)


We don''t need to hold the p2m lock for the duration of the guest walk.
We need
to ensure livenes of the top level page.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

diff -r 34c7e6be9265 -r 58fd70123787 xen/arch/x86/mm/hap/guest_walk.c
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -52,6 +52,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
 {
     uint32_t missing;
     mfn_t top_mfn;
+    struct page_info *top_page;
     void *top_map;
     p2m_type_t p2mt;
     p2m_access_t p2ma;
@@ -85,13 +86,16 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
 
     /* Map the top-level table and call the tree-walker */
     ASSERT(mfn_valid(mfn_x(top_mfn)));
+    top_page = mfn_to_page(mfn_x(top_mfn));
+    ASSERT(get_page(top_page, p2m->domain));
+    __put_gfn(p2m, top_gfn);
     top_map = map_domain_page(mfn_x(top_mfn));
 #if GUEST_PAGING_LEVELS == 3
     top_map += (cr3 & ~(PAGE_MASK | 31));
 #endif
     missing = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn,
top_map);
     unmap_domain_page(top_map);
-    __put_gfn(p2m, top_gfn);
+    put_page(top_page);
 
     /* Interpret the answer */
     if ( missing == 0 )

Andres Lagar-Cavilla

2012-Apr-24 19:34 UTC

head link

[PATCH 2 of 3] x86/emulate: Relieve contention of p2m lock in emulation of rep movs

xen/arch/x86/hvm/emulate.c |  27 +++++++++------------------
 1 files changed, 9 insertions(+), 18 deletions(-)


get_two_gfns is used to query the source and target physical addresses of the
emulated rep movs. It is not necessary to hold onto the two gfn''s for
the
duration of the emulation: further calls down the line will do the appropriate
unsharing or paging in, and unwind correctly on failure.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

diff -r 58fd70123787 -r 2ffc676120b8 xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -714,25 +714,23 @@ static int hvmemul_rep_movs(
     if ( rc != X86EMUL_OKAY )
         return rc;
 
+    /* The query on the gfns is to establish the need for mmio. In the two mmio
+     * cases, a proper get_gfn for the "other" gfn will be enacted,
with paging in
+     * or unsharing if necessary. In the memmove case, the gfn might change
given
+     * the bytes mov''ed, and, again, proper get_gfn''s will be
enacted in
+     * __hvm_copy. */ 
     get_two_gfns(current->domain, sgpa >> PAGE_SHIFT, &sp2mt,
NULL, NULL,
                  current->domain, dgpa >> PAGE_SHIFT, &dp2mt,
NULL, NULL,
                  P2M_ALLOC, &tg);
-
+    put_two_gfns(&tg);
+  
     if ( !p2m_is_ram(sp2mt) && !p2m_is_grant(sp2mt) )
-    {
-        rc = hvmemul_do_mmio(
+        return hvmemul_do_mmio(
             sgpa, reps, bytes_per_rep, dgpa, IOREQ_READ, df, NULL);
-        put_two_gfns(&tg);
-        return rc;
-    }
 
     if ( !p2m_is_ram(dp2mt) && !p2m_is_grant(dp2mt) )
-    {
-        rc = hvmemul_do_mmio(
+        return hvmemul_do_mmio(
             dgpa, reps, bytes_per_rep, sgpa, IOREQ_WRITE, df, NULL);
-        put_two_gfns(&tg);
-        return rc;
-    }
 
     /* RAM-to-RAM copy: emulate as equivalent of memmove(dgpa, sgpa, bytes). */
     bytes = *reps * bytes_per_rep;
@@ -747,10 +745,7 @@ static int hvmemul_rep_movs(
      * can be emulated by a source-to-buffer-to-destination block copy.
      */
     if ( ((dgpa + bytes_per_rep) > sgpa) && (dgpa < (sgpa +
bytes)) )
-    {
-        put_two_gfns(&tg);
         return X86EMUL_UNHANDLEABLE;
-    }
 
     /* Adjust destination address for reverse copy. */
     if ( df )
@@ -759,10 +754,7 @@ static int hvmemul_rep_movs(
     /* Allocate temporary buffer. Fall back to slow emulation if this fails. */
     buf = xmalloc_bytes(bytes);
     if ( buf == NULL )
-    {
-        put_two_gfns(&tg);
         return X86EMUL_UNHANDLEABLE;
-    }
 
     /*
      * We do a modicum of checking here, just for paranoia''s sake and
to
@@ -773,7 +765,6 @@ static int hvmemul_rep_movs(
         rc = hvm_copy_to_guest_phys(dgpa, buf, bytes);
 
     xfree(buf);
-    put_two_gfns(&tg);
 
     if ( rc == HVMCOPY_gfn_paged_out )
         return X86EMUL_RETRY;

Andres Lagar-Cavilla

2012-Apr-24 19:34 UTC

head link

[PATCH 3 of 3] x86/emulation: No need to get_gfn on zero ram_gpa

xen/arch/x86/hvm/emulate.c |  48 ++++++++++++++++++++++++---------------------
 1 files changed, 26 insertions(+), 22 deletions(-)


Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

diff -r 2ffc676120b8 -r 7a7443e80b99 xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -60,33 +60,37 @@ static int hvmemul_do_io(
     ioreq_t *p = get_ioreq(curr);
     unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
     p2m_type_t p2mt;
-    mfn_t ram_mfn;
+    mfn_t ram_mfn = _mfn(INVALID_MFN);
     int rc;
 
-    /* Check for paged out page */
-    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_gfn(curr->domain, ram_gfn); 
-        p2m_mem_paging_populate(curr->domain, ram_gfn);
-        return X86EMUL_RETRY;
-    }
-    if ( p2m_is_shared(p2mt) )
-    {
-        put_gfn(curr->domain, ram_gfn); 
-        return X86EMUL_RETRY;
-    }
-
-    /* Maintain a ref on the mfn to ensure liveness. Put the gfn
-     * to avoid potential deadlock wrt event channel lock, later. */
-    if ( mfn_valid(mfn_x(ram_mfn)) )
-        if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
-             curr->domain) )
+    /* Many callers pass a stub zero ram_gpa address. */
+    if ( ram_gfn != 0 )
+    { 
+        /* Check for paged out page */
+        ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
+        if ( p2m_is_paging(p2mt) )
         {
-            put_gfn(curr->domain, ram_gfn);
+            put_gfn(curr->domain, ram_gfn); 
+            p2m_mem_paging_populate(curr->domain, ram_gfn);
             return X86EMUL_RETRY;
         }
-    put_gfn(curr->domain, ram_gfn);
+        if ( p2m_is_shared(p2mt) )
+        {
+            put_gfn(curr->domain, ram_gfn); 
+            return X86EMUL_RETRY;
+        }
+
+        /* Maintain a ref on the mfn to ensure liveness. Put the gfn
+         * to avoid potential deadlock wrt event channel lock, later. */
+        if ( mfn_valid(mfn_x(ram_mfn)) )
+            if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
+                 curr->domain) )
+            {
+                put_gfn(curr->domain, ram_gfn);
+                return X86EMUL_RETRY;
+            }
+        put_gfn(curr->domain, ram_gfn);
+    }
 
     /*
      * Weird-sized accesses have undefined behaviour: we discard writes

Tim Deegan

2012-Apr-25 12:51 UTC

head link

Re: [PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn

At 15:34 -0400 on 24 Apr (1335281651), Andres Lagar-Cavilla
wrote:>  xen/arch/x86/mm/hap/guest_walk.c |  6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> 
> We don''t need to hold the p2m lock for the duration of the guest
walk. We need
> to ensure livenes of the top level page.
I''m not sure I want to start taking these s/gfn-lock/get-page/ changes
piecemeal without a clear understanding of why the gfn-locking is not
needed.  My understanding was that

 - get_page protects against the page being freed and reused.
 - gfn-lock protects against the GFN being reused, or the page 
   being reused within the VM. 

Are there some cases where we only care about the first of these and
some where we care about both?  In other words, can we replace the
gfn-locking everywhere with get_page/put_page (i.e. get rid of put_gfn)?

Also:
> @@ -85,13 +86,16 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>  
>      /* Map the top-level table and call the tree-walker */
>      ASSERT(mfn_valid(mfn_x(top_mfn)));
> +    top_page = mfn_to_page(mfn_x(top_mfn));
> +    ASSERT(get_page(top_page, p2m->domain));
ASSERT(foo) is compiled out on non-debug builds, so that''s definitely
not what you want. 

I personally dislike this idiom with BUG_ON() as well, because even
though BUG_ON(some side-effecting statement) may be correct, my eye
tends to skip over it when skimming code.

Cheers,

Tim.

Tim Deegan

2012-Apr-25 12:54 UTC

head link

Re: [PATCH 2 of 3] x86/emulate: Relieve contention of p2m lock in emulation of rep movs

At 15:34 -0400 on 24 Apr (1335281652), Andres Lagar-Cavilla
wrote:>  xen/arch/x86/hvm/emulate.c |  27 +++++++++------------------
>  1 files changed, 9 insertions(+), 18 deletions(-)
> 
> 
> get_two_gfns is used to query the source and target physical addresses of
the
> emulated rep movs. It is not necessary to hold onto the two gfn''s
for the
> duration of the emulation: further calls down the line will do the
appropriate
> unsharing or paging in, and unwind correctly on failure.
> 
> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> 
> diff -r 58fd70123787 -r 2ffc676120b8 xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -714,25 +714,23 @@ static int hvmemul_rep_movs(
>      if ( rc != X86EMUL_OKAY )
>          return rc;
>  
> +    /* The query on the gfns is to establish the need for mmio. In the two
mmio
> +     * cases, a proper get_gfn for the "other" gfn will be
enacted, with paging in
> +     * or unsharing if necessary. In the memmove case, the gfn might
change given
> +     * the bytes mov''ed, and, again, proper get_gfn''s
will be enacted in
> +     * __hvm_copy. */ 
>      get_two_gfns(current->domain, sgpa >> PAGE_SHIFT, &sp2mt,
NULL, NULL,
>                   current->domain, dgpa >> PAGE_SHIFT, &dp2mt,
NULL, NULL,
>                   P2M_ALLOC, &tg);
> -
> +    put_two_gfns(&tg);
If we''re just going to put_ these immediately, we don''t need
to use
get_two_gfns().

Tim.

Tim Deegan

2012-Apr-25 13:11 UTC

head link

Re: [PATCH 3 of 3] x86/emulation: No need to get_gfn on zero ram_gpa

At 15:34 -0400 on 24 Apr (1335281653), Andres Lagar-Cavilla
wrote:>  xen/arch/x86/hvm/emulate.c |  48
++++++++++++++++++++++++---------------------
>  1 files changed, 26 insertions(+), 22 deletions(-)
> 
> 
> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> 
> diff -r 2ffc676120b8 -r 7a7443e80b99 xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -60,33 +60,37 @@ static int hvmemul_do_io(
>      ioreq_t *p = get_ioreq(curr);
>      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>      p2m_type_t p2mt;
> -    mfn_t ram_mfn;
> +    mfn_t ram_mfn = _mfn(INVALID_MFN);
>      int rc;
>  
> -    /* Check for paged out page */
> -    ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> -    if ( p2m_is_paging(p2mt) )
> -    {
> -        put_gfn(curr->domain, ram_gfn); 
> -        p2m_mem_paging_populate(curr->domain, ram_gfn);
> -        return X86EMUL_RETRY;
> -    }
> -    if ( p2m_is_shared(p2mt) )
> -    {
> -        put_gfn(curr->domain, ram_gfn); 
> -        return X86EMUL_RETRY;
> -    }
> -
> -    /* Maintain a ref on the mfn to ensure liveness. Put the gfn
> -     * to avoid potential deadlock wrt event channel lock, later. */
> -    if ( mfn_valid(mfn_x(ram_mfn)) )
> -        if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
> -             curr->domain) )
> +    /* Many callers pass a stub zero ram_gpa address. */
> +    if ( ram_gfn != 0 )
To safely gate on this, the ''stub'' value needs to be made into
something
that can''t be confused with a real paddr, say, ragm_gpa == -1.
Otherwise we lose protection for IO where the target is in page zero. 
> +    { 
> +        /* Check for paged out page */
> +        ram_mfn = get_gfn_unshare(curr->domain, ram_gfn, &p2mt);
> +        if ( p2m_is_paging(p2mt) )
>          {
> -            put_gfn(curr->domain, ram_gfn);
> +            put_gfn(curr->domain, ram_gfn); 
> +            p2m_mem_paging_populate(curr->domain, ram_gfn);
>              return X86EMUL_RETRY;
>          }
> -    put_gfn(curr->domain, ram_gfn);
> +        if ( p2m_is_shared(p2mt) )
> +        {
> +            put_gfn(curr->domain, ram_gfn); 
> +            return X86EMUL_RETRY;
> +        }
> +
> +        /* Maintain a ref on the mfn to ensure liveness. Put the gfn
> +         * to avoid potential deadlock wrt event channel lock, later. */
> +        if ( mfn_valid(mfn_x(ram_mfn)) )
> +            if ( !get_page(mfn_to_page(mfn_x(ram_mfn)),
> +                 curr->domain) )
> +            {
> +                put_gfn(curr->domain, ram_gfn);
> +                return X86EMUL_RETRY;
> +            }
> +        put_gfn(curr->domain, ram_gfn);
> +    }
>  
>      /*
>       * Weird-sized accesses have undefined behaviour: we discard writes
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Andres Lagar-Cavilla

2012-Apr-25 15:33 UTC

head link

Re: [PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn

> At 15:34 -0400 on 24 Apr (1335281651), Andres Lagar-Cavilla wrote:
>>  xen/arch/x86/mm/hap/guest_walk.c |  6 +++++-
>>  1 files changed, 5 insertions(+), 1 deletions(-)
>>
>>
>> We don''t need to hold the p2m lock for the duration of the
guest walk.
>> We need
>> to ensure livenes of the top level page.
>
> I''m not sure I want to start taking these s/gfn-lock/get-page/
changes
> piecemeal without a clear understanding of why the gfn-locking is not
> needed.  My understanding was that
>
>  - get_page protects against the page being freed and reused.
>  - gfn-lock protects against the GFN being reused, or the page
>    being reused within the VM.
Most cases gfn_lock protects against are already taken care of by the end
of a query. From there on, get_page will protect you against paging out.

gfn_lock will further protect the caller from the gfn being replaced. That
is the risk we are taking in these code paths, and the assumption is that
the VM would not try that on its own pagetables, GDTs, or mmio targets. If
so, Xen won''t misbehave, yet the VM will shoot itself in the foot.
Deservedly.

gfn/p2m lock will also prevent modifications elsewhere in the p2m, and
that is good motivation for a fine-grained lock. These code paths don''t
care about modifications elsewhere in the p2m.

I agree it''s rather ad hoc, and I''m not happy about it. I
would prefer a
proper construct that buries all details, or finding out if rwlocks are
feasible. I''m trying to find ways to relieve contention and address
Yang''s
problem before the window closes. It hasn''t seem that successful
anyways.

Andres>
> Are there some cases where we only care about the first of these and
> some where we care about both?  In other words, can we replace the
> gfn-locking everywhere with get_page/put_page (i.e. get rid of put_gfn)?
>
> Also:
>
>> @@ -85,13 +86,16 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>>
>>      /* Map the top-level table and call the tree-walker */
>>      ASSERT(mfn_valid(mfn_x(top_mfn)));
>> +    top_page = mfn_to_page(mfn_x(top_mfn));
>> +    ASSERT(get_page(top_page, p2m->domain));
>
> ASSERT(foo) is compiled out on non-debug builds, so that''s
definitely
> not what you want.
>
> I personally dislike this idiom with BUG_ON() as well, because even
> though BUG_ON(some side-effecting statement) may be correct, my eye
> tends to skip over it when skimming code.
>
> Cheers,
>
> Tim.
>

Xen devel - Apr 2012 - [PATCH 0 of 3] x86/mm: Relieve contention of p2m lock on three hot paths

[PATCH 0 of 3] x86/mm: Relieve contention of p2m lock on three hot paths

[PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn

[PATCH 2 of 3] x86/emulate: Relieve contention of p2m lock in emulation of rep movs

[PATCH 3 of 3] x86/emulation: No need to get_gfn on zero ram_gpa

Re: [PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn

Re: [PATCH 2 of 3] x86/emulate: Relieve contention of p2m lock in emulation of rep movs

Re: [PATCH 3 of 3] x86/emulation: No need to get_gfn on zero ram_gpa

Re: [PATCH 1 of 3] x86/mm: Relieve contention for p2m lock in gva_to_gfn