thr3ads.net - Xen devel - [PATCH 0/2] XSA-74 follow-ups [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Jan Beulich

2013-Nov-27 08:07 UTC

[PATCH 0/2] XSA-74 follow-ups

1: fix locking in offline_page()
2: use return value of domain_adjust_tot_pages() where feasible

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Jan Beulich

2013-Nov-27 08:25 UTC

head link

[PATCH 1/2] fix locking in offline_page()

Coverity ID 1055655

Apart from the Coverity-detected lock order reversal (a domain''s
page_alloc_lock taken with the heap lock already held), calling
put_page() with heap_lock is a bad idea too (as a possible descendant
from put_page() is free_heap_pages(), which wants to take this very
lock).

From all I can tell the region over which heap_lock was held was far
too large: All we need to protect are the call to mark_page_offline()
and reserve_heap_page() (and I''d even put under question the need for
the former). Hence by slightly re-arranging the if/else-if chain we
can drop the lock much earlier, at once no longer covering the two
put_page() invocations.

Once at it, do a little bit of other cleanup: Put the "pod_replace"
code path inline rather than at its own label, and drop the effectively
unused variable "ret".

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int 
 {
     unsigned long old_info = 0;
     struct domain *owner;
-    int ret = 0;
     struct page_info *pg;
 
     if ( !mfn_valid(mfn) )
@@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int 
     if ( page_state_is(pg, offlined) )
     {
         reserve_heap_page(pg);
-        *status = PG_OFFLINE_OFFLINED;
+
+        spin_unlock(&heap_lock);
+
+        *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN
+                         : PG_OFFLINE_OFFLINED;
+        return 0;
     }
-    else if ( (owner = page_get_owner_and_reference(pg)) )
+
+    spin_unlock(&heap_lock);
+
+    if ( (owner = page_get_owner_and_reference(pg)) )
     {
         if ( p2m_pod_offline_or_broken_hit(pg) )
-            goto pod_replace;
+        {
+            put_page(pg);
+            p2m_pod_offline_or_broken_replace(pg);
+            *status = PG_OFFLINE_OFFLINED;
+        }
         else
         {
             *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
-              (owner->domain_id << PG_OFFLINE_OWNER_SHIFT);
+                      (owner->domain_id << PG_OFFLINE_OWNER_SHIFT);
             /* Release the reference since it will not be allocated anymore */
             put_page(pg);
         }
@@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int 
     else if ( old_info & PGC_xen_heap )
     {
         *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING |
-          (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
+                  (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
     }
     else
     {
@@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int 
     if ( broken )
         *status |= PG_OFFLINE_BROKEN;
 
-    spin_unlock(&heap_lock);
-
-    return ret;
-
-pod_replace:
-    put_page(pg);
-    spin_unlock(&heap_lock);
-
-    p2m_pod_offline_or_broken_replace(pg);
-    *status = PG_OFFLINE_OFFLINED;
-
-    if ( broken )
-        *status |= PG_OFFLINE_BROKEN;
-
-    return ret;
+    return 0;
 }
 
 /*





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Jan Beulich

2013-Nov-27 08:26 UTC

head link

[PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

This is generally cheaper than re-reading ->tot_pages.

While doing so I also noticed an improper use (lacking error handling)
of get_domain() as well as lacks of ->is_dying checks in the memory
sharing code, which the patch fixes at once. In the course of doing
this I further noticed other error paths there pointlessly calling
put_page() et al with ->page_alloc_lock still held, which is also being
reversed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -611,9 +611,16 @@ static int page_make_sharable(struct dom
                        struct page_info *page, 
                        int expected_refcnt)
 {
-    int drop_dom_ref;
+    bool_t drop_dom_ref;
+
     spin_lock(&d->page_alloc_lock);
 
+    if ( d->is_dying )
+    {
+        spin_unlock(&d->page_alloc_lock);
+        return -EBUSY;
+    }
+
     /* Change page type and count atomically */
     if ( !get_page_and_type(page, d, PGT_shared_page) )
     {
@@ -624,8 +631,8 @@ static int page_make_sharable(struct dom
     /* Check it wasn''t already sharable and undo if it was */
     if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
     {
-        put_page_and_type(page);
         spin_unlock(&d->page_alloc_lock);
+        put_page_and_type(page);
         return -EEXIST;
     }
 
@@ -633,15 +640,14 @@ static int page_make_sharable(struct dom
      * the second from get_page_and_type at the top of this function */
     if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) )
     {
+        spin_unlock(&d->page_alloc_lock);
         /* Return type count back to zero */
         put_page_and_type(page);
-        spin_unlock(&d->page_alloc_lock);
         return -E2BIG;
     }
 
     page_set_owner(page, dom_cow);
-    domain_adjust_tot_pages(d, -1);
-    drop_dom_ref = (d->tot_pages == 0);
+    drop_dom_ref = !domain_adjust_tot_pages(d, -1);
     page_list_del(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
 
@@ -659,6 +665,13 @@ static int page_make_private(struct doma
     
     spin_lock(&d->page_alloc_lock);
 
+    if ( d->is_dying )
+    {
+        spin_unlock(&d->page_alloc_lock);
+        put_page(page);
+        return -EBUSY;
+    }
+
     /* We can only change the type if count is one */
     /* Because we are locking pages individually, we need to drop
      * the lock here, while the page is typed. We cannot risk the 
@@ -666,8 +679,8 @@ static int page_make_private(struct doma
     expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2);
     if ( page->u.inuse.type_info != expected_type )
     {
-        put_page(page);
         spin_unlock(&d->page_alloc_lock);
+        put_page(page);
         return -EEXIST;
     }
 
@@ -682,7 +695,7 @@ static int page_make_private(struct doma
     page_set_owner(page, d);
 
     if ( domain_adjust_tot_pages(d, 1) == 1 )
-        get_domain(d);
+        get_knownalive_domain(d);
     page_list_add_tail(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
 
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -475,8 +475,8 @@ static long memory_exchange(XEN_GUEST_HA
                              (j * (1UL << exch.out.extent_order)));
 
                 spin_lock(&d->page_alloc_lock);
-                domain_adjust_tot_pages(d, -dec_count);
-                drop_dom_ref = (dec_count && !d->tot_pages);
+                drop_dom_ref = (dec_count &&
+                                !domain_adjust_tot_pages(d, -dec_count));
                 spin_unlock(&d->page_alloc_lock);
 
                 if ( drop_dom_ref )
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1516,8 +1516,9 @@ struct page_info *alloc_domheap_pages(
 
 void free_domheap_pages(struct page_info *pg, unsigned int order)
 {
-    int            i, drop_dom_ref;
     struct domain *d = page_get_owner(pg);
+    unsigned int i;
+    bool_t drop_dom_ref;
 
     ASSERT(!in_irq());
 
@@ -1545,8 +1546,7 @@ void free_domheap_pages(struct page_info
             page_list_del2(&pg[i], &d->page_list,
&d->arch.relmem_list);
         }
 
-        domain_adjust_tot_pages(d, -(1 << order));
-        drop_dom_ref = (d->tot_pages == 0);
+        drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
 
         spin_unlock_recursive(&d->page_alloc_lock);
 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Andrew Cooper

2013-Nov-27 10:01 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

On 27/11/13 08:25, Jan Beulich wrote:> Coverity ID 1055655
>
> Apart from the Coverity-detected lock order reversal (a domain''s
> page_alloc_lock taken with the heap lock already held), calling
> put_page() with heap_lock is a bad idea too (as a possible descendant
> from put_page() is free_heap_pages(), which wants to take this very
> lock).
>
> From all I can tell the region over which heap_lock was held was far
> too large: All we need to protect are the call to mark_page_offline()
> and reserve_heap_page() (and I''d even put under question the need
for
> the former). Hence by slightly re-arranging the if/else-if chain we
> can drop the lock much earlier, at once no longer covering the two
> put_page() invocations.
>
> Once at it, do a little bit of other cleanup: Put the
"pod_replace"
> code path inline rather than at its own label, and drop the effectively
> unused variable "ret".
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int 
>  {
>      unsigned long old_info = 0;
>      struct domain *owner;
> -    int ret = 0;
>      struct page_info *pg;
>  
>      if ( !mfn_valid(mfn) )
> @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int 
>      if ( page_state_is(pg, offlined) )
>      {
>          reserve_heap_page(pg);
> -        *status = PG_OFFLINE_OFFLINED;
> +
> +        spin_unlock(&heap_lock);
> +
> +        *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN
> +                         : PG_OFFLINE_OFFLINED;
> +        return 0;
>      }
> -    else if ( (owner = page_get_owner_and_reference(pg)) )
> +
> +    spin_unlock(&heap_lock);
> +
> +    if ( (owner = page_get_owner_and_reference(pg)) )
>      {
>          if ( p2m_pod_offline_or_broken_hit(pg) )
> -            goto pod_replace;
> +        {
> +            put_page(pg);
> +            p2m_pod_offline_or_broken_replace(pg);
> +            *status = PG_OFFLINE_OFFLINED;
> +        }
>          else
>          {
>              *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
> -              (owner->domain_id << PG_OFFLINE_OWNER_SHIFT);
> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
domain_id will be promoted from a uint16_t to an int32_t, then shifted
left by 16 bits which is undefined if the top of domain_id bit was set.

Do we care about the likelyhood of a domain_id with the top bit set? I
certainly cant see how one would get into that state.

~Andrew
>              /* Release the reference since it will not be allocated
anymore */
>              put_page(pg);
>          }
> @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int 
>      else if ( old_info & PGC_xen_heap )
>      {
>          *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING |
> -          (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
> +                  (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
>      }
>      else
>      {
> @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int 
>      if ( broken )
>          *status |= PG_OFFLINE_BROKEN;
>  
> -    spin_unlock(&heap_lock);
> -
> -    return ret;
> -
> -pod_replace:
> -    put_page(pg);
> -    spin_unlock(&heap_lock);
> -
> -    p2m_pod_offline_or_broken_replace(pg);
> -    *status = PG_OFFLINE_OFFLINED;
> -
> -    if ( broken )
> -        *status |= PG_OFFLINE_BROKEN;
> -
> -    return ret;
> +    return 0;
>  }
>  
>  /*
>
>
>

Jan Beulich

2013-Nov-27 10:05 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

>>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 27/11/13 08:25, Jan Beulich wrote:
>>          else
>>          {
>>              *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
>> -              (owner->domain_id << PG_OFFLINE_OWNER_SHIFT);
>> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> 
> domain_id will be promoted from a uint16_t to an int32_t, then shifted
> left by 16 bits which is undefined if the top of domain_id bit was set.
That promotion is done by zero extension, so I don''t figure what
you think is undefined here.

Furthermore I suppose you realize that all the patch does here is
adjust indentation.

Jan

Andrew Cooper

2013-Nov-27 10:07 UTC

head link

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

On 27/11/13 08:26, Jan Beulich wrote:> This is generally cheaper than re-reading ->tot_pages.
>
> While doing so I also noticed an improper use (lacking error handling)
> of get_domain() as well as lacks of ->is_dying checks in the memory
> sharing code, which the patch fixes at once. In the course of doing
> this I further noticed other error paths there pointlessly calling
> put_page() et al with ->page_alloc_lock still held, which is also being
> reversed.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -611,9 +611,16 @@ static int page_make_sharable(struct dom
>                         struct page_info *page, 
>                         int expected_refcnt)
>  {
> -    int drop_dom_ref;
> +    bool_t drop_dom_ref;
> +
>      spin_lock(&d->page_alloc_lock);
>  
> +    if ( d->is_dying )
> +    {
> +        spin_unlock(&d->page_alloc_lock);
> +        return -EBUSY;
> +    }
> +
>      /* Change page type and count atomically */
>      if ( !get_page_and_type(page, d, PGT_shared_page) )
>      {
> @@ -624,8 +631,8 @@ static int page_make_sharable(struct dom
>      /* Check it wasn''t already sharable and undo if it was */
>      if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
>      {
> -        put_page_and_type(page);
>          spin_unlock(&d->page_alloc_lock);
> +        put_page_and_type(page);
>          return -EEXIST;
>      }
>  
> @@ -633,15 +640,14 @@ static int page_make_sharable(struct dom
>       * the second from get_page_and_type at the top of this function */
>      if ( page->count_info != (PGC_allocated | (2 + expected_refcnt)) )
>      {
> +        spin_unlock(&d->page_alloc_lock);
>          /* Return type count back to zero */
>          put_page_and_type(page);
> -        spin_unlock(&d->page_alloc_lock);
>          return -E2BIG;
>      }
>  
>      page_set_owner(page, dom_cow);
> -    domain_adjust_tot_pages(d, -1);
> -    drop_dom_ref = (d->tot_pages == 0);
> +    drop_dom_ref = !domain_adjust_tot_pages(d, -1);
>      page_list_del(page, &d->page_list);
>      spin_unlock(&d->page_alloc_lock);
>  
> @@ -659,6 +665,13 @@ static int page_make_private(struct doma
>      
>      spin_lock(&d->page_alloc_lock);
>  
> +    if ( d->is_dying )
> +    {
> +        spin_unlock(&d->page_alloc_lock);
> +        put_page(page);
> +        return -EBUSY;
> +    }
> +
>      /* We can only change the type if count is one */
>      /* Because we are locking pages individually, we need to drop
>       * the lock here, while the page is typed. We cannot risk the 
> @@ -666,8 +679,8 @@ static int page_make_private(struct doma
>      expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2);
>      if ( page->u.inuse.type_info != expected_type )
>      {
> -        put_page(page);
>          spin_unlock(&d->page_alloc_lock);
> +        put_page(page);
>          return -EEXIST;
>      }
>  
> @@ -682,7 +695,7 @@ static int page_make_private(struct doma
>      page_set_owner(page, d);
>  
>      if ( domain_adjust_tot_pages(d, 1) == 1 )
> -        get_domain(d);
> +        get_knownalive_domain(d);
>      page_list_add_tail(page, &d->page_list);
>      spin_unlock(&d->page_alloc_lock);
>  
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -475,8 +475,8 @@ static long memory_exchange(XEN_GUEST_HA
>                               (j * (1UL << exch.out.extent_order)));
>  
>                  spin_lock(&d->page_alloc_lock);
> -                domain_adjust_tot_pages(d, -dec_count);
> -                drop_dom_ref = (dec_count && !d->tot_pages);
> +                drop_dom_ref = (dec_count &&
> +                                !domain_adjust_tot_pages(d, -dec_count));
>                  spin_unlock(&d->page_alloc_lock);
>  
>                  if ( drop_dom_ref )
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1516,8 +1516,9 @@ struct page_info *alloc_domheap_pages(
>  
>  void free_domheap_pages(struct page_info *pg, unsigned int order)
>  {
> -    int            i, drop_dom_ref;
>      struct domain *d = page_get_owner(pg);
> +    unsigned int i;
> +    bool_t drop_dom_ref;
>  
>      ASSERT(!in_irq());
>  
> @@ -1545,8 +1546,7 @@ void free_domheap_pages(struct page_info
>              page_list_del2(&pg[i], &d->page_list,
&d->arch.relmem_list);
>          }
>  
> -        domain_adjust_tot_pages(d, -(1 << order));
> -        drop_dom_ref = (d->tot_pages == 0);
> +        drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
>  
>          spin_unlock_recursive(&d->page_alloc_lock);
>  
>
>

Tim Deegan

2013-Nov-27 10:34 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich
wrote:> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > On 27/11/13 08:25, Jan Beulich wrote:
> >>          else
> >>          {
> >>              *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> >> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> > 
> > domain_id will be promoted from a uint16_t to an int32_t, then shifted
> > left by 16 bits which is undefined if the top of domain_id bit was
set.
> 
> That promotion is done by zero extension, so I don''t figure what
> you think is undefined here.
If the domid has bit 15 set, it will be shifted into the sign bit of
the promoted integer; it should be explicitly cast to an unsigned type
before the shift.

Tim.
> Furthermore I suppose you realize that all the patch does here is
> adjust indentation.
> 
> Jan
>

Tim Deegan

2013-Nov-27 10:42 UTC

head link

Re: [PATCH 0/2] XSA-74 follow-ups

At 08:07 +0000 on 27 Nov (1385536046), Jan Beulich
wrote:> 1: fix locking in offline_page()
> 2: use return value of domain_adjust_tot_pages() where feasible
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>

Jan Beulich

2013-Nov-27 10:43 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

>>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote:
> At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote:
>> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> > On 27/11/13 08:25, Jan Beulich wrote:
>> >>          else
>> >>          {
>> >>              *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
>> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
>> >> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
>> > 
>> > domain_id will be promoted from a uint16_t to an int32_t, then
shifted
>> > left by 16 bits which is undefined if the top of domain_id bit was
set.
>> 
>> That promotion is done by zero extension, so I don''t figure
what
>> you think is undefined here.
> 
> If the domid has bit 15 set, it will be shifted into the sign bit of
> the promoted integer; it should be explicitly cast to an unsigned type
> before the shift.
Again - I don''t see this. Promotion happens before the shift,
i.e. 0x8000 -> 0x00008000 -> 0x80000000.

Jan

Tim Deegan

2013-Nov-27 10:48 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich
wrote:> >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org> wrote:
> > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote:
> >> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> >> > On 27/11/13 08:25, Jan Beulich wrote:
> >> >>          else
> >> >>          {
> >> >>              *status = PG_OFFLINE_OWNED |
PG_OFFLINE_PENDING |
> >> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> >> >> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> >> > 
> >> > domain_id will be promoted from a uint16_t to an int32_t,
then shifted
> >> > left by 16 bits which is undefined if the top of domain_id
bit was set.
> >> 
> >> That promotion is done by zero extension, so I don''t
figure what
> >> you think is undefined here.
> > 
> > If the domid has bit 15 set, it will be shifted into the sign bit of
> > the promoted integer; it should be explicitly cast to an unsigned type
> > before the shift.
> 
> Again - I don''t see this. Promotion happens before the shift,
> i.e. 0x8000 -> 0x00008000 -> 0x80000000.
AIUI the default promotion is to a signed integer if the value will
fit, i.e.:
             (unsigned short) 0x8000 
promoted     (signed int) 0x00008000 
shifted left (signed int) 0x80000000 (undefined behaviour)

(but I don''t have my copy of the standard here to check).

Tim.

Tim Deegan

2013-Nov-27 10:53 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

At 11:48 +0100 on 27 Nov (1385549319), Tim Deegan wrote:> At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote:
> > >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org>
wrote:
> > > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote:
> > >> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > >> > On 27/11/13 08:25, Jan Beulich wrote:
> > >> >>          else
> > >> >>          {
> > >> >>              *status = PG_OFFLINE_OWNED |
PG_OFFLINE_PENDING |
> > >> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> > >> >> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> > >> > 
> > >> > domain_id will be promoted from a uint16_t to an
int32_t, then shifted
> > >> > left by 16 bits which is undefined if the top of
domain_id bit was set.
> > >> 
> > >> That promotion is done by zero extension, so I don''t
figure what
> > >> you think is undefined here.
> > > 
> > > If the domid has bit 15 set, it will be shifted into the sign bit
of
> > > the promoted integer; it should be explicitly cast to an unsigned
type
> > > before the shift.
> > 
> > Again - I don''t see this. Promotion happens before the shift,
> > i.e. 0x8000 -> 0x00008000 -> 0x80000000.
> 
> AIUI the default promotion is to a signed integer if the value will
> fit, i.e.:
>              (unsigned short) 0x8000 
> promoted     (signed int) 0x00008000 
> shifted left (signed int) 0x80000000 (undefined behaviour)
> 
> (but I don''t have my copy of the standard here to check).
(and AFAICT there''s no way for a real domain''s domain_id to
hva the
top bit set so it''s a moot point).

Tim.

Jan Beulich

2013-Nov-27 10:55 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

>>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote:
> At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote:
>> >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org>
wrote:
>> > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote:
>> >> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> >> > On 27/11/13 08:25, Jan Beulich wrote:
>> >> >>          else
>> >> >>          {
>> >> >>              *status = PG_OFFLINE_OWNED |
PG_OFFLINE_PENDING |
>> >> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
>> >> >> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
>> >> > 
>> >> > domain_id will be promoted from a uint16_t to an int32_t,
then shifted
>> >> > left by 16 bits which is undefined if the top of
domain_id bit was set.
>> >> 
>> >> That promotion is done by zero extension, so I don''t
figure what
>> >> you think is undefined here.
>> > 
>> > If the domid has bit 15 set, it will be shifted into the sign bit
of
>> > the promoted integer; it should be explicitly cast to an unsigned
type
>> > before the shift.
>> 
>> Again - I don''t see this. Promotion happens before the shift,
>> i.e. 0x8000 -> 0x00008000 -> 0x80000000.
> 
> AIUI the default promotion is to a signed integer if the value will
> fit, i.e.:
>              (unsigned short) 0x8000 
> promoted     (signed int) 0x00008000 
> shifted left (signed int) 0x80000000 (undefined behaviour)
Right - but the promotion (as you also show) is done via zero
extension. Hence, plus because of left shifts being ignorant of
signed-ness, no need for a cast.

Jan

George Dunlap

2013-Nov-27 14:44 UTC

head link

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

On 11/27/2013 08:26 AM, Jan Beulich wrote:> This is generally cheaper than re-reading ->tot_pages.
>
> While doing so I also noticed an improper use (lacking error handling)
> of get_domain() as well as lacks of ->is_dying checks in the memory
> sharing code, which the patch fixes at once. In the course of doing
> this I further noticed other error paths there pointlessly calling
> put_page() et al with ->page_alloc_lock still held, which is also being
> reversed.
Thus hiding two very important changes underneath a summary that looks 
completely unimportant.

If this patch only did as the title suggests, I would question whether 
it should be included for 4.4, since it seems to have little benefit.  
Are the other two changes bug fixes?

In any case, the summary should indicate to someone just browsing 
through what important changes might be inside.

  -George

George Dunlap

2013-Nov-27 14:48 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

On 11/27/2013 08:25 AM, Jan Beulich wrote:> Coverity ID 1055655
>
> Apart from the Coverity-detected lock order reversal (a domain''s
> page_alloc_lock taken with the heap lock already held), calling
> put_page() with heap_lock is a bad idea too (as a possible descendant
> from put_page() is free_heap_pages(), which wants to take this very
> lock).
This sounds like a bug fix, so doesn''t need a freeze exception at this
time.

  -George
>
>  From all I can tell the region over which heap_lock was held was far
> too large: All we need to protect are the call to mark_page_offline()
> and reserve_heap_page() (and I''d even put under question the need
for
> the former). Hence by slightly re-arranging the if/else-if chain we
> can drop the lock much earlier, at once no longer covering the two
> put_page() invocations.
>
> Once at it, do a little bit of other cleanup: Put the
"pod_replace"
> code path inline rather than at its own label, and drop the effectively
> unused variable "ret".
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -957,7 +957,6 @@ int offline_page(unsigned long mfn, int
>   {
>       unsigned long old_info = 0;
>       struct domain *owner;
> -    int ret = 0;
>       struct page_info *pg;
>   
>       if ( !mfn_valid(mfn) )
> @@ -1007,16 +1006,28 @@ int offline_page(unsigned long mfn, int
>       if ( page_state_is(pg, offlined) )
>       {
>           reserve_heap_page(pg);
> -        *status = PG_OFFLINE_OFFLINED;
> +
> +        spin_unlock(&heap_lock);
> +
> +        *status = broken ? PG_OFFLINE_OFFLINED | PG_OFFLINE_BROKEN
> +                         : PG_OFFLINE_OFFLINED;
> +        return 0;
>       }
> -    else if ( (owner = page_get_owner_and_reference(pg)) )
> +
> +    spin_unlock(&heap_lock);
> +
> +    if ( (owner = page_get_owner_and_reference(pg)) )
>       {
>           if ( p2m_pod_offline_or_broken_hit(pg) )
> -            goto pod_replace;
> +        {
> +            put_page(pg);
> +            p2m_pod_offline_or_broken_replace(pg);
> +            *status = PG_OFFLINE_OFFLINED;
> +        }
>           else
>           {
>               *status = PG_OFFLINE_OWNED | PG_OFFLINE_PENDING |
> -              (owner->domain_id << PG_OFFLINE_OWNER_SHIFT);
> +                      (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
>               /* Release the reference since it will not be allocated
anymore */
>               put_page(pg);
>           }
> @@ -1024,7 +1035,7 @@ int offline_page(unsigned long mfn, int
>       else if ( old_info & PGC_xen_heap )
>       {
>           *status = PG_OFFLINE_XENPAGE | PG_OFFLINE_PENDING |
> -          (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
> +                  (DOMID_XEN << PG_OFFLINE_OWNER_SHIFT);
>       }
>       else
>       {
> @@ -1043,21 +1054,7 @@ int offline_page(unsigned long mfn, int
>       if ( broken )
>           *status |= PG_OFFLINE_BROKEN;
>   
> -    spin_unlock(&heap_lock);
> -
> -    return ret;
> -
> -pod_replace:
> -    put_page(pg);
> -    spin_unlock(&heap_lock);
> -
> -    p2m_pod_offline_or_broken_replace(pg);
> -    *status = PG_OFFLINE_OFFLINED;
> -
> -    if ( broken )
> -        *status |= PG_OFFLINE_BROKEN;
> -
> -    return ret;
> +    return 0;
>   }
>   
>   /*
>
>
>

Jan Beulich

2013-Nov-27 15:46 UTC

head link

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

>>> On 27.11.13 at 15:44, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
> On 11/27/2013 08:26 AM, Jan Beulich wrote:
>> This is generally cheaper than re-reading ->tot_pages.
>>
>> While doing so I also noticed an improper use (lacking error handling)
>> of get_domain() as well as lacks of ->is_dying checks in the memory
>> sharing code, which the patch fixes at once. In the course of doing
>> this I further noticed other error paths there pointlessly calling
>> put_page() et al with ->page_alloc_lock still held, which is also
being
>> reversed.
> 
> Thus hiding two very important changes underneath a summary that looks 
> completely unimportant.
> 
> If this patch only did as the title suggests, I would question whether 
> it should be included for 4.4, since it seems to have little benefit.  
> Are the other two changes bug fixes?
I''m sure the missing ->is_dying check is one. I''m not sure
the
put vs unlock ordering is just inefficient or could actively cause
issues.
> In any case, the summary should indicate to someone just browsing 
> through what important changes might be inside.
The issue is that you can''t really put all three things in the title.
And hence I decided to keep the title what it was supposed to be
when I started correcting this code.

With - in my understanding - the memory sharing code still being
experimental, it also didn''t really seem worthwhile splitting these
changes out into separate patches (I generally try to do so when
spotting isolated issues, but tend to keep things together when
in the course of doing other adjustments I recognize further small
issues in code that''s not production ready anyway, i.e. not
needing to be backported, and hence the title not needing to hint
at that).

In any event, as to the freeze: The code was written and would
have been submitted before the code freeze if I hadn''t been
required to hold it back until after XSA-74 went public (which
goes back to our [unwritten?] policy of not publishing changes
that were found necessary/desirable in the course of researching
a specific security issue).

Jan

George Dunlap

2013-Nov-27 16:51 UTC

head link

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

On 11/27/2013 03:46 PM, Jan Beulich wrote:>>>> On 27.11.13 at 15:44, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>> On 11/27/2013 08:26 AM, Jan Beulich wrote:
>>> This is generally cheaper than re-reading ->tot_pages.
>>>
>>> While doing so I also noticed an improper use (lacking error
handling)
>>> of get_domain() as well as lacks of ->is_dying checks in the
memory
>>> sharing code, which the patch fixes at once. In the course of doing
>>> this I further noticed other error paths there pointlessly calling
>>> put_page() et al with ->page_alloc_lock still held, which is
also being
>>> reversed.
>> Thus hiding two very important changes underneath a summary that looks
>> completely unimportant.
>>
>> If this patch only did as the title suggests, I would question whether
>> it should be included for 4.4, since it seems to have little benefit.
>> Are the other two changes bug fixes?
> I''m sure the missing ->is_dying check is one. I''m not
sure the
> put vs unlock ordering is just inefficient or could actively cause
> issues.
>
>> In any case, the summary should indicate to someone just browsing
>> through what important changes might be inside.
> The issue is that you can''t really put all three things in the
title.
> And hence I decided to keep the title what it was supposed to be
> when I started correcting this code.
I think I would call it something like, "Various fix-ups to mm-related 
code".  That would allow anyone scanning it to know that there were a 
number of fix-ups, and they were in the MM code; and would prompt them 
to look further if it seemed like something they might be looking for 
(either fixes to backport, or the source of a bug that was introduced).
> With - in my understanding - the memory sharing code still being
> experimental, it also didn''t really seem worthwhile splitting
these
> changes out into separate patches (I generally try to do so when
> spotting isolated issues, but tend to keep things together when
> in the course of doing other adjustments I recognize further small
> issues in code that''s not production ready anyway, i.e. not
> needing to be backported, and hence the title not needing to hint
> at that).
>
> In any event, as to the freeze: The code was written and would
> have been submitted before the code freeze if I hadn''t been
> required to hold it back until after XSA-74 went public (which
> goes back to our [unwritten?] policy of not publishing changes
> that were found necessary/desirable in the course of researching
> a specific security issue).
Unfortunately, the "unfairness" of having been held back for a
security
embargo (or in the dom0 PVH case, having been submitted a year ago) 
doesn''t change the realities of the situation: that making changes
risks
introducing bugs which delay the release, or worse, are not found until 
after the release.  That may be a reason to consider it "having been 
submitted before the feature freeze", but it can''t tilt the scales
of a
cost/benefits analysis.

Anyway, we''re only half-way through the code freeze, and these do look 
like bug fixes; I''m inclined to say they''re OK.

  -George

Tim Deegan

2013-Nov-28 10:25 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich
wrote:> >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org> wrote:
> > At 10:43 +0000 on 27 Nov (1385545399), Jan Beulich wrote:
> >> >>> On 27.11.13 at 11:34, Tim Deegan <tim@xen.org>
wrote:
> >> > At 10:05 +0000 on 27 Nov (1385543154), Jan Beulich wrote:
> >> >> >>> On 27.11.13 at 11:01, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> >> >> > On 27/11/13 08:25, Jan Beulich wrote:
> >> >> >>          else
> >> >> >>          {
> >> >> >>              *status = PG_OFFLINE_OWNED |
PG_OFFLINE_PENDING |
> >> >> >> -              (owner->domain_id <<
PG_OFFLINE_OWNER_SHIFT);
> >> >> >> +                      (owner->domain_id
<< PG_OFFLINE_OWNER_SHIFT);
> >> >> > 
> >> >> > domain_id will be promoted from a uint16_t to an
int32_t, then shifted
> >> >> > left by 16 bits which is undefined if the top of
domain_id bit was set.
> >> >> 
> >> >> That promotion is done by zero extension, so I
don''t figure what
> >> >> you think is undefined here.
> >> > 
> >> > If the domid has bit 15 set, it will be shifted into the sign
bit of
> >> > the promoted integer; it should be explicitly cast to an
unsigned type
> >> > before the shift.
> >> 
> >> Again - I don''t see this. Promotion happens before the
shift,
> >> i.e. 0x8000 -> 0x00008000 -> 0x80000000.
> > 
> > AIUI the default promotion is to a signed integer if the value will
> > fit, i.e.:
> >              (unsigned short) 0x8000 
> > promoted     (signed int) 0x00008000 
> > shifted left (signed int) 0x80000000 (undefined behaviour)
> 
> Right - but the promotion (as you also show) is done via zero
> extension. Hence, plus because of left shifts being ignorant of
> signed-ness, no need for a cast.
No: left-shifting that set bit into the sign bit of the promoted value
is undefined behaviour.  I still don''t have my standard reference to
hand, but e.g. http://blog.regehr.org/archives/738

Tim.

Jan Beulich

2013-Nov-28 14:38 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

>>> Tim Deegan <tim@xen.org> 11/28/13 11:26 AM >>>
>At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich wrote:
>> >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org>
wrote:
>> > AIUI the default promotion is to a signed integer if the value
will
>> > fit, i.e.:
>> >              (unsigned short) 0x8000 
>> > promoted     (signed int) 0x00008000 
>> > shifted left (signed int) 0x80000000 (undefined behaviour)
>> 
>> Right - but the promotion (as you also show) is done via zero
>> extension. Hence, plus because of left shifts being ignorant of
>> signed-ness, no need for a cast.
>
>No: left-shifting that set bit into the sign bit of the promoted value
>is undefined behaviour.  I still don''t have my standard reference
to
>hand, but e.g. http://blog.regehr.org/archives/738
Ah, indeed. I can certainly add a cast there, but as said before
- the value can''t be negative as we only permit 2^^15 domains,
- the change to the line in question was only white space
  adjustment.

Jan

Tim Deegan

2013-Nov-28 15:11 UTC

head link

Re: [PATCH 1/2] fix locking in offline_page()

At 14:38 +0000 on 28 Nov (1385645907), Jan Beulich
wrote:> >>> Tim Deegan <tim@xen.org> 11/28/13 11:26 AM >>>
> >At 10:55 +0000 on 27 Nov (1385546118), Jan Beulich wrote:
> >> >>> On 27.11.13 at 11:48, Tim Deegan <tim@xen.org>
wrote:
> >> > AIUI the default promotion is to a signed integer if the
value will
> >> > fit, i.e.:
> >> >              (unsigned short) 0x8000 
> >> > promoted     (signed int) 0x00008000 
> >> > shifted left (signed int) 0x80000000 (undefined behaviour)
> >> 
> >> Right - but the promotion (as you also show) is done via zero
> >> extension. Hence, plus because of left shifts being ignorant of
> >> signed-ness, no need for a cast.
> >
> >No: left-shifting that set bit into the sign bit of the promoted value
> >is undefined behaviour.  I still don''t have my standard
reference to
> >hand, but e.g. http://blog.regehr.org/archives/738
> 
> Ah, indeed. I can certainly add a cast there, but as said before
> - the value can''t be negative as we only permit 2^^15 domains,
Absolutely.  My reviewed-by stands on the original patch.

Tim.

Keir Fraser

2013-Dec-03 09:20 UTC

head link

Re: [PATCH 0/2] XSA-74 follow-ups

On 27/11/2013 08:07, "Jan Beulich" <JBeulich@suse.com> wrote:
> 1: fix locking in offline_page()
> 2: use return value of domain_adjust_tot_pages() where feasible
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Xen devel - Nov 2013 - [PATCH 0/2] XSA-74 follow-ups

[PATCH 0/2] XSA-74 follow-ups

[PATCH 1/2] fix locking in offline_page()

[PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 0/2] XSA-74 follow-ups

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

Re: [PATCH 2/2] use return value of domain_adjust_tot_pages() where feasible

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 1/2] fix locking in offline_page()

Re: [PATCH 0/2] XSA-74 follow-ups