thr3ads.net - Xen devel - [PATCH 00 of 11] Automatic NUMA placement for xl [May 2012]

If this information is useful, please help other people find it:
Share via:

Dario Faggioli

2012-May-31 12:11 UTC

[PATCH 00 of 11] Automatic NUMA placement for xl

Hello Everyone,

This posting follows the RFC subject-ed "Automatically place guest on
host''s
NUMA nodes with xl" (patchbomb.1334150267@Solace , 4/11/2012),

        _however_

the key bits have been pretty much rewritten, aiming at 4.2 inclusion. In fact,
automatic NUMA placement is something xm/xend does, and xl should include
something like that, for the sake of feature parity.  For the same reason, many
of the functionalities that were part of the RFC are being kept out (e.g., NUMA
aware scheduling in the hypervisor).

Of course, while at it, all the review comments the RFC got, which involved
patches that are part of this series too, have been considered and applied.

So, here it is, in some more details, what this is all about:

  1/11 libxc: abstract xenctl_cpumap to just xenctl_map
  2/11 libxl: abstract libxl_cpumap to just libxl_map
  3/11 libxc, libxl: introduce xc_nodemap_t and libxl_nodemap
  4/11 libxl: expand the libxl_{cpu,node}map API a bit
  5/11 libxl: add a new Array type to the IDL
  6/11 libxl: introduce libxl_get_numainfo()

Introduce the data structures, calls and infrastructure needed for retrieving
all the information about the NUMA-ness of the system and deal with them at the
toolstack level.

  7/11 xen: enhance dump_numa output
  8/11 xl: add more NUMA information to `xl info -n''

Are just output and debugging enhancements.

  9/11 libxl, xl: enable automatic placement of guests on NUMA nodes
 10/11 libxl, xl: heuristics for reordering NUMA placement candidates

Are where the automatic placement is really implemented. Searching for suitable
placement candidates and manipulating these candidates a bit is being done
within libxl, by adding a couple of API calls. This was not like this in the
previous version, but I really think it is something other toolstack wanting to
build on top of libxl could benefit of this, instead of re-implementing
everything from scratch. On the other hand, The key bits of the algorithm
(i.e., which candidate to chose, how, whether or not to mangle it a bit, etc.)
are implemented at the xl level.

 11/11 Some automatic NUMA placement documentation

Puts some more details about the implementation and the usage of the new
feature directly in the tree.

Thanks a lot and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

More specifically:
 1. replaces xenctl_cpumap with xenctl_map
 2. provides bitmap_to_xenctl_map and the reverse;
 3. re-implement cpumask_to_xenctl_map with bitmap_to_xenctl_map
    and the reverse;

Other than #3, no functional changes. Interface only slightly
afected.

This is in preparation of introducing NUMA nodes maps.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.eu.com>

diff --git a/tools/libxc/xc_cpupool.c b/tools/libxc/xc_cpupool.c
--- a/tools/libxc/xc_cpupool.c
+++ b/tools/libxc/xc_cpupool.c
@@ -90,7 +90,7 @@ xc_cpupoolinfo_t *xc_cpupool_getinfo(xc_
     sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_INFO;
     sysctl.u.cpupool_op.cpupool_id = poolid;
     set_xen_guest_handle(sysctl.u.cpupool_op.cpumap.bitmap, local);
-    sysctl.u.cpupool_op.cpumap.nr_cpus = local_size * 8;
+    sysctl.u.cpupool_op.cpumap.nr_elems = local_size * 8;
 
     err = do_sysctl_save(xch, &sysctl);
 
@@ -184,7 +184,7 @@ xc_cpumap_t xc_cpupool_freeinfo(xc_inter
     sysctl.cmd = XEN_SYSCTL_cpupool_op;
     sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_FREEINFO;
     set_xen_guest_handle(sysctl.u.cpupool_op.cpumap.bitmap, local);
-    sysctl.u.cpupool_op.cpumap.nr_cpus = mapsize * 8;
+    sysctl.u.cpupool_op.cpumap.nr_elems = mapsize * 8;
 
     err = do_sysctl_save(xch, &sysctl);
 
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -142,7 +142,7 @@ int xc_vcpu_setaffinity(xc_interface *xc
 
     set_xen_guest_handle(domctl.u.vcpuaffinity.cpumap.bitmap, local);
 
-    domctl.u.vcpuaffinity.cpumap.nr_cpus = cpusize * 8;
+    domctl.u.vcpuaffinity.cpumap.nr_elems = cpusize * 8;
 
     ret = do_domctl(xch, &domctl);
 
@@ -182,7 +182,7 @@ int xc_vcpu_getaffinity(xc_interface *xc
     domctl.u.vcpuaffinity.vcpu = vcpu;
 
     set_xen_guest_handle(domctl.u.vcpuaffinity.cpumap.bitmap, local);
-    domctl.u.vcpuaffinity.cpumap.nr_cpus = cpusize * 8;
+    domctl.u.vcpuaffinity.cpumap.nr_elems = cpusize * 8;
 
     ret = do_domctl(xch, &domctl);
 
diff --git a/tools/libxc/xc_tbuf.c b/tools/libxc/xc_tbuf.c
--- a/tools/libxc/xc_tbuf.c
+++ b/tools/libxc/xc_tbuf.c
@@ -134,7 +134,7 @@ int xc_tbuf_set_cpu_mask(xc_interface *x
     bitmap_64_to_byte(bytemap, &mask64, sizeof (mask64) * 8);
 
     set_xen_guest_handle(sysctl.u.tbuf_op.cpu_mask.bitmap, bytemap);
-    sysctl.u.tbuf_op.cpu_mask.nr_cpus = sizeof(bytemap) * 8;
+    sysctl.u.tbuf_op.cpu_mask.nr_elems = sizeof(bytemap) * 8;
 
     ret = do_sysctl(xch, &sysctl);
 
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -1545,8 +1545,7 @@ long do_mca(XEN_GUEST_HANDLE(xen_mc_t) u
             cpumap = &cpu_online_map;
         else
         {
-            ret = xenctl_cpumap_to_cpumask(&cmv,
-                                           &op->u.mc_inject_v2.cpumap);
+            ret = xenctl_map_to_cpumask(&cmv,
&op->u.mc_inject_v2.cpumap);
             if ( ret )
                 break;
             cpumap = cmv;
diff --git a/xen/arch/x86/platform_hypercall.c
b/xen/arch/x86/platform_hypercall.c
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -365,7 +365,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
     {
         uint32_t cpu;
         uint64_t idletime, now = NOW();
-        struct xenctl_cpumap ctlmap;
+        struct xenctl_map ctlmap;
         cpumask_var_t cpumap;
         XEN_GUEST_HANDLE(uint8) cpumap_bitmap;
         XEN_GUEST_HANDLE(uint64) idletimes;
@@ -378,11 +378,11 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
         if ( cpufreq_controller != FREQCTL_dom0_kernel )
             break;
 
-        ctlmap.nr_cpus  = op->u.getidletime.cpumap_nr_cpus;
+        ctlmap.nr_elems  = op->u.getidletime.cpumap_nr_cpus;
         guest_from_compat_handle(cpumap_bitmap,
                                  op->u.getidletime.cpumap_bitmap);
         ctlmap.bitmap.p = cpumap_bitmap.p; /* handle -> handle_64 conversion
*/
-        if ( (ret = xenctl_cpumap_to_cpumask(&cpumap, &ctlmap)) != 0 )
+        if ( (ret = xenctl_map_to_cpumask(&cpumap, &ctlmap)) != 0 )
             goto out;
         guest_from_compat_handle(idletimes, op->u.getidletime.idletime);
 
@@ -401,7 +401,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
 
         op->u.getidletime.now = now;
         if ( ret == 0 )
-            ret = cpumask_to_xenctl_cpumap(&ctlmap, cpumap);
+            ret = cpumask_to_xenctl_map(&ctlmap, cpumap);
         free_cpumask_var(cpumap);
 
         if ( ret == 0 && copy_to_guest(u_xenpf_op, op, 1) )
diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -489,7 +489,7 @@ int cpupool_do_sysctl(struct xen_sysctl_
         op->cpupool_id = c->cpupool_id;
         op->sched_id = c->sched->sched_id;
         op->n_dom = c->n_dom;
-        ret = cpumask_to_xenctl_cpumap(&op->cpumap, c->cpu_valid);
+        ret = cpumask_to_xenctl_map(&op->cpumap, c->cpu_valid);
         cpupool_put(c);
     }
     break;
@@ -584,7 +584,7 @@ int cpupool_do_sysctl(struct xen_sysctl_
 
     case XEN_SYSCTL_CPUPOOL_OP_FREEINFO:
     {
-        ret = cpumask_to_xenctl_cpumap(
+        ret = cpumask_to_xenctl_map(
             &op->cpumap, &cpupool_free_cpus);
     }
     break;
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -31,28 +31,29 @@
 static DEFINE_SPINLOCK(domctl_lock);
 DEFINE_SPINLOCK(vcpu_alloc_lock);
 
-int cpumask_to_xenctl_cpumap(
-    struct xenctl_cpumap *xenctl_cpumap, const cpumask_t *cpumask)
+int bitmap_to_xenctl_map(struct xenctl_map *xenctl_map,
+                         const unsigned long *bitmap,
+                         unsigned int nbits)
 {
     unsigned int guest_bytes, copy_bytes, i;
     uint8_t zero = 0;
     int err = 0;
-    uint8_t *bytemap = xmalloc_array(uint8_t, (nr_cpu_ids + 7) / 8);
+    uint8_t *bytemap = xmalloc_array(uint8_t, (nbits + 7) / 8);
 
     if ( !bytemap )
         return -ENOMEM;
 
-    guest_bytes = (xenctl_cpumap->nr_cpus + 7) / 8;
-    copy_bytes  = min_t(unsigned int, guest_bytes, (nr_cpu_ids + 7) / 8);
+    guest_bytes = (xenctl_map->nr_elems + 7) / 8;
+    copy_bytes  = min_t(unsigned int, guest_bytes, (nbits + 7) / 8);
 
-    bitmap_long_to_byte(bytemap, cpumask_bits(cpumask), nr_cpu_ids);
+    bitmap_long_to_byte(bytemap, bitmap, nbits);
 
     if ( copy_bytes != 0 )
-        if ( copy_to_guest(xenctl_cpumap->bitmap, bytemap, copy_bytes) )
+        if ( copy_to_guest(xenctl_map->bitmap, bytemap, copy_bytes) )
             err = -EFAULT;
 
     for ( i = copy_bytes; !err && i < guest_bytes; i++ )
-        if ( copy_to_guest_offset(xenctl_cpumap->bitmap, i, &zero, 1) )
+        if ( copy_to_guest_offset(xenctl_map->bitmap, i, &zero, 1) )
             err = -EFAULT;
 
     xfree(bytemap);
@@ -60,36 +61,58 @@ int cpumask_to_xenctl_cpumap(
     return err;
 }
 
-int xenctl_cpumap_to_cpumask(
-    cpumask_var_t *cpumask, const struct xenctl_cpumap *xenctl_cpumap)
+int xenctl_map_to_bitmap(unsigned long *bitmap,
+                         const struct xenctl_map *xenctl_map,
+                         unsigned int nbits)
 {
     unsigned int guest_bytes, copy_bytes;
     int err = 0;
-    uint8_t *bytemap = xzalloc_array(uint8_t, (nr_cpu_ids + 7) / 8);
+    uint8_t *bytemap = xzalloc_array(uint8_t, (nbits + 7) / 8);
 
     if ( !bytemap )
         return -ENOMEM;
 
-    guest_bytes = (xenctl_cpumap->nr_cpus + 7) / 8;
-    copy_bytes  = min_t(unsigned int, guest_bytes, (nr_cpu_ids + 7) / 8);
+    guest_bytes = (xenctl_map->nr_elems + 7) / 8;
+    copy_bytes  = min_t(unsigned int, guest_bytes, (nbits + 7) / 8);
 
     if ( copy_bytes != 0 )
     {
-        if ( copy_from_guest(bytemap, xenctl_cpumap->bitmap, copy_bytes) )
+        if ( copy_from_guest(bytemap, xenctl_map->bitmap, copy_bytes) )
             err = -EFAULT;
-        if ( (xenctl_cpumap->nr_cpus & 7) && (guest_bytes <=
sizeof(bytemap)) )
-            bytemap[guest_bytes-1] &= ~(0xff <<
(xenctl_cpumap->nr_cpus & 7));
+        if ( (xenctl_map->nr_elems & 7) && (guest_bytes <=
sizeof(bytemap)) )
+            bytemap[guest_bytes-1] &= ~(0xff <<
(xenctl_map->nr_elems & 7));
     }
 
-    if ( err )
-        /* nothing */;
-    else if ( alloc_cpumask_var(cpumask) )
-        bitmap_byte_to_long(cpumask_bits(*cpumask), bytemap, nr_cpu_ids);
+    if ( !err )
+        bitmap_byte_to_long(bitmap, bytemap, nbits);
+
+    xfree(bytemap);
+
+    return err;
+}
+
+int cpumask_to_xenctl_map(struct xenctl_map *xenctl_cpumap,
+                          const cpumask_t *cpumask)
+{
+    return bitmap_to_xenctl_map(xenctl_cpumap, cpumask_bits(cpumask),
+                                nr_cpu_ids);
+}
+
+int xenctl_map_to_cpumask(cpumask_var_t *cpumask,
+                          const struct xenctl_map *xenctl_cpumap)
+{
+    int err = 0;
+
+    if ( alloc_cpumask_var(cpumask) ) {
+        err = xenctl_map_to_bitmap(cpumask_bits(*cpumask), xenctl_cpumap,
+                                   nr_cpu_ids);
+        /* In case of error, cleanup is up to us, as the caller won''t
care! */
+        if ( err )
+            free_cpumask_var(*cpumask);
+    }
     else
         err = -ENOMEM;
 
-    xfree(bytemap);
-
     return err;
 }
 
@@ -617,7 +640,7 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
         {
             cpumask_var_t new_affinity;
 
-            ret = xenctl_cpumap_to_cpumask(
+            ret = xenctl_map_to_cpumask(
                 &new_affinity, &op->u.vcpuaffinity.cpumap);
             if ( !ret )
             {
@@ -627,7 +650,7 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
         }
         else
         {
-            ret = cpumask_to_xenctl_cpumap(
+            ret = cpumask_to_xenctl_map(
                 &op->u.vcpuaffinity.cpumap, v->cpu_affinity);
         }
 
diff --git a/xen/common/trace.c b/xen/common/trace.c
--- a/xen/common/trace.c
+++ b/xen/common/trace.c
@@ -384,7 +384,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc
     {
         cpumask_var_t mask;
 
-        rc = xenctl_cpumap_to_cpumask(&mask, &tbc->cpu_mask);
+        rc = xenctl_map_to_cpumask(&mask, &tbc->cpu_mask);
         if ( !rc )
         {
             cpumask_copy(&tb_cpu_mask, mask);
diff --git a/xen/include/public/arch-x86/xen-mca.h
b/xen/include/public/arch-x86/xen-mca.h
--- a/xen/include/public/arch-x86/xen-mca.h
+++ b/xen/include/public/arch-x86/xen-mca.h
@@ -414,7 +414,7 @@ struct xen_mc_mceinject {
 
 struct xen_mc_inject_v2 {
 	uint32_t flags;
-	struct xenctl_cpumap cpumap;
+	struct xenctl_map cpumap;
 };
 #endif
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -283,7 +283,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_getvc
 /* XEN_DOMCTL_getvcpuaffinity */
 struct xen_domctl_vcpuaffinity {
     uint32_t  vcpu;              /* IN */
-    struct xenctl_cpumap cpumap; /* IN/OUT */
+    struct xenctl_map cpumap;    /* IN/OUT */
 };
 typedef struct xen_domctl_vcpuaffinity xen_domctl_vcpuaffinity_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpuaffinity_t);
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -71,8 +71,8 @@ struct xen_sysctl_tbuf_op {
 #define XEN_SYSCTL_TBUFOP_disable      5
     uint32_t cmd;
     /* IN/OUT variables */
-    struct xenctl_cpumap cpu_mask;
-    uint32_t             evt_mask;
+    struct xenctl_map cpu_mask;
+    uint32_t          evt_mask;
     /* OUT variables */
     uint64_aligned_t buffer_mfn;
     uint32_t size;  /* Also an IN variable! */
@@ -531,7 +531,7 @@ struct xen_sysctl_cpupool_op {
     uint32_t domid;       /* IN: M              */
     uint32_t cpu;         /* IN: AR             */
     uint32_t n_dom;       /*            OUT: I  */
-    struct xenctl_cpumap cpumap; /*     OUT: IF */
+    struct xenctl_map cpumap;    /*     OUT: IF */
 };
 typedef struct xen_sysctl_cpupool_op xen_sysctl_cpupool_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpupool_op_t);
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -822,9 +822,9 @@ typedef uint8_t xen_domain_handle_t[16];
 #endif
 
 #ifndef __ASSEMBLY__
-struct xenctl_cpumap {
+struct xenctl_map {
     XEN_GUEST_HANDLE_64(uint8) bitmap;
-    uint32_t nr_cpus;
+    uint32_t nr_elems;
 };
 #endif
 
diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h
--- a/xen/include/xen/cpumask.h
+++ b/xen/include/xen/cpumask.h
@@ -424,8 +424,8 @@ extern cpumask_t cpu_present_map;
 #define for_each_present_cpu(cpu)  for_each_cpu(cpu, &cpu_present_map)
 
 /* Copy to/from cpumap provided by control tools. */
-struct xenctl_cpumap;
-int cpumask_to_xenctl_cpumap(struct xenctl_cpumap *, const cpumask_t *);
-int xenctl_cpumap_to_cpumask(cpumask_var_t *, const struct xenctl_cpumap *);
+struct xenctl_map;
+int cpumask_to_xenctl_map(struct xenctl_map *, const cpumask_t *);
+int xenctl_map_to_cpumask(cpumask_var_t *, const struct xenctl_map *);
 
 #endif /* __XEN_CPUMASK_H */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -2,7 +2,7 @@
 # ! - needs translation
 # ? - needs checking
 ?	dom0_vga_console_info		xen.h
-?	xenctl_cpumap			xen.h
+?	xenctl_map			xen.h
 ?	mmu_update			xen.h
 !	mmuext_op			xen.h
 !	start_info			xen.h

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

More specifically:
 1. introduces struct libxl_map;
 2. re-implement libxl_cpumap_* on top of struct libxl_map_*;

No functional nor interface changes at all.

This is in preparation of the introduction of NUMA nodes maps.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.eu.com>

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -282,11 +282,17 @@ typedef uint32_t libxl_hwcap[8];
 
 typedef uint64_t libxl_ev_user;
 
-typedef struct {
+struct libxl_map {
     uint32_t size;          /* number of bytes in map */
     uint8_t *map;
-} libxl_cpumap;
-void libxl_cpumap_dispose(libxl_cpumap *map);
+};
+void libxl_map_dispose(struct libxl_map *map);
+
+typedef struct libxl_map libxl_cpumap;
+static inline void libxl_cpumap_dispose(libxl_cpumap *cpumap)
+{
+    return libxl_map_dispose(cpumap);
+}
 
 typedef struct {
     /*
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -488,47 +488,53 @@ int libxl_mac_to_device_nic(libxl_ctx *c
     return rc;
 }
 
+void libxl_map_dispose(struct libxl_map *map)
+{
+    free(map->map);
+}
+
+static int libxl_map_alloc(libxl_ctx *ctx, struct libxl_map *map, int n_elems)
+{
+    int sz;
+
+    sz = (n_elems + 7) / 8;
+    map->map = calloc(sz, sizeof(*map->map));
+    if (!map->map)
+        return ERROR_NOMEM;
+    map->size = sz;
+    return 0;
+}
+
+int libxl_map_test(struct libxl_map *map, int elem)
+{
+    if (elem >= map->size * 8)
+        return 0;
+    return (map->map[elem / 8] & (1 << (elem & 7))) ? 1 : 0;
+}
+
+void libxl_map_set(struct libxl_map *map, int elem)
+{
+    if (elem >= map->size * 8)
+        return;
+    map->map[elem / 8] |= 1 << (elem & 7);
+}
+
+void libxl_map_reset(struct libxl_map *map, int elem)
+{
+    if (elem >= map->size * 8)
+        return;
+    map->map[elem / 8] &= ~(1 << (elem & 7));
+}
+
 int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap)
 {
     int max_cpus;
-    int sz;
 
     max_cpus = libxl_get_max_cpus(ctx);
     if (max_cpus == 0)
         return ERROR_FAIL;
 
-    sz = (max_cpus + 7) / 8;
-    cpumap->map = calloc(sz, sizeof(*cpumap->map));
-    if (!cpumap->map)
-        return ERROR_NOMEM;
-    cpumap->size = sz;
-    return 0;
-}
-
-void libxl_cpumap_dispose(libxl_cpumap *map)
-{
-    free(map->map);
-}
-
-int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
-{
-    if (cpu >= cpumap->size * 8)
-        return 0;
-    return (cpumap->map[cpu / 8] & (1 << (cpu & 7))) ? 1 : 0;
-}
-
-void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu)
-{
-    if (cpu >= cpumap->size * 8)
-        return;
-    cpumap->map[cpu / 8] |= 1 << (cpu & 7);
-}
-
-void libxl_cpumap_reset(libxl_cpumap *cpumap, int cpu)
-{
-    if (cpu >= cpumap->size * 8)
-        return;
-    cpumap->map[cpu / 8] &= ~(1 << (cpu & 7));
+    return libxl_map_alloc(ctx, cpumap, max_cpus);
 }
 
 int libxl_get_max_cpus(libxl_ctx *ctx)
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -63,21 +63,46 @@ int libxl_devid_to_device_nic(libxl_ctx 
 int libxl_vdev_to_device_disk(libxl_ctx *ctx, uint32_t domid, const char *vdev,
                                libxl_device_disk *disk);
 
+int libxl_map_test(struct libxl_map *map, int elem);
+void libxl_map_set(struct libxl_map *map, int elem);
+void libxl_map_reset(struct libxl_map *map, int elem);
+static inline void libxl_map_set_any(struct libxl_map *map)
+{
+    memset(map->map, -1, map->size);
+}
+static inline void libxl_map_set_none(struct libxl_map *map)
+{
+    memset(map->map, 0, map->size);
+}
+static inline int libxl_map_elem_valid(struct libxl_map *map, int elem)
+{
+    return elem >= 0 && elem < (map->size * 8);
+}
+
 int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap);
-int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu);
-void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu);
-void libxl_cpumap_reset(libxl_cpumap *cpumap, int cpu);
+static inline int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
+{
+    return libxl_map_test(cpumap, cpu);
+}
+static inline void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu)
+{
+    libxl_map_set(cpumap, cpu);
+}
+static inline void libxl_cpumap_reset(libxl_cpumap *cpumap, int cpu)
+{
+    libxl_map_reset(cpumap, cpu);
+}
 static inline void libxl_cpumap_set_any(libxl_cpumap *cpumap)
 {
-    memset(cpumap->map, -1, cpumap->size);
+    libxl_map_set_any(cpumap);
 }
 static inline void libxl_cpumap_set_none(libxl_cpumap *cpumap)
 {
-    memset(cpumap->map, 0, cpumap->size);
+    libxl_map_set_none(cpumap);
 }
 static inline int libxl_cpumap_cpu_valid(libxl_cpumap *cpumap, int cpu)
 {
-    return cpu >= 0 && cpu < (cpumap->size * 8);
+    return libxl_map_elem_valid(cpumap, cpu);
 }
 #define libxl_for_each_cpu(var, map) for (var = 0; var < (map).size * 8;
var++)
 #define libxl_for_each_set_cpu(v, m) for (v = 0; v < (m).size * 8; v++) \

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

As NUMA node-related counterparts of xc_cpumap_t and libxl_cpumap.

This is in preparation of making it possible to manipulate
NUMA nodes from the toolstack(s).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -35,11 +35,30 @@ int xc_get_max_cpus(xc_interface *xch)
     return max_cpus;
 }
 
+int xc_get_max_nodes(xc_interface *xch)
+{
+    static int max_nodes = 0;
+    xc_physinfo_t physinfo;
+
+    if ( max_nodes )
+        return max_nodes;
+
+    if ( !xc_physinfo(xch, &physinfo) )
+        max_nodes = physinfo.max_node_id + 1;
+
+    return max_nodes;
+}
+
 int xc_get_cpumap_size(xc_interface *xch)
 {
     return (xc_get_max_cpus(xch) + 7) / 8;
 }
 
+int xc_get_nodemap_size(xc_interface *xch)
+{
+    return (xc_get_max_nodes(xch) + 7) / 8;
+}
+
 xc_cpumap_t xc_cpumap_alloc(xc_interface *xch)
 {
     int sz;
@@ -50,6 +69,16 @@ xc_cpumap_t xc_cpumap_alloc(xc_interface
     return calloc(1, sz);
 }
 
+xc_nodemap_t xc_nodemap_alloc(xc_interface *xch)
+{
+    int sz;
+
+    sz = xc_get_nodemap_size(xch);
+    if (sz == 0)
+        return NULL;
+    return calloc(1, sz);
+}
+
 int xc_readconsolering(xc_interface *xch,
                        char *buffer,
                        unsigned int *pnr_chars,
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -330,6 +330,20 @@ int xc_get_cpumap_size(xc_interface *xch
 xc_cpumap_t xc_cpumap_alloc(xc_interface *xch);
 
 /*
+ * NODEMAP handling
+ */
+typedef uint8_t *xc_nodemap_t;
+
+/* return maximum number of NUMA nodes the hypervisor supports */
+int xc_get_max_nodes(xc_interface *xch);
+
+/* return array size for nodemap */
+int xc_get_nodemap_size(xc_interface *xch);
+
+/* allocate a nodemap */
+xc_nodemap_t xc_nodemap_alloc(xc_interface *xch);
+
+/*
  * DOMAIN DEBUGGING FUNCTIONS
  */
 
diff --git a/tools/libxl/gentest.py b/tools/libxl/gentest.py
--- a/tools/libxl/gentest.py
+++ b/tools/libxl/gentest.py
@@ -20,7 +20,7 @@ def randomize_case(s):
 def randomize_enum(e):
     return random.choice([v.name for v in e.values])
 
-handcoded = ["libxl_cpumap", "libxl_key_value_list",
+handcoded = ["libxl_cpumap", "libxl_nodemap",
"libxl_key_value_list",
              "libxl_cpuid_policy_list",
"libxl_file_reference",
              "libxl_string_list"]
 
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -294,6 +294,12 @@ static inline void libxl_cpumap_dispose(
     return libxl_map_dispose(cpumap);
 }
 
+typedef struct libxl_map libxl_nodemap;
+static inline void libxl_nodemap_dispose(libxl_nodemap *nodemap)
+{
+    return libxl_map_dispose(nodemap);
+}
+
 typedef struct {
     /*
      * Path is always set if the file reference is valid. However if
@@ -549,6 +555,9 @@ int libxl_domain_preserve(libxl_ctx *ctx
 /* get max. number of cpus supported by hypervisor */
 int libxl_get_max_cpus(libxl_ctx *ctx);
 
+/* get max. number of NUMA nodes supported by hypervisor */
+int libxl_get_max_nodes(libxl_ctx *ctx);
+
 /*
  * Run the configured bootloader for a PV domain and update
  * info->kernel, info->u.pv.ramdisk and info->u.pv.cmdline as
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -11,6 +11,7 @@ libxl_domid = Builtin("domid", json_fn  libxl_uuid =
Builtin("uuid", passby=PASS_BY_REFERENCE)
 libxl_mac = Builtin("mac", passby=PASS_BY_REFERENCE)
 libxl_cpumap = Builtin("cpumap",
dispose_fn="libxl_cpumap_dispose", passby=PASS_BY_REFERENCE)
+libxl_nodemap = Builtin("nodemap",
dispose_fn="libxl_nodemap_dispose", passby=PASS_BY_REFERENCE)
 libxl_cpuid_policy_list = Builtin("cpuid_policy_list",
dispose_fn="libxl_cpuid_dispose", passby=PASS_BY_REFERENCE)
 
 libxl_string_list = Builtin("string_list",
dispose_fn="libxl_string_list_dispose", passby=PASS_BY_REFERENCE)
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -537,11 +537,27 @@ int libxl_cpumap_alloc(libxl_ctx *ctx, l
     return libxl_map_alloc(ctx, cpumap, max_cpus);
 }
 
+int libxl_nodemap_alloc(libxl_ctx *ctx, libxl_nodemap *nodemap)
+{
+    int max_nodes;
+
+    max_nodes = libxl_get_max_nodes(ctx);
+    if (max_nodes == 0)
+        return ERROR_FAIL;
+
+    return libxl_map_alloc(ctx, nodemap, max_nodes);
+}
+
 int libxl_get_max_cpus(libxl_ctx *ctx)
 {
     return xc_get_max_cpus(ctx->xch);
 }
 
+int libxl_get_max_nodes(libxl_ctx *ctx)
+{
+    return xc_get_max_nodes(ctx->xch);
+}
+
 int libxl__enum_from_string(const libxl_enum_string_table *t,
                             const char *s, int *e)
 {
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -108,6 +108,35 @@ static inline int libxl_cpumap_cpu_valid
 #define libxl_for_each_set_cpu(v, m) for (v = 0; v < (m).size * 8; v++) \
                                              if (libxl_cpumap_test(&(m),
v))
 
+int libxl_nodemap_alloc(libxl_ctx *ctx, libxl_nodemap *nodemap);
+static inline int libxl_nodemap_test(libxl_nodemap *nodemap, int node)
+{
+    return libxl_map_test(nodemap, node);
+}
+static inline void libxl_nodemap_set(libxl_nodemap *nodemap, int node)
+{
+    libxl_map_set(nodemap, node);
+}
+static inline void libxl_nodemap_reset(libxl_nodemap *nodemap, int node)
+{
+    libxl_map_reset(nodemap, node);
+}
+static inline void libxl_nodemap_set_any(libxl_nodemap *nodemap)
+{
+    libxl_map_set_any(nodemap);
+}
+static inline void libxl_nodemap_set_none(libxl_nodemap *nodemap)
+{
+    libxl_map_set_none(nodemap);
+}
+static inline int libxl_nodemap_node_valid(libxl_nodemap *nodemap, int node)
+{
+    return libxl_map_elem_valid(nodemap, node);
+}
+#define libxl_for_each_node(var, map) for (var = 0; var < (map).size * 8;
var++)
+#define libxl_for_each_set_node(v, m) for (v = 0; v < (m).size * 8; v++) \
+                                              if (libxl_nodemap_test(&(m),
v))
+
 static inline uint32_t libxl__sizekb_to_mb(uint32_t s) {
     return (s + 1023) / 1024;
 }
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -116,6 +116,21 @@ int xenctl_map_to_cpumask(cpumask_var_t 
     return err;
 }
 
+int nodemask_to_xenctl_map(struct xenctl_map *xenctl_nodemap,
+                           const nodemask_t *nodemask)
+{
+    return bitmap_to_xenctl_map(xenctl_nodemap, cpumask_bits(nodemask),
+                                MAX_NUMNODES);
+}
+
+int xenctl_map_to_nodemask(nodemask_t *nodemask,
+                               const struct xenctl_map *xenctl_nodemap)
+{
+    return xenctl_map_to_bitmap(nodes_addr(*nodemask),
+                                xenctl_nodemap,
+                                MAX_NUMNODES);
+}
+
 static inline int is_free_domid(domid_t dom)
 {
     struct domain *d;

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

By adding copying and *_is_full/*_is_empty facilities.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -505,6 +505,35 @@ static int libxl_map_alloc(libxl_ctx *ct
     return 0;
 }
 
+static void libxl_map_copy(struct libxl_map *dptr,
+                           const struct libxl_map *sptr)
+{
+    int sz;
+
+    sz = dptr->size = sptr->size;
+    memcpy(dptr->map, sptr->map, sz * sizeof(*dptr->map));
+}
+
+int libxl_map_is_full(struct libxl_map *map)
+{
+    int i;
+
+    for (i = 0; i < map->size; i++)
+        if (map->map[i] != (uint8_t)-1)
+            return -1;
+    return 0;
+}
+
+int libxl_map_is_empty(struct libxl_map *map)
+{
+    int i;
+
+    for (i = 0; i < map->size; i++)
+        if (map->map[i])
+            return -1;
+    return 0;
+}
+
 int libxl_map_test(struct libxl_map *map, int elem)
 {
     if (elem >= map->size * 8)
@@ -548,6 +577,18 @@ int libxl_nodemap_alloc(libxl_ctx *ctx, 
     return libxl_map_alloc(ctx, nodemap, max_nodes);
 }
 
+void libxl_cpumap_copy(/*XXX libxl_ctx *ctx,*/ libxl_cpumap *dst,
+                       const libxl_cpumap *src)
+{
+    libxl_map_copy(dst, src);
+}
+
+void libxl_nodemap_copy(/*XXX libxl_ctx *ctx,*/ libxl_nodemap *dst,
+                       const libxl_nodemap *src)
+{
+   libxl_map_copy(dst, src);
+}
+
 int libxl_get_max_cpus(libxl_ctx *ctx)
 {
     return xc_get_max_cpus(ctx->xch);
diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
--- a/tools/libxl/libxl_utils.h
+++ b/tools/libxl/libxl_utils.h
@@ -63,6 +63,8 @@ int libxl_devid_to_device_nic(libxl_ctx 
 int libxl_vdev_to_device_disk(libxl_ctx *ctx, uint32_t domid, const char *vdev,
                                libxl_device_disk *disk);
 
+int libxl_map_is_full(struct libxl_map *map);
+int libxl_map_is_empty(struct libxl_map *map);
 int libxl_map_test(struct libxl_map *map, int elem);
 void libxl_map_set(struct libxl_map *map, int elem);
 void libxl_map_reset(struct libxl_map *map, int elem);
@@ -80,6 +82,16 @@ static inline int libxl_map_elem_valid(s
 }
 
 int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap);
+void libxl_cpumap_copy(/*XXX libxl_ctx *ctx, */ libxl_cpumap *dst,
+                       const libxl_cpumap *src);
+static inline int libxl_cpumap_is_full(libxl_cpumap *cpumap)
+{
+    return libxl_map_is_full(cpumap);
+}
+static inline int libxl_cpumap_is_empty(libxl_cpumap *cpumap)
+{
+    return libxl_map_is_empty(cpumap);
+}
 static inline int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
 {
     return libxl_map_test(cpumap, cpu);
@@ -109,6 +121,16 @@ static inline int libxl_cpumap_cpu_valid
                                              if (libxl_cpumap_test(&(m),
v))
 
 int libxl_nodemap_alloc(libxl_ctx *ctx, libxl_nodemap *nodemap);
+void libxl_nodemap_copy(/*XXX libxl_ctx *ctx, */ libxl_nodemap *dst,
+                       const libxl_nodemap *src);
+static inline int libxl_nodemap_is_full(libxl_nodemap *nodemap)
+{
+    return libxl_map_is_full(nodemap);
+}
+static inline int libxl_nodemap_is_empty(libxl_nodemap *nodemap)
+{
+    return libxl_map_is_empty(nodemap);
+}
 static inline int libxl_nodemap_test(libxl_nodemap *nodemap, int node)
 {
     return libxl_map_test(nodemap, node);

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 05 of 11] libxl: add a new Array type to the IDL

And make all the required infrastructure updates to enable this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxl/gentest.py b/tools/libxl/gentest.py
--- a/tools/libxl/gentest.py
+++ b/tools/libxl/gentest.py
@@ -28,6 +28,18 @@ def gen_rand_init(ty, v, indent = "    "
     s = ""
     if isinstance(ty, idl.Enumeration):
         s += "%s = %s;\n" % (ty.pass_arg(v, parent is None),
randomize_enum(ty))
+    elif isinstance(ty, idl.Array):
+        if parent is None:
+            raise Exception("Array type must have a parent")
+        s += "%s = rand()%%8;\n" % (parent + ty.lenvar.name)
+        s += "%s = calloc(%s, sizeof(*%s));\n" % \
+            (v, parent + ty.lenvar.name, v)
+        s += "{\n"
+        s += "    int i;\n"
+        s += "    for (i=0; i<%s; i++)\n" % (parent +
ty.lenvar.name)
+        s += gen_rand_init(ty.elem_type, v+"[i]",
+                           indent + "        ", parent)
+        s += "}\n"
     elif isinstance(ty, idl.KeyedUnion):
         if parent is None:
             raise Exception("KeyedUnion type must have a parent")
diff --git a/tools/libxl/gentypes.py b/tools/libxl/gentypes.py
--- a/tools/libxl/gentypes.py
+++ b/tools/libxl/gentypes.py
@@ -11,8 +11,12 @@ def libxl_C_instance_of(ty, instancename
             return libxl_C_type_define(ty)
         else:
             return libxl_C_type_define(ty) + " " + instancename
-    else:
-        return ty.typename + " " + instancename
+
+    s = ""
+    if isinstance(ty, idl.Array):
+        s += libxl_C_instance_of(ty.lenvar.type, ty.lenvar.name) +
";\n"
+
+    return s + ty.typename + " " + instancename
 
 def libxl_C_type_define(ty, indent = ""):
     s = ""
@@ -66,6 +70,18 @@ def libxl_C_type_dispose(ty, v, indent              s +=
libxl_C_type_dispose(f.type, fexpr, indent + "    ", nparent)
             s += "    break;\n"
         s += "}\n"
+    elif isinstance(ty, idl.Array):
+        if parent is None:
+            raise Exception("Array type must have a parent")
+        s += "{\n"
+        if ty.elem_type.dispose_fn is not None:
+            s += "    int i;\n"
+            s += "    for (i=0; i<%s; i++)\n" % (parent +
ty.lenvar.name)
+            s += libxl_C_type_dispose(ty.elem_type, v+"[i]",
+                                      indent + "        ", parent)
+        if ty.dispose_fn is not None:
+            s += "    %s(%s);\n" % (ty.dispose_fn, ty.pass_arg(v,
parent is None))
+        s += "}\n"
     elif isinstance(ty, idl.Struct) and (parent is None or ty.dispose_fn is
None):
         for f in [f for f in ty.fields if not f.const]:
             (nparent,fexpr) = ty.member(v, f, parent is None)
@@ -164,7 +180,24 @@ def libxl_C_type_gen_json(ty, v, indent 
     s = ""
     if parent is None:
         s += "yajl_gen_status s;\n"
-    if isinstance(ty, idl.Enumeration):
+
+    if isinstance(ty, idl.Array):
+        if parent is None:
+            raise Exception("Array type must have a parent")
+        s += "{\n"
+        s += "    int i;\n"
+        s += "    s = yajl_gen_array_open(hand);\n"
+        s += "    if (s != yajl_gen_status_ok)\n"
+        s += "        goto out;\n"
+        s += "    for (i=0; i<%s; i++) {\n" % (parent +
ty.lenvar.name)
+        s += libxl_C_type_gen_json(ty.elem_type, v+"[i]",
+                                   indent + "        ", parent)
+        s += "    }\n"
+        s += "    s = yajl_gen_array_close(hand);\n"
+        s += "    if (s != yajl_gen_status_ok)\n"
+        s += "        goto out;\n"
+        s += "}\n"
+    elif isinstance(ty, idl.Enumeration):
         s += "s = libxl__yajl_gen_enum(hand, %s_to_string(%s));\n" %
(ty.typename, ty.pass_arg(v, parent is None))
         s += "if (s != yajl_gen_status_ok)\n"
         s += "    goto out;\n"
diff --git a/tools/libxl/idl.py b/tools/libxl/idl.py
--- a/tools/libxl/idl.py
+++ b/tools/libxl/idl.py
@@ -266,6 +266,17 @@ string = Builtin("char *", namespace = N
                  json_fn = "libxl__string_gen_json",
                  autogenerate_json = False)
 
+class Array(Type):
+    """An array of the same type"""
+    def __init__(self, elem_type, lenvar_name, **kwargs):
+        kwargs.setdefault(''dispose_fn'',
''free'')
+        Type.__init__(self, namespace=elem_type.namespace,
typename=elem_type.rawname + " *", **kwargs)
+
+        lv_kwargs = dict([(x.lstrip(''lenvar_''),y) for (x,y)
in kwargs.items() if x.startswith(''lenvar_'')])
+
+        self.lenvar = Field(integer, lenvar_name, **lv_kwargs)
+        self.elem_type = elem_type
+
 class OrderedDict(dict):
     """A dictionary which remembers insertion order.
 
diff --git a/tools/libxl/idl.txt b/tools/libxl/idl.txt
--- a/tools/libxl/idl.txt
+++ b/tools/libxl/idl.txt
@@ -145,12 +145,24 @@ idl.KeyedUnion
 
  A subclass of idl.Aggregate which represents the C union type
  where the currently valid member of the union can be determined based
- upon another member in the containing type.
+ upon another member in the containing type. An idl.KeyedUnion must
+ always be a member of a containing idl.Aggregate type.
 
- The KeyedUnion.keyvar contains an idl.type the member of the
+ The KeyedUnion.keyvar contains an idl.Type the member of the
  containing type which determines the valid member of the union. The
  must be an instance of the Enumeration type.
 
+idl.Array
+
+  A class representing an array or similar elements. An idl.Array must
+  always be an idl.Field of a containing idl.Aggregate.
+
+  idl.Array.elem_type contains an idl.Type which is the type of all
+  elements of the array.
+
+  idl.Array.len_var contains an idl.Field which is added to the parent
+  idl.Aggregate and will contain the length of the array.
+
 Standard Types
 --------------
 
diff --git a/tools/ocaml/libs/xl/genwrap.py b/tools/ocaml/libs/xl/genwrap.py
--- a/tools/ocaml/libs/xl/genwrap.py
+++ b/tools/ocaml/libs/xl/genwrap.py
@@ -54,7 +54,8 @@ def ocaml_type_of(ty):
             return "int%d" % ty.width
         else:
             return "int"
-
+    elif isinstance(ty,idl.Array):
+        return "%s list" % ocaml_type_of(ty.elem_type)
     elif isinstance(ty,idl.Builtin):
         if not builtins.has_key(ty.typename):
             raise NotImplementedError("Unknown Builtin %s (%s)" %
(ty.typename, type(ty)))
diff --git a/tools/python/genwrap.py b/tools/python/genwrap.py
--- a/tools/python/genwrap.py
+++ b/tools/python/genwrap.py
@@ -4,7 +4,7 @@ import sys,os
 
 import idl
 
-(TYPE_DEFBOOL, TYPE_BOOL, TYPE_INT, TYPE_UINT, TYPE_STRING, TYPE_AGGREGATE) =
range(6)
+(TYPE_DEFBOOL, TYPE_BOOL, TYPE_INT, TYPE_UINT, TYPE_STRING, TYPE_ARRAY,
TYPE_AGGREGATE) = range(7)
 
 def py_type(ty):
     if ty == idl.bool:
@@ -18,6 +18,8 @@ def py_type(ty):
             return TYPE_INT
         else:
             return TYPE_UINT
+    if isinstance(ty, idl.Array):
+        return TYPE_ARRAY
     if isinstance(ty, idl.Aggregate):
         return TYPE_AGGREGATE
     if ty == idl.string:
@@ -74,7 +76,7 @@ def py_attrib_get(ty, f):
         l.append(''    return
genwrap__ull_get(self->obj.%s);''%f.name)
     elif t == TYPE_STRING:
         l.append(''    return
genwrap__string_get(&self->obj.%s);''%f.name)
-    elif t == TYPE_AGGREGATE:
+    elif t == TYPE_AGGREGATE or t == TYPE_ARRAY:
         l.append(''    PyErr_SetString(PyExc_NotImplementedError,
"Getting %s");''%ty.typename)
         l.append(''    return NULL;'')
     else:
@@ -105,7 +107,7 @@ def py_attrib_set(ty, f):
         l.append(''    return ret;'')
     elif t == TYPE_STRING:
         l.append(''    return genwrap__string_set(v,
&self->obj.%s);''%f.name)
-    elif t == TYPE_AGGREGATE:
+    elif t == TYPE_AGGREGATE or t == TYPE_ARRAY:
         l.append(''    PyErr_SetString(PyExc_NotImplementedError,
"Setting %s");''%ty.typename)
         l.append(''    return -1;'')
     else:

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Make some NUMA node information available to the toolstack. Achieve
this by means of xc_numainfo(), which exposes memory size and amount
of free memory of each node, as well as the relative distances of
each node to all the others.

For properly exposing distances we need the IDL to support arrays.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3014,6 +3014,83 @@ fail:
     return ret;
 }
 
+libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
+{
+    xc_numainfo_t ninfo;
+    DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
+    DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
+    DECLARE_HYPERCALL_BUFFER(uint32_t, node_distances);
+    libxl_numainfo *ret = NULL;
+    int i, j, max_nodes;
+
+    max_nodes = libxl_get_max_nodes(ctx);
+    if (max_nodes == 0)
+    {
+        LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of
NODES");
+        return NULL;
+    }
+
+    memsize = xc_hypercall_buffer_alloc
+        (ctx->xch, memsize, sizeof(*memsize) * max_nodes);
+    memfree = xc_hypercall_buffer_alloc
+        (ctx->xch, memfree, sizeof(*memfree) * max_nodes);
+    node_distances = xc_hypercall_buffer_alloc
+        (ctx->xch, node_distances, sizeof(*node_distances) * max_nodes *
max_nodes);
+    if ((memsize == NULL) || (memfree == NULL) || (node_distances == NULL)) {
+        LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
+                            "Unable to allocate hypercall
arguments");
+        goto fail;
+    }
+
+    set_xen_guest_handle(ninfo.node_to_memsize, memsize);
+    set_xen_guest_handle(ninfo.node_to_memfree, memfree);
+    set_xen_guest_handle(ninfo.node_to_node_distance, node_distances);
+    ninfo.max_node_index = max_nodes - 1;
+    if (xc_numainfo(ctx->xch, &ninfo) != 0) {
+        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "getting numainfo");
+        goto fail;
+    }
+
+    if (ninfo.max_node_index < max_nodes - 1)
+        max_nodes = ninfo.max_node_index + 1;
+
+    ret = malloc(sizeof(libxl_numainfo) * max_nodes);
+    if (ret == NULL) {
+        LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
+                            "Unable to allocate return value");
+        goto fail;
+    }
+    for (i = 0; i < max_nodes; i++) {
+        ret[i].dists = malloc(sizeof(*node_distances) * max_nodes);
+        if (ret == NULL) {
+            for (j = i; j >= 0; j--)
+                free(ret[i].dists);
+            free(ret);
+            goto fail;
+        }
+    }
+
+    for (i = 0; i < max_nodes; i++) {
+#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
+    LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
+        ret[i].size = V(memsize, i);
+        ret[i].free = V(memfree, i);
+        ret[i].num_dists = max_nodes;
+        for (j = 0; j < ret[i].num_dists; j++)
+            ret[i].dists[j] = V(node_distances, i * max_nodes + j);
+#undef V
+    }
+
+fail:
+    xc_hypercall_buffer_free(ctx->xch, memsize);
+    xc_hypercall_buffer_free(ctx->xch, memfree);
+    xc_hypercall_buffer_free(ctx->xch, node_distances);
+
+    if (ret)
+        *nr = max_nodes;
+    return ret;
+}
+
 const libxl_version_info* libxl_get_version_info(libxl_ctx *ctx)
 {
     union {
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -810,6 +810,9 @@ int libxl_get_physinfo(libxl_ctx *ctx, l
 #define LIBXL_CPUTOPOLOGY_INVALID_ENTRY (~(uint32_t)0)
 libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nr);
 void libxl_cputopology_list_free(libxl_cputopology *, int nr);
+#define LIBXL_NUMAINFO_INVALID_ENTRY (~(uint32_t)0)
+libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr);
+void libxl_numainfo_list_free(libxl_numainfo *, int nr);
 libxl_vcpuinfo *libxl_list_vcpu(libxl_ctx *ctx, uint32_t domid,
                                        int *nb_vcpu, int *nrcpus);
 int libxl_set_vcpuaffinity(libxl_ctx *ctx, uint32_t domid, uint32_t vcpuid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -427,6 +427,12 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_hvm_directio", bool),
     ], dir=DIR_OUT)
 
+libxl_numainfo = Struct("numainfo", [
+    ("size", uint64),
+    ("free", uint64),
+    ("dists", Array(uint32, "num_dists")),
+    ], dir=DIR_OUT)
+
 libxl_cputopology = Struct("cputopology", [
     ("core", uint32),
     ("socket", uint32),
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -613,6 +613,14 @@ int libxl__enum_from_string(const libxl_
     return ERROR_FAIL;
 }
 
+void libxl_numainfo_list_free(libxl_numainfo *list, int nr)
+{
+    int i;
+    for (i = 0; i < nr; i++)
+        libxl_numainfo_dispose(&list[i]);
+    free(list);
+}
+
 void libxl_cputopology_list_free(libxl_cputopology *list, int nr)
 {
     int i;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -484,6 +484,7 @@ typedef struct xen_sysctl_topologyinfo x
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_topologyinfo_t);
 
 /* XEN_SYSCTL_numainfo */
+#define INVALID_NUMAINFO_ID (~0U)
 struct xen_sysctl_numainfo {
     /*
      * IN: maximum addressable entry in the caller-provided arrays.

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 07 of 11] xen: enhance dump_numa output

By showing the number of free pages on each node.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -365,10 +365,11 @@ static void dump_numa(unsigned char key)
 
 	for_each_online_node(i) {
 		paddr_t pa = (paddr_t)(NODE_DATA(i)->node_start_pfn + 1)<<
PAGE_SHIFT;
-		printk("idx%d -> NODE%d start->%lu size->%lu\n",
+		printk("idx%d -> NODE%d start->%lu size->%lu
free->%lu\n",
 			  i, NODE_DATA(i)->node_id,
 			  NODE_DATA(i)->node_start_pfn,
-			  NODE_DATA(i)->node_spanned_pages);
+			  NODE_DATA(i)->node_spanned_pages,
+                          avail_node_heap_pages(i));
 		/* sanity check phys_to_nid() */
 		printk("phys_to_nid(%"PRIpaddr") -> %d should be
%d\n", pa, phys_to_nid(pa),
 			  NODE_DATA(i)->node_id);

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

So that the user knows how much memory there is on each node and
how far they are from each others.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4217,6 +4217,36 @@ static void output_physinfo(void)
     return;
 }
 
+static void output_numainfo(void)
+{
+    libxl_numainfo *info;
+    int i, j, nr;
+
+    info = libxl_get_numainfo(ctx, &nr);
+    if (info == NULL) {
+        fprintf(stderr, "libxl_get_numainfo failed.\n");
+        return;
+    }
+
+    printf("numa_info              :\n");
+    printf("node:    memsize    memfree    distances\n");
+
+    for (i = 0; i < nr; i++) {
+        if (info[i].size != LIBXL_NUMAINFO_INVALID_ENTRY) {
+            printf("%4d:    %6"PRIu64"     %6"PRIu64" 
%d", i,
+                   info[i].size / (1 << 20), info[i].free / (1 <<
20),
+                   info[i].dists[0]);
+            for (j = 1; j < info[i].num_dists; j++)
+                printf(",%d", info[i].dists[j]);
+            printf("\n");
+        }
+    }
+
+    libxl_numainfo_list_free(info, nr);
+
+    return;
+}
+
 static void output_topologyinfo(void)
 {
     libxl_cputopology *info;
@@ -4239,8 +4269,6 @@ static void output_topologyinfo(void)
 
     libxl_cputopology_list_free(info, nr);
 
-    printf("numa_info              : none\n");
-
     return;
 }
 
@@ -4250,8 +4278,10 @@ static void info(int numa)
 
     output_physinfo();
 
-    if (numa)
+    if (numa) {
         output_topologyinfo();
+        output_numainfo();
+    }
 
     output_xeninfo();

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

If a domain does not have a VCPU affinity, try to pin it automatically
to some PCPUs. This is done taking into account the NUMA characteristics
of the host: we look for a combination of host''s NUMA nodes that has
enough
free memoy for the new domain, and pin it to the VCPUs of those nodes.
Smaller combinations are considered first, to avoid spreading the
domain''s memory among too many nodes.

After that, we also ensure that the chosen combination comprises at
least the same number of PCPUs than the number of VCPUs of the domain,
increasing the number of nodes it is made up of if that is not the case.
When adding new nodes to a combination, node distances are taken into
account, trying to minimize the peformance impact for the domain.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -111,8 +111,8 @@ created online and the remainder will be
 
 =item B<cpus="CPU-LIST">
 
-List of which cpus the guest is allowed to use. Default behavior is
-`all cpus`. A C<CPU-LIST> may be specified as follows:
+List of which cpus the guest is allowed to use. By default xl will
+pick some cpus (see below). A C<CPU-LIST> may be specified as follows:
 
 =over 4
 
@@ -132,6 +132,12 @@ run on cpu #3 of the host.
 
 =back
 
+If this option is not specified, xl automatically try to place the new
+domain on the host''s NUMA nodes (provided the host has more than one
NUMA
+node) by pinning it to the cpus of those nodes. The employed heuristics
+tries to maximize performance for the domain itself, while also achieving
+efficient resource utilization.
+
 =item B<cpu_weight=WEIGHT>
 
 A domain with a weight of 512 will get twice as much CPU as a domain
diff --git a/tools/libxl/gentest.py b/tools/libxl/gentest.py
--- a/tools/libxl/gentest.py
+++ b/tools/libxl/gentest.py
@@ -131,6 +131,19 @@ static void libxl_cpumap_rand_init(libxl
     }
 }
 
+static void libxl_nodemap_rand_init(libxl_nodemap *nodemap)
+{
+    int i;
+    nodemap->size = rand() % 16;
+    nodemap->map = calloc(nodemap->size, sizeof(*nodemap->map));
+    libxl_for_each_node(i, *nodemap) {
+        if (rand() % 2)
+            libxl_nodemap_set(nodemap, i);
+        else
+            libxl_nodemap_reset(nodemap, i);
+    }
+}
+
 static void libxl_key_value_list_rand_init(libxl_key_value_list *pkvl)
 {
     int i, nr_kvp = rand() % 16;
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -626,6 +626,15 @@ libxl_cpupoolinfo * libxl_list_cpupool(l
 libxl_vminfo * libxl_list_vm(libxl_ctx *ctx, int *nb_vm);
 void libxl_vminfo_list_free(libxl_vminfo *list, int nr);
 
+/* Automatic NUMA placement */
+libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
+                                        libxl_domain_build_info *b_info,
+                                        int min_nodes, int *nr_cndts);
+int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
+                                  int min_cpus, int max_nodes,
+                                  libxl_numa_candidate *candidate);
+void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int nr);
+
 /*
  * Devices
  * ======diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c
--- a/tools/libxl/libxl_json.c
+++ b/tools/libxl/libxl_json.c
@@ -119,6 +119,26 @@ out:
     return s;
 }
 
+yajl_gen_status libxl_nodemap_gen_json(yajl_gen hand,
+                                       libxl_nodemap *nodemap)
+{
+    yajl_gen_status s;
+    int i;
+
+    s = yajl_gen_array_open(hand);
+    if (s != yajl_gen_status_ok) goto out;
+
+    libxl_for_each_node(i, *nodemap) {
+        if (libxl_nodemap_test(nodemap, i)) {
+            s = yajl_gen_integer(hand, i);
+            if (s != yajl_gen_status_ok) goto out;
+        }
+    }
+    s = yajl_gen_array_close(hand);
+out:
+    return s;
+}
+
 yajl_gen_status libxl_key_value_list_gen_json(yajl_gen hand,
                                               libxl_key_value_list *pkvl)
 {
diff --git a/tools/libxl/libxl_json.h b/tools/libxl/libxl_json.h
--- a/tools/libxl/libxl_json.h
+++ b/tools/libxl/libxl_json.h
@@ -27,6 +27,7 @@ yajl_gen_status libxl_domid_gen_json(yaj
 yajl_gen_status libxl_uuid_gen_json(yajl_gen hand, libxl_uuid *p);
 yajl_gen_status libxl_mac_gen_json(yajl_gen hand, libxl_mac *p);
 yajl_gen_status libxl_cpumap_gen_json(yajl_gen hand, libxl_cpumap *p);
+yajl_gen_status libxl_nodemap_gen_json(yajl_gen hand, libxl_nodemap *p);
 yajl_gen_status libxl_cpuid_policy_list_gen_json(yajl_gen hand,
                                                  libxl_cpuid_policy_list *p);
 yajl_gen_status libxl_string_list_gen_json(yajl_gen hand, libxl_string_list
*p);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -439,6 +439,12 @@ libxl_cputopology = Struct("cputopology"
     ("node", uint32),
     ], dir=DIR_OUT)
 
+libxl_numa_candidate = Struct("numa_candidate", [
+    ("nr_nodes", integer),
+    ("free_memkb", uint32),
+    ("nodemap", libxl_nodemap),
+    ], dir=DIR_OUT)
+
 libxl_sched_credit_domain = Struct("sched_credit_domain", [
     ("weight", integer),
     ("cap", integer),
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -613,6 +613,250 @@ int libxl__enum_from_string(const libxl_
     return ERROR_FAIL;
 }
 
+static uint64_t node_to_nodemap_distance(const libxl_numainfo *ninfo,
+                                         libxl_nodemap *nodemap,
+                                         int node)
+{
+    int i;
+    uint64_t dist = 0;
+
+    /* Sum of the distances of a node from all the nodes in the map */
+    libxl_for_each_set_node(i, *nodemap)
+        dist += labs(ninfo[node].dists[i]);
+
+    return dist;
+}
+
+static int add_node_to_nodemap(const libxl_numainfo *ninfo, int nr_nodes,
+                               libxl_nodemap *nodemap)
+{
+    int i, node = -1;
+    uint64_t min_dist = UINT64_MAX;
+
+    /* Find the node closest to our nodemap */
+    for (i = 0; i < nr_nodes && !libxl_nodemap_test(nodemap, i);
i++) {
+        if (node_to_nodemap_distance(ninfo, nodemap, i) < min_dist) {
+            min_dist = node_to_nodemap_distance(ninfo, nodemap, i);
+            node = i;
+        }
+    }
+    if (node >= 0)
+        libxl_nodemap_set(nodemap, node);
+
+    return node;
+}
+
+/*
+ * The following two uility functions compute the node maps
+ * coresponding to the [ n! / (k! * (n - k)!) ] combinations
+ * of @size nodes within a set that is @nr_nodes big, without
+ * repetition. The algorithm used for generating them uses
+ * a vector containing the indexes in the set of the elements
+ * of the i-eth combination to generate the (i+1)-eth, and
+ * ensures they come in lexicographic order.
+ *
+ * For example, with n = 5 and k = 3, calling numa_node_set_init()
+ * will generate "0, 1, 2", while subsequent calls to
+ * numa_node_set_next() yield the following:
+ * "0, 1, 2
+ *  0, 1, 3
+ *  0, 1, 4
+ *  0, 2, 3
+ *  0, 2, 4
+ *  0, 3, 4
+ *  1, 2, 3
+ *  1, 2, 4
+ *  1, 3, 4
+ *  2, 3, 4"
+ */
+static void numa_node_set_init(int nr_nodes, int size, int *idxs,
+                              libxl_nodemap *nodemap)
+{
+    int i;
+
+    /* First set always is "0, 1, 2, ..., size-1" */
+    libxl_nodemap_set_none(nodemap);
+    for (i = 0; i < size; i++) {
+        libxl_nodemap_set(nodemap, i);
+        idxs[i] = i;
+    }
+}
+
+static int numa_node_set_next(int nr_nodes, int size, int *idxs,
+                              libxl_nodemap *nodemap)
+{
+    int i;
+
+    /* Check whether we just need to increase the rightmost index */
+    if (idxs[size - 1] < nr_nodes - 1) {
+        idxs[size - 1]++;
+        goto build_nodemap;
+    }
+
+    /* Find the rightmost element from where to start the increasing */
+    for (i = size - 1; idxs[i] == nr_nodes - size + i; i--) {
+        if (i <= 0) {
+            /* No more valid combinations! */
+            return -1;
+        }
+    }
+    for (idxs[i]++, i += 1; i < size; i++)
+        idxs[i] = idxs[i - 1] + 1;
+
+ build_nodemap:
+    libxl_nodemap_set_none(nodemap);
+    for (i = 0; i < size; i++)
+        libxl_nodemap_set(nodemap, idxs[i]);
+
+    return 0;
+}
+
+libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
+                                        libxl_domain_build_info *b_info,
+                                        int min_nodes, int *nr_cndts)
+{
+    uint32_t dom_memkb, nodes_memkb;
+    libxl_numa_candidate *cndts = NULL;
+    libxl_nodemap suitable_nodes;
+    int *suitable_nodes_idx;
+    libxl_numainfo *ninfo;
+    int nr_nodes, i, next_comb_ok;
+
+    ninfo = libxl_get_numainfo(ctx, &nr_nodes);
+    if (ninfo == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_get_numainfo
failed.\n");
+        goto out;
+    }
+
+    /* Prepare the nodemap and the index array for the combinations */
+    if (libxl_nodemap_alloc(ctx, &suitable_nodes) < 0) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_nodemap_alloc
failed\n");
+        goto out_numainfo;
+    }
+    suitable_nodes_idx = malloc(sizeof(int) * nr_nodes);
+    if (suitable_nodes_idx == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "malloc failed\n");
+        goto out_nodemap;
+    }
+
+    /* Memoy needed by the domain */
+    if (libxl_domain_need_memory(ctx, b_info, &dom_memkb)) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_domain_need_memory
failed\n");
+        goto out_nodemap_idx;
+    }
+
+    *nr_cndts = 0;
+    while (1) {
+        /* Avoid considering a silly number of nodes */
+        if (min_nodes > nr_nodes)
+            break;
+
+        /* Generate the combinations and check their cumulative free memory */
+        numa_node_set_init(nr_nodes, min_nodes, suitable_nodes_idx,
+                           &suitable_nodes);
+        do {
+            nodes_memkb = 0;
+            libxl_for_each_set_node(i, suitable_nodes)
+                nodes_memkb += ninfo[i].free / 1024;
+
+            /* If free memory is enough, this is a valid candidate */
+            if (nodes_memkb >= dom_memkb) {
+                cndts = realloc(cndts, sizeof(cndts[0]) * (*nr_cndts+1));
+                if (cndts == NULL) {
+                    LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "realloc
failed\n");
+                    goto out_nodemap_idx;
+                }
+
+                libxl_nodemap_alloc(ctx, &cndts[*nr_cndts].nodemap);
+                libxl_nodemap_copy(&cndts[*nr_cndts].nodemap,
&suitable_nodes);
+                cndts[*nr_cndts].free_memkb = nodes_memkb;
+                cndts[*nr_cndts].nr_nodes = min_nodes;
+                (*nr_cndts)++;
+            }
+
+            next_comb_ok = numa_node_set_next(nr_nodes, min_nodes,
+                                              suitable_nodes_idx,
+                                              &suitable_nodes) == 0;
+        } while (next_comb_ok);
+
+        min_nodes++;
+    }
+
+ out_nodemap_idx:
+    free(suitable_nodes_idx);
+ out_nodemap:
+    libxl_nodemap_dispose(&suitable_nodes);
+ out_numainfo:
+    libxl_numainfo_list_free(ninfo, nr_nodes);
+ out:
+    return cndts;
+}
+
+int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
+                                  int min_cpus, int max_nodes,
+                                  libxl_numa_candidate *candidate)
+{
+    int dom_nodes, nodes_cpus;
+    libxl_cputopology *tinfo;
+    libxl_numainfo *ninfo;
+    int nr_nodes, nr_cpus;
+    int i, rc;
+
+    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
+    if (tinfo == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_get_topologyinfo
failed\n");
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+
+    ninfo = libxl_get_numainfo(ctx, &nr_nodes);
+    if (ninfo == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_get_numainfo
failed\n");
+        rc = ERROR_NOMEM;
+        goto out_topologyinfo;
+    }
+
+    if (max_nodes == -1 || max_nodes > nr_nodes)
+        max_nodes = nr_nodes;
+
+    /* If a candidate is short in PCPUs, just add more nodes */
+    dom_nodes = candidate->nr_nodes;
+    while (1) {
+        rc = ERROR_FAIL;
+
+        if (candidate->nr_nodes > max_nodes)
+            break;
+
+        for (i = 0, nodes_cpus = 0; i < nr_cpus; i++) {
+            if (libxl_nodemap_test(&candidate->nodemap, tinfo[i].node))
+                nodes_cpus++;
+        }
+
+        /* Either we have enough PCPUs, and thus we''re done ... */
+        if (nodes_cpus >= min_cpus) {
+            rc = 0;
+            break;
+        }
+        /* ... Or we need more, and we want to try adding the closest node */
+        add_node_to_nodemap(ninfo, nr_nodes, &candidate->nodemap);
+        candidate->nr_nodes++;
+    }
+
+    libxl_numainfo_list_free(ninfo, nr_nodes);
+ out_topologyinfo:
+    libxl_cputopology_list_free(tinfo, nr_cpus);
+ out:
+    return rc;
+}
+
+void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int nr)
+{
+    int i;
+    for (i = 0; i < nr; i++)
+        libxl_numa_candidate_dispose(&list[i]);
+    free(list);
+}
+
 void libxl_numainfo_list_free(libxl_numainfo *list, int nr)
 {
     int i;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -490,6 +490,122 @@ static void split_string_into_string_lis
     free(s);
 }
 
+/* How many PCPUs are there on each node? */
+static int cpus_per_node(libxl_cputopology *tinfo, int nr_cpus)
+{
+    libxl_numainfo *ninfo;
+    int nr_nodes, j, i;
+    int cpus_nodes = 0;
+
+    ninfo = libxl_get_numainfo(ctx, &nr_nodes);
+    if (ninfo == NULL)
+        return 0;
+
+    /* This makes sense iff # of PCPUs is the same for all nodes */
+    for (j = 0; j < nr_nodes; j++) {
+        int curr_cpus = 0;
+
+        for (i = 0; i < nr_cpus; i++) {
+            if (tinfo[i].node == j)
+                curr_cpus++;
+        }
+        cpus_nodes = cpus_nodes == 0 ? curr_cpus : cpus_nodes;
+
+        /* Make sure the above holds, or turn the whole thing off! */
+        if (cpus_nodes != curr_cpus) {
+            cpus_nodes = 0;
+            break;
+        }
+    }
+
+    libxl_numainfo_dispose(ninfo);
+    return cpus_nodes;
+}
+
+/* Try to achieve "optimal" NUMA placement */
+static int place_domain(libxl_domain_build_info *b_info)
+{
+    int dom_min_nodes, dom_max_nodes;
+    int dom_needs_cpus, cpus_nodes;
+    libxl_cputopology *tinfo;
+    libxl_cpupoolinfo *pinfo;
+    int nr_cpus, nr_pools;
+    libxl_numa_candidate *candidates;
+    int candidate, nr_candidates;
+    int i, err;
+
+    /* First of all, if the domain has its cpu-affinity, just don''t
bother */
+    if (libxl_cpumap_is_full(&b_info->cpumap) != 0)
+        return 0;
+
+    /* If cpupools are in use, better not to mess with them */
+    pinfo = libxl_list_cpupool(ctx, &nr_pools);
+    if (!pinfo) {
+        fprintf(stderr, "error getting cpupool info\n");
+        err = ENOMEM;
+        goto out;
+    }
+    if (nr_pools > 1) {
+        fprintf(stderr, "skip numa placement as cpupools are being
used\n");
+        err = 0;
+        goto out_poolinfo;
+    }
+
+    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
+    if (tinfo == NULL) {
+        fprintf(stderr, "libxl_get_topologyinfo failed.\n");
+        err = ENOMEM;
+        goto out_poolinfo;
+    }
+
+    /* If possible, start with a sensible minimum number of nodes */
+    cpus_nodes = cpus_per_node(tinfo, nr_cpus);
+    dom_needs_cpus = b_info->max_vcpus;
+
+    if (cpus_nodes != 0)
+        dom_min_nodes = (dom_needs_cpus + cpus_nodes - 1) / cpus_nodes;
+    else
+        dom_min_nodes = 1;
+
+    /* Let''s fit the domain memory-wise (ENOSPC if we don''t
manage) */
+    candidates = libxl_domain_numa_candidates(ctx, b_info, dom_min_nodes,
+                                              &nr_candidates);
+    if (!candidates) {
+        err = ENOSPC;
+        goto out_topologyinfo;
+    }
+
+    /* Pick a candidate and ensure it gives us enough PCPUs */
+    dom_max_nodes = -1; err = ERROR_FAIL;
+    for (candidate = 0; err && candidate < nr_candidates;
candidate++) {
+        /* We try adding nodes up to all of them (dom_max_nodes = -1) */
+        err = libxl_numa_candidate_add_cpus(ctx, dom_needs_cpus, dom_max_nodes,
+                                            &candidates[candidate]);
+    }
+
+    if (err == ERROR_FAIL) {
+        /* Report back we didn''t find a candidate with enough cpus */
+        err = ESRCH;
+    } else if (!err) {
+        /* Things went fine. Commit the map into domain''s build info
*/
+        libxl_cpumap_set_none(&b_info->cpumap);
+        for (i = 0; i < nr_cpus; i++) {
+            if (libxl_nodemap_test(&candidates[candidate-1].nodemap,
+                                   tinfo[i].node))
+                libxl_cpumap_set(&b_info->cpumap, i);
+        }
+    }
+
+    libxl_numa_candidates_list_free(candidates, nr_candidates);
+out_topologyinfo:
+    libxl_cputopology_list_free(tinfo, nr_cpus);
+out_poolinfo:
+    for (i = 0; i < nr_pools; i++)
+        libxl_cpupoolinfo_dispose(&pinfo[i]);
+out:
+    return err;
+}
+
 static int vcpupin_parse(char *cpu, libxl_cpumap *cpumap)
 {
     libxl_cpumap exclude_cpumap;
@@ -1738,6 +1854,16 @@ start:
         goto error_out;
     }
 
+    ret = place_domain(&d_config.b_info);
+    if (ret == ESRCH || ret == ENOSPC) {
+        fprintf(stderr, "Failed to place the domain, fallback to all
nodes\n");
+        /* Resetting domain''s cpu-affinity is enough for that */
+        libxl_cpumap_set_any(&d_config.b_info.cpumap);
+    } else if (ret != 0) {
+        ret = ERROR_FAIL;
+        goto error_out;
+    }
+
     libxl_asyncprogress_how autoconnect_console_how_buf;
     if ( dom_info->console_autoconnect ) {
         autoconnect_console_how_buf.callback = autoconnect_console;
diff --git a/tools/python/xen/lowlevel/xl/xl.c
b/tools/python/xen/lowlevel/xl/xl.c
--- a/tools/python/xen/lowlevel/xl/xl.c
+++ b/tools/python/xen/lowlevel/xl/xl.c
@@ -320,6 +320,23 @@ PyObject *attrib__libxl_file_reference_g
     return genwrap__string_get(&pptr->path);
 }
 
+PyObject *attrib__libxl_nodemap_get(libxl_nodemap *pptr)
+{
+    PyObject *nodelist = NULL;
+    int i;
+
+    nodelist = PyList_New(0);
+    libxl_for_each_node(i, *pptr) {
+        if ( libxl_nodemap_test(pptr, i) ) {
+            PyObject* pyint = PyInt_FromLong(i);
+
+            PyList_Append(nodelist, pyint);
+            Py_DECREF(pyint);
+        }
+    }
+    return nodelist;
+}
+
 PyObject *attrib__libxl_hwcap_get(libxl_hwcap *pptr)
 {
     PyErr_SetString(PyExc_NotImplementedError, "Getting hwcap");

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 10 of 11] libxl, xl: heuristics for reordering NUMA placement candidates

Once we know which ones of all the possible combinations
represents valid placement candidates for a domain, use some
heuistics for deciding which one to pick (instead to just
taking the first one).

First of all, the smaller candidates are better both from the
domain''s point of view (fewer memory spreading among nodes) and
fom the system point of view (fewer memoy fragmentation). In
case of candidates of equal sizes, the one that has the greater
amount of memory by at least 10% wins, as this is (again) good
for keeping the fragmentation small. Finally, the number of
domains running on the nodes involved in the combinations is
checked, and the "least populated" candidate is the one that
is considered.

This makes the wholle automatic NUMA placement mechanism very
similar to what xm/xend does, although no memory considerations
are present there.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -633,6 +633,8 @@ libxl_numa_candidate *libxl_domain_numa_
 int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
                                   int min_cpus, int max_nodes,
                                   libxl_numa_candidate *candidate);
+int libxl_numa_candidate_count_domains(libxl_ctx *ctx,
+                                       libxl_numa_candidate *candidate);
 void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int nr);
 
 /*
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -441,6 +441,7 @@ libxl_cputopology = Struct("cputopology"
 
 libxl_numa_candidate = Struct("numa_candidate", [
     ("nr_nodes", integer),
+    ("nr_domains", integer),
     ("free_memkb", uint32),
     ("nodemap", libxl_nodemap),
     ], dir=DIR_OUT)
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -849,6 +849,70 @@ int libxl_numa_candidate_add_cpus(libxl_
     return rc;
 }
 
+int libxl_numa_candidate_count_domains(libxl_ctx *ctx,
+                                       libxl_numa_candidate *candidate)
+{
+    libxl_nodemap dom_nodemap;
+    libxl_cputopology *tinfo;
+    int nr_doms, nr_cpus, rc = 0;
+    libxl_dominfo *dinfo;
+    int i, j, k;
+
+    dinfo = libxl_list_domain(ctx, &nr_doms);
+    if (dinfo == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_list_domain
failed\n");
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+
+    if (libxl_nodemap_alloc(ctx, &dom_nodemap) < 0) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_nodemap_alloc
failed\n");
+        rc = ERROR_NOMEM;
+        goto out_dominfo;
+    }
+
+    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
+    if (tinfo == NULL) {
+        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_get_topologyinfo
failed\n");
+        rc = ERROR_NOMEM;
+        goto out_nodemap;
+    }
+
+    candidate->nr_domains = 0;
+    for (i = 0; i < nr_doms; i++) {
+        libxl_vcpuinfo *vinfo;
+        int nr_vcpus, nr_cpus;
+
+        vinfo = libxl_list_vcpu(ctx, dinfo[i].domid, &nr_vcpus,
&nr_cpus);
+        if (vinfo == NULL)
+            continue;
+
+        libxl_nodemap_set_none(&dom_nodemap);
+        for (j = 0; j < nr_vcpus; j++) {
+            libxl_for_each_set_cpu(k, vinfo[j].cpumap)
+                libxl_nodemap_set(&dom_nodemap, tinfo[k].node);
+        }
+
+        libxl_for_each_set_node(j, dom_nodemap) {
+            if (libxl_nodemap_test(&candidate->nodemap, j)) {
+                candidate->nr_domains++;
+                goto found;
+            }
+        }
+ found:
+        libxl_vcpuinfo_list_free(vinfo, nr_vcpus);
+    }
+
+
+    libxl_cputopology_list_free(tinfo, nr_cpus);
+ out_nodemap:
+    libxl_nodemap_dispose(&dom_nodemap);
+ out_dominfo:
+    libxl_dominfo_list_free(dinfo, nr_doms);
+ out:
+    return rc;
+}
+
 void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int nr)
 {
     int i;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -522,6 +522,34 @@ static int cpus_per_node(libxl_cputopolo
     return cpus_nodes;
 }
 
+/*
+ * The NUMA placement candidates are reordered according to the following
+ * heuristics:
+ *  - candidates involving fewer nodes come first. In case two (or
+ *    more) candidates span the same number of nodes,
+ *  - candidates with greater amount of free memory come first. In
+ *    case two (or more) candidates differ in their amount of free
+ *    memory by less than 10%,
+ *  - candidates with fewer domains insisting on them at the time of
+ *    this call come first.
+ */
+static int candidates_cmpf(const void *v1, const void *v2)
+{
+    const libxl_numa_candidate *c1 = (const libxl_numa_candidate*) v1;
+    const libxl_numa_candidate *c2 = (const libxl_numa_candidate*) v2;
+    double mem_diff = labs(c1->free_memkb - c2->free_memkb);
+    double mem_avg = (c1->free_memkb + c2->free_memkb) / 2.0;
+
+    if (c1->nr_nodes != c2->nr_nodes)
+        return c1->nr_nodes - c2->nr_nodes;
+
+    if ((mem_diff / mem_avg) * 100.0 < 10.0 &&
+        c1->nr_domains != c2->nr_domains)
+        return c1->nr_domains - c2->nr_domains;
+
+    return c2->free_memkb - c1->free_memkb;
+}
+
 /* Try to achieve "optimal" NUMA placement */
 static int place_domain(libxl_domain_build_info *b_info)
 {
@@ -575,6 +603,18 @@ static int place_domain(libxl_domain_bui
         goto out_topologyinfo;
     }
 
+    /* Account for the number of domains insisting on a candidate placement */
+    for (i = 0; i < nr_candidates; i++) {
+        if (libxl_numa_candidate_count_domains(ctx, &candidates[i])) {
+            fprintf(stderr, "libxl_numa_candidate_count_domains
failed\n");
+            err = ENOMEM;
+            goto out_cndtslist;
+        }
+    }
+
+    /* Reorder candidates (see @candidates_cmpf for the heuristics) */
+    qsort(candidates, nr_candidates, sizeof(candidates[0]), candidates_cmpf);
+
     /* Pick a candidate and ensure it gives us enough PCPUs */
     dom_max_nodes = -1; err = ERROR_FAIL;
     for (candidate = 0; err && candidate < nr_candidates;
candidate++) {
@@ -596,6 +636,7 @@ static int place_domain(libxl_domain_bui
         }
     }
 
+out_cndtslist:
     libxl_numa_candidates_list_free(candidates, nr_candidates);
 out_topologyinfo:
     libxl_cputopology_list_free(tinfo, nr_cpus);

Dario Faggioli

2012-May-31 12:11 UTC

head link

[PATCH 11 of 11] Some automatic NUMA placement documentation

About rationale, usage and API.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

diff --git a/docs/misc/xl-numa-placement.markdonw
b/docs/misc/xl-numa-placement.markdonw
new file mode 100644
--- /dev/null
+++ b/docs/misc/xl-numa-placement.markdonw
@@ -0,0 +1,113 @@
+# Guest Automatic NUMA Placement in libxl and xl #
+
+## Rationale ##
+
+The Xen hypervisor deals with Non-Uniform Memory Access (NUMA])
+machines by assigning to its domain a "node affinity", i.e., a set of
NUMA
+nodes of the host from which it gets its memory allocated.
+
+NUMA awareness becomes very important as soon as many domains start running
+memory-intensive workloads on a shared host. In fact, the cost of accessing
+non node-local memory locations is very high, and the performance degradation
+is likely to be noticeable.
+
+## Guest Placement in xl ##
+
+If using xl for creating and managing guests, it is very easy to ask
+for both manual or automatic placement of them across the host''s NUMA
+nodes.
+
+Note that xm/xend does the very same thing, the only differences residing
+in the details of the heuristics adopted for the placement (see below).
+
+### Manual Guest Placement with xl ###
+
+Thanks to the "cpus=" option, it is possible to specify where a
domain
+should be created and scheduled on, directly in its config file. This
+affects NUMA placement and memory accesses as the hypervisor constructs
+the node affinity of a VM basing right on its CPU affinity when it is
+created.
+
+This is very simple and effective, but requires the user/system
+administrator to explicitly specify affinities for each and every domain,
+or Xen won''t be able to enable guarantee the locality for their memory
+accesses.
+
+### Automatic Guest Placement with xl ###
+
+In case no "cpus=" option is specified in the config file, xl tries
+to figure out on its own on which node(s) the domain could fit best.
+
+First of all, it needs to find a node (or a set of nodes) that have
+enough free memory for accommodating the domain. After that, the actual
+decision on where to put the new guest happens by generating all the
+possible combinations of nodes that satisfies the above and chose among
+them according to the following heuristics:
+
+  *  candidates involving fewer nodes come first. In case two (or more)
+     candidates span the same number of nodes,
+  *  candidates with greater amount of free memory come first. In case
+     two (or more) candidates differ in their amount of free memory by
+     less than 10%,
+  *  candidates with fewer domains already placed on them come first.
+
+Giving preference to small candidates ensures better performance for
+the guest, as it avoid spreading its memory among different nodes.
+Using the nodes that have the biggest amounts of free memory helps
+keeping the memory fragmentation small, from a system wide perspective.
+Finally, in case more candidates fulfil these criteria by the same
+extent, choosing the candidate that is hosting fewer domain helps
+balancing the load on the various nodes.
+
+The last step is figuring out whether the selected candidate contains
+at least as much CPUs as the number of VCPUs of the VM. The current
+solution for the case when this is not verified is just to add some
+more nodes, until the condition turns into being true. When doing
+this, the nodes with the least possible distance from the ones
+already in the nodemap are considered.
+
+## Guest Placement in libxl ##
+
+xl achieves automatic NUMA placement by means of the following API
+calls, provided by libxl.
+
+        libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
+                                            libxl_domain_build_info *b_info,
+                                            int min_nodes, int *nr_cndts);
+
+This is what should be used to generate the full set of placement
+candidates. In fact, the function returns an array of containing nr_cndts
+libxl_numa_candidate (see below). Each candidate is basically a set of nodes
+that has been checked against the memory requirement derived from the
+provided libxl_domain_build_info.
+
+        int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
+                                          int min_cpus, int max_nodes,
+                                          libxl_numa_candidate *candidate);
+
+This is what should be used to ensure a placement candidate has at least
+min_cpus CPUs. In case it does not, the function also take care of
+adding more nodes to the candidate itself (up to when the value specified
+in max_nodes is reached). When adding new nodes, the one that has the
+smallest "distance" from the current node map is selected at each
step.
+
+        libxl_numa_candidate_count_domains(libxl_ctx *ctx,
+                                           libxl_numa_candidate *candidate);
+
+This is what counts the number of domains that are currently pinned
+to the CPUs of the nodes of a given candidate.
+
+Finally, a placement candidate is represented by the following data
+structure:
+
+        typedef struct libxl_numa_candidate {
+            int nr_nodes;
+            int nr_domains;
+            uint32_t free_memkb;
+            libxl_nodemap nodemap;
+        } libxl_numa_candidate;
+
+It basically tells what are the nodes the candidate spans (in the nodemap),
+how many of them there are (nr_nodes), how much free memory they can
+provide all together (free_memkb) and how many domains are running pinned
+to their CPUs (nr_domains).

Ian Jackson

2012-May-31 14:10 UTC

head link

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Dario Faggioli writes ("[PATCH 01 of 11] libxc: abstract xenctl_cpumap to
just xenctl_map"):> More specifically:
>  1. replaces xenctl_cpumap with xenctl_map
>  2. provides bitmap_to_xenctl_map and the reverse;
>  3. re-implement cpumask_to_xenctl_map with bitmap_to_xenctl_map
>     and the reverse;
> 
> Other than #3, no functional changes. Interface only slightly
> afected.
This looks plausible to me but it needs a review from a hypervisor
maintainer too.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Ian Jackson

2012-May-31 14:11 UTC

head link

Re: [PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

Dario Faggioli writes ("[PATCH 02 of 11] libxl: abstract libxl_cpumap to
just libxl_map"):> More specifically:
>  1. introduces struct libxl_map;
>  2. re-implement libxl_cpumap_* on top of struct libxl_map_*;
> 
> No functional nor interface changes at all.
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Ian Jackson

2012-May-31 14:12 UTC

head link

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

Dario Faggioli writes ("[PATCH 03 of 11] libxc, libxl: introduce
xc_nodemap_t and libxl_nodemap"):> As NUMA node-related counterparts of xc_cpumap_t and libxl_cpumap.
Wouldn''t it be possible to just use `libxl_map'' type directly,
with
the caller knowing whether it contains node or cpu numbers ?

As it is this has a whole lot of duplicated wrapper code.

Ian.

Ian Jackson

2012-May-31 14:13 UTC

head link

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Dario Faggioli writes ("[PATCH 04 of 11] libxl: expand the
libxl_{cpu,node}map API a bit"):> +void libxl_cpumap_copy(/*XXX libxl_ctx *ctx,*/ libxl_cpumap *dst,
XXX ?

Ian.

Ian Jackson

2012-May-31 14:22 UTC

head link

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Dario Faggioli writes ("[PATCH 06 of 11] libxl: introduce
libxl_get_numainfo()"):> Make some NUMA node information available to the toolstack. Achieve
> this by means of xc_numainfo(), which exposes memory size and amount
> of free memory of each node, as well as the relative distances of
> each node to all the others.
...> +#define LIBXL_NUMAINFO_INVALID_ENTRY (~(uint32_t)0)
Is there some reason we can''t just make sure we use the same value for
this in both places ?  That would avoid the need for this:
> +#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
> +    LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
I appreciate that the types aren''t the same.  In libxc it''s an
unsigned long.  But shouldn''t they be the same ?
> +libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
> +{
> +    xc_numainfo_t ninfo;
> +    DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
> +    DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
> +    DECLARE_HYPERCALL_BUFFER(uint32_t, node_distances);
> +    libxl_numainfo *ret = NULL;
> +    int i, j, max_nodes;
> +
> +    max_nodes = libxl_get_max_nodes(ctx);
> +    if (max_nodes == 0)
> +    {
> +        LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of
NODES");
> +        return NULL;
> +    }
> +
> +    memsize = xc_hypercall_buffer_alloc
> +        (ctx->xch, memsize, sizeof(*memsize) * max_nodes);
> +    memfree = xc_hypercall_buffer_alloc
> +        (ctx->xch, memfree, sizeof(*memfree) * max_nodes);
> +    node_distances = xc_hypercall_buffer_alloc
> +        (ctx->xch, node_distances, sizeof(*node_distances) * max_nodes
* max_nodes);
This kind of stuff normally lives in libxc.  I appreciate that we have
a bit of it in libxl already, but do we want to perpetuate that ?
> +    ret = malloc(sizeof(libxl_numainfo) * max_nodes);
> +    if (ret == NULL) {
In libxl you can use libxl__zalloc(NULL,...) (and don''t have to check
for errors because it can''t fail).  But perhaps this is going into
libxc ?

I''d like to hear what other people say about putting this in libxl
vs. libxc.
> +        ret[i].dists = malloc(sizeof(*node_distances) * max_nodes);
Again.  And the lack of error exits will simplify this a lot.

Ian.

Ian Jackson

2012-May-31 14:23 UTC

head link

Re: [PATCH 07 of 11] xen: enhance dump_numa output

Dario Faggioli writes ("[PATCH 07 of 11] xen: enhance dump_numa
output"):> By showing the number of free pages on each node.
...> -		printk("idx%d -> NODE%d start->%lu size->%lu\n",
> +		printk("idx%d -> NODE%d start->%lu size->%lu
free->%lu\n",
>  			  i, NODE_DATA(i)->node_id,
>  			  NODE_DATA(i)->node_start_pfn,
> -			  NODE_DATA(i)->node_spanned_pages);
> +			  NODE_DATA(i)->node_spanned_pages,
> +                          avail_node_heap_pages(i));
This seems to have an indentation error.

Otherwise,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


Ian.

Ian Jackson

2012-May-31 14:24 UTC

head link

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

Dario Faggioli writes ("[PATCH 08 of 11] xl: add more NUMA information to
`xl info -n''"):> So that the user knows how much memory there is on each node and
> how far they are from each others.
Perhaps the json output machinery can do this for us ?  If that''s
sufficiently legible then it would be an improvement on this
open-coded printer.

Also people might try to parse the output from open-coded printer
which would be annoying when we try to extend things.

Ian.

Dario Faggioli

2012-May-31 14:30 UTC

head link

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

On Thu, 2012-05-31 at 15:13 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 04 of 11] libxl: expand the
libxl_{cpu,node}map API a bit"):
> > +void libxl_cpumap_copy(/*XXX libxl_ctx *ctx,*/ libxl_cpumap *dst,
> 
> XXX ?
> Yep, I left it on purpose but I also wanted to mention that in the
changelog, which I apparently missed. Sorry.

However, now that we are here, the whole point is, does that function
need a libxl_ctx argument? It doesn''t use it, but I was concerned about
consistency with the other libxl_cpumap_* calls, and maybe future
needs...

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-May-31 14:32 UTC

head link

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

On Thu, 2012-05-31 at 15:12 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 03 of 11] libxc, libxl: introduce
xc_nodemap_t and libxl_nodemap"):
> > As NUMA node-related counterparts of xc_cpumap_t and libxl_cpumap.
> 
> Wouldn''t it be possible to just use `libxl_map'' type
directly, with
> the caller knowing whether it contains node or cpu numbers ?
> Definitely possible.
> As it is this has a whole lot of duplicated wrapper code.
> I agree, and I will do that.

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-May-31 14:35 UTC

head link

Re: [PATCH 07 of 11] xen: enhance dump_numa output

On Thu, 2012-05-31 at 15:23 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 07 of 11] xen: enhance dump_numa
output"):
> > By showing the number of free pages on each node.
> ...
> > -		printk("idx%d -> NODE%d start->%lu size->%lu\n",
> > +		printk("idx%d -> NODE%d start->%lu size->%lu
free->%lu\n",
> >  			  i, NODE_DATA(i)->node_id,
> >  			  NODE_DATA(i)->node_start_pfn,
> > -			  NODE_DATA(i)->node_spanned_pages);
> > +			  NODE_DATA(i)->node_spanned_pages,
> > +                          avail_node_heap_pages(i));
> 
> This seems to have an indentation error.
> Yep, that file has tabs... :-/

Will fix it. Thanks,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-May-31 14:40 UTC

head link

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

On Thu, 2012-05-31 at 15:24 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 08 of 11] xl: add more NUMA information
to `xl info -n''"):
> > So that the user knows how much memory there is on each node and
> > how far they are from each others.
> 
> Perhaps the json output machinery can do this for us ?  If that''s
> sufficiently legible then it would be an improvement on this
> open-coded printer.
> 
> Also people might try to parse the output from open-coded printer
> which would be annoying when we try to extend things.
> Not sure I got 100% of what you mean. I put something together on the
line of what ''void output_topologyinfo(void)'' does, but I can
do it some
other way as soon as I understand how you''re suggesting to do it :-P

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2012-May-31 14:47 UTC

head link

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

On 31/05/12 15:10, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 01 of 11] libxc: abstract xenctl_cpumap
to just xenctl_map"):
>> More specifically:
>>   1. replaces xenctl_cpumap with xenctl_map
>>   2. provides bitmap_to_xenctl_map and the reverse;
>>   3. re-implement cpumask_to_xenctl_map with bitmap_to_xenctl_map
>>      and the reverse;
>>
>> Other than #3, no functional changes. Interface only slightly
>> afected.
> This looks plausible to me but it needs a review from a hypervisor
> maintainer too.What exactly are you looking for from a hypervisor maintianer?

I was going to say it looks plausible to me but needs a review from a 
tools maintianer. :-)

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

George Dunlap

2012-May-31 14:54 UTC

head link

Re: [PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

On 31/05/12 13:11, Dario Faggioli wrote:> More specifically:
>   1. introduces struct libxl_map;
>   2. re-implement libxl_cpumap_* on top of struct libxl_map_*;
>
> No functional nor interface changes at all.
>
> This is in preparation of the introduction of NUMA nodes maps.
>
> Signed-off-by: Dario Faggioli<dario.faggioli@citrix.eu.com>Acked-by: George Dunlap
<george.dunlap@eu.citrix.com>>
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -282,11 +282,17 @@ typedef uint32_t libxl_hwcap[8];
>
>   typedef uint64_t libxl_ev_user;
>
> -typedef struct {
> +struct libxl_map {
>       uint32_t size;          /* number of bytes in map */
>       uint8_t *map;
> -} libxl_cpumap;
> -void libxl_cpumap_dispose(libxl_cpumap *map);
> +};
> +void libxl_map_dispose(struct libxl_map *map);
> +
> +typedef struct libxl_map libxl_cpumap;
> +static inline void libxl_cpumap_dispose(libxl_cpumap *cpumap)
> +{
> +    return libxl_map_dispose(cpumap);
> +}
>
>   typedef struct {
>       /*
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> --- a/tools/libxl/libxl_utils.c
> +++ b/tools/libxl/libxl_utils.c
> @@ -488,47 +488,53 @@ int libxl_mac_to_device_nic(libxl_ctx *c
>       return rc;
>   }
>
> +void libxl_map_dispose(struct libxl_map *map)
> +{
> +    free(map->map);
> +}
> +
> +static int libxl_map_alloc(libxl_ctx *ctx, struct libxl_map *map, int
n_elems)
> +{
> +    int sz;
> +
> +    sz = (n_elems + 7) / 8;
> +    map->map = calloc(sz, sizeof(*map->map));
> +    if (!map->map)
> +        return ERROR_NOMEM;
> +    map->size = sz;
> +    return 0;
> +}
> +
> +int libxl_map_test(struct libxl_map *map, int elem)
> +{
> +    if (elem>= map->size * 8)
> +        return 0;
> +    return (map->map[elem / 8]&  (1<<  (elem&  7))) ? 1 :
0;
> +}
> +
> +void libxl_map_set(struct libxl_map *map, int elem)
> +{
> +    if (elem>= map->size * 8)
> +        return;
> +    map->map[elem / 8] |= 1<<  (elem&  7);
> +}
> +
> +void libxl_map_reset(struct libxl_map *map, int elem)
> +{
> +    if (elem>= map->size * 8)
> +        return;
> +    map->map[elem / 8]&= ~(1<<  (elem&  7));
> +}
> +
>   int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap)
>   {
>       int max_cpus;
> -    int sz;
>
>       max_cpus = libxl_get_max_cpus(ctx);
>       if (max_cpus == 0)
>           return ERROR_FAIL;
>
> -    sz = (max_cpus + 7) / 8;
> -    cpumap->map = calloc(sz, sizeof(*cpumap->map));
> -    if (!cpumap->map)
> -        return ERROR_NOMEM;
> -    cpumap->size = sz;
> -    return 0;
> -}
> -
> -void libxl_cpumap_dispose(libxl_cpumap *map)
> -{
> -    free(map->map);
> -}
> -
> -int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
> -{
> -    if (cpu>= cpumap->size * 8)
> -        return 0;
> -    return (cpumap->map[cpu / 8]&  (1<<  (cpu&  7))) ? 1
: 0;
> -}
> -
> -void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu)
> -{
> -    if (cpu>= cpumap->size * 8)
> -        return;
> -    cpumap->map[cpu / 8] |= 1<<  (cpu&  7);
> -}
> -
> -void libxl_cpumap_reset(libxl_cpumap *cpumap, int cpu)
> -{
> -    if (cpu>= cpumap->size * 8)
> -        return;
> -    cpumap->map[cpu / 8]&= ~(1<<  (cpu&  7));
> +    return libxl_map_alloc(ctx, cpumap, max_cpus);
>   }
>
>   int libxl_get_max_cpus(libxl_ctx *ctx)
> diff --git a/tools/libxl/libxl_utils.h b/tools/libxl/libxl_utils.h
> --- a/tools/libxl/libxl_utils.h
> +++ b/tools/libxl/libxl_utils.h
> @@ -63,21 +63,46 @@ int libxl_devid_to_device_nic(libxl_ctx
>   int libxl_vdev_to_device_disk(libxl_ctx *ctx, uint32_t domid, const char
*vdev,
>                                  libxl_device_disk *disk);
>
> +int libxl_map_test(struct libxl_map *map, int elem);
> +void libxl_map_set(struct libxl_map *map, int elem);
> +void libxl_map_reset(struct libxl_map *map, int elem);
> +static inline void libxl_map_set_any(struct libxl_map *map)
> +{
> +    memset(map->map, -1, map->size);
> +}
> +static inline void libxl_map_set_none(struct libxl_map *map)
> +{
> +    memset(map->map, 0, map->size);
> +}
> +static inline int libxl_map_elem_valid(struct libxl_map *map, int elem)
> +{
> +    return elem>= 0&&  elem<  (map->size * 8);
> +}
> +
>   int libxl_cpumap_alloc(libxl_ctx *ctx, libxl_cpumap *cpumap);
> -int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu);
> -void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu);
> -void libxl_cpumap_reset(libxl_cpumap *cpumap, int cpu);
> +static inline int libxl_cpumap_test(libxl_cpumap *cpumap, int cpu)
> +{
> +    return libxl_map_test(cpumap, cpu);
> +}
> +static inline void libxl_cpumap_set(libxl_cpumap *cpumap, int cpu)
> +{
> +    libxl_map_set(cpumap, cpu);
>   {
> -    memset(cpumap->map, -1, cpumap->size);
> +    libxl_map_set_any(cpumap);
>   }
>   static inline void libxl_cpumap_set_none(libxl_cpumap *cpumap)
>   {
> -    memset(cpumap->map, 0, cpumap->size);
> +    libxl_map_set_none(cpumap);
>   }
>   static inline int libxl_cpumap_cpu_valid(libxl_cpumap *cpumap, int cpu)
>   {
> -    return cpu>= 0&&  cpu<  (cpumap->size * 8);
> +    return libxl_map_elem_valid(cpumap, cpu);
>   }
>   #define libxl_for_each_cpu(var, map) for (var = 0; var<  (map).size *
8; var++)
>   #define libxl_for_each_set_cpu(v, m) for (v = 0; v<  (m).size * 8;
v++) \

George Dunlap

2012-May-31 14:55 UTC

head link

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

On Thu, May 31, 2012 at 3:47 PM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:>> This looks plausible to me but it needs a review from a hypervisor
>> maintainer too.
>
> What exactly are you looking for from a hypervisor maintianer?
>
> I was going to say it looks plausible to me but needs a review from a tools
> maintianer. :-)
>
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Er, sorry -- seems I counted something wrong and reviewed the wrong
patch.  This ack shoudl go to patch #2.  Let me come back to this one.

 -George

Dario Faggioli

2012-May-31 14:57 UTC

head link

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

On Thu, 2012-05-31 at 15:22 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 06 of 11] libxl: introduce
libxl_get_numainfo()"):
> > Make some NUMA node information available to the toolstack. Achieve
> > this by means of xc_numainfo(), which exposes memory size and amount
> > of free memory of each node, as well as the relative distances of
> > each node to all the others.
> ...
> > +#define LIBXL_NUMAINFO_INVALID_ENTRY (~(uint32_t)0)
> 
> Is there some reason we can''t just make sure we use the same value
for
> this in both places ?  That would avoid the need for this:
> Sorry, with "both places" being? Maybe you''re talking about
reusing
LIBXL_CPUTOPOLOGY_INVALID_ENTRY here as well? If yes, I think it is
possible and I also thought doing it like that... Or was it something
different you were saying?
> > +#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
> > +    LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
> 
> I appreciate that the types aren''t the same.  In libxc
it''s an
> unsigned long.  But shouldn''t they be the same ?
> I again am not sure about what types you are talking about here I''m
afraid... :-(
> > +libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
> > +{
> > +    xc_numainfo_t ninfo;
> > +    DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
> > +    DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
> > +    DECLARE_HYPERCALL_BUFFER(uint32_t, node_distances);
> > +    libxl_numainfo *ret = NULL;
> > +    int i, j, max_nodes;
> > +
> > +    max_nodes = libxl_get_max_nodes(ctx);
> > +    if (max_nodes == 0)
> > +    {
> > +        LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number
of NODES");
> > +        return NULL;
> > +    }
> > +
> > +    memsize = xc_hypercall_buffer_alloc
> > +        (ctx->xch, memsize, sizeof(*memsize) * max_nodes);
> > +    memfree = xc_hypercall_buffer_alloc
> > +        (ctx->xch, memfree, sizeof(*memfree) * max_nodes);
> > +    node_distances = xc_hypercall_buffer_alloc
> > +        (ctx->xch, node_distances, sizeof(*node_distances) *
max_nodes * max_nodes);
> 
> This kind of stuff normally lives in libxc.  I appreciate that we have
> a bit of it in libxl already, but do we want to perpetuate that ?
> Yes, I did this like it is mainly because it is exactly what
libxl_get_cpu_topology() does and thus I thought it to be the way to
go. :-)
> > +    ret = malloc(sizeof(libxl_numainfo) * max_nodes);
> > +    if (ret == NULL) {
> 
> In libxl you can use libxl__zalloc(NULL,...) (and don''t have to
check
> for errors because it can''t fail).  But perhaps this is going into
> libxc ?
> About libxl__zalloc(), noted. Thanks.

About moving this all, I can of couse look into doing that, if wanted.
> I''d like to hear what other people say about putting this in libxl
> vs. libxc.
>Sure, just let me know what people prefer...

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2012-May-31 15:01 UTC

head link

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

On Thu, May 31, 2012 at 1:11 PM, Dario Faggioli <raistlin@linux.it>
wrote:> More specifically:
>  1. replaces xenctl_cpumap with xenctl_map
>  2. provides bitmap_to_xenctl_map and the reverse;
>  3. re-implement cpumask_to_xenctl_map with bitmap_to_xenctl_map
>    and the reverse;
>
> Other than #3, no functional changes. Interface only slightly
> afected.
>
> This is in preparation of introducing NUMA nodes maps.
Dario,

What changes are there in this since the last time you posted this?
Anything other than updating the patch description?

 -George
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.eu.com>
>
> diff --git a/tools/libxc/xc_cpupool.c b/tools/libxc/xc_cpupool.c
> --- a/tools/libxc/xc_cpupool.c
> +++ b/tools/libxc/xc_cpupool.c
> @@ -90,7 +90,7 @@ xc_cpupoolinfo_t *xc_cpupool_getinfo(xc_
>     sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_INFO;
>     sysctl.u.cpupool_op.cpupool_id = poolid;
>     set_xen_guest_handle(sysctl.u.cpupool_op.cpumap.bitmap, local);
> -    sysctl.u.cpupool_op.cpumap.nr_cpus = local_size * 8;
> +    sysctl.u.cpupool_op.cpumap.nr_elems = local_size * 8;
>
>     err = do_sysctl_save(xch, &sysctl);
>
> @@ -184,7 +184,7 @@ xc_cpumap_t xc_cpupool_freeinfo(xc_inter
>     sysctl.cmd = XEN_SYSCTL_cpupool_op;
>     sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_FREEINFO;
>     set_xen_guest_handle(sysctl.u.cpupool_op.cpumap.bitmap, local);
> -    sysctl.u.cpupool_op.cpumap.nr_cpus = mapsize * 8;
> +    sysctl.u.cpupool_op.cpumap.nr_elems = mapsize * 8;
>
>     err = do_sysctl_save(xch, &sysctl);
>
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -142,7 +142,7 @@ int xc_vcpu_setaffinity(xc_interface *xc
>
>     set_xen_guest_handle(domctl.u.vcpuaffinity.cpumap.bitmap, local);
>
> -    domctl.u.vcpuaffinity.cpumap.nr_cpus = cpusize * 8;
> +    domctl.u.vcpuaffinity.cpumap.nr_elems = cpusize * 8;
>
>     ret = do_domctl(xch, &domctl);
>
> @@ -182,7 +182,7 @@ int xc_vcpu_getaffinity(xc_interface *xc
>     domctl.u.vcpuaffinity.vcpu = vcpu;
>
>     set_xen_guest_handle(domctl.u.vcpuaffinity.cpumap.bitmap, local);
> -    domctl.u.vcpuaffinity.cpumap.nr_cpus = cpusize * 8;
> +    domctl.u.vcpuaffinity.cpumap.nr_elems = cpusize * 8;
>
>     ret = do_domctl(xch, &domctl);
>
> diff --git a/tools/libxc/xc_tbuf.c b/tools/libxc/xc_tbuf.c
> --- a/tools/libxc/xc_tbuf.c
> +++ b/tools/libxc/xc_tbuf.c
> @@ -134,7 +134,7 @@ int xc_tbuf_set_cpu_mask(xc_interface *x
>     bitmap_64_to_byte(bytemap, &mask64, sizeof (mask64) * 8);
>
>     set_xen_guest_handle(sysctl.u.tbuf_op.cpu_mask.bitmap, bytemap);
> -    sysctl.u.tbuf_op.cpu_mask.nr_cpus = sizeof(bytemap) * 8;
> +    sysctl.u.tbuf_op.cpu_mask.nr_elems = sizeof(bytemap) * 8;
>
>     ret = do_sysctl(xch, &sysctl);
>
> diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
> --- a/xen/arch/x86/cpu/mcheck/mce.c
> +++ b/xen/arch/x86/cpu/mcheck/mce.c
> @@ -1545,8 +1545,7 @@ long do_mca(XEN_GUEST_HANDLE(xen_mc_t) u
>             cpumap = &cpu_online_map;
>         else
>         {
> -            ret = xenctl_cpumap_to_cpumask(&cmv,
> -                                          
&op->u.mc_inject_v2.cpumap);
> +            ret = xenctl_map_to_cpumask(&cmv,
&op->u.mc_inject_v2.cpumap);
>             if ( ret )
>                 break;
>             cpumap = cmv;
> diff --git a/xen/arch/x86/platform_hypercall.c
b/xen/arch/x86/platform_hypercall.c
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -365,7 +365,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
>     {
>         uint32_t cpu;
>         uint64_t idletime, now = NOW();
> -        struct xenctl_cpumap ctlmap;
> +        struct xenctl_map ctlmap;
>         cpumask_var_t cpumap;
>         XEN_GUEST_HANDLE(uint8) cpumap_bitmap;
>         XEN_GUEST_HANDLE(uint64) idletimes;
> @@ -378,11 +378,11 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
>         if ( cpufreq_controller != FREQCTL_dom0_kernel )
>             break;
>
> -        ctlmap.nr_cpus  = op->u.getidletime.cpumap_nr_cpus;
> +        ctlmap.nr_elems  = op->u.getidletime.cpumap_nr_cpus;
>         guest_from_compat_handle(cpumap_bitmap,
>                                  op->u.getidletime.cpumap_bitmap);
>         ctlmap.bitmap.p = cpumap_bitmap.p; /* handle -> handle_64
conversion */
> -        if ( (ret = xenctl_cpumap_to_cpumask(&cpumap, &ctlmap)) !=
0 )
> +        if ( (ret = xenctl_map_to_cpumask(&cpumap, &ctlmap)) != 0
)
>             goto out;
>         guest_from_compat_handle(idletimes, op->u.getidletime.idletime);
>
> @@ -401,7 +401,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE(xe
>
>         op->u.getidletime.now = now;
>         if ( ret == 0 )
> -            ret = cpumask_to_xenctl_cpumap(&ctlmap, cpumap);
> +            ret = cpumask_to_xenctl_map(&ctlmap, cpumap);
>         free_cpumask_var(cpumap);
>
>         if ( ret == 0 && copy_to_guest(u_xenpf_op, op, 1) )
> diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
> --- a/xen/common/cpupool.c
> +++ b/xen/common/cpupool.c
> @@ -489,7 +489,7 @@ int cpupool_do_sysctl(struct xen_sysctl_
>         op->cpupool_id = c->cpupool_id;
>         op->sched_id = c->sched->sched_id;
>         op->n_dom = c->n_dom;
> -        ret = cpumask_to_xenctl_cpumap(&op->cpumap,
c->cpu_valid);
> +        ret = cpumask_to_xenctl_map(&op->cpumap, c->cpu_valid);
>         cpupool_put(c);
>     }
>     break;
> @@ -584,7 +584,7 @@ int cpupool_do_sysctl(struct xen_sysctl_
>
>     case XEN_SYSCTL_CPUPOOL_OP_FREEINFO:
>     {
> -        ret = cpumask_to_xenctl_cpumap(
> +        ret = cpumask_to_xenctl_map(
>             &op->cpumap, &cpupool_free_cpus);
>     }
>     break;
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -31,28 +31,29 @@
>  static DEFINE_SPINLOCK(domctl_lock);
>  DEFINE_SPINLOCK(vcpu_alloc_lock);
>
> -int cpumask_to_xenctl_cpumap(
> -    struct xenctl_cpumap *xenctl_cpumap, const cpumask_t *cpumask)
> +int bitmap_to_xenctl_map(struct xenctl_map *xenctl_map,
> +                         const unsigned long *bitmap,
> +                         unsigned int nbits)
>  {
>     unsigned int guest_bytes, copy_bytes, i;
>     uint8_t zero = 0;
>     int err = 0;
> -    uint8_t *bytemap = xmalloc_array(uint8_t, (nr_cpu_ids + 7) / 8);
> +    uint8_t *bytemap = xmalloc_array(uint8_t, (nbits + 7) / 8);
>
>     if ( !bytemap )
>         return -ENOMEM;
>
> -    guest_bytes = (xenctl_cpumap->nr_cpus + 7) / 8;
> -    copy_bytes  = min_t(unsigned int, guest_bytes, (nr_cpu_ids + 7) / 8);
> +    guest_bytes = (xenctl_map->nr_elems + 7) / 8;
> +    copy_bytes  = min_t(unsigned int, guest_bytes, (nbits + 7) / 8);
>
> -    bitmap_long_to_byte(bytemap, cpumask_bits(cpumask), nr_cpu_ids);
> +    bitmap_long_to_byte(bytemap, bitmap, nbits);
>
>     if ( copy_bytes != 0 )
> -        if ( copy_to_guest(xenctl_cpumap->bitmap, bytemap, copy_bytes)
)
> +        if ( copy_to_guest(xenctl_map->bitmap, bytemap, copy_bytes) )
>             err = -EFAULT;
>
>     for ( i = copy_bytes; !err && i < guest_bytes; i++ )
> -        if ( copy_to_guest_offset(xenctl_cpumap->bitmap, i, &zero,
1) )
> +        if ( copy_to_guest_offset(xenctl_map->bitmap, i, &zero, 1)
)
>             err = -EFAULT;
>
>     xfree(bytemap);
> @@ -60,36 +61,58 @@ int cpumask_to_xenctl_cpumap(
>     return err;
>  }
>
> -int xenctl_cpumap_to_cpumask(
> -    cpumask_var_t *cpumask, const struct xenctl_cpumap *xenctl_cpumap)
> +int xenctl_map_to_bitmap(unsigned long *bitmap,
> +                         const struct xenctl_map *xenctl_map,
> +                         unsigned int nbits)
>  {
>     unsigned int guest_bytes, copy_bytes;
>     int err = 0;
> -    uint8_t *bytemap = xzalloc_array(uint8_t, (nr_cpu_ids + 7) / 8);
> +    uint8_t *bytemap = xzalloc_array(uint8_t, (nbits + 7) / 8);
>
>     if ( !bytemap )
>         return -ENOMEM;
>
> -    guest_bytes = (xenctl_cpumap->nr_cpus + 7) / 8;
> -    copy_bytes  = min_t(unsigned int, guest_bytes, (nr_cpu_ids + 7) / 8);
> +    guest_bytes = (xenctl_map->nr_elems + 7) / 8;
> +    copy_bytes  = min_t(unsigned int, guest_bytes, (nbits + 7) / 8);
>
>     if ( copy_bytes != 0 )
>     {
> -        if ( copy_from_guest(bytemap, xenctl_cpumap->bitmap,
copy_bytes) )
> +        if ( copy_from_guest(bytemap, xenctl_map->bitmap, copy_bytes) )
>             err = -EFAULT;
> -        if ( (xenctl_cpumap->nr_cpus & 7) && (guest_bytes
<= sizeof(bytemap)) )
> -            bytemap[guest_bytes-1] &= ~(0xff <<
(xenctl_cpumap->nr_cpus & 7));
> +        if ( (xenctl_map->nr_elems & 7) && (guest_bytes
<= sizeof(bytemap)) )
> +            bytemap[guest_bytes-1] &= ~(0xff <<
(xenctl_map->nr_elems & 7));
>     }
>
> -    if ( err )
> -        /* nothing */;
> -    else if ( alloc_cpumask_var(cpumask) )
> -        bitmap_byte_to_long(cpumask_bits(*cpumask), bytemap, nr_cpu_ids);
> +    if ( !err )
> +        bitmap_byte_to_long(bitmap, bytemap, nbits);
> +
> +    xfree(bytemap);
> +
> +    return err;
> +}
> +
> +int cpumask_to_xenctl_map(struct xenctl_map *xenctl_cpumap,
> +                          const cpumask_t *cpumask)
> +{
> +    return bitmap_to_xenctl_map(xenctl_cpumap, cpumask_bits(cpumask),
> +                                nr_cpu_ids);
> +}
> +
> +int xenctl_map_to_cpumask(cpumask_var_t *cpumask,
> +                          const struct xenctl_map *xenctl_cpumap)
> +{
> +    int err = 0;
> +
> +    if ( alloc_cpumask_var(cpumask) ) {
> +        err = xenctl_map_to_bitmap(cpumask_bits(*cpumask), xenctl_cpumap,
> +                                   nr_cpu_ids);
> +        /* In case of error, cleanup is up to us, as the caller
won''t care! */
> +        if ( err )
> +            free_cpumask_var(*cpumask);
> +    }
>     else
>         err = -ENOMEM;
>
> -    xfree(bytemap);
> -
>     return err;
>  }
>
> @@ -617,7 +640,7 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
>         {
>             cpumask_var_t new_affinity;
>
> -            ret = xenctl_cpumap_to_cpumask(
> +            ret = xenctl_map_to_cpumask(
>                 &new_affinity, &op->u.vcpuaffinity.cpumap);
>             if ( !ret )
>             {
> @@ -627,7 +650,7 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
>         }
>         else
>         {
> -            ret = cpumask_to_xenctl_cpumap(
> +            ret = cpumask_to_xenctl_map(
>                 &op->u.vcpuaffinity.cpumap, v->cpu_affinity);
>         }
>
> diff --git a/xen/common/trace.c b/xen/common/trace.c
> --- a/xen/common/trace.c
> +++ b/xen/common/trace.c
> @@ -384,7 +384,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc
>     {
>         cpumask_var_t mask;
>
> -        rc = xenctl_cpumap_to_cpumask(&mask, &tbc->cpu_mask);
> +        rc = xenctl_map_to_cpumask(&mask, &tbc->cpu_mask);
>         if ( !rc )
>         {
>             cpumask_copy(&tb_cpu_mask, mask);
> diff --git a/xen/include/public/arch-x86/xen-mca.h
b/xen/include/public/arch-x86/xen-mca.h
> --- a/xen/include/public/arch-x86/xen-mca.h
> +++ b/xen/include/public/arch-x86/xen-mca.h
> @@ -414,7 +414,7 @@ struct xen_mc_mceinject {
>
>  struct xen_mc_inject_v2 {
>        uint32_t flags;
> -       struct xenctl_cpumap cpumap;
> +       struct xenctl_map cpumap;
>  };
>  #endif
>
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -283,7 +283,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_getvc
>  /* XEN_DOMCTL_getvcpuaffinity */
>  struct xen_domctl_vcpuaffinity {
>     uint32_t  vcpu;              /* IN */
> -    struct xenctl_cpumap cpumap; /* IN/OUT */
> +    struct xenctl_map cpumap;    /* IN/OUT */
>  };
>  typedef struct xen_domctl_vcpuaffinity xen_domctl_vcpuaffinity_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpuaffinity_t);
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -71,8 +71,8 @@ struct xen_sysctl_tbuf_op {
>  #define XEN_SYSCTL_TBUFOP_disable      5
>     uint32_t cmd;
>     /* IN/OUT variables */
> -    struct xenctl_cpumap cpu_mask;
> -    uint32_t             evt_mask;
> +    struct xenctl_map cpu_mask;
> +    uint32_t          evt_mask;
>     /* OUT variables */
>     uint64_aligned_t buffer_mfn;
>     uint32_t size;  /* Also an IN variable! */
> @@ -531,7 +531,7 @@ struct xen_sysctl_cpupool_op {
>     uint32_t domid;       /* IN: M              */
>     uint32_t cpu;         /* IN: AR             */
>     uint32_t n_dom;       /*            OUT: I  */
> -    struct xenctl_cpumap cpumap; /*     OUT: IF */
> +    struct xenctl_map cpumap;    /*     OUT: IF */
>  };
>  typedef struct xen_sysctl_cpupool_op xen_sysctl_cpupool_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpupool_op_t);
> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -822,9 +822,9 @@ typedef uint8_t xen_domain_handle_t[16];
>  #endif
>
>  #ifndef __ASSEMBLY__
> -struct xenctl_cpumap {
> +struct xenctl_map {
>     XEN_GUEST_HANDLE_64(uint8) bitmap;
> -    uint32_t nr_cpus;
> +    uint32_t nr_elems;
>  };
>  #endif
>
> diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h
> --- a/xen/include/xen/cpumask.h
> +++ b/xen/include/xen/cpumask.h
> @@ -424,8 +424,8 @@ extern cpumask_t cpu_present_map;
>  #define for_each_present_cpu(cpu)  for_each_cpu(cpu, &cpu_present_map)
>
>  /* Copy to/from cpumap provided by control tools. */
> -struct xenctl_cpumap;
> -int cpumask_to_xenctl_cpumap(struct xenctl_cpumap *, const cpumask_t *);
> -int xenctl_cpumap_to_cpumask(cpumask_var_t *, const struct xenctl_cpumap
*);
> +struct xenctl_map;
> +int cpumask_to_xenctl_map(struct xenctl_map *, const cpumask_t *);
> +int xenctl_map_to_cpumask(cpumask_var_t *, const struct xenctl_map *);
>
>  #endif /* __XEN_CPUMASK_H */
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -2,7 +2,7 @@
>  # ! - needs translation
>  # ? - needs checking
>  ?      dom0_vga_console_info           xen.h
> -?      xenctl_cpumap                   xen.h
> +?      xenctl_map                      xen.h
>  ?      mmu_update                      xen.h
>  !      mmuext_op                       xen.h
>  !      start_info                      xen.h
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Jackson

2012-May-31 15:02 UTC

head link

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

Dario Faggioli writes ("[PATCH 09 of 11] libxl, xl: enable automatic
placement of guests on NUMA nodes"):> If a domain does not have a VCPU affinity, try to pin it automatically
> to some PCPUs. This is done taking into account the NUMA characteristics
> of the host: we look for a combination of host''s NUMA nodes that
has enough
> free memoy for the new domain, and pin it to the VCPUs of those nodes.
> Smaller combinations are considered first, to avoid spreading the
> domain''s memory among too many nodes.
Thanks for this.  Here are my comments:
> +static void libxl_nodemap_rand_init(libxl_nodemap *nodemap)
> +{
> +    int i;
> +    nodemap->size = rand() % 16;
> +    nodemap->map = calloc(nodemap->size, sizeof(*nodemap->map));
> +    libxl_for_each_node(i, *nodemap) {
> +        if (rand() % 2)
> +            libxl_nodemap_set(nodemap, i);
> +        else
> +            libxl_nodemap_reset(nodemap, i);
> +    }
> +}
For your random number generation, please use nrand48, with a seed in
the libxl ctx.  (This means you''ll need to take out the ctx lock.)
And provide a way to set the seed.
> +/* Automatic NUMA placement */
> +libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
> +                                        libxl_domain_build_info *b_info,
> +                                        int min_nodes, int *nr_cndts);
> +int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> +                                  int min_cpus, int max_nodes,
> +                                  libxl_numa_candidate *candidate);
> +void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int nr);
This interface documentation is deficient, I''m afraid.  Please explain
how these functions are supposed to be used.
> diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c
> --- a/tools/libxl/libxl_json.c
> +++ b/tools/libxl/libxl_json.c
> @@ -119,6 +119,26 @@ out:
>      return s;
>  }
> 
> +yajl_gen_status libxl_nodemap_gen_json(yajl_gen hand,
> +                                       libxl_nodemap *nodemap)
> +{
> +    yajl_gen_status s;
> +    int i;
> +
> +    s = yajl_gen_array_open(hand);
> +    if (s != yajl_gen_status_ok) goto out;
> +
> +    libxl_for_each_node(i, *nodemap) {
> +        if (libxl_nodemap_test(nodemap, i)) {
> +            s = yajl_gen_integer(hand, i);
> +            if (s != yajl_gen_status_ok) goto out;
> +        }
> +    }
> +    s = yajl_gen_array_close(hand);
> +out:
> +    return s;
> +}
This is a copy of _cpumap_gen_json.  Bad enough to have one of these,
let along two.
> +/*
> + * The following two uility functions compute the node maps
> + * coresponding to the [ n! / (k! * (n - k)!) ] combinations
> + * of @size nodes within a set that is @nr_nodes big, without
> + * repetition. The algorithm used for generating them uses
> + * a vector containing the indexes in the set of the elements
> + * of the i-eth combination to generate the (i+1)-eth, and
> + * ensures they come in lexicographic order.
Can you please avoid this `@'' ?  We aren''t using doxygen (and
never
will I hope) and it''s just noise.  Also I think the "..."
here (and
later) are wrong since these aren''t strings.  If you want to talk
about them as arrays and want something to indicate the grouping you
could use { }.
> +static int numa_node_set_next(int nr_nodes, int size, int *idxs,
> +                              libxl_nodemap *nodemap)
You should at least write     int idxs[]    and explain that the
caller is expected to provide an array of size `size'' which is private
to the implementation of numa_node_set_...
> +{
> +    int i;
> +
> +    /* Check whether we just need to increase the rightmost index */
> +    if (idxs[size - 1] < nr_nodes - 1) {
> +        idxs[size - 1]++;
> +        goto build_nodemap;
Is there something wrong with `else'' ?  Or maybe you mean `goto
out'' ?

But, I think in fact this special case is unnecessary, isn''t it ?
> +    /* Find the rightmost element from where to start the increasing */
> +    for (i = size - 1; idxs[i] == nr_nodes - size + i; i--) {
Since if  idxs[size-1] == nr_nodes-1  this loop''s first test of the
condition with  i == size-1  becomes   idxs[size-1]==nr_nodes-1
and we therefore execute this
> +    for (idxs[i]++, i += 1; i < size; i++)
> +        idxs[i] = idxs[i - 1] + 1;
with i==size-1 and thus we increment idxs[size-1] and quit the loop
right away.

Furthermore, that last loop is really rather confusing.  The function
would benefit from a comment describing the algorith.   `i += 1'' is
un-idiomatic, too.

> +libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
> +                                        libxl_domain_build_info *b_info,
> +                                        int min_nodes, int *nr_cndts)
I think it would be better if this function returned a libxl error
code.  That way in would be possible to distinguish the various error
cases with different error codes.  You should define new error codes
if you need them.

For the libxl internal subcalls, do this:
    rc = libxl_get_some_information(CTX, ...);
    if (rc) goto out;
> +    suitable_nodes_idx = malloc(sizeof(int) * nr_nodes);
Use GCNEW_ARRAY and ditch the error handling.  You will need to
GC_INIT and GC_FREE.  This will give you a gc and then you should
write `CTX'' rather than `ctx'' so that your code can easily be
moved
into an inner function.
> +        /* Generate the combinations and check their cumulative free
memory */
> +        numa_node_set_init(nr_nodes, min_nodes, suitable_nodes_idx,
> +                           &suitable_nodes);
> +        do {
> +            nodes_memkb = 0;
> +            libxl_for_each_set_node(i, suitable_nodes)
> +                nodes_memkb += ninfo[i].free / 1024;
This should be a for loop.  If necessary, numa_node_set_* should be
changed so that they can easily be used in a for loop.

This would be acceptable, for example:

       for (rc = numa_node_set_init(nr_nodes, min_nodes, suitable_nodes_idx,
                                    &suitable_nodes);
            !rc;
            rc = numa_node_set_next(nr_nodes, min_nodes, suitable_nodes_idx,
                                    &suitable_nodes)) {

Perhaps numa_node_set_ should take a struct for all these arguments.
Then you could write:

        for (rc = numa_node_set_init(&numa_node_set_iter, nr_nodes,
min_nodes);
             !rc;
             rc = numa_node_set_next(&numa_node_set_iter) {

Or even better if you changed the interface a bit:

        for (numa_node_set_init(gc, &numa_node_set_iter, nr_nodes,
min_nodes);
             numa_node_set_get(&numa_node_set_iter, &suitable_nodes);
             numa_node_set_next(&numa_node_set_iter) {

which would allow you to make numa_node_set_iter entirely opaque and
move the allocation of the idx array into the numa_node_set_*
functions.
> +                cndts = realloc(cndts, sizeof(cndts[0]) * (*nr_cndts+1));
> +                if (cndts == NULL) {
Firstly, use libxl__zrealloc(0, ...) and ditch the error handling.
Secondly, this reallocs the array every time we add a node.  It would
be better to keep a separate note of the array size and reallocate in
bigger chunks.
> + out_nodemap_idx:
> +    free(suitable_nodes_idx);
> + out_nodemap:
> +    libxl_nodemap_dispose(&suitable_nodes);
> + out_numainfo:
> +    libxl_numainfo_list_free(ninfo, nr_nodes);
> + out:
> +    return cndts;
> +}
Please don''t use this error handling style.  Instead, initialise all
the resource variables to 0 right at the top of the function, and
unconditionally free them.

For the output variable cndts, you can unconditionally free it at the
end if you do this on the success exit path:
> +int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> +                                  int min_cpus, int max_nodes,
> +                                  libxl_numa_candidate *candidate)
> +{
> +    int dom_nodes, nodes_cpus;
> +    libxl_cputopology *tinfo;
> +    libxl_numainfo *ninfo;
> +    int nr_nodes, nr_cpus;
> +    int i, rc;
> +
> +    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
> +    if (tinfo == NULL) {
> +        LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl_get_topologyinfo
failed\n");
> +        rc = ERROR_NOMEM;
We aren''t supposed to be returning ERROR_NOMEM any more because all
our memory allocation is now assumed to be infallible.  Shouldn''t this
be ERROR_FAIL ?  (And in the next stanza.)
> +    if (max_nodes == -1 || max_nodes > nr_nodes)
> +        max_nodes = nr_nodes;
I''m afraid I can''t really make much sense of this algorithm
without
thinking very hard.  Maybe if the interface doc comment had explained
what it was supposed to do it would be obvious...
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -490,6 +490,122 @@ static void split_string_into_string_lis
...> +/* How many PCPUs are there on each node? */
> +static int cpus_per_node(libxl_cputopology *tinfo, int nr_cpus)
> +{
> +    libxl_numainfo *ninfo;
> +    int nr_nodes, j, i;
> +    int cpus_nodes = 0;
> +
> +    ninfo = libxl_get_numainfo(ctx, &nr_nodes);
> +    if (ninfo == NULL)
> +        return 0;
> +
> +    /* This makes sense iff # of PCPUs is the same for all nodes */
And what if it isn''t ?  Later I see:
> +    if (cpus_nodes != 0)
> +        dom_min_nodes = (dom_needs_cpus + cpus_nodes - 1) / cpus_nodes;
> +    else
> +        dom_min_nodes = 1;
which seems to suggest we''ll pile the whole thing onto one pcpu.
> +/* Try to achieve "optimal" NUMA placement */
> +static int place_domain(libxl_domain_build_info *b_info)
> +{...
 +    if (nr_pools > 1) {> +        fprintf(stderr, "skip numa placement as cpupools are being
used\n");
> +        err = 0;
> +        goto out_poolinfo;
> +    }
Perhaps in the case of cpupools the right answer is to consider only
those pcpus in the intended pool ?  Or is that likely to do more harm
than good, with people who are currently doing their numa placement
with cpupools ?
> +    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
> +    if (tinfo == NULL) {
> +        fprintf(stderr, "libxl_get_topologyinfo failed.\n");
> +        err = ENOMEM;
libxl functions should return libxl error codes, not errno values.
You will probably need to introduce some.
> +    if (err == ERROR_FAIL) {
> +        /* Report back we didn''t find a candidate with enough
cpus */
> +        err = ESRCH;
I don''t think we should ever turn ERROR_FAIL into something else.
ERROR_FAIL means `something went wrong in a way that shouldn''t
happen''.

> +out_topologyinfo:
> +    libxl_cputopology_list_free(tinfo, nr_cpus);
> +out_poolinfo:
> +    for (i = 0; i < nr_pools; i++)
> +        libxl_cpupoolinfo_dispose(&pinfo[i]);
> +out:
> +    return err;
> +}
Again, please use an idempotent error handling style.


Ian.

Dario Faggioli

2012-May-31 15:08 UTC

head link

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

On Thu, 2012-05-31 at 16:01 +0100, George Dunlap wrote:> Dario,
> 
> What changes are there in this since the last time you posted this?
> Anything other than updating the patch description?
> Nope. Patches 1, 2 and 3 are just what I sent with your comments applied
about code and changelogs.

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Jackson

2012-May-31 15:08 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Dario Faggioli writes ("[PATCH 11 of 11] Some automatic NUMA placement
documentation"):> About rationale, usage and API.
...> +## Guest Placement in libxl ##
Oh here''s the API documentation!

In general I would prefer to see docs come in the same patch but I
guess I can read it here:
> +xl achieves automatic NUMA placement by means of the following API
> +calls, provided by libxl.
Can you try to write some more general comment about what order these
functions should be called in ?

Or to put it another way:
> +        libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
> +                                            libxl_domain_build_info
*b_info,
> +                                            int min_nodes, int *nr_cndts);
> +
> +This is what should be used to generate the full set of placement
> +candidates. In fact, the function returns an array of containing nr_cndts
> +libxl_numa_candidate (see below). Each candidate is basically a set of
nodes
> +that has been checked against the memory requirement derived from the
> +provided libxl_domain_build_info.
That tells me what the function does.  But my starting point is that I
have no idea when or why I might want to `generate the full set of
placement candidates''.  If this is something I need to do the docs
need to explain that.

This goes double for this function:
> +        int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> +                                          int min_cpus, int max_nodes,
> +                                          libxl_numa_candidate
*candidate);
> +
> +This is what should be used to ensure a placement candidate has at least
> +min_cpus CPUs. In case it does not, the function also take care of
> +adding more nodes to the candidate itself (up to when the value specified
> +in max_nodes is reached). When adding new nodes, the one that has the
> +smallest "distance" from the current node map is selected at
each step.
`add_cpus'' doesn''t seem the same as `ensure a candidate has at
least
min_cpus CPUs''.  In what sense are the CPUs added ?


And, as before, why might I want to call this ?  And when would I call
it ?  Why does the interface to libxl expose this rather than just
offering a single function

 libxl_do_automatic_numa_placement_according_to_heuristics

?

> +        libxl_numa_candidate_count_domains(libxl_ctx *ctx,
> +                                           libxl_numa_candidate
*candidate);
> +
> +This is what counts the number of domains that are currently pinned
> +to the CPUs of the nodes of a given candidate.
Why is that useful ?  It is used by some of your other code so I guess
this is a facility which is useful to the implementors of other
placement algorithms ?

Ian.

Dario Faggioli

2012-May-31 15:41 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

On Thu, 2012-05-31 at 16:08 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 11 of 11] Some automatic NUMA placement
documentation"):
> > About rationale, usage and API.
> ...
> > +## Guest Placement in libxl ##
> 
> Oh here''s the API documentation!
> Yep. :-)
> In general I would prefer to see docs come in the same patch but I
> guess I can read it here:
> I can put it there. Would you recommend _moving_ it in the source file o
also leaving a copy, o maybe just some pointers, here?
> > +xl achieves automatic NUMA placement by means of the following API
> > +calls, provided by libxl.
> 
> Can you try to write some more general comment about what order these
> functions should be called in ?
> Yes I can, good point.

<snip>
> > +        int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> > +                                          int min_cpus, int
max_nodes,
> > +                                          libxl_numa_candidate
*candidate);
> > +
> > +This is what should be used to ensure a placement candidate has at
least
> > +min_cpus CPUs. In case it does not, the function also take care of
> > +adding more nodes to the candidate itself (up to when the value
specified
> > +in max_nodes is reached). When adding new nodes, the one that has the
> > +smallest "distance" from the current node map is selected
at each step.
> 
> `add_cpus'' doesn''t seem the same as `ensure a candidate
has at least
> min_cpus CPUs''.  In what sense are the CPUs added ?
> In the sense that, if you have a candidate placement which comprises,
say, 2 nodes, summing up to 12 CPUs, but you want your VM to have access
to a minimum of 16 CPUs the this checks that and add more nodes to the
candidate placement until it contains enough of them for giving you the
number of CPUs you asked for.
> And, as before, why might I want to call this ?  And when would I call
> it ?  Why does the interface to libxl expose this rather than just
> offering a single function
> 
>  libxl_do_automatic_numa_placement_according_to_heuristics
> 
> ?
> Right, that is of couse an option, and I thought about going that way
for quite a while. However, what I think is best for libxl is to provide
a set of building blocks for its user to be able to implement the actual
heuristics.

That''s why I tried to offer a ''generate the placement
candidates'' and
''manipulate the placement candidate'' kind of API, to make it
possible
for lower levels to use it when putting their own placing algorithm
together. It is also kind of extendable, I mean, we can always add
another ''manipulate the candidates in some other way'' function
if needed
or wanted by us or whatever other libxl user.

Doing it the other way, i.e., one big function doing everything, would
mean that as soon as we want to change or improve the placement
heuristics, we need to modify the behaviour of that API call, which I
think it is suboptimal. Also, if some other user of libxl does not like
the heuristics I came out with, they have to reimplement the whole
placement.

So, like it is right now, the actual heuristics is implemented in xl, on
top of these placement candidate generation and manipulation facilities,
which I finally decided it was the way I liked this whole thing
most. :-)

Again, you''re right in asking this reasoning to be part of the
documentation, and to be put in the proper place, and I will do that.
However, now that I''ve put it here, do you think it makes some sense?
> > +        libxl_numa_candidate_count_domains(libxl_ctx *ctx,
> > +                                           libxl_numa_candidate
*candidate);
> > +
> > +This is what counts the number of domains that are currently pinned
> > +to the CPUs of the nodes of a given candidate.
> 
> Why is that useful ?  It is used by some of your other code so I guess
> this is a facility which is useful to the implementors of other
> placement algorithms ?
> As I tried to explain above, yes, exactly like that.

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2012-May-31 15:41 UTC

head link

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

On 31/05/12 13:11, Dario Faggioli wrote:> As NUMA node-related counterparts of xc_cpumap_t and libxl_cpumap.
>
> This is in preparation of making it possible to manipulate
> NUMA nodes from the toolstack(s).
>
> Signed-off-by: Dario Faggioli<dario.faggioli@citrix.com>
>
> diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> --- a/tools/libxc/xc_misc.c
> +++ b/tools/libxc/xc_misc.c
> @@ -35,11 +35,30 @@ int xc_get_max_cpus(xc_interface *xch)
>       return max_cpus;
>   }
>
> +int xc_get_max_nodes(xc_interface *xch)
> +{
> +    static int max_nodes = 0;
> +    xc_physinfo_t physinfo;
> +
> +    if ( max_nodes )
> +        return max_nodes;
> +
> +    if ( !xc_physinfo(xch,&physinfo) )
> +        max_nodes = physinfo.max_node_id + 1;
> +
> +    return max_nodes;What does xc_physinfo() return?  Should this return the error message 
rather than 0 if it fails?

  -George

George Dunlap

2012-May-31 15:51 UTC

head link

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

On 31/05/12 13:11, Dario Faggioli wrote:> By adding copying and *_is_full/*_is_empty facilities.
>
> Signed-off-by: Dario Faggioli<dario.faggioli@citrix.com>
>
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
> --- a/tools/libxl/libxl_utils.c
> +++ b/tools/libxl/libxl_utils.c
> @@ -505,6 +505,35 @@ static int libxl_map_alloc(libxl_ctx *ct
>       return 0;
>   }
>
> +static void libxl_map_copy(struct libxl_map *dptr,
> +                           const struct libxl_map *sptr)
> +{
> +    int sz;
> +
> +    sz = dptr->size = sptr->size;
> +    memcpy(dptr->map, sptr->map, sz * sizeof(*dptr->map));
> +}
> +
> +int libxl_map_is_full(struct libxl_map *map)
> +{
> +    int i;
> +
> +    for (i = 0; i<  map->size; i++)
> +        if (map->map[i] != (uint8_t)-1)
> +            return -1;
> +    return 0;
> +}Why are you returning -1 and 0 for false and true, rather than 0 and 1?  
None of the other libxl "_is_" functions (e.g., libxl_is_stubom()) do
that.

  -George

George Dunlap

2012-May-31 15:54 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On 31/05/12 13:11, Dario Faggioli wrote:> And make all the required infrastructure updates to enable this.
>
> Signed-off-by: Ian Campbell<ian.campbell@citrix.com>
> Tested-by: Dario Faggioli<dario.faggioli@citrix.com>This is beyond my ken, so I''m going to skip this one. :-)

  -George

Dario Faggioli

2012-May-31 16:01 UTC

head link

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

On Thu, 2012-05-31 at 16:51 +0100, George Dunlap wrote:> > +int libxl_map_is_full(struct libxl_map *map)
> > +{
> > +    int i;
> > +
> > +    for (i = 0; i<  map->size; i++)
> > +        if (map->map[i] != (uint8_t)-1)
> > +            return -1;
> > +    return 0;
> > +}
> Why are you returning -1 and 0 for false and true, rather than 0 and 1?  
> None of the other libxl "_is_" functions (e.g.,
libxl_is_stubom()) do that.
> Oh, well, it seemed natural to me as well for true to be 1, but that
would have meant giving 0 a different meaning than ''everything is
ok'',
and I didn''t find relevant examples of it being done... Thanks for
pointing this out, I''ll change it.

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-May-31 16:09 UTC

head link

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

On Thu, 2012-05-31 at 16:41 +0100, George Dunlap wrote:> > +int xc_get_max_nodes(xc_interface *xch)
> > +{
> > +    static int max_nodes = 0;
> > +    xc_physinfo_t physinfo;
> > +
> > +    if ( max_nodes )
> > +        return max_nodes;
> > +
> > +    if ( !xc_physinfo(xch,&physinfo) )
> > +        max_nodes = physinfo.max_node_id + 1;
> > +
> > +    return max_nodes;
> What does xc_physinfo() return?  Should this return the error message 
> rather than 0 if it fails?
> 
int xc_get_max_cpus(xc_interface *xch)
{
    static int max_cpus = 0;
    xc_physinfo_t physinfo;

    if ( max_cpus )
        return max_cpus;

    if ( !xc_physinfo(xch, &physinfo) )
        max_cpus = physinfo.max_cpu_id + 1;

    return max_cpus;
}

And I guess the reason for this is this is used (both in libxc and
libxl) in a few places to determine the size of the array that will host
the cpu map. Thus, getting a 0 ensures we do not allocate a random
amount of memory.

Of course it should be more than possible to change it, but going though
all the callers is needed. If you want me to do that, just say it, and I
will perhaps deal with both this "original" and my
"variant", right?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-May-31 16:27 UTC

head link

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

On Thu, 2012-05-31 at 16:02 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 09 of 11] libxl, xl: enable automatic
placement of guests on NUMA nodes"):
> > If a domain does not have a VCPU affinity, try to pin it automatically
> > to some PCPUs. This is done taking into account the NUMA
characteristics
> > of the host: we look for a combination of host''s NUMA nodes
that has enough
> > free memoy for the new domain, and pin it to the VCPUs of those nodes.
> > Smaller combinations are considered first, to avoid spreading the
> > domain''s memory among too many nodes.
> 
> Thanks for this.  Here are my comments:
> Thanks to you. :-)
> > +static void libxl_nodemap_rand_init(libxl_nodemap *nodemap)
> > +{
> > +    int i;
> > +    nodemap->size = rand() % 16;
> > +    nodemap->map = calloc(nodemap->size,
sizeof(*nodemap->map));
> > +    libxl_for_each_node(i, *nodemap) {
> > +        if (rand() % 2)
> > +            libxl_nodemap_set(nodemap, i);
> > +        else
> > +            libxl_nodemap_reset(nodemap, i);
> > +    }
> > +}
> 
> For your random number generation, please use nrand48, with a seed in
> the libxl ctx.  (This means you''ll need to take out the ctx lock.)
> And provide a way to set the seed.
> Sounds reasonable, I''ll look into this. As a side note, as I
deliberately took inspiration from libxl_cpumap_rand_init() for this,
should I apply what you just said to that function too?
> > +/* Automatic NUMA placement */
> > +libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
> > +                                        libxl_domain_build_info
*b_info,
> > +                                        int min_nodes, int
*nr_cndts);
> > +int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> > +                                  int min_cpus, int max_nodes,
> > +                                  libxl_numa_candidate *candidate);
> > +void libxl_numa_candidates_list_free(libxl_numa_candidate *list, int
nr);
> 
> This interface documentation is deficient, I''m afraid.  Please
explain
> how these functions are supposed to be used.
> Sure, as per the other mail, I''ll put the API documentation here.
> > +yajl_gen_status libxl_nodemap_gen_json(yajl_gen hand,
> > +                                       libxl_nodemap *nodemap)
> > +{
> > +    yajl_gen_status s;
> > +    int i;
> > +
> > +    s = yajl_gen_array_open(hand);
> > +    if (s != yajl_gen_status_ok) goto out;
> > +
> > +    libxl_for_each_node(i, *nodemap) {
> > +        if (libxl_nodemap_test(nodemap, i)) {
> > +            s = yajl_gen_integer(hand, i);
> > +            if (s != yajl_gen_status_ok) goto out;
> > +        }
> > +    }
> > +    s = yajl_gen_array_close(hand);
> > +out:
> > +    return s;
> > +}
> 
> This is a copy of _cpumap_gen_json.  Bad enough to have one of these,
> let along two.
> So, perhaps you''re suggesting merging the two of them?
> > +static int numa_node_set_next(int nr_nodes, int size, int *idxs,
> > +                              libxl_nodemap *nodemap)
> 
> You should at least write     int idxs[]    and explain that the
> caller is expected to provide an array of size `size'' which is
private
> to the implementation of numa_node_set_...
> 
> <snip>
>I like this and the other suggestions about restructuring the
combination generation and iteration, and I''ll also try to improve the
comments and the documentation.
> > +libxl_numa_candidate *libxl_domain_numa_candidates(libxl_ctx *ctx,
> > +                                        libxl_domain_build_info
*b_info,
> > +                                        int min_nodes, int *nr_cndts)
> 
> I think it would be better if this function returned a libxl error
> code.  That way in would be possible to distinguish the various error
> cases with different error codes.  You should define new error codes
> if you need them.
> 
> For the libxl internal subcalls, do this:
>     rc = libxl_get_some_information(CTX, ...);
>     if (rc) goto out;
> Ok, I will move the array of candidates among the parameters...
> > +                cndts = realloc(cndts, sizeof(cndts[0]) *
(*nr_cndts+1));
> > +                if (cndts == NULL) {
> 
> Firstly, use libxl__zrealloc(0, ...) and ditch the error handling.
> Secondly, this reallocs the array every time we add a node.  It would
> be better to keep a separate note of the array size and reallocate in
> bigger chunks.
> Yes, that''s very bad, I agree.
> > + out_nodemap_idx:
> > +    free(suitable_nodes_idx);
> > + out_nodemap:
> > +    libxl_nodemap_dispose(&suitable_nodes);
> > + out_numainfo:
> > +    libxl_numainfo_list_free(ninfo, nr_nodes);
> > + out:
> > +    return cndts;
> > +}
> 
> Please don''t use this error handling style.  Instead, initialise
all
> the resource variables to 0 right at the top of the function, and
> unconditionally free them.
> Ok.
> For the output variable cndts, you can unconditionally free it at the
> end if you do this on the success exit path:
> No, that is up to the caller to free...
> > +int libxl_numa_candidate_add_cpus(libxl_ctx *ctx,
> > +                                  int min_cpus, int max_nodes,
> > +                                  libxl_numa_candidate *candidate)
> > +{
> > +    int dom_nodes, nodes_cpus;
> > +    libxl_cputopology *tinfo;
> > +    libxl_numainfo *ninfo;
> > +    int nr_nodes, nr_cpus;
> > +    int i, rc;
> > +
> > +    tinfo = libxl_get_cpu_topology(ctx, &nr_cpus);
> > +    if (tinfo == NULL) {
> > +        LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
"libxl_get_topologyinfo failed\n");
> > +        rc = ERROR_NOMEM;
> 
> We aren''t supposed to be returning ERROR_NOMEM any more because
all
> our memory allocation is now assumed to be infallible.  Shouldn''t
this
> be ERROR_FAIL ?  (And in the next stanza.)
> Right. What I need is to distinguish, in the caller, whether this failed
because it did not find and couldn''t add enough CPUs --which would mean
"bad, but we just reset the affinity and avoid placing the domain"--
or
for some other reason --which would mean "bad, we must bail out". But
I
guess I can manage doing this with the ERROR_s (or adding my own ones,
if that is the case).
> > +/* How many PCPUs are there on each node? */
> > +static int cpus_per_node(libxl_cputopology *tinfo, int nr_cpus)
> > +{
> > +    libxl_numainfo *ninfo;
> > +    int nr_nodes, j, i;
> > +    int cpus_nodes = 0;
> > +
> > +    ninfo = libxl_get_numainfo(ctx, &nr_nodes);
> > +    if (ninfo == NULL)
> > +        return 0;
> > +
> > +    /* This makes sense iff # of PCPUs is the same for all nodes */
> 
> And what if it isn''t ?  Later I see:
> 
> > +    if (cpus_nodes != 0)
> > +        dom_min_nodes = (dom_needs_cpus + cpus_nodes - 1) /
cpus_nodes;
> > +    else
> > +        dom_min_nodes = 1;
> 
> which seems to suggest we''ll pile the whole thing onto one pcpu.
> You''re again right I might have spent a few more words on this...
It''s
just I fear bearing to verbose (I really tend to talk and write too
much!).

This whole comes from an hint from George during his review of my RFC,
where he said we should check how many CPUs we have on each node so
that, if we know we have 6 CPUs on each node and the domain wants 8
CPUs, we can avoid looking for solutions comprising less than 2 nodes.
This is all true, but it depends on the fact each node has the same
amount of CPUs in it, otherwise, math becomes way more complex than just
that division.

Therefore, what this is trying to do is check whether or not we are in
that situation and, if not, just turn this kind of optimization off and
tell the rest of the algorithm to give use _all_ the candidates,
starting from a minimum number of nodes equal to one (which is the
meaning of dom_min_nodes = 1 at this stage).
> > +/* Try to achieve "optimal" NUMA placement */
> > +static int place_domain(libxl_domain_build_info *b_info)
> > +{
> ...
>  +    if (nr_pools > 1) {
> > +        fprintf(stderr, "skip numa placement as cpupools are
being used\n");
> > +        err = 0;
> > +        goto out_poolinfo;
> > +    }
> 
> Perhaps in the case of cpupools the right answer is to consider only
> those pcpus in the intended pool ?  Or is that likely to do more harm
> than good, with people who are currently doing their numa placement
> with cpupools ?
> I''m not sure. What you are saying is what I wanted to have, but then I
thought, as it is for VCPU affinity, if the user is asking something
specific about CPU usage, it may be better to leave the whole thing with
him and turning all the automation down.

However, I think I can make the algorithm work while taking cpupools
into account, it''s just a matter of deciding. Maybe hearing something
from George and Juergen could be useful...

Thanks a lot for the quick and detailed review. :-)

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2012-May-31 16:42 UTC

head link

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

On Thu, 2012-05-31 at 17:27 +0100, Dario Faggioli wrote:> On Thu, 2012-05-31 at 16:02 +0100, Ian Jackson wrote:
> > Dario Faggioli writes ("[PATCH 09 of 11] libxl, xl: enable
automatic placement of guests on NUMA nodes"):
> > > If a domain does not have a VCPU affinity, try to pin it
automatically
> > > to some PCPUs. This is done taking into account the NUMA
characteristics
> > > of the host: we look for a combination of host''s NUMA
nodes that has enough
> > > free memoy for the new domain, and pin it to the VCPUs of those
nodes.
> > > Smaller combinations are considered first, to avoid spreading the
> > > domain''s memory among too many nodes.
> > 
> > Thanks for this.  Here are my comments:
> > 
> Thanks to you. :-)
> 
> > > +static void libxl_nodemap_rand_init(libxl_nodemap *nodemap)
> > > +{
> > > +    int i;
> > > +    nodemap->size = rand() % 16;
> > > +    nodemap->map = calloc(nodemap->size,
sizeof(*nodemap->map));
> > > +    libxl_for_each_node(i, *nodemap) {
> > > +        if (rand() % 2)
> > > +            libxl_nodemap_set(nodemap, i);
> > > +        else
> > > +            libxl_nodemap_reset(nodemap, i);
> > > +    }
> > > +}
> > 
> > For your random number generation, please use nrand48, with a seed in
> > the libxl ctx.  (This means you''ll need to take out the ctx
lock.)
> > And provide a way to set the seed.
> > 
> Sounds reasonable, I''ll look into this. As a side note, as I
> deliberately took inspiration from libxl_cpumap_rand_init() for this,
> should I apply what you just said to that function too?
This code is in gentest.py not libxl proper, it doesn''t need a seed in
the ctx nor to use good random numbers, Ian J was mistakenly think this
was actual libxl functionality I think.

Ian.

Dario Faggioli

2012-May-31 16:56 UTC

head link

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

On Thu, 2012-05-31 at 17:42 +0100, Ian Campbell wrote:> > > For your random number generation, please use nrand48, with a
seed in
> > > the libxl ctx.  (This means you''ll need to take out the
ctx lock.)
> > > And provide a way to set the seed.
> > > 
> > Sounds reasonable, I''ll look into this. As a side note, as I
> > deliberately took inspiration from libxl_cpumap_rand_init() for this,
> > should I apply what you just said to that function too?
> 
> This code is in gentest.py not libxl proper, 
>It is, indeed, as it is its _cpumap_ counterpart from which I copied it,
as soon as I started getting build errors because of an obscure
libxl_nodemap_rand_init() (which I didn''t knew anything of) being
missing! :-P

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2012-Jun-01 16:44 UTC

head link

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

On 31/05/12 15:22, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 06 of 11] libxl: introduce
libxl_get_numainfo()"):
>> Make some NUMA node information available to the toolstack. Achieve
>> this by means of xc_numainfo(), which exposes memory size and amount
>> of free memory of each node, as well as the relative distances of
>> each node to all the others.
> ...
>> +#define LIBXL_NUMAINFO_INVALID_ENTRY (~(uint32_t)0)
> Is there some reason we can''t just make sure we use the same value
for
> this in both places ?  That would avoid the need for this:
>
>> +#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
>> +    LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
> I appreciate that the types aren''t the same.  In libxc
it''s an
> unsigned long.  But shouldn''t they be the same ?It looks like Dario was just following suit from the topology interface:

xen/include/syctl.h: #define INVALID_TOPOLOGY_ID  (~0U)
tools/libxl/libxl.h: #define LIBXL_CPUTOPOLOGY_INVALID_ENTRY (~(uint32_t)0)

It might be worth seeing who wrote those and asking that person why they 
thought it was important.  It might be something to do with making the 
libxl type a fixed size...?> In libxl you can use libxl__zalloc(NULL,...) (and don''t have to
check
> for errors because it can''t fail). But perhaps this is going into 
> libxc ? I''d like to hear what other people say about putting this
in
> libxl vs. libxc.I dunno, whom do we expect to be calling libxc?  If the answer is "One 
day, only libxl", then it''s just a matter of taste.  Otherwise, we
need
to ask whether someone might want to call such a function w/o having to 
link to libxl.

  -George

George Dunlap

2012-Jun-01 16:56 UTC

head link

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

On 31/05/12 15:24, Ian Jackson wrote:> Dario Faggioli writes ("[PATCH 08 of 11] xl: add more NUMA information
to `xl info -n''"):
>> So that the user knows how much memory there is on each node and
>> how far they are from each others.
> Perhaps the json output machinery can do this for us ?  If that''s
> sufficiently legible then it would be an improvement on this
> open-coded printer.That sounds like a different patch series -- after all, the topology 
info is also "open-coded" at the moment.  I think there''s
enough work to
do with the NUMA placement stuff for now. :-)

My only nit (somewhat pedantic) is that "x / (1 << n)" will come
up with
exactly the same answer as "x >> n", but with a very fast
bit-shift
rather than a very slow integer divison.  Not that it matters for this 
purpose; but it does make the expression more complicated-looking than 
it needs to me.

But other than that, I think this patch is probably good as it is.

  -George

Ian Jackson

2012-Jun-01 16:58 UTC

head link

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

George Dunlap writes ("Re: [PATCH 06 of 11] libxl: introduce
libxl_get_numainfo()"):> I dunno, whom do we expect to be calling libxc?  If the answer is "One
> day, only libxl", then it''s just a matter of taste. 
Otherwise, we need
> to ask whether someone might want to call such a function w/o having to 
> link to libxl.
Perhaps the right thing is to cross that bridge when we come to it.

Ian.

Ian Jackson

2012-Jun-08 13:54 UTC

head link

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 04 of 11] libxl: expand the
libxl_{cpu, node}map API a bit"):> On Thu, 2012-05-31 at 15:13 +0100, Ian Jackson wrote:
> > Dario Faggioli writes ("[PATCH 04 of 11] libxl: expand the
libxl_{cpu,node}map API a bit"):
> > > +void libxl_cpumap_copy(/*XXX libxl_ctx *ctx,*/ libxl_cpumap
*dst,
> > 
> > XXX ?
> 
> Yep, I left it on purpose but I also wanted to mention that in the
> changelog, which I apparently missed. Sorry.
> 
> However, now that we are here, the whole point is, does that function
> need a libxl_ctx argument? It doesn''t use it, but I was concerned
about
> consistency with the other libxl_cpumap_* calls, and maybe future
> needs...
Yes.  It should take a ctx.

Ian.

Ian Jackson

2012-Jun-08 14:01 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 11 of 11] Some automatic
NUMA placement documentation"):> On Thu, 2012-05-31 at 16:08 +0100, Ian Jackson wrote:
> > In general I would prefer to see docs come in the same patch but I
> > guess I can read it here:
> 
> I can put it there. Would you recommend _moving_ it in the source file o
> also leaving a copy, o maybe just some pointers, here?
In general I don''t mind whether API docs are in a separate file or in
the source headers.  What we have here is sufficiently complex that it
seems to merit its own file.  A reference to it from libxl.h would be
appropriate.
> Right, that is of couse an option, and I thought about going that way
> for quite a while. However, what I think is best for libxl is to provide
> a set of building blocks for its user to be able to implement the actual
> heuristics.
The difficulty with this is that this is supposedly becoming a stable
API.  If we change the approach later, we may want to change the API.
Do we know clearly enough what building blocks are needed and what
they should be like ?

I think it would be easier, at this stage for 4.2, simply to offer a
single `do something sensible'' heuristic, and then think about more
complicated control by the caller for 4.3.
> Doing it the other way, i.e., one big function doing everything, would
> mean that as soon as we want to change or improve the placement
> heuristics, we need to modify the behaviour of that API call, which I
> think it is suboptimal.
If we have a function whose documented behaviour is `try to do a
roughly optimal thing'' then improving its optimisation is not a change
to the API semantics.

Specifically, existing code then will, when upgraded to a newer libxl,
behaviour differently - better, we hope.  Is that not the intent ?
>   Also, if some other user of libxl does not like
> the heuristics I came out with, they have to reimplement the whole
> placement.
I think users of 4.2 won''t be doing that.  That is, we can tell them
to please wait for 4.3 when we will have a richer API for this kind of
thing.
> So, like it is right now, the actual heuristics is implemented in xl, on
> top of these placement candidate generation and manipulation facilities,
> which I finally decided it was the way I liked this whole thing
> most. :-)
So how much of the code currently in libxl should be reproduced in
(say) libvirt ?
> Again, you''re right in asking this reasoning to be part of the
> documentation, and to be put in the proper place, and I will do that.
> However, now that I''ve put it here, do you think it makes some
sense?
I think it''s fine for it to be in a separate file.  But I think it
should ideally be in the same patch as the implementation.
> > > +        libxl_numa_candidate_count_domains(libxl_ctx *ctx,
> > > +                                           libxl_numa_candidate
*candidate);
> > > +
> > > +This is what counts the number of domains that are currently
pinned
> > > +to the CPUs of the nodes of a given candidate.
> > 
> > Why is that useful ?  It is used by some of your other code so I guess
> > this is a facility which is useful to the implementors of other
> > placement algorithms ?
> 
> As I tried to explain above, yes, exactly like that.
Are we really sure that we want to support this API for a long time ?

Ian.

George Dunlap

2012-Jun-08 14:03 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

On 08/06/12 15:01, Ian Jackson wrote:> Dario Faggioli writes ("Re: [Xen-devel] [PATCH 11 of 11] Some
automatic NUMA placement documentation"):
>> On Thu, 2012-05-31 at 16:08 +0100, Ian Jackson wrote:
>>> In general I would prefer to see docs come in the same patch but I
>>> guess I can read it here:
>> I can put it there. Would you recommend _moving_ it in the source file
o
>> also leaving a copy, o maybe just some pointers, here?
> In general I don''t mind whether API docs are in a separate file or
in
> the source headers.  What we have here is sufficiently complex that it
> seems to merit its own file.  A reference to it from libxl.h would be
> appropriate.
>
>> Right, that is of couse an option, and I thought about going that way
>> for quite a while. However, what I think is best for libxl is to
provide
>> a set of building blocks for its user to be able to implement the
actual
>> heuristics.
> The difficulty with this is that this is supposedly becoming a stable
> API.  If we change the approach later, we may want to change the API.
> Do we know clearly enough what building blocks are needed and what
> they should be like ?
>
> I think it would be easier, at this stage for 4.2, simply to offer a
> single `do something sensible'' heuristic, and then think about
more
> complicated control by the caller for 4.3.In fact, I think it would probably be better to take Ian Campbell''s 
suggestion and either just do it in xl completely (with no libxl 
changes) or do it entirely with libxl, with no external interface.  Then 
we''ll have something better than nothing, and the whole 4.3 dev cycle
to
come up with a good interface we can commit to.

  -George

Ian Jackson

2012-Jun-08 14:03 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Dario Faggioli writes ("[Xen-devel] [PATCH 05 of 11] libxl: add a new Array
type to the IDL"):> And make all the required infrastructure updates to enable this.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
You write `From: Dario'' in the headers but it has a S-o-b from Ian.

I think you should write
  From: Ian Campbell <ian.campbell@citrix.com>
if he actually wrote it.

git-commit --author will do this for you and I think git-send-email
will then generate appropriate output.

Can you point me at your git tree so I can easily clone it and see
what the output from this new IDL generator looks like ?

Thanks,
Ian.

Ian Jackson

2012-Jun-08 14:06 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

George Dunlap writes ("Re: [Xen-devel] [PATCH 11 of 11] Some automatic NUMA
placement documentation"):> On 08/06/12 15:01, Ian Jackson wrote:
> > I think it would be easier, at this stage for 4.2, simply to offer a
> > single `do something sensible'' heuristic, and then think
about more
> > complicated control by the caller for 4.3.
>
> In fact, I think it would probably be better to take Ian
Campbell''s
> suggestion and either just do it in xl completely (with no libxl 
> changes) or do it entirely with libxl, with no external interface.  Then 
> we''ll have something better than nothing, and the whole 4.3 dev
cycle to
> come up with a good interface we can commit to.
Yes.

Ian.

Dario Faggioli

2012-Jun-08 14:35 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

On Fri, 2012-06-08 at 15:06 +0100, Ian Jackson wrote:> George Dunlap writes ("Re: [Xen-devel] [PATCH 11 of 11] Some automatic
NUMA placement documentation"):
> > On 08/06/12 15:01, Ian Jackson wrote:
> > > I think it would be easier, at this stage for 4.2, simply to
offer a
> > > single `do something sensible'' heuristic, and then think
about more
> > > complicated control by the caller for 4.3.
> >
> > In fact, I think it would probably be better to take Ian
Campbell''s
> > suggestion and either just do it in xl completely (with no libxl 
> > changes) or do it entirely with libxl, with no external interface. 
Then
> > we''ll have something better than nothing, and the whole 4.3
dev cycle to
> > come up with a good interface we can commit to.
> 
> Yes.
> That''s fine. As also discussed on IRC with Ian Campbell and George,
I''m
putting everything inside libxl, but with _no_ public API exposure of it
at all (as per analogy to it being in xend).

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-Jun-08 15:14 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Fri, 2012-06-08 at 15:03 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[Xen-devel] [PATCH 05 of 11] libxl: add a new
Array type to the IDL"):
> > And make all the required infrastructure updates to enable this.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
> 
> You write `From: Dario'' in the headers but it has a S-o-b from
Ian.
> Right.
> I think you should write
>   From: Ian Campbell <ian.campbell@citrix.com>
> if he actually wrote it.
> Yes, this is by Ian...
> git-commit --author will do this for you and I think git-send-email
> will then generate appropriate output.
> Oh, thanks... Do you happen to know what a mercurial equivalent for this
could be? :-P
> Can you point me at your git tree so I can easily clone it and see
> what the output from this new IDL generator looks like ?
> Well, it''s an hg tree + patchqueue that I don''t have anywhere
public. I
just put a fresh clone of xen-unstable plus this patch here (on a Citrix
NIS machine):

 ~dariof/Repositories/xen-unstable-array.hg

is that fine for now?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Jackson

2012-Jun-08 15:17 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 05 of 11] libxl: add a new
Array type to the IDL"):> On Fri, 2012-06-08 at 15:03 +0100, Ian Jackson wrote:
> > git-commit --author will do this for you and I think git-send-email
> > will then generate appropriate output.
> 
> Oh, thanks... Do you happen to know what a mercurial equivalent for this
> could be? :-P
hg ci -u
> > Can you point me at your git tree so I can easily clone it and see
> > what the output from this new IDL generator looks like ?
> 
> Well, it''s an hg tree + patchqueue that I don''t have
anywhere public. I
> just put a fresh clone of xen-unstable plus this patch here (on a Citrix
> NIS machine):
> 
>  ~dariof/Repositories/xen-unstable-array.hg
> 
> is that fine for now?
Thanks, that''s great.  I''ll take a look.

Ian.

Dario Faggioli

2012-Jun-08 15:19 UTC

head link

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

On Fri, 2012-06-08 at 15:01 +0100, Ian Jackson wrote:> > Right, that is of couse an option, and I thought about going that way
> > for quite a while. However, what I think is best for libxl is to
provide
> > a set of building blocks for its user to be able to implement the
actual
> > heuristics.
> 
> The difficulty with this is that this is supposedly becoming a stable
> API.  If we change the approach later, we may want to change the API.
> Do we know clearly enough what building blocks are needed and what
> they should be like ?
> No, you all are right, not yet... More investigation is needed here and
it is not now the moment to do it.
> > Doing it the other way, i.e., one big function doing everything, would
> > mean that as soon as we want to change or improve the placement
> > heuristics, we need to modify the behaviour of that API call, which I
> > think it is suboptimal.
> 
> If we have a function whose documented behaviour is `try to do a
> roughly optimal thing'' then improving its optimisation is not a
change
> to the API semantics.
> 
> Specifically, existing code then will, when upgraded to a newer libxl,
> behaviour differently - better, we hope.  Is that not the intent ?
> It is. I''m changing this all into something like "when you do not
say
anything, libxl will place the domain `sensibly`, +/- as xend was
doing". We''ll then add all the necessary interface and API in
place
during 4.3.
> > So, like it is right now, the actual heuristics is implemented in xl,
on
> > top of these placement candidate generation and manipulation
facilities,
> > which I finally decided it was the way I liked this whole thing
> > most. :-)
> 
> So how much of the code currently in libxl should be reproduced in
> (say) libvirt ?
> Again, I agree. Let''s leave this for now, but I''ll definitely
investigate this later.

Thanks a lot for looking into this, :-)
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Jackson

2012-Jun-08 15:37 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 05 of 11] libxl: add a new
Array type to the IDL"):> On Fri, 2012-06-08 at 15:03 +0100, Ian Jackson wrote:
> > Can you point me at your git tree so I can easily clone it and see
> > what the output from this new IDL generator looks like ?
> 
> Well, it''s an hg tree + patchqueue that I don''t have
anywhere public. I
> just put a fresh clone of xen-unstable plus this patch here (on a Citrix
> NIS machine):
> 
>  ~dariof/Repositories/xen-unstable-array.hg
> 
> is that fine for now?
Uh, can you please push your whole queue there ?  In order to review
the IDL generator output I naturally need a user of the new array
type.

Ian.

Dario Faggioli

2012-Jun-08 15:52 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Fri, 2012-06-08 at 16:37 +0100, Ian Jackson wrote:> > Well, it''s an hg tree + patchqueue that I don''t have
anywhere public. I
> > just put a fresh clone of xen-unstable plus this patch here (on a
Citrix
> > NIS machine):
> > 
> >  ~dariof/Repositories/xen-unstable-array.hg
> > 
> > is that fine for now?
> 
> Uh, can you please push your whole queue there ?  In order to review
> the IDL generator output I naturally need a user of the new array
> type.
> Gah, right! The whole queue is under deep restructuring, but I should
have pushed the patch that constitutes the direct user of the new
type...

Thanks,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Jackson

2012-Jun-08 15:57 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 05 of 11] libxl: add a new
Array type to the IDL"):> Gah, right! The whole queue is under deep restructuring, but I should
> have pushed the patch that constitutes the direct user of the new
> type...
I''m happy to wait.

Ian.

Ian Campbell

2012-Jun-12 09:02 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Fri, 2012-06-08 at 16:14 +0100, Dario Faggioli wrote:> On Fri, 2012-06-08 at 15:03 +0100, Ian Jackson wrote:
> > git-commit --author will do this for you and I think git-send-email
> > will then generate appropriate output.
> > 
> Oh, thanks... Do you happen to know what a mercurial equivalent for this
> could be? :-P
Ian J answered for the general commit case but in case you are
interested for the mq case you can put a comment at the top of the patch
(before the changelog) with the same form as what "hg export" gives
you.
e.g.:
        # HG changeset patch
        # User Ian Campbell <ian.campbell@citrix.com>

This makes qpush do the right thing (IIRC, I didn''t try it just
now...).

Ian.

Dario Faggioli

2012-Jun-13 06:59 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Tue, 2012-06-12 at 10:02 +0100, Ian Campbell wrote:> On Fri, 2012-06-08 at 16:14 +0100, Dario Faggioli wrote:
> > On Fri, 2012-06-08 at 15:03 +0100, Ian Jackson wrote:
> > > git-commit --author will do this for you and I think
git-send-email
> > > will then generate appropriate output.
> > > 
> > Oh, thanks... Do you happen to know what a mercurial equivalent for
this
> > could be? :-P
> 
> Ian J answered for the general commit case but in case you are
> interested for the mq case you can put a comment at the top of the patch
> (before the changelog) with the same form as what "hg export"
gives you.
> e.g.:
>         # HG changeset patch
>         # User Ian Campbell <ian.campbell@citrix.com>
> I am, indeed, interested in patchqueues... So thanks to both of you. :-)
> This makes qpush do the right thing (IIRC, I didn''t try it just
now...).
> I see. It seems `hg qrefresh'' also takes a -u argument which is doing
the trick.

Anyway, thanks again,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-Jun-18 12:06 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Wed, 2012-06-13 at 08:59 +0200, Dario Faggioli wrote:
> > Ian J answered for the general commit case but in case you are
> > interested for the mq case you can put a comment at the top of the
patch
> > (before the changelog) with the same form as what "hg
export" gives you.
> > e.g.:
> >         # HG changeset patch
> >         # User Ian Campbell <ian.campbell@citrix.com>
> > 
> I am, indeed, interested in patchqueues... So thanks to both of you. :-)
> 
> > This makes qpush do the right thing (IIRC, I didn''t try it
just now...).
> > 
> I see. It seems `hg qrefresh'' also takes a -u argument which is
doing
> the trick.
> For the record, I tried the above and it seemed to work on my local
copy, but not while sending via patchbomb... :-( I''ll try
IanC''s
suggestion for next time... Sorry. :-(

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Dario Faggioli

2012-Jun-21 14:32 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Wed, 2012-06-13 at 08:59 +0200, Dario Faggioli wrote:> > Ian J answered for the general commit case but in case you are
> > interested for the mq case you can put a comment at the top of the
patch
> > (before the changelog) with the same form as what "hg
export" gives you.
> > e.g.:
> >         # HG changeset patch
> >         # User Ian Campbell <ian.campbell@citrix.com>
> > So, I have this in place, as I did when sent v2. However, as I suppose
this is not doing its job, as I see this:

From: 	Dario Faggioli <raistlinxxx>
To: 	xen-develxxx
Cc: 	...
Subject: 	[PATCH 02 of 10 v2] libxl: add a new Array type to the IDL

Maybe I should remove the --plain option from my patchbomb line, so that
those comments make it to the list?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Ian Campbell

2012-Jun-21 14:35 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Thu, 2012-06-21 at 15:32 +0100, Dario Faggioli wrote:> On Wed, 2012-06-13 at 08:59 +0200, Dario Faggioli wrote:
> > > Ian J answered for the general commit case but in case you are
> > > interested for the mq case you can put a comment at the top of
the patch
> > > (before the changelog) with the same form as what "hg
export" gives you.
> > > e.g.:
> > >         # HG changeset patch
> > >         # User Ian Campbell <ian.campbell@citrix.com>
> > > 
> So, I have this in place, as I did when sent v2. However, as I suppose
> this is not doing its job, as I see this:
> 
> From: 	Dario Faggioli <raistlinxxx>
> To: 	xen-develxxx
> Cc: 	...
> Subject: 	[PATCH 02 of 10 v2] libxl: add a new Array type to the IDL
> 
> Maybe I should remove the --plain option from my patchbomb line, so that
> those comments make it to the list?
That would work, I don''t normally use --plain.

Ian.

Dario Faggioli

2012-Jun-21 14:35 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Wed, 2012-06-13 at 08:59 +0200, Dario Faggioli wrote:> > Ian J answered for the general commit case but in case you are
> > interested for the mq case you can put a comment at the top of the
patch
> > (before the changelog) with the same form as what "hg
export" gives you.
> > e.g.:
> >         # HG changeset patch
> >         # User Ian Campbell <ian.campbell@citrix.com>
> > So, I have this in place, as I did when sent v2. However, as I suppose
this is not doing its job, as I see this:

From: 	Dario Faggioli <raistlinxxx>
To: 	xen-develxxx
Cc: 	...
Subject: 	[PATCH 02 of 10 v2] libxl: add a new Array type to the IDL

Maybe I should remove the --plain option from my patchbomb line, so that
those comments make it to the list?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Jackson

2012-Jun-26 16:28 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Dario Faggioli writes ("[Xen-devel] [PATCH 05 of 11] libxl: add a new Array
type to the IDL"):> And make all the required infrastructure updates to enable this.
> diff --git a/tools/libxl/idl.txt b/tools/libxl/idl.txt
> --- a/tools/libxl/idl.txt
> +++ b/tools/libxl/idl.txt
> @@ -145,12 +145,24 @@ idl.KeyedUnion
>  
>   A subclass of idl.Aggregate which represents the C union type
>   where the currently valid member of the union can be determined based
> - upon another member in the containing type.
> + upon another member in the containing type. An idl.KeyedUnion must
> + always be a member of a containing idl.Aggregate type.
>  
> - The KeyedUnion.keyvar contains an idl.type the member of the
> + The KeyedUnion.keyvar contains an idl.Type the member of the
>   containing type which determines the valid member of the union. The
>   must be an instance of the Enumeration type.
Something seems to have been mangled here in this sentence (although
all your patch does is change the case of the name).
> +idl.Array
> +
> +  A class representing an array or similar elements. An idl.Array must                                   ^^
of.
> +  always be an idl.Field of a containing idl.Aggregate.
> +
> +  idl.Array.elem_type contains an idl.Type which is the type of all
> +  elements of the array.
More natural in English would be, I think, "... the type of each
element of the array".
> +  idl.Array.len_var contains an idl.Field which is added to the parent
> +  idl.Aggregate and will contain the length of the array.
I think you should explicitly say that len_var MUST be a field named
"num_XXX" where the original field is named XXX, as previously stated.

In the future this will probably be enforced by the idl compiler.


Aside from that,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


Thanks,
Ian.

Ian Campbell

2012-Jun-26 16:30 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Tue, 2012-06-26 at 17:28 +0100, Ian Jackson wrote:> Dario Faggioli writes ("[Xen-devel] [PATCH 05 of 11] libxl: add a new
Array type to the IDL"):
> > And make all the required infrastructure updates to enable this.
> 
> > diff --git a/tools/libxl/idl.txt b/tools/libxl/idl.txt
> > --- a/tools/libxl/idl.txt
> > +++ b/tools/libxl/idl.txt
> > @@ -145,12 +145,24 @@ idl.KeyedUnion
> >  
> >   A subclass of idl.Aggregate which represents the C union type
> >   where the currently valid member of the union can be determined
based
> > - upon another member in the containing type.
> > + upon another member in the containing type. An idl.KeyedUnion must
> > + always be a member of a containing idl.Aggregate type.
> >  
> > - The KeyedUnion.keyvar contains an idl.type the member of the
> > + The KeyedUnion.keyvar contains an idl.Type the member of the
> >   containing type which determines the valid member of the union. The
> >   must be an instance of the Enumeration type.
> 
> Something seems to have been mangled here in this sentence (although
> all your patch does is change the case of the name).
> 
> > +idl.Array
> > +
> > +  A class representing an array or similar elements. An idl.Array
must
>                                    ^^
> of.
> 
> > +  always be an idl.Field of a containing idl.Aggregate.
> > +
> > +  idl.Array.elem_type contains an idl.Type which is the type of all
> > +  elements of the array.
> 
> More natural in English would be, I think, "... the type of each
> element of the array".
> 
> > +  idl.Array.len_var contains an idl.Field which is added to the
parent
> > +  idl.Aggregate and will contain the length of the array.
> 
> I think you should explicitly say that len_var MUST be a field named
> "num_XXX" where the original field is named XXX, as previously
stated.
> 
> In the future this will probably be enforced by the idl compiler.
> 
> 
> Aside from that,
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Thanks.

Dario, since this is my patch do you want me to address the above and
resend?
> 
> 
> Thanks,
> Ian.

Dario Faggioli

2012-Jun-26 16:58 UTC

head link

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

On Tue, 2012-06-26 at 17:30 +0100, Ian Campbell wrote:> > Aside from that,
> > 
> > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> Thanks.
> 
> Dario, since this is my patch do you want me to address the above and
> resend?
> As you wish. If that doesn''t bother you too much, I''m fine
with that.
Otherwise, I can do it when resending the whole series...

Let''s say that, if I don''t see the patch on the list when I
come to
resend, I''ll apply these changes myself, ok?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Xen devel - May 2012 - [PATCH 00 of 11] Automatic NUMA placement for xl

[PATCH 00 of 11] Automatic NUMA placement for xl

[PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

[PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

[PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

[PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

[PATCH 05 of 11] libxl: add a new Array type to the IDL

[PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

[PATCH 07 of 11] xen: enhance dump_numa output

[PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

[PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

[PATCH 10 of 11] libxl, xl: heuristics for reordering NUMA placement candidates

[PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Re: [PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Re: [PATCH 07 of 11] xen: enhance dump_numa output

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

Re: [PATCH 07 of 11] xen: enhance dump_numa output

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Re: [PATCH 02 of 11] libxl: abstract libxl_cpumap to just libxl_map

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

Re: [PATCH 01 of 11] libxc: abstract xenctl_cpumap to just xenctl_map

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Re: [PATCH 03 of 11] libxc, libxl: introduce xc_nodemap_t and libxl_nodemap

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

Re: [PATCH 09 of 11] libxl, xl: enable automatic placement of guests on NUMA nodes

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Re: [PATCH 08 of 11] xl: add more NUMA information to `xl info -n''

Re: [PATCH 06 of 11] libxl: introduce libxl_get_numainfo()

Re: [PATCH 04 of 11] libxl: expand the libxl_{cpu, node}map API a bit

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 11 of 11] Some automatic NUMA placement documentation

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL

Re: [PATCH 05 of 11] libxl: add a new Array type to the IDL