Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH v13] claim and its friends for allocating multiple self-ballooning guests
The patch (mmu: Introduce XENMEM_claim_pages (subop of memory ops) is already in the hypervisor and described in details the problem/solution/alternative solutions. This builds upon that new hypercall to expand the toolstack to use it. The patches follow the normal code-flow - the patch to implement the two hypercalls: XENMEM_claim_pages and XENMEM_get_outstanding_pages. Then the patches to utilize them in the libxc. The hypercall''s are only utilized if the toolstack (libxl) sets the claim_mode to 1 (true). Then the toolstack (libxl + xl) patches. They revolve around two different changes: 1). Add ''claim_mode=0|1'' global configuration value that determines whether the claim hypercall should be used as part of guest creation. 2). As part of ''xl info'' output how many pages are claimed by different guests. This is more of a diagnostic patch. iNote that these two pages: [PATCH 4/6] xc: export outstanding_pages value in xc_dominfo [PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo could very well be squashed together. I don''t know whather that is OK with the maintainers or not so I left them as two seperate ones. I am OK with them being squashed. These patches are also visible at: git://xenbits.xen.org/people/konradwilk/xen.git claim.v13 docs/man/xl.conf.pod.5 | 41 ++++++++++++++++++++++++++++++++++++ tools/examples/xend-config.sxp | 5 +++++ tools/examples/xl.conf | 6 ++++++ tools/libxc/xc_dom.h | 1 + tools/libxc/xc_dom_x86.c | 12 +++++++++++ tools/libxc/xc_domain.c | 31 +++++++++++++++++++++++++++ tools/libxc/xc_hvm_build_x86.c | 23 ++++++++++++++++---- tools/libxc/xenctrl.h | 7 ++++++ tools/libxc/xenguest.h | 2 ++ tools/libxl/libxl.c | 13 ++++++++++++ tools/libxl/libxl.h | 2 +- tools/libxl/libxl_dom.c | 3 ++- tools/libxl/libxl_types.idl | 3 ++- tools/libxl/xl.c | 5 +++++ tools/libxl/xl.h | 1 + tools/libxl/xl_cmdimpl.c | 26 +++++++++++++++++++++++ tools/python/xen/lowlevel/xc/xc.c | 29 ++++++++++++++++--------- tools/python/xen/xend/XendOptions.py | 8 +++++++ tools/python/xen/xend/balloon.py | 4 ++++ tools/python/xen/xend/image.py | 13 +++++++++--- 20 files changed, 215 insertions(+), 20 deletions(-) Dan Magenheimer (2): xc: use XENMEM_claim_pages hypercall during guest creation. xc: export outstanding_pages value in xc_dominfo structure. Konrad Rzeszutek Wilk (4): xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config xl: export ''outstanding_pages'' value from xcinfo xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 1/6] xc: use XENMEM_claim_pages hypercall during guest creation.
From: Dan Magenheimer <dan.magenheimer@oracle.com> We add an extra parameter to the structures passed to the PV routine (arch_setup_meminit) and HVM routine (setup_guest) that determines whether the claim hypercall is to be done. The contents of the ''claim_enabled'' is defined as an ''int'' in case the hypercall expands in the future with extra flags (for example for per-NUMA allocation). For right now the proper values are: 0 to disable it or 1 to enable it. If the hypervisor does not support this function, the xc_domain_claim_pages and xc_domain_get_outstanding_pages will silently return 0 (and set errno to zero). Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> [v2: Updated per Ian''s recommendations] [v3: Added support for out-of-sync hypervisor] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/libxc/xc_dom.h | 1 + tools/libxc/xc_dom_x86.c | 12 ++++++++++++ tools/libxc/xc_domain.c | 30 ++++++++++++++++++++++++++++++ tools/libxc/xc_hvm_build_x86.c | 23 +++++++++++++++++++---- tools/libxc/xenctrl.h | 6 ++++++ tools/libxc/xenguest.h | 2 ++ 6 files changed, 70 insertions(+), 4 deletions(-) diff --git a/tools/libxc/xc_dom.h b/tools/libxc/xc_dom.h index 779b9d4..ac36600 100644 --- a/tools/libxc/xc_dom.h +++ b/tools/libxc/xc_dom.h @@ -135,6 +135,7 @@ struct xc_dom_image { domid_t guest_domid; int8_t vhpt_size_log2; /* for IA64 */ int8_t superpages; + int claim_enabled; /* 0 by default, 1 enables it */ int shadow_enabled; int xen_version; diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index eb9ac07..d89526d 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -706,6 +706,13 @@ int arch_setup_meminit(struct xc_dom_image *dom) } else { + /* try to claim pages for early warning of insufficient memory avail */ + if ( dom->claim_enabled ) { + rc = xc_domain_claim_pages(dom->xch, dom->guest_domid, + dom->total_pages); + if ( rc ) + return rc; + } /* setup initial p2m */ for ( pfn = 0; pfn < dom->total_pages; pfn++ ) dom->p2m_host[pfn] = pfn; @@ -722,6 +729,11 @@ int arch_setup_meminit(struct xc_dom_image *dom) dom->xch, dom->guest_domid, allocsz, 0, 0, &dom->p2m_host[i]); } + + /* Ensure no unclaimed pages are left unused. + * OK to call if hadn''t done the earlier claim call. */ + (void)xc_domain_claim_pages(dom->xch, dom->guest_domid, + 0 /* cancels the claim */); } return rc; diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index 480ce91..299c907 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -775,6 +775,36 @@ int xc_domain_add_to_physmap(xc_interface *xch, return do_memory_op(xch, XENMEM_add_to_physmap, &xatp, sizeof(xatp)); } +int xc_domain_claim_pages(xc_interface *xch, + uint32_t domid, + unsigned long nr_pages) +{ + int err; + struct xen_memory_reservation reservation = { + .nr_extents = nr_pages, + .extent_order = 0, + .mem_flags = 0, /* no flags */ + .domid = domid + }; + + set_xen_guest_handle(reservation.extent_start, HYPERCALL_BUFFER_NULL); + + err = do_memory_op(xch, XENMEM_claim_pages, &reservation, sizeof(reservation)); + /* Ignore it if the hypervisor does not support the call. */ + if (err == -1 && errno == ENOSYS) + err = errno = 0; + return err; +} +unsigned long xc_domain_get_outstanding_pages(xc_interface *xch) +{ + long ret = do_memory_op(xch, XENMEM_get_outstanding_pages, NULL, 0); + + /* Ignore it if the hypervisor does not support the call. */ + if (ret == -1 && errno == ENOSYS) + ret = errno = 0; + return ret; +} + int xc_domain_populate_physmap(xc_interface *xch, uint32_t domid, unsigned long nr_extents, diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index 3b5d777..ab33a7f 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -252,6 +252,7 @@ static int setup_guest(xc_interface *xch, unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, stat_1gb_pages = 0; int pod_mode = 0; + int claim_enabled = args->claim_enabled; if ( nr_pages > target_pages ) pod_mode = XENMEMF_populate_on_demand; @@ -329,6 +330,16 @@ static int setup_guest(xc_interface *xch, xch, dom, 0xa0, 0, pod_mode, &page_array[0x00]); cur_pages = 0xc0; stat_normal_pages = 0xc0; + + /* try to claim pages for early warning of insufficient memory available */ + if ( claim_enabled ) { + rc = xc_domain_claim_pages(xch, dom, nr_pages - cur_pages); + if ( rc != 0 ) + { + PERROR("Could not allocate memory for HVM guest as we cannot claim memory!"); + goto error_out; + } + } while ( (rc == 0) && (nr_pages > cur_pages) ) { /* Clip count to maximum 1GB extent. */ @@ -506,12 +517,16 @@ static int setup_guest(xc_interface *xch, munmap(page0, PAGE_SIZE); } - free(page_array); - return 0; - + rc = 0; + goto out; error_out: + rc = -1; + out: + /* ensure no unclaimed pages are left unused */ + xc_domain_claim_pages(xch, dom, 0 /* cancels the claim */); + free(page_array); - return -1; + return rc; } /* xc_hvm_build: diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index 32122fd..e695456 100644 --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1129,6 +1129,12 @@ int xc_domain_populate_physmap_exact(xc_interface *xch, unsigned int mem_flags, xen_pfn_t *extent_start); +int xc_domain_claim_pages(xc_interface *xch, + uint32_t domid, + unsigned long nr_pages); + +unsigned long xc_domain_get_outstanding_pages(xc_interface *xch); + int xc_domain_memory_exchange_pages(xc_interface *xch, int domid, unsigned long nr_in_extents, diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h index 7d4ac33..4714bd2 100644 --- a/tools/libxc/xenguest.h +++ b/tools/libxc/xenguest.h @@ -231,6 +231,8 @@ struct xc_hvm_build_args { /* Extra SMBIOS structures passed to HVMLOADER */ struct xc_hvm_firmware_module smbios_module; + /* Whether to use claim hypercall (1 - enable, 0 - disable). */ + int claim_enabled; }; /** -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 2/6] xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config
The XENMEM_claim_pages hypercall operates per domain and it should be used system wide. As such this patch introduces a global configuration option ''claim_mode'' that by default is disabled. If this option is enabled then when a guest is created there will be an guarantee that there is memory available for the guest. This is an particularly acute problem on hosts with memory over-provisioned guests that use tmem and have self-balloon enabled (which is the default option for them). The self-balloon mechanism can deflate/inflate the balloon quickly and the amount of free memory (which ''xl info'' can show) is stale the moment it is printed. When claim is enabled a reservation for the amount of memory (''memory'' in guest config) is set, which is then reduced as the domain''s memory is populated and eventually reaches zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds/minutes (depending on the size of the guest) while the guest is populated. Note that to enable tmem type guest, one need to provide ''tmem'' on the Xen hypervisor argument and as well on the Linux kernel command line. There are two boolean options: (0) No claim is made. Memory population during guest creation will be attempted as normal and may fail due to memory exhaustion. (1) Normal memory and freeable pool of ephemeral pages (tmem) is used when calculating whether there is enough memory free to launch a guest. This guarantees immediate feedback whether the guest can be launched due to memory exhaustion (which can take a long time to find out if launching massively huge guests) and in parallel. [v1: Removed own claim_mode type, using just bool, improved docs, all per Ian''s suggestion] [v2: Updated the comments] [v3: Rebase on top 733b9c524dbc2bec318bfc3588ed1652455d30ec (xl: add vif.default.script)] [v4: Fixed up comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- docs/man/xl.conf.pod.5 | 41 +++++++++++++++++++++++++++++++++++++++++ tools/examples/xl.conf | 6 ++++++ tools/libxl/libxl.h | 1 - tools/libxl/libxl_dom.c | 3 ++- tools/libxl/libxl_types.idl | 2 +- tools/libxl/xl.c | 5 +++++ tools/libxl/xl.h | 1 + tools/libxl/xl_cmdimpl.c | 2 ++ 8 files changed, 58 insertions(+), 3 deletions(-) diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5 index 7b9fcac..bb92bfc 100644 --- a/docs/man/xl.conf.pod.5 +++ b/docs/man/xl.conf.pod.5 @@ -108,6 +108,47 @@ Configures the name of the first block device to be used for temporary block device allocations by the toolstack. The default choice is "xvda". +=item B<claim_mode=BOOLEAN> + +If this option is enabled then when a guest is created there will be an +guarantee that there is memory available for the guest. This is an particularly +acute problem on hosts with memory over-provisioned guests that use +tmem and have self-balloon enabled (which is the default option). +The self-balloon mechanism can deflate/inflate the balloon quickly and the +amount of free memory (which C<xl info> can show) is stale the moment +it is printed. When claim is enabled a reservation for the amount of +memory (see ''memory'' in xl.conf(5)) is set, which is then reduced as the +domain''s memory is populated and eventually reaches zero. + +If the reservation cannot be meet the guest creation fails immediately instead +of taking seconds/minutes (depending on the size of the guest) while the guest +is populated. + +Note that to enable tmem type guest, one need to provide C<tmem> on the +Xen hypervisor argument and as well on the Linux kernel command line. + +Note that the claim call is not attempted if C<superpages> option is +used in the guest config (see xl.cfg(5)). + +Default: C<0> + +=over 4 + +=item C<0> + +No claim is made. Memory population during guest creation will be +attempted as normal and may fail due to memory exhaustion. + +=item C<1> + +Normal memory and freeable pool of ephemeral pages (tmem) is used when +calculating whether there is enough memory free to launch a guest. +This guarantees immediate feedback whether the guest can be launched due +to memory exhaustion (which can take a long time to find out if launching +massively huge guests). + +=back + =back =head1 SEE ALSO diff --git a/tools/examples/xl.conf b/tools/examples/xl.conf index b0caa32..f386bb9 100644 --- a/tools/examples/xl.conf +++ b/tools/examples/xl.conf @@ -26,3 +26,9 @@ # default bridge device to use with vif-bridge hotplug scripts #vif.default.bridge="xenbr0" + +# Reserve a claim of memory when launching a guest. This guarantees immediate +# feedback whether the guest can be launched due to memory exhaustion +# (which can take a long time to find out if launching huge guests). +# see xl.conf(5) for details. +#claim_mode=0 diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 030aa86..c4ad58b 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -579,7 +579,6 @@ int libxl_wait_for_free_memory(libxl_ctx *ctx, uint32_t domid, uint32_t memory_k /* wait for the memory target of a domain to be reached */ int libxl_wait_for_memory_target(libxl_ctx *ctx, uint32_t domid, int wait_secs); - int libxl_vncviewer_exec(libxl_ctx *ctx, uint32_t domid, int autopass); int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num, libxl_console_type type); /* libxl_primary_console_exec finds the domid and console number diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index de555ee..cffa0d6 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -367,6 +367,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid, dom->console_domid = state->console_domid; dom->xenstore_evtchn = state->store_port; dom->xenstore_domid = state->store_domid; + dom->claim_enabled = libxl_defbool_val(info->claim_mode); if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) { LOGE(ERROR, "xc_dom_boot_xen_init failed"); @@ -601,7 +602,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, */ args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10; args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10; - + args.claim_enabled = libxl_defbool_val(info->claim_mode); if (libxl__domain_firmware(gc, info, &args)) { LOG(ERROR, "initializing domain firmware failed"); goto out; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index f3c212b..0f1f118 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -293,7 +293,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("ioports", Array(libxl_ioport_range, "num_ioports")), ("irqs", Array(uint32, "num_irqs")), ("iomem", Array(libxl_iomem_range, "num_iomem")), - + ("claim_mode", libxl_defbool), ("u", KeyedUnion(None, libxl_domain_type, "type", [("hvm", Struct(None, [("firmware", string), ("bios", libxl_bios_type), diff --git a/tools/libxl/xl.c b/tools/libxl/xl.c index 4c598db..92716d0 100644 --- a/tools/libxl/xl.c +++ b/tools/libxl/xl.c @@ -45,6 +45,7 @@ char *default_vifscript = NULL; char *default_bridge = NULL; char *default_gatewaydev = NULL; enum output_format default_output_format = OUTPUT_FORMAT_JSON; +int global_claim_mode = 0; static xentoollog_level minmsglevel = XTL_PROGRESS; @@ -134,6 +135,10 @@ static void parse_global_config(const char *configfile, } if (!xlu_cfg_get_string (config, "blkdev_start", &buf, 0)) blkdev_start = strdup(buf); + + if (!xlu_cfg_get_long (config, "claim_mode", &l, 0)) + global_claim_mode = 1; + xlu_cfg_destroy(config); } diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h index b881f92..4db4aac 100644 --- a/tools/libxl/xl.h +++ b/tools/libxl/xl.h @@ -145,6 +145,7 @@ int xl_child_pid(xlchildnum); /* returns 0 if child struct is not in use */ extern int autoballoon; extern int run_hotplug_scripts; extern int dryrun_only; +extern int global_claim_mode; extern char *lockfile; extern char *default_vifscript; extern char *default_bridge; diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 2d40f8f..880905e 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -757,6 +757,8 @@ static void parse_config_data(const char *config_source, if (!xlu_cfg_get_long (config, "maxmem", &l, 0)) b_info->max_memkb = l * 1024; + libxl_defbool_set(&b_info->claim_mode, global_claim_mode); + if (xlu_cfg_get_string (config, "on_poweroff", &buf, 0)) buf = "destroy"; if (!parse_action_on_shutdown(buf, &d_config->on_poweroff)) { -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config
The XENMEM_claim_pages hypercall operates per domain and it should be used system wide. As such this patch introduces a global configuration option ''claim-mode'' that by default is disabled. If this option is enabled then when a guest is created there will be an guarantee that there is memory available for the guest. This is an particularly acute problem on hosts with memory over-provisioned guests that use tmem and have self-balloon enabled (which is the default option for them). The self-balloon mechanism can deflate/inflate the balloon quickly and the amount of free memory (which ''xm info'' can show) is stale the moment it is printed. When claim is enabled a reservation for the amount of memory (''memory'' in guest config) is set, which is then reduced as the domain''s memory is populated and eventually reaches zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds/minutes (depending on the size of the guest) while the guest is populated. Note that to enable tmem type guest, one need to provide ''tmem'' on the Xen hypervisor argument and as well on the Linux kernel command line. There are two boolean options: (false) No claim is made. Memory population during guest creation will be attempted as normal and may fail due to memory exhaustion. (true) Normal memory and freeable pool of ephemeral pages (tmem) is used when calculating whether there is enough memory free to launch a guest. This guarantees immediate feedback whether the guest can be launched due to memory exhaustion (which can take a long time to find out if launching massively huge guests) and in parallel. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/examples/xend-config.sxp | 5 +++++ tools/python/xen/lowlevel/xc/xc.c | 29 +++++++++++++++++++---------- tools/python/xen/xend/XendOptions.py | 8 ++++++++ tools/python/xen/xend/balloon.py | 4 ++++ tools/python/xen/xend/image.py | 13 ++++++++++--- 5 files changed, 46 insertions(+), 13 deletions(-) diff --git a/tools/examples/xend-config.sxp b/tools/examples/xend-config.sxp index 0896a27..4d5816c 100644 --- a/tools/examples/xend-config.sxp +++ b/tools/examples/xend-config.sxp @@ -302,3 +302,8 @@ # command lsscsi, e.g. (''16:0:0:0'' ''15:0'') # (pscsi-device-mask (''*'')) +# Reserve a claim of memory when launching a guest. This guarantees immediate +# feedback whether the guest can be launched due to memory exhaustion +# (which can take a long time to find out if launching huge guests). +# see xl.conf(5) for details. +# (claim-mode no) diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c index e220f68..3540313 100644 --- a/tools/python/xen/lowlevel/xc/xc.c +++ b/tools/python/xen/lowlevel/xc/xc.c @@ -455,6 +455,7 @@ static PyObject *pyxc_linux_build(XcObject *self, int store_evtchn, console_evtchn; int vhpt = 0; int superpages = 0; + int claim = 0; unsigned int mem_mb; unsigned long store_mfn = 0; unsigned long console_mfn = 0; @@ -467,14 +468,15 @@ static PyObject *pyxc_linux_build(XcObject *self, "console_evtchn", "image", /* optional */ "ramdisk", "cmdline", "flags", - "features", "vhpt", "superpages", NULL }; + "features", "vhpt", "superpages", + "claim_mode", NULL }; - if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiis|ssisii", kwd_list, + if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiis|ssisiii", kwd_list, &domid, &store_evtchn, &mem_mb, &console_evtchn, &image, /* optional */ &ramdisk, &cmdline, &flags, - &features, &vhpt, &superpages) ) + &features, &vhpt, &superpages, &claim) ) return NULL; xc_dom_loginit(self->xc_handle); @@ -486,6 +488,8 @@ static PyObject *pyxc_linux_build(XcObject *self, dom->superpages = superpages; + dom->claim_enabled = claim; + if ( xc_dom_linux_build(self->xc_handle, dom, domid, mem_mb, image, ramdisk, flags, store_evtchn, &store_mfn, console_evtchn, &console_mfn) != 0 ) { @@ -944,16 +948,16 @@ static PyObject *pyxc_hvm_build(XcObject *self, #endif int i; char *image; - int memsize, target=-1, vcpus = 1, acpi = 0, apic = 1; + int memsize, target=-1, vcpus = 1, acpi = 0, apic = 1, claim = 0; PyObject *vcpu_avail_handle = NULL; uint8_t vcpu_avail[(HVM_MAX_VCPUS + 7)/8]; - + struct xc_hvm_build_args pargs = {}; static char *kwd_list[] = { "domid", "memsize", "image", "target", "vcpus", - "vcpu_avail", "acpi", "apic", NULL }; - if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iis|iiOii", kwd_list, + "vcpu_avail", "acpi", "apic", "claim_mode", NULL }; + if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iis|iiOiii", kwd_list, &dom, &memsize, &image, &target, &vcpus, - &vcpu_avail_handle, &acpi, &apic) ) + &vcpu_avail_handle, &acpi, &apic, &claim) ) return NULL; memset(vcpu_avail, 0, sizeof(vcpu_avail)); @@ -984,8 +988,13 @@ static PyObject *pyxc_hvm_build(XcObject *self, if ( target == -1 ) target = memsize; - if ( xc_hvm_build_target_mem(self->xc_handle, dom, memsize, - target, image) != 0 ) + memset(&pargs, 0, sizeof(struct xc_hvm_build_args)); + pargs.mem_size = (uint64_t)memsize << 20; + pargs.mem_target = (uint64_t)target << 20; + pargs.image_file_name = image; + pargs.claim_enabled = claim; + + if ( xc_hvm_build(self->xc_handle, dom, &pargs) != 0 ) return pyxc_error_to_exception(self->xc_handle); #if !defined(__ia64__) diff --git a/tools/python/xen/xend/XendOptions.py b/tools/python/xen/xend/XendOptions.py index cc6f38e..275efdc 100644 --- a/tools/python/xen/xend/XendOptions.py +++ b/tools/python/xen/xend/XendOptions.py @@ -154,6 +154,12 @@ class XendOptions: use loose check automatically if necessary.""" pci_dev_assign_strict_check_default = True + """Reserve a claim of memory when launching a guest. This guarantees + immediate feedback whether the guest can be launched due to memory + exhaustion (which can take a long time to find out if launching huge + guests).""" + claim_mode_default = False + def __init__(self): self.configure() @@ -436,6 +442,8 @@ class XendOptions: def get_pscsi_device_mask(self): return self.get_config_value("pscsi-device-mask", self.xend_pscsi_device_mask) + def get_claim_mode(self): + return self.get_config_bool("claim-mode", self.claim_mode_default) class XendOptionsFile(XendOptions): diff --git a/tools/python/xen/xend/balloon.py b/tools/python/xen/xend/balloon.py index 89965d7..dcd4caa 100644 --- a/tools/python/xen/xend/balloon.py +++ b/tools/python/xen/xend/balloon.py @@ -116,6 +116,10 @@ def free(need_mem, dominfo): dom0_ballooning = xoptions.get_enable_dom0_ballooning() dom0_alloc = get_dom0_current_alloc() + # let it handle it all via the claim option + if (xoptions.get_claim_mode() and not dom0_ballooning): + return + retries = 0 sleep_time = SLEEP_TIME_GROWTH new_alloc = 0 diff --git a/tools/python/xen/xend/image.py b/tools/python/xen/xend/image.py index 832c168..7d3716d 100644 --- a/tools/python/xen/xend/image.py +++ b/tools/python/xen/xend/image.py @@ -84,6 +84,7 @@ class ImageHandler: ostype = None superpages = 0 + claim_mode = 0 memory_sharing = 0 def __init__(self, vm, vmConfig): @@ -95,7 +96,9 @@ class ImageHandler: self.kernel = None self.ramdisk = None self.cmdline = None - + xoptions = XendOptions.instance() + if (xoptions.get_claim_mode() == True): + self.claim_mode = 1 self.configure(vmConfig) def configure(self, vmConfig): @@ -729,6 +732,7 @@ class LinuxImageHandler(ImageHandler): log.debug("features = %s", self.vm.getFeatures()) log.debug("flags = %d", self.flags) log.debug("superpages = %d", self.superpages) + log.debug("claim_mode = %d", self.claim_mode) if arch.type == "ia64": log.debug("vhpt = %d", self.vhpt) @@ -742,7 +746,8 @@ class LinuxImageHandler(ImageHandler): features = self.vm.getFeatures(), flags = self.flags, vhpt = self.vhpt, - superpages = self.superpages) + superpages = self.superpages, + claim_mode = self.claim_mode) def getBitSize(self): return xc.getBitSize(image = self.kernel, @@ -956,6 +961,7 @@ class HVMImageHandler(ImageHandler): log.debug("vcpu_avail = %li", self.vm.getVCpuAvail()) log.debug("acpi = %d", self.acpi) log.debug("apic = %d", self.apic) + log.debug("claim_mode = %d", self.claim_mode) rc = xc.hvm_build(domid = self.vm.getDomid(), image = self.loader, @@ -964,7 +970,8 @@ class HVMImageHandler(ImageHandler): vcpus = self.vm.getVCpuCount(), vcpu_avail = self.vm.getVCpuAvail(), acpi = self.acpi, - apic = self.apic) + apic = self.apic, + claim_mode = self.claim_mode) rc[''notes''] = { ''SUSPEND_CANCEL'': 1 } rc[''store_mfn''] = xc.hvm_get_param(self.vm.getDomid(), -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 4/6] xc: export outstanding_pages value in xc_dominfo structure.
From: Dan Magenheimer <dan.magenheimer@oracle.com> This patch provides the value of the currently outstanding pages claimed for a specific domain. This is a value that influences the global outstanding claims value (See patch: "xl: ''xl info'' print outstanding claims if enabled") returned via xc_domain_get_outstanding_pages hypercall. This domain value decrements as the memory is populated for the guest and eventually reaches zero. This patch is neccessary for "xl: export ''outstanding_pages'' value from xcinfo" patch. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> [v2: s/unclaimed_pages/outstanding_pages/ per Tim''s suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/libxc/xc_domain.c | 1 + tools/libxc/xenctrl.h | 1 + 2 files changed, 2 insertions(+) diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index 299c907..1676bd7 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -234,6 +234,7 @@ int xc_domain_getinfo(xc_interface *xch, info->ssidref = domctl.u.getdomaininfo.ssidref; info->nr_pages = domctl.u.getdomaininfo.tot_pages; + info->nr_outstanding_pages = domctl.u.getdomaininfo.outstanding_pages; info->nr_shared_pages = domctl.u.getdomaininfo.shr_pages; info->nr_paged_pages = domctl.u.getdomaininfo.paged_pages; info->max_memkb = domctl.u.getdomaininfo.max_pages << (PAGE_SHIFT-10); diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index e695456..2a4d4df 100644 --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -364,6 +364,7 @@ typedef struct xc_dominfo { hvm:1, debugged:1; unsigned int shutdown_reason; /* only meaningful if shutdown==1 */ unsigned long nr_pages; /* current number, not maximum */ + unsigned long nr_outstanding_pages; unsigned long nr_shared_pages; unsigned long nr_paged_pages; unsigned long shared_info_frame; -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo
This patch provides the value of the currently outstanding pages claimed for a specific domain. This is a value that influences the global outstanding claims value (See patch: "xl: ''xl info'' print outstanding claims if enabled") returned via xc_domain_get_outstanding_pages hypercall. This domain value decrements as the memory is populated for the guest and eventually reaches zero. With this patch it is possible to utilize this field. Acked-by: Ian Campbell <ian.campbell@citrix.com> [v2: s/unclaimed/outstanding/ per Tim''s suggestion] [v3: Don''t use SXP printout file per Ian''s suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/libxl/libxl.c | 1 + tools/libxl/libxl_types.idl | 1 + 2 files changed, 2 insertions(+) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 572c2c6..1f5dc3f 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -528,6 +528,7 @@ static void xcinfo2xlinfo(const xc_domaininfo_t *xcinfo, else xlinfo->shutdown_reason = ~0; + xlinfo->outstanding_memkb = PAGE_TO_MEMKB(xcinfo->outstanding_pages); xlinfo->current_memkb = PAGE_TO_MEMKB(xcinfo->tot_pages); xlinfo->shared_memkb = PAGE_TO_MEMKB(xcinfo->shr_pages); xlinfo->paged_memkb = PAGE_TO_MEMKB(xcinfo->paged_pages); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 0f1f118..be8cea2 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -196,6 +196,7 @@ libxl_dominfo = Struct("dominfo",[ # Otherwise set to a value guaranteed not to clash with any valid # LIBXL_SHUTDOWN_REASON_* constant. ("shutdown_reason", libxl_shutdown_reason), + ("outstanding_memkb", MemKB), ("current_memkb", MemKB), ("shared_memkb", MemKB), ("paged_memkb", MemKB), -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Mar-27 20:55 UTC
[PATCH 6/6] xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)
This patch provides the value of the currently outstanding pages claimed for all domains. This is a total global value that influences the hypervisors'' MM system. When a claim call is done, a reservation for a specific amount of pages is set and also a global value is incremented. This global value is then reduced as the domain''s memory is populated and eventually reaches zero. The toolstack can also choose to set the domain''s claim to zero which cancels the reservation and decrements the global value by the amount of claim that has not been satisfied. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds or minutes (depending on the size of the guest) while the toolstack populates memory. See patch: "xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config" for details on how it is implemented. The value fluctuates quite often so the value is stale once it is provided to the user-space. However it is useful for diagnostic purposes. It is only printed when the global "claim_mode" option in xl.conf(5) is set to enabled (1). [v1: s/unclaimed/outstanding/] [v2: Made libxl_get_claiminfo return just MemKB suggested by Ian] [v3: Made libxl_get_claininfo return MemMB to conform to the other values printed] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- tools/libxl/libxl.c | 12 ++++++++++++ tools/libxl/libxl.h | 1 + tools/libxl/xl_cmdimpl.c | 24 ++++++++++++++++++++++++ 3 files changed, 37 insertions(+) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 1f5dc3f..5a91d66 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -4058,6 +4058,18 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr) return ret; } +uint64_t libxl_get_claiminfo(libxl_ctx *ctx) +{ + long l; + + l = xc_domain_get_outstanding_pages(ctx->xch); + if (l < 0) + return l; + + /* In MB */ + return (l >> 8); +} + const libxl_version_info* libxl_get_version_info(libxl_ctx *ctx) { union { diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index c4ad58b..5dab24b 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -579,6 +579,7 @@ int libxl_wait_for_free_memory(libxl_ctx *ctx, uint32_t domid, uint32_t memory_k /* wait for the memory target of a domain to be reached */ int libxl_wait_for_memory_target(libxl_ctx *ctx, uint32_t domid, int wait_secs); +uint64_t libxl_get_claiminfo(libxl_ctx *ctx); int libxl_vncviewer_exec(libxl_ctx *ctx, uint32_t domid, int autopass); int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num, libxl_console_type type); /* libxl_primary_console_exec finds the domid and console number diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 880905e..a5f0f56 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -4650,6 +4650,29 @@ static void output_topologyinfo(void) return; } +static void output_claim(void) +{ + long l; + + /* + * Note that the xl.c (which calls us) has already read from the + * global configuration the ''claim_mode'' value. + */ + if (!global_claim_mode) + return; + + l = libxl_get_claiminfo(ctx); + if (l < 0) { + fprintf(stderr, "libxl_get_claiminfo failed. errno: %d (%s)\n", + errno, strerror(errno)); + return; + } + + printf("outstanding_claims : %ld\n", l); + + return; +} + static void print_info(int numa) { output_nodeinfo(); @@ -4660,6 +4683,7 @@ static void print_info(int numa) output_topologyinfo(); output_numainfo(); } + output_claim(); output_xeninfo(); -- 1.8.0.2
Ian Jackson
2013-Mar-28 16:17 UTC
Re: [PATCH v13] claim and its friends for allocating multiple self-ballooning guests
Konrad Rzeszutek Wilk writes ("[PATCH v13] claim and its friends for allocating multiple self-ballooning guests"):> The patch (mmu: Introduce XENMEM_claim_pages (subop of memory ops) > is already in the hypervisor and described in details the > problem/solution/alternative solutions. This builds upon that new > hypercall to expand the toolstack to use it....> These patches are also visible at: > git://xenbits.xen.org/people/konradwilk/xen.git claim.v13Thanks for making that convenient. I''ll review the individual patches in a moment. Ian.
Ian Jackson
2013-Mar-28 16:23 UTC
Re: [PATCH 1/6] xc: use XENMEM_claim_pages hypercall during guest creation.
Konrad Rzeszutek Wilk writes ("[PATCH 1/6] xc: use XENMEM_claim_pages hypercall during guest creation."):> We add an extra parameter to the structures passed to the > PV routine (arch_setup_meminit) and HVM routine (setup_guest) > that determines whether the claim hypercall is to be done.This looks plausible to me, except that you seem to have missed a comment of Ian Campbell''s on the hypercall buffers.> +int xc_domain_claim_pages(xc_interface *xch, > + uint32_t domid, > + unsigned long nr_pages) > +{ > + int err; > + struct xen_memory_reservation reservation = { > + .nr_extents = nr_pages, > + .extent_order = 0, > + .mem_flags = 0, /* no flags */ > + .domid = domid > + }; > + > + set_xen_guest_handle(reservation.extent_start, HYPERCALL_BUFFER_NULL);In response to which Ian C wrote in <1363170195.32410.124.camel@zakaz.uk.xensource.com>: This is unused? I think you just need: set_xen_guest_handle(reservation.extent_start,HYPERCALL_BUFFER_NULL); and drop the declaration of the bounce above. (personally I think a new arg struct for this subop would have been more obvious than forcing it into the reservation struct, but what''s done is done) thanks, Ian.
Ian Jackson
2013-Mar-28 16:39 UTC
Re: [PATCH 2/6] xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config
Konrad Rzeszutek Wilk writes ("[PATCH 2/6] xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config"):> The XENMEM_claim_pages hypercall operates per domain and it should be > used system wide. As such this patch introduces a global configuration > option ''claim_mode'' that by default is disabled.This mostly looks good to me.> +=item B<claim_mode=BOOLEAN> > + > +If this option is enabled then when a guest is created there will be an > +guarantee that there is memory available for the guest. This is an particularlySorry to be picky, but can I ask you to wrap this document to 70-75 characters ? At this width (looks like exactly 80) it inevitably generates wrap damage when a patch or quoted code is shown in an 80-column window.> +Note that to enable tmem type guest, one need to provide C<tmem> on the > +Xen hypervisor argument and as well on the Linux kernel command line."to enable tmem type guest" - shouldn''t that be "guests" ? And "one needs" ?> + if (!xlu_cfg_get_long (config, "claim_mode", &l, 0)) > + global_claim_mode = 1;This should set global_claim_mode to something depending on l, not 1, I think ? And perhaps global_claim_mode should be a libxl_defbool, so that we can inherit the libxl default more directly ? At the moment the default is set in xl and again in libxl, which I think is not ideal. It''s better to try to set the default only in one place. Also I think you need to call libxl_defbool_setdefault somewhere in libxl, probably in libxl__domain_create_info_setdefault. Is there some reason why this variable is called "global_claim_mode" and not just "claim_mode" ? Other globals in xl aren''t marked in this way. Ian.
Ian Jackson
2013-Mar-28 16:41 UTC
Re: [PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config
Konrad Rzeszutek Wilk writes ("[PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config"):> The XENMEM_claim_pages hypercall operates per domain and it should be > used system wide. As such this patch introduces a global configuration > option ''claim-mode'' that by default is disabled.I think at this stage we aren''t accepting new features in xend. Feel free to consult the other maintainers/committers if you think we need to make an exception here. On the plus side this means that the xend code is not changing much and carrying a local patch ought to be straightforward for those people still using xend. Ian.
Ian Jackson
2013-Mar-28 16:43 UTC
Re: [PATCH 4/6] xc: export outstanding_pages value in xc_dominfo structure.
Konrad Rzeszutek Wilk writes ("[PATCH 4/6] xc: export outstanding_pages value in xc_dominfo structure."):> This patch provides the value of the currently outstanding pages > claimed for a specific domain. This is a value that influences > the global outstanding claims value (See patch: "xl: ''xl info'' > print outstanding claims if enabled") returned via > xc_domain_get_outstanding_pages hypercall. This domain value > decrements as the memory is populated for the guest and > eventually reaches zero. > > This patch is neccessary for "xl: export ''outstanding_pages'' value > from xcinfo" patch.This looks good as far as it goes but this or a subsequent patch needs a proper documentation hunk (perhaps several) explaining the semantics of the new value. Ian.
Ian Jackson
2013-Mar-28 16:44 UTC
Re: [PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo
Konrad Rzeszutek Wilk writes ("[PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo"):> This patch provides the value of the currently outstanding pages > claimed for a specific domain. This is a value that influences > the global outstanding claims value (See patch: "xl: ''xl info'' > print outstanding claims if enabled") returned via > xc_domain_get_outstanding_pages hypercall. This domain value > decrements as the memory is populated for the guest and > eventually reaches zero.Again this patch is fine as far as it goes but we need an update to the documentation with a clear and accurate definition of what these values are. Ian.
Ian Jackson
2013-Mar-28 16:47 UTC
Re: [PATCH 6/6] xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)
Konrad Rzeszutek Wilk writes ("[PATCH 6/6] xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)"):> This patch provides the value of the currently outstanding pages > claimed for all domains. This is a total global value that influences > the hypervisors'' MM system. > > When a claim call is done, a reservation for a specific amount of pages > is set and also a global value is incremented. This global value is then > reduced as the domain''s memory is populated and eventually reaches zero. > The toolstack can also choose to set the domain''s claim to zero which > cancels the reservation and decrements the global value by the amount > of claim that has not been satisfied.This description is good, but something like it needs to be in the documentation.> +uint64_t libxl_get_claiminfo(libxl_ctx *ctx) > +{ > + long l; > + > + l = xc_domain_get_outstanding_pages(ctx->xch); > + if (l < 0) > + return l; > + > + /* In MB */ > + return (l >> 8); > +}libxl functions should return libxl error values, not whatever you got from libxc. I don''t mind very much what the libxc function returns but the libxl function needs to handle the error properly. See Ian C''s comments on the previous version of this patch. Thanks, Ian.
Ian Jackson
2013-Mar-28 16:50 UTC
Re: [PATCH v13] claim and its friends for allocating multiple self-ballooning guests
Konrad Rzeszutek Wilk writes ("[PATCH v13] claim and its friends for allocating multiple self-ballooning guests"):> The patch (mmu: Introduce XENMEM_claim_pages (subop of memory ops) > is already in the hypervisor and described in details the > problem/solution/alternative solutions. This builds upon that new > hypercall to expand the toolstack to use it.I''ve reviewed this series now and it''s looking close. When we get to a final version that I''m happy with I would like to get an opinion from Ian C before actually committing it. He''s away until after feature freeze, but you''ve posted this already several times before the freeze and I expect the final version to make it before George declares code freeze. CC George. Ian.
Konrad Rzeszutek Wilk
2013-Mar-29 13:12 UTC
Re: [PATCH 1/6] xc: use XENMEM_claim_pages hypercall during guest creation.
On Thu, Mar 28, 2013 at 04:23:11PM +0000, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH 1/6] xc: use XENMEM_claim_pages hypercall during guest creation."): > > We add an extra parameter to the structures passed to the > > PV routine (arch_setup_meminit) and HVM routine (setup_guest) > > that determines whether the claim hypercall is to be done. > > This looks plausible to me, except that you seem to have missed a > comment of Ian Campbell''s on the hypercall buffers. > > > +int xc_domain_claim_pages(xc_interface *xch, > > + uint32_t domid, > > + unsigned long nr_pages) > > +{ > > + int err; > > + struct xen_memory_reservation reservation = { > > + .nr_extents = nr_pages, > > + .extent_order = 0, > > + .mem_flags = 0, /* no flags */ > > + .domid = domid > > + }; > > + > > + set_xen_guest_handle(reservation.extent_start, HYPERCALL_BUFFER_NULL); > > In response to which Ian C wrote in > <1363170195.32410.124.camel@zakaz.uk.xensource.com>: > > This is unused? I think you just need: > set_xen_guest_handle(reservation.extent_start,HYPERCALL_BUFFER_NULL); > and drop the declaration of the bounce above.I think that is what I did? The original patch (v11 posting) had this: [Also available here: http://xenbits.xen.org/gitweb/?p=people/konradwilk/xen.git;a=blobdiff;f=tools/libxc/xc_domain.c;h=af7ef66c041652309b44f0437ab402a4dfa18ad7;hp=480ce91500dd4e90a420e0407387205f76128752;hb=2430df20d51ad1a53a47831396ba6257f2e732ec;hpb=1a5757996a197abb5660d159fba843eb5e7aa5af in the claim.v11 branch] int xc_domain_claim_pages(xc_interface *xch, + uint32_t domid, + unsigned long nr_pages, + unsigned int claim_flag) +{ + int err; + xen_pfn_t *extent_start = NULL; + DECLARE_HYPERCALL_BOUNCE(extent_start, 0, XC_HYPERCALL_BUFFER_BOUNCE_BOTH); + struct xen_memory_reservation reservation = { + .nr_extents = nr_pages, + .extent_order = 0, + .mem_flags = claim_flag, + .domid = domid + }; + + set_xen_guest_handle(reservation.extent_start, extent_start); + + err = do_memory_op(xch, XENMEM_claim_pages, &reservation, sizeof(reservation)); + return err; +} And he suggested that I drop the bounce and just use the BUFFER_NULL. The patch I posted (v12 and this v13) does this: int xc_domain_claim_pages(xc_interface *xch, + uint32_t domid, + unsigned long nr_pages) +{ + int err; + struct xen_memory_reservation reservation = { + .nr_extents = nr_pages, + .extent_order = 0, + .mem_flags = 0, /* no flags */ + .domid = domid + }; + + set_xen_guest_handle(reservation.extent_start, HYPERCALL_BUFFER_NULL); + + err = do_memory_op(xch, XENMEM_claim_pages, &reservation, sizeof(reservation)); + /* Ignore it if the hypervisor does not support the call. */ + if (err == -1 && errno == ENOSYS) + err = errno = 0; + return err; +} Which I believe does what he suggested? I also added the check for err and errno as he suggested in another review.
Konrad Rzeszutek Wilk
2013-Mar-29 13:27 UTC
Re: [PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config
On Thu, Mar 28, 2013 at 04:41:39PM +0000, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config"): > > The XENMEM_claim_pages hypercall operates per domain and it should be > > used system wide. As such this patch introduces a global configuration > > option ''claim-mode'' that by default is disabled. > > I think at this stage we aren''t accepting new features in xend.Indeed. The cover letter mentioned that this is more of ''this is how it would be in Xend - but since Xend is deprecated it is more of people to get a fell for it and if they would like to put in in their local version''> > Feel free to consult the other maintainers/committers if you think we > need to make an exception here.No need. I will just drop it out of the patchset and the git branch.> > On the plus side this means that the xend code is not changing much > and carrying a local patch ought to be straightforward for those > people still using xend.Exactly.> > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Konrad Rzeszutek Wilk
2013-Mar-29 13:31 UTC
Re: [PATCH 6/6] xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)
On Thu, Mar 28, 2013 at 04:47:17PM +0000, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH 6/6] xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf)"): > > This patch provides the value of the currently outstanding pages > > claimed for all domains. This is a total global value that influences > > the hypervisors'' MM system. > > > > When a claim call is done, a reservation for a specific amount of pages > > is set and also a global value is incremented. This global value is then > > reduced as the domain''s memory is populated and eventually reaches zero. > > The toolstack can also choose to set the domain''s claim to zero which > > cancels the reservation and decrements the global value by the amount > > of claim that has not been satisfied. > > This description is good, but something like it needs to be in the > documentation.The documentation being xl.conf right? It already is in xen.conf thanks to the first patch.> > > +uint64_t libxl_get_claiminfo(libxl_ctx *ctx) > > +{ > > + long l; > > + > > + l = xc_domain_get_outstanding_pages(ctx->xch); > > + if (l < 0) > > + return l; > > + > > + /* In MB */ > > + return (l >> 8); > > +} > > libxl functions should return libxl error values, not whatever you got > from libxc.OK.> > I don''t mind very much what the libxc function returns but the libxl > function needs to handle the error properly. See Ian C''s comments on > the previous version of this patch.OK, I probably missed something in his reply - the errno that we would mostly get (-ENOSYS) is neutered, so that it would not pipe up to libxl (first patch). However, there is of course nothing stopping from adding better error handling in this code in case we get -EPERM or such. Patch shortly coming up.> > Thanks, > Ian.
Konrad Rzeszutek Wilk
2013-Mar-29 19:30 UTC
Re: [PATCH 2/6] xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config
On Thu, Mar 28, 2013 at 04:39:57PM +0000, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH 2/6] xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config"): > > The XENMEM_claim_pages hypercall operates per domain and it should be > > used system wide. As such this patch introduces a global configuration > > option ''claim_mode'' that by default is disabled. > > This mostly looks good to me. > > > +=item B<claim_mode=BOOLEAN> > > + > > +If this option is enabled then when a guest is created there will be an > > +guarantee that there is memory available for the guest. This is an particularly > > Sorry to be picky, but can I ask you to wrap this document to 70-75 > characters ? At this width (looks like exactly 80) it inevitably > generates wrap damage when a patch or quoted code is shown in an > 80-column window.Of course.> > > +Note that to enable tmem type guest, one need to provide C<tmem> on the > > +Xen hypervisor argument and as well on the Linux kernel command line. > > "to enable tmem type guest" - shouldn''t that be "guests" ? And "one > needs" ?Yes. Fixed.> > > + if (!xlu_cfg_get_long (config, "claim_mode", &l, 0)) > > + global_claim_mode = 1; > > This should set global_claim_mode to something depending on l, not 1, > I think ?Yes.> > And perhaps global_claim_mode should be a libxl_defbool, so that we > can inherit the libxl default more directly ? At the moment theYes. The one odditiy is that I had to use the set_default_bool in the parse_global_config - otherwise with the last patch (xl: ''xl info'' print outstanding claims if enabled (claim_mode=1 in xl.conf) we would have gotten the non-initialized value and the assert would be triggered. This is b/c the libxl__domain_create_info_setdefault does not get called when ''xl info'' is called.> default is set in xl and again in libxl, which I think is not ideal. > It''s better to try to set the default only in one place.Right. Done.> > Also I think you need to call libxl_defbool_setdefault somewhere in > libxl, probably in libxl__domain_create_info_setdefault.I did it in libxl__domain_build_info_setdefault since it is part of the b_info structure. But I am wondering if that could be removed as we do it also in parse_global_config?> > Is there some reason why this variable is called "global_claim_mode" > and not just "claim_mode" ? Other globals in xl aren''t marked in this > way.Brainfart on my side. Fixed. Is this OK with you? Should I remove the libxl_defbool_setdefault in the libxl__domain_create_info_setdefault? commit 2e11cfaa12aef6ef857a77dd63b256efb24fc99d Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Mon Mar 18 16:23:39 2013 -0400 xl: Implement XENMEM_claim_pages support via ''claim_mode'' global config The XENMEM_claim_pages hypercall operates per domain and it should be used system wide. As such this patch introduces a global configuration option ''claim_mode'' that by default is disabled. If this option is enabled then when a guest is created there will be an guarantee that there is memory available for the guest. This is an particularly acute problem on hosts with memory over-provisioned guests that use tmem and have self-balloon enabled (which is the default option for them). The self-balloon mechanism can deflate/inflate the balloon quickly and the amount of free memory (which ''xl info'' can show) is stale the moment it is printed. When claim is enabled a reservation for the amount of memory (''memory'' in guest config) is set, which is then reduced as the domain''s memory is populated and eventually reaches zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds/minutes (depending on the size of the guest) while the guest is populated. Note that to enable tmem type guests, one needs to provide ''tmem'' on the Xen hypervisor argument and as well on the Linux kernel command line. There are two boolean options: (0) No claim is made. Memory population during guest creation will be attempted as normal and may fail due to memory exhaustion. (1) Normal memory and freeable pool of ephemeral pages (tmem) is used when calculating whether there is enough memory free to launch a guest. This guarantees immediate feedback whether the guest can be launched due to memory exhaustion (which can take a long time to find out if launching massively huge guests) and in parallel. [v1: Removed own claim_mode type, using just bool, improved docs, all per Ian''s suggestion] [v2: Updated the comments] [v3: Rebase on top 733b9c524dbc2bec318bfc3588ed1652455d30ec (xl: add vif.default.script)] [v4: Fixed up comments] [v5: s/global_claim_mode/claim_mode/] [v6: Ian Jackson''s feedback: use libxl_defbool, better comments, etc] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5 index 7b9fcac..08c7120 100644 --- a/docs/man/xl.conf.pod.5 +++ b/docs/man/xl.conf.pod.5 @@ -108,6 +108,47 @@ Configures the name of the first block device to be used for temporary block device allocations by the toolstack. The default choice is "xvda". +=item B<claim_mode=BOOLEAN> + +If this option is enabled then when a guest is created there will be an +guarantee that there is memory available for the guest. This is an +particularly acute problem on hosts with memory over-provisioned guests +that use tmem and have self-balloon enabled (which is the default +option). The self-balloon mechanism can deflate/inflate the balloon +quickly and the amount of free memory (which C<xl info> can show) is +stale the moment it is printed. When claim is enabled a reservation for +the amount of memory (see ''memory'' in xl.conf(5)) is set, which is then +reduced as the domain''s memory is populated and eventually reaches zero. + +If the reservation cannot be meet the guest creation fails immediately +instead of taking seconds/minutes (depending on the size of the guest) +while the guest is populated. + +Note that to enable tmem type guests, one needs to provide C<tmem> on the +Xen hypervisor argument and as well on the Linux kernel command line. + +Note that the claim call is not attempted if C<superpages> option is +used in the guest config (see xl.cfg(5)). + +Default: C<0> + +=over 4 + +=item C<0> + +No claim is made. Memory population during guest creation will be +attempted as normal and may fail due to memory exhaustion. + +=item C<1> + +Normal memory and freeable pool of ephemeral pages (tmem) is used when +calculating whether there is enough memory free to launch a guest. +This guarantees immediate feedback whether the guest can be launched due +to memory exhaustion (which can take a long time to find out if launching +massively huge guests). + +=back + =back =head1 SEE ALSO diff --git a/tools/examples/xl.conf b/tools/examples/xl.conf index b0caa32..f386bb9 100644 --- a/tools/examples/xl.conf +++ b/tools/examples/xl.conf @@ -26,3 +26,9 @@ # default bridge device to use with vif-bridge hotplug scripts #vif.default.bridge="xenbr0" + +# Reserve a claim of memory when launching a guest. This guarantees immediate +# feedback whether the guest can be launched due to memory exhaustion +# (which can take a long time to find out if launching huge guests). +# see xl.conf(5) for details. +#claim_mode=0 diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 030aa86..c4ad58b 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -579,7 +579,6 @@ int libxl_wait_for_free_memory(libxl_ctx *ctx, uint32_t domid, uint32_t memory_k /* wait for the memory target of a domain to be reached */ int libxl_wait_for_memory_target(libxl_ctx *ctx, uint32_t domid, int wait_secs); - int libxl_vncviewer_exec(libxl_ctx *ctx, uint32_t domid, int autopass); int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num, libxl_console_type type); /* libxl_primary_console_exec finds the domid and console number diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 30a4507..ae72f21 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -196,6 +196,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, if (b_info->target_memkb == LIBXL_MEMKB_DEFAULT) b_info->target_memkb = b_info->max_memkb; + libxl_defbool_setdefault(&b_info->claim_mode, false); + libxl_defbool_setdefault(&b_info->localtime, false); libxl_defbool_setdefault(&b_info->disable_migrate, false); diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 2dd429f..92a6628 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -371,6 +371,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid, dom->console_domid = state->console_domid; dom->xenstore_evtchn = state->store_port; dom->xenstore_domid = state->store_domid; + dom->claim_enabled = libxl_defbool_val(info->claim_mode); if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) { LOGE(ERROR, "xc_dom_boot_xen_init failed"); @@ -605,7 +606,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, */ args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10; args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10; - + args.claim_enabled = libxl_defbool_val(info->claim_mode); if (libxl__domain_firmware(gc, info, &args)) { LOG(ERROR, "initializing domain firmware failed"); goto out; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index f3c212b..0f1f118 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -293,7 +293,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("ioports", Array(libxl_ioport_range, "num_ioports")), ("irqs", Array(uint32, "num_irqs")), ("iomem", Array(libxl_iomem_range, "num_iomem")), - + ("claim_mode", libxl_defbool), ("u", KeyedUnion(None, libxl_domain_type, "type", [("hvm", Struct(None, [("firmware", string), ("bios", libxl_bios_type), diff --git a/tools/libxl/xl.c b/tools/libxl/xl.c index 4c598db..211facd 100644 --- a/tools/libxl/xl.c +++ b/tools/libxl/xl.c @@ -45,6 +45,7 @@ char *default_vifscript = NULL; char *default_bridge = NULL; char *default_gatewaydev = NULL; enum output_format default_output_format = OUTPUT_FORMAT_JSON; +libxl_defbool claim_mode; static xentoollog_level minmsglevel = XTL_PROGRESS; @@ -134,6 +135,10 @@ static void parse_global_config(const char *configfile, } if (!xlu_cfg_get_string (config, "blkdev_start", &buf, 0)) blkdev_start = strdup(buf); + + libxl_defbool_setdefault(&claim_mode, false); + (void)xlu_cfg_get_defbool (config, "claim_mode", &claim_mode, 0); + xlu_cfg_destroy(config); } diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h index b881f92..4c5e5d1 100644 --- a/tools/libxl/xl.h +++ b/tools/libxl/xl.h @@ -145,6 +145,7 @@ int xl_child_pid(xlchildnum); /* returns 0 if child struct is not in use */ extern int autoballoon; extern int run_hotplug_scripts; extern int dryrun_only; +extern libxl_defbool claim_mode; extern char *lockfile; extern char *default_vifscript; extern char *default_bridge; diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 2d40f8f..c8b0a99 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -757,6 +757,8 @@ static void parse_config_data(const char *config_source, if (!xlu_cfg_get_long (config, "maxmem", &l, 0)) b_info->max_memkb = l * 1024; + b_info->claim_mode = claim_mode; + if (xlu_cfg_get_string (config, "on_poweroff", &buf, 0)) buf = "destroy"; if (!parse_action_on_shutdown(buf, &d_config->on_poweroff)) {
Konrad Rzeszutek Wilk
2013-Mar-29 20:07 UTC
Re: [PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo
On Thu, Mar 28, 2013 at 04:44:22PM +0000, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH 5/6] xl: export ''outstanding_pages'' value from xcinfo"): > > This patch provides the value of the currently outstanding pages > > claimed for a specific domain. This is a value that influences > > the global outstanding claims value (See patch: "xl: ''xl info'' > > print outstanding claims if enabled") returned via > > xc_domain_get_outstanding_pages hypercall. This domain value > > decrements as the memory is populated for the guest and > > eventually reaches zero. > > Again this patch is fine as far as it goes but we need an update to > the documentation with a clear and accurate definition of what these > values are.The original patch (v11) had this outputed to SXP but as that is in the ''do-not-touch'' that part had to be removed. And the ''libxl_dominfo'' structure that comes out of this change, is only consumed in couple of places and only certain attributes: - domid in ''list_domains_details''. - domid, vcpu_online, blocked, paused, shutdown, dying, cpu_time, ssidref in ''list_domains''. - vcpu_online in ''main_cpupoolnumasplit''. - current_memkb, shared_memkb, domid, shutdown in ''sharing''. Nobody seems to be using ''page_memkb'', nor ''vcpu_max_id''. I could provide the ''outstanding_memkb'' value as part of ''list_domains'' but that would alter the existing format it has. I could create an ''xl claim'' that would be similar to ''xl list'' but also include the claim information. But I am not sure whether there is a need for it. The ''outstanding_claims'' via (xl info) is good enough for it. So your call - My thinking was that this could be in the code to be potentially used in the future (hand-waving how), but I am also OK with just dropping this patch and the earlier: xl/xc: export outstanding_pages value in xc_dominfo structure. Let me actually drop it from the next posting - it is always nicer to have a slimmed down version of patchsets.
Konrad Rzeszutek Wilk
2013-Mar-29 20:17 UTC
Re: [PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config
On Fri, Mar 29, 2013 at 09:27:30AM -0400, Konrad Rzeszutek Wilk wrote:> On Thu, Mar 28, 2013 at 04:41:39PM +0000, Ian Jackson wrote: > > Konrad Rzeszutek Wilk writes ("[PATCH 3/6] xend: Implement XENMEM_claim_pages support via ''claim-mode'' global config"): > > > The XENMEM_claim_pages hypercall operates per domain and it should be > > > used system wide. As such this patch introduces a global configuration > > > option ''claim-mode'' that by default is disabled. > > > > I think at this stage we aren''t accepting new features in xend. > > Indeed. The cover letter mentioned that this is more of ''this is how > it would be in Xend - but since Xend is deprecated it is more of > people to get a fell for it and if they would like to put in in their > local version''It did not!? It looks like the cover letter I used was a different version than I typed up. Ugh. Sorry about that.
George Dunlap
2013-Apr-02 11:10 UTC
Re: [PATCH v13] claim and its friends for allocating multiple self-ballooning guests
On 28/03/13 16:50, Ian Jackson wrote:> Konrad Rzeszutek Wilk writes ("[PATCH v13] claim and its friends for allocating multiple self-ballooning guests"): >> The patch (mmu: Introduce XENMEM_claim_pages (subop of memory ops) >> is already in the hypervisor and described in details the >> problem/solution/alternative solutions. This builds upon that new >> hypercall to expand the toolstack to use it. > I''ve reviewed this series now and it''s looking close. > > When we get to a final version that I''m happy with I would like to get > an opinion from Ian C before actually committing it. He''s away until > after feature freeze, but you''ve posted this already several times > before the freeze and I expect the final version to make it before > George declares code freeze.I was planning on extending the code freeze for a week for exactly this kind of reason. -George