Hi, Here are extension and fix for balloon driver: - xen/balloon: Notify a host about a guest memory size limit, - xen/balloon: Enforce various limits on target. They are posted as a reference for libxl memory management patches which I posted today. Please do not apply them yet. Daniel
Daniel Kiper
2013-Apr-29 11:37 UTC
[PATCH v2 1/2] xen/balloon: Notify a host about a guest memory size limit
Notify a host about a guest memory size limit. Idea of this patch was discussed with Ian Campbell and Ian Jackson. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> --- drivers/xen/balloon.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index a56776d..856661f 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -65,6 +65,7 @@ #include <xen/balloon.h> #include <xen/features.h> #include <xen/page.h> +#include <xen/xenbus.h> /* * balloon_process() state: @@ -586,6 +587,9 @@ static void __init balloon_add_region(unsigned long start_pfn, static int __init balloon_init(void) { int i; +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG + int rc; +#endif if (!xen_domain()) return -ENODEV; @@ -621,6 +625,27 @@ static int __init balloon_init(void) balloon_add_region(PFN_UP(xen_extra_mem[i].start), PFN_DOWN(xen_extra_mem[i].size)); +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG + /* + * Notify a host about our memory size limit. + * + * The guest-max value is in KiB, so MAX_DOMAIN_PAGES should + * be converted respectively. PAGE_SHIFT converts pages to bytes, + * hence PAGE_SHIFT - 10. + */ + rc = xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu", + MAX_DOMAIN_PAGES << (PAGE_SHIFT - 10)); + + if (rc < 0) + pr_info("xen/balloon: Memory hotplug may not be supported " + "in some environments: %i\n", rc); +#else + xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu", + (balloon_stats.current_pages + + balloon_stats.balloon_low + + balloon_stats.balloon_high) << (PAGE_SHIFT - 10)); +#endif + return 0; } -- 1.7.10.4
Daniel Kiper
2013-Apr-29 11:37 UTC
[PATCH v2 2/2] xen/balloon: Enforce various limits on target
This patch enforces on target limit statically defined in Linux Kernel source and limit defined by hypervisor or host. This way the balloon driver should not attempt to populate pages above given limits because they may fail. Particularly this patch fixes bug which led to flood of dom0 kernel log with messages similar to: System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added xen_balloon: reserve_additional_memory: add_memory() failed: -17 v2 - suggestions/fixes: - always use maximum reservation as a refernce (suggested by David Vrabel), - change logging level for dom0 (suggested by Konrad Rzeszutek Wilk), - improve commit comment (suggested by David Vrabel). Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> --- drivers/xen/balloon.c | 40 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 856661f..a3a8eaa 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -81,7 +81,6 @@ enum bp_state { BP_ECANCELED }; - static DEFINE_MUTEX(balloon_mutex); struct balloon_stats balloon_stats; @@ -90,6 +89,12 @@ EXPORT_SYMBOL_GPL(balloon_stats); /* We increase/decrease in batches which fit in a page */ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; +/* + * Extra internal memory reserved by libxl. + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. + */ +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) + #ifdef CONFIG_HIGHMEM #define inc_totalhigh_pages() (totalhigh_pages++) #define dec_totalhigh_pages() (totalhigh_pages--) @@ -491,11 +496,42 @@ static void balloon_process(struct work_struct *work) mutex_unlock(&balloon_mutex); } -/* Resets the Xen limit, sets new target, and kicks off processing. */ +/* Enforce limits, set new target and kick off processing. */ void balloon_set_new_target(unsigned long target) { + domid_t domid = DOMID_SELF; + int rc; + + /* Enforce statically defined limit. */ + target = min(target, MAX_DOMAIN_PAGES); + + rc = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); + + if (xen_initial_domain()) { + if (rc <= 0) { + pr_debug("xen_balloon: %s: Initial domain target limit " + "could not be established: %i\n", + __func__, rc); + goto no_host_limit; + } + } else { + if (rc <= 0) { + pr_info("xen_balloon: %s: Guest domain target limit " + "could not be established: %i\n", __func__, rc); + goto no_host_limit; + } + + /* Do not take into account memory reserved for internal stuff. */ + rc -= LIBXL_MAXMEM_CONSTANT_PAGES; + } + + /* Enforce hypervisor/host defined limit. */ + target = min_t(unsigned long, target, rc); + +no_host_limit: /* No need for lock. Not read-modify-write updates. */ balloon_stats.target_pages = target; + schedule_delayed_work(&balloon_worker, 0); } EXPORT_SYMBOL_GPL(balloon_set_new_target); -- 1.7.10.4
Ian Campbell
2013-Apr-29 14:35 UTC
Re: [PATCH v2 1/2] xen/balloon: Notify a host about a guest memory size limit
On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote:> Notify a host about a guest memory size limit. > > Idea of this patch was discussed with Ian Campbell and Ian Jackson. > > Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> > --- > drivers/xen/balloon.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index a56776d..856661f 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -65,6 +65,7 @@ > #include <xen/balloon.h> > #include <xen/features.h> > #include <xen/page.h> > +#include <xen/xenbus.h> > > /* > * balloon_process() state: > @@ -586,6 +587,9 @@ static void __init balloon_add_region(unsigned long start_pfn, > static int __init balloon_init(void) > { > int i; > +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG > + int rc; > +#endif > > if (!xen_domain()) > return -ENODEV; > @@ -621,6 +625,27 @@ static int __init balloon_init(void) > balloon_add_region(PFN_UP(xen_extra_mem[i].start), > PFN_DOWN(xen_extra_mem[i].size)); > > +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG > + /* > + * Notify a host about our memory size limit. > + * > + * The guest-max value is in KiB, so MAX_DOMAIN_PAGES should > + * be converted respectively. PAGE_SHIFT converts pages to bytes, > + * hence PAGE_SHIFT - 10. > + */ > + rc = xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu", > + MAX_DOMAIN_PAGES << (PAGE_SHIFT - 10)); > + > + if (rc < 0) > + pr_info("xen/balloon: Memory hotplug may not be supported " > + "in some environments: %i\n", rc); > +#else > + xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu",Would it be OK to not do anything in the non-hotplug case? That''s just the historical behaviour right? Would you expect this value to differ from static-max?> + (balloon_stats.current_pages + > + balloon_stats.balloon_low + > + balloon_stats.balloon_high) << (PAGE_SHIFT - 10)); > +#endif > + > return 0; > } >
Ian Campbell
2013-Apr-29 14:44 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote:> This patch enforces on target limit statically defined in Linux Kernel > source and limit defined by hypervisor or host. This way the balloon > driver should not attempt to populate pages above given limits > because they may fail. > > Particularly this patch fixes bug which led to flood > of dom0 kernel log with messages similar to: > > System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added > xen_balloon: reserve_additional_memory: add_memory() failed: -17I think it would be OK to simply tone down this message (and perhaps add the failed pages to the balloon, if that makes sense). This isn''t dissimilar to increase_reservation failing.> +/* > + * Extra internal memory reserved by libxl. > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > + */ > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE)I think we need to find a way to achieve your aims which doesn''t require leaking internal implementation details of libxl into the guest kernels. What happens if libxl decides to double this?> + > #ifdef CONFIG_HIGHMEM > #define inc_totalhigh_pages() (totalhigh_pages++) > #define dec_totalhigh_pages() (totalhigh_pages--) > @@ -491,11 +496,42 @@ static void balloon_process(struct work_struct *work) > mutex_unlock(&balloon_mutex); > } > > -/* Resets the Xen limit, sets new target, and kicks off processing. */ > +/* Enforce limits, set new target and kick off processing. */ > void balloon_set_new_target(unsigned long target) > { > + domid_t domid = DOMID_SELF; > + int rc; > + > + /* Enforce statically defined limit. */ > + target = min(target, MAX_DOMAIN_PAGES); > + > + rc = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > + > + if (xen_initial_domain()) { > + if (rc <= 0) { > + pr_debug("xen_balloon: %s: Initial domain target limit " > + "could not be established: %i\n", > + __func__, rc); > + goto no_host_limit; > + } > + } else { > + if (rc <= 0) { > + pr_info("xen_balloon: %s: Guest domain target limit " > + "could not be established: %i\n", __func__, rc); > + goto no_host_limit; > + } > + > + /* Do not take into account memory reserved for internal stuff. */ > + rc -= LIBXL_MAXMEM_CONSTANT_PAGES; > + }Why is this needed? Wouldn''t it be a toolstack bug to set the target greater than this limit? But if it did ask then it would no doubt be expecting the guest to try and reach that limit (perhaps it intends to raise the maximum later?). In any case it should be handled the same way as a failure in increase_reservation is always handled, shouldn''t it? No need for a special case. I don''t think this change has anything to do with the add_memory() failure you mention in the commit message.> + > + /* Enforce hypervisor/host defined limit. */ > + target = min_t(unsigned long, target, rc); > + > +no_host_limit: > /* No need for lock. Not read-modify-write updates. */ > balloon_stats.target_pages = target; > + > schedule_delayed_work(&balloon_worker, 0); > } > EXPORT_SYMBOL_GPL(balloon_set_new_target);
Daniel Kiper
2013-Apr-30 12:40 UTC
Re: [PATCH v2 1/2] xen/balloon: Notify a host about a guest memory size limit
On Mon, Apr 29, 2013 at 03:35:11PM +0100, Ian Campbell wrote:> On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote: > > Notify a host about a guest memory size limit. > > > > Idea of this patch was discussed with Ian Campbell and Ian Jackson. > > > > Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> > > --- > > drivers/xen/balloon.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > > index a56776d..856661f 100644 > > --- a/drivers/xen/balloon.c > > +++ b/drivers/xen/balloon.c > > @@ -65,6 +65,7 @@ > > #include <xen/balloon.h> > > #include <xen/features.h> > > #include <xen/page.h> > > +#include <xen/xenbus.h> > > > > /* > > * balloon_process() state: > > @@ -586,6 +587,9 @@ static void __init balloon_add_region(unsigned long start_pfn, > > static int __init balloon_init(void) > > { > > int i; > > +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG > > + int rc; > > +#endif > > > > if (!xen_domain()) > > return -ENODEV; > > @@ -621,6 +625,27 @@ static int __init balloon_init(void) > > balloon_add_region(PFN_UP(xen_extra_mem[i].start), > > PFN_DOWN(xen_extra_mem[i].size)); > > > > +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG > > + /* > > + * Notify a host about our memory size limit. > > + * > > + * The guest-max value is in KiB, so MAX_DOMAIN_PAGES should > > + * be converted respectively. PAGE_SHIFT converts pages to bytes, > > + * hence PAGE_SHIFT - 10. > > + */ > > + rc = xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu", > > + MAX_DOMAIN_PAGES << (PAGE_SHIFT - 10)); > > + > > + if (rc < 0) > > + pr_info("xen/balloon: Memory hotplug may not be supported " > > + "in some environments: %i\n", rc); > > +#else > > + xenbus_printf(XBT_NIL, "memory", "guest-max", "%lu", > > Would it be OK to not do anything in the non-hotplug case? That''s justAs I stated earlier I think that every new polite guest, with or without memory hotplug, should inform about its maximum supported memory size.> the historical behaviour right?Yes, it is.> Would you expect this value to differ from static-max?On my test system it is static-max + 8 MiB. Daniel
Daniel Kiper
2013-Apr-30 12:59 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Mon, Apr 29, 2013 at 03:44:09PM +0100, Ian Campbell wrote:> On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote: > > > This patch enforces on target limit statically defined in Linux Kernel > > source and limit defined by hypervisor or host. This way the balloon > > driver should not attempt to populate pages above given limits > > because they may fail. > > > > Particularly this patch fixes bug which led to flood > > of dom0 kernel log with messages similar to: > > > > System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added > > xen_balloon: reserve_additional_memory: add_memory() failed: -17 > > I think it would be OK to simply tone down this message (and perhaps add > the failed pages to the balloon, if that makes sense). This isn''t > dissimilar to increase_reservation failing.If add_memory() fails it is hard error. It means that we do not know where new or ballooned pages should be placed.> > +/* > > + * Extra internal memory reserved by libxl. > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > + */ > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > I think we need to find a way to achieve your aims which doesn''t require > leaking internal implementation details of libxl into the guest kernels. > What happens if libxl decides to double this?I agree that this is not elegant solution. However, if we would like to be in line with docs/misc/libxl_memory.txt (this is correct path) this is a must. Once I thought that this value could be passed via xenstore but I think it is rather small chance it would be changed in near future. As I know this slack is reserved now just in case (correct me if I am wrong). If this value will be changed we could pass new value via xenstore (or other convenient mechanism).> > + > > #ifdef CONFIG_HIGHMEM > > #define inc_totalhigh_pages() (totalhigh_pages++) > > #define dec_totalhigh_pages() (totalhigh_pages--) > > @@ -491,11 +496,42 @@ static void balloon_process(struct work_struct *work) > > mutex_unlock(&balloon_mutex); > > } > > > > -/* Resets the Xen limit, sets new target, and kicks off processing. */ > > +/* Enforce limits, set new target and kick off processing. */ > > void balloon_set_new_target(unsigned long target) > > { > > + domid_t domid = DOMID_SELF; > > + int rc; > > + > > + /* Enforce statically defined limit. */ > > + target = min(target, MAX_DOMAIN_PAGES); > > + > > + rc = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > > + > > + if (xen_initial_domain()) { > > + if (rc <= 0) { > > + pr_debug("xen_balloon: %s: Initial domain target limit " > > + "could not be established: %i\n", > > + __func__, rc); > > + goto no_host_limit; > > + } > > + } else { > > + if (rc <= 0) { > > + pr_info("xen_balloon: %s: Guest domain target limit " > > + "could not be established: %i\n", __func__, rc); > > + goto no_host_limit; > > + } > > + > > + /* Do not take into account memory reserved for internal stuff. */ > > + rc -= LIBXL_MAXMEM_CONSTANT_PAGES; > > + } > > Why is this needed? Wouldn''t it be a toolstack bug to set the target > greater than this limit? But if it did ask then it would no doubt be > expecting the guest to try and reach that limit (perhaps it intends to > raise the maximum later?).For domU XENMEM_maximum_reservation is always equal <user_requested_maximum> + LIBXL_MAXMEM_CONSTANT_PAGES. Acording to docs/misc/libxl_memory.txt LIBXL_MAXMEM_CONSTANT_PAGES is reserved for extra internal. It means that we should not allow balloon driver to reserve more than user_requested_maximum. Daniel
Ian Campbell
2013-Apr-30 13:44 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Tue, 2013-04-30 at 13:59 +0100, Daniel Kiper wrote:> On Mon, Apr 29, 2013 at 03:44:09PM +0100, Ian Campbell wrote: > > On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote: > > > > > This patch enforces on target limit statically defined in Linux Kernel > > > source and limit defined by hypervisor or host. This way the balloon > > > driver should not attempt to populate pages above given limits > > > because they may fail. > > > > > > Particularly this patch fixes bug which led to flood > > > of dom0 kernel log with messages similar to: > > > > > > System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added > > > xen_balloon: reserve_additional_memory: add_memory() failed: -17 > > > > I think it would be OK to simply tone down this message (and perhaps add > > the failed pages to the balloon, if that makes sense). This isn''t > > dissimilar to increase_reservation failing. > > If add_memory() fails it is hard error. It means that we do not > know where new or ballooned pages should be placed.I see that add_memory() is a generic or arch level function rather than a ballooning specific one. Under what circumstances can it fail and how do they relate the the setting of the balloon target?> > > +/* > > > + * Extra internal memory reserved by libxl. > > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > > + */ > > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > > > I think we need to find a way to achieve your aims which doesn''t require > > leaking internal implementation details of libxl into the guest kernels. > > What happens if libxl decides to double this? > > I agree that this is not elegant solution. However, if we would like to > be in line with docs/misc/libxl_memory.txt (this is correct path) this > is a must.I''m not sure about this, that file describes the toolstacks view of the memory in a system. That need not necessarily correspond with the guest''s ideas (although you would hope it would be a superset). Surely it is logically wrong to bake toolstack specific knowledge in the guest? If someone can describe a meaningful semantic for this number in a toolstack independent way then perhaps it would be appropriate to do something with it. I''ve no idea, having looked at both the document and the code, what this value actually is.> Once I thought that this value could be passed via xenstore > but I think it is rather small chance it would be changed in near > future. As I know this slack is reserved now just in case (correct me > if I am wrong). If this value will be changed we could pass new value > via xenstore (or other convenient mechanism). > > > > + > > > #ifdef CONFIG_HIGHMEM > > > #define inc_totalhigh_pages() (totalhigh_pages++) > > > #define dec_totalhigh_pages() (totalhigh_pages--) > > > @@ -491,11 +496,42 @@ static void balloon_process(struct work_struct *work) > > > mutex_unlock(&balloon_mutex); > > > } > > > > > > -/* Resets the Xen limit, sets new target, and kicks off processing. */ > > > +/* Enforce limits, set new target and kick off processing. */ > > > void balloon_set_new_target(unsigned long target) > > > { > > > + domid_t domid = DOMID_SELF; > > > + int rc; > > > + > > > + /* Enforce statically defined limit. */ > > > + target = min(target, MAX_DOMAIN_PAGES); > > > + > > > + rc = HYPERVISOR_memory_op(XENMEM_maximum_reservation, &domid); > > > + > > > + if (xen_initial_domain()) { > > > + if (rc <= 0) { > > > + pr_debug("xen_balloon: %s: Initial domain target limit " > > > + "could not be established: %i\n", > > > + __func__, rc); > > > + goto no_host_limit; > > > + } > > > + } else { > > > + if (rc <= 0) { > > > + pr_info("xen_balloon: %s: Guest domain target limit " > > > + "could not be established: %i\n", __func__, rc); > > > + goto no_host_limit; > > > + } > > > + > > > + /* Do not take into account memory reserved for internal stuff. */ > > > + rc -= LIBXL_MAXMEM_CONSTANT_PAGES; > > > + } > > > > Why is this needed? Wouldn''t it be a toolstack bug to set the target > > greater than this limit? But if it did ask then it would no doubt be > > expecting the guest to try and reach that limit (perhaps it intends to > > raise the maximum later?). > > For domU XENMEM_maximum_reservation is always equal > <user_requested_maximum> + LIBXL_MAXMEM_CONSTANT_PAGES. > Acording to docs/misc/libxl_memory.txt LIBXL_MAXMEM_CONSTANT_PAGES > is reserved for extra internal. It means that we should not allow > balloon driver to reserve more than user_requested_maximum.Why not? What is the downside of reserving a little more than we should? If the toolstack cares then it will presumably never set a target which exceeds <user_requested_maximum>. Ian.
Daniel Kiper
2013-Apr-30 18:58 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Tue, Apr 30, 2013 at 02:44:18PM +0100, Ian Campbell wrote:> On Tue, 2013-04-30 at 13:59 +0100, Daniel Kiper wrote: > > On Mon, Apr 29, 2013 at 03:44:09PM +0100, Ian Campbell wrote: > > > On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote: > > > > > > > This patch enforces on target limit statically defined in Linux Kernel > > > > source and limit defined by hypervisor or host. This way the balloon > > > > driver should not attempt to populate pages above given limits > > > > because they may fail. > > > > > > > > Particularly this patch fixes bug which led to flood > > > > of dom0 kernel log with messages similar to: > > > > > > > > System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added > > > > xen_balloon: reserve_additional_memory: add_memory() failed: -17 > > > > > > I think it would be OK to simply tone down this message (and perhaps add > > > the failed pages to the balloon, if that makes sense). This isn''t > > > dissimilar to increase_reservation failing. > > > > If add_memory() fails it is hard error. It means that we do not > > know where new or ballooned pages should be placed. > > I see that add_memory() is a generic or arch level function rather than > a ballooning specific one. Under what circumstances can it fail and how > do they relate the the setting of the balloon target?It is generic function with some references to arch code. It is called when pages could not be taken from balloon any more and must be hotplugged. It reserves memory resource (start address and size is calculated on the base of target and max_pfn) and build some structures for new memory. It may fail in many places. In this case it failed because placement of new resource was incorrectly calculated because algorithm is very simple and not cover all cases. I am going to fix this issue but a bit later. However, first of all memory hotplug should not be activated because memory allocation limit was reached in this case. This patch solves this issue.> > > > +/* > > > > + * Extra internal memory reserved by libxl. > > > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > > > + */ > > > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > > > > > I think we need to find a way to achieve your aims which doesn''t require > > > leaking internal implementation details of libxl into the guest kernels. > > > What happens if libxl decides to double this? > > > > I agree that this is not elegant solution. However, if we would like to > > be in line with docs/misc/libxl_memory.txt (this is correct path) this > > is a must. > > I''m not sure about this, that file describes the toolstacks view of the > memory in a system. That need not necessarily correspond with the > guest''s ideas (although you would hope it would be a superset). > > Surely it is logically wrong to bake toolstack specific knowledge in the > guest? If someone can describe a meaningful semantic for this number in > a toolstack independent way then perhaps it would be appropriate to do > something with it. I''ve no idea, having looked at both the document and > the code, what this value actually is.This was added by commit 9905ac2b90a3e7cecd9e7dfe21c252362e7080b2 (libxenlight: implement libxl_set_memory_target). It was written by Keir and signed off by Stefano (both are CCed here). Guys, why did you added LIBXL_MAXMEM_CONSTANT? What does it mean? Daniel
Stefano Stabellini
2013-May-02 11:34 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Tue, 30 Apr 2013, Daniel Kiper wrote:> > > > > +/* > > > > > + * Extra internal memory reserved by libxl. > > > > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > > > > + */ > > > > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > > > > > > > I think we need to find a way to achieve your aims which doesn''t require > > > > leaking internal implementation details of libxl into the guest kernels. > > > > What happens if libxl decides to double this? > > > > > > I agree that this is not elegant solution. However, if we would like to > > > be in line with docs/misc/libxl_memory.txt (this is correct path) this > > > is a must. > > > > I''m not sure about this, that file describes the toolstacks view of the > > memory in a system. That need not necessarily correspond with the > > guest''s ideas (although you would hope it would be a superset). > > > > Surely it is logically wrong to bake toolstack specific knowledge in the > > guest? If someone can describe a meaningful semantic for this number in > > a toolstack independent way then perhaps it would be appropriate to do > > something with it. I''ve no idea, having looked at both the document and > > the code, what this value actually is. > > This was added by commit 9905ac2b90a3e7cecd9e7dfe21c252362e7080b2 > (libxenlight: implement libxl_set_memory_target). It was written > by Keir and signed off by Stefano (both are CCed here). Guys, > why did you added LIBXL_MAXMEM_CONSTANT? What does it mean?libxl inherits the memory model from xapi. LIBXL_MAXMEM_CONSTANT corresponds to "extra internal" in xapi, an amount of memory that is not allocated to the domain but it is left as a slack on top of the actual memory target to determine the Xen maxmem. I believe it comes from empirical measurements and stress testing on the platform. The xapi guys, CC''ed, might have more insights on what exactly is. I dislike having to pull this "hack" into Linux, but if it is actually important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. I would add a big comment on top saying: "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT kilobytes unused, let''s be gentle and do that."
Konrad Rzeszutek Wilk
2013-May-02 18:04 UTC
Re: [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote:> On Tue, 30 Apr 2013, Daniel Kiper wrote: > > > > > > +/* > > > > > > + * Extra internal memory reserved by libxl. > > > > > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > > > > > + */ > > > > > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > > > > > > > > > I think we need to find a way to achieve your aims which doesn''t require > > > > > leaking internal implementation details of libxl into the guest kernels. > > > > > What happens if libxl decides to double this? > > > > > > > > I agree that this is not elegant solution. However, if we would like to > > > > be in line with docs/misc/libxl_memory.txt (this is correct path) this > > > > is a must. > > > > > > I''m not sure about this, that file describes the toolstacks view of the > > > memory in a system. That need not necessarily correspond with the > > > guest''s ideas (although you would hope it would be a superset). > > > > > > Surely it is logically wrong to bake toolstack specific knowledge in the > > > guest? If someone can describe a meaningful semantic for this number in > > > a toolstack independent way then perhaps it would be appropriate to do > > > something with it. I''ve no idea, having looked at both the document and > > > the code, what this value actually is. > > > > This was added by commit 9905ac2b90a3e7cecd9e7dfe21c252362e7080b2 > > (libxenlight: implement libxl_set_memory_target). It was written > > by Keir and signed off by Stefano (both are CCed here). Guys, > > why did you added LIBXL_MAXMEM_CONSTANT? What does it mean? > > libxl inherits the memory model from xapi. > LIBXL_MAXMEM_CONSTANT corresponds to "extra internal" in xapi, an amount > of memory that is not allocated to the domain but it is left as a slack > on top of the actual memory target to determine the Xen maxmem. > I believe it comes from empirical measurements and stress testing on the > platform.What is this ''slack'' used for?> The xapi guys, CC''ed, might have more insights on what exactly is. > > > I dislike having to pull this "hack" into Linux, but if it is actually > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > I would add a big comment on top saying: > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > kilobytes unused, let''s be gentle and do that."
Ian Campbell
2013-May-03 08:04 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Tue, 2013-04-30 at 19:58 +0100, Daniel Kiper wrote:> On Tue, Apr 30, 2013 at 02:44:18PM +0100, Ian Campbell wrote: > > On Tue, 2013-04-30 at 13:59 +0100, Daniel Kiper wrote: > > > On Mon, Apr 29, 2013 at 03:44:09PM +0100, Ian Campbell wrote: > > > > On Mon, 2013-04-29 at 12:37 +0100, Daniel Kiper wrote: > > > > > > > > > This patch enforces on target limit statically defined in Linux Kernel > > > > > source and limit defined by hypervisor or host. This way the balloon > > > > > driver should not attempt to populate pages above given limits > > > > > because they may fail. > > > > > > > > > > Particularly this patch fixes bug which led to flood > > > > > of dom0 kernel log with messages similar to: > > > > > > > > > > System RAM resource [mem 0x1b8000000-0x1bfffffff] cannot be added > > > > > xen_balloon: reserve_additional_memory: add_memory() failed: -17 > > > > > > > > I think it would be OK to simply tone down this message (and perhaps add > > > > the failed pages to the balloon, if that makes sense). This isn''t > > > > dissimilar to increase_reservation failing. > > > > > > If add_memory() fails it is hard error. It means that we do not > > > know where new or ballooned pages should be placed. > > > > I see that add_memory() is a generic or arch level function rather than > > a ballooning specific one. Under what circumstances can it fail and how > > do they relate the the setting of the balloon target? > > It is generic function with some references to arch code. It is called > when pages could not be taken from balloon any more and must be hotplugged. > It reserves memory resource (start address and size is calculated on the > base of target and max_pfn) and build some structures for new memory. > It may fail in many places. In this case it failed because placement > of new resource was incorrectly calculated because algorithm is very > simple and not cover all cases. I am going to fix this issue but > a bit later.I think you should fix that core issue first, otherwise the requirement for messing with the balloon target is not very obvious. Ian.
Ian Campbell
2013-May-03 08:15 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote:> On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote: > > On Tue, 30 Apr 2013, Daniel Kiper wrote: > > > > > > > +/* > > > > > > > + * Extra internal memory reserved by libxl. > > > > > > > + * Check tools/libxl/libxl_memory.txt file in Xen source for more details. > > > > > > > + */ > > > > > > > +#define LIBXL_MAXMEM_CONSTANT_PAGES (1024 * 1024 / PAGE_SIZE) > > > > > > > > > > > > I think we need to find a way to achieve your aims which doesn''t require > > > > > > leaking internal implementation details of libxl into the guest kernels. > > > > > > What happens if libxl decides to double this? > > > > > > > > > > I agree that this is not elegant solution. However, if we would like to > > > > > be in line with docs/misc/libxl_memory.txt (this is correct path) this > > > > > is a must. > > > > > > > > I''m not sure about this, that file describes the toolstacks view of the > > > > memory in a system. That need not necessarily correspond with the > > > > guest''s ideas (although you would hope it would be a superset). > > > > > > > > Surely it is logically wrong to bake toolstack specific knowledge in the > > > > guest? If someone can describe a meaningful semantic for this number in > > > > a toolstack independent way then perhaps it would be appropriate to do > > > > something with it. I''ve no idea, having looked at both the document and > > > > the code, what this value actually is. > > > > > > This was added by commit 9905ac2b90a3e7cecd9e7dfe21c252362e7080b2 > > > (libxenlight: implement libxl_set_memory_target). It was written > > > by Keir and signed off by Stefano (both are CCed here). Guys, > > > why did you added LIBXL_MAXMEM_CONSTANT? What does it mean? > > > > libxl inherits the memory model from xapi. > > LIBXL_MAXMEM_CONSTANT corresponds to "extra internal" in xapi, an amount > > of memory that is not allocated to the domain but it is left as a slack > > on top of the actual memory target to determine the Xen maxmem. > > I believe it comes from empirical measurements and stress testing on the > > platform. > > What is this ''slack'' used for?Did you read Stefano''s next sentence?> > The xapi guys, CC''ed, might have more insights on what exactly is.I think that unless someone can remember what this issue was we should just chalk it up to a historical artefact of something xapi (or maybe some historical guest) was doing which we have no reason to believe needs to be carried over to libxl. IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 cycle and see how it goes. If someone can show either empirical evidence or (better) logically argue that a fudge is required then we can always put it back (or it may turn out to be the caller''s issue, in which case they can deal with it, hopefully xapi-on-libxl won''t apply this fudge twice...). Alternatively I''m also strongly considering having debug builds of the toolstack randomise the amount of slack, that ought to shake out any lingering issues...> > I dislike having to pull this "hack" into Linux, but if it is actually > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > I would add a big comment on top saying: > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > kilobytes unused, let''s be gentle and do that."It seems to me that this change in Linux is really just papering over the underlying issue. Or at the very least no one has adequately explained what that real issue is and why this change is relevant to it and/or an appropriate fix for it. The guest knows what target the toolstack has set for it is (its in the target xenstore node), I don''t see any reason for the guest to be further second guessing that value by examining maxmem (adjusted by a fudge factor or otherwise). If the guest is seeing failures to increase its reservation when trying to meet that target then either the toolstack was buggy in asking it to hit a target greater than its maxmem or it is hitting one of the other reason for increase reservation failures. Since it needs to deal with the latter anyway I don''t see any reason to special case maxmem as a cause for a failure. Ian.
Daniel Kiper
2013-May-03 13:00 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, May 03, 2013 at 09:15:32AM +0100, Ian Campbell wrote:> On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote: > > On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote:[...]> > > The xapi guys, CC''ed, might have more insights on what exactly is. > > I think that unless someone can remember what this issue was we should > just chalk it up to a historical artefact of something xapi (or maybe > some historical guest) was doing which we have no reason to believe > needs to be carried over to libxl. > > IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 > cycle and see how it goes. If someone can show either empirical evidence > or (better) logically argue that a fudge is required then we can always > put it back (or it may turn out to be the caller''s issue, in which case > they can deal with it, hopefully xapi-on-libxl won''t apply this fudge > twice...). > > Alternatively I''m also strongly considering having debug builds of the > toolstack randomise the amount of slack, that ought to shake out any > lingering issues...Do you suggest to postopone this work until 4.4 merge window? If yes, then I think that at least "xen/balloon: Enforce various limits on target" patch (without this crazy libxl hack) should be applied.> > > I dislike having to pull this "hack" into Linux, but if it is actually > > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > > I would add a big comment on top saying: > > > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > > kilobytes unused, let''s be gentle and do that." > > It seems to me that this change in Linux is really just papering over > the underlying issue. Or at the very least no one has adequately > explained what that real issue is and why this change is relevant to it > and/or an appropriate fix for it. > > The guest knows what target the toolstack has set for it is (its in the > target xenstore node), I don''t see any reason for the guest to be > further second guessing that value by examining maxmem (adjusted by a > fudge factor or otherwise). If the guest is seeing failures to increase > its reservation when trying to meet that target then either the > toolstack was buggy in asking it to hit a target greater than its maxmem > or it is hitting one of the other reason for increase reservation > failures. Since it needs to deal with the latter anyway I don''t see any > reason to special case maxmem as a cause for a failure.Do not forget that guest may change target itself. Additionally, we would like to introduce xm compatibility mode which is a bit different then xl normal behavior. I do not mention that it is always worth check the limits. It will save us a lot of trouble later. Daniel
Ian Campbell
2013-May-03 13:21 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, 2013-05-03 at 14:00 +0100, Daniel Kiper wrote:> On Fri, May 03, 2013 at 09:15:32AM +0100, Ian Campbell wrote: > > On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote: > > > On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote: > > [...] > > > > > The xapi guys, CC''ed, might have more insights on what exactly is. > > > > I think that unless someone can remember what this issue was we should > > just chalk it up to a historical artefact of something xapi (or maybe > > some historical guest) was doing which we have no reason to believe > > needs to be carried over to libxl. > > > > IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 > > cycle and see how it goes. If someone can show either empirical evidence > > or (better) logically argue that a fudge is required then we can always > > put it back (or it may turn out to be the caller''s issue, in which case > > they can deal with it, hopefully xapi-on-libxl won''t apply this fudge > > twice...). > > > > Alternatively I''m also strongly considering having debug builds of the > > toolstack randomise the amount of slack, that ought to shake out any > > lingering issues... > > Do you suggest to postopone this work until 4.4 merge window?4.4 is a Xen version, this is a Linux patch so I''m not sure what you mean.> If yes, then I think that at least "xen/balloon: Enforce various limits on target" > patch (without this crazy libxl hack) should be applied.You mean this patch with only the MAX_DOMAIN_PAGES clamping? That seems plausible.> > > > I dislike having to pull this "hack" into Linux, but if it is actually > > > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > > > I would add a big comment on top saying: > > > > > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > > > kilobytes unused, let''s be gentle and do that." > > > > It seems to me that this change in Linux is really just papering over > > the underlying issue. Or at the very least no one has adequately > > explained what that real issue is and why this change is relevant to it > > and/or an appropriate fix for it. > > > > The guest knows what target the toolstack has set for it is (its in the > > target xenstore node), I don''t see any reason for the guest to be > > further second guessing that value by examining maxmem (adjusted by a > > fudge factor or otherwise). If the guest is seeing failures to increase > > its reservation when trying to meet that target then either the > > toolstack was buggy in asking it to hit a target greater than its maxmem > > or it is hitting one of the other reason for increase reservation > > failures. Since it needs to deal with the latter anyway I don''t see any > > reason to special case maxmem as a cause for a failure. > > Do not forget that guest may change target itself.Yes it can, and that can fail either due to maxmem or due to ENOMEM, and the kernel needs prepared to deal with that when it happens.> Additionally, we would like to introduce xm compatibility > mode which is a bit different then xl normal behavior.When then you really don''t want to be baking specifics of the current model into the kernel, do you.> I do not mention that it is always worth check the limits. > It will save us a lot of trouble later.On the contrary, it seems to me that baking magic numbers into the kernel based on current toolstack behaviour is asking for trouble later. Ian.
Daniel Kiper
2013-May-03 13:47 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, May 03, 2013 at 02:21:24PM +0100, Ian Campbell wrote:> On Fri, 2013-05-03 at 14:00 +0100, Daniel Kiper wrote: > > On Fri, May 03, 2013 at 09:15:32AM +0100, Ian Campbell wrote: > > > On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote: > > > > On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote: > > > > [...] > > > > > > > The xapi guys, CC''ed, might have more insights on what exactly is. > > > > > > I think that unless someone can remember what this issue was we should > > > just chalk it up to a historical artefact of something xapi (or maybe > > > some historical guest) was doing which we have no reason to believe > > > needs to be carried over to libxl. > > > > > > IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 > > > cycle and see how it goes. If someone can show either empirical evidence > > > or (better) logically argue that a fudge is required then we can always > > > put it back (or it may turn out to be the caller''s issue, in which case > > > they can deal with it, hopefully xapi-on-libxl won''t apply this fudge > > > twice...). > > > > > > Alternatively I''m also strongly considering having debug builds of the > > > toolstack randomise the amount of slack, that ought to shake out any > > > lingering issues... > > > > Do you suggest to postopone this work until 4.4 merge window? > > 4.4 is a Xen version, this is a Linux patch so I''m not sure what you > mean.I thought about my libxl patches. Should I post new version without LIBXL_MAXMEM_CONSTANT when 4.4 merge window open?> > If yes, then I think that at least "xen/balloon: Enforce various limits on target" > > patch (without this crazy libxl hack) should be applied. > > You mean this patch with only the MAX_DOMAIN_PAGES clamping? That seems > plausible.... and with XENMEM_maximum_reservation check but wihtout LIBXL_MAXMEM_CONSTANT_PAGES.> > > > > I dislike having to pull this "hack" into Linux, but if it is actually > > > > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > > > > I would add a big comment on top saying: > > > > > > > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > > > > kilobytes unused, let''s be gentle and do that." > > > > > > It seems to me that this change in Linux is really just papering over > > > the underlying issue. Or at the very least no one has adequately > > > explained what that real issue is and why this change is relevant to it > > > and/or an appropriate fix for it. > > > > > > The guest knows what target the toolstack has set for it is (its in the > > > target xenstore node), I don''t see any reason for the guest to be > > > further second guessing that value by examining maxmem (adjusted by a > > > fudge factor or otherwise). If the guest is seeing failures to increase > > > its reservation when trying to meet that target then either the > > > toolstack was buggy in asking it to hit a target greater than its maxmem > > > or it is hitting one of the other reason for increase reservation > > > failures. Since it needs to deal with the latter anyway I don''t see any > > > reason to special case maxmem as a cause for a failure. > > > > Do not forget that guest may change target itself. > > Yes it can, and that can fail either due to maxmem or due to ENOMEM, and > the kernel needs prepared to deal with that when it happens.Sure but why we would like to fail in endless loop if maxmem case could be easliy detected by checking XENMEM_maximum_reservation?> > Additionally, we would like to introduce xm compatibility > > mode which is a bit different then xl normal behavior. > > When then you really don''t want to be baking specifics of the current > model into the kernel, do you.Hmmm... Little misunderstanding. As I stated a few times I do not want bake any libxl or Xen stuff into Linux Kernel (including LIBXL_MAXMEM_CONSTANT). I just want to check limits which I think make sense in this case.> > I do not mention that it is always worth check the limits. > > It will save us a lot of trouble later. > > On the contrary, it seems to me that baking magic numbers into the > kernel based on current toolstack behaviour is asking for trouble later.Ditto... Daniel
Ian Campbell
2013-May-03 14:11 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, 2013-05-03 at 14:47 +0100, Daniel Kiper wrote:> On Fri, May 03, 2013 at 02:21:24PM +0100, Ian Campbell wrote: > > On Fri, 2013-05-03 at 14:00 +0100, Daniel Kiper wrote: > > > On Fri, May 03, 2013 at 09:15:32AM +0100, Ian Campbell wrote: > > > > On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote: > > > > > On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote: > > > > > > [...] > > > > > > > > > The xapi guys, CC''ed, might have more insights on what exactly is. > > > > > > > > I think that unless someone can remember what this issue was we should > > > > just chalk it up to a historical artefact of something xapi (or maybe > > > > some historical guest) was doing which we have no reason to believe > > > > needs to be carried over to libxl. > > > > > > > > IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 > > > > cycle and see how it goes. If someone can show either empirical evidence > > > > or (better) logically argue that a fudge is required then we can always > > > > put it back (or it may turn out to be the caller''s issue, in which case > > > > they can deal with it, hopefully xapi-on-libxl won''t apply this fudge > > > > twice...). > > > > > > > > Alternatively I''m also strongly considering having debug builds of the > > > > toolstack randomise the amount of slack, that ought to shake out any > > > > lingering issues... > > > > > > Do you suggest to postopone this work until 4.4 merge window? > > > > 4.4 is a Xen version, this is a Linux patch so I''m not sure what you > > mean. > > I thought about my libxl patches. Should I post new version without > LIBXL_MAXMEM_CONSTANT when 4.4 merge window open?Xen doesn''t have a concept of a merge window. However we are currently in feature freeze for 4.3. You''d need to speak to the release manager (George Dunlap) to see if your xen side patches are acceptable at this stage of the release. AIUI we were intending to do an RC1 release on Monday or Tuesday. It''s probably too late to be making any major changes to libxl''s behaviour without an extremely good rationale.> > > If yes, then I think that at least "xen/balloon: Enforce various limits on target" > > > patch (without this crazy libxl hack) should be applied. > > > > You mean this patch with only the MAX_DOMAIN_PAGES clamping? That seems > > plausible. > > ... and with XENMEM_maximum_reservation check but wihtout LIBXL_MAXMEM_CONSTANT_PAGES.I am less convinced by this bit, but not as dead against it as I am against anything with LIBXL_MAXMEM_CONSTANT_PAGES in it.> > > > > > I dislike having to pull this "hack" into Linux, but if it is actually > > > > > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > > > > > I would add a big comment on top saying: > > > > > > > > > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > > > > > kilobytes unused, let''s be gentle and do that." > > > > > > > > It seems to me that this change in Linux is really just papering over > > > > the underlying issue. Or at the very least no one has adequately > > > > explained what that real issue is and why this change is relevant to it > > > > and/or an appropriate fix for it. > > > > > > > > The guest knows what target the toolstack has set for it is (its in the > > > > target xenstore node), I don''t see any reason for the guest to be > > > > further second guessing that value by examining maxmem (adjusted by a > > > > fudge factor or otherwise). If the guest is seeing failures to increase > > > > its reservation when trying to meet that target then either the > > > > toolstack was buggy in asking it to hit a target greater than its maxmem > > > > or it is hitting one of the other reason for increase reservation > > > > failures. Since it needs to deal with the latter anyway I don''t see any > > > > reason to special case maxmem as a cause for a failure. > > > > > > Do not forget that guest may change target itself. > > > > Yes it can, and that can fail either due to maxmem or due to ENOMEM, and > > the kernel needs prepared to deal with that when it happens. > > Sure but why we would like to fail in endless loop if maxmem case > could be easliy detected by checking XENMEM_maximum_reservation?That endless loop is deliberate. When a target is set the balloon driver is supposed to try to reach it and if it fails at any given moment it is supposed to try again. This all relates to the changes made in bc2c0303226e. Now you could argue that this case is subtly different from the ENOMEM case which was the primary focus of that commit but have you thought about the behaviour you want in the case where maximum_reservation is subsequently raised? IMHO there''s no reason why the balloon driver shouldn''t then automatically make further progress towards the target. If the infinite loop bothers you then perhaps an exponential backoff in the frequency of attempts would be a suitable middle ground?> > > Additionally, we would like to introduce xm compatibility > > > mode which is a bit different then xl normal behavior. > > > > When then you really don''t want to be baking specifics of the current > > model into the kernel, do you. > > Hmmm... Little misunderstanding. As I stated a few times I do not > want bake any libxl or Xen stuff into Linux Kernel (including > LIBXL_MAXMEM_CONSTANT). I just want to check limits which I think > make sense in this case.Sorry, I never noticed you saying that. Where was it? Ian.
Daniel Kiper
2013-May-03 15:47 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, May 03, 2013 at 03:11:18PM +0100, Ian Campbell wrote:> On Fri, 2013-05-03 at 14:47 +0100, Daniel Kiper wrote: > > On Fri, May 03, 2013 at 02:21:24PM +0100, Ian Campbell wrote: > > > On Fri, 2013-05-03 at 14:00 +0100, Daniel Kiper wrote: > > > > On Fri, May 03, 2013 at 09:15:32AM +0100, Ian Campbell wrote: > > > > > On Thu, 2013-05-02 at 19:04 +0100, Konrad Rzeszutek Wilk wrote: > > > > > > On Thu, May 02, 2013 at 12:34:32PM +0100, Stefano Stabellini wrote: > > > > > > > > [...] > > > > > > > > > > > The xapi guys, CC''ed, might have more insights on what exactly is. > > > > > > > > > > I think that unless someone can remember what this issue was we should > > > > > just chalk it up to a historical artefact of something xapi (or maybe > > > > > some historical guest) was doing which we have no reason to believe > > > > > needs to be carried over to libxl. > > > > > > > > > > IOW I''m suggesting we set LIBXL_MAXMEM_CONSTANT to 0 early in the 4.4 > > > > > cycle and see how it goes. If someone can show either empirical evidence > > > > > or (better) logically argue that a fudge is required then we can always > > > > > put it back (or it may turn out to be the caller''s issue, in which case > > > > > they can deal with it, hopefully xapi-on-libxl won''t apply this fudge > > > > > twice...). > > > > > > > > > > Alternatively I''m also strongly considering having debug builds of the > > > > > toolstack randomise the amount of slack, that ought to shake out any > > > > > lingering issues... > > > > > > > > Do you suggest to postopone this work until 4.4 merge window? > > > > > > 4.4 is a Xen version, this is a Linux patch so I''m not sure what you > > > mean. > > > > I thought about my libxl patches. Should I post new version without > > LIBXL_MAXMEM_CONSTANT when 4.4 merge window open? > > Xen doesn''t have a concept of a merge window. However we are currently > in feature freeze for 4.3. You''d need to speak to the release managerYes, I know.> (George Dunlap) to see if your xen side patches are acceptable at this > stage of the release. > > AIUI we were intending to do an RC1 release on Monday or Tuesday. It''s > probably too late to be making any major changes to libxl''s behaviour > without an extremely good rationale.I do not insist on that because I am aware that removal of LIBXL_MAXMEM_CONSTANT is too big change. Earlier I thought that it will be possible because changes were not so drastic. I will post new version of libxl patches after releasing Xen 4.3.> > > > If yes, then I think that at least "xen/balloon: Enforce various limits on target" > > > > patch (without this crazy libxl hack) should be applied. > > > > > > You mean this patch with only the MAX_DOMAIN_PAGES clamping? That seems > > > plausible. > > > > ... and with XENMEM_maximum_reservation check but wihtout LIBXL_MAXMEM_CONSTANT_PAGES. > > I am less convinced by this bit, but not as dead against it as I am > against anything with LIBXL_MAXMEM_CONSTANT_PAGES in it. > > > > > > > > I dislike having to pull this "hack" into Linux, but if it is actually > > > > > > > important to leave LIBXL_MAXMEM_CONSTANT unused, then it is worth doing. > > > > > > > I would add a big comment on top saying: > > > > > > > > > > > > > > "libxl seems to think that we need to leave LIBXL_MAXMEM_CONSTANT > > > > > > > kilobytes unused, let''s be gentle and do that." > > > > > > > > > > It seems to me that this change in Linux is really just papering over > > > > > the underlying issue. Or at the very least no one has adequately > > > > > explained what that real issue is and why this change is relevant to it > > > > > and/or an appropriate fix for it. > > > > > > > > > > The guest knows what target the toolstack has set for it is (its in the > > > > > target xenstore node), I don''t see any reason for the guest to be > > > > > further second guessing that value by examining maxmem (adjusted by a > > > > > fudge factor or otherwise). If the guest is seeing failures to increase > > > > > its reservation when trying to meet that target then either the > > > > > toolstack was buggy in asking it to hit a target greater than its maxmem > > > > > or it is hitting one of the other reason for increase reservation > > > > > failures. Since it needs to deal with the latter anyway I don''t see any > > > > > reason to special case maxmem as a cause for a failure. > > > > > > > > Do not forget that guest may change target itself. > > > > > > Yes it can, and that can fail either due to maxmem or due to ENOMEM, and > > > the kernel needs prepared to deal with that when it happens. > > > > Sure but why we would like to fail in endless loop if maxmem case > > could be easliy detected by checking XENMEM_maximum_reservation? > > That endless loop is deliberate. When a target is set the balloon driver > is supposed to try to reach it and if it fails at any given moment it is > supposed to try again. This all relates to the changes made in > bc2c0303226e. > > Now you could argue that this case is subtly different from the ENOMEM > case which was the primary focus of that commit but have you thought > about the behaviour you want in the case where maximum_reservation is > subsequently raised? IMHO there''s no reason why the balloon driver > shouldn''t then automatically make further progress towards the target.OK, now it makes sens. Do we assume the same behavior for dom0? Could we change maximum_reservation for dom0 using xl?> If the infinite loop bothers you then perhaps an exponential backoff in > the frequency of attempts would be a suitable middle ground?Relevant patches made by me are merged some time ago.> > > > Additionally, we would like to introduce xm compatibility > > > > mode which is a bit different then xl normal behavior. > > > > > > When then you really don''t want to be baking specifics of the current > > > model into the kernel, do you. > > > > Hmmm... Little misunderstanding. As I stated a few times I do not > > want bake any libxl or Xen stuff into Linux Kernel (including > > LIBXL_MAXMEM_CONSTANT). I just want to check limits which I think > > make sense in this case. > > Sorry, I never noticed you saying that. Where was it?Here http://lists.xen.org/archives/html/xen-devel/2013-04/msg03259.html I stated that I do not like this constant. I explained why this was done in that way. Later I found relevant commit which introduced it and asked authors about it. I think that shows that I am not happy with LIBXL_MAXMEM_CONSTANT and I am looking for good solution for this problem. Daniel
Ian Campbell
2013-May-03 16:00 UTC
Re: [Xen-devel] [PATCH v2 2/2] xen/balloon: Enforce various limits on target
On Fri, 2013-05-03 at 16:47 +0100, Daniel Kiper wrote:> > > > > Do not forget that guest may change target itself. > > > > > > > > Yes it can, and that can fail either due to maxmem or due to ENOMEM, and > > > > the kernel needs prepared to deal with that when it happens. > > > > > > Sure but why we would like to fail in endless loop if maxmem case > > > could be easliy detected by checking XENMEM_maximum_reservation? > > > > That endless loop is deliberate. When a target is set the balloon driver > > is supposed to try to reach it and if it fails at any given moment it is > > supposed to try again. This all relates to the changes made in > > bc2c0303226e. > > > > Now you could argue that this case is subtly different from the ENOMEM > > case which was the primary focus of that commit but have you thought > > about the behaviour you want in the case where maximum_reservation is > > subsequently raised? IMHO there''s no reason why the balloon driver > > shouldn''t then automatically make further progress towards the target. > > OK, now it makes sens. Do we assume the same behavior for dom0? > Could we change maximum_reservation for dom0 using xl?I don''t think there''s any reason to special case dom0 here.> > If the infinite loop bothers you then perhaps an exponential backoff in > > the frequency of attempts would be a suitable middle ground? > > Relevant patches made by me are merged some time ago.Great!> > > > > Additionally, we would like to introduce xm compatibility > > > > > mode which is a bit different then xl normal behavior. > > > > > > > > When then you really don''t want to be baking specifics of the current > > > > model into the kernel, do you. > > > > > > Hmmm... Little misunderstanding. As I stated a few times I do not > > > want bake any libxl or Xen stuff into Linux Kernel (including > > > LIBXL_MAXMEM_CONSTANT). I just want to check limits which I think > > > make sense in this case. > > > > Sorry, I never noticed you saying that. Where was it? > > Here http://lists.xen.org/archives/html/xen-devel/2013-04/msg03259.html > I stated that I do not like this constant. I explained why this was done > in that way.Sorry, I read that mail as arguing that it must be done this way ("this is a must").> Later I found relevant commit which introduced it and asked > authors about it. I think that shows that I am not happy with > LIBXL_MAXMEM_CONSTANT and I am looking for good solution for this problem.Ian.