thr3ads.net - Xen devel - [RFC] Make xl vcpu-set work in overcommit and with PV guests. (v2). [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Konrad Rzeszutek Wilk

2013-Sep-25 20:40 UTC

[RFC] Make xl vcpu-set work in overcommit and with PV guests. (v2).

Hey,

In Xen 4.3 the ''xl vcpu-set'' would not work for CPU overcommit
- meaning
where you request more vCPUs than the host has. We added the --ignore-host
parameter to allow the system admin to still do this operation as a stop-gate
solution.

For Xen 4.4 I had posted two (three?) set of patches that try different
things for this. The discussion narrowed down to (and please correct me if
I am incorrect) to: lets still print the warning but let the operation
go ahead without required the --ignore-host parameter. Since the parameter
is baked in it still has to work but we can just ignore it.

The first patch does that. The second patch is a bug I found where the
''xl vcpuset'' would not work on PV guests. I think the same
issue is with
Xen 4.3 but I am not sure. The patch #2 fixes that.

The patches are also at:

 git://xenbits.xen.org/people/konradwilk/xen.git vcpu.v1.1

The diffstat and log is:

 docs/man/xl.pod.1         | 15 ++++++++++++++-
 tools/libxl/libxl.c       | 28 ++++++++++++++++++++--------
 tools/libxl/xl_cmdimpl.c  | 28 ++++++++++++----------------
 tools/libxl/xl_cmdtable.c |  2 +-
 4 files changed, 47 insertions(+), 26 deletions(-)

Konrad Rzeszutek Wilk (2):
      xl: neuter vcpu-set --ignore-host.
      xl/vcpuset: Make it work for PV guests.

Konrad Rzeszutek Wilk

2013-Sep-25 20:40 UTC

head link

[PATCH 1/2] xl: neuter vcpu-set --ignore-host.

When Xen 4.3 was released we had a discussion whether we should
allow the vcpu-set command to allow the user to set more than
physical CPUs for a guest (it didn''t). The author brought up:
 - Xend used to do it,
 - If a user wants to do it, let them do it,
 - The original author of the change did not realize the
   side-effect his patch caused this and had no intention of changing it.
 - The user can already boot a massively overcommitted guest by
   having a large ''vcpus='' value in the guest config and we
allow
   that.

Since we were close to the release we added --ignore-host parameter
as a mechanism for a user to still set more vCPUs that the physical
machine as a stop-gate.

This patch keeps said option but neuters the check so that we
can overcommit. In other words - by default the user is
allowed to set as many vCPUs as they would like.

Furthermore mention this parameter change in the man-page.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 docs/man/xl.pod.1         | 15 ++++++++++++++-
 tools/libxl/xl_cmdimpl.c  | 28 ++++++++++++----------------
 tools/libxl/xl_cmdtable.c |  2 +-
 3 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 5975d7b..1199d01 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -597,7 +597,7 @@ This command is only available for HVM domains.
 Moves a domain out of the paused state.  This will allow a previously
 paused domain to now be eligible for scheduling by the Xen hypervisor.
 
-=item B<vcpu-set> I<domain-id> I<vcpu-count>
+=item B<vcpu-set> I<OPTION> I<domain-id> I<vcpu-count>
 
 Enables the I<vcpu-count> virtual CPUs for the domain in question.
 Like mem-set, this command can only allocate up to the maximum virtual
@@ -614,6 +614,19 @@ quietly ignored.
 Some guests may need to actually bring the newly added CPU online
 after B<vcpu-set>, go to B<SEE ALSO> section for information.
 
+B<OPTION>
+
+=over 4
+
+=item B<-i>, B<--ignore-host>
+
+Deprecated. Used to allow the user to increase the current number of
+active VCPUs, if it was greater than physical number of CPUs.
+This seatbelt option was introduced due to being (depending on the type
+of workload and guest OS) performance drawbacks of CPU overcommitting.
+
+=back
+
 =item B<vcpu-list> [I<domain-id>]
 
 Lists VCPU information for a specific domain.  If no domain is
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 3d7eaad..ecab9a6 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4536,11 +4536,12 @@ int main_vcpupin(int argc, char **argv)
     return 0;
 }
 
-static void vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
+static void vcpuset(uint32_t domid, const char* nr_vcpus)
 {
     char *endptr;
     unsigned int max_vcpus, i;
     libxl_bitmap cpumap;
+    unsigned int host_cpu;
 
     max_vcpus = strtoul(nr_vcpus, &endptr, 10);
     if (nr_vcpus == endptr) {
@@ -4549,19 +4550,14 @@ static void vcpuset(uint32_t domid, const char*
nr_vcpus, int check_host)
     }
 
     /*
-     * Maximum amount of vCPUS the guest is allowed to set is limited
-     * by the host''s amount of pCPUs.
+     * Warn if maximum amount of vCPUS the guest wants is higher than
+     * the host''s amount of pCPUs.
      */
-    if (check_host) {
-        unsigned int host_cpu = libxl_get_max_cpus(ctx);
-        if (max_vcpus > host_cpu) {
-            fprintf(stderr, "You are overcommmitting! You have %d physical
" \
-                    " CPUs and want %d vCPUs! Aborting, use --ignore-host
to " \
-                    " continue\n", host_cpu, max_vcpus);
-            return;
-        }
-        /* NB: This also limits how many are set in the bitmap */
-        max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
+    host_cpu = libxl_get_max_cpus(ctx);
+    if (max_vcpus > host_cpu) {
+        fprintf(stderr, "WARNING: You are overcommmitting! You have
%d" \
+                " physical CPUs and want %d vCPUs! Continuing..\n",
+                host_cpu, max_vcpus);
     }
     if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
         fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
@@ -4582,17 +4578,17 @@ int main_vcpuset(int argc, char **argv)
         {"ignore-host", 0, 0, ''i''},
         {0, 0, 0, 0}
     };
-    int opt, check_host = 1;
+    int opt;
 
     SWITCH_FOREACH_OPT(opt, "i", opts, "vcpu-set", 2) {
     case ''i'':
-        check_host = 0;
+        /* deprecated. */;
         break;
     default:
         break;
     }
 
-    vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
+    vcpuset(find_domain(argv[optind]), argv[optind + 1]);
     return 0;
 }
 
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 326a660..2ed9715 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -219,7 +219,7 @@ struct cmd_spec cmd_table[] = {
       &main_vcpuset, 0, 1,
       "Set the number of active VCPUs allowed for the domain",
       "[option] <Domain> <vCPUs>",
-      "-i, --ignore-host  Don''t limit the vCPU based on the host
CPU count",
+      "-i, --ignore-host  Don''t limit the vCPU based on the host
CPU count (deprecated)",
     },
     { "vm-list",
       &main_vm_list, 0, 0,
-- 
1.8.3.1

Konrad Rzeszutek Wilk

2013-Sep-25 20:40 UTC

head link

[PATCH 2/2] xl/vcpuset: Make it work for PV guests.

When we try to set the number of VCPUs for a PV guest we might
have QEMU running serving the framebuffer, or not. Either way
we should not use QMP when trying to alter the number of VCPUs
a PV guest can have.

This fixes the bug where for a PV guest ''xl vcpuset'' results
in:
libxl: error: libxl_qmp.c:702:libxl__qmp_initialize: Connection error: No such
file or directory

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxl/libxl.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 0879f23..8786074 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4307,16 +4307,28 @@ int libxl_set_vcpuonline(libxl_ctx *ctx, uint32_t domid,
libxl_bitmap *cpumap)
 {
     GC_INIT(ctx);
     int rc;
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+
+    libxl_domain_type type = libxl__domain_type(gc, domid);
+    if (type == LIBXL_DOMAIN_TYPE_INVALID) {
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (type == LIBXL_DOMAIN_TYPE_PV)
         rc = libxl__set_vcpuonline_xenstore(gc, domid, cpumap);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        rc = libxl__set_vcpuonline_qmp(gc, domid, cpumap);
-        break;
-    default:
-        rc = ERROR_INVAL;
+    else {
+        switch (libxl__device_model_version_running(gc, domid)) {
+        case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+            rc = libxl__set_vcpuonline_xenstore(gc, domid, cpumap);
+            break;
+        case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+            rc = libxl__set_vcpuonline_qmp(gc, domid, cpumap);
+            break;
+        default:
+            rc = ERROR_INVAL;
+        }
     }
+out:
     GC_FREE;
     return rc;
 }
-- 
1.8.3.1

Dario Faggioli

2013-Sep-26 07:23 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On mer, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk
wrote:> This patch keeps said option but neuters the check so that we
> can overcommit. In other words - by default the user is
> allowed to set as many vCPUs as they would like.
> What about use the parameter to silence the warning? I mean, by default
we allow more vCPUs than vCPUs and print the warning. With
''-i'' we allow
that too (of course) and _do_not_ print the warning.

It''s definitely not a big deal, it''s just one way of not
having a
completely useless and neglected param around... It materialized in my
mind while reading the description of the change, and I felt like I was
sharing it. :-)

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2013-Sep-26 09:06 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Wed, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk
wrote:> When Xen 4.3 was released we had a discussion whether we should
> allow the vcpu-set command to allow the user to set more than
> physical CPUs for a guest (it didn''t). The author brought up:
>  - Xend used to do it,
IMHO xend is buggy here. If it were being maintained I encourage a patch
to file this particular sharp edge off.
>  - If a user wants to do it, let them do it,
We do, we have an option for those who know what they are doing to use
in the tiny minority of cases where they need to do this.
>  - The original author of the change did not realize the
>    side-effect his patch caused this and had no intention of changing it.
a happy accident then.
>  - The user can already boot a massively overcommitted guest by
>    having a large ''vcpus='' value in the guest config and
we allow
>    that.
IMHO this is an xl bug, I''d be happy to see a patch to fix this and
require and override here too.
> 
> Since we were close to the release we added --ignore-host parameter
> as a mechanism for a user to still set more vCPUs that the physical
> machine as a stop-gate.
> 
> This patch keeps said option but neuters the check so that we
> can overcommit. In other words - by default the user is
> allowed to set as many vCPUs as they would like.
and why would a naive user want to do this? non-naive users can use the
option if this is what they really want, and are probably grateful for
the catch if they didn''t intend to overcommit, which is almost always
even for expert users.

This change need far better rationalisation than "because xend did it"
and "because we can". IMHO.
> 
> Furthermore mention this parameter change in the man-page.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  docs/man/xl.pod.1         | 15 ++++++++++++++-
>  tools/libxl/xl_cmdimpl.c  | 28 ++++++++++++----------------
>  tools/libxl/xl_cmdtable.c |  2 +-
>  3 files changed, 27 insertions(+), 18 deletions(-)
> 
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index 5975d7b..1199d01 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -597,7 +597,7 @@ This command is only available for HVM domains.
>  Moves a domain out of the paused state.  This will allow a previously
>  paused domain to now be eligible for scheduling by the Xen hypervisor.
>  
> -=item B<vcpu-set> I<domain-id> I<vcpu-count>
> +=item B<vcpu-set> I<OPTION> I<domain-id>
I<vcpu-count>
>  
>  Enables the I<vcpu-count> virtual CPUs for the domain in question.
>  Like mem-set, this command can only allocate up to the maximum virtual
> @@ -614,6 +614,19 @@ quietly ignored.
>  Some guests may need to actually bring the newly added CPU online
>  after B<vcpu-set>, go to B<SEE ALSO> section for information.
>  
> +B<OPTION>
> +
> +=over 4
> +
> +=item B<-i>, B<--ignore-host>
> +
> +Deprecated. Used to allow the user to increase the current number of
> +active VCPUs, if it was greater than physical number of CPUs.
> +This seatbelt option was introduced due to being (depending on the type
> +of workload and guest OS) performance drawbacks of CPU overcommitting.
> +
> +=back
> +
>  =item B<vcpu-list> [I<domain-id>]
>  
>  Lists VCPU information for a specific domain.  If no domain is
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 3d7eaad..ecab9a6 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -4536,11 +4536,12 @@ int main_vcpupin(int argc, char **argv)
>      return 0;
>  }
>  
> -static void vcpuset(uint32_t domid, const char* nr_vcpus, int check_host)
> +static void vcpuset(uint32_t domid, const char* nr_vcpus)
>  {
>      char *endptr;
>      unsigned int max_vcpus, i;
>      libxl_bitmap cpumap;
> +    unsigned int host_cpu;
>  
>      max_vcpus = strtoul(nr_vcpus, &endptr, 10);
>      if (nr_vcpus == endptr) {
> @@ -4549,19 +4550,14 @@ static void vcpuset(uint32_t domid, const char*
nr_vcpus, int check_host)
>      }
>  
>      /*
> -     * Maximum amount of vCPUS the guest is allowed to set is limited
> -     * by the host''s amount of pCPUs.
> +     * Warn if maximum amount of vCPUS the guest wants is higher than
> +     * the host''s amount of pCPUs.
>       */
> -    if (check_host) {
> -        unsigned int host_cpu = libxl_get_max_cpus(ctx);
> -        if (max_vcpus > host_cpu) {
> -            fprintf(stderr, "You are overcommmitting! You have %d
physical " \
> -                    " CPUs and want %d vCPUs! Aborting, use
--ignore-host to " \
> -                    " continue\n", host_cpu, max_vcpus);
> -            return;
> -        }
> -        /* NB: This also limits how many are set in the bitmap */
> -        max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
> +    host_cpu = libxl_get_max_cpus(ctx);
> +    if (max_vcpus > host_cpu) {
> +        fprintf(stderr, "WARNING: You are overcommmitting! You have
%d" \
> +                " physical CPUs and want %d vCPUs!
Continuing..\n",
> +                host_cpu, max_vcpus);
>      }
>      if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
>          fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
> @@ -4582,17 +4578,17 @@ int main_vcpuset(int argc, char **argv)
>          {"ignore-host", 0, 0, ''i''},
>          {0, 0, 0, 0}
>      };
> -    int opt, check_host = 1;
> +    int opt;
>  
>      SWITCH_FOREACH_OPT(opt, "i", opts, "vcpu-set", 2)
{
>      case ''i'':
> -        check_host = 0;
> +        /* deprecated. */;
>          break;
>      default:
>          break;
>      }
>  
> -    vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
> +    vcpuset(find_domain(argv[optind]), argv[optind + 1]);
>      return 0;
>  }
>  
> diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
> index 326a660..2ed9715 100644
> --- a/tools/libxl/xl_cmdtable.c
> +++ b/tools/libxl/xl_cmdtable.c
> @@ -219,7 +219,7 @@ struct cmd_spec cmd_table[] = {
>        &main_vcpuset, 0, 1,
>        "Set the number of active VCPUs allowed for the domain",
>        "[option] <Domain> <vCPUs>",
> -      "-i, --ignore-host  Don''t limit the vCPU based on the
host CPU count",
> +      "-i, --ignore-host  Don''t limit the vCPU based on the
host CPU count (deprecated)",
>      },
>      { "vm-list",
>        &main_vm_list, 0, 0,

Ian Campbell

2013-Sep-26 09:10 UTC

head link

Re: [PATCH 2/2] xl/vcpuset: Make it work for PV guests.

On Wed, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk
wrote:> When we try to set the number of VCPUs for a PV guest we might
> have QEMU running serving the framebuffer, or not. Either way
> we should not use QMP when trying to alter the number of VCPUs
> a PV guest can have.
> 
> This fixes the bug where for a PV guest ''xl vcpuset''
results in:
> libxl: error: libxl_qmp.c:702:libxl__qmp_initialize: Connection error: No
such file or directory
>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl.c | 28 ++++++++++++++++++++--------
>  1 file changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 0879f23..8786074 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -4307,16 +4307,28 @@ int libxl_set_vcpuonline(libxl_ctx *ctx, uint32_t
domid, libxl_bitmap *cpumap)
>  {
>      GC_INIT(ctx);
>      int rc;
> -    switch (libxl__device_model_version_running(gc, domid)) {
> -    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +
> +    libxl_domain_type type = libxl__domain_type(gc, domid);
> +    if (type == LIBXL_DOMAIN_TYPE_INVALID) {
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    if (type == LIBXL_DOMAIN_TYPE_PV)
Please switch on libxl__domain_type(), that way we will get a build
error if/when a new type is added.
>          rc = libxl__set_vcpuonline_xenstore(gc, domid, cpumap);
> -        break;
> -    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> -        rc = libxl__set_vcpuonline_qmp(gc, domid, cpumap);
> -        break;
> -    default:
> -        rc = ERROR_INVAL;
> +    else {
> +        switch (libxl__device_model_version_running(gc, domid)) {
> +        case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +            rc = libxl__set_vcpuonline_xenstore(gc, domid, cpumap);
> +            break;
> +        case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +            rc = libxl__set_vcpuonline_qmp(gc, domid, cpumap);
> +            break;
> +        default:
> +            rc = ERROR_INVAL;
> +        }
>      }
> +out:
>      GC_FREE;
>      return rc;
>  }

Konrad Rzeszutek Wilk

2013-Sep-26 12:45 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, Sep 26, 2013 at 09:23:28AM +0200, Dario Faggioli
wrote:> On mer, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk wrote:
> > This patch keeps said option but neuters the check so that we
> > can overcommit. In other words - by default the user is
> > allowed to set as many vCPUs as they would like.
> > 
> What about use the parameter to silence the warning? I mean, by default
> we allow more vCPUs than vCPUs and print the warning. With
''-i'' we allow
> that too (of course) and _do_not_ print the warning.
Good point. Let me redo it that way.> 
> It''s definitely not a big deal, it''s just one way of not
having a
> completely useless and neglected param around... It materialized in my
> mind while reading the description of the change, and I felt like I was
> sharing it. :-)
> 
> Regards,
> Dario
> 
> -- 
> <<This happens because I choose it to happen!>> (Raistlin
Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>

Konrad Rzeszutek Wilk

2013-Sep-26 12:48 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, Sep 26, 2013 at 10:06:31AM +0100, Ian Campbell
wrote:> On Wed, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk wrote:
> > When Xen 4.3 was released we had a discussion whether we should
> > allow the vcpu-set command to allow the user to set more than
> > physical CPUs for a guest (it didn''t). The author brought up:
> >  - Xend used to do it,
> 
> IMHO xend is buggy here. If it were being maintained I encourage a patch
> to file this particular sharp edge off.
> 
> >  - If a user wants to do it, let them do it,
> 
> We do, we have an option for those who know what they are doing to use
> in the tiny minority of cases where they need to do this.
> 
> >  - The original author of the change did not realize the
> >    side-effect his patch caused this and had no intention of changing
it.
> 
> a happy accident then.
> 
> >  - The user can already boot a massively overcommitted guest by
> >    having a large ''vcpus='' value in the guest config
and we allow
> >    that.
> 
> IMHO this is an xl bug, I''d be happy to see a patch to fix this
and
> require and override here too.
I think I posted one some time ago, but I don''t recall anybody
commenting on it. Will repost it.> 
> > 
> > Since we were close to the release we added --ignore-host parameter
> > as a mechanism for a user to still set more vCPUs that the physical
> > machine as a stop-gate.
> > 
> > This patch keeps said option but neuters the check so that we
> > can overcommit. In other words - by default the user is
> > allowed to set as many vCPUs as they would like.
> 
> and why would a naive user want to do this? non-naive users can use the
> option if this is what they really want, and are probably grateful for
> the catch if they didn''t intend to overcommit, which is almost
always
> even for expert users.
> 
> This change need far better rationalisation than "because xend did
it"
> and "because we can". IMHO.
I am going to defer to George here. His viewpoint (I am going to
probably mangle it up) was that - if the user wants to do, let him/her
do it without us putting obstacles.

And I think Ian Jackson was ambivalent here and was deferring to George.

George?> 
> > 
> > Furthermore mention this parameter change in the man-page.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  docs/man/xl.pod.1         | 15 ++++++++++++++-
> >  tools/libxl/xl_cmdimpl.c  | 28 ++++++++++++----------------
> >  tools/libxl/xl_cmdtable.c |  2 +-
> >  3 files changed, 27 insertions(+), 18 deletions(-)
> > 
> > diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> > index 5975d7b..1199d01 100644
> > --- a/docs/man/xl.pod.1
> > +++ b/docs/man/xl.pod.1
> > @@ -597,7 +597,7 @@ This command is only available for HVM domains.
> >  Moves a domain out of the paused state.  This will allow a previously
> >  paused domain to now be eligible for scheduling by the Xen
hypervisor.
> >  
> > -=item B<vcpu-set> I<domain-id> I<vcpu-count>
> > +=item B<vcpu-set> I<OPTION> I<domain-id>
I<vcpu-count>
> >  
> >  Enables the I<vcpu-count> virtual CPUs for the domain in
question.
> >  Like mem-set, this command can only allocate up to the maximum
virtual
> > @@ -614,6 +614,19 @@ quietly ignored.
> >  Some guests may need to actually bring the newly added CPU online
> >  after B<vcpu-set>, go to B<SEE ALSO> section for
information.
> >  
> > +B<OPTION>
> > +
> > +=over 4
> > +
> > +=item B<-i>, B<--ignore-host>
> > +
> > +Deprecated. Used to allow the user to increase the current number of
> > +active VCPUs, if it was greater than physical number of CPUs.
> > +This seatbelt option was introduced due to being (depending on the
type
> > +of workload and guest OS) performance drawbacks of CPU
overcommitting.
> > +
> > +=back
> > +
> >  =item B<vcpu-list> [I<domain-id>]
> >  
> >  Lists VCPU information for a specific domain.  If no domain is
> > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> > index 3d7eaad..ecab9a6 100644
> > --- a/tools/libxl/xl_cmdimpl.c
> > +++ b/tools/libxl/xl_cmdimpl.c
> > @@ -4536,11 +4536,12 @@ int main_vcpupin(int argc, char **argv)
> >      return 0;
> >  }
> >  
> > -static void vcpuset(uint32_t domid, const char* nr_vcpus, int
check_host)
> > +static void vcpuset(uint32_t domid, const char* nr_vcpus)
> >  {
> >      char *endptr;
> >      unsigned int max_vcpus, i;
> >      libxl_bitmap cpumap;
> > +    unsigned int host_cpu;
> >  
> >      max_vcpus = strtoul(nr_vcpus, &endptr, 10);
> >      if (nr_vcpus == endptr) {
> > @@ -4549,19 +4550,14 @@ static void vcpuset(uint32_t domid, const
char* nr_vcpus, int check_host)
> >      }
> >  
> >      /*
> > -     * Maximum amount of vCPUS the guest is allowed to set is limited
> > -     * by the host''s amount of pCPUs.
> > +     * Warn if maximum amount of vCPUS the guest wants is higher than
> > +     * the host''s amount of pCPUs.
> >       */
> > -    if (check_host) {
> > -        unsigned int host_cpu = libxl_get_max_cpus(ctx);
> > -        if (max_vcpus > host_cpu) {
> > -            fprintf(stderr, "You are overcommmitting! You have
%d physical " \
> > -                    " CPUs and want %d vCPUs! Aborting, use
--ignore-host to " \
> > -                    " continue\n", host_cpu, max_vcpus);
> > -            return;
> > -        }
> > -        /* NB: This also limits how many are set in the bitmap */
> > -        max_vcpus = (max_vcpus > host_cpu ? host_cpu : max_vcpus);
> > +    host_cpu = libxl_get_max_cpus(ctx);
> > +    if (max_vcpus > host_cpu) {
> > +        fprintf(stderr, "WARNING: You are overcommmitting! You
have %d" \
> > +                " physical CPUs and want %d vCPUs!
Continuing..\n",
> > +                host_cpu, max_vcpus);
> >      }
> >      if (libxl_cpu_bitmap_alloc(ctx, &cpumap, max_vcpus)) {
> >          fprintf(stderr, "libxl_cpu_bitmap_alloc failed\n");
> > @@ -4582,17 +4578,17 @@ int main_vcpuset(int argc, char **argv)
> >          {"ignore-host", 0, 0, ''i''},
> >          {0, 0, 0, 0}
> >      };
> > -    int opt, check_host = 1;
> > +    int opt;
> >  
> >      SWITCH_FOREACH_OPT(opt, "i", opts,
"vcpu-set", 2) {
> >      case ''i'':
> > -        check_host = 0;
> > +        /* deprecated. */;
> >          break;
> >      default:
> >          break;
> >      }
> >  
> > -    vcpuset(find_domain(argv[optind]), argv[optind + 1], check_host);
> > +    vcpuset(find_domain(argv[optind]), argv[optind + 1]);
> >      return 0;
> >  }
> >  
> > diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
> > index 326a660..2ed9715 100644
> > --- a/tools/libxl/xl_cmdtable.c
> > +++ b/tools/libxl/xl_cmdtable.c
> > @@ -219,7 +219,7 @@ struct cmd_spec cmd_table[] = {
> >        &main_vcpuset, 0, 1,
> >        "Set the number of active VCPUs allowed for the
domain",
> >        "[option] <Domain> <vCPUs>",
> > -      "-i, --ignore-host  Don''t limit the vCPU based on
the host CPU count",
> > +      "-i, --ignore-host  Don''t limit the vCPU based on
the host CPU count (deprecated)",
> >      },
> >      { "vm-list",
> >        &main_vm_list, 0, 0,
> 
>

Konrad Rzeszutek Wilk

2013-Sep-26 15:28 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

> >  - The user can already boot a massively overcommitted guest by
> >    having a large ''vcpus='' value in the guest config
and we allow
> >    that.
> 
> IMHO this is an xl bug, I''d be happy to see a patch to fix this
and
> require and override here too.
I actually think that doing vCPU overcommit is an OK process. If you go
down the path of ''don''t do this b/c it can cause performance
degredation''
you might end up with tons of things that we should be turning off:
 - don''t use file but use phy for block.
 - if you have 40GB SR-IOV, use that instead of vif.
 - booting PV? You should be booting it in HVM mode on latest machines.
 - etc.

They are all in some cases subjective and the user can have a legitimate
reason to do this instead of using one we think is better.

I want the user to be able to make that choice without constraining
them. This is open source after all - we lift the barriers, not put them
in.> 
> > 
> > Since we were close to the release we added --ignore-host parameter
> > as a mechanism for a user to still set more vCPUs that the physical
> > machine as a stop-gate.
> > 
> > This patch keeps said option but neuters the check so that we
> > can overcommit. In other words - by default the user is
> > allowed to set as many vCPUs as they would like.
> 
> and why would a naive user want to do this? non-naive users can use the
> option if this is what they really want, and are probably grateful for
> the catch if they didn''t intend to overcommit, which is almost
always
> even for expert users.
I think adding the WARNING is a good idea (which is how it does it
right now). But this is the similar as running a guest on a NUMA machine
without putting it in proper NUMA containers - we should WARN, not just
outright stop the guest.> 
> This change need far better rationalisation than "because xend did
it"
> and "because we can". IMHO.
From my non-technical view (and I am not sure if I had made this clear)
there is a lot of users that use ''xend'' and want to switch to
''xl''. As such
to make this backwards compatible, even bugs have be considered. Perhaps
another way to do this is to have a global flag -
''xend_compatible'' where
even things that are bugs are expected to work certain ways.

I do get your frustration - why would a normal user want to shoot themselves
in the foot with VCPU over-subscription? I have some faint clue - but I
do to get a stream of requests from customers demanding it. And if they pay to
shoot themselves in the foot - well, here is a cocked gun and let me point the
gun at your foot and you can pull the trigger.

Lastly, now that the PV ticketlocks are in and they work for both PV and
HVM I am curious how many people are going to start using it.


Sorry about the long twisted answer - I hope this will get the discussion a bit
more going forward so we can decide what we want to do for Xen 4.4.

Ian Campbell

2013-Sep-26 15:47 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, 2013-09-26 at 11:28 -0400, Konrad Rzeszutek Wilk
wrote:> > >  - The user can already boot a massively overcommitted guest by
> > >    having a large ''vcpus='' value in the guest
config and we allow
> > >    that.
> > 
> > IMHO this is an xl bug, I''d be happy to see a patch to fix
this and
> > require and override here too.
> 
> I actually think that doing vCPU overcommit is an OK process. If you go
> down the path of ''don''t do this b/c it can cause
performance degredation''
> you might end up with tons of things that we should be turning off:
>  - don''t use file but use phy for block.
>  - if you have 40GB SR-IOV, use that instead of vif.
>  - booting PV? You should be booting it in HVM mode on latest machines.
>  - etc.
Those are all legitimate choices for a user to make. 

Overcommitting VCPUs is not.

Choosing to overcommit VCPUs is useful in exactly one scenario --
testing how bad CPU overcommit makes things. That is an expert use case
and a legitimate reason to use an override.

I''ve asked *repeatedly* now for an actual use case where a user would
want to overcommit. IMHO it is always an accident/user error to do this.
If you have an actual use case then please share it, the presence of
such a thing would go a long way to changing my opinion.
> They are all in some cases subjective and the user can have a legitimate
> reason to do this instead of using one we think is better.
> 
> I want the user to be able to make that choice without constraining
> them. This is open source after all - we lift the barriers, not put them
> in.
I''m not even going to dignify this kind of vapid tosh with a response.
> I think adding the WARNING is a good idea (which is how it does it
> right now). But this is the similar as running a guest on a NUMA machine
> without putting it in proper NUMA containers
It''s not at all the same as that. In that case we can apply sensible
defaults etc, as Dario has been doing.
>  - we should WARN, not just
> outright stop the guest.
> > 
> > This change need far better rationalisation than "because xend
did it"
> > and "because we can". IMHO.
> 
> From my non-technical view (and I am not sure if I had made this clear)
> there is a lot of users that use ''xend'' and want to
switch to ''xl''. As such
> to make this backwards compatible, even bugs have be considered. Perhaps
> another way to do this is to have a global flag -
''xend_compatible'' where
> even things that are bugs are expected to work certain ways.
If xend were still being maintained this is exactly the sort of
improvement I would advocate making to it. Yes that would mean xend
behaved differently for xend users in the next release. I would call it
an improvement or a bugfix not an incompatibility. You seem to think
xend has or had some sort of CLI compatibility guarantee, which it
did/does not, when it was developed it was changed regularly (for,
mostly, good reasons).

There is no reason to reimplement xend bugs in xl. Nor is "xl must be
compatible with xend" a reason to avoid making progress which we would
have been making with xend anyway if it were still being developed.
> I do get your frustration - why would a normal user want to shoot
themselves
> in the foot with VCPU over-subscription? I have some faint clue - but I
> do to get a stream of requests from customers demanding it.
And not a single one has explained to you why they want it?

Or perhaps you could explain this faint clue of yours?

I''m not saying we can''t make this change. I''m saying
you haven''t even
come close to giving a reasonable justification for it. I seem to
remember saying exactly the same thing last time we went around this
mulberry bush too.
>  And if they pay to
> shoot themselves in the foot - well, here is a cocked gun and let me point
the
> gun at your foot and you can pull the trigger.
They can use the override.
> Lastly, now that the PV ticketlocks are in and they work for both PV and
> HVM I am curious how many people are going to start using it.
Why would they? What possible benefit is there to doing this whether or
not PV ticketlocks are available?
> Sorry about the long twisted answer - I hope this will get the discussion a
bit
> more going forward so we can decide what we want to do for Xen 4.4.

Andrew Cooper

2013-Sep-26 16:01 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On 26/09/13 16:47, Ian Campbell wrote:> On Thu, 2013-09-26 at 11:28 -0400, Konrad Rzeszutek Wilk wrote:
>>>>  - The user can already boot a massively overcommitted guest by
>>>>    having a large ''vcpus='' value in the guest
config and we allow
>>>>    that.
>>> IMHO this is an xl bug, I''d be happy to see a patch to fix
this and
>>> require and override here too.
>> I actually think that doing vCPU overcommit is an OK process. If you go
>> down the path of ''don''t do this b/c it can cause
performance degredation''
>> you might end up with tons of things that we should be turning off:
>>  - don''t use file but use phy for block.
>>  - if you have 40GB SR-IOV, use that instead of vif.
>>  - booting PV? You should be booting it in HVM mode on latest machines.
>>  - etc.
> Those are all legitimate choices for a user to make. 
>
> Overcommitting VCPUs is not.
Depends how the overcommitting happens.  What about:

* User creating N VMs which are individually undercomitted but has the
same effect as creating 1 VM which is stupidly overcommitted.

* Power management decides to shut down some of the PCPUs because it can
service all the current VCPUs from some somewhat idle domains on fewer PCPUs


While I agree that creating a single VM which is overcommitted in terms
of VCPUs is either a user error or power-user, that alone is not a
justification for it being impossible/very hard to do.

My two cents.

~Andrew

Ian Campbell

2013-Sep-26 16:05 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, 2013-09-26 at 17:01 +0100, Andrew Cooper wrote:> On 26/09/13 16:47, Ian Campbell wrote:
> > On Thu, 2013-09-26 at 11:28 -0400, Konrad Rzeszutek Wilk wrote:
> >>>>  - The user can already boot a massively overcommitted
guest by
> >>>>    having a large ''vcpus='' value in the
guest config and we allow
> >>>>    that.
> >>> IMHO this is an xl bug, I''d be happy to see a patch
to fix this and
> >>> require and override here too.
> >> I actually think that doing vCPU overcommit is an OK process. If
you go
> >> down the path of ''don''t do this b/c it can cause
performance degredation''
> >> you might end up with tons of things that we should be turning
off:
> >>  - don''t use file but use phy for block.
> >>  - if you have 40GB SR-IOV, use that instead of vif.
> >>  - booting PV? You should be booting it in HVM mode on latest
machines.
> >>  - etc.
> > Those are all legitimate choices for a user to make. 
> >
> > Overcommitting VCPUs is not.
> 
> Depends how the overcommitting happens.  What about:
> 
> * User creating N VMs which are individually undercomitted but has the
> same effect as creating 1 VM which is stupidly overcommitted.
That''s a different, and legitimate, case.

The case here is creating a single VM which has more VCPUS and PCPUS.
> * Power management decides to shut down some of the PCPUs because it can
> service all the current VCPUs from some somewhat idle domains on fewer
PCPUs
Hopefully these checks are based on the actual total and not the
artificially lowered limit. If not then that would be bug.
> While I agree that creating a single VM which is overcommitted in terms
> of VCPUs is either a user error or power-user, that alone is not a
> justification for it being impossible/very hard to do.
It''s not even as obscure as "very hard", there is a simple
CLI option to
allow it!

Ian.

George Dunlap

2013-Sep-26 16:25 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On 26/09/13 13:48, Konrad Rzeszutek Wilk wrote:> On Thu, Sep 26, 2013 at 10:06:31AM +0100, Ian Campbell wrote:
>> On Wed, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk wrote:
>>> When Xen 4.3 was released we had a discussion whether we should
>>> allow the vcpu-set command to allow the user to set more than
>>> physical CPUs for a guest (it didn''t). The author brought
up:
>>>   - Xend used to do it,
>> IMHO xend is buggy here. If it were being maintained I encourage a
patch
>> to file this particular sharp edge off.
>>
>>>   - If a user wants to do it, let them do it,
>> We do, we have an option for those who know what they are doing to use
>> in the tiny minority of cases where they need to do this.
>>
>>>   - The original author of the change did not realize the
>>>     side-effect his patch caused this and had no intention of
changing it.
>> a happy accident then.
>>
>>>   - The user can already boot a massively overcommitted guest by
>>>     having a large ''vcpus='' value in the guest
config and we allow
>>>     that.
>> IMHO this is an xl bug, I''d be happy to see a patch to fix
this and
>> require and override here too.
> I think I posted one some time ago, but I don''t recall anybody
> commenting on it. Will repost it.
>>> Since we were close to the release we added --ignore-host parameter
>>> as a mechanism for a user to still set more vCPUs that the physical
>>> machine as a stop-gate.
>>>
>>> This patch keeps said option but neuters the check so that we
>>> can overcommit. In other words - by default the user is
>>> allowed to set as many vCPUs as they would like.
>> and why would a naive user want to do this? non-naive users can use the
>> option if this is what they really want, and are probably grateful for
>> the catch if they didn''t intend to overcommit, which is almost
always
>> even for expert users.
>>
>> This change need far better rationalisation than "because xend did
it"
>> and "because we can". IMHO.
> I am going to defer to George here. His viewpoint (I am going to
> probably mangle it up) was that - if the user wants to do, let him/her
> do it without us putting obstacles.
>
> And I think Ian Jackson was ambivalent here and was deferring to George.
So I''ve gone back and read the original thread, and what I actually
said
was:

"So I think the right thing to do long-term is to make it possible to do 
in xl.  Having a "seatbelt" restriction by default that can be 
overridden would be OK with me, but I think a warning message when vcpus 
 > pcpus would suffice."

And my summary of mine and IanC''s positions at the time (which IanC did
not dispute) was:

"We both agree that "vcpus > pcpus" is a bad configuration.  I
think
ideally we should support it (because administrators should be allowed 
to shoot themselves in the foot) and Ian[C] seems to be making the case 
that we shouldn''t support it."

IanJ, as I understood him, agreed with me that it should be *possible*.

As IanC points out, it is possible -- you just have to add
"--ignore-host".

So given what all of us think, keeping the "seatbelt" is probably the 
best compromise. IanC is happy that a hapless user will not accidentally 
shoot his own foot, and IanJ and I are happy that a skilled user can 
shoot her own foot if she really wants to.

  -George

Konrad Rzeszutek Wilk

2013-Sep-27 01:44 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, Sep 26, 2013 at 05:25:17PM +0100, George Dunlap
wrote:> On 26/09/13 13:48, Konrad Rzeszutek Wilk wrote:
> >On Thu, Sep 26, 2013 at 10:06:31AM +0100, Ian Campbell wrote:
> >>On Wed, 2013-09-25 at 16:40 -0400, Konrad Rzeszutek Wilk wrote:
> >>>When Xen 4.3 was released we had a discussion whether we should
> >>>allow the vcpu-set command to allow the user to set more than
> >>>physical CPUs for a guest (it didn''t). The author
brought up:
> >>>  - Xend used to do it,
> >>IMHO xend is buggy here. If it were being maintained I encourage a
patch
> >>to file this particular sharp edge off.
> >>
> >>>  - If a user wants to do it, let them do it,
> >>We do, we have an option for those who know what they are doing to
use
> >>in the tiny minority of cases where they need to do this.
> >>
> >>>  - The original author of the change did not realize the
> >>>    side-effect his patch caused this and had no intention of
changing it.
> >>a happy accident then.
> >>
> >>>  - The user can already boot a massively overcommitted guest
by
> >>>    having a large ''vcpus='' value in the
guest config and we allow
> >>>    that.
> >>IMHO this is an xl bug, I''d be happy to see a patch to fix
this and
> >>require and override here too.
> >I think I posted one some time ago, but I don''t recall anybody
> >commenting on it. Will repost it.
> >>>Since we were close to the release we added --ignore-host
parameter
> >>>as a mechanism for a user to still set more vCPUs that the
physical
> >>>machine as a stop-gate.
> >>>
> >>>This patch keeps said option but neuters the check so that we
> >>>can overcommit. In other words - by default the user is
> >>>allowed to set as many vCPUs as they would like.
> >>and why would a naive user want to do this? non-naive users can use
the
> >>option if this is what they really want, and are probably grateful
for
> >>the catch if they didn''t intend to overcommit, which is
almost always
> >>even for expert users.
> >>
> >>This change need far better rationalisation than "because xend
did it"
> >>and "because we can". IMHO.
> >I am going to defer to George here. His viewpoint (I am going to
> >probably mangle it up) was that - if the user wants to do, let him/her
> >do it without us putting obstacles.
> >
> >And I think Ian Jackson was ambivalent here and was deferring to
George.
> 
> So I''ve gone back and read the original thread, and what I
actually
> said was:
> 
> "So I think the right thing to do long-term is to make it possible
> to do in xl.  Having a "seatbelt" restriction by default that can
be
> overridden would be OK with me, but I think a warning message when
> vcpus > pcpus would suffice."
> 
> And my summary of mine and IanC''s positions at the time (which
IanC
> did not dispute) was:
> 
> "We both agree that "vcpus > pcpus" is a bad
configuration.  I think
> ideally we should support it (because administrators should be
> allowed to shoot themselves in the foot) and Ian[C] seems to be
> making the case that we shouldn''t support it."
> 
> IanJ, as I understood him, agreed with me that it should be *possible*.
> 
> As IanC points out, it is possible -- you just have to add
"--ignore-host".
> 
> So given what all of us think, keeping the "seatbelt" is probably
> the best compromise. IanC is happy that a hapless user will not
> accidentally shoot his own foot, and IanJ and I are happy that a
> skilled user can shoot her own foot if she really wants to.
Excellent. Let me prep a patch with said seatbelt option.

Konrad Rzeszutek Wilk

2013-Sep-27 01:52 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

. snip..> > I do get your frustration - why would a normal user want to shoot
themselves
> > in the foot with VCPU over-subscription? I have some faint clue - but
I
> > do to get a stream of requests from customers demanding it.
> 
> And not a single one has explained to you why they want it?
> 
> Or perhaps you could explain this faint clue of yours?
I believe it is mostly driven by VMWare making statements that this is a
OK scenario (see ESXi CPU considerations in
http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf)> 
> I''m not saying we can''t make this change. I''m
saying you haven''t even
> come close to giving a reasonable justification for it. I seem to
> remember saying exactly the same thing last time we went around this
> mulberry bush too.
I learned two new idioms today - mulberry bush and tosh today
:-)> 
> >  And if they pay to
> > shoot themselves in the foot - well, here is a cocked gun and let me
point the
> > gun at your foot and you can pull the trigger.
> 
> They can use the override.
Yes they can. I am was hoping we would allow a non override mode for
those who really don''t want any of these overrides. George suggested
the
"seatbelt" option and that looks to be a good compromise for me.
> 
> > Lastly, now that the PV ticketlocks are in and they work for both PV
and
> > HVM I am curious how many people are going to start using it.
> 
> Why would they? What possible benefit is there to doing this whether or
> not PV ticketlocks are available?
Because now one can run Linux guests without incuring huge latency waits
due to spinlock contention. This makes it possible to actually compile a
Linux kernel with massively overcommited scenarios.

Ian Campbell

2013-Sep-27 08:41 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Thu, 2013-09-26 at 21:52 -0400, Konrad Rzeszutek Wilk
wrote:> . snip..
> > > I do get your frustration - why would a normal user want to shoot
themselves
> > > in the foot with VCPU over-subscription? I have some faint clue -
but I
> > > do to get a stream of requests from customers demanding it.
> > 
> > And not a single one has explained to you why they want it?
> > 
> > Or perhaps you could explain this faint clue of yours?
> 
> I believe it is mostly driven by VMWare making statements that this is a
> OK scenario (see ESXi CPU considerations in
> http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf)
The only reference to CPU (as opposed to memory) overcommit I can see in
there is:
        In most environments ESXi allows significant levels of CPU
        overcommitment (that is, running more vCPUs on a host than the
        total number of physical processor cores in that host) without
        impacting virtual machine performance.

I think this is pretty clearly referring to the case where the total of
all vCPUs across all guests exceeds the number of pCPUs. That kind over
overcommit is obviously fine, and not at all what I''m arguing against.

Nowhere in that doc did I get the impression that VMware were suggesting
that giving a single VM more vCPUs than the host has pCPUs was a useful
or sensible thing to do.

To be clear -- the patches you are proposing are removing the safety
catch preventing the latter (1 VM with vCPUs > pCPUs) not the former
(sum_all_VM(VCPUS_) > pCPUS). If xl is disallowing sum_all_VM(VCPUS_) >
pCPUS then there is a real bug. I don''t believe this to be the case
though (if it is then we''ve been talking at cross purposes all along).
> > I''m not saying we can''t make this change.
I''m saying you haven''t even
> > come close to giving a reasonable justification for it. I seem to
> > remember saying exactly the same thing last time we went around this
> > mulberry bush too.
> 
> I learned two new idioms today - mulberry bush and tosh today :-)
Sorry about that, was a bit grumpy.
 > > >  And if they pay to
> > > shoot themselves in the foot - well, here is a cocked gun and let
me point the
> > > gun at your foot and you can pull the trigger.
> > 
> > They can use the override.
> 
> Yes they can. I am was hoping we would allow a non override mode for
> those who really don''t want any of these overrides. George
suggested the
> "seatbelt" option and that looks to be a good compromise for me.
AIUI the "seatbelt" to which George was referring is the current
behaviour (i.e. the override). He said:
        So given what all of us think, keeping the "seatbelt" is
        probably the best compromise.

i.e. to him the seatbelt is the status quo.

So what do you mean by seatbelt?

Or do you just mean to add the restriction + override to xl create too?
That would be good.
> > > Lastly, now that the PV ticketlocks are in and they work for both
PV and
> > > HVM I am curious how many people are going to start using it.
> > 
> > Why would they? What possible benefit is there to doing this whether
or
> > not PV ticketlocks are available?
> 
> Because now one can run Linux guests without incuring huge latency waits
> due to spinlock contention. This makes it possible to actually compile a
> Linux kernel with massively overcommited scenarios.
But *why*, for what possible reason would a user want do that? That is
the key question which has never been answered and that''s why I keep
nacking this patch.

Ian.

Konrad Rzeszutek Wilk

2013-Sep-30 18:40 UTC

head link

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

On Fri, Sep 27, 2013 at 09:41:02AM +0100, Ian Campbell
wrote:> On Thu, 2013-09-26 at 21:52 -0400, Konrad Rzeszutek Wilk wrote:
> > . snip..
> > > > I do get your frustration - why would a normal user want to
shoot themselves
> > > > in the foot with VCPU over-subscription? I have some faint
clue - but I
> > > > do to get a stream of requests from customers demanding it.
> > > 
> > > And not a single one has explained to you why they want it?
> > > 
> > > Or perhaps you could explain this faint clue of yours?
> > 
> > I believe it is mostly driven by VMWare making statements that this is
a
> > OK scenario (see ESXi CPU considerations in
> > http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf)
> 
> The only reference to CPU (as opposed to memory) overcommit I can see in
> there is:
>         In most environments ESXi allows significant levels of CPU
>         overcommitment (that is, running more vCPUs on a host than the
>         total number of physical processor cores in that host) without
>         impacting virtual machine performance.
> 
> I think this is pretty clearly referring to the case where the total of
> all vCPUs across all guests exceeds the number of pCPUs. That kind over
> overcommit is obviously fine, and not at all what I''m arguing
against.
I read that as running one guest with more vCPUs than host
CPUs.> 
> Nowhere in that doc did I get the impression that VMware were suggesting
> that giving a single VM more vCPUs than the host has pCPUs was a useful
> or sensible thing to do.
> 
> To be clear -- the patches you are proposing are removing the safety
> catch preventing the latter (1 VM with vCPUs > pCPUs) not the former
> (sum_all_VM(VCPUS_) > pCPUS). If xl is disallowing sum_all_VM(VCPUS_)
>
> pCPUS then there is a real bug. I don''t believe this to be the
case
> though (if it is then we''ve been talking at cross purposes all
along).
That scenario (sum_all_VM(VCPUs) > pCPUs) is working semi OK as long
as each VM VCPUs <= pCPUs.

.. snip..> > > They can use the override.
> > 
> > Yes they can. I am was hoping we would allow a non override mode for
> > those who really don''t want any of these overrides. George
suggested the
> > "seatbelt" option and that looks to be a good compromise for
me.
> 
> AIUI the "seatbelt" to which George was referring is the current
> behaviour (i.e. the override). He said:
>         So given what all of us think, keeping the "seatbelt" is
>         probably the best compromise.
> 
> i.e. to him the seatbelt is the status quo.
> 
> So what do you mean by seatbelt?
I was thinking off something like this in /etc/xen/xl.conf

#
# Disallow (unless used with --ignore-.. parameter) certain
# operations that would be questionable from a performance standpoint.
# For example overcommitting a single guest to have more VCPUs
# than there are physical CPUs.
seatbelt=yes> 
> Or do you just mean to add the restriction + override to xl create too?
> That would be good.
That is a seperate patch I will post. I definitly want to add the check
for it (so vCPUS > pCPUS => warn user).
> 
> > > > Lastly, now that the PV ticketlocks are in and they work for
both PV and
> > > > HVM I am curious how many people are going to start using
it.
> > > 
> > > Why would they? What possible benefit is there to doing this
whether or
> > > not PV ticketlocks are available?
> > 
> > Because now one can run Linux guests without incuring huge latency
waits
> > due to spinlock contention. This makes it possible to actually compile
a
> > Linux kernel with massively overcommited scenarios.
> 
> But *why*, for what possible reason would a user want do that? That is
> the key question which has never been answered and that''s why I
keep
> nacking this patch.
And why did the chicken cross the street? To get to the other side.

[OK, that is pretty poor joke].

My only reasoning behind this is:

 - Keep as much xend behavior as possible to ease the transition.
   It does not have to be on by default, but a global option
   to turn this on/off would be easy.

 - Users reading "best practices" and assuming that vCPU overcommit
   for a single guest should just work.

I poke more at the customers of why they would want to do it but
that will take some time get an answer.

Xen devel - Sep 2013 - [RFC] Make xl vcpu-set work in overcommit and with PV guests. (v2).

[RFC] Make xl vcpu-set work in overcommit and with PV guests. (v2).

[PATCH 1/2] xl: neuter vcpu-set --ignore-host.

[PATCH 2/2] xl/vcpuset: Make it work for PV guests.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 2/2] xl/vcpuset: Make it work for PV guests.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.

Re: [PATCH 1/2] xl: neuter vcpu-set --ignore-host.