Laszlo Ersek
2022-Sep-05 11:25 UTC
[Libguestfs] [p2v PATCH 2/6] restrict vCPU topology to (a) fully populated physical, or (b) 1 * N * 1
Currently, the CPU topology for the converted domain is determined as follows: (1) main() [main.c] (2) set_config_defaults() [main.c] vcpus <-- sysconf (_SC_NPROCESSORS_ONLN) get_cpu_config() [cpuid.c] get_topology() [cpuid.c] sockets <-- lscpu (physical) cores <-- lscpu (physical) threads <-- lscpu (physical) (3) update_config_from_kernel_cmdline() [kernel-config.c] vcpus <-- kernel cmdline sockets <-- kernel cmdline cores <-- kernel cmdline threads <-- kernel cmdline (4) opt#1: kernel_conversion() [kernel.c] no change to topology info opt#2: gui_conversion() [gui.c] vcpus <-- GUI (5) ... start_conversion() [conversion.c] generate_physical_xml() [physical-xml.c] format vcpus, sockets, cores, threads In (2), we initialize the topology information from the physical machine. In (3), we override each property in isolation from the kernel command line (this is done for both kernel conversion and GUI conversion; in the former case, it is the only way for the user to influence each element of the topology). In (4), in case we picked GUI conversion, the VCPU number can be overridden yet another time on the GUI. In (5), we format the topology information into the domain XML. The problem is that it's easy to create inconsistent topologies that libvirt complains about. One example is that in step (4) on the GUI, if the flat VCPU count is reduced by the user, it can result in sockets partially populated with cores or threads, or in cores partially populated with threads. Another example (even if the user does not touch the VCPU count) is a partially onlined CPU topology on the physical machine: _SC_NPROCESSORS_ONLN does not reflect any topology information, and if the flat number it returns is less than the (sockets * cores * threads) product from "lscpu", then the online logical processors will be "compressed to the left" by libvirt just the same as after the manual VCPU count reduction. An over-complicated way to fix this would be the following: - parse the output of "lscpu -p" (which is available since RHEL6 -- "lscpu" is altogether missing in RHEL5), retrieving online-ness combined with full topology information - expose complete topology info on the GUI - format the complete topology information for libvirt. A simpler way (which is what this patch series chooses) is to remove some flexibility from virt-p2v's configuration interface. Namely, only allow the user to choose betwen (a) copying the physical CPU topology, *fully populated*, (b) specifying the core count N for a "1 socket * N cores/socket * 1 thread/core" topology. Option (a) eliminates the "partially onlined physical CPU topology" scenario; it produces a topology that exists in the physical world, and does not require p2v to maintain topology information more expressively than it currently does. Option (b) eliminates any conflicts between the physical (or any user-configured) sockets:cores:threads hierarchy and a manually set logical processor count. Option (b) produces the kind of flat CPU topology that QEMU has defaulted to since version 6.2, with a simple "-smp" option. Option (b) preserves the simple p2v user interface for entering only a logical processor count. In this patch: - delete the (sockets, cores, threads) triplet from the p2v config structure, moving it out to a stand-alone structure, - introduce a new boolean config element (taking priority above the existent flat VCPU count) for copying -- densely populated -- the physical CPU topology, - always render the full topology into the domain XML, handling both options (a) and (b) -- for option (a), invoke "lscpu" right when the physical topology is needed, - make option (b) the default (sticking with current practice), - do not expose option (a) on the GUI yet. (An interesting question is whether post-patch, in GUI mode, where the conversion runs in a separate thread, it is safe to fork a child process for "lscpu" from said thread, with popen(): start_conversion_thread() -> start_conversion() -> generate_physical_xml() -> get_cpu_topology() -> get_lscpu() -> popen(). This concern arises because in the child process, only the one thread returning from fork() exists, thus if any locks are held by other threads in the parent process, those will never be released in the child process. But such a fork() should be safe: existent code already starts e.g. nbdkit on the same call path, namely start_conversion_thread() -> start_conversion() -> start_nbd_server() -> start_nbdkit() -> fork().) Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1590721 Signed-off-by: Laszlo Ersek <lersek at redhat.com> --- generate-p2v-config.pl | 45 ++++++++-------- p2v.h | 6 +++ cpuid.c | 39 ++++++++------ gui.c | 4 +- main.c | 8 +-- physical-xml.c | 54 ++++++++++---------- test-virt-p2v-cmdline.sh | 5 +- 7 files changed, 88 insertions(+), 73 deletions(-) diff --git a/generate-p2v-config.pl b/generate-p2v-config.pl index 06a2e2c8b0e4..5c33f814e3e7 100755 --- a/generate-p2v-config.pl +++ b/generate-p2v-config.pl @@ -110,16 +110,19 @@ my @fields = [ ], ), ConfigString->new(name => 'guestname'), - ConfigInt->new(name => 'vcpus', value => 0), + ConfigSection->new( + name => 'vcpu', + elements => [ + ConfigBool->new(name => 'dense_topo'), + ConfigInt->new(name => 'cores', value => 0), + ], + ), ConfigUInt64->new(name => 'memory'), ConfigSection->new( name => 'cpu', elements => [ ConfigString->new(name => 'vendor'), ConfigString->new(name => 'model'), - ConfigUnsigned->new(name => 'sockets'), - ConfigUnsigned->new(name => 'cores'), - ConfigUnsigned->new(name => 'threads'), ConfigBool->new(name => 'acpi'), ConfigBool->new(name => 'apic'), ConfigBool->new(name => 'pae'), @@ -231,11 +234,21 @@ The name of the guest that is created. The default is to try to derive a name from the physical machine?s hostname (if possible) else use a randomly generated name.", ), - "p2v.vcpus" => manual_entry->new( + "p2v.vcpu.dense_topo" => manual_entry->new( + shortopt => "", # ignored for booleans + description => " +Copy the physical machine's CPU topology, densely populated, to the +guest. Disabled by default. If disabled, the C<p2v.vcpu.cores> setting +takes effect.", + ), + "p2v.vcpu.cores" => manual_entry->new( shortopt => "N", description => " -The number of virtual CPUs to give to the guest. The default is to -use the same as the number of physical CPUs.", +This setting is ignored if C<p2v.vcpu.dense_topo> is enabled. +Otherwise, it specifies the flat number of vCPU cores to give to the +guest (placing all of those cores into a single socket, and exposing one +thread per core). The default value is the number of online logical +processors on the physical machine.", ), "p2v.memory" => manual_entry->new( shortopt => "n(M|G)", @@ -258,24 +271,6 @@ the same CPU vendor as the physical machine.", description => " The vCPU model, eg. \"IvyBridge\". The default is to use the same CPU model as the physical machine.", - ), - "p2v.cpu.sockets" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU sockets to use. The default is to use the same as the -physical machine.", - ), - "p2v.cpu.cores" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU cores to use. The default is to use the same as the -physical machine.", - ), - "p2v.cpu.threads" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU hyperthreads to use. The default is to use the same -as the physical machine.", ), "p2v.cpu.acpi" => manual_entry->new( shortopt => "", # ignored for booleans diff --git a/p2v.h b/p2v.h index 20c4ac7e506a..32dc55d94de5 100644 --- a/p2v.h +++ b/p2v.h @@ -60,6 +60,12 @@ extern int feature_colours_option; extern int force_colour; /* cpuid.c */ +struct cpu_topo { + unsigned sockets; + unsigned cores; + unsigned threads; +}; +extern void get_cpu_topology (struct cpu_topo *topo); extern void get_cpu_config (struct cpu_config *); /* rtc.c */ diff --git a/cpuid.c b/cpuid.c index 84603bbb3374..31a1720dc23c 100644 --- a/cpuid.c +++ b/cpuid.c @@ -154,22 +154,32 @@ get_vendor (char **lscpu, struct cpu_config *cpu) } /** - * Read the CPU topology from lscpu output. + * Read the CPU topology from a separate lscpu invocation. */ -static void -get_topology (char **lscpu, struct cpu_config *cpu) +void +get_cpu_topology (struct cpu_topo *topo) { - const char *v; - - v = get_field (lscpu, "Socket(s)"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->sockets)); - v = get_field (lscpu, "Core(s) per socket"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->cores)); - v = get_field (lscpu, "Thread(s) per core"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->threads)); + + CLEANUP_FREE_STRING_LIST char **lscpu = NULL; + + lscpu = get_lscpu (); + + if (lscpu != NULL) { + const char *sockets, *cores, *threads; + + sockets = get_field (lscpu, "Socket(s)"); + cores = get_field (lscpu, "Core(s) per socket"); + threads = get_field (lscpu, "Thread(s) per core"); + if (sockets && cores && threads) { + ignore_value (sscanf (sockets, "%u", &topo->sockets)); + ignore_value (sscanf (cores, "%u", &topo->cores)); + ignore_value (sscanf (threads, "%u", &topo->threads)); + return; + } + } + topo->sockets = 1; + topo->cores = 1; + topo->threads = 1; } /** @@ -211,7 +221,6 @@ get_cpu_config (struct cpu_config *cpu) lscpu = get_lscpu (); if (lscpu != NULL) { get_vendor (lscpu, cpu); - get_topology (lscpu, cpu); get_flags (lscpu, cpu); } diff --git a/gui.c b/gui.c index 5c4f1343095a..b4a9fed18410 100644 --- a/gui.c +++ b/gui.c @@ -767,7 +767,7 @@ create_conversion_dialog (struct config *config) set_alignment (vcpus_label, 1., 0.5); vcpus_entry = gtk_entry_new (); gtk_label_set_mnemonic_widget (GTK_LABEL (vcpus_label), vcpus_entry); - snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpus); + snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpu.cores); gtk_entry_set_text (GTK_ENTRY (vcpus_entry), vcpus_str); table_attach (target_tbl, vcpus_entry, 1, 2, 1, 2, GTK_FILL, GTK_FILL, 1, 1); @@ -2010,7 +2010,7 @@ start_conversion_clicked (GtkWidget *w, gpointer data) return; } - config->vcpus = get_vcpus_from_conv_dlg (); + config->vcpu.cores = get_vcpus_from_conv_dlg (); config->memory = get_memory_from_conv_dlg (); /* Get the list of disks to be converted. */ diff --git a/main.c b/main.c index 0ebb7291c7ce..6ac94fcb5f8e 100644 --- a/main.c +++ b/main.c @@ -291,16 +291,18 @@ set_config_defaults (struct config *config) } config->guestname = strdup (hostname); + config->vcpu.dense_topo = false; + /* Defaults for #vcpus and memory are taken from the physical machine. */ i = sysconf (_SC_NPROCESSORS_ONLN); if (i == -1) { perror ("sysconf: _SC_NPROCESSORS_ONLN"); - config->vcpus = 1; + config->vcpu.cores = 1; } else if (i == 0) - config->vcpus = 1; + config->vcpu.cores = 1; else - config->vcpus = i; + config->vcpu.cores = i; i = sysconf (_SC_PHYS_PAGES); if (i == -1) { diff --git a/physical-xml.c b/physical-xml.c index 4e830ea7ee0c..c27869de1c35 100644 --- a/physical-xml.c +++ b/physical-xml.c @@ -62,6 +62,7 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, uint64_t memkb; CLEANUP_XMLFREETEXTWRITER xmlTextWriterPtr xo = NULL; size_t i; + struct cpu_topo topo; xo = xmlNewTextWriterFilename (filename, 0); if (xo == NULL) @@ -104,34 +105,35 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, string_format ("%" PRIu64, memkb); } end_element (); - single_element_format ("vcpu", "%d", config->vcpus); - - if (config->cpu.vendor || config->cpu.model || - config->cpu.sockets || config->cpu.cores || config->cpu.threads) { - /* https://libvirt.org/formatdomain.html#elementsCPU */ - start_element ("cpu") { - attribute ("match", "minimum"); - if (config->cpu.vendor) - single_element ("vendor", config->cpu.vendor); - if (config->cpu.model) { - start_element ("model") { - attribute ("fallback", "allow"); - string (config->cpu.model); - } end_element (); - } - if (config->cpu.sockets || config->cpu.cores || config->cpu.threads) { - start_element ("topology") { - if (config->cpu.sockets) - attribute_format ("sockets", "%u", config->cpu.sockets); - if (config->cpu.cores) - attribute_format ("cores", "%u", config->cpu.cores); - if (config->cpu.threads) - attribute_format ("threads", "%u", config->cpu.threads); - } end_element (); - } - } end_element (); + if (config->vcpu.dense_topo) + get_cpu_topology (&topo); + else { + topo.sockets = 1; + topo.cores = config->vcpu.cores; + topo.threads = 1; } + single_element_format ("vcpu", "%u", + topo.sockets * topo.cores * topo.threads); + + /* https://libvirt.org/formatdomain.html#elementsCPU */ + start_element ("cpu") { + attribute ("match", "minimum"); + if (config->cpu.vendor) + single_element ("vendor", config->cpu.vendor); + if (config->cpu.model) { + start_element ("model") { + attribute ("fallback", "allow"); + string (config->cpu.model); + } end_element (); + } + start_element ("topology") { + attribute_format ("sockets", "%u", topo.sockets); + attribute_format ("cores", "%u", topo.cores); + attribute_format ("threads", "%u", topo.threads); + } end_element (); + } end_element (); + switch (config->rtc.basis) { case BASIS_UNKNOWN: /* Don't emit any <clock> element. */ diff --git a/test-virt-p2v-cmdline.sh b/test-virt-p2v-cmdline.sh index cd76044195a3..37dc1662a593 100755 --- a/test-virt-p2v-cmdline.sh +++ b/test-virt-p2v-cmdline.sh @@ -26,7 +26,7 @@ out=test-virt-p2v-cmdline.out rm -f $out # The Linux kernel command line. -$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpus=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out +$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpu.cores=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out # For debugging purposes. cat $out @@ -37,7 +37,8 @@ grep "^remote\.port.*123" $out grep "^auth\.username.*user" $out grep "^auth\.sudo.*false" $out grep "^guestname.*test" $out -grep "^vcpus.*4" $out +grep "^vcpu.dense_topo.*false" $out +grep "^vcpu.cores.*4" $out grep "^memory.*"$((1024*1024*1024)) $out grep "^disks.*sda sdb sdc" $out grep "^removable.*sdd" $out
Richard W.M. Jones
2022-Sep-08 07:52 UTC
[Libguestfs] [p2v PATCH 2/6] restrict vCPU topology to (a) fully populated physical, or (b) 1 * N * 1
On Mon, Sep 05, 2022 at 01:25:27PM +0200, Laszlo Ersek wrote:> This concern arises because in the child process, only the one thread > returning from fork() exists, thus if any locks are held by other threads > in the parent process, those will never be released in the child process. > > But such a fork() should be safe: existent code already starts e.g. nbdkit > on the same call path, namely start_conversion_thread() -> > start_conversion() -> start_nbd_server() -> start_nbdkit() -> fork().)Yes, I think this ought to be safe. The thread model of virt-p2v is relatively simple.> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1590721 > Signed-off-by: Laszlo Ersek <lersek at redhat.com> > --- > generate-p2v-config.pl | 45 ++++++++-------- > p2v.h | 6 +++ > cpuid.c | 39 ++++++++------ > gui.c | 4 +- > main.c | 8 +-- > physical-xml.c | 54 ++++++++++---------- > test-virt-p2v-cmdline.sh | 5 +- > 7 files changed, 88 insertions(+), 73 deletions(-) > > diff --git a/generate-p2v-config.pl b/generate-p2v-config.pl > index 06a2e2c8b0e4..5c33f814e3e7 100755 > --- a/generate-p2v-config.pl > +++ b/generate-p2v-config.pl > @@ -110,16 +110,19 @@ my @fields = [ > ], > ), > ConfigString->new(name => 'guestname'), > - ConfigInt->new(name => 'vcpus', value => 0), > + ConfigSection->new( > + name => 'vcpu', > + elements => [ > + ConfigBool->new(name => 'dense_topo'), > + ConfigInt->new(name => 'cores', value => 0), > + ], > + ), > ConfigUInt64->new(name => 'memory'), > ConfigSection->new( > name => 'cpu', > elements => [ > ConfigString->new(name => 'vendor'), > ConfigString->new(name => 'model'), > - ConfigUnsigned->new(name => 'sockets'), > - ConfigUnsigned->new(name => 'cores'), > - ConfigUnsigned->new(name => 'threads'), > ConfigBool->new(name => 'acpi'), > ConfigBool->new(name => 'apic'), > ConfigBool->new(name => 'pae'), > @@ -231,11 +234,21 @@ The name of the guest that is created. The default is to try to > derive a name from the physical machine?s hostname (if possible) else > use a randomly generated name.", > ), > - "p2v.vcpus" => manual_entry->new( > + "p2v.vcpu.dense_topo" => manual_entry->new( > + shortopt => "", # ignored for booleans > + description => " > +Copy the physical machine's CPU topology, densely populated, to the > +guest. Disabled by default. If disabled, the C<p2v.vcpu.cores> setting > +takes effect.", > + ), > + "p2v.vcpu.cores" => manual_entry->new( > shortopt => "N", > description => " > -The number of virtual CPUs to give to the guest. The default is to > -use the same as the number of physical CPUs.", > +This setting is ignored if C<p2v.vcpu.dense_topo> is enabled. > +Otherwise, it specifies the flat number of vCPU cores to give to the > +guest (placing all of those cores into a single socket, and exposing one > +thread per core). The default value is the number of online logical > +processors on the physical machine.", > ), > "p2v.memory" => manual_entry->new( > shortopt => "n(M|G)", > @@ -258,24 +271,6 @@ the same CPU vendor as the physical machine.", > description => " > The vCPU model, eg. \"IvyBridge\". The default is to use the same > CPU model as the physical machine.", > - ), > - "p2v.cpu.sockets" => manual_entry->new( > - shortopt => "N", > - description => " > -Number of vCPU sockets to use. The default is to use the same as the > -physical machine.", > - ), > - "p2v.cpu.cores" => manual_entry->new( > - shortopt => "N", > - description => " > -Number of vCPU cores to use. The default is to use the same as the > -physical machine.", > - ), > - "p2v.cpu.threads" => manual_entry->new( > - shortopt => "N", > - description => " > -Number of vCPU hyperthreads to use. The default is to use the same > -as the physical machine.", > ), > "p2v.cpu.acpi" => manual_entry->new( > shortopt => "", # ignored for booleans > diff --git a/p2v.h b/p2v.h > index 20c4ac7e506a..32dc55d94de5 100644 > --- a/p2v.h > +++ b/p2v.h > @@ -60,6 +60,12 @@ extern int feature_colours_option; > extern int force_colour; > > /* cpuid.c */ > +struct cpu_topo { > + unsigned sockets; > + unsigned cores; > + unsigned threads; > +}; > +extern void get_cpu_topology (struct cpu_topo *topo); > extern void get_cpu_config (struct cpu_config *); > > /* rtc.c */ > diff --git a/cpuid.c b/cpuid.c > index 84603bbb3374..31a1720dc23c 100644 > --- a/cpuid.c > +++ b/cpuid.c > @@ -154,22 +154,32 @@ get_vendor (char **lscpu, struct cpu_config *cpu) > } > > /** > - * Read the CPU topology from lscpu output. > + * Read the CPU topology from a separate lscpu invocation. > */ > -static void > -get_topology (char **lscpu, struct cpu_config *cpu) > +void > +get_cpu_topology (struct cpu_topo *topo) > { > - const char *v; > - > - v = get_field (lscpu, "Socket(s)"); > - if (v) > - ignore_value (sscanf (v, "%u", &cpu->sockets)); > - v = get_field (lscpu, "Core(s) per socket"); > - if (v) > - ignore_value (sscanf (v, "%u", &cpu->cores)); > - v = get_field (lscpu, "Thread(s) per core"); > - if (v) > - ignore_value (sscanf (v, "%u", &cpu->threads)); > + > + CLEANUP_FREE_STRING_LIST char **lscpu = NULL; > + > + lscpu = get_lscpu (); > + > + if (lscpu != NULL) { > + const char *sockets, *cores, *threads; > + > + sockets = get_field (lscpu, "Socket(s)"); > + cores = get_field (lscpu, "Core(s) per socket"); > + threads = get_field (lscpu, "Thread(s) per core"); > + if (sockets && cores && threads) { > + ignore_value (sscanf (sockets, "%u", &topo->sockets)); > + ignore_value (sscanf (cores, "%u", &topo->cores)); > + ignore_value (sscanf (threads, "%u", &topo->threads)); > + return; > + } > + } > + topo->sockets = 1; > + topo->cores = 1; > + topo->threads = 1; > } > > /** > @@ -211,7 +221,6 @@ get_cpu_config (struct cpu_config *cpu) > lscpu = get_lscpu (); > if (lscpu != NULL) { > get_vendor (lscpu, cpu); > - get_topology (lscpu, cpu); > get_flags (lscpu, cpu); > } > > diff --git a/gui.c b/gui.c > index 5c4f1343095a..b4a9fed18410 100644 > --- a/gui.c > +++ b/gui.c > @@ -767,7 +767,7 @@ create_conversion_dialog (struct config *config) > set_alignment (vcpus_label, 1., 0.5); > vcpus_entry = gtk_entry_new (); > gtk_label_set_mnemonic_widget (GTK_LABEL (vcpus_label), vcpus_entry); > - snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpus); > + snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpu.cores); > gtk_entry_set_text (GTK_ENTRY (vcpus_entry), vcpus_str); > table_attach (target_tbl, vcpus_entry, > 1, 2, 1, 2, GTK_FILL, GTK_FILL, 1, 1); > @@ -2010,7 +2010,7 @@ start_conversion_clicked (GtkWidget *w, gpointer data) > return; > } > > - config->vcpus = get_vcpus_from_conv_dlg (); > + config->vcpu.cores = get_vcpus_from_conv_dlg (); > config->memory = get_memory_from_conv_dlg (); > > /* Get the list of disks to be converted. */ > diff --git a/main.c b/main.c > index 0ebb7291c7ce..6ac94fcb5f8e 100644 > --- a/main.c > +++ b/main.c > @@ -291,16 +291,18 @@ set_config_defaults (struct config *config) > } > config->guestname = strdup (hostname); > > + config->vcpu.dense_topo = false; > + > /* Defaults for #vcpus and memory are taken from the physical machine. */ > i = sysconf (_SC_NPROCESSORS_ONLN); > if (i == -1) { > perror ("sysconf: _SC_NPROCESSORS_ONLN"); > - config->vcpus = 1; > + config->vcpu.cores = 1; > } > else if (i == 0) > - config->vcpus = 1; > + config->vcpu.cores = 1; > else > - config->vcpus = i; > + config->vcpu.cores = i; > > i = sysconf (_SC_PHYS_PAGES); > if (i == -1) { > diff --git a/physical-xml.c b/physical-xml.c > index 4e830ea7ee0c..c27869de1c35 100644 > --- a/physical-xml.c > +++ b/physical-xml.c > @@ -62,6 +62,7 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, > uint64_t memkb; > CLEANUP_XMLFREETEXTWRITER xmlTextWriterPtr xo = NULL; > size_t i; > + struct cpu_topo topo; > > xo = xmlNewTextWriterFilename (filename, 0); > if (xo == NULL) > @@ -104,34 +105,35 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, > string_format ("%" PRIu64, memkb); > } end_element (); > > - single_element_format ("vcpu", "%d", config->vcpus); > - > - if (config->cpu.vendor || config->cpu.model || > - config->cpu.sockets || config->cpu.cores || config->cpu.threads) { > - /* https://libvirt.org/formatdomain.html#elementsCPU */ > - start_element ("cpu") { > - attribute ("match", "minimum"); > - if (config->cpu.vendor) > - single_element ("vendor", config->cpu.vendor); > - if (config->cpu.model) { > - start_element ("model") { > - attribute ("fallback", "allow"); > - string (config->cpu.model); > - } end_element (); > - } > - if (config->cpu.sockets || config->cpu.cores || config->cpu.threads) { > - start_element ("topology") { > - if (config->cpu.sockets) > - attribute_format ("sockets", "%u", config->cpu.sockets); > - if (config->cpu.cores) > - attribute_format ("cores", "%u", config->cpu.cores); > - if (config->cpu.threads) > - attribute_format ("threads", "%u", config->cpu.threads); > - } end_element (); > - } > - } end_element (); > + if (config->vcpu.dense_topo) > + get_cpu_topology (&topo); > + else { > + topo.sockets = 1; > + topo.cores = config->vcpu.cores; > + topo.threads = 1; > } > > + single_element_format ("vcpu", "%u", > + topo.sockets * topo.cores * topo.threads); > + > + /* https://libvirt.org/formatdomain.html#elementsCPU */ > + start_element ("cpu") { > + attribute ("match", "minimum"); > + if (config->cpu.vendor) > + single_element ("vendor", config->cpu.vendor); > + if (config->cpu.model) { > + start_element ("model") { > + attribute ("fallback", "allow"); > + string (config->cpu.model); > + } end_element (); > + } > + start_element ("topology") { > + attribute_format ("sockets", "%u", topo.sockets); > + attribute_format ("cores", "%u", topo.cores); > + attribute_format ("threads", "%u", topo.threads); > + } end_element (); > + } end_element (); > + > switch (config->rtc.basis) { > case BASIS_UNKNOWN: > /* Don't emit any <clock> element. */ > diff --git a/test-virt-p2v-cmdline.sh b/test-virt-p2v-cmdline.sh > index cd76044195a3..37dc1662a593 100755 > --- a/test-virt-p2v-cmdline.sh > +++ b/test-virt-p2v-cmdline.sh > @@ -26,7 +26,7 @@ out=test-virt-p2v-cmdline.out > rm -f $out > > # The Linux kernel command line. > -$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpus=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out > +$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpu.cores=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out > > # For debugging purposes. > cat $out > @@ -37,7 +37,8 @@ grep "^remote\.port.*123" $out > grep "^auth\.username.*user" $out > grep "^auth\.sudo.*false" $out > grep "^guestname.*test" $out > -grep "^vcpus.*4" $out > +grep "^vcpu.dense_topo.*false" $out > +grep "^vcpu.cores.*4" $out > grep "^memory.*"$((1024*1024*1024)) $out > grep "^disks.*sda sdb sdc" $out > grep "^removable.*sdd" $outReviewed-by: Richard W.M. Jones <rjones at redhat.com> Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Richard W.M. Jones
2022-Sep-08 08:03 UTC
[Libguestfs] [p2v PATCH 2/6] restrict vCPU topology to (a) fully populated physical, or (b) 1 * N * 1
On Mon, Sep 05, 2022 at 01:25:27PM +0200, Laszlo Ersek wrote:> + "p2v.vcpu.dense_topo" => manual_entry->new( > + shortopt => "", # ignored for booleans > + description => " > +Copy the physical machine's CPU topology, densely populated, to the > +guest. Disabled by default. If disabled, the C<p2v.vcpu.cores> setting > +takes effect.",I just realised I'm not completely sure what "densely populated" means here. I think we should have a bit more explanation. How about something like: "p2v.vcpu.dense_topo" => manual_entry->new( shortopt => "", # ignored for booleans description => " Copy the physical machine's complete CPU topology (sockets, cores and threads) to the guest. Disabled by default. If disabled, the C<p2v.vcpu.cores> setting takes effect.", (Which might also imply that we rename this something like "complete_topo" or "full_topo" but I'll leave that to you.) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Daniel P. Berrangé
2022-Sep-08 14:37 UTC
[Libguestfs] [p2v PATCH 2/6] restrict vCPU topology to (a) fully populated physical, or (b) 1 * N * 1
On Mon, Sep 05, 2022 at 01:25:27PM +0200, Laszlo Ersek wrote:> Currently, the CPU topology for the converted domain is determined as > follows: > > (1) main() [main.c] > > (2) set_config_defaults() [main.c] > vcpus <-- sysconf (_SC_NPROCESSORS_ONLN) > get_cpu_config() [cpuid.c] > get_topology() [cpuid.c] > sockets <-- lscpu (physical) > cores <-- lscpu (physical) > threads <-- lscpu (physical) > > (3) update_config_from_kernel_cmdline() [kernel-config.c] > vcpus <-- kernel cmdline > sockets <-- kernel cmdline > cores <-- kernel cmdline > threads <-- kernel cmdline > > (4) opt#1: kernel_conversion() [kernel.c] > no change to topology info > opt#2: gui_conversion() [gui.c] > vcpus <-- GUI > > (5) ... > start_conversion() [conversion.c] > generate_physical_xml() [physical-xml.c] > format vcpus, sockets, cores, threads > > In (2), we initialize the topology information from the physical machine. > In (3), we override each property in isolation from the kernel command > line (this is done for both kernel conversion and GUI conversion; in the > former case, it is the only way for the user to influence each element of > the topology). In (4), in case we picked GUI conversion, the VCPU number > can be overridden yet another time on the GUI. In (5), we format the > topology information into the domain XML. > > The problem is that it's easy to create inconsistent topologies that > libvirt complains about. One example is that in step (4) on the GUI, if > the flat VCPU count is reduced by the user, it can result in sockets > partially populated with cores or threads, or in cores partially populated > with threads. Another example (even if the user does not touch the VCPU > count) is a partially onlined CPU topology on the physical machine: > _SC_NPROCESSORS_ONLN does not reflect any topology information, and if the > flat number it returns is less than the (sockets * cores * threads) > product from "lscpu", then the online logical processors will be > "compressed to the left" by libvirt just the same as after the manual VCPU > count reduction. > > An over-complicated way to fix this would be the following: > > - parse the output of "lscpu -p" (which is available since RHEL6 -- > "lscpu" is altogether missing in RHEL5), retrieving online-ness combined > with full topology information > > - expose complete topology info on the GUI > > - format the complete topology information for libvirt. > > A simpler way (which is what this patch series chooses) is to remove some > flexibility from virt-p2v's configuration interface. Namely, only allow > the user to choose betwen (a) copying the physical CPU topology, *fully > populated*, (b) specifying the core count N for a "1 socket * N > cores/socket * 1 thread/core" topology. > > Option (a) eliminates the "partially onlined physical CPU topology" > scenario; it produces a topology that exists in the physical world, and > does not require p2v to maintain topology information more expressively > than it currently does. > > Option (b) eliminates any conflicts between the physical (or any > user-configured) sockets:cores:threads hierarchy and a manually set > logical processor count. Option (b) produces the kind of flat CPU topology > that QEMU has defaulted to since version 6.2, with a simple "-smp" option. > Option (b) preserves the simple p2v user interface for entering only a > logical processor count.I would further say that option (b) is the *only* one that makes sense from a performance POV, if you are *not* doing 1:1 pCPU:vCPU pinning. The guest kernel scheduler will take different placement decisions based on the topology, and if guest CPUIs are floating across arbitrary host CPUs, those scheduler decisions are meaningless and potentially even harmful to performance. IOW, I'd say option (b) should be the default in the absence of any explicit override from the user. When libvirt integrates support for core scheduling, then it becomes more reasonable to consider setting different topologies even when not pinning. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|