Laszlo Ersek
2022-May-25 15:07 UTC
[Libguestfs] [PATCH] always 'max' for the appliance CPU model on all targes except ppc
+ Drew & Peter On 05/25/22 15:30, Daniel P. Berrang? wrote:> The current logic for selecting CPU model to use with the appliance > selects 'max' for all architectures except for ppc64le and aarch64. > > On aarch64, it selects 'host' if KVM is available which is identical > to 'max'. For TCG it selects 'cortex-a57'. The 'max' model is a > superset of 'cortex-a57', so it is reasonable to use 'max' for TCG > too. > > On ppc64, it selects no CPU model due to a historical bug where > using 'host' would break ability to fallback to TCG. Unfortunately > we can't use 'max' on PPC as this is actually an old G4 vintage > 32-bit model, rather than a synonym for 'host' / all-TCG-features > as on other targets. > > We can at least simplify the code to use 'max' in all scenarios > for appliance CPU model, and simply skip a CPU model for PPC. > > Signed-off-by: Daniel P. Berrang? <berrange at redhat.com> > --- > docs/C_SOURCE_FILES | 1 - > lib/Makefile.am | 1 - > lib/appliance-cpu.c | 93 ------------------------------------------ > lib/guestfs-internal.h | 3 -- > lib/launch-direct.c | 25 +++++------- > lib/launch-libvirt.c | 40 ++++++++---------- > po/POTFILES | 1 - > 7 files changed, 27 insertions(+), 137 deletions(-) > delete mode 100644 lib/appliance-cpu.c- The patch seems to do what it says in the commit message. - QEMU commit bab52d4bba3f ("target/arm: Add "-cpu max" support", 2018-03-09) confirms what the commit message says, about both TCG and KVM. - To smoke-test the TCG-related change, I've edited a long-term TCG aarch64 libvirt domain of mine, replacing "cortex-a57" with "max". Both edk2 and the Linux guest continued working. So I guess the TCG change is OK. - In additional support of the above, QEMU commit ddaebdda53fc ("target/arm: Unindent unnecessary else-clause", 2022-02-21) added a comment saying /* '-cpu max' for TCG: we currently do this as "A57 with extra things" */ - Although I was more surprised by the TCG-related statement initially (i.e. that "max" was a superset of "cortex-a57" when using TCG), now I'm actually more concerned about the KVM case. Specifically QEMU commit 0baa21be497d ("target/arm: Make KVM -cpu max exactly like -cpu host", 2022-02-21) eliminated a difference where "-cpu max" had been a superset of "-cpu host", featuring the "sve-max-vq" extra property. The fix is part of release v7.0.0. The difference was introduced in commits [1] 6fa8a37949d9 ("target/arm/cpu64: max cpu: Support sve properties with KVM", 2019-11-01) [2] 87014c6b3660 ("target/arm/kvm: host cpu: Add support for sve<N> properties", 2019-11-01) and apparently *deliberately*. Commit [1] brought a number of "sve" properties to "-cpu max" on KVM, including "sve-max-vq". Commit [2] factored a *subset* of those properties out to aarch64_add_sve_properties(), and intentionally enabled only that subset for "-cpu host" on KVM. Commits [1] and [2] were part of release v4.2.0. Therefore it seems that starting with qemu-4.2, but strictly preceding qemu-7.0, "-cpu max" and "-cpu host" are not "identical" when KVM is enabled; "-cpu max" has more features. Because of that, I think there are two options: (a) This extra feature is actually harmless, so we should only update the commit message (i.e., generally speaking, "-cpu max" has been a superset of "-cpu host" on KVM and of "-cpu cortex-a57" on TCG). (b) The feature actually presents a problem, and qemu in [v4.2.0, v7.0.0) will not start when KVM accel and "-cpu max" are requested simultaneously. In this case, I think the appliance needs to stick with "-cpu host" on KVM. I *think* that case (b) applies. Here's why: - QEMU commit 87014c6b3660 says "sve-max-vq [...] is difficult to use with KVM", and - commit 0baa21be497d does not eliminate the difference by *adding* "sve-max-vq" to "-cpu host"; the difference is eliminated by *removing* "sve-max-vq" from "-cpu max" on KVM! Thanks, Laszlo> > diff --git a/docs/C_SOURCE_FILES b/docs/C_SOURCE_FILES > index a367c93a0..a26487d99 100644 > --- a/docs/C_SOURCE_FILES > +++ b/docs/C_SOURCE_FILES > @@ -283,7 +283,6 @@ lib/actions-6.c > lib/actions-support.c > lib/actions-variants.c > lib/alloc.c > -lib/appliance-cpu.c > lib/appliance-kcmdline.c > lib/appliance-uefi.c > lib/appliance.c > diff --git a/lib/Makefile.am b/lib/Makefile.am > index 212bcb94a..40a4c9ac3 100644 > --- a/lib/Makefile.am > +++ b/lib/Makefile.am > @@ -71,7 +71,6 @@ libguestfs_la_SOURCES = \ > actions-variants.c \ > alloc.c \ > appliance.c \ > - appliance-cpu.c \ > appliance-kcmdline.c \ > appliance-uefi.c \ > available.c \ > diff --git a/lib/appliance-cpu.c b/lib/appliance-cpu.c > deleted file mode 100644 > index 54ac6e2e3..000000000 > --- a/lib/appliance-cpu.c > +++ /dev/null > @@ -1,93 +0,0 @@ > -/* libguestfs > - * Copyright (C) 2009-2020 Red Hat Inc. > - * > - * This library is free software; you can redistribute it and/or > - * modify it under the terms of the GNU Lesser General Public > - * License as published by the Free Software Foundation; either > - * version 2 of the License, or (at your option) any later version. > - * > - * This library is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - * Lesser General Public License for more details. > - * > - * You should have received a copy of the GNU Lesser General Public > - * License along with this library; if not, write to the Free Software > - * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > - */ > - > -/** > - * The appliance choice of CPU model. > - */ > - > -#include <config.h> > - > -#include <stdio.h> > -#include <stdlib.h> > - > -#include "guestfs.h" > -#include "guestfs-internal.h" > - > -/** > - * Return the right CPU model to use as the qemu C<-cpu> parameter or > - * its equivalent in libvirt. This returns: > - * > - * =over 4 > - * > - * =item C<"host"> > - * > - * The literal string C<"host"> means use C<-cpu host>. > - * > - * =item C<"max"> > - * > - * The literal string C<"max"> means use C<-cpu max> (the best > - * possible). This requires awkward translation for libvirt. > - * > - * =item some string > - * > - * Some string such as C<"cortex-a57"> means use C<-cpu cortex-a57>. > - * > - * =item C<NULL> > - * > - * C<NULL> means no C<-cpu> option at all. Note returning C<NULL> > - * does not indicate an error. > - * > - * =back > - * > - * This is made unnecessarily hard and fragile because of two stupid > - * choices in QEMU: > - * > - * =over 4 > - * > - * =item * > - * > - * The default for C<qemu-system-aarch64 -M virt> is to emulate a > - * C<cortex-a15> (WTF?). > - * > - * =item * > - * > - * We don't know for sure if KVM will work, but C<-cpu host> is broken > - * with TCG, so we almost always pass a broken C<-cpu> flag if KVM is > - * semi-broken in any way. > - * > - * =back > - */ > -const char * > -guestfs_int_get_cpu_model (int kvm) > -{ > -#if defined(__aarch64__) > - /* With -M virt, the default -cpu is cortex-a15. Stupid. */ > - if (kvm) > - return "host"; > - else > - return "cortex-a57"; > -#elif defined(__powerpc64__) > - /* See discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1605071 */ > - return NULL; > -#else > - /* On most architectures we can use "max" to get the best possible CPU. > - * For recent qemu this should work even on TCG. > - */ > - return "max"; > -#endif > -} > diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h > index 16755cfb3..33037a26d 100644 > --- a/lib/guestfs-internal.h > +++ b/lib/guestfs-internal.h > @@ -718,9 +718,6 @@ extern const char *guestfs_int_drive_protocol_to_string (enum drive_protocol pro > /* appliance.c */ > extern int guestfs_int_build_appliance (guestfs_h *g, char **kernel, char **initrd, char **appliance); > > -/* appliance-cpu.c */ > -const char *guestfs_int_get_cpu_model (int kvm); > - > /* appliance-kcmdline.c */ > extern char *guestfs_int_appliance_command_line (guestfs_h *g, const char *appliance, int flags); > #define APPLIANCE_COMMAND_LINE_IS_TCG 1 > diff --git a/lib/launch-direct.c b/lib/launch-direct.c > index ff0eaeb62..f41711353 100644 > --- a/lib/launch-direct.c > +++ b/lib/launch-direct.c > @@ -354,7 +354,6 @@ launch_direct (guestfs_h *g, void *datav, const char *arg) > int force_tcg; > int force_kvm; > const char *accel_val = "kvm:tcg"; > - const char *cpu_model; > CLEANUP_FREE char *append = NULL; > CLEANUP_FREE_STRING_LIST char **argv = NULL; > > @@ -517,20 +516,18 @@ launch_direct (guestfs_h *g, void *datav, const char *arg) > #endif > } end_list (); > > - cpu_model = guestfs_int_get_cpu_model (has_kvm && !force_tcg); > - if (cpu_model) { > -#if defined(__x86_64__) > - /* Temporary workaround for RHBZ#2082806 */ > - if (STREQ (cpu_model, "max")) { > - start_list ("-cpu") { > - append_list (cpu_model); > - append_list ("la57=off"); > - } end_list (); > - } > - else > + /* > + * Can't use 'max' on ppc64 since it is an old G4 model > + * Also can't use 'host' due to TCG fallback issues > + * https://bugzilla.redhat.com/show_bug.cgi?id=1605071 > + */ > +#if !defined(__powerpc64__) > +# if defined(__x86_64__) > + arg ("-cpu", "max,la57=off"); > +# else > + arg ("-cpu", "max"); > +# endif > #endif > - arg ("-cpu", cpu_model); > - } > > if (g->smp > 1) > arg_format ("-smp", "%d", g->smp); > diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c > index 03d69e027..b97c91566 100644 > --- a/lib/launch-libvirt.c > +++ b/lib/launch-libvirt.c > @@ -1161,8 +1161,6 @@ construct_libvirt_xml_cpu (guestfs_h *g, > const struct libvirt_xml_params *params, > xmlTextWriterPtr xo) > { > - const char *cpu_model; > - > start_element ("memory") { > attribute ("unit", "MiB"); > string_format ("%d", g->memsize); > @@ -1173,30 +1171,24 @@ construct_libvirt_xml_cpu (guestfs_h *g, > string_format ("%d", g->memsize); > } end_element (); > > - cpu_model = guestfs_int_get_cpu_model (params->data->is_kvm); > - if (cpu_model) { > - start_element ("cpu") { > - if (STREQ (cpu_model, "host")) { > - attribute ("mode", "host-passthrough"); > - start_element ("model") { > - attribute ("fallback", "allow"); > - } end_element (); > - } > - else if (STREQ (cpu_model, "max")) { > - /* https://bugzilla.redhat.com/show_bug.cgi?id=1935572#c11 */ > - attribute ("mode", "maximum"); > + /* > + * Can't use 'max' on ppc64 since it is an old G4 model > + * Also can't use 'host' due to TCG fallback issues > + * https://bugzilla.redhat.com/show_bug.cgi?id=1605071 > + */ > +#if !defined(__powerpc64__) > + start_element ("cpu") { > + /* https://bugzilla.redhat.com/show_bug.cgi?id=1935572#c11 */ > + attribute ("mode", "maximum"); > #if defined(__x86_64__) > - /* Temporary workaround for RHBZ#2082806 */ > - start_element ("feature") { > - attribute ("policy", "disable"); > - attribute ("name", "la57"); > - } end_element (); > -#endif > - } > - else > - single_element ("model", cpu_model); > + /* Temporary workaround for RHBZ#2082806 */ > + start_element ("feature") { > + attribute ("policy", "disable"); > + attribute ("name", "la57"); > } end_element (); > - } > +#endif > + } end_element (); > +#endif > > single_element_format ("vcpu", "%d", g->smp); > > diff --git a/po/POTFILES b/po/POTFILES > index 32b975a04..67cdffb29 100644 > --- a/po/POTFILES > +++ b/po/POTFILES > @@ -329,7 +329,6 @@ lib/actions-6.c > lib/actions-support.c > lib/actions-variants.c > lib/alloc.c > -lib/appliance-cpu.c > lib/appliance-kcmdline.c > lib/appliance-uefi.c > lib/appliance.c >
Peter Maydell
2022-May-25 16:13 UTC
[Libguestfs] [PATCH] always 'max' for the appliance CPU model on all targes except ppc
On Wed, 25 May 2022 at 16:07, Laszlo Ersek <lersek at redhat.com> wrote:> > + Drew & Peter > > On 05/25/22 15:30, Daniel P. Berrang? wrote: > - The patch seems to do what it says in the commit message. > > - QEMU commit bab52d4bba3f ("target/arm: Add "-cpu max" support", > 2018-03-09) confirms what the commit message says, about both TCG and > KVM. > > - To smoke-test the TCG-related change, I've edited a long-term TCG > aarch64 libvirt domain of mine, replacing "cortex-a57" with "max". > Both edk2 and the Linux guest continued working. So I guess the TCG > change is OK.One thing to note here is that if you are using: * TCG -cpu max * 'virt' with no named version or with 'virt-7.0' or later * a Linux kernel version prior to v5.12 then a bug in Linux means it won't boot. (This is because of the LPA2 CPU feature which TCG -cpu max now emulates; older kernels were buggy and won't boot on an LPA2 CPU, including a real hardware one.) You might or might not feel this is something worth noting in release notes or equivalent.> - In additional support of the above, QEMU commit ddaebdda53fc > ("target/arm: Unindent unnecessary else-clause", 2022-02-21) added a > comment saying > > /* '-cpu max' for TCG: we currently do this as "A57 with extra things" */ > > - Although I was more surprised by the TCG-related statement initially > (i.e. that "max" was a superset of "cortex-a57" when using TCG), now > I'm actually more concerned about the KVM case.The TCG part is an implementation convenience (mostly it just means that 'max' has the A57's IMPDEF registers).> Specifically QEMU commit 0baa21be497d ("target/arm: Make KVM -cpu max > exactly like -cpu host", 2022-02-21) eliminated a difference where > "-cpu max" had been a superset of "-cpu host", featuring the > "sve-max-vq" extra property. > > The fix is part of release v7.0.0. > > The difference was introduced in commits > > [1] 6fa8a37949d9 ("target/arm/cpu64: max cpu: Support sve properties > with KVM", 2019-11-01) > > [2] 87014c6b3660 ("target/arm/kvm: host cpu: Add support for sve<N> > properties", 2019-11-01) > > and apparently *deliberately*.No, this was unintentional. See the discussion in this thread: https://lore.kernel.org/qemu-devel/20220203173640.shxkmatdcsfzzvtj at gator/ In particular, although KVM '-cpu max' had the sve-max-vq property, Drew notes "sve-max-vq won't work for any of the machines that support SVE that I know of". Which is why we removed it, rather than adding it to '-cpu host'.> Therefore it seems that starting with qemu-4.2, but strictly preceding > qemu-7.0, "-cpu max" and "-cpu host" are not "identical" when KVM is > enabled; "-cpu max" has more features. Because of that, I think there > are two options: > > (a) This extra feature is actually harmless, so we should only update > the commit message (i.e., generally speaking, "-cpu max" has been > a superset of "-cpu host" on KVM and of "-cpu cortex-a57" on TCG). > > (b) The feature actually presents a problem, and qemu in [v4.2.0, > v7.0.0) will not start when KVM accel and "-cpu max" are requested > simultaneously. In this case, I think the appliance needs to stick > with "-cpu host" on KVM.I don't understand why you think these are the only two options. The actual situation is: (c) -cpu max and -cpu host have always been identical on KVM, and this commit does not change that. There happens to have been a QOM property 'sve-max-vq' on 'max' that should not have existed there and that nobody can actually have been usefully setting, but now there isn't. thanks -- PMM