Laszlo Ersek
2022-Jul-20 12:55 UTC
[Libguestfs] [v2v PATCH] output/create_libvirt_xml: generate @check='none' CPU attribute
On 07/20/22 14:00, Richard W.M. Jones wrote:> On Wed, Jul 20, 2022 at 01:09:13PM +0200, Laszlo Ersek wrote: >> We currently don't generate any @check attribute for the /domain/cpu >> element, which causes the following libvirtd behavior >> <https://libvirt.org/formatdomain.html#cpu-model-and-topology>: >> >>> Once the domain starts, libvirt will automatically change the check >>> attribute to the best supported value to ensure the virtual CPU does >>> not change when the domain is migrated to another host >> >> Vera Wu reports that in practice, at least when the source CPU model >> is explicitly "qemu64", libvirtd sets @check='partial'. That's >> defined as: > > I think my main confusion is why is the _source_ model "qemu64"? > Doesn't that imply the source hypervisor is qemu and so we shouldn't > use virt-v2v at all? In fact from the BZ: > > # virt-v2v -i libvirt -ic qemu:///system esx6.7-rhel8.6-nvme-disk ... > > (Vera is doing this for a good reason - she wants to test the -oo > compressed option, and using a local qemu guest is the quickest way.) > > Anyway let's go with the assumption this is a bad idea, but for > testing purposes we often do this so we should try to make it work.Indeed this is for that "-i libvirt -ic qemu://..." use case. I commented similarly to you, in <https://bugzilla.redhat.com/show_bug.cgi?id=2047660#c11>:> Anyway I defer to Rich on whether we need a new bug for this. I don't > think that we absolutely need to investigate this, as I'd never expect > libvirt-to-libvirt conversion to be used in production.And then Vera filed 2107503. Continuing: On 07/20/22 14:00, Richard W.M. Jones wrote:> On Wed, Jul 20, 2022 at 01:09:13PM +0200, Laszlo Ersek wrote: > >>> Libvirt will check the guest CPU specification before starting a >>> domain >> >> This is a problem: the default "qemu64" CPU model exposes the SVM CPU >> flag, and that's unsupportable on Intel hosts. SVM is the AMD >> counterpart of VT-x; IOW, the flag effectively advertizes >> AMD-specific nesting to guests. >> >> With @check='partial', libvirt prevents the converted domain from >> starting on Intel hosts; but with @check='none', >> >>> Libvirt does no checking and it is up to the hypervisor to refuse to >>> start the domain if it cannot provide the requested CPU. With QEMU >>> this means no checking is done at all since the default behavior of >>> QEMU is to emit warnings, but start the domain anyway. >> >> We don't care about the migratability of the converted domain, so >> relax the libvirtd checks, by generating the @check='none' attribute. > > For testing we don't, but in general -o libvirt conversions we might > care about migratability.Without @check='none', we may not even be able to start the converted domain in-place (that is, on the libvirt host that the domain was first converted to -- even before we attempt migration). See below.> >> Consider adding @check='none' in two cases: >> >> (1) When the source domain specifies a CPU model. >> >> Generating @check='none' in this case fixes the issue reported by >> Vera. >> >> (2) When the source domain does not specify a CPU model, and the >> guest OS is assumed to work well with the default CPU model. >> >> Generating @check='none' in this case is actually a no-op. Going from >> "no CPU element" to "<cpu check='none'/>" does not change how >> libvirtd augments the domain config. Namely, >> >> (2.1) for x86 guests, we get >> >> <cpu mode='custom' match='exact' check='none'> >> <model fallback='forbid'>qemu64</model> >> </cpu> >> >> either way, >> >> (2.2) for aarch64 guests, we get >> >> <cpu mode='custom' match='exact' check='none'> >> <model fallback='forbid'>cortex-a15</model> >> </cpu> >> >> either way. >> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2107503 >> Signed-off-by: Laszlo Ersek <lersek at redhat.com> >> --- >> output/create_libvirt_xml.ml | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/output/create_libvirt_xml.ml b/output/create_libvirt_xml.ml >> index 531a4f75bf3e..0343d3194268 100644 >> --- a/output/create_libvirt_xml.ml >> +++ b/output/create_libvirt_xml.ml >> @@ -192,6 +192,7 @@ let create_libvirt_xml ?pool source inspect >> List.push_back cpu_attrs ("mode", "host-passthrough"); >> | Some model -> >> List.push_back cpu_attrs ("match", "minimum"); >> + List.push_back cpu_attrs ("check", "none"); > > So should we make this attribute conditional on the source CPU model > being qemu64, just so the change is minimal and we don't break > migratability for non testcases?In my opinion, having SVM in qemu64 is actually a bug in the qemu64 CPU model. Qemu64 is supposed to be the most compatible ("runs everywhere") CPU model, per <https://gitlab.com/qemu-project/qemu/-/blob/master/docs/system/cpu-models-x86.rst.inc#L330>. So why does it include a CPU feature that's impossible to satisfy with KVM on an Intel host? Libvirt is totally right to complain about that. With this patch, *at first sight*, we're working around the qemu64 CPU model breakage, by asking libvirt to ignore its own (otherwise justified) checks. However... assume the source virt host is a genuine AMD one, and the source VCPU model is (say) EPYC <https://gitlab.com/qemu-project/qemu/-/blob/master/docs/system/cpu-models-x86.rst.inc#L237>. Then the "svm" CPU feature (and some other AMD-only CPU flags) will show up just the same, and the libvirtd-side checks will prevent the converted domain from launching on an Intel host (with KVM) just the same. From all perspectives *except* virt-v2v's, this is arguably the correct behavior. But from virt-v2v's perspective, it is *still* wrong. We arguably don't care if we need to tweak the CPU model "incompatibly", we just want the converted guest to boot *somehow*. Now, one could argue that "-ic libvirtURI" and "-o libvirt" only support local libvirt connections, so it's impossible to have an AMD "source host" and an Intel "target host", when those hosts are actually the same. In other words, if the host were Intel, then EPYC would not be bootable in the first place (and indeed "qemu64" is only bootable in the first place because the original domain does include @check='none'). However, we also have the "-i libvirtxml" input mode, where the original domain XML may come from anywhere. In short, I don't know what to do about this. Modeling the *entire* <cpu> element in guestcaps (for the sake of transferring @check='none' along with the qemu64 model) would be totally overkill; this situation is never encountered in production where we convert mostly from vmware, i.e., where neither "qemu64" or "@check='none'" are relevant. I can spin a version 2 of this patch, restricting the hack to the qemu64 VCPU model, but then the commit message can only say, "this is yet another band-aid, namely for the *very specific* test environment -i libvirt -ic qemu://... -o libvirt". We could be best off with just producing "host-passthrough" indiscriminately in the libvirt output module (and explicitly writing off migratability). ... Compare how much simpler the QEMU output is: (match source.s_cpu_model, guestcaps.gcaps_default_cpu with | None, true -> () | None, false -> arg "-cpu" "host" | Some model, _ -> arg "-cpu" model ); The first two patterns match what we currently do in the libvirt output module, but the last pattern doesn't. If QEMU cannot satisfy the particular -cpu model, it will emit some warnings (see the docs quote above), but will start the domain anyway. That's where libvirt is stricter, with the default @check='partial'. Adding "@check='none'" would relax the libvirt output to the level of the QEMU output, if you will. Consider where "s_cpu_model" is used among all the output modules. It is consumed by the libvirt, qemu, and RHV (lib/create_ovf.ml) ones. QEMU does a very simple thing (see above). The OVF stuff is even simpler (this is what the QEMU logic was based upon): (match source.s_cpu_model with | None -> () | Some model -> List.push_back content_subnodes (e "CustomCpuName" [] [PCData model]) ); It's only the libvirt output that's super hairy. I think @check='none' relaxes it to the level of the other two. Laszlo> > Rich. > >> (match source.s_cpu_vendor with >> | None -> () >> | Some vendor -> >> -- >> 2.19.1.3.g30247aa5d201 >> >> _______________________________________________ >> Libguestfs mailing list >> Libguestfs at redhat.com >> https://listman.redhat.com/mailman/listinfo/libguestfs >
Richard W.M. Jones
2022-Jul-22 09:34 UTC
[Libguestfs] [v2v PATCH] output/create_libvirt_xml: generate @check='none' CPU attribute
Sorry for the delayed response to this. I see you've posted an updated patch, so this is just a bit of FYI. I originally added CPU modelling in commit 11505e4b84 (March 2017): https://github.com/libguestfs/virt-v2v/commit/11505e4b84ce8d7eda4e2a275fdcecc5f2a3288d What we were actually trying to achieve here was to preserve the CPU topology. I believe the request came from Bill Helgerson who was working on v2v in the proto-IMS product, and was working a lot with customers. You can see in the code before the patch is applied we only modelled the number of vCPUs. Afterwards we have: * number of vCPUs * vendor (eg. AMD) * model (eg. EPYC) * sockets * cores per socket * threads per core I think here only the first 1 and last 3 (#vCPUS, topology) are really important. I believe I added the vendor and model just because they were there, without necessarily thinking too deeply about the implications. As you covered in your email, what is the real meaning of converting a source guest using eg AMD/EPYC with virt-v2v to some target? Does it mean that the target must be able to emulate all EPYC feature (likely impossible if the target is Intel)? I would say it's not that important. This isn't live migration, and almost all guests can be booted interchangably on different x86_64 hardware. Is topology important? I would say yes, or at least it's much more important than vendor/model. Workloads may expect not just a number of vCPUs, but a particular layout, especially the larger and more complex ones. So ... my question now is, should we simply remove the vendor and model fields completely? I'll ACK the v2 patch anyway (in a minute). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html