Laszlo Ersek
2021-Sep-30 11:12 UTC
[Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)
(+libvirt-devel) On 09/29/21 21:22, Richard W.M. Jones wrote:> We currently partially install the virtio block drivers in the Windows > guest (just enough to get the guest to boot on the target), and > Windows itself re-installs the virtio block driver and other drivers > it needs, and that's enough to get it to see C: > > As for other hard disk partitions, Windows does indeed contain a > mapping to other drives in the Registry but IIRC it's not sensitive to > the device driver (unlike Linux /dev/vdX vs /dev/sdX). If you're > interested in that, see libguestfs.git/daemon/inspect_fs_windows.ml: > get_drive_mappings. We never bothered with attempting to handle > conversion of floppy drives or CD-ROMs for Windows.OK. So AIUI, that means no work is needed here for Windows.> On Linux we do better: We iterate over all the configuration files in > /etc and change device paths. The significance of this bug is we need > to change (eg) /dev/hdc to /dev/<something>. The difficulty is > working out where the device will appear on the target and not having > it conflict with any hard disk, something we partly control (see > virt-v2v.git/convert/target_bus_assignment.ml*)AIUI the conflict avoidance logic ("no overlapping disks") is already in place. The question is how to translate device paths in /etc/fstab and similar. Please correct me if I'm wrong: at the moment, I believe virt-v2v parses and manipulates the following elements and attributes in the domain XML: <target dev='hda' bus='ide'/> <target dev='hdb' bus='ide'/> <target dev='hdc' bus='ide'/> <target dev='hdd' bus='ide'/> ^^^ ^^^ My understanding is however that the target/@dev attribute is mostly irrelevant: https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms The dev attribute indicates the "logical" device name. The actual device name specified is not guaranteed to map to the device name in the guest OS. Treat it as a device ordering hint. [...] What actually matters is the target/@bus attribute, in combination with the sibling element <address>. Such as: <target dev='hda' bus='ide'/> ^^^ <address type='drive' controller='0' bus='0' target='0' unit='0'/> ^ ^ ^ <target dev='hdb' bus='ide'/> ^^^ <address type='drive' controller='0' bus='0' target='0' unit='1'/> ^ ^ ^ <target dev='hdc' bus='ide'/> ^^^ <address type='drive' controller='0' bus='1' target='0' unit='0'/> ^ ^ ^ <target dev='hdd' bus='ide'/> ^^^ <address type='drive' controller='0' bus='1' target='0' unit='1'/> ^ ^ ^ So, target/@dev should be mostly ignored; what matters is the following tuple: (target/@bus, address/@controller, address/@bus, address/@unit) Extracting just the tuples: (ide, 0, 0, 0) (ide, 0, 0, 1) (ide, 0, 1, 0) (ide, 0, 1, 1) The first two components of each tuple -- i.e., (ide, 0) -- refer to the following IDE controller: <controller type='ide' index='0'> ^^^^^^^^^^ ^^^^^^^^^ <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> and then the rest of the components, such as (0, 0), (0, 1), (1, 0), (1, 1), identify the disk on that IDE controller. (Side comment: the PCI location of the (first) IDE controller is fixed in QEMU; if one tries to change it, libvirt complains: "Primary IDE controller must have PCI address 0:0:1.1".) (Side comment: on the QEMU command line, this maps to -device ide-cd,bus=ide.0,unit=0,... \ -device ide-cd,bus=ide.0,unit=1,... \ -device ide-cd,bus=ide.1,unit=0,... \ -device ide-cd,bus=ide.1,unit=1,... \ ) Inside the guest, /dev/hd* nodes don't even exist, so it's unlikely that /etc/fstab would refer to them. /etc/fstab can however refer to symlinks under "/dev/disk/by-id" (for example): lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00001 -> ../../sr0 lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00002 -> ../../sr1 lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00003 -> ../../sr2 lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00004 -> ../../sr3 Furthermore, we have pseudo-files (directories) such as: /sys/devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/block/sr0 /sys/devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:1/0:0:1:0/block/sr1 /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2 /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:1/1:0:1:0/block/sr3 ^ ^ So in order to map a device path from the original guest's "/etc/fstab", such as "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003", to the original domain XML's <disk> element, we have to do the following in the "source" appliance: NODE=$(realpath /dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003) # -> /dev/sr2 NODE=${NODE#/dev/} # -> sr2 DEVPATH=$(ls -d /sys/devices/pci0000:00/0000:00:01.1/ata?/host?/target?:0:?/?:0:?:0/block/$NODE) # -> /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2 And then map the "1:0:0:0" pathname component from $DEVPATH to: <target dev='hdc' bus='ide'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> ^^^^^^^ ^^^^^^^^ [1]:0:0:0 1:0:[0]:0 in the original domain XML. This tells us under what device node the original guest sees the host-side file (<source> element). After conversion, on the Q35 board, the inverse mapping is needed. We start from the domain XML, <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='2'/> <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='3'/> <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='5'/> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> (Side comment: the PCI B/D/F of the SATA controller is also fixed in QEMU; otherwise libvirt complains: "Primary SATA controller must have PCI address 0:0:1f.2".) (Side comment: the QEMU command line is -device ide-cd,bus=ide.0,... \ -device ide-cd,bus=ide.1,... \ -device ide-cd,bus=ide.2,... \ -device ide-cd,bus=ide.3,... \ -device ide-cd,bus=ide.4,... \ -device ide-cd,bus=ide.5,... \ ) In the guest we have: /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sr0 /sys/devices/pci0000:00/0000:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr1 /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/block/sr2 /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/3:0:0:0/block/sr3 /sys/devices/pci0000:00/0000:00:1f.2/ata5/host4/target4:0:0/4:0:0:0/block/sr4 /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/5:0:0:0/block/sr5 So, assuming we mapped the original (i440fx) "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003" guest device path to some <source> element (= host-side file) in the original domain XML, and assuming virt-v2v assigned the same <source> element to the following element on the Q35 board: <target dev='sd*' bus='sata'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> ^^^^^^^^ we find the device node in the destination appliance as follows: NODE=$(basename /sys/devices/pci0000:00/0000:00:1f.2/ata?/host?/target?:0:0/4:0:0:0/block/*) ^ unit='4' # -> sr4 and then replace "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003" with "/dev/sr4" in "/etc/fstab". All this requires virt-v2v to parse complete <address> elements from the original domain XML, and to generate complete <address> elements in the destination domain XML. Is that feasible? The @wwn and @serial attributes don't look safe to me, because the guest could refer to them even when the original domain XML does not spell them out (eg. "QM00003" is such a @serial). So neither can we trust that a @serial is present in the original XML, nor can we just go ahead and generate a @serial if it is absent. (The generation step could immediately break references such as "QM00003" in the guest.) /dev/disk/by-label and /dev/disk/by-uuid are based on media contents, and multiple CD-ROMs may (read-only) map the same host-side file, so those are not good for mapping either, I think. Also does not cover CD-ROM devices that are empty (have no medium) at the time of conversion, but "/etc/fstab" still refers to them (potentially with "noauto"). So I think the only reliable ID is the hardware device path. ... Now if *that* needs to work when the original guest comes from a different management application than libvirt, then I have no idea. The original address (PCI B/D/F of the IDE controller, and IDE bus and unit of the drive) need to be known somehow; otherwise we cannot associate the guest-side "/dev/..." reference with the host-side file underlying that CD-ROM. Thanks, Laszlo
Richard W.M. Jones
2021-Sep-30 11:53 UTC
[Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)
On Thu, Sep 30, 2021 at 01:12:39PM +0200, Laszlo Ersek wrote:> All this requires virt-v2v to parse complete <address> elements from the > original domain XML, and to generate complete <address> elements in the > destination domain XML. Is that feasible?The input is not always (in fact, hardly ever) full libvirt XML. It's input specific to the hypervisor. For VMware it might be: - the *.vmx file (the real source of truth) (-i vmx) - partial libvirt XML generated by libvirt's vpx driver, but this is derived from information from VMware APIs and ultimately that comes from the *.vmx file (-i libvirt -ic esx:// or -ic vpx://) - the *.ovf file (-i ova) - nothing at all! (-i disk) Also we don't currently try to find or rewrite /dev/disk/ paths in guest configuration files. The only rewriting that happens is for /dev/[hs]d* block device filenames and a few others. The actual code that does this is convert/convert_linux.ml:remap_block_devices So I wouldn't over-think this. It's likely fine to identify such devices and rewrite them as "/dev/cdrom", assuming that (I didn't check) udev creates that symlink for any reasonably modern Linux. And if there's more than one attached CD to the source, only convert the first one and warn about but drop the others. Note that the aim of virt-v2v is to make it boot on the target, make sure the network works, and get the user to a login prompt. The MTV management app around virt-v2v allows site-specific pre- and post- configuration to happen (site-specific Ansible playbooks). There also exists ssh and remote console. For people using virt-v2v on the command line, if they get the guest to boot on the target and not every device fully works, there is an expectation that they can log in as root and fix small[*] things. Rich. [*] Obviously if the boot / system is completely broken, not that. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Daniel P. Berrangé
2021-Sep-30 11:54 UTC
[Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v (was: test-v2v-cdrom: update the CD-ROM's bus to SATA in the converted domain)
On Thu, Sep 30, 2021 at 01:12:39PM +0200, Laszlo Ersek wrote:> On 09/29/21 21:22, Richard W.M. Jones wrote: > Please correct me if I'm wrong: at the moment, I believe virt-v2v parses > and manipulates the following elements and attributes in the domain XML: > > <target dev='hda' bus='ide'/> > <target dev='hdb' bus='ide'/> > <target dev='hdc' bus='ide'/> > <target dev='hdd' bus='ide'/> > ^^^ ^^^ > > My understanding is however that the target/@dev attribute is mostly > irrelevant: > > https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms > > The dev attribute indicates the "logical" device name. The actual > device name specified is not guaranteed to map to the device name in > the guest OS. Treat it as a device ordering hint. [...]I won't say it is irrelevant. Functionally @dev is absolutely still important, as it influences how the disk is attached to the VM. Rather I would say that the @dev attribute is misleading to users, because they mistakenly think it provides a guarantee that the disk will appear with this name inside the guest.> What actually matters is the target/@bus attribute, in combination with > the sibling element <address>. Such as: > > <target dev='hda' bus='ide'/> > ^^^ > <address type='drive' controller='0' bus='0' target='0' unit='0'/> > ^ ^ ^ > > <target dev='hdb' bus='ide'/> > ^^^ > <address type='drive' controller='0' bus='0' target='0' unit='1'/> > ^ ^ ^ > > <target dev='hdc' bus='ide'/> > ^^^ > <address type='drive' controller='0' bus='1' target='0' unit='0'/> > ^ ^ ^ > > <target dev='hdd' bus='ide'/> > ^^^ > <address type='drive' controller='0' bus='1' target='0' unit='1'/> > ^ ^ ^ > > So, target/@dev should be mostly ignored; what matters is the following > tuple: > > (target/@bus, address/@controller, address/@bus, address/@unit)Yes, the <address/> is what libvirt internally drivers all configuration off, but in practice application developers almost never use the <address> element directly. They will just give target/@dev and libvirt will use that to automatically populate an <address/> element, in order to reliably fixate the guest ABI thereafter.> > Extracting just the tuples: > > (ide, 0, 0, 0) > (ide, 0, 0, 1) > (ide, 0, 1, 0) > (ide, 0, 1, 1) > > The first two components of each tuple -- i.e., (ide, 0) -- refer to the > following IDE controller: > > <controller type='ide' index='0'> > ^^^^^^^^^^ ^^^^^^^^^ > <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> > </controller> > > and then the rest of the components, such as (0, 0), (0, 1), (1, 0), (1, > 1), identify the disk on that IDE controller.Yes, that's correct for IDE.> (Side comment: the PCI location of the (first) IDE controller is fixed > in QEMU; if one tries to change it, libvirt complains: "Primary IDE > controller must have PCI address 0:0:1.1".)Yep, its defined by the QEMU machine type and we just have to accept that for IDE/SATA.> Inside the guest, /dev/hd* nodes don't even exist, so it's unlikely that > /etc/fstab would refer to them. /etc/fstab can however refer to symlinks > under "/dev/disk/by-id" (for example): > > lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00001 -> ../../sr0 > lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00002 -> ../../sr1 > lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00003 -> ../../sr2 > lrwxrwxrwx. 1 root root 9 Sep 30 11:54 ata-QEMU_DVD-ROM_QM00004 -> ../../sr3 > > Furthermore, we have pseudo-files (directories) such as: > > /sys/devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/block/sr0 > /sys/devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:1/0:0:1:0/block/sr1 > /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2 > /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:1/1:0:1:0/block/sr3 > ^ ^ > > So in order to map a device path from the original guest's "/etc/fstab", > such as "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003", to the original > domain XML's <disk> element, we have to do the following in the "source" > appliance: > > NODE=$(realpath /dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003) > # -> /dev/sr2 > NODE=${NODE#/dev/} > # -> sr2 > DEVPATH=$(ls -d /sys/devices/pci0000:00/0000:00:01.1/ata?/host?/target?:0:?/?:0:?:0/block/$NODE) > # -> /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr2 > > And then map the "1:0:0:0" pathname component from $DEVPATH to: > > <target dev='hdc' bus='ide'/> > <address type='drive' controller='0' bus='1' target='0' unit='0'/> > ^^^^^^^ ^^^^^^^^ > [1]:0:0:0 1:0:[0]:0 > > in the original domain XML. This tells us under what device node the > original guest sees the host-side file (<source> element).Yes, to map from the guest to the libvirt XML, you need to be working in terms of the hardware buses/addresses.> So, assuming we mapped the original (i440fx) > "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003" guest device path to some > <source> element (= host-side file) in the original domain XML, and > assuming virt-v2v assigned the same <source> element to the following > element on the Q35 board: > > <target dev='sd*' bus='sata'/> > <address type='drive' controller='0' bus='0' target='0' unit='4'/> > ^^^^^^^^ > > we find the device node in the destination appliance as follows: > > NODE=$(basename /sys/devices/pci0000:00/0000:00:1f.2/ata?/host?/target?:0:0/4:0:0:0/block/*) > ^ > unit='4' > # -> sr4 > > and then replace "/dev/disk/by-id/ata-QEMU_DVD-ROM_QM00003" with > "/dev/sr4" in "/etc/fstab".I wouldn't recommend using any of the /dev/* devices, as those are all unstable when faced with changed guest configuration over time. /etc/fstab should really use a stable /dev/disk/ symlink, so it can be directly associated with the desired device based on the hardware topology, rather than guest OS device probe order.> /dev/disk/by-label and /dev/disk/by-uuid are based on media contents, > and multiple CD-ROMs may (read-only) map the same host-side file, so > those are not good for mapping either, I think. Also does not cover > CD-ROM devices that are empty (have no medium) at the time of > conversion, but "/etc/fstab" still refers to them (potentially with > "noauto"). So I think the only reliable ID is the hardware device path.Yep, the symlink based on hardware topology is the only thing that can be stable and unique. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Laszlo Ersek
2021-Oct-01 10:35 UTC
[Libguestfs] translating CD-ROM device paths from i440fx to Q35 in virt-v2v
Hi Rich, (dropping libvirt-devel) On 09/30/21 13:53, Richard W.M. Jones wrote:> Also we don't currently try to find or rewrite /dev/disk/ paths in > guest configuration files. The only rewriting that happens is for > /dev/[hs]d* block device filenames and a few others. The actual code > that does this is convert/convert_linux.ml:remap_block_devices > > So I wouldn't over-think this. It's likely fine to identify such > devices and rewrite them as "/dev/cdrom", assuming that (I didn't > check) udev creates that symlink for any reasonably modern Linux. And > if there's more than one attached CD to the source, only convert the > first one and warn about but drop the others.So I've started reading "convert/convert_linux.ml" in parallel with the OCaml manual. (I dabbled for a few weeks in Haskell when everyone else did a few years ago, so it's not 100% unfamiliar.) Question: let family match inspect.i_distro with | "fedora" | "rhel" | "centos" | "scientificlinux" | "redhat-based" | "oraclelinux" -> `RHEL_family | "altlinux" -> `ALT_family | "sles" | "suse-based" | "opensuse" -> `SUSE_family | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> `Debian_family Here, RHEL_family, ALT_family, SUSE_family, Debian_family are not plain constructors ("fixed" variants) [1] but polymorphic ones [2]. Why? Was this done *only* in order so that an explicit type definition such as type os_family = RHEL_family | ALT_family | SUSE_family | Debian_family could be avoided? Based on my (incomplete understanding) of the ocaml docs, this looks like a bad idea. We need no polymorphism here, and using a static type (a fixed variant) is generally beneficial [3]. The following (minimally modified) definition at the OCaml REPL: let family1 distro match distro with | "fedora" | "rhel" | "centos" | "scientificlinux" | "redhat-based" | "oraclelinux" -> `RHEL_family | "altlinux" -> `ALT_family | "sles" | "suse-based" | "opensuse" -> `SUSE_family | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> `Debian_family | _ -> assert false;; deduces the following type: string -> [> `ALT_family | `Debian_family | `RHEL_family | `SUSE_family ] = <fun> with the nasty "[>" mark at the start (allowing for further type refinement [2], which I think we don't need here). Conversely. type os_family = RHEL_family | ALT_family | SUSE_family | Debian_family;; let family2 distro match distro with | "fedora" | "rhel" | "centos" | "scientificlinux" | "redhat-based" | "oraclelinux" -> RHEL_family | "altlinux" -> ALT_family | "sles" | "suse-based" | "opensuse" -> SUSE_family | "debian" | "ubuntu" | "linuxmint" | "kalilinux" -> Debian_family | _ -> assert false;; (note the explicit os_family type definition, and the removal of the backticks from the constructor names) comes back with the type: val family2 : string -> os_family = <fun> Basically converting a string to an enum constant. So, what's the reason for the polymorphic variant? [1] https://ocaml.org/manual/coreexamples.html#s%3Atut-recvariants [2] https://ocaml.org/manual/polyvariant.html#sec48 [3] https://ocaml.org/manual/polyvariant.html#s%3Apolyvariant-weaknesses Thanks! Laszlo