Laszlo Ersek
2021-Dec-23 10:36 UTC
[Libguestfs] [PATCH 0/3] resolve conflict between manual and libvirt-assigned PCI addresses
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 The first patch extends our current <qemu:commandline> hack, moving the virtio-net-pci device to slot 0x1E, where it is very unlikely to conflict with any libvirt-assigned PCI address. The second patch is only refactoring. The third patch resolves the conflict on libvirt >= 3.8.0 in a better way (suggested by Rich): such libvirt versions permit SLIRP network address and prefix specifications right in the <interface> element. Therefore we can have libvirt manage the virtio-net device for us, similarly to virtio-rng, virtio-scsi, and virtio-serial. The (updated) <qemu:commandline> hack is preserved for libvirt < 3.8.0. Gruesomely meticulously tested. (See the Notes sections on the patches.) Sanity-tested both back-ends *without* the "--network" switch as well. Thanks, Laszlo Laszlo Ersek (3): launch-libvirt: place our virtio-net-pci device in slot 0x1e lib: extract NETWORK_ADDRESS and NETWORK_PREFIX as macros launch-libvirt: add virtio-net via the standard <interface> element lib/guestfs-internal.h | 18 ++++++++++++++++++ lib/launch-direct.c | 2 +- lib/launch-libvirt.c | 34 ++++++++++++++++++++++++++++++---- 3 files changed, 49 insertions(+), 5 deletions(-) base-commit: 4af6d68e2d8b856d91fa5527216ea3db04556086 -- 2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:36 UTC
[Libguestfs] [PATCH 1/3] launch-libvirt: place our virtio-net-pci device in slot 0x1e
The <qemu:commandline> trick we use for adding our virtio-net-pci device in the libvirt backend can conflict with libvirtd's and QEMU's PCI address assignment. Try to mitigate that by placing our device in slot 0x1e on the root bus. In practice this could only conflict with a "dmi-to-pci-bridge" device model, which libvirtd itself places in slot 0x1e. However, given the XMLs we generate, and modern QEMU versions, libvirtd has no reason to auto-add "dmi-to-pci-bridge". Refer to <https://libvirt.org/formatdomain.html#controllers>. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 Signed-off-by: Laszlo Ersek <lersek at redhat.com> --- Notes: v1: - Regression-tested with LIBGUESTFS_BACKEND=libvirt virt-rescue -a test1.img --network - According to "virsh dumpxml", the <qemu:commandline> element now contains <qemu:arg value='-device'/> <qemu:arg value='virtio-net-pci,netdev=usernet,addr=1e.0'/> And the corresponding QEMU command line option is (per "ps"): -device virtio-net-pci,netdev=usernet,addr=1e.0 lib/guestfs-internal.h | 11 +++++++++++ lib/launch-libvirt.c | 4 +++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h index 4097b33fd67a..8eb2dd3adddb 100644 --- a/lib/guestfs-internal.h +++ b/lib/guestfs-internal.h @@ -172,6 +172,17 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr) #define VIRTIO_DEVICE_NAME(type) type "-pci" #endif +/* Place the virtio-net controller in slot 0x1e on the root bus, on normal + * hardware with PCI. Refer to RHBZ#2034160. + */ +#ifdef HAVE_LIBVIRT_BACKEND +#if defined(__arm__) || defined(__s390x__) +#define VIRTIO_NET_PCI_ADDR "" +#else +#define VIRTIO_NET_PCI_ADDR ",addr=1e.0" +#endif +#endif + /* Guestfs handle and associated structures. */ /* State. */ diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c index 194530c49fc8..9e833693810f 100644 --- a/lib/launch-libvirt.c +++ b/lib/launch-libvirt.c @@ -1851,7 +1851,9 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g, } end_element (); start_element ("qemu:arg") { - attribute ("value", VIRTIO_DEVICE_NAME ("virtio-net") ",netdev=usernet"); + attribute ("value", (VIRTIO_DEVICE_NAME ("virtio-net") + ",netdev=usernet" + VIRTIO_NET_PCI_ADDR)); } end_element (); } -- 2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:37 UTC
[Libguestfs] [PATCH 2/3] lib: extract NETWORK_ADDRESS and NETWORK_PREFIX as macros
The 169.254.0.0/16 network specification (for the appliance) is currently duplicated between the direct backend and the libvirt backend. In a subsequent patch, we're going to need the network specification in yet another spot; extract it now to the NETWORK_ADDRESS and NETWORK_PREFIX macros (simply as strings). Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 Signed-off-by: Laszlo Ersek <lersek at redhat.com> --- Notes: v1: - Regression-tested with "virt-rescue --network" and "ping", using the default "direct" back-end, after setting "net.ipv4.ping_group_range" properly in sysctl on the host. - The direct back-end continues to generate the following QEMU option: -netdev user,id=usernet,net=169.254.0.0/16 - Did the same with the libvirt backend. The generated libvirt XML, per "virsh dumpxml", contains <qemu:commandline> <qemu:arg value='-netdev'/> <qemu:arg value='user,id=usernet,net=169.254.0.0/16'/> <qemu:arg value='-device'/> <qemu:arg value='virtio-net-pci,netdev=usernet,addr=1e.0'/> <qemu:env name='TMPDIR' value='/home/lacos/src/v2v/guestfs-tools/tmp'/> </qemu:commandline> and the consequent QEMU cmdline has -netdev user,id=usernet,net=169.254.0.0/16 \ -device virtio-net-pci,netdev=usernet,addr=1e.0 \ lib/guestfs-internal.h | 6 ++++++ lib/launch-direct.c | 2 +- lib/launch-libvirt.c | 3 ++- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h index 8eb2dd3adddb..e24d570f5aa7 100644 --- a/lib/guestfs-internal.h +++ b/lib/guestfs-internal.h @@ -183,6 +183,12 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr) #endif #endif +/* Network address and network mask (expressed as address prefix) that the + * appliance will see (if networking is enabled). + */ +#define NETWORK_ADDRESS "169.254.0.0" +#define NETWORK_PREFIX "16" + /* Guestfs handle and associated structures. */ /* State. */ diff --git a/lib/launch-direct.c b/lib/launch-direct.c index e5b9a561176e..4f038f4f04d5 100644 --- a/lib/launch-direct.c +++ b/lib/launch-direct.c @@ -689,7 +689,7 @@ launch_direct (guestfs_h *g, void *datav, const char *arg) start_list ("-netdev") { append_list ("user"); append_list ("id=usernet"); - append_list ("net=169.254.0.0/16"); + append_list ("net=" NETWORK_ADDRESS "/" NETWORK_PREFIX); } end_list (); start_list ("-device") { append_list (VIRTIO_DEVICE_NAME ("virtio-net")); diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c index 9e833693810f..266d88824686 100644 --- a/lib/launch-libvirt.c +++ b/lib/launch-libvirt.c @@ -1843,7 +1843,8 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g, } end_element (); start_element ("qemu:arg") { - attribute ("value", "user,id=usernet,net=169.254.0.0/16"); + attribute ("value", + "user,id=usernet,net=" NETWORK_ADDRESS "/" NETWORK_PREFIX); } end_element (); start_element ("qemu:arg") { -- 2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:37 UTC
[Libguestfs] [PATCH 3/3] launch-libvirt: add virtio-net via the standard <interface> element
Starting with version 3.8.0, libvirt allows us to specify the network address and network mask (as prefix) for SLIRP directly via the <interface> element in the domain XML: <https://libvirt.org/formatdomain.html#userspace-slirp-stack>. This means we don't need the <qemu:commandline> hack for virtio-net on such versions. Restrict the hack in construct_libvirt_xml_qemu_cmdline() to libvirt<3.8.0, and generate the proper <interface> element in construct_libvirt_xml_devices() on libvirt>=3.8.0. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 Suggested-by: Richard W.M. Jones <rjones at redhat.com> Signed-off-by: Laszlo Ersek <lersek at redhat.com> --- Notes: v1: - Tested with "virt-rescue --network" and "ping", using the "libvirt" back-end, after setting "net.ipv4.ping_group_range" properly in sysctl on the host. (Libvirt version: libvirt-daemon-7.6.0-3.fc35.x86_64.) - The generated libvirt XML, per "virsh dumpxml", contains <devices> <interface type='user'> <mac address='52:54:00:fe:23:48'/> <ip address='169.254.0.0' family='ipv4' prefix='16'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </interface> </devices> <qemu:commandline> <qemu:env name='TMPDIR' value='/home/lacos/src/v2v/guestfs-tools/tmp'/> </qemu:commandline> and the consequent QEMU cmdline has -netdev user,net=169.254.0.0/16,id=hostnet0 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fe:23:48,bus=pci.0,addr=0x2 \ - The full set of PCI addresses that libvirt assigns to the virtio devices are, per "virsh dumpxml": <controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> <interface type='user'> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> <rng model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> - The same, verified with virsh -c ... qemu-monitor-command guestfs-... --hmp info qtree is: bus: main-system-bus dev: i440FX-pcihost, id "" bus: pci.0 dev: virtio-rng-pci, id "rng0" addr = 05.0 dev: virtio-net-pci, id "net0" addr = 02.0 dev: virtio-serial-pci, id "virtio-serial0" addr = 04.0 dev: virtio-scsi-pci, id "scsi0" addr = 03.0 lib/guestfs-internal.h | 3 ++- lib/launch-libvirt.c | 27 +++++++++++++++++++++++++-- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h index e24d570f5aa7..4a19e5c6bb47 100644 --- a/lib/guestfs-internal.h +++ b/lib/guestfs-internal.h @@ -173,7 +173,8 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr) #endif /* Place the virtio-net controller in slot 0x1e on the root bus, on normal - * hardware with PCI. Refer to RHBZ#2034160. + * hardware with PCI. Necessary only before libvirt 3.8.0. Refer to + * RHBZ#2034160. */ #ifdef HAVE_LIBVIRT_BACKEND #if defined(__arm__) || defined(__s390x__) diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c index 266d88824686..cc714c02efbf 100644 --- a/lib/launch-libvirt.c +++ b/lib/launch-libvirt.c @@ -1413,6 +1413,28 @@ construct_libvirt_xml_devices (guestfs_h *g, } end_element (); } end_element (); + /* Virtio-net NIC with SLIRP (= userspace) back-end, if networking is + * enabled. Starting with libvirt 3.8.0, we can specify the network address + * and prefix for SLIRP in the domain XML. Therefore, we can add the NIC + * via the standard <interface> element rather than <qemu:commandline>, and + * so libvirt can manage the PCI address of the virtio-net NIC like the PCI + * addresses of all other devices. Refer to RHBZ#2034160. + */ + if (g->enable_network && + guestfs_int_version_ge (¶ms->data->libvirt_version, 3, 8, 0)) { + start_element ("interface") { + attribute ("type", "user"); + start_element ("model") { + attribute ("type", "virtio"); + } end_element (); + start_element ("ip") { + attribute ("family", "ipv4"); + attribute ("address", NETWORK_ADDRESS); + attribute ("prefix", NETWORK_PREFIX); + } end_element (); + } end_element (); + } + /* Libvirt adds some devices by default. Indicate to libvirt * that we don't want them. */ @@ -1835,9 +1857,10 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g, } end_element (); /* Workaround because libvirt user networking cannot specify "net=" - * parameter. + * parameter. Necessary only before libvirt 3.8.0; refer to RHBZ#2034160. */ - if (g->enable_network) { + if (g->enable_network && + !guestfs_int_version_ge (¶ms->data->libvirt_version, 3, 8, 0)) { start_element ("qemu:arg") { attribute ("value", "-netdev"); } end_element (); -- 2.19.1.3.g30247aa5d201
Daniel P. Berrangé
2022-Jan-04 10:33 UTC
[Libguestfs] [PATCH 0/3] resolve conflict between manual and libvirt-assigned PCI addresses
On Thu, Dec 23, 2021 at 11:36:58AM +0100, Laszlo Ersek wrote:> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 > > The first patch extends our current <qemu:commandline> hack, moving the > virtio-net-pci device to slot 0x1E, where it is very unlikely to > conflict with any libvirt-assigned PCI address.Remind me why we still need to use <qemu:commandline> ? A need for this obviously reflects a failure of libvirt to address the libguestfs requirements. Can we fix the root cause here with proper XML support for whatever feature is missing. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|