Laszlo Ersek
2021-Dec-23 10:36 UTC
[Libguestfs] [PATCH 0/3] resolve conflict between manual and libvirt-assigned PCI addresses
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 The first patch extends our current <qemu:commandline> hack, moving the virtio-net-pci device to slot 0x1E, where it is very unlikely to conflict with any libvirt-assigned PCI address. The second patch is only refactoring. The third patch resolves the conflict on libvirt >= 3.8.0 in a better way (suggested by Rich): such libvirt versions permit SLIRP network address and prefix specifications right in the <interface> element. Therefore we can have libvirt manage the virtio-net device for us, similarly to virtio-rng, virtio-scsi, and virtio-serial. The (updated) <qemu:commandline> hack is preserved for libvirt < 3.8.0. Gruesomely meticulously tested. (See the Notes sections on the patches.) Sanity-tested both back-ends *without* the "--network" switch as well. Thanks, Laszlo Laszlo Ersek (3): launch-libvirt: place our virtio-net-pci device in slot 0x1e lib: extract NETWORK_ADDRESS and NETWORK_PREFIX as macros launch-libvirt: add virtio-net via the standard <interface> element lib/guestfs-internal.h | 18 ++++++++++++++++++ lib/launch-direct.c | 2 +- lib/launch-libvirt.c | 34 ++++++++++++++++++++++++++++++---- 3 files changed, 49 insertions(+), 5 deletions(-) base-commit: 4af6d68e2d8b856d91fa5527216ea3db04556086 -- 2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:36 UTC
[Libguestfs] [PATCH 1/3] launch-libvirt: place our virtio-net-pci device in slot 0x1e
The <qemu:commandline> trick we use for adding our virtio-net-pci device
in the libvirt backend can conflict with libvirtd's and QEMU's PCI
address
assignment. Try to mitigate that by placing our device in slot 0x1e on the
root bus. In practice this could only conflict with a
"dmi-to-pci-bridge"
device model, which libvirtd itself places in slot 0x1e. However, given
the XMLs we generate, and modern QEMU versions, libvirtd has no reason to
auto-add "dmi-to-pci-bridge". Refer to
<https://libvirt.org/formatdomain.html#controllers>.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160
Signed-off-by: Laszlo Ersek <lersek at redhat.com>
---
Notes:
v1:
- Regression-tested with
LIBGUESTFS_BACKEND=libvirt virt-rescue -a test1.img --network
- According to "virsh dumpxml", the <qemu:commandline>
element now
contains
<qemu:arg value='-device'/>
<qemu:arg
value='virtio-net-pci,netdev=usernet,addr=1e.0'/>
And the corresponding QEMU command line option is (per "ps"):
-device virtio-net-pci,netdev=usernet,addr=1e.0
lib/guestfs-internal.h | 11 +++++++++++
lib/launch-libvirt.c | 4 +++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h
index 4097b33fd67a..8eb2dd3adddb 100644
--- a/lib/guestfs-internal.h
+++ b/lib/guestfs-internal.h
@@ -172,6 +172,17 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr)
#define VIRTIO_DEVICE_NAME(type) type "-pci"
#endif
+/* Place the virtio-net controller in slot 0x1e on the root bus, on normal
+ * hardware with PCI. Refer to RHBZ#2034160.
+ */
+#ifdef HAVE_LIBVIRT_BACKEND
+#if defined(__arm__) || defined(__s390x__)
+#define VIRTIO_NET_PCI_ADDR ""
+#else
+#define VIRTIO_NET_PCI_ADDR ",addr=1e.0"
+#endif
+#endif
+
/* Guestfs handle and associated structures. */
/* State. */
diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c
index 194530c49fc8..9e833693810f 100644
--- a/lib/launch-libvirt.c
+++ b/lib/launch-libvirt.c
@@ -1851,7 +1851,9 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g,
} end_element ();
start_element ("qemu:arg") {
- attribute ("value", VIRTIO_DEVICE_NAME
("virtio-net") ",netdev=usernet");
+ attribute ("value", (VIRTIO_DEVICE_NAME
("virtio-net")
+ ",netdev=usernet"
+ VIRTIO_NET_PCI_ADDR));
} end_element ();
}
--
2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:37 UTC
[Libguestfs] [PATCH 2/3] lib: extract NETWORK_ADDRESS and NETWORK_PREFIX as macros
The 169.254.0.0/16 network specification (for the appliance) is currently
duplicated between the direct backend and the libvirt backend. In a
subsequent patch, we're going to need the network specification in yet
another spot; extract it now to the NETWORK_ADDRESS and NETWORK_PREFIX
macros (simply as strings).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160
Signed-off-by: Laszlo Ersek <lersek at redhat.com>
---
Notes:
v1:
- Regression-tested with "virt-rescue --network" and
"ping", using the
default "direct" back-end, after setting
"net.ipv4.ping_group_range"
properly in sysctl on the host.
- The direct back-end continues to generate the following QEMU option:
-netdev user,id=usernet,net=169.254.0.0/16
- Did the same with the libvirt backend. The generated libvirt XML,
per "virsh dumpxml", contains
<qemu:commandline>
<qemu:arg value='-netdev'/>
<qemu:arg value='user,id=usernet,net=169.254.0.0/16'/>
<qemu:arg value='-device'/>
<qemu:arg
value='virtio-net-pci,netdev=usernet,addr=1e.0'/>
<qemu:env name='TMPDIR'
value='/home/lacos/src/v2v/guestfs-tools/tmp'/>
</qemu:commandline>
and the consequent QEMU cmdline has
-netdev user,id=usernet,net=169.254.0.0/16 \
-device virtio-net-pci,netdev=usernet,addr=1e.0 \
lib/guestfs-internal.h | 6 ++++++
lib/launch-direct.c | 2 +-
lib/launch-libvirt.c | 3 ++-
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h
index 8eb2dd3adddb..e24d570f5aa7 100644
--- a/lib/guestfs-internal.h
+++ b/lib/guestfs-internal.h
@@ -183,6 +183,12 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr)
#endif
#endif
+/* Network address and network mask (expressed as address prefix) that the
+ * appliance will see (if networking is enabled).
+ */
+#define NETWORK_ADDRESS "169.254.0.0"
+#define NETWORK_PREFIX "16"
+
/* Guestfs handle and associated structures. */
/* State. */
diff --git a/lib/launch-direct.c b/lib/launch-direct.c
index e5b9a561176e..4f038f4f04d5 100644
--- a/lib/launch-direct.c
+++ b/lib/launch-direct.c
@@ -689,7 +689,7 @@ launch_direct (guestfs_h *g, void *datav, const char *arg)
start_list ("-netdev") {
append_list ("user");
append_list ("id=usernet");
- append_list ("net=169.254.0.0/16");
+ append_list ("net=" NETWORK_ADDRESS "/"
NETWORK_PREFIX);
} end_list ();
start_list ("-device") {
append_list (VIRTIO_DEVICE_NAME ("virtio-net"));
diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c
index 9e833693810f..266d88824686 100644
--- a/lib/launch-libvirt.c
+++ b/lib/launch-libvirt.c
@@ -1843,7 +1843,8 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g,
} end_element ();
start_element ("qemu:arg") {
- attribute ("value",
"user,id=usernet,net=169.254.0.0/16");
+ attribute ("value",
+ "user,id=usernet,net=" NETWORK_ADDRESS
"/" NETWORK_PREFIX);
} end_element ();
start_element ("qemu:arg") {
--
2.19.1.3.g30247aa5d201
Laszlo Ersek
2021-Dec-23 10:37 UTC
[Libguestfs] [PATCH 3/3] launch-libvirt: add virtio-net via the standard <interface> element
Starting with version 3.8.0, libvirt allows us to specify the network
address and network mask (as prefix) for SLIRP directly via the
<interface> element in the domain XML:
<https://libvirt.org/formatdomain.html#userspace-slirp-stack>. This means
we don't need the <qemu:commandline> hack for virtio-net on such
versions.
Restrict the hack in construct_libvirt_xml_qemu_cmdline() to
libvirt<3.8.0, and generate the proper <interface> element in
construct_libvirt_xml_devices() on libvirt>=3.8.0.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160
Suggested-by: Richard W.M. Jones <rjones at redhat.com>
Signed-off-by: Laszlo Ersek <lersek at redhat.com>
---
Notes:
v1:
- Tested with "virt-rescue --network" and "ping", using
the "libvirt"
back-end, after setting "net.ipv4.ping_group_range" properly in
sysctl
on the host. (Libvirt version: libvirt-daemon-7.6.0-3.fc35.x86_64.)
- The generated libvirt XML, per "virsh dumpxml", contains
<devices>
<interface type='user'>
<mac address='52:54:00:fe:23:48'/>
<ip address='169.254.0.0' family='ipv4'
prefix='16'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000'
bus='0x00' slot='0x02' function='0x0'/>
</interface>
</devices>
<qemu:commandline>
<qemu:env name='TMPDIR'
value='/home/lacos/src/v2v/guestfs-tools/tmp'/>
</qemu:commandline>
and the consequent QEMU cmdline has
-netdev user,net=169.254.0.0/16,id=hostnet0 \
-device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fe:23:48,bus=pci.0,addr=0x2
\
- The full set of PCI addresses that libvirt assigns to the virtio
devices are, per "virsh dumpxml":
<controller type='scsi' index='0'
model='virtio-scsi'>
<address type='pci' domain='0x0000'
bus='0x00' slot='0x03' function='0x0'/>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000'
bus='0x00' slot='0x04' function='0x0'/>
<interface type='user'>
<address type='pci' domain='0x0000'
bus='0x00' slot='0x02' function='0x0'/>
<rng model='virtio'>
<address type='pci' domain='0x0000'
bus='0x00' slot='0x05' function='0x0'/>
- The same, verified with
virsh -c ... qemu-monitor-command guestfs-... --hmp info qtree
is:
bus: main-system-bus
dev: i440FX-pcihost, id ""
bus: pci.0
dev: virtio-rng-pci, id "rng0"
addr = 05.0
dev: virtio-net-pci, id "net0"
addr = 02.0
dev: virtio-serial-pci, id "virtio-serial0"
addr = 04.0
dev: virtio-scsi-pci, id "scsi0"
addr = 03.0
lib/guestfs-internal.h | 3 ++-
lib/launch-libvirt.c | 27 +++++++++++++++++++++++++--
2 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h
index e24d570f5aa7..4a19e5c6bb47 100644
--- a/lib/guestfs-internal.h
+++ b/lib/guestfs-internal.h
@@ -173,7 +173,8 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr)
#endif
/* Place the virtio-net controller in slot 0x1e on the root bus, on normal
- * hardware with PCI. Refer to RHBZ#2034160.
+ * hardware with PCI. Necessary only before libvirt 3.8.0. Refer to
+ * RHBZ#2034160.
*/
#ifdef HAVE_LIBVIRT_BACKEND
#if defined(__arm__) || defined(__s390x__)
diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c
index 266d88824686..cc714c02efbf 100644
--- a/lib/launch-libvirt.c
+++ b/lib/launch-libvirt.c
@@ -1413,6 +1413,28 @@ construct_libvirt_xml_devices (guestfs_h *g,
} end_element ();
} end_element ();
+ /* Virtio-net NIC with SLIRP (= userspace) back-end, if networking is
+ * enabled. Starting with libvirt 3.8.0, we can specify the network address
+ * and prefix for SLIRP in the domain XML. Therefore, we can add the NIC
+ * via the standard <interface> element rather than
<qemu:commandline>, and
+ * so libvirt can manage the PCI address of the virtio-net NIC like the PCI
+ * addresses of all other devices. Refer to RHBZ#2034160.
+ */
+ if (g->enable_network &&
+ guestfs_int_version_ge (¶ms->data->libvirt_version, 3, 8,
0)) {
+ start_element ("interface") {
+ attribute ("type", "user");
+ start_element ("model") {
+ attribute ("type", "virtio");
+ } end_element ();
+ start_element ("ip") {
+ attribute ("family", "ipv4");
+ attribute ("address", NETWORK_ADDRESS);
+ attribute ("prefix", NETWORK_PREFIX);
+ } end_element ();
+ } end_element ();
+ }
+
/* Libvirt adds some devices by default. Indicate to libvirt
* that we don't want them.
*/
@@ -1835,9 +1857,10 @@ construct_libvirt_xml_qemu_cmdline (guestfs_h *g,
} end_element ();
/* Workaround because libvirt user networking cannot specify
"net="
- * parameter.
+ * parameter. Necessary only before libvirt 3.8.0; refer to RHBZ#2034160.
*/
- if (g->enable_network) {
+ if (g->enable_network &&
+ !guestfs_int_version_ge (¶ms->data->libvirt_version, 3,
8, 0)) {
start_element ("qemu:arg") {
attribute ("value", "-netdev");
} end_element ();
--
2.19.1.3.g30247aa5d201
Daniel P. Berrangé
2022-Jan-04 10:33 UTC
[Libguestfs] [PATCH 0/3] resolve conflict between manual and libvirt-assigned PCI addresses
On Thu, Dec 23, 2021 at 11:36:58AM +0100, Laszlo Ersek wrote:> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160 > > The first patch extends our current <qemu:commandline> hack, moving the > virtio-net-pci device to slot 0x1E, where it is very unlikely to > conflict with any libvirt-assigned PCI address.Remind me why we still need to use <qemu:commandline> ? A need for this obviously reflects a failure of libvirt to address the libguestfs requirements. Can we fix the root cause here with proper XML support for whatever feature is missing. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|