Daniel P. Berrangé
2022-Nov-02 15:20 UTC
Predictable and consistent net interface naming in guests
On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:> On Wed, 2 Nov 2022 10:43:10 -0400 > Laine Stump <laine at redhat.com> wrote: > > > On 11/1/22 7:46 AM, Igor Mammedov wrote: > > > On Mon, 31 Oct 2022 14:48:54 +0000 > > > Daniel P. Berrang? <berrange at redhat.com> wrote: > > > > > >> On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote: > > >>> Hi Igor and Laine, > > >>> > > >>> I would like to revive a 2 years old discussion [1] about consistent network > > >>> interfaces in the guest. > > >>> > > >>> That discussion mentioned that a guest PCI address may change in two cases: > > >>> - The PCI topology changes. > > >>> - The machine type changes. > > >>> > > >>> Usually, the machine type is not expected to change, especially if one > > >>> wants to allow migrations between nodes. > > >>> I would hope to argue this should not be problematic in practice, because > > >>> guest images would be made per a specific machine type. > > >>> > > >>> Regarding the PCI topology, I am not sure I understand what changes > > >>> need to occur to the domxml for a defined guest PCI address to change. > > >>> The only think that I can think of is a scenario where hotplug/unplug is > > >>> used, > > >>> but even then I would expect existing devices to preserve their PCI address > > >>> and the plug/unplug device to have a reserved address managed by the one > > >>> acting on it (the management system). > > >>> > > >>> Could you please help clarify in which scenarios the PCI topology can cause > > >>> a mess to the naming of interfaces in the guest? > > >>> > > >>> Are there any plans to add the acpi_index support? > > >> > > >> This was implemented a year & a half ago > > >> > > >> https://libvirt.org/formatdomain.html#network-interfaces > > >> > > >> though due to QEMU limitations this only works for the old > > >> i440fx chipset, not Q35 yet. > > > > > > Q35 should work partially too. In its case acpi-index support > > > is limited to hotplug enabled root-ports and PCIe-PCI bridges. > > > One also has to enable ACPI PCI hotplug (it's enled by default > > > on recent machine types) for it to work (i.e.it's not supported > > > in native PCIe hotplug mode). > > > > > > So if mgmt can put nics on root-ports/bridges, then acpi-index > > > should just work on Q35 as well. > > > > With only a few exceptions (e.g. the first ich9 audio device, which is > > placed directly on the root bus at 00:1B.0 because that is where the > > ich9 audio device is located on actual Q35 hardware), libvirt will > > automatically put all PCI devices (including network interfaces) on a > > pcie-root-port. > > > > After seeing reports that "acpi index doesn't work with Q35 > > machinetypes" I just assumed that was correct and didn't try it. But > > after seeing the "should work partially" statement above, I tried it > > just now and an <interface> of a Q35 guest that had its PCI address > > auto-assigned by libvirt (and so was placed on a pcie-root-port)m and > > had <acpi index='4'/> was given the name "eno4". So what exactly is it > > that *doesn't* work? > > From QEMU side: > acpi-index requires: > 1. acpi pci hotplug enabled (which is default on relatively new q35 machine types) > 2. hotpluggble pci bus (root-port, various pci bridges) > 3. NIC can be cold or hotplugged, guest should pick up acpi-index of the device > currently plugged into slot > what doesn't work: > 1. device attached to host-bridge directly (work in progress) > (q35) > 2. devices attached to any PXB port and any hierarchy hanging of it (there are not plans to make it work) > (q35, pc)I'd say this is still a relatively important, as the PXBs are needed to create a NUMA placement aware topology for guests, and I'd say it is undesirable to loose acpi-index if a guest is updated to be NUMA aware, or if a guest image can be deployed in either normal or NUMA aware setups.> 3. devices plugged into hot-plugged bridges/root-ports > (hotplugged bridge lacks ACPI description) (hard to fix, maybe not possible) > (q35, pc)Not so bothered about that, since I think generally mgmt apps pre-plug sufficient bridges to cope.> 4. multifunction devices (it's undefined by spec, hence not supported)Not a big deal. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Igor Mammedov
2022-Nov-02 15:58 UTC
Predictable and consistent net interface naming in guests
On Wed, 2 Nov 2022 15:20:39 +0000 Daniel P. Berrang? <berrange at redhat.com> wrote:> On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote: > > On Wed, 2 Nov 2022 10:43:10 -0400 > > Laine Stump <laine at redhat.com> wrote: > > > > > On 11/1/22 7:46 AM, Igor Mammedov wrote: > > > > On Mon, 31 Oct 2022 14:48:54 +0000 > > > > Daniel P. Berrang? <berrange at redhat.com> wrote: > > > > > > > >> On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote: > > > >>> Hi Igor and Laine, > > > >>> > > > >>> I would like to revive a 2 years old discussion [1] about consistent network > > > >>> interfaces in the guest. > > > >>> > > > >>> That discussion mentioned that a guest PCI address may change in two cases: > > > >>> - The PCI topology changes. > > > >>> - The machine type changes. > > > >>> > > > >>> Usually, the machine type is not expected to change, especially if one > > > >>> wants to allow migrations between nodes. > > > >>> I would hope to argue this should not be problematic in practice, because > > > >>> guest images would be made per a specific machine type. > > > >>> > > > >>> Regarding the PCI topology, I am not sure I understand what changes > > > >>> need to occur to the domxml for a defined guest PCI address to change. > > > >>> The only think that I can think of is a scenario where hotplug/unplug is > > > >>> used, > > > >>> but even then I would expect existing devices to preserve their PCI address > > > >>> and the plug/unplug device to have a reserved address managed by the one > > > >>> acting on it (the management system). > > > >>> > > > >>> Could you please help clarify in which scenarios the PCI topology can cause > > > >>> a mess to the naming of interfaces in the guest? > > > >>> > > > >>> Are there any plans to add the acpi_index support? > > > >> > > > >> This was implemented a year & a half ago > > > >> > > > >> https://libvirt.org/formatdomain.html#network-interfaces > > > >> > > > >> though due to QEMU limitations this only works for the old > > > >> i440fx chipset, not Q35 yet. > > > > > > > > Q35 should work partially too. In its case acpi-index support > > > > is limited to hotplug enabled root-ports and PCIe-PCI bridges. > > > > One also has to enable ACPI PCI hotplug (it's enled by default > > > > on recent machine types) for it to work (i.e.it's not supported > > > > in native PCIe hotplug mode). > > > > > > > > So if mgmt can put nics on root-ports/bridges, then acpi-index > > > > should just work on Q35 as well. > > > > > > With only a few exceptions (e.g. the first ich9 audio device, which is > > > placed directly on the root bus at 00:1B.0 because that is where the > > > ich9 audio device is located on actual Q35 hardware), libvirt will > > > automatically put all PCI devices (including network interfaces) on a > > > pcie-root-port. > > > > > > After seeing reports that "acpi index doesn't work with Q35 > > > machinetypes" I just assumed that was correct and didn't try it. But > > > after seeing the "should work partially" statement above, I tried it > > > just now and an <interface> of a Q35 guest that had its PCI address > > > auto-assigned by libvirt (and so was placed on a pcie-root-port)m and > > > had <acpi index='4'/> was given the name "eno4". So what exactly is it > > > that *doesn't* work? > > > > From QEMU side: > > acpi-index requires: > > 1. acpi pci hotplug enabled (which is default on relatively new q35 machine types) > > 2. hotpluggble pci bus (root-port, various pci bridges) > > 3. NIC can be cold or hotplugged, guest should pick up acpi-index of the device > > currently plugged into slot > > what doesn't work: > > 1. device attached to host-bridge directly (work in progress) > > (q35) > > 2. devices attached to any PXB port and any hierarchy hanging of it (there are not plans to make it work) > > (q35, pc) > > I'd say this is still a relatively important, as the PXBs are needed > to create a NUMA placement aware topology for guests, and I'd say it > is undesirable to loose acpi-index if a guest is updated to be NUMA > aware, or if a guest image can be deployed in either normal or NUMA > aware setups.it's not only Q35 but also PC. We basically do not generate ACPI hierarchy for PXBs at all, so neither ACPI hotplug nor depended acpi-index would work. It's been so for many years and no one have asked to enable ACPI hotplug on them so far. CCing Amnon so he could ask around if we have a possible customer for this.> > 3. devices plugged into hot-plugged bridges/root-ports > > (hotplugged bridge lacks ACPI description) (hard to fix, maybe not possible) > > (q35, pc) > > Not so bothered about that, since I think generally mgmt apps > pre-plug sufficient bridges to cope. > > > 4. multifunction devices (it's undefined by spec, hence not supported) > > Not a big deal. > > With regards, > Daniel