thr3ads.net - Linux Virtualization - [RFC] virtio-net: help live migrate SR-IOV devices [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Jakub Kicinski

2017-Nov-30 03:51 UTC

[RFC] virtio-net: help live migrate SR-IOV devices

On Thu, 30 Nov 2017 11:29:56 +0800, Jason Wang wrote:> On 2017?11?29? 03:27, Jesse Brandeburg wrote:
> > Hi, I'd like to get some feedback on a proposal to enhance
virtio-net
> > to ease configuration of a VM and that would enable live migration of
> > passthrough network SR-IOV devices.
> >
> > Today we have SR-IOV network devices (VFs) that can be passed into a
VM
> > in order to enable high performance networking direct within the VM.
> > The problem I am trying to address is that this configuration is
> > generally difficult to live-migrate.  There is documentation [1]
> > indicating that some OS/Hypervisor vendors will support live migration
> > of a system with a direct assigned networking device.  The problem I
> > see with these implementations is that the network configuration
> > requirements that are passed on to the owner of the VM are quite
> > complicated.  You have to set up bonding, you have to configure it to
> > enslave two interfaces, those interfaces (one is virtio-net, the other
> > is SR-IOV device/driver like ixgbevf) must support MAC address changes
> > requested in the VM, and on and on...
> >
> > So, on to the proposal:
> > Modify virtio-net driver to be a single VM network device that
> > enslaves an SR-IOV network device (inside the VM) with the same MAC
> > address. This would cause the virtio-net driver to appear and work
like
> > a simplified bonding/team driver.  The live migration problem would be
> > solved just like today's bonding solution, but the VM user's
networking
> > config would be greatly simplified.
> >
> > At it's simplest, it would appear something like this in the VM.
> >
> > =========> > = vnet0  > >           ============> >
(virtio- =       |
> >   net)    =       |
> >           =  =========> >           =  = ixgbef > >
==========  =========> >
> > (forgive the ASCII art)
> >
> > The fast path traffic would prefer the ixgbevf or other SR-IOV device
> > path, and fall back to virtio's transmit/receive when migrating.
> >
> > Compared to today's options this proposal would
> > 1) make virtio-net more sticky, allow fast path traffic at SR-IOV
> >     speeds
> > 2) simplify end user configuration in the VM (most if not all of the
> >     set up to enable migration would be done in the hypervisor)
> > 3) allow live migration via a simple link down and maybe a PCI
> >     hot-unplug of the SR-IOV device, with failover to the virtio-net
> >     driver core
> > 4) allow vendor agnostic hardware acceleration, and live migration
> >     between vendors if the VM os has driver support for all the
required
> >     SR-IOV devices.
> >
> > Runtime operation proposed:
> > - <in either order> virtio-net driver loads, SR-IOV driver loads
> > - virtio-net finds other NICs that match it's MAC address by
> >    both examining existing interfaces, and sets up a new device
notifier
> > - virtio-net enslaves the first NIC with the same MAC address
> > - virtio-net brings up the slave, and makes it the
"preferred" path
> > - virtio-net follows the behavior of an active backup bond/team
> > - virtio-net acts as the interface to the VM
> > - live migration initiates
> > - link goes down on SR-IOV, or SR-IOV device is removed
> > - failover to virtio-net as primary path
> > - migration continues to new host
> > - new host is started with virio-net as primary
> > - if no SR-IOV, virtio-net stays primary
> > - hypervisor can hot-add SR-IOV NIC, with same MAC addr as virtio
> > - virtio-net notices new NIC and starts over at enslave step above
> >
> > Future ideas (brainstorming):
> > - Optimize Fast east-west by having special rules to direct east-west
> >    traffic through virtio-net traffic path
> >
> > Thanks for reading!
> > Jesse  
> 
> Cc netdev.
> 
> Interesting, and this method is actually used by netvsc now:
> 
> commit 0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> Author: stephen hemminger <stephen at networkplumber.org>
> Date:?? Tue Aug 1 19:58:53 2017 -0700
> 
>  ??? netvsc: transparent VF management
> 
>  ??? This patch implements transparent fail over from synthetic NIC to
>  ??? SR-IOV virtual function NIC in Hyper-V environment. It is a better
>  ??? alternative to using bonding as is done now. Instead, the receive and
>  ??? transmit fail over is done internally inside the driver.
> 
>  ??? Using bonding driver has lots of issues because it depends on the
>  ??? script being run early enough in the boot process and with sufficient
>  ??? information to make the association. This patch moves all that
>  ??? functionality into the kernel.
> 
>  ??? Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com>
>  ??? Signed-off-by: David S. Miller <davem at davemloft.net>
> 
> If my understanding is correct there's no need to for any extension of 
> virtio spec. If this is true, maybe you can start to prepare the patch?
IMHO this is as close to policy in the kernel as one can get.  User
land has all the information it needs to instantiate that bond/team
automatically.  In fact I'm trying to discuss this with NetworkManager
folks and Red Hat right now:

https://mail.gnome.org/archives/networkmanager-list/2017-November/msg00038.html

Can we flip the argument and ask why is the kernel supposed to be
responsible for this?  It's not like we run DHCP out of the kernel
on new interfaces...

Stephen Hemminger

2017-Nov-30 04:10 UTC

head link

[RFC] virtio-net: help live migrate SR-IOV devices

On Wed, 29 Nov 2017 19:51:38 -0800
Jakub Kicinski <jakub.kicinski at netronome.com> wrote:
> On Thu, 30 Nov 2017 11:29:56 +0800, Jason Wang wrote:
> > On 2017?11?29? 03:27, Jesse Brandeburg wrote:  
> > > Hi, I'd like to get some feedback on a proposal to enhance
> > > virtio-net to ease configuration of a VM and that would enable
> > > live migration of passthrough network SR-IOV devices.
> > >
> > > Today we have SR-IOV network devices (VFs) that can be passed
> > > into a VM in order to enable high performance networking direct
> > > within the VM. The problem I am trying to address is that this
> > > configuration is generally difficult to live-migrate.  There is
> > > documentation [1] indicating that some OS/Hypervisor vendors will
> > > support live migration of a system with a direct assigned
> > > networking device.  The problem I see with these implementations
> > > is that the network configuration requirements that are passed on
> > > to the owner of the VM are quite complicated.  You have to set up
> > > bonding, you have to configure it to enslave two interfaces,
> > > those interfaces (one is virtio-net, the other is SR-IOV
> > > device/driver like ixgbevf) must support MAC address changes
> > > requested in the VM, and on and on...
> > >
> > > So, on to the proposal:
> > > Modify virtio-net driver to be a single VM network device that
> > > enslaves an SR-IOV network device (inside the VM) with the same
> > > MAC address. This would cause the virtio-net driver to appear and
> > > work like a simplified bonding/team driver.  The live migration
> > > problem would be solved just like today's bonding solution,
but
> > > the VM user's networking config would be greatly simplified.
> > >
> > > At it's simplest, it would appear something like this in the
VM.
> > >
> > > =========> > > = vnet0  > > >          
============> > > (virtio- =       |
> > >   net)    =       |
> > >           =  =========> > >           =  = ixgbef >
> > ==========  =========> > >
> > > (forgive the ASCII art)
> > >
> > > The fast path traffic would prefer the ixgbevf or other SR-IOV
> > > device path, and fall back to virtio's transmit/receive when
> > > migrating.
> > >
> > > Compared to today's options this proposal would
> > > 1) make virtio-net more sticky, allow fast path traffic at SR-IOV
> > >     speeds
> > > 2) simplify end user configuration in the VM (most if not all of
> > > the set up to enable migration would be done in the hypervisor)
> > > 3) allow live migration via a simple link down and maybe a PCI
> > >     hot-unplug of the SR-IOV device, with failover to the
> > > virtio-net driver core
> > > 4) allow vendor agnostic hardware acceleration, and live
migration
> > >     between vendors if the VM os has driver support for all the
> > > required SR-IOV devices.
> > >
> > > Runtime operation proposed:
> > > - <in either order> virtio-net driver loads, SR-IOV driver
loads
> > > - virtio-net finds other NICs that match it's MAC address by
> > >    both examining existing interfaces, and sets up a new device
> > > notifier
> > > - virtio-net enslaves the first NIC with the same MAC address
> > > - virtio-net brings up the slave, and makes it the
"preferred"
> > > path
> > > - virtio-net follows the behavior of an active backup bond/team
> > > - virtio-net acts as the interface to the VM
> > > - live migration initiates
> > > - link goes down on SR-IOV, or SR-IOV device is removed
> > > - failover to virtio-net as primary path
> > > - migration continues to new host
> > > - new host is started with virio-net as primary
> > > - if no SR-IOV, virtio-net stays primary
> > > - hypervisor can hot-add SR-IOV NIC, with same MAC addr as virtio
> > > - virtio-net notices new NIC and starts over at enslave step
above
> > >
> > > Future ideas (brainstorming):
> > > - Optimize Fast east-west by having special rules to direct
> > > east-west traffic through virtio-net traffic path
> > >
> > > Thanks for reading!
> > > Jesse    
> > 
> > Cc netdev.
> > 
> > Interesting, and this method is actually used by netvsc now:
> > 
> > commit 0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> > Author: stephen hemminger <stephen at networkplumber.org>
> > Date:?? Tue Aug 1 19:58:53 2017 -0700
> > 
> >  ??? netvsc: transparent VF management
> > 
> >  ??? This patch implements transparent fail over from synthetic NIC
> > to SR-IOV virtual function NIC in Hyper-V environment. It is a
> > better alternative to using bonding as is done now. Instead, the
> > receive and transmit fail over is done internally inside the driver.
> > 
> >  ??? Using bonding driver has lots of issues because it depends on
> > the script being run early enough in the boot process and with
> > sufficient information to make the association. This patch moves
> > all that functionality into the kernel.
> > 
> >  ??? Signed-off-by: Stephen Hemminger <sthemmin at
microsoft.com>
> >  ??? Signed-off-by: David S. Miller <davem at davemloft.net>
> > 
> > If my understanding is correct there's no need to for any
extension
> > of virtio spec. If this is true, maybe you can start to prepare the
> > patch?  
> 
> IMHO this is as close to policy in the kernel as one can get.  User
> land has all the information it needs to instantiate that bond/team
> automatically.  In fact I'm trying to discuss this with NetworkManager
> folks and Red Hat right now:
> 
>
https://mail.gnome.org/archives/networkmanager-list/2017-November/msg00038.html
> 
> Can we flip the argument and ask why is the kernel supposed to be
> responsible for this?  It's not like we run DHCP out of the kernel
> on new interfaces... 
Although "policy should not be in the kernel" is a a great mantra,
it is not practical in the real world.

If you think it can be solved in userspace, then you haven't had to
deal with four different network initialization
systems, multiple orchestration systems and customers on ancient
Enterprise distributions.

Jakub Kicinski

2017-Nov-30 04:21 UTC

head link

[RFC] virtio-net: help live migrate SR-IOV devices

On Wed, 29 Nov 2017 20:10:09 -0800, Stephen Hemminger
wrote:> On Wed, 29 Nov 2017 19:51:38 -0800 Jakub Kicinski wrote:
> > On Thu, 30 Nov 2017 11:29:56 +0800, Jason Wang wrote:  
> > > On 2017?11?29? 03:27, Jesse Brandeburg wrote:    
> > > commit 0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> > > Author: stephen hemminger <stephen at networkplumber.org>
> > > Date:?? Tue Aug 1 19:58:53 2017 -0700
> > > 
> > >  ??? netvsc: transparent VF management
> > > 
> > >  ??? This patch implements transparent fail over from synthetic
NIC
> > > to SR-IOV virtual function NIC in Hyper-V environment. It is a
> > > better alternative to using bonding as is done now. Instead, the
> > > receive and transmit fail over is done internally inside the
driver.
> > > 
> > >  ??? Using bonding driver has lots of issues because it depends
on
> > > the script being run early enough in the boot process and with
> > > sufficient information to make the association. This patch moves
> > > all that functionality into the kernel.
> > > 
> > >  ??? Signed-off-by: Stephen Hemminger <sthemmin at
microsoft.com>
> > >  ??? Signed-off-by: David S. Miller <davem at
davemloft.net>
> > > 
> > > If my understanding is correct there's no need to for any
extension
> > > of virtio spec. If this is true, maybe you can start to prepare
the
> > > patch?    
> > 
> > IMHO this is as close to policy in the kernel as one can get.  User
> > land has all the information it needs to instantiate that bond/team
> > automatically.  In fact I'm trying to discuss this with
NetworkManager
> > folks and Red Hat right now:
> > 
> >
https://mail.gnome.org/archives/networkmanager-list/2017-November/msg00038.html
> > 
> > Can we flip the argument and ask why is the kernel supposed to be
> > responsible for this?  It's not like we run DHCP out of the kernel
> > on new interfaces...   
> 
> Although "policy should not be in the kernel" is a a great
mantra,
> it is not practical in the real world.
> 
> If you think it can be solved in userspace, then you haven't had to
> deal with four different network initialization
> systems, multiple orchestration systems and customers on ancient
> Enterprise distributions.
I would accept that argument if anyone ever tried to get those
Enterprise distros to handle this use case.  From conversations I 
had it seemed like no one ever did, and SR-IOV+virtio bonding is 
what has been done to solve this since day 1 of SR-IOV networking.

For practical reasons it's easier to push this into the kernel, 
because vendors rarely employ developers of the user space
orchestrations systems.  Is that not the real problem here,
potentially? :)

Michael S. Tsirkin

2017-Nov-30 13:54 UTC

head link

[RFC] virtio-net: help live migrate SR-IOV devices

On Wed, Nov 29, 2017 at 07:51:38PM -0800, Jakub Kicinski
wrote:> On Thu, 30 Nov 2017 11:29:56 +0800, Jason Wang wrote:
> > On 2017?11?29? 03:27, Jesse Brandeburg wrote:
> > > Hi, I'd like to get some feedback on a proposal to enhance
virtio-net
> > > to ease configuration of a VM and that would enable live
migration of
> > > passthrough network SR-IOV devices.
> > >
> > > Today we have SR-IOV network devices (VFs) that can be passed
into a VM
> > > in order to enable high performance networking direct within the
VM.
> > > The problem I am trying to address is that this configuration is
> > > generally difficult to live-migrate.  There is documentation [1]
> > > indicating that some OS/Hypervisor vendors will support live
migration
> > > of a system with a direct assigned networking device.  The
problem I
> > > see with these implementations is that the network configuration
> > > requirements that are passed on to the owner of the VM are quite
> > > complicated.  You have to set up bonding, you have to configure
it to
> > > enslave two interfaces, those interfaces (one is virtio-net, the
other
> > > is SR-IOV device/driver like ixgbevf) must support MAC address
changes
> > > requested in the VM, and on and on...
> > >
> > > So, on to the proposal:
> > > Modify virtio-net driver to be a single VM network device that
> > > enslaves an SR-IOV network device (inside the VM) with the same
MAC
> > > address. This would cause the virtio-net driver to appear and
work like
> > > a simplified bonding/team driver.  The live migration problem
would be
> > > solved just like today's bonding solution, but the VM
user's networking
> > > config would be greatly simplified.
> > >
> > > At it's simplest, it would appear something like this in the
VM.
> > >
> > > =========> > > = vnet0  > > >          
============> > > (virtio- =       |
> > >   net)    =       |
> > >           =  =========> > >           =  = ixgbef >
> > ==========  =========> > >
> > > (forgive the ASCII art)
> > >
> > > The fast path traffic would prefer the ixgbevf or other SR-IOV
device
> > > path, and fall back to virtio's transmit/receive when
migrating.
> > >
> > > Compared to today's options this proposal would
> > > 1) make virtio-net more sticky, allow fast path traffic at SR-IOV
> > >     speeds
> > > 2) simplify end user configuration in the VM (most if not all of
the
> > >     set up to enable migration would be done in the hypervisor)
> > > 3) allow live migration via a simple link down and maybe a PCI
> > >     hot-unplug of the SR-IOV device, with failover to the
virtio-net
> > >     driver core
> > > 4) allow vendor agnostic hardware acceleration, and live
migration
> > >     between vendors if the VM os has driver support for all the
required
> > >     SR-IOV devices.
> > >
> > > Runtime operation proposed:
> > > - <in either order> virtio-net driver loads, SR-IOV driver
loads
> > > - virtio-net finds other NICs that match it's MAC address by
> > >    both examining existing interfaces, and sets up a new device
notifier
> > > - virtio-net enslaves the first NIC with the same MAC address
> > > - virtio-net brings up the slave, and makes it the
"preferred" path
> > > - virtio-net follows the behavior of an active backup bond/team
> > > - virtio-net acts as the interface to the VM
> > > - live migration initiates
> > > - link goes down on SR-IOV, or SR-IOV device is removed
> > > - failover to virtio-net as primary path
> > > - migration continues to new host
> > > - new host is started with virio-net as primary
> > > - if no SR-IOV, virtio-net stays primary
> > > - hypervisor can hot-add SR-IOV NIC, with same MAC addr as virtio
> > > - virtio-net notices new NIC and starts over at enslave step
above
> > >
> > > Future ideas (brainstorming):
> > > - Optimize Fast east-west by having special rules to direct
east-west
> > >    traffic through virtio-net traffic path
> > >
> > > Thanks for reading!
> > > Jesse  
> > 
> > Cc netdev.
> > 
> > Interesting, and this method is actually used by netvsc now:
> > 
> > commit 0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> > Author: stephen hemminger <stephen at networkplumber.org>
> > Date:?? Tue Aug 1 19:58:53 2017 -0700
> > 
> >  ??? netvsc: transparent VF management
> > 
> >  ??? This patch implements transparent fail over from synthetic NIC to
> >  ??? SR-IOV virtual function NIC in Hyper-V environment. It is a
better
> >  ??? alternative to using bonding as is done now. Instead, the receive
and
> >  ??? transmit fail over is done internally inside the driver.
> > 
> >  ??? Using bonding driver has lots of issues because it depends on the
> >  ??? script being run early enough in the boot process and with
sufficient
> >  ??? information to make the association. This patch moves all that
> >  ??? functionality into the kernel.
> > 
> >  ??? Signed-off-by: Stephen Hemminger <sthemmin at
microsoft.com>
> >  ??? Signed-off-by: David S. Miller <davem at davemloft.net>
> > 
> > If my understanding is correct there's no need to for any
extension of
> > virtio spec. If this is true, maybe you can start to prepare the
patch?
>
> IMHO this is as close to policy in the kernel as one can get.  User
> land has all the information it needs to instantiate that bond/team
> automatically.
It does have this info (MAC addresses match) but where's the policy
here? IMHO the policy has been set by the hypervisor
already.>From hypervisor POV adding passthrough is a commitment not to migrateuntil guest stops using the passthrough device.

Within the guest, the bond is required for purely functional reasons - just to
maintain a link up since we know SRIOV will will go away. Maintaining an
uninterrupted connection is not a policy - it's what networking is
about.
>  In fact I'm trying to discuss this with NetworkManager
> folks and Red Hat right now:
> 
>
https://mail.gnome.org/archives/networkmanager-list/2017-November/msg00038.html
I thought we should do it too, for a while.

But now, I think that the real issue is this: kernel exposes what looks
like two network devices to userspace, but in fact it is just one
backend device, just exposed by hypervisor in a weird way for
compatibility reasons.

For example you will not get a better reliability or throughput by using
both of them - the only bonding mode that makes sense is fail over. As
another example, if the underlying physical device lost its link, trying
to use virtio won't help - it's only useful when the passthrough device
is gone for good.  As another example, there is no point in not
configuring a bond. As a last example, depending on how the backend is
configured, virtio might not even work when the pass-through device is
active.

So from that point of view, showing two network devices to userspace is
a bug that we are asking userspace to work around.
> Can we flip the argument and ask why is the kernel supposed to be
> responsible for this?
Because if we show a single device to userspace the number of
misconfigured guests will go down, and we won't lose any useful
flexibility.
>  It's not like we run DHCP out of the kernel
> on new interfaces... 
Because one can set up a static IP, IPv6 doesn't always need DHCP, etc.

-- 
MST

Jakub Kicinski

2017-Nov-30 20:48 UTC

head link

[RFC] virtio-net: help live migrate SR-IOV devices

On Thu, 30 Nov 2017 15:54:40 +0200, Michael S. Tsirkin
wrote:> On Wed, Nov 29, 2017 at 07:51:38PM -0800, Jakub Kicinski wrote:
> > On Thu, 30 Nov 2017 11:29:56 +0800, Jason Wang wrote:  
> > > On 2017?11?29? 03:27, Jesse Brandeburg wrote:  
> > > > Hi, I'd like to get some feedback on a proposal to
enhance virtio-net
> > > > to ease configuration of a VM and that would enable live
migration of
> > > > passthrough network SR-IOV devices.
> > > >
> > > > Today we have SR-IOV network devices (VFs) that can be
passed into a VM
> > > > in order to enable high performance networking direct within
the VM.
> > > > The problem I am trying to address is that this
configuration is
> > > > generally difficult to live-migrate.  There is documentation
[1]
> > > > indicating that some OS/Hypervisor vendors will support live
migration
> > > > of a system with a direct assigned networking device.  The
problem I
> > > > see with these implementations is that the network
configuration
> > > > requirements that are passed on to the owner of the VM are
quite
> > > > complicated.  You have to set up bonding, you have to
configure it to
> > > > enslave two interfaces, those interfaces (one is virtio-net,
the other
> > > > is SR-IOV device/driver like ixgbevf) must support MAC
address changes
> > > > requested in the VM, and on and on...
> > > >
> > > > So, on to the proposal:
> > > > Modify virtio-net driver to be a single VM network device
that
> > > > enslaves an SR-IOV network device (inside the VM) with the
same MAC
> > > > address. This would cause the virtio-net driver to appear
and work like
> > > > a simplified bonding/team driver.  The live migration
problem would be
> > > > solved just like today's bonding solution, but the VM
user's networking
> > > > config would be greatly simplified.
> > > >
> > > > At it's simplest, it would appear something like this in
the VM.
> > > >
> > > > =========> > > > = vnet0  > > > >   
============> > > > (virtio- =       |
> > > >   net)    =       |
> > > >           =  =========> > > >           =  =
ixgbef > > > > ==========  =========> > > >
> > > > (forgive the ASCII art)
> > > >
> > > > The fast path traffic would prefer the ixgbevf or other
SR-IOV device
> > > > path, and fall back to virtio's transmit/receive when
migrating.
> > > >
> > > > Compared to today's options this proposal would
> > > > 1) make virtio-net more sticky, allow fast path traffic at
SR-IOV
> > > >     speeds
> > > > 2) simplify end user configuration in the VM (most if not
all of the
> > > >     set up to enable migration would be done in the
hypervisor)
> > > > 3) allow live migration via a simple link down and maybe a
PCI
> > > >     hot-unplug of the SR-IOV device, with failover to the
virtio-net
> > > >     driver core
> > > > 4) allow vendor agnostic hardware acceleration, and live
migration
> > > >     between vendors if the VM os has driver support for all
the required
> > > >     SR-IOV devices.
> > > >
> > > > Runtime operation proposed:
> > > > - <in either order> virtio-net driver loads, SR-IOV
driver loads
> > > > - virtio-net finds other NICs that match it's MAC
address by
> > > >    both examining existing interfaces, and sets up a new
device notifier
> > > > - virtio-net enslaves the first NIC with the same MAC
address
> > > > - virtio-net brings up the slave, and makes it the
"preferred" path
> > > > - virtio-net follows the behavior of an active backup
bond/team
> > > > - virtio-net acts as the interface to the VM
> > > > - live migration initiates
> > > > - link goes down on SR-IOV, or SR-IOV device is removed
> > > > - failover to virtio-net as primary path
> > > > - migration continues to new host
> > > > - new host is started with virio-net as primary
> > > > - if no SR-IOV, virtio-net stays primary
> > > > - hypervisor can hot-add SR-IOV NIC, with same MAC addr as
virtio
> > > > - virtio-net notices new NIC and starts over at enslave step
above
> > > >
> > > > Future ideas (brainstorming):
> > > > - Optimize Fast east-west by having special rules to direct
east-west
> > > >    traffic through virtio-net traffic path
> > > >
> > > > Thanks for reading!
> > > > Jesse    
> > > 
> > > Cc netdev.
> > > 
> > > Interesting, and this method is actually used by netvsc now:
> > > 
> > > commit 0c195567a8f6e82ea5535cd9f1d54a1626dd233e
> > > Author: stephen hemminger <stephen at networkplumber.org>
> > > Date:?? Tue Aug 1 19:58:53 2017 -0700
> > > 
> > >  ??? netvsc: transparent VF management
> > > 
> > >  ??? This patch implements transparent fail over from synthetic
NIC to
> > >  ??? SR-IOV virtual function NIC in Hyper-V environment. It is a
better
> > >  ??? alternative to using bonding as is done now. Instead, the
receive and
> > >  ??? transmit fail over is done internally inside the driver.
> > > 
> > >  ??? Using bonding driver has lots of issues because it depends
on the
> > >  ??? script being run early enough in the boot process and with
sufficient
> > >  ??? information to make the association. This patch moves all
that
> > >  ??? functionality into the kernel.
> > > 
> > >  ??? Signed-off-by: Stephen Hemminger <sthemmin at
microsoft.com>
> > >  ??? Signed-off-by: David S. Miller <davem at
davemloft.net>
> > > 
> > > If my understanding is correct there's no need to for any
extension of
> > > virtio spec. If this is true, maybe you can start to prepare the
patch?
> >
> > IMHO this is as close to policy in the kernel as one can get.  User
> > land has all the information it needs to instantiate that bond/team
> > automatically.  
> 
> It does have this info (MAC addresses match) but where's the policy
> here? IMHO the policy has been set by the hypervisor already.
> From hypervisor POV adding passthrough is a commitment not to migrate
> until guest stops using the passthrough device.
> 
> Within the guest, the bond is required for purely functional reasons - just
to
> maintain a link up since we know SRIOV will will go away. Maintaining an
> uninterrupted connection is not a policy - it's what networking is
> about.
> 
> >  In fact I'm trying to discuss this with NetworkManager
> > folks and Red Hat right now:
> > 
> >
https://mail.gnome.org/archives/networkmanager-list/2017-November/msg00038.html
> 
> I thought we should do it too, for a while.
> 
> But now, I think that the real issue is this: kernel exposes what looks
> like two network devices to userspace, but in fact it is just one
> backend device, just exposed by hypervisor in a weird way for
> compatibility reasons.
> 
> For example you will not get a better reliability or throughput by using
> both of them - the only bonding mode that makes sense is fail over.
Yes, I'm talking about fail over.
> As another example, if the underlying physical device lost its link, trying
> to use virtio won't help - it's only useful when the passthrough
device
> is gone for good.  As another example, there is no point in not
> configuring a bond. As a last example, depending on how the backend is
> configured, virtio might not even work when the pass-through device is
> active.
> 
> So from that point of view, showing two network devices to userspace is
> a bug that we are asking userspace to work around.
I'm confused by what you're saying here.  IIRC the question is whether
we expose 2 netdevs or 3.  There will always be a virtio netdev and a
VF netdev.  I assume you're not suggesting hiding the VF netdev.  So
the question is do we expose a VF netdev and a combo virtio netdev
which is also a bond or do we expose a VF netdev a virtio netdev, and a
active/passive bond/team which is a well understood and architecturally
correct construct.
> > Can we flip the argument and ask why is the kernel supposed to be
> > responsible for this?  
> 
> Because if we show a single device to userspace the number of
> misconfigured guests will go down, and we won't lose any useful
> flexibility.
Again, single device?
> >  It's not like we run DHCP out of the kernel
> > on new interfaces...   
> 
> Because one can set up a static IP, IPv6 doesn't always need DHCP, etc.
But we don't handle LACP, etc.

Look, as much as I don't like this, I'm not going to argue about this to
death.  I just find it very dishonest to claim kernel *has to* do it,
when no one seem to have made any honest attempts to solve this in user
space for the last 10 years :/

Apparently Analagous Threads

Search for more possibly parallel threads

Linux Virtualization - Nov 2017 - [RFC] virtio-net: help live migrate SR-IOV devices

[RFC] virtio-net: help live migrate SR-IOV devices

[RFC] virtio-net: help live migrate SR-IOV devices

[RFC] virtio-net: help live migrate SR-IOV devices

[RFC] virtio-net: help live migrate SR-IOV devices

[RFC] virtio-net: help live migrate SR-IOV devices

Apparently Analagous Threads