Michael S. Tsirkin
2017-Dec-05 19:20 UTC
[RFC] virtio-net: help live migrate SR-IOV devices
On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote:> Then we'll have a single solution for both netvsc and virtio (and any > other PV device). > And we could handle the VF DMA dirt issue agnostically.For the record, I won't block patches adding this kist to virtio on the basis that they must be generic. It's not a lot of code, implementation can come first, prettify later. But we do need to have a discussion about how devices are paired. I am not sure using just MAC works. E.g. some passthrough devices don't give host ability to set the MAC. Are these worth worrying about? -- MST
On Tue, 5 Dec 2017 21:20:07 +0200 "Michael S. Tsirkin" <mst at redhat.com> wrote:> On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote: > > Then we'll have a single solution for both netvsc and virtio (and any > > other PV device). > > And we could handle the VF DMA dirt issue agnostically. > > For the record, I won't block patches adding this kist to virtio > on the basis that they must be generic. It's not a lot > of code, implementation can come first, prettify later.Thanks, based on this discussion we're going to work on improving virtio-net first, but some of Achiad's points are good. I don't believe it should block the virtio work however. In particular I'm really interested in figuring out how we can get to the point that virtio is able to make or implement some smart decisions about which NIC to pick for traffic delivery (it's own paravirt path or the passthorugh device path), if Achiad wants to develop the idea into some code, I'd be interested to review it.> But we do need to have a discussion about how devices are paired. > I am not sure using just MAC works. E.g. some passthrough > devices don't give host ability to set the MAC. > Are these worth worrying about?I personally don't think that will be much of a problem, if a certain device has that issue, can't we just have the virtio-net device pick up the MAC address of the passthrough device? As long as they match things should work OK. It at least is an initial way to do the configuration that has at least some traction as workable, as proved by the Microsoft design. FWIW, the Intel SR-IOV devices all accept a hypervisor/host provided MAC address.
Michael S. Tsirkin
2017-Dec-05 22:05 UTC
[RFC] virtio-net: help live migrate SR-IOV devices
On Tue, Dec 05, 2017 at 01:52:26PM -0800, Jesse Brandeburg wrote:> On Tue, 5 Dec 2017 21:20:07 +0200 > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote: > > > Then we'll have a single solution for both netvsc and virtio (and any > > > other PV device). > > > And we could handle the VF DMA dirt issue agnostically. > > > > For the record, I won't block patches adding this kist to virtio > > on the basis that they must be generic. It's not a lot > > of code, implementation can come first, prettify later. > > Thanks, based on this discussion we're going to work on improving > virtio-net first, but some of Achiad's points are good. I don't believe > it should block the virtio work however. > > In particular I'm really interested in figuring out how we can get to > the point that virtio is able to make or implement some smart decisions > about which NIC to pick for traffic delivery (it's own paravirt path or > the passthorugh device path), if Achiad wants to develop the idea into > some code, I'd be interested to review it. > > > But we do need to have a discussion about how devices are paired. > > I am not sure using just MAC works. E.g. some passthrough > > devices don't give host ability to set the MAC. > > Are these worth worrying about? > > I personally don't think that will be much of a problem, if a > certain device has that issue, can't we just have the virtio-net device > pick up the MAC address of the passthrough device?Then what do you do after you have migrated to another box? The PT device there likely has a different MAC.> As long as they match > things should work OK. It at least is an initial way to do the > configuration that has at least some traction as workable, as proved by > the Microsoft design.Yes - that design just implements what people have been doing for years using bond so of course it's workable.> FWIW, the Intel SR-IOV devices all accept a hypervisor/host provided > MAC address.For VFs you often can program the MAC through the PF, but you typically can't do this for PFs. Or as another example consider nested virt with a VF passed through. PF isn't there within L1 guest so can't be used to program the mac of the VF. Still, we can always start small and require same mac, add other ways to address issues later as we come up with them. -- MST
On 5 December 2017 at 21:20, Michael S. Tsirkin <mst at redhat.com> wrote:> On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote: >> Then we'll have a single solution for both netvsc and virtio (and any >> other PV device). >> And we could handle the VF DMA dirt issue agnostically. > > For the record, I won't block patches adding this kist to virtio > on the basis that they must be generic. It's not a lot > of code, implementation can come first, prettify later.It's not a lot of code either way. So I fail to understand why not to do it right from the beginning. For the record...> > But we do need to have a discussion about how devices are paired. > I am not sure using just MAC works. E.g. some passthrough > devices don't give host ability to set the MAC. > Are these worth worrying about? > > -- > MST
On Wed, Dec 6, 2017 at 11:28 PM, achiad shochat <achiad.mellanox at gmail.com> wrote:> On 5 December 2017 at 21:20, Michael S. Tsirkin <mst at redhat.com> wrote: >> On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote: >>> Then we'll have a single solution for both netvsc and virtio (and any >>> other PV device). >>> And we could handle the VF DMA dirt issue agnostically. >> >> For the record, I won't block patches adding this kist to virtio >> on the basis that they must be generic. It's not a lot >> of code, implementation can come first, prettify later. > > It's not a lot of code either way. > So I fail to understand why not to do it right from the beginning. > For the record...What isn't a lot of code? If you are talking about the DMA dirtying then I would have to disagree. The big problem with the DMA is that we have to mark a page as dirty and non-migratable as soon as it is mapped for Rx DMA. It isn't until the driver has either unmapped the page or the device has been disabled that we can then allow the page to be migrated for being dirty. That ends up being the way we have to support this if we don't have the bonding solution. With the bonding solution we could look at doing a lightweight DMA dirtying which would just require flagging pages as dirty after an unmap or sync call is performed. However it requires that we shut down the driver/device before we can complete the migration which means we have to have the paravirtualized fail-over approach. As far as indicating that the interfaces are meant to be enslaved I wonder if we couldn't look at tweaking the PCI layout of the guest and use that to indicate that a given set of interfaces are meant to be bonded. For example the VFs are all meant to work as a part of a multi-function device. What if we were to make virtio-net function 0 of a PCI/PCIe device, and then place any direct assigned VFs that are meant to be a part of the bond in functions 1-7 of the device? Then it isn't too far off from the model we have on the host where if the VF goes away we would expect to see the traffic on the PF that is usually occupying function 0 of a given device.