On Mon, Mar 14, 2016 at 01:13:24PM +0200, Michael S. Tsirkin wrote:> On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi wrote: > > Michael pointed out that the virtio-vsock draft specification does not > > address live migration and in fact currently precludes migration. > > > > Migration is fundamental so the device specification at least mustn't > > preclude it. Having brainstormed migration with Matthew Benjamin and > > Michael Tsirkin, I am now summarizing the approach that I want to > > include in the next draft specification. > > > > Feedback and comments welcome! In the meantime I will implement this in > > code and update the draft specification. > > Most of the issue seems to be a consequence of using a 4 byte CID. > > I think the right thing to do is just to teach guests > about 64 bit CIDs. > > For now, can we drop guest CID from guest to host communication completely, > making CID only host-visible? Maybe leave the space > in the packet so we can add CID there later. > It seems that in theory this will allow changing CID > during migration, transparently to the guest. > > Guest visible CID is required for guest to guest communication - > but IIUC that is not currently supported. > Maybe that can be made conditional on 64 bit addressing. > Alternatively, it seems much easier to accept that these channels get broken > across migration.I reached the conclusion that channels break across migration because: 1. 32-bit CIDs are in sockaddr_vm and we'd break AF_VSOCK ABI by changing it to 64-bit. Application code would be specific virtio-vsock and wouldn't work with other AF_VSOCK transports that use the 32-bit sockaddr_vm struct. 2. Dropping guest CIDs from the protocol breaks network protocols that send addresses. NFS and netperf are the first two protocols I looked at and both transmit address information across the connection... -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20160315/16222652/attachment.sig>
On Tue, Mar 15, 2016 at 03:15:29PM +0000, Stefan Hajnoczi wrote:> On Mon, Mar 14, 2016 at 01:13:24PM +0200, Michael S. Tsirkin wrote: > > On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi wrote: > > > Michael pointed out that the virtio-vsock draft specification does not > > > address live migration and in fact currently precludes migration. > > > > > > Migration is fundamental so the device specification at least mustn't > > > preclude it. Having brainstormed migration with Matthew Benjamin and > > > Michael Tsirkin, I am now summarizing the approach that I want to > > > include in the next draft specification. > > > > > > Feedback and comments welcome! In the meantime I will implement this in > > > code and update the draft specification. > > > > Most of the issue seems to be a consequence of using a 4 byte CID. > > > > I think the right thing to do is just to teach guests > > about 64 bit CIDs. > > > > For now, can we drop guest CID from guest to host communication completely, > > making CID only host-visible? Maybe leave the space > > in the packet so we can add CID there later. > > It seems that in theory this will allow changing CID > > during migration, transparently to the guest. > > > > Guest visible CID is required for guest to guest communication - > > but IIUC that is not currently supported. > > Maybe that can be made conditional on 64 bit addressing. > > Alternatively, it seems much easier to accept that these channels get broken > > across migration. > > I reached the conclusion that channels break across migration because: > > 1. 32-bit CIDs are in sockaddr_vm and we'd break AF_VSOCK ABI by > changing it to 64-bit. Application code would be specific > virtio-vsock and wouldn't work with other AF_VSOCK transports that > use the 32-bit sockaddr_vm struct.You don't have to repeat the IPv6 mistake. Make all 32 bit CIDs 64 bit CIDs by padding with 0s, then 64 bit apps can use any CID. Old 32 bit CID applications will not be able to use the extended addresses, but hardcoding bugs does not seem sane.> 2. Dropping guest CIDs from the protocol breaks network protocols that > send addresses.Stick it in config space if you really have to. But why do you need it on each packet?> NFS and netperf are the first two protocols I looked > at and both transmit address information across the connection...Does netperf really attempt to get local IP and then send that inline within the connection? -- MST
On Tue, Mar 15, 2016 at 06:12:55PM +0200, Michael S. Tsirkin wrote:> On Tue, Mar 15, 2016 at 03:15:29PM +0000, Stefan Hajnoczi wrote: > > On Mon, Mar 14, 2016 at 01:13:24PM +0200, Michael S. Tsirkin wrote: > > > On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi wrote: > > > > Michael pointed out that the virtio-vsock draft specification does not > > > > address live migration and in fact currently precludes migration. > > > > > > > > Migration is fundamental so the device specification at least mustn't > > > > preclude it. Having brainstormed migration with Matthew Benjamin and > > > > Michael Tsirkin, I am now summarizing the approach that I want to > > > > include in the next draft specification. > > > > > > > > Feedback and comments welcome! In the meantime I will implement this in > > > > code and update the draft specification. > > > > > > Most of the issue seems to be a consequence of using a 4 byte CID. > > > > > > I think the right thing to do is just to teach guests > > > about 64 bit CIDs. > > > > > > For now, can we drop guest CID from guest to host communication completely, > > > making CID only host-visible? Maybe leave the space > > > in the packet so we can add CID there later. > > > It seems that in theory this will allow changing CID > > > during migration, transparently to the guest. > > > > > > Guest visible CID is required for guest to guest communication - > > > but IIUC that is not currently supported. > > > Maybe that can be made conditional on 64 bit addressing. > > > Alternatively, it seems much easier to accept that these channels get broken > > > across migration. > > > > I reached the conclusion that channels break across migration because: > > > > 1. 32-bit CIDs are in sockaddr_vm and we'd break AF_VSOCK ABI by > > changing it to 64-bit. Application code would be specific > > virtio-vsock and wouldn't work with other AF_VSOCK transports that > > use the 32-bit sockaddr_vm struct. > > You don't have to repeat the IPv6 mistake. Make all 32 bit CIDs > 64 bit CIDs by padding with 0s, then 64 bit apps can use > any CID. > > Old 32 bit CID applications will not be able to use the extended > addresses, but hardcoding bugs > does not seem sane.A mixed 32-bit and 64-bit CID world is complex. The host doesn't know in advance whether all applications (especially inside the guest) will support 64-bit CIDs or not. 32-bit CID applications won't work if a 64-bit CID has been assigned. It also opens up the question how unique CIDs are allocated across hosts. Given that AF_VSOCK in Linux already exists in the 32-bit CID version, I'd prefer to make virtio-vsock compatible with that for the time being. Extensions can be added in the future but just implementing existing AF_VSOCK semantics will already allow the applications to run.> > 2. Dropping guest CIDs from the protocol breaks network protocols that > > send addresses. > > Stick it in config space if you really have to. > But why do you need it on each packet?If packets are implicitly guest<->host then adding guest<->guest communication requires a virtio spec change. If packets contain source/destination CIDs then allowing/forbidding guest<->host or guest<->guest communication is purely a host policy decision. I think it's worth keeping that in from the start.> > NFS and netperf are the first two protocols I looked > > at and both transmit address information across the connection... > > > Does netperf really attempt to get local IP > and then send that inline within the connection?Yes, netperf has separate control and data sockets. I think part of the reason for this split is that the control connection can communicate the address details for the data connection over a different protocol (TCP + RDMA?), but I'm not sure. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20160316/22b8ad22/attachment.sig>