thr3ads.net - Linux Virtualization - [virtio-dev] virtio-vsock live migration [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Stefan Hajnoczi

2016-Mar-03 15:37 UTC

virtio-vsock live migration

Michael pointed out that the virtio-vsock draft specification does not
address live migration and in fact currently precludes migration.

Migration is fundamental so the device specification at least mustn't
preclude it. Having brainstormed migration with Matthew Benjamin and
Michael Tsirkin, I am now summarizing the approach that I want to
include in the next draft specification.

Feedback and comments welcome! In the meantime I will implement this in
code and update the draft specification.

1. Requirements

Virtio-vsock is a new AF_VSOCK transport. As such, it should provide at
least the same guarantees as the existing AF_VSOCK VMCI transport. This
is for consistency and to allow code reuse across any AF_VSOCK
transport.

Virtio-vsock aims to replace virtio-serial by providing the same
guest/host communication ability but with sockets API semantics that are
more popular and convenient for application developers. Therefore
virtio-vsock migration should provide at least the same level of
migration functionality as virtio-serial.

Ideally it should be possible to migrate applications using AF_VSOCK
together with the virtual machine so that guest<->host communication is
interrupted. Neither AF_VSOCK VMCI nor virtio-serial support this
today.

2. Basic disruptive migration flow

When the virtual machine migrates from the source host to the
destination host, the guest's CID may change. The CID namespace is
host-wide so other hosts may have CID collisions and allocate a new CID
for incoming migration VMs.

The device notifies the guest that the CID has changed. Guest sockets
are affected as follows:

* Established connections are reset (ECONNRESET) and the guest
application will have to reconnect.

* Listen sockets remain open. The only thing to note is that
connections from the host are now made to the new CID. This means
the local address of the listen socket is automatically updated to
the new CID.

* Sockets in other states are unchanged.

Applications must handle disruptive migration by reconnecting if
necessary after ECONNRESET.

3. Checkpoint/restore for seamless migration

Applications that wish to communicate across live migration can do so
but this requires extra application-specific checkpoint/restore code.

This is similar to the approach taken by the CRIU project where
getsockopt()/setsockopt() is used to migrate socket state. The
difference is that the application process is not automatically migrated
from the source host to the destination host. Therefore, the
application needs to migrate its own state somehow.

The flow is as follows:

The application on the source host must quiesce (stop sending/receiving)
and use getsockopt() to extract socket state information from the host
kernel.

A new instance of the application is started on the destination host and
given the state so it can restore the connection. The setsockopt()
syscall is used to restore socket state information.

The guest is given a list of <host_old_cid, host_new_cid, host_port,
guest_port> tuples for established connections that must not be reset
when the guest CID update notification is received. These connections
will carry on as if nothing changed.

Note that the connection's remote address is updated from host_old_cid
to host_new_cid. This allows remapping of CIDs (if necessary).
Typically this will be unused because the host always has well-known CID
2. In a guest<->guest scenario it may be used to remap CIDs.

For the time being I am focussing on the basic disruptive migration flow
only. Checkpoint/restore can be added with a feature bit in the future.
It is a lot more complex and I'm not sure whether there will be any
users yet.

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20160303/80032760/attachment.sig>

Michael S. Tsirkin

2016-Mar-10 23:56 UTC

head link

virtio-vsock live migration

On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi
wrote:> Michael pointed out that the virtio-vsock draft specification does not
> address live migration and in fact currently precludes migration.
> 
> Migration is fundamental so the device specification at least mustn't
> preclude it.  Having brainstormed migration with Matthew Benjamin and
> Michael Tsirkin, I am now summarizing the approach that I want to
> include in the next draft specification.
> 
> Feedback and comments welcome!  In the meantime I will implement this in
> code and update the draft specification.
> 
> 1. Requirements
> 
> Virtio-vsock is a new AF_VSOCK transport.  As such, it should provide at
> least the same guarantees as the existing AF_VSOCK VMCI transport.  This
> is for consistency and to allow code reuse across any AF_VSOCK
> transport.
> 
> Virtio-vsock aims to replace virtio-serial by providing the same
> guest/host communication ability but with sockets API semantics that are
> more popular and convenient for application developers.  Therefore
> virtio-vsock migration should provide at least the same level of
> migration functionality as virtio-serial.
> 
> Ideally it should be possible to migrate applications using AF_VSOCK
> together with the virtual machine so that guest<->host communication
is
> interrupted.  Neither AF_VSOCK VMCI nor virtio-serial support this
> today.
I'm not sure why do you say this about virtio serial.
It appears that if host pre-connected to destination
qemu before migration, backend reconnects transparently
on destination.

> 2. Basic disruptive migration flow
> 
> When the virtual machine migrates from the source host to the
> destination host, the guest's CID may change.  The CID namespace is
> host-wide

BTW, I think CIDs would have to become per network namespace.
> so other hosts may have CID collisions and allocate a new CID
> for incoming migration VMs.
I guess all this is so that guest can retrieve its CID and
send it to host using some side-channel?

> The device notifies the guest that the CID has changed.  Guest sockets
> are affected as follows:
> 
>  * Established connections are reset (ECONNRESET) and the guest
>    application will have to reconnect.
> 
>  * Listen sockets remain open.  The only thing to note is that
>    connections from the host are now made to the new CID.  This means
>    the local address of the listen socket is automatically updated to
>    the new CID.
> 
>  * Sockets in other states are unchanged.
> 
> Applications must handle disruptive migration by reconnecting if
> necessary after ECONNRESET.
> 
> 3. Checkpoint/restore for seamless migration
> 
> Applications that wish to communicate across live migration can do so
> but this requires extra application-specific checkpoint/restore code.
> 
> This is similar to the approach taken by the CRIU project where
> getsockopt()/setsockopt() is used to migrate socket state.  The
> difference is that the application process is not automatically migrated
> from the source host to the destination host.  Therefore, the
> application needs to migrate its own state somehow.
> 
> The flow is as follows:
> 
> The application on the source host must quiesce (stop sending/receiving)
> and use getsockopt() to extract socket state information from the host
> kernel.
> 
> A new instance of the application is started on the destination host and
> given the state so it can restore the connection.  The setsockopt()
> syscall is used to restore socket state information.
> 
> The guest is given a list of <host_old_cid, host_new_cid, host_port,
> guest_port> tuples for established connections that must not be reset
> when the guest CID update notification is received.  These connections
> will carry on as if nothing changed.
> 
> Note that the connection's remote address is updated from host_old_cid
> to host_new_cid.  This allows remapping of CIDs (if necessary).
> Typically this will be unused because the host always has well-known CID
> 2.  In a guest<->guest scenario it may be used to remap CIDs.
> 
> 
> For the time being I am focussing on the basic disruptive migration flow
> only.  Checkpoint/restore can be added with a feature bit in the future.
> It is a lot more complex and I'm not sure whether there will be any
> users yet.
> 
> Stefan
This makes some things harder. For example, imagine a guest
reboot mixed with migration. We don't know why did the connection
die, so we'll retry connections until - when?

Could you please describe some user of vsock and show how
it recovers from destructive migration?

-- 
MST

Michael S. Tsirkin

2016-Mar-14 11:13 UTC

head link

[virtio-dev] virtio-vsock live migration

On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi
wrote:> Michael pointed out that the virtio-vsock draft specification does not
> address live migration and in fact currently precludes migration.
> 
> Migration is fundamental so the device specification at least mustn't
> preclude it.  Having brainstormed migration with Matthew Benjamin and
> Michael Tsirkin, I am now summarizing the approach that I want to
> include in the next draft specification.
> 
> Feedback and comments welcome!  In the meantime I will implement this in
> code and update the draft specification.
Most of the issue seems to be a consequence of using a 4 byte CID.

I think the right thing to do is just to teach guests
about 64 bit CIDs.

For now, can we drop guest CID from guest to host communication completely,
making CID only host-visible? Maybe leave the space
in the packet so we can add CID there later.
It seems that in theory this will allow changing CID
during migration, transparently to the guest.

Guest visible CID is required for guest to guest communication -
but IIUC that is not currently supported.
Maybe that can be made conditional on 64 bit addressing.
Alternatively, it seems much easier to accept that these channels get broken
across migration.

> 1. Requirements
> 
> Virtio-vsock is a new AF_VSOCK transport.  As such, it should provide at
> least the same guarantees as the existing AF_VSOCK VMCI transport.  This
> is for consistency and to allow code reuse across any AF_VSOCK
> transport.
> 
> Virtio-vsock aims to replace virtio-serial by providing the same
> guest/host communication ability but with sockets API semantics that are
> more popular and convenient for application developers.  Therefore
> virtio-vsock migration should provide at least the same level of
> migration functionality as virtio-serial.
> 
> Ideally it should be possible to migrate applications using AF_VSOCK
> together with the virtual machine so that guest<->host communication
is
> interrupted.  Neither AF_VSOCK VMCI nor virtio-serial support this
> today.
> 
> 2. Basic disruptive migration flow
> 
> When the virtual machine migrates from the source host to the
> destination host, the guest's CID may change.  The CID namespace is
> host-wide so other hosts may have CID collisions and allocate a new CID
> for incoming migration VMs.
> 
> The device notifies the guest that the CID has changed.  Guest sockets
> are affected as follows:
> 
>  * Established connections are reset (ECONNRESET) and the guest
>    application will have to reconnect.
> 
>  * Listen sockets remain open.  The only thing to note is that
>    connections from the host are now made to the new CID.  This means
>    the local address of the listen socket is automatically updated to
>    the new CID.
> 
>  * Sockets in other states are unchanged.
> 
> Applications must handle disruptive migration by reconnecting if
> necessary after ECONNRESET.
> 
> 3. Checkpoint/restore for seamless migration
> 
> Applications that wish to communicate across live migration can do so
> but this requires extra application-specific checkpoint/restore code.
> 
> This is similar to the approach taken by the CRIU project where
> getsockopt()/setsockopt() is used to migrate socket state.  The
> difference is that the application process is not automatically migrated
> from the source host to the destination host.  Therefore, the
> application needs to migrate its own state somehow.
> 
> The flow is as follows:
> 
> The application on the source host must quiesce (stop sending/receiving)
> and use getsockopt() to extract socket state information from the host
> kernel.
> 
> A new instance of the application is started on the destination host and
> given the state so it can restore the connection.  The setsockopt()
> syscall is used to restore socket state information.
> 
> The guest is given a list of <host_old_cid, host_new_cid, host_port,
> guest_port> tuples for established connections that must not be reset
> when the guest CID update notification is received.  These connections
> will carry on as if nothing changed.
> 
> Note that the connection's remote address is updated from host_old_cid
> to host_new_cid.  This allows remapping of CIDs (if necessary).
> Typically this will be unused because the host always has well-known CID
> 2.  In a guest<->guest scenario it may be used to remap CIDs.
> 
> 
> For the time being I am focussing on the basic disruptive migration flow
> only.  Checkpoint/restore can be added with a feature bit in the future.
> It is a lot more complex and I'm not sure whether there will be any
> users yet.
> 
> Stefan

Stefan Hajnoczi

2016-Mar-15 15:10 UTC

head link

virtio-vsock live migration

On Fri, Mar 11, 2016 at 01:56:05AM +0200, Michael S. Tsirkin
wrote:> On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi wrote:
> > Michael pointed out that the virtio-vsock draft specification does not
> > address live migration and in fact currently precludes migration.
> > 
> > Migration is fundamental so the device specification at least
mustn't
> > preclude it.  Having brainstormed migration with Matthew Benjamin and
> > Michael Tsirkin, I am now summarizing the approach that I want to
> > include in the next draft specification.
> > 
> > Feedback and comments welcome!  In the meantime I will implement this
in
> > code and update the draft specification.
> > 
> > 1. Requirements
> > 
> > Virtio-vsock is a new AF_VSOCK transport.  As such, it should provide
at
> > least the same guarantees as the existing AF_VSOCK VMCI transport. 
This
> > is for consistency and to allow code reuse across any AF_VSOCK
> > transport.
> > 
> > Virtio-vsock aims to replace virtio-serial by providing the same
> > guest/host communication ability but with sockets API semantics that
are
> > more popular and convenient for application developers.  Therefore
> > virtio-vsock migration should provide at least the same level of
> > migration functionality as virtio-serial.
> > 
> > Ideally it should be possible to migrate applications using AF_VSOCK
> > together with the virtual machine so that guest<->host
communication is
> > interrupted.  Neither AF_VSOCK VMCI nor virtio-serial support this
> > today.
> 
> I'm not sure why do you say this about virtio serial.
> It appears that if host pre-connected to destination
> qemu before migration, backend reconnects transparently
> on destination.
You are right, virtio-serial supports keeping active ports open across
migration (as well as closing active ports across migration).  In
virtio-vsock the equivalent would be setsockopt() CRIU-style socket
migration which is not implemented today.
> > 2. Basic disruptive migration flow
> > 
> > When the virtual machine migrates from the source host to the
> > destination host, the guest's CID may change.  The CID namespace
is
> > host-wide
> 
> 
> BTW, I think CIDs would have to become per network namespace.
Yes, I agree.
> > so other hosts may have CID collisions and allocate a new CID
> > for incoming migration VMs.
> 
> I guess all this is so that guest can retrieve its CID and
> send it to host using some side-channel?
Yes.
> > The device notifies the guest that the CID has changed.  Guest sockets
> > are affected as follows:
> > 
> >  * Established connections are reset (ECONNRESET) and the guest
> >    application will have to reconnect.
> > 
> >  * Listen sockets remain open.  The only thing to note is that
> >    connections from the host are now made to the new CID.  This means
> >    the local address of the listen socket is automatically updated to
> >    the new CID.
> > 
> >  * Sockets in other states are unchanged.
> > 
> > Applications must handle disruptive migration by reconnecting if
> > necessary after ECONNRESET.
> > 
> > 3. Checkpoint/restore for seamless migration
> > 
> > Applications that wish to communicate across live migration can do so
> > but this requires extra application-specific checkpoint/restore code.
> > 
> > This is similar to the approach taken by the CRIU project where
> > getsockopt()/setsockopt() is used to migrate socket state.  The
> > difference is that the application process is not automatically
migrated
> > from the source host to the destination host.  Therefore, the
> > application needs to migrate its own state somehow.
> > 
> > The flow is as follows:
> > 
> > The application on the source host must quiesce (stop
sending/receiving)
> > and use getsockopt() to extract socket state information from the host
> > kernel.
> > 
> > A new instance of the application is started on the destination host
and
> > given the state so it can restore the connection.  The setsockopt()
> > syscall is used to restore socket state information.
> > 
> > The guest is given a list of <host_old_cid, host_new_cid,
host_port,
> > guest_port> tuples for established connections that must not be
reset
> > when the guest CID update notification is received.  These connections
> > will carry on as if nothing changed.
> > 
> > Note that the connection's remote address is updated from
host_old_cid
> > to host_new_cid.  This allows remapping of CIDs (if necessary).
> > Typically this will be unused because the host always has well-known
CID
> > 2.  In a guest<->guest scenario it may be used to remap CIDs.
> > 
> > 
> > For the time being I am focussing on the basic disruptive migration
flow
> > only.  Checkpoint/restore can be added with a feature bit in the
future.
> > It is a lot more complex and I'm not sure whether there will be
any
> > users yet.
> > 
> > Stefan
> 
> This makes some things harder. For example, imagine a guest
> reboot mixed with migration. We don't know why did the connection
> die, so we'll retry connections until - when?
> 
> Could you please describe some user of vsock and show how
> it recovers from destructive migration?
qemu-guest-agent runs inside the guest with an AF_VSOCK listen socket.

libvirt arbitrates the qemu-guest-agent connection and provides an API
for applications to send commands.

When an application sends a command, libvirt checks if the connection to
qemu-guest-agent is established.  If there is no connection libvirt will
attempt to connect.

The command is sent to qemu-guest-agent and the response is handed back
to the guest application.  libvirt arbitrates access so commands from
multiple applications are serialized.

Live migration resets the established connection between
qemu-guest-agent and the source host's libvirt daemon.  When an
application issues the next qemu-guest-agent command the libvirt daemon
on the destination host notices there is no established connection yet
and starts a new one.

Libvirt refuses to send qemu-guest-agent commands while live migration
is in progress.

Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20160315/fc01277d/attachment.sig>

Stefan Hajnoczi

2016-Mar-15 15:15 UTC

head link

[virtio-dev] virtio-vsock live migration

On Mon, Mar 14, 2016 at 01:13:24PM +0200, Michael S. Tsirkin
wrote:> On Thu, Mar 03, 2016 at 03:37:37PM +0000, Stefan Hajnoczi wrote:
> > Michael pointed out that the virtio-vsock draft specification does not
> > address live migration and in fact currently precludes migration.
> > 
> > Migration is fundamental so the device specification at least
mustn't
> > preclude it.  Having brainstormed migration with Matthew Benjamin and
> > Michael Tsirkin, I am now summarizing the approach that I want to
> > include in the next draft specification.
> > 
> > Feedback and comments welcome!  In the meantime I will implement this
in
> > code and update the draft specification.
> 
> Most of the issue seems to be a consequence of using a 4 byte CID.
> 
> I think the right thing to do is just to teach guests
> about 64 bit CIDs.
> 
> For now, can we drop guest CID from guest to host communication completely,
> making CID only host-visible? Maybe leave the space
> in the packet so we can add CID there later.
> It seems that in theory this will allow changing CID
> during migration, transparently to the guest.
> 
> Guest visible CID is required for guest to guest communication -
> but IIUC that is not currently supported.
> Maybe that can be made conditional on 64 bit addressing.
> Alternatively, it seems much easier to accept that these channels get
broken
> across migration.
I reached the conclusion that channels break across migration because:

1. 32-bit CIDs are in sockaddr_vm and we'd break AF_VSOCK ABI by
   changing it to 64-bit.  Application code would be specific
   virtio-vsock and wouldn't work with other AF_VSOCK transports that
   use the 32-bit sockaddr_vm struct.

2. Dropping guest CIDs from the protocol breaks network protocols that
   send addresses.  NFS and netperf are the first two protocols I looked
   at and both transmit address information across the connection...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL:
<http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20160315/16222652/attachment.sig>

Seemingly Similar Threads

Search for more reasonably related threads

Linux Virtualization - Mar 2016 - [virtio-dev] virtio-vsock live migration

virtio-vsock live migration

virtio-vsock live migration

[virtio-dev] virtio-vsock live migration

virtio-vsock live migration

[virtio-dev] virtio-vsock live migration

Seemingly Similar Threads