In this email message: http://www.gossamer-threads.com/lists/xen/devel/256737 a patch to libxl_domain_suspend is presented to enable live migrate with the qemu device model in xen-unstable. This patch has *NOT* been taken into the 4.2.1 tree (as far as I can see). In 4.2.0, adding this patch and attempting a live migrate using the qemu device model (using xl) produces a seg fault due to unitialised variables. Should I expect live-migrate of qemu-dm vms to work under 4.2.1? If so, should the patch (or a modification thereof) to remove the check from libxl_domain_suspend be applied to 4.2.1-testing or is there more to do? I am very happy to commit test resources to this. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffeffff700 (LWP 5995)] 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) bt #0 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff6d4e970 in libxl__domain_suspend_common_switch_qemu_logdirty (domid=<optimized out>, enable=<optimized out>, user=0x7ffff00024e8) at libxl_dom.c:728 #2 0x00007ffff6d5c1ae in libxl__srm_callout_received_save (msg=0x7fffefffe41a " error", len=<optimized out>, user=0x7ffff00024e8) at _libxl_save_msgs_callout.c:162 #3 0x00007ffff6d5b736 in helper_stdout_readable (egc=0x7fffefffe5a0, ev=0x7ffff0002560, fd=38, events=<optimized out>, revents=<optimized out>) at libxl_save_callout.c:283 #4 0x00007ffff6d601f1 in afterpoll_internal (egc=0x7fffefffe5a0, poller=0x7ffff00028c0, nfds=4, fds=0x7ffff00048b0, now=...) at libxl_event.c:948 #5 0x00007ffff6d604db in eventloop_iteration (egc=0x7fffefffe5a0, poller=0x7ffff00028c0) at libxl_event.c:1368 #6 0x00007ffff6d616b3 in libxl__ao_inprogress (ao=0x7ffff0001d40, file=<optimized out>, line=<optimized out>, func=<optimized out>) at libxl_event.c:1614 #7 0x00007ffff6d3ab75 in libxl_domain_suspend (ctx=<optimized out>, domid=1, fd=10, flags=<optimized out>, ao_how=<optimized out>) at libxl.c:796 #8 0x000000000043677e in migrate_domain_send (ctx=0x7ffff0008860, domid=1, fd=10) at hypervisor/xen_libxl.c:587 #9 0x000000000043698a in live_migrate_send (hyperconn=0x7ffff0001c70, server=0x7ffff0001cb0, node_ip=0x7ffff00041e0 "10.157.128.20", fd=10) at hypervisor/xen_libxl.c:647 #10 0x0000000000422a70 in migrate_server_action (request=0x7ffff0002980) at action/node_action.c:1287 #11 0x00000000004240c1 in runAction (socket_fd=8) at action/handleaction.c:138 #12 0x00000000004179bd in runcomm (socket=0x8) at xvpagent.c:253 #13 0x0000000000427502 in trackedthread_run (arg=0x66df20) at util/util.c:179 #14 0x00007ffff5c9ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #15 0x00007ffff59ca4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #16 0x0000000000000000 in ?? () -- Alex Bligh
This particular segv seems to be because in libxl__domain_suspend_common_switch_qemu_logdirty in libxl_dom.c the variables ''got'', and ''got_ret'' do not appear to be initialised to NULL (got certainly should be, got_ret should be if there is any possibility of libxl__xs_read_checked not writing to got_ret which it doesn''t seem to be). Code inspection suggests this issue is still there in 4.2.1, hence my wondering whether other stuff needs bringing in from unstable. --On 11 December 2012 11:45:42 +0000 Alex Bligh <alex@alex.org.uk> wrote:> Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fffeffff700 (LWP 5995)] > 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > (gdb) bt ># 0 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 ># 1 0x00007ffff6d4e970 in libxl__domain_suspend_common_switch_qemu_logdirty (domid=<optimized out>, enable=<optimized out>, user=0x7ffff00024e8) at libxl_dom.c:728 ># 2 0x00007ffff6d5c1ae in libxl__srm_callout_received_save (msg=0x7fffefffe41a " error", len=<optimized out>, user=0x7ffff00024e8) at _libxl_save_msgs_callout.c:162 ># 3 0x00007ffff6d5b736 in helper_stdout_readable (egc=0x7fffefffe5a0, ev=0x7ffff0002560, fd=38, events=<optimized out>, revents=<optimized out>) at libxl_save_callout.c:283 ># 4 0x00007ffff6d601f1 in afterpoll_internal (egc=0x7fffefffe5a0, poller=0x7ffff00028c0, nfds=4, fds=0x7ffff00048b0, now=...) at libxl_event.c:948 ># 5 0x00007ffff6d604db in eventloop_iteration (egc=0x7fffefffe5a0, poller=0x7ffff00028c0) at libxl_event.c:1368 ># 6 0x00007ffff6d616b3 in libxl__ao_inprogress (ao=0x7ffff0001d40, file=<optimized out>, line=<optimized out>, func=<optimized out>) at libxl_event.c:1614 ># 7 0x00007ffff6d3ab75 in libxl_domain_suspend (ctx=<optimized out>, domid=1, fd=10, flags=<optimized out>, ao_how=<optimized out>) at libxl.c:796 ># 8 0x000000000043677e in migrate_domain_send (ctx=0x7ffff0008860, domid=1, fd=10) at hypervisor/xen_libxl.c:587 ># 9 0x000000000043698a in live_migrate_send (hyperconn=0x7ffff0001c70, server=0x7ffff0001cb0, node_ip=0x7ffff00041e0 "10.157.128.20", fd=10) at hypervisor/xen_libxl.c:647 ># 10 0x0000000000422a70 in migrate_server_action (request=0x7ffff0002980) at action/node_action.c:1287 ># 11 0x00000000004240c1 in runAction (socket_fd=8) at action/handleaction.c:138 ># 12 0x00000000004179bd in runcomm (socket=0x8) at xvpagent.c:253 ># 13 0x0000000000427502 in trackedthread_run (arg=0x66df20) at util/util.c:179 ># 14 0x00007ffff5c9ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 ># 15 0x00007ffff59ca4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6 ># 16 0x0000000000000000 in ?? ()-- Alex Bligh
On Tue, 2012-12-11 at 11:45 +0000, Alex Bligh wrote:> In this email message: > http://www.gossamer-threads.com/lists/xen/devel/256737 > a patch to libxl_domain_suspend is presented to enable live migrate > with the qemu device model in xen-unstable. This patch has *NOT* > been taken into the 4.2.1 tree (as far as I can see). > > In 4.2.0, adding this patch and attempting a live migrate using > the qemu device model (using xl) produces a seg fault due to > unitialised variables.Really using xl? because the stack trace suggests otherwise.> Should I expect live-migrate of qemu-dm vms to work under 4.2.1?Do you mean VMs using the upstream "qemu-xen" device model, as opposed to the default "qemu-xen-traditional" model? Migration of HVM guests in 4.2.x is only supported with the qemu-xen-traditional device model and AFAIK there is no plan to backport this support to 4.2. It still shouldn''t crash though. I''m not sure how it got this far since libxl on 4.2 explicitly checks the DM version before attempting to migrate and will refuse to even try with a qemu-xen DM. Ian.> If so, should the patch (or a modification thereof) to remove > the check from libxl_domain_suspend be applied to 4.2.1-testing > or is there more to do? > > I am very happy to commit test resources to this. > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fffeffff700 (LWP 5995)] > 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > (gdb) bt > #0 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00007ffff6d4e970 in libxl__domain_suspend_common_switch_qemu_logdirty (domid=<optimized out>, enable=<optimized out>, user=0x7ffff00024e8) at libxl_dom.c:728 > #2 0x00007ffff6d5c1ae in libxl__srm_callout_received_save (msg=0x7fffefffe41a " error", len=<optimized out>, user=0x7ffff00024e8) at _libxl_save_msgs_callout.c:162 > #3 0x00007ffff6d5b736 in helper_stdout_readable (egc=0x7fffefffe5a0, ev=0x7ffff0002560, fd=38, events=<optimized out>, revents=<optimized out>) at libxl_save_callout.c:283 > #4 0x00007ffff6d601f1 in afterpoll_internal (egc=0x7fffefffe5a0, poller=0x7ffff00028c0, nfds=4, fds=0x7ffff00048b0, now=...) at libxl_event.c:948 > #5 0x00007ffff6d604db in eventloop_iteration (egc=0x7fffefffe5a0, poller=0x7ffff00028c0) at libxl_event.c:1368 > #6 0x00007ffff6d616b3 in libxl__ao_inprogress (ao=0x7ffff0001d40, file=<optimized out>, line=<optimized out>, func=<optimized out>) at libxl_event.c:1614 > #7 0x00007ffff6d3ab75 in libxl_domain_suspend (ctx=<optimized out>, domid=1, fd=10, flags=<optimized out>, ao_how=<optimized out>) at libxl.c:796 > #8 0x000000000043677e in migrate_domain_send (ctx=0x7ffff0008860, domid=1, fd=10) at hypervisor/xen_libxl.c:587 > #9 0x000000000043698a in live_migrate_send (hyperconn=0x7ffff0001c70, server=0x7ffff0001cb0, node_ip=0x7ffff00041e0 "10.157.128.20", fd=10) at hypervisor/xen_libxl.c:647 > #10 0x0000000000422a70 in migrate_server_action (request=0x7ffff0002980) at action/node_action.c:1287 > #11 0x00000000004240c1 in runAction (socket_fd=8) at action/handleaction.c:138 > #12 0x00000000004179bd in runcomm (socket=0x8) at xvpagent.c:253 > #13 0x0000000000427502 in trackedthread_run (arg=0x66df20) at util/util.c:179 > #14 0x00007ffff5c9ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 > #15 0x00007ffff59ca4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6 > #16 0x0000000000000000 in ?? () > >
On Tue, 2012-12-11 at 12:05 +0000, Alex Bligh wrote:> This particular segv seems to be because in > libxl__domain_suspend_common_switch_qemu_logdirty > in libxl_dom.c the variables ''got'', and ''got_ret'' do not appear to be initialised > to NULL (got certainly should be, got_ret should be if there is any possibility > of libxl__xs_read_checked not writing to got_ret which it doesn''t seem to be). > Code inspection suggests this issue is still there in 4.2.1, hence my wondering > whether other stuff needs bringing in from unstable.libxl__xs_read_checked will always either initialise the variable (perhaps to NULL) or return an error. On both callsites we check for error and "goto out". I think the crash is because the code uses got_ret without checking if it was NULL, which can happen if the path is not present. Ian (J) does that make sense as something which is allowed to happen? As I said in my early mail I''m not sure why you are getting here at all though. Ian.> > --On 11 December 2012 11:45:42 +0000 Alex Bligh <alex@alex.org.uk> wrote: > > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fffeffff700 (LWP 5995)] > > 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > > (gdb) bt > ># 0 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > ># 1 0x00007ffff6d4e970 in libxl__domain_suspend_common_switch_qemu_logdirty (domid=<optimized out>, enable=<optimized out>, user=0x7ffff00024e8) at libxl_dom.c:728 > ># 2 0x00007ffff6d5c1ae in libxl__srm_callout_received_save (msg=0x7fffefffe41a " error", len=<optimized out>, user=0x7ffff00024e8) at _libxl_save_msgs_callout.c:162 > ># 3 0x00007ffff6d5b736 in helper_stdout_readable (egc=0x7fffefffe5a0, ev=0x7ffff0002560, fd=38, events=<optimized out>, revents=<optimized out>) at libxl_save_callout.c:283 > ># 4 0x00007ffff6d601f1 in afterpoll_internal (egc=0x7fffefffe5a0, poller=0x7ffff00028c0, nfds=4, fds=0x7ffff00048b0, now=...) at libxl_event.c:948 > ># 5 0x00007ffff6d604db in eventloop_iteration (egc=0x7fffefffe5a0, poller=0x7ffff00028c0) at libxl_event.c:1368 > ># 6 0x00007ffff6d616b3 in libxl__ao_inprogress (ao=0x7ffff0001d40, file=<optimized out>, line=<optimized out>, func=<optimized out>) at libxl_event.c:1614 > ># 7 0x00007ffff6d3ab75 in libxl_domain_suspend (ctx=<optimized out>, domid=1, fd=10, flags=<optimized out>, ao_how=<optimized out>) at libxl.c:796 > ># 8 0x000000000043677e in migrate_domain_send (ctx=0x7ffff0008860, domid=1, fd=10) at hypervisor/xen_libxl.c:587 > ># 9 0x000000000043698a in live_migrate_send (hyperconn=0x7ffff0001c70, server=0x7ffff0001cb0, node_ip=0x7ffff00041e0 "10.157.128.20", fd=10) at hypervisor/xen_libxl.c:647 > ># 10 0x0000000000422a70 in migrate_server_action (request=0x7ffff0002980) at action/node_action.c:1287 > ># 11 0x00000000004240c1 in runAction (socket_fd=8) at action/handleaction.c:138 > ># 12 0x00000000004179bd in runcomm (socket=0x8) at xvpagent.c:253 > ># 13 0x0000000000427502 in trackedthread_run (arg=0x66df20) at util/util.c:179 > ># 14 0x00007ffff5c9ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 > ># 15 0x00007ffff59ca4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6 > ># 16 0x0000000000000000 in ?? () > > >
Ian, (two messages in one)>> In 4.2.0, adding this patch and attempting a live migrate using >> the qemu device model (using xl) produces a seg fault due to >> unitialised variables. > > Really using xl? because the stack trace suggests otherwise.Sorry, libxl.>> Should I expect live-migrate of qemu-dm vms to work under 4.2.1? > > Do you mean VMs using the upstream "qemu-xen" device model, as opposed > to the default "qemu-xen-traditional" model?Yes, using upstream qemu-xen.> Migration of HVM guests in 4.2.x is only supported with the > qemu-xen-traditional device model and AFAIK there is no plan to backport > this support to 4.2.Ah. What would be involved in a backport? We use HVM guests under 4.2 and need qemu-xen for various reasons.> It still shouldn''t crash though. I''m not sure how it got this far since > libxl on 4.2 explicitly checks the DM version before attempting to > migrate and will refuse to even try with a qemu-xen DM.We had removed the check (per the pointer to the mail message I sent) for the qemu-xen model.> libxl__xs_read_checked will always either initialise the variable > (perhaps to NULL) or return an error. On both callsites we check for > error and "goto out".It returns NULL if the error is not ENOENT I think.> I think the crash is because the code uses got_ret without checking if > it was NULL, which can happen if the path is not present. Ian (J) does > that make sense as something which is allowed to happen?gdb showed one pointer was NULL and the other pointed to some rubbish (the latter is confusing). Alex> > Ian. > >> If so, should the patch (or a modification thereof) to remove >> the check from libxl_domain_suspend be applied to 4.2.1-testing >> or is there more to do? >> >> I am very happy to commit test resources to this. >> >> Program received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x7fffeffff700 (LWP 5995)] >> 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 >> (gdb) bt >> # 0 0x00007ffff5a0862a in ?? () from /lib/x86_64-linux-gnu/libc.so.6 >> # 1 0x00007ffff6d4e970 in libxl__domain_suspend_common_switch_qemu_logdirty (domid=<optimized out>, enable=<optimized out>, user=0x7ffff00024e8) at libxl_dom.c:728 >> # 2 0x00007ffff6d5c1ae in libxl__srm_callout_received_save (msg=0x7fffefffe41a " error", len=<optimized out>, user=0x7ffff00024e8) at _libxl_save_msgs_callout.c:162 >> # 3 0x00007ffff6d5b736 in helper_stdout_readable (egc=0x7fffefffe5a0, ev=0x7ffff0002560, fd=38, events=<optimized out>, revents=<optimized out>) at libxl_save_callout.c:283 >> # 4 0x00007ffff6d601f1 in afterpoll_internal (egc=0x7fffefffe5a0, poller=0x7ffff00028c0, nfds=4, fds=0x7ffff00048b0, now=...) at libxl_event.c:948 >> # 5 0x00007ffff6d604db in eventloop_iteration (egc=0x7fffefffe5a0, poller=0x7ffff00028c0) at libxl_event.c:1368 >> # 6 0x00007ffff6d616b3 in libxl__ao_inprogress (ao=0x7ffff0001d40, file=<optimized out>, line=<optimized out>, func=<optimized out>) at libxl_event.c:1614 >> # 7 0x00007ffff6d3ab75 in libxl_domain_suspend (ctx=<optimized out>, domid=1, fd=10, flags=<optimized out>, ao_how=<optimized out>) at libxl.c:796 >> # 8 0x000000000043677e in migrate_domain_send (ctx=0x7ffff0008860, domid=1, fd=10) at hypervisor/xen_libxl.c:587 >> # 9 0x000000000043698a in live_migrate_send (hyperconn=0x7ffff0001c70, server=0x7ffff0001cb0, node_ip=0x7ffff00041e0 "10.157.128.20", fd=10) at hypervisor/xen_libxl.c:647 >> # 10 0x0000000000422a70 in migrate_server_action (request=0x7ffff0002980) at action/node_action.c:1287 >> # 11 0x00000000004240c1 in runAction (socket_fd=8) at action/handleaction.c:138 >> # 12 0x00000000004179bd in runcomm (socket=0x8) at xvpagent.c:253 >> # 13 0x0000000000427502 in trackedthread_run (arg=0x66df20) at util/util.c:179 >> # 14 0x00007ffff5c9ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 >> # 15 0x00007ffff59ca4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6 >> # 16 0x0000000000000000 in ?? () >> >> > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >-- Alex Bligh
On Tue, 2012-12-11 at 15:07 +0000, Alex Bligh wrote:> > Migration of HVM guests in 4.2.x is only supported with the > > qemu-xen-traditional device model and AFAIK there is no plan to backport > > this support to 4.2. > > Ah. What would be involved in a backport? We use HVM guests under 4.2 and > need qemu-xen for various reasons.AFAIK its a pretty big qemu-side backport, plus you need the whole libxl series not just the final patch that you linked to. I don''t think this is a candidate for 4.2.x. You could try doing it locally though I guess.> > It still shouldn''t crash though. I''m not sure how it got this far since > > libxl on 4.2 explicitly checks the DM version before attempting to > > migrate and will refuse to even try with a qemu-xen DM. > > We had removed the check (per the pointer to the mail message I sent) > for the qemu-xen model.Oh well, then all bets are off really. Ian.
Stefano Stabellini
2012-Dec-11 15:26 UTC
Re: Xen 4.2.1 live migration with qemu device model
On Tue, 11 Dec 2012, Ian Campbell wrote:> On Tue, 2012-12-11 at 15:07 +0000, Alex Bligh wrote: > > > Migration of HVM guests in 4.2.x is only supported with the > > > qemu-xen-traditional device model and AFAIK there is no plan to backport > > > this support to 4.2. > > > > Ah. What would be involved in a backport? We use HVM guests under 4.2 and > > need qemu-xen for various reasons. > > AFAIK its a pretty big qemu-side backport, plus you need the whole libxl > series not just the final patch that you linked to. > > I don''t think this is a candidate for 4.2.x. You could try doing it > locally though I guess.This is the patch series that needs to be backported in QEMU: http://marc.info/?l=qemu-devel&m=134920288412400&w=2 And this is the libxl counterpart: http://marc.info/?l=xen-devel&m=134944750724252 I would be OK with backporting the QEMU side, but I''ll leave the decision on the libxl side up to you.
Stefono, Ian, --On 11 December 2012 15:26:25 +0000 Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> This is the patch series that needs to be backported in QEMU: > http://marc.info/?l=qemu-devel&m=134920288412400&w=2 > > And this is the libxl counterpart: > http://marc.info/?l=xen-devel&m=134944750724252 > > I would be OK with backporting the QEMU side,That would be great and very useful (even if it doesn''t ultimately make 4.2.x)> but I''ll leave the > decision on the libxl side up to you.I''d already planned to try backporting this patchset. If it goes in reasonably cleanly, it''s probably within my competence, and I''m happy to test etc. We need qemu-xen for the better snapshotting capability on qcow2 (rebase in particular), and not having live-migrate on HVM is a bit of a PITA. -- Alex Bligh
Stefano Stabellini
2012-Dec-11 19:24 UTC
Re: Xen 4.2.1 live migration with qemu device model
On Tue, 11 Dec 2012, Alex Bligh wrote:> Stefono, Ian, > > --On 11 December 2012 15:26:25 +0000 Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > > This is the patch series that needs to be backported in QEMU: > > http://marc.info/?l=qemu-devel&m=134920288412400&w=2 > > > > And this is the libxl counterpart: > > http://marc.info/?l=xen-devel&m=134944750724252 > > > > I would be OK with backporting the QEMU side, > > That would be great and very useful (even if it doesn''t ultimately make 4.2.x) > > > but I''ll leave the > > decision on the libxl side up to you. > > I''d already planned to try backporting this patchset. If it goes in reasonably cleanly, > it''s probably within my competence, and I''m happy to test etc. > > We need qemu-xen for the better snapshotting capability on qcow2 (rebase > in particular), and not having live-migrate on HVM is a bit of a PITA.It would be great if you could implement QEMU qcow2 snapshotting support in libxl, so that everything can happen seamlessly using a variation of xl save/restore. It is a while that we wanted that feature but we never got around to it.
--On 11 December 2012 19:24:54 +0000 Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> It would be great if you could implement QEMU qcow2 snapshotting support > in libxl, so that everything can happen seamlessly using a variation of xl > save/restore.I''d guess that isn''t hard. We already have QEMU qcow2 snapshotting support on qemu-xen working via direct calls to QEMU - I think it ''worked just like kvm''. I seem to remember there is some fiddling to do with an open fd number in qemu for the live rebase, and some limitations (for instance I couldn''t see how to do a consistent snapshot of the same device presented both pv and emulated under HVM - which is theoretically an issue if you have one partition mounted one way and one the other). -- Alex Bligh
Stefano Stabellini
2012-Dec-12 13:04 UTC
QEMU qcow2 snapshotting (was Xen 4.2.1 live migration with qemu device model)
On Tue, 11 Dec 2012, Alex Bligh wrote:> --On 11 December 2012 19:24:54 +0000 Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > > It would be great if you could implement QEMU qcow2 snapshotting support > > in libxl, so that everything can happen seamlessly using a variation of xl > > save/restore. > > I''d guess that isn''t hard. We already have QEMU qcow2 snapshotting support > on qemu-xen working via direct calls to QEMU - I think it ''worked just like > kvm''. I seem to remember there is some fiddling to do with an open fd number > in qemu for the live rebase, and some limitations (for instance I couldn''t > see how to do a consistent snapshot of the same device presented both pv > and emulated under HVM - which is theoretically an issue if you have > one partition mounted one way and one the other).It shouldn''t be hard: we already have pretty good QMP support in libxl, so it should be just a matter of plumbing through the new commands. QEMU supports both internal and external snapshots, but I think that the internal ones are more interesting (snapshot saved within the qcow2 file), which ones are you using? Also QEMU''s snapshots support both disk and/or the entire VM state (savevm vs. snapshot_blkdev): one thing that would be cool is if we could save the Xen VM state (AKA xl save) within the snapshot in the qcow2 image. Basically we are talking about a new pair of xl save/restore commands (or maybe options to the existing save/restore commands) that make use of QEMU qcow2 snapshots as storage for both the disk delta and the VM state. To do this we would probably need a new QEMU QMP command, but I think that it would be very useful.
Alex Bligh
2012-Dec-12 15:22 UTC
Re: QEMU qcow2 snapshotting (was Xen 4.2.1 live migration with qemu device model)
Stefano, --On 12 December 2012 13:04:04 +0000 Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:> QEMU supports both internal and external snapshots, but I think that the > internal ones are more interesting (snapshot saved within the qcow2 > file), which ones are you using?We currently use external snapshots + rebase. The rebase stuff has made external snapshots far more usable particularly for CoW base images. I shall try to have a look at this - but getting live migration working with the device model is the top priority for us. I had a quick look this morning and most of the changes are to the JSON library it seems! -- Alex Bligh
Ian Campbell writes ("Re: [Xen-devel] Xen 4.2.1 live migration with qemu device model"):> libxl__xs_read_checked will always either initialise the variable > (perhaps to NULL) or return an error. On both callsites we check for > error and "goto out".Yes. And the error handling case after the strcmp does indeed handle got_ret==NULL but the test fails to guard against this before running strcmp.> I think the crash is because the code uses got_ret without checking if > it was NULL, which can happen if the path is not present. Ian (J) does > that make sense as something which is allowed to happen?Yes. I think it is an error somewhere. I don''t understand how this is happening. This situation might occur if the qemu crashed and two separate attempts were made to send a logdirty command, I guess.> As I said in my early mail I''m not sure why you are getting here at all > though.Right. I think the patch below fixes the segfault but it doesn''t fix the underlying cause. Ian. Subject: libxl: qemu trad logdirty: Tolerate ENOENT on ret path It can happen in error conditions that lds->ret_path doesn''t exist, and libxl__xs_read_checked signals this by setting got_ret=NULL. If this happens, fail without crashing. Reported-by: Alex Bligh <alex@alex.org.uk>, Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 95da18e..7586a6c 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -725,7 +725,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret); if (rc) goto out; - if (strcmp(got, got_ret)) { + if (!got_ret || strcmp(got, got_ret)) { LOG(ERROR,"controlling logdirty: qemu was already sent" " command `%s'' (xenstore path `%s'') but result is `%s''", got, lds->cmd_path, got_ret ? got_ret : "<none>");
Alex Bligh
2012-Dec-13 10:56 UTC
[RFC] Enabling live-migrate on HVM on qemu-xen device model
This is a COMPILE TESTED ONLY RFC for a backport of the libxl elements of: http://marc.info/?l=xen-devel&m=134944750724252 Stefano Stabellini said he would look at backporting the qemu-xen side. My first question is whether this is the right approach: do we want to include the JSON changes too (presumably it changes the API). I am not 100% convinced they are necessary to get live migrate working. DO NOT RUN THIS CODE ON ANYTHING YOU CARE ABOUT.
Alex Bligh
2012-Dec-13 10:56 UTC
[PATCH 01/10] libxl_json: Export json_object related function.
Export libxl__json_object_alloc and libxl__json_object_append_to to use them in a later patch. Backported from xen-unstable patch: : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693129 -3600 : Node ID c9b80c7f8db1a5d26906a2298c481bf7e87fda94 : Parent 93e3e6a33e0a1ec9f92fc575334caa35e6dbc757 --- tools/libxl/libxl_internal.h | 14 ++++++++++++-- tools/libxl/libxl_json.c | 32 ++++++++++++++++---------------- 2 files changed, 28 insertions(+), 18 deletions(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index a135cd7..2959527 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1512,6 +1512,15 @@ static inline long long libxl__json_object_get_integer(const libxl__json_object return -1; } +/* + * NOGC can be used with those json_object functions, but the + * libxl__json_object* will need to be freed with libxl__json_object_free. + */ +_hidden libxl__json_object *libxl__json_object_alloc(libxl__gc *gc_opt, + libxl__json_node_type type); +_hidden int libxl__json_object_append_to(libxl__gc *gc_opt, + libxl__json_object *obj, + libxl__json_object *dst); _hidden libxl__json_object *libxl__json_array_get(const libxl__json_object *o, int i); _hidden @@ -1520,9 +1529,10 @@ libxl__json_map_node *libxl__json_map_node_get(const libxl__json_object *o, _hidden const libxl__json_object *libxl__json_map_get(const char *key, const libxl__json_object *o, libxl__json_node_type expected_type); -_hidden void libxl__json_object_free(libxl__gc *gc, libxl__json_object *obj); +_hidden void libxl__json_object_free(libxl__gc *gc_opt, + libxl__json_object *obj); -_hidden libxl__json_object *libxl__json_parse(libxl__gc *gc, const char *s); +_hidden libxl__json_object *libxl__json_parse(libxl__gc *gc_opt, const char *s); /* Based on /local/domain/$domid/dm-version xenstore key * default is qemu xen traditional */ diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c index caa8312..0b0cf2f 100644 --- a/tools/libxl/libxl_json.c +++ b/tools/libxl/libxl_json.c @@ -205,7 +205,7 @@ yajl_gen_status libxl__string_gen_json(yajl_gen hand, * libxl__json_object helper functions */ -static libxl__json_object *json_object_alloc(libxl__gc *gc, +libxl__json_object *libxl__json_object_alloc(libxl__gc *gc, libxl__json_node_type type) { libxl__json_object *obj; @@ -236,7 +236,7 @@ static libxl__json_object *json_object_alloc(libxl__gc *gc, return obj; } -static int json_object_append_to(libxl__gc *gc, +int libxl__json_object_append_to(libxl__gc *gc, libxl__json_object *obj, libxl__json_object *dst) { @@ -393,10 +393,10 @@ static int json_callback_null(void *opaque) DEBUG_GEN(ctx, null); - if ((obj = json_object_alloc(ctx->gc, JSON_NULL)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_NULL)) == NULL) return 0; - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } @@ -411,11 +411,11 @@ static int json_callback_boolean(void *opaque, int boolean) DEBUG_GEN_VALUE(ctx, bool, boolean); - if ((obj = json_object_alloc(ctx->gc, + if ((obj = libxl__json_object_alloc(ctx->gc, boolean ? JSON_TRUE : JSON_FALSE)) == NULL) return 0; - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } @@ -448,7 +448,7 @@ static int json_callback_number(void *opaque, const char *s, libxl_yajl_length l goto error; } - if ((obj = json_object_alloc(ctx->gc, JSON_DOUBLE)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_DOUBLE)) == NULL) return 0; obj->u.d = d; } else { @@ -458,7 +458,7 @@ static int json_callback_number(void *opaque, const char *s, libxl_yajl_length l goto error; } - if ((obj = json_object_alloc(ctx->gc, JSON_INTEGER)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_INTEGER)) == NULL) return 0; obj->u.i = i; } @@ -466,7 +466,7 @@ static int json_callback_number(void *opaque, const char *s, libxl_yajl_length l error: /* If the conversion fail, we just store the original string. */ - if ((obj = json_object_alloc(ctx->gc, JSON_NUMBER)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_NUMBER)) == NULL) return 0; t = malloc(len + 1); @@ -481,7 +481,7 @@ error: obj->u.string = t; out: - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } @@ -508,13 +508,13 @@ static int json_callback_string(void *opaque, const unsigned char *str, strncpy(t, (const char *) str, len); t[len] = 0; - if ((obj = json_object_alloc(ctx->gc, JSON_STRING)) == NULL) { + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_STRING)) == NULL) { free(t); return 0; } obj->u.string = t; - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } @@ -573,11 +573,11 @@ static int json_callback_start_map(void *opaque) DEBUG_GEN(ctx, map_open); - if ((obj = json_object_alloc(ctx->gc, JSON_MAP)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_MAP)) == NULL) return 0; if (ctx->current) { - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } @@ -615,11 +615,11 @@ static int json_callback_start_array(void *opaque) DEBUG_GEN(ctx, array_open); - if ((obj = json_object_alloc(ctx->gc, JSON_ARRAY)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_ARRAY)) == NULL) return 0; if (ctx->current) { - if (json_object_append_to(ctx->gc, obj, ctx->current) == -1) { + if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { libxl__json_object_free(ctx->gc, obj); return 0; } -- 1.7.4.1
This value from libxl__json_node_type is never used. Backported from xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693130 -3600 : Node ID 4a6d5d8cba4fc44f9bbda201188885868604b8e8 : Parent c9b80c7f8db1a5d26906a2298c481bf7e87fda94 --- tools/libxl/libxl_internal.h | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 2959527..5b285d4 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1428,7 +1428,6 @@ _hidden yajl_gen_status libxl__yajl_gen_asciiz(yajl_gen hand, const char *str); _hidden yajl_gen_status libxl__yajl_gen_enum(yajl_gen hand, const char *str); typedef enum { - JSON_ERROR, JSON_NULL, JSON_TRUE, JSON_FALSE, -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 03/10] libxl_json: Replace JSON_TRUE/FALSE by JSON_BOOL.
Those two JSON_TRUE and JSON_FALSE were types of node. But it''s better to have a unique JSON_BOOL type. Backported from xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693131 -3600 : Node ID 3f71aab0e2774ded0c5a03436c364fb031ba9aa0 : Parent 4a6d5d8cba4fc44f9bbda201188885868604b8e8 --- tools/libxl/libxl_internal.h | 15 +++++++++++++-- tools/libxl/libxl_json.c | 3 +-- tools/libxl/libxl_qmp.c | 3 ++- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 5b285d4..7dbd8af 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1429,8 +1429,7 @@ _hidden yajl_gen_status libxl__yajl_gen_enum(yajl_gen hand, const char *str); typedef enum { JSON_NULL, - JSON_TRUE, - JSON_FALSE, + JSON_BOOL, JSON_INTEGER, JSON_DOUBLE, /* number is store in string, it''s too big to be a long long or a double */ @@ -1444,6 +1443,7 @@ typedef enum { typedef struct libxl__json_object { libxl__json_node_type type; union { + bool b; long long i; double d; char *string; @@ -1462,6 +1462,10 @@ typedef struct { typedef struct libxl__yajl_ctx libxl__yajl_ctx; +static inline bool libxl__json_object_is_bool(const libxl__json_object *o) +{ + return o != NULL && o->type == JSON_BOOL; +} static inline bool libxl__json_object_is_string(const libxl__json_object *o) { return o != NULL && o->type == JSON_STRING; @@ -1479,6 +1483,13 @@ static inline bool libxl__json_object_is_array(const libxl__json_object *o) return o != NULL && o->type == JSON_ARRAY; } +static inline bool libxl__json_object_get_bool(const libxl__json_object *o) +{ + if (libxl__json_object_is_bool(o)) + return o->u.b; + else + return false; +} static inline const char *libxl__json_object_get_string(const libxl__json_object *o) { diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c index 0b0cf2f..98db465 100644 --- a/tools/libxl/libxl_json.c +++ b/tools/libxl/libxl_json.c @@ -411,8 +411,7 @@ static int json_callback_boolean(void *opaque, int boolean) DEBUG_GEN_VALUE(ctx, bool, boolean); - if ((obj = libxl__json_object_alloc(ctx->gc, - boolean ? JSON_TRUE : JSON_FALSE)) == NULL) + if ((obj = libxl__json_object_alloc(ctx->gc, JSON_BOOL)) == NULL) return 0; if (libxl__json_object_append_to(ctx->gc, obj, ctx->current) == -1) { diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index e33b130..9e86c35 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -178,7 +178,8 @@ static int qmp_register_vnc_callback(libxl__qmp_handler *qmp, goto out; } - if (libxl__json_map_get("enabled", o, JSON_FALSE)) { + obj = libxl__json_map_get("enabled", o, JSON_BOOL); + if (!obj || !libxl__json_object_get_bool(obj)) { rc = 0; goto out; } -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 04/10] libxl_json: Introduce libxl__json_object_to_yajl_gen.
This function converts a libxl__json_object to yajl by calling every yajl_gen_* function on a preallocated yajl_gen hand. This helps to integrate a json_object into an already existing yajl_gen tree. This function is used in a later patch. Backported from xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693132 -3600 : Node ID 74dee58cfc0d2d6594f388db3b4d2ce91d1bb204 : Parent 3f71aab0e2774ded0c5a03436c364fb031ba9aa0 --- tools/libxl/libxl_internal.h | 3 ++ tools/libxl/libxl_json.c | 61 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 0 deletions(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 7dbd8af..b00ff61 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1539,6 +1539,9 @@ libxl__json_map_node *libxl__json_map_node_get(const libxl__json_object *o, _hidden const libxl__json_object *libxl__json_map_get(const char *key, const libxl__json_object *o, libxl__json_node_type expected_type); +_hidden yajl_status libxl__json_object_to_yajl_gen(libxl__gc *gc_opt, + yajl_gen hand, + libxl__json_object *param); _hidden void libxl__json_object_free(libxl__gc *gc_opt, libxl__json_object *obj); diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c index 98db465..72b52e8 100644 --- a/tools/libxl/libxl_json.c +++ b/tools/libxl/libxl_json.c @@ -381,6 +381,67 @@ const libxl__json_object *libxl__json_map_get(const char *key, return NULL; } +yajl_status libxl__json_object_to_yajl_gen(libxl__gc *gc, + yajl_gen hand, + libxl__json_object *obj) +{ + int idx = 0; + yajl_status rc; + + switch (obj->type) { + case JSON_NULL: + return yajl_gen_null(hand); + case JSON_BOOL: + return yajl_gen_bool(hand, obj->u.b); + case JSON_INTEGER: + return yajl_gen_integer(hand, obj->u.i); + case JSON_DOUBLE: + return yajl_gen_double(hand, obj->u.d); + case JSON_NUMBER: + return yajl_gen_number(hand, obj->u.string, strlen(obj->u.string)); + case JSON_STRING: + return libxl__yajl_gen_asciiz(hand, obj->u.string); + case JSON_MAP: { + libxl__json_map_node *node = NULL; + + rc = yajl_gen_map_open(hand); + if (rc != yajl_status_ok) + return rc; + for (idx = 0; idx < obj->u.map->count; idx++) { + if (flexarray_get(obj->u.map, idx, (void**)&node) != 0) + break; + + rc = libxl__yajl_gen_asciiz(hand, node->map_key); + if (rc != yajl_status_ok) + return rc; + rc = libxl__json_object_to_yajl_gen(gc, hand, node->obj); + if (rc != yajl_status_ok) + return rc; + } + return yajl_gen_map_close(hand); + } + case JSON_ARRAY: { + libxl__json_object *node = NULL; + + rc = yajl_gen_array_open(hand); + if (rc != yajl_status_ok) + return rc; + for (idx = 0; idx < obj->u.array->count; idx++) { + if (flexarray_get(obj->u.array, idx, (void**)&node) != 0) + break; + rc = libxl__json_object_to_yajl_gen(gc, hand, node); + if (rc != yajl_status_ok) + return rc; + } + return yajl_gen_array_close(hand); + } + case JSON_ANY: + /* JSON_ANY is not a valid value for obj->type. */ + ; + } + abort(); +} + /* * JSON callbacks -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 05/10] libxl_qmp: Introduces helpers to create an argument list.
Those functions will be used to create a "list" of parameters that contain more than just strings. This list is converted by qmp_send to a string to be sent to QEMU. Those functions will be used in the next two patches, so right now there are not compiled. Backported from xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693132 -3600 : Node ID 6f7847729f0f42614de516d15257ede7243f995f : Parent 74dee58cfc0d2d6594f388db3b4d2ce91d1bb204 --- tools/libxl/libxl_qmp.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 51 insertions(+), 0 deletions(-) diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index 9e86c35..827f1b7 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -624,6 +624,57 @@ static void qmp_free_handler(libxl__qmp_handler *qmp) free(qmp); } +#if 0 +/* + * QMP Parameters Helpers + */ +static void qmp_parameters_common_add(libxl__gc *gc, + libxl__json_object **param, + const char *name, + libxl__json_object *obj) +{ + libxl__json_map_node *arg = NULL; + + if (!*param) { + *param = libxl__json_object_alloc(gc, JSON_MAP); + } + + arg = libxl__zalloc(gc, sizeof(*arg)); + + arg->map_key = libxl__strdup(gc, name); + arg->obj = obj; + + flexarray_append((*param)->u.map, arg); +} + +static void qmp_parameters_add_string(libxl__gc *gc, + libxl__json_object **param, + const char *name, const char *argument) +{ + libxl__json_object *obj; + + obj = libxl__json_object_alloc(gc, JSON_STRING); + obj->u.string = libxl__strdup(gc, argument); + + qmp_parameters_common_add(gc, param, name, obj); +} + +static void qmp_parameters_add_bool(libxl__gc *gc, + libxl__json_object **param, + const char *name, bool b) +{ + libxl__json_object *obj; + + obj = libxl__json_object_alloc(gc, JSON_BOOL); + obj->u.b = b; + qmp_parameters_common_add(gc, param, name, obj); +} + +#define QMP_PARAMETERS_SPRINTF(args, name, format, ...) \ + qmp_parameters_add_string(gc, args, name, \ + libxl__sprintf(gc, format, __VA_ARGS__)) +#endif + /* * API */ -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 06/10] libxl_qmp: Use qmp_parameters_* functions for param list of a QMP command.
Backported from xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693133 -3600 : Node ID be5d014f91dfbd67afacc3385c265243794a246f : Parent 6f7847729f0f42614de516d15257ede7243f995f --- tools/libxl/libxl_qmp.c | 89 ++++++++++++++++------------------------------- 1 files changed, 30 insertions(+), 59 deletions(-) diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index 827f1b7..605e8f3 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -78,7 +78,7 @@ struct libxl__qmp_handler { }; static int qmp_send(libxl__qmp_handler *qmp, - const char *cmd, libxl_key_value_list *args, + const char *cmd, libxl__json_object *args, qmp_callback_t callback, void *opaque, qmp_request_context *context); @@ -503,7 +503,7 @@ static int qmp_next(libxl__gc *gc, libxl__qmp_handler *qmp) } static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp, - const char *cmd, libxl_key_value_list *args, + const char *cmd, libxl__json_object *args, qmp_callback_t callback, void *opaque, qmp_request_context *context) { @@ -527,7 +527,7 @@ static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp, yajl_gen_integer(hand, ++qmp->last_id_used); if (args) { libxl__yajl_gen_asciiz(hand, "arguments"); - libxl_key_value_list_gen_json(hand, args); + libxl__json_object_to_yajl_gen(gc, hand, args); } yajl_gen_map_close(hand); @@ -561,7 +561,7 @@ out: } static int qmp_send(libxl__qmp_handler *qmp, - const char *cmd, libxl_key_value_list *args, + const char *cmd, libxl__json_object *args, qmp_callback_t callback, void *opaque, qmp_request_context *context) { @@ -589,7 +589,7 @@ out: } static int qmp_synchronous_send(libxl__qmp_handler *qmp, const char *cmd, - libxl_key_value_list *args, + libxl__json_object *args, qmp_callback_t callback, void *opaque, int ask_timeout) { @@ -624,7 +624,6 @@ static void qmp_free_handler(libxl__qmp_handler *qmp) free(qmp); } -#if 0 /* * QMP Parameters Helpers */ @@ -659,6 +658,7 @@ static void qmp_parameters_add_string(libxl__gc *gc, qmp_parameters_common_add(gc, param, name, obj); } +#if 0 static void qmp_parameters_add_bool(libxl__gc *gc, libxl__json_object **param, const char *name, bool b) @@ -669,11 +669,11 @@ static void qmp_parameters_add_bool(libxl__gc *gc, obj->u.b = b; qmp_parameters_common_add(gc, param, name, obj); } +#endif #define QMP_PARAMETERS_SPRINTF(args, name, format, ...) \ qmp_parameters_add_string(gc, args, name, \ libxl__sprintf(gc, format, __VA_ARGS__)) -#endif /* * API @@ -801,8 +801,7 @@ out: int libxl__qmp_pci_add(libxl__gc *gc, int domid, libxl_device_pci *pcidev) { libxl__qmp_handler *qmp = NULL; - flexarray_t *parameters = NULL; - libxl_key_value_list args = NULL; + libxl__json_object *args = NULL; char *hostaddr = NULL; int rc = 0; @@ -815,31 +814,22 @@ int libxl__qmp_pci_add(libxl__gc *gc, int domid, libxl_device_pci *pcidev) if (!hostaddr) return -1; - parameters = flexarray_make(6, 1); - flexarray_append_pair(parameters, "driver", "xen-pci-passthrough"); - flexarray_append_pair(parameters, "id", - libxl__sprintf(gc, PCI_PT_QDEV_ID, - pcidev->bus, pcidev->dev, - pcidev->func)); - flexarray_append_pair(parameters, "hostaddr", hostaddr); + qmp_parameters_add_string(gc, &args, "driver", "xen-pci-passthrough"); + QMP_PARAMETERS_SPRINTF(&args, "id", PCI_PT_QDEV_ID, + pcidev->bus, pcidev->dev, pcidev->func); + qmp_parameters_add_string(gc, &args, "hostaddr", hostaddr); if (pcidev->vdevfn) { - flexarray_append_pair(parameters, "addr", - libxl__sprintf(gc, "%x.%x", - PCI_SLOT(pcidev->vdevfn), - PCI_FUNC(pcidev->vdevfn))); + QMP_PARAMETERS_SPRINTF(&args, "addr", "%x.%x", + PCI_SLOT(pcidev->vdevfn), PCI_FUNC(pcidev->vdevfn)); } - args = libxl__xs_kvs_of_flexarray(gc, parameters, parameters->count); - if (!args) - return -1; - rc = qmp_synchronous_send(qmp, "device_add", &args, + rc = qmp_synchronous_send(qmp, "device_add", args, NULL, NULL, qmp->timeout); if (rc == 0) { rc = qmp_synchronous_send(qmp, "query-pci", NULL, pci_add_callback, pcidev, qmp->timeout); } - flexarray_free(parameters); libxl__qmp_close(qmp); return rc; } @@ -847,24 +837,18 @@ int libxl__qmp_pci_add(libxl__gc *gc, int domid, libxl_device_pci *pcidev) static int qmp_device_del(libxl__gc *gc, int domid, char *id) { libxl__qmp_handler *qmp = NULL; - flexarray_t *parameters = NULL; - libxl_key_value_list args = NULL; + libxl__json_object *args = NULL; int rc = 0; qmp = libxl__qmp_initialize(gc, domid); if (!qmp) return ERROR_FAIL; - parameters = flexarray_make(2, 1); - flexarray_append_pair(parameters, "id", id); - args = libxl__xs_kvs_of_flexarray(gc, parameters, parameters->count); - if (!args) - return ERROR_NOMEM; + qmp_parameters_add_string(gc, &args, "id", id); - rc = qmp_synchronous_send(qmp, "device_del", &args, + rc = qmp_synchronous_send(qmp, "device_del", args, NULL, NULL, qmp->timeout); - flexarray_free(parameters); libxl__qmp_close(qmp); return rc; } @@ -882,56 +866,43 @@ int libxl__qmp_pci_del(libxl__gc *gc, int domid, libxl_device_pci *pcidev) int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename) { libxl__qmp_handler *qmp = NULL; - flexarray_t *parameters = NULL; - libxl_key_value_list args = NULL; + libxl__json_object *args = NULL; int rc = 0; qmp = libxl__qmp_initialize(gc, domid); if (!qmp) return ERROR_FAIL; - parameters = flexarray_make(2, 1); - if (!parameters) { - rc = ERROR_NOMEM; - goto out; - } - flexarray_append_pair(parameters, "filename", (char *)filename); - args = libxl__xs_kvs_of_flexarray(gc, parameters, parameters->count); + qmp_parameters_add_string(gc, &args, "filename", (char *)filename); if (!args) { rc = ERROR_NOMEM; - goto out2; + goto out; } - rc = qmp_synchronous_send(qmp, "xen-save-devices-state", &args, + rc = qmp_synchronous_send(qmp, "xen-save-devices-state", args, NULL, NULL, qmp->timeout); -out2: - flexarray_free(parameters); out: libxl__qmp_close(qmp); return rc; + } static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp, char *device, char *target, char *arg) { - flexarray_t *parameters = NULL; - libxl_key_value_list args = NULL; + libxl__json_object *args = NULL; int rc = 0; - parameters = flexarray_make(6, 1); - flexarray_append_pair(parameters, "device", device); - flexarray_append_pair(parameters, "target", target); - if (arg) - flexarray_append_pair(parameters, "arg", arg); - args = libxl__xs_kvs_of_flexarray(gc, parameters, parameters->count); - if (!args) - return ERROR_NOMEM; + qmp_parameters_add_string(gc, &args, "device", device); + qmp_parameters_add_string(gc, &args, "target", target); + if (arg) { + qmp_parameters_add_string(gc, &args, "arg", arg); + } - rc = qmp_synchronous_send(qmp, "change", &args, + rc = qmp_synchronous_send(qmp, "change", args, NULL, NULL, qmp->timeout); - flexarray_free(parameters); return rc; } -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 07/10] libxl_qmp: Simplify run of single QMP commands.
This new function connects to QEMU, sends the command and disconnects. Backport of xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693134 -3600 : Node ID f3890916496445c97d6778d6c986b0270ff707f2 : Parent be5d014f91dfbd67afacc3385c265243794a246f --- tools/libxl/libxl_qmp.c | 77 +++++++++++++--------------------------------- 1 files changed, 22 insertions(+), 55 deletions(-) diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index 605e8f3..b09bf13 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -798,6 +798,23 @@ out: return rc; } +static int qmp_run_command(libxl__gc *gc, int domid, + const char *cmd, libxl__json_object *args, + qmp_callback_t callback, void *opaque) +{ + libxl__qmp_handler *qmp = NULL; + int rc = 0; + + qmp = libxl__qmp_initialize(gc, domid); + if (!qmp) + return ERROR_FAIL; + + rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp->timeout); + + libxl__qmp_close(qmp); + return rc; +} + int libxl__qmp_pci_add(libxl__gc *gc, int domid, libxl_device_pci *pcidev) { libxl__qmp_handler *qmp = NULL; @@ -836,21 +853,10 @@ int libxl__qmp_pci_add(libxl__gc *gc, int domid, libxl_device_pci *pcidev) static int qmp_device_del(libxl__gc *gc, int domid, char *id) { - libxl__qmp_handler *qmp = NULL; libxl__json_object *args = NULL; - int rc = 0; - - qmp = libxl__qmp_initialize(gc, domid); - if (!qmp) - return ERROR_FAIL; qmp_parameters_add_string(gc, &args, "id", id); - - rc = qmp_synchronous_send(qmp, "device_del", args, - NULL, NULL, qmp->timeout); - - libxl__qmp_close(qmp); - return rc; + return qmp_run_command(gc, domid, "device_del", args, NULL, NULL); } int libxl__qmp_pci_del(libxl__gc *gc, int domid, libxl_device_pci *pcidev) @@ -865,27 +871,10 @@ int libxl__qmp_pci_del(libxl__gc *gc, int domid, libxl_device_pci *pcidev) int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename) { - libxl__qmp_handler *qmp = NULL; libxl__json_object *args = NULL; - int rc = 0; - - qmp = libxl__qmp_initialize(gc, domid); - if (!qmp) - return ERROR_FAIL; - - qmp_parameters_add_string(gc, &args, "filename", (char *)filename); - if (!args) { - rc = ERROR_NOMEM; - goto out; - } - - rc = qmp_synchronous_send(qmp, "xen-save-devices-state", args, - NULL, NULL, qmp->timeout); - -out: - libxl__qmp_close(qmp); - return rc; + return qmp_run_command(gc, domid, "xen-save-devices-state", args, + NULL, NULL); } static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp, @@ -908,34 +897,12 @@ static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp, int libxl__qmp_stop(libxl__gc *gc, int domid) { - libxl__qmp_handler *qmp = NULL; - int rc = 0; - - qmp = libxl__qmp_initialize(gc, domid); - if (!qmp) - return ERROR_FAIL; - - rc = qmp_synchronous_send(qmp, "stop", NULL, - NULL, NULL, qmp->timeout); - - libxl__qmp_close(qmp); - return rc; + return qmp_run_command(gc, domid, "stop", NULL, NULL, NULL); } int libxl__qmp_resume(libxl__gc *gc, int domid) { - libxl__qmp_handler *qmp = NULL; - int rc = 0; - - qmp = libxl__qmp_initialize(gc, domid); - if (!qmp) - return ERROR_FAIL; - - rc = qmp_synchronous_send(qmp, "cont", NULL, - NULL, NULL, qmp->timeout); - - libxl__qmp_close(qmp); - return rc; + return qmp_run_command(gc, domid, "cont", NULL, NULL, NULL); } int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid, -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 08/10] libxl_qmp: Introduce libxl__qmp_set_global_dirty_log.
This function will enable or disable the global dirty log on QEMU, used during a migration. Backport of xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693135 -3600 : Node ID d4aec9eff7e6d15c2805957af620c82555553b3e : Parent f3890916496445c97d6778d6c986b0270ff707f2 --- tools/libxl/libxl_internal.h | 2 ++ tools/libxl/libxl_qmp.c | 12 ++++++++++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index b00ff61..f658562 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1400,6 +1400,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid); _hidden int libxl__qmp_resume(libxl__gc *gc, int domid); /* Save current QEMU state into fd. */ _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename); +/* Set dirty bitmap logging status */ +_hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable); /* close and free the QMP handler */ _hidden void libxl__qmp_close(libxl__qmp_handler *qmp); /* remove the socket file, if the file has already been removed, diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index b09bf13..ac10f20 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -658,7 +658,6 @@ static void qmp_parameters_add_string(libxl__gc *gc, qmp_parameters_common_add(gc, param, name, obj); } -#if 0 static void qmp_parameters_add_bool(libxl__gc *gc, libxl__json_object **param, const char *name, bool b) @@ -669,7 +668,6 @@ static void qmp_parameters_add_bool(libxl__gc *gc, obj->u.b = b; qmp_parameters_common_add(gc, param, name, obj); } -#endif #define QMP_PARAMETERS_SPRINTF(args, name, format, ...) \ qmp_parameters_add_string(gc, args, name, \ @@ -905,6 +903,16 @@ int libxl__qmp_resume(libxl__gc *gc, int domid) return qmp_run_command(gc, domid, "cont", NULL, NULL, NULL); } +int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable) +{ + libxl__json_object *args = NULL; + + qmp_parameters_add_bool(gc, &args, "enable", enable); + + return qmp_run_command(gc, domid, "xen-set-global-dirty-log", args, + NULL, NULL); +} + int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid, const libxl_domain_config *guest_config) { -- 1.7.4.1
Alex Bligh
2012-Dec-13 10:57 UTC
[PATCH 09/10] libxl_dom: Call the right switch logdirty for the right DM.
This patch dispatch the switch logdirty call depending on which device model version is running. The call to qemu-xen right now is synchronous, not like the one to qemu-xen-traditional. Backport of xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693136 -3600 : Node ID 08fac5c2bf3dcbc493ce45091383f6ce1938f369 : Parent d4aec9eff7e6d15c2805957af620c82555553b3e --- tools/libxl/libxl_dom.c | 45 ++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 42 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index e1de832..95da18e 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -685,10 +685,10 @@ static void logdirty_init(libxl__logdirty_switch *lds) libxl__ev_time_init(&lds->timeout); } -void libxl__domain_suspend_common_switch_qemu_logdirty - (int domid, unsigned enable, void *user) +static void domain_suspend_switch_qemu_xen_traditional_logdirty + (int domid, unsigned enable, + libxl__save_helper_state *shs) { - libxl__save_helper_state *shs = user; libxl__egc *egc = shs->egc; libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); libxl__logdirty_switch *lds = &dss->logdirty; @@ -756,6 +756,45 @@ void libxl__domain_suspend_common_switch_qemu_logdirty switch_logdirty_done(egc,dss,-1); } +static void domain_suspend_switch_qemu_xen_logdirty + (int domid, unsigned enable, + libxl__save_helper_state *shs) +{ + libxl__egc *egc = shs->egc; + libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); + STATE_AO_GC(dss->ao); + int rc; + + rc = libxl__qmp_set_global_dirty_log(gc, domid, enable); + if (!rc) { + libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0); + } else { + LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc); + libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); + } +} + +void libxl__domain_suspend_common_switch_qemu_logdirty + (int domid, unsigned enable, void *user) +{ + libxl__save_helper_state *shs = user; + libxl__egc *egc = shs->egc; + libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); + STATE_AO_GC(dss->ao); + + switch (libxl__device_model_version_running(gc, domid)) { + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: + domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs); + break; + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: + domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs); + break; + default: + LOG(ERROR,"logdirty switch failed" + ", no valid device model version found, aborting suspend"); + libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); + } +} static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev, const struct timeval *requested_abs) { -- 1.7.4.1
Backport of xen-unstable patch: : HG changeset patch : User Anthony PERARD <anthony.perard@citrix.com> : Date 1349693136 -3600 : Node ID 0995890022391682a2499a202c3c8608e1d3780a : Parent 08fac5c2bf3dcbc493ce45091383f6ce1938f369 --- tools/libxl/libxl.c | 17 ----------------- 1 files changed, 0 insertions(+), 17 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 4b4c5b0..9b14364 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -768,23 +768,6 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags, goto out_err; } - if (type == LIBXL_DOMAIN_TYPE_HVM && flags & LIBXL_SUSPEND_LIVE) { - switch (libxl__device_model_version_running(gc, domid)) { - case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: - LOG(ERROR, - "cannot live migrate HVM domains with qemu-xen device-model"); - rc = ERROR_FAIL; - goto out_err; - case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: - /* No problem */ - break; - case -1: - rc = ERROR_FAIL; - goto out_err; - default: abort(); - } - } - libxl__domain_suspend_state *dss; GCNEW(dss); -- 1.7.4.1
Alex Bligh
2012-Dec-13 11:03 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
Apologies, I missed the vital information that this is a patch for Xen-4.2, against current xen-4.2-testing. --On 13 December 2012 10:56:58 +0000 Alex Bligh <alex@alex.org.uk> wrote:> This is a COMPILE TESTED ONLY RFC for a backport of the libxl elements > of: > > http://marc.info/?l=xen-devel&m=134944750724252 > > Stefano Stabellini said he would look at backporting the qemu-xen > side. > > My first question is whether this is the right approach: do we want > to include the JSON changes too (presumably it changes the API). > I am not 100% convinced they are necessary to get live migrate > working. > > DO NOT RUN THIS CODE ON ANYTHING YOU CARE ABOUT. > >-- Alex Bligh
Ian Campbell
2012-Dec-13 11:38 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
Adding Ian J who looks after the tools portion of the stable branches. I''m personally not convinced that this change is appropriate for a stable branch backport, it''s a pretty large feature patch after all and not a bug fix. On Thu, 2012-12-13 at 10:56 +0000, Alex Bligh wrote:> This is a COMPILE TESTED ONLY RFC for a backport of the libxl elements > of: > > http://marc.info/?l=xen-devel&m=134944750724252 > > Stefano Stabellini said he would look at backporting the qemu-xen > side. > > My first question is whether this is the right approach: do we want > to include the JSON changes too (presumably it changes the API). > I am not 100% convinced they are necessary to get live migrate > working.You certainly need the JSON changes which let libxl make the QMP call to enable logdirty in the qemu process. I don''t know about the rest, presumably they only change the internal API and not the external API?> DO NOT RUN THIS CODE ON ANYTHING YOU CARE ABOUT. >
> Subject: libxl: qemu trad logdirty: Tolerate ENOENT on ret path > > It can happen in error conditions that lds->ret_path doesn''t exist, > and libxl__xs_read_checked signals this by setting got_ret=NULL. If > this happens, fail without crashing. > > Reported-by: Alex Bligh <alex@alex.org.uk>, > Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>Acked + applied, thanks.
Alex Bligh
2012-Dec-13 15:06 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
Ian, --On 13 December 2012 11:38:31 +0000 Ian Campbell <Ian.Campbell@citrix.com> wrote:> Adding Ian J who looks after the tools portion of the stable branches. > > I''m personally not convinced that this change is appropriate for a > stable branch backport, it''s a pretty large feature patch after all and > not a bug fix.My prime concern is for our own usage, which I can easily address using a local build. But beyond that, my concern is that Xen-4.3 is (as advertised) going to have qemu-xen as the default device model; no one needing HVM can currently use it at all (if they need live migrate which I guess most do). That''s a bit of a jump. Xen docs currently same qemu-xen is supported. Nowhere does it say "save under HVM you can''t live migrate". I''d argue lack of live migration for HVM *is* a bug. If you ignore the JSON stuff, the patch is quite small. We could always only enable it as a compile option... -- Alex Bligh
Ian Jackson
2012-Dec-13 15:09 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
Alex Bligh writes ("Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model"):> But beyond that, my concern is that Xen-4.3 is (as advertised) going > to have qemu-xen as the default device model; no one needing HVM > can currently use it at all (if they need live migrate which I guess > most do). That''s a bit of a jump. > > Xen docs currently same qemu-xen is supported. Nowhere does it say > "save under HVM you can''t live migrate". I''d argue lack of live migration > for HVM *is* a bug. > > If you ignore the JSON stuff, the patch is quite small. We could always > only enable it as a compile option...I think this is a very good argument. Ian.
Ian Campbell
2012-Dec-17 12:01 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
On Thu, 2012-12-13 at 15:06 +0000, Alex Bligh wrote:> But beyond that, my concern is that Xen-4.3 is (as advertised) going > to have qemu-xen as the default device model; no one needing HVM > can currently use it at all (if they need live migrate which I guess > most do). That''s a bit of a jump.> Xen docs currently same qemu-xen is supported. Nowhere does it say > "save under HVM you can''t live migrate". I''d argue lack of live migration > for HVM *is* a bug.It''s definitely at least a bug in the documentation -- this isn''t a feature which was forgotten about, it was explicitly decided hadn''t met the 4.2 deadline and wasn''t going to be ready in time to wait for. This should have been documented and in the release notes etc, sorry. We did at least manage to tag it tech preview in http://wiki.xen.org/wiki/Xen_Release_Features which implies that it is not yet fully formed.> If you ignore the JSON stuff, the patch is quite small. We could always > only enable it as a compile option...Ian.
Alex Bligh
2012-Dec-17 17:02 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
Ian, --On 17 December 2012 12:01:52 +0000 Ian Campbell <Ian.Campbell@citrix.com> wrote:> It''s definitely at least a bug in the documentation -- this isn''t a > feature which was forgotten about, it was explicitly decided hadn''t met > the 4.2 deadline and wasn''t going to be ready in time to wait for. This > should have been documented and in the release notes etc, sorry. > > We did at least manage to tag it tech preview in > http://wiki.xen.org/wiki/Xen_Release_Features which implies that it is > not yet fully formed.It is indeed tagged as ''tech preview''. But given that: a) HVM is hardly uncommon b) live-migrate is pretty essential c) qemu-xen device model is the default next time around and the /only/ way you can achieve certain things (e.g. rebaseable snapshots) d) the code is already written (at least for -unstable) and the patch is smallish (particularly if the JSON stuff is ignored) ... I''d suggest a backport to 4.2.x is a reasonable request. I''ve had a first go at the libxl part and Stefano got their first and volunteered to do the qemu bit (if he doesn''t have time I can have a go at that too). If necessary we can guard the code either with a compile time switch or (as the wiki page suggested to me) a hypervisor command line switch like TMEM and AVX, which if it''s not enabled would just error the call in the same way libxl does at the moment. -- Alex Bligh
Ian Campbell
2012-Dec-18 10:04 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
On Mon, 2012-12-17 at 17:02 +0000, Alex Bligh wrote:> Ian, > > --On 17 December 2012 12:01:52 +0000 Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > It''s definitely at least a bug in the documentation -- this isn''t a > > feature which was forgotten about, it was explicitly decided hadn''t met > > the 4.2 deadline and wasn''t going to be ready in time to wait for. This > > should have been documented and in the release notes etc, sorry. > > > > We did at least manage to tag it tech preview in > > http://wiki.xen.org/wiki/Xen_Release_Features which implies that it is > > not yet fully formed. > > It is indeed tagged as ''tech preview''. > > But given that: > a) HVM is hardly uncommon > b) live-migrate is pretty essential > c) qemu-xen device model is the default next time around and the /only/ > way you can achieve certain things (e.g. rebaseable snapshots) > d) the code is already written (at least for -unstable) and the patch is > smallish (particularly if the JSON stuff is ignored)These are all true, but none of them are an argument that this is a bug fix rather than a feature request. Our general policy is not to backport features because the risk of regressions is higher, so feature backports need to be considered more carefully than a bugfix.> ... I''d suggest a backport to 4.2.x is a reasonable request.It''s certainly a reasonable request, and I''m not suggesting we shouldn''t consider it. My point is that it needs more careful consideration than a straight bugfix backport. For example your list above doesn''t include the flip sides which are: * Migration of PV guests * Migration of HVM Qemu-traditional guests * HVM Qemu-xen guests for users who don''t care about migration All of these run the risk of regression. Is this new feature worth that risk? Obviously you think so, but I''m not so sure.> I''ve had a > first go at the libxl part and Stefano got their first and volunteered > to do the qemu bit (if he doesn''t have time I can have a go at that too).It''s appreciated but ability and or willingness to do the work is only one part of the equation.> If necessary we can guard the code either with a compile time switch or > (as the wiki page suggested to me) a hypervisor command line switch like > TMEM and AVX, which if it''s not enabled would just error the call in the > same way libxl does at the moment.I don''t think we should have a compile time switch, we should either backport this new feature as a fully supported thing or not at all. Adding conditional features like this simply divides our testing resource. A hypervisor switch isn''t an option because the hypervisor isn''t involved in this decision at all. Ian.
George Dunlap
2012-Dec-18 11:04 UTC
Re: [RFC] Enabling live-migrate on HVM on qemu-xen device model
On Tue, Dec 18, 2012 at 10:04 AM, Ian Campbell <Ian.Campbell@citrix.com>wrote:> For example your list above doesn''t include the flip sides which are: > * Migration of PV guests > * Migration of HVM Qemu-traditional guests > * HVM Qemu-xen guests for users who don''t care about migration > > All of these run the risk of regression. Is this new feature worth that > risk? Obviously you think so, but I''m not so sure. >Looking just at the titles of the patch series, it seems that there''s only one patch which touches code not specific to qemu-xen. Assuming that''s just an "if(qemu_xen_upstream) { X } else { Y }", it seems like it would be a pretty low risk for #1 and #2. So the only potential risk would be if the changes to qemu-xen introduces a regression in other code. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel