Shriram Rajagopalan
2013-Nov-24 16:52 UTC
[PATCH 0 of 2] libxl: AO support in domain suspend ops
Make libxl''s suspend callback be asynchronous and refactor the code accordingly. Ian J''s patch on introducing the async suspend stub functions has been merged into patch 2. [1/2] Add an asynchronous version of the common suspend core (libxl__domain_suspend_common_callback), while retaining the existing synchronous version. Add a callout to Remus (libxl__domain_suspend_callback_remus) that gets called post domain suspend. Remus specific post-suspend tasks are implemented here. [2/2] Remove the synchronous version of libxl__domain_suspend_common_callback. Also remove the separate entry point for Remus (libxl__remus_domain_suspend_callback). Instead, introduce a common entry point (libxl__domain_suspend_callback). N.B. This series depends on the remus network buffering patch series. thanks shriram
Shriram Rajagopalan
2013-Nov-24 16:52 UTC
[PATCH 1 of 2] libxl: async version of domain suspend
# HG changeset patch # User Shriram Rajagopalan <rshriram@cs.ubc.ca> # Date 1385309833 28800 # Node ID c7fdc18830cd46018ced3afb1cde23121a716a6e # Parent 987d21e4498495e7bbd794bf6eb8d18a8219a081 libxl: async version of domain suspend Add functions that can be used to implement an asynchronous version of domain suspend. This patch does not replace the existing synchronous implementation of domain suspend. The code introduced in this patch is redundant, based on libxl__domain_suspend_common_callback. Also add a callout to Remus (libxl__domain_suspend_callback_remus) that gets called post domain suspend. Remus specific post-suspend tasks can be implemented here. Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_dom.c --- a/tools/libxl/libxl_dom.c Sun Nov 24 08:09:19 2013 -0800 +++ b/tools/libxl/libxl_dom.c Sun Nov 24 08:17:13 2013 -0800 @@ -1016,6 +1016,196 @@ int libxl__domain_resume_device_model(li return 0; } +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t domid); +static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs); +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs); +static void wait_for_suspend_req_ack(libxl__domain_suspend_state *dss); +static void wait_for_guest_suspend(libxl__domain_suspend_state *dss); +static void guest_suspended(libxl__domain_suspend_state *dss); +static int libxl__domain_suspend_callback_remus(libxl__domain_suspend_state *dss); + +/* Returns 0 if suspend request was cancelled and the guest did not + * respond during the cancellation process (see comments in function for + * explanation of race conditions). + * Returns 1 if guest had finally acked suspend request, during the + * cancellation process. + */ +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t domid) +{ + char *state = NULL; + xs_transaction_t t; + + /* + * Guest appears to not be responding. Cancel the suspend + * request. + * + * We re-read the suspend node and clear it within a + * transaction in order to handle the case where we race + * against the guest catching up and acknowledging the request + * at the last minute. + */ + LOG(ERROR, "guest didn''t acknowledge suspend, cancelling request"); + retry_transaction: + t = xs_transaction_start(CTX->xsh); + + state = libxl__domain_pvcontrol_read(gc, t, domid); + if (!state) state = ""; + + if (!strcmp(state, "suspend")) + libxl__domain_pvcontrol_write(gc, t, domid, ""); + + if (!xs_transaction_end(CTX->xsh, t, 0)) + if (errno == EAGAIN) + goto retry_transaction; + + /* + * Final check for guest acknowledgement. The guest may have + * acknowledged while we were cancelling the request in which + * case we lost the race while cancelling and should continue. + */ + if (!strcmp(state, "suspend")) { + LOG(ERROR, "guest didn''t acknowledge suspend, request cancelled"); + return 0; + } + + return 1; +} + +static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs) +{ + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); + STATE_AO_GC(dss->ao); + + libxl__ev_time_deregister(gc, &dss->timeout); + dss->watchdog--; + wait_for_suspend_req_ack(dss); +} + +static void wait_for_suspend_req_ack(libxl__domain_suspend_state *dss) +{ + char *state = NULL; + int ret; + int wait_time; + STATE_AO_GC(dss->ao); + + state = libxl__domain_pvcontrol_read(gc, XBT_NULL, dss->domid); + + if (!state) state = ""; + + if (!strcmp(state, "suspend") && dss->watchdog > 0) { + /* The first timeout is a short one (10ms), hoping that the + * guest would respond immediately. The subsequent timeouts + * are longer (40ms). + */ + wait_time = dss->watchdog_starting ? 10 : 40; + dss->watchdog_starting = 0; + ret = libxl__ev_time_register_rel(gc, &dss->timeout, + wait_for_ack_timeout, wait_time); + if (ret) { + LOG(ERROR, "unable to register timeout event to wait for" + " guest to suspend ack. Cancelling suspend request"); + goto cancel_req; + } + return; + } + + if (strcmp(state, "suspend")) { + ack: + LOG(DEBUG, "guest acknowledged suspend request"); + dss->guest_responded = 1; + dss->watchdog = 60; + dss->watchdog_starting = 1; + wait_for_guest_suspend(dss); + return; + } + + cancel_req: //either timer reg. failed or watchdog loop ended + if (cancel_xenbus_suspend_request(gc, dss->domid)) + goto ack; + + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, &dss->shs, 0); +} + +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs) +{ + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); + STATE_AO_GC(dss->ao); + + libxl__ev_time_deregister(gc, &dss->timeout); + dss->watchdog--; + wait_for_guest_suspend(dss); +} + +static void wait_for_guest_suspend(libxl__domain_suspend_state *dss) +{ + int ret; + int wait_time; + xc_domaininfo_t info; + STATE_AO_GC(dss->ao); + + ret = xc_domain_getinfolist(CTX->xch, dss->domid, 1, &info); + if (ret == 1 && info.domain == dss->domid && + (info.flags & XEN_DOMINF_shutdown)) { + int shutdown_reason; + + shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift) + & XEN_DOMINF_shutdownmask; + if (shutdown_reason == SHUTDOWN_suspend) { + LOG(DEBUG, "guest has suspended"); + guest_suspended(dss); + return; + } + } + + if (dss->watchdog > 0) { + /* The first timeout is a short one (10ms), hoping that the + * guest would respond immediately. The subsequent timeouts + * are longer (40ms). + */ + wait_time = dss->watchdog_starting ? 10 : 40; + dss->watchdog_starting = 0; + ret = libxl__ev_time_register_rel(gc, &dss->timeout, + wait_for_suspend_timeout, wait_time); + if (ret) { + LOG(ERROR, "unable to register timeout event to wait for" + " guest to suspend."); + goto err; + } + return; + } + + LOG(ERROR, "guest did not suspend"); + err: + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, + &dss->shs, 0); +} + +static void guest_suspended(libxl__domain_suspend_state *dss) +{ + int ret, ok = 1; + STATE_AO_GC(dss->ao); + if (dss->hvm) { + ret = libxl__domain_suspend_device_model(gc, dss); + if (ret) { + LOG(ERROR, "libxl__domain_suspend_device_model failed " + "ret=%d", ret); + ok = 0; + goto end; + } + } + + if (dss->remus_ctx) + ok = libxl__domain_suspend_callback_remus(dss); + + end: + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, + &dss->shs, ok); +} + int libxl__domain_suspend_common_callback(void *user) { libxl__save_helper_state *shs = user; @@ -1285,6 +1475,28 @@ void libxl__remus_teardown_done(libxl__e /*----- remus callbacks -----*/ +/* Remus specific tasks to be executed post domain suspend */ +static int libxl__domain_suspend_callback_remus(libxl__domain_suspend_state *dss) +{ + libxl__remus_ctx *remus_ctx = dss->remus_ctx; + STATE_AO_GC(dss->ao); + + /* REMUS TODO: Issue disk checkpoint reqs. */ + if (!remus_ctx->netbuf_ctx) + return 1; + + /* Start a new network + * buffer for the next epoch. If this operation fails, then act + * as though domain suspend failed -- libxc exits its infinite + * loop and ultimately, the replication stops. + */ + if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid, + remus_ctx)) + return 0; + + return 1; +} + static int libxl__remus_domain_suspend_callback(void *data) { libxl__save_helper_state *shs = data; @@ -1416,6 +1628,7 @@ void libxl__domain_suspend(libxl__egc *e &dss->shs.callbacks.save.a; logdirty_init(&dss->logdirty); + libxl__ev_time_init(&dss->timeout); switch (type) { case LIBXL_DOMAIN_TYPE_HVM: { diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Sun Nov 24 08:09:19 2013 -0800 +++ b/tools/libxl/libxl_internal.h Sun Nov 24 08:17:13 2013 -0800 @@ -2352,6 +2352,9 @@ struct libxl__domain_suspend_state { int hvm; int xcflags; int guest_responded; + int watchdog; + int watchdog_starting; + libxl__ev_time timeout; const char *dm_savefile; libxl__save_helper_state shs; libxl__logdirty_switch logdirty;
Shriram Rajagopalan
2013-Nov-24 16:52 UTC
[PATCH 2 of 2] libxl: make libxl__domain_suspend_callback be asynchronous
# HG changeset patch # User Shriram Rajagopalan <rshriram@cs.ubc.ca> # Date 1385310114 28800 # Node ID 2eddacbd701dce5d3bb395ff3c19c5bc7c9cfa68 # Parent c7fdc18830cd46018ced3afb1cde23121a716a6e libxl: make libxl__domain_suspend_callback be asynchronous Mark the suspend callback as asynchronous in the helper stub generator (libxl_save_msgs_gen.pl). libxl__domain_suspend_common_callback, the common synchronous core, which used to be provided directly as the callback function for the helper machinery, becomes libxl__domain_suspend_callback. It can now take a typesafe parameter. This function is further refactored to use the previously introduced code that relies on libxl''s event timer machinery instead of usleep calls to implement wait loops. The remus version of suspend callback is no longer called directly by the helper machinery. Instead, after a domain is successfully suspended, a remus callback (libxl__domain_suspend_callback_remus) is called, wherein remus specific tasks (e.g., creating a new network buffer) are executed. Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> diff -r c7fdc18830cd -r 2eddacbd701d tools/libxl/libxl_dom.c --- a/tools/libxl/libxl_dom.c Sun Nov 24 08:17:13 2013 -0800 +++ b/tools/libxl/libxl_dom.c Sun Nov 24 08:21:54 2013 -0800 @@ -1206,16 +1206,13 @@ static void guest_suspended(libxl__domai &dss->shs, ok); } -int libxl__domain_suspend_common_callback(void *user) +void libxl__domain_suspend_callback(void *data) { - libxl__save_helper_state *shs = user; + libxl__save_helper_state *shs = data; libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); STATE_AO_GC(dss->ao); unsigned long hvm_s_state = 0, hvm_pvdrv = 0; int ret; - char *state = "suspend"; - int watchdog; - xs_transaction_t t; /* Convenience aliases */ const uint32_t domid = dss->domid; @@ -1231,117 +1228,41 @@ int libxl__domain_suspend_common_callbac ret = xc_evtchn_notify(dss->xce, dss->suspend_eventchn); if (ret < 0) { LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret); - return 0; + goto err; } ret = xc_await_suspend(CTX->xch, dss->xce, dss->suspend_eventchn); if (ret < 0) { LOG(ERROR, "xc_await_suspend failed ret=%d", ret); - return 0; + goto err; } dss->guest_responded = 1; - goto guest_suspended; + guest_suspended(dss); + return; } + dss->watchdog = 60; + dss->watchdog_starting = 1; if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) { LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain"); ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend); if (ret < 0) { LOGE(ERROR, "xc_domain_shutdown failed"); - return 0; + goto err; } /* The guest does not (need to) respond to this sort of request. */ dss->guest_responded = 1; + wait_for_guest_suspend(dss); } else { LOG(DEBUG, "issuing %s suspend request via XenBus control node", dss->hvm ? "PVHVM" : "PV"); + libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend"); + LOG(DEBUG, "wait for the guest to acknowledge suspend request"); + wait_for_suspend_req_ack(dss); + } + return; - libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend"); - - LOG(DEBUG, "wait for the guest to acknowledge suspend request"); - watchdog = 60; - while (!strcmp(state, "suspend") && watchdog > 0) { - usleep(100000); - - state = libxl__domain_pvcontrol_read(gc, XBT_NULL, domid); - if (!state) state = ""; - - watchdog--; - } - - /* - * Guest appears to not be responding. Cancel the suspend - * request. - * - * We re-read the suspend node and clear it within a - * transaction in order to handle the case where we race - * against the guest catching up and acknowledging the request - * at the last minute. - */ - if (!strcmp(state, "suspend")) { - LOG(ERROR, "guest didn''t acknowledge suspend, cancelling request"); - retry_transaction: - t = xs_transaction_start(CTX->xsh); - - state = libxl__domain_pvcontrol_read(gc, t, domid); - if (!state) state = ""; - - if (!strcmp(state, "suspend")) - libxl__domain_pvcontrol_write(gc, t, domid, ""); - - if (!xs_transaction_end(CTX->xsh, t, 0)) - if (errno == EAGAIN) - goto retry_transaction; - - } - - /* - * Final check for guest acknowledgement. The guest may have - * acknowledged while we were cancelling the request in which - * case we lost the race while cancelling and should continue. - */ - if (!strcmp(state, "suspend")) { - LOG(ERROR, "guest didn''t acknowledge suspend, request cancelled"); - return 0; - } - - LOG(DEBUG, "guest acknowledged suspend request"); - dss->guest_responded = 1; - } - - LOG(DEBUG, "wait for the guest to suspend"); - watchdog = 60; - while (watchdog > 0) { - xc_domaininfo_t info; - - usleep(100000); - ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info); - if (ret == 1 && info.domain == domid && - (info.flags & XEN_DOMINF_shutdown)) { - int shutdown_reason; - - shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift) - & XEN_DOMINF_shutdownmask; - if (shutdown_reason == SHUTDOWN_suspend) { - LOG(DEBUG, "guest has suspended"); - goto guest_suspended; - } - } - - watchdog--; - } - - LOG(ERROR, "guest did not suspend"); - return 0; - - guest_suspended: - if (dss->hvm) { - ret = libxl__domain_suspend_device_model(gc, dss); - if (ret) { - LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", ret); - return 0; - } - } - return 1; + err: + libxl__xc_domain_saverestore_async_callback_done(shs->egc, shs, 0); } static inline char *physmap_path(libxl__gc *gc, uint32_t domid, @@ -1497,31 +1418,6 @@ static int libxl__domain_suspend_callbac return 1; } -static int libxl__remus_domain_suspend_callback(void *data) -{ - libxl__save_helper_state *shs = data; - libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs); - libxl__remus_ctx *remus_ctx = dss->remus_ctx; - STATE_AO_GC(dss->ao); - - /* REMUS TODO: Issue disk checkpoint reqs. */ - int ok = libxl__domain_suspend_common_callback(data); - - if (!remus_ctx->netbuf_ctx || !ok) goto out; - - /* The domain was suspended successfully. Start a new network - * buffer for the next epoch. If this operation fails, then act - * as though domain suspend failed -- libxc exits its infinite - * loop and ultimately, the replication stops. - */ - if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid, - remus_ctx)) - ok = 0; - - out: - return ok; -} - static int libxl__remus_domain_resume_callback(void *data) { libxl__save_helper_state *shs = data; @@ -1680,11 +1576,10 @@ void libxl__domain_suspend(libxl__egc *e memset(callbacks, 0, sizeof(*callbacks)); if (dss->remus_ctx != NULL) { - callbacks->suspend = libxl__remus_domain_suspend_callback; callbacks->postcopy = libxl__remus_domain_resume_callback; callbacks->checkpoint = libxl__remus_domain_checkpoint_callback; - } else - callbacks->suspend = libxl__domain_suspend_common_callback; + } + callbacks->suspend = libxl__domain_suspend_callback; callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty; dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save; diff -r c7fdc18830cd -r 2eddacbd701d tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Sun Nov 24 08:17:13 2013 -0800 +++ b/tools/libxl/libxl_internal.h Sun Nov 24 08:21:54 2013 -0800 @@ -2650,7 +2650,7 @@ _hidden void libxl__xc_domain_save_done( void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc, libxl__save_helper_state *shs, int return_value); -_hidden int libxl__domain_suspend_common_callback(void *data); +_hidden void libxl__domain_suspend_callback(void *data); _hidden void libxl__domain_suspend_common_switch_qemu_logdirty (int domid, unsigned int enable, void *data); _hidden int libxl__toolstack_save(uint32_t domid, uint8_t **buf, diff -r c7fdc18830cd -r 2eddacbd701d tools/libxl/libxl_save_msgs_gen.pl --- a/tools/libxl/libxl_save_msgs_gen.pl Sun Nov 24 08:17:13 2013 -0800 +++ b/tools/libxl/libxl_save_msgs_gen.pl Sun Nov 24 08:21:54 2013 -0800 @@ -23,7 +23,7 @@ our @msgs = ( STRING doing_what), ''unsigned long'', ''done'', ''unsigned long'', ''total''] ], - [ 3, ''scxW'', "suspend", [] ], + [ 3, ''scxA'', "suspend", [] ], [ 4, ''scxW'', "postcopy", [] ], [ 5, ''scxA'', "checkpoint", [] ], [ 6, ''scxA'', "switch_qemu_logdirty", [qw(int domid
Dong, Eddie
2013-Dec-11 07:38 UTC
Re: [PATCH 1 of 2] libxl: async version of domain suspend
Hi Raj: How many patches are left to re-enable Remus in Xen? We are very eager to see all them to be in :) Thanks, Eddie -----Original Message----- From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Shriram Rajagopalan Sent: Monday, November 25, 2013 12:52 AM To: xen-devel@lists.xen.org Cc: Ian Jackson; Ian Campbell Subject: [Xen-devel] [PATCH 1 of 2] libxl: async version of domain suspend # HG changeset patch # User Shriram Rajagopalan <rshriram@cs.ubc.ca> # Date 1385309833 28800 # Node ID c7fdc18830cd46018ced3afb1cde23121a716a6e # Parent 987d21e4498495e7bbd794bf6eb8d18a8219a081 libxl: async version of domain suspend Add functions that can be used to implement an asynchronous version of domain suspend. This patch does not replace the existing synchronous implementation of domain suspend. The code introduced in this patch is redundant, based on libxl__domain_suspend_common_callback. Also add a callout to Remus (libxl__domain_suspend_callback_remus) that gets called post domain suspend. Remus specific post-suspend tasks can be implemented here. Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_dom.c --- a/tools/libxl/libxl_dom.c Sun Nov 24 08:09:19 2013 -0800 +++ b/tools/libxl/libxl_dom.c Sun Nov 24 08:17:13 2013 -0800 @@ -1016,6 +1016,196 @@ int libxl__domain_resume_device_model(li return 0; } +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t +domid); static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs); +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval +*requested_abs); static void +wait_for_suspend_req_ack(libxl__domain_suspend_state *dss); static void +wait_for_guest_suspend(libxl__domain_suspend_state *dss); static void +guest_suspended(libxl__domain_suspend_state *dss); static int +libxl__domain_suspend_callback_remus(libxl__domain_suspend_state *dss); + +/* Returns 0 if suspend request was cancelled and the guest did not + * respond during the cancellation process (see comments in function +for + * explanation of race conditions). + * Returns 1 if guest had finally acked suspend request, during the + * cancellation process. + */ +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t domid) +{ + char *state = NULL; + xs_transaction_t t; + + /* + * Guest appears to not be responding. Cancel the suspend + * request. + * + * We re-read the suspend node and clear it within a + * transaction in order to handle the case where we race + * against the guest catching up and acknowledging the request + * at the last minute. + */ + LOG(ERROR, "guest didn''t acknowledge suspend, cancelling request"); + retry_transaction: + t = xs_transaction_start(CTX->xsh); + + state = libxl__domain_pvcontrol_read(gc, t, domid); + if (!state) state = ""; + + if (!strcmp(state, "suspend")) + libxl__domain_pvcontrol_write(gc, t, domid, ""); + + if (!xs_transaction_end(CTX->xsh, t, 0)) + if (errno == EAGAIN) + goto retry_transaction; + + /* + * Final check for guest acknowledgement. The guest may have + * acknowledged while we were cancelling the request in which + * case we lost the race while cancelling and should continue. + */ + if (!strcmp(state, "suspend")) { + LOG(ERROR, "guest didn''t acknowledge suspend, request cancelled"); + return 0; + } + + return 1; +} + +static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs) { + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); + STATE_AO_GC(dss->ao); + + libxl__ev_time_deregister(gc, &dss->timeout); + dss->watchdog--; + wait_for_suspend_req_ack(dss); +} + +static void wait_for_suspend_req_ack(libxl__domain_suspend_state *dss) +{ + char *state = NULL; + int ret; + int wait_time; + STATE_AO_GC(dss->ao); + + state = libxl__domain_pvcontrol_read(gc, XBT_NULL, dss->domid); + + if (!state) state = ""; + + if (!strcmp(state, "suspend") && dss->watchdog > 0) { + /* The first timeout is a short one (10ms), hoping that the + * guest would respond immediately. The subsequent timeouts + * are longer (40ms). + */ + wait_time = dss->watchdog_starting ? 10 : 40; + dss->watchdog_starting = 0; + ret = libxl__ev_time_register_rel(gc, &dss->timeout, + wait_for_ack_timeout, wait_time); + if (ret) { + LOG(ERROR, "unable to register timeout event to wait for" + " guest to suspend ack. Cancelling suspend request"); + goto cancel_req; + } + return; + } + + if (strcmp(state, "suspend")) { + ack: + LOG(DEBUG, "guest acknowledged suspend request"); + dss->guest_responded = 1; + dss->watchdog = 60; + dss->watchdog_starting = 1; + wait_for_guest_suspend(dss); + return; + } + + cancel_req: //either timer reg. failed or watchdog loop ended + if (cancel_xenbus_suspend_request(gc, dss->domid)) + goto ack; + + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, +&dss->shs, 0); } + +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval +*requested_abs) { + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); + STATE_AO_GC(dss->ao); + + libxl__ev_time_deregister(gc, &dss->timeout); + dss->watchdog--; + wait_for_guest_suspend(dss); +} + +static void wait_for_guest_suspend(libxl__domain_suspend_state *dss) { + int ret; + int wait_time; + xc_domaininfo_t info; + STATE_AO_GC(dss->ao); + + ret = xc_domain_getinfolist(CTX->xch, dss->domid, 1, &info); + if (ret == 1 && info.domain == dss->domid && + (info.flags & XEN_DOMINF_shutdown)) { + int shutdown_reason; + + shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift) + & XEN_DOMINF_shutdownmask; + if (shutdown_reason == SHUTDOWN_suspend) { + LOG(DEBUG, "guest has suspended"); + guest_suspended(dss); + return; + } + } + + if (dss->watchdog > 0) { + /* The first timeout is a short one (10ms), hoping that the + * guest would respond immediately. The subsequent timeouts + * are longer (40ms). + */ + wait_time = dss->watchdog_starting ? 10 : 40; + dss->watchdog_starting = 0; + ret = libxl__ev_time_register_rel(gc, &dss->timeout, + wait_for_suspend_timeout, wait_time); + if (ret) { + LOG(ERROR, "unable to register timeout event to wait for" + " guest to suspend."); + goto err; + } + return; + } + + LOG(ERROR, "guest did not suspend"); + err: + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, + &dss->shs, 0); } + +static void guest_suspended(libxl__domain_suspend_state *dss) { + int ret, ok = 1; + STATE_AO_GC(dss->ao); + if (dss->hvm) { + ret = libxl__domain_suspend_device_model(gc, dss); + if (ret) { + LOG(ERROR, "libxl__domain_suspend_device_model failed " + "ret=%d", ret); + ok = 0; + goto end; + } + } + + if (dss->remus_ctx) + ok = libxl__domain_suspend_callback_remus(dss); + + end: + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, + &dss->shs, ok); } + int libxl__domain_suspend_common_callback(void *user) { libxl__save_helper_state *shs = user; @@ -1285,6 +1475,28 @@ void libxl__remus_teardown_done(libxl__e /*----- remus callbacks -----*/ +/* Remus specific tasks to be executed post domain suspend */ static +int libxl__domain_suspend_callback_remus(libxl__domain_suspend_state +*dss) { + libxl__remus_ctx *remus_ctx = dss->remus_ctx; + STATE_AO_GC(dss->ao); + + /* REMUS TODO: Issue disk checkpoint reqs. */ + if (!remus_ctx->netbuf_ctx) + return 1; + + /* Start a new network + * buffer for the next epoch. If this operation fails, then act + * as though domain suspend failed -- libxc exits its infinite + * loop and ultimately, the replication stops. + */ + if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid, + remus_ctx)) + return 0; + + return 1; +} + static int libxl__remus_domain_suspend_callback(void *data) { libxl__save_helper_state *shs = data; @@ -1416,6 +1628,7 @@ void libxl__domain_suspend(libxl__egc *e &dss->shs.callbacks.save.a; logdirty_init(&dss->logdirty); + libxl__ev_time_init(&dss->timeout); switch (type) { case LIBXL_DOMAIN_TYPE_HVM: { diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Sun Nov 24 08:09:19 2013 -0800 +++ b/tools/libxl/libxl_internal.h Sun Nov 24 08:17:13 2013 -0800 @@ -2352,6 +2352,9 @@ struct libxl__domain_suspend_state { int hvm; int xcflags; int guest_responded; + int watchdog; + int watchdog_starting; + libxl__ev_time timeout; const char *dm_savefile; libxl__save_helper_state shs; libxl__logdirty_switch logdirty; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Shriram Rajagopalan
2013-Dec-11 17:59 UTC
Re: [PATCH 1 of 2] libxl: async version of domain suspend
On Sunday, November 24, 2013, Shriram Rajagopalan wrote:> # HG changeset patch > # User Shriram Rajagopalan <rshriram@cs.ubc.ca <javascript:;>> > # Date 1385309833 28800 > # Node ID c7fdc18830cd46018ced3afb1cde23121a716a6e > # Parent 987d21e4498495e7bbd794bf6eb8d18a8219a081 > libxl: async version of domain suspend > > Add functions that can be used to implement an asynchronous version > of domain suspend. This patch does not replace the existing synchronous > implementation of domain suspend. The code introduced in this patch is > redundant, based on libxl__domain_suspend_common_callback. > > Also add a callout to Remus (libxl__domain_suspend_callback_remus) that > gets called post domain suspend. Remus specific post-suspend tasks can be > implemented here. > > Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca <javascript:;>> > > diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_dom.c > --- a/tools/libxl/libxl_dom.c Sun Nov 24 08:09:19 2013 -0800 > +++ b/tools/libxl/libxl_dom.c Sun Nov 24 08:17:13 2013 -0800 > @@ -1016,6 +1016,196 @@ int libxl__domain_resume_device_model(li > return 0; > } > > +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t domid); > +static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, > + const struct timeval *requested_abs); > +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, > + const struct timeval *requested_abs); > +static void wait_for_suspend_req_ack(libxl__domain_suspend_state *dss); > +static void wait_for_guest_suspend(libxl__domain_suspend_state *dss); > +static void guest_suspended(libxl__domain_suspend_state *dss); > +static int > libxl__domain_suspend_callback_remus(libxl__domain_suspend_state *dss); > + > +/* Returns 0 if suspend request was cancelled and the guest did not > + * respond during the cancellation process (see comments in function for > + * explanation of race conditions). > + * Returns 1 if guest had finally acked suspend request, during the > + * cancellation process. > + */ > +static int cancel_xenbus_suspend_request(libxl__gc *gc, uint32_t domid) > +{ > + char *state = NULL; > + xs_transaction_t t; > + > + /* > + * Guest appears to not be responding. Cancel the suspend > + * request. > + * > + * We re-read the suspend node and clear it within a > + * transaction in order to handle the case where we race > + * against the guest catching up and acknowledging the request > + * at the last minute. > + */ > + LOG(ERROR, "guest didn''t acknowledge suspend, cancelling request"); > + retry_transaction: > + t = xs_transaction_start(CTX->xsh); > + > + state = libxl__domain_pvcontrol_read(gc, t, domid); > + if (!state) state = ""; > + > + if (!strcmp(state, "suspend")) > + libxl__domain_pvcontrol_write(gc, t, domid, ""); > + > + if (!xs_transaction_end(CTX->xsh, t, 0)) > + if (errno == EAGAIN) > + goto retry_transaction; > + > + /* > + * Final check for guest acknowledgement. The guest may have > + * acknowledged while we were cancelling the request in which > + * case we lost the race while cancelling and should continue. > + */ > + if (!strcmp(state, "suspend")) { > + LOG(ERROR, "guest didn''t acknowledge suspend, request cancelled"); > + return 0; > + } > + > + return 1; > +} > + > +static void wait_for_ack_timeout(libxl__egc *egc, libxl__ev_time *ev, > + const struct timeval *requested_abs) > +{ > + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); > + STATE_AO_GC(dss->ao); > + > + libxl__ev_time_deregister(gc, &dss->timeout); > + dss->watchdog--; > + wait_for_suspend_req_ack(dss); > +} > + > +static void wait_for_suspend_req_ack(libxl__domain_suspend_state *dss) > +{ > + char *state = NULL; > + int ret; > + int wait_time; > + STATE_AO_GC(dss->ao); > + > + state = libxl__domain_pvcontrol_read(gc, XBT_NULL, dss->domid); > + > + if (!state) state = ""; > + > + if (!strcmp(state, "suspend") && dss->watchdog > 0) { > + /* The first timeout is a short one (10ms), hoping that the > + * guest would respond immediately. The subsequent timeouts > + * are longer (40ms). > + */ > + wait_time = dss->watchdog_starting ? 10 : 40; > + dss->watchdog_starting = 0; > + ret = libxl__ev_time_register_rel(gc, &dss->timeout, > + wait_for_ack_timeout, > wait_time); > + if (ret) { > + LOG(ERROR, "unable to register timeout event to wait for" > + " guest to suspend ack. Cancelling suspend request"); > + goto cancel_req; > + } > + return; > + } > + > + if (strcmp(state, "suspend")) { > + ack: > + LOG(DEBUG, "guest acknowledged suspend request"); > + dss->guest_responded = 1; > + dss->watchdog = 60; > + dss->watchdog_starting = 1; > + wait_for_guest_suspend(dss); > + return; > + } > + > + cancel_req: //either timer reg. failed or watchdog loop ended > + if (cancel_xenbus_suspend_request(gc, dss->domid)) > + goto ack; > + > + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, > &dss->shs, 0); > +} > + > +static void wait_for_suspend_timeout(libxl__egc *egc, libxl__ev_time *ev, > + const struct timeval *requested_abs) > +{ > + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, timeout); > + STATE_AO_GC(dss->ao); > + > + libxl__ev_time_deregister(gc, &dss->timeout); > + dss->watchdog--; > + wait_for_guest_suspend(dss); > +} > + > +static void wait_for_guest_suspend(libxl__domain_suspend_state *dss) > +{ > + int ret; > + int wait_time; > + xc_domaininfo_t info; > + STATE_AO_GC(dss->ao); > + > + ret = xc_domain_getinfolist(CTX->xch, dss->domid, 1, &info); > + if (ret == 1 && info.domain == dss->domid && > + (info.flags & XEN_DOMINF_shutdown)) { > + int shutdown_reason; > + > + shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift) > + & XEN_DOMINF_shutdownmask; > + if (shutdown_reason == SHUTDOWN_suspend) { > + LOG(DEBUG, "guest has suspended"); > + guest_suspended(dss); > + return; > + } > + } > + > + if (dss->watchdog > 0) { > + /* The first timeout is a short one (10ms), hoping that the > + * guest would respond immediately. The subsequent timeouts > + * are longer (40ms). > + */ > + wait_time = dss->watchdog_starting ? 10 : 40; > + dss->watchdog_starting = 0; > + ret = libxl__ev_time_register_rel(gc, &dss->timeout, > + wait_for_suspend_timeout, > wait_time); > + if (ret) { > + LOG(ERROR, "unable to register timeout event to wait for" > + " guest to suspend."); > + goto err; > + } > + return; > + } > + > + LOG(ERROR, "guest did not suspend"); > + err: > + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, > + &dss->shs, 0); > +} > + > +static void guest_suspended(libxl__domain_suspend_state *dss) > +{ > + int ret, ok = 1; > + STATE_AO_GC(dss->ao); > + if (dss->hvm) { > + ret = libxl__domain_suspend_device_model(gc, dss); > + if (ret) { > + LOG(ERROR, "libxl__domain_suspend_device_model failed " > + "ret=%d", ret); > + ok = 0; > + goto end; > + } > + } > + > + if (dss->remus_ctx) > + ok = libxl__domain_suspend_callback_remus(dss); > + > + end: > + libxl__xc_domain_saverestore_async_callback_done(dss->shs.egc, > + &dss->shs, ok); > +} > + > int libxl__domain_suspend_common_callback(void *user) > { > libxl__save_helper_state *shs = user; > @@ -1285,6 +1475,28 @@ void libxl__remus_teardown_done(libxl__e > > /*----- remus callbacks -----*/ > > +/* Remus specific tasks to be executed post domain suspend */ > +static int > libxl__domain_suspend_callback_remus(libxl__domain_suspend_state *dss) > +{ > + libxl__remus_ctx *remus_ctx = dss->remus_ctx; > + STATE_AO_GC(dss->ao); > + > + /* REMUS TODO: Issue disk checkpoint reqs. */ > + if (!remus_ctx->netbuf_ctx) > + return 1; > + > + /* Start a new network > + * buffer for the next epoch. If this operation fails, then act > + * as though domain suspend failed -- libxc exits its infinite > + * loop and ultimately, the replication stops. > + */ > + if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid, > + remus_ctx)) > + return 0; > + > + return 1; > +} > + > static int libxl__remus_domain_suspend_callback(void *data) > { > libxl__save_helper_state *shs = data; > @@ -1416,6 +1628,7 @@ void libxl__domain_suspend(libxl__egc *e > &dss->shs.callbacks.save.a; > > logdirty_init(&dss->logdirty); > + libxl__ev_time_init(&dss->timeout); > > switch (type) { > case LIBXL_DOMAIN_TYPE_HVM: { > diff -r 987d21e44984 -r c7fdc18830cd tools/libxl/libxl_internal.h > --- a/tools/libxl/libxl_internal.h Sun Nov 24 08:09:19 2013 -0800 > +++ b/tools/libxl/libxl_internal.h Sun Nov 24 08:17:13 2013 -0800 > @@ -2352,6 +2352,9 @@ struct libxl__domain_suspend_state { > int hvm; > int xcflags; > int guest_responded; > + int watchdog; > + int watchdog_starting; > + libxl__ev_time timeout; > const char *dm_savefile; > libxl__save_helper_state shs; > libxl__logdirty_switch logdirty; > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org <javascript:;> > http://lists.xen.org/xen-devel >Ping! Ian J/Ian C - any comments on this? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson
2013-Dec-12 15:08 UTC
Re: [PATCH 1 of 2] libxl: async version of domain suspend
Shriram Rajagopalan writes ("Re: [Xen-devel] [PATCH 1 of 2] libxl: async version of domain suspend"):> On Sunday, November 24, 2013, Shriram Rajagopalan wrote:...> libxl: async version of domain suspendI don''t seem to have replied to this. Sorry about that and thanks for the ping. Unfortunately, it''s really not the right approach at all. The right approach would be a series of incremental patches each of which makes a smallish change and which can be reviewed. This is far from easy. I have started work on this myself in the meantime and I have (so far!) a 17-patch series which is far too disruptive to be suitable for 4.4. Indeed at this stage of the release these structural changes to the suspend/resume code aren''t really appropriate. It would be nice to be able to get improved support for Remus in 4.4 if we can do so without adding too much risk. In particular I think it would be fine to add code to support Remus network buffering if we can get the interfaces right and non-Remus code isn''t touched too much. I''m not sure what George (the release manager) will think of libxl: make libxl__domain_suspend_callback be asynchronous which I think is a precondition for code which runs the Remus network buffering script. I''m about to bounce a copy to him. Ian.
Shriram Rajagopalan
2013-Dec-12 18:17 UTC
Re: [PATCH 1 of 2] libxl: async version of domain suspend
On Thu, Dec 12, 2013 at 7:08 AM, Ian Jackson <Ian.Jackson@eu.citrix.com>wrote:> Shriram Rajagopalan writes ("Re: [Xen-devel] [PATCH 1 of 2] libxl: async > version of domain suspend"): > > On Sunday, November 24, 2013, Shriram Rajagopalan wrote: > ... > > libxl: async version of domain suspend > > I don''t seem to have replied to this. Sorry about that and thanks for > the ping. > > Unfortunately, it''s really not the right approach at all. The right > approach would be a series of incremental patches each of which makes > a smallish change and which can be reviewed. This is far from easy. > > I have started work on this myself in the meantime and I have (so > far!) a 17-patch series which is far too disruptive to be suitable for > 4.4. Indeed at this stage of the release these structural changes to > the suspend/resume code aren''t really appropriate. > > It would be nice to be able to get improved support for Remus in 4.4 > if we can do so without adding too much risk. In particular I think > it would be fine to add code to support Remus network buffering if we > can get the interfaces right and non-Remus code isn''t touched too > much. > >True. If that is the case, the only change to the domain suspend code that I request is to tweak those silly wait loops, one which enforces a mandatory 100ms wait before checking whether the guest acked the suspend request and another 100ms wait before even checking if the guest suspended. Matter of fact, if these hardcoded timer values can be reduced to something like a "10ms wait for first attempt, 40-50ms wait for subsequent attempts" then it might be beneficial to both remus and live migration. NB: This is simply a bug fix and not a change in the way the control flows. The code would still remain synchronous for the moment.> I''m not sure what George (the release manager) will think of > libxl: make libxl__domain_suspend_callback be asynchronous > which I think is a precondition for code which runs the Remus > network buffering script. I''m about to bounce a copy to him. > >I have kept the Remus series independent of this. Infact, the changes I suggested above are more than sufficient. And FYI, the network buffering script is a one time invocation when Remus starts. It is not called in the context of domain suspend callbacks. The only network buffering related code in the suspend callback is invocation of libnl3 APIs> Ian. > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson
2013-Dec-12 18:34 UTC
Re: [PATCH 1 of 2] libxl: async version of domain suspend
Shriram Rajagopalan writes ("Re: [Xen-devel] [PATCH 1 of 2] libxl: async version of domain suspend"):> True. If that is the case, the only change to the domain suspend > code that I request is to tweak those silly wait loops, one which > enforces a mandatory 100ms wait before checking whether the guest > acked the suspend request and another 100ms wait before even > checking if the guest suspended. Matter of fact, if these hardcoded > timer values can be reduced to something like a "10ms wait for first > attempt, 40-50ms wait for subsequent attempts" then it might be > beneficial to both remus and live migration. > > NB: This is simply a bug fix and not a change in the way the control flows. > The code would still remain synchronous for the moment.I wouldn''t object to changing those timeouts in the remus case. In the non-remus case I think at this stage of the release we should leave them unchanged. Ian.