Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 0 of 7 v5] Introduce the tapback daemon (most of blkback in user-space)
This patch series introduces the tapback daemon, the user space daemon that acts as a device''s back-end, essentially most of blkback in user space. The daemon is responsible for coordinating the front-end and tapdisk. It creates tapdisk process as needed, instructs them to connect to/disconnect from the shared ring, and manages the state of the back-end. The shared ring between the front-end and the tapdisk is provided by a piece of code that lives inside the tapdisk and will be introduced by the next patch series. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v1: The series has been largely reorganised: * Renamed the daemon from xenio to tapback. * Improved description in patch 0. * Merged structures and functions. * Disaggregated functionality from the core daemon source file to smaller ones in order to facilitate the review process and improve maintenance. Changed since v2: * Added a new patch that ignores tapback binaries. * For the rest of the patches, see the description in each patch. Changed since v3: * Replace the minor number with type:/path/to/file where necessary. * Create the daemon''s control socket.
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 1 of 7 v5] blktap3/tapback: Introduce core defines and structure definitions
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * Removed BUG_ON macro. * Fixed whitespace in tapback_backend_find_device. * Clarified back-end name (xenio). * Removed references to the "serial" thing. * Remove unused member dev from struct vbd. * Introduce prototype of function XenbusState2str. Changed since v3: * Use vbd3 as the back-end name, as this allows blktap2 and blktap3 to co-exist. * struct vbd now carries the type:/path/to/file as it is later used to tell the tapdisk which VBD must be connected to the sring. Changed since v4: * Minor documentation. diff --git a/tools/blktap3/tapback/tapback.h b/tools/blktap3/tapback/tapback.h new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/tapback.h @@ -0,0 +1,283 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#ifndef __TAPBACK_H__ +#define __TAPBACK_H__ + +#include <stdio.h> +#include <stdint.h> +#include <stdlib.h> +#include <assert.h> +#include <string.h> + +#include <xen/xen.h> +#include <xenstore.h> +#include <xen/io/xenbus.h> +#include <xen/event_channel.h> +#include <xen/grant_table.h> + +#include "tap-ctl.h" + +void tapback_log(int prio, const char *fmt, ...); +void (*tapback_vlog) (int prio, const char *fmt, va_list ap); + +#define DBG(_fmt, _args...) tapback_log(LOG_DEBUG, "%s:%d "_fmt, __FILE__, \ + __LINE__, ##_args) +#define INFO(_fmt, _args...) tapback_log(LOG_INFO, _fmt, ##_args) +#define WARN(_fmt, _args...) tapback_log(LOG_WARNING, "%s:%d "_fmt, __FILE__, \ + __LINE__, ##_args) + +#define WARN_ON(_cond, fmt, ...) \ + if (unlikely(_cond)) { \ + printf(fmt, ##__VA_ARGS__); \ + } + +/* + * Pre-defined XenStore path components used for running the XenBus protocol. + * + * To avoid confusion with blktap2, we''ll use a new kind of device for libxl + * defining it in tools/libxl/libxl_types_internal.idl. This will be done by + * the patch that adds libxl support for blktap3. The temporary back-end name + * is "xenio". Once blktap3 becomes the default back-end, its back-end name + * should be "vbd" and "xenio" will be removed. TODO When that patch is sent, + * use the definition from there instead of hard-coding it here. + */ +#define XENSTORE_BACKEND "backend" +#define BLKTAP3_BACKEND_NAME "vbd3" +#define BLKTAP3_BACKEND_PATH XENSTORE_BACKEND"/"BLKTAP3_BACKEND_NAME +#define BLKTAP3_BACKEND_TOKEN XENSTORE_BACKEND"-"BLKTAP3_BACKEND_NAME +#define BLKTAP3_FRONTEND_TOKEN "otherend-state" + +/* + * TODO Put the rest of the front-end nodes defined in blkif.h here and group + * them. e.g. FRONTEND_NODE_xxx. + */ +#define RING_REF "ring-ref" +#define FEAT_PERSIST "feature-persistent" + +/** + * A Virtual Block Device (VBD), represents a block device in a guest VM. + * Contains all relevant information. + */ +typedef struct vbd { + + /** + * Device name, as retrieved from XenStore at probe-time. + */ + char *name; + + /** + * The device ID. Same as vbd.name, we keep it around because the tapdisk + * control libreary wants it as an int and not as a string. + */ + int devid; + + /** + * For linked lists. + */ + TAILQ_ENTRY(vbd) backend_entry; + + /** + * The domain ID this VBD belongs to. + */ + domid_t domid; + + /** + * The root directory in XenStore for this VBD. This is where all + * directories and key/value pairs related to this VBD are stored. + */ + char *frontend_path; + + /** + * XenStore path to the VBD''s state. This is just + * vbd.frontend_path + "/state", we keep it around so we don''t have to + * allocate/free memory all the time. + */ + char *frontend_state_path; + + /** + * Indicates whether the tapdisk is connected to the shared ring. + */ + bool connected; + + /** + * Descriptor of the tapdisk process serving this virtual block device. We + * need this until the end of the VBD''s lifetime in order to disconnect + * the tapdisk from the shared ring. + */ + tap_list_t tap; + + /* + * XXX We keep sector_size, sectors, and info because we need to + * communicate them to the front-end not only when the front-end goes to + * XenbusStateInitialised, but to XenbusStateConnected as well. + */ + + /** + * Sector size, supplied by the tapdisk, communicated to blkfront. + */ + unsigned int sector_size; + + /** + * Number of sectors, supplied by the tapdisk, communicated to blkfront. + */ + unsigned long long sectors; + + /** + * VDISK_???, defined in include/xen/interface/io/blkif.h. + */ + unsigned int info; + + /** + * type:/path/to/file + */ + char *params; + + /** + * /path/to/file + */ + char *path; + + /** + * type (vhd, aio, etc.) + */ + char *type; + +} vbd_t; + +TAILQ_HEAD(tqh_vbd, vbd); + +/** + * The collection of all necessary handles and descriptors. + */ +struct _blktap3_daemon { + + /** + * A handle to XenStore. + */ + struct xs_handle *xs; + + /** + * For executing transacted operations on XenStore. + */ + xs_transaction_t xst; + + /** + * The list of virtual block devices. + * + * TODO We sometimes have to scan the whole list to find the device/domain + * we''re interested in, should we optimize this? E.g. use a hash table + * for O(1) access? + * TODO Replace with a hash table (hcreate etc.)? + */ + struct tqh_vbd devices; + + /** + * TODO From xen/include/public/io/blkif.h: "The maximum supported size of + * the request ring buffer" + */ + int max_ring_page_order; + + /** + * Unix domain socket for controlling the daemon. + */ + int ctrl_sock; +}; + +extern struct _blktap3_daemon blktap3_daemon; + +#define tapback_backend_for_each_device(_device, _next) \ + TAILQ_FOREACH_SAFE(_device, &blktap3_daemon.devices, backend_entry, _next) + +/** + * Iterates over all devices and returns the one for which the condition is + * true. + */ +#define tapback_backend_find_device(_device, _cond) \ +do { \ + vbd_t *__next; \ + int found = 0; \ + tapback_backend_for_each_device(_device, __next) { \ + if (_cond) { \ + found = 1; \ + break; \ + } \ + } \ + if (!found) \ + _device = NULL; \ +} while (0) + +/** + * Act in response to a change in the front-end XenStore path. + * + * TODO We only care about changes on the front-end''s state. Document this. + * Also, executed the body of this function (blkback_frontend_changed) iff a + * change occured on the state, otherwise immediatelly return. + * + * @param path the front-end''s XenStore path that changed + * @returns 0 on success, an error code otherwise + * + * XXX Only called by tapback_read_watch + */ +int +tapback_backend_handle_otherend_watch(const char * const path); + +/** + * Act in response to a change in the back-end directory in XenStore. + * + * If the path is "/backend" or "/backend/<backend name>", all devices are + * probed. Otherwise, the path should be + * "backend/<backend name>/<domid>/<device name>" + * (i.e. backend/<backend name>/1/51712), and in this case this specific device + * is probed. + * + * @param path the back-end''s XenStore path that changed @returns 0 on success, + * an error code otherwise + * + * TODO We only care about changes on the domid/devid component, as this + * signifies device creation/removal. Changes to paths such as + * "backend/vbd3/29/51712/mode" or "backend/vbd3/29/51712/removable" are + * currently uninteresting and we shouldn''t do anything. + * + * XXX Only called by tapback_read_watch. + */ +int +tapback_backend_handle_backend_watch(char * const path); + +/** + * Converts XenbusState values to a printable string, e.g. XenbusStateConnected + * corresponds to "connected". + * + * @param xbs the XenbusState to convert + * @returns a printable string + */ +char * +XenbusState2str(const XenbusState xbs); + +/** + * Converts XenbusState values to a printable string, e.g. XenbusStateConnected + * corresponds to "connected". + * + * @param xbs the XenbusState to convert + * @returns a printable string + */ +char * +XenbusState2str(const XenbusState xbs); + +#endif /* __TAPBACK_H__ */
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 2 of 7 v5] blktap3/tapback: Introduces functionality required to access XenStore
This patch introduces convenience functions that read/write values from/to XenStore. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * Introduced makefile to facilitate development. * Removed functions (v)mprintf as (v)asprintf suffice. * Ensure tapback_xs_vread (a) returns a NULL terminated string, and (b) the returned string doesn''t contain NULL characters, apart from the NULL-terminating one. * Removed unnecessary variable initialisations. * Minor whitespace clean up. * Fixed typo in tapback_xs_vread. * function tapback_xs_vread: handle corner case where the value returned by xs_read is a zero-length string. Changed since v3: * Corrected documentation of function tapback_device_read_otherend. diff --git a/tools/blktap3/tapback/Makefile b/tools/blktap3/tapback/Makefile new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/Makefile @@ -0,0 +1,31 @@ +XEN_ROOT := $(CURDIR)/../../../ +include $(XEN_ROOT)/tools/Rules.mk + +BLKTAP_ROOT := .. + +# -D_GNU_SOURCE is required by vasprintf. +override CFLAGS += \ + -I$(BLKTAP_ROOT)/include \ + -I$(BLKTAP_ROOT)/control \ + -D_GNU_SOURCE \ + $(CFLAGS_libxenstore) \ + $(CFLAGS_libxenctrl) \ + $(CFLAGS_xeninclude) \ + -Wall \ + -Wextra \ + -Werror + +# FIXME cause trouble +override CFLAGS += \ + -Wno-old-style-declaration \ + -Wno-sign-compare \ + -Wno-type-limits + +override LDFLAGS += \ + $(LDLIBS_libxenstore) \ + $(LDFLAGS_libxenctrl) + +clean: + rm -f *.o *.o.d .*.o.d + +.PHONY: clean install diff --git a/tools/blktap3/tapback/xenstore.c b/tools/blktap3/tapback/xenstore.c new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/xenstore.c @@ -0,0 +1,195 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#include <stdarg.h> +#include <stdio.h> +#include <xenstore.h> +#include <assert.h> +#include <stdlib.h> +#include <string.h> + +#include "blktap3.h" +#include "tapback.h" +#include "xenstore.h" + +char * +tapback_xs_vread(struct xs_handle * const xs, xs_transaction_t xst, + const char * const fmt, va_list ap) +{ + char *path, *data = NULL; + unsigned int len = 0; + + assert(xs); + + if (vasprintf(&path, fmt, ap) == -1) + goto fail; + assert(path); + + data = xs_read(xs, xst, path, &len); + free(path); + + if (!data) + return NULL; + + assert(len >= 0); + + /* + * Make sure the returned string is NULL-terminated. + */ + if ((len > 0 && data[len - 1] != ''\0'') || (len == 0 && data[0] != ''\0'')) { + char *_data = strndup(data, len); + if (!_data) + /* TODO log error */ + goto fail; + free(data); + data = _data; + } + + /* + * Make sure the returned string does not containing NULL characters, apart + * from the NULL-terminating one. + * + * We should be checking for extraneous NULLs before duplicating the + * buffer, but this way logic is simplified. + */ + if (strrchr(data, ''\0'') - data != len) + /* TODO log error */ + goto fail; + + return data; +fail: + free(data); + return NULL; +} + +__printf(3, 4) +char * +tapback_xs_read(struct xs_handle * const xs, xs_transaction_t xst, + const char * const fmt, ...) +{ + va_list ap; + char *s; + + assert(xs); + + va_start(ap, fmt); + s = tapback_xs_vread(xs, xst, fmt, ap); + va_end(ap); + + return s; +} + +char * +tapback_device_read(const vbd_t * const device, const char * const path) +{ + assert(device); + assert(path); + + return tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst, + "%s/%d/%s/%s", BLKTAP3_BACKEND_PATH, device->domid, device->name, + path); +} + +char * +tapback_device_read_otherend(vbd_t * const device, + const char * const path) +{ + assert(device); + assert(path); + + return tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst, "%s/%s", + device->frontend_path, path); +} + +__scanf(3, 4) +int +tapback_device_scanf_otherend(vbd_t * const device, + const char * const path, const char * const fmt, ...) +{ + va_list ap; + int n = 0; + char *s = NULL; + + assert(device); + assert(path); + + if (!(s = tapback_device_read_otherend(device, path))) + return -1; + va_start(ap, fmt); + n = vsscanf(s, fmt, ap); + free(s); + va_end(ap); + + return n; +} + +__printf(4, 5) +int +tapback_device_printf(vbd_t * const device, const char * const key, + const bool mkread, const char * const fmt, ...) +{ + va_list ap; + int err = 0; + char *path = NULL, *val = NULL; + bool nerr = false; + + assert(device); + assert(key); + + if (-1 == asprintf(&path, "%s/%d/%s/%s", BLKTAP3_BACKEND_PATH, + device->domid, device->name, key)) { + err = -errno; + goto fail; + } + + va_start(ap, fmt); + if (-1 == vasprintf(&val, fmt, ap)) + val = NULL; + va_end(ap); + + if (!val) { + err = -errno; + goto fail; + } + + if (!(nerr = xs_write(blktap3_daemon.xs, blktap3_daemon.xst, path, val, + strlen(val)))) { + err = -errno; + goto fail; + } + + if (mkread) { + struct xs_permissions perms = { + device->domid, + XS_PERM_READ + }; + + if (!(nerr = xs_set_permissions(blktap3_daemon.xs, blktap3_daemon.xst, + path, &perms, 1))) { + err = -errno; + goto fail; + } + } + +fail: + free(path); + free(val); + + return err; +} diff --git a/tools/blktap3/tapback/xenstore.h b/tools/blktap3/tapback/xenstore.h new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/xenstore.h @@ -0,0 +1,95 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +/** + * Retrieves the XenStore value of the specified key of the VBD''s front-end. + * The caller must free the returned buffer. + * + * @param device the VBD + * @param path key under the front-end directory + * @returns a buffer containing the value, or NULL on error + */ +char * +tapback_device_read_otherend(vbd_t * const device, + const char * const path); + +/** + * Writes to XenStore backened/tapback/<domid>/<devname>/@key = @fmt. + * + * @param device the VBD + * @param key the key to write to + * @param mkread TODO + * @param fmt format + * @returns 0 on success, an negative error code otherwise + */ +__printf(4, 5) +int +tapback_device_printf(vbd_t * const device, const char * const key, + const bool mkread, const char * const fmt, ...); + +/** + * Reads the specified XenStore path under the front-end directory in a + * scanf-like manner. + * + * @param device the VBD + * @param path XenStore path to read + * @param fmt format + */ +__scanf(3, 4) +int +tapback_device_scanf_otherend(vbd_t * const device, + const char * const path, const char * const fmt, ...); + +/** + * Retrieves the value of the specified of the device from XenStore, + * i.e. backend/tapback/<domid>/<devname>/@path + * The caller must free the returned buffer. + * + * @param device the VBD + * @param path the XenStore key + * @returns a buffer containing the value, or NULL on error + */ +char * +tapback_device_read(const vbd_t * const device, const char * const path); + +/** + * Reads the specified XenStore path. The caller must free the returned buffer. + * + * @param xs handle to XenStore + * @param xst XenStore transaction + * @param fmt format + * @param ap arguments + * @returns a buffer containing the value, or NULL on error + */ +char * +tapback_xs_vread(struct xs_handle * const xs, xs_transaction_t xst, + const char * const fmt, va_list ap); + +/** + * Reads the specified XenStore path. The caller must free the returned buffer. + * + * @param xs handle to XenStore + * @param xst XenStore transaction + * @param fmt format + * @returns a buffer containing the value, or NULL on error + */ +__printf(3, 4) +char * +tapback_xs_read(struct xs_handle * const xs, xs_transaction_t xst, + const char * const fmt, ...);
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 3 of 7 v5] blktap3/tapback: Logging for the tapback daemon
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * minor whitespace clean up Changed since v4: * Fix patch topic. diff --git a/tools/blktap3/tapback/log.c b/tools/blktap3/tapback/log.c new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/log.c @@ -0,0 +1,33 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#include "compiler.h" +#include <stdarg.h> +#include <syslog.h> + +void (*tapback_vlog) (int prio, const char *fmt, va_list ap) = vsyslog; + +__printf(2, 3) void +tapback_log(int prio, const char *fmt, ...) +{ + va_list ap; + va_start(ap, fmt); + tapback_vlog(prio, fmt, ap); + va_end(ap); +}
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 4 of 7 v5] blktap3/tapback: Introduce back-end XenStore path handler
This patch introduces the handler executed when the back-end XenStore path gets modified. A back-end XenStore path is modified as a result of a device creation/removal request. The device is created/removed depending on whether its path exists/does not exist in XenStore. Creating a device comprises creating the in-memory representation of it and adding it to the list of devices, locating the tapdisk designated to serve this VBD, and setting a XenStore watch to the front-end path of the newly-created device. Deleting a device comprises removing that XenStore watch and deallocating its in-memory representation. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * Removed the "serial" thing. * Function tapback_backend_handle_backend_watch doesn''t always have to scan * the entire back-end sub-tree, this is left as a future optimisation (relevant comment inside code updated). * Replaced mprintf with asprintf. * Check for failures in tapback_backend_scan. * Minor code and whitespace clean up. * The tapback daemon searches for a suitable tapdisk and creates one if none is found. It also kills it when the front-end disconnects. Changed since v3: * In function tapback_backend_destroy_device, pass the VDI''s type:/path/to/file as there''s no minor number any more. * Function blkback_find_tapdisk doesn''t parse the type:/path/to/file any more, it expects it already parsed. * In function tapback_backend_create_device, parse type:/path/to/file and store it the VBD. * Pass the type:/path/to/file to tap_ctl_info instead of the minor number. Changed since v4: * Minor documentation, plus a minor debug print. diff --git a/tools/blktap3/tapback/backend.c b/tools/blktap3/tapback/backend.c new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/backend.c @@ -0,0 +1,474 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + * + * This file contains the handler executed when the back-end XenStore path gets + * modified. + */ + +#include "tapback.h" +#include "xenstore.h" + +/** + * Removes the XenStore watch from the front-end. + * + * @param device the VBD whose front-end XenStore path should stop being + * watched + */ +static void +tapback_device_unwatch_frontend_state(vbd_t * const device) +{ + assert(device); + + if (device->frontend_state_path) + xs_unwatch(blktap3_daemon.xs, device->frontend_state_path, + BLKTAP3_FRONTEND_TOKEN); + + free(device->frontend_state_path); + device->frontend_state_path = NULL; +} + +/** + * Destroys and deallocates the back-end part of a VBD. + * + * @param device the VBD to destroy + */ +static void +tapback_backend_destroy_device(vbd_t * const device) +{ + int err; + + assert(device); + + DBG("removing device %d/%s\n", device->domid, device->devid); + + TAILQ_REMOVE(&blktap3_daemon.devices, device, backend_entry); + + tapback_device_unwatch_frontend_state(device); + + /* + * kill the tapdisk + */ + err = tap_ctl_destroy(device->tap.pid, device->params, 0, NULL); + if (err) + WARN("failed to destroy tapdisk %d %s: %s (error ignored)\n", + device->tap.pid, device->params, strerror(-err)); + + free(device->frontend_path); + free(device->name); + free(device->type); + free(device->path); + free(device->params); + free(device); +} + +/** + * Retrieves the tapdisk designated to serve this device, storing this + * information in the supplied VBD handle. + * + * @param param <type>:/path/to/file + * @param tap output parameter that receives the tapdisk process information. + * The parameter is undefined when the function returns a non-zero value. + * @returns 0 if a suitable tapdisk is found, ESRCH if no suitable tapdisk is + * found, and an error code in case of error + * + * TODO rename function + * + * XXX Only called by blkback_probe. + */ +static inline int +blkback_find_tapdisk(const char *type, const char *path, tap_list_t *tap) +{ + struct tqh_tap_list list; + tap_list_t *_tap; + int err; + + assert(type); + assert(path); + assert(tap); + + err = tap_ctl_list(&list); + if (err) { + WARN("error listing tapdisks: %s\n", strerror(err)); + goto out; + } + + err = ESRCH; + if (!TAILQ_EMPTY(&list)) { + tap_list_for_each_entry(_tap, &list) { + if (_tap->type && !strcmp(_tap->type, type) && _tap->path + && !strcmp(_tap->path, path)) { + err = 0; + memcpy(tap, _tap, sizeof(tap)); + break; + } + } + tap_ctl_list_free(&list); + } else + DBG("no tapdisks\n"); + +out: + return err; +} + +/** + * Creates a device and adds it to the list of devices. + * Initiates a XenStore watch to the front-end state. + * + * Creating the device implies initializing the handle and retrieving all the + * information of the tapdisk serving this VBD. + * + * @param domid the ID of the domain where the VBD is created + * @param name the name of the device + * @returns 0 on success, an error code otherwise + */ +static inline int +tapback_backend_create_device(const domid_t domid, const char * const name) +{ + vbd_t *device = NULL; + int err = 0; + + assert(name); + + DBG("creating device domid=%d, devid=%s\n", domid, name); + + if (!(device = calloc(1, sizeof(*device)))) { + WARN("error allocating memory\n"); + err = -errno; + goto out; + } + + device->domid = domid; + + TAILQ_INSERT_TAIL(&blktap3_daemon.devices, device, backend_entry); + + if (!(device->name = strdup(name))) { + err = -errno; + goto out; + } + + /* + * Get the front-end path from XenStore. We need this to talk to the + * front-end. + */ + if (!(device->frontend_path = tapback_device_read(device, "frontend"))) { + err = errno; + WARN("failed to read front-end path: %s\n", strerror(err)); + goto out; + } + + /* + * Get the file path backing the VBD. + */ + device->params = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst, + "%s/%d/%s/params", BLKTAP3_BACKEND_PATH, domid, name); + if (!device->params) { + err = errno; + WARN("failed to read backing file: %s\n", strerror(err)); + goto out; + } + DBG("need to find tapdisk serving \''%s\''\n", device->params); + + err = parse_params(device->params, &device->type, &device->path); + if (err) { + WARN("failed to parse params \''%s\'': %s\n", device->params, + strerror(-err)); + goto out; + } + + err = blkback_find_tapdisk(device->type, device->path, &device->tap); + if (!err) { + DBG("found tapdisk pid=%d\n", device->tap.pid); + } else if (err == ESRCH) { + /* TODO replace with tap_ctl_create */ + DBG("no such tapdisk\n"); + err = tap_ctl_spawn(); + if (err <= 0) { + WARN("failed to create tapdisk: %s\n", strerror(-err)); + goto out; + } + device->tap.pid = err; + DBG("spawned tapdisk pid=%d\n", device->tap.pid); + err = tap_ctl_open(device->tap.pid, device->params, 0, NULL, NULL); + if (err) { + WARN("failed to open %s on tapdisk pid=%d: %s\n", device->params, + device->tap.pid, strerror(-err)); + /* TODO The error handler assumes that there a device has been + * opened, so tapdisk will complain that there is no such image. + */ + goto out; + } + DBG("opened %s on tapdisk pid=%d\n", device->params, device->tap.pid); + } else { + WARN("error looking for tapdisk: %s\n", strerror(err)); + goto out; + } + + /* + * get the VBD parameters from the tapdisk + */ + if ((err = tap_ctl_info(device->tap.pid, device->params, &device->sectors, + &device->sector_size, &device->info))) { + WARN("error retrieving disk characteristics: %s\n", strerror(-err)); + goto out; + } + + /* + * Finally, watch the front-end path in XenStore for changes, i.e. + * /local/domain/<domid>/device/vbd/<devname>/state + * After this, we wait for the front-end to switch state to continue with + * the initialisation. + */ + if (asprintf(&device->frontend_state_path, "%s/state", + device->frontend_path) == -1) { + /* TODO log error */ + err = -errno; + goto out; + } + assert(device->frontend_state_path); + + /* + * We use the same token for all front-end watches. We don''t have to use a + * unique token for each front-end watch because when a front-end watch + * fires we are given the XenStore path that changed. + */ + if (!xs_watch(blktap3_daemon.xs, device->frontend_state_path, + BLKTAP3_FRONTEND_TOKEN)) { + free(device->frontend_state_path); + err = -errno; + goto out; + } + +out: + if (err) { + WARN("error creating device: domid=%d name=%s err=%d (%s)\n", + domid, name, err, strerror(err)); + if (device) + tapback_backend_destroy_device(device); + } + return err; +} + +/** + * Creates (removes) a device depending on the existence (non-existence) of the + * "backend/<backend name>/@domid/@devname" XenStore path. + * + * @param domid the ID of the domain where the VBD is created + * @param devname device name + * @returns 0 on success, an error code otherwise + */ +static int +tapback_backend_probe_device(const domid_t domid, const char * const devname) +{ + int should_exist = 0, create = 0, remove = 0; + vbd_t *device = NULL; + char * s = NULL; + + assert(devname); + + DBG("probing device domid=%d name=%s\n", domid, devname); + + /* + * Ask XenStore if the device _should_ exist. + */ + s = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst, "%s/%d/%s", + BLKTAP3_BACKEND_PATH, domid, devname); + should_exist = s != NULL; + free(s); + + /* + * Search the device list for this specific device. + */ + tapback_backend_find_device(device, + device->domid == domid && !strcmp(device->name, devname)); + + /* + * If XenStore says that the device should exist but it''s not in our device + * list, we must create it. If it''s the other way round, this is a removal. + */ + remove = device && !should_exist; + create = !device && should_exist; + + if (!create && !remove) { + /* + * A watch has triggered on a path we''re not interested in. + * TODO Check if we can avoid probing the device completely based on + * the path that triggered. + */ + return 0; + } + + /* + * Remember that remove and create may both be true at the same time, as + * this indicates that the device has been removed and re-created too fast. + * In this case, we too need to remove and re-create the device, + * respectively. + */ + + if (remove) + tapback_backend_destroy_device(device); + + if (create) { + const int err = tapback_backend_create_device(domid, devname); + if (0 != err) { + WARN("error creating device %s on domain %d: %s\n", devname, domid, + strerror(err)); + return err; + } + } + + return 0; +} + +/** + * Scans XenStore for all blktap3 devices and probes each one of them. + * + * XXX Only called by tapback_backend_handle_backend_watch. + */ +static int +tapback_backend_scan(void) +{ + vbd_t *device = NULL, *next = NULL; + unsigned int i = 0, j = 0, n = 0, m = 0; + char **dir = NULL; + int err = 0; + + DBG("scanning entire back-end\n"); + + /* + * scrap all non-existent devices + * TODO Why do we do this? Is this costly? + */ + + tapback_backend_for_each_device(device, next) { + err = tapback_backend_probe_device(device->domid, device->name); + if (err) { + WARN("error probing device %s of domain %d: %s\n", device->name, + device->domid, strerror(err)); + /* TODO Should we fail in this case of keep probing? */ + goto out; + } + } + + /* + * probe the new ones + * + * TODO We''re checking each and every device in each and every domain, + * could there be a performance issue in the presence of many VMs/VBDs? + * (e.g. boot-storm) + */ + if (!(dir = xs_directory(blktap3_daemon.xs, blktap3_daemon.xst, + BLKTAP3_BACKEND_PATH, &n))) { + err = errno; + if (err == ENOENT) + err = 0; + else + WARN("error listing %s: %s\n", BLKTAP3_BACKEND_PATH, + strerror(err)); + goto out; + } + + DBG("probing %d domains\n", n); + + for (i = 0; i < n; i++) { /* for each domain */ + char *path = NULL, **sub = NULL, *end = NULL; + domid_t domid = 0; + + /* + * Get the domain ID. + */ + domid = strtoul(dir[i], &end, 0); + if (*end != 0 || end == dir[i]) + continue; + + /* + * Read the devices of this domain. + */ + if (asprintf(&path, "%s/%d", BLKTAP3_BACKEND_PATH, domid) == -1) { + /* TODO log error */ + err = errno; + goto out; + } + sub = xs_directory(blktap3_daemon.xs, blktap3_daemon.xst, path, &m); + err = errno; + free(path); + + if (!sub) { + WARN("error listing %s: %s\n", path, strerror(err)); + goto out; + } + + /* + * Probe each device. + */ + for (j = 0; j < m; j++) { + err = tapback_backend_probe_device(domid, sub[j]); + if (err) { + WARN("error probing device %s of domain %d: %s\n", sub[j], + domid, strerror(err)); + goto out; + } + } + + free(sub); + } + +out: + free(dir); + return err; +} + +int +tapback_backend_handle_backend_watch(char * const path) +{ + char *s = NULL, *end = NULL, *name = NULL; + domid_t domid = 0; + + assert(path); + + s = strtok(path, "/"); + assert(!strcmp(s, XENSTORE_BACKEND)); + if (!(s = strtok(NULL, "/"))) + return tapback_backend_scan(); + + assert(!strcmp(s, BLKTAP3_BACKEND_NAME)); + if (!(s = strtok(NULL, "/"))) + return tapback_backend_scan(); + + domid = strtoul(s, &end, 0); + if (*end != 0 || end == s) { + WARN("invalid domain ID \''%s\''\n", s); + return EINVAL; + } + + /* + * TODO Optimisation: since we know which domain changed, we don''t have to + * scan the whole thing. Add the domid as an optional parameter to + * tapback_backend_scan. + */ + if (!(name = strtok(NULL, "/"))) + return tapback_backend_scan(); + + /* + * Create or remove a specific device. + * + * TODO tapback_backend_probe_device reads xenstore again to see if the + * device should exist, but we already know that in the current function. + * Optimise this case. + */ + return tapback_backend_probe_device(domid, name); +}
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 5 of 7 v5] blktap3/tapback: Introduce front-end XenStore path handler
This patch introduces the handler executed when the front-end XenStore path of a VBD gets modified. This is done when the front-end switches state. The core of this functionality is connecting/disconnecting the tapdisk to/from the shared ring. When the front-end goes to Initialised or Connected state, the daemon reads all the necessary information from XenStore provided by the front-end, initiates the shared ring creation (the actual creation is performed by a library that will be introduced by another patch series), and instructs the tapdisk designated to serve this VBD to connect to the ring. Finally, it communicates to the front-end all the necessary disk parameters. When the front-end switches to Closed, the tapdisk is disconnected from the shared ring. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * Documented the frontend_state_change_map array of callbacks. * Minor code and whitespace clean up. * Added some debug prints. * Don''t fail if the front-end doesn''t tell us whether or not it supports persistent grants, just assume it doesn''t. Changed since v3: * Pass the type:/path/to/file to tap_ctl_connect_xenblkif instead of the minor number. Changed since v4: * Fix signs in return values. diff --git a/tools/blktap3/include/blktap3.h b/tools/blktap3/include/blktap3.h --- a/tools/blktap3/include/blktap3.h +++ b/tools/blktap3/include/blktap3.h @@ -46,4 +46,19 @@ TAILQ_REMOVE(src, node, entry); \ TAILQ_INSERT_TAIL(dst, node, entry); +/** + * Block I/O protocol + * + * Taken from linux/drivers/block/xen-blkback/common.h so that blkfront can + * work both with blktap2 and blktap3. + * + * TODO linux/drivers/block/xen-blkback/common.h contains other definitions + * necessary for allowing tapdisk3 to talk directly to blkfront. Find a way to + * use the definitions from there. + */ +enum blkif_protocol { + BLKIF_PROTOCOL_NATIVE = 1, + BLKIF_PROTOCOL_X86_32 = 2, + BLKIF_PROTOCOL_X86_64 = 3, +}; #endif /* __BLKTAP_3_H__ */ diff --git a/tools/blktap3/tapback/frontend.c b/tools/blktap3/tapback/frontend.c new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/frontend.c @@ -0,0 +1,417 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + * + * This file contains the handler executed when the front-end XenStore path of + * a VBD gets modified. + */ + +#include "tapback.h" +#include "xenstore.h" + +#include <xen/io/protocols.h> + +/** + * Switches the back-end state of the device by writing to XenStore. + * + * @param device the VBD + * @param state the state to switch to + * @returns 0 on success, an error code otherwise + */ +static int +tapback_device_switch_state(vbd_t * const device, + const XenbusState state) +{ + int err; + + assert(device); + + /* + * TODO Ensure @state contains a legitimate XenbusState value. + * TODO Check for valid state transitions? + */ + + err = -tapback_device_printf(device, "state", false, "%u", state); + if (err) { + WARN("failed to switch back-end state to %s: %s\n", + XenbusState2str(state), strerror(err)); + } else + DBG("switched back-end state to %s\n", XenbusState2str(state)); + return err; +} + +/** + * Core functions that instructs the tapdisk to connect to the shared ring (if + * not already connected) and communicates essential information to the + * front-end. + * + * If the tapdisk is not already connected, all the necessary information is + * read from XenStore and the tapdisk gets connected using this information. + * + * TODO Why should this function be called on an already connected VBD? Why + * re-write the sector size etc. in XenStore for an already connected VBD? + * TODO rename function (no blkback, not only connects to tapdisk) + * + * @param xbdev the VBD the tapdisk should connect to + * @param state unused + * @returns 0 on success, an error code otherwise + * + * XXX Only called by blkback_frontend_changed, when the front-end switches to + * Initialised and Connected. + */ +static int +blkback_connect_tap(vbd_t * const bdev, + const XenbusState state __attribute__((unused))) +{ + evtchn_port_t port = 0; + grant_ref_t *gref = NULL; + int err = 0; + char *proto_str = NULL; + char *persistent_grants_str = NULL; + + assert(bdev); + + if (bdev->connected) { + DBG("front-end already connected to tapdisk.\n"); + } else { + /* + * TODO How can we make sure we''re not missing a node written by the + * front-end? Use xs_directory? + */ + int nr_pages = 0, proto = 0, order = 0; + bool persistent_grants = false; + + if (1 != tapback_device_scanf_otherend(bdev, "ring-page-order", "%d", + &order)) + order = 0; + + nr_pages = 1 << order; + + if (!(gref = calloc(nr_pages, sizeof(grant_ref_t)))) { + WARN("Failed to allocate memory for grant refs.\n"); + err = ENOMEM; + goto fail; + } + + /* + * Read the grant references. + */ + if (order) { + int i = 0; + /* + * +10 is for INT_MAX, +1 for NULL termination + */ + static const size_t len = sizeof(RING_REF) + 10 + 1; + char ring_ref[len]; + for (i = 0; i < nr_pages; i++) { + if (snprintf(ring_ref, len, "%s%d", RING_REF, i) >= len) { + DBG("error printing to buffer\n"); + err = EINVAL; + goto fail; + } + if (1 != tapback_device_scanf_otherend(bdev, ring_ref, "%u", + &gref[i])) { + WARN("Failed to read grant ref 0x%x.\n", i); + err = ENOENT; + goto fail; + } + } + } else { + if (1 != tapback_device_scanf_otherend(bdev, RING_REF, "%u", + &gref[0])) { + WARN("Failed to read grant ref.\n"); + err = ENOENT; + goto fail; + } + } + + /* + * Read the event channel. + */ + if (1 != tapback_device_scanf_otherend(bdev, "event-channel", "%u", + &port)) { + WARN("Failed to read event channel.\n"); + err = ENOENT; + goto fail; + } + + /* + * Read the guest VM''s ABI. + */ + if (!(proto_str = tapback_device_read_otherend(bdev, "protocol"))) + proto = BLKIF_PROTOCOL_NATIVE; + else if (!strcmp(proto_str, XEN_IO_PROTO_ABI_X86_32)) + proto = BLKIF_PROTOCOL_X86_32; + else if (!strcmp(proto_str, XEN_IO_PROTO_ABI_X86_64)) + proto = BLKIF_PROTOCOL_X86_64; + else { + WARN("unsupported protocol %s\n", proto_str); + err = EINVAL; + goto fail; + } + + /* + * Does the front-end support persistent grants? + */ + persistent_grants_str = tapback_device_read_otherend(bdev, + FEAT_PERSIST); + if (persistent_grants_str) { + if (!strcmp(persistent_grants_str, "0")) + persistent_grants = false; + else if (!strcmp(persistent_grants_str, "1")) + persistent_grants = true; + else { + WARN("invalid %s value: %s\n", FEAT_PERSIST, + persistent_grants_str); + err = EINVAL; + goto fail; + } + } + else + DBG("front-end doesn''t support persistent grants\n"); + + /* + * persistent grants are not yet supported + */ + if (persistent_grants) + WARN("front-end supports persistent grants but we don''t\n"); + + /* + * Create the shared ring and ask the tapdisk to connect to it. + */ + if ((err = tap_ctl_connect_xenblkif(bdev->tap.pid, bdev->params, + bdev->domid, bdev->devid, gref, order, port, proto, + NULL))) { + WARN("tapdisk failed to connect to the shared ring: %s\n", + strerror(err)); + goto fail; + } + DBG("tapdisk pid=%d, path=%s connected to shared ring\n", + bdev->tap.pid, bdev->path); + + bdev->connected = true; + } + + /* + * Write the number of sectors, sector size, and info to the back-end path + * in XenStore so that the front-end creates a VBD with the appropriate + * characteristics. + */ + if ((err = tapback_device_printf(bdev, "sector-size", true, "%u", + bdev->sector_size))) { + WARN("Failed to write sector-size.\n"); + goto fail; + } + + if ((err = tapback_device_printf(bdev, "sectors", true, "%llu", + bdev->sectors))) { + WARN("Failed to write sectors.\n"); + goto fail; + } + + if ((err = tapback_device_printf(bdev, "info", true, "%u", bdev->info))) { + WARN("Failed to write info.\n"); + goto fail; + } + + if ((err = tapback_device_switch_state(bdev, XenbusStateConnected))) { + WARN("failed to switch back-end state to connected: %s\n", + strerror(err)); + } + +fail: + if (err && bdev->connected) { + const int err2 = -tap_ctl_disconnect_xenblkif(bdev->tap.pid, + bdev->domid, bdev->devid, NULL); + if (err2) { + WARN("error disconnecting tapdisk from the shared ring (error " + "ignored): %s\n", strerror(err2)); + } + + bdev->connected = false; + } + + free(gref); + free(proto_str); + free(persistent_grants_str); + + return err; +} + +/** + * Callback that is executed when the front-end goes to StateClosed. + * + * Instructs the tapdisk to disconnect itself from the shared ring and switches + * the back-end state to StateClosed. + * + * @param xbdev the VBD whose tapdisk should be disconnected + * @param state unused + * @returns 0 on success, an error code otherwise + * + * XXX Only called by blkback_frontend_changed. + */ +static inline int +backend_close(vbd_t * const bdev, + const XenbusState state __attribute__((unused))) +{ + int err = 0; + + assert(bdev); + + if (!bdev->connected) { + /* + * TODO Is this safe? Shouldn''t we report an error? + */ + DBG("tapdisk not connected\n"); + return 0; + } + + DBG("disconnecting domid=%d devid=%d from tapdisk pid=%d %s:%s\n", + bdev->domid, bdev->devid, bdev->tap.pid, bdev->type, bdev->path); + + if ((err = -tap_ctl_disconnect_xenblkif(bdev->tap.pid, bdev->domid, + bdev->devid, NULL))){ + + /* + * TODO I don''t see how tap_ctl_disconnect_xenblkif can return + * ESRCH, so this is probably wrong. Probably there''s another error + * code indicating that there''s no tapdisk process. + */ + if (errno == ESRCH) { + WARN("tapdisk not running\n"); + } else { + WARN("error disconnecting tapdisk from front-end: %s\n", + strerror(err)); + return err; + } + } + + bdev->connected = false; + + return tapback_device_switch_state(bdev, XenbusStateClosed); +} + +/** + * Acts on changes in the front-end state. + * + * TODO The back-end blindly follows the front-ends state transitions, should + * we check whether unexpected transitions are performed? + * + * @param xbdev the VBD whose front-end state changed + * @param state the new state + * @returns 0 on success, an error code otherwise + * + * XXX Only called by tapback_device_check_front-end_state. + */ +static inline int +blkback_frontend_changed(vbd_t * const xbdev, const XenbusState state) +{ + /* + * XXX The size of the array (9) comes from the XenbusState enum. + * + * TODO Send a patch that adds XenbusStateMin, XenbusStateMax, + * XenbusStateInvalid and in the XenbusState enum (located in xenbus.h). + * + * The front-end''s state is used as the array index. Each element contains + * a call-back function to be executed in response, and an optional state + * for the back-end to switch to. + */ + struct frontend_state_change { + int (*fn)(vbd_t * const, const XenbusState); + const XenbusState state; + } static const frontend_state_change_map[] = { + [XenbusStateUnknown] = {NULL, 0}, + [XenbusStateInitialising] + = {tapback_device_switch_state, XenbusStateInitWait}, + [XenbusStateInitWait] = {NULL, 0}, + + /* blkback_connect_tap swicthes back-end state to Connected */ + [XenbusStateInitialised] = {blkback_connect_tap, 0}, + [XenbusStateConnected] = {blkback_connect_tap, 0}, + + [XenbusStateClosing] + = {tapback_device_switch_state, XenbusStateClosing}, + [XenbusStateClosed] = {backend_close, 0}, + [XenbusStateReconfiguring] = {NULL, 0}, + [XenbusStateReconfigured] = {NULL, 0} + }; + + assert(xbdev); + assert(state >= XenbusStateUnknown && state <= XenbusStateReconfigured); + + DBG("front-end domid=%d, devid=%s went into state %s\n", + xbdev->domid, xbdev->name, XenbusState2str(state)); + + if (frontend_state_change_map[state].fn) + return frontend_state_change_map[state].fn(xbdev, + frontend_state_change_map[state].state); + else + DBG("ignoring front-end''s domid=%d, devid=%s transition to state %s\n", + xbdev->domid, xbdev->name, XenbusState2str(state)); + return 0; +} + +int +tapback_backend_handle_otherend_watch(const char * const path) +{ + vbd_t *device = NULL; + int err = 0, state = 0; + char *s = NULL, *end = NULL; + + assert(path); + + /* + * Find the device that has the same front-end state path. + * + * There should definitely be such a device in our list, otherwise this + * function would not have executed at all, since we would not be waiting + * on that XenStore path. The XenStore path we wait for is: + * /local/domain/<domid>/device/vbd/<devname>/state. In order to watch this + * path, it means that we have received a device create request, so the + * device will be there. + * + * TODO Instead of this linear search we could do better (hash table etc). + */ + tapback_backend_find_device(device, + !strcmp(device->frontend_state_path, path)); + if (!device) { + WARN("path \''%s\'' does not correspond to a known device\n", path); + return ENODEV; + } + + DBG("device: domid=%d name=%s\n", device->domid, device->name); + + /* + * Read the new front-end''s state. + */ + if (!(s = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst, "%s", + device->frontend_state_path))) { + err = errno; + goto fail; + } + state = strtol(s, &end, 0); + if (*end != 0 || end == s) { + err = EINVAL; + goto fail; + } + + err = blkback_frontend_changed(device, state); + +fail: + free(s); + return err; +}
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 6 of 7 v5] blktap3/tapback: Introduce the tapback daemon
This patch introduces the core of the tapback daemon, the user space daemon that acts as a device''s back-end, essentially most of blkback in user space. Similar to blkback, the daemon monitors XenStore for device creation/removal requests and front-end state changes, and acts in response to them (communicates to the front-end necessary information, switches the back-end''s state etc.). The daemon creates/destroys tapdisk processes as needed. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v1: * minor code clean up Changed since v2: * Introduce function XenbusState2str for printing XenbusStates in a more human-friendly way. * Don''t print the log identify if the tapback is run in non-daemon mode, in order to make debug output more concise. Changed since v3: * Remove some debug print as they spam the output. * Use function abs() to report errors as some error codes are negated. * Create the control socket. Changed since v4: * Don''t emit a debug print if a XenStore path has been removed. diff --git a/tools/blktap3/tapback/tapback.c b/tools/blktap3/tapback/tapback.c new file mode 100644 --- /dev/null +++ b/tools/blktap3/tapback/tapback.c @@ -0,0 +1,364 @@ +/* + * Copyright (C) 2012 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + * + * This file contains the core of the tapback daemon, the user space daemon + * that acts as a device''s back-end. + */ + +/* + * TODO Some of these includes may be useless. + * TODO Replace hard-coding strings with defines/const string. + */ +#include <stdlib.h> +#include <stdarg.h> +#include <assert.h> +#include <syslog.h> +#include <fcntl.h> +#include <unistd.h> +#include <libgen.h> +#include <getopt.h> +#include <sys/ioctl.h> +#include <sys/mount.h> +#include <syslog.h> + +#include "blktap3.h" +#include "stdio.h" /* TODO tap-ctl.h needs to include stdio.h */ +#include "tap-ctl.h" +#include "tapback.h" +#include <sys/types.h> +#include <sys/socket.h> +#include <sys/un.h> +#include <signal.h> + +void tapback_log(int prio, const char *fmt, ...); +void (*tapback_vlog) (int prio, const char *fmt, va_list ap); + +struct _blktap3_daemon blktap3_daemon; + +char *XenbusState2str(const XenbusState xbs) +{ + static char * const str[] = { + [XenbusStateUnknown] = "unknown", + [XenbusStateInitialising] = "initialising", + [XenbusStateInitWait] = "init wait", + [XenbusStateInitialised] = "initialised", + [XenbusStateConnected] = "connected", + [XenbusStateClosing] = "closing", + [XenbusStateClosed] = "closed", + [XenbusStateReconfiguring] = "reconfiguring", + [XenbusStateReconfigured] = "reconfigured" + }; + return str[xbs]; +} + +/** + * Read changes that occurred on the "backend/<backend name>" XenStore path + * or one of the front-end paths and act accordingly. + */ +static inline void +tapback_read_watch(void) +{ + char **watch = NULL, *path = NULL, *token = NULL; + unsigned int n = 0; + int err = 0, _abort = 0; + + /* read the change */ + watch = xs_read_watch(blktap3_daemon.xs, &n); + path = watch[XS_WATCH_PATH]; + token = watch[XS_WATCH_TOKEN]; + + /* + * TODO Put the body of "again:" into a loop instead of using goto. + */ +again: + if (!(blktap3_daemon.xst = xs_transaction_start(blktap3_daemon.xs))) { + WARN("error starting transaction\n"); + goto fail; + } + + DBG("path = %s\n", path); + + /* + * The token indicates which XenStore watch triggered, the front-end one or + * the back-end one. + */ + if (!strcmp(token, BLKTAP3_FRONTEND_TOKEN)) { + err = tapback_backend_handle_otherend_watch(path); + } else if (!strcmp(token, BLKTAP3_BACKEND_TOKEN)) { + err = tapback_backend_handle_backend_watch(path); + } else { + WARN("invalid token \''%s\''\n", token); + err = EINVAL; + } + + _abort = !!err; + if (_abort) { + if (err != ENOENT) { + /* TODO Some functions return +err, others -err */ + DBG("aborting transaction: %s\n", strerror(abs(err))); + } + } + + err = xs_transaction_end(blktap3_daemon.xs, blktap3_daemon.xst, _abort); + blktap3_daemon.xst = 0; + if (!err) { + err = -errno; + /* + * This is OK according to xs_transaction_end''s semantics. + */ + if (EAGAIN == errno) + goto again; + DBG("error ending transaction: %s\n", strerror(err)); + } + +fail: + free(watch); + return; +} + +static void +tapback_backend_destroy(void) +{ + int err; + + if (blktap3_daemon.xs) { + xs_daemon_close(blktap3_daemon.xs); + blktap3_daemon.xs = NULL; + } + + err = unlink(TAPBACK_CTL_SOCK_PATH); + if (err == -1 && errno != ENOENT) { + err = errno; + WARN("failed to remove %s: %s\n", TAPBACK_CTL_SOCK_PATH, strerror(err)); + } +} + +static void +signal_cb(int signum) { + + assert(signum == SIGINT || signum == SIGTERM); + + /* TODO Check whether there are active VBDs? */ + tapback_backend_destroy(); + exit(0); +} + +/** + * Initializes the back-end descriptor. There is one back-end per tapback + * process. Also, it initiates a watch to XenStore on backend/<backend name>. + * + * @returns 0 on success, an error code otherwise + */ +static inline int +tapback_backend_create(void) +{ + int err; + struct sockaddr_un local; + int len; + + TAILQ_INIT(&blktap3_daemon.devices); + blktap3_daemon.xst = XBT_NULL; + blktap3_daemon.ctrl_sock = -1; + + if (!(blktap3_daemon.xs = xs_daemon_open())) { + err = EINVAL; + goto fail; + } + + /* + * Watch the back-end. + */ + if (!xs_watch(blktap3_daemon.xs, BLKTAP3_BACKEND_PATH, + BLKTAP3_BACKEND_TOKEN)) { + err = errno; + goto fail; + } + + if (SIG_ERR == signal(SIGINT, signal_cb) || + SIG_ERR == signal(SIGTERM, signal_cb)) { + WARN("failed to register signal handlers\n"); + err = EINVAL; + goto fail; + } + + /* + * Create the control socket. + * XXX We don''t listen for connections as we don''t yet support any control + * commands. + */ + blktap3_daemon.ctrl_sock = socket(AF_UNIX, SOCK_STREAM, 0); + if (blktap3_daemon.ctrl_sock == -1) { + err = errno; + WARN("failed to create control socket: %s\n", strerror(errno)); + goto fail; + } + local.sun_family = AF_UNIX; + strcpy(local.sun_path, TAPBACK_CTL_SOCK_PATH); + err = unlink(local.sun_path); + if (err && errno != ENOENT) { + err = errno; + WARN("failed to remove %s: %s\n", local.sun_path, strerror(err)); + goto fail; + } + len = strlen(local.sun_path) + sizeof(local.sun_family); + err = bind(blktap3_daemon.ctrl_sock, (struct sockaddr *)&local, len); + if (err == -1) { + err = errno; + WARN("failed to bind to %s: %s\n", local.sun_path, strerror(err)); + goto fail; + } + + return 0; + +fail: + tapback_backend_destroy(); + + return err; +} + +/** + * Runs the daemon. + * + * Watches backend/<backend name> and the front-end devices. + */ +static inline int +tapback_backend_run(void) +{ + const int fd = xs_fileno(blktap3_daemon.xs); + int err; + + do { + fd_set rfds; + int nfds = 0; + + FD_ZERO(&rfds); + FD_SET(fd, &rfds); + + /* poll the fd for changes in the XenStore path we''re interested in */ + if ((nfds = select(fd + 1, &rfds, NULL, NULL, NULL)) < 0) { + perror("error monitoring XenStore"); + err = -errno; + break; + } + + if (FD_ISSET(fd, &rfds)) + tapback_read_watch(); + DBG("--\n"); + } while (1); + + return err; +} + +static char *blkback_ident = NULL; + +static void +blkback_vlog_fprintf(const int prio, const char * const fmt, va_list ap) +{ + static const char *strprio[] = { + [LOG_DEBUG] = "DBG", + [LOG_INFO] = "INF", + [LOG_WARNING] = "WRN" + }; + + assert(LOG_DEBUG == prio || LOG_INFO == prio || LOG_WARNING == prio); + assert(strprio[prio]); + + fprintf(stderr, "%s[%s] ", blkback_ident, strprio[prio]); + vfprintf(stderr, fmt, ap); +} + +/** + * Print tapback''s usage instructions. + */ +static void +usage(FILE * const stream, const char * const prog) +{ + assert(stream); + assert(prog); + + fprintf(stream, + "usage: %s\n" + "\t[-D|--debug]\n" + "\t[-h|--help]\n", prog); +} + +int main(int argc, char **argv) +{ + const char *prog = NULL; + int opt_debug = 0; + int err = 0; + + prog = basename(argv[0]); + + opt_debug = 0; + + do { + const struct option longopts[] = { + {"help", 0, NULL, ''h''}, + {"debug", 0, NULL, ''D''}, + }; + int c; + + c = getopt_long(argc, argv, "h:D", longopts, NULL); + if (c < 0) + break; + + switch (c) { + case ''h'': + usage(stdout, prog); + return 0; + case ''D'': + opt_debug = 1; + break; + case ''?'': + goto usage; + } + } while (1); + + if (opt_debug) { + blkback_ident = ""; + tapback_vlog = blkback_vlog_fprintf; + } + else { + blkback_ident = BLKTAP3_BACKEND_TOKEN; + openlog(blkback_ident, 0, LOG_DAEMON); + } + + if (!opt_debug) { + if ((err = daemon(0, 0))) { + err = -errno; + goto fail; + } + } + + if ((err = tapback_backend_create())) { + WARN("error creating blkback: %s\n", strerror(err)); + goto fail; + } + + err = tapback_backend_run(); + + tapback_backend_destroy(); + +fail: + return err ? -err : 0; + +usage: + usage(stderr, prog); + return 1; +}
Thanos Makatos
2013-Jul-15 11:38 UTC
[PATCH 7 of 7 v5] blktap3/tapback: Introduce tapback daemon Makefile
This patch introduces the Makefile that builds the tapback daemon. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v2: * Use $(BINDIR) as the daemon''s installation directory. * Fixed whitespace. Changed since v3: * Explicitly use libblktapctl.3 to avoid conflicts with the blktap2 one. * Merge patch that adds the tapback binary to the mercurial ignore list into this patch. diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -375,3 +375,6 @@ ^unmodified_drivers/linux-2.6/.*\.ko$ ^unmodified_drivers/linux-2.6/.*\.mod\.c$ ^LibVNCServer.* + +# blktap3 +^tools/blktap3/tapback/tapback$ diff --git a/tools/blktap3/tapback/Makefile b/tools/blktap3/tapback/Makefile --- a/tools/blktap3/tapback/Makefile +++ b/tools/blktap3/tapback/Makefile @@ -3,6 +3,10 @@ include $(XEN_ROOT)/tools/Rules.mk BLKTAP_ROOT := .. +INST_DIR ?= $(BINDIR) + +IBIN = tapback + # -D_GNU_SOURCE is required by vasprintf. override CFLAGS += \ -I$(BLKTAP_ROOT)/include \ @@ -25,7 +29,20 @@ override LDFLAGS += \ $(LDLIBS_libxenstore) \ $(LDFLAGS_libxenctrl) +TAPBACK-OBJS := log.o xenstore.o frontend.o backend.o + +TAPBACK-LIBS := $(BLKTAP_ROOT)/control/libblktapctl.so.3.0 + +all: $(IBIN) + +$(IBIN): $(TAPBACK-OBJS) tapback.o + $(CC) -o $@ $^ $(TAPBACK-LIBS) $(LDFLAGS) + +install: all + $(INSTALL_DIR) -p $(DESTDIR)$(INST_DIR) + $(INSTALL_PROG) $(IBIN) $(DESTDIR)$(INST_DIR) + clean: - rm -f *.o *.o.d .*.o.d + rm -f *.o *.o.d .*.o.d $(IBIN) .PHONY: clean install
Thanos Makatos
2013-Jul-15 14:25 UTC
Re: [PATCH 0 of 7 v5] Introduce the tapback daemon (most of blkback in user-space)
> -----Original Message----- > From: Thanos Makatos [mailto:thanos.makatos@citrix.com] > Sent: 15 July 2013 12:39 > To: xen-devel@lists.xen.org > Cc: Thanos Makatos > Subject: [PATCH 0 of 7 v5] Introduce the tapback daemon (most of > blkback in user-space) > > This patch series introduces the tapback daemon, the user space daemon > that acts as a device''s back-end, essentially most of blkback in user > space. The daemon is responsible for coordinating the front-end and > tapdisk. It creates tapdisk process as needed, instructs them to > connect to/disconnect from the shared ring, and manages the state of > the back-end. > > The shared ring between the front-end and the tapdisk is provided by a > piece of code that lives inside the tapdisk and will be introduced by > the next patch series. > > Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> > > --- > Changed since v1: > The series has been largely reorganised: > * Renamed the daemon from xenio to tapback. > * Improved description in patch 0. > * Merged structures and functions. > * Disaggregated functionality from the core daemon source file to > smaller ones > in order to facilitate the review process and improve maintenance. > > Changed since v2: > * Added a new patch that ignores tapback binaries. > * For the rest of the patches, see the description in each patch. > > Changed since v3: > * Replace the minor number with type:/path/to/file where necessary. > * Create the daemon''s control socket.What''s missing from this series is automatically starting the daemon when necessary. What''s the best approach to this? Should the daemon be started by some xen script in /etc/init.d? Should libxl spawn the daemon on demand?