Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 0 of 8 v3] blktap3: Introduce the tapback daemon (most of blkback in user-space).
This patch series introduces the tapback daemon, the user space daemon that acts as a device''s back-end, essentially most of blkback in user space. The daemon is responsible for coordinating the front-end and tapdisk. It creates tapdisk process as needed, instructs them to connect to/disconnect from the shared ring, and manages the state of the back-end. The shared ring between the front-end and the tapdisk is provided by a piece of code that lives inside the tapdisk and will be introduced by the next patch series. Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> --- Changed since v1: The series has been largely reorganised: * Renamed the daemon from xenio to tapback. * Improved description in patch 0. * Merged structures and functions. * Disaggregated functionality from the core daemon source file to smaller ones in order to facilitate the review process and improve maintenance. Changed since v2: * Added a new patch that ignores tapback binaries. * For the rest of the patches, see the description in each patch.
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 1 of 8 v3] blktap3/tapback: Introduce core defines and structure definitions
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * Removed BUG_ON macro.
  * Fixed whitespace in tapback_backend_find_device.
  * Clarified back-end name (xenio).
  * Removed references to the "serial" thing.
  * Remove unused member dev from struct vbd.
  * Introduce prototype of function XenbusState2str.
diff --git a/tools/blktap3/tapback/tapback.h b/tools/blktap3/tapback/tapback.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/tapback.h
@@ -0,0 +1,244 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#ifndef __TAPBACK_H__
+#define __TAPBACK_H__
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <string.h>
+
+#include <xen/xen.h>
+#include <xenstore.h>
+#include <xen/io/xenbus.h>
+#include <xen/event_channel.h>
+#include <xen/grant_table.h>
+
+#include "tap-ctl.h"
+
+void tapback_log(int prio, const char *fmt, ...);
+void (*tapback_vlog) (int prio, const char *fmt, va_list ap);
+
+#define DBG(_fmt, _args...)  tapback_log(LOG_DEBUG, "%s:%d "_fmt,
__FILE__, \
+        __LINE__, ##_args)
+#define INFO(_fmt, _args...) tapback_log(LOG_INFO, _fmt, ##_args)
+#define WARN(_fmt, _args...) tapback_log(LOG_WARNING, "%s:%d "_fmt,
__FILE__, \
+        __LINE__, ##_args)
+
+#define WARN_ON(_cond, fmt, ...)    \
+    if (unlikely(_cond)) {          \
+        printf(fmt, ##__VA_ARGS__); \
+    }
+
+/*
+ * Pre-defined XenStore path components used for running the XenBus protocol.
+ *
+ * To avoid confusion with blktap2, we''ll use a new kind of device for
libxl
+ * defining it in tools/libxl/libxl_types_internal.idl. This will be done by
+ * the patch that adds libxl support for blktap3. The temporary back-end name
+ * is "xenio". Once blktap3 becomes the default back-end, its
back-end name
+ * should be "vbd" and "xenio" will be removed. TODO When
that patch is sent,
+ * use the definition from there instead of hard-coding it here.
+ */
+#define XENSTORE_BACKEND        "backend"
+#define BLKTAP3_BACKEND_NAME    "xenio"
+#define BLKTAP3_BACKEND_PATH   
XENSTORE_BACKEND"/"BLKTAP3_BACKEND_NAME
+#define BLKTAP3_BACKEND_TOKEN  
XENSTORE_BACKEND"-"BLKTAP3_BACKEND_NAME
+#define BLKTAP3_FRONTEND_TOKEN  "otherend-state"
+
+/*
+ * TODO Put the rest of the front-end nodes defined in blkif.h here and group
+ * them. e.g. FRONTEND_NODE_xxx.
+ */
+#define RING_REF                "ring-ref"
+#define FEAT_PERSIST            "feature-persistent"
+
+/**
+ * A Virtual Block Device (VBD), represents a block device in a guest VM.
+ * Contains all relevant information.
+ */
+typedef struct vbd {
+
+    /**
+     * Device name, as retrieved from XenStore at probe-time.
+     */
+    char *name;
+    
+    /**
+     * The device ID. Same as vbd.name, we keep it around because the tapdisk
+     * control libreary wants it as an int and not as a string.
+     */
+    int devid;
+
+    /**
+     * For linked lists.
+     */
+     TAILQ_ENTRY(vbd) backend_entry;
+
+    /**
+     * The domain ID this VBD belongs to.
+     */
+    domid_t domid;
+
+    /**
+     * The root directory in XenStore for this VBD. This is where all
+     * directories and key/value pairs related to this VBD are stored.
+     */
+    char *frontend_path;
+
+    /**
+     * XenStore path to the VBD''s state. This is just
+     * vbd.frontend_path + "/state", we keep it around so we
don''t have to
+     * allocate/free memory all the time.
+     */
+    char *frontend_state_path;
+
+    /**
+     * Indicates whether the tapdisk is connected to the shared ring.
+     */
+    bool connected;
+
+    /**
+     * Descriptor of the tapdisk process serving this virtual block device. We
+     * need this until the end of the VBD''s lifetime in order to
disconnect
+     * the tapdisk from the shared ring.
+     */
+    tap_list_t tap;
+
+    /*
+     * XXX We keep sector_size, sectors, and info because we need to
+     * communicate them to the front-end not only when the front-end goes to
+     * XenbusStateInitialised, but to XenbusStateConnected as well.
+     */
+
+    /**
+     * Sector size, supplied by the tapdisk, communicated to blkfront.
+     */
+    unsigned int sector_size;
+
+    /**
+     * Number of sectors, supplied by the tapdisk, communicated to blkfront.
+     */
+    unsigned long long sectors;
+
+    /**
+     * VDISK_???, defined in include/xen/interface/io/blkif.h.
+     */
+    unsigned int info;
+
+} vbd_t;
+
+TAILQ_HEAD(tqh_vbd, vbd);
+
+/**
+ * The collection of all necessary handles and descriptors.
+ */
+struct _blktap3_daemon {
+
+    /**
+     * A handle to XenStore.
+     */
+    struct xs_handle *xs;
+
+    /**
+     * For executing transacted operations on XenStore.
+     */
+    xs_transaction_t xst;
+
+    /**
+     * The list of virtual block devices.
+     *
+     * TODO We sometimes have to scan the whole list to find the device/domain
+     * we''re interested in, should we optimize this? E.g. use a hash
table
+     * for O(1) access?
+     * TODO Replace with a hash table (hcreate etc.)?
+     */
+    struct tqh_vbd devices;
+
+    /**
+     * TODO From xen/include/public/io/blkif.h: "The maximum supported
size of
+     * the request ring buffer" 
+     */
+    int max_ring_page_order;
+};
+
+extern struct _blktap3_daemon blktap3_daemon;
+
+#define tapback_backend_for_each_device(_device, _next)	\
+	TAILQ_FOREACH_SAFE(_device, &blktap3_daemon.devices, backend_entry, _next)
+
+/**
+ * Iterates over all devices and returns the one for which the condition is
+ * true.
+ */
+#define tapback_backend_find_device(_device, _cond)     \
+do {                                                    \
+    vbd_t *__next;                                      \
+    int found = 0;                                      \
+    tapback_backend_for_each_device(_device, __next) {  \
+        if (_cond) {                                    \
+            found = 1;                                  \
+            break;                                      \
+        }                                               \
+    }                                                   \
+    if (!found)                                         \
+        _device = NULL;                                 \
+} while (0)
+
+/**
+ * Act in response to a change in the front-end XenStore path.
+ *
+ * @param path the front-end''s XenStore path that changed
+ * @returns 0 on success, an error code otherwise
+ *
+ * XXX Only called by tapback_read_watch
+ */
+int
+tapback_backend_handle_otherend_watch(const char * const path);
+
+/**
+ * Act in response to a change in the back-end directory in XenStore.
+ *
+ * If the path is "/backend" or "/backend/<backend
name>", all devices are
+ * probed. Otherwise, the path should be
+ * "backend/<backend name>/<domid>/<device name>"
+ * (i.e. backend/<backend name>/1/51712), and in this case this specific
device
+ * is probed.
+ *
+ * @param path the back-end''s XenStore path that changed @returns 0 on
success,
+ * an error code otherwise
+ *
+ * XXX Only called by tapback_read_watch.
+ */
+int
+tapback_backend_handle_backend_watch(char * const path);
+
+/**
+ * Converts XenbusState values to a printable string, e.g. XenbusStateConnected
+ * corresponds to "connected".
+ *
+ * @param xbs the XenbusState to convert
+ * @returns a printable string
+ */
+char *
+XenbusState2str(const XenbusState xbs);
+
+#endif /* __TAPBACK_H__ */
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 2 of 8 v3] blktap3/tapback: Introduces functionality required to access XenStore
This patch introduces convenience functions that read/write values from/to
XenStore.
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * Introduced makefile to facilitate development.
  * Removed functions (v)mprintf as (v)asprintf suffice.
  * Ensure tapback_xs_vread (a) returns a NULL terminated string, and (b) the
    returned string doesn''t contain NULL characters, apart from the
    NULL-terminating one.
  * Removed unnecessary variable initialisations.
  * Minor whitespace clean up.
  * Fixed typo in tapback_xs_vread.
  * function tapback_xs_vread: handle corner case where the value returned by
    xs_read is a zero-length string.
diff --git a/tools/blktap3/tapback/Makefile b/tools/blktap3/tapback/Makefile
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/Makefile
@@ -0,0 +1,31 @@
+XEN_ROOT := $(CURDIR)/../../../
+include $(XEN_ROOT)/tools/Rules.mk
+
+BLKTAP_ROOT := ..
+
+# -D_GNU_SOURCE is required by vasprintf.
+override CFLAGS += \
+    -I$(BLKTAP_ROOT)/include \
+    -I$(BLKTAP_ROOT)/control \
+    -D_GNU_SOURCE \
+    $(CFLAGS_libxenstore) \
+    $(CFLAGS_libxenctrl) \
+    $(CFLAGS_xeninclude) \
+    -Wall \
+    -Wextra \
+    -Werror
+
+# FIXME cause trouble
+override CFLAGS += \
+    -Wno-old-style-declaration \
+    -Wno-sign-compare \
+    -Wno-type-limits
+
+override LDFLAGS += \
+    $(LDLIBS_libxenstore) \
+    $(LDFLAGS_libxenctrl)
+
+clean:
+	rm -f *.o *.o.d .*.o.d
+
+.PHONY: clean install
diff --git a/tools/blktap3/tapback/xenstore.c b/tools/blktap3/tapback/xenstore.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/xenstore.c
@@ -0,0 +1,195 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <xenstore.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "blktap3.h"
+#include "tapback.h"
+#include "xenstore.h"
+
+char *
+tapback_xs_vread(struct xs_handle * const xs, xs_transaction_t xst,
+        const char * const fmt, va_list ap)
+{
+    char *path, *data = NULL;
+    unsigned int len = 0;
+
+    assert(xs);
+
+    if (vasprintf(&path, fmt, ap) == -1)
+        goto fail;
+    assert(path);
+
+    data = xs_read(xs, xst, path, &len);
+    free(path);
+
+    if (!data)
+        return NULL;
+
+    assert(len >= 0);
+
+    /*
+     * Make sure the returned string is NULL-terminated.
+     */
+    if ((len > 0 && data[len - 1] != ''\0'') || (len
== 0 && data[0] != ''\0'')) {
+        char *_data = strndup(data, len);
+        if (!_data)
+            /* TODO log error */
+            goto fail;
+        free(data);
+        data = _data;
+    }
+
+    /*
+     * Make sure the returned string does not containing NULL characters, apart
+     * from the NULL-terminating one.
+     *
+     * We should be checking for extraneous NULLs before duplicating the
+     * buffer, but this way logic is simplified.
+     */
+    if (strrchr(data, ''\0'') - data != len)
+        /* TODO log error */
+        goto fail;
+
+    return data;
+fail:
+    free(data);
+    return NULL;
+}
+
+__printf(3, 4)
+char *
+tapback_xs_read(struct xs_handle * const xs, xs_transaction_t xst,
+        const char * const fmt, ...)
+{
+    va_list ap;
+    char *s;
+
+    assert(xs);
+
+    va_start(ap, fmt);
+    s = tapback_xs_vread(xs, xst, fmt, ap);
+    va_end(ap);
+
+    return s;
+}
+
+char *
+tapback_device_read(const vbd_t * const device, const char * const path)
+{
+    assert(device);
+    assert(path);
+
+    return tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst,
+            "%s/%d/%s/%s", BLKTAP3_BACKEND_PATH, device->domid,
device->name,
+            path);
+}
+
+char *
+tapback_device_read_otherend(vbd_t * const device,
+        const char * const path)
+{
+    assert(device);
+    assert(path);
+
+    return tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst,
"%s/%s",
+            device->frontend_path, path);
+}
+
+__scanf(3, 4)
+int
+tapback_device_scanf_otherend(vbd_t * const device,
+        const char * const path, const char * const fmt, ...)
+{
+    va_list ap;
+    int n = 0;
+    char *s = NULL;
+
+    assert(device);
+    assert(path);
+
+    if (!(s = tapback_device_read_otherend(device, path)))
+        return -1;
+    va_start(ap, fmt);
+    n = vsscanf(s, fmt, ap);
+    free(s);
+    va_end(ap);
+
+    return n;
+}
+
+__printf(4, 5)
+int
+tapback_device_printf(vbd_t * const device, const char * const key,
+        const bool mkread, const char * const fmt, ...)
+{
+    va_list ap;
+    int err = 0;
+    char *path = NULL, *val = NULL;
+    bool nerr = false;
+
+    assert(device);
+    assert(key);
+
+    if (-1 == asprintf(&path, "%s/%d/%s/%s",
BLKTAP3_BACKEND_PATH,
+                device->domid, device->name, key)) {
+        err = -errno;
+        goto fail;
+    }
+
+    va_start(ap, fmt);
+    if (-1 == vasprintf(&val, fmt, ap))
+        val = NULL;
+    va_end(ap);
+
+    if (!val) {
+        err = -errno;
+        goto fail;
+    }
+
+    if (!(nerr = xs_write(blktap3_daemon.xs, blktap3_daemon.xst, path, val,
+                    strlen(val)))) {
+        err = -errno;
+        goto fail;
+    }
+
+    if (mkread) {
+        struct xs_permissions perms = {
+            device->domid,
+            XS_PERM_READ
+        };
+
+        if (!(nerr = xs_set_permissions(blktap3_daemon.xs, blktap3_daemon.xst,
+                        path, &perms, 1))) {
+            err = -errno;
+            goto fail;
+        }
+    }
+
+fail:
+    free(path);
+    free(val);
+
+    return err;
+}
diff --git a/tools/blktap3/tapback/xenstore.h b/tools/blktap3/tapback/xenstore.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/xenstore.h
@@ -0,0 +1,95 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+/**
+ * Retrieves the XenStore value of the specified key of the VBD''s
front-end.
+ * The caller must free the returned buffer.
+ *
+ * @param device the VBD
+ * @param path key under the front-end directory
+ * @returns a buffer containing the value, or NULL on error
+ */
+char *
+tapback_device_read_otherend(vbd_t * const device,
+        const char * const path);
+
+/**
+ * Writes to XenStore backened/tapback/<domid>/<devname>/@key =
@fmt.
+ *
+ * @param device the VBD
+ * @param key the key to write to
+ * @param mkread TODO
+ * @param fmt format
+ * @returns 0 on success, an error code otherwise
+ */
+__printf(4, 5)
+int
+tapback_device_printf(vbd_t * const device, const char * const key,
+        const bool mkread, const char * const fmt, ...);
+
+/**
+ * Reads the specified XenStore path under the front-end directory in a
+ * scanf-like manner.
+ *
+ * @param device the VBD
+ * @param path XenStore path to read
+ * @param fmt format
+ */
+__scanf(3, 4)
+int
+tapback_device_scanf_otherend(vbd_t * const device,
+        const char * const path, const char * const fmt, ...);
+
+/**
+ * Retrieves the value of the specified of the device from XenStore,
+ * i.e. backend/tapback/<domid>/<devname>/@path
+ * The caller must free the returned buffer.
+ *
+ * @param device the VBD 
+ * @param path the XenStore key
+ * @returns a buffer containing the value, or NULL on error
+ */
+char *
+tapback_device_read(const vbd_t * const device, const char * const path);
+
+/**
+ * Reads the specified XenStore path. The caller must free the returned buffer.
+ *
+ * @param xs handle to XenStore
+ * @param xst XenStore transaction 
+ * @param fmt format
+ * @param ap arguments
+ * @returns a buffer containing the value, or NULL on error
+ */
+char *
+tapback_xs_vread(struct xs_handle * const xs, xs_transaction_t xst,
+        const char * const fmt, va_list ap);
+
+/**
+ * Reads the specified XenStore path. The caller must free the returned buffer.
+ *
+ * @param xs handle to XenStore
+ * @param xst XenStore transaction
+ * @param fmt format
+ * @returns a buffer containing the value, or NULL on error
+ */
+__printf(3, 4)
+char *
+tapback_xs_read(struct xs_handle * const xs, xs_transaction_t xst,
+        const char * const fmt, ...);
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 3 of 8 v3] blktap3/tapback: Logging for the tapback daemon and libxenio
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * minor whitespace clean up
diff --git a/tools/blktap3/tapback/log.c b/tools/blktap3/tapback/log.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/log.c
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#include "compiler.h"
+#include <stdarg.h>
+#include <syslog.h>
+
+void (*tapback_vlog) (int prio, const char *fmt, va_list ap) = vsyslog;
+
+__printf(2, 3) void
+tapback_log(int prio, const char *fmt, ...)
+{
+    va_list ap;
+    va_start(ap, fmt);
+    tapback_vlog(prio, fmt, ap);
+    va_end(ap);
+}
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 4 of 8 v3] blktap3/tapback: Introduce back-end XenStore path handler
This patch introduces the handler executed when the back-end XenStore path gets
modified. A back-end XenStore path is modified as a result of a device
creation/removal request. The device is created/removed depending on whether
its path exists/does not exist in XenStore.
Creating a device comprises creating the in-memory representation of it and
adding it to the list of devices, locating the tapdisk designated to serve
this VBD, and setting a XenStore watch to the front-end path of the
newly-created device.
Deleting a device comprises removing that XenStore watch and deallocating its
in-memory representation.
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * Removed the "serial" thing.
  * Function tapback_backend_handle_backend_watch doesn''t always have
to scan
  * the entire back-end sub-tree, this is left as a future optimisation
    (relevant comment inside code updated).
  * Replaced mprintf with asprintf.
  * Check for failures in tapback_backend_scan.
  * Minor code and whitespace clean up.
  * The tapback daemon searches for a suitable tapdisk and creates one if none
    is found. It also kills it when the front-end disconnects.
diff --git a/tools/blktap3/tapback/backend.c b/tools/blktap3/tapback/backend.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/backend.c
@@ -0,0 +1,503 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ *
+ * This file contains the handler executed when the back-end XenStore path gets
+ * modified.
+ */
+
+#include "tapback.h"
+#include "xenstore.h"
+#include "tap-ctl-info.h"
+
+/**
+ * Removes the XenStore watch from the front-end.
+ *
+ * @param device the VBD whose front-end XenStore path should stop being
+ * watched
+ */
+static void
+tapback_device_unwatch_frontend_state(vbd_t * const device)
+{
+    assert(device);
+
+    if (device->frontend_state_path)
+        xs_unwatch(blktap3_daemon.xs, device->frontend_state_path,
+                BLKTAP3_FRONTEND_TOKEN);
+
+    free(device->frontend_state_path);
+    device->frontend_state_path = NULL;
+}
+
+/**
+ * Destroys and deallocates the back-end part of a VBD.
+ *
+ * @param device the VBD to destroy
+ */
+static void
+tapback_backend_destroy_device(vbd_t * const device)
+{
+    int err;
+
+    assert(device);
+
+    TAILQ_REMOVE(&blktap3_daemon.devices, device, backend_entry);
+
+    tapback_device_unwatch_frontend_state(device);
+
+    /*
+     * kill the tapdisk
+     */
+    err = tap_ctl_destroy(device->tap.pid, device->tap.minor, 0, NULL);
+    if (err)
+        WARN("failed to destroy tapdisk %d/%d: %s (error ignored)\n",
+                device->tap.pid, device->tap.minor, strerror(err));
+
+    free(device->frontend_path);
+    free(device->name);
+
+    /*
+     * XXX device->bdev is expected to have been freed
+     */
+
+    free(device);
+}
+
+/**
+ * Retrieves the tapdisk designated to serve this device, storing this
+ * information in the supplied VBD handle.
+ *
+ * @param param <type>:/path/to/file
+ * @param tap output parameter that receives the tapdisk process information.
+ * The parameter is undefined when the function returns a non-zero value.
+ * @returns 0 if a suitable tapdisk is found, ESRCH if no suitable tapdisk is
+ * found, and an error code in case of error
+ *
+ * TODO rename function
+ *
+ * XXX Only called by blkback_probe.
+ */
+static inline int
+blkback_find_tapdisk(const char *params, tap_list_t *tap)
+{
+    struct tqh_tap_list list;
+    tap_list_t *_tap;
+    int err;
+    char *type;
+    char *path;
+    char *_params = NULL;
+
+    assert(params);
+    assert(tap);
+
+    _params = strdup(params);
+    if (!_params) {
+        WARN("failed to allocate memory\n");
+        err = ENOMEM;
+        goto out;
+    }
+
+    type = strtok(_params, ":");
+    if (type) {
+        path = strtok(NULL, ":");
+    }
+    if (!type || !path) {
+        WARN("malformed params \''%s\''\n", params);
+        err = EINVAL;
+        goto out;
+    }
+
+    err = tap_ctl_list(&list);
+    if (err) {
+        WARN("error listing tapdisks: %s\n", strerror(err));
+        goto out;
+    }
+
+    err = ESRCH;
+    if (!TAILQ_EMPTY(&list)) {
+        tap_list_for_each_entry(_tap, &list) {
+            if (_tap->type && !strcmp(_tap->type, type)
&& _tap->path
+                    && !strcmp(_tap->path, path)) {
+                err = 0;
+                memcpy(tap, _tap, sizeof(tap));
+                break;
+            }
+        }
+        tap_ctl_list_free(&list);
+    } else
+        DBG("no tapdisks\n");
+
+out:
+    free(_params);
+    return err;
+}
+
+/**
+ * Creates a device and adds it to the list of devices.
+ * Initiates a XenStore watch to the front-end state.
+ *
+ * Creating the device implies initializing the handle and retrieving all the
+ * information of the tapdisk serving this VBD.
+ *
+ * @param domid the ID of the domain where the VBD is created
+ * @param name the name of the device
+ * @returns 0 on success, an error code otherwise
+ */
+static inline int
+tapback_backend_create_device(const domid_t domid, const char * const name)
+{
+    vbd_t *device = NULL;
+    int err = 0;
+    char *params = NULL;
+
+    assert(name);
+
+    DBG("creating device %d/%s\n", domid, name);
+
+    if (!(device = calloc(1, sizeof(*device)))) {
+        WARN("error allocating memory\n");
+        err = -errno;
+        goto out;
+    }
+
+    device->domid = domid;
+
+    TAILQ_INSERT_TAIL(&blktap3_daemon.devices, device, backend_entry);
+
+    if (!(device->name = strdup(name))) {
+        err = -errno;
+        goto out;
+    }
+
+    /*
+     * Get the front-end path from XenStore. We need this to talk to the
+     * front-end.
+     */
+    if (!(device->frontend_path = tapback_device_read(device,
"frontend"))) {
+        err = errno;
+        WARN("failed to read front-end path: %s\n", strerror(err));
+        goto out;
+    }
+
+    /*
+     * Get the file path backing the VBD.
+     */
+    params = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst,
+            "%s/%d/%s/params", BLKTAP3_BACKEND_PATH, domid, name);
+    if (!params) {
+        err = errno;
+        WARN("failed to read backing file: %s\n", strerror(err));
+        goto out;
+    }
+    DBG("need to find tapdisk serving \''%s\''\n",
params);
+
+    err = blkback_find_tapdisk(params, &device->tap);
+    if (!err) {
+        DBG("found tapdisk %d/%d\n", device->tap.pid,
device->tap.minor);
+    } else if (err == ESRCH) {
+        /* FIXME replace with tap_ctl_create */
+        DBG("no such tapdisk\n");
+        err = tap_ctl_spawn();
+        if (err <= 0) {
+            WARN("failed to create tapdisk: %s\n", strerror(err));
+            goto out;
+        }
+        device->tap.pid = err;
+        DBG("spawned tapdisk %d\n", device->tap.pid);
+        device->tap.minor = 0;
+        err = tap_ctl_attach(device->tap.pid, device->tap.minor);
+        if (err) {
+            WARN("failed to attach tapdisk: %s\n", strerror(err));
+            goto out;
+        }
+        DBG("attached tapdisk %d\n", device->tap.pid);
+        err = tap_ctl_open(device->tap.pid, 0, params, 0, -1, NULL);
+        if (err) {
+            WARN("failed to open tapdisk: %s\n", strerror(err));
+            goto out;
+        }
+        DBG("opened tapdisk %d\n", device->tap.pid);
+    } else  {
+        WARN("error looking for tapdisk: %s\n", strerror(err));
+        goto out;
+    }
+
+    /*
+     * get the VBD parameters from the tapdisk
+     */
+    if ((err = tap_ctl_info(device->tap.pid, 0, &device->sectors,
+                    &device->sector_size, &device->info))) {
+        WARN("error retrieving disk characteristics: %s\n",
strerror(-err));
+        goto out;
+    }
+
+    DBG("got %d-%d with tapdev %d/%d\n", device->domid,
device->devid,
+            device->tap.pid, device->tap.minor);
+
+	/*
+	 * Finally, watch the front-end path in XenStore for changes, i.e.
+     * /local/domain/<domid>/device/vbd/<devname>/state
+     * After this, we wait for the front-end to switch state to continue with
+     * the initialisation.
+	 */
+    if (asprintf(&device->frontend_state_path, "%s/state",
+                device->frontend_path) == -1) {
+        /* TODO log error */
+        err = -errno;
+        goto out;
+    }
+    assert(device->frontend_state_path);
+
+    /*
+     * We use the same token for all front-end watches. We don''t have
to use a
+     * unique token for each front-end watch because when a front-end watch
+     * fires we are given the XenStore path that changed.
+     */
+    if (!xs_watch(blktap3_daemon.xs, device->frontend_state_path,
+                BLKTAP3_FRONTEND_TOKEN)) {
+        free(device->frontend_state_path);
+        err = -errno;
+        goto out;
+    }
+
+out:
+    if (err) {
+        WARN("error creating device: domid=%d name=%s err=%d (%s)\n",
+                domid, name, err, strerror(err));
+        if (device)
+            tapback_backend_destroy_device(device);
+    }
+    free(params);
+    return err;
+}
+
+/**
+ * Creates (removes) a device depending on the existence (non-existence) of the
+ * "backend/<backend name>/@domid/@devname" XenStore path.
+ *
+ * @param domid the ID of the domain where the VBD is created
+ * @param devname device name
+ * @returns 0 on success, an error code otherwise
+ */
+static int
+tapback_backend_probe_device(const domid_t domid, const char * const devname)
+{
+    int should_exist = 0, create = 0, remove = 0;
+    vbd_t *device = NULL;
+    char * s = NULL;
+
+    assert(devname);
+
+    DBG("probing device domid=%d name=%s\n", domid, devname);
+
+    /*
+     * Ask XenStore if the device _should_ exist.
+     */
+    s = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst,
"%s/%d/%s",
+            BLKTAP3_BACKEND_PATH, domid, devname);
+    should_exist = s != NULL;
+    free(s);
+
+    /*
+     * Search the device list for this specific device.
+     */
+    tapback_backend_find_device(device,
+            device->domid == domid && !strcmp(device->name,
devname));
+
+    /*
+	 * If XenStore says that the device should exist but it''s not in our
device
+     * list, we must create it. If it''s the other way round, this is a
removal.
+     */
+    remove = device && !should_exist;
+    create = !device && should_exist;
+
+    DBG("should exist=%d device=%p remove=%d create=%d\n",
+        should_exist, device, remove, create);
+
+    if (!create && !remove) {
+        /*
+         * A watch has triggered on a path we''re not interested in.
+         * TODO Check if we can avoid probing the device completely based on
+         * the path that triggered.
+         */
+        DBG("spurious XenStore watch triggered on back-end\n");
+        return 0;
+    }
+
+    /*
+     * Remember that remove and create may both be true at the same time, as
+     * this indicates that the device has been removed and re-created too fast.
+     * In this case, we too need to remove and re-create the device,
+     * respectively.
+     */
+
+    if (remove)
+        tapback_backend_destroy_device(device);
+
+    if (create) {
+        const int err = tapback_backend_create_device(domid, devname);
+        if (0 != err) {
+            WARN("error creating device %s on domain %d: %s\n",
devname, domid,
+                    strerror(err));
+            return err;
+        }
+    }
+
+    return 0;
+}
+
+/**
+ * Scans XenStore for all blktap3 devices and probes each one of them.
+ *
+ * XXX Only called by tapback_backend_handle_backend_watch.
+ */
+static int
+tapback_backend_scan(void)
+{
+    vbd_t *device = NULL, *next = NULL;
+    unsigned int i = 0, j = 0, n = 0, m = 0;
+    char **dir = NULL;
+    int err = 0;
+
+    DBG("scanning the back-end\n");
+
+    /*
+     * scrap all non-existent devices
+     * FIXME Why do we do this?
+     * FIXME Is this costly?
+     */
+
+    tapback_backend_for_each_device(device, next) {
+        err = tapback_backend_probe_device(device->domid, device->name);
+        if (err) {
+            WARN("error probing device %s of domain %d: %s\n",
device->name,
+                    device->domid, strerror(err));
+            /* TODO Should we fail in this case of keep probing? */
+            goto out;
+        }
+    }
+
+    /*
+     * probe the new ones
+     *
+     * FIXME We''re checking each and every device in each and every
domain,
+     * could there be a performance issue in the presence of many VMs/VBDs?
+     * (e.g. boot-storm)
+     */
+    if (!(dir = xs_directory(blktap3_daemon.xs, blktap3_daemon.xst,
+                    BLKTAP3_BACKEND_PATH, &n))) {
+        err = errno;
+        if (err == ENOENT)
+            err = 0;
+        else
+            WARN("error listing %s: %s\n", BLKTAP3_BACKEND_PATH,
+                    strerror(err));
+        goto out;
+    }
+
+    DBG("probing %d domains\n", n);
+
+    for (i = 0; i < n; i++) { /* for each domain */
+        char *path = NULL, **sub = NULL, *end = NULL;
+        domid_t domid = 0;
+
+        /*
+         * Get the domain ID.
+         */
+        domid = strtoul(dir[i], &end, 0);
+        if (*end != 0 || end == dir[i])
+            continue;
+
+        /*
+         * Read the devices of this domain.
+         */
+        if (asprintf(&path, "%s/%d", BLKTAP3_BACKEND_PATH, domid)
== -1) {
+            /* TODO log error */
+            err = errno;
+            goto out;
+        }
+        sub = xs_directory(blktap3_daemon.xs, blktap3_daemon.xst, path,
&m);
+        err = errno;
+        free(path);
+
+        if (!sub) {
+            WARN("error listing %s: %s\n", path, strerror(err));
+            goto out;
+        }
+
+        /*
+         * Probe each device.
+         */
+        for (j = 0; j < m; j++) {
+            err = tapback_backend_probe_device(domid, sub[j]);
+            if (err) {
+                WARN("error probing device %s of domain %d: %s\n",
sub[j],
+                        domid, strerror(err));
+                goto out;
+            }
+        }
+
+        free(sub);
+    }
+
+out:
+    free(dir);
+    return err;
+}
+
+int
+tapback_backend_handle_backend_watch(char * const path)
+{
+    char *s = NULL, *end = NULL, *name = NULL;
+    domid_t domid = 0;
+
+    assert(path);
+
+    DBG("handling watch triggered on path
\''%s\''\n", path);
+
+    s = strtok(path, "/");
+    assert(!strcmp(s, XENSTORE_BACKEND));
+    if (!(s = strtok(NULL, "/")))
+        return tapback_backend_scan();
+
+    assert(!strcmp(s, BLKTAP3_BACKEND_NAME));
+    if (!(s = strtok(NULL, "/")))
+        return tapback_backend_scan();
+
+    domid = strtoul(s, &end, 0);
+    if (*end != 0 || end == s) {
+        WARN("invalid domain ID \''%s\''\n", s);
+        return EINVAL;
+    }
+
+    /*
+     * TODO Optimisation: since we know which domain changed, we don''t
have to
+     * scan the whole thing. Add the domid as an optional parameter to
+     * tapback_backend_scan.
+     */
+    if (!(name = strtok(NULL, "/")))
+        return tapback_backend_scan();
+
+    /*
+     * Create or remove a specific device.
+     *
+     * TODO tapback_backend_probe_device reads xenstore again to see if the
+     * device should exist, but we already know that in the current function.
+     * Optimise this case.
+     */
+    return tapback_backend_probe_device(domid, name);
+}
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 5 of 8 v3] blktap3/tapback: Introduce front-end XenStore path handler
This patch introduces the handler executed when the front-end XenStore path of
a VBD gets modified. This is done when the front-end switches state. The core
of this functionality is connecting/disconnecting the tapdisk to/from the
shared ring.
When the front-end goes to Initialised or Connected state, the daemon reads all
the necessary information from XenStore provided by the front-end, initiates
the shared ring creation (the actual creation is performed by a library that
will be introduced by another patch series), and instructs the tapdisk
designated to serve this VBD to connect to the ring. Finally, it communicates
to the front-end all the necessary disk parameters.
When the front-end switches to Closed, the tapdisk is disconnected from the
shared ring.
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * Documented the frontend_state_change_map array of callbacks.
  * Minor code and whitespace clean up.
  * Added some debug prints.
  * Don''t fail if the front-end doesn''t tell us whether or not
it supports
    persistent grants, just assume it doesn''t.
diff --git a/tools/blktap3/include/blktap3.h b/tools/blktap3/include/blktap3.h
--- a/tools/blktap3/include/blktap3.h
+++ b/tools/blktap3/include/blktap3.h
@@ -44,4 +44,19 @@
     TAILQ_REMOVE(src, node, entry);             \
     TAILQ_INSERT_TAIL(dst, node, entry);
 
+/**
+ * Block I/O protocol
+ *
+ * Taken from linux/drivers/block/xen-blkback/common.h so that blkfront can
+ * work both with blktap2 and blktap3.
+ *
+ * TODO linux/drivers/block/xen-blkback/common.h contains other definitions
+ * necessary for allowing tapdisk3 to talk directly to blkfront. Find a way to
+ * use the definitions from there.
+ */
+enum blkif_protocol {
+       BLKIF_PROTOCOL_NATIVE = 1,
+       BLKIF_PROTOCOL_X86_32 = 2,
+       BLKIF_PROTOCOL_X86_64 = 3,
+};
 #endif /* __BLKTAP_3_H__ */
diff --git a/tools/blktap3/tapback/frontend.c b/tools/blktap3/tapback/frontend.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/frontend.c
@@ -0,0 +1,418 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ *
+ * This file contains the handler executed when the front-end XenStore path of
+ * a VBD gets modified.
+ */
+
+#include "tapback.h"
+#include "xenstore.h"
+#include "tap-ctl-xen.h"
+
+#include <xen/io/protocols.h>
+
+/**
+ * Switches the back-end state of the device by writing to XenStore.
+ *
+ * @param device the VBD
+ * @param state the state to switch to
+ * @returns 0 on success, an error code otherwise
+ */
+static int
+tapback_device_switch_state(vbd_t * const device,
+        const XenbusState state)
+{
+    int err;
+
+    assert(device);
+
+    /*
+     * TODO Ensure @state contains a legitimate XenbusState value.
+     * TODO Check for valid state transitions?
+     */
+
+    err = tapback_device_printf(device, "state", false,
"%u", state);
+    if (err) {
+        WARN("failed to switch back-end state to %s: %s\n",
+                XenbusState2str(state), strerror(err));
+    } else
+        DBG("switched back-end state to %s\n",
XenbusState2str(state));
+    return err;
+}
+
+/**
+ * Core functions that instructs the tapdisk to connect to the shared ring (if
+ * not already connected) and communicates essential information to the
+ * front-end.
+ *
+ * If the tapdisk is not already connected, all the necessary information is
+ * read from XenStore and the tapdisk gets connected using this information.
+ *
+ * TODO Why should this function be called on an already connected VBD? Why
+ * re-write the sector size etc. in XenStore for an already connected VBD?
+ * TODO rename function (no blkback, not only connects to tapdisk)
+ *
+ * @param xbdev the VBD the tapdisk should connect to
+ * @param state unused
+ * @returns 0 on success, an error code otherwise
+ *
+ * XXX Only called by blkback_frontend_changed, when the front-end switches to
+ * Initialised and Connected.
+ */
+static int
+blkback_connect_tap(vbd_t * const bdev,
+        const XenbusState state __attribute__((unused)))
+{
+    evtchn_port_t port = 0;
+    grant_ref_t *gref = NULL;
+    int err = 0;
+    char *proto_str = NULL;
+    char *persistent_grants_str = NULL;
+
+    assert(bdev);
+
+    if (bdev->connected) {
+        DBG("front-end already connected to tapdisk.\n");
+    } else {
+        /*
+         * TODO How can we make sure we''re not missing a node written
by the
+         * front-end? Use xs_directory?
+         */
+        int nr_pages = 0, proto = 0, order = 0;
+        bool persistent_grants = false;
+
+        if (1 != tapback_device_scanf_otherend(bdev,
"ring-page-order", "%d",
+                    &order))
+            order = 0;
+
+         nr_pages = 1 << order;
+
+        if (!(gref = calloc(nr_pages, sizeof(grant_ref_t)))) {
+            WARN("Failed to allocate memory for grant refs.\n");
+            err = ENOMEM;
+            goto fail;
+        }
+
+        /*
+         * Read the grant references.
+         */
+        if (order) {
+            int i = 0;
+            /*
+             * +10 is for INT_MAX, +1 for NULL termination
+             */
+            static const size_t len = sizeof(RING_REF) + 10 + 1;
+            char ring_ref[len];
+            for (i = 0; i < nr_pages; i++) {
+                if (snprintf(ring_ref, len, "%s%d", RING_REF, i)
>= len) {
+                    DBG("error printing to buffer\n");
+                    err = EINVAL;
+                    goto fail;
+                }
+                if (1 != tapback_device_scanf_otherend(bdev, ring_ref,
"%u",
+                            &gref[i])) {
+                    WARN("Failed to read grant ref 0x%x.\n", i);
+                    err = ENOENT;
+                    goto fail;
+                }
+            }
+        } else {
+            if (1 != tapback_device_scanf_otherend(bdev, RING_REF,
"%u",
+                        &gref[0])) {
+                WARN("Failed to read grant ref.\n");
+                err = ENOENT;
+                goto fail;
+            }
+        }
+
+        /*
+         * Read the event channel.
+         */
+        if (1 != tapback_device_scanf_otherend(bdev, "event-channel",
"%u",
+                    &port)) {
+            WARN("Failed to read event channel.\n");
+            err = ENOENT;
+            goto fail;
+        }
+
+        /*
+         * Read the guest VM''s ABI.
+         */
+        if (!(proto_str = tapback_device_read_otherend(bdev,
"protocol")))
+            proto = BLKIF_PROTOCOL_NATIVE;
+        else if (!strcmp(proto_str, XEN_IO_PROTO_ABI_X86_32))
+            proto = BLKIF_PROTOCOL_X86_32;
+        else if (!strcmp(proto_str, XEN_IO_PROTO_ABI_X86_64))
+            proto = BLKIF_PROTOCOL_X86_64;
+        else {
+            WARN("unsupported protocol %s\n", proto_str);
+            err = EINVAL;
+            goto fail;
+        }
+
+        /*
+         * Does the front-end support persistent grants?
+         */
+        persistent_grants_str = tapback_device_read_otherend(bdev,
+                FEAT_PERSIST);
+        if (persistent_grants_str) {
+            if (!strcmp(persistent_grants_str, "0"))
+                persistent_grants = false;
+            else if (!strcmp(persistent_grants_str, "1"))
+                persistent_grants = true;
+            else {
+                WARN("invalid %s value: %s\n", FEAT_PERSIST,
+                        persistent_grants_str);
+                err = EINVAL;
+                goto fail;
+            }
+        }
+        else
+            DBG("front-end doesn''t support persistent
grants\n");
+
+        /*
+         * persistent grants are not yet supported
+         */
+        if (persistent_grants)
+            WARN("front-end supports persistent grants but we
don''t\n");
+
+        /*
+         * Create the shared ring and ask the tapdisk to connect to it.
+         */
+        if ((err = tap_ctl_connect_xenblkif(bdev->tap.pid,
bdev->tap.minor,
+                        bdev->domid, bdev->devid, gref, order, port,
proto,
+                        NULL))) {
+            WARN("tapdisk failed to connect to the shared ring:
%s\n",
+                    strerror(err));
+            goto fail;
+        }
+        DBG("tapdisk %d/%d connected to shared ring\n",
bdev->tap.pid,
+                bdev->tap.minor);
+
+        bdev->connected = true;
+    }
+
+    /*
+     * Write the number of sectors, sector size, and info to the back-end path
+     * in XenStore so that the front-end creates a VBD with the appropriate
+     * characteristics.
+     */
+    if ((err = tapback_device_printf(bdev, "sector-size", true,
"%u",
+                    bdev->sector_size))) {
+        WARN("Failed to write sector-size.\n");
+        goto fail;
+    }
+
+    if ((err = tapback_device_printf(bdev, "sectors", true,
"%llu",
+                    bdev->sectors))) {
+        WARN("Failed to write sectors.\n");
+        goto fail;
+    }
+
+    if ((err = tapback_device_printf(bdev, "info", true,
"%u", bdev->info))) {
+        WARN("Failed to write info.\n");
+        goto fail;
+    }
+
+    if ((err = tapback_device_switch_state(bdev, XenbusStateConnected))) {
+        WARN("failed to switch back-end state to connected: %s\n",
+                strerror(err));
+    }
+
+fail:
+    if (err && bdev->connected) {
+        const int err2 = tap_ctl_disconnect_xenblkif(bdev->tap.pid,
+                bdev->tap.minor, bdev->domid, bdev->devid, NULL);
+        if (err2) {
+            WARN("error disconnecting tapdisk from the shared ring (error
"
+                    "ignored): %s\n", strerror(err2));
+        }
+
+        bdev->connected = false;
+    }
+
+    free(gref);
+    free(proto_str);
+    free(persistent_grants_str);
+
+    return err;
+}
+
+/**
+ * Callback that is executed when the front-end goes to StateClosed.
+ *
+ * Instructs the tapdisk to disconnect itself from the shared ring and switches
+ * the back-end state to StateClosed.
+ *
+ * @param xbdev the VBD whose tapdisk should be disconnected
+ * @param state unused
+ * @returns 0 on success, an error code otherwise
+ *
+ * XXX Only called by blkback_frontend_changed.
+ */
+static inline int
+backend_close(vbd_t * const bdev,
+        const XenbusState state __attribute__((unused)))
+{
+    int err = 0;
+
+    assert(bdev);
+
+    if (!bdev->connected) {
+        /*
+         * TODO Is this safe? Shouldn''t we report an error?
+         */
+        DBG("tapdisk not connected\n");
+        return 0;
+    }
+
+    DBG("disconnecting vbd-%d-%d from tapdisk %d minor %d\n",
+        bdev->domid, bdev->devid, bdev->tap.pid, bdev->tap.minor);
+
+    if ((err = tap_ctl_disconnect_xenblkif(bdev->tap.pid,
bdev->tap.minor,
+            bdev->domid, bdev->devid, NULL))){
+
+        /*
+         * TODO I don''t see how tap_ctl_disconnect_xenblkif can return
+         * ESRCH, so this is probably wrong. Probably there''s another
error
+         * code indicating that there''s no tapdisk process.
+         */
+        if (errno == -ESRCH) {
+            WARN("tapdisk not running\n");
+        } else {
+            WARN("error disconnecting tapdisk from front-end: %s\n",
+                    strerror(err));
+            return err;
+        }
+    }
+
+    bdev->connected = false;
+
+    return tapback_device_switch_state(bdev, XenbusStateClosed);
+}
+
+/**
+ * Acts on changes in the front-end state.
+ *
+ * TODO The back-end blindly follows the front-ends state transitions, should
+ * we check whether unexpected transitions are performed?
+ *
+ * @param xbdev the VBD whose front-end state changed
+ * @param state the new state
+ * @returns 0 on success, an error code otherwise
+ *
+ * XXX Only called by tapback_device_check_front-end_state.
+ */
+static inline int
+blkback_frontend_changed(vbd_t * const xbdev, const XenbusState state)
+{
+    /*
+     * XXX The size of the array (9) comes from the XenbusState enum.
+     *
+     * TODO Send a patch that adds XenbusStateMin, XenbusStateMax,
+     * XenbusStateInvalid and in the XenbusState enum (located in xenbus.h).
+     *
+     * The front-end''s state is used as the array index. Each element
contains
+     * a call-back function to be executed in response, and an optional state
+     * for the back-end to switch to.
+     */
+    struct frontend_state_change {
+        int (*fn)(vbd_t * const, const XenbusState);
+        const XenbusState state;
+    } static const frontend_state_change_map[] = {
+        [XenbusStateUnknown] = {NULL, 0},
+        [XenbusStateInitialising]
+            = {tapback_device_switch_state, XenbusStateInitWait},
+        [XenbusStateInitWait] = {NULL, 0},
+
+        /* blkback_connect_tap swicthes back-end state to Connected */
+        [XenbusStateInitialised] = {blkback_connect_tap, 0},
+        [XenbusStateConnected] = {blkback_connect_tap, 0},
+
+        [XenbusStateClosing]
+            = {tapback_device_switch_state, XenbusStateClosing},
+        [XenbusStateClosed] = {backend_close, 0},
+        [XenbusStateReconfiguring] = {NULL, 0},
+        [XenbusStateReconfigured] = {NULL, 0}
+    };
+
+    assert(xbdev);
+    assert(state >= XenbusStateUnknown && state <=
XenbusStateReconfigured);
+
+    DBG("front-end %d-%s went into state %s\n",
+            xbdev->domid, xbdev->name, XenbusState2str(state));
+
+    if (frontend_state_change_map[state].fn)
+        return frontend_state_change_map[state].fn(xbdev,
+                frontend_state_change_map[state].state);
+    else
+        DBG("ignoring front-end''s %d-%s transition to state
%s\n",
+                xbdev->domid, xbdev->name, XenbusState2str(state));
+    return 0;
+}
+
+int
+tapback_backend_handle_otherend_watch(const char * const path)
+{
+    vbd_t *device = NULL;
+    int err = 0, state = 0;
+    char *s = NULL, *end = NULL;
+
+    assert(path);
+
+    /*
+     * Find the device that has the same front-end state path.
+     *
+     * There should definitely be such a device in our list, otherwise this
+     * function would not have executed at all, since we would not be waiting
+     * on that XenStore path.  The XenStore path we wait for is:
+     * /local/domain/<domid>/device/vbd/<devname>/state. In order
to watch this
+     * path, it means that we have received a device create request, so the
+     * device will be there.
+     *
+     * TODO Instead of this linear search we could do better (hash table etc).
+     */
+    tapback_backend_find_device(device,
+            !strcmp(device->frontend_state_path, path));
+    if (!device) {
+        WARN("path \''%s\'' does not correspond to a known
device\n", path);
+        return ENODEV;
+    }
+
+    DBG("device: domid=%d name=%s\n", device->domid,
device->name);
+
+    /*
+     * Read the new front-end''s state.
+     */
+    if (!(s = tapback_xs_read(blktap3_daemon.xs, blktap3_daemon.xst,
"%s",
+                    device->frontend_state_path))) {
+        err = -errno;
+        goto fail;
+    }
+    state = strtol(s, &end, 0);
+    if (*end != 0 || end == s) {
+        err = -EINVAL;
+        goto fail;
+    }
+
+    err = blkback_frontend_changed(device, state);
+
+fail:
+    free(s);
+    return err;
+}
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 6 of 8 v3] blktap3/tapback: Introduce the tapback daemon
This patch introduces the core of the tapback daemon, the user space daemon
that acts as a device''s back-end, essentially most of blkback in user
space.
Similar to blkback, the daemon monitors XenStore for device creation/removal
requests and front-end state changes, and acts in response to them
(communicates to the front-end necessary information, switches the
back-end''s
state etc.). The daemon creates/destroys tapdisk processes as needed.
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v1:
  * minor code clean up
Changed since v2:
  * Introduce function XenbusState2str for printing XenbusStates in a more
    human-friendly way.
  * Don''t print the log identify if the tapback is run in non-daemon
mode, in
    order to make debug output more concise.
diff --git a/tools/blktap3/tapback/tapback.c b/tools/blktap3/tapback/tapback.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/tapback/tapback.c
@@ -0,0 +1,301 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ *
+ * This file contains the core of the tapback daemon, the user space daemon
+ * that acts as a device''s back-end.
+ */
+
+/*
+ * TODO Some of these includes may be useless.
+ * TODO Replace hard-coding strings with defines/const string.
+ */
+#include <stdlib.h>
+#include <stdarg.h>
+#include <assert.h>
+#include <syslog.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <libgen.h>
+#include <getopt.h>
+#include <sys/ioctl.h>
+#include <sys/mount.h>
+#include <syslog.h>
+
+#include "blktap3.h"
+#include "stdio.h" /* TODO tap-ctl.h needs to include stdio.h */
+#include "tap-ctl.h"
+#include "tapback.h"
+
+void tapback_log(int prio, const char *fmt, ...);
+void (*tapback_vlog) (int prio, const char *fmt, va_list ap);
+
+struct _blktap3_daemon blktap3_daemon;
+
+char *XenbusState2str(const XenbusState xbs)
+{
+    static char * const str[] = {
+        [XenbusStateUnknown] = "unknown",
+        [XenbusStateInitialising] = "initialising",
+        [XenbusStateInitWait] = "init wait",
+        [XenbusStateInitialised] = "initialised",
+        [XenbusStateConnected] = "connected",
+        [XenbusStateClosing] = "closing",
+        [XenbusStateClosed] = "closed",
+        [XenbusStateReconfiguring] = "reconfiguring",
+        [XenbusStateReconfigured] = "reconfigured"
+    };
+    return str[xbs];
+}
+
+/**
+ * Read changes that occurred on the "backend/<backend name>"
XenStore path
+ * or one of the front-end paths and act accordingly.
+ */
+static inline void
+tapback_read_watch(void)
+{
+    char **watch = NULL, *path = NULL, *token = NULL;
+    unsigned int n = 0;
+    int err = 0, _abort = 0;
+
+    /* read the change */
+    watch = xs_read_watch(blktap3_daemon.xs, &n);
+    path = watch[XS_WATCH_PATH];
+    token = watch[XS_WATCH_TOKEN];
+
+    /*
+     * TODO Put the body of "again:" into a loop instead of using
goto.
+     */
+again:
+    if (!(blktap3_daemon.xst = xs_transaction_start(blktap3_daemon.xs))) {
+        WARN("error starting transaction\n");
+        goto fail;
+    }
+
+    /*
+     * The token indicates which XenStore watch triggered, the front-end one or
+     * the back-end one.
+     */
+    if (!strcmp(token, BLKTAP3_FRONTEND_TOKEN)) {
+        DBG("front-end triggered\n");
+        err = tapback_backend_handle_otherend_watch(path);
+    } else if (!strcmp(token, BLKTAP3_BACKEND_TOKEN)) {
+        DBG("back-end triggered\n");
+        err = tapback_backend_handle_backend_watch(path);
+    } else {
+        WARN("invalid token \''%s\''\n", token);
+        err = EINVAL;
+    }
+
+    _abort = !!err;
+    if (_abort)
+        DBG("aborting transaction: %s\n", strerror(err));
+
+    err = xs_transaction_end(blktap3_daemon.xs, blktap3_daemon.xst, _abort);
+    blktap3_daemon.xst = 0;
+    if (!err) {
+        err = -errno;
+        /*
+         * This is OK according to xs_transaction_end''s semantics.
+         */
+        if (EAGAIN == errno)
+            goto again;
+        DBG("error ending transaction: %s\n", strerror(err));
+    }
+
+fail:
+    free(watch);
+    return;
+}
+
+static void
+tapback_backend_destroy(void)
+{
+    if (blktap3_daemon.xs) {
+        xs_daemon_close(blktap3_daemon.xs);
+        blktap3_daemon.xs = NULL;
+    }
+}
+
+/**
+ * Initializes the back-end descriptor. There is one back-end per tapback
+ * process. Also, it initiates a watch to XenStore on backend/<backend
name>.
+ *
+ * @returns 0 on success, an error code otherwise
+ */
+static inline int
+tapback_backend_create(void)
+{
+    int err;
+
+    TAILQ_INIT(&blktap3_daemon.devices);
+    blktap3_daemon.xst = XBT_NULL;
+
+    if (!(blktap3_daemon.xs = xs_daemon_open())) {
+        err = EINVAL;
+        goto fail;
+    }
+
+    /*
+     * Watch the back-end.
+     */
+    if (!xs_watch(blktap3_daemon.xs, BLKTAP3_BACKEND_PATH,
+                BLKTAP3_BACKEND_TOKEN)) {
+        err = errno;
+        goto fail;
+    }
+
+    return 0;
+
+fail:
+	tapback_backend_destroy();
+
+    return err;
+}
+
+/**
+ * Runs the daemon.
+ *
+ * Watches backend/<backend name> and the front-end devices.
+ */
+static inline int
+tapback_backend_run(void)
+{
+    const int fd = xs_fileno(blktap3_daemon.xs);
+	int err;
+
+    do {
+        fd_set rfds;
+        int nfds = 0;
+
+        FD_ZERO(&rfds);
+        FD_SET(fd, &rfds);
+
+        /* poll the fd for changes in the XenStore path we''re
interested in */
+        if ((nfds = select(fd + 1, &rfds, NULL, NULL, NULL)) < 0) {
+            perror("error monitoring XenStore");
+            err = -errno;
+            break;
+        }
+
+        if (FD_ISSET(fd, &rfds))
+            tapback_read_watch();
+        DBG("--\n");
+    } while (1);
+
+    return err;
+}
+
+static char *blkback_ident = NULL;
+
+static void
+blkback_vlog_fprintf(const int prio, const char * const fmt, va_list ap)
+{
+    static const char *strprio[] = {
+        [LOG_DEBUG] = "DBG",
+        [LOG_INFO] = "INF",
+        [LOG_WARNING] = "WRN"
+    };
+
+    assert(LOG_DEBUG == prio || LOG_INFO == prio || LOG_WARNING == prio);
+    assert(strprio[prio]);
+
+    fprintf(stderr, "%s[%s] ", blkback_ident, strprio[prio]);
+    vfprintf(stderr, fmt, ap);
+}
+
+/**
+ * Print tapback''s usage instructions.
+ */
+static void
+usage(FILE * const stream, const char * const prog)
+{
+    assert(stream);
+    assert(prog);
+
+    fprintf(stream,
+            "usage: %s\n"
+            "\t[-D|--debug]\n"
+			"\t[-h|--help]\n", prog);
+}
+
+int main(int argc, char **argv)
+{
+    const char *prog = NULL;
+    int opt_debug = 0;
+    int err = 0;
+
+    prog = basename(argv[0]);
+
+    opt_debug = 0;
+
+    do {
+        const struct option longopts[] = {
+            {"help", 0, NULL, ''h''},
+            {"debug", 0, NULL, ''D''},
+        };
+        int c;
+
+        c = getopt_long(argc, argv, "h:D", longopts, NULL);
+        if (c < 0)
+            break;
+
+        switch (c) {
+        case ''h'':
+            usage(stdout, prog);
+            return 0;
+        case ''D'':
+            opt_debug = 1;
+            break;
+        case ''?'':
+            goto usage;
+        }
+    } while (1);
+
+    if (opt_debug) {
+        blkback_ident = "";
+        tapback_vlog = blkback_vlog_fprintf;
+    }
+    else {
+        blkback_ident = BLKTAP3_BACKEND_TOKEN;
+        openlog(blkback_ident, 0, LOG_DAEMON);
+    }
+
+    if (!opt_debug) {
+        if ((err = daemon(0, 0))) {
+            err = -errno;
+            goto fail;
+        }
+    }
+
+	if ((err = tapback_backend_create())) {
+        WARN("error creating blkback: %s\n", strerror(err));
+        goto fail;
+    }
+
+    err = tapback_backend_run();
+
+    tapback_backend_destroy();
+
+fail:
+    return err ? -err : 0;
+
+usage:
+    usage(stderr, prog);
+    return 1;
+}
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 7 of 8 v3] blktap3/tapback: Introduce tapback daemon Makefile
This patch introduces the Makefile that builds the tapback daemon. This
Makefile is not yet hooked into the build system.
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com>
---
Changed since v2:
  * Use $(BINDIR) as the daemon''s installation directory.
  * Fixed whitespace.
diff --git a/tools/blktap3/tapback/Makefile b/tools/blktap3/tapback/Makefile
--- a/tools/blktap3/tapback/Makefile
+++ b/tools/blktap3/tapback/Makefile
@@ -3,6 +3,10 @@ include $(XEN_ROOT)/tools/Rules.mk
 
 BLKTAP_ROOT := ..
 
+INST_DIR ?= $(BINDIR)
+
+IBIN = tapback
+
 # -D_GNU_SOURCE is required by vasprintf.
 override CFLAGS += \
     -I$(BLKTAP_ROOT)/include \
@@ -25,7 +29,20 @@ override LDFLAGS += \
     $(LDLIBS_libxenstore) \
     $(LDFLAGS_libxenctrl)
 
+TAPBACK-OBJS := log.o xenstore.o frontend.o backend.o
+
+TAPBACK-LIBS := $(BLKTAP_ROOT)/control/libblktapctl.so.1.0.0
+
+all: $(IBIN)
+
+$(IBIN): $(TAPBACK-OBJS) tapback.o
+	$(CC) -o $@ $^ $(TAPBACK-LIBS) $(LDFLAGS)
+
+install: all
+	$(INSTALL_DIR) -p $(DESTDIR)$(INST_DIR)
+	$(INSTALL_PROG) $(IBIN) $(DESTDIR)$(INST_DIR)
+
 clean:
-	rm -f *.o *.o.d .*.o.d
+	rm -f *.o *.o.d .*.o.d $(IBIN)
 
 .PHONY: clean install
Thanos Makatos
2013-Mar-06  13:07 UTC
[PATCH 8 of 8 v3] blktap3/tapback: Add the tapback binary to the Mercurial ignore list
Signed-off-by: Thanos Makatos <thanos.makatos@citrix.com> diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -371,4 +371,7 @@ ^unmodified_drivers/linux-2.6/.*\.ko$ ^unmodified_drivers/linux-2.6/.*\.mod\.c$ ^LibVNCServer.* + +# blktap3 +^tools/blktap3/tapback/tapback$ ^tools/blktap3/control/_paths.h$