Daniel Stodden
2010-Jan-28 22:37 UTC
[Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
Get blktap2 running on pvops. This mainly adds eventfd support to the userland code. Based on some prior cleanup to tapdisk-queue and the server object. We had most of that in XenServer for a while, so I kept it stacked. 1. Clean up IPC and AIO init in tapdisk-server. [I think tapdisk-ipc in blktap2 is basically obsolete. Pending a later patch to remove it?] 2. Split tapdisk-queue into variable raw I/O backends. This basically makes an ''ops''-struct (struct tio) out of what used to be primarily libaio vs. an if/else hack to resort to canonical read()/write()s where desirable. For now, the one chosen remains as hardcoded as ever. 3. Prefer AIO eventfd support on kernels >= 2.6.22 Mainline Linux after 2.6.22 finally got I/O muxing for AIO. Unfortunately, few systems bring the necessary libaio update (0.3.107), xen/tools included. Since this is just about a bunch of inline macros and an update to reserved space in the iocb struct, let''s add a compat header with private typedefs instead. This should obsolete tools/aio. Misc: - Does a runtime kernel version check. I guess this code will need additional cpp magic on BSDs. - Wants a PERROR macro in blktaplib.h - Fixes a bug in tapdisk-vbd which locks up the sync io mode. - Removed dead code in qcow2raw to make it link again. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-28 22:37 UTC
[Xen-devel] [PATCH 1 of 4] blktap2: Sort out tapdisk IPC init
# HG changeset patch # User Daniel Stodden <daniel.stodden@citrix.com> # Date 1258172773 28800 # Node ID 9da64803841f4d8d8816f6d0218cbc6b672f7903 # Parent 2636e561970898517def148c19e04581b12dc860 blktap2: Sort out tapdisk IPC init. Move I/O and event callbacks setup out of tapdisk-server, into tapdisk-ipc. diff -r 2636e5619708 -r 9da64803841f tools/blktap2/drivers/tapdisk-ipc.c --- a/tools/blktap2/drivers/tapdisk-ipc.c Tue Jan 26 15:54:40 2010 +0000 +++ b/tools/blktap2/drivers/tapdisk-ipc.c Fri Nov 13 20:26:13 2009 -0800 @@ -30,12 +30,86 @@ #include <stdlib.h> #include <unistd.h> #include <string.h> +#include <fcntl.h> #include "tapdisk.h" #include "tapdisk-ipc.h" #include "tapdisk-vbd.h" #include "tapdisk-server.h" +static void +tapdisk_ipc_read_event(event_id_t id, char mode, void *private) +{ + td_ipc_t *ipc = private; + tapdisk_ipc_read(ipc); +} + +static void +__tapdisk_ipc_init(td_ipc_t *ipc) +{ + ipc->rfd = -1; + ipc->wfd = -1; + ipc->rfd_event = -1; +} + +int +tapdisk_ipc_open(td_ipc_t *ipc, const char *read, const char *write) +{ + int err; + + memset(ipc, 0, sizeof(td_ipc_t)); + __tapdisk_ipc_init(ipc); + + if (read) { + ipc->rfd = open(read, O_RDWR | O_NONBLOCK); + if (ipc->rfd < 0) { + err = -errno; + EPRINTF("FD open failed %s: %d\n", read, err); + goto fail; + } + + ipc->rfd_event = + tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, + ipc->rfd, 0, + tapdisk_ipc_read_event, + ipc); + if (ipc->rfd_event < 0) { + err = ipc->rfd_event; + goto fail; + } + } + + if (write) { + ipc->wfd = open(write, O_RDWR | O_NONBLOCK); + if (ipc->wfd < 0) { + err = -errno; + EPRINTF("FD open failed %s, %d\n", write, err); + goto fail; + } + } + + return 0; + +fail: + tapdisk_ipc_close(ipc); + return err; +} + +void +tapdisk_ipc_close(td_ipc_t *ipc) +{ + if (ipc->rfd > 0) + close(ipc->rfd); + + if (ipc->wfd > 0) + close(ipc->wfd); + + if (ipc->rfd_event >= 0) + tapdisk_server_unregister_event(ipc->rfd_event); + + __tapdisk_ipc_init(ipc); +} + static int tapdisk_ipc_write_message(int fd, tapdisk_message_t *message, int timeout) { diff -r 2636e5619708 -r 9da64803841f tools/blktap2/drivers/tapdisk-ipc.h --- a/tools/blktap2/drivers/tapdisk-ipc.h Tue Jan 26 15:54:40 2010 +0000 +++ b/tools/blktap2/drivers/tapdisk-ipc.h Fri Nov 13 20:26:13 2009 -0800 @@ -29,13 +29,17 @@ #define _TAPDISK_IPC_H_ #include "tapdisk-message.h" +#include "scheduler.h" typedef struct td_ipc_handle { int rfd; int wfd; + event_id_t rfd_event; td_uuid_t uuid; } td_ipc_t; +int tapdisk_ipc_open(td_ipc_t *ipc, const char *read, const char *write); +void tapdisk_ipc_close(td_ipc_t *ipc); int tapdisk_ipc_read(td_ipc_t *ipc); int tapdisk_ipc_write(td_ipc_t *ipc, int type); int tapdisk_ipc_write_error(td_ipc_t *ipc, const char *message); diff -r 2636e5619708 -r 9da64803841f tools/blktap2/drivers/tapdisk-server.c --- a/tools/blktap2/drivers/tapdisk-server.c Tue Jan 26 15:54:40 2010 +0000 +++ b/tools/blktap2/drivers/tapdisk-server.c Fri Nov 13 20:26:13 2009 -0800 @@ -26,7 +26,6 @@ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include <stdio.h> -#include <fcntl.h> #include <errno.h> #include <unistd.h> #include <stdlib.h> @@ -223,12 +222,6 @@ } static void -tapdisk_server_read_ipc_message(event_id_t id, char mode, void *private) -{ - tapdisk_ipc_read(&server.ipc); -} - -static void tapdisk_server_aio_queue_event(event_id_t id, char mode, void *private) { tapdisk_complete_tiocbs(&server.aio_queue); @@ -242,6 +235,18 @@ } static int +tapdisk_server_init_ipc(const char *read, const char *write) +{ + return tapdisk_ipc_open(&server.ipc, read, write); +} + +static void +tapdisk_server_close_ipc(void) +{ + tapdisk_ipc_close(&server.ipc); +} + +static int tapdisk_server_initialize_aio_queue(void) { int err; @@ -270,15 +275,7 @@ tapdisk_server_close(void) { tapdisk_server_free_aio_queue(); - - if (server.control_event) - scheduler_unregister_event(&server.scheduler, server.control_event); - - if (server.ipc.rfd != -1) - close(server.ipc.rfd); - - if (server.ipc.wfd != -1) - close(server.ipc.wfd); + tapdisk_server_close_ipc(); } static void @@ -334,63 +331,26 @@ tapdisk_server_initialize(const char *read, const char *write) { int err; - event_id_t event_id; - event_id = 0; memset(&server, 0, sizeof(tapdisk_server_t)); - server.ipc.rfd = server.ipc.wfd = -1; - INIT_LIST_HEAD(&server.vbds); - if (read) { - server.ipc.rfd = open(read, O_RDWR | O_NONBLOCK); - if (server.ipc.rfd < 0) { - err = -errno; - EPRINTF("FD open failed %s: %d\n", read, err); - goto fail; - } - } - - if (write) { - server.ipc.wfd = open(write, O_RDWR | O_NONBLOCK); - if (server.ipc.wfd < 0) { - err = -errno; - EPRINTF("FD open failed %s, %d\n", write, err); - goto fail; - } - } - scheduler_initialize(&server.scheduler); - if (read) { - event_id = scheduler_register_event(&server.scheduler, - SCHEDULER_POLL_READ_FD, - server.ipc.rfd, 0, - tapdisk_server_read_ipc_message, - NULL); - if (event_id < 0) { - err = event_id; - goto fail; - } - } + err = tapdisk_server_init_ipc(read, write); + if (err) + goto fail; err = tapdisk_server_initialize_aio_queue(); if (err) goto fail; - server.control_event = event_id; server.run = 1; return 0; fail: - if (server.ipc.rfd > 0) - close(server.ipc.rfd); - if (server.ipc.wfd > 0) - close(server.ipc.wfd); - if (event_id > 0) - scheduler_unregister_event(&server.scheduler, - server.control_event); + tapdisk_server_close_ipc(); return err; } diff -r 2636e5619708 -r 9da64803841f tools/blktap2/drivers/tapdisk-server.h --- a/tools/blktap2/drivers/tapdisk-server.h Tue Jan 26 15:54:40 2010 +0000 +++ b/tools/blktap2/drivers/tapdisk-server.h Fri Nov 13 20:26:13 2009 -0800 @@ -57,7 +57,6 @@ td_ipc_t ipc; struct list_head vbds; scheduler_t scheduler; - event_id_t control_event; struct tqueue aio_queue; event_id_t aio_queue_event_id; } tapdisk_server_t; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-28 22:37 UTC
[Xen-devel] [PATCH 2 of 4] blktap2: Sort out tapdisk AIO init
# HG changeset patch # User Daniel Stodden <daniel.stodden@citrix.com> # Date 1258172774 28800 # Node ID 2439e941318745b333e63493af247b9558ddebc2 # Parent 9da64803841f4d8d8816f6d0218cbc6b672f7903 blktap2: Sort out tapdisk AIO init. Move event callbacks registration into tapdisk-queue. This should also obsoletes the dummy pollfd pipe in the synchronous I/O case. diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/io-optimize.c --- a/tools/blktap2/drivers/io-optimize.c Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/io-optimize.c Fri Nov 13 20:26:14 2009 -0800 @@ -51,9 +51,16 @@ opio_free(struct opioctx *ctx) { free(ctx->opios); + ctx->opios = NULL; + free(ctx->free_opios); + ctx->free_opios = NULL; + free(ctx->iocb_queue); + ctx->iocb_queue = NULL; + free(ctx->event_queue); + ctx->event_queue = NULL; } int diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/qcow2raw.c --- a/tools/blktap2/drivers/qcow2raw.c Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/qcow2raw.c Fri Nov 13 20:26:14 2009 -0800 @@ -101,12 +101,6 @@ return; } -void -queue_event(event_id_t id, char mode, void *private) -{ - tapdisk_complete_tiocbs(&server.aio_queue); -} - static void debug_output(uint64_t progress, uint64_t size) { //Output progress every PROGRESS_QUANT diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/tapdisk-queue.c --- a/tools/blktap2/drivers/tapdisk-queue.c Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-queue.c Fri Nov 13 20:26:14 2009 -0800 @@ -35,6 +35,7 @@ #include "tapdisk-log.h" #include "tapdisk-queue.h" #include "tapdisk-filter.h" +#include "tapdisk-server.h" #include "atomicio.h" #define WARN(_f, _a...) tlog_write(TLOG_WARN, _f, ##_a) @@ -46,7 +47,7 @@ * so that we can concurrently poll on synchronous and async descriptors. * This is signalled by passing 1 as the io context to io_setup. */ -#define REQUEST_ASYNC_FD 1 +#define REQUEST_ASYNC_FD ((io_context_t)1) static inline void queue_tiocb(struct tqueue *queue, struct tiocb *tiocb) @@ -220,6 +221,8 @@ return split; } +static void tapdisk_tiocb_event(event_id_t id, char mode, void *private); + int tapdisk_init_queue(struct tqueue *queue, int size, int sync, struct tfilter *filter) @@ -232,18 +235,18 @@ queue->sync = sync; queue->filter = filter; - if (sync) { - /* set up a pipe so we can return - * a poll fd that won''t fire. */ - if (pipe(queue->dummy_pipe)) - return -errno; - queue->poll_fd = queue->dummy_pipe[0]; - } else { - queue->aio_ctx = (io_context_t)REQUEST_ASYNC_FD; + queue->event = -1; + queue->aio_ctx = NULL; + + if (!size) + return 0; + + if (!sync) { + queue->aio_ctx = REQUEST_ASYNC_FD; queue->poll_fd = io_setup(size, &queue->aio_ctx); - - if (queue->poll_fd < 0) { - if (queue->poll_fd == -EAGAIN) + err = queue->poll_fd; + if (err < 0) { + if (err == -EAGAIN) DPRINTF("Couldn''t setup AIO context. If you " "are trying to concurrently use a " "large number of blktap-based disks, " @@ -256,8 +259,19 @@ "support. This is probably because " "your kernel does not have the " "aio-poll patch applied.\n"); - return queue->poll_fd; + queue->aio_ctx = NULL; + goto fail; } + + queue->event + tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, + queue->poll_fd, 0, + tapdisk_tiocb_event, + queue); + err = queue->event; + if (err < 0) + goto fail; + } err = -ENOMEM; @@ -280,14 +294,22 @@ void tapdisk_free_queue(struct tqueue *queue) { - if (queue->sync) { - close(queue->dummy_pipe[0]); - close(queue->dummy_pipe[1]); - } else + if (queue->event >= 0) { + tapdisk_server_unregister_event(queue->event); + queue->event = -1; + } + + if (queue->aio_ctx) { io_destroy(queue->aio_ctx); + queue->aio_ctx = NULL; + } free(queue->iocbs); + queue->iocbs = NULL; + free(queue->aio_events); + queue->aio_events = NULL; + opio_free(&queue->opioctx); } @@ -390,7 +412,7 @@ return submitted; } -int +static void tapdisk_complete_tiocbs(struct tqueue *queue) { int i, ret, split; @@ -415,8 +437,13 @@ } queue_deferred_tiocbs(queue); +} - return split; +static void +tapdisk_tiocb_event(event_id_t id, char mode, void *private) +{ + struct tqueue *queue = private; + tapdisk_complete_tiocbs(queue); } /* diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/tapdisk-queue.h --- a/tools/blktap2/drivers/tapdisk-queue.h Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-queue.h Fri Nov 13 20:26:14 2009 -0800 @@ -32,6 +32,7 @@ #include <libaio.h> #include "io-optimize.h" +#include "scheduler.h" struct tiocb; struct tfilter; @@ -57,9 +58,9 @@ int sync; int poll_fd; + event_id_t event; io_context_t aio_ctx; struct opioctx opioctx; - int dummy_pipe[2]; int queued; struct iocb **iocbs; @@ -104,7 +105,6 @@ void tapdisk_queue_tiocb(struct tqueue *, struct tiocb *); int tapdisk_submit_tiocbs(struct tqueue *); int tapdisk_submit_all_tiocbs(struct tqueue *); -int tapdisk_complete_tiocbs(struct tqueue *); int tapdisk_cancel_tiocbs(struct tqueue *); int tapdisk_cancel_all_tiocbs(struct tqueue *); void tapdisk_prep_tiocb(struct tiocb *, int, int, char *, size_t, diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/tapdisk-server.c --- a/tools/blktap2/drivers/tapdisk-server.c Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-server.c Fri Nov 13 20:26:14 2009 -0800 @@ -221,19 +221,6 @@ tapdisk_ipc_write_error(&vbd->ipc, message); } -static void -tapdisk_server_aio_queue_event(event_id_t id, char mode, void *private) -{ - tapdisk_complete_tiocbs(&server.aio_queue); -} - -static void -tapdisk_server_free_aio_queue(void) -{ - tapdisk_server_unregister_event(server.aio_queue_event_id); - tapdisk_free_queue(&server.aio_queue); -} - static int tapdisk_server_init_ipc(const char *read, const char *write) { @@ -247,34 +234,21 @@ } static int -tapdisk_server_initialize_aio_queue(void) +tapdisk_server_init_aio(void) { - int err; - event_id_t id; + return tapdisk_init_queue(&server.aio_queue, TAPDISK_TIOCBS, 0, NULL); +} - err = tapdisk_init_queue(&server.aio_queue, - TAPDISK_TIOCBS, 0, NULL); - if (err) - return err; - - id = tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, - server.aio_queue.poll_fd, 0, - tapdisk_server_aio_queue_event, - NULL); - if (id < 0) { - tapdisk_free_queue(&server.aio_queue); - return id; - } - - server.aio_queue_event_id = id; - - return 0; +static void +tapdisk_server_close_aio(void) +{ + tapdisk_free_queue(&server.aio_queue); } static void tapdisk_server_close(void) { - tapdisk_server_free_aio_queue(); + tapdisk_server_close_aio(); tapdisk_server_close_ipc(); } @@ -341,7 +315,7 @@ if (err) goto fail; - err = tapdisk_server_initialize_aio_queue(); + err = tapdisk_server_init_aio(); if (err) goto fail; diff -r 9da64803841f -r 2439e9413187 tools/blktap2/drivers/tapdisk-server.h --- a/tools/blktap2/drivers/tapdisk-server.h Fri Nov 13 20:26:13 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-server.h Fri Nov 13 20:26:14 2009 -0800 @@ -58,7 +58,6 @@ struct list_head vbds; scheduler_t scheduler; struct tqueue aio_queue; - event_id_t aio_queue_event_id; } tapdisk_server_t; #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-28 22:37 UTC
[Xen-devel] [PATCH 3 of 4] blktap2: Separate tapdisk raw I/O into different backends
# HG changeset patch # User Daniel Stodden <daniel.stodden@citrix.com> # Date 1264717794 28800 # Node ID b816b4a752d8d2dcc6e7e01ad0cf75404932d94b # Parent 2439e941318745b333e63493af247b9558ddebc2 blktap2: Separate tapdisk raw I/O into different backends. Hide tapdisk support for different raw I/O interfaces behind a new struct tio. Libaio remains to dominate the interface, requiring everyone to dispatch iocb/ioevent structs. Backends: - lio: Kernel AIO via libaio. - rwio: Canonical read/write() mode. Misc: - Fixes a bug in tapdisk-vbd which locks up the sync io mode. - Wants a PERROR macro in blktaplib.h - Removes dead code in qcow2raw to make it link again. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jake Wires <jake.wires@citrix.com> diff -r 2439e9413187 -r b816b4a752d8 tools/blktap2/drivers/tapdisk-queue.c --- a/tools/blktap2/drivers/tapdisk-queue.c Fri Nov 13 20:26:14 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-queue.c Thu Jan 28 14:29:54 2010 -0800 @@ -141,7 +141,7 @@ * use a private linked list to keep track * of the tiocbs we''re cancelling. */ - tiocb = (struct tiocb *)queue->iocbs[0]->data; + tiocb = queue->iocbs[0]->data; queued = queue->queued; queue->queued = 0; @@ -165,8 +165,40 @@ return cancel_tiocbs(queue, err); } +/* + * rwio + */ + +struct rwio { + struct io_event *aio_events; +}; + +static void +tapdisk_rwio_destroy(struct tqueue *queue) +{ + struct rwio *rwio = queue->tio_data; + + if (rwio->aio_events) { + free(rwio->aio_events); + rwio->aio_events = NULL; + } +} + +static int +tapdisk_rwio_setup(struct tqueue *queue, int size) +{ + struct rwio *rwio = queue->tio_data; + int err; + + rwio->aio_events = calloc(size, sizeof(struct io_event)); + if (!rwio->aio_events) + return -errno; + + return 0; +} + static inline ssize_t -iocb_rw(struct iocb *iocb) +tapdisk_rwio_rw(const struct iocb *iocb) { int fd = iocb->aio_fildes; char *buf = iocb->u.c.buf; @@ -177,7 +209,7 @@ if (lseek(fd, off, SEEK_SET) == (off_t)-1) return -errno; - + if (atomicio(func, fd, buf, size) != size) return -errno; @@ -185,8 +217,9 @@ } static int -io_synchronous_rw(struct tqueue *queue) +tapdisk_rwio_submit(struct tqueue *queue) { + struct rwio *rwio = queue->tio_data; int i, merged, split; struct iocb *iocb; struct tiocb *tiocb; @@ -201,18 +234,18 @@ queue->queued = 0; for (i = 0; i < merged; i++) { - ep = queue->aio_events + i; + ep = rwio->aio_events + i; iocb = queue->iocbs[i]; ep->obj = iocb; - ep->res = iocb_rw(iocb); + ep->res = tapdisk_rwio_rw(iocb); } - split = io_split(&queue->opioctx, queue->aio_events, merged); - tapdisk_filter_events(queue->filter, queue->aio_events, split); + split = io_split(&queue->opioctx, rwio->aio_events, merged); + tapdisk_filter_events(queue->filter, rwio->aio_events, split); - for (i = split, ep = queue->aio_events; i-- > 0; ep++) { + for (i = split, ep = rwio->aio_events; i-- > 0; ep++) { iocb = ep->obj; - tiocb = (struct tiocb *)iocb->data; + tiocb = iocb->data; complete_tiocb(queue, tiocb, ep->res); } @@ -221,65 +254,258 @@ return split; } -static void tapdisk_tiocb_event(event_id_t id, char mode, void *private); +static const struct tio td_tio_rwio = { + .name = "rwio", + .data_size = 0, + .tio_setup = NULL, + .tio_destroy = NULL, + .tio_submit = tapdisk_rwio_submit +}; + +/* + * libaio + */ + +struct lio { + io_context_t aio_ctx; + struct io_event *aio_events; + + int poll_fd; + int event_id; +}; + +static void +tapdisk_lio_destroy(struct tqueue *queue) +{ + struct lio *lio = queue->tio_data; + + if (!lio) + return; + + if (lio->event_id >= 0) { + tapdisk_server_unregister_event(lio->event_id); + lio->event_id = -1; + } + + if (lio->aio_ctx) { + io_destroy(lio->aio_ctx); + lio->aio_ctx = NULL; + } + + if (lio->aio_events) { + free(lio->aio_events); + lio->aio_events = NULL; + } +} + +static void +tapdisk_lio_event(event_id_t id, char mode, void *private) +{ + struct tqueue *queue = private; + struct lio *lio; + int i, ret, split; + struct iocb *iocb; + struct tiocb *tiocb; + struct io_event *ep; + + lio = queue->tio_data; + ret = io_getevents(lio->aio_ctx, 0, + queue->size, lio->aio_events, NULL); + split = io_split(&queue->opioctx, lio->aio_events, ret); + tapdisk_filter_events(queue->filter, lio->aio_events, split); + + DBG("events: %d, tiocbs: %d\n", ret, split); + + queue->iocbs_pending -= ret; + queue->tiocbs_pending -= split; + + for (i = split, ep = lio->aio_events; i-- > 0; ep++) { + iocb = ep->obj; + tiocb = iocb->data; + complete_tiocb(queue, tiocb, ep->res); + } + + queue_deferred_tiocbs(queue); +} + +static int +tapdisk_lio_setup(struct tqueue *queue, int qlen) +{ + struct lio *lio = queue->tio_data; + size_t sz; + int err; + + lio->event_id = -1; + lio->aio_ctx = REQUEST_ASYNC_FD; + + lio->poll_fd = io_setup(qlen, &lio->aio_ctx); + err = lio->poll_fd; + if (err < 0) { + lio->aio_ctx = NULL; + + if (err == -EAGAIN) + goto fail_rsv; + + goto fail_fd; + } + + lio->event_id + tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, + lio->poll_fd, 0, + tapdisk_lio_event, + queue); + err = lio->event_id; + if (err < 0) + goto fail; + + lio->aio_events = calloc(qlen, sizeof(struct io_event)); + if (!lio->aio_events) { + err = -errno; + goto fail; + } + + return 0; + +fail: + tapdisk_lio_destroy(queue); + return err; + +fail_rsv: + DPRINTF("Couldn''t setup AIO context. If you are trying to " + "concurrently use a large number of blktap-based disks, you may " + "need to increase the system-wide aio request limit. " + "(e.g. ''echo 1048576 > /proc/sys/fs/aio-max-nr'')\n"); + goto fail; + +fail_fd: + DPRINTF("Couldn''t get fd for AIO poll support. This is probably " + "because your kernel does not have the aio-poll patch " + "applied.\n"); + goto fail; +} + +static int +tapdisk_lio_submit(struct tqueue *queue) +{ + struct lio *lio = queue->tio_data; + int merged, submitted, err = 0; + + if (!queue->queued) + return 0; + + tapdisk_filter_iocbs(queue->filter, queue->iocbs, queue->queued); + merged = io_merge(&queue->opioctx, queue->iocbs, queue->queued); + submitted = io_submit(lio->aio_ctx, merged, queue->iocbs); + + DBG("queued: %d, merged: %d, submitted: %d\n", + queue->queued, merged, submitted); + + if (submitted < 0) { + err = submitted; + submitted = 0; + } else if (submitted < merged) + err = -EIO; + + queue->iocbs_pending += submitted; + queue->tiocbs_pending += queue->queued; + queue->queued = 0; + + if (err) + queue->tiocbs_pending -= + fail_tiocbs(queue, submitted, merged, err); + + return submitted; +} + +static const struct tio td_tio_lio = { + .name = "lio", + .data_size = sizeof(struct lio), + .tio_setup = tapdisk_lio_setup, + .tio_destroy = tapdisk_lio_destroy, + .tio_submit = tapdisk_lio_submit, +}; + +static void +tapdisk_queue_free_io(struct tqueue *queue) +{ + if (queue->tio) { + if (queue->tio->tio_destroy) + queue->tio->tio_destroy(queue); + queue->tio = NULL; + } + + if (queue->tio_data) { + free(queue->tio_data); + queue->tio_data = NULL; + } +} + +static int +tapdisk_queue_init_io(struct tqueue *queue, int drv) +{ + const struct tio *tio; + int err; + + switch (drv) { + case TIO_DRV_LIO: + tio = &td_tio_lio; + break; + case TIO_DRV_RWIO: + tio = &td_tio_rwio; + break; + default: + err = -EINVAL; + goto fail; + } + + queue->tio_data = calloc(1, tio->data_size); + if (!queue->tio_data) { + PERROR("malloc(%zu)", tio->data_size); + err = -errno; + goto fail; + } + + queue->tio = tio; + + if (tio->tio_setup) { + err = tio->tio_setup(queue, queue->size); + if (err) + goto fail; + } + + DPRINTF("I/O queue driver: %s\n", tio->name); + + return 0; + +fail: + tapdisk_queue_free_io(queue); + return err; +} int tapdisk_init_queue(struct tqueue *queue, int size, - int sync, struct tfilter *filter) + int drv, struct tfilter *filter) { int i, err; memset(queue, 0, sizeof(struct tqueue)); queue->size = size; - queue->sync = sync; queue->filter = filter; - queue->event = -1; - queue->aio_ctx = NULL; - if (!size) return 0; - if (!sync) { - queue->aio_ctx = REQUEST_ASYNC_FD; - queue->poll_fd = io_setup(size, &queue->aio_ctx); - err = queue->poll_fd; - if (err < 0) { - if (err == -EAGAIN) - DPRINTF("Couldn''t setup AIO context. If you " - "are trying to concurrently use a " - "large number of blktap-based disks, " - "you may need to increase the " - "system-wide aio request limit. " - "(e.g. ''echo 1048576 > /proc/sys/fs/" - "aio-max-nr'')\n"); - else - DPRINTF("Couldn''t get fd for AIO poll " - "support. This is probably because " - "your kernel does not have the " - "aio-poll patch applied.\n"); - queue->aio_ctx = NULL; - goto fail; - } + err = tapdisk_queue_init_io(queue, drv); + if (err) + goto fail; - queue->event - tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, - queue->poll_fd, 0, - tapdisk_tiocb_event, - queue); - err = queue->event; - if (err < 0) - goto fail; - + queue->iocbs = calloc(size, sizeof(struct iocb *)); + if (!queue->iocbs) { + err = -errno; + goto fail; } - err = -ENOMEM; - queue->iocbs = calloc(size, sizeof(struct iocb *)); - queue->aio_events = calloc(size, sizeof(struct io_event)); - if (!queue->iocbs || !queue->aio_events) - goto fail; - err = opio_init(&queue->opioctx, size); if (err) goto fail; @@ -294,22 +520,11 @@ void tapdisk_free_queue(struct tqueue *queue) { - if (queue->event >= 0) { - tapdisk_server_unregister_event(queue->event); - queue->event = -1; - } - - if (queue->aio_ctx) { - io_destroy(queue->aio_ctx); - queue->aio_ctx = NULL; - } + tapdisk_queue_free_io(queue); free(queue->iocbs); queue->iocbs = NULL; - free(queue->aio_events); - queue->aio_events = NULL; - opio_free(&queue->opioctx); } @@ -319,9 +534,9 @@ struct tiocb *tiocb = queue->deferred.head; WARN("TAPDISK QUEUE:\n"); - WARN("size: %d, sync: %d, queued: %d, iocbs_pending: %d, " + WARN("size: %d, tio: %s, queued: %d, iocbs_pending: %d, " "tiocbs_pending: %d, tiocbs_deferred: %d, deferrals: %"PRIx64"\n", - queue->size, queue->sync, queue->queued, queue->iocbs_pending, + queue->size, queue->tio->name, queue->queued, queue->iocbs_pending, queue->tiocbs_pending, queue->tiocbs_deferred, queue->deferrals); if (tiocb) { @@ -362,42 +577,14 @@ defer_tiocb(queue, tiocb); } + /* * fail_tiocbs may queue more tiocbs */ int tapdisk_submit_tiocbs(struct tqueue *queue) { - int merged, submitted, err = 0; - - if (!queue->queued) - return 0; - - if (queue->sync) - return io_synchronous_rw(queue); - - tapdisk_filter_iocbs(queue->filter, queue->iocbs, queue->queued); - merged = io_merge(&queue->opioctx, queue->iocbs, queue->queued); - submitted = io_submit(queue->aio_ctx, merged, queue->iocbs); - - DBG("queued: %d, merged: %d, submitted: %d\n", - queue->queued, merged, submitted); - - if (submitted < 0) { - err = submitted; - submitted = 0; - } else if (submitted < merged) - err = -EIO; - - queue->iocbs_pending += submitted; - queue->tiocbs_pending += queue->queued; - queue->queued = 0; - - if (err) - queue->tiocbs_pending -= - fail_tiocbs(queue, submitted, merged, err); - - return submitted; + return queue->tio->tio_submit(queue); } int @@ -412,40 +599,6 @@ return submitted; } -static void -tapdisk_complete_tiocbs(struct tqueue *queue) -{ - int i, ret, split; - struct iocb *iocb; - struct tiocb *tiocb; - struct io_event *ep; - - ret = io_getevents(queue->aio_ctx, 0, - queue->size, queue->aio_events, NULL); - split = io_split(&queue->opioctx, queue->aio_events, ret); - tapdisk_filter_events(queue->filter, queue->aio_events, split); - - DBG("events: %d, tiocbs: %d\n", ret, split); - - queue->iocbs_pending -= ret; - queue->tiocbs_pending -= split; - - for (i = split, ep = queue->aio_events; i-- > 0; ep++) { - iocb = ep->obj; - tiocb = (struct tiocb *)iocb->data; - complete_tiocb(queue, tiocb, ep->res); - } - - queue_deferred_tiocbs(queue); -} - -static void -tapdisk_tiocb_event(event_id_t id, char mode, void *private) -{ - struct tqueue *queue = private; - tapdisk_complete_tiocbs(queue); -} - /* * cancel_tiocbs may queue more tiocbs */ diff -r 2439e9413187 -r b816b4a752d8 tools/blktap2/drivers/tapdisk-queue.h --- a/tools/blktap2/drivers/tapdisk-queue.h Fri Nov 13 20:26:14 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-queue.h Thu Jan 28 14:29:54 2010 -0800 @@ -55,16 +55,14 @@ struct tqueue { int size; - int sync; - int poll_fd; - event_id_t event; - io_context_t aio_ctx; + const struct tio *tio; + void *tio_data; + struct opioctx opioctx; int queued; struct iocb **iocbs; - struct io_event *aio_events; /* number of iocbs pending in the aio layer */ int iocbs_pending; @@ -86,6 +84,20 @@ uint64_t deferrals; }; +struct tio { + const char *name; + size_t data_size; + + int (*tio_setup) (struct tqueue *queue, int qlen); + void (*tio_destroy) (struct tqueue *queue); + int (*tio_submit) (struct tqueue *queue); +}; + +enum { + TIO_DRV_LIO = 1, + TIO_DRV_RWIO = 2, +}; + /* * Interface for request producer (i.e., tapdisk) * NB: the following functions may cause additional tiocbs to be queued: @@ -99,7 +111,7 @@ #define tapdisk_queue_empty(q) ((q)->queued == 0) #define tapdisk_queue_full(q) \ (((q)->tiocbs_pending + (q)->queued) >= (q)->size) -int tapdisk_init_queue(struct tqueue *, int size, int sync, struct tfilter *); +int tapdisk_init_queue(struct tqueue *, int size, int drv, struct tfilter *); void tapdisk_free_queue(struct tqueue *); void tapdisk_debug_queue(struct tqueue *); void tapdisk_queue_tiocb(struct tqueue *, struct tiocb *); diff -r 2439e9413187 -r b816b4a752d8 tools/blktap2/drivers/tapdisk-server.c --- a/tools/blktap2/drivers/tapdisk-server.c Fri Nov 13 20:26:14 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-server.c Thu Jan 28 14:29:54 2010 -0800 @@ -236,7 +236,8 @@ static int tapdisk_server_init_aio(void) { - return tapdisk_init_queue(&server.aio_queue, TAPDISK_TIOCBS, 0, NULL); + return tapdisk_init_queue(&server.aio_queue, TAPDISK_TIOCBS, + TIO_DRV_LIO, NULL); } static void diff -r 2439e9413187 -r b816b4a752d8 tools/blktap2/drivers/tapdisk-vbd.c --- a/tools/blktap2/drivers/tapdisk-vbd.c Fri Nov 13 20:26:14 2009 -0800 +++ b/tools/blktap2/drivers/tapdisk-vbd.c Thu Jan 28 14:29:54 2010 -0800 @@ -1260,6 +1260,8 @@ int n; td_ring_t *ring; + tapdisk_vbd_check_state(vbd); + ring = &vbd->ring; if (!ring->sring) return 0; diff -r 2439e9413187 -r b816b4a752d8 tools/blktap2/include/blktaplib.h --- a/tools/blktap2/include/blktaplib.h Fri Nov 13 20:26:14 2009 -0800 +++ b/tools/blktap2/include/blktaplib.h Thu Jan 28 14:29:54 2010 -0800 @@ -43,6 +43,7 @@ #endif #define EPRINTF(_f, _a...) syslog(LOG_ERR, "tap-err:%s: " _f, __func__, ##_a) +#define PERROR(_f, _a...) EPRINTF(_f ": %s", ##_a, strerror(errno)) #define BLK_RING_SIZE __RING_SIZE((blkif_sring_t *)0, XC_PAGE_SIZE) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-28 22:37 UTC
[Xen-devel] [PATCH 4 of 4] blktap2: Prefer AIO eventfd support on kernels >= 2.6.22
# HG changeset patch # User Daniel Stodden <daniel.stodden@citrix.com> # Date 1264717884 28800 # Node ID b7d6bbba856b7a3c5170f7377a3d490429a91ba5 # Parent b816b4a752d8d2dcc6e7e01ad0cf75404932d94b blktap2: Prefer AIO eventfd support on kernels >= 2.6.22 Mainline kernel support for eventfd(2) in linux aio was added between 2.6.21 and 2.6.22. Libaio after 0.3.107 has the header file, but presently few systems support it. Neither do we rely on an up-to-date libc6. Instead, this patch adds a header which defines custom iocb_common struct, and works around a potentially missing sys/eventfd.h. q diff -r b816b4a752d8 -r b7d6bbba856b tools/blktap2/drivers/block-aio.c --- a/tools/blktap2/drivers/block-aio.c Thu Jan 28 14:29:54 2010 -0800 +++ b/tools/blktap2/drivers/block-aio.c Thu Jan 28 14:31:24 2010 -0800 @@ -28,7 +28,6 @@ #include <errno.h> -#include <libaio.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> diff -r b816b4a752d8 -r b7d6bbba856b tools/blktap2/drivers/libaio-compat.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/blktap2/drivers/libaio-compat.h Thu Jan 28 14:31:24 2010 -0800 @@ -0,0 +1,92 @@ +/* + * Copyright (c) 2010, XenSource Inc. + * All rights reserved. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public License + * as published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + */ + +/* + * kernel 2.6.21 added eventfd(2) support, kernel 2.6.22 eventfds for + * aio. libaio 0.3.107 updated the header file, but few systems have + * it. define a custom iocb_common struct instead, and work around a + * potentially missing sys/eventfd.h. this header should vanish over + * time. + */ + +#ifndef __LIBAIO_COMPAT +#define __LIBAIO_COMPAT + +#include <libaio.h> +#include <unistd.h> +#include <sys/syscall.h> + +struct __compat_io_iocb_common { + char __pad_buf[8]; + char __pad_nbytes[8]; + long long offset; + long long __pad3; + unsigned flags; + unsigned resfd; +}; + +static inline void __io_set_eventfd(struct iocb *iocb, int eventfd) +{ + struct __compat_io_iocb_common *c; + c = (struct __compat_io_iocb_common*)&iocb->u.c; + c->flags |= (1 << 0); + c->resfd = eventfd; +} + +#ifndef SYS_eventfd +#ifndef __NR_eventfd +# if defined(__alpha__) +# define __NR_eventfd 478 +# elif defined(__arm__) +# define __NR_eventfd (__NR_SYSCALL_BASE+351) +# elif defined(__ia64__) +# define __NR_eventfd 1309 +# elif defined(__i386__) +# define __NR_eventfd 323 +# elif defined(__m68k__) +# define __NR_eventfd 319 +# elif 0 && defined(__mips__) +# error __NR_eventfd? +# define __NR_eventfd (__NR_Linux + 319) +# define __NR_eventfd (__NR_Linux + 278) +# define __NR_eventfd (__NR_Linux + 282) +# elif defined(__hppa__) +# define __NR_eventfd (__NR_Linux + 304) +# elif defined(__PPC__) || defined(__powerpc64__) +# define __NR_eventfd 307 +# elif defined(__s390__) || defined(__s390x__) +# define __NR_eventfd 318 +# elif defined(__sparc__) +# define __NR_eventfd 313 +# elif defined(__x86_64__) +# define __NR_eventfd 284 +# endif +#else +# error __NR_eventfd? +#endif +#define SYS_eventfd __NR_eventfd +#endif + +static inline int tapdisk_sys_eventfd(int initval) +{ + return syscall(SYS_eventfd, initval, 0); +} + +#endif /* __LIBAIO_COMPAT */ diff -r b816b4a752d8 -r b7d6bbba856b tools/blktap2/drivers/tapdisk-queue.c --- a/tools/blktap2/drivers/tapdisk-queue.c Thu Jan 28 14:29:54 2010 -0800 +++ b/tools/blktap2/drivers/tapdisk-queue.c Thu Jan 28 14:31:24 2010 -0800 @@ -30,12 +30,18 @@ #include <stdlib.h> #include <unistd.h> #include <libaio.h> +#ifdef __linux__ +#include <linux/version.h> +#endif #include "tapdisk.h" #include "tapdisk-log.h" #include "tapdisk-queue.h" #include "tapdisk-filter.h" #include "tapdisk-server.h" +#include "tapdisk-utils.h" + +#include "libaio-compat.h" #include "atomicio.h" #define WARN(_f, _a...) tlog_write(TLOG_WARN, _f, ##_a) @@ -270,10 +276,122 @@ io_context_t aio_ctx; struct io_event *aio_events; - int poll_fd; + int event_fd; int event_id; + + int flags; }; +#define LIO_FLAG_EVENTFD (1<<0) + +static int +tapdisk_lio_check_resfd(void) +{ + return tapdisk_linux_version() >= KERNEL_VERSION(2, 6, 22); +} + +static void +tapdisk_lio_destroy_aio(struct tqueue *queue) +{ + struct lio *lio = queue->tio_data; + + if (lio->event_fd >= 0) { + close(lio->event_fd); + lio->event_fd = -1; + } + + if (lio->aio_ctx) { + io_destroy(lio->aio_ctx); + lio->aio_ctx = 0; + } +} + +static int +__lio_setup_aio_poll(struct tqueue *queue, int qlen) +{ + struct lio *lio = queue->tio_data; + int err, fd; + + lio->aio_ctx = REQUEST_ASYNC_FD; + + fd = io_setup(qlen, &lio->aio_ctx); + if (fd < 0) { + lio->aio_ctx = 0; + err = -errno; + + if (err == -EINVAL) + goto fail_fd; + + goto fail; + } + + lio->event_fd = fd; + + return 0; + +fail_fd: + DPRINTF("Couldn''t get fd for AIO poll support. This is probably " + "because your kernel does not have the aio-poll patch " + "applied.\n"); +fail: + return err; +} + +static int +__lio_setup_aio_eventfd(struct tqueue *queue, int qlen) +{ + struct lio *lio = queue->tio_data; + int err; + + err = io_setup(qlen, &lio->aio_ctx); + if (err < 0) { + lio->aio_ctx = 0; + return err; + } + + lio->event_fd = tapdisk_sys_eventfd(0); + if (lio->event_fd < 0) + return -errno; + + lio->flags |= LIO_FLAG_EVENTFD; + + return 0; +} + +static int +tapdisk_lio_setup_aio(struct tqueue *queue, int qlen) +{ + struct lio *lio = queue->tio_data; + int err; + + lio->aio_ctx = 0; + lio->event_fd = -1; + + /* + * prefer the mainline eventfd(2) api, if available. + * if not, fall back to the poll fd patch. + */ + + err = !tapdisk_lio_check_resfd(); + if (!err) + err = __lio_setup_aio_eventfd(queue, qlen); + if (err) + err = __lio_setup_aio_poll(queue, qlen); + + if (err == -EAGAIN) + goto fail_rsv; +fail: + return err; + +fail_rsv: + DPRINTF("Couldn''t setup AIO context. If you are trying to " + "concurrently use a large number of blktap-based disks, you may " + "need to increase the system-wide aio request limit. " + "(e.g. ''echo 1048576 > /proc/sys/fs/aio-max-nr'')\n"); + goto fail; +} + + static void tapdisk_lio_destroy(struct tqueue *queue) { @@ -287,10 +405,7 @@ lio->event_id = -1; } - if (lio->aio_ctx) { - io_destroy(lio->aio_ctx); - lio->aio_ctx = NULL; - } + tapdisk_lio_destroy_aio(queue); if (lio->aio_events) { free(lio->aio_events); @@ -299,6 +414,27 @@ } static void +tapdisk_lio_set_eventfd(struct tqueue *queue, int n, struct iocb **iocbs) +{ + struct lio *lio = queue->tio_data; + int i; + + if (lio->flags & LIO_FLAG_EVENTFD) + for (i = 0; i < n; ++i) + __io_set_eventfd(iocbs[i], lio->event_fd); +} + +static void +tapdisk_lio_ack_event(struct tqueue *queue) +{ + struct lio *lio = queue->tio_data; + uint64_t val; + + if (lio->flags & LIO_FLAG_EVENTFD) + read(lio->event_fd, &val, sizeof(val)); +} + +static void tapdisk_lio_event(event_id_t id, char mode, void *private) { struct tqueue *queue = private; @@ -308,6 +444,8 @@ struct tiocb *tiocb; struct io_event *ep; + tapdisk_lio_ack_event(queue); + lio = queue->tio_data; ret = io_getevents(lio->aio_ctx, 0, queue->size, lio->aio_events, NULL); @@ -336,22 +474,14 @@ int err; lio->event_id = -1; - lio->aio_ctx = REQUEST_ASYNC_FD; - lio->poll_fd = io_setup(qlen, &lio->aio_ctx); - err = lio->poll_fd; - if (err < 0) { - lio->aio_ctx = NULL; - - if (err == -EAGAIN) - goto fail_rsv; - - goto fail_fd; - } + err = tapdisk_lio_setup_aio(queue, qlen); + if (err) + goto fail; lio->event_id tapdisk_server_register_event(SCHEDULER_POLL_READ_FD, - lio->poll_fd, 0, + lio->event_fd, 0, tapdisk_lio_event, queue); err = lio->event_id; @@ -369,19 +499,6 @@ fail: tapdisk_lio_destroy(queue); return err; - -fail_rsv: - DPRINTF("Couldn''t setup AIO context. If you are trying to " - "concurrently use a large number of blktap-based disks, you may " - "need to increase the system-wide aio request limit. " - "(e.g. ''echo 1048576 > /proc/sys/fs/aio-max-nr'')\n"); - goto fail; - -fail_fd: - DPRINTF("Couldn''t get fd for AIO poll support. This is probably " - "because your kernel does not have the aio-poll patch " - "applied.\n"); - goto fail; } static int @@ -395,6 +512,7 @@ tapdisk_filter_iocbs(queue->filter, queue->iocbs, queue->queued); merged = io_merge(&queue->opioctx, queue->iocbs, queue->queued); + tapdisk_lio_set_eventfd(queue, merged, queue->iocbs); submitted = io_submit(lio->aio_ctx, merged, queue->iocbs); DBG("queued: %d, merged: %d, submitted: %d\n", diff -r b816b4a752d8 -r b7d6bbba856b tools/blktap2/drivers/tapdisk-utils.c --- a/tools/blktap2/drivers/tapdisk-utils.c Thu Jan 28 14:29:54 2010 -0800 +++ b/tools/blktap2/drivers/tapdisk-utils.c Thu Jan 28 14:31:24 2010 -0800 @@ -33,6 +33,10 @@ #include <sys/mman.h> #include <sys/ioctl.h> #include <sys/resource.h> +#include <sys/utsname.h> +#ifdef __linux__ +#include <linux/version.h> +#endif #include "blk.h" #include "tapdisk.h" @@ -183,3 +187,31 @@ return 0; } + +#ifdef __linux__ + +int tapdisk_linux_version(void) +{ + struct utsname uts; + unsigned int version, patchlevel, sublevel; + int n, err; + + err = uname(&uts); + if (err) + return -errno; + + n = sscanf(uts.release, "%u.%u.%u", &version, &patchlevel, &sublevel); + if (n != 3) + return -ENOSYS; + + return KERNEL_VERSION(version, patchlevel, sublevel); +} + +#else + +int tapdisk_linux_version(void) +{ + return -ENOSYS; +} + +#endif diff -r b816b4a752d8 -r b7d6bbba856b tools/blktap2/drivers/tapdisk-utils.h --- a/tools/blktap2/drivers/tapdisk-utils.h Thu Jan 28 14:29:54 2010 -0800 +++ b/tools/blktap2/drivers/tapdisk-utils.h Thu Jan 28 14:31:24 2010 -0800 @@ -38,5 +38,6 @@ int tapdisk_namedup(char **, const char *); int tapdisk_parse_disk_type(const char *, char **, int *); int tapdisk_get_image_size(int, uint64_t *, uint32_t *); +int tapdisk_linux_version(void); #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-29 06:55 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On 28/01/2010 22:37, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:> Get blktap2 running on pvops. > > This mainly adds eventfd support to the userland code. Based on some > prior cleanup to tapdisk-queue and the server object. We had most of > that in XenServer for a while, so I kept it stacked.Patches 1, 2 and 4 are missing a signed-off-by line. I could not apply patch 3 because it was rejected as mangled by ''patch'': not sure why this happens from time to time, but please re-send all 4 patches as attachments, and include signed-off-by on each one. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:44 UTC
[Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
Is --inline right/better? Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:44 UTC
[Xen-devel] [PATCH 1 of 4] blktap2: Sort out tapdisk IPC init
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:44 UTC
[Xen-devel] [PATCH 2 of 4] blktap2: Sort out tapdisk AIO init
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:44 UTC
[Xen-devel] [PATCH 3 of 4] blktap2: Separate tapdisk raw I/O into different backends
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:44 UTC
[Xen-devel] [PATCH 4 of 4] blktap2: Prefer AIO eventfd support on kernels >= 2.6.22
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 07:55 UTC
[Xen-devel] Re: [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 02:44 -0500, Daniel Stodden wrote:> Is --inline right/better?Hmm, looks like mailman filters attachment parts anyway. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-29 08:09 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
Echo the changeset comment and sign-off into the email body would be better, but your re-send is fine as far as I''m concerned. I should be able to apply the attachments no problem. Thanks, Keir On 29/01/2010 07:44, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:> Is --inline right/better? > > Cheers, > Daniel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 08:29 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote:> Echo the changeset comment and sign-off into the email body would be better, > but your re-send is fine as far as I''m concerned. I should be able to apply > the attachments no problem.Patchbomb as of hg 1.3.1 doesn''t seem to do this. Xapi-dev promotes a similar patchbomb hack, but it would include the entire diff in the header message (--review) and a dup attached to avoid mangling. Does one exist already, or are you somewhat asking me to write one? Would probably belong on the wiki. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 08:45 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 03:29 -0500, Daniel Stodden wrote:> On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote: > > Echo the changeset comment and sign-off into the email body would be better, > > but your re-send is fine as far as I''m concerned. I should be able to apply > > the attachments no problem. > > Patchbomb as of hg 1.3.1 doesn''t seem to do this.Ah, it does. So --inline implies --attach, but is still different from --inline *and* --attach. Sorry for the noise. :P, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-29 08:52 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On 29/01/2010 08:45, "Daniel Stodden" <daniel.stodden@citrix.com> wrote:> On Fri, 2010-01-29 at 03:29 -0500, Daniel Stodden wrote: >> On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote: >>> Echo the changeset comment and sign-off into the email body would be better, >>> but your re-send is fine as far as I''m concerned. I should be able to apply >>> the attachments no problem. >> >> Patchbomb as of hg 1.3.1 doesn''t seem to do this. > > Ah, it does. > > So --inline implies --attach, > but is still different from --inline *and* --attach. > > Sorry for the noise.By the way, attaching and inlining a whole patch is also acceptable. Some people like inline patches, so that they can easily review and comment. I like attachments because they JustWork when I''m trying to apply big bundles of patches. For smaller patches, inline only can be okay. For some reason, I find that inline large patches rarely apply for me. It''s really weird, given I only fetchmail and munpack the emails, and yet I have a suspicion the mangling does happen at my end. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 09:22 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 03:52 -0500, Keir Fraser wrote:> On 29/01/2010 08:45, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > > On Fri, 2010-01-29 at 03:29 -0500, Daniel Stodden wrote: > >> On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote: > >>> Echo the changeset comment and sign-off into the email body would be better, > >>> but your re-send is fine as far as I''m concerned. I should be able to apply > >>> the attachments no problem. > >> > >> Patchbomb as of hg 1.3.1 doesn''t seem to do this. > > > > Ah, it does. > > > > So --inline implies --attach, > > but is still different from --inline *and* --attach. > > > > Sorry for the noise. > > By the way, attaching and inlining a whole patch is also acceptable. Some > people like inline patches, so that they can easily review and comment. I > like attachments because they JustWork when I''m trying to apply big bundles > of patches.To avoid confusion: By inline I meant the content-disposition. The visual results would combine the best of both. But I suspect the results somwhat depend on the mailer in use. It''s completely up to you. I don''t charge for python cycles. :) So far I got [alias] email-xen = email --attach --inline --to ''Xen <xen-devel@lists.xensource.com>'' email-xapi = email --review --inline --to ''Xen API <xen-api@lists.xensource.com>'' Maybe Pasi knows a decent corner on the wiki where this stuff can converge to gatekeeper preferences.> For smaller patches, inline only can be okay. For some reason, I find that > inline large patches rarely apply for me. It''s really weird, given I only > fetchmail and munpack the emails, and yet I have a suspicion the mangling > does happen at my end.Heard this before. I doubt fetchmail tries to be smart about the message body. I never worked with munpack. I noticed that mailman adds some filtering even when forwarding, because the content-disposition=inline gets stripped. But I also think mailman breaking diffs would be kind of gross.. Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-29 10:06 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, Jan 29, 2010 at 01:22:50AM -0800, Daniel Stodden wrote:> On Fri, 2010-01-29 at 03:52 -0500, Keir Fraser wrote: > > On 29/01/2010 08:45, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > > > > On Fri, 2010-01-29 at 03:29 -0500, Daniel Stodden wrote: > > >> On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote: > > >>> Echo the changeset comment and sign-off into the email body would be better, > > >>> but your re-send is fine as far as I''m concerned. I should be able to apply > > >>> the attachments no problem. > > >> > > >> Patchbomb as of hg 1.3.1 doesn''t seem to do this. > > > > > > Ah, it does. > > > > > > So --inline implies --attach, > > > but is still different from --inline *and* --attach. > > > > > > Sorry for the noise. > > > > By the way, attaching and inlining a whole patch is also acceptable. Some > > people like inline patches, so that they can easily review and comment. I > > like attachments because they JustWork when I''m trying to apply big bundles > > of patches. > > To avoid confusion: By inline I meant the content-disposition. The > visual results would combine the best of both. But I suspect the results > somwhat depend on the mailer in use. > > It''s completely up to you. I don''t charge for python cycles. :) > > So far I got > > [alias] > email-xen = email --attach --inline --to ''Xen <xen-devel@lists.xensource.com>'' > email-xapi = email --review --inline --to ''Xen API <xen-api@lists.xensource.com>'' > > Maybe Pasi knows a decent corner on the wiki where this stuff can > converge to gatekeeper preferences. >There''s a note about submitting patches in: http://wiki.xensource.com/xenwiki/XenFaq I can add this stuff there. So those aliases should go to .hgrc ? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 10:27 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 05:06 -0500, Pasi Kärkkäinen wrote:> On Fri, Jan 29, 2010 at 01:22:50AM -0800, Daniel Stodden wrote: > > On Fri, 2010-01-29 at 03:52 -0500, Keir Fraser wrote: > > > On 29/01/2010 08:45, "Daniel Stodden" <daniel.stodden@citrix.com> wrote: > > > > > > > On Fri, 2010-01-29 at 03:29 -0500, Daniel Stodden wrote: > > > >> On Fri, 2010-01-29 at 03:09 -0500, Keir Fraser wrote: > > > >>> Echo the changeset comment and sign-off into the email body would be better, > > > >>> but your re-send is fine as far as I''m concerned. I should be able to apply > > > >>> the attachments no problem. > > > >> > > > >> Patchbomb as of hg 1.3.1 doesn''t seem to do this. > > > > > > > > Ah, it does. > > > > > > > > So --inline implies --attach, > > > > but is still different from --inline *and* --attach. > > > > > > > > Sorry for the noise. > > > > > > By the way, attaching and inlining a whole patch is also acceptable. Some > > > people like inline patches, so that they can easily review and comment. I > > > like attachments because they JustWork when I''m trying to apply big bundles > > > of patches. > > > > To avoid confusion: By inline I meant the content-disposition. The > > visual results would combine the best of both. But I suspect the results > > somwhat depend on the mailer in use. > > > > It''s completely up to you. I don''t charge for python cycles. :) > > > > So far I got > > > > [alias] > > email-xen = email --attach --inline --to ''Xen <xen-devel@lists.xensource.com>'' > > email-xapi = email --review --inline --to ''Xen API <xen-api@lists.xensource.com>'' > > > > Maybe Pasi knows a decent corner on the wiki where this stuff can > > converge to gatekeeper preferences. > > > > There''s a note about submitting patches in: > http://wiki.xensource.com/xenwiki/XenFaq > > I can add this stuff there. > > So those aliases should go to .hgrc ?May. It''s just a neat way to not memorize them, so I guess it might help. I remember a xensource wiki page where the preferred format for xapi-dev was documented. I send you a copy when I''m back at work. That took a patch to patchbomb though. I pasted the one I''m presently running below. Not sure: I think there''s already a fair number xen-devel people looking into XCP. But is Xapi-related stuff even wanted on the current wiki? Daniel diff -r 13e59a0b3485 -r c60348153509 patchbomb.py --- a/patchbomb.py Fri Jan 29 00:20:30 2010 -0800 +++ b/patchbomb.py Fri Jan 29 00:49:13 2010 -0800 @@ -126,6 +126,9 @@ ''Patch subject is complete summary.'') body += ''\n\n\n'' + if opts.get(''review''): + body += ''\n''.join(patch) + if opts.get(''plain''): while patch and patch[0].startswith(''# ''): patch.pop(0) @@ -494,6 +497,7 @@ ('''', ''bundlename'', ''bundle'', _(''name of the bundle attachment file'')), (''r'', ''rev'', [], _(''a revision to send'')), + (''R'', ''review'', None, _(''add patch to message body for review'')), ('''', ''force'', None, _(''run even when remote repository is unrelated '' ''(with -b/--bundle)'')), _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Jan-29 10:34 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Thu, 2010-01-28 at 22:37 +0000, Daniel Stodden wrote:> - Does a runtime kernel version check. I guess this code will > need additional cpp magic on BSDs.I guess BSD will need cpp magic regardless of the kernel version check since it will have a different interface altogether. But for the Linux version can we not just try eventfd(), check for -ENOSYS and fall back to the old way if necessary, instead of hardcoding the kernel version? I suppose eventfd is an unlikely candidate for backporting to anything older but in general IMHO it''s preferable to check for the actual functionality you want to use rather than hardcoding version numbers etc. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-29 10:52 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Fri, 2010-01-29 at 05:34 -0500, Ian Campbell wrote:> On Thu, 2010-01-28 at 22:37 +0000, Daniel Stodden wrote: > > - Does a runtime kernel version check. I guess this code will > > need additional cpp magic on BSDs. > > I guess BSD will need cpp magic regardless of the kernel version check > since it will have a different interface altogether.Right now the version check is only done ifdef __linux__, or it falls back to the poll patch right away. Whatever BSD is precisely running there, I didn''t change much about the ABI tapdisk is using.> But for the Linux > version can we not just try eventfd(), check for -ENOSYS and fall back > to the old way if necessary, instead of hardcoding the kernel version?That was my initial thought. Then again eventfd went in before AIO caught up. Then again-again it''s just a minor number in between. It''s probably fair to assume that the number of dom0 kernels out there which are prone to surprises is quite limited.> I suppose eventfd is an unlikely candidate for backporting to anything > older but in general IMHO it''s preferable to check for the actual > functionality you want to use rather than hardcoding version numbers > etc.In theory I agree. I guess the right (tm) way to fix is to convince the libaio maintainer to add a proper version define somewhere, instead of doing it in tapdisk. Then let distros fix libaio if they want it. Which they most probably want anyway. The objective was to do without for now, and obsolete tools/libaio asap. The uname() was just to guarantee the least possible amount of unwanted surprise while doing so. If the gap is a non-issue, let''s remove that bit. I''m all for smaller code. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-02 22:53 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
Is this patch set related to the AIO poll support that blocktap2 keeps complaining about when I use tap:tapdisk:aio as the disk type? I found the code that changes io_setup in the Novell Xen patches but its part of a reasonably large patch. Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-02 23:08 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Tue, 2010-02-02 at 15:10 -0800, Daniel Stodden wrote:> On Tue, 2010-02-02 at 17:53 -0500, David P. Quigley wrote: > > Is this patch set related to the AIO poll support that blocktap2 keeps > > complaining about when I use tap:tapdisk:aio as the disk type? > > On pvops, yes. > > > I found > > the code that changes io_setup in the Novell Xen patches but its part of > > a reasonably large patch. > > Shouldn''t depend on the disktype. As long as it''s blktap, not blkback on > a raw vdi, all disk drivers depend on this. > > The io_setup change you''re referring might rather be the old way of > doing things. It made io_setup return an fd to poll, if the io_setup > caller calls with a ''magic'' (1, actually) aio_ctx value. > > Daniel >I believe that is the case. The error message below is what I''m getting from tapdisk2. The ioctl successfully completes and the devices are created. However it just can''t grab the fd to use so it then uses a second ioctl to remove the device shortly after. Dave Feb 1 17:29:20 localhost tapdisk2[2798]: Created /dev/xen/blktap-2/control device Feb 1 17:29:20 localhost tapdisk2[2798]: Created /dev/xen/blktap-2/blktap0 device Feb 1 17:29:20 localhost tapdisk2[2798]: Created /dev/xen/blktap-2/tapdev0 device Feb 1 17:29:20 localhost tapdisk2[2798]: new interface: ring: 251, device: 253, minor: 0 Feb 1 17:29:20 localhost tapdisk2[2798]: Couldn''t get fd for AIO poll support. This is probably because your kernel does not have the aio-poll patch applied. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Feb-02 23:10 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Tue, 2010-02-02 at 17:53 -0500, David P. Quigley wrote:> Is this patch set related to the AIO poll support that blocktap2 keeps > complaining about when I use tap:tapdisk:aio as the disk type?On pvops, yes.> I found > the code that changes io_setup in the Novell Xen patches but its part of > a reasonably large patch.Shouldn''t depend on the disktype. As long as it''s blktap, not blkback on a raw vdi, all disk drivers depend on this. The io_setup change you''re referring might rather be the old way of doing things. It made io_setup return an fd to poll, if the io_setup caller calls with a ''magic'' (1, actually) aio_ctx value. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-02 23:23 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Tue, 2010-02-02 at 15:10 -0800, Daniel Stodden wrote:> On Tue, 2010-02-02 at 17:53 -0500, David P. Quigley wrote: > > Is this patch set related to the AIO poll support that blocktap2 keeps > > complaining about when I use tap:tapdisk:aio as the disk type? > > On pvops, yes. > > > I found > > the code that changes io_setup in the Novell Xen patches but its part of > > a reasonably large patch. > > Shouldn''t depend on the disktype. As long as it''s blktap, not blkback on > a raw vdi, all disk drivers depend on this. > > The io_setup change you''re referring might rather be the old way of > doing things. It made io_setup return an fd to poll, if the io_setup > caller calls with a ''magic'' (1, actually) aio_ctx value. > > Daniel >Seems like I made a mistake and copied the version of the error that was generated before I updated xen-unstable to have your latest changes. Now I''m getting a completely different error. Does the aio driver handle raw disk files or do I need to create the image in some special way? Not the error I get has to do with incorrect magic in the shared info and this results in a segfault. I''ll try to hunt them down. The tapdisk output is listed below. Dave tapdisk2[2366]: Created /dev/xen/blktap-2/blktap0 device tapdisk2[2366]: Created /dev/xen/blktap-2/tapdev0 device tapdisk2[2366]: new interface: ring: 251, device: 253, minor: 0 tapdisk2[2366]: I/O queue driver: lio tapdisk2[2366]: Incorrect magic in shared info. tapdisk2[2366]: Failed to open shared info. tapdisk2[2366]: segfault at 18 ip 0000003e42008dc1 sp 00007fffd29889a0 error 4 in libpthread-2.10.2.so[3e42000000+17000] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Feb-02 23:57 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Tue, 2010-02-02 at 18:23 -0500, David P. Quigley wrote:> On Tue, 2010-02-02 at 15:10 -0800, Daniel Stodden wrote: > > On Tue, 2010-02-02 at 17:53 -0500, David P. Quigley wrote: > > > Is this patch set related to the AIO poll support that blocktap2 keeps > > > complaining about when I use tap:tapdisk:aio as the disk type? > > > > On pvops, yes. > > > > > I found > > > the code that changes io_setup in the Novell Xen patches but its part of > > > a reasonably large patch. > > > > Shouldn''t depend on the disktype. As long as it''s blktap, not blkback on > > a raw vdi, all disk drivers depend on this. > > > > The io_setup change you''re referring might rather be the old way of > > doing things. It made io_setup return an fd to poll, if the io_setup > > caller calls with a ''magic'' (1, actually) aio_ctx value. > > > > Daniel > > > > Seems like I made a mistake and copied the version of the error that was > generated before I updated xen-unstable to have your latest changes. Now > I''m getting a completely different error. Does the aio driver handle raw > disk files or do I need to create the image in some special way? Not the > error I get has to do with incorrect magic in the shared info and this > results in a segfault. I''ll try to hunt them down. The tapdisk output is > listed below. > > Dave > > tapdisk2[2366]: Created /dev/xen/blktap-2/blktap0 device > tapdisk2[2366]: Created /dev/xen/blktap-2/tapdev0 device > tapdisk2[2366]: new interface: ring: 251, device: 253, minor: 0 > tapdisk2[2366]: I/O queue driver: lio > tapdisk2[2366]: Incorrect magic in shared info. > tapdisk2[2366]: Failed to open shared info. > tapdisk2[2366]: segfault at 18 ip 0000003e42008dc1 sp 00007fffd29889a0 error 4 in libpthread-2.10.2.so[3e42000000+17000]Please try disabling -DMEMSHR in the Makefile. [Assuming that helps: Keir, I think there is help underway, but should we turn it off by default in the meantime?] Is this on the pvops kernel? - Make sure it doesn''t have HIGHPTE set. - Apply the bunch of kernel patches posted. ("blktap2 on pvops updates") I don''t think they made it into Jeremy''s tree yet. Let me know if there''s more trouble, we''re presently preparing another bunch of updates, so the time is just right :) Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-03 00:37 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Tue, 2010-02-02 at 15:57 -0800, Daniel Stodden wrote:> On Tue, 2010-02-02 at 18:23 -0500, David P. Quigley wrote: > > On Tue, 2010-02-02 at 15:10 -0800, Daniel Stodden wrote: > > > On Tue, 2010-02-02 at 17:53 -0500, David P. Quigley wrote: > > > > Is this patch set related to the AIO poll support that blocktap2 keeps > > > > complaining about when I use tap:tapdisk:aio as the disk type? > > > > > > On pvops, yes. > > > > > > > I found > > > > the code that changes io_setup in the Novell Xen patches but its part of > > > > a reasonably large patch. > > > > > > Shouldn''t depend on the disktype. As long as it''s blktap, not blkback on > > > a raw vdi, all disk drivers depend on this. > > > > > > The io_setup change you''re referring might rather be the old way of > > > doing things. It made io_setup return an fd to poll, if the io_setup > > > caller calls with a ''magic'' (1, actually) aio_ctx value. > > > > > > Daniel > > > > > > > Seems like I made a mistake and copied the version of the error that was > > generated before I updated xen-unstable to have your latest changes. Now > > I''m getting a completely different error. Does the aio driver handle raw > > disk files or do I need to create the image in some special way? Not the > > error I get has to do with incorrect magic in the shared info and this > > results in a segfault. I''ll try to hunt them down. The tapdisk output is > > listed below. > > > > Dave > > > > tapdisk2[2366]: Created /dev/xen/blktap-2/blktap0 device > > tapdisk2[2366]: Created /dev/xen/blktap-2/tapdev0 device > > tapdisk2[2366]: new interface: ring: 251, device: 253, minor: 0 > > tapdisk2[2366]: I/O queue driver: lio > > tapdisk2[2366]: Incorrect magic in shared info. > > tapdisk2[2366]: Failed to open shared info. > > tapdisk2[2366]: segfault at 18 ip 0000003e42008dc1 sp 00007fffd29889a0 error 4 in libpthread-2.10.2.so[3e42000000+17000] > > Please try disabling -DMEMSHR in the Makefile. > > [Assuming that helps: Keir, I think there is help underway, but should > we turn it off by default in the meantime?] > > Is this on the pvops kernel? > > - Make sure it doesn''t have HIGHPTE set. > - Apply the bunch of kernel patches posted. ("blktap2 on pvops updates") > I don''t think they made it into Jeremy''s tree yet. > > Let me know if there''s more trouble, we''re presently preparing another > bunch of updates, so the time is just right :) > > DanielSo I''m applying the patches and building now so I haven''t have any trouble yet but you have a small issue with your 4th patch in the series you posted late last week. I actually looked into making blktap2 compile as a module and I ran into the two symbols you exported. I don''t see a way around exporting zap_page_range however mm_init isn''t supposed to be used anymore. If you look in Documentation/vm/active_mm.txt there is this text below. [Start Quote] Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any more. "init_mm" should be considered just a "lazy context when no other context is available", and in fact it is mainly used just at bootup when no real VM has yet been created. So code that used to check if (current->mm == &init_mm) should generally just do if (!current->mm) instead (which makes more sense anyway - the test is basically one of "do we have a user context", and is generally done by the page fault handler and things like that). [End Quote] I''m not sure if you really don''t have a real context at that point but I found it weird that init_mm was being used when this documentation seems to indicate no one should really be using it. Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-03 19:41 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Wed, 2010-02-03 at 11:47 -0800, Daniel Stodden wrote:> On Wed, 2010-02-03 at 14:41 -0500, Daniel Stodden wrote: > > Hmm. DEBUG_SG set, I didn''t try that yet, but I didn''t expect issues > > either. Let''s see if I manage a repro. > > Apart from that, thought you might later be willing to give the one > below a spin. > > Daniel >I''ll add it to my guilt patchset and give it a try. Should I turn off DEBUG_SG in my config? Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Feb-03 19:41 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
Hmm. DEBUG_SG set, I didn''t try that yet, but I didn''t expect issues either. Let''s see if I manage a repro. Daniel On Wed, 2010-02-03 at 14:15 -0500, David P. Quigley wrote:> ------------[ cut here ]------------ > kernel BUG at /home/dpquigl/linux-2.6-pvops.git/include/linux/scatterlist.h:65! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/devices/virtual/block/tapdevc/removable > CPU 1 > Modules linked in: nfs fscache ipt_MASQUERADE iptable_nat nf_nat nfsd bridge lockd stp nfs_acl llc auth_rpcgss bnep exportfs sco l2cap bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec radeon snd_hwdep snd_seq ttm snd_seq_device drm snd_pcm snd_timer snd ppdev ata_generic soundcore e1000e dcdbas parport_pc i2c_algo_bit i2c_i801 snd_page_alloc iTCO_wdt serio_raw joydev parport pcspkr pata_acpi wmi iTCO_vendor_support i2c_core [last unloaded: microcode] > Pid: 17105, comm: vol_id Not tainted 2.6.31.6-pvops-dom0 #20 OptiPlex 960 > RIP: e030:[<ffffffff81267d75>] [<ffffffff81267d75>] sg_set_page+0x35/0x67 > RSP: e02b:ffff8801bf03f538 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffff8801c482e7a0 RCX: 0000000000000000 > RDX: 0000000000001000 RSI: ffffea000b7dec40 RDI: ffff8801c482e7a0 > RBP: ffff8801bf03f548 R08: 0000000087654321 R09: 0000000000000000 > R10: 0000000000000001 R11: 0000000000000000 R12: ffff8801bf066470 > R13: ffff8801bf078000 R14: ffff8801ddd4f320 R15: ffff8801bf066400 > FS: 00007f1cc9fdf6f0(0000) GS:ffffc900001d9000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000001cf75a8 CR3: 00000001ca1df000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process vol_id (pid: 17105, threadinfo ffff8801bf03e000, task ffff8801c650c880) > Stack: > ffff8801bf03f558 000000000e73a217 ffff8801bf03f5c8 ffffffff8126867d > <0> ffff8801bf03f588 ffffffff8108e686 0000000000000000 000000000173a217 > <0> ffff8801c482e7a0 0000000000000000 ffff8801bf03f5c8 000000000e73a217 > Call Trace: > [<ffffffff8126867d>] blk_rq_map_sg+0x1c8/0x2ff > [<ffffffff8108e686>] ? getnstimeofday+0x6e/0xd8 > [<ffffffff81313d73>] blktap_device_process_request+0xb7/0x973 > [<ffffffff81133d26>] ? check_object+0x184/0x1d3 > [<ffffffff8100f61f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff8109aab3>] ? lock_acquired+0x2b4/0x2d7 > [<ffffffff8100f61f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff813148b3>] ? blktap_device_do_request+0x1e7/0x2ca > [<ffffffff813148b3>] ? blktap_device_do_request+0x1e7/0x2ca > [<ffffffff813148c1>] blktap_device_do_request+0x1f5/0x2ca > [<ffffffff81074b70>] ? del_timer+0x63/0x85 > [<ffffffff810fb538>] ? sync_page_killable+0x0/0x5e > [<ffffffff81261134>] __generic_unplug_device+0x44/0x5f > [<ffffffff8100f632>] ? check_events+0x12/0x20 > [<ffffffff8126118c>] generic_unplug_device+0x3d/0x64 > [<ffffffff8100eb68>] ? xen_force_evtchn_callback+0x20/0x36 > [<ffffffff8125dfdd>] blk_unplug+0x38/0x53 > [<ffffffff8125e01d>] blk_backing_dev_unplug+0x25/0x3b > [<ffffffff8108430a>] ? prepare_to_wait_exclusive+0x34/0x8b > [<ffffffff811682ce>] block_sync_page+0x47/0x62 > [<ffffffff810fb51b>] sync_page+0x5f/0x7c > [<ffffffff8100f632>] ? check_events+0x12/0x20 > [<ffffffff810fb559>] sync_page_killable+0x21/0x5e > [<ffffffff8151365c>] __wait_on_bit_lock+0x55/0xb2 > [<ffffffff8100f61f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff810fb310>] __lock_page_killable+0x73/0x8e > [<ffffffff81084132>] ? wake_bit_function+0x0/0x5a > [<ffffffff810fb376>] lock_page_killable+0x4b/0x66 > [<ffffffff810fd596>] generic_file_aio_read+0x385/0x52f > [<ffffffff81141ed4>] do_sync_read+0xfa/0x14b > [<ffffffff8100eb68>] ? xen_force_evtchn_callback+0x20/0x36 > [<ffffffff810840d3>] ? autoremove_wake_function+0x0/0x5f > [<ffffffff81225eec>] ? security_file_permission+0x29/0x3f > [<ffffffff81142692>] vfs_read+0xba/0x12b > [<ffffffff811427f5>] sys_read+0x59/0x91 > [<ffffffff81014fb2>] system_call_fastpath+0x16/0x1b > Code: 44 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 40 f6 c6 03 48 8b 47 08 74 04 0f 0b eb fe 41 b8 21 43 65 87 4c 39 07 74 04 <0f> 0b eb fe a8 01 74 04 0f 0b eb fe 83 e0 03 89 4f 10 89 57 14 > RIP [<ffffffff81267d75>] sg_set_page+0x35/0x67 > RSP <ffff8801bf03f538> > ---[ end trace c72cd6244d0817d8 ]--- > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Feb-03 19:47 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Wed, 2010-02-03 at 14:41 -0500, Daniel Stodden wrote:> Hmm. DEBUG_SG set, I didn''t try that yet, but I didn''t expect issues > either. Let''s see if I manage a repro.Apart from that, thought you might later be willing to give the one below a spin. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Feb-03 20:02 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Wed, 2010-02-03 at 14:41 -0500, David P. Quigley wrote:> On Wed, 2010-02-03 at 11:47 -0800, Daniel Stodden wrote: > > On Wed, 2010-02-03 at 14:41 -0500, Daniel Stodden wrote: > > > Hmm. DEBUG_SG set, I didn''t try that yet, but I didn''t expect issues > > > either. Let''s see if I manage a repro. > > > > Apart from that, thought you might later be willing to give the one > > below a spin. > > > > Daniel > > > > I''ll add it to my guilt patchset and give it a try. Should I turn off > DEBUG_SG in my config?Nooo, that''s all great, let''s rather see what else you dig up. Put the patch below on top, it worked for me. I''ll fold them later and post an update. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David P. Quigley
2010-Feb-03 20:12 UTC
Re: [Xen-devel] [PATCH 0 of 4] aio event fd support to blktap2
On Wed, 2010-02-03 at 12:02 -0800, Daniel Stodden wrote:> On Wed, 2010-02-03 at 14:41 -0500, David P. Quigley wrote: > > On Wed, 2010-02-03 at 11:47 -0800, Daniel Stodden wrote: > > > On Wed, 2010-02-03 at 14:41 -0500, Daniel Stodden wrote: > > > > Hmm. DEBUG_SG set, I didn''t try that yet, but I didn''t expect issues > > > > either. Let''s see if I manage a repro. > > > > > > Apart from that, thought you might later be willing to give the one > > > below a spin. > > > > > > Daniel > > > > > > > I''ll add it to my guilt patchset and give it a try. Should I turn off > > DEBUG_SG in my config? > > Nooo, that''s all great, let''s rather see what else you dig up. > Put the patch below on top, it worked for me. > I''ll fold them later and post an update. > > DanielI still have the same oops even after the init-mm patch. Not quite sure whats causing it but the domain gets built even though I get the kernel oops with blktap. Unfortunately the vm then hangs as it has no disk to use for booting. Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Maybe Matching Threads
- [PATCH 00 of 21 RESEND] blktap3/drivers: Introduce tapdisk server.
- [PATCH 0 of 6 RESEND v2] blktap3/sring: shared ring between tapdisk and the front-end
- [PATCH 00 of 18] [v2] tools: fix bugs and build errors triggered by -O2 -Wall -Werror
- Errors of doing "make install-tools" with xen-4.2-unstable?
- Re: Re: XAPI on debian - sr-create issue