Eric Blake
2018-Jan-19 16:23 UTC
[Libguestfs] [nbdkit PATCH] Update filters to support FUA flags.
From: "Richard W.M. Jones" <rjones@redhat.com> This patch may be worth squashing? --- docs/nbdkit-filter.pod | 65 +++++++++++++++++++++++++++++++++++++++++++------ include/nbdkit-filter.h | 29 ++++++++++++---------- src/filters.c | 55 ++++++++++++++++++++--------------------- 3 files changed, 101 insertions(+), 48 deletions(-) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index 75157ef..a050a1d 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -89,7 +89,12 @@ by the plugin. To see example filters, take a look at the source of nbdkit, in the C<filters> directory. -Filters must be written in C and must be fully thread safe. +Filters must be written in C, must be fully thread safe, and have +tighter rules regarding what callbacks may do. While there is a +guarantee that plugins written against an older version of nbdkit will +still work with newer versions, filters do not have the same stability +guarantee, and nbdkit may refuse to use a filter that was compiled +against a different version rather than risk misbehavior. =head1 C<nbdkit-filter.h> @@ -320,11 +325,15 @@ error message and return C<-1>. =head2 C<.pread> int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, void *buf, uint32_t count, uint64_t offset); + void *handle, void *buf, uint32_t count, uint64_t offset, + uint32_t flags); This intercepts the plugin C<.pread> method and can be used to read or modify data read by the plugin. +At this time, flags will be 0 on input, and the filter should not pass +any flags to C<next_ops->pread>. + If there is an error (including a short read which couldn't be recovered from), C<.pread> should call C<nbdkit_error> with an error message B<and> set C<errno>, then return C<-1>. @@ -333,11 +342,21 @@ message B<and> set C<errno>, then return C<-1>. int (*pwrite) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, - const void *buf, uint32_t count, uint64_t offset); + const void *buf, uint32_t count, uint64_t offset, + uint32_t flags); This intercepts the plugin C<.pwrite> method and can be used to modify data written by the plugin. +At this time, flags may include C<NBDKIT_FLAG_FUA> on input based on +the result of C<.can_flush>. In turn, the filter may only pass +C<NBDKIT_FLAG_FUA> on to C<next_ops->pwrite> if C<next_ops->can_flush> +returned true. + +This function will not be called if C<.can_write> returned false; in +turn, the filter should not call C<next_ops->pwrite> if C<next_ops->can_write> +did not return true. + If there is an error (including a short write which couldn't be recovered from), C<.pwrite> should call C<nbdkit_error> with an error message B<and> set C<errno>, then return C<-1>. @@ -345,35 +364,67 @@ message B<and> set C<errno>, then return C<-1>. =head2 C<.flush> int (*flush) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle); + void *handle, uint32_t flags); This intercepts the plugin C<.flush> method and can be used to modify flush requests. +At this time, flags will be 0 on input, and the filter should not pass +any flags to C<next_ops->flush>. + +This function will not be called if C<.can_flush> returned false; in +turn, the filter should not call C<next_ops->flush> if C<next_ops->can_flush> +did not return true. + If there is an error, C<.flush> should call C<nbdkit_error> with an error message B<and> set C<errno>, then return C<-1>. =head2 C<.trim> int (*trim) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, uint32_t count, uint64_t offset); + void *handle, uint32_t count, uint64_t offset, uint32_t flags); This intercepts the plugin C<.trim> method and can be used to modify trim requests. +At this time, flags may include C<NBDKIT_FLAG_FUA> on input based on +the result of C<.can_flush>. In turn, the filter may only pass +C<NBDKIT_FLAG_FUA> on to C<next_ops->trim> if C<next_ops->can_flush> +returned true. + +This function will not be called if C<.can_trim> returned false; in +turn, the filter should not call C<next_ops->trim> if C<next_ops->can_trim> +did not return true. + If there is an error, C<.trim> should call C<nbdkit_error> with an error message B<and> set C<errno>, then return C<-1>. =head2 C<.zero> int (*zero) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, uint32_t count, uint64_t offset, int may_trim); + void *handle, uint32_t count, uint64_t offset, + uint32_t flags); This intercepts the plugin C<.zero> method and can be used to modify zero requests. +At this time, flags may include C<NBDKIT_FLAG_MAY_TRIM> +unconditionally, and C<NBDKIT_FLAG_FUA> based on the result of +C<.can_flush>. In turn, when calling C<next_ops->zero>, the filter may +pass C<NBDKIT_FLAG_MAY_TRIM> unconditionally, but may only pass +C<NBDKIT_FLAG_FUA> if C<next_ops->can_flush> returned true. + +This function will not be called if C<.can_write> returned false; in +turn, the filter should not call C<next_ops->zero> if C<next_ops->can_write> +did not return true. + If there is an error, C<.zero> should call C<nbdkit_error> with an -error message B<and> set C<errno>, then return C<-1>. +error message B<and> set C<errno>, then return C<-1>; however, +unlike plugins, this function must not return the C<EOPNOTSUPP> +error (the code guarantees that C<next_ops->zero> will have already +done a fallback to C<next_ops->write> rather than fail with that +particular error, and the fallback must not be performed more +than once). =head1 THREADS diff --git a/include/nbdkit-filter.h b/include/nbdkit-filter.h index af79e33..c8b3c8c 100644 --- a/include/nbdkit-filter.h +++ b/include/nbdkit-filter.h @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2017 Red Hat Inc. + * Copyright (C) 2013-2018 Red Hat Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -59,17 +59,18 @@ struct nbdkit_next_ops { int (*is_rotational) (void *nxdata); int (*can_trim) (void *nxdata); - int (*pread) (void *nxdata, void *buf, uint32_t count, uint64_t offset); - int (*pwrite) (void *nxdata, - const void *buf, uint32_t count, uint64_t offset); - int (*flush) (void *nxdata); - int (*trim) (void *nxdata, uint32_t count, uint64_t offset); - int (*zero) (void *nxdata, uint32_t count, uint64_t offset, int may_trim); + int (*pread) (void *nxdata, void *buf, uint32_t count, uint64_t offset, + uint32_t flags); + int (*pwrite) (void *nxdata, const void *buf, uint32_t count, + uint64_t offset, uint32_t flags); + int (*flush) (void *nxdata, uint32_t flags); + int (*trim) (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags); + int (*zero) (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags); }; struct nbdkit_filter { /* Do not set these fields directly; use NBDKIT_REGISTER_FILTER. - * They exist so that we can support filters compiled against + * They exist so that we can recognize filters compiled against * one version of the header with a runtime compiled against a * different version with more (or fewer) fields. */ @@ -112,16 +113,18 @@ struct nbdkit_filter { void *handle); int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, void *buf, uint32_t count, uint64_t offset); + void *handle, void *buf, uint32_t count, uint64_t offset, + uint32_t flags); int (*pwrite) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, - const void *buf, uint32_t count, uint64_t offset); + const void *buf, uint32_t count, uint64_t offset, + uint32_t flags); int (*flush) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle); + void *handle, uint32_t flags); int (*trim) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, uint32_t count, uint64_t offset); + void *handle, uint32_t count, uint64_t offset, uint32_t flags); int (*zero) (struct nbdkit_next_ops *next_ops, void *nxdata, - void *handle, uint32_t count, uint64_t offset, int may_trim); + void *handle, uint32_t count, uint64_t offset, uint32_t flags); }; #ifndef NBDKIT_CXX_LANG_C diff --git a/src/filters.c b/src/filters.c index 9a2022c..9768f20 100644 --- a/src/filters.c +++ b/src/filters.c @@ -280,43 +280,40 @@ next_can_trim (void *nxdata) } static int -next_pread (void *nxdata, void *buf, uint32_t count, uint64_t offset) +next_pread (void *nxdata, void *buf, uint32_t count, uint64_t offset, + uint32_t flags) { struct b_conn *b_conn = nxdata; - return b_conn->b->pread (b_conn->b, b_conn->conn, buf, count, offset, 0); + return b_conn->b->pread (b_conn->b, b_conn->conn, buf, count, offset, flags); } static int -next_pwrite (void *nxdata, const void *buf, uint32_t count, uint64_t offset) +next_pwrite (void *nxdata, const void *buf, uint32_t count, uint64_t offset, + uint32_t flags) { struct b_conn *b_conn = nxdata; - return b_conn->b->pwrite (b_conn->b, b_conn->conn, buf, count, offset, 0); + return b_conn->b->pwrite (b_conn->b, b_conn->conn, buf, count, offset, flags); } static int -next_flush (void *nxdata) +next_flush (void *nxdata, uint32_t flags) { struct b_conn *b_conn = nxdata; - return b_conn->b->flush (b_conn->b, b_conn->conn, 0); + return b_conn->b->flush (b_conn->b, b_conn->conn, flags); } static int -next_trim (void *nxdata, uint32_t count, uint64_t offset) +next_trim (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags) { struct b_conn *b_conn = nxdata; - return b_conn->b->trim (b_conn->b, b_conn->conn, count, offset, 0); + return b_conn->b->trim (b_conn->b, b_conn->conn, count, offset, flags); } static int -next_zero (void *nxdata, uint32_t count, uint64_t offset, int may_trim) +next_zero (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags) { struct b_conn *b_conn = nxdata; - uint32_t f = 0; - - if (may_trim) - f |= NBDKIT_FLAG_MAY_TRIM; - - return b_conn->b->zero (b_conn->b, b_conn->conn, count, offset, f); + return b_conn->b->zero (b_conn->b, b_conn->conn, count, offset, flags); } static struct nbdkit_next_ops next_ops = { @@ -448,11 +445,12 @@ filter_pread (struct backend *b, struct connection *conn, assert (flags == 0); - debug ("pread count=%" PRIu32 " offset=%" PRIu64, count, offset); + debug ("pread count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, + count, offset, flags); if (f->filter.pread) return f->filter.pread (&next_ops, &nxdata, handle, - buf, count, offset); + buf, count, offset, flags); else return f->backend.next->pread (f->backend.next, conn, buf, count, offset, flags); @@ -470,12 +468,12 @@ filter_pwrite (struct backend *b, struct connection *conn, assert (!(flags & ~NBDKIT_FLAG_FUA)); - debug ("pwrite count=%" PRIu32 " offset=%" PRIu64 " fua=%d", - count, offset, fua); + debug ("pwrite count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, + count, offset, flags); if (f->filter.pwrite) return f->filter.pwrite (&next_ops, &nxdata, handle, - buf, count, offset); + buf, count, offset, flags); else return f->backend.next->pwrite (f->backend.next, conn, buf, count, offset, flags); @@ -490,10 +488,10 @@ filter_flush (struct backend *b, struct connection *conn, uint32_t flags) assert (flags == 0); - debug ("flush"); + debug ("flush flags=0x%" PRIx32, flags); if (f->filter.flush) - return f->filter.flush (&next_ops, &nxdata, handle); + return f->filter.flush (&next_ops, &nxdata, handle, flags); else return f->backend.next->flush (f->backend.next, conn, flags); } @@ -509,10 +507,12 @@ filter_trim (struct backend *b, struct connection *conn, assert (flags == 0); - debug ("trim count=%" PRIu32 " offset=%" PRIu64, count, offset); + debug ("trim count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, + count, offset, flags); if (f->filter.trim) - return f->filter.trim (&next_ops, &nxdata, handle, count, offset); + return f->filter.trim (&next_ops, &nxdata, handle, count, offset, + flags); else return f->backend.next->trim (f->backend.next, conn, count, offset, flags); } @@ -524,16 +524,15 @@ filter_zero (struct backend *b, struct connection *conn, struct backend_filter *f = container_of (b, struct backend_filter, backend); void *handle = connection_get_handle (conn, f->backend.i); struct b_conn nxdata = { .b = f->backend.next, .conn = conn }; - int may_trim = (flags & NBDKIT_FLAG_MAY_TRIM) != 0; assert (!(flags & ~(NBDKIT_FLAG_MAY_TRIM | NBDKIT_FLAG_FUA))); - debug ("zero count=%" PRIu32 " offset=%" PRIu64 " may_trim=%d", - count, offset, may_trim); + debug ("zero count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, + count, offset, flags); if (f->filter.zero) return f->filter.zero (&next_ops, &nxdata, handle, - count, offset, may_trim); + count, offset, flags); else return f->backend.next->zero (f->backend.next, conn, count, offset, flags); -- 2.14.3
Eric Blake
2018-Jan-19 16:45 UTC
Re: [Libguestfs] [nbdkit PATCH] Update filters to support FUA flags.
On 01/19/2018 10:23 AM, Eric Blake wrote:> From: "Richard W.M. Jones" <rjones@redhat.com>Whoops, that's supposed to be attributed to me as author if we keep this patch separate from your work (I was still attributing it to you when I had squashed it into your earlier patches, so this was a rebase artifact).> > This patch may be worth squashing? > --- > docs/nbdkit-filter.pod | 65 +++++++++++++++++++++++++++++++++++++++++++------ > include/nbdkit-filter.h | 29 ++++++++++++---------- > src/filters.c | 55 ++++++++++++++++++++--------------------- > 3 files changed, 101 insertions(+), 48 deletions(-) >> @@ -333,11 +342,21 @@ message B<and> set C<errno>, then return C<-1>. > > int (*pwrite) (struct nbdkit_next_ops *next_ops, void *nxdata, > void *handle, > - const void *buf, uint32_t count, uint64_t offset); > + const void *buf, uint32_t count, uint64_t offset, > + uint32_t flags); > > This intercepts the plugin C<.pwrite> method and can be used to modify > data written by the plugin. > > +At this time, flags may include C<NBDKIT_FLAG_FUA> on input based on > +the result of C<.can_flush>. In turn, the filter may only pass > +C<NBDKIT_FLAG_FUA> on to C<next_ops->pwrite> if C<next_ops->can_flush> > +returned true. > + > +This function will not be called if C<.can_write> returned false; in > +turn, the filter should not call C<next_ops->pwrite> if C<next_ops->can_write> > +did not return true.I'm wondering if we're missing documentation here (and/or in the plugins document) that if .can_write returns true, the plugin must supply a .pwrite; likewise for .can_trim implying a .trim, and .can_flush implying a .flush. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2018-Jan-19 17:04 UTC
Re: [Libguestfs] [nbdkit PATCH] Update filters to support FUA flags.
On Fri, Jan 19, 2018 at 10:45:51AM -0600, Eric Blake wrote:> I'm wondering if we're missing documentation here (and/or in the plugins > document) that if .can_write returns true, the plugin must supply a > .pwrite; likewise for .can_trim implying a .trim, and .can_flush > implying a .flush.It's a strange one. For example if a plugin returns .can_pwrite == 1 but doesn't implement .pwrite then the NBD connection will appear to be writable, but any attempt to write will return EROFS. On the other hand (a) plugins don't usually implement can_write because the "autosensing" usually works, and (b) it doesn't crash or do anything particularly bad. can_trim/trim and can_flush/flush are similar. However unfortunately src/plugins.c now assert-fails if the FUA flag is present but plugin.flush does not exist. Is this a bug? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Reasonably Related Threads
- [nbdkit PATCH] Update filters to support FUA flags.
- [nbdkit PATCH 0/9] can_FOO caching, more filter validation
- [nbdkit PATCH 0/5] Another round of retry fixes
- Re: [PATCH nbdkit v2 10/10] tests: Test the Python plugin thoroughly.
- [nbdkit PATCH v3 00/15] Add FUA support to nbdkit