François Revol
2020-May-01 19:16 UTC
[Libguestfs] [PATCH] [v2] WIP: ddrescue mapfile filter
This allows to overlay bad sectors according to the mapfile generated by ddrescue, to then see where sectors are used using fsck and trying to copy files around. Signed-off-by: Fran?ois Revol <revol at free.fr> --- configure.ac | 2 + filters/ddrescue/Makefile.am | 75 +++++++ filters/ddrescue/ddrescue.c | 211 ++++++++++++++++++++ filters/ddrescue/nbdkit-ddrescue-filter.pod | 74 +++++++ 4 files changed, 362 insertions(+) create mode 100644 filters/ddrescue/Makefile.am create mode 100644 filters/ddrescue/ddrescue.c create mode 100644 filters/ddrescue/nbdkit-ddrescue-filter.pod diff --git a/configure.ac b/configure.ac index 6c25226e..21e1013f 100644 --- a/configure.ac +++ b/configure.ac @@ -97,6 +97,7 @@ filters="\ cache \ cacheextents \ cow \ + ddrescue \ delay \ error \ exitlast \ @@ -1089,6 +1090,7 @@ AC_CONFIG_FILES([Makefile filters/cache/Makefile filters/cacheextents/Makefile filters/cow/Makefile + filters/ddrescue/Makefile filters/delay/Makefile filters/error/Makefile filters/exitlast/Makefile diff --git a/filters/ddrescue/Makefile.am b/filters/ddrescue/Makefile.am new file mode 100644 index 00000000..2498074c --- /dev/null +++ b/filters/ddrescue/Makefile.am @@ -0,0 +1,75 @@ +# nbdkit +# Copyright (C) 2018 Red Hat Inc. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +include $(top_srcdir)/common-rules.mk + +EXTRA_DIST = \ + nbdkit-ddrescue-filter.pod \ + $(NULL) + +filter_LTLIBRARIES = nbdkit-ddrescue-filter.la + +nbdkit_ddrescue_filter_la_SOURCES = \ + ddrescue.c \ + $(top_srcdir)/include/nbdkit-filter.h \ + $(NULL) + +nbdkit_ddrescue_filter_la_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/common/include \ + -I$(top_srcdir)/common/sparse \ + -I$(top_srcdir)/common/utils \ + $(NULL) +nbdkit_ddrescue_filter_la_CFLAGS = \ + $(WARNINGS_CFLAGS) \ + $(GNUTLS_CFLAGS) \ + $(NULL) +nbdkit_ddrescue_filter_la_LDFLAGS = \ + -module -avoid-version -shared \ + -Wl,--version-script=$(top_srcdir)/filters/filters.syms \ + $(NULL) +nbdkit_ddrescue_filter_la_LIBADD = \ + $(top_builddir)/common/sparse/libsparse.la \ + $(top_builddir)/common/utils/libutils.la \ + $(GNUTLS_LIBS) \ + $(NULL) + +if HAVE_POD + +man_MANS = nbdkit-ddrescue-filter.1 +CLEANFILES += $(man_MANS) + +nbdkit-ddrescue-filter.1: nbdkit-ddrescue-filter.pod + $(PODWRAPPER) --section=1 --man $@ \ + --html $(top_builddir)/html/$@.html \ + $< + +endif HAVE_POD diff --git a/filters/ddrescue/ddrescue.c b/filters/ddrescue/ddrescue.c new file mode 100644 index 00000000..f6dd9382 --- /dev/null +++ b/filters/ddrescue/ddrescue.c @@ -0,0 +1,211 @@ +/* nbdkit + * Copyright (C) 2018-2020 Fran?ois Revol. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * * Neither the name of Red Hat nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <inttypes.h> +#include <string.h> + +#include <pthread.h> + +#include <nbdkit-filter.h> + +#include "cleanup.h" + +struct range { + int64_t start; + int64_t end; + int64_t size; + char status; +}; + +struct mapfile { + int ranges_count; + struct range *ranges; +}; + +static struct mapfile map = { 0, NULL }; + +static int +parse_mapfile (const char *filename) +{ + FILE *fp = NULL; + CLEANUP_FREE char *line = NULL; + size_t linelen = 0; + ssize_t len; + int ret = -1; + int status_seen = 0; + + fp = fopen (filename, "r"); + if (!fp) { + nbdkit_error ("%s: ddrescue: fopen: %m", filename); + goto out; + } + + while ((len = getline (&line, &linelen, fp)) != -1) { + const char *delim = " \t"; + char *sp, *p; + int64_t offset, length; + char status; + + if (len > 0 && line[len-1] == '\n') { + line[len-1] = '\0'; + len--; + } + + if (len > 0 && line[0] == '#') + continue; + + if (len > 0 && !status_seen) { + /* status line, ignore it for now */ + status_seen = 1; + nbdkit_debug ("%s: skipping status line: '%s'", filename, line); + continue; + } + + if (sscanf (line, "%" SCNi64 "\t%" SCNi64 "\t%c", &offset, &length, &status) == 3) { + if (offset < 0) { + nbdkit_error ("block offset must not be negative"); + return -1; + } + if (length < 0) { + nbdkit_error ("block length must not be negative"); + return -1; + } + if (status == '+') { + int i = map.ranges_count++; + map.ranges = realloc(map.ranges, map.ranges_count * sizeof(struct range)); + if (map.ranges == NULL) { + nbdkit_error ("%s: ddrescue: realloc: %m", filename); + goto out; + } + map.ranges[i].start = offset; + map.ranges[i].end = offset + length - 1; + map.ranges[i].size = length; + map.ranges[i].status = status; + } + + nbdkit_debug ("%s: range: 0x%" PRIx64 " 0x%" PRIx64 " '%c'", filename, offset, length, status); + } + } + + ret = 0; + + out: + if (fp) + fclose (fp); + return ret; +} + +/* On unload, free the sparse array. */ +static void +ddrescue_unload (void) +{ + free (map.ranges); + map.ranges = NULL; + map.ranges_count = 0; +} + +static int +ddrescue_config (nbdkit_next_config *next, void *nxdata, + const char *key, const char *value) +{ + if (strcmp (key, "ddrescue-mapfile") == 0) { + if (parse_mapfile (value) == -1) + return -1; + return 0; + } + + else + return next (nxdata, key, value); +} + +#define ddrescue_config_help \ + "ddrescue-mapfile=... Specify ddrescue mapfile to use" + +/* We need this because otherwise the layer below can_write is called + * and that might return true (eg. if the plugin has a pwrite method + * at all), resulting in writes being passed through to the layer + * below. + */ +static int +ddrescue_can_write (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + return 0; +} + +static int +ddrescue_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + return 0; +} + +/* Read data. */ +static int +ddrescue_pread (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, void *buf, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + int i; + + for (i = 0; i < map.ranges_count; i++) { + if (map.ranges[i].status != '+') + continue; + if (offset >= map.ranges[i].start && offset <= map.ranges[i].end) { + if (offset + count - 1 <= map.ranges[i].end) { + /* entirely contained within this range */ + return next_ops->pread (nxdata, buf, count, offset, flags, err); + } + } + } + /* read was not fully covered */ + *err = EIO; + return -1; +} + +static struct nbdkit_filter filter = { + .name = "ddrescue", + .longname = "nbdkit ddrescue mapfile filter", + .unload = ddrescue_unload, + .config = ddrescue_config, + .config_help = ddrescue_config_help, + .can_write = ddrescue_can_write, + .can_cache = ddrescue_can_cache, + .pread = ddrescue_pread, +}; + +NBDKIT_REGISTER_FILTER(filter) diff --git a/filters/ddrescue/nbdkit-ddrescue-filter.pod b/filters/ddrescue/nbdkit-ddrescue-filter.pod new file mode 100644 index 00000000..8210866b --- /dev/null +++ b/filters/ddrescue/nbdkit-ddrescue-filter.pod @@ -0,0 +1,74 @@ +=head1 NAME + +nbdkit-ddrescue-filter - nbdkit filter for serving from ddrescue dump + +=head1 SYNOPSIS + + nbdkit --filter=ddrescue plugin [plugin-args...] ddrescue-mapfile=file.map + + nbdkit --filter=ddrescue file file=file.img ddrescue-mapfile=file.map [plugin-args...] + +=head1 DESCRIPTION + +C<nbdkit-ddrescue-filter> is a filter for L<nbdkit(1)> which overlays +bad blocks according to a GNU L<ddrescue(1)> mapfile. This is mainly useful +for testing disk images recovered with ddrescue, to detect which files +or filesystem structures are impacted, or attempting fsck on them. + +Note that the current implementation is read-only. + +=head1 EXAMPLES + +=over 4 + +=item Expose a rescued disk image with detected bad sectors: + + nbdkit --filter=ddrescue file file=disk.img ddrescue-mapfile=disk.map + +The above command serves the disk image disk.img and maps the bad +sectors listed in disk.img so that read attempts on them do not return +a valid block full of zeroes. + +=back + +=head1 PARAMETERS + +The C<ddrescue-mapfile> parameter must point to a valid GNU ddrescue +mapfile. + +=head1 DATA FORMAT + +The file pointed to by the C<ddrescue-mapfile> parameter should +conform to the format of a GNU L<ddrescue(1)> mapfile. + +=head1 FILES + +=over 4 + +=item F<$filterdir/nbdkit-ddrescue-filter.so> + +The filter. + +Use C<nbdkit --dump-config> to find the location of C<$filterdir>. + +=back + +=head1 VERSION + +C<nbdkit-ddrescue-filter> first appeared in nbdkit 1.22. + +=head1 SEE ALSO + +L<nbdkit(1)>, +L<nbdkit-file-plugin(1)>, +L<nbdkit-filter(3)>, +L<ddrescue(1)>, +L<https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html>. + +=head1 AUTHORS + +Fran?ois Revol + +=head1 COPYRIGHT + +Copyright (C) 2020 Fran?ois Revol -- 2.26.2
On 5/1/20 2:16 PM, Fran?ois Revol wrote:> This allows to overlay bad sectors according to the mapfile generated by > ddrescue, to then see where sectors are used using fsck and trying to > copy files around. > > Signed-off-by: Fran?ois Revol <revol at free.fr> > ---Focusing on just the docs:> +++ b/filters/ddrescue/nbdkit-ddrescue-filter.pod > @@ -0,0 +1,74 @@ > +=head1 NAME > + > +nbdkit-ddrescue-filter - nbdkit filter for serving from ddrescue dump > + > +=head1 SYNOPSIS > + > + nbdkit --filter=ddrescue plugin [plugin-args...] ddrescue-mapfile=file.map > + > + nbdkit --filter=ddrescue file file=file.img ddrescue-mapfile=file.map [plugin-args...] > + > +=head1 DESCRIPTION > + > +C<nbdkit-ddrescue-filter> is a filter for L<nbdkit(1)> which overlays > +bad blocks according to a GNU L<ddrescue(1)> mapfile. This is mainly useful > +for testing disk images recovered with ddrescue, to detect which files > +or filesystem structures are impacted, or attempting fsck on them.Oh cool, this sounds very similar to the existing extentlist plugin, but with a difference that extentlist uses the input to decide which portions of the file are advertised as sparse, while this one uses which portions of the file cause EIO errors during read. Is it worth trying to combine the two concepts into one filter, or do we want to keep it orthogonal as two separate filters? Can we teach both filters to reuse common code for parsing extent lists in different formats (the extentlist format is the same format documented by the nbdkit-sh-plugin .extents; yours is the ddrescue format, I am also interested in the 'qemu-img map --output=json' format)? In fact, if we introduce a common library for extent list parsing, teaching nbdkit-sh-plugin to accept all formats might be useful. I ask this because one of my ideas for qemu-nbd and the 'qemu:dirty-bitmap:FOO' exposure of a bitmap is to have qemu-nbd report EIO for any part of the file not covered by the dirty bitmap. There is no easy way to emulate that in nbdkit yet (since nbdkit does not understand anything other than the base:allocation context), but this filter plus the nbd plugin would be a way to provide those semantics on top of existing qemu-nbd, if this filter could easily parse qemu-img's extent list.> + > +Note that the current implementation is read-only. > + > +=head1 EXAMPLES > + > +=over 4 > + > +=item Expose a rescued disk image with detected bad sectors: > + > + nbdkit --filter=ddrescue file file=disk.img ddrescue-mapfile=disk.map > + > +The above command serves the disk image disk.img and maps the bad > +sectors listed in disk.img so that read attempts on them do not return > +a valid block full of zeroes.Wording suggestion: and maps the bad sectors listed in disk.img so that read attempts over those portions of the file fail rather than returning a valid block containing garbage. Or even make it configurable (via another .config knob) whether bad blocks read as zero or cause EIO. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2020-May-01 21:14 UTC
Re: [Libguestfs] [PATCH] [v2] WIP: ddrescue mapfile filter
On Fri, May 01, 2020 at 02:41:59PM -0500, Eric Blake wrote:> Oh cool, this sounds very similar to the existing extentlist plugin, > but with a difference that extentlist uses the input to decide which > portions of the file are advertised as sparse, while this one uses > which portions of the file cause EIO errors during read. Is it > worth trying to combine the two concepts into one filter, or do we > want to keep it orthogonal as two separate filters? Can we teach > both filters to reuse common code for parsing extent lists in > different formats (the extentlist format is the same format > documented by the nbdkit-sh-plugin .extents; yours is the ddrescue > format, I am also interested in the 'qemu-img map --output=json' > format)? In fact, if we introduce a common library for extent list > parsing, teaching nbdkit-sh-plugin to accept all formats might be > useful.I think common code for parsing sounds useful (eventually), but the filters are quite different in their effects so fully combining them would be difficult and seems unlikely to achieve much. However ...> Or even make it configurable (via another .config knob) whether bad > blocks read as zero or cause EIO.... this is definitely an interesting idea. I guess some clients will stop at an EIO but would continue if the data read as zeroes (although whether such clients would do the right thing is questionable). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Richard W.M. Jones
2020-May-01 21:16 UTC
Re: [Libguestfs] [PATCH] [v2] WIP: ddrescue mapfile filter
On Fri, May 01, 2020 at 09:16:14PM +0200, François Revol wrote:> +nbdkit_ddrescue_filter_la_LDFLAGS = \ > + -module -avoid-version -shared \ > + -Wl,--version-script=$(top_srcdir)/filters/filters.syms \ > + $(NULL)Still lacking $(SHARED_LDFLAGS) here? The rest seems fine (but see Eric's suggestion about the documentation). We'll have to hold this one until 1.20 has been released, should be soon. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Richard W.M. Jones
2020-May-22 09:45 UTC
Re: [Libguestfs] [PATCH] [v2] WIP: ddrescue mapfile filter
On Fri, May 22, 2020 at 02:13:22AM +0200, François Revol wrote:> Le 01/05/2020 à 23:16, Richard W.M. Jones a écrit : > > On Fri, May 01, 2020 at 09:16:14PM +0200, François Revol wrote: > >> +nbdkit_ddrescue_filter_la_LDFLAGS = \ > >> + -module -avoid-version -shared \ > >> + -Wl,--version-script=$(top_srcdir)/filters/filters.syms \ > >> + $(NULL) > > > > Still lacking $(SHARED_LDFLAGS) here? > > > > The rest seems fine (but see Eric's suggestion about the documentation). > > > > We'll have to hold this one until 1.20 has been released, should > > be soon. > > I see 1.20 is out, I'll send a v3, hopefully it can be merged now. > > Btw, Why doesn't SHARED_LDFLAGS include the -shared? > > Since not all platforms/linkers actually use this flag (BeOS used > -nostart but that was rather hackish since it's a bit different (even > apps where shared objects to allow loading them as replicants in other > apps)). > > So I'd find it better to have this flag in the variable too.This probably needs to be cleaned up, but that's a separate issue. If you duplicate whatever other filters are doing now, then cleaning that up later will be easier. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v