Eric Blake
2019-Apr-24 19:39 UTC
Re: [Libguestfs] [PATCH nbdkit v5 FINAL 15/19] file: Implement extents.
On 3/28/19 11:18 AM, Richard W.M. Jones wrote:> This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and > holes in the underlying file. > --- > plugins/file/file.c | 141 ++++++++++++++++++++++++++++++++++--- > tests/Makefile.am | 5 ++ > tests/test-file-extents.sh | 57 +++++++++++++++ > 3 files changed, 193 insertions(+), 10 deletions(-) >> +#ifdef SEEK_HOLE > +/* Extents. */ > + > +static int > +file_can_extents (void *handle) > +{ > + struct handle *h = handle; > + off_t r; > + > + /* A simple test to see whether SEEK_HOLE etc is likely to work on > + * the current filesystem. > + */ > + pthread_mutex_lock (&lseek_lock); > + r = lseek (h->fd, 0, SEEK_HOLE); > + pthread_mutex_unlock (&lseek_lock); > + if (r == -1) { > + nbdkit_debug ("extents disabled: lseek: SEEK_HOLE: %m"); > + return 0; > + } > + return 1; > +}Should we also return 0 if the lseek() returned the offset of EOF? More concretely, what if we test: r = lseek(h->fd, lseek(h->fd, 0, SEEK_HOLE), SEEK_DATA); If r is non-negative, then we have a sparse file; if the inner SEEK_HOLE failed then the outer lseek(, -1, SEEK_DATA) should fail with EINVAL; while if SEEK_HOLE succeeded but reported the offset of EOF, then SEEK_DATA should fail due to ENXIO and we know the file has no holes at all. In favor of the double check: a completely non-sparse file is just wasting time on further lseek() calls that won't tell us anything different than our default of all data (and which may be helpful for things like tmpfs that have abysmal lseek() performance). Perhaps a reason against: if we expect NBD_CMD_TRIM/NBD_CMD_WRITE_ZEROES or even a third-party to be punching holes in the meantime, then opting out of future lseek() won't see those later additions of holes. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Apr-25 10:37 UTC
Re: [Libguestfs] [PATCH nbdkit v5 FINAL 15/19] file: Implement extents.
On Wed, Apr 24, 2019 at 02:39:03PM -0500, Eric Blake wrote:> On 3/28/19 11:18 AM, Richard W.M. Jones wrote: > > This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and > > holes in the underlying file. > > --- > > plugins/file/file.c | 141 ++++++++++++++++++++++++++++++++++--- > > tests/Makefile.am | 5 ++ > > tests/test-file-extents.sh | 57 +++++++++++++++ > > 3 files changed, 193 insertions(+), 10 deletions(-) > > > > > +#ifdef SEEK_HOLE > > +/* Extents. */ > > + > > +static int > > +file_can_extents (void *handle) > > +{ > > + struct handle *h = handle; > > + off_t r; > > + > > + /* A simple test to see whether SEEK_HOLE etc is likely to work on > > + * the current filesystem. > > + */ > > + pthread_mutex_lock (&lseek_lock); > > + r = lseek (h->fd, 0, SEEK_HOLE); > > + pthread_mutex_unlock (&lseek_lock); > > + if (r == -1) { > > + nbdkit_debug ("extents disabled: lseek: SEEK_HOLE: %m"); > > + return 0; > > + } > > + return 1; > > +} > > Should we also return 0 if the lseek() returned the offset of EOF? More > concretely, what if we test: > r = lseek(h->fd, lseek(h->fd, 0, SEEK_HOLE), SEEK_DATA); > > If r is non-negative, then we have a sparse file; if the inner SEEK_HOLE > failed then the outer lseek(, -1, SEEK_DATA) should fail with EINVAL; > while if SEEK_HOLE succeeded but reported the offset of EOF, then > SEEK_DATA should fail due to ENXIO and we know the file has no holes at > all. In favor of the double check: a completely non-sparse file is just > wasting time on further lseek() calls that won't tell us anything > different than our default of all data (and which may be helpful for > things like tmpfs that have abysmal lseek() performance). Perhaps a > reason against: if we expect NBD_CMD_TRIM/NBD_CMD_WRITE_ZEROES or even a > third-party to be punching holes in the meantime, then opting out of > future lseek() won't see those later additions of holes.This last point seems crucial - the file could become sparse, even if it starts off as non-sparse. The test is designed really to eliminate the case where SEEK_HOLE isn't supported and returns an error. However you rightly point out that there's another case we don't cover: what if we're using tmpfs where SEEK_HOLE is supported but has poor performance? I would say the best solution is to fix tmpfs! But if that's not possible, perhaps we can do {f,}statfs(2) on the file and blacklist certain combinations of f_type and kernel version? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Eric Blake
2019-Apr-25 12:49 UTC
Re: [Libguestfs] [PATCH nbdkit v5 FINAL 15/19] file: Implement extents.
On 4/25/19 5:37 AM, Richard W.M. Jones wrote:> On Wed, Apr 24, 2019 at 02:39:03PM -0500, Eric Blake wrote: >> On 3/28/19 11:18 AM, Richard W.M. Jones wrote: >>> This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and >>> holes in the underlying file.>> Should we also return 0 if the lseek() returned the offset of EOF?>> reason against: if we expect NBD_CMD_TRIM/NBD_CMD_WRITE_ZEROES or even a >> third-party to be punching holes in the meantime, then opting out of >> future lseek() won't see those later additions of holes. > > This last point seems crucial - the file could become sparse, even if > it starts off as non-sparse. > > The test is designed really to eliminate the case where SEEK_HOLE > isn't supported and returns an error.Okay, then let's leave this one as-is.> > However you rightly point out that there's another case we don't > cover: what if we're using tmpfs where SEEK_HOLE is supported but has > poor performance? I would say the best solution is to fix tmpfs! But > if that's not possible, perhaps we can do {f,}statfs(2) on the file > and blacklist certain combinations of f_type and kernel version?Nah, it's easier to just tell the users to apply the noextents filter in that case. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Possibly Parallel Threads
- Re: [PATCH nbdkit v5 FINAL 15/19] file: Implement extents.
- [PATCH nbdkit v5 FINAL 15/19] file: Implement extents.
- [PATCH nbdkit 8/8] file: Implement extents.
- Re: [PATCH nbdkit 8/8] file: Implement extents.
- [nbdkit PATCH 05/10] plugins: Wire up file-based plugin support for NBD_INFO_INIT_STATE