Richard W.M. Jones
2021-May-12 10:37 UTC
[Libguestfs] [PATCH v2] fuse: Allow fallocate(FALLOC_FL_ZERO_RANGE)
libnbd's nbdfuse utility would like to translate fallocate zero requests into NBD_CMD_WRITE_ZEROES. Currently the fuse module filters these out, returning -EOPNOTSUPP. This commit treats these almost the same way as FALLOC_FL_PUNCH_HOLE except not calling truncate_pagecache_range. A way to test this is with the following script: -------------------- #!/bin/bash # Requires fuse >= 3, nbdkit >= 1.8, and latest nbdfuse from # https://gitlab.com/nbdkit/libnbd/-/tree/master/fuse set -e set -x export output=$PWD/output rm -f test.img $output # Create an nbdkit instance that prints the NBD requests seen. nbdkit sh - <<'EOF' case "$1" in get_size) echo 1M ;; can_write|can_trim|can_zero|can_fast_zero) ;; pread) echo "$@" >>$output; dd if=/dev/zero count=$3 iflag=count_bytes ;; pwrite) echo "$@" >>$output; cat >/dev/null ;; trim|zero) echo "$@" >>$output ;; *) exit 2 ;; esac EOF # Fuse-mount NBD instance as a file. touch test.img nbdfuse test.img nbd://localhost & sleep 2 ls -lh test.img # Run a read, write, trim and zero request. dd if=test.img of=/dev/null bs=512 skip=1024 count=1 dd if=/dev/zero of=test.img bs=512 skip=2048 count=1 fallocate -p -l 512 -o 4096 test.img fallocate -z -l 512 -o 8192 test.img # Print the output from the NBD server. cat $output # Clean up. fusermount3 -u test.img killall nbdkit rm test.img $output -------------------- which will print: pread 4096 524288 # number depends on readahead pwrite 512 0 trim 512 4096 zero 512 8192 may_trim The last line indicates that the FALLOC_FL_ZERO_RANGE request was successfully passed through by the kernel module to nbdfuse, translated to NBD_CMD_WRITE_ZEROES and sent through to the server. Signed-off-by: Richard W.M. Jones <rjones at redhat.com> --- fs/fuse/file.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 09ef2a4d25ed..22e8e88c78d4 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2907,11 +2907,13 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, }; int err; bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) || - (mode & FALLOC_FL_PUNCH_HOLE); + (mode & FALLOC_FL_PUNCH_HOLE) || + (mode & FALLOC_FL_ZERO_RANGE); bool block_faults = FUSE_IS_DAX(inode) && lock_inode; - if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | + FALLOC_FL_ZERO_RANGE)) return -EOPNOTSUPP; if (fm->fc->no_fallocate) @@ -2926,7 +2928,8 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, goto out; } - if (mode & FALLOC_FL_PUNCH_HOLE) { + if ((mode & FALLOC_FL_PUNCH_HOLE) || + (mode & FALLOC_FL_ZERO_RANGE)) { loff_t endbyte = offset + length - 1; err = fuse_writeback_range(inode, offset, endbyte); -- 2.31.1
Shachar Sharon
2021-May-12 14:27 UTC
[Libguestfs] [PATCH v2] fuse: Allow fallocate(FALLOC_FL_ZERO_RANGE)
On Wed, May 12, 2021 at 11:37:04AM +0100, Richard W.M. Jones wrote:>libnbd's nbdfuse utility would like to translate fallocate zero >requests into NBD_CMD_WRITE_ZEROES. Currently the fuse module filters >these out, returning -EOPNOTSUPP. This commit treats these almost the >same way as FALLOC_FL_PUNCH_HOLE except not calling >truncate_pagecache_range. >Why don't you call 'truncate_pagecache_range' ?>A way to test this is with the following script: > >-------------------- > #!/bin/bash > # Requires fuse >= 3, nbdkit >= 1.8, and latest nbdfuse from > # https://gitlab.com/nbdkit/libnbd/-/tree/master/fuse > set -e > set -x > > export output=$PWD/output > rm -f test.img $output > > # Create an nbdkit instance that prints the NBD requests seen. > nbdkit sh - <<'EOF' > case "$1" in > get_size) echo 1M ;; > can_write|can_trim|can_zero|can_fast_zero) ;; > pread) echo "$@" >>$output; dd if=/dev/zero count=$3 iflag=count_bytes ;; > pwrite) echo "$@" >>$output; cat >/dev/null ;; > trim|zero) echo "$@" >>$output ;; > *) exit 2 ;; > esac > EOF > > # Fuse-mount NBD instance as a file. > touch test.img > nbdfuse test.img nbd://localhost & sleep 2 > ls -lh test.img > > # Run a read, write, trim and zero request. > dd if=test.img of=/dev/null bs=512 skip=1024 count=1 > dd if=/dev/zero of=test.img bs=512 skip=2048 count=1 > fallocate -p -l 512 -o 4096 test.img > fallocate -z -l 512 -o 8192 test.img > > # Print the output from the NBD server. > cat $output > > # Clean up. > fusermount3 -u test.img > killall nbdkit > rm test.img $output > -------------------- > >which will print: > > pread 4096 524288 # number depends on readahead > pwrite 512 0 > trim 512 4096 > zero 512 8192 may_trim > >The last line indicates that the FALLOC_FL_ZERO_RANGE request was >successfully passed through by the kernel module to nbdfuse, >translated to NBD_CMD_WRITE_ZEROES and sent through to the server. > >Signed-off-by: Richard W.M. Jones <rjones at redhat.com> >--- > fs/fuse/file.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > >diff --git a/fs/fuse/file.c b/fs/fuse/file.c >index 09ef2a4d25ed..22e8e88c78d4 100644 >--- a/fs/fuse/file.c >+++ b/fs/fuse/file.c >@@ -2907,11 +2907,13 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, > }; > int err; > bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) || >- (mode & FALLOC_FL_PUNCH_HOLE); >+ (mode & FALLOC_FL_PUNCH_HOLE) || >+ (mode & FALLOC_FL_ZERO_RANGE);To stay aligned with existing code style, consider: - (mode & FALLOC_FL_PUNCH_HOLE); +? ? ? (mode & (FALLOC_FL_PUNCH_HOLE | +? ? ? ? FALLOC_FL_ZERO_RANGE));> > bool block_faults = FUSE_IS_DAX(inode) && lock_inode; > >- if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) >+ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | >+ FALLOC_FL_ZERO_RANGE)) > return -EOPNOTSUPP; > > if (fm->fc->no_fallocate) >@@ -2926,7 +2928,8 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, > goto out; > } > >- if (mode & FALLOC_FL_PUNCH_HOLE) { >+ if ((mode & FALLOC_FL_PUNCH_HOLE) || >+ (mode & FALLOC_FL_ZERO_RANGE)) { > loff_t endbyte = offset + length - 1; > > err = fuse_writeback_range(inode, offset, endbyte); >-- >2.31.1 >
Richard W.M. Jones
2021-May-12 14:49 UTC
[Libguestfs] [PATCH v2] fuse: Allow fallocate(FALLOC_FL_ZERO_RANGE)
On Wed, May 12, 2021 at 05:27:22PM +0300, Shachar Sharon wrote:> On Wed, May 12, 2021 at 11:37:04AM +0100, Richard W.M. Jones wrote: > >libnbd's nbdfuse utility would like to translate fallocate zero > >requests into NBD_CMD_WRITE_ZEROES. Currently the fuse module filters > >these out, returning -EOPNOTSUPP. This commit treats these almost the > >same way as FALLOC_FL_PUNCH_HOLE except not calling > >truncate_pagecache_range. > > > Why don't you call 'truncate_pagecache_range' ?Very good point. I just assumed that it would only be useful when hole-punching, but now I actually read the description of the function I see we need it. Also looking at other filesystems that also support FALLOC_FL_ZERO_RANGE: ext4_zero_range -> calls truncate_pagecache_range f2fs_zero_range -> calls it xfs -> calls it indirectly btrfs_zero_range -> does not call it (?) I'll add this, and retest everything.> >A way to test this is with the following script:In my next version I'll also address this script which is rather long-winded. I think there's an easier way for people to test this:> >-------------------- > > #!/bin/bash > > # Requires fuse >= 3, nbdkit >= 1.8, and latest nbdfuse from > > # https://gitlab.com/nbdkit/libnbd/-/tree/master/fuse > > set -e > > set -x > > > > export output=$PWD/output > > rm -f test.img $output > > > > # Create an nbdkit instance that prints the NBD requests seen. > > nbdkit sh - <<'EOF' > > case "$1" in > > get_size) echo 1M ;; > > can_write|can_trim|can_zero|can_fast_zero) ;; > > pread) echo "$@" >>$output; dd if=/dev/zero count=$3 iflag=count_bytes ;; > > pwrite) echo "$@" >>$output; cat >/dev/null ;; > > trim|zero) echo "$@" >>$output ;; > > *) exit 2 ;; > > esac[etc]> >diff --git a/fs/fuse/file.c b/fs/fuse/file.c > >index 09ef2a4d25ed..22e8e88c78d4 100644 > >--- a/fs/fuse/file.c > >+++ b/fs/fuse/file.c > >@@ -2907,11 +2907,13 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset, > > }; > > int err; > > bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) || > >- (mode & FALLOC_FL_PUNCH_HOLE); > >+ (mode & FALLOC_FL_PUNCH_HOLE) || > >+ (mode & FALLOC_FL_ZERO_RANGE); > To stay aligned with existing code style, consider: > - (mode & FALLOC_FL_PUNCH_HOLE); > +? ? ? (mode & (FALLOC_FL_PUNCH_HOLE | > +? ? ? ? FALLOC_FL_ZERO_RANGE));Good idea. Thanks for the quick review. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org