This patchset adds support of per-file DAX for virtiofs, which is inspired by Ira Weiny's work on ext4[1] and xfs[2]. There are three related scenarios: 1. Alloc inode: get per-file DAX flag from fuse_attr.flags. (patch 3) 2. Per-file DAX flag changes when the file has been opened. (patch 3) In this case, the dentry and inode are all marked as DONT_CACHE, and the DAX state won't be updated until the file is closed and reopened later. 3. Users can change the per-file DAX flag inside the guest by chattr(1). (patch 4) 4. Create new files under directories with DAX enabled. When creating new files in ext4/xfs on host, the new created files will inherit the per-file DAX flag from the directory, and thus the new created files in virtiofs will also inherit the per-file DAX flag if the fuse server derives fuse_attr.flags from the underlying ext4/xfs inode's per-file DAX flag. Any comment is welcome. [1] commit 9cb20f94afcd ("fs/ext4: Make DAX mount option a tri-state") [2] commit 02beb2686ff9 ("fs/xfs: Make DAX mount option a tri-state") changes since v1: - add support for changing per-file DAX flags inside guest (patch 4) v1:https://www.spinics.net/lists/linux-virtualization/msg51008.html Jeffle Xu (4): fuse: add fuse_should_enable_dax() helper fuse: Make DAX mount option a tri-state fuse: add per-file DAX flag fuse: support changing per-file DAX flag inside guest fs/fuse/dax.c | 36 ++++++++++++++++++++++++++++++++++-- fs/fuse/file.c | 4 ++-- fs/fuse/fuse_i.h | 16 ++++++++++++---- fs/fuse/inode.c | 7 +++++-- fs/fuse/ioctl.c | 9 ++++++--- fs/fuse/virtio_fs.c | 16 ++++++++++++++-- include/uapi/linux/fuse.h | 5 +++++ 7 files changed, 78 insertions(+), 15 deletions(-) -- 2.27.0
This is in prep for following per-file DAX checking. Signed-off-by: Jeffle Xu <jefflexu at linux.alibaba.com> --- fs/fuse/dax.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 0e5407f48e6a..c6f4e82e65f3 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1336,11 +1336,19 @@ static const struct address_space_operations fuse_dax_file_aops = { .invalidatepage = noop_invalidatepage, }; -void fuse_dax_inode_init(struct inode *inode) +static bool fuse_should_enable_dax(struct inode *inode) { struct fuse_conn *fc = get_fuse_conn(inode); if (!fc->dax) + return false; + + return true; +} + +void fuse_dax_inode_init(struct inode *inode) +{ + if (!fuse_should_enable_dax(inode)) return; inode->i_flags |= S_DAX; -- 2.27.0
We add 'always', 'never', and 'inode' (default). '-o dax' continues to operate the same which is equivalent to 'always'. By the time this patch is applied, 'inode' mode is actually equal to 'always' mode, before the per-file DAX flag is introduced in the following patch. Signed-off-by: Jeffle Xu <jefflexu at linux.alibaba.com> --- fs/fuse/dax.c | 13 ++++++++++++- fs/fuse/fuse_i.h | 11 +++++++++-- fs/fuse/inode.c | 2 +- fs/fuse/virtio_fs.c | 16 ++++++++++++++-- 4 files changed, 36 insertions(+), 6 deletions(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index c6f4e82e65f3..a478e824c2d0 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -70,6 +70,9 @@ struct fuse_inode_dax { }; struct fuse_conn_dax { + /** dax mode: FUSE_DAX_MOUNT_* (always, never or per-file) **/ + unsigned int mode; + /* DAX device */ struct dax_device *dev; @@ -1288,7 +1291,8 @@ static int fuse_dax_mem_range_init(struct fuse_conn_dax *fcd) return ret; } -int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev) +int fuse_dax_conn_alloc(struct fuse_conn *fc, unsigned int mode, + struct dax_device *dax_dev) { struct fuse_conn_dax *fcd; int err; @@ -1301,6 +1305,7 @@ int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev) return -ENOMEM; spin_lock_init(&fcd->lock); + fcd->mode = mode; fcd->dev = dax_dev; err = fuse_dax_mem_range_init(fcd); if (err) { @@ -1339,10 +1344,16 @@ static const struct address_space_operations fuse_dax_file_aops = { static bool fuse_should_enable_dax(struct inode *inode) { struct fuse_conn *fc = get_fuse_conn(inode); + unsigned int mode; if (!fc->dax) return false; + mode = fc->dax->mode; + + if (mode == FUSE_DAX_MOUNT_NEVER) + return false; + return true; } diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 07829ce78695..f29018323845 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -487,6 +487,12 @@ struct fuse_dev { struct list_head entry; }; +enum { + FUSE_DAX_MOUNT_INODE, + FUSE_DAX_MOUNT_ALWAYS, + FUSE_DAX_MOUNT_NEVER, +}; + struct fuse_fs_context { int fd; unsigned int rootmode; @@ -503,7 +509,7 @@ struct fuse_fs_context { bool no_control:1; bool no_force_umount:1; bool legacy_opts_show:1; - bool dax:1; + unsigned int dax; unsigned int max_read; unsigned int blksize; const char *subtype; @@ -1242,7 +1248,8 @@ ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to); ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from); int fuse_dax_mmap(struct file *file, struct vm_area_struct *vma); int fuse_dax_break_layouts(struct inode *inode, u64 dmap_start, u64 dmap_end); -int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev); +int fuse_dax_conn_alloc(struct fuse_conn *fc, unsigned int mode, + struct dax_device *dax_dev); void fuse_dax_conn_free(struct fuse_conn *fc); bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi); void fuse_dax_inode_init(struct inode *inode); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index b9beb39a4a18..f6b46395edb2 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1434,7 +1434,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx) sb->s_subtype = ctx->subtype; ctx->subtype = NULL; if (IS_ENABLED(CONFIG_FUSE_DAX)) { - err = fuse_dax_conn_alloc(fc, ctx->dax_dev); + err = fuse_dax_conn_alloc(fc, ctx->dax, ctx->dax_dev); if (err) goto err; } diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 8f52cdaa8445..561f711d1945 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -88,12 +88,21 @@ struct virtio_fs_req_work { static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq, struct fuse_req *req, bool in_flight); +static const struct constant_table dax_param_enums[] = { + {"inode", FUSE_DAX_MOUNT_INODE }, + {"always", FUSE_DAX_MOUNT_ALWAYS }, + {"never", FUSE_DAX_MOUNT_NEVER }, + {} +}; + enum { OPT_DAX, + OPT_DAX_ENUM, }; static const struct fs_parameter_spec virtio_fs_parameters[] = { fsparam_flag("dax", OPT_DAX), + fsparam_enum("dax", OPT_DAX_ENUM, dax_param_enums), {} }; @@ -110,7 +119,10 @@ static int virtio_fs_parse_param(struct fs_context *fc, switch (opt) { case OPT_DAX: - ctx->dax = 1; + ctx->dax = FUSE_DAX_MOUNT_ALWAYS; + break; + case OPT_DAX_ENUM: + ctx->dax = result.uint_32; break; default: return -EINVAL; @@ -1326,7 +1338,7 @@ static int virtio_fs_fill_super(struct super_block *sb, struct fs_context *fsc) /* virtiofs allocates and installs its own fuse devices */ ctx->fudptr = NULL; - if (ctx->dax) { + if (ctx->dax != FUSE_DAX_MOUNT_NEVER) { if (!fs->dax_dev) { err = -EINVAL; pr_err("virtio-fs: dax can't be enabled as filesystem" -- 2.27.0
Add one flag for fuse_attr.flags indicating if DAX shall be enabled for this file. When the per-file DAX flag changes for an *opened* file, the state of the file won't be updated until this file is closed and reopened later. Signed-off-by: Jeffle Xu <jefflexu at linux.alibaba.com> --- fs/fuse/dax.c | 21 +++++++++++++++++---- fs/fuse/file.c | 4 ++-- fs/fuse/fuse_i.h | 5 +++-- fs/fuse/inode.c | 5 ++++- include/uapi/linux/fuse.h | 5 +++++ 5 files changed, 31 insertions(+), 9 deletions(-) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index a478e824c2d0..0e862119757a 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1341,7 +1341,7 @@ static const struct address_space_operations fuse_dax_file_aops = { .invalidatepage = noop_invalidatepage, }; -static bool fuse_should_enable_dax(struct inode *inode) +static bool fuse_should_enable_dax(struct inode *inode, unsigned int flags) { struct fuse_conn *fc = get_fuse_conn(inode); unsigned int mode; @@ -1354,18 +1354,31 @@ static bool fuse_should_enable_dax(struct inode *inode) if (mode == FUSE_DAX_MOUNT_NEVER) return false; - return true; + if (mode == FUSE_DAX_MOUNT_ALWAYS) + return true; + + WARN_ON(mode != FUSE_DAX_MOUNT_INODE); + return flags & FUSE_ATTR_DAX; } -void fuse_dax_inode_init(struct inode *inode) +void fuse_dax_inode_init(struct inode *inode, unsigned int flags) { - if (!fuse_should_enable_dax(inode)) + if (!fuse_should_enable_dax(inode, flags)) return; inode->i_flags |= S_DAX; inode->i_data.a_ops = &fuse_dax_file_aops; } +void fuse_dax_dontcache(struct inode *inode, bool newdax) +{ + struct fuse_conn *fc = get_fuse_conn(inode); + + if (fc->dax && fc->dax->mode == FUSE_DAX_MOUNT_INODE && + IS_DAX(inode) != newdax) + d_mark_dontcache(inode); +} + bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment) { if (fc->dax && (map_alignment > FUSE_DAX_SHIFT)) { diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 97f860cfc195..cf42af492146 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3142,7 +3142,7 @@ static const struct address_space_operations fuse_file_aops = { .write_end = fuse_write_end, }; -void fuse_init_file_inode(struct inode *inode) +void fuse_init_file_inode(struct inode *inode, struct fuse_attr *attr) { struct fuse_inode *fi = get_fuse_inode(inode); @@ -3156,5 +3156,5 @@ void fuse_init_file_inode(struct inode *inode) fi->writepages = RB_ROOT; if (IS_ENABLED(CONFIG_FUSE_DAX)) - fuse_dax_inode_init(inode); + fuse_dax_inode_init(inode, attr->flags); } diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index f29018323845..0793b93d680a 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1000,7 +1000,7 @@ int fuse_notify_poll_wakeup(struct fuse_conn *fc, /** * Initialize file operations on a regular file */ -void fuse_init_file_inode(struct inode *inode); +void fuse_init_file_inode(struct inode *inode, struct fuse_attr *attr); /** * Initialize inode operations on regular files and special files @@ -1252,8 +1252,9 @@ int fuse_dax_conn_alloc(struct fuse_conn *fc, unsigned int mode, struct dax_device *dax_dev); void fuse_dax_conn_free(struct fuse_conn *fc); bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi); -void fuse_dax_inode_init(struct inode *inode); +void fuse_dax_inode_init(struct inode *inode, unsigned int flags); void fuse_dax_inode_cleanup(struct inode *inode); +void fuse_dax_dontcache(struct inode *inode, bool newdax); bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment); void fuse_dax_cancel_work(struct fuse_conn *fc); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index f6b46395edb2..2ae92798126e 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -269,6 +269,9 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr, if (inval) invalidate_inode_pages2(inode->i_mapping); } + + if (IS_ENABLED(CONFIG_FUSE_DAX)) + fuse_dax_dontcache(inode, attr->flags & FUSE_ATTR_DAX); } static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr) @@ -281,7 +284,7 @@ static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr) inode->i_ctime.tv_nsec = attr->ctimensec; if (S_ISREG(inode->i_mode)) { fuse_init_common(inode); - fuse_init_file_inode(inode); + fuse_init_file_inode(inode, attr); } else if (S_ISDIR(inode->i_mode)) fuse_init_dir(inode); else if (S_ISLNK(inode->i_mode)) diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 36ed092227fa..90c9df10d37a 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -184,6 +184,9 @@ * * 7.34 * - add FUSE_SYNCFS + * + * 7.35 + * - add FUSE_ATTR_DAX */ #ifndef _LINUX_FUSE_H @@ -449,8 +452,10 @@ struct fuse_file_lock { * fuse_attr flags * * FUSE_ATTR_SUBMOUNT: Object is a submount root + * FUSE_ATTR_DAX: Enable DAX for this file in per-file DAX mode */ #define FUSE_ATTR_SUBMOUNT (1 << 0) +#define FUSE_ATTR_DAX (1 << 1) /** * Open flags -- 2.27.0
Jeffle Xu
2021-Jul-16 10:47 UTC
[PATCH v2 4/4] fuse: support changing per-file DAX flag inside guest
Fuse client can enable or disable per-file DAX inside guest by chattr(1). Similarly the new state won't be updated until the file is closed and reopened later. It is worth nothing that it is a best-effort style, since whether per-file DAX is enabled or not is controlled by fuse_attr.flags retrieved by FUSE LOOKUP routine, while the algorithm constructing fuse_attr.flags is totally fuse server specific, not to mention ioctl may not be supported by fuse server at all. Signed-off-by: Jeffle Xu <jefflexu at linux.alibaba.com> --- fs/fuse/ioctl.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c index 546ea3d58fb4..172e05c3f038 100644 --- a/fs/fuse/ioctl.c +++ b/fs/fuse/ioctl.c @@ -460,6 +460,7 @@ int fuse_fileattr_set(struct user_namespace *mnt_userns, struct fuse_file *ff; unsigned int flags = fa->flags; struct fsxattr xfa; + bool newdax; int err; ff = fuse_priv_ioctl_prepare(inode); @@ -467,10 +468,9 @@ int fuse_fileattr_set(struct user_namespace *mnt_userns, return PTR_ERR(ff); if (fa->flags_valid) { + newdax = flags & FS_DAX_FL; err = fuse_priv_ioctl(inode, ff, FS_IOC_SETFLAGS, &flags, sizeof(flags)); - if (err) - goto cleanup; } else { memset(&xfa, 0, sizeof(xfa)); xfa.fsx_xflags = fa->fsx_xflags; @@ -479,11 +479,14 @@ int fuse_fileattr_set(struct user_namespace *mnt_userns, xfa.fsx_projid = fa->fsx_projid; xfa.fsx_cowextsize = fa->fsx_cowextsize; + newdax = fa->fsx_xflags & FS_XFLAG_DAX; err = fuse_priv_ioctl(inode, ff, FS_IOC_FSSETXATTR, &xfa, sizeof(xfa)); } -cleanup: + if (!err && IS_ENABLED(CONFIG_FUSE_DAX)) + fuse_dax_dontcache(inode, newdax); + fuse_priv_ioctl_cleanup(inode, ff); return err; -- 2.27.0
On Fri, Jul 16, 2021 at 06:47:49PM +0800, Jeffle Xu wrote:> This patchset adds support of per-file DAX for virtiofs, which is > inspired by Ira Weiny's work on ext4[1] and xfs[2]. > > There are three related scenarios: > 1. Alloc inode: get per-file DAX flag from fuse_attr.flags. (patch 3) > 2. Per-file DAX flag changes when the file has been opened. (patch 3) > In this case, the dentry and inode are all marked as DONT_CACHE, and > the DAX state won't be updated until the file is closed and reopened > later. > 3. Users can change the per-file DAX flag inside the guest by chattr(1). > (patch 4) > 4. Create new files under directories with DAX enabled. When creating > new files in ext4/xfs on host, the new created files will inherit the > per-file DAX flag from the directory, and thus the new created files in > virtiofs will also inherit the per-file DAX flag if the fuse server > derives fuse_attr.flags from the underlying ext4/xfs inode's per-file > DAX flag.Thinking little bit more about this from requirement perspective. I think we are trying to address two use cases here. A. Client does not know which files DAX should be used on. Only server knows it and server passes this information to client. I suspect that's your primary use case. B. Client is driving which files are supposed to be using DAX. This is exactly same as the model ext4/xfs are using by storing a persistent flag on inode. Current patches seem to be a hybrid of both approach A and B. If we were to implement B, then fuse client probably needs to have the capability to query FS_XFLAG_DAX on inode and decide whether to enable DAX or not. (Without extra round trip). Or know it indirectly by extending GETATTR and requesting this explicitly. If we were only implementing A, then server does not have a way to tell client to enable DAX. Server can either look at FS_XFLAG_DAX and decide to enable DAX or use some other property. Given querying FS_XFLAG_DAX will be an extra ioctl() on every inode lookup/getattr, it probably will be a server option. But enabling on server does not mean client will enable it. I think my primary concern with this patch right now is trying to figure out which requirement we are trying to cater to first and how to connect server and client well so they both understand what mode they are operating in and interact well. Vivek> > > Any comment is welcome. > > [1] commit 9cb20f94afcd ("fs/ext4: Make DAX mount option a tri-state") > [2] commit 02beb2686ff9 ("fs/xfs: Make DAX mount option a tri-state") > > > changes since v1: > - add support for changing per-file DAX flags inside guest (patch 4) > > v1:https://www.spinics.net/lists/linux-virtualization/msg51008.html > > Jeffle Xu (4): > fuse: add fuse_should_enable_dax() helper > fuse: Make DAX mount option a tri-state > fuse: add per-file DAX flag > fuse: support changing per-file DAX flag inside guest > > fs/fuse/dax.c | 36 ++++++++++++++++++++++++++++++++++-- > fs/fuse/file.c | 4 ++-- > fs/fuse/fuse_i.h | 16 ++++++++++++---- > fs/fuse/inode.c | 7 +++++-- > fs/fuse/ioctl.c | 9 ++++++--- > fs/fuse/virtio_fs.c | 16 ++++++++++++++-- > include/uapi/linux/fuse.h | 5 +++++ > 7 files changed, 78 insertions(+), 15 deletions(-) > > -- > 2.27.0 >