Michael S. Tsirkin
2019-Sep-03 13:55 UTC
[PATCH v4 15/16] virtio-fs: add virtiofs filesystem
On Tue, Sep 03, 2019 at 01:42:02PM +0200, Miklos Szeredi wrote:> From: Stefan Hajnoczi <stefanha at redhat.com> > > Add a basic file system module for virtio-fs. This does not yet contain > shared data support between host and guest or metadata coherency speedups. > However it is already significantly faster than virtio-9p. > > Design Overview > ==============> > With the goal of designing something with better performance and local file > system semantics, a bunch of ideas were proposed. > > - Use fuse protocol (instead of 9p) for communication between guest and > host. Guest kernel will be fuse client and a fuse server will run on > host to serve the requests. > > - For data access inside guest, mmap portion of file in QEMU address space > and guest accesses this memory using dax. That way guest page cache is > bypassed and there is only one copy of data (on host). This will also > enable mmap(MAP_SHARED) between guests. > > - For metadata coherency, there is a shared memory region which contains > version number associated with metadata and any guest changing metadata > updates version number and other guests refresh metadata on next access. > This is yet to be implemented. > > How virtio-fs differs from existing approaches > =============================================> > The unique idea behind virtio-fs is to take advantage of the co-location of > the virtual machine and hypervisor to avoid communication (vmexits). > > DAX allows file contents to be accessed without communication with the > hypervisor. The shared memory region for metadata avoids communication in > the common case where metadata is unchanged. > > By replacing expensive communication with cheaper shared memory accesses, > we expect to achieve better performance than approaches based on network > file system protocols. In addition, this also makes it easier to achieve > local file system semantics (coherency). > > These techniques are not applicable to network file system protocols since > the communications channel is bypassed by taking advantage of shared memory > on a local machine. This is why we decided to build virtio-fs rather than > focus on 9P or NFS. > > Caching Modes > ============> > Like virtio-9p, different caching modes are supported which determine the > coherency level as well. The ?cache=FOO? and ?writeback? options control > the level of coherence between the guest and host filesystems. > > - cache=none > metadata, data and pathname lookup are not cached in guest. They are > always fetched from host and any changes are immediately pushed to host. > > - cache=always > metadata, data and pathname lookup are cached in guest and never expire. > > - cache=auto > metadata and pathname lookup cache expires after a configured amount of > time (default is 1 second). Data is cached while the file is open > (close to open consistency). > > - writeback/no_writeback > These options control the writeback strategy. If writeback is disabled, > then normal writes will immediately be synchronized with the host fs. > If writeback is enabled, then writes may be cached in the guest until > the file is closed or an fsync(2) performed. This option has no effect > on mmap-ed writes or writes going through the DAX mechanism. > > Signed-off-by: Stefan Hajnoczi <stefanha at redhat.com> > Signed-off-by: Vivek Goyal <vgoyal at redhat.com> > Signed-off-by: Miklos Szeredi <mszeredi at redhat.com>What's with all of the TODOs? Some of these are really scary, looks like they need to be figured out before this is merged. Endian-ness for fuse header also looks wrong.> --- > fs/fuse/Kconfig | 11 + > fs/fuse/Makefile | 1 + > fs/fuse/fuse_i.h | 5 + > fs/fuse/virtio_fs.c | 1072 +++++++++++++++++++++++++++++++ > include/uapi/linux/virtio_fs.h | 19 + > include/uapi/linux/virtio_ids.h | 1 + > 6 files changed, 1109 insertions(+) > create mode 100644 fs/fuse/virtio_fs.c > create mode 100644 include/uapi/linux/virtio_fs.h > > diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig > index 24fc5a5c1b97..0635cba19971 100644 > --- a/fs/fuse/Kconfig > +++ b/fs/fuse/Kconfig > @@ -27,3 +27,14 @@ config CUSE > > If you want to develop or use a userspace character device > based on CUSE, answer Y or M. > + > +config VIRTIO_FS > + tristate "Virtio Filesystem" > + depends on FUSE_FS > + select VIRTIO > + help > + The Virtio Filesystem allows guests to mount file systems from the > + host. > + > + If you want to share files between guests or with the host, answer Y > + or M. > diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile > index 9485019c2a14..6419a2b3510d 100644 > --- a/fs/fuse/Makefile > +++ b/fs/fuse/Makefile > @@ -5,5 +5,6 @@ > > obj-$(CONFIG_FUSE_FS) += fuse.o > obj-$(CONFIG_CUSE) += cuse.o > +obj-$(CONFIG_VIRTIO_FS) += virtio_fs.o > > fuse-objs := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index dbf73e5d5b38..85e2dcad68c1 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -444,6 +444,11 @@ struct fuse_req { > > /** Request is stolen from fuse_file->reserved_req */ > struct file *stolen_file; > + > +#if IS_ENABLED(CONFIG_VIRTIO_FS) > + /** virtio-fs's physically contiguous buffer for in and out args */ > + void *argbuf; > +#endif > }; > > struct fuse_iqueue; > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > new file mode 100644 > index 000000000000..197e79e536f9 > --- /dev/null > +++ b/fs/fuse/virtio_fs.c > @@ -0,0 +1,1072 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * virtio-fs: Virtio Filesystem > + * Copyright (C) 2018 Red Hat, Inc. > + */ > + > +#include <linux/fs.h> > +#include <linux/module.h> > +#include <linux/virtio.h> > +#include <linux/virtio_fs.h> > +#include <linux/delay.h> > +#include <linux/fs_context.h> > +#include <linux/highmem.h> > +#include "fuse_i.h" > + > +/* List of virtio-fs device instances and a lock for the list */ > +static DEFINE_MUTEX(virtio_fs_mutex); > +static LIST_HEAD(virtio_fs_instances); > + > +enum { > + VQ_HIPRIO, > + VQ_REQUEST > +}; > + > +/* Per-virtqueue state */ > +struct virtio_fs_vq { > + spinlock_t lock; > + struct virtqueue *vq; /* protected by ->lock */ > + struct work_struct done_work; > + struct list_head queued_reqs; > + struct delayed_work dispatch_work; > + struct fuse_dev *fud; > + bool connected; > + long in_flight; > + char name[24];I'd keep names somewhere separate as they are not used on data path.> +} ____cacheline_aligned_in_smp; > + > +/* A virtio-fs device instance */ > +struct virtio_fs { > + struct list_head list; /* on virtio_fs_instances */ > + char *tag; > + struct virtio_fs_vq *vqs; > + unsigned int nvqs; /* number of virtqueues */ > + unsigned int num_queues; /* number of request queues */ > +}; > + > +struct virtio_fs_forget { > + struct fuse_in_header ih; > + struct fuse_forget_in arg;These structures are all native endian. Passing them to host will make cross-endian setups painful to support, and hardware implementations impossible. How about converting everything to LE?> + /* This request can be temporarily queued on virt queue */ > + struct list_head list; > +}; > + > +static inline struct virtio_fs_vq *vq_to_fsvq(struct virtqueue *vq) > +{ > + struct virtio_fs *fs = vq->vdev->priv; > + > + return &fs->vqs[vq->index]; > +} > + > +static inline struct fuse_pqueue *vq_to_fpq(struct virtqueue *vq) > +{ > + return &vq_to_fsvq(vq)->fud->pq; > +} > + > +/* Add a new instance to the list or return -EEXIST if tag name exists*/ > +static int virtio_fs_add_instance(struct virtio_fs *fs) > +{ > + struct virtio_fs *fs2; > + bool duplicate = false; > + > + mutex_lock(&virtio_fs_mutex); > + > + list_for_each_entry(fs2, &virtio_fs_instances, list) { > + if (strcmp(fs->tag, fs2->tag) == 0) > + duplicate = true; > + } > + > + if (!duplicate) > + list_add_tail(&fs->list, &virtio_fs_instances);This is O(N^2) as it's presumably called for each istance. Doesn't scale - please switch to a tree or such.> + > + mutex_unlock(&virtio_fs_mutex); > + > + if (duplicate) > + return -EEXIST; > + return 0; > +} > + > +/* Return the virtio_fs with a given tag, or NULL */ > +static struct virtio_fs *virtio_fs_find_instance(const char *tag) > +{ > + struct virtio_fs *fs; > + > + mutex_lock(&virtio_fs_mutex); > + > + list_for_each_entry(fs, &virtio_fs_instances, list) { > + if (strcmp(fs->tag, tag) == 0) > + goto found; > + } > + > + fs = NULL; /* not found */ > + > +found: > + mutex_unlock(&virtio_fs_mutex); > + > + return fs; > +} > + > +static void virtio_fs_free_devs(struct virtio_fs *fs) > +{ > + unsigned int i; > + > + /* TODO lock */Doesn't inspire confidence, does it?> + > + for (i = 0; i < fs->nvqs; i++) { > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > + > + if (!fsvq->fud) > + continue; > + > + flush_work(&fsvq->done_work); > + flush_delayed_work(&fsvq->dispatch_work); > + > + /* TODO need to quiesce/end_requests/decrement dev_count */Indeed. Won't this crash if we don't?> + fuse_dev_free(fsvq->fud); > + fsvq->fud = NULL; > + } > +} > + > +/* Read filesystem name from virtio config into fs->tag (must kfree()). */ > +static int virtio_fs_read_tag(struct virtio_device *vdev, struct virtio_fs *fs) > +{ > + char tag_buf[sizeof_field(struct virtio_fs_config, tag)]; > + char *end; > + size_t len; > + > + virtio_cread_bytes(vdev, offsetof(struct virtio_fs_config, tag), > + &tag_buf, sizeof(tag_buf)); > + end = memchr(tag_buf, '\0', sizeof(tag_buf)); > + if (end == tag_buf) > + return -EINVAL; /* empty tag */ > + if (!end) > + end = &tag_buf[sizeof(tag_buf)]; > + > + len = end - tag_buf; > + fs->tag = devm_kmalloc(&vdev->dev, len + 1, GFP_KERNEL); > + if (!fs->tag) > + return -ENOMEM; > + memcpy(fs->tag, tag_buf, len); > + fs->tag[len] = '\0'; > + return 0; > +} > + > +/* Work function for hiprio completion */ > +static void virtio_fs_hiprio_done_work(struct work_struct *work) > +{ > + struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > + done_work); > + struct virtqueue *vq = fsvq->vq; > + > + /* Free completed FUSE_FORGET requests */ > + spin_lock(&fsvq->lock); > + do { > + unsigned int len; > + void *req; > + > + virtqueue_disable_cb(vq); > + > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > + kfree(req); > + fsvq->in_flight--; > + } > + } while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq))); > + spin_unlock(&fsvq->lock); > +} > + > +static void virtio_fs_dummy_dispatch_work(struct work_struct *work) > +{ > +} > + > +static void virtio_fs_hiprio_dispatch_work(struct work_struct *work) > +{ > + struct virtio_fs_forget *forget; > + struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > + dispatch_work.work); > + struct virtqueue *vq = fsvq->vq; > + struct scatterlist sg; > + struct scatterlist *sgs[] = {&sg}; > + bool notify; > + int ret; > + > + pr_debug("virtio-fs: worker %s called.\n", __func__); > + while (1) { > + spin_lock(&fsvq->lock); > + forget = list_first_entry_or_null(&fsvq->queued_reqs, > + struct virtio_fs_forget, list); > + if (!forget) { > + spin_unlock(&fsvq->lock); > + return; > + } > + > + list_del(&forget->list); > + if (!fsvq->connected) { > + spin_unlock(&fsvq->lock); > + kfree(forget); > + continue; > + } > + > + sg_init_one(&sg, forget, sizeof(*forget));This passes to host a structure including "struct list_head list"; Not a good idea.> + > + /* Enqueue the request */ > + dev_dbg(&vq->vdev->dev, "%s\n", __func__); > + ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC);This is easier as add_outbuf. Also - why GFP_ATOMIC?> + if (ret < 0) { > + if (ret == -ENOMEM || ret == -ENOSPC) { > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later\n", > + ret); > + list_add_tail(&forget->list, > + &fsvq->queued_reqs); > + schedule_delayed_work(&fsvq->dispatch_work, > + msecs_to_jiffies(1));Can't we we requeue after some buffers get consumed?> + } else { > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Dropping it.\n", > + ret); > + kfree(forget); > + } > + spin_unlock(&fsvq->lock); > + return; > + } > + > + fsvq->in_flight++; > + notify = virtqueue_kick_prepare(vq); > + spin_unlock(&fsvq->lock); > + > + if (notify) > + virtqueue_notify(vq); > + pr_debug("virtio-fs: worker %s dispatched one forget request.\n", > + __func__); > + } > +} > + > +/* Allocate and copy args into req->argbuf */ > +static int copy_args_to_argbuf(struct fuse_req *req) > +{ > + unsigned int offset = 0; > + unsigned int num_in; > + unsigned int num_out; > + unsigned int len; > + unsigned int i; > + > + num_in = req->in.numargs - req->in.argpages; > + num_out = req->out.numargs - req->out.argpages; > + len = fuse_len_args(num_in, (struct fuse_arg *)req->in.args) + > + fuse_len_args(num_out, req->out.args); > + > + req->argbuf = kmalloc(len, GFP_ATOMIC); > + if (!req->argbuf) > + return -ENOMEM; > + > + for (i = 0; i < num_in; i++) { > + memcpy(req->argbuf + offset, > + req->in.args[i].value, > + req->in.args[i].size); > + offset += req->in.args[i].size; > + } > + > + return 0; > +} > + > +/* Copy args out of and free req->argbuf */ > +static void copy_args_from_argbuf(struct fuse_req *req) > +{ > + unsigned int remaining; > + unsigned int offset; > + unsigned int num_in; > + unsigned int num_out; > + unsigned int i; > + > + remaining = req->out.h.len - sizeof(req->out.h); > + num_in = req->in.numargs - req->in.argpages; > + num_out = req->out.numargs - req->out.argpages; > + offset = fuse_len_args(num_in, (struct fuse_arg *)req->in.args); > + > + for (i = 0; i < num_out; i++) { > + unsigned int argsize = req->out.args[i].size; > + > + if (req->out.argvar && > + i == req->out.numargs - 1 && > + argsize > remaining) { > + argsize = remaining; > + } > + > + memcpy(req->out.args[i].value, req->argbuf + offset, argsize); > + offset += argsize; > + > + if (i != req->out.numargs - 1) > + remaining -= argsize; > + } > + > + /* Store the actual size of the variable-length arg */ > + if (req->out.argvar) > + req->out.args[req->out.numargs - 1].size = remaining; > + > + kfree(req->argbuf); > + req->argbuf = NULL; > +} > + > +/* Work function for request completion */ > +static void virtio_fs_requests_done_work(struct work_struct *work) > +{ > + struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > + done_work); > + struct fuse_pqueue *fpq = &fsvq->fud->pq; > + struct fuse_conn *fc = fsvq->fud->fc; > + struct virtqueue *vq = fsvq->vq; > + struct fuse_req *req; > + struct fuse_req *next; > + unsigned int len, i, thislen; > + struct page *page; > + LIST_HEAD(reqs); > + > + /* Collect completed requests off the virtqueue */ > + spin_lock(&fsvq->lock); > + do { > + virtqueue_disable_cb(vq); > + > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > + spin_lock(&fpq->lock); > + list_move_tail(&req->list, &reqs); > + spin_unlock(&fpq->lock); > + } > + } while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq))); > + spin_unlock(&fsvq->lock); > + > + /* End requests */ > + list_for_each_entry_safe(req, next, &reqs, list) { > + /* TODO check unique */ > + /* TODO fuse_len_args(out) against oh.len */ > + > + copy_args_from_argbuf(req); > + > + if (req->out.argpages && req->out.page_zeroing) { > + len = req->out.args[req->out.numargs - 1].size; > + for (i = 0; i < req->num_pages; i++) { > + thislen = req->page_descs[i].length; > + if (len < thislen) { > + WARN_ON(req->page_descs[i].offset); > + page = req->pages[i]; > + zero_user_segment(page, len, thislen); > + len = 0; > + } else { > + len -= thislen; > + } > + } > + } > + > + spin_lock(&fpq->lock); > + clear_bit(FR_SENT, &req->flags); > + list_del_init(&req->list); > + spin_unlock(&fpq->lock); > + > + fuse_request_end(fc, req); > + } > +} > + > +/* Virtqueue interrupt handler */ > +static void virtio_fs_vq_done(struct virtqueue *vq) > +{ > + struct virtio_fs_vq *fsvq = vq_to_fsvq(vq); > + > + dev_dbg(&vq->vdev->dev, "%s %s\n", __func__, fsvq->name); > + > + schedule_work(&fsvq->done_work); > +} > + > +/* Initialize virtqueues */ > +static int virtio_fs_setup_vqs(struct virtio_device *vdev, > + struct virtio_fs *fs) > +{ > + struct virtqueue **vqs; > + vq_callback_t **callbacks; > + const char **names; > + unsigned int i; > + int ret; > + > + virtio_cread(vdev, struct virtio_fs_config, num_queues, > + &fs->num_queues); > + if (fs->num_queues == 0) > + return -EINVAL; > + > + fs->nvqs = 1 + fs->num_queues; > + > + fs->vqs = devm_kcalloc(&vdev->dev, fs->nvqs, > + sizeof(fs->vqs[VQ_HIPRIO]), GFP_KERNEL); > + if (!fs->vqs) > + return -ENOMEM; > + > + vqs = kmalloc_array(fs->nvqs, sizeof(vqs[VQ_HIPRIO]), GFP_KERNEL); > + callbacks = kmalloc_array(fs->nvqs, sizeof(callbacks[VQ_HIPRIO]), > + GFP_KERNEL); > + names = kmalloc_array(fs->nvqs, sizeof(names[VQ_HIPRIO]), GFP_KERNEL); > + if (!vqs || !callbacks || !names) { > + ret = -ENOMEM; > + goto out; > + } > + > + callbacks[VQ_HIPRIO] = virtio_fs_vq_done; > + snprintf(fs->vqs[VQ_HIPRIO].name, sizeof(fs->vqs[VQ_HIPRIO].name), > + "hiprio"); > + names[VQ_HIPRIO] = fs->vqs[VQ_HIPRIO].name; > + INIT_WORK(&fs->vqs[VQ_HIPRIO].done_work, virtio_fs_hiprio_done_work); > + INIT_LIST_HEAD(&fs->vqs[VQ_HIPRIO].queued_reqs); > + INIT_DELAYED_WORK(&fs->vqs[VQ_HIPRIO].dispatch_work, > + virtio_fs_hiprio_dispatch_work); > + spin_lock_init(&fs->vqs[VQ_HIPRIO].lock); > + > + /* Initialize the requests virtqueues */ > + for (i = VQ_REQUEST; i < fs->nvqs; i++) { > + spin_lock_init(&fs->vqs[i].lock); > + INIT_WORK(&fs->vqs[i].done_work, virtio_fs_requests_done_work); > + INIT_DELAYED_WORK(&fs->vqs[i].dispatch_work, > + virtio_fs_dummy_dispatch_work); > + INIT_LIST_HEAD(&fs->vqs[i].queued_reqs); > + snprintf(fs->vqs[i].name, sizeof(fs->vqs[i].name), > + "requests.%u", i - VQ_REQUEST); > + callbacks[i] = virtio_fs_vq_done; > + names[i] = fs->vqs[i].name; > + } > + > + ret = virtio_find_vqs(vdev, fs->nvqs, vqs, callbacks, names, NULL); > + if (ret < 0) > + goto out; > + > + for (i = 0; i < fs->nvqs; i++) { > + fs->vqs[i].vq = vqs[i]; > + fs->vqs[i].connected = true; > + } > +out: > + kfree(names); > + kfree(callbacks); > + kfree(vqs); > + return ret; > +} > + > +/* Free virtqueues (device must already be reset) */ > +static void virtio_fs_cleanup_vqs(struct virtio_device *vdev, > + struct virtio_fs *fs) > +{ > + vdev->config->del_vqs(vdev); > +} > + > +static int virtio_fs_probe(struct virtio_device *vdev) > +{ > + struct virtio_fs *fs; > + int ret; > + > + fs = devm_kzalloc(&vdev->dev, sizeof(*fs), GFP_KERNEL); > + if (!fs) > + return -ENOMEM; > + vdev->priv = fs; > + > + ret = virtio_fs_read_tag(vdev, fs); > + if (ret < 0) > + goto out; > + > + ret = virtio_fs_setup_vqs(vdev, fs); > + if (ret < 0) > + goto out; > + > + /* TODO vq affinity */ > + /* TODO populate notifications vq */what's notifications vq?> + > + /* Bring the device online in case the filesystem is mounted and > + * requests need to be sent before we return. > + */ > + virtio_device_ready(vdev); > + > + ret = virtio_fs_add_instance(fs); > + if (ret < 0) > + goto out_vqs; > + > + return 0; > + > +out_vqs: > + vdev->config->reset(vdev); > + virtio_fs_cleanup_vqs(vdev, fs); > + > +out: > + vdev->priv = NULL; > + return ret; > +} > + > +static void virtio_fs_remove(struct virtio_device *vdev) > +{ > + struct virtio_fs *fs = vdev->priv; > + > + virtio_fs_free_devs(fs); > + > + vdev->config->reset(vdev); > + virtio_fs_cleanup_vqs(vdev, fs); > + > + mutex_lock(&virtio_fs_mutex); > + list_del(&fs->list); > + mutex_unlock(&virtio_fs_mutex); > + > + vdev->priv = NULL; > +} > + > +#ifdef CONFIG_PM_SLEEP > +static int virtio_fs_freeze(struct virtio_device *vdev) > +{ > + return 0; /* TODO */ > +} > + > +static int virtio_fs_restore(struct virtio_device *vdev) > +{ > + return 0; /* TODO */ > +}Is this really a good idea? I'd rather it was implemented, but if not possible at all disabling PM seems better than just keep going.> +#endif /* CONFIG_PM_SLEEP */ > + > +const static struct virtio_device_id id_table[] = { > + { VIRTIO_ID_FS, VIRTIO_DEV_ANY_ID }, > + {}, > +}; > + > +const static unsigned int feature_table[] = {}; > + > +static struct virtio_driver virtio_fs_driver = { > + .driver.name = KBUILD_MODNAME, > + .driver.owner = THIS_MODULE, > + .id_table = id_table, > + .feature_table = feature_table, > + .feature_table_size = ARRAY_SIZE(feature_table), > + /* TODO validate config_get != NULL */Why?> + .probe = virtio_fs_probe, > + .remove = virtio_fs_remove, > +#ifdef CONFIG_PM_SLEEP > + .freeze = virtio_fs_freeze, > + .restore = virtio_fs_restore, > +#endif > +}; > + > +static void virtio_fs_wake_forget_and_unlock(struct fuse_iqueue *fiq) > +__releases(fiq->waitq.lock) > +{ > + struct fuse_forget_link *link; > + struct virtio_fs_forget *forget; > + struct scatterlist sg; > + struct scatterlist *sgs[] = {&sg}; > + struct virtio_fs *fs; > + struct virtqueue *vq; > + struct virtio_fs_vq *fsvq; > + bool notify; > + u64 unique; > + int ret; > + > + link = fuse_dequeue_forget(fiq, 1, NULL); > + unique = fuse_get_unique(fiq); > + > + fs = fiq->priv; > + fsvq = &fs->vqs[VQ_HIPRIO]; > + spin_unlock(&fiq->waitq.lock); > + > + /* Allocate a buffer for the request */ > + forget = kmalloc(sizeof(*forget), GFP_NOFS | __GFP_NOFAIL); > + > + forget->ih = (struct fuse_in_header){ > + .opcode = FUSE_FORGET, > + .nodeid = link->forget_one.nodeid, > + .unique = unique, > + .len = sizeof(*forget), > + }; > + forget->arg = (struct fuse_forget_in){ > + .nlookup = link->forget_one.nlookup, > + }; > + > + sg_init_one(&sg, forget, sizeof(*forget)); > + > + /* Enqueue the request */ > + vq = fsvq->vq; > + dev_dbg(&vq->vdev->dev, "%s\n", __func__); > + spin_lock(&fsvq->lock); > + > + ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC); > + if (ret < 0) { > + if (ret == -ENOMEM || ret == -ENOSPC) { > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later.\n", > + ret); > + list_add_tail(&forget->list, &fsvq->queued_reqs); > + schedule_delayed_work(&fsvq->dispatch_work, > + msecs_to_jiffies(1)); > + } else { > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Dropping it.\n", > + ret); > + kfree(forget); > + } > + spin_unlock(&fsvq->lock); > + goto out; > + }code duplicated from virtio_fs_hiprio_dispatch_work> + > + fsvq->in_flight++; > + notify = virtqueue_kick_prepare(vq); > + > + spin_unlock(&fsvq->lock); > + > + if (notify) > + virtqueue_notify(vq); > +out: > + kfree(link); > +} > + > +static void virtio_fs_wake_interrupt_and_unlock(struct fuse_iqueue *fiq) > +__releases(fiq->waitq.lock) > +{ > + /* TODO */what's needed?> + spin_unlock(&fiq->waitq.lock); > +} > + > +/* Return the number of scatter-gather list elements required */ > +static unsigned int sg_count_fuse_req(struct fuse_req *req) > +{ > + unsigned int total_sgs = 1 /* fuse_in_header */; > + > + if (req->in.numargs - req->in.argpages) > + total_sgs += 1; > + > + if (req->in.argpages) > + total_sgs += req->num_pages; > + > + if (!test_bit(FR_ISREPLY, &req->flags)) > + return total_sgs; > + > + total_sgs += 1 /* fuse_out_header */; > + > + if (req->out.numargs - req->out.argpages) > + total_sgs += 1; > + > + if (req->out.argpages) > + total_sgs += req->num_pages; > + > + return total_sgs; > +} > + > +/* Add pages to scatter-gather list and return number of elements used */ > +static unsigned int sg_init_fuse_pages(struct scatterlist *sg, > + struct page **pages, > + struct fuse_page_desc *page_descs, > + unsigned int num_pages, > + unsigned int total_len) > +{ > + unsigned int i; > + unsigned int this_len; > + > + for (i = 0; i < num_pages && total_len; i++) { > + sg_init_table(&sg[i], 1); > + this_len = min(page_descs[i].length, total_len); > + sg_set_page(&sg[i], pages[i], this_len, page_descs[i].offset); > + total_len -= this_len; > + } > + > + return i; > +} > + > +/* Add args to scatter-gather list and return number of elements used */ > +static unsigned int sg_init_fuse_args(struct scatterlist *sg, > + struct fuse_req *req, > + struct fuse_arg *args, > + unsigned int numargs, > + bool argpages, > + void *argbuf, > + unsigned int *len_used) > +{ > + unsigned int total_sgs = 0; > + unsigned int len; > + > + len = fuse_len_args(numargs - argpages, args); > + if (len) > + sg_init_one(&sg[total_sgs++], argbuf, len); > + > + if (argpages) > + total_sgs += sg_init_fuse_pages(&sg[total_sgs], > + req->pages, > + req->page_descs, > + req->num_pages, > + args[numargs - 1].size); > + > + if (len_used) > + *len_used = len; > + > + return total_sgs; > +} > + > +/* Add a request to a virtqueue and kick the device */ > +static int virtio_fs_enqueue_req(struct virtqueue *vq, struct fuse_req *req) > +{ > + /* requests need at least 4 elements */ > + struct scatterlist *stack_sgs[6]; > + struct scatterlist stack_sg[ARRAY_SIZE(stack_sgs)]; > + struct scatterlist **sgs = stack_sgs; > + struct scatterlist *sg = stack_sg; > + struct virtio_fs_vq *fsvq; > + unsigned int argbuf_used = 0; > + unsigned int out_sgs = 0; > + unsigned int in_sgs = 0; > + unsigned int total_sgs; > + unsigned int i; > + int ret; > + bool notify; > + > + /* Does the sglist fit on the stack? */ > + total_sgs = sg_count_fuse_req(req); > + if (total_sgs > ARRAY_SIZE(stack_sgs)) { > + sgs = kmalloc_array(total_sgs, sizeof(sgs[0]), GFP_ATOMIC); > + sg = kmalloc_array(total_sgs, sizeof(sg[0]), GFP_ATOMIC); > + if (!sgs || !sg) { > + ret = -ENOMEM; > + goto out; > + } > + } > + > + /* Use a bounce buffer since stack args cannot be mapped */ > + ret = copy_args_to_argbuf(req); > + if (ret < 0) > + goto out; > + > + /* Request elements */ > + sg_init_one(&sg[out_sgs++], &req->in.h, sizeof(req->in.h)); > + out_sgs += sg_init_fuse_args(&sg[out_sgs], req, > + (struct fuse_arg *)req->in.args, > + req->in.numargs, req->in.argpages, > + req->argbuf, &argbuf_used); > + > + /* Reply elements */ > + if (test_bit(FR_ISREPLY, &req->flags)) { > + sg_init_one(&sg[out_sgs + in_sgs++], > + &req->out.h, sizeof(req->out.h)); > + in_sgs += sg_init_fuse_args(&sg[out_sgs + in_sgs], req, > + req->out.args, req->out.numargs, > + req->out.argpages, > + req->argbuf + argbuf_used, NULL); > + } > + > + WARN_ON(out_sgs + in_sgs != total_sgs); > + > + for (i = 0; i < total_sgs; i++) > + sgs[i] = &sg[i]; > + > + fsvq = vq_to_fsvq(vq); > + spin_lock(&fsvq->lock); > + > + ret = virtqueue_add_sgs(vq, sgs, out_sgs, in_sgs, req, GFP_ATOMIC); > + if (ret < 0) { > + /* TODO handle full virtqueue */Indeed. What prevents this?> + spin_unlock(&fsvq->lock); > + goto out; > + } > + > + notify = virtqueue_kick_prepare(vq); > + > + spin_unlock(&fsvq->lock); > + > + if (notify) > + virtqueue_notify(vq); > + > +out: > + if (ret < 0 && req->argbuf) { > + kfree(req->argbuf); > + req->argbuf = NULL; > + } > + if (sgs != stack_sgs) { > + kfree(sgs); > + kfree(sg); > + } > + > + return ret; > +} > + > +static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq) > +__releases(fiq->waitq.lock) > +{ > + unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */ > + struct virtio_fs *fs; > + struct fuse_conn *fc; > + struct fuse_req *req; > + struct fuse_pqueue *fpq; > + int ret; > + > + WARN_ON(list_empty(&fiq->pending)); > + req = list_last_entry(&fiq->pending, struct fuse_req, list); > + clear_bit(FR_PENDING, &req->flags); > + list_del_init(&req->list); > + WARN_ON(!list_empty(&fiq->pending)); > + spin_unlock(&fiq->waitq.lock); > + > + fs = fiq->priv; > + fc = fs->vqs[queue_id].fud->fc; > + > + dev_dbg(&fs->vqs[queue_id].vq->vdev->dev, > + "%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n", > + __func__, req->in.h.opcode, req->in.h.unique, req->in.h.nodeid, > + req->in.h.len, fuse_len_args(req->out.numargs, req->out.args)); > + > + fpq = &fs->vqs[queue_id].fud->pq; > + spin_lock(&fpq->lock); > + if (!fpq->connected) { > + spin_unlock(&fpq->lock); > + req->out.h.error = -ENODEV; > + pr_err("virtio-fs: %s disconnected\n", __func__); > + fuse_request_end(fc, req); > + return; > + } > + list_add_tail(&req->list, fpq->processing); > + spin_unlock(&fpq->lock); > + set_bit(FR_SENT, &req->flags); > + /* matches barrier in request_wait_answer() */ > + smp_mb__after_atomic(); > + /* TODO check for FR_INTERRUPTED? */?> + > +retry: > + ret = virtio_fs_enqueue_req(fs->vqs[queue_id].vq, req); > + if (ret < 0) { > + if (ret == -ENOMEM || ret == -ENOSPC) { > + /* Virtqueue full. Retry submission */ > + usleep_range(20, 30);again, why not respond to completion?> + goto retry; > + } > + req->out.h.error = ret; > + pr_err("virtio-fs: virtio_fs_enqueue_req() failed %d\n", ret); > + fuse_request_end(fc, req); > + return; > + } > +} > + > +static void virtio_fs_flush_hiprio_queue(struct virtio_fs_vq *fsvq) > +{ > + struct virtio_fs_forget *forget; > + > + WARN_ON(fsvq->in_flight < 0); > + > + /* Go through pending forget requests and free them */ > + spin_lock(&fsvq->lock); > + while (1) { > + forget = list_first_entry_or_null(&fsvq->queued_reqs, > + struct virtio_fs_forget, list); > + if (!forget) > + break; > + list_del(&forget->list); > + kfree(forget); > + } > + > + spin_unlock(&fsvq->lock); > + > + /* Wait for in flight requests to finish.*/ > + while (1) { > + spin_lock(&fsvq->lock); > + if (!fsvq->in_flight) { > + spin_unlock(&fsvq->lock); > + break; > + } > + spin_unlock(&fsvq->lock); > + usleep_range(1000, 2000);Same thing here. Can we use e.g. a completion and not usleep?> + } > +} > + > +const static struct fuse_iqueue_ops virtio_fs_fiq_ops = { > + .wake_forget_and_unlock = virtio_fs_wake_forget_and_unlock, > + .wake_interrupt_and_unlock = virtio_fs_wake_interrupt_and_unlock, > + .wake_pending_and_unlock = virtio_fs_wake_pending_and_unlock, > +}; > + > +static int virtio_fs_fill_super(struct super_block *sb) > +{ > + struct fuse_conn *fc = get_fuse_conn_super(sb); > + struct virtio_fs *fs = fc->iq.priv; > + unsigned int i; > + int err; > + struct fuse_req *init_req; > + struct fuse_fs_context ctx = { > + .rootmode = S_IFDIR, > + .default_permissions = 1, > + .allow_other = 1, > + .max_read = UINT_MAX, > + .blksize = 512, > + .destroy = true, > + .no_control = true, > + .no_force_umount = true, > + }; > + > + /* TODO lock */what does this refer to?> + if (fs->vqs[VQ_REQUEST].fud) { > + pr_err("virtio-fs: device already in use\n"); > + err = -EBUSY; > + goto err; > + } > + > + err = -ENOMEM; > + /* Allocate fuse_dev for hiprio and notification queues */ > + for (i = 0; i < VQ_REQUEST; i++) { > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > + > + fsvq->fud = fuse_dev_alloc(); > + if (!fsvq->fud) > + goto err_free_fuse_devs; > + } > + > + init_req = fuse_request_alloc(0); > + if (!init_req) > + goto err_free_fuse_devs; > + __set_bit(FR_BACKGROUND, &init_req->flags); > + > + ctx.fudptr = (void **)&fs->vqs[VQ_REQUEST].fud; > + err = fuse_fill_super_common(sb, &ctx); > + if (err < 0) > + goto err_free_init_req; > + > + fc = fs->vqs[VQ_REQUEST].fud->fc; > + > + /* TODO take fuse_mutex around this loop? */Don't we need to figure this kind of thing out before this code lands upstream?> + for (i = 0; i < fs->nvqs; i++) { > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > + > + if (i == VQ_REQUEST) > + continue; /* already initialized */ > + fuse_dev_install(fsvq->fud, fc); > + atomic_inc(&fc->dev_count); > + } > + > + fuse_send_init(fc, init_req); > + return 0; > + > +err_free_init_req: > + fuse_request_free(init_req); > +err_free_fuse_devs: > + for (i = 0; i < fs->nvqs; i++) > + fuse_dev_free(fs->vqs[i].fud); > +err: > + return err; > +} > + > +static void virtio_kill_sb(struct super_block *sb) > +{ > + struct fuse_conn *fc = get_fuse_conn_super(sb); > + struct virtio_fs *vfs; > + struct virtio_fs_vq *fsvq; > + > + /* If mount failed, we can still be called without any fc */ > + if (!fc) > + return fuse_kill_sb_anon(sb); > + > + vfs = fc->iq.priv; > + fsvq = &vfs->vqs[VQ_HIPRIO]; > + > + /* Stop forget queue. Soon destroy will be sent */ > + spin_lock(&fsvq->lock); > + fsvq->connected = false; > + spin_unlock(&fsvq->lock); > + virtio_fs_flush_hiprio_queue(fsvq); > + > + fuse_kill_sb_anon(sb); > + virtio_fs_free_devs(vfs); > +} > + > +static int virtio_fs_test_super(struct super_block *sb, > + struct fs_context *fsc) > +{ > + struct fuse_conn *fc = fsc->s_fs_info; > + > + return fc->iq.priv == get_fuse_conn_super(sb)->iq.priv; > +} > + > +static int virtio_fs_set_super(struct super_block *sb, > + struct fs_context *fsc) > +{ > + int err; > + > + err = get_anon_bdev(&sb->s_dev); > + if (!err) > + fuse_conn_get(fsc->s_fs_info); > + > + return err; > +} > + > +static int virtio_fs_get_tree(struct fs_context *fsc) > +{ > + struct virtio_fs *fs; > + struct super_block *sb; > + struct fuse_conn *fc; > + int err; > + > + fs = virtio_fs_find_instance(fsc->source); > + if (!fs) { > + pr_info("virtio-fs: tag <%s> not found\n", fsc->source); > + return -EINVAL; > + } > + > + fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL); > + if (!fc) > + return -ENOMEM; > + > + fuse_conn_init(fc, get_user_ns(current_user_ns()), &virtio_fs_fiq_ops, > + fs); > + fc->release = fuse_free_conn; > + fc->delete_stale = true; > + > + fsc->s_fs_info = fc; > + sb = sget_fc(fsc, virtio_fs_test_super, virtio_fs_set_super); > + fuse_conn_put(fc); > + if (IS_ERR(sb)) > + return PTR_ERR(sb); > + > + if (!sb->s_root) { > + err = virtio_fs_fill_super(sb); > + if (err) { > + deactivate_locked_super(sb); > + return err; > + } > + > + sb->s_flags |= SB_ACTIVE; > + } > + > + WARN_ON(fsc->root); > + fsc->root = dget(sb->s_root); > + return 0; > +} > + > +static const struct fs_context_operations virtio_fs_context_ops = { > + .get_tree = virtio_fs_get_tree, > +}; > + > +static int virtio_fs_init_fs_context(struct fs_context *fsc) > +{ > + fsc->ops = &virtio_fs_context_ops; > + return 0; > +} > + > +static struct file_system_type virtio_fs_type = { > + .owner = THIS_MODULE, > + .name = "virtiofs", > + .init_fs_context = virtio_fs_init_fs_context, > + .kill_sb = virtio_kill_sb, > +}; > + > +static int __init virtio_fs_init(void) > +{ > + int ret; > + > + ret = register_virtio_driver(&virtio_fs_driver); > + if (ret < 0) > + return ret; > + > + ret = register_filesystem(&virtio_fs_type); > + if (ret < 0) { > + unregister_virtio_driver(&virtio_fs_driver); > + return ret; > + } > + > + return 0; > +} > +module_init(virtio_fs_init); > + > +static void __exit virtio_fs_exit(void) > +{ > + unregister_filesystem(&virtio_fs_type); > + unregister_virtio_driver(&virtio_fs_driver); > +} > +module_exit(virtio_fs_exit); > + > +MODULE_AUTHOR("Stefan Hajnoczi <stefanha at redhat.com>"); > +MODULE_DESCRIPTION("Virtio Filesystem"); > +MODULE_LICENSE("GPL"); > +MODULE_ALIAS_FS(KBUILD_MODNAME); > +MODULE_DEVICE_TABLE(virtio, id_table); > diff --git a/include/uapi/linux/virtio_fs.h b/include/uapi/linux/virtio_fs.h > new file mode 100644 > index 000000000000..b5e99c217c86 > --- /dev/null > +++ b/include/uapi/linux/virtio_fs.h > @@ -0,0 +1,19 @@ > +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ > + > +#ifndef _UAPI_LINUX_VIRTIO_FS_H > +#define _UAPI_LINUX_VIRTIO_FS_H > + > +#include <linux/types.h> > +#include <linux/virtio_ids.h> > +#include <linux/virtio_config.h> > +#include <linux/virtio_types.h> > + > +struct virtio_fs_config { > + /* Filesystem name (UTF-8, not NUL-terminated, padded with NULs) */ > + __u8 tag[36]; > + > + /* Number of request queues */ > + __u32 num_queues; > +} __attribute__((packed)); > + > +#endif /* _UAPI_LINUX_VIRTIO_FS_H */ > diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h > index 348fd0176f75..585e07b27333 100644 > --- a/include/uapi/linux/virtio_ids.h > +++ b/include/uapi/linux/virtio_ids.h > @@ -44,6 +44,7 @@ > #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */ > #define VIRTIO_ID_CRYPTO 20 /* virtio crypto */ > #define VIRTIO_ID_IOMMU 23 /* virtio IOMMU */ > +#define VIRTIO_ID_FS 26 /* virtio filesystem */ > #define VIRTIO_ID_PMEM 27 /* virtio pmem */ > > #endif /* _LINUX_VIRTIO_IDS_H */ > -- > 2.21.0
On Tue, Sep 03, 2019 at 09:55:49AM -0400, Michael S. Tsirkin wrote:> On Tue, Sep 03, 2019 at 01:42:02PM +0200, Miklos Szeredi wrote: > Endian-ness for fuse header also looks wrong.[...]> > +struct virtio_fs_forget { > > + struct fuse_in_header ih; > > + struct fuse_forget_in arg; > > These structures are all native endian. > > Passing them to host will make cross-endian setups painful to support, > and hardware implementations impossible. > > How about converting everything to LE?The driver dictates the endianness of the FUSE protocol session. The virtio-fs device specification states that the device looks at the first request's fuse_in_header::opcode field to detect the guest endianness. If it sees FUSE_INIT in its native endianness then no byte-swapping is necessary. If it sees FUSE_INIT in the opposite endianness then byte-swapping is necessary on the device side. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20190904/762190e5/attachment.sig>
Michael S. Tsirkin
2019-Sep-04 18:58 UTC
[PATCH v4 15/16] virtio-fs: add virtiofs filesystem
On Wed, Sep 04, 2019 at 07:16:30PM +0100, Stefan Hajnoczi wrote:> On Tue, Sep 03, 2019 at 09:55:49AM -0400, Michael S. Tsirkin wrote: > > On Tue, Sep 03, 2019 at 01:42:02PM +0200, Miklos Szeredi wrote: > > Endian-ness for fuse header also looks wrong. > [...] > > > +struct virtio_fs_forget { > > > + struct fuse_in_header ih; > > > + struct fuse_forget_in arg; > > > > These structures are all native endian. > > > > Passing them to host will make cross-endian setups painful to support, > > and hardware implementations impossible. > > > > How about converting everything to LE? > > The driver dictates the endianness of the FUSE protocol session. The > virtio-fs device specification states that the device looks at the first > request's fuse_in_header::opcode field to detect the guest endianness. > > If it sees FUSE_INIT in its native endianness then no byte-swapping is > necessary. If it sees FUSE_INIT in the opposite endianness then > byte-swapping is necessary on the device side.You are right. Pls ignore the comment. We need to reserve the byte-swapped FUSE_INIT to make sure future versions of fuse don't try to send that though. I sent a patch to that effect, let's see whether it gets accepted. -- MST
On Tue, Sep 03, 2019 at 09:55:49AM -0400, Michael S. Tsirkin wrote: [..]> What's with all of the TODOs? Some of these are really scary, > looks like they need to be figured out before this is merged.Hi Michael, One of the issue I noticed is races w.r.t device removal and super block initialization. I am about to post a set of patches which take care of these races and also get rid of some of the scary TODOs. Other TODOs like suspend/restore, multiqueue support etc are improvements which we can do over a period of time. [..]> > +/* Per-virtqueue state */ > > +struct virtio_fs_vq { > > + spinlock_t lock; > > + struct virtqueue *vq; /* protected by ->lock */ > > + struct work_struct done_work; > > + struct list_head queued_reqs; > > + struct delayed_work dispatch_work; > > + struct fuse_dev *fud; > > + bool connected; > > + long in_flight; > > + char name[24]; > > I'd keep names somewhere separate as they are not used on data path.Ok, this sounds like a nice to have. Will take care of this once base patch gets merged. [..]> > +struct virtio_fs_forget { > > + struct fuse_in_header ih; > > + struct fuse_forget_in arg; > > These structures are all native endian. > > Passing them to host will make cross-endian setups painful to support, > and hardware implementations impossible. > > How about converting everything to LE?So looks like endianness issue is now resolved (going by the other emails). So I will not worry about it. [..]> > +/* Add a new instance to the list or return -EEXIST if tag name exists*/ > > +static int virtio_fs_add_instance(struct virtio_fs *fs) > > +{ > > + struct virtio_fs *fs2; > > + bool duplicate = false; > > + > > + mutex_lock(&virtio_fs_mutex); > > + > > + list_for_each_entry(fs2, &virtio_fs_instances, list) { > > + if (strcmp(fs->tag, fs2->tag) == 0) > > + duplicate = true; > > + } > > + > > + if (!duplicate) > > + list_add_tail(&fs->list, &virtio_fs_instances); > > > This is O(N^2) as it's presumably called for each istance. > Doesn't scale - please switch to a tree or such.This is O(N) and not O(N^2) right? Addition of device is O(N), search during mount is O(N). This is not a frequent event at all and number of virtiofs instances per guest are expected to be fairly small (say less than 10). So I really don't think there is any value in converting this into a tree (instead of a list). [..]> > +static void virtio_fs_free_devs(struct virtio_fs *fs) > > +{ > > + unsigned int i; > > + > > + /* TODO lock */ > > Doesn't inspire confidence, does it?Agreed. Getting rid of this in set of fixes I am about to post.> > > + > > + for (i = 0; i < fs->nvqs; i++) { > > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > > + > > + if (!fsvq->fud) > > + continue; > > + > > + flush_work(&fsvq->done_work); > > + flush_delayed_work(&fsvq->dispatch_work); > > + > > + /* TODO need to quiesce/end_requests/decrement dev_count */ > > Indeed. Won't this crash if we don't?Took care of this as well. [..]> > +static void virtio_fs_hiprio_dispatch_work(struct work_struct *work) > > +{ > > + struct virtio_fs_forget *forget; > > + struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > > + dispatch_work.work); > > + struct virtqueue *vq = fsvq->vq; > > + struct scatterlist sg; > > + struct scatterlist *sgs[] = {&sg}; > > + bool notify; > > + int ret; > > + > > + pr_debug("virtio-fs: worker %s called.\n", __func__); > > + while (1) { > > + spin_lock(&fsvq->lock); > > + forget = list_first_entry_or_null(&fsvq->queued_reqs, > > + struct virtio_fs_forget, list); > > + if (!forget) { > > + spin_unlock(&fsvq->lock); > > + return; > > + } > > + > > + list_del(&forget->list); > > + if (!fsvq->connected) { > > + spin_unlock(&fsvq->lock); > > + kfree(forget); > > + continue; > > + } > > + > > + sg_init_one(&sg, forget, sizeof(*forget)); > > This passes to host a structure including "struct list_head list"; > > Not a good idea.Ok, host does not have to see "struct list_head list". Its needed for guest. Will look into getting rid of this.> > > > + > > + /* Enqueue the request */ > > + dev_dbg(&vq->vdev->dev, "%s\n", __func__); > > + ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC); > > > This is easier as add_outbuf.Will look into it.> > Also - why GFP_ATOMIC?Hmm..., may be it can be GFP_KERNEL. I don't see atomic context here. Will look into it.> > > + if (ret < 0) { > > + if (ret == -ENOMEM || ret == -ENOSPC) { > > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later\n", > > + ret); > > + list_add_tail(&forget->list, > > + &fsvq->queued_reqs); > > + schedule_delayed_work(&fsvq->dispatch_work, > > + msecs_to_jiffies(1)); > > Can't we we requeue after some buffers get consumed?That's what dispatch work is doing. It tries to requeue the request after a while. [..]> > +static int virtio_fs_probe(struct virtio_device *vdev) > > +{ > > + struct virtio_fs *fs; > > + int ret; > > + > > + fs = devm_kzalloc(&vdev->dev, sizeof(*fs), GFP_KERNEL); > > + if (!fs) > > + return -ENOMEM; > > + vdev->priv = fs; > > + > > + ret = virtio_fs_read_tag(vdev, fs); > > + if (ret < 0) > > + goto out; > > + > > + ret = virtio_fs_setup_vqs(vdev, fs); > > + if (ret < 0) > > + goto out; > > + > > + /* TODO vq affinity */ > > + /* TODO populate notifications vq */ > > what's notifications vq?It has not been implemented yet. At some point of time we want to have a notion of notification queue so that host can send notifications to guest. Will get rid of this comment for now. [..]> > +#ifdef CONFIG_PM_SLEEP > > +static int virtio_fs_freeze(struct virtio_device *vdev) > > +{ > > + return 0; /* TODO */ > > +} > > + > > +static int virtio_fs_restore(struct virtio_device *vdev) > > +{ > > + return 0; /* TODO */ > > +} > > Is this really a good idea? I'd rather it was implemented, > but if not possible at all disabling PM seems better than just > keep going.I agree. Will look into disabling it.> > > +#endif /* CONFIG_PM_SLEEP */ > > + > > +const static struct virtio_device_id id_table[] = { > > + { VIRTIO_ID_FS, VIRTIO_DEV_ANY_ID }, > > + {}, > > +}; > > + > > +const static unsigned int feature_table[] = {}; > > + > > +static struct virtio_driver virtio_fs_driver = { > > + .driver.name = KBUILD_MODNAME, > > + .driver.owner = THIS_MODULE, > > + .id_table = id_table, > > + .feature_table = feature_table, > > + .feature_table_size = ARRAY_SIZE(feature_table), > > + /* TODO validate config_get != NULL */ > > Why?Don't know. Stefan, do you remember why did you put this comment? If not, I will get rid of it.> > > + .probe = virtio_fs_probe, > > + .remove = virtio_fs_remove, > > +#ifdef CONFIG_PM_SLEEP > > + .freeze = virtio_fs_freeze, > > + .restore = virtio_fs_restore, > > +#endif > > +}; > > + > > +static void virtio_fs_wake_forget_and_unlock(struct fuse_iqueue *fiq) > > +__releases(fiq->waitq.lock) > > +{ > > + struct fuse_forget_link *link; > > + struct virtio_fs_forget *forget; > > + struct scatterlist sg; > > + struct scatterlist *sgs[] = {&sg}; > > + struct virtio_fs *fs; > > + struct virtqueue *vq; > > + struct virtio_fs_vq *fsvq; > > + bool notify; > > + u64 unique; > > + int ret; > > + > > + link = fuse_dequeue_forget(fiq, 1, NULL); > > + unique = fuse_get_unique(fiq); > > + > > + fs = fiq->priv; > > + fsvq = &fs->vqs[VQ_HIPRIO]; > > + spin_unlock(&fiq->waitq.lock); > > + > > + /* Allocate a buffer for the request */ > > + forget = kmalloc(sizeof(*forget), GFP_NOFS | __GFP_NOFAIL); > > + > > + forget->ih = (struct fuse_in_header){ > > + .opcode = FUSE_FORGET, > > + .nodeid = link->forget_one.nodeid, > > + .unique = unique, > > + .len = sizeof(*forget), > > + }; > > + forget->arg = (struct fuse_forget_in){ > > + .nlookup = link->forget_one.nlookup, > > + }; > > + > > + sg_init_one(&sg, forget, sizeof(*forget)); > > + > > + /* Enqueue the request */ > > + vq = fsvq->vq; > > + dev_dbg(&vq->vdev->dev, "%s\n", __func__); > > + spin_lock(&fsvq->lock); > > + > > + ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC); > > + if (ret < 0) { > > + if (ret == -ENOMEM || ret == -ENOSPC) { > > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later.\n", > > + ret); > > + list_add_tail(&forget->list, &fsvq->queued_reqs); > > + schedule_delayed_work(&fsvq->dispatch_work, > > + msecs_to_jiffies(1)); > > + } else { > > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Dropping it.\n", > > + ret); > > + kfree(forget); > > + } > > + spin_unlock(&fsvq->lock); > > + goto out; > > + } > > > code duplicated from virtio_fs_hiprio_dispatch_workWill look into reusing the code.> > > + > > + fsvq->in_flight++; > > + notify = virtqueue_kick_prepare(vq); > > + > > + spin_unlock(&fsvq->lock); > > + > > + if (notify) > > + virtqueue_notify(vq); > > +out: > > + kfree(link); > > +} > > + > > +static void virtio_fs_wake_interrupt_and_unlock(struct fuse_iqueue *fiq) > > +__releases(fiq->waitq.lock) > > +{ > > + /* TODO */ > > what's needed?We have not implemented this interrupt thing. It interrupts the existing pending requests. Its not a must to have. [..]> > +static int virtio_fs_enqueue_req(struct virtqueue *vq, struct fuse_req *req) > > +{ > > + /* requests need at least 4 elements */ > > + struct scatterlist *stack_sgs[6]; > > + struct scatterlist stack_sg[ARRAY_SIZE(stack_sgs)]; > > + struct scatterlist **sgs = stack_sgs; > > + struct scatterlist *sg = stack_sg; > > + struct virtio_fs_vq *fsvq; > > + unsigned int argbuf_used = 0; > > + unsigned int out_sgs = 0; > > + unsigned int in_sgs = 0; > > + unsigned int total_sgs; > > + unsigned int i; > > + int ret; > > + bool notify; > > + > > + /* Does the sglist fit on the stack? */ > > + total_sgs = sg_count_fuse_req(req); > > + if (total_sgs > ARRAY_SIZE(stack_sgs)) { > > + sgs = kmalloc_array(total_sgs, sizeof(sgs[0]), GFP_ATOMIC); > > + sg = kmalloc_array(total_sgs, sizeof(sg[0]), GFP_ATOMIC); > > + if (!sgs || !sg) { > > + ret = -ENOMEM; > > + goto out; > > + } > > + } > > + > > + /* Use a bounce buffer since stack args cannot be mapped */ > > + ret = copy_args_to_argbuf(req); > > + if (ret < 0) > > + goto out; > > + > > + /* Request elements */ > > + sg_init_one(&sg[out_sgs++], &req->in.h, sizeof(req->in.h)); > > + out_sgs += sg_init_fuse_args(&sg[out_sgs], req, > > + (struct fuse_arg *)req->in.args, > > + req->in.numargs, req->in.argpages, > > + req->argbuf, &argbuf_used); > > + > > + /* Reply elements */ > > + if (test_bit(FR_ISREPLY, &req->flags)) { > > + sg_init_one(&sg[out_sgs + in_sgs++], > > + &req->out.h, sizeof(req->out.h)); > > + in_sgs += sg_init_fuse_args(&sg[out_sgs + in_sgs], req, > > + req->out.args, req->out.numargs, > > + req->out.argpages, > > + req->argbuf + argbuf_used, NULL); > > + } > > + > > + WARN_ON(out_sgs + in_sgs != total_sgs); > > + > > + for (i = 0; i < total_sgs; i++) > > + sgs[i] = &sg[i]; > > + > > + fsvq = vq_to_fsvq(vq); > > + spin_lock(&fsvq->lock); > > + > > + ret = virtqueue_add_sgs(vq, sgs, out_sgs, in_sgs, req, GFP_ATOMIC); > > + if (ret < 0) { > > + /* TODO handle full virtqueue */ > > Indeed. What prevents this?So for forget requests (VQ_HIPRIO), I already handled this by retrying submitting the request with the help of a worker. For regular requests (VQ_REQUESTS), I have never run into virt queue being full so far. That's why never worried about it. So nothing prevents this. But we have not noticed it yet. So its a TODO item. It will be nice to retry if virtuqueue gets full (instead of returning error to caller). [..]> > +static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq) > > +__releases(fiq->waitq.lock) > > +{ > > + unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */ > > + struct virtio_fs *fs; > > + struct fuse_conn *fc; > > + struct fuse_req *req; > > + struct fuse_pqueue *fpq; > > + int ret; > > + > > + WARN_ON(list_empty(&fiq->pending)); > > + req = list_last_entry(&fiq->pending, struct fuse_req, list); > > + clear_bit(FR_PENDING, &req->flags); > > + list_del_init(&req->list); > > + WARN_ON(!list_empty(&fiq->pending)); > > + spin_unlock(&fiq->waitq.lock); > > + > > + fs = fiq->priv; > > + fc = fs->vqs[queue_id].fud->fc; > > + > > + dev_dbg(&fs->vqs[queue_id].vq->vdev->dev, > > + "%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n", > > + __func__, req->in.h.opcode, req->in.h.unique, req->in.h.nodeid, > > + req->in.h.len, fuse_len_args(req->out.numargs, req->out.args)); > > + > > + fpq = &fs->vqs[queue_id].fud->pq; > > + spin_lock(&fpq->lock); > > + if (!fpq->connected) { > > + spin_unlock(&fpq->lock); > > + req->out.h.error = -ENODEV; > > + pr_err("virtio-fs: %s disconnected\n", __func__); > > + fuse_request_end(fc, req); > > + return; > > + } > > + list_add_tail(&req->list, fpq->processing); > > + spin_unlock(&fpq->lock); > > + set_bit(FR_SENT, &req->flags); > > + /* matches barrier in request_wait_answer() */ > > + smp_mb__after_atomic(); > > + /* TODO check for FR_INTERRUPTED? */ > > > ?hmm... we don't support FR_INTERRUPTED. Stefan, do you remember why this TODO is here. If not, I will get rid of it.> > > + > > +retry: > > + ret = virtio_fs_enqueue_req(fs->vqs[queue_id].vq, req); > > + if (ret < 0) { > > + if (ret == -ENOMEM || ret == -ENOSPC) { > > + /* Virtqueue full. Retry submission */ > > + usleep_range(20, 30); > > again, why not respond to completion?Using usleep() was a quick fix. Will look into using completion in second round of cleanup. In first round I am taking care of more scary TODOs which deal with races w.r.t device removal.> > > + goto retry; > > + } > > + req->out.h.error = ret; > > + pr_err("virtio-fs: virtio_fs_enqueue_req() failed %d\n", ret); > > + fuse_request_end(fc, req); > > + return; > > + } > > +} > > + > > +static void virtio_fs_flush_hiprio_queue(struct virtio_fs_vq *fsvq) > > +{ > > + struct virtio_fs_forget *forget; > > + > > + WARN_ON(fsvq->in_flight < 0); > > + > > + /* Go through pending forget requests and free them */ > > + spin_lock(&fsvq->lock); > > + while (1) { > > + forget = list_first_entry_or_null(&fsvq->queued_reqs, > > + struct virtio_fs_forget, list); > > + if (!forget) > > + break; > > + list_del(&forget->list); > > + kfree(forget); > > + } > > + > > + spin_unlock(&fsvq->lock); > > + > > + /* Wait for in flight requests to finish.*/ > > + while (1) { > > + spin_lock(&fsvq->lock); > > + if (!fsvq->in_flight) { > > + spin_unlock(&fsvq->lock); > > + break; > > + } > > + spin_unlock(&fsvq->lock); > > + usleep_range(1000, 2000); > > Same thing here. Can we use e.g. a completion and not usleep?Second round cleanup.> > > + } > > +} > > + > > +const static struct fuse_iqueue_ops virtio_fs_fiq_ops = { > > + .wake_forget_and_unlock = virtio_fs_wake_forget_and_unlock, > > + .wake_interrupt_and_unlock = virtio_fs_wake_interrupt_and_unlock, > > + .wake_pending_and_unlock = virtio_fs_wake_pending_and_unlock, > > +}; > > + > > +static int virtio_fs_fill_super(struct super_block *sb) > > +{ > > + struct fuse_conn *fc = get_fuse_conn_super(sb); > > + struct virtio_fs *fs = fc->iq.priv; > > + unsigned int i; > > + int err; > > + struct fuse_req *init_req; > > + struct fuse_fs_context ctx = { > > + .rootmode = S_IFDIR, > > + .default_permissions = 1, > > + .allow_other = 1, > > + .max_read = UINT_MAX, > > + .blksize = 512, > > + .destroy = true, > > + .no_control = true, > > + .no_force_umount = true, > > + }; > > + > > + /* TODO lock */ > > what does this refer to?Took care of in first round of cleanup.> > > + if (fs->vqs[VQ_REQUEST].fud) { > > + pr_err("virtio-fs: device already in use\n"); > > + err = -EBUSY; > > + goto err; > > + } > > + > > + err = -ENOMEM; > > + /* Allocate fuse_dev for hiprio and notification queues */ > > + for (i = 0; i < VQ_REQUEST; i++) { > > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > > + > > + fsvq->fud = fuse_dev_alloc(); > > + if (!fsvq->fud) > > + goto err_free_fuse_devs; > > + } > > + > > + init_req = fuse_request_alloc(0); > > + if (!init_req) > > + goto err_free_fuse_devs; > > + __set_bit(FR_BACKGROUND, &init_req->flags); > > + > > + ctx.fudptr = (void **)&fs->vqs[VQ_REQUEST].fud; > > + err = fuse_fill_super_common(sb, &ctx); > > + if (err < 0) > > + goto err_free_init_req; > > + > > + fc = fs->vqs[VQ_REQUEST].fud->fc; > > + > > + /* TODO take fuse_mutex around this loop? */ > > Don't we need to figure this kind of thing out before this > code lands upstream?I think we don't need this TODO. fuse_connection object and associated super block are still being formed. And my cleanup has taken care of the additional locking. Thanks Vivek
On Thu, Sep 05, 2019 at 03:15:15PM -0400, Vivek Goyal wrote:> On Tue, Sep 03, 2019 at 09:55:49AM -0400, Michael S. Tsirkin wrote: > [..] > > What's with all of the TODOs? Some of these are really scary, > > looks like they need to be figured out before this is merged. > > Hi Michael, > > One of the issue I noticed is races w.r.t device removal and super > block initialization. I am about to post a set of patches which > take care of these races and also get rid of some of the scary > TODOs. Other TODOs like suspend/restore, multiqueue support etc > are improvements which we can do over a period of time. > > [..] > > > +/* Per-virtqueue state */ > > > +struct virtio_fs_vq { > > > + spinlock_t lock; > > > + struct virtqueue *vq; /* protected by ->lock */ > > > + struct work_struct done_work; > > > + struct list_head queued_reqs; > > > + struct delayed_work dispatch_work; > > > + struct fuse_dev *fud; > > > + bool connected; > > > + long in_flight; > > > + char name[24]; > > > > I'd keep names somewhere separate as they are not used on data path. > > Ok, this sounds like a nice to have. Will take care of this once base > patch gets merged. > > [..] > > > +struct virtio_fs_forget { > > > + struct fuse_in_header ih; > > > + struct fuse_forget_in arg; > > > > These structures are all native endian. > > > > Passing them to host will make cross-endian setups painful to support, > > and hardware implementations impossible. > > > > How about converting everything to LE? > > So looks like endianness issue is now resolved (going by the other > emails). So I will not worry about it. > > [..] > > > +/* Add a new instance to the list or return -EEXIST if tag name exists*/ > > > +static int virtio_fs_add_instance(struct virtio_fs *fs) > > > +{ > > > + struct virtio_fs *fs2; > > > + bool duplicate = false; > > > + > > > + mutex_lock(&virtio_fs_mutex); > > > + > > > + list_for_each_entry(fs2, &virtio_fs_instances, list) { > > > + if (strcmp(fs->tag, fs2->tag) == 0) > > > + duplicate = true; > > > + } > > > + > > > + if (!duplicate) > > > + list_add_tail(&fs->list, &virtio_fs_instances); > > > > > > This is O(N^2) as it's presumably called for each istance. > > Doesn't scale - please switch to a tree or such. > > This is O(N) and not O(N^2) right? Addition of device is O(N), search > during mount is O(N). > > This is not a frequent event at all and number of virtiofs instances > per guest are expected to be fairly small (say less than 10). So I > really don't think there is any value in converting this into a tree > (instead of a list). > > [..] > > > +static void virtio_fs_free_devs(struct virtio_fs *fs) > > > +{ > > > + unsigned int i; > > > + > > > + /* TODO lock */ > > > > Doesn't inspire confidence, does it? > > Agreed. Getting rid of this in set of fixes I am about to post. > > > > > > + > > > + for (i = 0; i < fs->nvqs; i++) { > > > + struct virtio_fs_vq *fsvq = &fs->vqs[i]; > > > + > > > + if (!fsvq->fud) > > > + continue; > > > + > > > + flush_work(&fsvq->done_work); > > > + flush_delayed_work(&fsvq->dispatch_work); > > > + > > > + /* TODO need to quiesce/end_requests/decrement dev_count */ > > > > Indeed. Won't this crash if we don't? > > Took care of this as well. > > [..] > > > +static void virtio_fs_hiprio_dispatch_work(struct work_struct *work) > > > +{ > > > + struct virtio_fs_forget *forget; > > > + struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq, > > > + dispatch_work.work); > > > + struct virtqueue *vq = fsvq->vq; > > > + struct scatterlist sg; > > > + struct scatterlist *sgs[] = {&sg}; > > > + bool notify; > > > + int ret; > > > + > > > + pr_debug("virtio-fs: worker %s called.\n", __func__); > > > + while (1) { > > > + spin_lock(&fsvq->lock); > > > + forget = list_first_entry_or_null(&fsvq->queued_reqs, > > > + struct virtio_fs_forget, list); > > > + if (!forget) { > > > + spin_unlock(&fsvq->lock); > > > + return; > > > + } > > > + > > > + list_del(&forget->list); > > > + if (!fsvq->connected) { > > > + spin_unlock(&fsvq->lock); > > > + kfree(forget); > > > + continue; > > > + } > > > + > > > + sg_init_one(&sg, forget, sizeof(*forget)); > > > > This passes to host a structure including "struct list_head list"; > > > > Not a good idea. > > Ok, host does not have to see "struct list_head list". Its needed for > guest. Will look into getting rid of this. > > > > > > > > + > > > + /* Enqueue the request */ > > > + dev_dbg(&vq->vdev->dev, "%s\n", __func__); > > > + ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC); > > > > > > This is easier as add_outbuf. > > Will look into it. > > > > > Also - why GFP_ATOMIC? > > Hmm..., may be it can be GFP_KERNEL. I don't see atomic context here. Will > look into it. > > > > > > + if (ret < 0) { > > > + if (ret == -ENOMEM || ret == -ENOSPC) { > > > + pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later\n", > > > + ret); > > > + list_add_tail(&forget->list, > > > + &fsvq->queued_reqs); > > > + schedule_delayed_work(&fsvq->dispatch_work, > > > + msecs_to_jiffies(1)); > > > > Can't we we requeue after some buffers get consumed? > > That's what dispatch work is doing. It tries to requeue the request after > a while. > > [..] > > > +static int virtio_fs_probe(struct virtio_device *vdev) > > > +{ > > > + struct virtio_fs *fs; > > > + int ret; > > > + > > > + fs = devm_kzalloc(&vdev->dev, sizeof(*fs), GFP_KERNEL); > > > + if (!fs) > > > + return -ENOMEM; > > > + vdev->priv = fs; > > > + > > > + ret = virtio_fs_read_tag(vdev, fs); > > > + if (ret < 0) > > > + goto out; > > > + > > > + ret = virtio_fs_setup_vqs(vdev, fs); > > > + if (ret < 0) > > > + goto out; > > > + > > > + /* TODO vq affinity */ > > > + /* TODO populate notifications vq */ > > > > what's notifications vq? > > It has not been implemented yet. At some point of time we want to have > a notion of notification queue so that host can send notifications to > guest. Will get rid of this comment for now. > > [..] > > > +#ifdef CONFIG_PM_SLEEP > > > +static int virtio_fs_freeze(struct virtio_device *vdev) > > > +{ > > > + return 0; /* TODO */ > > > +} > > > + > > > +static int virtio_fs_restore(struct virtio_device *vdev) > > > +{ > > > + return 0; /* TODO */ > > > +} > > > > Is this really a good idea? I'd rather it was implemented, > > but if not possible at all disabling PM seems better than just > > keep going. > > I agree. Will look into disabling it. > > > > > > +#endif /* CONFIG_PM_SLEEP */ > > > + > > > +const static struct virtio_device_id id_table[] = { > > > + { VIRTIO_ID_FS, VIRTIO_DEV_ANY_ID }, > > > + {}, > > > +}; > > > + > > > +const static unsigned int feature_table[] = {}; > > > + > > > +static struct virtio_driver virtio_fs_driver = { > > > + .driver.name = KBUILD_MODNAME, > > > + .driver.owner = THIS_MODULE, > > > + .id_table = id_table, > > > + .feature_table = feature_table, > > > + .feature_table_size = ARRAY_SIZE(feature_table), > > > + /* TODO validate config_get != NULL */ > > > > Why? > > Don't know. Stefan, do you remember why did you put this comment? If not, > I will get rid of it.This comment can be removed.> > > +static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq) > > > +__releases(fiq->waitq.lock) > > > +{ > > > + unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */ > > > + struct virtio_fs *fs; > > > + struct fuse_conn *fc; > > > + struct fuse_req *req; > > > + struct fuse_pqueue *fpq; > > > + int ret; > > > + > > > + WARN_ON(list_empty(&fiq->pending)); > > > + req = list_last_entry(&fiq->pending, struct fuse_req, list); > > > + clear_bit(FR_PENDING, &req->flags); > > > + list_del_init(&req->list); > > > + WARN_ON(!list_empty(&fiq->pending)); > > > + spin_unlock(&fiq->waitq.lock); > > > + > > > + fs = fiq->priv; > > > + fc = fs->vqs[queue_id].fud->fc; > > > + > > > + dev_dbg(&fs->vqs[queue_id].vq->vdev->dev, > > > + "%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n", > > > + __func__, req->in.h.opcode, req->in.h.unique, req->in.h.nodeid, > > > + req->in.h.len, fuse_len_args(req->out.numargs, req->out.args)); > > > + > > > + fpq = &fs->vqs[queue_id].fud->pq; > > > + spin_lock(&fpq->lock); > > > + if (!fpq->connected) { > > > + spin_unlock(&fpq->lock); > > > + req->out.h.error = -ENODEV; > > > + pr_err("virtio-fs: %s disconnected\n", __func__); > > > + fuse_request_end(fc, req); > > > + return; > > > + } > > > + list_add_tail(&req->list, fpq->processing); > > > + spin_unlock(&fpq->lock); > > > + set_bit(FR_SENT, &req->flags); > > > + /* matches barrier in request_wait_answer() */ > > > + smp_mb__after_atomic(); > > > + /* TODO check for FR_INTERRUPTED? */ > > > > > > ? > > hmm... we don't support FR_INTERRUPTED. Stefan, do you remember why > this TODO is here. If not, I will get rid of it.We don't support FUSE_INTERRUPT yet. The purpose of this comment is that when we do support FUSE_INTERRUPT we'll need to follow fuse_dev_do_read() in queuing a FUSE_INTERRUPT here. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20190906/1b6c4a75/attachment-0001.sig>