Richard W.M. Jones
2020-Aug-07  14:07 UTC
Re: [Libguestfs] [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
On Fri, Aug 07, 2020 at 04:43:12PM +0300, Nir Soffer wrote:> On Fri, Aug 7, 2020, 16:16 Richard W.M. Jones <rjones@redhat.com> wrote: > > I'm not sure if or even how we could ever do a robust O_DIRECT > > > > We can let the plugin an filter deal with that. The simplest solution is to > drop it on the user and require aligned requests.I mean this is very error prone. It requires the end user to know about the basically unknowable restrictions of O_DIRECT and isn't even possible in one common case - if the size of the file isn't an exact multiple of the filesystem block size.> Maybe a filter can handle alignment? > > > implementation, but my idea was that it might be an alternate > > implementation of cache=none. But if we thought we might use O_DIRECT > > as a separate mode, then maybe we should rename cache=none. > > cache=advise? cache=dontneed? I can't think of a good name! > > > > Yes, don't call it none if you use the cache. > > How about advise=? > > I would keep cache semantics similar to qemu.qemu uses cache=none as a synonym for O_DIRECT, but AFAIK it has nothing that tries to use posix_fadvise(DONTNEED) with or without Linus's double buffering technique. qemu does use posix_fadvise(DONTNEED) in one place but AFAICT it is only used for live migration. ...> We already tried this with dd and the results were not good.These ones? https://www.redhat.com/archives/libguestfs/2020-August/msg00078.html Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Nir Soffer
2020-Aug-07  14:29 UTC
Re: [Libguestfs] [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
On Fri, Aug 7, 2020 at 5:07 PM Richard W.M. Jones <rjones@redhat.com> wrote:> > On Fri, Aug 07, 2020 at 04:43:12PM +0300, Nir Soffer wrote: > > On Fri, Aug 7, 2020, 16:16 Richard W.M. Jones <rjones@redhat.com> wrote: > > > I'm not sure if or even how we could ever do a robust O_DIRECT > > > > > > > We can let the plugin an filter deal with that. The simplest solution is to > > drop it on the user and require aligned requests. > > I mean this is very error prone. It requires the end user to know > about the basically unknowable restrictions of O_DIRECT and isn't even > possible in one common case - if the size of the file isn't an exact > multiple of the filesystem block size.Yes, doing direct I/O is hard, even qemu still has bugs in this area that pop from time to time. It is fine to fail open if the size of the imgae is not aligned to underlying block size. However finding the underlying block size can of worms :-)> > Maybe a filter can handle alignment? > > > > > implementation, but my idea was that it might be an alternate > > > implementation of cache=none. But if we thought we might use O_DIRECT > > > as a separate mode, then maybe we should rename cache=none. > > > cache=advise? cache=dontneed? I can't think of a good name! > > > > > > > Yes, don't call it none if you use the cache. > > > > How about advise=? > > > > I would keep cache semantics similar to qemu. > > qemu uses cache=none as a synonym for O_DIRECT, but AFAIK it has > nothing that tries to use posix_fadvise(DONTNEED) with or without > Linus's double buffering technique.Yes, this is the right way. posix_fadvise is not a replacement for O_DIRECT.> qemu does use > posix_fadvise(DONTNEED) in one place but AFAICT it is only used for > live migration. > > ... > > We already tried this with dd and the results were not good. > > These ones? > https://www.redhat.com/archives/libguestfs/2020-August/msg00078.htmlNo, we had a bug when copying image from glance caused sanlock timeouts because of the unpredictable page cache flushes. We tried to use fadvice but it did not help. The only way to avoid such issues is with O_SYNC or O_DIRECT. O_SYNC is much slower but this is the path we took for now in this flow.> > Rich. > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-builder quickly builds VMs from scratch > http://libguestfs.org/virt-builder.1.html >
Richard W.M. Jones
2020-Aug-07  14:36 UTC
Re: [Libguestfs] [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
On Fri, Aug 07, 2020 at 05:29:24PM +0300, Nir Soffer wrote:> On Fri, Aug 7, 2020 at 5:07 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > These ones? > > https://www.redhat.com/archives/libguestfs/2020-August/msg00078.html > > No, we had a bug when copying image from glance caused sanlock timeouts > because of the unpredictable page cache flushes. > > We tried to use fadvice but it did not help. The only way to avoid such issues > is with O_SYNC or O_DIRECT. O_SYNC is much slower but this is the path > we took for now in this flow.I'm interested in more background about this, because while it is true that O_DIRECT and POSIX_FADV_DONTNEED are not exactly equivalent, I think I've shown here that DONTNEED can be used to avoid polluting the page cache. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Possibly Parallel Threads
- Re: [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
- Re: [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
- Re: [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.
- Re: [PATCH nbdkit] plugins: file: More standard cache mode names
- Re: [PATCH nbdkit] file: Implement cache=none and fadvise=normal|random|sequential.