Nir Soffer
2018-Mar-12 07:13 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Fri, Mar 9, 2018 at 4:25 PM Richard W.M. Jones <rjones@redhat.com> wrote:> It has to be said it would be really convenient to have a 'zero' > and/or 'trim' method of some sort. >'trim' means discard? Currently we cannot support discard on block storage since ovirt may need to wipe lvs when deleting a disk, and discarding may leave unwiped user data. This may change in 4.3 if we switch to wipe on creation instead of wipe after delete.> qemu-img tries hard to trim the whole disk before using it. > Unfortunately it does this in different ways across RHEL 7 and > upstream. With upstream I managed a workaround based on ignoring any > zero requests which arrive before the first write. However it's not > so easy to do this for RHEL 7's qemu which is issuing mixed writes and > zeroes in different orders. > > How hard would it be to implement a special https request in imageio > for zeroing (better still, either zeroing or trimming) a range of > bytes? >Supporting efficient zero makes sense. We plan to support it via a special sparse format, see: https://gerrit.ovirt.org/#/c/85413/ We have a demo here: https://gerrit.ovirt.org/#/c/85468/ This will not help you use case when you want to mix read/write/zero requests, but we can use the same infrastructure. We plan to use fallocate for file based storage: https://gerrit.ovirt.org/#/c/85512/ and BLKZEROOUT for block storage: https://gerrit.ovirt.org/#/c/85537/ and some dumb zero loop if these options are not available. So we need to map the zero operation to http - how about: POST /images/ticket-id ... ... { "op": "zero", "offset": X, "size": Y } I would like to support only aligned offset and size - do you think it should work for qemu-img? Adding this with dumb zero loop can be done quickly. We can make it more efficient later. Nir
Nir Soffer
2018-Mar-12 10:19 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Mon, Mar 12, 2018 at 9:13 AM Nir Soffer <nirsof@gmail.com> wrote:> On Fri, Mar 9, 2018 at 4:25 PM Richard W.M. Jones <rjones@redhat.com> > wrote: > >> It has to be said it would be really convenient to have a 'zero' >> and/or 'trim' method of some sort. >> > > 'trim' means discard? > > Currently we cannot support discard on block storage since ovirt may need > to wipe > lvs when deleting a disk, and discarding may leave unwiped user data. This > may > change in 4.3 if we switch to wipe on creation instead of wipe after > delete. > > >> qemu-img tries hard to trim the whole disk before using it. >> Unfortunately it does this in different ways across RHEL 7 and >> upstream. With upstream I managed a workaround based on ignoring any >> zero requests which arrive before the first write. However it's not >> so easy to do this for RHEL 7's qemu which is issuing mixed writes and >> zeroes in different orders. >> >> How hard would it be to implement a special https request in imageio >> for zeroing (better still, either zeroing or trimming) a range of >> bytes? >> > > Supporting efficient zero makes sense. We plan to support it via a special > sparse format, see: > https://gerrit.ovirt.org/#/c/85413/ > > We have a demo here: > https://gerrit.ovirt.org/#/c/85468/ > > This will not help you use case when you want to mix read/write/zero > requests, > but we can use the same infrastructure. We plan to use fallocate for file > based > storage: > https://gerrit.ovirt.org/#/c/85512/ > and BLKZEROOUT for block storage: > https://gerrit.ovirt.org/#/c/85537/ > and some dumb zero loop if these options are not available. > > So we need to map the zero operation to http - how about: > > POST /images/ticket-id ... > ... > { > "op": "zero", > "offset": X, > "size": Y > } > > I would like to support only aligned offset and size - do you think it > should work > for qemu-img? > > Adding this with dumb zero loop can be done quickly. We can make it more > efficient later. >I posted this https://gerrit.ovirt.org/#/c/88793/ If this work for you I think we can include this in the next 4.2 build. Nir
Richard W.M. Jones
2018-Mar-12 10:32 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Mon, Mar 12, 2018 at 07:13:52AM +0000, Nir Soffer wrote:> On Fri, Mar 9, 2018 at 4:25 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > It has to be said it would be really convenient to have a 'zero' > > and/or 'trim' method of some sort. > > > > 'trim' means discard?Yes. The 5 functions we could support are: * pread - done * pwrite - done * flush - does fdatasync(2) on the block device * zero - write a range of zeroes without having to send zeroes * trim - punch hole, can be emulated using zero if not possile Also (not implemented in nbdkit today, but coming soon), pwrite, zero and trim can be extended with a FUA (force unit access) flag, which would mean that the range should be persisted to disk before returning. It can be emulated by calling flush after the operation. It wasn't clear if anything in this process flushes the content to disk. Is that what transfer.finalize does?> Currently we cannot support discard on block storage since ovirt may > need to wipe lvs when deleting a disk, and discarding may leave > unwiped user data. This may change in 4.3 if we switch to wipe on > creation instead of wipe after delete.I think this depends on BLKDISCARDZEROES[1] for the block device? Of course if you're worried about data remanence for someone who has access to the physical device then that wouldn't be enough. [1] https://rwmj.wordpress.com/2014/03/11/blkdiscard-blkzeroout-blkdiscardzeroes-blksecdiscard/> POST /images/ticket-id ... > ... > { > "op": "zero", > "offset": X, > "size": Y > } > > I would like to support only aligned offset and size - do you think it > should work > for qemu-img?It depends a bit on what you mean by "aligned" and what the alignment is. We'd probably have to work around it in the plugin so that it can round in the request, issues a zero operation for the aligned part, and writes zeroes at each end. There's no guarantee that qemu-img will be well-behaved in the future even if it is now. Anyway this sounds do-able, is it something we can get for 4.2? How will we detect that the server supports it? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2018-Mar-12 10:34 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Mon, Mar 12, 2018 at 10:19:02AM +0000, Nir Soffer wrote:> I posted this > https://gerrit.ovirt.org/#/c/88793/ > > If this work for you I think we can include this in the next 4.2 build.Looks good, but we need to know something about the alignment required for these requests. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Nir Soffer
2018-Mar-12 12:13 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Mon, Mar 12, 2018 at 12:32 PM Richard W.M. Jones <rjones@redhat.com> wrote:> On Mon, Mar 12, 2018 at 07:13:52AM +0000, Nir Soffer wrote: > > On Fri, Mar 9, 2018 at 4:25 PM Richard W.M. Jones <rjones@redhat.com> > wrote: > > > > > It has to be said it would be really convenient to have a 'zero' > > > and/or 'trim' method of some sort. > > > > > > > 'trim' means discard? > > Yes. The 5 functions we could support are: > > * pread - done > * pwrite - done > * flush - does fdatasync(2) on the block device >Currently we do fsync() on every PUT request, so flush is not very useful.> * zero - write a range of zeroes without having to send zeroes > * trim - punch hole, can be emulated using zero if not possile >Makes sense.> Also (not implemented in nbdkit today, but coming soon), pwrite, zero > and trim can be extended with a FUA (force unit access) flag, which > would mean that the range should be persisted to disk before > returning. It can be emulated by calling flush after the operation.It wasn't clear if anything in this process flushes the content to> disk. Is that what transfer.finalize does? >All PUT requests fsync() before returning. We optimize for complete image trasfer, not for random io.> Currently we cannot support discard on block storage since ovirt may > > need to wipe lvs when deleting a disk, and discarding may leave > > unwiped user data. This may change in 4.3 if we switch to wipe on > > creation instead of wipe after delete. > > I think this depends on BLKDISCARDZEROES[1] for the block device? Of > course if you're worried about data remanence for someone who has > access to the physical device then that wouldn't be enough. >BLKDISCARDZEROES was never reliable and it was removed from the kernel recently. Please check https://patchwork.kernel.org/patch/9903757/ We are not relying on it since ovirt 4.2, hopefully also in 4.1.> [1] > https://rwmj.wordpress.com/2014/03/11/blkdiscard-blkzeroout-blkdiscardzeroes-blksecdiscard/ > > > POST /images/ticket-id ... > > ... > > { > > "op": "zero", > > "offset": X, > > "size": Y > > } > > > > I would like to support only aligned offset and size - do you think it > > should work > > for qemu-img? > > It depends a bit on what you mean by "aligned" and what the alignment > is. We'd probably have to work around it in the plugin so that it can > round in the request, issues a zero operation for the aligned part, > and writes zeroes at each end. There's no guarantee that qemu-img > will be well-behaved in the future even if it is now. >Aligned for direct I/O (we use direct I/O for read/write). We can support non-aligned ranges by doing similar emulation in the server, but I prefer to do it only if we have such requirement. If you need to do this in the client, we probably need to do this in the server otherwise all clients may need to emulate this. I think there is no reason that qemu-img will zero unaligned ranges, but I guess Eric can have a better answer.> Anyway this sounds do-able, is it something we can get for 4.2? >I think it is for 4.2.z. Is zero support in the daemon and proxy enough, or we need the other options now?> How will we detect that the server supports it? >Because we don't support OPTIONS yet, the only way is to send a POST request and checking for 405 response. Nir
Reasonably Related Threads
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
- Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk
- v2v: -o rhv-upload - oVirt imageio random I/O APIs
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.