Richard W.M. Jones
2018-Mar-08 12:29 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Thu, Mar 08, 2018 at 12:13:01PM +0000, Nir Soffer wrote:> On Wed, Mar 7, 2018 at 12:18 AM Richard W.M. Jones <rjones@redhat.com> > wrote: > > > Previous versions: > > v3: https://www.redhat.com/archives/libguestfs/2018-March/msg00000.html > > v2: > > https://www.redhat.com/archives/libguestfs/2018-February/msg00177.html > > v1: > > https://www.redhat.com/archives/libguestfs/2018-February/msg00139.html > > > > This completely rethinks the approach taken by the previous patches. > > > > Instead of trying to involve qemu's curl driver, this uses a small > > Python 3 nbdkit plugin to interface between qemu and the oVirt server. > > > > The data path is: > > > > qemu-img convert -------> nbdkit -------> oVirt imageio > > nbd https > > > > What is the advantage of this for raw files? Why not: > > v2v -> ovirt imageio?Not sure I understand what you mean? virt-v2v always runs ‘qemu-img convert’ to do the copy (not conversion), so the question is how do we connect qemu-img to the oVirt server. One way would be to extend qemu so it knows how to write on an https connection (which it does not do now) but that has a number of disadvantages, as well as being hard to implement.> And how qcow2 files will be handled?We'll add ‘-O qcow2’ to the qemu-img convert command line, and qemu will then write out a qcow2 file. However it's not quite so straightforward (and in fact I didn't get it to work yet). qemu will try to first read from the target (invoking pread calls in the nbdkit Python plugin which will try to read from oVirt over https). Unfortunately it fails here for a couple of reasons: (1) My pread method is broken. I saw your suggested fixes to it (and pwrite) and will try those later. (2) In any case it won't work because the disk at this point is empty and full of zeroes, and it's looking for a qcow2 header. To fix this we'll have to write a qcow2 header to the disk first (TBD).> when I tried nbdkit few month ago I could not make it handle qcow2 > files. Maybe I had to write a plugin?NBD (the protocol) doesn't "know" about qcow2 files. You can serve any file you want as a range of bytes, including qcow2, but that requires whatever is consuming those bytes to then do the qcow2 en-/decoding. (Which means effectively the client has to be qemu because nothing else can parse qcow2 reliably). In the qemu-img convert case above this all works because qemu-img (ie. qemu) is the client, and it does the encoding of qcow2, and we're just shuffling a byte stream to oVirt imageio.> We considered using this flow when we download/upload images, > to support on-the-fly image conversion: > > raw file -> qemu-img convert -> nbdkit -> qcow2 stream -> imageio -> > http client > > And same for uploading, e.g. uploading qcow2 and writing raw image. > > If this is possible using nbdkit plugin, can we ruse the same plugin in > different applications, or we must implement the plugin in each application?This should be possible (modulo fixing issues (1) & (2) above). The plugin I have written is very specific to the virt-v2v task, but it could be evolved into something which would handle this case. nbdkit is designed around the idea that you can make small plugins in familiar scripting languages for specific tasks. I did think about having a generic "oVirt plugin" which we'd ship with upstream nbdkit, but making it generic enough to handle a useful range of cases seemed difficult.> > There are two Python scripts included. One is the nbdkit plugin. The > > other creates the VM. As with the previous patches, these scripts get > > embeded in virt-v2v at compile time, so effectively there is no API > > contract between virt-v2v & the Python code. > > > > With this patch series I am able to (mostly) successfully convert VMs > > from local disk to oVirt 4.2, with full end-to-end streaming. There > > is some room for optimization -- in particular uploads are currently > > rather slow because we rely on qemu-img batching small requests into > > large ones which it doesn't do well, and instead the nbdkit plugin > > could batch small writes into larger ones. Also I noticed (but only > > one time) that very long transfers would cause the oVirt ticket to > > expire, even though we were writing the whole time. > > > > On the host, the ticket is extended regularly, based on the activity. > > On the proxy we currently have 3600 seconds timeout, and the ticket > is never extended. I think we should have the same mechanism as > we do on the host.I have only seen this once and never again, so hopefully it was just a network blip causing a > 3600 second timeout. If it happens again I'll take a closer look. Thanks for the review, it does seem like pwrite is rather broken. Unfortunately my ovirt node hangs hard when I try to boot any guest (seems like a kernel or even hardware bug) so I have never been able to test that the transferred guest works :-( Rich.> Nir > > There are still a few unresolved issues (see patch 3/3) so this is not > > quite ready to go upstream yet, but can still be reviewed. Patches 1 > > & 2 are the same as posted before. > > > > I did not yet test qcow2 uploads. Those are "interestingly" different > > because qcow2 will require us to read from the remote oVirt server as > > well as just stream/write to it. The pread method for that is written > > but has not been tested. > > > > Rich. > > > > _______________________________________________ > > Libguestfs mailing list > > Libguestfs@redhat.com > > https://www.redhat.com/mailman/listinfo/libguestfs > >-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Nir Soffer
2018-Mar-08 14:31 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
בתאריך יום ה׳, 8 במרץ 2018, 14:29, מאת Richard W.M. Jones < rjones@redhat.com>:> On Thu, Mar 08, 2018 at 12:13:01PM +0000, Nir Soffer wrote: > > On Wed, Mar 7, 2018 at 12:18 AM Richard W.M. Jones <rjones@redhat.com> > > wrote: > > > > > Previous versions: > > > v3: > https://www.redhat.com/archives/libguestfs/2018-March/msg00000.html > > > v2: > > > https://www.redhat.com/archives/libguestfs/2018-February/msg00177.html > > > v1: > > > https://www.redhat.com/archives/libguestfs/2018-February/msg00139.html > > > > > > This completely rethinks the approach taken by the previous patches. > > > > > > Instead of trying to involve qemu's curl driver, this uses a small > > > Python 3 nbdkit plugin to interface between qemu and the oVirt server. > > > > > > The data path is: > > > > > > qemu-img convert -------> nbdkit -------> oVirt imageio > > > nbd https > > > > > > > What is the advantage of this for raw files? Why not: > > > > v2v -> ovirt imageio? > > Not sure I understand what you mean? virt-v2v always runs ‘qemu-img > convert’ to do the copy (not conversion), so the question is how do we > connect qemu-img to the oVirt server. One way would be to extend qemu > so it knows how to write on an https connection (which it does not do > now) but that has a number of disadvantages, as well as being hard to > implement. > > > And how qcow2 files will be handled? > > We'll add ‘-O qcow2’ to the qemu-img convert command line, and qemu > will then write out a qcow2 file. However it's not quite so > straightforward (and in fact I didn't get it to work yet). qemu will > try to first read from the target (invoking pread calls in the nbdkit > Python plugin which will try to read from oVirt over https). > Unfortunately it fails here for a couple of reasons: > > (1) My pread method is broken. I saw your suggested fixes to it (and > pwrite) and will try those later. > > (2) In any case it won't work because the disk at this point is empty > and full of zeroes, and it's looking for a qcow2 header. To fix this > we'll have to write a qcow2 header to the disk first (TBD). >When you create a disk using qcow2 format ovirt creates a qcow2 empty image with the specified virtual size, so we should work.> > when I tried nbdkit few month ago I could not make it handle qcow2 > > files. Maybe I had to write a plugin? > > NBD (the protocol) doesn't "know" about qcow2 files. You can serve > any file you want as a range of bytes, including qcow2, but that > requires whatever is consuming those bytes to then do the qcow2 > en-/decoding. (Which means effectively the client has to be qemu > because nothing else can parse qcow2 reliably). In the qemu-img > convert case above this all works because qemu-img (ie. qemu) is the > client, and it does the encoding of qcow2, and we're just shuffling a > byte stream to oVirt imageio. > > > We considered using this flow when we download/upload images, > > to support on-the-fly image conversion: > > > > raw file -> qemu-img convert -> nbdkit -> qcow2 stream -> imageio -> > > http client > > > > And same for uploading, e.g. uploading qcow2 and writing raw image. > > > > If this is possible using nbdkit plugin, can we ruse the same plugin in > > different applications, or we must implement the plugin in each > application? > > This should be possible (modulo fixing issues (1) & (2) above). The > plugin I have written is very specific to the virt-v2v task, but it > could be evolved into something which would handle this case. > > nbdkit is designed around the idea that you can make small plugins in > familiar scripting languages for specific tasks. > > I did think about having a generic "oVirt plugin" which we'd ship with > upstream nbdkit, but making it generic enough to handle a useful range > of cases seemed difficult. >I agree, when we have a working plugin we will try to adapt it to our use case.> > > There are two Python scripts included. One is the nbdkit plugin. The > > > other creates the VM. As with the previous patches, these scripts get > > > embeded in virt-v2v at compile time, so effectively there is no API > > > contract between virt-v2v & the Python code. > > > > > > With this patch series I am able to (mostly) successfully convert VMs > > > from local disk to oVirt 4.2, with full end-to-end streaming. There > > > is some room for optimization -- in particular uploads are currently > > > rather slow because we rely on qemu-img batching small requests into > > > large ones which it doesn't do well, and instead the nbdkit plugin > > > could batch small writes into larger ones. Also I noticed (but only > > > one time) that very long transfers would cause the oVirt ticket to > > > expire, even though we were writing the whole time. > > > > > > > On the host, the ticket is extended regularly, based on the activity. > > > > On the proxy we currently have 3600 seconds timeout, and the ticket > > is never extended. I think we should have the same mechanism as > > we do on the host. > > I have only seen this once and never again, so hopefully it was just a > network blip causing a > 3600 second timeout. > > If it happens again I'll take a closer look. > > Thanks for the review, it does seem like pwrite is rather broken. > Unfortunately my ovirt node hangs hard when I try to boot any guest > (seems like a kernel or even hardware bug) so I have never been able > to test that the transferred guest works :-( > > Rich. > > > Nir > > > > There are still a few unresolved issues (see patch 3/3) so this is not > > > quite ready to go upstream yet, but can still be reviewed. Patches 1 > > > & 2 are the same as posted before. > > > > > > I did not yet test qcow2 uploads. Those are "interestingly" different > > > because qcow2 will require us to read from the remote oVirt server as > > > well as just stream/write to it. The pread method for that is written > > > but has not been tested. > > > > > > Rich. > > > > > > _______________________________________________ > > > Libguestfs mailing list > > > Libguestfs@redhat.com > > > https://www.redhat.com/mailman/listinfo/libguestfs > > > > > -- > Richard Jones, Virtualization Group, Red Hat > http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-p2v converts physical machines to virtual machines. Boot with a > live CD or over the network (PXE) and turn machines into KVM guests. > http://libguestfs.org/virt-v2v >
Richard W.M. Jones
2018-Mar-08 14:35 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Thu, Mar 08, 2018 at 02:31:48PM +0000, Nir Soffer wrote:> When you create a disk using qcow2 format ovirt creates a qcow2 empty image > with the specified virtual size, so we should work.Ah didn't know that. That nicely solves #2. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Eric Blake
2018-Mar-08 15:41 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On 03/08/2018 06:29 AM, Richard W.M. Jones wrote:> NBD (the protocol) doesn't "know" about qcow2 files. You can serve > any file you want as a range of bytes, including qcow2, but that > requires whatever is consuming those bytes to then do the qcow2 > en-/decoding. (Which means effectively the client has to be qemu > because nothing else can parse qcow2 reliably). In the qemu-img > convert case above this all works because qemu-img (ie. qemu) is the > client, and it does the encoding of qcow2, and we're just shuffling a > byte stream to oVirt imageio.One caveat: NBD cannot (yet) resize disks (there is a proposal to implement a new command that would optionally allow an NBD client to request a resize, and/or a server to advertise an updated size back to the client beyond the initial size learned at connect, but that proposal still needs ironing out and an initial implementation). As such, if you are serving a qcow2 file as raw bytes over NBD, you MUST be sure that the NBD server already has a sufficient size for the file it is serving, because the client interpreting qcow2 will not be able to do anything if its qcow2 usage patterns require more space than the NBD server already advertised. When pairing qemu-nbd as server with qemu or qemu-io as client, your options are: 1. 'qemu-nbd -f qcow2' + 'qemu-io -f raw': the client sees only what the guest would see, and no qcow2 metadata. The server may resize the host's qcow2 image as needed to continue to provide the same size guest image to the client, but the client can't take direct advantage of any qcow2 features. This is easiest. 2. 'qemu-nbd -f raw' + 'qemu-io -f qcow2': the client sees the qcow2 metadata directly, and must interpret from that what the guest would see. The server never resizes the image, so hopefully you preallocated it to be large enough for anything the guest wants to do with qcow2. The client can use whatever qcow2 features it wants. This is where I've seen people fail to preallocate, and then wonder why the client fails. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2018-Mar-08 16:21 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Thu, Mar 08, 2018 at 09:41:38AM -0600, Eric Blake wrote:> On 03/08/2018 06:29 AM, Richard W.M. Jones wrote: > > >NBD (the protocol) doesn't "know" about qcow2 files. You can serve > >any file you want as a range of bytes, including qcow2, but that > >requires whatever is consuming those bytes to then do the qcow2 > >en-/decoding. (Which means effectively the client has to be qemu > >because nothing else can parse qcow2 reliably). In the qemu-img > >convert case above this all works because qemu-img (ie. qemu) is the > >client, and it does the encoding of qcow2, and we're just shuffling a > >byte stream to oVirt imageio. > > One caveat: NBD cannot (yet) resize disks (there is a proposal to > implement a new command that would optionally allow an NBD client to > request a resize, and/or a server to advertise an updated size back > to the client beyond the initial size learned at connect, but that > proposal still needs ironing out and an initial implementation). As > such, if you are serving a qcow2 file as raw bytes over NBD, you > MUST be sure that the NBD server already has a sufficient size for > the file it is serving, because the client interpreting qcow2 will > not be able to do anything if its qcow2 usage patterns require more > space than the NBD server already advertised.That's a good point actually. The current patch set assumes size == virtual size, which is OK for raw, but could be insufficient (in extremis) for qcow2. Added to the to-do list. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Richard W.M. Jones
2018-Mar-09 14:25 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
It has to be said it would be really convenient to have a 'zero' and/or 'trim' method of some sort. qemu-img tries hard to trim the whole disk before using it. Unfortunately it does this in different ways across RHEL 7 and upstream. With upstream I managed a workaround based on ignoring any zero requests which arrive before the first write. However it's not so easy to do this for RHEL 7's qemu which is issuing mixed writes and zeroes in different orders. How hard would it be to implement a special https request in imageio for zeroing (better still, either zeroing or trimming) a range of bytes? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Nir Soffer
2018-Mar-12 07:13 UTC
Re: [Libguestfs] [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
On Fri, Mar 9, 2018 at 4:25 PM Richard W.M. Jones <rjones@redhat.com> wrote:> It has to be said it would be really convenient to have a 'zero' > and/or 'trim' method of some sort. >'trim' means discard? Currently we cannot support discard on block storage since ovirt may need to wipe lvs when deleting a disk, and discarding may leave unwiped user data. This may change in 4.3 if we switch to wipe on creation instead of wipe after delete.> qemu-img tries hard to trim the whole disk before using it. > Unfortunately it does this in different ways across RHEL 7 and > upstream. With upstream I managed a workaround based on ignoring any > zero requests which arrive before the first write. However it's not > so easy to do this for RHEL 7's qemu which is issuing mixed writes and > zeroes in different orders. > > How hard would it be to implement a special https request in imageio > for zeroing (better still, either zeroing or trimming) a range of > bytes? >Supporting efficient zero makes sense. We plan to support it via a special sparse format, see: https://gerrit.ovirt.org/#/c/85413/ We have a demo here: https://gerrit.ovirt.org/#/c/85468/ This will not help you use case when you want to mix read/write/zero requests, but we can use the same infrastructure. We plan to use fallocate for file based storage: https://gerrit.ovirt.org/#/c/85512/ and BLKZEROOUT for block storage: https://gerrit.ovirt.org/#/c/85537/ and some dumb zero loop if these options are not available. So we need to map the zero operation to http - how about: POST /images/ticket-id ... ... { "op": "zero", "offset": X, "size": Y } I would like to support only aligned offset and size - do you think it should work for qemu-img? Adding this with dumb zero loop can be done quickly. We can make it more efficient later. Nir
Reasonably Related Threads
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.
- v2v: -o rhv-upload - oVirt imageio random I/O APIs
- Re: [PATCH v4 0/3] v2v: Add -o rhv-upload output mode.