Martin Kletzander
2019-Oct-11 07:37 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Thu, Oct 10, 2019 at 03:33:25PM +0100, Richard W.M. Jones wrote:>On Wed, Oct 09, 2019 at 02:19:46PM +0200, Martin Kletzander wrote: >> Even though this option is not to be used according to the manual, it: >> >> a) still might be useful even for machine-readable logs >> >> b) should not break the machine-readable output > >I'm a bit confused what you're trying to do here. >When working on a conversion where the disk is already copied from the source hypervisor to the destination storage (due to various reasons, e.g. having faster access to one of the storages or being able to use the disk_sync script to prefetch some of the data before the machine gets shut down) I need to utilize virt-v2v for two specific purposes: 1) to create the output metadata and 2) to make the changes in the guest disks (which are already copied) but not for the third thing that virt-v2v does as well, which is copying the disks from source to destination. For case #1 I initially used virt-v2v --no-copy, which creates the metadata, but discards the changes made to the disks in the overlays and for #2 I used --in-place which changes the data on the source (for which I crafted a specific XML to be used with -i libvirtxml), but does not output any metadata for the destination. There are couple of issues I am facing with this: a) I need to run the conversion twice b) --in-place does not know about the destination (and it does not do the same thing as if it knew) c) It should only be done using --no-copy first and --in-place second just to make sure the modifications are not done twice (although this is just to stay on the safe side). It still felt like a massive hack to me and while I still held on to the idea of keeping the overlays around without actually copying the data anywhere I found the --debug-overlays option. It fits my use case perfectly because what I am doing now (with this patch) is: i) run virt-v2v --no-copy with the proper destination ii) take the overlays and run qemu-img commit on them iii) remove the overlays This way all of my concerns are solved and the only one that arises is the fact that --debug-overlays is documented as "... only used for debugging and may be removed in a future version." One interesting thing is that I only found that information in the man page after I wrote this patch and started using it. The reason for it was that I started actually writing something very similar to this and I stumbled upon the code that already does that (with --debug-overlays). I also thought that if someone runs virt-v2v --machine-readable --debug-overlays that the machine readable output would be broken because the output is not json, but it looks like (I'm just trying it now) --machine-readable does not write any data to stdout or stderr, even when specified as --machine-readable=stream:stdout (or stderr) for simple libvirt2libvirt migration. So point (b) from the commit message is probably invalid. Oh, actually, I don't see it being used for anything else than printing apabilities and estimates. I wonder where the json messages in machine readable logfile that virt-v2v-wrapper requests are coming from. One last thing is that currently any application that wants to run virt-v2v (virt-v2v-wrapper in my case) needs to parse the output to get any meaningful data from it. What I tried here was using a new type of message called 'data' that upper applications could just parse and use. In the long run I could add more data to the output so that it is easier to use from virt-v2v-wrapper and other applications. But that is not something that is necessary for this particular patch (or the discussion about the --debug-overlays being supported in the long term), it just made it easier. Does that make sense? For some more context/info, see How I used it in a PoC here: https://github.com/nertpinx/v2v-conversion-host/blob/asdf/wrapper/log_parser.py#L71>Rich. > >> Signed-off-by: Martin Kletzander <mkletzan@redhat.com> >> --- >> v2v/v2v.ml | 29 ++++++++++++++++++++++------- >> 1 file changed, 22 insertions(+), 7 deletions(-) >> >> diff --git a/v2v/v2v.ml b/v2v/v2v.ml >> index 4ee15663f261..508a2b4f39a5 100644 >> --- a/v2v/v2v.ml >> +++ b/v2v/v2v.ml >> @@ -815,13 +815,28 @@ and actual_target_size target_file disk_stats >> >> (* Save overlays if --debug-overlays option was used. *) >> and preserve_overlays overlays src_name >> - List.iter ( >> - fun ov -> >> - let saved_filename >> - sprintf "%s/%s-%s.qcow2" overlay_dir src_name ov.ov_sd in >> - rename ov.ov_overlay_file saved_filename; >> - info (f_"Overlay saved as %s [--debug-overlays]") saved_filename >> - ) overlays >> + let filenames = List.map ( >> + fun ov -> >> + let saved_filename >> + sprintf "%s/%s-%s.qcow2" overlay_dir src_name ov.ov_sd in >> + rename ov.ov_overlay_file saved_filename; >> + saved_filename >> + ) overlays in >> + match machine_readable () with >> + | None -> >> + List.iter ( >> + fun filename -> >> + info (f_"Overlay saved as %s [--debug-overlays]") filename >> + ) filenames >> + | Some {pr} -> >> + let json = [ >> + "type", JSON.String "data"; >> + "data", JSON.Dict [ >> + "saved_overlays", >> + JSON.List (List.map (fun s -> JSON.String s) filenames); >> + ] >> + ] in >> + pr "%s\n" (JSON.string_of_doc json) >> >> (* Request guest caps based on source configuration. *) >> and rcaps_from_source source >> -- >> 2.23.0 > >-- >Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones >Read my programming and virtualization blog: http://rwmj.wordpress.com >virt-top is 'top' for virtual machines. Tiny program with many >powerful monitoring features, net stats, disk stats, logging, etc. >http://people.redhat.com/~rjones/virt-top
Martin Kletzander
2019-Oct-11 07:47 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Fri, Oct 11, 2019 at 09:37:18AM +0200, Martin Kletzander wrote:>On Thu, Oct 10, 2019 at 03:33:25PM +0100, Richard W.M. Jones wrote: >>On Wed, Oct 09, 2019 at 02:19:46PM +0200, Martin Kletzander wrote: >>> Even though this option is not to be used according to the manual, it: >>> >>> a) still might be useful even for machine-readable logs >>> >>> b) should not break the machine-readable output >> >>I'm a bit confused what you're trying to do here. >> > >When working on a conversion where the disk is already copied from the source >hypervisor to the destination storage (due to various reasons, e.g. having >faster access to one of the storages or being able to use the disk_sync script >to prefetch some of the data before the machine gets shut down) I need to >utilize virt-v2v for two specific purposes: > > 1) to create the output metadata and > > 2) to make the changes in the guest disks (which are already copied) > >but not for the third thing that virt-v2v does as well, which is copying the >disks from source to destination. > >For case #1 I initially used virt-v2v --no-copy, which creates the metadata, but >discards the changes made to the disks in the overlays and for #2 I used >--in-place which changes the data on the source (for which I crafted a specific >XML to be used with -i libvirtxml), but does not output any metadata for the >destination. > >There are couple of issues I am facing with this: > > a) I need to run the conversion twice > > b) --in-place does not know about the destination (and it does not do the same > thing as if it knew) > > c) It should only be done using --no-copy first and --in-place second just to > make sure the modifications are not done twice (although this is just to > stay on the safe side). > >It still felt like a massive hack to me and while I still held on to the idea of >keeping the overlays around without actually copying the data anywhere I found >the --debug-overlays option. It fits my use case perfectly because what I am >doing now (with this patch) is: > > i) run virt-v2v --no-copy with the proper destination > > ii) take the overlays and run qemu-img commit on them > > iii) remove the overlays > >This way all of my concerns are solved and the only one that arises is the fact >that --debug-overlays is documented as "... only used for debugging and may be >removed in a future version." > >One interesting thing is that I only found that information in the man page >after I wrote this patch and started using it. The reason for it was that I >started actually writing something very similar to this and I stumbled upon the >code that already does that (with --debug-overlays). > >I also thought that if someone runs virt-v2v --machine-readable --debug-overlays >that the machine readable output would be broken because the output is not json, >but it looks like (I'm just trying it now) --machine-readable does not write any >data to stdout or stderr, even when specified as >--machine-readable=stream:stdout (or stderr) for simple libvirt2libvirt >migration. So point (b) from the commit message is probably invalid. Oh, actually, I don't see it being used for anything else than printing apabilities and estimates. I wonder where the json messages in machine readable logfile that virt-v2v-wrapper requests are coming from. > >One last thing is that currently any application that wants to run virt-v2v (virt-v2v-wrapper in my case) needs to parse the output to get any meaningful data from it. What I tried here was using a new type of message called 'data' that upper applications could just parse and use. In the long run I could add more data to the output so that it is easier to use from virt-v2v-wrapper and other applications. But that is not something that is necessary for this particular patch (or the discussion about the --debug-overlays being supported in the long term), it just made it easier. > >Does that make sense? > >For some more context/info, see How I used it in a PoC here: > > https://github.com/nertpinx/v2v-conversion-host/blob/asdf/wrapper/log_parser.py#L71 >It's also a way easier and sustainable way of what I implemented in a different and weird way on top of this ancient patch: https://www.redhat.com/archives/libguestfs/2019-April/msg00253.html where virt-v2v was the one who actually ran qemu-img commit.
Richard W.M. Jones
2019-Oct-11 12:29 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
Thanks for explaining how this was going to work on IRC. Stepping back I think the problem is we're shoehorning features into a tool (virt-v2v) which was designed to do cold conversions, and was never meant to do either warm or in-place conversions. The tool was hacked a while back for in-place but that mode is very awkward to use and we have never enabled it in RHEL for good reason. What would a tool which reused virt-v2v code and did what we really want look like? The basic flow of objects in existing v2v/v2v.ml is: * let input, output = parse_cmdline () The -i and -o options from the command line are parsed, and "input" and "output" objects constructed. * let source = open_source input The input metadata is read (eg. libvirt XML, VMware VMX) and an internal "source" object is created which models this metadata, plus (almost as a side-effect) links to the disks. In the following discussion remember that "source" == "metadata". * let overlays = create_overlays source.s_disks in let g = Guestfs.guestfs () in populate_overlays g overlays The overlays are created and added as disks to a guestfs handle * let inspect = Inspect_source.inspect_source g in let guestcaps = do_convert g inspect output in This does inspection + conversion. (The real do_convert function takes the source struct as parameter, but it is not really used.) * let targets = output#prepare_targets overlays in copy_targets targets input output This creates the target disks and copies them. (The real output#prepare_targets method takes the source struct as a parameter, but the actual objects only use the source.s_name and .s_hypervisor fields). * output#create_metadata source targets guestcaps inspect This creates the target metadata. What you want -- copy done elsewhere, convert in place, create metadata -- is a very different flow. It could look something like this: * let overlays = create_overlays source.s_disks in let g = Guestfs.guestfs () in populate_overlays g overlays let inspect = Inspect_source.inspect_source g in let guestcaps = do_convert g inspect output in let targets = output#prepare_targets overlays in for each overlay: qemu-img commit it output#create_metadata source targets guestcaps inspect We need the source (ie. metadata) struct in the last step in order to create the metadata, so that still needs to be read from somewhere, and maybe -i libvirtxml is as good a place as any. I was going to come to a proper conclusion here, but I'm not yet sure what it is at the moment. I'll think about this a bit more. If you have any comments I'd like to hear them. However there are some action items: (1) do_convert does not need the source parameter. I think it could just be passed the .s_name field instead. (2) output#prepare_targets doesn't really need the source struct, but could probably get away with being passed just the .s_name field and maybe .s_hypervisor. (3) source.s_disks and the rest of the source struct seem to have sufficiently different usage that we can consider separating them. These changes would reduce coupling between stages. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Martin Kletzander
2019-Oct-11 14:17 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Fri, Oct 11, 2019 at 01:29:53PM +0100, Richard W.M. Jones wrote:>Thanks for explaining how this was going to work on IRC. > >Stepping back I think the problem is we're shoehorning features into a >tool (virt-v2v) which was designed to do cold conversions, and was >never meant to do either warm or in-place conversions. The tool was >hacked a while back for in-place but that mode is very awkward to use >and we have never enabled it in RHEL for good reason. > >What would a tool which reused virt-v2v code and did what we really >want look like? The basic flow of objects in existing v2v/v2v.ml is: >Ideally there would be distinction between metadata and disks, so that I could specify separately for disks and metadata where do they come from and where they should go. That way we could do things like "convert metadata from source to destination, but take disk data from the destination already and keep them there" and other things like that.>* let input, output = parse_cmdline () > > The -i and -o options from the command line are parsed, and > "input" and "output" objects constructed. > >* let source = open_source input > > The input metadata is read (eg. libvirt XML, VMware VMX) and an > internal "source" object is created which models this metadata, > plus (almost as a side-effect) links to the disks. In the > following discussion remember that "source" == "metadata". > >* let overlays = create_overlays source.s_disks in > let g = Guestfs.guestfs () in > populate_overlays g overlays > > The overlays are created and added as disks to a guestfs handle > >* let inspect = Inspect_source.inspect_source g in > let guestcaps = do_convert g inspect output in > > This does inspection + conversion. (The real do_convert function > takes the source struct as parameter, but it is not really used.) > >* let targets = output#prepare_targets overlays in > copy_targets targets input output > > This creates the target disks and copies them. (The real > output#prepare_targets method takes the source struct as a > parameter, but the actual objects only use the source.s_name and > .s_hypervisor fields). > >* output#create_metadata source targets guestcaps inspect > > This creates the target metadata. > >What you want -- copy done elsewhere, convert in place, create metadata -- >is a very different flow. It could look something like this: > >* let overlays = create_overlays source.s_disks in > let g = Guestfs.guestfs () in > populate_overlays g overlays > let inspect = Inspect_source.inspect_source g in > let guestcaps = do_convert g inspect output in > let targets = output#prepare_targets overlays in > for each overlay: qemu-img commit it > output#create_metadata source targets guestcaps inspect >From the calling application point of view the only difference here is that virt-v2v calls qemu-img commit instead of qemu-img convert. It is essentially what I would like to have, but what you are describing above only fits my use case and is not generally usable for upstream (at least how I see it, I don't know how else it is used).>We need the source (ie. metadata) struct in the last step in order to >create the metadata, so that still needs to be read from somewhere, >and maybe -i libvirtxml is as good a place as any. >If metadata and disks have separate input+output settings (which is an overkill in my opinion, at least in the current state) then this can come from the source while disks are coming from the destination already. But getting the xml and modifying it is not *that* big of a deal (until we find problems there).>I was going to come to a proper conclusion here, but I'm not yet sure >what it is at the moment. I'll think about this a bit more. If you >have any comments I'd like to hear them. >Purely from the usage point of view it would be the nicest thing if virt-v2v did not do so many things at once. I mean keep the functionality as is, but expose the separate steps so that they do not need to be duplicated. It would mean exposing the guest disk conversion as an API or a CLI tool, converting metadata between generic and specific formats as another one. Generic format would be for example similar to what -o json generates, keeping in mind that it would have to have *all* the data that virt-v2v wants to use. Then the composition of all the steps in different order and/or with changed input data would be way nicer. Well, I think it would, but it might also not be the case.>However there are some action items: > >(1) do_convert does not need the source parameter. I think it could >just be passed the .s_name field instead. > >(2) output#prepare_targets doesn't really need the source struct, but >could probably get away with being passed just the .s_name field and >maybe .s_hypervisor. > >(3) source.s_disks and the rest of the source struct seem to have >sufficiently different usage that we can consider separating them. > >These changes would reduce coupling between stages. >To be honest I do not know what you are trying to do here. If that helps getting supported version of --debug-overlays, then great. But to make it absolutely clear, what --debug-overlays is so close to what I wanted the whole time that the only better thing than that would be --commit-overlays and the only thing that it would help with would be that I don't have to run one easy command per disk. Given the only "problem" with --debug-overlays I can think of is the fact that it is marked as unsupported. Other than that, I'm very happy with this patch. And given the fact that it does not need scary code changes it also feels safe to use.>Rich. > >-- >Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones >Read my programming and virtualization blog: http://rwmj.wordpress.com >virt-builder quickly builds VMs from scratch >http://libguestfs.org/virt-builder.1.html
Richard W.M. Jones
2019-Oct-11 15:08 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Fri, Oct 11, 2019 at 01:29:53PM +0100, Richard W.M. Jones wrote:> However there are some action items: > > (1) do_convert does not need the source parameter. I think it could > just be passed the .s_name field instead. > > (2) output#prepare_targets doesn't really need the source struct, but > could probably get away with being passed just the .s_name field and > maybe .s_hypervisor.OK I fixed (1) and (2) already - trivial commits pushed.> (3) source.s_disks and the rest of the source struct seem to have > sufficiently different usage that we can consider separating them.I'll have a look at this. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2019-Oct-15 17:26 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Tue, Oct 15, 2019 at 04:53:03PM +0000, Roman Kagan wrote:> On Fri, Oct 11, 2019 at 01:29:53PM +0100, Richard W.M. Jones wrote: > > Stepping back I think the problem is we're shoehorning features into a > > tool (virt-v2v) which was designed to do cold conversions, and was > > never meant to do either warm or in-place conversions. The tool was > > hacked a while back for in-place but that mode is very awkward to use > > and we have never enabled it in RHEL for good reason. > > As I'm guilty for introducing --in-place, let me chime in with my 2c > and humbly disagree with it being awkward to use. I think the only > thing that's awkward is its tight integration into virt-v2v; it'd > make more sense being a separate tool whose sole purpose would be to > tune the guest to be bootable/operational in the provided > configuration.My original comments there are rather rude. What I should have said is the use of the input metadata (eg. --inplace -i libvirtxml) as a kind of configuration file for selecting the rcaps (eg. whether to install virtio drivers) are somewhat non-obvious, although as you say in the context of virt-v2v that's all that can be done. I would - same as you - like to split out virt-v2v much more, into more streamlined tools (but without breaking backwards compatibility or removing features). Still don't know how, but I'll read the rest of your email in more detail as part of thinking about this. At the moment I'm working on separating virt-v2v from the huge libguestfs repository, which I think is the first step to really refactoring it properly. Thanks, Rich.> > Such a tool (let's call it, say, virt-adopt or virt-adapt, depending on > whose POV you take, the guest's or the host's :) could have a number of > benefits: > > 1) it can be used outside of the current virt-v2v, e.g. > > a) v2v where the rest of migration is taken care of by something > else. E.g. we use it to migrate from our own older releases that > used a different hypervisor, and we're sure to do a better job > converting configurations and copying data than the generic tool. > Overall data/metadata migration is the place where plenty of > customization can be necessary, and the monolithic tool is too > rigid for that, so we've had a few other cases where it was easier > to write custom migration code and call the tool to adopt the > resulting guest. In particular, QCOW2 overlays are not always > possible or conveninent, data migration via the source hypervisor > control url is not necessarily the fastest, and so on. > > b) we use it to restore backups created with the older releases which > used incompatible virtual hardware. > > c) every now and then I wanted to change the hardware configuration > of an existing VM in an incompatible way, like changing the > chipset, the type of the storage or network controller, etc, and > this tool came in handy. > > 2) it can allow splitting the current virt-v2v into reusable steps. > E.g. it's a pity, having migrated a huge guest disk image, to find > out that due to some minor network misconfiguration on the host you > can't run yum in the guestfs appliance to install some missing bits > and thus can't complete the conversion, so, once the problem is > fixed, you have to start over. > > 3) it fits better with the libguestfs' purpose, which is to poke in the > guest's files. As a result, in particular, it has smaller testing > scope, and is thus easier to keep tested. (I wonder how many > contributors have the opportunity to test all virt-v2v's input and > output hypervisor combinations.) > > The only downside I see is that it requires some work to be done, and > I'm short of spare cycles to do that :( > > > What you want -- copy done elsewhere, convert in place, create metadata -- > > is a very different flow. It could look something like this: > > > > * let overlays = create_overlays source.s_disks in > > let g = Guestfs.guestfs () in > > populate_overlays g overlays > > let inspect = Inspect_source.inspect_source g in > > let guestcaps = do_convert g inspect output in > > let targets = output#prepare_targets overlays in > > for each overlay: qemu-img commit it > > output#create_metadata source targets guestcaps inspect > > FWIW our scenarios look somewhat different: > > 1) transfer vm disks and metadata > 2) (optional) inspect migrated guest > 3) create target vm based on (1) and (2) > 4) (optional) snapshot target vm (using overlays or whatever is appropriate) > 5) "adopt" target vm (using virt-v2v --in-place) > 6) (optional) boot target vm and let firstboot scripts run > 7) (optional) merge snapshots with the original disks > 8) release target vm to user > > Note, in particular, that step (5) makes no decisions re. the eventual > vm configuration; they are all made at step (3). Restore from old > backups just changes step (1). > > Thanks, and sorry for the long mail without patches, > Roman.-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Martin Kletzander
2019-Oct-16 13:33 UTC
Re: [Libguestfs] [PATCH] v2v: Output saved overlays in a machine-readable fashion
On Tue, Oct 15, 2019 at 06:26:23PM +0100, Richard W.M. Jones wrote:>On Tue, Oct 15, 2019 at 04:53:03PM +0000, Roman Kagan wrote: >> On Fri, Oct 11, 2019 at 01:29:53PM +0100, Richard W.M. Jones wrote: >> > Stepping back I think the problem is we're shoehorning features into a >> > tool (virt-v2v) which was designed to do cold conversions, and was >> > never meant to do either warm or in-place conversions. The tool was >> > hacked a while back for in-place but that mode is very awkward to use >> > and we have never enabled it in RHEL for good reason. >> >> As I'm guilty for introducing --in-place, let me chime in with my 2c >> and humbly disagree with it being awkward to use. I think the only >> thing that's awkward is its tight integration into virt-v2v; it'd >> make more sense being a separate tool whose sole purpose would be to >> tune the guest to be bootable/operational in the provided >> configuration. > >My original comments there are rather rude. What I should have said >is the use of the input metadata (eg. --inplace -i libvirtxml) as a >kind of configuration file for selecting the rcaps (eg. whether to >install virtio drivers) are somewhat non-obvious, although as you say >in the context of virt-v2v that's all that can be done. > >I would - same as you - like to split out virt-v2v much more, into >more streamlined tools (but without breaking backwards compatibility >or removing features). Still don't know how, but I'll read the rest >of your email in more detail as part of thinking about this. >I agree with both of you, as I also mentioned this idea in one of my previous e-mails. Of course I do not know how exactly to do that nicely and Rich will know way more on how to approach this. Martin P.S.: If you want some ideas for some usage scenarios and you cannot think of any, just let me know.
Possibly Parallel Threads
- Re: [PATCH] v2v: Output saved overlays in a machine-readable fashion
- Re: [PATCH] v2v: Output saved overlays in a machine-readable fashion
- Re: [PATCH] v2v: Output saved overlays in a machine-readable fashion
- Re: [PATCH v2 15/17] v2v: add --in-place mode
- Re: [PATCH] v2v: Output saved overlays in a machine-readable fashion