Nir Soffer
2022-Feb-12 20:49 UTC
[Libguestfs] [PATCH] v2v/v2v.ml: Use larger request size for -o rhv-upload
rhv-upload plugin is translating every NBD command to HTTP request, translated back to NBD command on imageio server. The HTTP client and server, and the NBD client on the imageio server side are synchronous and implemented in python, so they have high overhead per request. To get good performance we need to use larger request size. Testing shows that request size of 8MiB is best, speeding up the copy disk phase from 14.7 seconds to 7.7 seconds (1.9x times faster). Here are stats extracted from imageio log when importing Fedora 35 image with 3 GiB of random data. For each copy, we have 4 connection stats. Before: connection 1 ops, 14.767843 s dispatch 4023 ops, 11.427662 s zero 38 ops, 0.053840 s, 327.91 MiB, 5.95 GiB/s write 3981 ops, 8.975877 s, 988.61 MiB, 110.14 MiB/s flush 4 ops, 0.001023 s connection 1 ops, 14.770026 s dispatch 4006 ops, 11.408732 s zero 37 ops, 0.057205 s, 633.21 MiB, 10.81 GiB/s write 3965 ops, 8.907420 s, 986.65 MiB, 110.77 MiB/s flush 4 ops, 0.000280 s connection 1 ops, 14.768180 s dispatch 4057 ops, 11.430712 s zero 42 ops, 0.030011 s, 470.47 MiB, 15.31 GiB/s write 4011 ops, 9.002055 s, 996.98 MiB, 110.75 MiB/s flush 4 ops, 0.000261 s connection 1 ops, 14.770744 s dispatch 4037 ops, 11.462050 s zero 45 ops, 0.026668 s, 750.82 MiB, 27.49 GiB/s write 3988 ops, 9.002721 s, 989.36 MiB, 109.90 MiB/s flush 4 ops, 0.000282 s After: connection 1 ops, 7.776159 s dispatch 181 ops, 6.701100 s zero 27 ops, 0.219959 s, 5.97 MiB, 27.15 MiB/s write 150 ops, 6.266066 s, 983.13 MiB, 156.90 MiB/s flush 4 ops, 0.000299 s connection 1 ops, 7.805616 s dispatch 187 ops, 6.643718 s zero 30 ops, 0.227808 s, 809.01 MiB, 3.47 GiB/s write 153 ops, 6.306260 s, 1.02 GiB, 165.81 MiB/s flush 4 ops, 0.000306 s connection 1 ops, 7.780301 s dispatch 191 ops, 6.535249 s zero 47 ops, 0.228495 s, 693.31 MiB, 2.96 GiB/s write 140 ops, 6.033484 s, 958.23 MiB, 158.82 MiB/s flush 4 ops, 0.001618 s connection 1 ops, 7.829294 s dispatch 213 ops, 6.594207 s zero 56 ops, 0.297876 s, 674.12 MiB, 2.21 GiB/s write 153 ops, 6.070786 s, 974.56 MiB, 160.53 MiB/s flush 4 ops, 0.000318 s This is an ugly hack; the preferred request size should be a function of the output module that only output_rhv_upload will override, but I don't know how to implement this with the current code. Another way is to add this as an output option; this will make it easier to test and find the best setting that works in a real environment, or tweak the value in a specific environment if needed. --- v2v/v2v.ml | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/v2v/v2v.ml b/v2v/v2v.ml index fddb0742..b21e2737 100644 --- a/v2v/v2v.ml +++ b/v2v/v2v.ml @@ -578,37 +578,45 @@ read the man page virt-v2v(1). let input_socket = sprintf "%s/in%d" tmpdir i and output_socket = sprintf "%s/out%d" tmpdir i in if Sys.file_exists input_socket && Sys.file_exists output_socket then loop ((i, input_socket, output_socket) :: acc) (i+1) else List.rev acc in let disks = loop [] 0 in let nr_disks = List.length disks in + (* XXX This is a hack for -o rhv-upload that works best with larger + * request size. + *) + let request_size + match output_mode with + | `RHV_Upload -> 8*1024*1024 + | _ -> 0 in + (* Copy the disks. *) List.iter ( fun (i, input_socket, output_socket) -> message (f_"Copying disk %d/%d") (i+1) nr_disks; let input_uri = nbd_uri_of_socket input_socket and output_uri = nbd_uri_of_socket output_socket in (* In verbose mode print some information about each * side of the pipeline. *) if verbose () then ( nbdinfo ~content:true input_uri; nbdinfo ~content:false output_uri ); - nbdcopy output_alloc input_uri output_uri + nbdcopy output_alloc input_uri output_uri request_size ) disks; (* End of copying phase. *) unlink (tmpdir // "copy"); (* Do the finalization step. *) message (f_"Creating output metadata"); Output_module.finalize tmpdir output_poptions output_t source inspect target_meta; @@ -627,26 +635,28 @@ read the man page virt-v2v(1). * appliance may be created there. (RHBZ#1316479, RHBZ#2051394) *) and check_host_free_space () let free_space = StatVFS.free_space (StatVFS.statvfs large_tmpdir) in debug "check_host_free_space: large_tmpdir=%s free_space=%Ld" large_tmpdir free_space; if free_space < 1_073_741_824L then error (f_"insufficient free space in the conversion server temporary directory %s (%s).\n\nEither free up space in that directory, or set the LIBGUESTFS_CACHEDIR environment variable to point to another directory with more than 1GB of free space.\n\nSee also the virt-v2v(1) manual, section \"Minimum free space check in the host\".") large_tmpdir (human_size free_space) -and nbdcopy output_alloc input_uri output_uri +and nbdcopy output_alloc input_uri output_uri request_size (* XXX It's possible that some output modes know whether * --target-is-zero which would be a useful optimization. *) let cmd = ref [] in List.push_back_list cmd [ "nbdcopy"; input_uri; output_uri ]; + if request_size != 0 then + List.push_back_list cmd ["--request-size"; string_of_int request_size]; List.push_back cmd "--flush"; (*List.push_back cmd "--verbose";*) if not (quiet ()) then List.push_back cmd "--progress"; if output_alloc = Types.Preallocated then List.push_back cmd "--allocated"; let cmd = !cmd in if run_command cmd <> 0 then error (f_"nbdcopy command failed, see earlier error messages") (* Run nbdinfo on a URI and dump the information to stderr. -- 2.34.1
Richard W.M. Jones
2022-Feb-13 09:40 UTC
[Libguestfs] [PATCH] v2v/v2v.ml: Use larger request size for -o rhv-upload
On Sat, Feb 12, 2022 at 10:49:42PM +0200, Nir Soffer wrote:> rhv-upload plugin is translating every NBD command to HTTP request, > translated back to NBD command on imageio server. The HTTP client and > server, and the NBD client on the imageio server side are synchronous > and implemented in python, so they have high overhead per request. To > get good performance we need to use larger request size. > > Testing shows that request size of 8MiB is best, speeding up the copy > disk phase from 14.7 seconds to 7.7 seconds (1.9x times faster).Unfortunately this will break VDDK since it cannot handle very large requests (I think 4M is about the max without reconfiguring the server). Also larger requests have adverse performance effects in other configurations, although I understand this patch tries to retrict the change to when the output mode is rhv-upload. We need to think of some other approach, but I'm not sure what it is. I'd really like to be able to talk to imageio's NBD server directly! Other relevant commits: https://github.com/libguestfs/virt-v2v/commit/7ebb2c8db9d4d297fbbef116a9828a9dde700de6 https://github.com/libguestfs/virt-v2v/commit/08e764959ec9dadd71a95d22d3d88d647a18d165 [...]> This is an ugly hack; the preferred request size should be a function of > the output module that only output_rhv_upload will override, but I don't > know how to implement this with the current code.Just add a new value to output/output.ml{,i}. There is no superclassing (this is not OO) so you'll have to add the value to every output module implementation, defaulting to None. However I'd like to think of another approach first. - Have nbdcopy split and combine requests so request size for input and output can be different? Sounds complicated but might be necessary one day to support minimum block size. - More efficient Python plugin that might combine requests? Also complicated ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v