Nir Soffer
2022-Apr-13 18:32 UTC
[Libguestfs] [PATCH] -o rhv-upload: wait for VM creation task
On Tue, Apr 12, 2022 at 9:35 PM Tom?? Golembiovsk? <tgolembi at redhat.com> wrote:> > oVirt API call for VM creation finishes before the VM is actually > created. Entities may be still locked after virt-v2v terminates and if > user tries to perform (scripted) actions after virt-v2v those operations > may fail. To prevent this it is useful to monitor the task and wait for > the completion. This will also help to prevent some corner case > scenarios (that would be difficult to debug) when the VM creation job > fails after virt-v2v already termintates with success. > > Thanks: Nir Soffer > Signed-off-by: Tom?? Golembiovsk? <tgolembi at redhat.com> > Reviewed-by: Arik Hadas <ahadas at redhat.com> > --- > output/rhv-upload-createvm.py | 57 ++++++++++++++++++++++++++++++++++- > 1 file changed, 56 insertions(+), 1 deletion(-) > > diff --git a/output/rhv-upload-createvm.py b/output/rhv-upload-createvm.py > index 50bb7e34..c6a6fbd6 100644 > --- a/output/rhv-upload-createvm.py > +++ b/output/rhv-upload-createvm.py > @@ -19,12 +19,54 @@ > import json > import logging > import sys > +import time > +import uuid > > from urllib.parse import urlparse > > import ovirtsdk4 as sdk > import ovirtsdk4.types as types > > + > +def debug(s): > + if params['verbose']: > + print(s, file=sys.stderr) > + sys.stderr.flush() > + > + > +def jobs_completed(system_service, correlation_id): > + jobs_service = system_service.jobs_service() > + > + try: > + jobs = jobs_service.list( > + search="correlation_id=%s" % correlation_id) > + except sdk.Error as e: > + debug( > + "Error searching for jobs with correlation id %s: %s" % > + (correlation_id, e)) > + # We dont know, assume that jobs did not complete yet.don't?> + return False > + > + # STARTED is the only "in progress" status, other mean the job has"other" -> "anything else"?> + # already terminatedMissing . at the end of the comment.> + if all(job.status != types.JobStatus.STARTED for job in jobs): > + failed_jobs = [(job.description, str(job.status)) > + for job in jobs > + if job.status != types.JobStatus.FINISHED] > + if failed_jobs: > + raise RuntimeError( > + "Failed to create a VM! Failed jobs: %r" % failed_jobs) > + return True > + else: > + jobs_status = [(job.description, str(job.status)) for job in jobs]jobs_status is a little confusing since this is a list of (description, status) tuples. Maybe "running_jobs"? It is also more consistent with "failed_jobs" above.> + debug("Some jobs with correlation id %s are running: %s" % > + (correlation_id, jobs_status)) > + return False > + > + > +# Seconds to wait for the VM import job to complete in oVirt. > +timeout = 5 * 60 > + > # Parameters are passed in via a JSON doc from the OCaml code. > # Because this Python code ships embedded inside virt-v2v there > # is no formal API here. > @@ -67,6 +109,7 @@ system_service = connection.system_service() > cluster = system_service.clusters_service().cluster_service(params['rhv_cluster_uuid']) > cluster = cluster.get() > > +correlation_id = str(uuid.uuid4()) > vms_service = system_service.vms_service() > vm = vms_service.add( > types.Vm( > @@ -77,5 +120,17 @@ vm = vms_service.add( > data=ovf, > ) > ) > - ) > + ), > + query={'correlation_id': correlation_id}, > ) > + > +# Wait for the import job to finish > +endt = time.time() + timeoutSince we use python 3, it is better to use time.monotonic() which is affected by system time changes.> +while True: > + time.sleep(1)Since we wait up to 300 seconds, maybe use a longer delay? Or maybe we don't need to wait for 300 seconds?> + if jobs_completed(system_service, correlation_id): > + break > + if time.time() > endt: > + raise RuntimeError( > + "Timed out waiting for VM creation!" > + " Jobs still running for correlation id %s" % correlation_id) > -- > 2.35.1 > > _______________________________________________ > Libguestfs mailing list > Libguestfs at redhat.com > https://listman.redhat.com/mailman/listinfo/libguestfsWith or without suggested minor improvements, Reviewed-by: Nir Soffer <nsoffer at redhat.com> Nir
Richard W.M. Jones
2022-Apr-14 08:09 UTC
[Libguestfs] [PATCH] -o rhv-upload: wait for VM creation task
On Wed, Apr 13, 2022 at 09:32:14PM +0300, Nir Soffer wrote:> > +# Wait for the import job to finish > > +endt = time.time() + timeout > > Since we use python 3, it is better to use time.monotonic() > which is affected by system time changes.Interesting - we use time.time() quite a lot at the moment. How about the following patch? It passes tests for me locally (so only against the test harness, not real RHV). Rich. diff --git a/output/rhv-upload-finalize.py b/output/rhv-upload-finalize.py index 4d1dcfb2f4..1221e766ac 100644 --- a/output/rhv-upload-finalize.py +++ b/output/rhv-upload-finalize.py @@ -73,7 +73,7 @@ def finalize_transfer(connection, transfer_id, disk_id): .image_transfers_service() .image_transfer_service(transfer_id)) - start = time.time() + start = time.monotonic() transfer_service.finalize() @@ -125,14 +125,14 @@ def finalize_transfer(connection, transfer_id, disk_id): raise RuntimeError( "transfer %s was paused by system" % (transfer.id,)) - if time.time() > start + timeout: + if time.monotonic() > start + timeout: raise RuntimeError( "timed out waiting for transfer %s to finalize, " "transfer is %s" % (transfer.id, transfer.phase)) debug("transfer %s finalized in %.3f seconds" - % (transfer_id, time.time() - start)) + % (transfer_id, time.monotonic() - start)) # Parameters are passed in via a JSON doc from the OCaml code. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org