thr3ads.net - Libguestfs - [Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize [Aug 2021]

If this information is useful, please help other people find it:
Share via:

Nir Soffer

2021-Aug-03 18:18 UTC

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

On Tue, Aug 3, 2021 at 8:48 PM Richard W.M. Jones <rjones at redhat.com>
wrote:>
> On Tue, Aug 03, 2021 at 07:51:03PM +0300, Nir Soffer wrote:
> > On Tue, Aug 3, 2021 at 3:10 PM Richard W.M. Jones <rjones at
redhat.com> wrote:
> > >
> > > The existing plugin created a new disk in open() and tried to
finalize
> > > it in close().  A correct NBD client may make several connections
> > > (eg. connecting just to query for multi-conn or other metadata). 
This
> > > could result in multiple disks being created.
> > >
> > > Let's fix this by using Nir's suggestion to split out the
create and
> > > finalize functions.  The nbdkit plugin becomes a much more
> > > straightforward plugin - it no longer depends on the oVirt SDK at
all
> > > so could in theory be rewritten in another programming language. 
The
> > > other functions have been moved into new scripts
("transfer",
> > > "finalize", "cancel") which are run
separately.
> > >
> > > Some subtler points about this commit:
> > >
> > >  - We no longer use or need the @failing decorator.  I have
extended
> > >    the cleanup code so as well as deleting disks it also cancels
the
> > >    transfers.  Any failure detected by qemu-img convert will
result in
> > >    virt-v2v exiting and cancelling the transfer, which is much
cleaner
> > >    and more robust.
> >
> > Canceling the transfer on errors is fine but deleting a disk *used* by
> > a transfer will fail since the transfer owns the disk. Once the
transfer
> > was created, there is no need to delete the disk, it will delete the
disk
> > on errors.
>
> (Commented on below)
>
> > >  - We still require the HTTP pool.  The reasons is a bit subtle:
Since
> > >    we are using the parallel thread model, pread and pwrite (etc)
can
> > >    be called at any time, even overlapping, so storing an HTTP
handle
> > >    per NBD connection does not work as the handle gets reused
(which
> > >    results in a fatal error).
> >
> > Do we need the parallel threading model now? If we create a python
instance
> > per connection, one http connection to image is best.
>
> I did consider this, but I thought it would be better to keep the
> parallel thread model and multiple the requests down to a fixed number
> of HTTP connections (in the pool).  In the eventual multi-conn case it
> would look like:
>
>         +----------+        +---------+
>    -----| nbdkit  p|--------| imageio |
>    -----|         o|--------|         |
>    -----|         o|  (8)   |         |
>    -----|         l|--------|         |
>         +----------+        +---------+
> 4 multi-conn        HTTP
> NBD connections
>
> On the left you'd have up to 4 x 64 NBD requests coming in.  In the
> middle (assuming no HTTP pipelining is possible) the requests are
> issued by multiple threads on to the 8 HTTP connections to imageio.
>
> > > diff --git a/v2v/rhv-upload-deletedisks.py
b/v2v/rhv-upload-cancel.py
> > > similarity index 80%
> > > rename from v2v/rhv-upload-deletedisks.py
> > > rename to v2v/rhv-upload-cancel.py
> > > index 8ed8cb88b..f08713b8c 100644
> > > --- a/v2v/rhv-upload-deletedisks.py
> > > +++ b/v2v/rhv-upload-cancel.py
> > > @@ -1,6 +1,6 @@
> > >  # -*- python -*-
> > > -# oVirt or RHV upload delete disks used by ?virt-v2v -o
rhv-upload?
> > > -# Copyright (C) 2019 Red Hat Inc.
> > > +# oVirt or RHV upload cancel used by ?virt-v2v -o rhv-upload?
> > > +# Copyright (C) 2019-2021 Red Hat Inc.
> > >  #
> > >  # This program is free software; you can redistribute it and/or
modify
> > >  # it under the terms of the GNU General Public License as
published by
> > > @@ -57,13 +57,22 @@ connection = sdk.Connection(
> > >  )
> >
> > You should close the connection (helps engine manages its sessions).
> > The best way is using:
> >
> > from contextlib import closing
> >
> > with closing(connection):
> >     code using the econnection here...
> >
> > I'm not sure if connection.close() does more than socket.close(),
but it
> > is better to be explicit.
>
> OK I'll add this.
>
> It's worth noting that in my local copy I'm now closing the HTTP
> connections in the plugin on exit, which I think wasn't in the patch
> posted.  It does unfortunately require a small change to nbdkit to do
> this
(https://gitlab.com/nbdkit/nbdkit/-/commit/f2fe99e4b0f54467ab8028eaf2d039cf918b2961).
>
> > >  system_service = connection.system_service()
> > > +image_transfers_service =
system_service.image_transfers_service()
> > > +
> > > +# Try to cancel the transfers.  This will probably delete the
disk.
> >
> > This *will* delete the disk, not probably.
> >
> > > +for id in params['transfer_ids']:
> > > +    try:
> > > +        transfer_service =
image_transfers_service.image_transfer_service(id)
> > > +        transfer_service.cancel()
> > > +    except sdk.NotFoundError:
> > > +        pass
> >
> > This can fail because of other reasons, and in this case we will fail
> > on the first
> > failure, instead of trying to cancel all transfers. Previously we
always cancel
> > all transfers on all errors.
> >
> > I think this should be:
> >
> >     for id in params['transfer_ids']:
> >         try:
> >             cancel the transfer...
> >         except sdk.NotFoundError:
> >             log warning - we don't expect non existing transfer
> >         except Exception:
> >             log traceback with the transfer id, someone will have to
do manual
> >             cleanup using this id.
>
> OK I'll see if I can fix this.
>
> > Note that cancel is async, so the transfer is still in the cleanup
> > flow when we reach this line, so...
> >
> > >  disks_service = system_service.disks_service()
> > >
> > > +# As a last resort, try to remove the disks.
> > >  for uuid in params['disk_uuids']:
> > > -    # Try to get and remove the disk, however do not fail
> > > -    # if it does not exist (maybe removed in the meanwhile).
> > >      try:
> > >          disk_service = disks_service.disk_service(uuid)
> > >          disk_service.remove()
> > > -    except sdk.NotFoundError:
> > > +    except (sdk.NotFoundError, sdk.Error):
> > >          pass
> >
> > this will likely fail since the transfer still holds a lock on the
disk.
> >
> > We don't need to delete the disks, the transfers will delete their
disk.
>
> Right, but I believe there is a window between creating the disk and
> adding it to the transfer, and therefore this is an attempt to have a
> "backup plan" to delete the disk.
>
> I did discover that the transfer deletes the disk, which was why I
> changed the exception to ignore sdk.Error.
>
> ...
>
> > >  # Using version 2 supporting the buffer protocol for better
performance.
> > >  API_VERSION = 2
> > >
> > > @@ -43,31 +34,47 @@ API_VERSION = 2
> > >  # client, this give best performance.
> > >  MAX_CONNECTIONS = 4
> >
> > This is not need now since we use mult_con. This can also casue
> > a dealock since the underlying qemu-nbd per this transfer supports up
> > to 8 clients (--shared 8). If we create 4 http connections per nbdkit
> > connections, we will have 16 connections. 8 of these will be blocked
> > in imageio on qemu-nbd, which can lead to deadlock or make some
> > nbdkit connections ineffective.
> >
> > To simplify this huge refactoring, I would use MAX_CONNECITIONS = 1
> > for now.
>
> Since the pool is a global (there is only one plugin instance, but
> open() is called 4 times), I believe this actually limits the pool to
> 4, unless max_readers/writers overrides it.
I see, so if I understand how it works correctly, we have:

after fork:
- called once
- create pool of connections

open:
- called once per nbd connection
- does nothing since we already opened the connections

write/zero:
- called once per nbd command
- pick random connection and send the request

flush:
- called once per nbd connection
- flush all connections

close:
- called once per nbd connection
- close all connections on first call
- does nothing on later call

So we have some issues:
- calling flush 16 times instead of 4 times

Can we disable the parallel threading model with multi_con, or we must
use it to have concurrent calls?
> > > +def after_fork():
> > > +    global options, pool
> >
> > This works but hiding the globals here makes it hard to understand the
code.
> > The globalls should be initialized at the top of the module.
>
> OK.
>
> PS. The way "global" works in Python is dumb!
>
> ...
>
> > > -def create_http_pool(url, host, options):
> > > +# Connection pool.
> > > +def create_http_pool(url, options):
> > >      pool = queue.Queue()
> > >
> > >      count = min(options["max_readers"],
> > >                  options["max_writers"],
> > >                  MAX_CONNECTIONS)
> >
> > Note that this logic must move to virt-v2v now. If imageio does not
support
> > multiple writers, you should not use multi_con.
> >
> > Since we create the transfer with format="raw", we should
always
> > support 8 writers, but the correctness this must be enforce using
> > max_writers option.
>
> I don't understand this comment.
In create_transfer we use create the image transfer with:

    format="raw"

This enables the nbd backend in imageio. The system starts qemu-nbd
and imageio connects to qemu-nbd instead of using the file backend
opening the actual disk. The nbd backend supports multiple writers, but
the file backend does not.
> ...
>
> > > +# Parameters are passed in via a JSON doc from the OCaml code.
> > > +# Because this Python code ships embedded inside virt-v2v there
> > > +# is no formal API here.
> > > +params = None
> > > +
> > > +if len(sys.argv) != 2:
> > > +    raise RuntimeError("incorrect number of
parameters")
> > > +
> > > +# Parameters are passed in via a JSON document.
> > > +with open(sys.argv[1], 'r') as fp:
> > > +    params = json.load(fp)
> >
> > This works, but why not read the json from stdin in the same way
> > we write the output to stdout?
>
> It's a bit complicated to have two pipes to an external process, and
> the other programs pass a params file as a parameter, that's the only
> reason.
This is simple in python, here is an example:
https://github.com/oVirt/vdsm/blob/511d7a4aea9beb6e375d05dfffa0c135a15bb3b4/lib/vdsm/storage/managedvolume.py#L186
> > > +# Send the destination URL, transfer ID, and host flag back to
OCaml code.
> > > +results = {
> > > +    "transfer_id": transfer.id,
> > > +    "destination_url": destination_url,
> > > +    "is_host": host is not None,
> >
> > is_host is not clear, mayb "is_local"?
>
> Yes, I agree.  Or "is_ovirt_host" perhaps.
>
> Thanks,
>
> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> Fedora Windows cross-compiler. Compile Windows programs, test, and
> build Windows installers. Over 100 libraries supported.
> http://fedoraproject.org/wiki/MinGW
>

Richard W.M. Jones

2021-Aug-03 19:25 UTC

head link

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

On Tue, Aug 03, 2021 at 09:18:50PM +0300, Nir Soffer
wrote:> On Tue, Aug 3, 2021 at 8:48 PM Richard W.M. Jones <rjones at
redhat.com> wrote:
> > Since the pool is a global (there is only one plugin instance, but
> > open() is called 4 times), I believe this actually limits the pool to
> > 4, unless max_readers/writers overrides it.
> 
> I see, so if I understand how it works correctly, we have:
> 
> after fork:
> - called once
> - create pool of connections
Yes.  Note there's only one nbdkit process and one Python interpreter
running even if the client connects multiple times.
> open:
> - called once per nbd connection
> - does nothing since we already opened the connections
Yup.
> write/zero:
> - called once per nbd command
> - pick random connection and send the request
Yes.  But crucially because the thread model is parallel, and because
certain Python http requests will block and release the Python GIL, it
can be that nbdkit will call into the plugin in parallel even on the
same handle.

But when running Python code we are never preempted unless we call
some C extension that blocks and releases the GIL.
> flush:
> - called once per nbd connection
> - flush all connections
Yes.  However this is controlled by the client.

It happens that both qemu-img convert and nbdcopy do a flush operation
right at the end.  qemu-img convert doesn't support multi-conn, so
there's only one NBD connection and one flush.

nbdcopy happens to call flush on every NBD connection (but that's just
because we are being over-cautious, I think multi-conn guarantees mean
we don't have to do this).
https://gitlab.com/nbdkit/libnbd/-/blob/3d9d23b1b0d1df049782555ad602476c35aad6b0/copy/nbd-ops.c#L174
> close:
> - called once per nbd connection
> - close all connections on first call
> - does nothing on later call
I think the patch I posted had a mistake in it where I closed the
whole thread pool in close().  In my own copy the thread pool is
destroyed on cleanup() (when nbdkit exits).  However this requires a
small change to nbdkit
(https://gitlab.com/nbdkit/nbdkit/-/commit/f2fe99e4b0f54467ab8028eaf2d039cf918b2961)
> So we have some issues:
> - calling flush 16 times instead of 4 times
Yes I think so, but probably the later flushes don't do very much work?
> Can we disable the parallel threading model with multi_con, or we must
> use it to have concurrent calls?
The client decides how many connections to make, and that happens
after we've made the decision to choose a thread model.
> > > > -def create_http_pool(url, host, options):
> > > > +# Connection pool.
> > > > +def create_http_pool(url, options):
> > > >      pool = queue.Queue()
> > > >
> > > >      count = min(options["max_readers"],
> > > >                  options["max_writers"],
> > > >                  MAX_CONNECTIONS)
> > >
> > > Note that this logic must move to virt-v2v now. If imageio does
> > > not support multiple writers, you should not use multi_con.
> > > Since we create the transfer with format="raw", we
should always
> > > support 8 writers, but the correctness this must be enforce
> > > using max_writers option.
> >
> > I don't understand this comment.
> 
> In create_transfer we use create the image transfer with:
> 
>     format="raw"
> 
> This enables the nbd backend in imageio. The system starts qemu-nbd
> and imageio connects to qemu-nbd instead of using the file backend
> opening the actual disk. The nbd backend supports multiple writers, but
> the file backend does not.
I see.  However I'm not sure I understand why the logic for choosing
"count" has to move to virt-v2v.

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Eric Blake

2021-Aug-03 19:39 UTC

head link

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

On Tue, Aug 03, 2021 at 09:18:50PM +0300, Nir Soffer
wrote:> I see, so if I understand how it works correctly, we have:
> 
> after fork:
> - called once
> - create pool of connections
> 
> open:
> - called once per nbd connection
> - does nothing since we already opened the connections
> 
> write/zero:
> - called once per nbd command
> - pick random connection and send the request
> 
> flush:
> - called once per nbd connection
> - flush all connections
If the other end of the http connections is multi-conn consistent
(right now, qemu-nbd doesn't yet advertise multi-conn for writable
exports, but everything I've seen in the code says that it IS
consistent because it serializes flush requests at the block layer
backend regardless of which client front-end sent them - it's still on
my todo list to get qemu 6.2 to advertise multi-conn for writable
qemu-nbd), then you only have to flush on one random connection.

If the other end of the http connection is not multi-conn consistent,
then you do indeed have to flush ALL of the http connections; at which
point, you are somewhat recreating the effects of the nbdkit
multi-conn filter.  However, the multi-conn filter only knows how to
propagate flush across openings into the plugin, and not into the pool
of http connections embedded within the plugin, so the duplication is
probably necessary rather than relying on the filter.
> 
> close:
> - called once per nbd connection
> - close all connections on first call
> - does nothing on later call
Or do you want it to be more reference-counted, where it leaves all
connections open until the LAST nbd connection is closed?
> 
> So we have some issues:
> - calling flush 16 times instead of 4 times
Are the later flushes expensive, or is the server smart enough to
optimize that if nothing has changed since the first flush, the later
flushes return immediately?

And whether it is called 1, 4, or 16 times (assuming 4 NBD clients
each talking to a pool of 4 http connections) depends on which parts
of the chain advertise (or assume) multi-conn semantics.  If you have
multi-conn all the way through, a single flush on any one http
connection is immediately visible to all other connections, and you
don't need the remaining flushes.  If you do not have multi-conn
consistency natively, then yes, replicating the flush across all
connections is necessary, although the multi-conn filter may let you
tweak whether that replication has to be done by the client or can be
emulated by nbdkit.  Also, the multi-conn filter can turn a single flush from
the NBD client into multiple flushes to the plugin; but it also has an
operational mode where it recognizes multiple flushes from NBD clients
and turns the latter flushes into no-ops if nothing has changed in the
meantime from the plugin.
> 
> Can we disable the parallel threading model with multi_con, or we must
> use it to have concurrent calls?
Parallel threading is necessary if you want concurrent calls (more
than one outstanding read or write at a time).  That's independent of
whether you have multi-conn.  Note that you can have a semi-serialized
connection that advertises multi-conn but does not allow parallel
requests on a given connection (you have to use parallel NBD clients
to get any parallelism); and conversely you can have a parallel
connection that does not advertise multi-conn (parallel requests are
supported on any connection, but flush on one connection does not
necessarily affect the other connections, and you may see stale data
from a per-connection cache if you don't flush on all connections).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Libguestfs - Aug 2021 - [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize