thr3ads.net - Libguestfs - Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273). [Mar 2018]

If this information is useful, please help other people find it:
Share via:

Nir Soffer

2018-Mar-24 22:36 UTC

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones <rjones@redhat.com>
wrote:
> PROBLEMS:
>  - -of qcow2 does not work, with multiple problems
>     * needs to set NBD size to something larger than virtual size
>
How is this related to the actual file size specified when you create a
disk?

In block storage, qcow2 image is always on a thin provisioned disk, which
in oVirt is a regular logical volume, created at the requested
"initial_size":

From:

https://github.com/oVirt/ovirt-engine-sdk/blob/aba2a83ec94ecac1594adfff62d3cfcfdbdef416/sdk/examples/upload_disk.py#L113


if image_info["format"] == "qcow2":
    disk_format = types.DiskFormat.COW
else:
    disk_format = types.DiskFormat.RAW

disks_service = connection.system_service().disks_service()
disk = disks_service.add(
    disk=types.Disk(
        name=os.path.basename(image_path),
        description='Uploaded disk',
        format=disk_format,
        initial_size=image_size,
        provisioned_size=image_info["virtual-size"],
        sparse=disk_format == types.DiskFormat.COW,
        storage_domains=[
            types.StorageDomain(
                name='mydata'
            )
        ]
    )
)

Internally we do allocated 10% more then the requested initial_size
to leave room for qcow2 metadata.

See:

https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/blockVolume.py#L328




https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/volume.py#L666





Do we have other problems?

>  - Cannot choose the datacenter.
>
The storage domain belongs to some data center, so I don't think you
need to select a data center.

[snipped]

 diff --git a/v2v/rhv-upload-plugin.py b/v2v/rhv-upload-plugin.py
> new file mode 100644
> index 000000000..979cbd63b
> --- /dev/null
> +++ b/v2v/rhv-upload-plugin.py
> @@ -0,0 +1,414 @@
> +# -*- python -*-
> +# oVirt or RHV upload nbdkit plugin used by ‘virt-v2v -o rhv-upload’
> +# Copyright (C) 2018 Red Hat Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License along
> +# with this program
>
<https://maps.google.com/?q=with+this+program&entry=gmail&source=g>;
if
> not, write to the Free Software Foundation, Inc.,
> +# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> +
> +import builtins
> +import json
> +import logging
> +import ovirtsdk4 as sdk
> +import ovirtsdk4.types as types
>
It is good practice to separate stdlib imports and 3rd party imports
like ovirtsdk, helps to understand the dependencies in modules.

[snipped]
> +from http.client import HTTPSConnection
> +from urllib.parse import urlparse
>
These will not work in python 2, but you can use six.moves to have
code that works on both 2 and 3.

[snipped]
> +    # Connect to the server.
> +    connection = sdk.Connection(
> +        url = params['output_conn'],
> +        username = username,
> +        password = password,
> +        ca_file = params['rhv_cafile'],
>
Can this be None?

> +        log = logging.getLogger(),
> +        insecure = params['insecure'],
>
If ca_file cannot be None, then insecure is not needed, based
on Ondra review from earlier version.

[snipped]
> +    # Create the disk.
> +    disks_service = system_service.disks_service()
> +    if params['disk_format'] == "raw":
> +        disk_format = types.DiskFormat.RAW
> +    else:
> +        disk_format = types.DiskFormat.COW
> +    disk = disks_service.add(
> +        disk = types.Disk(
> +            name = params['disk_name'],
> +            description = "Uploaded by virt-v2v",
> +            format = disk_format,
> +            provisioned_size = params['disk_size'],
>
This must be the virtual size.

You don't specify initial_size - in this case you get 1G, and most
images will fail to upload.

> +            sparse = params['output_sparse'],
>
The user cannot configure that. This must be based on the image
format. The current coded may create images in unsupported
combinations, e.g. raw/sparse on block storage, or fail validation
in engine.

[snipped]
> +# Can we issue zero, trim or flush requests?
>
+def get_options(h):> +    if h['got_options']:
> +        return
> +    h['got_options'] = True
> +
> +    http = h['http']
> +    transfer=h['transfer']
> +
> +    http.putrequest("OPTIONS", h['path'])
> +    http.putheader("Authorization", transfer.signed_ticket)
> +    http.endheaders()
> +
> +    r = http.getresponse()
> +    if r.status == 200:
> +        j = json.loads(r.read())
> +        h['can_zero'] = "zero" in j['features']
> +        h['can_trim'] = "trim" in j['features']
> +        h['can_flush'] = "flush" in
j['features']
> +
> +        # If can_zero is true then it means we're using the new
imageio
> +        # which never needs the Authorization header.
> +        if h['can_zero']:
> +            h['needs_auth'] = False
>
If we got here, we are working with new daemon or proxy, and both
of them do not need auth, so we can set 'needs_auth' to False
if OPTIONS returned 200 OK.

> +    # Old imageio servers returned 405 Method Not Allowed.  If
> +    # we see that we just say that no operations are allowed and
> +    # we will emulate them.
> +    elif r.status == 405:
> +        h['can_zero'] = False
> +        h['can_trim'] = False
> +        h['can_flush'] = False
>
I would set all the defaults when creating the sate dict. This ensures
that we don't forget needs_auth or other keys and crash later with
KeyError, and make it easier to understand what is the state managed
by the plugin.

[snipped]

+def pwrite(h, buf, offset):> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    count = len(buf)
> +    h['highestwrite'] = max(h['highestwrite'],
offset+count)
> +
> +    http.putrequest("PUT", h['path'] +
"?flush=n")
>
"?flush=n" has effect only if h["can_flush"] is True. Older
daemon/proxy
do not know support disabling flushing. The docs mention that this
query string will be added in 1.3, we are now at 1.2.

> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
+    # The oVirt server only uses the first part of the range, and
the> +    # content-length.

[snipped]

> +def zero(h, count, offset, may_trim):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Unlike the trim and flush calls, there is no 'can_zero'
method
> +    # so nbdkit could call this even if the server doesn't support
> +    # zeroing.  If this is the case we must emulate.
> +    if not h['can_zero']:
> +        emulate_zero(h, count, offset)
> +        return
> +
> +    # Construct the JSON request for zeroing.
> +    buf = json.dumps({'op': "zero",
> +                      'offset': offset,
> +                      'size': count,
> +                      'flush': False})
> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>
Only GET and PUT on a server that does not implement OPTIONS need auth.

This will work if h['needs_auth'] is set correctly, but I think it is
better to include
this check only in pread() and pwrite(), and add a comment there that this
is
need to support legacy versions.

> +    http.putheader("Content-Length", len(buf))
> +    http.endheaders()
> +    http.send(buf)
> +
> +    r = http.getresponse()
> +    if r.status != 200:
> +        transfer_service.pause()
> +        h['failed'] = True
> +        raise RuntimeError("could not zero sector (%d, %d): %d:
%s" %
> +                           (offset, count, r.status, r.reason))
> +
> +# qemu-img convert starts by trying to zero/trim the whole device.
> +# Since we've just created a new disk it's safe to ignore these
> +# requests as long as they are smaller than the highest write seen.
> +# After that we must emulate them with writes.
>
I think this comment is not related to this code. Maybe it belongs to
write() where we compute the highwrite?

> +def emulate_zero(h, count, offset):
> +    if offset+count < h['highestwrite']:
> +        http.putrequest("PUT", h['path'] +
"?flush=n")
>
This is is used only on old daemon/proxy, the flush has no effect.

[snipped]

+def trim(h, count, offset):> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Construct the JSON request for trimming.
> +    buf = json.dumps({'op': "trim",
> +                      'offset': offset,
> +                      'size': count,
> +                      'flush': False})
> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>
Never needed.

[snipped]
> +def flush(h):
> +    http = h['http']
> +    transfer=h['transfer']
> +    transfer_service=h['transfer_service']
> +
> +    # Construct the JSON request for flushing.  Note the offset
> +    # and count must be present but are ignored by imageio.
> +    buf = json.dumps({'op': "flush",
> +                      'offset': 0,
> +                      'size': 0,
> +                      'flush': True})
>
Should be (discussed in v6)

    buf = json.dumps({"op": "flush"})

> +
> +    http.putrequest("PATCH", h['path'])
> +    http.putheader("Content-Type", "application/json")
> +    if h['needs_auth']:
> +        http.putheader("Authorization", transfer.signed_ticket)
>
Never needed.

[snipped]
> +def close(h):
> +    http = h['http']
> +    connection = h['connection']
> +
> +    http.close()
> +
> +    # If we didn't fail, then finalize the transfer.
> +    if not h['failed']:
> +        disk = h['disk']
> +        transfer_service=h['transfer_service']
> +
> +        transfer_service.finalize()
>
If this raises, we never clean up

[snipped]
> +        # Write the disk ID file.  Only do this on successful completion.
> +        with builtins.open(params['diskid_file'], 'w') as
fp:
> +            fp.write(disk.id)
>
If this raises, we will not remove the disk, or close the connection.

> +
> +    # Otherwise if we did fail then we should delete the disk.
> +    else:
> +        disk_service = h['disk_service']
> +        disk_service.remove()
> +
> +    connection.close()
>
[snipped]

I did not check all the SDK related code, I'm not very familiar with the
SDK.

Thanks for creating this, and sorry for the bad documentation on our side
:-)

Nir

Nir Soffer

2018-Mar-25 00:05 UTC

head link

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

On Sun, Mar 25, 2018 at 1:36 AM Nir Soffer <nirsof@gmail.com> wrote:
> On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones
<rjones@redhat.com>
> wrote:
>
>> PROBLEMS:
>>  - -of qcow2 does not work, with multiple problems
>>     * needs to set NBD size to something larger than virtual size
>>
>
> How is this related to the actual file size specified when you create a
> disk?
>
> In block storage, qcow2 image is always on a thin provisioned disk, which
> in oVirt is a regular logical volume, created at the requested
> "initial_size":
>
> From:
>
>
https://github.com/oVirt/ovirt-engine-sdk/blob/aba2a83ec94ecac1594adfff62d3cfcfdbdef416/sdk/examples/upload_disk.py#L113
>
>
> if image_info["format"] == "qcow2":
>     disk_format = types.DiskFormat.COW
> else:
>     disk_format = types.DiskFormat.RAW
>
> disks_service = connection.system_service().disks_service()
> disk = disks_service.add(
>     disk=types.Disk(
>         name=os.path.basename(image_path),
>         description='Uploaded disk',
>         format=disk_format,
>         initial_size=image_size,
>         provisioned_size=image_info["virtual-size"],
>         sparse=disk_format == types.DiskFormat.COW,
>         storage_domains=[
>             types.StorageDomain(
>                 name='mydata'
>             )
>         ]
>     )
> )
>
> Internally we do allocated 10% more then the requested initial_size
> to leave room for qcow2 metadata.
>
But best do not depend on this, I would like to remove it in the future.

> [snipped]
>
>> +    # Create the disk.
>> +    disks_service = system_service.disks_service()
>> +    if params['disk_format'] == "raw":
>> +        disk_format = types.DiskFormat.RAW
>> +    else:
>> +        disk_format = types.DiskFormat.COW
>> +    disk = disks_service.add(
>> +        disk = types.Disk(
>> +            name = params['disk_name'],
>> +            description = "Uploaded by virt-v2v",
>> +            format = disk_format,
>> +            provisioned_size = params['disk_size'],
>>
>
> This must be the virtual size.
>
> You don't specify initial_size - in this case you get 1G, and most
> images will fail to upload.
>
Since you are using qemu-img convert, you must give initial_size that it big
enough to hold the image after conversion. The source image size may not
be enough, since it may contain compressed clusters.

The best way to estimate the size it to use the new "qemu-img measure"
command.

The oVirt sdk examples do not care about this since we just copy the image
as is to storage.

Nir

Richard W.M. Jones

2018-Mar-25 11:41 UTC

head link

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

On Sat, Mar 24, 2018 at 10:36:06PM +0000, Nir Soffer
wrote:> On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones
<rjones@redhat.com>
> wrote:
> 
> > PROBLEMS:
> >  - -of qcow2 does not work, with multiple problems
> >     * needs to set NBD size to something larger than virtual size
> >
> 
> How is this related to the actual file size specified when you create a
> disk?
> 
> In block storage, qcow2 image is always on a thin provisioned disk, which
> in oVirt is a regular logical volume, created at the requested
> "initial_size":
[...]>         initial_size=image_size,
>         provisioned_size=image_info["virtual-size"],
Can you describe exactly what these two sizes mean?

Remember that virt-v2v works by streaming.  At the point where we are
calling this API virt-v2v only knows the virtual size.  We never know
the size of the final qcow2 file.
>         sparse=disk_format == types.DiskFormat.COW,
>         storage_domains=[
>             types.StorageDomain(
>                 name='mydata'
>             )
>         ]
>     )
> )
> 
> Internally we do allocated 10% more then the requested initial_size
> to leave room for qcow2 metadata.
10% more than what?
> See:
> 
>
https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/blockVolume.py#L328
> 
>
https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/volume.py#L666
This leave me more confused than before :-(
> >  - Cannot choose the datacenter.
> 
> The storage domain belongs to some data center, so I don't think you
> need to select a data center.
OK I didn't know this.
> > +from http.client import HTTPSConnection
> > +from urllib.parse import urlparse
> >
> 
> These will not work in python 2, but you can use six.moves to have
> code that works on both 2 and 3.
For RHEL I'm applying a second downstream-only patch which converts
the Python 3 code to Python 2:

https://github.com/libguestfs/libguestfs/commit/bff845c86dcb12c720b38fc60dcdaa5a10373081
> [snipped]
> 
> > +    # Connect to the server.
> > +    connection = sdk.Connection(
> > +        url = params['output_conn'],
> > +        username = username,
> > +        password = password,
> > +        ca_file = params['rhv_cafile'],
> >
> 
> Can this be None?
We could allow that, but in the current code it must be present.
> > +        log = logging.getLogger(),
> > +        insecure = params['insecure'],
> >
> 
> If ca_file cannot be None, then insecure is not needed, based
> on Ondra review from earlier version.
Is this really true?  My reading of the code is that the insecure flag
verifies the server to the client, whereas the certificate bundle is
for verifying the client to the server.
> [snipped]
> 
> > +    # Create the disk.
> > +    disks_service = system_service.disks_service()
> > +    if params['disk_format'] == "raw":
> > +        disk_format = types.DiskFormat.RAW
> > +    else:
> > +        disk_format = types.DiskFormat.COW
> > +    disk = disks_service.add(
> > +        disk = types.Disk(
> > +            name = params['disk_name'],
> > +            description = "Uploaded by virt-v2v",
> > +            format = disk_format,
> > +            provisioned_size = params['disk_size'],
> >
> 
> This must be the virtual size.
> 
> You don't specify initial_size - in this case you get 1G, and most
> images will fail to upload.
This works for me, and I'm using a guest which is larger than 1G.  As
above can you explain what these numbers are supposed to be, and note
that we only know the virtual size (params['disk_size']).  We cannot
know any other information because we're streaming the data, so
anything that requires us to know the final size of the image is a
non-starter.
> > +            sparse = params['output_sparse'],
> >
> The user cannot configure that. This must be based on the image
> format. The current coded may create images in unsupported
> combinations, e.g. raw/sparse on block storage, or fail validation
> in engine.
In virt-v2v this can be configured using ‘-oa’.  What are the possible
combinations?
> [snipped]
> 
> > +# Can we issue zero, trim or flush requests?
> >
> +def get_options(h):
> > +    if h['got_options']:
> > +        return
> > +    h['got_options'] = True
> > +
> > +    http = h['http']
> > +    transfer=h['transfer']
> > +
> > +    http.putrequest("OPTIONS", h['path'])
> > +    http.putheader("Authorization", transfer.signed_ticket)
> > +    http.endheaders()
> > +
> > +    r = http.getresponse()
> > +    if r.status == 200:
> > +        j = json.loads(r.read())
> > +        h['can_zero'] = "zero" in
j['features']
> > +        h['can_trim'] = "trim" in
j['features']
> > +        h['can_flush'] = "flush" in
j['features']
> > +
> > +        # If can_zero is true then it means we're using the new
imageio
> > +        # which never needs the Authorization header.
> > +        if h['can_zero']:
> > +            h['needs_auth'] = False
> >
> 
> If we got here, we are working with new daemon or proxy, and both
> of them do not need auth, so we can set 'needs_auth' to False
> if OPTIONS returned 200 OK.
Which is what this code does, unless I'm misunderstanding things.
> > +    # Old imageio servers returned 405 Method Not Allowed.  If
> > +    # we see that we just say that no operations are allowed and
> > +    # we will emulate them.
> > +    elif r.status == 405:
> > +        h['can_zero'] = False
> > +        h['can_trim'] = False
> > +        h['can_flush'] = False
> >
> 
> I would set all the defaults when creating the sate dict. This ensures
> that we don't forget needs_auth or other keys and crash later with
> KeyError, and make it easier to understand what is the state managed
> by the plugin.
Yup, strong typing FTW ...
> [snipped]
> 
> +def pwrite(h, buf, offset):
> > +    http = h['http']
> > +    transfer=h['transfer']
> > +    transfer_service=h['transfer_service']
> > +
> > +    count = len(buf)
> > +    h['highestwrite'] = max(h['highestwrite'],
offset+count)
> > +
> > +    http.putrequest("PUT", h['path'] +
"?flush=n")
> >
> 
> "?flush=n" has effect only if h["can_flush"] is True.
Older daemon/proxy
> do not know support disabling flushing. The docs mention that this
> query string will be added in 1.3, we are now at 1.2.
But it doesn't seem to break the old imageio?

Eventually we'll have nbdkit > 1.1.25 which will support pwrite-with-
FUA-flag.  Then we'll be able to set flush=(n|y) correctly.  However
this is not likely to happen until after RHEL 7.6.
> > +def zero(h, count, offset, may_trim):
> > +    http = h['http']
> > +    transfer=h['transfer']
> > +    transfer_service=h['transfer_service']
> > +
> > +    # Unlike the trim and flush calls, there is no 'can_zero'
method
> > +    # so nbdkit could call this even if the server doesn't
support
> > +    # zeroing.  If this is the case we must emulate.
> > +    if not h['can_zero']:
> > +        emulate_zero(h, count, offset)
> > +        return
> > +
> > +    # Construct the JSON request for zeroing.
> > +    buf = json.dumps({'op': "zero",
> > +                      'offset': offset,
> > +                      'size': count,
> > +                      'flush': False})
> > +
> > +    http.putrequest("PATCH", h['path'])
> > +    http.putheader("Content-Type",
"application/json")
> > +    if h['needs_auth']:
> > +        http.putheader("Authorization",
transfer.signed_ticket)
> >
> 
> Only GET and PUT on a server that does not implement OPTIONS need auth.
> 
> This will work if h['needs_auth'] is set correctly, but I think it
is
> better to include
> this check only in pread() and pwrite(), and add a comment there that this
> is
> need to support legacy versions.
So I think you mean that we can remove the if h['needs_auth']
and following line completely?
> > +    http.putheader("Content-Length", len(buf))
> > +    http.endheaders()
> > +    http.send(buf)
> > +
> > +    r = http.getresponse()
> > +    if r.status != 200:
> > +        transfer_service.pause()
> > +        h['failed'] = True
> > +        raise RuntimeError("could not zero sector (%d, %d): %d:
%s" %
> > +                           (offset, count, r.status, r.reason))
> > +
> > +# qemu-img convert starts by trying to zero/trim the whole device.
> > +# Since we've just created a new disk it's safe to ignore
these
> > +# requests as long as they are smaller than the highest write seen.
> > +# After that we must emulate them with writes.
> >
> 
> I think this comment is not related to this code. Maybe it belongs to
> write() where we compute the highwrite?
It relates to the conditional in this function:
> > +def emulate_zero(h, count, offset):
> > +    if offset+count < h['highestwrite']:
> > +        http.putrequest("PUT", h['path'] +
"?flush=n")
> >
> 
> This is is used only on old daemon/proxy, the flush has no effect.
OK.
> > +def flush(h):
> > +    http = h['http']
> > +    transfer=h['transfer']
> > +    transfer_service=h['transfer_service']
> > +
> > +    # Construct the JSON request for flushing.  Note the offset
> > +    # and count must be present but are ignored by imageio.
> > +    buf = json.dumps({'op': "flush",
> > +                      'offset': 0,
> > +                      'size': 0,
> > +                      'flush': True})
> >
> 
> Should be (discussed in v6)
> 
>     buf = json.dumps({"op": "flush"})
This contradicts the documentation, but I can change the code.
> > +def close(h):
> > +    http = h['http']
> > +    connection = h['connection']
> > +
> > +    http.close()
> > +
> > +    # If we didn't fail, then finalize the transfer.
> > +    if not h['failed']:
> > +        disk = h['disk']
> > +        transfer_service=h['transfer_service']
> > +
> > +        transfer_service.finalize()
> >
> 
> If this raises, we never clean up
Understood, I'll try to make this function a bit more robust.

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Richard W.M. Jones

2018-Mar-25 11:44 UTC

head link

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

On Sun, Mar 25, 2018 at 12:05:36AM +0000, Nir Soffer
wrote:> On Sun, Mar 25, 2018 at 1:36 AM Nir Soffer <nirsof@gmail.com> wrote:
> 
> > On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones
<rjones@redhat.com>
> > wrote:
> >
> >> PROBLEMS:
> >>  - -of qcow2 does not work, with multiple problems
> >>     * needs to set NBD size to something larger than virtual size
> >>
> >
> > How is this related to the actual file size specified when you create
a
> > disk?
> >
> > In block storage, qcow2 image is always on a thin provisioned disk,
which
> > in oVirt is a regular logical volume, created at the requested
> > "initial_size":
> >
> > From:
> >
> >
https://github.com/oVirt/ovirt-engine-sdk/blob/aba2a83ec94ecac1594adfff62d3cfcfdbdef416/sdk/examples/upload_disk.py#L113
> >
> >
> > if image_info["format"] == "qcow2":
> >     disk_format = types.DiskFormat.COW
> > else:
> >     disk_format = types.DiskFormat.RAW
> >
> > disks_service = connection.system_service().disks_service()
> > disk = disks_service.add(
> >     disk=types.Disk(
> >         name=os.path.basename(image_path),
> >         description='Uploaded disk',
> >         format=disk_format,
> >         initial_size=image_size,
> >         provisioned_size=image_info["virtual-size"],
> >         sparse=disk_format == types.DiskFormat.COW,
> >         storage_domains=[
> >             types.StorageDomain(
> >                 name='mydata'
> >             )
> >         ]
> >     )
> > )
> >
> > Internally we do allocated 10% more then the requested initial_size
> > to leave room for qcow2 metadata.
> >
> 
> But best do not depend on this, I would like to remove it in the future.
> 
> 
> > [snipped]
> >
> >> +    # Create the disk.
> >> +    disks_service = system_service.disks_service()
> >> +    if params['disk_format'] == "raw":
> >> +        disk_format = types.DiskFormat.RAW
> >> +    else:
> >> +        disk_format = types.DiskFormat.COW
> >> +    disk = disks_service.add(
> >> +        disk = types.Disk(
> >> +            name = params['disk_name'],
> >> +            description = "Uploaded by virt-v2v",
> >> +            format = disk_format,
> >> +            provisioned_size = params['disk_size'],
> >>
> >
> > This must be the virtual size.
> >
> > You don't specify initial_size - in this case you get 1G, and most
> > images will fail to upload.
> >
> 
> Since you are using qemu-img convert, you must give initial_size that it
big
> enough to hold the image after conversion. The source image size may not
> be enough, since it may contain compressed clusters.
> 
> The best way to estimate the size it to use the new "qemu-img
measure"
> command.
virt-v2v doesn't really work in a way that we can do that.  Basically
we know the virtual size and will stream the data to the end, and
there's no way to know how big the final image will be, especially for
qcow2.  Data copied is typically much smaller than the virtual size
because we fstrim the source and skip zeroes.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Nir Soffer

2018-Mar-25 20:05 UTC

head link

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

On Sun, Mar 25, 2018 at 2:41 PM Richard W.M. Jones <rjones@redhat.com>
wrote:
> On Sat, Mar 24, 2018 at 10:36:06PM +0000, Nir Soffer wrote:
> > On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones
<rjones@redhat.com>
> > wrote:
> >
> > > PROBLEMS:
> > >  - -of qcow2 does not work, with multiple problems
> > >     * needs to set NBD size to something larger than virtual size
> > >
> >
> > How is this related to the actual file size specified when you create
a
> > disk?
> >
> > In block storage, qcow2 image is always on a thin provisioned disk,
which
> > in oVirt is a regular logical volume, created at the requested
> > "initial_size":
> [...]
> >         initial_size=image_size,
> >         provisioned_size=image_info["virtual-size"],
>
> Can you describe exactly what these two sizes mean?
>
- provisioned_size is the virtual size of the image.
- initial_size is used only for thin provisioned disk on block storage.
This is
the size of the logical volume we created for this disk.

On thin disk on block storage you will not be able to write or zero more
then initial_size bytes.

When a vm is running, vdsm monitor the allocated space and ask the SPM
host to allocated more space when the highest write offset is too high, so
the disk looks like a thin provisioned disk to the vm. This mechanism is
not available for storage operations, and all of them specify initial_size
when converting images to qcow2 format.
>
> Remember that virt-v2v works by streaming.  At the point where we are
> calling this API virt-v2v only knows the virtual size.  We never know
> the size of the final qcow2 file.
>
If you don't have a way to estimate the final size you need to allocated
the entire image when you create a disk.

But I don't understand why you don't have access to the image. I
understood
that the flow is:

image file -> qemu-img convert -> nbdkit -> ovirt-imageio -> ovirt
disk

In this flow you can estimate the size using the image file before starting
the streaming process.

If there is no way to estimate the size and you must allocate the entire
image, we can add a shrink step in upload finalization, to shrink the image
size to optimal size. We are already doing this in several flows that
cannot estimate the final disk size and do over allocation.

[snipped]
> > > +        log = logging.getLogger(),
>
> > +        insecure = params['insecure'],
> > >
> >
> > If ca_file cannot be None, then insecure is not needed, based
> > on Ondra review from earlier version.
>
> Is this really true?  My reading of the code is that the insecure flag
> verifies the server to the client, whereas the certificate bundle is
> for verifying the client to the server.
>
Maybe, I'm not familiar with the SDK.

What is the motivation for disabling this flag?

> > [snipped]
> >
> > > +    # Create the disk.
> > > +    disks_service = system_service.disks_service()
> > > +    if params['disk_format'] == "raw":
> > > +        disk_format = types.DiskFormat.RAW
> > > +    else:
> > > +        disk_format = types.DiskFormat.COW
> > > +    disk = disks_service.add(
> > > +        disk = types.Disk(
> > > +            name = params['disk_name'],
> > > +            description = "Uploaded by virt-v2v",
> > > +            format = disk_format,
> > > +            provisioned_size = params['disk_size'],
> > >
> >
> > This must be the virtual size.
> >
> > You don't specify initial_size - in this case you get 1G, and most
> > images will fail to upload.
>
> This works for me, and I'm using a guest which is larger than 1G.  As
> above can you explain what these numbers are supposed to be, and note
> that we only know the virtual size (params['disk_size']).  We
cannot
> know any other information because we're streaming the data, so
> anything that requires us to know the final size of the image is a
> non-starter.
>
Discussed above.

> > > +            sparse = params['output_sparse'],
> > >
> > The user cannot configure that. This must be based on the image
> > format. The current coded may create images in unsupported
> > combinations, e.g. raw/sparse on block storage, or fail validation
> > in engine.
>
> In virt-v2v this can be configured using ‘-oa’.  What are the possible
> combinations?
>
I could not find documentation for this in the ovirt engine sdk.
There is http://ovirt.github.io/ovirt-engine-sdk/master/  but it is
not very useful for anything.

The ovirt-engine REST API has real documentation here:
http://ovirt.github.io/ovirt-engine-api-model/master

But I could not find any documentation about creating disks there, so
the above is based on current system behavior.

When creating disks from the UI, the user select the storage domain
(implicitly selecting the file/block), and the allocation policy
(thin/preallocated). The user cannot select the image format when
creating disk, ovirt will choose the image format and sparse for you.

storage type  allocation   | image format  sparse  creating
---------------------------|-----------------------------------------------------------------
file          thin         | raw           true    sparse file of
provisioned_size bytes
file          preallocated | raw           false   preallocated file of
provisioned_size bytes
block         thin         | qcow2         true    logical volume of
initial_size bytes
block         preallocated | raw           false   logical volume of
provisioned_size bytes

When uploading disk, the user select a file, implicitly selecting the
image format (raw/qcow2), and the storage domain, implicitly selecting
the storage type (file/block).  oVirt selects the allocation and sparse
value for you.

storage type  image format | allocation    sparse   creating
---------------------------|---------------------------------------------------------------
file          qcow2        | thin          true     sparse file of image
size bytes
file          raw          | thin          true     sparse file of
provisioned_size bytes
block         qcow2        | thin          true     logical volume of
initial_size bytes
block         raw          | preallocated  false    logical volume of
provisioned_size bytes

Notes:
- file,qcow2 cannot be created from the UI.
- no way to create preallocated uploading from the UI (looks like a bug to
me)

When using the sdk, you can select both the image format and sparse, so you
can create invalid combinations. oVirt selects the allocation for you.

storage type  image format  sparse | allocation      creating
-----------------------------------|--------------------------------------------------------
file          qcow2         true   | thin           sparse file of image
size bytes
file          raw           false  | preallocated   preallocated file of
provisioned_size bytes
file          raw           true   | thin           sparse file of
provisioned_size
bytes
file          qcow2         false  | -              unsupported
block         qcow2         true   | thin           logical volume of
initial_size bytes
block         raw           false  | preallocated   logical volume of
provisioned_size bytes
block         qcow2         false  | -              unsupported
block         raw           true   | -              unsupported

The only case when selecting the sparse value is useful is raw file
on file based storage domain, allowing creation of sparse or preallocated
disk.

I guess you can check if a storage domain is file based or block basedu
using the SDK, but I'm not familiar with it.

Daniel, do we expose this info?
> [snipped]
> >
> > > +# Can we issue zero, trim or flush requests?
> > >
> > +def get_options(h):
> > > +    if h['got_options']:
> > > +        return
> > > +    h['got_options'] = True
> > > +
> > > +    http = h['http']
> > > +    transfer=h['transfer']
> > > +
> > > +    http.putrequest("OPTIONS", h['path'])
> > > +    http.putheader("Authorization",
transfer.signed_ticket)
> > > +    http.endheaders()
> > > +
> > > +    r = http.getresponse()
> > > +    if r.status == 200:
> > > +        j = json.loads(r.read())
> > +        h['can_zero'] = "zero" in
j['features']
> > > +        h['can_trim'] = "trim" in
j['features']
> > > +        h['can_flush'] = "flush" in
j['features']
> > > +
> > > +        # If can_zero is true then it means we're using the
new
> imageio
> > > +        # which never needs the Authorization header.
> > > +        if h['can_zero']:
> > > +            h['needs_auth'] = False
>
In older proxy, we do support OPTIONS, used for allowing access
from enigne UI, but it does not return any content. Fortunately, we
return httplib.NO_CONTENT so it is easy to detect.

So we need to handle gracefully r.status = 204

> > >
> >
> > If we got here, we are working with new daemon or proxy, and both
> > of them do not need auth, so we can set 'needs_auth' to False
> > if OPTIONS returned 200 OK.
>
> Which is what this code does, unless I'm misunderstanding things.
>
Accessing readonly ticket will return empty features list, and
h['needs_auth']
will be True, while this server does not need auth. Yes, it cannot happen
with this code but it is not the right check.

> > [snipped]
> >
> > +def pwrite(h, buf, offset):
> > > +    http = h['http']
> > > +    transfer=h['transfer']
> > > +    transfer_service=h['transfer_service']
> > > +
> > > +    count = len(buf)
> > > +    h['highestwrite'] = max(h['highestwrite'],
offset+count)
> > > +
> > > +    http.putrequest("PUT", h['path'] +
"?flush=n")
> > >
> >
> > "?flush=n" has effect only if h["can_flush"] is
True. Older daemon/proxy
> > do not know support disabling flushing. The docs mention that this
> > query string will be added in 1.3, we are now at 1.2.
>
> But it doesn't seem to break the old imageio?
>
Old imageio just ignores this, so it is safe to use as is, the issue is
again
confusing the reader that this has some effect on the server.

[snipped]
> > > +    # Construct the JSON request for zeroing.
> > > +    buf = json.dumps({'op': "zero",
> > > +                      'offset': offset,
> > > +                      'size': count,
> > > +                      'flush': False})
> > > +
> > > +    http.putrequest("PATCH", h['path'])
> > > +    http.putheader("Content-Type",
"application/json")
> > > +    if h['needs_auth']:
> > > +        http.putheader("Authorization",
transfer.signed_ticket)
> > >
> >
> > Only GET and PUT on a server that does not implement OPTIONS need
auth.
> >
> > This will work if h['needs_auth'] is set correctly, but I
think it is
> > better to include
> > this check only in pread() and pwrite(), and add a comment there that
> this
> > is
> > need to support legacy versions.
>
> So I think you mean that we can remove the if h['needs_auth']
> and following line completely?
>
Yes, probably can be removed.

[snipped]
> > > +    buf = json.dumps({'op': "flush",
> > > +                      'offset': 0,
> > > +                      'size': 0,
> > > +                      'flush': True})
> > >
> >
> > Should be (discussed in v6)
> >
> >     buf = json.dumps({"op": "flush"})
>
> This contradicts the documentation, but I can change the code.
>
Agree, I'll fix the docs.

Nir

Seemingly Similar Threads

Search for more seemingly similar threads

Libguestfs - Mar 2018 - Re: [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Seemingly Similar Threads