thr3ads.net - Libguestfs - [Libguestfs] Extracting files from OVA is bad [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Tomáš Golembiovský

2016-Sep-09 11:03 UTC

[Libguestfs] Extracting files from OVA is bad

Hi,

recently we (oVirt) have started discussing whether the way virt-v2v
handles import from OVA files is good. And I would be interested in
ideas how it can be improved. It is likely somebody already gave some
thought to this problem.

TL;DR: Extracting the OVA before import is a problem for large VMs (in
sizes of TBs). Can we change something to prevent the extraction and
work directly over OVA?


What we consider a huge shortcoming is the fact that whole OVA is
extracted prior to the import into a temporary directory and processed
afterwards. Under normal situation user can have up to three copies of
the VM on his drive at the end of import:

  * original OVA,
  * temporary extracted files (will be deleted when virt-v2v terminates,
  * converted VM.


This is not a good idea for large VMs that have hunderds of GBs or even
TBs in size. The requirements on the necessary storage space can be
lessened with proper partitioning. I.e. source OVA and converted VM
don't end up on the same drive and TMPDIR is set to put even temporary
files somewhere else. But this is not a general solution. And sometimes
the necessary space may not be available at all.


The question is how to change the import path so that virt-v2v doesn't
have to extract the OVA. I can see the following solutions:

 1) Solve it virt-v2v: create a layer for directly accessing the files
    in the archive.

 2) Solve it in QEMU: create backing method that would allow creating
    qemu disk backed by the archive. 

 3) Solve it on oVirt side: use some FUSE-based tool to provide
    access to the archive and pass the OVA to virt-v2v not as a file but
    as directory.


Does anyone have any other ideas or suggestions?


Best regards,

    Tomas Golembiovsky

-- 
Tomáš Golembiovský <tgolembi@redhat.com>

Richard W.M. Jones

2016-Sep-09 12:02 UTC

head link

Re: [Libguestfs] Extracting files from OVA is bad

On Fri, Sep 09, 2016 at 01:03:49PM +0200, Tomáš Golembiovský
wrote:> Hi,
> 
> recently we (oVirt) have started discussing whether the way virt-v2v
> handles import from OVA files is good. And I would be interested in
> ideas how it can be improved. It is likely somebody already gave some
> thought to this problem.
> 
> TL;DR: Extracting the OVA before import is a problem for large VMs (in
> sizes of TBs). Can we change something to prevent the extraction and
> work directly over OVA?
Specifically virt-v2v needs to do:

qemu-img create -b <source-file-within-the-tarball> -f qcow2 overlay.qcow2
qemu-img convert overlay.qcow2 output
> What we consider a huge shortcoming is the fact that whole OVA is
> extracted prior to the import into a temporary directory and processed
> afterwards. Under normal situation user can have up to three copies of
> the VM on his drive at the end of import:
> 
>   * original OVA,
>   * temporary extracted files (will be deleted when virt-v2v terminates,
>   * converted VM.
> 
> 
> This is not a good idea for large VMs that have hunderds of GBs or even
> TBs in size. The requirements on the necessary storage space can be
> lessened with proper partitioning. I.e. source OVA and converted VM
> don't end up on the same drive and TMPDIR is set to put even temporary
> files somewhere else. But this is not a general solution. And sometimes
> the necessary space may not be available at all.
> 
> 
> The question is how to change the import path so that virt-v2v doesn't
> have to extract the OVA. I can see the following solutions:
> 
>  1) Solve it virt-v2v: create a layer for directly accessing the files
>     in the archive.
>
>  2) Solve it in QEMU: create backing method that would allow creating
>     qemu disk backed by the archive. 
As long as the tar file is not compressed, accessing a file within it
should be trivial.

I asked Kevin if there is a way to get qemu to access a disk image at
an offset within another file, but there is no such feature at the
moment.  It's possible with `losetup', but that requires root :-(

(At this point I would normally grumble about how easy this would be
with a microkernel, but I won't do that now.)

David Gilbert suggested looking at qemu-nbd which has an --offset
option, allowing a particular offset with another file to be accessed.
If we wanted to do it entirely within virt-v2v, I think this would be
the way to go - the complex logic could be hidden inside v2v/input_ova.ml

The second problem is to work out the right offset to use.  I suspect
this is something that http://www.libarchive.org/ can do, and that
package is also in RHEL.

We could even imagine a qemu block backend based on libarchive.
>  3) Solve it on oVirt side: use some FUSE-based tool to provide
>     access to the archive and pass the OVA to virt-v2v not as a file but
>     as directory.
http://www.cybernoia.de/software/archivemount/ is one such tool which
can do this.  It's not in RHEL, but it seems to be based on
libarchive.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Michal Skrivanek

2016-Sep-09 12:32 UTC

head link

Re: [Libguestfs] Extracting files from OVA is bad

> On 09 Sep 2016, at 14:02, Richard W.M. Jones <rjones@redhat.com>
wrote:
> 
> On Fri, Sep 09, 2016 at 01:03:49PM +0200, Tomáš Golembiovský wrote:
>> Hi,
>> 
>> recently we (oVirt) have started discussing whether the way virt-v2v
>> handles import from OVA files is good. And I would be interested in
>> ideas how it can be improved. It is likely somebody already gave some
>> thought to this problem.
>> 
>> TL;DR: Extracting the OVA before import is a problem for large VMs (in
>> sizes of TBs). Can we change something to prevent the extraction and
>> work directly over OVA?
> 
> Specifically virt-v2v needs to do:
> 
> qemu-img create -b <source-file-within-the-tarball> -f qcow2
overlay.qcow2
> qemu-img convert overlay.qcow2 output
> 
>> What we consider a huge shortcoming is the fact that whole OVA is
>> extracted prior to the import into a temporary directory and processed
>> afterwards. Under normal situation user can have up to three copies of
>> the VM on his drive at the end of import:
>> 
>>  * original OVA,
>>  * temporary extracted files (will be deleted when virt-v2v terminates,
>>  * converted VM.
>> 
>> 
>> This is not a good idea for large VMs that have hunderds of GBs or even
>> TBs in size. The requirements on the necessary storage space can be
>> lessened with proper partitioning. I.e. source OVA and converted VM
>> don't end up on the same drive and TMPDIR is set to put even
temporary
>> files somewhere else. But this is not a general solution. And sometimes
>> the necessary space may not be available at all.
>> 
>> 
>> The question is how to change the import path so that virt-v2v
doesn't
>> have to extract the OVA. I can see the following solutions:
>> 
>> 1) Solve it virt-v2v: create a layer for directly accessing the files
>>    in the archive.
>> 
>> 2) Solve it in QEMU: create backing method that would allow creating
>>    qemu disk backed by the archive. 
> 
> As long as the tar file is not compressed, accessing a file within it
> should be trivial.
The OVA standard [1] talks about compression. But it looks like it’s meant only
for individual disks inside the archive. It doesn’t seem to be clear about it.
The OVF xml is guaranteed to be at the beginning so it doesn’t need to read a
lot until really reading it whole even if it would be compressed.
Well, I guess it’s a reasonable to start with plain tar regardless.

Thanks,
michal

[1]
http://www.dmtf.org/sites/default/files/standards/documents/DSP0243_2.1.1.pdf
> 
> I asked Kevin if there is a way to get qemu to access a disk image at
> an offset within another file, but there is no such feature at the
> moment.  It's possible with `losetup', but that requires root :-(
> 
> (At this point I would normally grumble about how easy this would be
> with a microkernel, but I won't do that now.)
> 
> David Gilbert suggested looking at qemu-nbd which has an --offset
> option, allowing a particular offset with another file to be accessed.
> If we wanted to do it entirely within virt-v2v, I think this would be
> the way to go - the complex logic could be hidden inside v2v/input_ova.ml
> 
> The second problem is to work out the right offset to use.  I suspect
> this is something that http://www.libarchive.org/ can do, and that
> package is also in RHEL.
> 
> We could even imagine a qemu block backend based on libarchive.
> 
>> 3) Solve it on oVirt side: use some FUSE-based tool to provide
>>    access to the archive and pass the OVA to virt-v2v not as a file but
>>    as directory.
> 
> http://www.cybernoia.de/software/archivemount/ is one such tool which
> can do this.  It's not in RHEL, but it seems to be based on
> libarchive.
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> libguestfs lets you edit virtual machines.  Supports shell scripting,
> bindings from many languages.  http://libguestfs.org

Maybe Matching Threads

Search for more maybe matching threads

Libguestfs - Sep 2016 - Extracting files from OVA is bad

[Libguestfs] Extracting files from OVA is bad

Re: [Libguestfs] Extracting files from OVA is bad

Re: [Libguestfs] Extracting files from OVA is bad

Maybe Matching Threads