We were seeing 100% failure rates transferring 10G disk images from ESX on a particular setup. We also weren't spotting the transfer failure, and dying with a strange error from libguestfs. These 2 patches fix the error check which should have made it obvious what was failing, and the underlying error.
Matthew Booth
2010-Apr-28 16:18 UTC
[Libguestfs] [PATCH 1/2] ESX: Fix check that full volume was transfered
The check of transferred size against expected size was done after returning the volume, so was never usefully happening. --- lib/Sys/VirtV2V/Transfer/ESX.pm | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/lib/Sys/VirtV2V/Transfer/ESX.pm b/lib/Sys/VirtV2V/Transfer/ESX.pm index 4d65d5e..fb0c6ea 100644 --- a/lib/Sys/VirtV2V/Transfer/ESX.pm +++ b/lib/Sys/VirtV2V/Transfer/ESX.pm @@ -131,16 +131,18 @@ sub get_volume my $died = $r->header('X-Died'); die($died) if (defined($died)); + # It reports success even if we didn't receive the whole file + die(user_message(__x("Didn't receive full volume. Received {received} ". + "of {total} bytes.", + received => $self->{_v2v_received}, + total => $self->{_v2v_volsize}))) + unless ($self->{_v2v_received} == $self->{_v2v_volsize}); + my $vol = $self->{_v2v_vol}; $vol->close(); return $vol; } - die(user_message(__x("Didn't receive full volume. Received {received} of ". - "{total} bytes.", - received => $self->{_v2v_received}, - total => $self->{_v2v_volsize}))) - unless ($self->{_v2v_received} == $self->{_v2v_volsize}); if ($r->code == 401) { die(user_message(__x("Authentication error connecting to ". -- 1.6.6.1
Matthew Booth
2010-Apr-28 16:18 UTC
[Libguestfs] [PATCH 2/2] ESX: Get volume size in a separate HEAD request
We used to fetch the volume in a HEAD request. After receiving the header, which contains the content length, we created the target volume and opened it for writing before continuing to receive data. This created a problem if volume creation took long enough for the transfer to time out. To fix this, we get the size first with a HEAD request. Create the volume, then do a second GET request to receive the data. This fixes RHBZ#586816 --- lib/Sys/VirtV2V/Transfer/ESX.pm | 30 ++++++++++++++++++++++++------ 1 files changed, 24 insertions(+), 6 deletions(-) diff --git a/lib/Sys/VirtV2V/Transfer/ESX.pm b/lib/Sys/VirtV2V/Transfer/ESX.pm index fb0c6ea..f638149 100644 --- a/lib/Sys/VirtV2V/Transfer/ESX.pm +++ b/lib/Sys/VirtV2V/Transfer/ESX.pm @@ -121,10 +121,22 @@ sub get_volume return $target->get_volume($volname); } + # Head request to get the size and create the volume + # We could do this with a single GET request. The problem with this is that + # you have to create the volume before writing to it. If the volume creation + # takes a very long time, the transfer may fail in the mean time. + my $r = $self->head($url); + if ($r->is_success) { + $self->verify_certificate($r) unless ($self->{_v2v_noverify}); + $self->create_volume($r); + } else { + $self->report_error($r); + } + $self->{_v2v_received} = 0; - my $r = $self->SUPER::get($url, - ':content_cb' => sub { $self->handle_data(@_); }, - ':read_size_hint' => 64 * 1024); + $r = $self->get($url, + ':content_cb' => sub { $self->handle_data(@_); }, + ':read_size_hint' => 64 * 1024); if ($r->is_success) { # It reports success even if one of the callbacks died @@ -143,6 +155,13 @@ sub get_volume return $vol; } + $self->report_error($r); +} + +sub report_error +{ + my $self = shift; + my ($r) = @_; if ($r->code == 401) { die(user_message(__x("Authentication error connecting to ". @@ -173,10 +192,9 @@ sub handle_data my ($data, $response) = @_; - # Create the volume if it hasn't been created already - if (!defined($self->{_v2v_vol}) && $response->is_success) { + # Verify the certificate of the get request the first time we're called + if ($self->{_v2v_received} == 0) { $self->verify_certificate($response) unless ($self->{_v2v_noverify}); - $self->create_volume($response); } $self->{_v2v_received} += length($data); -- 1.6.6.1
On Wed, Apr 28, 2010 at 05:18:57PM +0100, Matthew Booth wrote:> We were seeing 100% failure rates transferring 10G disk images from ESX on a > particular setup. We also weren't spotting the transfer failure, and dying with > a strange error from libguestfs. These 2 patches fix the error check which > should have made it obvious what was failing, and the underlying error.Heh, I thought for a minute you were saying libguestfs wasn't reporting an error! ACK to both patches, they both make sense. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
Reasonably Related Threads
- [PATCH 1/2] Try to load the loop module before running mkinitrd
- [FOR REVIEW ONLY] ESX work in progress
- [PATCH] ESX: Fix storage URL if storage has a snapshot
- [ESX support] Working ESX conversion for RHEL 5
- [PATCH 1/2] ESX: Look harder for potential transfer failures