We were seeing 100% failure rates transferring 10G disk images from ESX on a particular setup. We also weren't spotting the transfer failure, and dying with a strange error from libguestfs. These 2 patches fix the error check which should have made it obvious what was failing, and the underlying error.
Matthew Booth
2010-Apr-28 16:18 UTC
[Libguestfs] [PATCH 1/2] ESX: Fix check that full volume was transfered
The check of transferred size against expected size was done after returning the
volume, so was never usefully happening.
---
lib/Sys/VirtV2V/Transfer/ESX.pm | 12 +++++++-----
1 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/lib/Sys/VirtV2V/Transfer/ESX.pm b/lib/Sys/VirtV2V/Transfer/ESX.pm
index 4d65d5e..fb0c6ea 100644
--- a/lib/Sys/VirtV2V/Transfer/ESX.pm
+++ b/lib/Sys/VirtV2V/Transfer/ESX.pm
@@ -131,16 +131,18 @@ sub get_volume
my $died = $r->header('X-Died');
die($died) if (defined($died));
+ # It reports success even if we didn't receive the whole file
+ die(user_message(__x("Didn't receive full volume. Received
{received} ".
+ "of {total} bytes.",
+ received => $self->{_v2v_received},
+ total => $self->{_v2v_volsize})))
+ unless ($self->{_v2v_received} == $self->{_v2v_volsize});
+
my $vol = $self->{_v2v_vol};
$vol->close();
return $vol;
}
- die(user_message(__x("Didn't receive full volume. Received
{received} of ".
- "{total} bytes.",
- received => $self->{_v2v_received},
- total => $self->{_v2v_volsize})))
- unless ($self->{_v2v_received} == $self->{_v2v_volsize});
if ($r->code == 401) {
die(user_message(__x("Authentication error connecting to ".
--
1.6.6.1
Matthew Booth
2010-Apr-28 16:18 UTC
[Libguestfs] [PATCH 2/2] ESX: Get volume size in a separate HEAD request
We used to fetch the volume in a HEAD request. After receiving the header, which
contains the content length, we created the target volume and opened it for
writing before continuing to receive data. This created a problem if volume
creation took long enough for the transfer to time out.
To fix this, we get the size first with a HEAD request. Create the volume, then
do a second GET request to receive the data.
This fixes RHBZ#586816
---
lib/Sys/VirtV2V/Transfer/ESX.pm | 30 ++++++++++++++++++++++++------
1 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/lib/Sys/VirtV2V/Transfer/ESX.pm b/lib/Sys/VirtV2V/Transfer/ESX.pm
index fb0c6ea..f638149 100644
--- a/lib/Sys/VirtV2V/Transfer/ESX.pm
+++ b/lib/Sys/VirtV2V/Transfer/ESX.pm
@@ -121,10 +121,22 @@ sub get_volume
return $target->get_volume($volname);
}
+ # Head request to get the size and create the volume
+ # We could do this with a single GET request. The problem with this is that
+ # you have to create the volume before writing to it. If the volume
creation
+ # takes a very long time, the transfer may fail in the mean time.
+ my $r = $self->head($url);
+ if ($r->is_success) {
+ $self->verify_certificate($r) unless ($self->{_v2v_noverify});
+ $self->create_volume($r);
+ } else {
+ $self->report_error($r);
+ }
+
$self->{_v2v_received} = 0;
- my $r = $self->SUPER::get($url,
- ':content_cb' => sub {
$self->handle_data(@_); },
- ':read_size_hint' => 64 * 1024);
+ $r = $self->get($url,
+ ':content_cb' => sub {
$self->handle_data(@_); },
+ ':read_size_hint' => 64 * 1024);
if ($r->is_success) {
# It reports success even if one of the callbacks died
@@ -143,6 +155,13 @@ sub get_volume
return $vol;
}
+ $self->report_error($r);
+}
+
+sub report_error
+{
+ my $self = shift;
+ my ($r) = @_;
if ($r->code == 401) {
die(user_message(__x("Authentication error connecting to ".
@@ -173,10 +192,9 @@ sub handle_data
my ($data, $response) = @_;
- # Create the volume if it hasn't been created already
- if (!defined($self->{_v2v_vol}) && $response->is_success) {
+ # Verify the certificate of the get request the first time we're called
+ if ($self->{_v2v_received} == 0) {
$self->verify_certificate($response) unless
($self->{_v2v_noverify});
- $self->create_volume($response);
}
$self->{_v2v_received} += length($data);
--
1.6.6.1
On Wed, Apr 28, 2010 at 05:18:57PM +0100, Matthew Booth wrote:> We were seeing 100% failure rates transferring 10G disk images from ESX on a > particular setup. We also weren't spotting the transfer failure, and dying with > a strange error from libguestfs. These 2 patches fix the error check which > should have made it obvious what was failing, and the underlying error.Heh, I thought for a minute you were saying libguestfs wasn't reporting an error! ACK to both patches, they both make sense. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
Seemingly Similar Threads
- [PATCH 1/2] Try to load the loop module before running mkinitrd
- [FOR REVIEW ONLY] ESX work in progress
- [PATCH] ESX: Fix storage URL if storage has a snapshot
- [ESX support] Working ESX conversion for RHEL 5
- [PATCH 1/2] ESX: Look harder for potential transfer failures