thr3ads.net - Libguestfs - [Libguestfs] [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption. [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Richard W.M. Jones

2012-Jul-03 18:03 UTC

[Libguestfs] [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption.

https://bugzilla.redhat.com/show_bug.cgi?id=836710
https://bugzilla.redhat.com/show_bug.cgi?id=836913

There are at least two related bugs going on:

(1) Linux sync(2) system call doesn't send a write barrier to the
disk, so in effect it doesn't force the hard disk to flush its cache.
libguestfs used sync(2) to force changes to disk.  We didn't expect
that qemu was caching anything because we used 'cache=none' for all
writable disks, but it turns out that qemu creates a writeback cache
anyway when you do this (you need to use 'cache=directsync' when you
don't want a cache at all).

(2) qemu's qcow2 disk cache code is buggy.  If there are I/Os in
flight when qemu shuts down, then qemu segfaults or assert fails.
This can result in unwritten data.  Unfortunately libguestfs ignored
the result of waitpid(2) so we didn't see this problem happening.

Patch 1/7 fixes the first problem by issuing fsync(2) on each whole
block device when we sync.

Patches 2/7 - 7/7 are needed to fix the second problem.  We add a new
API (guestfs_shutdown) so that we can actually catch the case where
qemu is segfaulting instead of just ignoring it.  Since qemu itself
isn't likely to be fixed any time soon, patch 7/7 adds a crude but
effective workaround to virt-resize.

Rich.

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 1/7] daemon: Run fsync on block devices after sync (RHBZ#836710).

From: "Richard W.M. Jones" <rjones at redhat.com>

On Linux, sync(2) does not actually issue a write barrier, thus it
doesn't force a flush of the underlying hardware write cache (or
qemu's disk cache in the virtual case).

This can be a problem, because libguestfs relies on running sync in
the appliance, followed by killing qemu (using SIGTERM).

In most cases, this is fine, because killing qemu with SIGTERM should
cause it to flush out the disk cache before it exits.  However we have
found various bugs in qemu which cause qemu to crash while doing the
flush, leaving the data unwritten (see RHBZ#836913).

The solution is to issue fsync(2) to the block devices.  This has a
write barrier, so it ensures that qemu writes out its cache long
before we get around to killing qemu.
---
 configure.ac  |    1 +
 daemon/sync.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/configure.ac b/configure.ac
index c37b7f5..e4b3e70 100644
--- a/configure.ac
+++ b/configure.ac
@@ -204,6 +204,7 @@ AC_CHECK_HEADERS([\
 
 dnl Functions.
 AC_CHECK_FUNCS([\
+        fsync \
         futimens \
         getxattr \
         htonl \
diff --git a/daemon/sync.c b/daemon/sync.c
index fcb887e..2338a3d 100644
--- a/daemon/sync.c
+++ b/daemon/sync.c
@@ -23,7 +23,11 @@
 #endif
 
 #include <stdio.h>
+#include <stdlib.h>
 #include <unistd.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <sys/types.h>
 
 #include "daemon.h"
 #include "actions.h"
@@ -32,6 +36,10 @@
 static int sync_win32 (void);
 #endif
 
+#ifdef HAVE_FSYNC
+static void fsync_devices (void);
+#endif
+
 int
 do_sync (void)
 {
@@ -52,6 +60,18 @@ sync_disks (void)
 {
 #if defined(HAVE_SYNC)
   sync ();
+
+  /* On Linux, sync(2) doesn't perform a barrier, so qemu (which may
+   * have a writeback cache, even with cache=none) will still have
+   * some unwritten data.  Force the data out of any qemu caches, by
+   * calling fsync on all block devices.  Note we still need the
+   * call to sync above in order to schedule the writes.
+   * Thanks to: Avi Kivity, Kevin Wolf.
+   */
+#ifdef HAVE_FSYNC
+  fsync_devices ();
+#endif
+
   return 0;
 #elif defined(WIN32)
   return sync_win32 ();
@@ -60,6 +80,64 @@ sync_disks (void)
 #endif
 }
 
+#ifdef HAVE_FSYNC
+static void
+fsync_devices (void)
+{
+  DIR *dir;
+  struct dirent *d;
+  char dev_path[256];
+  int fd;
+
+  dir = opendir ("/sys/block");
+  if (!dir) {
+    perror ("opendir: /sys/block");
+    return;
+  }
+
+  for (;;) {
+    errno = 0;
+    d = readdir(dir);
+    if (!d) break;
+
+    if (STREQLEN (d->d_name, "sd", 2) ||
+        STREQLEN (d->d_name, "hd", 2) ||
+        STREQLEN (d->d_name, "vd", 2) ||
+        STREQLEN (d->d_name, "sr", 2)) {
+      snprintf (dev_path, sizeof dev_path, "/dev/%s", d->d_name);
+
+      /* Ignore the root device. */
+      if (is_root_device (dev_path))
+        continue;
+
+      fd = open (dev_path, O_RDONLY|O_CLOEXEC);
+      if (fd == -1) {
+        perror (dev_path);
+        continue;
+      }
+
+      /* fsync the device. */
+      if (verbose)
+        fprintf (stderr, "fsync %s\n", dev_path);
+
+      if (fsync (fd) == -1)
+        perror ("fsync");
+
+      if (close (fd) == -1)
+        perror ("close");
+    }
+  }
+
+  /* Check readdir didn't fail */
+  if (errno != 0)
+    perror ("readdir: /sys/block");
+
+  /* Close the directory handle */
+  if (closedir (dir) == -1)
+    perror ("closedir");
+}
+#endif /* HAVE_FSYNC */
+
 #ifdef WIN32
 static int
 sync_win32 (void)
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 2/7] examples: In create_disk example, don't call set_autosync.

From: "Richard W.M. Jones" <rjones at redhat.com>

This is now set by default in all supported versions of libguestfs.
It's just confusing if the examples refer to it.
---
 erlang/examples/create_disk.erl |    8 --------
 examples/create_disk.c          |    9 ---------
 java/examples/CreateDisk.java   |    8 --------
 ocaml/examples/create_disk.ml   |   11 +----------
 perl/examples/create_disk.pl    |    7 -------
 python/examples/create_disk.py  |    7 -------
 ruby/examples/create_disk.rb    |    7 -------
 7 files changed, 1 insertion(+), 56 deletions(-)

diff --git a/erlang/examples/create_disk.erl b/erlang/examples/create_disk.erl
index 231c398..a0c9044 100755
--- a/erlang/examples/create_disk.erl
+++ b/erlang/examples/create_disk.erl
@@ -16,10 +16,6 @@ main(_) ->
     % Set the trace flag so that we can see each libguestfs call.
     ok = guestfs:set_trace(G, true),
 
-    % Set the autosync flag so that the disk will be synchronized
-    % automatically when the libguestfs handle is closed.
-    ok = guestfs:set_autosync(G, true),
-
     % Attach the disk image to libguestfs.
     ok = guestfs:add_drive_opts(G, Output,
                                 [{format, "raw"}, {readonly,
false}]),
@@ -55,10 +51,6 @@ main(_) ->
     % the disk image.
     ok = guestfs:upload(G, "/etc/resolv.conf",
"/foo/resolv.conf"),
 
-    % Because 'autosync' was set (above) we can just close the handle
-    % and the disk contents will be synchronized.  You can also do
-    % this manually by calling guestfs:umount_all and guestfs:sync.
-    %
     % Note also that handles are automatically closed if they are
     % reaped by the garbage collector.  You only need to call close
     % if you want to close the handle right away.
diff --git a/examples/create_disk.c b/examples/create_disk.c
index bcad6d8..5380888 100644
--- a/examples/create_disk.c
+++ b/examples/create_disk.c
@@ -37,11 +37,6 @@ main (int argc, char *argv[])
   /* Set the trace flag so that we can see each libguestfs call. */
   guestfs_set_trace (g, 1);
 
-  /* Set the autosync flag so that the disk will be synchronized
-   * automatically when the libguestfs handle is closed.
-   */
-  guestfs_set_autosync (g, 1);
-
   /* Add the disk image to libguestfs. */
   if (guestfs_add_drive_opts (g, "disk.img",
         GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw", /* raw format */
@@ -104,10 +99,6 @@ main (int argc, char *argv[])
   if (guestfs_upload (g, "/etc/resolv.conf",
"/foo/resolv.conf") == -1)
     exit (EXIT_FAILURE);
 
-  /* Because 'autosync' was set (above) we can just close the handle
-   * and the disk contents will be synchronized.  You can also do
-   * this manually by calling guestfs_umount_all and guestfs_sync.
-   */
   guestfs_close (g);
 
   /* Free up the lists. */
diff --git a/java/examples/CreateDisk.java b/java/examples/CreateDisk.java
index 4742b5a..41f8739 100644
--- a/java/examples/CreateDisk.java
+++ b/java/examples/CreateDisk.java
@@ -22,10 +22,6 @@ public class CreateDisk
             // Set the trace flag so that we can see each libguestfs call.
             g.set_trace (true);
 
-            // Set the autosync flag so that the disk will be synchronized
-            // automatically when the libguestfs handle is closed.
-            g.set_autosync (true);
-
             // Attach the disk image to libguestfs.
             Map<String, Object> optargs = new HashMap<String,
Object>() {
                 {
@@ -70,10 +66,6 @@ public class CreateDisk
             // the disk image.
             g.upload ("/etc/resolv.conf",
"/foo/resolv.conf");
 
-            // Because 'autosync' was set (above) we can just close the
handle
-            // and the disk contents will be synchronized.  You can also do
-            // this manually by calling g#umount_all and g#sync.
-            //
             // Note also that handles are automatically closed if they are
             // reaped by the garbage collector.  You only need to call close
             // if you want to close the handle right away.
diff --git a/ocaml/examples/create_disk.ml b/ocaml/examples/create_disk.ml
index 0f9f941..75dbdc3 100644
--- a/ocaml/examples/create_disk.ml
+++ b/ocaml/examples/create_disk.ml
@@ -16,11 +16,6 @@ let ()    (* Set the trace flag so that we can see each
libguestfs call. *)
   g#set_trace true;
 
-  (* Set the autosync flag so that the disk will be synchronized
-   * automatically when the libguestfs handle is closed.
-   *)
-  g#set_autosync true;
-
   (* Attach the disk image to libguestfs. *)
   g#add_drive_opts ~format:"raw" ~readonly:false output;
 
@@ -62,11 +57,7 @@ let ()     *)
   g#upload "/etc/resolv.conf" "/foo/resolv.conf";
 
-  (* Because 'autosync' was set (above) we can just close the handle
-   * and the disk contents will be synchronized.  You can also do
-   * this manually by calling g#umount_all and g#sync.
-   *
-   * Note also that handles are automatically closed if they are
+  (* Note also that handles are automatically closed if they are
    * reaped by the garbage collector.  You only need to call close
    * if you want to close the handle right away.
    *)
diff --git a/perl/examples/create_disk.pl b/perl/examples/create_disk.pl
index 5a81663..49538a5 100755
--- a/perl/examples/create_disk.pl
+++ b/perl/examples/create_disk.pl
@@ -17,10 +17,6 @@ close FILE or die "$output: $!";
 # Set the trace flag so that we can see each libguestfs call.
 $g->set_trace (1);
 
-# Set the autosync flag so that the disk will be synchronized
-# automatically when the libguestfs handle is closed.
-$g->set_autosync (1);
-
 # Attach the disk image to libguestfs.
 $g->add_drive_opts ($output, format => "raw", readonly =>
0);
 
@@ -61,7 +57,4 @@ $g->mkdir ("/foo");
 # the disk image.
 $g->upload ("/etc/resolv.conf", "/foo/resolv.conf");
 
-# Because 'autosync' was set (above) we can just exit here
-# and the disk contents will be synchronized.  You can also do
-# this manually by calling $g->umount_all and $g->sync.
 exit 0
diff --git a/python/examples/create_disk.py b/python/examples/create_disk.py
index c80fc6d..bf47c91 100644
--- a/python/examples/create_disk.py
+++ b/python/examples/create_disk.py
@@ -15,10 +15,6 @@ f.close ()
 # Set the trace flag so that we can see each libguestfs call.
 g.set_trace (1)
 
-# Set the autosync flag so that the disk will be synchronized
-# automatically when the libguestfs handle is closed.
-g.set_autosync (1)
-
 # Attach the disk image to libguestfs.
 g.add_drive_opts (output, format = "raw", readonly = 0)
 
@@ -55,7 +51,4 @@ g.mkdir ("/foo")
 # the disk image.
 g.upload ("/etc/resolv.conf", "/foo/resolv.conf")
 
-# Because 'autosync' was set (above) we can just close the handle
-# and the disk contents will be synchronized.  You can also do
-# this manually by calling g.umount_all and g.sync.
 g.close ()
diff --git a/ruby/examples/create_disk.rb b/ruby/examples/create_disk.rb
index 16871cb..4e42cf6 100644
--- a/ruby/examples/create_disk.rb
+++ b/ruby/examples/create_disk.rb
@@ -14,10 +14,6 @@ File.open(output, "w") {
 # Set the trace flag so that we can see each libguestfs call.
 g.set_trace(1)
 
-# Set the autosync flag so that the disk will be synchronized
-# automatically when the libguestfs handle is closed.
-g.set_autosync(1)
-
 # Attach the disk image to libguestfs.
 g.add_drive_opts(output, :format => "raw")
 
@@ -58,7 +54,4 @@ g.mkdir("/foo")
 # the disk image.
 g.upload("/etc/resolv.conf", "/foo/resolv.conf")
 
-# Because 'autosync' was set (above) we can just close the handle
-# and the disk contents will be synchronized.  You can also do
-# this manually by calling g#umount_all and g#sync.
 g.close()
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 3/7] perl, python, ruby: Fix comments on call to close method.

From: "Richard W.M. Jones" <rjones at redhat.com>

Make the comments consistent.

Also make the Perl example call $g->close explicitly so it is
consistent with the other examples.
---
 perl/examples/create_disk.pl   |    5 ++++-
 python/examples/create_disk.py |    3 +++
 ruby/examples/create_disk.rb   |    3 +++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/perl/examples/create_disk.pl b/perl/examples/create_disk.pl
index 49538a5..fa15bc7 100755
--- a/perl/examples/create_disk.pl
+++ b/perl/examples/create_disk.pl
@@ -57,4 +57,7 @@ $g->mkdir ("/foo");
 # the disk image.
 $g->upload ("/etc/resolv.conf", "/foo/resolv.conf");
 
-exit 0
+# Note also that handles are automatically closed if they are
+# reaped by reference counting.  You only need to call close
+# if you want to close the handle right away.
+$g->close ();
diff --git a/python/examples/create_disk.py b/python/examples/create_disk.py
index bf47c91..0885383 100644
--- a/python/examples/create_disk.py
+++ b/python/examples/create_disk.py
@@ -51,4 +51,7 @@ g.mkdir ("/foo")
 # the disk image.
 g.upload ("/etc/resolv.conf", "/foo/resolv.conf")
 
+# Note also that handles are automatically closed if they are
+# reaped by reference counting.  You only need to call close
+# if you want to close the handle right away.
 g.close ()
diff --git a/ruby/examples/create_disk.rb b/ruby/examples/create_disk.rb
index 4e42cf6..026ea2e 100644
--- a/ruby/examples/create_disk.rb
+++ b/ruby/examples/create_disk.rb
@@ -54,4 +54,7 @@ g.mkdir("/foo")
 # the disk image.
 g.upload("/etc/resolv.conf", "/foo/resolv.conf")
 
+# Note also that handles are automatically closed if they are
+# reaped by the garbage collector.  You only need to call close
+# if you want to close the handle right away.
 g.close()
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 4/7] close: Rearrange the order in which the handle is closed and freed.

From: "Richard W.M. Jones" <rjones at redhat.com>

The order is now:

 - remove the handle from the list of handles
 - send close trace message
 - sync and shutdown qemu
 - run user close callback
 - free temporary directory
 - free memory

This commit ought to be no functional change.
---
 src/guestfs.c |   64 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 31 insertions(+), 33 deletions(-)

diff --git a/src/guestfs.c b/src/guestfs.c
index 32bcbef..16d11da 100644
--- a/src/guestfs.c
+++ b/src/guestfs.c
@@ -178,6 +178,19 @@ guestfs_close (guestfs_h *g)
     return;
   }
 
+  /* Remove the handle from the handles list. */
+  gl_lock_lock (handles_lock);
+  if (handles == g)
+    handles = g->next;
+  else {
+    guestfs_h *gg;
+
+    for (gg = handles; gg->next != g; gg = gg->next)
+      ;
+    gg->next = g->next;
+  }
+  gl_lock_unlock (handles_lock);
+
   if (g->trace) {
     const char trace_msg[] = "close";
 
@@ -203,19 +216,6 @@ guestfs_close (guestfs_h *g)
     ignore_value (guestfs_kill_subprocess (g));
 #endif
 
-  /* Run user close callbacks. */
-  guestfs___call_callbacks_void (g, GUESTFS_EVENT_CLOSE);
-
-  /* Remove all other registered callbacks.  Since we've already
-   * called the close callbacks, we shouldn't call any others.
-   */
-  free (g->events);
-  g->nr_events = 0;
-  g->events = NULL;
-
-  guestfs___free_inspect_info (g);
-  guestfs___free_drives (&g->drives);
-
   /* Close sockets. */
   if (g->fd[0] >= 0)
     close (g->fd[0]);
@@ -231,9 +231,25 @@ guestfs_close (guestfs_h *g)
   if (g->pid > 0) waitpid (g->pid, NULL, 0);
   if (g->recoverypid > 0) waitpid (g->recoverypid, NULL, 0);
 
+  /* Run user close callbacks. */
+  guestfs___call_callbacks_void (g, GUESTFS_EVENT_CLOSE);
+
   /* Remove whole temporary directory. */
   guestfs___remove_tmpdir (g->tmpdir);
-  free (g->tmpdir);
+
+  /* Mark the handle as dead and then free up all memory. */
+  g->state = NO_HANDLE;
+
+  free (g->events);
+  g->nr_events = 0;
+  g->events = NULL;
+
+#if HAVE_FUSE
+  guestfs___free_fuse (g);
+#endif
+
+  guestfs___free_inspect_info (g);
+  guestfs___free_drives (&g->drives);
 
   if (g->cmdline) {
     size_t i;
@@ -243,27 +259,9 @@ guestfs_close (guestfs_h *g)
     free (g->cmdline);
   }
 
-  /* Mark the handle as dead before freeing it. */
-  g->state = NO_HANDLE;
-
-  gl_lock_lock (handles_lock);
-  if (handles == g)
-    handles = g->next;
-  else {
-    guestfs_h *gg;
-
-    for (gg = handles; gg->next != g; gg = gg->next)
-      ;
-    gg->next = g->next;
-  }
-  gl_lock_unlock (handles_lock);
-
-#if HAVE_FUSE
-  guestfs___free_fuse (g);
-#endif
-
   if (g->pda)
     hash_free (g->pda);
+  free (g->tmpdir);
   free (g->last_error);
   free (g->path);
   free (g->qemu);
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 5/7] close: Warn if qemu exits unsuccessfully.

From: "Richard W.M. Jones" <rjones at redhat.com>

Currently guestfs_close has no method to return an error indication,
so this commit simply prints the error on stderr.
---
 src/guestfs.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/guestfs.c b/src/guestfs.c
index 16d11da..e36ad46 100644
--- a/src/guestfs.c
+++ b/src/guestfs.c
@@ -172,6 +172,8 @@ guestfs_create (void)
 void
 guestfs_close (guestfs_h *g)
 {
+  int status, sig;
+
   if (g->state == NO_HANDLE) {
     /* Not safe to call ANY callbacks here, so ... */
     fprintf (stderr, _("guestfs_close: called twice on the same
handle\n"));
@@ -228,7 +230,23 @@ guestfs_close (guestfs_h *g)
   g->sock = -1;
 
   /* Wait for subprocess(es) to exit. */
-  if (g->pid > 0) waitpid (g->pid, NULL, 0);
+  if (g->pid > 0) {
+    if (waitpid (g->pid, &status, 0) == -1)
+      perror ("waitpid (qemu)");
+    if (WIFEXITED (status) && WEXITSTATUS (status) != 0)
+      fprintf (stderr, "libguestfs: close: qemu failed (status
%d)\n",
+               WEXITSTATUS (status));
+    else if (WIFSIGNALED (status)) {
+      sig = WTERMSIG (status);
+      fprintf (stderr, "libguestfs: close: qemu terminated by signal %d
(%s)\n",
+               sig, strsignal (sig));
+    }
+    else if (WIFSTOPPED (status)) {
+      sig = WSTOPSIG (status);
+      fprintf (stderr, "libguestfs: close: qemu stopped by signal %d
(%s)\n",
+               sig, strsignal (sig));
+    }
+  }
   if (g->recoverypid > 0) waitpid (g->recoverypid, NULL, 0);
 
   /* Run user close callbacks. */
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 6/7] New API: guestfs_shutdown: Cleanly shutdown the backend.

From: "Richard W.M. Jones" <rjones at redhat.com>

The new API splits orderly close into a two-step process:

  if (guestfs_shutdown (g) == -1) {
    /* handle the error, eg. qemu error */
  }
  guestfs_close (g);

Note that the explicit shutdown step is only necessary in the case
where you have made changes to the disk image and want to handle write
errors.  Read the documentation for further information.

This change also:

 - deprecates guestfs_kill_subprocess

 - turns guestfs_kill_subprocess into the same as guestfs_shutdown

 - changes guestfish and other tools to call shutdown + close
   where necessary (not for read-only tools)

 - updates documentation

 - updates examples
---
 edit/virt-edit.c                            |    2 +-
 erlang/examples/create_disk.erl             |    5 ++
 examples/copy_over.c                        |    6 +-
 examples/create_disk.c                      |    7 ++
 examples/mount_local.c                      |    4 +-
 fish/fish.c                                 |    3 +
 fish/reopen.c                               |    3 +
 fish/test-events.sh                         |    3 +
 format/format.c                             |    4 ++
 fuse/guestmount.c                           |    2 +
 generator/generator_actions.ml              |   26 ++++++-
 java/examples/CreateDisk.java               |    5 ++
 java/t/GuestFS010Basic.java                 |    1 +
 ocaml/examples/create_disk.ml               |    6 ++
 ocaml/t/guestfs_010_basic.ml                |    1 +
 ocaml/t/guestfs_070_threads.ml              |    3 +-
 ocaml/t/guestfs_500_parallel_mount_local.ml |    1 +
 perl/examples/create_disk.pl                |    5 ++
 python/examples/create_disk.py              |    5 ++
 python/t/010-basic.py                       |    1 +
 rescue/virt-rescue.c                        |    4 +-
 resize/resize.ml                            |    8 +--
 ruby/examples/create_disk.rb                |    5 ++
 sparsify/sparsify.ml                        |    1 +
 src/guestfs.c                               |  103 +++++++++++++++++----------
 src/guestfs.pod                             |   25 ++++---
 src/launch.c                                |   12 +---
 sysprep/main.ml                             |    1 +
 test-tool/test-tool.c                       |    5 ++
 tests/btrfs/test-btrfs-subvolume-default.pl |    1 +
 tests/disks/test-max-disks.pl               |    2 +-
 tests/guests/guest-aux/make-fedora-img.pl   |    3 +
 tests/selinux/run-test.pl                   |    1 +
 tools/virt-make-fs                          |    5 +-
 tools/virt-tar                              |    6 +-
 tools/virt-win-reg                          |    6 +-
 36 files changed, 193 insertions(+), 88 deletions(-)

diff --git a/edit/virt-edit.c b/edit/virt-edit.c
index c042745..3d23da8 100644
--- a/edit/virt-edit.c
+++ b/edit/virt-edit.c
@@ -298,7 +298,7 @@ main (int argc, char *argv[])
   free (root);
 
   /* Cleanly unmount the disks after editing. */
-  if (guestfs_umount_all (g) == -1 || guestfs_sync (g) == -1)
+  if (guestfs_shutdown (g) == -1)
     exit (EXIT_FAILURE);
 
   guestfs_close (g);
diff --git a/erlang/examples/create_disk.erl b/erlang/examples/create_disk.erl
index a0c9044..d192435 100755
--- a/erlang/examples/create_disk.erl
+++ b/erlang/examples/create_disk.erl
@@ -51,6 +51,11 @@ main(_) ->
     % the disk image.
     ok = guestfs:upload(G, "/etc/resolv.conf",
"/foo/resolv.conf"),
 
+    % Because we wrote to the disk and we want to detect write
+    % errors, call guestfs:shutdown.  You don't need to do this:
+    % guestfs:close will do it implicitly.
+    ok = guestfs:shutdown(G),
+
     % Note also that handles are automatically closed if they are
     % reaped by the garbage collector.  You only need to call close
     % if you want to close the handle right away.
diff --git a/examples/copy_over.c b/examples/copy_over.c
index 9cff00b..45b8fa6 100644
--- a/examples/copy_over.c
+++ b/examples/copy_over.c
@@ -145,7 +145,7 @@ main (int argc, char *argv[])
   }
 
   /* Clean up. */
-  if (guestfs_umount_all (destg) == -1)
+  if (guestfs_shutdown (destg) == -1)
     exit (EXIT_FAILURE);
   guestfs_close (destg);
 
@@ -193,10 +193,6 @@ start_srcthread (void *arg)
   }
 
   /* Clean up. */
-  if (guestfs_umount_all (srcg) == -1) {
-    pthread_cancel (threaddata->mainthread);
-    exit (EXIT_FAILURE);
-  }
   guestfs_close (srcg);
 
   return NULL;
diff --git a/examples/create_disk.c b/examples/create_disk.c
index 5380888..e314328 100644
--- a/examples/create_disk.c
+++ b/examples/create_disk.c
@@ -99,6 +99,13 @@ main (int argc, char *argv[])
   if (guestfs_upload (g, "/etc/resolv.conf",
"/foo/resolv.conf") == -1)
     exit (EXIT_FAILURE);
 
+  /* Because we wrote to the disk and we want to detect write
+   * errors, call guestfs_shutdown.  You don't need to do this:
+   * guestfs_close will do it implicitly.
+   */
+  if (guestfs_shutdown (g) == -1)
+    exit (EXIT_FAILURE);
+
   guestfs_close (g);
 
   /* Free up the lists. */
diff --git a/examples/mount_local.c b/examples/mount_local.c
index 240e31f..5c24cd1 100644
--- a/examples/mount_local.c
+++ b/examples/mount_local.c
@@ -186,8 +186,8 @@ main (int argc, char *argv[])
 
   waitpid (pid, NULL, 0);
 
-  /* Unmount and close. */
-  if (guestfs_umount (g, "/") == -1)
+  /* Shutdown the handle explicitly so write errors can be detected. */
+  if (guestfs_shutdown (g) == -1)
     exit (EXIT_FAILURE);
 
   guestfs_close (g);
diff --git a/fish/fish.c b/fish/fish.c
index 80b3364..ded80ec 100644
--- a/fish/fish.c
+++ b/fish/fish.c
@@ -553,6 +553,9 @@ main (int argc, char *argv[])
   else
     cmdline (argv, optind, argc);
 
+  if (guestfs_shutdown (g) == -1)
+    exit (EXIT_FAILURE);
+
   guestfs_close (g);
 
  out_after_handle_close:
diff --git a/fish/reopen.c b/fish/reopen.c
index 3fc9510..6b3707e 100644
--- a/fish/reopen.c
+++ b/fish/reopen.c
@@ -40,6 +40,9 @@ run_reopen (const char *cmd, size_t argc, char *argv[])
     return -1;
   }
 
+  if (guestfs_shutdown (g) == -1)
+    return -1;
+
   /* Open the new handle first, so we can copy the settings from the
    * old one to the new one, and also so if it fails we still have an
    * open handle.
diff --git a/fish/test-events.sh b/fish/test-events.sh
index 76d3823..b3684a8 100755
--- a/fish/test-events.sh
+++ b/fish/test-events.sh
@@ -53,6 +53,9 @@ if [ "$(cat test.out)" != '"ev1" (0):
*: echo $EVENT $@
 "ev1" (1): *: echo $EVENT $@
 "ev2" (2): *: echo $EVENT $@
 "ev2" (2): *: echo $EVENT $@
+enter shutdown
+trace shutdown
+trace shutdown = 0
 enter get_autosync
 trace get_autosync
 trace get_autosync = 1
diff --git a/format/format.c b/format/format.c
index b9bab4d..bbe7f5c 100644
--- a/format/format.c
+++ b/format/format.c
@@ -260,6 +260,8 @@ main (int argc, char *argv[])
       guestfs_set_verbose (g2, guestfs_get_verbose (g));
       guestfs_set_trace (g2, guestfs_get_trace (g));
 
+      if (guestfs_shutdown (g) == -1)
+        exit (EXIT_FAILURE);
       guestfs_close (g);
       g = g2;
     }
@@ -278,6 +280,8 @@ main (int argc, char *argv[])
   /* Free up data structures. */
   free_drives (drvs);
 
+  if (guestfs_shutdown (g) == -1)
+    exit (EXIT_FAILURE);
   guestfs_close (g);
 
   exit (EXIT_SUCCESS);
diff --git a/fuse/guestmount.c b/fuse/guestmount.c
index b6c6467..0c2e2dc 100644
--- a/fuse/guestmount.c
+++ b/fuse/guestmount.c
@@ -431,6 +431,8 @@ main (int argc, char *argv[])
   r = guestfs_mount_local_run (g);
 
   /* Cleanup. */
+  if (guestfs_shutdown (g) == -1)
+    r = -1;
   guestfs_close (g);
 
   exit (r == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
diff --git a/generator/generator_actions.ml b/generator/generator_actions.ml
index e6219d1..5baa9b2 100644
--- a/generator/generator_actions.ml
+++ b/generator/generator_actions.ml
@@ -137,11 +137,13 @@ If you see any calls to this function in code then you can
just
 remove them, unless you want to retain compatibility with older
 versions of the API.");
 
-  ("kill_subprocess", (RErr, [], []), -1, [],
+  ("kill_subprocess", (RErr, [], []), -1, [DeprecatedBy
"shutdown"],
    [],
    "kill the qemu subprocess",
    "\
-This kills the qemu subprocess.  You should never need to call this.");
+This kills the qemu subprocess.
+
+Do not call this.  See: C<guestfs_shutdown> instead.");
 
   ("add_drive", (RErr, [String "filename"], []), -1,
[ConfigOnly],
    [],
@@ -1766,6 +1768,26 @@ Converted to C</dev/VG/LV> form using
C<guestfs_lvm_canonical_lvm_name>.
 
 Other strings are returned unmodified.");
 
+  ("shutdown", (RErr, [], []), -1, [],
+   [],
+   "shutdown the qemu subprocess",
+   "\
+This is the opposite of C<guestfs_launch>.  It performs an orderly
+shutdown of the backend process(es).  If the autosync flag is set
+(which is the default) then the disk image is synchronized.
+
+If the subprocess exits with an error then this function will return
+an error, which should I<not> be ignored (it may indicate that the
+disk image could not be written out properly).
+
+It is safe to call this multiple times.  Extra calls are ignored.
+
+This call does I<not> close or free up the handle.  You still
+need to call C<guestfs_close> afterwards.
+
+C<guestfs_close> will call this if you don't do it explicitly,
+but note that any errors are ignored in that case.");
+
 ]
 
 (* daemon_functions are any functions which cause some action
diff --git a/java/examples/CreateDisk.java b/java/examples/CreateDisk.java
index 41f8739..a6fcfed 100644
--- a/java/examples/CreateDisk.java
+++ b/java/examples/CreateDisk.java
@@ -66,6 +66,11 @@ public class CreateDisk
             // the disk image.
             g.upload ("/etc/resolv.conf",
"/foo/resolv.conf");
 
+            // Because we wrote to the disk and we want to detect write
+            // errors, call g.shutdown.  You don't need to do this:
+            //g.close will do it implicitly.
+            g.shutdown ();
+
             // Note also that handles are automatically closed if they are
             // reaped by the garbage collector.  You only need to call close
             // if you want to close the handle right away.
diff --git a/java/t/GuestFS010Basic.java b/java/t/GuestFS010Basic.java
index 21f88dd..a44fde1 100644
--- a/java/t/GuestFS010Basic.java
+++ b/java/t/GuestFS010Basic.java
@@ -55,6 +55,7 @@ public class GuestFS010Basic
             assert m.get ("/dev/VG/LV1").equals ("ext2");
             assert m.get ("/dev/VG/LV2").equals
("unknown");
 
+            g.shutdown ();
             g.close ();
 
             File f2 = new File ("test.img");
diff --git a/ocaml/examples/create_disk.ml b/ocaml/examples/create_disk.ml
index 75dbdc3..4437b1c 100644
--- a/ocaml/examples/create_disk.ml
+++ b/ocaml/examples/create_disk.ml
@@ -57,6 +57,12 @@ let ()     *)
   g#upload "/etc/resolv.conf" "/foo/resolv.conf";
 
+  (* Because we wrote to the disk and we want to detect write
+   * errors, call g#shutdown.  You don't need to do this:
+   * g#close will do it implicitly.
+   *)
+  g#shutdown ();
+
   (* Note also that handles are automatically closed if they are
    * reaped by the garbage collector.  You only need to call close
    * if you want to close the handle right away.
diff --git a/ocaml/t/guestfs_010_basic.ml b/ocaml/t/guestfs_010_basic.ml
index b6750a1..bb96737 100644
--- a/ocaml/t/guestfs_010_basic.ml
+++ b/ocaml/t/guestfs_010_basic.ml
@@ -61,5 +61,6 @@ let ()                 "q", 'r' ] then
     failwith "Guestfs.readdir returned incorrect result";
 
+  g#shutdown ();
   g#close ();
   unlink "test.img"
diff --git a/ocaml/t/guestfs_070_threads.ml b/ocaml/t/guestfs_070_threads.ml
index 8b22f93..419f853 100644
--- a/ocaml/t/guestfs_070_threads.ml
+++ b/ocaml/t/guestfs_070_threads.ml
@@ -65,8 +65,7 @@ let ()    let s = String.copy "/test" in
   Guestfs.touch g s;
 
-  Guestfs.umount_all g;
-  Guestfs.sync g;
+  Guestfs.shutdown g;
   Guestfs.close g;
   unlink "test.img";
   Gc.compact ();
diff --git a/ocaml/t/guestfs_500_parallel_mount_local.ml
b/ocaml/t/guestfs_500_parallel_mount_local.ml
index 5a432f0..a26ff9d 100644
--- a/ocaml/t/guestfs_500_parallel_mount_local.ml
+++ b/ocaml/t/guestfs_500_parallel_mount_local.ml
@@ -145,6 +145,7 @@ and start_thread (filename, mp)    in
   loop ();
 
+  g#shutdown ();
   g#close ()
 
 (* This is run in a child program. *)
diff --git a/perl/examples/create_disk.pl b/perl/examples/create_disk.pl
index fa15bc7..11b9146 100755
--- a/perl/examples/create_disk.pl
+++ b/perl/examples/create_disk.pl
@@ -57,6 +57,11 @@ $g->mkdir ("/foo");
 # the disk image.
 $g->upload ("/etc/resolv.conf", "/foo/resolv.conf");
 
+# Because we wrote to the disk and we want to detect write
+# errors, call $g->shutdown.  You don't need to do this:
+# $g->close will do it implicitly.
+$g->shutdown ();
+
 # Note also that handles are automatically closed if they are
 # reaped by reference counting.  You only need to call close
 # if you want to close the handle right away.
diff --git a/python/examples/create_disk.py b/python/examples/create_disk.py
index 0885383..86f5042 100644
--- a/python/examples/create_disk.py
+++ b/python/examples/create_disk.py
@@ -51,6 +51,11 @@ g.mkdir ("/foo")
 # the disk image.
 g.upload ("/etc/resolv.conf", "/foo/resolv.conf")
 
+# Because we wrote to the disk and we want to detect write
+# errors, call g.shutdown.  You don't need to do this:
+# g.close will do it implicitly.
+g.shutdown ()
+
 # Note also that handles are automatically closed if they are
 # reaped by reference counting.  You only need to call close
 # if you want to close the handle right away.
diff --git a/python/t/010-basic.py b/python/t/010-basic.py
index a97f3d2..779af48 100644
--- a/python/t/010-basic.py
+++ b/python/t/010-basic.py
@@ -31,6 +31,7 @@ g.lvcreate ("LV1", "VG", 200)
 g.lvcreate ("LV2", "VG", 200)
 if (g.lvs () != ["/dev/VG/LV1", "/dev/VG/LV2"]):
     raise "Error: g.lvs() returned incorrect result"
+g.shutdown ()
 g.close ()
 
 os.unlink ("test.img")
diff --git a/rescue/virt-rescue.c b/rescue/virt-rescue.c
index 684baeb..f1f8d85 100644
--- a/rescue/virt-rescue.c
+++ b/rescue/virt-rescue.c
@@ -341,7 +341,9 @@ main (int argc, char *argv[])
    */
   guestfs_set_error_handler (g, NULL, NULL);
 
-  /* We expect launch to fail, so ignore the return value. */
+  /* We expect launch to fail, so ignore the return value, and don't
+   * bother with explicit guestfs_shutdown either.
+   */
   ignore_value (guestfs_launch (g));
 
   guestfs_close (g);
diff --git a/resize/resize.ml b/resize/resize.ml
index 5e323ba..cd4c9d6 100644
--- a/resize/resize.ml
+++ b/resize/resize.ml
@@ -836,7 +836,7 @@ let g      if ok then g, true
     else if attempts > 0 then (
       g#zero "/dev/sdb";
-      g#sync ();
+      g#shutdown ();
       g#close ();
 
       let g = connect_both_disks () in
@@ -1120,8 +1120,7 @@ let to_be_expanded  
 let g    if to_be_expanded then (
-    g#umount_all ();
-    g#sync ();
+    g#shutdown ();
     g#close ();
 
     let g = new G.guestfs () in
@@ -1192,8 +1191,7 @@ let ()  
 (* Finished.  Unmount disks and exit. *)
 let () -  g#umount_all ();
-  g#sync ();
+  g#shutdown ();
   g#close ();
 
   if not quiet then (
diff --git a/ruby/examples/create_disk.rb b/ruby/examples/create_disk.rb
index 026ea2e..32fb117 100644
--- a/ruby/examples/create_disk.rb
+++ b/ruby/examples/create_disk.rb
@@ -54,6 +54,11 @@ g.mkdir("/foo")
 # the disk image.
 g.upload("/etc/resolv.conf", "/foo/resolv.conf")
 
+# Because we wrote to the disk and we want to detect write
+# errors, call g.shutdown.  You don't need to do this:
+# g.close will do it implicitly.
+g.shutdown()
+
 # Note also that handles are automatically closed if they are
 # reaped by the garbage collector.  You only need to call close
 # if you want to close the handle right away.
diff --git a/sparsify/sparsify.ml b/sparsify/sparsify.ml
index ec3512a..6f1635b 100644
--- a/sparsify/sparsify.ml
+++ b/sparsify/sparsify.ml
@@ -295,6 +295,7 @@ let ()  
 (* Don't need libguestfs now. *)
 let () +  g#shutdown ();
   g#close ()
 
 (* What should the output format be?  If the user specified an
diff --git a/src/guestfs.c b/src/guestfs.c
index e36ad46..93ae247 100644
--- a/src/guestfs.c
+++ b/src/guestfs.c
@@ -172,8 +172,6 @@ guestfs_create (void)
 void
 guestfs_close (guestfs_h *g)
 {
-  int status, sig;
-
   if (g->state == NO_HANDLE) {
     /* Not safe to call ANY callbacks here, so ... */
     fprintf (stderr, _("guestfs_close: called twice on the same
handle\n"));
@@ -202,10 +200,6 @@ guestfs_close (guestfs_h *g)
 
   debug (g, "closing guestfs handle %p (state %d)", g, g->state);
 
-  /* Try to sync if autosync flag is set. */
-  if (g->autosync && g->state == READY)
-    ignore_value (guestfs_internal_autosync (g));
-
   /* If we are valgrinding the daemon, then we *don't* want to kill
    * the subprocess because we want the final valgrind messages sent
    * when we close sockets below.  However for normal production use,
@@ -213,42 +207,10 @@ guestfs_close (guestfs_h *g)
    * daemon or qemu is not responding).
    */
 #ifndef VALGRIND_DAEMON
-  /* Kill the qemu subprocess. */
   if (g->state != CONFIG)
-    ignore_value (guestfs_kill_subprocess (g));
+    ignore_value (guestfs_shutdown (g));
 #endif
 
-  /* Close sockets. */
-  if (g->fd[0] >= 0)
-    close (g->fd[0]);
-  if (g->fd[1] >= 0)
-    close (g->fd[1]);
-  if (g->sock >= 0)
-    close (g->sock);
-  g->fd[0] = -1;
-  g->fd[1] = -1;
-  g->sock = -1;
-
-  /* Wait for subprocess(es) to exit. */
-  if (g->pid > 0) {
-    if (waitpid (g->pid, &status, 0) == -1)
-      perror ("waitpid (qemu)");
-    if (WIFEXITED (status) && WEXITSTATUS (status) != 0)
-      fprintf (stderr, "libguestfs: close: qemu failed (status
%d)\n",
-               WEXITSTATUS (status));
-    else if (WIFSIGNALED (status)) {
-      sig = WTERMSIG (status);
-      fprintf (stderr, "libguestfs: close: qemu terminated by signal %d
(%s)\n",
-               sig, strsignal (sig));
-    }
-    else if (WIFSTOPPED (status)) {
-      sig = WSTOPSIG (status);
-      fprintf (stderr, "libguestfs: close: qemu stopped by signal %d
(%s)\n",
-               sig, strsignal (sig));
-    }
-  }
-  if (g->recoverypid > 0) waitpid (g->recoverypid, NULL, 0);
-
   /* Run user close callbacks. */
   guestfs___call_callbacks_void (g, GUESTFS_EVENT_CLOSE);
 
@@ -290,6 +252,69 @@ guestfs_close (guestfs_h *g)
   free (g);
 }
 
+/* Shutdown the backend. */
+int
+guestfs__shutdown (guestfs_h *g)
+{
+  int ret = 0;
+  int status, sig;
+
+  if (g->state == CONFIG)
+    return 0;
+
+  /* Try to sync if autosync flag is set. */
+  if (g->autosync && g->state == READY) {
+    if (guestfs_internal_autosync (g) == -1)
+      ret = -1;
+  }
+
+  /* Signal qemu to shutdown cleanly, and kill the recovery process. */
+  if (g->pid > 0) {
+    debug (g, "sending SIGTERM to process %d", g->pid);
+    kill (g->pid, SIGTERM);
+  }
+  if (g->recoverypid > 0) kill (g->recoverypid, 9);
+
+  /* Close sockets. */
+  if (g->fd[0] >= 0)
+    close (g->fd[0]);
+  if (g->fd[1] >= 0)
+    close (g->fd[1]);
+  if (g->sock >= 0)
+    close (g->sock);
+  g->fd[0] = -1;
+  g->fd[1] = -1;
+  g->sock = -1;
+
+  /* Wait for subprocess(es) to exit. */
+  if (g->pid > 0) {
+    if (waitpid (g->pid, &status, 0) == -1) {
+      perrorf (g, "waitpid (qemu)");
+      ret = -1;
+    }
+    else if (WIFEXITED (status) && WEXITSTATUS (status) != 0) {
+      error (g, "qemu failed (status %d)", WEXITSTATUS (status));
+      ret = -1;
+    }
+    else if (WIFSIGNALED (status)) {
+      sig = WTERMSIG (status);
+      error (g, "qemu terminated by signal %d (%s)", sig, strsignal
(sig));
+      ret = -1;
+    }
+    else if (WIFSTOPPED (status)) {
+      sig = WSTOPSIG (status);
+      error (g, "qemu stopped by signal %d (%s)", sig, strsignal
(sig));
+      ret = -1;
+    }
+  }
+  if (g->recoverypid > 0) waitpid (g->recoverypid, NULL, 0);
+
+  g->pid = g->recoverypid = 0;
+  g->state = CONFIG;
+
+  return ret;
+}
+
 /* Close all open handles (called from atexit(3)). */
 static void
 close_handles (void)
diff --git a/src/guestfs.pod b/src/guestfs.pod
index 6959f50..9538f38 100644
--- a/src/guestfs.pod
+++ b/src/guestfs.pod
@@ -14,6 +14,7 @@ guestfs - Library for accessing and modifying virtual machine
images
  guestfs_mount (g, "/dev/sda1", "/");
  guestfs_touch (g, "/hello");
  guestfs_umount (g, "/");
+ guestfs_shutdown (g);
  guestfs_close (g);
 
  cc prog.c -o prog -lguestfs
@@ -83,13 +84,10 @@ this:
   */
  guestfs_touch (g, "/hello");
  
- /* This is only needed for libguestfs < 1.5.24.  Since then
-  * it is done automatically when you close the handle.  See
-  * discussion of autosync in this page.
-  */
- guestfs_sync (g);
+ /* Synchronize the disk.  This is the opposite of guestfs_launch. */
+ guestfs_shutdown (g);
  
- /* Close the handle 'g'. */
+ /* Close and free the handle 'g'. */
  guestfs_close (g);
 
 The code above doesn't include any error checking.  In real code you
@@ -1424,9 +1422,18 @@ L</ERROR HANDLING> section below.
 
 This closes the connection handle and frees up all resources used.
 
-If autosync was set on the handle and the handle was launched, then
-this implicitly calls various functions to unmount filesystems and
-sync the disk.  See L</guestfs_set_autosync> for more details.
+The correct way to close the handle is:
+
+ /* guestfs_shutdown is only needed if ALL of the following are true:
+  * (1) one or more disks were added in read-write mode,
+  * (2) guestfs_launch was called,
+  * (3) you made some changes,
+  * (4) you want to handle write errors properly.
+  */
+ if (guestfs_shutdown (g) == -1) {
+   /* handle write errors here */
+ }
+ guestfs_close (g);
 
 If a close callback was set on the handle, then it is called.
 
diff --git a/src/launch.c b/src/launch.c
index 95694ff..af5ca9f 100644
--- a/src/launch.c
+++ b/src/launch.c
@@ -1629,17 +1629,7 @@ guestfs__wait_ready (guestfs_h *g)
 int
 guestfs__kill_subprocess (guestfs_h *g)
 {
-  if (g->state == CONFIG) {
-    error (g, _("no subprocess to kill"));
-    return -1;
-  }
-
-  debug (g, "sending SIGTERM to process %d", g->pid);
-
-  if (g->pid > 0) kill (g->pid, SIGTERM);
-  if (g->recoverypid > 0) kill (g->recoverypid, 9);
-
-  return 0;
+  return guestfs__shutdown (g);
 }
 
 /* Maximum number of disks. */
diff --git a/sysprep/main.ml b/sysprep/main.ml
index ca5664d..05dd57d 100644
--- a/sysprep/main.ml
+++ b/sysprep/main.ml
@@ -239,6 +239,7 @@ let ()  
 (* Finished. *)
 let () +  g#shutdown ();
   g#close ();
 
   if debug_gc then
diff --git a/test-tool/test-tool.c b/test-tool/test-tool.c
index f0e3a31..865720e 100644
--- a/test-tool/test-tool.c
+++ b/test-tool/test-tool.c
@@ -255,6 +255,11 @@ main (int argc, char *argv[])
   }
 
   /* Close the handle. */
+  if (guestfs_shutdown (g) == -1) {
+    fprintf (stderr, _("libguestfs-test-tool: shutdown failed\n"));
+    exit (EXIT_FAILURE);
+  }
+
   guestfs_close (g);
 
   /* Booted and performed some simple operations -- success! */
diff --git a/tests/btrfs/test-btrfs-subvolume-default.pl
b/tests/btrfs/test-btrfs-subvolume-default.pl
index 013291b..2c09ce2 100755
--- a/tests/btrfs/test-btrfs-subvolume-default.pl
+++ b/tests/btrfs/test-btrfs-subvolume-default.pl
@@ -90,6 +90,7 @@ $g->mount ("/dev/sda1", "/");
 # Now we're back to the original default volume, so this should work:
 $g->mkdir ("/test1/foo/bar/baz");
 
+$g->shutdown ();
 $g->close ();
 
 unlink $testimg or die "$testimg: unlink: $!";
diff --git a/tests/disks/test-max-disks.pl b/tests/disks/test-max-disks.pl
index 3ced971..a0bba7f 100755
--- a/tests/disks/test-max-disks.pl
+++ b/tests/disks/test-max-disks.pl
@@ -140,7 +140,7 @@ for ($i = 0, $j = 0; $i < $max_disks; ++$i) {
     }
 }
 
-$g->umount_all ();
+$g->shutdown ();
 $g->close ();
 
 for ($i = 0; $i < $max_disks; ++$i) {
diff --git a/tests/guests/guest-aux/make-fedora-img.pl
b/tests/guests/guest-aux/make-fedora-img.pl
index 96e4168..6a0a183 100755
--- a/tests/guests/guest-aux/make-fedora-img.pl
+++ b/tests/guests/guest-aux/make-fedora-img.pl
@@ -195,6 +195,9 @@ $g->mkfs_opts ('ext2', '/dev/VG/LV2',
blocksize => 1024);
 $g->mkfs_opts ('ext2', '/dev/VG/LV3', blocksize => 2048);
 
 # Cleanup
+$g->shutdown ();
+$g->close ();
+
 unlink ("fstab.tmp.$$") or die;
 foreach my $img (@images) {
   $img =~ /^(.*)\.tmp\.\d+$/ or die;
diff --git a/tests/selinux/run-test.pl b/tests/selinux/run-test.pl
index 6b0ffb5..bbe3a2f 100755
--- a/tests/selinux/run-test.pl
+++ b/tests/selinux/run-test.pl
@@ -154,6 +154,7 @@ if ($test_via eq "direct") {
 }
 
 # Finish up.
+$g->shutdown ();
 $g->close ();
 unlink $testimg or die "$testimg: $!";
 
diff --git a/tools/virt-make-fs b/tools/virt-make-fs
index 7d03280..b53e860 100755
--- a/tools/virt-make-fs
+++ b/tools/virt-make-fs
@@ -539,9 +539,8 @@ eval {
     }
 
     print STDERR "finishing off\n" if $debug;
-    $g->umount_all ();
-    $g->sync ();
-    undef $g;
+    $g->shutdown ();
+    $g->close ()
 };
 if ($@) {
     # Error: delete the output before exiting.
diff --git a/tools/virt-tar b/tools/virt-tar
index 777dbbd..801104f 100755
--- a/tools/virt-tar
+++ b/tools/virt-tar
@@ -279,10 +279,8 @@ if ($mode eq "x") {
     }
 }
 
-$g->sync ();
-$g->umount_all ();
-
-undef $g;
+$g->shutdown ();
+$g->close ();
 
 exit 0;
 
diff --git a/tools/virt-win-reg b/tools/virt-win-reg
index ab5daa7..2caa954 100755
--- a/tools/virt-win-reg
+++ b/tools/virt-win-reg
@@ -337,9 +337,9 @@ else {                          # Import mode.
         upload_hive ($hiveshortname, $hives{$hiveshortname}->{hivefile})
     }
 
-    # Sync everything.
-    $g->umount_all ();
-    $g->sync ();
+    # Close.
+    $g->shutdown ();
+    $g->close ();
 }
 
 exit 0;
-- 
1.7.10.4

Richard W.M. Jones

2012-Jul-03 18:03 UTC

head link

[Libguestfs] [PATCH 7/7] resize: Add workaround for qcow2 disk cache bug in qemu (RHBZ#836710).

From: "Richard W.M. Jones" <rjones at redhat.com>

Best to read the comment.
---
 resize/Makefile.am |    2 +-
 resize/resize.ml   |   33 ++++++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/resize/Makefile.am b/resize/Makefile.am
index 70ace37..1234c96 100644
--- a/resize/Makefile.am
+++ b/resize/Makefile.am
@@ -56,7 +56,7 @@ bin_SCRIPTS = virt-resize
 # -I $(top_builddir)/src/.libs is a hack which forces corresponding -L
 # option to be passed to gcc, so we don't try linking against an
 # installed copy of libguestfs.
-OCAMLPACKAGES = -package str -I $(top_builddir)/src/.libs -I ../ocaml
+OCAMLPACKAGES = -package str,unix -I $(top_builddir)/src/.libs -I ../ocaml
 if HAVE_OCAML_PKG_GETTEXT
 OCAMLPACKAGES += -package gettext-stub
 endif
diff --git a/resize/resize.ml b/resize/resize.ml
index cd4c9d6..3118d74 100644
--- a/resize/resize.ml
+++ b/resize/resize.ml
@@ -1010,6 +1010,36 @@ let ()  
 (* Copy over the data. *)
 let () +  (* Obviously when you've got a function called
'hack_for_...' you
+   * know it cannot be good.
+   *
+   * This works around a bug in qemu's qcow2 block driver.  If there
+   * are I/Os in flight and you send SIGTERM to qemu, then qemu
+   * segfaults.  This particularly happens when the output file is
+   * growing rapidly (because of the partition copies below) and we
+   * close the handle (which kills qemu).
+   *
+   * The ugly workaround is to monitor the disk file and wait until it
+   * stops growing before closing the handle.
+   *
+   * https://bugzilla.redhat.com/show_bug.cgi?id=836710
+   * https://bugzilla.redhat.com/show_bug.cgi?id=836913
+   *)
+  let hack_for_rhbz836710 g ?format outfile +    match format with
+    | None | Some "qcow2" -> (* only for qcow2 or auto *)
+      let get_size () = (Unix.LargeFile.stat outfile).Unix.LargeFile.st_size in
+      let rec loop size +        g#sync ();
+        g#sleep 1;
+        let size' = get_size () in
+        if size <> size' then
+          loop size'
+      in
+      loop (get_size ())
+    | _ -> ()
+  in
+
   List.iter (
     fun p ->
       match p.p_operation with
@@ -1033,7 +1063,8 @@ let ()  
         (match p.p_type with
          | ContentUnknown | ContentPV _ | ContentFS _ ->
-           g#copy_device_to_device ~size:copysize source target
+           g#copy_device_to_device ~size:copysize source target;
+           hack_for_rhbz836710 g ?format:output_format outfile
 
          | ContentExtendedPartition ->
            (* You can't just copy an extended partition by name, eg.
-- 
1.7.10.4

Pádraig Brady

2012-Jul-04 13:38 UTC

head link

[Libguestfs] [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption.

On 07/03/2012 07:03 PM, Richard W.M. Jones wrote:> https://bugzilla.redhat.com/show_bug.cgi?id=836710
> https://bugzilla.redhat.com/show_bug.cgi?id=836913
> 
> There are at least two related bugs going on:
> 
> (1) Linux sync(2) system call doesn't send a write barrier to the
> disk, so in effect it doesn't force the hard disk to flush its cache.
> libguestfs used sync(2) to force changes to disk.
Surprising. So sync(2) is currently async. Ho hum.
I just noticed Jan Kara's patch set today actually:
https://lkml.org/lkml/2012/7/3/272
Would fix the issue at the kernel level?
>  We didn't expect
> that qemu was caching anything because we used 'cache=none' for all
> writable disks, but it turns out that qemu creates a writeback cache
> anyway when you do this (you need to use 'cache=directsync' when
you
> don't want a cache at all).
And we're not using 'directsync' for performance reasons?
> (2) qemu's qcow2 disk cache code is buggy.  If there are I/Os in
> flight when qemu shuts down, then qemu segfaults or assert fails.
> This can result in unwritten data.  Unfortunately libguestfs ignored
> the result of waitpid(2) so we didn't see this problem happening.
> 
> Patch 1/7 fixes the first problem by issuing fsync(2) on each whole
> block device when we sync.
> 
> Patches 2/7 - 7/7 are needed to fix the second problem.  We add a new
> API (guestfs_shutdown) so that we can actually catch the case where
> qemu is segfaulting instead of just ignoring it.  Since qemu itself
> isn't likely to be fixed any time soon, patch 7/7 adds a crude but
> effective workaround to virt-resize.
thanks for looking into this tricky issue so thoroughly,
P?draig.

Reasonably Related Threads

Search for more possibly parallel threads

Libguestfs - Jul 2012 - [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption.

[Libguestfs] [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption.

[Libguestfs] [PATCH 1/7] daemon: Run fsync on block devices after sync (RHBZ#836710).

[Libguestfs] [PATCH 2/7] examples: In create_disk example, don't call set_autosync.

[Libguestfs] [PATCH 3/7] perl, python, ruby: Fix comments on call to close method.

[Libguestfs] [PATCH 4/7] close: Rearrange the order in which the handle is closed and freed.

[Libguestfs] [PATCH 5/7] close: Warn if qemu exits unsuccessfully.

[Libguestfs] [PATCH 6/7] New API: guestfs_shutdown: Cleanly shutdown the backend.

[Libguestfs] [PATCH 7/7] resize: Add workaround for qcow2 disk cache bug in qemu (RHBZ#836710).

[Libguestfs] [PATCH 0/7 v2] Fix and workaround for qcow2 issues in qemu causing data corruption.

Reasonably Related Threads