thr3ads.net - Libguestfs - Re: [Libguestfs] Concurrent scanning of same disk [May 2015]

If this information is useful, please help other people find it:
Share via:

NoxDaFox

2015-May-28 07:33 UTC

Re: [Libguestfs] Concurrent scanning of same disk

2015-05-28 10:10 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:
> On Thu, May 28, 2015 at 09:48:41AM +0300, NoxDaFox wrote:
> > 2015-05-27 15:21 GMT+03:00 Richard W.M. Jones
<rjones@redhat.com>:
> >
> > > On Wed, May 27, 2015 at 09:38:38AM +0300, NoxDaFox wrote:
> > > >  * RuntimeError: file receive cancelled by daemon - On r
> > > > libguestfsmod.checksums_out (self._o, csumtype, directory,
sumsfile)
> > > >  * RuntimeError: hivex_close: do_hivex_close: you must call
> 'hivex-open'
> > > > first to initialize the hivex handle - On r >
libguestfsmod.inspect_os
> > > > (self._o)
> > >
> > > This error is likely to be -EIO (it's actually a bug in
libguestfs
> > > that it doesn't report these properly in the error message). 
However
> > > we cannot be sure unless you enable debugging and get the
complete
> > > messages.
> > >
> > > http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs
> > >
> > > Rich.
> > >
> > >
> >
> > I'm starting to wonder whether these errors are due to the fact
that I
> > compare snapshots of unconsistent disks. If so, is there a way to
> instruct
> > guestfs to ignore corrupted files?
>
> Are the snapshots "consistent"? - ie. taken in such as way that
they
> provide a single point-in-time view across the whole disk?  You
> mentioned using 'qemu-img convert' before.  'qemu-img
convert' on its
> own will not take a consistent snapshot (well, not unless you pause
> the guest during the copy, or you use some fancy new backup features
> recently added to qemu).
>
> > It's a bit challenging to generate such logs as the error appears
every
> now
> > ant then.
> > Here's the log related to a "RuntimeError: file receive
cancelled by
> > daemon".
> [...]
> > mount -o  /dev/sda2 /sysroot/
> > The disk contains an unclean file system (0, 0).
> > The file system wasn't safely closed on Windows. Fixing.
> > libguestfs: trace: mount = 0
> > libguestfs: trace: checksums_out "sha1" "/"
"/tmp/tmpAWHkYv"
> > guestfsd: main_loop: proc 1 (mount) took 2.02 seconds
> > guestfsd: main_loop: new request, len 0x38
> > cd /sysroot/ && find -type f -print0 | xargs -0 sha1sum
> > [   25.580340] perf interrupt took too long (2540 > 2500), lowering
> > kernel.perf_event_max_sample_rate to 50000
> > sha1sum: ./Windows/Prefetch/ReadyBoot/Trace7.fx: Value too large for
> > defined data type
> > [   67.835952] perf interrupt took too long (5048 > 5000), lowering
> > kernel.perf_event_max_sample_rate to 25000
> > [  143.304037] perf interrupt took too long (10010 > 10000),
lowering
> > kernel.perf_event_max_sample_rate to 12500
> > pclose: /: Success
> > guestfsd: main_loop: proc 244 (checksums_out) took 245.25 seconds
> > libguestfs: trace: checksums_out = -1 (error)
> [...]
> >   File "/usr/lib/python2.7/dist-packages/guestfs.py", line
1427, in
> > checksums_out
> >     r = libguestfsmod.checksums_out (self._o, csumtype, directory,
> sumsfile)
> > RuntimeError: file receive cancelled by daemon
>
> The error is confusing, but I think you are correct that it happens
> because the filesystem is unclean at the point at which it was
> snapshotted, maybe combined with partially written metadata which to
> the ntfs-3g driver looks like disk corruption.
>
> This is just what happens when you make inconsistent snapshots of disk
> unfortunately.
>
> My best suggestion would be:
>
>  - Catch the exception in Python
>
>  - When you hit this error, skip this snapshot and go on to the next one
>
> That may involve rearchitecting your application a bit, but if the
> error is rare, it seems like the best way to handle it.
>
> An alternative, if you're not doing it already, would be to take a
> consistent snapshot.  Assuming the guest is well-behaved and the
> filesystem uses journalling and the journalling is implemented
> correctly, a consistent snapshot should not have such errors.
>
> Rich.

To create the snapshots I'm using the libvirt command snapshotCreateXML
with no flag set. Does libvirt support consistent snapshotting or shall I
rely on QEMU backup new feature only?

Richard W.M. Jones

2015-May-28 07:40 UTC

head link

Re: [Libguestfs] Concurrent scanning of same disk

On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox
wrote:> To create the snapshots I'm using the libvirt command snapshotCreateXML
> with no flag set. Does libvirt support consistent snapshotting or shall I
> rely on QEMU backup new feature only?
According to: http://wiki.libvirt.org/page/Snapshots
virDomainSnapshotCreateXML is only consistent if the guest is paused
during the operation.

The new qemu feature is called drive-backup
(http://wiki.qemu.org/Features/IncrementalBackup).  Unless things
changed recently, it is not exposed through libvirt, so the only way
to use it is by sending qemu monitor commands
(http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/).

This is all pretty bleeding edge.  I still think you'd be better off
just ignoring snapshots that fail and moving on to the next one.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

NoxDaFox

2015-May-28 07:57 UTC

head link

Re: [Libguestfs] Concurrent scanning of same disk

2015-05-28 10:40 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:
> On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote:
> > To create the snapshots I'm using the libvirt command
snapshotCreateXML
> > with no flag set. Does libvirt support consistent snapshotting or
shall I
> > rely on QEMU backup new feature only?
>
> According to: http://wiki.libvirt.org/page/Snapshots
> virDomainSnapshotCreateXML is only consistent if the guest is paused
> during the operation.
>
> The new qemu feature is called drive-backup
> (http://wiki.qemu.org/Features/IncrementalBackup).  Unless things
> changed recently, it is not exposed through libvirt, so the only way
> to use it is by sending qemu monitor commands
> (
>
http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/
> ).
>
> This is all pretty bleeding edge.  I still think you'd be better off
> just ignoring snapshots that fail and moving on to the next one.
>
> Rich.

I might be missing something then as the guest is actually paused during
the acquisition of the snapshot.

I pause the guest, take a screenshot, a core dump and a snapshot, then I
resume the guest. Proof is that I can clearly analyse the memory core dump
without any problem.

Maybe I am breaking the disk's consistency once I extract the dump through
the qemu-img command?

The command is:
qemu-img convert -f qcow2 -o backing_file=guest_disk.qcow2 -O qcow2 -s
snapshot_n guest_disk.qcow2 new_disk_for_libguestfs.qcow2

Could it be that, as the backing file is pointing to the guest's disk which
will evolve in time, when guestfs tries to read the data sees
incosistencies?

The guest_disk.qcow2 is a COW clone of a base_disk.qcow2, what if I rebase
the new_disk_for_libguestfs.qcow2 to the base_disk.qcow2?

Seemingly Similar Threads

Search for more apparently analagous threads

Libguestfs - May 2015 - Re: Concurrent scanning of same disk

Re: [Libguestfs] Concurrent scanning of same disk

Re: [Libguestfs] Concurrent scanning of same disk

Re: [Libguestfs] Concurrent scanning of same disk

Seemingly Similar Threads