2015-05-28 10:10 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:> On Thu, May 28, 2015 at 09:48:41AM +0300, NoxDaFox wrote: > > 2015-05-27 15:21 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>: > > > > > On Wed, May 27, 2015 at 09:38:38AM +0300, NoxDaFox wrote: > > > > * RuntimeError: file receive cancelled by daemon - On r > > > > libguestfsmod.checksums_out (self._o, csumtype, directory, sumsfile) > > > > * RuntimeError: hivex_close: do_hivex_close: you must call > 'hivex-open' > > > > first to initialize the hivex handle - On r > libguestfsmod.inspect_os > > > > (self._o) > > > > > > This error is likely to be -EIO (it's actually a bug in libguestfs > > > that it doesn't report these properly in the error message). However > > > we cannot be sure unless you enable debugging and get the complete > > > messages. > > > > > > http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs > > > > > > Rich. > > > > > > > > > > I'm starting to wonder whether these errors are due to the fact that I > > compare snapshots of unconsistent disks. If so, is there a way to > instruct > > guestfs to ignore corrupted files? > > Are the snapshots "consistent"? - ie. taken in such as way that they > provide a single point-in-time view across the whole disk? You > mentioned using 'qemu-img convert' before. 'qemu-img convert' on its > own will not take a consistent snapshot (well, not unless you pause > the guest during the copy, or you use some fancy new backup features > recently added to qemu). > > > It's a bit challenging to generate such logs as the error appears every > now > > ant then. > > Here's the log related to a "RuntimeError: file receive cancelled by > > daemon". > [...] > > mount -o /dev/sda2 /sysroot/ > > The disk contains an unclean file system (0, 0). > > The file system wasn't safely closed on Windows. Fixing. > > libguestfs: trace: mount = 0 > > libguestfs: trace: checksums_out "sha1" "/" "/tmp/tmpAWHkYv" > > guestfsd: main_loop: proc 1 (mount) took 2.02 seconds > > guestfsd: main_loop: new request, len 0x38 > > cd /sysroot/ && find -type f -print0 | xargs -0 sha1sum > > [ 25.580340] perf interrupt took too long (2540 > 2500), lowering > > kernel.perf_event_max_sample_rate to 50000 > > sha1sum: ./Windows/Prefetch/ReadyBoot/Trace7.fx: Value too large for > > defined data type > > [ 67.835952] perf interrupt took too long (5048 > 5000), lowering > > kernel.perf_event_max_sample_rate to 25000 > > [ 143.304037] perf interrupt took too long (10010 > 10000), lowering > > kernel.perf_event_max_sample_rate to 12500 > > pclose: /: Success > > guestfsd: main_loop: proc 244 (checksums_out) took 245.25 seconds > > libguestfs: trace: checksums_out = -1 (error) > [...] > > File "/usr/lib/python2.7/dist-packages/guestfs.py", line 1427, in > > checksums_out > > r = libguestfsmod.checksums_out (self._o, csumtype, directory, > sumsfile) > > RuntimeError: file receive cancelled by daemon > > The error is confusing, but I think you are correct that it happens > because the filesystem is unclean at the point at which it was > snapshotted, maybe combined with partially written metadata which to > the ntfs-3g driver looks like disk corruption. > > This is just what happens when you make inconsistent snapshots of disk > unfortunately. > > My best suggestion would be: > > - Catch the exception in Python > > - When you hit this error, skip this snapshot and go on to the next one > > That may involve rearchitecting your application a bit, but if the > error is rare, it seems like the best way to handle it. > > An alternative, if you're not doing it already, would be to take a > consistent snapshot. Assuming the guest is well-behaved and the > filesystem uses journalling and the journalling is implemented > correctly, a consistent snapshot should not have such errors. > > Rich.To create the snapshots I'm using the libvirt command snapshotCreateXML with no flag set. Does libvirt support consistent snapshotting or shall I rely on QEMU backup new feature only?
Richard W.M. Jones
2015-May-28 07:40 UTC
Re: [Libguestfs] Concurrent scanning of same disk
On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote:> To create the snapshots I'm using the libvirt command snapshotCreateXML > with no flag set. Does libvirt support consistent snapshotting or shall I > rely on QEMU backup new feature only?According to: http://wiki.libvirt.org/page/Snapshots virDomainSnapshotCreateXML is only consistent if the guest is paused during the operation. The new qemu feature is called drive-backup (http://wiki.qemu.org/Features/IncrementalBackup). Unless things changed recently, it is not exposed through libvirt, so the only way to use it is by sending qemu monitor commands (http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/). This is all pretty bleeding edge. I still think you'd be better off just ignoring snapshots that fail and moving on to the next one. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
2015-05-28 10:40 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:> On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote: > > To create the snapshots I'm using the libvirt command snapshotCreateXML > > with no flag set. Does libvirt support consistent snapshotting or shall I > > rely on QEMU backup new feature only? > > According to: http://wiki.libvirt.org/page/Snapshots > virDomainSnapshotCreateXML is only consistent if the guest is paused > during the operation. > > The new qemu feature is called drive-backup > (http://wiki.qemu.org/Features/IncrementalBackup). Unless things > changed recently, it is not exposed through libvirt, so the only way > to use it is by sending qemu monitor commands > ( > http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/ > ). > > This is all pretty bleeding edge. I still think you'd be better off > just ignoring snapshots that fail and moving on to the next one. > > Rich.I might be missing something then as the guest is actually paused during the acquisition of the snapshot. I pause the guest, take a screenshot, a core dump and a snapshot, then I resume the guest. Proof is that I can clearly analyse the memory core dump without any problem. Maybe I am breaking the disk's consistency once I extract the dump through the qemu-img command? The command is: qemu-img convert -f qcow2 -o backing_file=guest_disk.qcow2 -O qcow2 -s snapshot_n guest_disk.qcow2 new_disk_for_libguestfs.qcow2 Could it be that, as the backing file is pointing to the guest's disk which will evolve in time, when guestfs tries to read the data sees incosistencies? The guest_disk.qcow2 is a COW clone of a base_disk.qcow2, what if I rebase the new_disk_for_libguestfs.qcow2 to the base_disk.qcow2?