2015-05-28 10:40 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:> On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote: > > To create the snapshots I'm using the libvirt command snapshotCreateXML > > with no flag set. Does libvirt support consistent snapshotting or shall I > > rely on QEMU backup new feature only? > > According to: http://wiki.libvirt.org/page/Snapshots > virDomainSnapshotCreateXML is only consistent if the guest is paused > during the operation. > > The new qemu feature is called drive-backup > (http://wiki.qemu.org/Features/IncrementalBackup). Unless things > changed recently, it is not exposed through libvirt, so the only way > to use it is by sending qemu monitor commands > ( > http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/ > ). > > This is all pretty bleeding edge. I still think you'd be better off > just ignoring snapshots that fail and moving on to the next one. > > Rich.I might be missing something then as the guest is actually paused during the acquisition of the snapshot. I pause the guest, take a screenshot, a core dump and a snapshot, then I resume the guest. Proof is that I can clearly analyse the memory core dump without any problem. Maybe I am breaking the disk's consistency once I extract the dump through the qemu-img command? The command is: qemu-img convert -f qcow2 -o backing_file=guest_disk.qcow2 -O qcow2 -s snapshot_n guest_disk.qcow2 new_disk_for_libguestfs.qcow2 Could it be that, as the backing file is pointing to the guest's disk which will evolve in time, when guestfs tries to read the data sees incosistencies? The guest_disk.qcow2 is a COW clone of a base_disk.qcow2, what if I rebase the new_disk_for_libguestfs.qcow2 to the base_disk.qcow2?
Richard W.M. Jones
2015-May-28 08:10 UTC
Re: [Libguestfs] Concurrent scanning of same disk
On Thu, May 28, 2015 at 10:57:51AM +0300, NoxDaFox wrote:> 2015-05-28 10:40 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>: > > > On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote: > > > To create the snapshots I'm using the libvirt command snapshotCreateXML > > > with no flag set. Does libvirt support consistent snapshotting or shall I > > > rely on QEMU backup new feature only? > > > > According to: http://wiki.libvirt.org/page/Snapshots > > virDomainSnapshotCreateXML is only consistent if the guest is paused > > during the operation. > > > > The new qemu feature is called drive-backup > > (http://wiki.qemu.org/Features/IncrementalBackup). Unless things > > changed recently, it is not exposed through libvirt, so the only way > > to use it is by sending qemu monitor commands > > ( > > http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/ > > ). > > > > This is all pretty bleeding edge. I still think you'd be better off > > just ignoring snapshots that fail and moving on to the next one. > > > > Rich. > > > I might be missing something then as the guest is actually paused during > the acquisition of the snapshot. > > I pause the guest, take a screenshot, a core dump and a snapshot, then I > resume the guest. Proof is that I can clearly analyse the memory core dump > without any problem.Note a core dump doesn't normally include the guest's disk. It just contains the guest's memory, so it's not relevant for consistency.> Maybe I am breaking the disk's consistency once I extract the dump through > the qemu-img command? > > The command is: > qemu-img convert -f qcow2 -o backing_file=guest_disk.qcow2 -O qcow2 -s > snapshot_n guest_disk.qcow2 new_disk_for_libguestfs.qcow2Is the guest paused when you do this? If not, then this will create an inconsistent snapshot.> Could it be that, as the backing file is pointing to the guest's disk which > will evolve in time, when guestfs tries to read the data sees > incosistencies?qemu-img convert makes a full copy, so it's not guestfs that's the problem, but qemu-img. The copy is not done instantaneously.> The guest_disk.qcow2 is a COW clone of a base_disk.qcow2, what if I rebase > the new_disk_for_libguestfs.qcow2 to the base_disk.qcow2?Many copies and snapshots. I'm thoroughly confused ... Anyhow, unless you either make a full copy while the guest is paused, *or* you use a point-in-time snapshot feature of either qemu (drive-backup) or your host filesystem, you're not making a consistent snapshot. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
2015-05-28 11:10 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>:> On Thu, May 28, 2015 at 10:57:51AM +0300, NoxDaFox wrote: > > 2015-05-28 10:40 GMT+03:00 Richard W.M. Jones <rjones@redhat.com>: > > > > > On Thu, May 28, 2015 at 10:33:48AM +0300, NoxDaFox wrote: > > > > To create the snapshots I'm using the libvirt command > snapshotCreateXML > > > > with no flag set. Does libvirt support consistent snapshotting or > shall I > > > > rely on QEMU backup new feature only? > > > > > > According to: http://wiki.libvirt.org/page/Snapshots > > > virDomainSnapshotCreateXML is only consistent if the guest is paused > > > during the operation. > > > > > > The new qemu feature is called drive-backup > > > (http://wiki.qemu.org/Features/IncrementalBackup). Unless things > > > changed recently, it is not exposed through libvirt, so the only way > > > to use it is by sending qemu monitor commands > > > ( > > > > http://kashyapc.com/2013/03/31/multiple-ways-to-access-qemu-monitor-protocol-qmp/ > > > ). > > > > > > This is all pretty bleeding edge. I still think you'd be better off > > > just ignoring snapshots that fail and moving on to the next one. > > > > > > Rich. > > > > > > I might be missing something then as the guest is actually paused during > > the acquisition of the snapshot. > > > > I pause the guest, take a screenshot, a core dump and a snapshot, then I > > resume the guest. Proof is that I can clearly analyse the memory core > dump > > without any problem. > > Note a core dump doesn't normally include the guest's disk. It just > contains the guest's memory, so it's not relevant for consistency. > > > Maybe I am breaking the disk's consistency once I extract the dump > through > > the qemu-img command? > > > > The command is: > > qemu-img convert -f qcow2 -o backing_file=guest_disk.qcow2 -O qcow2 -s > > snapshot_n guest_disk.qcow2 new_disk_for_libguestfs.qcow2 > > Is the guest paused when you do this? If not, then this will create > an inconsistent snapshot. > > > Could it be that, as the backing file is pointing to the guest's disk > which > > will evolve in time, when guestfs tries to read the data sees > > incosistencies? > > qemu-img convert makes a full copy, so it's not guestfs that's the > problem, but qemu-img. The copy is not done instantaneously. > > > The guest_disk.qcow2 is a COW clone of a base_disk.qcow2, what if I > rebase > > the new_disk_for_libguestfs.qcow2 to the base_disk.qcow2? > > Many copies and snapshots. I'm thoroughly confused ... > > Anyhow, unless you either make a full copy while the guest is paused, > *or* you use a point-in-time snapshot feature of either qemu > (drive-backup) or your host filesystem, you're not making a consistent > snapshot. > > Rich.Ok I definitely got confused with the APIs then. I thought the guest was supposed to be paused only when calling virDomainSnapshotCreateXML not also afterwards when creating the new disk file with qemu-img. I made a couple of changes and the hive corruption issue seems to be gone. The "RuntimeError: file receive cancelled by daemon" still persists. From the guestfs trace I can't see any evidence if not what seems a sort of overflow: sha1sum: ./Windows/Prefetch/ReadyBoot/Trace2.fx: Value too large for defined data type [ 65.347452] perf interrupt took too long (5124 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 139.668206] perf interrupt took too long (10140 > 10000), lowering kernel.perf_event_max_sample_rate to 12500 pclose: /: Success guestfsd: main_loop: proc 244 (checksums_out) took 244.89 seconds libguestfs: trace: checksums_out = -1 (error) Is there something else wrong?