So the short summary of my problem is, the batch file rsync creates is HUGE
for a very small change. The idea is to create workstation image with
partimage, update it with some software and send the image update diff over
the wire to a large number of destinations over a satellite link, but the
batch file updates are several orders of magnitude too large. I don't know
exactly how partimage creates image files, so the bytes/blocks may be
ordered differently between my two variants but should be identical, so
rsync _should_ be able to handle that right?
Software used: Ubuntu 9.10, fogproject.org v.28, partimage ??, rsync 3.0.6
Hardware: Running as VM in ESXi 4.1 U2, 4 x vCPU and 16 GB RAM, 200 GB disk
(150+ GB free)
My testing process:
1. Use FOG .28 / partimage to capture an image of and already configured
Windows XP workstation
2. Log in to workstation as normal user, download WinSCP (2.9 MB file),
shut down machine gracefully
3. Use FOG .28/partimage / to capture the same system again, to a new
image file.
4. FOG uses gzip to compress the partimage file, and we need to compare
uncompressed images
1. Commands:
1. mv image1 image1.gz && mv image2 image2.gz && gunzip
image1.gz
&& gunzip image2.gz
2. Resultant files:
1. image1 size in bytes: 17,062,442,700
2. image2 size in bytes: 16,993,256,652
3. Difference in raw size in bytes: 69,186,048 (somewhat larger
than the 2.9 MB difference I expect due to downloading
WinSCP, but not the
end of the world)
5. Create rsync diff package
1. Command:
1. rsync ?only-write-batch=img1toimg2_diff image2 image1
2. Resultant files:
1. img1toimg2_diff size in bytes: 7,315,408,780
2. img1toimg2_diff.sh in bytes: 58
3. Difference is WAY bigger than raw file size. This HAS to be a
bug!
I thought perhaps specifying the block size might help (it does
significantly in non-batch mode) but I get a error and cannot proceed. I
have tried in both rsync v3.0.6 and v3.0.7 to specify the block size, but
the result is the same:
1. Command:
1. rsync --block-size=512 ?only-write-batch=img1toimg2_diff image2
image1
2. Error message:
1. ERROR: Out of memory in receive_sums [sender]
2. rsync error: error allocating core memory buffers (code 22) at
util.c(117) [sender=3.0.7]
I looked at the changelog and haven't seen any updates to util.c since
rsync v3.0.6 was released that might address this issue. So i think that I
might be seeing two bugs: 1) huge diff size 2) crashing non-gracefully when
trying to use block size with batch mode.
Has anyone experienced this before, am I allowed to specify block size with
batch mode? Any words of wisdom?
Thanks,
Matt Van Mater
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.samba.org/pipermail/rsync/attachments/20120320/fc5adee4/attachment.html>
Matt,
Its probably not a rsync bug. Its likely that after booting to
create the second image a large number of updates has happened at many
different parts in the filesystem. You may have added only a few MB of
data but a lot of little things are going on in an active system like
filesystem timestamp updates, registry updates, etc. It could also have to
do with the internal structure of the image. If it stores metadata about
each part of the system the metadata could be different between runs
causing a large number of differences.
A 7GB diff of a 16GB file tells me about half the blocks were
modified between runs which isn't completely unbelievable in an active,
booted system.
Eric Bambach | Discover
Senior Assoc. Programmer, Warehouse Infrastructure and Tools
2500 Lake Cook Road, Riverwoods IL 60015
P: 224.405.2896 ericbambach1 at discover.com
From: Matt Van Mater <matt.vanmater at gmail.com>
To: <rsync at lists.samba.org>
Date: 03/20/2012 12:55 PM
Subject: Batch mode creates huge diffs, bug(s)?
Sent by: <rsync-bounces at lists.samba.org>
So the short summary of my problem is, the batch file rsync creates is
HUGE for a very small change. The idea is to create workstation image
with partimage, update it with some software and send the image update
diff over the wire to a large number of destinations over a satellite
link, but the batch file updates are several orders of magnitude too
large. I don't know exactly how partimage creates image files, so the
bytes/blocks may be ordered differently between my two variants but should
be identical, so rsync _should_ be able to handle that right?
Software used: Ubuntu 9.10, fogproject.org v.28, partimage ??, rsync 3.0.6
Hardware: Running as VM in ESXi 4.1 U2, 4 x vCPU and 16 GB RAM, 200 GB
disk (150+ GB free)
My testing process:
1. Use FOG .28 / partimage to capture an image of and already
configured Windows XP workstation
2. Log in to workstation as normal user, download WinSCP (2.9 MB
file), shut down machine gracefully
3. Use FOG .28/partimage / to capture the same system again, to a
new image file.
4. FOG uses gzip to compress the partimage file, and we need to
compare uncompressed images
1. Commands:
1. mv image1 image1.gz && mv image2 image2.gz && gunzip
image1.gz &&
gunzip image2.gz
2. Resultant files:
1. image1 size in bytes: 17,062,442,700
2. image2 size in bytes: 16,993,256,652
3. Difference in raw size in bytes: 69,186,048 (somewhat larger than
the 2.9 MB difference I expect due to downloading WinSCP, but not the end
of the world)
5. Create rsync diff package
1. Command:
1. rsync ?only-write-batch=img1toimg2_diff image2 image1
2. Resultant files:
1. img1toimg2_diff size in bytes: 7,315,408,780
2. img1toimg2_diff.sh in bytes: 58
3. Difference is WAY bigger than raw file size. This HAS to be a bug!
I thought perhaps specifying the block size might help (it does
significantly in non-batch mode) but I get a error and cannot proceed. I
have tried in both rsync v3.0.6 and v3.0.7 to specify the block size, but
the result is the same:
1. Command:
1. rsync --block-size=512 ?only-write-batch=img1toimg2_diff image2
image1
2. Error message:
1. ERROR: Out of memory in receive_sums [sender]
2. rsync error: error allocating core memory buffers (code 22) at
util.c(117) [sender=3.0.7]
I looked at the changelog and haven't seen any updates to util.c since
rsync v3.0.6 was released that might address this issue. So i think that
I might be seeing two bugs: 1) huge diff size 2) crashing non-gracefully
when trying to use block size with batch mode.
Has anyone experienced this before, am I allowed to specify block size
with batch mode? Any words of wisdom?
Thanks,
Matt Van Mater--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Please consider the environment before printing this email.
An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20120320/9a58b7c8/attachment.html>
Let me restate my last email regarding rdiff: All of my image files are from the same Windows XP VM, created using FOG/partimage. Image1 is the "baseline", Image2 is Image1 + the WinSCP binary downloaded (not even installed). I am not imaging an Ubuntu machine. I am using the Ubuntu machine as a means of creating the batch file for rsync and/or rdiff. I chose that platform since it is a common distribution used by many and would be easy for others to reproduce my problem. I agree the 400 MB still looks big, but no the ONLY intentional difference between image1 and image2 is the 2.9 MB WinSCP binary i downloaded. My guess is the difference is 1) due partially to the default block size rdiff uses (512b?) AND 2) the fact that the Windows XP VM image source only had 256 MB RAM and that by default Windows XP creates a pagefile of 1.5 x RAM size = 384 MB. That is close enough to 400 MB for me. I am currently running rdiff with a smaller blocksize to test #1 above, hopefully that will force the delta to get smaller (at the expense of longer computation time). Matt On Tue, Mar 20, 2012 at 3:41 PM, Joachim Otahal (privat) <Jou at gmx.net>wrote:> Matt Van Mater schrieb: > > >> Alternate assessment - I ran a similar comparison against the two image >> files using rdiff that comes with Ubuntu 10.04.4 LTS (shown up as librsync >> 0.9.7) and have a significantly smaller delta file (closer to what i >> expect). >> > > Just plain luck. If ubuntu wrote the most new files close to the last used > blocks and only changes a few bytes (this time literally) in the middle > then the desync happens later. The 400 MB delta still looks big, or did you > install something big like libreoffice? > > regards, > > Joachim Otahal > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20120320/052ea7bf/attachment.html>
Apparently Analagous Threads
- Presenting R Results in Webpages
- Two identical copies of an image mounted result in changes to both images if only one is modified
- Assigning values to several consecutives rows in a sequence while leaving some empty
- p-values with pvclust
- anyone interested in an Effect.Slideshow?