samba-bugs at samba.org
2011-Nov-16 16:07 UTC
[Bug 8615] New: feature request 'update by reference'
https://bugzilla.samba.org/show_bug.cgi?id=8615
Summary: feature request 'update by reference'
Product: rsync
Version: 3.0.9
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: core
AssignedTo: wayned at samba.org
ReportedBy: info at ecsystems.nl
QAContact: rsync-qa at samba.org
This is a new feature request for rsync.
I call it a 'update by reference' option.
What is it suppose to do?
Update(create) a target file by using a (server)local file that is similar to
the target being updated.
Scenario:
On a rsync server,
/Folder1/huge_file_1
On a rsync client,
/FolderLocal/huge_file_2
Commandline example:
rsync -rvt --inplace /FolderLocal/huge_file_2
root at 127.0.0.1::Folder1/huge_file_2 --byref Folder1/huge_file_1
The file 'huge_file_2' will be created on the Server as
'/Folder1/huge_file_2'
but the actual Delta is taken from the difference between
'/FolderLocal/huge_file_2' and '/Folder1/huge_file_1' (the
reference file)
Argumentation:
A remote site holds large backup files which are created far away each day,
these backup files do not differ alot from day to day but have to be made due
to a backup policy.
Each day backup is then a new file which a rsync remote site does not know
anything about and thus the entire backup file needs to be send across, this is
wasting alot of bandwidth and alot of transfer time.
However the remote rsync site does know about previous backup files that might
contain alot of similar data blocks.
If we could tell the remote rsync site(server) to 'create' the new
backup file
but reference the Delta on another file we might save alot of bandwidth and
time.
Personally I've done and still doing this manually with DVD images by
copying a
DVD image that is most similar to the one being rsynced to the new named target
and then 'delta overwrite' it with the real image file, not perfect but
it does
save me about 35% bandwidth and 45% time. These are just 4-5gb files, backup
files run into the hundreds of gb's so a 'quick copy' like this dvd
example is
not an option.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
samba-bugs at samba.org
2011-Nov-18 11:35 UTC
[Bug 8615] feature request 'update by reference'
https://bugzilla.samba.org/show_bug.cgi?id=8615 --- Comment #1 from itpp11 <info at ecsystems.nl> 2011-11-18 11:35:57 UTC --- While trying to find a workaround via ssh, like: plink -ssh -v -C -L 875:localhost:873 -l root %NASDEST% -pw %rootPW% -m sshcmds.txt Where sshcmds.txt contains cp(copy) commands to get future copies of backup files ready for a delta-overwrite on the remote(server) side I stumbled on rsync's fuzzy! which does what I was looking for! For example: rsync -vtrz --inplace --delete-after --fuzzy --copy-dest="/88_John_Cleese2" "/88_John_Cleese" "test at 127.0.0.1::test/test/" Will copy a new file to its destination and when a similar file is found IN the destination it will use that existing file as a Delta base ! Also the --copy-dest will tell the rsync server to look there as well for possible matches to the new file. The only thing which isn't documented, is it possible to use: --copy-dest="/*" so that the entire destination is searched for a match ? And how do you provide multiple destinations ? Ea. is this allowed: --copy-dest="/88_John_Cleese2";"/88_John_Cleese4" Or do we need to repeat the command ? --copy-dest="/88_John_Cleese2" --copy-dest="/88_John_Cleese4" -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
samba-bugs at samba.org
2011-Nov-23 20:40 UTC
[Bug 8615] feature request 'update by reference'
https://bugzilla.samba.org/show_bug.cgi?id=8615
Wayne Davison <wayned at samba.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
--- Comment #2 from Wayne Davison <wayned at samba.org> 2011-11-23
20:40:07 UTC ---
Yeah, --fuzzy helps out in this situation. The current option only scans the
destination directory for similar/moved files that have a different name. I
have committed an enhancement to the 3.1.0dev git that lets the repeating of
the --fuzzy option (e.g. -yy) ask rsync to also look through the matching
alt-dest dir(s) that were specified. Hopefully that will meet your needs?
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
samba-bugs at samba.org
2011-Nov-23 21:19 UTC
[Bug 8615] feature request 'update by reference'
https://bugzilla.samba.org/show_bug.cgi?id=8615 --- Comment #3 from itpp11 <info at ecsystems.nl> 2011-11-23 21:19:41 UTC --- Tnx! yes it would help if fuzzy would have a wider search area, for example ntbackup files are single files, wbadmin backup files are stored inside their own folder structure which changes with each new backup. If possible allow "../" to narrow fuzzy search to the tree backups are stored, ea. ntbackup: \backupFULL\server-021\20111120 \backupFULL\server-021\20111130 while you are in 20111130 a fuzzy search including ../ would include all known FULL backup files but only those of server-021 but without having to know the folder names. wbadmin: \backupFULL\server-041\20111114\windowsimagebackup\server-041\backup 2011-11-14 013016 The principle is then the same as the next backup rsync position would be "\backupFULL\server-041\201111xx" so that a fuzzy include of ../ would eventually search upwards into the tree of the last FULL backups finding a similar VHD backup file to Delta compare with. Hopefully you can think of some kind of dynamic solution for this. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.