On 11.02.2015 14:03, QUBE RUBBIK wrote:> Hello
>
> I was just thinking about a killer feature for rsync, the ability to detect
files name changes or move within the source and destination.
> At this time rsync has to re-transfer a file if it has been renamed or
moved inside a subfolder, with a heavy waste of ressources and bandwidth.
>
> It could be smarter :
> with a --smart switch, rsync could take a hash of every file within the
source and destination BEFORE TRANSFERING,
> then for existing (matching hash) files, it only needs to alter metadata
(name, location, chmod etc...) saving plenty of bandwidth
Imagine doing that for a couple GB of data. The hashing might take
longer than the time saved coping it.
This would only work with a persistence layer that remembers the hashes
of unchanged files. This has been a topic in the past, altough i don't
remember the details. (And i'm to lazy to google for it.)
Otherwise the only time it really saves time is when you have really
asynchronous bandwithes:
Fast local access on both sides (to create the hashes), terrible
bandwith on the link inbetween (for the coping of new/changed files)
> Okay destination has to handle this, I expect the rsync daemon has to
handle server side file hashing.
>
> We would have a clever tool to replicate data who only been reorganised
with no changes on the files themselves.
> No need to resync the whole structure if you added a dir in the path, or
someone renamed this particular heavy file
>
> this may save big data on automatic backups, ftp mirrors etc...
>
>
> What do you think about it?
The 'workaround' i personally use are hardlinks. Just hardlink all files
into a directory that sorts alphabetically before everything else, for
me personally i use a '.z'-directory in the root of directory i treat
that way.
That reason for that is rsync has to work through that directory first,
otherwise it wouldn't work like intended.
After that you can move around the files and when you:
rsync ... -H --delete ... ...
rsync just deletes and re-hardlinks the moved file(s).
If you remove a file:
find .z -type f -links 1 -delete
removes the 'dangling' file(s) with only 1 link remaining.
(And in the meantime you have a backup, in case you accidentally deleted
a file.)
You would also need to make plans for maintaing the .z-directory.
Initial creating, adding new files, can files change? ...
The solution has some caveats, like maintaining the .z-directory, but it
works fine for me.
--
Matthias