thr3ads.net - rsync - Implementing a conditional branch within rsync based on modified time of a file [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Jeff Allen

2009-Jan-10 21:07 UTC

Implementing a conditional branch within rsync based on modified time of a file

Greetings,

I've been looking through archives, googling, and reading through man pages
to no avail for some time now. I believe I need some combination of rsync -u
(update) and rsync --del, and I'm not quite sure how to get it.

I'm looking to build a rough implementation of a multi-client rdiff-backup
system; in order to do this I'm using rsync before rdiff-backup.

(We'll say there's a server, Client A, and Client B. Files should be
synced between A and B but the server should keep a master list of all
differences and changes made in any file, by any client in the directory I'm
syncing).
Essentially, I envision that syncing client A would go something like this:
1. Rsync down from the server to Client A in order to ensure that any
newly-created files added recently by Client B (which would have already been
uploaded - via rdiff-backup - to the server) is added to the local directory on
Client A.
2. Rdiff-backup from Client A to the server. This will not increment the freshly
downloaded files created by client B, as the modified times are equal. However,
it would update those newly-created/edit files on Client A since the last sync.

However, I will run into problems when I delete a file.
If I delete a file off of either client, the file will be un-deleted when I
rsync down in step one, as the file would still exist on the server. But if I
use rsync --del, it would just delete any and all new files created on a client
since the last sync.

The best solution I can envision is to write a shell script (or modify the rsync
source) which would alter step 1 above to the following:

global variable lastSync; //last synchronization for this client
function syncFile(file, modifiedDate){
  if (modifiedDate > lastSync){
     //this must be a new file created from another client.
     download the file from the server
  }
  else{
     //the file has been deleted on the client since the last sync, delete it.
     delete the file.
  }
}

I suppose I would first be interested to hear if anyone see any pitfalls/logical
errors in the above implementation?

More pertinently to this list, what approach should I take with this? Would it
be possible to implement something of this nature with shell scripts or would I
really need to modify the source? Has anyone tried anything comperable?

Thank you for your help and time

Jeff

_________________________________________________________________
Windows Live? Hotmail?: Chat. Store. Share. Do more with mail. 
http://windowslive.com/explore?ocid=TXT_TAGLM_WL_t1_hm_justgotbetter_explore_012009
-------------- next part --------------
HTML attachment scrubbed and removed

Matt McCutchen

2009-Jan-13 01:27 UTC

head link

(Synchronization among clients with history)

On Sat, 2009-01-10 at 15:01 -0600, Jeff Allen wrote:>  I'm looking to build a rough implementation of a multi-client
> rdiff-backup system; in order to do this I'm using rsync before
> rdiff-backup.
> 
> (We'll say there's a server, Client A, and Client B. Files should
be
> synced between A and B but the server should keep a master list of all
> differences and changes made in any file, by any client in the
> directory I'm syncing).
> Essentially, I envision that syncing client A would go something like
> this:
> 1. Rsync down from the server to Client A in order to ensure that any
> newly-created files added recently by Client B (which would have
> already been uploaded - via rdiff-backup - to the server) is added to
> the local directory on Client A.
> 2. Rdiff-backup from Client A to the server. This will not increment
> the freshly downloaded files created by client B, as the modified
> times are equal. However, it would update those newly-created/edit
> files on Client A since the last sync. 
Do I understand correctly that you're taking advantage of the fact that
rdiff-backup leaves the latest files in an ordinary tree that you can
read via rsync, provided that you --exclude=/rdiff-backup-data ?
>  However, I will run into problems when I delete a file.
> If I delete a file off of either client, the file will be un-deleted
> when I rsync down in step one, as the file would still exist on the
> server. But if I use rsync --del, it would just delete any and all new
> files created on a client since the last sync.
> 
> The best solution I can envision is to write a shell script (or modify
> the rsync source) which would alter step 1 above to the following:
> 
> global variable lastSync; //last synchronization for this client
> function syncFile(file, modifiedDate){
>   if (modifiedDate > lastSync){
>      //this must be a new file created from another client.
>      download the file from the server
>   }
>   else{
>      //the file has been deleted on the client since the last sync,
> delete it.
>      delete the file.
>   }
> } 
It just so happens that I had a similar need a few years ago (but
without the need to save history) and made a similar proposal as my
first rsync bug:

https://bugzilla.samba.org/show_bug.cgi?id=2094

Wayne wisely advised me to use a real two-way synchronization tool such
as unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) instead, and I
would give you the same advice.  But what makes your case more difficult
is that you don't want to write directly to the rdiff-backup dir with
unison.

If unison had an option to propagate changes in one direction and skip
any changes detected in the other direction, you could use that in step
1 and count on the next run of unison to recognize the changes made by
rdiff-backup as convergent.  Unfortunately, unison has no such option,
though you may be able to rig up a script to accomplish this in unison's
interactive mode.

Alternatively, you could introduce an intermediate directory containing
another copy of the data (which could be on either each client or the
server) and use the following procedure:

1. Rsync from rdiff-backup dir to intermediate dir.
2. Synchronize intermediate dir with client via unison.
3. Back up intermediate dir to rdiff-backup dir.

But this uses extra space.

Given your requirements for both history and synchronization, you may be
better served by using a full version-control tool in place of both
rdiff-backup and unison.  My personal favorite is git
( http://git.or.cz/ ).  The downside is that you'll have to jump through
extra hoops if you care about file attributes.  See this thread for some
ideas (written with reference to git but may apply to other tools too):

http://www.gelato.unsw.edu.au/archives/git/0612/index.html#34154

I hope one of these approaches works for you.  If not, give me some more
information and I will see if I can come up with anything else.

-- 
Matt

Seemingly Similar Threads

Search for more apparently analagous threads

rsync - Jan 2009 - Implementing a conditional branch within rsync based on modified time of a file

Implementing a conditional branch within rsync based on modified time of a file

(Synchronization among clients with history)

Seemingly Similar Threads