Jeff Allen
2009-Jan-10 21:07 UTC
Implementing a conditional branch within rsync based on modified time of a file
Greetings, I've been looking through archives, googling, and reading through man pages to no avail for some time now. I believe I need some combination of rsync -u (update) and rsync --del, and I'm not quite sure how to get it. I'm looking to build a rough implementation of a multi-client rdiff-backup system; in order to do this I'm using rsync before rdiff-backup. (We'll say there's a server, Client A, and Client B. Files should be synced between A and B but the server should keep a master list of all differences and changes made in any file, by any client in the directory I'm syncing). Essentially, I envision that syncing client A would go something like this: 1. Rsync down from the server to Client A in order to ensure that any newly-created files added recently by Client B (which would have already been uploaded - via rdiff-backup - to the server) is added to the local directory on Client A. 2. Rdiff-backup from Client A to the server. This will not increment the freshly downloaded files created by client B, as the modified times are equal. However, it would update those newly-created/edit files on Client A since the last sync. However, I will run into problems when I delete a file. If I delete a file off of either client, the file will be un-deleted when I rsync down in step one, as the file would still exist on the server. But if I use rsync --del, it would just delete any and all new files created on a client since the last sync. The best solution I can envision is to write a shell script (or modify the rsync source) which would alter step 1 above to the following: global variable lastSync; //last synchronization for this client function syncFile(file, modifiedDate){ if (modifiedDate > lastSync){ //this must be a new file created from another client. download the file from the server } else{ //the file has been deleted on the client since the last sync, delete it. delete the file. } } I suppose I would first be interested to hear if anyone see any pitfalls/logical errors in the above implementation? More pertinently to this list, what approach should I take with this? Would it be possible to implement something of this nature with shell scripts or would I really need to modify the source? Has anyone tried anything comperable? Thank you for your help and time Jeff _________________________________________________________________ Windows Live? Hotmail?: Chat. Store. Share. Do more with mail. http://windowslive.com/explore?ocid=TXT_TAGLM_WL_t1_hm_justgotbetter_explore_012009 -------------- next part -------------- HTML attachment scrubbed and removed
On Sat, 2009-01-10 at 15:01 -0600, Jeff Allen wrote:> I'm looking to build a rough implementation of a multi-client > rdiff-backup system; in order to do this I'm using rsync before > rdiff-backup. > > (We'll say there's a server, Client A, and Client B. Files should be > synced between A and B but the server should keep a master list of all > differences and changes made in any file, by any client in the > directory I'm syncing). > Essentially, I envision that syncing client A would go something like > this: > 1. Rsync down from the server to Client A in order to ensure that any > newly-created files added recently by Client B (which would have > already been uploaded - via rdiff-backup - to the server) is added to > the local directory on Client A. > 2. Rdiff-backup from Client A to the server. This will not increment > the freshly downloaded files created by client B, as the modified > times are equal. However, it would update those newly-created/edit > files on Client A since the last sync.Do I understand correctly that you're taking advantage of the fact that rdiff-backup leaves the latest files in an ordinary tree that you can read via rsync, provided that you --exclude=/rdiff-backup-data ?> However, I will run into problems when I delete a file. > If I delete a file off of either client, the file will be un-deleted > when I rsync down in step one, as the file would still exist on the > server. But if I use rsync --del, it would just delete any and all new > files created on a client since the last sync. > > The best solution I can envision is to write a shell script (or modify > the rsync source) which would alter step 1 above to the following: > > global variable lastSync; //last synchronization for this client > function syncFile(file, modifiedDate){ > if (modifiedDate > lastSync){ > //this must be a new file created from another client. > download the file from the server > } > else{ > //the file has been deleted on the client since the last sync, > delete it. > delete the file. > } > }It just so happens that I had a similar need a few years ago (but without the need to save history) and made a similar proposal as my first rsync bug: https://bugzilla.samba.org/show_bug.cgi?id=2094 Wayne wisely advised me to use a real two-way synchronization tool such as unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) instead, and I would give you the same advice. But what makes your case more difficult is that you don't want to write directly to the rdiff-backup dir with unison. If unison had an option to propagate changes in one direction and skip any changes detected in the other direction, you could use that in step 1 and count on the next run of unison to recognize the changes made by rdiff-backup as convergent. Unfortunately, unison has no such option, though you may be able to rig up a script to accomplish this in unison's interactive mode. Alternatively, you could introduce an intermediate directory containing another copy of the data (which could be on either each client or the server) and use the following procedure: 1. Rsync from rdiff-backup dir to intermediate dir. 2. Synchronize intermediate dir with client via unison. 3. Back up intermediate dir to rdiff-backup dir. But this uses extra space. Given your requirements for both history and synchronization, you may be better served by using a full version-control tool in place of both rdiff-backup and unison. My personal favorite is git ( http://git.or.cz/ ). The downside is that you'll have to jump through extra hoops if you care about file attributes. See this thread for some ideas (written with reference to git but may apply to other tools too): http://www.gelato.unsw.edu.au/archives/git/0612/index.html#34154 I hope one of these approaches works for you. If not, give me some more information and I will see if I can come up with anything else. -- Matt