Grzegorz Borowiak
2015-Aug-21 22:12 UTC
I would like to add features to rsync: tags and saving local modifications
Hello! My name is Grzegorz Borowiak and I am a programmer. I work for a company which uses rsync internally, to distribute our continuously changing development environment. The environment weighs several gigabytes and consists of over 100000 files, most of them binary, so VCS-es like git and subversion are not an option, but rsync performs very efficiently. However, I would like to add some features, which we need, and they are generic enough to be useful for someone else, so I would like to add them in a way which would allow them to be contributed. Feature 1: tags Our environment is large, but modularised, i.e. every file in it belongs to some module. Not every user needs not every module, so the download by rsync is parametrised by checking or unchecking the modules. However, currently this is implemented as filters, which include or exclude some files by their path or, in some cases, by substrings in file names. To make modularisation more straightforward, and not limited by necessity of differentiation between files by path or name, I propose to introduce concept of tags. Every file could be tagged with some string as an xattr (for example, user.rsync.tag=TAG), and in downloading rsync invocation you could specify a parameter --tag=TAG. This option could be specified more than once. rsync, once invoked in such way, would affect: - all files without tag at all - all files which match any of specified tags Other approach would be to use multiple tags for each file. This would be achieved by setting or unsetting xattrs like user.rsync.tag.TAG. If a file is tagged by tags "a" and "b", it has xattrs user.rsync.tag.a and user.rsync.tag.b. This would allow to divide more finely and be able to use logical expressions, like --tag-expr='a || (b && !c)' would specify all files with have tag "a" or have tag "b" but not "c". rsync already uses xattrs for storing metadata in fake super mode, so it seems a natural way to implement tags. In both approaches, the filtering could be integrated with filter rules. If a modifier "t" were appended after "+", "-", "H", "S", "P" or "R", it would treat the following expression not as a path matching pattern, but rather as a tag or logical combination of tags. For example, the following rule: "+t base" would include all files with tag "base" "Ht gui" would hide all files with tag "gui" "Ht a && !b" would hide all files tagged with "a" but not "b" Feature 2: saving local modifications Our users frequently do some local modifications. They always get lost when they rsync with newer version. I would like to make it possible to detect these modifications and backup that file. There is already --backup option, but this is insufficient, as it saves too many files -- also those which were not locally modified. To solve this problem, I would like to use xattr again and introduce the user.rsync.md5sum, which would store the md5sum of that file; when a file is going to be overwritten or deleted by rsync, it first calculates md5sum for it and if it differs from what is in xattr, the file is saved to backup. If a file has no md5sum xattr at all, it is also saved to backup, as this was for sure created locally. Another, quicker and less demanding, but imperfect method would be to create a special file after each downloading rsync, which would serve as a timestamp, and treat all files with newer mtime as locally modified. And here go my questions: - is any of above features already implemented in some form, or is being implemented now (in-progress)? - for feature 1, which solution would you prefer: single or multiple tagging? - for feature 1, is this a good idea to extend filter rules to handle tags, or it is better to stay with standalone arguments? - for feature 2, which solution would you prefer: md5sum, timestamp, or both (they can be implemented both) - 'fake super' uses user.rsync.%stat xattr; is the percent sign a part of some convention, which my xattrs should also follow? - did I miss something? - do you have other ideas how to provide these features? - what are the coding guidelines for rsync development?
ray vantassle
2015-Aug-22 14:56 UTC
I would like to add features to rsync: tags and saving local modifications
" Feature 1: tags" Sounds like this would be no less work for the user than just having include/exclude filter list(s). But right now rsync works poorly with large filter lists. That's something I have fixed, but just haven't submitted the patches yet. Also: is xattr supported on all OS's and filesystems? If not, is it worth the effort to make a (possibly large) effort to rsync? "Feature 2: saving local modifications Our users frequently do some local modifications. They always get lost when they rsync with newer version." The purpose of rsync is to make exact file copies from one machine to another. What you are asking for is for the recipient system to sometimes refuse to accept an updated copy. There is already a mechanism to do this -- dir-merge. In your proposal, you would have to do some sort of extra work -- presumably a script or whatever -- to detect these modified files and fiddle with xattr or whatever for this rsync feature to know how to handle them. Since you are already going to have to run a special-purpose script before/after the rsync, why not just have it create a dir-merge file? That sounds less complex than having a bunch of special xattr attributes and a set of special handling based on those xattrs. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20150822/537e9fe4/attachment.html>
Seemingly Similar Threads
- [PATCH v8 4/6] security: Allow all LSMs to provide xattrs for inode_init_security hook
- [PATCH v8 4/6] security: Allow all LSMs to provide xattrs for inode_init_security hook
- [PATCH v8 4/6] security: Allow all LSMs to provide xattrs for inode_init_security hook
- [PATCH v7 0/6] evm: Do HMAC of multiple per LSM xattrs for new inodes
- [PATCH v8 0/6] evm: Do HMAC of multiple per LSM xattrs for new inodes