Martin Pool
2003-Nov-25 12:14 UTC
rsync-bugs and unclear semantics when copying multiple source-dirs to one target
On 24 Nov 2003, Dirk Pape <pape@inf.fu-berlin.de> wrote:> Dear Martin Pool, > > I tried to ask via the rsync-mailing list but never got an answer. So I > contact you directly. > > I refer to the rsync syntax > > rsync [OPTION]... SRC [SRC]... DEST > > with more than one SRC, which is mentioned in the man-pages. > We use this form to "overlay" a target directory tree from more than one > sources (class, group1, group2, ..., machine) to yield a costomized > "cloned" directory. > > There are some glitches and bugs when using this form of rsync commands, > one of which I have described in the here attached mail to the rsync > mailing list. This is a platform specific bug.The heart of the problem is that you are trying to write the same file from several different source directories. I think this just will not work predictably in the current design of rsync, because it builds a single list of all files at the start of the transfer. Furthermore the order in which files are transferred is rather strange, for reasons of historical compatibility. I think we do not make any guarantees about what happens if the same relative path occurs in several source directories the behaviour is undefined. I agree that it would be nice if it processed the source directories in the order they are given, but that is not how it works. At the moment your options are: Fix rsync to support this behaviour. Transfer the directories one at a time to build up the destination. This has several problems, one being that there may be many redundant transfers and another that the state will be inconsistent for longer. Make a single source directory that has the state you want. Ditto, but use union bind mounts to synthesize it from several directories, assuming that your OS supports that. Use some other tool. Do several rsync transfers using exclude/include options to pick the right directories from each overlay. The last is possibly the most promising. You could even write a little Perl script to build the exactly correct include lists.> There is another glitch, which I will describe here: > > if you have the following directory structure (-> is softlink) > > ./dir1/dir/a > ./dir2/dir -> ../dir3/dir > ./dir3/dir/b > > and do > > rsync -av --delete dir1/ dir2/ target > > you get > > ./target/dir -> ../dir3/dir > ./dir3/dir/a > ./dir3/dir/b > > I would expect either > > Variant 1: > ./target/dir -> ../dir3/dir > ./dir3/dir/b > > (contents of /dir1/dir is ignored because dir ist "overlayed" with a > symlink in dir2) > or > > Variant 2: > ./target/dir -> ../dir3/dir > ./dir3/dir/b > > (./dir1/dir/a is copied following the overlayed symlink *but* the --delete > then also has to follow the symlink) > > I would prefer strongly to see variant 1 or a new option to protect target > directories from changing contents by linking in o them. > > For your motivation: > > Our more complex scenario is like that: We have > > class/usr/share/bugzilla/<some_files> > machine/usr/share/bugzilla -> /local/usr/share/bugzilla > > and we do something like > > rsync -av --delete --exclude local class/ machine/ targethost:/ > > the "--exclude local" protects files in targethost:/local from being > deleted but not from being overwritten with files which are present in > class/usr/share/bugzilla/ on the scr-host. > > I would like to see an option (or standard semantics) to simply "killing" a > directories "sub"-filelist when the directory is overlayed by a symlink in > a source directory given later in the command line. May be it would suffice > to do that only if the symlink points to a directory, which is "outside" > all source dirs or element of an exclude list. > > I hope you understand and can help me. > > Thanks, > Dirk Pape.> From: Dirk Pape <pape@inf.fu-berlin.de> > Subject: bug (filelist) for platforms solaris and darwin (macosx) and *not* > linuxi386 > Date: Sun, 28 Sep 2003 13:19:45 +0200 > To: rsync@lists.samba.org > X-Mailer: Mulberry/3.1.0b7 (Mac OS X) > > I have found a nasty bug when a file, which is in some of many sources, > shall be copied to a target. > > The linux-Version works well but rsync 2.5.{2|5|6} under solaris9 (gcc > 2.95.3) and darwin (gcc 3.1) do not. The decision which file (out of which > src) shall be copied depends on the number of src dirs given on the command > line. > > This bug bytes us very hard, because we decided to rely on rsync to build > local directories by "overlaying" different directories from a server and > need to be sure to have a consistent semantics in what version of the file > appears in the local directory. > > I stripped our sitation down to a (yet fairly complex) test archive, so you > can reproduce the situation. > > Here is the script, which is also in the archive: > > #!/bin/bash > rsyncpath=rsync > $rsyncpath -av --delete dir1/ dir2/ merged12 > $rsyncpath -av --delete dir1/ dir2/ dir3/ merged123 > # as dir3 only consists of an empty dir "subdir" we expect > # that merged12 and merged123 have identical files in them > # but merged*/subdir/s0/LOOKATTHIS differ as they come from different > sources: > diff -c merged*/subdir/s0/LOOKATTHIS > # this has been reproduced for rsync Version 2.5.2, 2.5.5 and 2.5.6 under > # solaris9 (gcc 2.95.3) and darwin (gcc 3.1) > # this bug *cannot* be reproduced under linuxi386 (gcc 2.95.4)> From: Dirk Pape <pape@inf.fu-berlin.de> > Subject: Re: bug (filelist) for platforms solaris and darwin (macosx) and > *not* linuxi386 > Date: Mon, 06 Oct 2003 15:55:09 +0200 > To: rsync@lists.samba.org > X-Mailer: Mulberry/3.1.0b8 (Mac OS X) > > Hello, > > I wrote this report one and a half weeks ago. There is a bug in the > "flist"-module, which can consistently reproduced on at most two > Unix-Platforms: Solaris 9 and MacOS X. > > Can anybody help me with the email-address of the developer of the > flist-module, so I can contact him/her directly? > > Thanks, > Dirk.-- Martin linux.conf.au -- Adelaide, January 2004
Dirk Pape
2003-Nov-25 18:46 UTC
rsync-bugs and unclear semantics when copying multiple source-dirs to one target
Hello Martin, Thanks for your fast answer to my problem. I am now happy to have a clear position from the developer of rsync (though I am not very happy with the position itself ;-). --Am Dienstag, 25. November 2003 12:14 Uhr +1100 schrieb Martin Pool <mbp@samba.org>:> At the moment your options are: > > Fix rsync to support this behaviour.I would like to take a quick look to tis first option you mentioned, before I conider the others. Do you have and can you send me some API-documentation for the flist module, which I believe is the one, I have to modify. Thanks, Dirk Pape. -- Dr. Dirk Pape (Leiter des Rechnerbetriebs und IT-Verantwortlicher) Fachbereich Mathematik und Informatik der FU Berlin Takustr. 9, 14195 Berlin Tel. +49 (30) 838 75143, Fax. +49 (30) 838 75190