I believe I have encountered a nasty bug with rsync 2.6.6 "--delete-excluded" option. When this is enabled, rsync''s operation appears to break entirely. I have a script that I''ve been using for a few years. The script hasn''t changed in the last year or so. Recently I upgraded from SuSE 9.3 to SUSE 10.0 which means that I went from rsync 2.6.3 (+ patches, I assume) to rsync 2.6.6. It was at this point that the script stopped working. I haven''t had time to diagnose it until today. These are my findings: To make sure it wasn''t some non-rsync aspect of the system, I forcibly downgraded the rsync installation to the one from SuSE 9.3. The script worked great! I then installed the one from SUSE 10.0 - the script was broke. I eventually boiled everything down to this: The invocation on the client machine is: rsync -avvvvv --delete-excluded --exclude-from=excludes \ -e ''ssh'' remote_machine: This is what I get: [client] parse_filter_file(excludes,0,3) [client] add_rule(- *.mp3) ... (Client) Protocol versions: remote=29, negotiated=29 (Server) Protocol versions: remote=29, negotiated=29 building file list ... [sender] make_file(....) <- lots of this ... recv_file_list done rsync: writefd_unbuffered failed to write 4092 bytes: phase "send_file_entry" [sender]: Broken pipe (32) get_local_name count=0 notebook.tmp generator starting pid=8361 count=0 delta-transmission enabled generate_files phase=1 recv_files(0) starting Invalid file index: 1701734919 (count=0) [receiver] _exit_cleanup(code=2, file=sender.c, line=163): entered rsync error: protocol incompatibility (code 2) at sender.c(163) _exit_cleanup(code=2, file=sender.c, line=163): about to call exit(2) rsync: connection unexpectedly closed (9 bytes received so far) [generator] _exit_cleanup(code=12, file=io.c, line=434): entered rsync error: error in rsync protocol data stream (code 12) at io.c(434) _exit_cleanup(code=12, file=io.c, line=434): about to call exit(12) rsync: connection unexpectedly closed (8 bytes received so far) [sender] _exit_cleanup(code=12, file=io.c, line=434): entered rsync error: error in rsync protocol data stream (code 12) at io.c(434) _exit_cleanup(code=12, file=io.c, line=434): about to call exit(12) If I remove the --delete-excluded part from the *client* invocation, everything appears to work just fine (except the excluded files aren''t deleted). It doesn''t always die in the same place (same filename), but it always dies, and does so fairly quickly. I can supply portions of straces, etc... if anybody is interested. This seems like a pretty serious bug. -- Jon Nelson <jnelson-rsync@jamponi.net>
On Fri, Dec 09, 2005 at 08:06:26PM -0600, Jon Nelson wrote:> rsync -avvvvv --delete-excluded --exclude-from=excludes \ > -e ''ssh'' remote_machine:I see no destination directory there, so is that command just listing the remote files? The verbose output you mention later seems to be trying to transfer a file, so I think you left something out. I''m going to assume that you left out a local source directory because that''s the only scenario that makes sense to me. Would you please double-check rsync''s version on both machines? I can''t get rsync to fail, so I''m left to assume that the server side really has a version that is somewhere in between 2.6.3 and 2.6.4 (in that it could say that it supports protocol 29 when it doesn''t really). If so, either upgrade the server version or re-compile its current in-between-releases source with the rsync.h define "PROTOCOL_VERSION" set to "28", not "29". ..wayne..
On Sat, 10 Dec 2005, Wayne Davison wrote:> On Fri, Dec 09, 2005 at 08:06:26PM -0600, Jon Nelson wrote: > > rsync -avvvvv --delete-excluded --exclude-from=excludes \ > > -e ''ssh'' remote_machine: > > I see no destination directory there, so is that command just listing > the remote files? The verbose output you mention later seems to be > trying to transfer a file, so I think you left something out. I''m going > to assume that you left out a local source directory because that''s the > only scenario that makes sense to me.You are right! I did mis-transcribe. Mea culpa. However, it''s really there (just not in my email): The source directory is /home/jnelson. The important thing is with the 2.6.3 binary everything works fine, and with the 2.6.6 binary it breaks, with the only thing changing between invocations being the binary. I can supply you with straces or portions of straces.> Would you please double-check rsync''s version on both machines? I > can''t get rsync to fail, so I''m left to assume that the server side > really has a version that is somewhere in between 2.6.3 and 2.6.4 (in > that it could say that it supports protocol 29 when it doesn''t > really). If so, either upgrade the server version or re-compile its > current in-between-releases source with the rsync.h define > "PROTOCOL_VERSION" set to "28", not "29".I''ve confirmed that the source is in fact 2.6.6. I definately see the rsync''s negotiate protocol version 29. -- Jon Nelson <jnelson-rsync@jamponi.net>
On Sat, Dec 10, 2005 at 10:05:59AM -0600, Jon Nelson wrote:> The important thing is with the 2.6.3 binary everything works fineThat''s to be expected because 2.6.3 only implements protocol 28. The problem that I believe is occurring is that the destination host is not supporting the final version of protocol 29 -- if it claims that it is implementing protocol 29 but it was a based on a CVS snapshot that does not implement all of the final protocol, it would fail in the way you described (due to the official protocol 29 transmitting an empty exclude list, and the server not expecting this info, causing a protocol failure).> I''ve confirmed that the source is in fact 2.6.6.Yes, but what is the destination rsync version? That is the important info that I asked about in my prior email. I would also like to know if it is a version that was distributed by an OS vendor (e.g. if it was packaged in a SUSE Linux distribution) or if it is a custom installation that an individual installed and just forgot to upgrade to a non-CVS version. ..wayne..
On Sat, 10 Dec 2005, Wayne Davison wrote:> On Sat, Dec 10, 2005 at 10:05:59AM -0600, Jon Nelson wrote: > > The important thing is with the 2.6.3 binary everything works fine > > That''s to be expected because 2.6.3 only implements protocol 28. The > problem that I believe is occurring is that the destination host is not > supporting the final version of protocol 29 -- if it claims that it is > implementing protocol 29 but it was a based on a CVS snapshot that does > not implement all of the final protocol, it would fail in the way you > described (due to the official protocol 29 transmitting an empty exclude > list, and the server not expecting this info, causing a protocol failure).> > I''ve confirmed that the source is in fact 2.6.6. > > Yes, but what is the destination rsync version? That is the important > info that I asked about in my prior email. I would also like to know if > it is a version that was distributed by an OS vendor (e.g. if it was > packaged in a SUSE Linux distribution) or if it is a custom installation > that an individual installed and just forgot to upgrade to a non-CVS > version.The binaries on both machines are identical. I only need to change the destination binary to 2.6.3 for things to work - the source binary can stay at 2.6.6 without issue. I downloaded the source for 2.6.6 from the rsync website and diffed it against the patched source from the SUSE srpm - the only differences are acl and srp things and I am quite sure they aren''t the problem. If it makes you feel better I''ll rebuild the rpm without the patches (I''m quite familiar with the process) and try that although I''m positive that they aren''t a problem. -- Jon Nelson <jnelson-rsync@jamponi.net>
On Sat, 10 Dec 2005, Jon Nelson wrote:> On Sat, 10 Dec 2005, Wayne Davison wrote: > > > On Sat, Dec 10, 2005 at 10:05:59AM -0600, Jon Nelson wrote: > > > The important thing is with the 2.6.3 binary everything works fine > > > > That''s to be expected because 2.6.3 only implements protocol 28. The > > problem that I believe is occurring is that the destination host is not > > supporting the final version of protocol 29 -- if it claims that it is > > implementing protocol 29 but it was a based on a CVS snapshot that does > > not implement all of the final protocol, it would fail in the way you > > described (due to the official protocol 29 transmitting an empty exclude > > list, and the server not expecting this info, causing a protocol failure). > > > > I''ve confirmed that the source is in fact 2.6.6. > > > > Yes, but what is the destination rsync version? That is the important > > info that I asked about in my prior email. I would also like to know if > > it is a version that was distributed by an OS vendor (e.g. if it was > > packaged in a SUSE Linux distribution) or if it is a custom installation > > that an individual installed and just forgot to upgrade to a non-CVS > > version. > > The binaries on both machines are identical. I only need to change the > destination binary to 2.6.3 for things to work - the source binary can > stay at 2.6.6 without issue. I downloaded the source for 2.6.6 from the > rsync website and diffed it against the patched source from the SUSE > srpm - the only differences are acl and srp things and I am quite sure > they aren''t the problem. If it makes you feel better I''ll rebuild the > rpm without the patches (I''m quite familiar with the process) and try > that although I''m positive that they aren''t a problem.More wrinkles. The only time I can make it happen is using ssh forced commands. With the destination machine using rsync 2.6.3 everything works (source machine remains at 2.6.6 throughout all tests). If I change nothing except to switch the destination machine to use rsync 2.6.6, I get weird errors. I don''t understand why this works perfectly with 2.6.3 and it fails utterly with 2.6.6. What information can I provide to help? -- Jon Nelson <jnelson-rsync@jamponi.net>
On Sat, Dec 17, 2005 at 12:44:29AM -0600, Jon Nelson wrote:> If I change nothing except to switch the destination machine to > use rsync 2.6.6, I get weird errors.I''d suggest the following. I''d be interested in seeing the data that is being sent from the client to the server. First, work up a very simple test case (using a small number of files that don''t require any updates, if possible) so that this data is small and innocuous. Then, we''ll use the "savetransfer" program from the support dir to grab the data on both ends of the connection. You can build it (if you have the rsync source present and configured) by just "cd"ing into the support dir and typing "make". Then, copy it somewhere on both machines and run it like this: rsync -av --rsh="/path/savetransfer -i /tmp/to.server ssh" --rsync-path="/path/savetransfer -i /tmp/from.client rsync" SOURCE DEST That will put a to.server file on the client, and a from.client file on the server. The data in the files should be identical, or the transport is at fault (e.g. ssh). If it is identical, please gzip it and send it to me off-list. While you''ve got the source there, you might also try using a stock 2.6.6 on both machines (you can run it without installing it using the --rsync-path option), just to be sure that this is a bug in the stock rsync. Thanks for your help. ..wayne..