I had a friend run some Cygwin tests and we found that --modify-window=1 works just as well as --modify-window=2 on FAT filesystems to copy files from Unix and detect the difference in granularity. FAT filesystems always have timestamps that have an even number of seconds. On the other hand, NTFS filesystems can store the modification time down to the second, whereas previously people on this mailing list thought it was just like FAT for modification time. Nevertheless, I decided to change the default value for --modify-window to 1 on Cygwin rather than 0, under the thinking that even on NTFS filesystems it wouldn't do any harm because it's highly unlikely that any individual file will be modified again within one second after it is copied by rsync. While doing the tests we too experienced hangs at the end of copies. We were going over openssh from a Solaris 9 box to Windows 2000 Cygwin. We tried the test from http://lists.samba.org/pipermail/rsync/2002-August/008130.html but it still experienced hangs. It wasn't clear if the patch reduced the frequency or not. Has *anybody* been able to figure out a fix for this that really works? It sure would be nice to get something into 2.5.6 but we're about out of time for that because I need to put it out this weekend before I start a new (temporary) job on Monday. I imagine that the issue I was experiencing could be a separate one and this patch really would help other hangs people are having; can anybody give me an argument for putting in the calls to "shutdown()" anyway? I would make it only happen on Cygwin, because it is unknown if they will cause harm on other platforms. - Dave Dykstra
Dave Dykstra wrote:> While doing the tests we too experienced hangs at the end of copies. > We were going over openssh from a Solaris 9 box to Windows 2000 > Cygwin. > We tried the test from > http://lists.samba.org/pipermail/rsync/2002-August/008130.html > but it still experienced hangs. It wasn't clear if the patch reduced > the frequency or not.Data point: I regularly rsync from sources.redhat.com (Linux, I presume. Not Cygwin, anyway) to my local machine (Cygwin). I have never experienced a hang.> Has *anybody* been able to figure out a fix for this that really > works? > It sure would be nice to get something into 2.5.6 but we're about out > of time for that because I need to put it out this weekend before I > start a new (temporary) job on Monday. I imagine that the issue I > was experiencing could be a separate one and this patch really would > help other hangs people are having; can anybody give me an argument > for putting in the calls to "shutdown()" anyway? I would make it > only happen on Cygwin, because it is unknown if they will cause harm > on other platforms.Those platforms would have to be fairly broken. The behaviour of shutdown is pretty clearly defined. Max.
On Fri, Jan 24, 2003 at 05:18:07PM -0600, you [Dave Dykstra] wrote:> > While doing the tests we too experienced hangs at the end of copies. > We were going over openssh from a Solaris 9 box to Windows 2000 Cygwin. > We tried the test from > http://lists.samba.org/pipermail/rsync/2002-August/008130.html > but it still experienced hangs. It wasn't clear if the patch reduced > the frequency or not.I've been using rsync cygwin->linux, 100Mbit ethernet for years and never seen this. When I add encryption (zebedee, ssh) to the equation, I get hangs. The buffering patch that was sent to the list a couple of weeks back seems to help, but I think it just makes it harder to trigger. (The buffering patch helps a great deal with performance.) As for the right solution, I'm afraid I have no ideas. -- v -- v@iki.fi
> Has *anybody* been able to figure out a fix for this that really works?Why does the receiving child wait in a loop to get killed, rather than just exit()? I presume cygwin has some problem or race condition in the wait loop, kill and wait_process(). The pipe to the parent will read 0 bytes (EOF) on the parent side after the child exits. Although I haven't tried it, I would guess this should be the reliable solution on all platforms. But there must be some good reason the wait loop, kill and wait_process() contortions appeared in the code (maybe some race condition with the remote side?)... Craig
> http://lists.samba.org/pipermail/rsync/2002-August/008130.html > but it still experienced hangs. It wasn't clear if the patch reduced > the frequency or not.It didn't fix it for us. We sync Win9x clients to a Win2k server running rsync as service. Hangs and connection reset by peer happened almost daily. Adding a -B 16384 seemed to help, but we still get the error. I lack the knowledge to debug this problem by myself, but I if somebody wants to run some tests or try a patch with our setup, I'll be very happy to help. Regards, Carlos Guti?rrez carlosg@sca.com.mx
On Fri, Jan 24, 2003 at 05:18:07PM -0600, Dave Dykstra wrote:> I had a friend run some Cygwin tests and we found that --modify-window=1 > works just as well as --modify-window=2 on FAT filesystems to copy files > from Unix and detect the difference in granularity. FAT filesystems always > have timestamps that have an even number of seconds. On the other hand, > NTFS filesystems can store the modification time down to the second, > whereas previously people on this mailing list thought it was just like > FAT for modification time. Nevertheless, I decided to change the default > value for --modify-window to 1 on Cygwin rather than 0, under the thinking > that even on NTFS filesystems it wouldn't do any harm because it's highly > unlikely that any individual file will be modified again within one second > after it is copied by rsync.This modify-window default of 1 has been causing some trouble on the rsync test suite on the Cygwin test machine on build.samba.org. The problem is that some files get created and immediately copied within one second, and then the rsync code that implements '-p' checks to see if the copied file's time is within one second before deciding whether or not to change it. The test machine is presumably using an NTFS filesystem so it has one second granuarlity. Last night I considered 4 possible solutions: 1. Change the test suite to always wait 2 seconds before copying, at the beginning of the "checkit" function. 2. Change the test suite to always pass --modify-window=0 to rsync by including it in the $RSYNC variable in testsuite/rsync.fns. 3. Change the set_perms() function in rsync.c to check for exact time rather than calling cmp_modtime. 4. Back out the default of --modify-window=1 on cygwin and go back to a default of 0. I implemented solution 1, but I'm not very comfortable with it because it slows the tests on all platforms. I'm now leaning toward solution 3. Discussion? - Dave