On Mon, Sep 23, 2002 at 01:06:22PM -0700, Sudheer Tumuluru wrote:> > I am having the same problem with rsync 2.5.5-1. I am > trying to rsync a couple of short text files between a linux server and > Win2k Professional boxes with cygwin. About 20% of the time, rsync freezes > at the end of the transfer, and I can't kill the rsync process in > cygwin even if I give it a 9 (SIGTERM) signal. This happens mostly on dual-processor > machines but it did happen once on the single proc machine as well.Me too. I spent this afternoon debugging. There doesn't appear to be anything wrong with rsync - looks rather more like something in cygwin signal delivery is ill. In my case I'm trying to pull files onto Windows(XP) from Unix(Solaris). rsync forks in this case; the parent process generates the filelisT while a child process does the receiving. (Something like that, at least; I guess it's for deadlock avoidance) At the end, the parent process waves farewell to the remote server, and then does a kill(..., SIGUSR2) on the child pid to tell it to exit. This signal seems to get lost, as suggested above, some moderate percentage of the time. The child process is supposedly waiting for this signal inside msleep(), which calls select() to wait in 20ms bursts. In the cases that the child manages to reach the select() in time to start waiting, I didn't observer any hangs. But consistently if the kill was received before that point, the child process simply locks up. This suggests that hack workaround of adding a call to say msleep(30) just before the line kill(pid, SIGUSR2) in main.c:do_recv(). With that kludge in, I haven't seen any hangs in a few hundred trials. YMMV, but it might be a helpful bandaid until some cygwin expert has the chance fix things properly. Rgds Anthony This communication is for informational purposes only. It is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of J.P. Morgan Chase & Co., its subsidiaries and affiliates.
On Tue, 24 Sep 2002, Anthony Heading wrote:> On Mon, Sep 23, 2002 at 01:06:22PM -0700, Sudheer Tumuluru wrote: > > > > I am having the same problem with rsync 2.5.5-1. I am > > trying to rsync a couple of short text files between a linux server and > > Win2k Professional boxes with cygwin. About 20% of the time, rsync freezes > > at the end of the transfer, and I can't kill the rsync process in > > cygwin even if I give it a 9 (SIGTERM) signal. This happens mostly on dual-processor > > machines but it did happen once on the single proc machine as well. > > Me too. I spent this afternoon debugging. There doesn't appear to be > anything wrong with rsync - looks rather more like something in cygwin > signal delivery is ill. > > In my case I'm trying to pull files onto Windows(XP) from Unix(Solaris). > > rsync forks in this case; the parent process generates the filelisT > while a child process does the receiving. (Something like that, at > least; I guess it's for deadlock avoidance) > > At the end, the parent process waves farewell to the remote server, > and then does a kill(..., SIGUSR2) on the child pid to tell it to exit. > > This signal seems to get lost, as suggested above, some moderate > percentage of the time. > > The child process is supposedly waiting for this signal inside > msleep(), which calls select() to wait in 20ms bursts. In the > cases that the child manages to reach the select() in time to > start waiting, I didn't observer any hangs. But consistently > if the kill was received before that point, the child process > simply locks up. > > This suggests that hack workaround of adding a call to > say msleep(30) just before the line kill(pid, SIGUSR2) in > main.c:do_recv(). > > With that kludge in, I haven't seen any hangs in a few hundred > trials. YMMV, but it might be a helpful bandaid until some > cygwin expert has the chance fix things properly. > > Rgds > > AnthonyHmm, how about a patch? Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha@cs.nyu.edu ZZZzz /,`.-'`' -. ;-;;,_ igor@watson.ibm.com |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "Water molecules expand as they grow warmer" (C) Popular Science, Oct'02, p.51
On Tue, Sep 24, 2002 at 07:29:40PM +0900, Anthony Heading wrote:> On Mon, Sep 23, 2002 at 01:06:22PM -0700, Sudheer Tumuluru wrote: > > > > I am having the same problem with rsync 2.5.5-1. I am > > trying to rsync a couple of short text files between a linux server and > > Win2k Professional boxes with cygwin. About 20% of the time, rsync freezes > > at the end of the transfer, and I can't kill the rsync process in > > cygwin even if I give it a 9 (SIGTERM) signal. This happens mostly on dual-processor > > machines but it did happen once on the single proc machine as well. > > Me too. I spent this afternoon debugging. There doesn't appear to be > anything wrong with rsync - looks rather more like something in cygwin > signal delivery is ill. > > In my case I'm trying to pull files onto Windows(XP) from Unix(Solaris). > > rsync forks in this case; the parent process generates the filelisT > while a child process does the receiving. (Something like that, at > least; I guess it's for deadlock avoidance) > > At the end, the parent process waves farewell to the remote server, > and then does a kill(..., SIGUSR2) on the child pid to tell it to exit. > > This signal seems to get lost, as suggested above, some moderate > percentage of the time. > > The child process is supposedly waiting for this signal inside > msleep(), which calls select() to wait in 20ms bursts. In the > cases that the child manages to reach the select() in time to > start waiting, I didn't observer any hangs. But consistently > if the kill was received before that point, the child process > simply locks up. > > This suggests that hack workaround of adding a call to > say msleep(30) just before the line kill(pid, SIGUSR2) in > main.c:do_recv(). > > With that kludge in, I haven't seen any hangs in a few hundred > trials. YMMV, but it might be a helpful bandaid until some > cygwin expert has the chance fix things properly.I looked at this briefly last night and reached the conclusion that indeed windows or cygwin is somehow dropping the signal. Then today as a result of a question asked by someone else happened to read the select_tut(2) manpage and noticed this little gem. ---------------------------------------------------------------- 10. I have heard that the Windows socket layer does not cope with OOB data properly. It also does not cope with select calls when no file descriptors are set at all. Having no file descriptors set is a useful way to sleep the process with sub-second precision by using the timeout. (See further on.) USLEEP EMULATION On systems that do not have a usleep function, you can call select with a finite timeout and no file descriptors as follows: struct timeval tv; tv.tv_sec = 0; tv.tv_usec = 200000; /* 0.2 seconds */ select (0, NULL, NULL, NULL, &tv); This is only guarenteed to work on Unix systems, however. ---------------------------------------------------------------- Hmm, That is exactly what rsync is doing here. I don't use cygwin so i'm strictly a spectator on this issue but it seems clear that something is amiss with cygwin and the cygwin developers should be brought in on it. Until cygwin is fully up-to-date you might want to create a patch that once given further testing could be included in the patches directory and referenced in the rsync FAQ. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
On Tue, 24 Sep 2002, Anthony Heading wrote:> On Mon, Sep 23, 2002 at 01:06:22PM -0700, Sudheer Tumuluru wrote: > > > > I am having the same problem with rsync 2.5.5-1. I am > > trying to rsync a couple of short text files between a linux server and > > Win2k Professional boxes with cygwin. About 20% of the time, rsync freezes > > at the end of the transfer, and I can't kill the rsync process in > > cygwin even if I give it a 9 (SIGTERM) signal. This happens mostly on dual-processor > > machines but it did happen once on the single proc machine as well. > > Me too. I spent this afternoon debugging. There doesn't appear to be > anything wrong with rsync - looks rather more like something in cygwin > signal delivery is ill. > [...]I use rsync every day on Tru64 Unix and Linux systems. I have frequently seen this problem there, too. -steve