I've noticed this problem for a while and thought it was my own changes to rsync causing them. I've recently done some tests with vanilla versions of rsync though, and found the same problem. Here's the details. The generator process on an rsync server seems to get stuck in an infinite loop after a client process dies. (The receiver process dies just fine). I can produce this behavior by starting rsync on the client sending files to the server, waiting for the file generation process to end and the file sending to start, then killing the rsync process on the client end. I see this bug on rsync 2.6.0-2.6.6. Version 2.5.7 doesn't exhibit the same behavior. I'm running rsync over stunnel, though I doubt that has any effect on the bug since rsync is running in daemon mode on the server. (Stunnel thus passes packets on to rsync via loopback). The module I'm testing under isn't running as root, and the timeout on the server is set to 120 seconds. Strace output for the generator process looks like this: select(5, NULL, [4], NULL, {33, 310000}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout) select(5, NULL, [4], NULL, {60, 0}) = 0 (Timeout)
On Fri, Sep 09, 2005 at 08:53:48PM -0500, Steve Sether wrote:> The generator process on an rsync server seems to get stuck in > an infinite loop after a client process dies.That's very strange. The generator is trying to write data down the socket, and if the other end of the socket connection goes away, the select() call should report that the fd is not "ready" so that rsync will try to write to it and get an error. It's very weird that this does not happen. Older rsync versions probably avoid this scenario by having an extra descriptor in the select() call (to check on the pipe from the receiver to the generator), but newer rsync versions avoid this when the generator is trying to finish writing a packet of data, which avoids the potential for protocol corruption. So, it seems to me that stunnel is the root cause of the hang, since it is not letting someone attempting to write the socket that it is closed. ..wayne..
Hi, (Sorry for starting a new thread, i found it on web archives, and i don't have the list history because i've just subscribed). I just wanted to say i'm having exactly the same problem, through openVPN, when the rsync client is stopped with SIGINT (and possibly by other means such as network outage?). The rsync daemon process endlessly loops on select() timeouts but never exits (timeout is set to 300 seconds - 5 minutes). Since the rsync client is launched frequently from cron (files change frequently and i need recent backups), i've configured max connections on the daemon to 1 in order to avoid overlapped rsync's, but the timeout problems breaks this. Moreover, i couldn't find an easy way to detect & terminate (sigkill doesn't stop it) the faulty daemon process, i'm stuck :( Is there a way to resolve this in rsync, although this doesn't seem to be rsync related problem? Thanks -- vv -------------- next part -------------- HTML attachment scrubbed and removed
>The problem dissapeared since i've setup a second module on rsync daemon. >Maybe it finally is an rsync related problem?Strange. I have about 10 modules configured and I still get this problem. Any advice on finding out what's going on Wayne?