bugzilla-daemon@dp3.samba.org
2005-Nov-18 13:15 UTC
[Bug 3271] New: Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 Summary: Rsync instances stay in memory when using in daemon mode Product: rsync Version: 2.6.7 Platform: x86 OS/Version: Linux Status: NEW Severity: normal Priority: P3 Component: core AssignedTo: wayned@samba.org ReportedBy: akalach@itos.eu.org QAContact: rsync-qa@samba.org Hello, all We are using rsync as file transfer tool to make run-time synchronization of two remote servers. So rsync is started each time a file is modified to make changes be propagated to the remote peer. Receiving side. The rsync is working in daemon mode (rsync --daemon) Sending site. There is a frequently modified file, which is sended by rsync to receiving side after each modification. Before sending the file our system performs check for existing rsync process, which are already sending the same file. If it is so, this rsync process is terminated by sending to it TERM signal, and next starting new rsync process. Problem. Sometimes on the receiving side appear child rsync processes which permanently hang in memory and never finish. I observed that these processes write to log file string "rsync: read error: Connection reset by peer" and then hang. After I removed logging from rsync (see log.c file) this problem is not appeared anymore. This problem is encountered only when rsync is in the daemon mode. When we are using rsync in the standalone mode all works fine. I assume that the cause of the problem is the usage of string formatting functions in the signal handlers in the rsync source code. Many of such formattings functions are using static buffers/variables and when invoked in signal handler may caused to inpredicable results. All testing was done on the SuSe 9.0 Linux x86 with rsync 2.6.6 release and with the latest CVS snapshot. :/var/log/trace/rsync> ps ax|grep rsync 18998 pts/8 S 0:36 strace -f rsync --daemon --no-detach 18999 pts/8 S 0:00 rsync --daemon --no-detach 19450 pts/8 S 0:00 rsync --daemon --no-detach 19945 pts/8 S 0:00 rsync --daemon --no-detach 20049 pts/8 S 0:00 rsync --daemon --no-detach 20077 pts/8 S 0:00 rsync --daemon --no-detach -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-18 13:21 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #1 from akalach@itos.eu.org 2005-11-18 06:21 MST ------- Created an attachment (id=1580) --> (https://bugzilla.samba.org/attachment.cgi?id=1580&action=view) log file of strace, log file of rsync and log.c file -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-24 05:09 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #2 from euclid80@users.sf.net 2005-11-23 22:07 MST ------- I have found that adding "#define SHUTDOWN_ALL_SOCKETS" to config.h (at least on the client side) solves the problem. Without it, the connection is closed with an error (connection reset), rather than with an end-of-file. The server's receiving code handles the error, and it goes into _exit_cleanup and gets all the way to the exit() call, but exit() hangs because it tries to flush a socket that is in an error state. If the client properly closes its sockets, the problem does not occur. Of course, this is only a workaround. It does not handle the case where the connection is legitimately reset because of network errors. It just keeps rsync from foiling itself. A more complete solution in the receiving code is required. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-25 02:31 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #3 from euclid80@users.sf.net 2005-11-24 19:30 MST ------- OK, further research has shown that I was not quite right about the cause of the problem. In my tests, the network i/o process does die, and it's the "generator" process that hangs. I traced it all the way into the exit_cleanup routine, and it hangs when it calls log_exit(). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-26 17:07 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #4 from euclid80@users.sf.net 2005-11-26 10:05 MST ------- (In reply to comment #3)> In my tests, the network i/o process does die, and it's the > "generator" process that hangs. I traced it all the way into the exit_cleanup > routine, and it hangs when it calls log_exit().The generator process tries to write its error messages over the network socket, which has gone into an error state. However, it does not detect that because select() keeps saying that the descriptor is not available for writing, so it enters an infinite loop. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-26 17:09 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #5 from euclid80@users.sf.net 2005-11-26 10:08 MST ------- Created an attachment (id=1594) --> (https://bugzilla.samba.org/attachment.cgi?id=1594&action=view) patch to prevent lingering processes This causes the writefd_unbuffered routine to check if the socket has died. However, it seems to break syslog when chrooted, so let's be careful. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-26 19:23 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 euclid80@users.sf.net changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1594 is|0 |1 obsolete| | ------- Comment #6 from euclid80@users.sf.net 2005-11-26 12:23 MST ------- Created an attachment (id=1595) --> (https://bugzilla.samba.org/attachment.cgi?id=1595&action=view) different way of handling a broken socket Perhaps this is better. The generator process doesn't use the timeout given in the config file anyway. It is designed to block! The other patch may give up on the socket prematurely. This one tries to write, though it may block for a long time. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Dec-02 02:25 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 euclid80@users.sf.net changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1595 is|0 |1 obsolete| | ------- Comment #7 from euclid80@users.sf.net 2005-12-01 19:24 MST ------- Created an attachment (id=1604) --> (https://bugzilla.samba.org/attachment.cgi?id=1604&action=view) even better way of handling a dead socket This patch doesn't break the contract of writefd_unbuffered because it risks a blocking write only when no timeout has been specified and there is no io_error_fd to read. It avoids the messy attempts to detect a dead socket. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Dec-03 01:06 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #8 from wayned@samba.org 2005-12-02 18:05 MST ------- Thanks for the suggested patches. I agree that those continue statements after select() returns -1 look like they should be improved, but I'm not sure why you allowed the code in the read_timeout() function to fall through after an error: the bit-field values will be left undefined by the error, so I think we need more code to figure out what to do. Perhaps we should change the code to only continue if count is 0 or if errno is EINTR, EWOULDBLOCK, or EAGAIN (and exit with an error on all other errno values). In the writefd_unbuffered() function, what errno is being returned when your new code is being triggered? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Dec-03 04:24 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #9 from euclid80@users.sf.net 2005-12-02 21:23 MST ------- (In reply to comment #8)> Thanks for the suggested patches. I agree that those continue statements after > select() returns -1 look like they should be improved, but I'm not sure why you > allowed the code in the read_timeout() function to fall through after an error:I didn't change read_timeout().> In the writefd_unbuffered() function, what errno is being returned when your > new code is being triggered?Good point. When select() returns an error, we ought not to write. So, my code could be put right after check_timeout(). The bug is caused by the fact that select() doesn't return an error. It keeps returning 0. In a perfect world, select() would detect that the socket is in an error state, but it doesn't. It sees a full transmit buffer and says that the socket is not ready. How about this: if (count == 0) { check_timeout(); --- my patch --- } if (count <= 0) { --- leave this as is --- } -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Dec-03 05:36 UTC
[Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 euclid80@users.sf.net changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1604 is|0 |1 obsolete| | ------- Comment #10 from euclid80@users.sf.net 2005-12-02 22:35 MST ------- Created an attachment (id=1605) --> (https://bugzilla.samba.org/attachment.cgi?id=1605&action=view) avoids select() call when possible Forget what I said in my last post. It doesn't fix the bug!! Perhaps this patch is clearer? It avoids the select() call entirely when it is safe to block. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2006-Feb-01 19:53 UTC
DO NOT REPLY [Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 ------- Comment #11 from wayned@samba.org 2006-02-01 12:53 MST ------- Sorry for forgetting to get back to this bug for so long. I've finally analyzed the strace output, and it shows that the generator received an error about the socket closing from the receiver (who was obviously trying to read it), but when the generator tries to select on that socket to write out the error message, the select never returns that the dead socket is writeable. I have chosen to deal with this situation by adding a new sibling message: when the receiver is reporting a socket-read error, the generator will note this fact and avoid writing an error down the socket, allowing it to close down gracefully when the pipe from the receiver closes. I have checked this fix into CVS. However, one can imagine a situation where the generator got a read-error on the socket and then died before being able to tell this to the generator. To handle this case, I am looking into a modification of the idea that James was working on in his patches. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2006-Oct-15 18:33 UTC
DO NOT REPLY [Bug 3271] Rsync instances stay in memory when using in daemon mode
https://bugzilla.samba.org/show_bug.cgi?id=3271 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED Version|2.6.7 |2.6.8 ------- Comment #12 from wayned@samba.org 2006-10-15 13:33 MST ------- The latest code now has the select() return exceptions, which should hopefully stop this from happening. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Reasonably Related Threads
- [Bug 2654] timeout is always triggered with 2.6.4
- Rsync: Segmentation fault
- DO NOT REPLY [Bug 5701] New: deadlock on local rsyncing, bisected to commit f303b749f2843433c9acd8218a4b9096d0d1bb8d
- DO NOT REPLY [Bug 4139] New: Generator process (running as daemon) hangs after kill SIGINT.
- DO NOT REPLY [Bug 6175] New: write last transfer status when timeout or other error happens