samba-bugs@samba.org
2005-Feb-08 23:01 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 abartlet@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |abartlet@samba.org Status|RESOLVED |REOPENED Resolution|WORKSFORME | ------- Additional Comments From abartlet@samba.org 2005-02-08 15:45 ------- I appear to be seeing exactly the same issue, particularly on hosts using 2.6.3 (to the 2.6.3 server) However, using the rsync-debug script, I find that the issue goes away! It also seem potentially timing or transport induced (just a gut feeling, that load and using/not using SSH deamon transport have made the problem come and go). This has happend more with more recent versions. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-10 18:23 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From ej.grace@imperial.ac.uk 2005-02-10 11:10 ------- I'm glad that it was not just a product of my imagination! For some reason I havn't been able to reliably reproduce the problem. I pretty much gave up. I am currently using Using rsync-2.6.3 on a FC3 machine which is connecting to rsync-2.6.3 on an FC2 based machine. It doesn't seem to display the problem any more, even with large file trees, but I am not convinced the problem has gone, rather that due to some mystic alignment of the stars it is not manifesting itself. Your experience seems to suggest the same things as my guesswork, that there may be some timing problem or strange interaction with SSH. If you can reliably reproduce the problem perhaps it's worth trying it with rsh as the transport mechanism. The other possibility is that there is some issue with the timeout option. I When I used to set the timeout option to (say) 10 minutes (far longer than the actual transfer took) it would occur. I have since set the timeout to 0 (indefinite) and have never seen this problem reoccur. These type of intermittent s/w problem always fox me! I will try and make it go wrong by setting a timeout of 10mins. If that breaks things again it may be informative to people "in the know". I suggest you try the same, see if setting the timeout option has any effect.. Good luck. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-10 19:40 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From abartlet@samba.org 2005-02-10 12:20 ------- I havn't specified the timeout option, and still get the issue. I also have double-checked, and reliably suceeed on all the 'straced' backups so far this week. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-11 09:02 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From ej.grace@imperial.ac.uk 2005-02-11 01:44 ------- (In reply to comment #5)> I havn't specified the timeout option, and still get the issue. > > I also have double-checked, and reliably suceeed on all the 'straced' backups so > far this week.Hmm. I've not managed to make it break. I will try "loading" the machines while doing the rsyncs and see if that has any effect. The previous incarnation was running on a machine with a slower processor and less RAM than the current version. Did you try rsh as the transport on the non straced version? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-11 14:42 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From ej.grace@imperial.ac.uk 2005-02-11 07:38 -------> Hmm. I've not managed to make it break. I will try "loading" the machines > while doing the rsyncs and see if that has any effect.Well. It seems to have no effect (apart from slowing things down)... It's strange, I really can't make it break. I tried thrashing the disk and cpu of the client machine and ran the one initiating the rsync request on a lower powered slower machine with less RAM. I'm afraid I am going to just have to shrug and say "don't know".... I never could make it break consistently... The only thing left at the back of my head is the ssh transport, I know the behaviour of ssh when disconnected changed a while back. Now, when one connects to a server and some I/O could come back the connection remains open and you have to explicitly disconnect with the ~. escape sequence... man ssh See the section "Escape Characters" Perhaps this is the interaction that's keeping the channel open and causing rsync to barf? I don't know enough about the process to really have an idea... Good luck.. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-25 20:01 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From dteed@artistic.ca 2005-02-25 12:53 ------- I've had rsync problems intermittantly in Gentoo. Gentoo uses rsync within portage package management system. Typically users run "emerge sync" to update their portage tree prior to building or maintaining any packages. After spending a whole day trying to make emerge sync run without a timeout in rsync/io.c I finally found my solution. I killed the frozen X window and mozilla on the console and then emerge sync worked reliably. I've come to the conclusion that the timeout error is completely bogus and the real issue is something that falsely looks like a timeout. See my bug report at Gentoo: http://bugs.gentoo.org/show_bug.cgi?id=83254 Many Gentoo users suffer from this bug. When a system is borderline, one can keep trying and eventually the rsync works. Such users don't report bugs, they just plug along and live with it. Yesterday was the first day I've seen it fail very consistantly. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-25 20:41 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |ASSIGNED ------- Additional Comments From wayned@samba.org 2005-02-25 13:22 ------- (In reply to comment #8)> See my bug report at Gentoo: > http://bugs.gentoo.org/show_bug.cgi?id=83254That may be a different timeout than this bug is talking about. In this bug the rsync gets done, but doesn't exit, so it then times out. The Gentoo problem appears to be a timeout in the middle of the transfer. I can easily imagine a scenario where this could occur: The generator is busily looking for files to download and sending the index of any file that needs to be updated (followed by checksum data, if the file exists). If enough time elapses between finding changed files, it is possible for a timeout to occur. I'll look into potential fixes for this (which should be pretty easy using the latest additions to protocol-29). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-25 23:22 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From wayned@samba.org 2005-02-25 16:07 ------- The CVS code now handles the case where the generator is taking too long between writes -- it either flushes any buffered data in the output buffer after timeout/2 seconds have gone by, or it sends a new "no-op" packet that will keep the connection alive. This requires protocol 29 (i.e. 2.6.4 on both ends of the connection). The latest "nightly" tar file has this code in it, if you'd care to try it. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-26 08:40 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From vanes002@umn.edu 2005-02-26 01:26 ------- I believe that using this "no-op" packet effectively implements the "keep alive" option requested in bug 1889 without requiring an explicit option be given. Nice. I'm glad you decided to do this, given that the protocol number is being bumped anyway due to other changes. But just to be clear - this is not a fix for the bug originally reported, where the hang occurs at the end, correct? Also - is there a default for timeout/2 if no --timeout value has been given? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-26 21:01 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Additional Comments From wayned@samba.org 2005-02-26 13:45 ------- (In reply to comment #11)> I believe that using this "no-op" packet effectively implements > the "keep alive" option requested in bug 1889 without requiring > an explicit option be given.It did once I moved the calls to the file-deletion passes into the generator (so that things were setup properly for the new keep-alive packet to be used).> But just to be clear - this is not a fix for the bug originally reported, > where the hang occurs at the end, correct?It might have. I am now theorizing that the original bug's timeout could have occurred during the directory-time-touching loop at the end of the generator's run. If so, the latest CVS code now handles this for a protocol-29 transfer. On this assumption, I'm closing this bug. Please reopen it if there is still a problem with 2.6.4 (you can try out the latest "nightly" tar file to test it now).> Also - is there a default for timeout/2 if no --timeout value has been given?If no --timeout has been given, no keep-alive packets are currently sent. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-26 21:21 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From dteed@artistic.ca 2005-02-26 14:12 ------- I may be able to reproduce the original situation. I can have a browser and X that are running with 90% load on CPU 1 and 98% load on CPU 2. Once I have waited for the emerge timestamp to age, I can try the sync again and possibly reproduce the bug. The question is, once I'm there and can reproduce the timeout from io.c, what do we want to know? Is there any stat from lsof, for example, that would help? If you want to know what I've done to reproduce the scenario: start X, launch 2 mozilla browsers (maybe one would work), enter a continuous stream of some character into the address bar of mozilla. I can do this with double clicking scroll lock and hitting some digit after. It is the same thing as putting an object on the keyboard (or just one key) so it autorepeats some character forever. I'm only using this weird scenario to reproduce what happened to me. I believe it will also simulate other types of system saturation with different causes that also lead to the io.c timeout error. Please let me know any data you'd like to gather, or if there is something I can add to the source to help debug it. I'm working with 2.6.3 right now. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2005-Feb-26 22:01 UTC
[Bug 1476] Unexplained timout at end of file transfer.
https://bugzilla.samba.org/show_bug.cgi?id=1476 ------- Additional Comments From wayned@samba.org 2005-02-26 14:52 ------- (In reply to comment #13)> I may be able to reproduce the original situation.The original situation is a timeout at the end of all transferred data, not in the middle. Also note that you'd need to be using the very latest 2.6.4 from CVS (or the upcoming 2.6.4pre2) on BOTH ends of the transfer for any reproduction to be noteworthy. Any other scenario does not have the keep-alive fix. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Possibly Parallel Threads
- [Bug 1476] New: Unexplained timout at end of file transfer.
- DO NOT REPLY [Bug 6461] New: rsync occassionally issues the message "rsync error: unexplained error (code 255) at main.c(1506) [generator=3.0.4]"
- DO NOT REPLY [Bug 4016] New: literal IPv6 addresses not supported in ssh protocol
- DO NOT REPLY [Bug 7454] New: assertion failed in finish_hard_link()
- DO NOT REPLY [Bug 6175] New: write last transfer status when timeout or other error happens