bugzilla-daemon at mindrot.org
2020-Jun-04 11:04 UTC
[Bug 3177] New: sshd process had became <defunct> and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 Bug ID: 3177 Summary: sshd process had became <defunct> and could not accept requests any more after many count sftp accesses. Product: Portable OpenSSH Version: 8.2p1 Hardware: amd64 OS: Linux Status: NEW Severity: critical Priority: P5 Component: sshd Assignee: unassigned-bugs at mindrot.org Reporter: zhlonggang at aliyun.com Created attachment 3405 --> https://bugzilla.mindrot.org/attachment.cgi?id=3405&action=edit sshd log which has a DEBUG3 log level I has a sftp file server in our production environment, which is centos6.8 os with openssh-5.3p1. Because openssh-5.3p1 is too old.Recently I want to upgrade openssh-5.3p1 with a newest openssh-8.2p1. After a downloading, compiling, deployment, and a few ssh/sftp command-line tests, everything is ok. The newest openssh-8.2p1 seems work normally as expected. But when I tested with a java client program which calls Jch-5.4 sdk to upload a file to the openssh-8.2p1 sftp file server in a loop, my problem occured. After many successful sftp accesses and a few hours later, the java client program accessed the sftp server failed with the following exception:"Session.connect: java.net.SocketTimeoutException: Read timed out". When I used a sftp or ssh command to access the server, the command was blocked for a long time and could not return any more, but I could telnet the port successfully. When I logined into the sftp file server, ps the process, I found the following infomaion: root 11742 1 0 6?03 ? 00:00:02 sshd: /usr/local/openssh-8.2p1/sbin/sshd -p 99 -f /etc/ssh/sshd_config_sftp.bad [listener] 0 of 10-100 startups root 11775 11742 0 6?03 ? 00:00:00 sshd: testuser1 [priv] testuse+ 11777 11775 0 6?03 ? 00:00:13 sshd: testuser1 at notty root 15110 923 0 18:06 ? 00:00:00 sshd: root at pts/0 root 15128 15112 0 18:06 pts/0 00:00:00 grep --color=auto sshd root 21936 11742 0 6?03 ? 00:00:00 [sshd] <defunct> I could see a sshd process which had become into a defunct process. It is the problem. I have ran the java client program for many times, But I got the same result, and I counld not found any error info in the sshd log. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-04 11:05 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 longgang <zhlonggang at aliyun.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|sshd process had became |sshd process had became |<defunct> and could not |into a defunct process and |accept requests any more |could not accept requests |after many count sftp |any more after many count |accesses. |sftp accesses. -- You are receiving this mail because: You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-04 11:49 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 Darren Tucker <dtucker at dtucker.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at dtucker.net --- Comment #1 from Darren Tucker <dtucker at dtucker.net> --- A <defunct> process is a child process that has exited but has not yet had its parent wait() for it yet. There is a (usually) brief period between when the process exits and its parent retrieves its exit status. Having a very small number of them appearing briefly is not necessarily a problem. Having the same pid appearing for any significant time (say, more than a second) is probably a problem. I had a look at the log, but did not see anything that would explain what you're describing. If you increase MaxStartups does the behaviour change? The default is 10:30:100 so I'd suggest trying "100:30:1000" -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-05 02:18 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 --- Comment #2 from longgang <zhlonggang at aliyun.com> --- Darren Tucker, thanks for your sugestion, I`ll have a try today and give my test result here. I uploaded the last 500 lines log when the problem occured and the sshd_config files used. I hope some useful infomation you can find. I also has run following tests: 1. I run openssh-5.3p1 with the same sshd_config file which I has uploaded except that the macs, kexalgorithoms, ciphers were commented. The same java client programe has already run more then 24 hours so far, everything is ok, the problem do not occur. 2. I run openssh-7.9p1 with the same sshd_config file which I has uploaded. The same java client programe has already run for 10 hours so far, everything is also ok, the problem also do not occur. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-05 02:21 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 --- Comment #3 from longgang <zhlonggang at aliyun.com> --- Created attachment 3406 --> https://bugzilla.mindrot.org/attachment.cgi?id=3406&action=edit the last 500 lines log the last 500 lines log -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-05 02:23 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 --- Comment #4 from longgang <zhlonggang at aliyun.com> --- Created attachment 3407 --> https://bugzilla.mindrot.org/attachment.cgi?id=3407&action=edit sshd_config file used sshd_config file used -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-05 02:28 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 --- Comment #5 from Darren Tucker <dtucker at dtucker.net> --- One possibility is the SA_RESTART signal change. It's worth trying 8.3p1 both as is and with "./configure --with-cflags=-DNO_SA_RESTART" (NO_SA_RESTART wasn't added until 8.3p1, so it won't do anything on 8.2p1). -- You are receiving this mail because: You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at mindrot.org
2020-Jun-08 09:25 UTC
[Bug 3177] sshd process had became into a defunct process and could not accept requests any more after many count sftp accesses.
https://bugzilla.mindrot.org/show_bug.cgi?id=3177 --- Comment #6 from longgang <zhlonggang at aliyun.com> --- Darren Tucker, I have run the test on last weekend. The following is the test result. 1. The openssh-8.3p1 complied with or without NO_SA_RESTART flags defined, both had run into the problem as openssh-8.2p1. After a few hours later, the java client program failed with the following exception: "Session.connect: java.net.SocketTimeoutException: Read timed out". The sshd process became into defunct status. [root at bjm7-9-10 ~]# ps -ef|grep sshd root 1118 1 0 2019 ? 00:02:24 /usr/sbin/sshd root 14170 16380 0 16:28 ? 00:00:00 [sshd] <defunct> root 16380 1 0 08:57 ? 00:00:26 sshd: /usr/local/openssh-8.3p1/sbin/sshd -p 99 -f /etc/ssh/sshd_config_sftp.bad [listener] 0 of 10-100 startups 2. I also have run the openssh-8.1p1, the sshd process was normal and did not become into defunct status, but the java client program failed occasionally with the following exeption: "java.io.IOException: inputstream is closed". If the java client program catch the exception and ignore the error, it will success next time. 3. My test with openssh-7.9p1 has already run 3 days so far, everything is also ok, the problem also do not occur, and the java client program has no errors as openssh-8.1p1. I seems like openssh-7.9p1 is my best choice. -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug.
Apparently Analagous Threads
- [Bug 3181] New: ssh-agent doesn't exit automatically after child program exits
- OpenSSH ver.8.2p1 compilation error on AIX
- why the length and width of a plot region produced by the dev.new() function cannot be correctly set?
- why the length and width of a plot region produced by the dev.new() function cannot be correctly set?
- Does Xapian support query string like "(age: 1..25) OR (age: 35..50)"? or "(age: 1..25) - (age: 35..50)"?