Tristan Evans
2016-Feb-19 14:46 UTC
[Samba] winbindd: Exceeding 200 client connections, no idle connection found
Hello! This is my first email to the Samba mailing list :) We have been having an issue on our servers where the winbind service begins to utilize 100%~ CPU and logs the following error: *winbindd: Exceeding 200 client connections, no idle connection found* We have been running Samba *3.6.3-0.52.5*, however when originally investigating the problem, it was found that *3.6.3-0.58.1* included a patch for the following: *Prune idle or hung connections older than "winbind request timeout"(bso#3204, bnc#872912)* *BSO #3204* covers what appears to be the exact problem we're having. The Bugzilla report is here: https://bugzilla.samba.org/show_bug.cgi?id=3204. So we've been deploying this patch slowly to all the servers that are having this problem (around 1500 physical servers running the same image). I've been keeping a close eye on my email, because the machines will send an email every time this error occurs. We have deployed it to about 1/3 of the machines, and today I just got a email from a system that already installed that patch. The patch is being deployed via a script that stops the Samba services (smb, nmb, winbind), installs the RPMs and then starts the services. *The winbindd service stopped: 2:00:04 AM* *The patch installed: 2/18 02:00:08 AM* *The winbindd service started: 2/18 02:00:49 AM* The error was reported in the logs: 2/18 12:56 PM Now it could be the service wasn't restarted fully, however it logged as if it was: *[2016/02/18 02:00:04.379281, 0] winbindd/winbindd.c:212(winbindd_sig_term_handler)* * Got sig[15] terminate (is_parent=1)* *[2016/02/18 02:00:49.910364, 0] winbindd/winbindd_cache.c:3178(initialize_winbindd_cache)* * initialize_winbindd_cache: clearing cache and re-creating with version number 2* Unfortunately I don't have the system in it's bad state right now (we have a support team who connects and restarts the service when these emails roll in, so that our customers do not experience problems). The next time this happens, I will try to preserve it's state for a bit, and get stuff like the winbind processes FDs. However, is there anything else that I should be looking for that would assist with an issue like this? Thanks.
Jeremy Allison
2016-Feb-19 18:34 UTC
[Samba] winbindd: Exceeding 200 client connections, no idle connection found
On Fri, Feb 19, 2016 at 09:46:19AM -0500, Tristan Evans wrote:> Hello! This is my first email to the Samba mailing list :) > > We have been having an issue on our servers where the winbind service > begins to utilize 100%~ CPU and logs the following error: > > *winbindd: Exceeding 200 client connections, no idle connection found* > > We have been running Samba *3.6.3-0.52.5*, however when originally > investigating the problem, it was found that *3.6.3-0.58.1* included a > patch for the following: > > > *Prune idle or hung connections older than "winbind request > timeout"(bso#3204, bnc#872912)* > > *BSO #3204* covers what appears to be the exact problem we're having. The > Bugzilla report is here: https://bugzilla.samba.org/show_bug.cgi?id=3204. > > So we've been deploying this patch slowly to all the servers that are > having this problem (around 1500 physical servers running the same image). > I've been keeping a close eye on my email, because the machines will send > an email every time this error occurs. We have deployed it to about 1/3 of > the machines, and today I just got a email from a system that already > installed that patch. > > The patch is being deployed via a script that stops the Samba services > (smb, nmb, winbind), installs the RPMs and then starts the services. > > *The winbindd service stopped: 2:00:04 AM* > > *The patch installed: 2/18 02:00:08 AM* > *The winbindd service started: 2/18 02:00:49 AM* > > The error was reported in the logs: 2/18 12:56 PM > > Now it could be the service wasn't restarted fully, however it logged as if > it was: > > *[2016/02/18 02:00:04.379281, 0] > winbindd/winbindd.c:212(winbindd_sig_term_handler)* > * Got sig[15] terminate (is_parent=1)* > *[2016/02/18 02:00:49.910364, 0] > winbindd/winbindd_cache.c:3178(initialize_winbindd_cache)* > * initialize_winbindd_cache: clearing cache and re-creating with version > number 2* > > Unfortunately I don't have the system in it's bad state right now (we have > a support team who connects and restarts the service when these emails roll > in, so that our customers do not experience problems).3.6.3 is old enough that it's out of support from the Samba Team I'm afraid. There have been significan bug fixes to winbindd in subsequent releases, so I'd really recommend testing with a more up to date Samba, or if you really need to keep on 3.6.x you'll have to go through the git commit logs and back-port all relevent winbindd patches (this would be a lot of work). Sorry, Jeremy.
Apparently Analagous Threads
- winbindd: Exceeding 200 client connections, no idle connection found
- winbindd: Exceeding 200 client connections, no idle connection found
- Exceeding 200 client connections, no idle connection found
- Exceeding 200 client connections, no idle connection found in samba 3.4.4
- winbindd: Exceeding 200 client connections, no idle connection found