Derek Zhang
2013-Nov-15 15:28 UTC
[Samba] NT_STATUS_NO_LOGON_SERVERS when winbindd under large traffic.
Hi list, Thanks in advanced. I have written a winbindd client software to do NTLM authentication by using WINBINDD_PAM_AUTH_CRAP message. that is to say, my winbindd client is similar to ntlm_auth, but with high performance. when I test it under large traffic, most of the time they all work well. and the transaction per second can reach more than 900. but sometimes, winbindd will return NT_STATUS_NO_LOGON_SERVERS. below are some relevant logs. ---------------------- [2013/11/15 22:08:24.477711, 0, pid=4905] winbindd/winbindd_cm.c:835(cm_prepare_connection) cm_prepare_connection: getpeername failed with: Transport endpoint is not connected [2013/11/15 22:08:24.478174, 0, pid=4904] winbindd/winbindd_cm.c:835(cm_prepare_connection) cm_prepare_connection: getpeername failed with: Transport endpoint is not connected [2013/11/15 22:09:04.482330, 1, pid=4900] ../lib/util/tdb_wrap.c:65(tdb_wrap_log) [2013/11/15 22:09:04.482336, 1, pid=4901] ../lib/util/tdb_wrap.c:65(tdb_wrap_log) [2013/11/15 22:09:04.482336, 1, pid=4899] ../lib/util/tdb_wrap.c:65(tdb_wrap_log) tdb(/opt/MY/contrib/samba/var/locks/mutex.tdb): tdb_lock failed on list 45 ltype=1 (Interrupted system call) tdb(/opt/MY/contrib/samba/var/locks/mutex.tdb): tdb_lock failed on list 45 ltype=1 (Interrupted system call) tdb(/opt/MY/contrib/samba/var/locks/mutex.tdb): tdb_lock failed on list 45 ltype=1 (Interrupted system call) [2013/11/15 22:09:04.482475, 0, pid=4900] lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal) tdb_chainlock_with_timeout_internal: alarm (40) timed out for key AD1.MY.perf in tdb /opt/MY/contrib/samba/var/locks/mutex.tdb [2013/11/15 22:09:04.482521, 0, pid=4901] lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal) [2013/11/15 22:09:04.482525, 0, pid=4899] lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal) [2013/11/15 22:09:04.482560, 1, pid=4900] lib/server_mutex.c:74(grab_named_mutex) tdb_chainlock_with_timeout_internal: alarm (40) timed out for key AD1.MY.perf in tdb /opt/MY/contrib/samba/var/locks/mutex.tdb tdb_chainlock_with_timeout_internal: alarm (40) timed out for key AD1.MY.perf in tdb /opt/MY/contrib/samba/var/locks/mutex.tdb Could not get the lock for AD1.MY.perf [2013/11/15 22:09:04.482613, 1, pid=4901] lib/server_mutex.c:74(grab_named_mutex) [2013/11/15 22:09:04.482630, 1, pid=4899] lib/server_mutex.c:74(grab_named_mutex) Could not get the lock for AD1.MY.perf [2013/11/15 22:09:04.482646, 0, pid=4900] winbindd/winbindd_cm.c:810(cm_prepare_connection) Could not get the lock for AD1.MY.perf cm_prepare_connection: mutex grab failed for AD1.MY.perf [2013/11/15 22:09:04.482701, 0, pid=4901] winbindd/winbindd_cm.c:810(cm_prepare_connection) [2013/11/15 22:09:04.482718, 0, pid=4899] winbindd/winbindd_cm.c:810(cm_prepare_connection) cm_prepare_connection: mutex grab failed for AD1.MY.perf cm_prepare_connection: mutex grab failed for AD1.MY.perf [2013/11/15 22:09:04.483073, 2, pid=4899] winbindd/winbindd_pam.c:1931(winbindd_dual_pam_auth_crap) [2013/11/15 22:09:04.483086, 2, pid=4901] winbindd/winbindd_pam.c:1931(winbindd_dual_pam_auth_crap) [2013/11/15 22:09:04.483105, 2, pid=4900] winbindd/winbindd_pam.c:1931(winbindd_dual_pam_auth_crap) NTLM CRAP authentication for user [MY.perf]\[user00016] returned NT_STATUS_NO_LOGON_SERVERS (PAM: 9) NTLM CRAP authentication for user [MY.perf]\[user00017] returned NT_STATUS_NO_LOGON_SERVERS (PAM: 9) ------------------------- My question is why this happens and what's the meaning of the log above? Where is the bottleneck? Since my AD is alive and CPU(4 cores) usags is around 30%, memory usage is 1.3GB(6.0GB total), I think maybe AD is not the bottleneck. If this means it reaches winbindd's processing capacity, how to make it more stable. my samba verison is 3.6.13 and smb.conf is as below [global] kerberos method = system keytab workgroup = perf security = ads passdb backend = tdbsam realm = MY.perf password server = * log level = 2 debug pid = yes winbind use default domain = yes winbind max domain connections = 15 client ntlmv2 auth = yes client ldap sasl wrapping = sign winbind max clients = 1024 Thank you Derek