Rowland Penny
2017-Feb-01 16:19 UTC
[Samba] net ads and wbinfo are painfully slow -- but they work
On Wed, 1 Feb 2017 07:30:19 -0800 Chris Stankevitz <chrisstankevitz at gmail.com> wrote:> On Wed, Feb 1, 2017 at 1:12 AM, Rowland Penny via samba > <samba at lists.samba.org> wrote: > > He is also unlikely to be running avahi, he is using Freebsd 10.3 > > truss (like strace) showed that wbinfo, net, and sshd were all hanging > after system calls to getuid() and getpid(). At the time I thought it > was those calls that were slow but today I realize (although I don't > have access to the system today) that the slowdown was *after* > getuid()/getpid(). > > My suspicion is that "passwd: files winbind" is causing some low level > function to be very slow although I'm not sure which one yet. > > ChrisThere is definitely something strange going on here, if I flush the winbind cache, then ask for a users info with getent, I get this: time getent passwd rowland rowland:*:10000:10000:Rowland Penny:/home/rowland:/bin/bash real 0m0.054s user 0m0.004s sys 0m0.000s then if I ask for it again: real 0m0.001s user 0m0.000s sys 0m0.000s is PAM set up correctly ? Rowland
Chris Stankevitz
2017-Feb-02 17:28 UTC
[Samba] net ads and wbinfo are painfully slow -- but they work
On Wed, Feb 1, 2017 at 8:19 AM, Rowland Penny via samba <samba at lists.samba.org> wrote:> There is definitely something strange going on here, if I flush > the winbind cache, then ask for a users info with getent, I get this:I suspect now that the problem is not getent/nsswitch/pam and that it is either winbindd/krb5/AD. I launched winbindd in debug99/foreground/no_cache mode. I waited about 10 seconds, then issued "wbinfo -u". wbinfo -u took 4 minutes to complete. It completed and displayed the correct/valid list of users. During the several minutes it took to display the users, winbindd was logging lots activity. The activity was interrupted by long waits. These long waits of course summed to be the ~4 minutes it took for "wbinfo -u" to return. The debug log is probably too large for this list. I snipped out of the debug output A) the lines that preceeded long waits and B) the lines that look like error messages. Please let me know if anything stands out as a configuration problem. The windows domain is just a couple days old and contains only a handful of users. Please keep in mind that the wbinfo -u is working and returns correct results. It is just outrageously slow. Thank you again, Chris === A. log entries tha preceeded long waits == Opening connection to LDAP server '192.168.1.4:389', timeout 15 seconds Starting GENSEC submechanism gse_krb5 list_users MYDOMAIN wbint_QueryUserList: struct wbint_QueryUserList in: struct wbint_QueryUserList Opening connection to LDAP server '192.168.1.4:389', timeout 15 seconds Starting GENSEC submechanism gse_krb5 Opening connection to LDAP server '192.168.1.4:389', timeout 15 seconds Starting GENSEC submechanism gse_krb5 Starting GENSEC submechanism gse_krb5 Opening connection to LDAP server '192.168.1.4:389', timeout 15 seconds Starting GENSEC submechanism gse_krb5 Opening connection to LDAP server '192.168.1.4:389', timeout 15 seconds Starting GENSEC submechanism gse_krb5 Starting GENSEC submechanism gse_krb5 Starting GENSEC submechanism gse_krb5 Starting GENSEC submechanism gse_krb5 Already reaped child 89263 died === B. log entries that look like error messages == 1. A complaint about addrchange_context_create addrchange_context_create failed: NT_STATUS_NOT_SUPPORTED 2. A complaint about broken pipe after starting gse_krb5. This happened multiple times. Starting GENSEC submechanism gse_krb5 Client request timed out, shutting down sock 23, pid 89266 final write to client failed: Broken pipe 3. A complainted about gss_acquire_creds. This happened multiple times. gss_acquire_creds failed for GSS_C_NO_NAME with [ No credentials were supplied, or the credentials w ere unavailable or inaccessible.: unknown mech-code 0 for mech 1 2 840 113554 1 2 2] -the caller may retry after a kinit. Failed to start GENSEC client mech gse_krb5: NT_STATUS_INTERNAL_ERROR Failed to setup SPNEGO negTokenInit request: NT_STATUS_INTERNAL_ERROR ads_sasl_spnego_gensec_bind(KRB5) failed with: An internal error occurred., calling kinit
Geert Lorang
2017-Feb-02 18:20 UTC
[Samba] net ads and wbinfo are painfully slow -- but they work
On 2/02/2017 18:28, Chris Stankevitz via samba wrote: This:> wbinfo -u took 4 minutes to complete. It completed and displayed the > correct/valid list of users.and this:> gss_acquire_creds failed for GSS_C_NO_NAME with [ No credentials were > supplied, or the credentials w > ere unavailable or inaccessible.: unknown mech-code 0 for mech 1 2 840 > 113554 1 2 2] -the caller may > retry after a kinit.Remind me too much of https://bugzilla.samba.org/show_bug.cgi?id=11852 which was fixed after 4.2.10 but before 4.2.14 - so you shouldn't be impacted actually, but it might be worth to double check it. Check in your winbindd debug log for the following: "Current tickets expire in 36000 seconds (at <timestamp>, time is now <timestamp>)" If it states anything negative (e.g. "Current ticket expire in -1474558139 seconds") you're hitting the above mentioned bug which results in a super slow wbinfo (because it has to kinit for every user you want to list). Just my 2 cents! Regards, Geert