We have a problem with our setup that exposed another problem in winbindd. Setup: Samba 3.0.1, openLDAP 1.2.23 for idmap backend using port 1389, security=ads (w2k). All works after we get the daemons started successfully. The first issue has to do with the initial setup of both a Samba server [smbd, nmbd, winbindd, slapd], and a Samba client [just nmbd and winbindd]. We do the following: 1) Configure smb.conf, krb5.conf, slapd.conf 2) kinit domadmin@TST.SAMBA17.TEST 3) net ads join At this point, a klist shows the tickets, and net ads testjoin returns success. Then we try to start all the processes as root via: slapd -h ldap://smbdev.tst.samba17.test:1389 smbd -D nmbd -D winbindd Usually, and this is the first problem, the log.winbindd reports the following: nsswitch/winbindd.c:main(842) winbindd version 3.0.1 started. Copyright The Samba Team 2000-2003 nsswitch/winbindd_util.c:add_trusted_domain(150) Added domain TST TST.SAMBA17.TEST libsmb/clikrb5.c:ads_krb5_mk_req(269) krb5_cc_get_principal failed (No credentials cache found) libads/kerberos.c:ads_kinit_password(133) kerberos_kinit_password HOST/smbdev@TST.SAMBA17.TEST failed: Client not found in Kerberos database nsswitch/winbindd_ads.c:ads_cached_connection(65) ads_connect for domain TST failed: Operations error nsswitch/winbindd_util.c:add_trusted_domains(207) scanning trusted domain list If the error "Client not found in Kerberos database" appears, winbindd will not connect to the AD, giving the "Operations error". If we leave winbindd running it appears not to ever reconnect. Instead, we have to kill and restart winbindd. So now we have an ugly scripted loop that does something to the effect of the following: 1) start all daemons 2) Wait 30 seconds for winbindd to go through its initialization 3) try a wbinfo -u. 4) If no users are returned by wbinfo, kill all daemons and go back to step 1. Winbindd successfully connects after an average of 2.5 minutes. On this success, the same section of the log will show: nsswitch/winbindd.c:main(842) winbindd version 3.0.1 started. Copyright The Samba Team 2000-2003 nsswitch/winbindd_util.c:add_trusted_domain(150) Added domain TST TST.SAMBA17.TEST libsmb/clikrb5.c:ads_krb5_mk_req(269) krb5_cc_get_principal failed (No credentials cache found) nsswitch/winbindd_util.c:add_trusted_domains(207) scanning trusted domain list I would truly like to know what we are doing wrong here. I've searched the web and newsgroups and have found similar errors, but no solutions that worked here. Code purusals haven't found anything (yet). The time between the Samba server and the AD are within a second of each other. Permissions are correct and files exist as expected. It definitely appears to be related to time, though. If I do the kinit and net ads join, then wait around five minutes before starting the daemons, winbindd starts up fine. What could need the time? Is there any command that I can run before starting Samba to verify all is ready? If you have suggestions for this, please pass them on, because this causes another problem that is a killer for us. ---- First I'll give an abbreviated description of what I'm about to describe. I hope this will make the next paragraph easier to follow: with user testuser1 and domain TST: 0) client tries to access a global share 1) client GETPWNAM request for tst+testuser1 to the Samba server 2) client GETPWNAM request for TST+testuser1 to the Samba server 3) client GETPWNAM request for TST+TESTUSER1 to the Samba server 4) client GETPWNAM request for testuser1 to the Samba server 5) client GETPWNAM request for TESTUSER1 to the Samba server 6) client CREATE_USER request for testuser1, which returns success and uses up a uid. 7) Process repeats since the winbindd server still can't authenticate using the AD -- including the CREATE_USER. ---- Now to elaborate: There have been situations where a PC tries to access a Samba share. In this case, the user is an AD global user that is not yet in the openLDAP idmap backend. I expect that even if it was in the idmap, the problem would still exist. At the time of these client requests, winbindd is up and running on the Samba server, but received that Kerberos error above (we are at step 2 of the startup loop, so winbindd can't talk with AD yet). The client tries five variations of the user name, attempting to authenticate using the winbindd server. This failing, it sends a CREATE_USER request to the winbindd server. The winbindd server pulls the next available id from the openLDAP backend, incrementing the openLDAP's next-available stored in uidNumber (<= issue!!), then calls wb_storepwnam. wb_storepwnam returns success (hmmmm, recall that we authenticate to AD and don't have a working connection to AD). The client then repeats this cycle, requesting authentication, then doing another CREATE_USER. This cycle is repeated roughly 30 times per second. The very bad part of this is that while winbindd is up and can't connect to AD, 30 uids are lost every second since the CREATE_USER request increments the openLDAP uidNumber every time it is called. We have a limited uid range, so this is very painful for us. If we leave winbindd running 30 seconds only to find out it didn't connect to the AD, we've lost 900 uids in the interim. I really want to solve the first, startup problem, and know it would likely mask the second, uid problem. However, I am willing to fix the second, as it still appears to be a valid issue. Before I work on it, though, I'd like some input from other Samba developers. =============================== Here is the section of the log.winbindd that is repeated 30 times a second, each time with a new, consecutive uid in the wb_storepwnam output. The timestamp has been removed for brevity and were all "2004/02/12 18:00:20". [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:client_write(556) client_write: need to write 512 extra data bytes. [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 512 bytes. [10] nsswitch/winbindd.c:client_write(545) client_write: client_write: complete response written. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn GETPWNAM [3] nsswitch/winbindd_user.c:winbindd_getpwnam(113) [ 5263]: getpwnam tst+testuser1 [10] nsswitch/winbindd_cache.c:refresh_sequence_number(343) refresh_sequence_number: TST time ok [10] nsswitch/winbindd_cache.c:refresh_sequence_number(368) refresh_sequence_number: TST seq number is now -1 [1] nsswitch/winbindd_user.c:winbindd_getpwnam(148) user 'testuser1' does not exist [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn GETPWNAM [3] nsswitch/winbindd_user.c:winbindd_getpwnam(113) [ 5263]: getpwnam TST+testuser1 [10] nsswitch/winbindd_cache.c:refresh_sequence_number(343) refresh_sequence_number: TST time ok [10] nsswitch/winbindd_cache.c:refresh_sequence_number(368) refresh_sequence_number: TST seq number is now -1 [1] nsswitch/winbindd_user.c:winbindd_getpwnam(148) user 'testuser1' does not exist [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn GETPWNAM [3] nsswitch/winbindd_user.c:winbindd_getpwnam(113) [ 5263]: getpwnam TST+TESTUSER1 [10] nsswitch/winbindd_cache.c:refresh_sequence_number(343) refresh_sequence_number: TST time ok [10] nsswitch/winbindd_cache.c:refresh_sequence_number(368) refresh_sequence_number: TST seq number is now -1 [1] nsswitch/winbindd_user.c:winbindd_getpwnam(148) user 'TESTUSER1' does not exist [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn GETPWNAM [3] nsswitch/winbindd_user.c:winbindd_getpwnam(113) [ 5263]: getpwnam testuser1 [10] nsswitch/winbindd_cache.c:refresh_sequence_number(343) refresh_sequence_number: TST time ok [10] nsswitch/winbindd_cache.c:refresh_sequence_number(368) refresh_sequence_number: TST seq number is now -1 [1] nsswitch/winbindd_user.c:winbindd_getpwnam(148) user 'testuser1' does not exist [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn GETPWNAM [3] nsswitch/winbindd_user.c:winbindd_getpwnam(113) [ 5263]: getpwnam TESTUSER1 [10] nsswitch/winbindd_cache.c:refresh_sequence_number(343) refresh_sequence_number: TST time ok [10] nsswitch/winbindd_cache.c:refresh_sequence_number(368) refresh_sequence_number: TST seq number is now -1 [1] nsswitch/winbindd_user.c:winbindd_getpwnam(148) user 'TESTUSER1' does not exist [10] nsswitch/winbindd.c:client_write(511) client_write: wrote 1300 bytes. [10] nsswitch/winbindd.c:winbind_client_read(457) client_read: read 1568 bytes. Need 0 more for a full request. [10] nsswitch/winbindd.c:process_request(307) process_request: request fn CREATE_USER [3] nsswitch/winbindd_acct.c:winbindd_create_user(876) [ 5263]: create_user: user=>(testuser1), group=>() [5] nsswitch/winbindd_acct.c:wb_getgrnam(522) wb_getgrnam: Did not find group (nobody) [10] nsswitch/winbindd_acct.c:passwd2string(158) passwd2string: converting passwd struct for testuser1 [10] nsswitch/winbindd_acct.c:wb_storepwnam(487) wb_storepwnam: Success -> "testuser1:x:11144:60001:testuser1:/home1/testuser1:/bin/csh" Big thanks for all inputs, John