Hi Marc, >> The cause is that the password change didn' reach both AD DCs, but only>> one. The other one still had the old value as could be seen by >> samba-tool ldapcmp. Restarting the DCs and waiting for a couple of >> seconds brings them back to sync and Windows logons work as they used to. >> Any idea, what I should do next time to obtain valuable output for >> debugging? > > * What Samba version are you running?The DCs are 4.1.17-Debian.> * How many DCs?Just two.> * Can you force this problem to appear?Need some more investigation here - I did not find any way reproducible under arbitrary conditions.> Just an idea: AD problems are often caused by DNS problems and we got > the keyword "DNS islanding" in an other threat at the moment: Which DNS > do your DCs use as primary? Their own or a different one? See > http://retrohack.com/a-word-or-two-about-dns-islanding/As I understood Linux resolving there is no static primary-secondary concept for DNS. So I'll try to remove the self-dependence altogether and see, if it enhances the situation. Regards, - lars.
Unsure, whether this is another symptom of the same disease: While configuring a member CUPS print server and checking the syslog for an entirely different reason I was surprised to see the following log entries (and many more similar): Mar 13 11:36:10 snorri nslcd[11752]: [4a481a] <passwd="mgr"> ldap_result() failed: Can't contact LDAP server Mar 13 11:36:10 snorri nslcd[11752]: [4a481a] <passwd="mgr"> ldap_abandon() failed to abandon search: Can't contact LDAP server: Transport endpoint is not connected Mar 13 11:36:10 snorri nslcd[11752]: [9abb43] <passwd=1001> ldap_result() failed: Can't contact LDAP server Mar 13 11:36:10 snorri nslcd[11752]: [9abb43] <passwd=1001> ldap_abandon() failed to abandon search: Can't contact LDAP server: Transport endpoint is not connected Okay doing: ldapsearch -LLL -D "CN=Administrator,CN=Users,DC=ad,DC=microsult,DC=de" -H ldap://ad.microsult.de -x -W '(uid=mgr)' uid uidNumber gidNumber sAMAccountName name gecos works nicely. I can also specify each DC separately as LDAP URI. Login to the machine, id, getent everything works, but sometimes produces the said log entries, and take a considerable time then. =nscd= is stopped on the machine. Currently everything is running smoothly. In the time where I see the most entries I also had several brief pauses in my music - served via Kerberized NFS4 with AD serving NSS and Kerberos. Some time before that, I applied today's Debian security updates to both DC and changed /etc/resolv.conf for the primary DC to not point to itself anymore. However, second's silences are not uncommon in my setup. When they become more frequent, this is usually a dire indication that something is about to break. And it generally does not coincide with any work on the DC.>>> Any idea, what I should do next time to obtain valuable output for >>> debugging?Which is still the challenging question! ;)>> >> * What Samba version are you running? > > The DCs are 4.1.17-Debian. > >> * How many DCs? > > Just two.Regards, - lars.
It did happen again and this time I was a little less panicked and took 
some time to figure out what happened.
On my primary DC (SAMBA) I did not notice anything extraordinary. 
However, my secondary (VERDANDI) reported issues:
root at verdandi:~# samba-tool drs showrepl
Default-First-Site-Name\VERDANDI
DSA Options: 0x00000001
DSA object GUID: a03bbb51-1dca-44ae-a4d9-7aa8cb4a1ace
DSA invocationId: 8bdb4f85-1da2-4f5a-b9a9-e8369d202745
==== INBOUND NEIGHBORS ===
CN=Schema,CN=Configuration,DC=ad,DC=microsult,DC=de
         Default-First-Site-Name\SAMBA via RPC
                 DSA object GUID: b19509be-c3ee-4a58-9fc9-afd61759a23f
                 Last attempt @ Wed Apr 22 00:12:36 2015 CEST failed, 
result 5 (WERR_ACCESS_DENIED)
                 1265 consecutive failure(s).
                 Last success @ Fri Apr 17 14:47:18 2015 CEST
[...]
==== OUTBOUND NEIGHBORS ===[... everything OK for no attempts were ever made,
but ...]
DC=ad,DC=microsult,DC=de
         Default-First-Site-Name\SAMBA via RPC
                 DSA object GUID: b19509be-c3ee-4a58-9fc9-afd61759a23f
                 Last attempt @ Wed Apr 22 00:14:00 2015 CEST failed, 
result 5 (WERR_ACCESS_DENIED)
                 31 consecutive failure(s).
                 Last success @ NTTIME(0)
And consequently the password update that happened the previous day was 
out of sync:
samba-tool ldapcmp ldap://samba ldap://verdandi -Uadministrator
Password for [AD\administrator]:
* Comparing [DOMAIN] context...
* Objects to be compared: 289
Comparing:
'CN=Builtin,DC=ad,DC=microsult,DC=de' [ldap://samba]
'CN=Builtin,DC=ad,DC=microsult,DC=de' [ldap://verdandi]
     Attributes found only in ldap://samba:
         serverState
     FAILED
Comparing:
'CN=Lars LH. Hanke,CN=Users,DC=ad,DC=microsult,DC=de' [ldap://samba]
'CN=Lars LH. Hanke,CN=Users,DC=ad,DC=microsult,DC=de' [ldap://verdandi]
     Difference in attribute values:
         pwdLastSet =>
['130740170160000000']
['130703672860000000']
     FAILED
[...]
Having restarted the secondary DC some 34h ago, it synchronized 
immediately and still does, i.e. drs showrepl has its last success 5 
minutes ago, no failures.
It looks a little like an expired ticket, which fails to renew after 
several weeks. But this is pure speculation.
Any ideas for troubleshooting?
Regards,
  - lars.
Am 13.03.2015 um 00:43 schrieb Lars Hanke:> Hi Marc,
>
>  >> The cause is that the password change didn' reach both AD
DCs, but only
>>> one. The other one still had the old value as could be seen by
>>> samba-tool ldapcmp. Restarting the DCs and waiting for a couple of
>>> seconds brings them back to sync and Windows logons work as they
used
>>> to.
>>> Any idea, what I should do next time to obtain valuable output for
>>> debugging?
>>
>> * What Samba version are you running?
>
> The DCs are 4.1.17-Debian.
>
>> * How many DCs?
>
> Just two.
>
>> * Can you force this problem to appear?
>
> Need some more investigation here - I did not find any way reproducible
> under arbitrary conditions.
>
>> Just an idea: AD problems are often caused by DNS problems and we got
>> the keyword "DNS islanding" in an other threat at the moment:
Which DNS
>> do your DCs use as primary? Their own or a different one? See
>> http://retrohack.com/a-word-or-two-about-dns-islanding/
>
> As I understood Linux resolving there is no static primary-secondary
> concept for DNS. So I'll try to remove the self-dependence altogether
> and see, if it enhances the situation.
>
> Regards,
>   - lars.
>
Greetings, Dr. Lars Hanke!> It did happen again and this time I was a little less panicked and took > some time to figure out what happened.> On my primary DC (SAMBA) I did not notice anything extraordinary. > However, my secondary (VERDANDI) reported issues:> root at verdandi:~# samba-tool drs showrepl > Default-First-Site-Name\VERDANDI > DSA Options: 0x00000001 > DSA object GUID: a03bbb51-1dca-44ae-a4d9-7aa8cb4a1ace > DSA invocationId: 8bdb4f85-1da2-4f5a-b9a9-e8369d202745> ==== INBOUND NEIGHBORS === > CN=Schema,CN=Configuration,DC=ad,DC=microsult,DC=de > Default-First-Site-Name\SAMBA via RPC > DSA object GUID: b19509be-c3ee-4a58-9fc9-afd61759a23f > Last attempt @ Wed Apr 22 00:12:36 2015 CEST failed, > result 5 (WERR_ACCESS_DENIED) > 1265 consecutive failure(s). > Last success @ Fri Apr 17 14:47:18 2015 CEST> [...] > ==== OUTBOUND NEIGHBORS ===> [... everything OK for no attempts were ever made, but ...]> DC=ad,DC=microsult,DC=de > Default-First-Site-Name\SAMBA via RPC > DSA object GUID: b19509be-c3ee-4a58-9fc9-afd61759a23f > Last attempt @ Wed Apr 22 00:14:00 2015 CEST failed, > result 5 (WERR_ACCESS_DENIED) > 31 consecutive failure(s). > Last success @ NTTIME(0)> And consequently the password update that happened the previous day was > out of sync:> samba-tool ldapcmp ldap://samba ldap://verdandi -Uadministrator > Password for [AD\administrator]:> * Comparing [DOMAIN] context...> * Objects to be compared: 289> Comparing: > 'CN=Builtin,DC=ad,DC=microsult,DC=de' [ldap://samba] > 'CN=Builtin,DC=ad,DC=microsult,DC=de' [ldap://verdandi] > Attributes found only in ldap://samba: > serverState > FAILED> Comparing: > 'CN=Lars LH. Hanke,CN=Users,DC=ad,DC=microsult,DC=de' [ldap://samba] > 'CN=Lars LH. Hanke,CN=Users,DC=ad,DC=microsult,DC=de' [ldap://verdandi] > Difference in attribute values: > pwdLastSet => > ['130740170160000000'] > ['130703672860000000']Looks very much like an hour off. I suggest checking tzdata configuration.> FAILED> [...]> Having restarted the secondary DC some 34h ago, it synchronized > immediately and still does, i.e. drs showrepl has its last success 5 > minutes ago, no failures.> It looks a little like an expired ticket, which fails to renew after > several weeks. But this is pure speculation.> Any ideas for troubleshooting?-- With best regards, Andrey Repin Friday, April 24, 2015 00:04:34 Sorry for my terrible english...