On 31/10/14 13:55, Donaldson Jeff wrote:> ??
>
> Recently one of my Samba4 (4.2.0 Ver) Domain Controllers started acting up.
Authentication against it would time out and fail, but until recently the
internal DNS was still working. Now the internal DNS fails. If I use nslookup
and set the server to it, then look up any hostname I get "connection timed
out; no servers could be reached". This DC is my primary and has all FSMO
roles. I need to get this working again in order to seize those roles on one of
my other DCs. During troubleshooting here are some of the things I found.
>
>
> If I nslookup the IP address of my primary DC on one of my other servers, I
get two records
>
>
> 25.2.xxx.xx.in-addr.arpa name = hostname.
>
> 25.2.xxx.xx.in-addr.arpa name = FQDN.
>
>
> I only get the FQDN when I lookup my other DCs. When I found this, I tried
to use Samba-Tool to delete the hostname. record, but I get message that the
record doesn't exist. If I then run samba-tool dns serverinfo hostname, I
get the following error...
>
>
> ERROR(runtime): uncaught exception - (-1073741643,
'NT_STATUS_IO_TIMEOUT') File
"/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py",
line 175, in _run
>
> return self.run(*args, **kwargs)
>
> File
"/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/dns.py",
line 703, in run
>
> dns_conn = dns_connect(server, self.lp, self.creds)
>
> File
"/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/dns.py",
line 37, in dns_connect
>
> dns_conn = dnsserver.dnsserver(binding_str, lp, creds)
>
>
> I then tried checking the sam.ldb to see how the record is entered using
ldbedit --url=sam.ldb. When I look at it's record, there are at least 10
additional servicePrincipalName lines that are pointing to an old orphaned DC
that I had to manually remove using ADSI and AD Sites and Services several
months back. They are somehow attached to the Primary DC record in sam.ldb now.
Could this be causing the DNS failure? If so, what if I were to take each DC
down (over a weekend of course) then manually edit the record in sam.ldb on each
DC making sure that only the one being edited was up at a time, then once all
of the changes are complete bring each one back online. The database record
would be the same on all DCs and therefore replication wouldn't cause any
further damage.
>
Is there any chance you could post the record that you want to alter
(suitably sanitized) ? It may be easier to just change the record on the
first DC and then let replication change the others.
Rowland
> Oddly enough, despite all of this I can still connect to this DC via DNS
Manager. Its really slow, but I can see all of the records and even attempted to
delete the PTR record for the odd hostname. I got similar error that the record
does not exist. I can only assume that there is a timeout querying DNS via
nslookup that DNS manager doesn't hit.
>
>
> Is there anything else I may be missing in troubleshooting this problem? If
needed I can provide info from resolv.conf and hosts. Any help is appreciated.
>
> Regards,
>
> Jeff
>
> Jeff Donaldson
> Technology Director
> Newark Charter School
> jeff.donaldson at ncs.k12.de.us
> (302) 369-2001 ext: 425