On 05.06.2018 20:39, lingpanda101 wrote:> On 6/5/2018 2:11 PM, Ole Traupe via samba wrote: >> Hi list, >> >> I have a domain in production on two sites (subnets, via "Sites and >> Services") with originally two DCs. One went down due to HDD (-> old >> hardware) error. Now, occasionally, clients cant access/find the file >> server (domain member). This does not occur on all clients at the >> same time, however, so I am rather sure it is not the file server >> itself, but a DNS problem. >> >> I couldn't find anything diagnostic in the logs. Default log level >> was not informative, I think, while log level 10 I just could not >> handle/analyze properly. >> >> Can someone recommend a log level? Should I look on the DC or on the >> file server? >> >> Do I have to remove the offline DC completely from DNS and Sites and >> Services for this mess to stop? >> >> I appreciate any advice. >> >> Cheers, >> Ole >> >> >> > Ole, > > If you haven't already removed the dead DC from your network you > should do that first. > > https://wiki.samba.org/index.php/Demoting_a_Samba_AD_DC > > Your clients DNS may still be pointing to the offline DC causing look > up delays. Also did you have your DC's pointing to themselves for DNS > or each other? >Thank you for your help! I had trouble with fail-safe tests regarding DC redundancy a while ago. Some time after discussing it here on the list I finally got it working (had something to do with IPv6). So I can say I have tested the absence of a DC, and it did not lead to any trouble (except for a very short moment due to DNS caching, supposedly). Now it does, which is weird. When the drive errors on the now broken DC manifested, the domain acted weirdly. When I took that DC completely offline, everything went back to normal. Now issues are showing up. Just so much for the background. The current situation is very much like in the fail-safe tests, with two exceptions: the remaining DC (FSMO role holder) is the primary DNS server on all Windows machines, and I updated the resolv.conf on that DC to only point to itself. This DC and several Windows clients got restarted after that, but issues persist. Actually, the DCs (resolv.conf) were pointing to each other initially, and I think that was at least one root of the evil. I think this advice in the Samba wiki actually is rather bad (and unnecessary with Samba, as has been pointed out, before?). Regarding demoting the dead DC: My Samba version is rather old (4.2.5). The problem is that I chose the uid/gid scopes unwisely. And I read on some patch notes that I can't update anymore, because newer versions of Samba actually require those scopes to be set in a very specific way. So perhaps demoting via the newly available method is not an option here. What I can think of is: - removing the dead DC from the clients DNS config, of course - removing it from AD DNS - removing it from AD Sites and Services - and removing it from AD Users and Computers What else does the Samba script for demoting a DC do? Can I do that manually, too? I repeat: it was not the FSMO role holder. Thanks again for any advice! Ole
On 6/6/2018 4:54 AM, Ole Traupe via samba wrote:> > > On 05.06.2018 20:39, lingpanda101 wrote: >> On 6/5/2018 2:11 PM, Ole Traupe via samba wrote: >>> Hi list, >>> >>> I have a domain in production on two sites (subnets, via "Sites and >>> Services") with originally two DCs. One went down due to HDD (-> old >>> hardware) error. Now, occasionally, clients cant access/find the >>> file server (domain member). This does not occur on all clients at >>> the same time, however, so I am rather sure it is not the file >>> server itself, but a DNS problem. >>> >>> I couldn't find anything diagnostic in the logs. Default log level >>> was not informative, I think, while log level 10 I just could not >>> handle/analyze properly. >>> >>> Can someone recommend a log level? Should I look on the DC or on the >>> file server? >>> >>> Do I have to remove the offline DC completely from DNS and Sites and >>> Services for this mess to stop? >>> >>> I appreciate any advice. >>> >>> Cheers, >>> Ole >>> >>> >>> >> Ole, >> >> If you haven't already removed the dead DC from your network you >> should do that first. >> >> https://wiki.samba.org/index.php/Demoting_a_Samba_AD_DC >> >> Your clients DNS may still be pointing to the offline DC causing look >> up delays. Also did you have your DC's pointing to themselves for DNS >> or each other? >> > > ** SNIP ** > > Actually, the DCs (resolv.conf) were pointing to each other initially, > and I think that was at least one root of the evil. I think this > advice in the Samba wiki actually is rather bad (and unnecessary with > Samba, as has been pointed out, before?).Using Bind I find it's necessary to point the DC to itself. I had no issues pointing to another DC with the internal DNS. The Wiki actually mentions best practice for a multi DC environment as it relates to a Windows setup. I do think it's unnecessary with Samba however.> > Regarding demoting the dead DC: My Samba version is rather old > (4.2.5). The problem is that I chose the uid/gid scopes unwisely. And > I read on some patch notes that I can't update anymore, because newer > versions of Samba actually require those scopes to be set in a very > specific way. So perhaps demoting via the newly available method is > not an option here.Can you repair or replace the dead DC with a current Samba version? Join then transfer the FSMO roles? I would advise not using the same hostname.> > What I can think of is: > - removing the dead DC from the clients DNS config, of course > - removing it from AD DNS > - removing it from AD Sites and Services > - and removing it from AD Users and ComputersYes to all the above. The key is to remove all service records in DNS that reference the bad DC. It's easier to use RSAT for this. Make sure you remove all NTDS connections as well that reference the dead DC. Reference the Wiki as it does a good job displaying an example of running '# samba-tool domain demote --remove-other-dead-server=DC2'. It shows all that seems necessary.> > What else does the Samba script for demoting a DC do? Can I do that > manually, too? I repeat: it was not the FSMO role holder.I don't know.> Thanks again for any advice! > Ole > > >-JAMES --
On 06.06.2018 14:44, lingpanda101 wrote:> >> ** SNIP ** >> >> Actually, the DCs (resolv.conf) were pointing to each other >> initially, and I think that was at least one root of the evil. I >> think this advice in the Samba wiki actually is rather bad (and >> unnecessary with Samba, as has been pointed out, before?). > Using Bind I find it's necessary to point the DC to itself. I had no > issues pointing to another DC with the internal DNS. The Wiki actually > mentions best practice for a multi DC environment as it relates to a > Windows setup. I do think it's unnecessary with Samba however.I fear, it is contra-productive in case you loose the other DC the one DC is pointing to.>> >> Regarding demoting the dead DC: My Samba version is rather old >> (4.2.5). The problem is that I chose the uid/gid scopes unwisely. And >> I read on some patch notes that I can't update anymore, because newer >> versions of Samba actually require those scopes to be set in a very >> specific way. So perhaps demoting via the newly available method is >> not an option here. > Can you repair or replace the dead DC with a current Samba version? > Join then transfer the FSMO roles? I would advise not using the same > hostname.I plan on replacing the dead DC very soon, the hardware is in shipping. I seem to remember having read here on the list, that it is no good idea to mix samba versions in a domain. If there is sound advice to do it anyways, I would be up for trying it. However, as I have written above, I messed up the uid/gid ranges. To my understanding, later versions of Samba (like 4.5) _require_ the ranges to comply to the defaults as denoted by the wiki.>> >> What I can think of is: >> - removing the dead DC from the clients DNS config, of course >> - removing it from AD DNS >> - removing it from AD Sites and Services >> - and removing it from AD Users and Computers > Yes to all the above. The key is to remove all service records in DNS > that reference the bad DC. It's easier to use RSAT for this. Make sure > you remove all NTDS connections as well that reference the dead DC. > Reference the Wiki as it does a good job displaying an example of > running '# samba-tool domain demote --remove-other-dead-server=DC2'. > It shows all that seems necessary.I will do that. I am using RSAT. Would I eradicate the complete site associated with the dead DC? Or which containers/objects in particular?>> >> What else does the Samba script for demoting a DC do? Can I do that >> manually, too? I repeat: it was not the FSMO role holder. > I don't know.Thank you very much, James!>> Thanks again for any advice! >> Ole >> >> >> > > -JAMES >