On 18/07/19 13:19, Rowland penny via samba wrote:> OK, from my understanding DC1 is using the internal dns and DC2 is > using Bind9.It's the other way round. On dc1 port 53 is mapped to /usr/sbin/named -u bind. On dc2 it's /usr/sbin/samba. I wasn't sure what to do when I deployed dc2. I remember installing bind9 on dc2 but then purging it. BTW - does it matter for replication which backend is being used? Or is everything expected to fully populate regardless of the DNS backend choice?> I would ensure your clients only use DC1What's the best way to achieve it? Through a local firewall?> turn off Bind9 on DC2 and then run samba-upgradedns to use the > internal dns server, this will cure one of your problems. You may have > to delete the 'dns-dc2' user manually. There is more to it than just > renaming 'dns-dc2' to 'dns-dc1'. > > If you then want to demote DC2, you will need to get into idmap.ldb > and make some changes, I would start by trying to change the FSMO role > holders to DC1, the ultimate aim will be to get replication working >I thought the plan was to forcefully demote dc2 and dc1 suffers from too many config issues to rely on replication.> speaking of which, have you tried this command: > > samba-tool drs replicate ldap://DC2 ldap://DC1 allIs it safe to run knowing data on both might be over a week out of sync? What's the worst that can happen?
On 18/07/2019 15:35, Adam Weremczuk via samba wrote:> On 18/07/19 13:19, Rowland penny via samba wrote: > >> OK, from my understanding DC1 is using the internal dns and DC2 is >> using Bind9. > > It's the other way round. > On dc1 port 53 is mapped to /usr/sbin/named -u bind. > On dc2 it's /usr/sbin/samba. > I wasn't sure what to do when I deployed dc2. > I remember installing bind9 on dc2 but then purging it.Then you do not need the user 'dns-DC2'> > BTW - does it matter for replication which backend is being used?All DC's are supposed to replicate to all other DC's> Or is everything expected to fully populate regardless of the DNS > backend choice?Just as long as a DC can find the other DC's, replication should occur.> >> I would ensure your clients only use DC1 > > What's the best way to achieve it? > Through a local firewall? > >> turn off Bind9 on DC2 and then run samba-upgradedns to use the >> internal dns server, this will cure one of your problems. You may >> have to delete the 'dns-dc2' user manually. There is more to it than >> just renaming 'dns-dc2' to 'dns-dc1'. >> >> If you then want to demote DC2, you will need to get into idmap.ldb >> and make some changes, I would start by trying to change the FSMO >> role holders to DC1, the ultimate aim will be to get replication working >> > I thought the plan was to forcefully demote dc2 and dc1 suffers from > too many config issues to rely on replication.I thought that was the plan as well, but you then seemed to want to try and fix DC2 so you could demote it, my plan would be to: TURN OFF DC2 Remove any trace of DC2 from DC1 Run 'samba-tool dbcheck --fix --yes --cross-ncs' Hopefully this will fix DC1, but your Samba is that old, I cannot remember if that will run on your DC. Your main problem is that your DC is in production, that is why I said to back everything up before you start. I would also do all of this when your network is down, at the weekend maybe ?? Rowland
Hi Rowland, On 18/07/19 15:52, Rowland penny via samba wrote:> my plan would be to: > > TURN OFF DC2I did it on Friday afternoon after my numerous attempts to demote DC2 failed. This fixed one issue - made the network shares appear again across all clients. A new one has been discovered though on one of our CentOS 5.11 boxes. Any command (like sudo or ssh) that needs authentication or user name lookup takes a long time to complete. This doesn't only make working with this machine very difficult but also makes lots of complex scripts to fail due to timeouts. Even though DC2 (192.168.8.125) has been powered off for almost 3 days I can still see this client trying to connect to it when I ssh from another terminal: [root at centos log]# lsof | grep 192.168.8.125 sshd????? 6630????? root??? 7u???? IPv4????????????? 24776 0t0??????? TCP centos.company.co.uk:57423->192.168.8.125:ldap (SYN_SENT) sshd????? 6642????? root??? 7u???? IPv4????????????? 24812 0t0??????? TCP centos.company.co.uk:57425->192.168.8.125:ldap (SYN_SENT) At the same time I can see a lot of successful TCP flags (ESTABLISHED, CLOSE_WAIT) against DC1. Since no configuration changes have been made on this CentOS box I'm assuming it must be DC1 advertising DC2 to clients. Is removing references to DC2 from DC1 the only option to resolve it or are there any quick tricks available to try? E.g. some cache still needs to expire or needs to be forced to do so.> > Remove any trace of DC2 from DC1I'm assuming I need to try exactly the same thing as last time? ldbedit -e vim -H /var/lib/samba/private/sam.ldb --cross-ncs Any difference running it with samba running vs samba stopped? Apart from DDNS updates there should be no modifications made to AD during the edit process (e.g. no machines or users added, removed, no password changed etc.).> > Run 'samba-tool dbcheck --fix --yes --cross-ncs' > > Hopefully this will fix DC1, but your Samba is that old, I cannot > remember if that will run on your DC. > > Your main problem is that your DC is in production, that is why I said > to back everything up before you start.I've skimmed through: https://wiki.samba.org/index.php/Back_up_and_Restoring_a_Samba_AD_DC and my understanding is both online and offline samba-tool backups are only available in the very latest versions 4.9 and 4.10. So the only option I have is a manual data backup. Is it sufficient to back up /var/lib/samba folder (containing *.ldb, sysvol and netlogon) and restore it entirely if a disaster strikes? Any benefit of stopping samba before creating a tarball? Thanks, Adam