Fabio Muzzi
2021-May-25 14:47 UTC
[Samba] AD malfunction after an accident - Please help, I'm stuck
I accidentally made a file-level restore instead of a backup, and all of /etc/samba and /var/lib/samba (and the whole system disk of my Samba AD DC) got restored to a previous status. Accident happened on may 12, accidentally restored status is from april, and I have a file level backup that I can use to try and re-revert things that's from May 11. Samba is from Debian 10 (version 4.9.5-Debian), and we have only one DC (the corrupted one) After the accident some weird things began to happen to the domain, some users passwords were reset to the old ones, and this is the event that made us research the issue and discover the accident, and when and how it happened. It's now May 25 and the current situation is: The DC has NOT been restarted, for fear of total disaster. PCs that where in the domain from before April do still login to the domain. PCs that were joined after april lost trust relationship (of course) Users that were in the domain before April got their passwords rolled back, but can still login. New PCs can join domain, but then no domain user can login on them, I get a missing trust relationship for the PC in the domain. New users can be created, can access file shares from a non-domain PC, but cannot logon on a domain PC (user does not exist or wrong password) On the DC, using the command "id <username>" for an "old" user gives the expected result, for a "new" user it says no such user. The same happens for PCs joined to the domain. On the DC, samba-tool dbcheck and samba-tool dbcheck --cross-ncs give no errors: # samba-tool dbcheck Checking 369 objects Checked 369 objects (0 errors) # samba-tool dbcheck --cross-ncs Checking 3597 objects Checked 3597 objects (0 errors) I have made a backup using samba-tool (samba-tool domain backup online --targetdir=/root/samba-backup --server=silos -UAdministrator) and it completes without errors, but I get 2 anomalies: the first is this: Unable to determine the DomainSID, can not enforce uniqueness constraint on local domainSIDs And the other is this: Replicating critical objects from the base DN of the domain Partition[DC=ad,DC=galileo,DC=lan] objects[96/96] linked_values[23/23] Partition[DC=ad,DC=galileo,DC=lan] objects[465/369] linked_values[435/435] You see that the last line says objects[465/369] ? More objects than the expected total. Now I'm here, with 50 people working on this domain, and I have to fix it. It seems to me that if I could fix the issue where new users (and new PCs) do in fact appear in "Active directory users and computers" program (from a windows PC that's still working on the domain) but do not work properly (no "id" for them on the linux DC, and no trust relationship for the PCs, and no login for the users from a domain PC) I could in fact recover from the damage. But I don't know where to start, I'm no AD expert. Reverting to may 11 is another option, but only if it will fix the new users / new PCs issue, otherwise I'll be in the same situation as now. Issue is I have a file level backup from may 11 that may be inconsistent, domain-wise, exactly as a database when it's backed up while running. What can I try? -- Fabio Muzzi