The issue is (almost) solved.
As shown the previously explained process to repair, nothing's clear about
that resolution. Perhaps just the big clean-up was necessary, perhaps
synchronisation of a first DC was necessary, no idea.
Anyway replication is working, almost.
On 4 DCs among 5:
ldbsearch -H $sam objectclass=* dn | tail -3
# returned 50968 records
# 50965 entries
# 3 referrals
On one there is one missing entry. This DC is one of the two lately added.
I expect that missing entry won't be missing for too much time :)
What's good about all that? A broken DC, in certain cases, can be repaired
without too much knowledge of Samba or AD, Samba is robust enough.
Have a nice day all!
mathias
2015-11-24 15:29 GMT+01:00 mathias dufresne <infractory at gmail.com>:
> Hi all,
>
> Thank you for tips Andrew, unfortunately I have vague notions about C and
> these notions are growing old. I was not able to understand anything there
> except that the function descriptor_modify(module, sub_req) was not
> successful, without the slightest idea about what could contain variables
> module or sub_req.
>
> Anyway, last few hours I was working on that subject.
> I first opened DNS tool from RSAT on Windows and remove all traces of old
> DCs, everywhere.
> I looked for '(invocationId=*)' --cross-ncs objectguid in the DB
and also
> remove manually references to old DCs.
> I looked for CN=odlDCname and remove also the few entries and their
> children which were remaining.
>
> I also used an awk script to force creation of DNS entries mentioned by
> "samba_dnsupdate --verbose --all-names", on all DCs.
>
> And I have no idea if this helps.
>
> Anyway I finally tried to run:
> samba-tool drs replicate m704 m702 --add-ref --sync-forced --sync-all
> --full-sync *--local* --kerberos yes DC=samba,DC=domain,DC=tld
>
> Before that I was trying different ways to run drs replicate but always
> without that --local switch.
>
> And with that --local switch the DB was eventually replicated from m702
> (the looking-like-broken FSMO owner) to m704 (the local server where the
> command was ran).
>
> Better: following that I've added some new users using ldif file and
> ldbadd on m702 and this change was automatically replicated on m704.
>
> So I installed another DC, joined it and waited several minutes, less than
> 30 minutes which should have been enough, and no replication happened. The
> whole DB one that new DC was containing 264 entries.
> As I'm not too patient, I ran also that new server a "samba-tool
drs
> replicate..." including --local and I stopped that command few second
after
> the launch. The replication process was started after that. Did the
> replication started because of the command or because it needed time to
> start, no idea yet. That's why I've installed another DC, joined it
and now
> I will wait until tomorrow morning to see if replication process start by
> himself or not.
>
> Of course in the middle of all that I restarted samba and also the
> servers. This for newly added DC and for the old ones.
>
> I'll be back tomorrow or earlier to tell if replication process started
by
> himself or not.
>
> Best regards,
>
> mathias
>
>
> 2015-11-24 8:51 GMT+01:00 Andrew Bartlett <abartlet at samba.org>:
>
>> On Mon, 2015-11-16 at 16:50 +0100, mathias dufresne wrote:
>> > transaction: operations error at
>> > ../source4/dsdb/samdb/ldb_modules/descriptor.c:1147
>>
>> Looking at that line in your version of Samba may give you some idea
>> why it failed.
>>
>> Andrew Bartlett
>>
>> --
>> Andrew Bartlett http://samba.org/~abartlet/
>> Authentication Developer, Samba Team http://samba.org
>> Samba Developer, Catalyst IT
>> http://catalyst.net.nz/services/samba
>>
>>
>>
>>
>