f2012 f1
2025-Dec-15 07:40 UTC
[Samba] Question on Safety of Removing TDB Read Locks for LDAP Queries in Large Samba AD DC
Hello Samba Team, I am running several Samba Active Directory Domain Controller clusters in a cloud environment, each serving approximately 80,000 users. We have an application that performs LDAP queries requiring a full directory scan. Unfortunately, the application vendor is unwilling to modify or optimize their query. When these LDAP queries are executed, the Samba process frequently reaches 100% CPU usage, which in turn causes other directory operations (such as modifications and writes) to be delayed by up to 10 seconds. During investigation, I observed that TDB read locking appears to be a significant bottleneck. As an experiment, I modified the locking behavior to remove the read lock for LDAP queries, and this change effectively eliminated the CPU spike and the associated delays. My questions are: 1. Is it safe to remove or relax TDB read locks for LDAP queries in Samba AD DC? 2. What potential data consistency, correctness, or replication risks might this introduce, especially in a multi-DC environment? 3. Is there an officially supported or recommended approach to handle heavy full-directory LDAP scans at this scale (e.g., configuration tuning, indexing strategies, MDB usage, read replicas, or architectural changes)? 4. Are there any known design constraints in Samba that would make such locking changes fundamentally unsafe? I understand that modifying Samba?s locking behavior is not ideal, and I would strongly prefer a supported solution or architectural recommendation if one exists. However, given the constraints imposed by the application vendor, I would appreciate guidance on whether this approach is fundamentally unsafe, or if there are alternative mitigations I should consider. Thank you very much for your time and for the excellent work on Samba. Samba version: The latest Best regards
Douglas Bagnall
2025-Dec-17 03:25 UTC
[Samba] Question on Safety of Removing TDB Read Locks for LDAP Queries in Large Samba AD DC
On 15/12/2025 20:40, f2012 f1 via samba wrote:> Hello Samba Team, > > I am running several Samba Active Directory Domain Controller clusters in a cloud environment, each serving approximately 80,000 users. > > We have an application that performs LDAP queries requiring a full directory scan. Unfortunately, the application vendor is unwilling to modify or optimize their query. When these LDAP queries are executed, the Samba process frequently reaches 100% CPU usage, which in turn causes other directory operations (such as modifications and writes) to be delayed by up to 10 seconds. > > During investigation, I observed that TDB read locking appears to be a significant bottleneck. As an experiment, I modified the locking behavior to remove the read lock for LDAP queries, and this change effectively eliminated the CPU spike and the associated delays. > > My questions are: > > 1. Is it safe to remove or relax TDB read locks for LDAP queries in Samba AD DC?No. The problem is an ldap search can cause multiple tdb searches, but the ldb layer has to assume consistency across these. It might work for a while, but perhaps only a short while on a domain where locks are causing 10 second delays.> 2. What potential data consistency, correctness, or replication risks might this introduce, especially in a multi-DC environment?There is a bit about this under "NEW FEATURES/CHANGES" in https://www.samba.org/samba/history/samba-4.7.0.html which can be summarised as broken replication, nonsense results, and segfaults. That is comparing whole DB locks to the old per-record locks. No-locks will likely have new problems.> 3. Is there an officially supported or recommended approach to handle heavy full-directory LDAP scans at this scale (e.g., configuration tuning, indexing strategies, MDB usage, read replicas, or architectural changes)?Switching to the LMDB backend has worked very well for other people in similar situations. I have heard of two minute delays being turned into nothing, and CPU load dropping "significantly".> 4. Are there any known design constraints in Samba that would make such locking changes fundamentally unsafe?Linked attributes and indexes require synchronised access. This is an AD constraint that we can't get out of, though of course the way we do it can be changed.> I understand that modifying Samba?s locking behavior is not ideal, and I would strongly prefer a supported solution or architectural recommendation if one exists. However, given the constraints imposed by the application vendor, I would appreciate guidance on whether this approach is fundamentally unsafe, or if there are alternative mitigations I should consider.Try LMDB! Beyond that you'd want to pay someone to look at what is going on. There are certainly still performance improvements we could make. cheers, Douglas Bagnall