hi Hubert,
I missed this earlier.
On 19/09/24 03:22, Hubert Kr?ss wrote:> Hi
>
> We have been experiencing performance issues with the 4 Samba4 (currently
installed version 4.20.4) AD DC domain controllers for a while. We have tested
various options, including hidden ones that were not documented. The LDB indices
help a bit, but typically on Mondays, when the most customers log in, there is
100% CPU usage by one LDAP process and high RPC process load.
> The domain controller appears to freeze for a period of time and cannot
handle requests. Even logging in via SSH with the root user (which authenticates
through the local passwd) is not possible during this time. The server
eventually recovers after a while, but of course, this has restrictive effects
on the entire production environment.
> We believe the issue lies with the LDAP processes, which, it seems, do not
scale very well. We have till 16 CPU cores and plenty of RAM on each system. By
starting with prefork children=64, we spawn some subprocesses, but this
didn't really solve the performance issue either.
> The underlying operating system is Debian 11, and the ulimits (open files)
don't seem to be a problem because otherwise dmesg would show kernel
messages. We also do not have any I/O, RAM, or CPU issues.
> Our environment comprises around 4000-5000 clients spread across 4 domain
controllers. Most of the clients use Samba file server services and many
third-party applications that authenticate through LDAP and via saslauthd on
Active Directory.
> our smb.conf:
>
> [global]
> netbios name = ###
> realm = ###
> workgroup = ###
> server role = active directory domain controller
> idmap_ldb:use rfc2307 = yes
> comment > template homedir = /home/%U
> template shell = /bin/bash
> ldap server require strong auth = No
> # WICHTIG: Radius ntlm_auth
> ntlm auth = Yes
> log level = auth_json_audit:0 auth_audit:3
> #ldb:3@/var/log/ldb.log
> logging = syslog
> password hash gpg key ids = "xyz"
> dns forwarder = a.b.c.d
> dns update command = /usr/local/samba/sbin/samba_dnsupdate
--use-samba-tool
> logon script = login.bat
> panic action = /opt/samba/bin/panicRestartSamba.sh
> dns zone transfer clients allow = aaa bbbb
> prefork children = 64
This sounds like a lot, perhaps too many, but I am no expert.
If things are working well, there is no way you need 256 listeners (4 DCs ? 64),
and if things are going badly you are surely better off with say 15 overwhelmed
processes instead of 64 (leaving one core for ssh).
> dbindex:objectClass = yes
> dbindex:uid = yes
> dbindex:uidNumber = yes
> dbindex:gidNumber = yes
> dbindex:memberUid = yes
> dbindex:sAMAccountName = yes
> ldb:max-cachesize = 10000000
I'm not sure these options do anything. Something like this
ldbsearch -s base -b @INDEXLIST
will show you what is indexed. It is set by the schema.
> ldap timeout = 2
> ldap replication sleep = 1000
These also are not relevant to the AD DC, as far as I know.
> Are there any performance parameters for LDB databases or an alternative to
LDB for better scalability?
No, there is no alternative to LDB.
Using LMDB as a backend might give you a slight improvement.
4.21 has some LDB optimisations
(https://bugzilla.samba.org/show_bug.cgi?id=15590),
but I don't think that will affect the Monday morning login rush.
Is this the same problem Heinz H?lzl reported two years ago?
Andrew Bartlett offered some ideas for diagnosis:
https://lists.samba.org/archive/samba/2022-September/242020.html
5000 users is not really a huge number, and it should work.
It seems likely some clients are habitually making expensive ldap queries,
but we don't know what they are.
cheers,
Douglas