hi Hubert, I missed this earlier. On 19/09/24 03:22, Hubert Kr?ss wrote:> Hi > > We have been experiencing performance issues with the 4 Samba4 (currently installed version 4.20.4) AD DC domain controllers for a while. We have tested various options, including hidden ones that were not documented. The LDB indices help a bit, but typically on Mondays, when the most customers log in, there is 100% CPU usage by one LDAP process and high RPC process load. > The domain controller appears to freeze for a period of time and cannot handle requests. Even logging in via SSH with the root user (which authenticates through the local passwd) is not possible during this time. The server eventually recovers after a while, but of course, this has restrictive effects on the entire production environment. > We believe the issue lies with the LDAP processes, which, it seems, do not scale very well. We have till 16 CPU cores and plenty of RAM on each system. By starting with prefork children=64, we spawn some subprocesses, but this didn't really solve the performance issue either. > The underlying operating system is Debian 11, and the ulimits (open files) don't seem to be a problem because otherwise dmesg would show kernel messages. We also do not have any I/O, RAM, or CPU issues. > Our environment comprises around 4000-5000 clients spread across 4 domain controllers. Most of the clients use Samba file server services and many third-party applications that authenticate through LDAP and via saslauthd on Active Directory. > our smb.conf: > > [global] > netbios name = ### > realm = ### > workgroup = ### > server role = active directory domain controller > idmap_ldb:use rfc2307 = yes > comment > template homedir = /home/%U > template shell = /bin/bash > ldap server require strong auth = No > # WICHTIG: Radius ntlm_auth > ntlm auth = Yes > log level = auth_json_audit:0 auth_audit:3 > #ldb:3@/var/log/ldb.log > logging = syslog > password hash gpg key ids = "xyz" > dns forwarder = a.b.c.d > dns update command = /usr/local/samba/sbin/samba_dnsupdate --use-samba-tool > logon script = login.bat > panic action = /opt/samba/bin/panicRestartSamba.sh > dns zone transfer clients allow = aaa bbbb > prefork children = 64This sounds like a lot, perhaps too many, but I am no expert. If things are working well, there is no way you need 256 listeners (4 DCs ? 64), and if things are going badly you are surely better off with say 15 overwhelmed processes instead of 64 (leaving one core for ssh).> dbindex:objectClass = yes > dbindex:uid = yes > dbindex:uidNumber = yes > dbindex:gidNumber = yes > dbindex:memberUid = yes > dbindex:sAMAccountName = yes > ldb:max-cachesize = 10000000I'm not sure these options do anything. Something like this ldbsearch -s base -b @INDEXLIST will show you what is indexed. It is set by the schema.> ldap timeout = 2 > ldap replication sleep = 1000These also are not relevant to the AD DC, as far as I know.> Are there any performance parameters for LDB databases or an alternative to LDB for better scalability?No, there is no alternative to LDB. Using LMDB as a backend might give you a slight improvement. 4.21 has some LDB optimisations (https://bugzilla.samba.org/show_bug.cgi?id=15590), but I don't think that will affect the Monday morning login rush. Is this the same problem Heinz H?lzl reported two years ago? Andrew Bartlett offered some ideas for diagnosis: https://lists.samba.org/archive/samba/2022-September/242020.html 5000 users is not really a huge number, and it should work. It seems likely some clients are habitually making expensive ldap queries, but we don't know what they are. cheers, Douglas
Le 05/10/2024 ? 02:48, Douglas Bagnall via samba a ?crit?:> hi Hubert, > > I missed this earlier.Hello, I exerienced the same kind of problem.? DC were overloaded by some requests. running DC with suficient debug level shows immediately 2 problems : - requests on big groups ( 70 000 members) with? member attributes - requests with *? in filters. these requests were? consuming from 1 to 10 s. reconfiguring applications (keycloak in our case),? and rewriting our custom php application to avoid if possible thes kind of requests definitely solved the problem : all requests are now below 10 ms, and everything works. Denis
On 05-10-2024 02:48, Douglas Bagnall via samba wrote:> hi Hubert, > > I missed this earlier. > > On 19/09/24 03:22, Hubert Kr?ss wrote: >> Hi >> >> We have been experiencing performance issues with the 4 Samba4 (currently installed version 4.20.4) AD DC domain controllers for a while. We have tested various options, including hidden ones that were not documented. The LDB indices help a bit, but typically on Mondays, when the most customers log in, there is 100% CPU usage by one LDAP process and high RPC process load. >> The domain controller appears to freeze for a period of time and cannot handle requests. Even logging in via SSH with the root user (which authenticates through the local passwd) is not possible during this time. The server eventually recovers after a while, but of course, this has restrictive effects on the entire production environment. >> We believe the issue lies with the LDAP processes, which, it seems, do not scale very well. We have till 16 CPU cores and plenty of RAM on each system. By starting with prefork children=64, we spawn some subprocesses, but this didn't really solve the performance issue either. >> The underlying operating system is Debian 11, and the ulimits (open files) don't seem to be a problem because otherwise dmesg would show kernel messages. We also do not have any I/O, RAM, or CPU issues. >> Our environment comprises around 4000-5000 clients spread across 4 domain controllers. Most of the clients use Samba file server services and many third-party applications that authenticate through LDAP and via saslauthd on Active Directory. >> our smb.conf: >> >> [global] >> netbios name = ### >> realm = ### >> workgroup = ### >> server role = active directory domain controller >> idmap_ldb:use rfc2307 = yes >> comment >> template homedir = /home/%U >> template shell = /bin/bash >> ldap server require strong auth = No >> # WICHTIG: Radius ntlm_auth >> ntlm auth = Yes >> log level = auth_json_audit:0 auth_audit:3 >> #ldb:3@/var/log/ldb.log >> logging = syslog >> password hash gpg key ids = "xyz" >> dns forwarder = a.b.c.d >> dns update command = /usr/local/samba/sbin/samba_dnsupdate --use-samba-tool >> logon script = login.bat >> panic action = /opt/samba/bin/panicRestartSamba.sh >> dns zone transfer clients allow = aaa bbbb >> prefork children = 64 > This sounds like a lot, perhaps too many, but I am no expert. > > If things are working well, there is no way you need 256 listeners (4 DCs ? 64), > and if things are going badly you are surely better off with say 15 overwhelmed > processes instead of 64 (leaving one core for ssh). > >> dbindex:objectClass = yes >> dbindex:uid = yes >> dbindex:uidNumber = yes >> dbindex:gidNumber = yes >> dbindex:memberUid = yes >> dbindex:sAMAccountName = yes >> ldb:max-cachesize = 10000000 > I'm not sure these options do anything. Something like this > > ldbsearch -s base -b @INDEXLIST > > will show you what is indexed. It is set by the schema. > >> ldap timeout = 2 >> ldap replication sleep = 1000 > These also are not relevant to the AD DC, as far as I know. > >> Are there any performance parameters for LDB databases or an alternative to LDB for better scalability? > No, there is no alternative to LDB. > > Using LMDB as a backend might give you a slight improvement. > > 4.21 has some LDB optimisations (https://bugzilla.samba.org/show_bug.cgi?id=15590), > but I don't think that will affect the Monday morning login rush. > > Is this the same problem Heinz H?lzl reported two years ago? > Andrew Bartlett offered some ideas for diagnosis: > > https://lists.samba.org/archive/samba/2022-September/242020.html > > 5000 users is not really a huge number, and it should work. > > It seems likely some clients are habitually making expensive ldap queries, > but we don't know what they are.There was a thread on the mailinglist with someone from Transquil IT no so long ago about some LDB performance issue. Apart from that, they have a lot of experience with big Samba environments. Perhaps it is worth a query. - Kees.> > cheers, > Douglas > >
Le 05/10/2024 ? 02:48, Douglas Bagnall via samba a ?crit?:> hi Hubert, > > I missed this earlier.Hello, I experienced the same kind of problem.? DC were overloaded by some requests. running DC with suficient debug level shows immediately 2 problems : - requests on big groups ( 70 000 members) with? member attributes - requests with *? in filters. these requests were? consuming from 1 to 10 s. reconfiguring applications (keycloak in our case),? and rewriting our custom php application to avoid if possible thes kind of requests definitely solved the problem : all requests are now below 10 ms, and everything works. Denis