Sören Busse
2024-Feb-03 17:26 UTC
[Samba] Slow ldap membership query in large active directory
Hey there, we've been using Samba AD DC successfully for about 4 years in our school with about 1000 people. 4 years ago we decided to create a group for each class + subject combination, so we have about 1400 groups with ~30 members each (some are much bigger up to 800 people and others have only a few members). One of our systems, which uses LDAP, needs to retrieve the gidNumber of all the groups a user is a member of. This request is sent about 3 to 4 times per second (yes, this is a design flaw, but we cannot easily change it or enable caching): We noticed that the query to get all the gidNumbers of the courses the user is a member of takes about 370ms, while a simple query takes 47ms (including bind/unbind). See the test results below. Why is a query on the member attribute so expensive? I would have assumed that this very common query would be optimised like an index user => [groups], so that you only need to get the gidNumber attribute of the remaining groups. Or maybe there's a faster way to do the query / optimise the ldap database? Thank you very much in advance! --- When doing a very simple LDAP lookup using ldapsearch we get around 47ms of execution time (incl. bind and unbind): # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth-User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b "OU=myou,DC=subdomain,DC=example,DC=de" "(cn=user.name)" real??? 0m0.047s user??? 0m0.026s sys??? 0m0.009s When trying to get the gidNumber of all groups the user is member of this request takes around 378ms (- 45ms roughly bind/unbind overhead): # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth-User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b "OU=courses,OU=myou,DC=subdomain,DC=example,DC=de" "(&(objectclass=group)(member=CN=user.name,OU=Employees,OU=Users,OU=myou,DC=subdomain,DC=example,DC=de))" gidNumber # numResponses: 68 # numEntries: 67 real??? 0m0.378s user??? 0m0.029s sys??? 0m0.012s When trying to get the gidNumber of all groups (courses) it only takes around 249ms (-45ms bind/unbind overhead). So querying the gidNumber of 1280 groups is faster then querying the gidNumber of groups where the user is a member: # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth-User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b "OU=courses,OU=myou,DC=subdomain,DC=example,DC=de" "(&(objectclass=group))" gidNumber # numResponses: 1281 # numEntries: 1280 real??? 0m0.249s user??? 0m0.051s sys??? 0m0.047s ---
Andrew Bartlett
2024-Feb-05 08:00 UTC
[Samba] Slow ldap membership query in large active directory
On Sat, 2024-02-03 at 18:26 +0100, S?ren Busse via samba wrote:> Hey there, > we've been using Samba AD DC successfully for about 4 years in our > school with about 1000 people. 4 years ago we decided to create a > group for each class + subject combination, so we have about 1400 > groups with ~30 members each (some are much bigger up to 800 people > and others have only a few members). One of our systems, which uses > LDAP, needs to retrieve the gidNumber of all the groups a user is a > member of. This request is sent about 3 to 4 times per second (yes, > this is a design flaw, but we cannot easily change it or enable > caching): > We noticed that the query to get all the gidNumbers of the courses > the user is a member of takes about 370ms, while a simple query takes > 47ms (including bind/unbind). See the test results below. > Why is a query on the member attribute so expensive? I would have > assumed that this very common query would be optimised like an index > user => [groups], so that you only need to get the gidNumber > attribute of the remaining groups. Or maybe there's a faster way to > do the query / optimise the ldap database? > Thank you very much in advance!I'm not totally shocked, DN based searches are much more expensive, as we have to confirm the DN exists and isn't deleted, not just match like an integer. You can work out where we spend the time if you run the same search, locally using ldbsearch -H /path/so/sam.ldb on a Samba build with debug symbols and use Brendan Greg's FlameGraph utility: https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#Instructions You can confirm that an index is in use by setting the environment variable LDB_WARN_UNINDEXED=1, this will warn on unindexed searches, but won't warn on poorly indexed searches.> --- > When doing a very simple LDAP lookup using ldapsearch we get around > 47ms of execution time (incl. bind and unbind): > # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth- > User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b > "OU=myou,DC=subdomain,DC=example,DC=de" "(cn=user.name)"real > 0m0.047suser 0m0.026ssys 0m0.009s > When trying to get the gidNumber of all groups the user is member of > this request takes around 378ms (- 45ms roughly bind/unbind > overhead): > # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth- > User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b > "OU=courses,OU=myou,DC=subdomain,DC=example,DC=de" > "(&(objectclass=group)(member=CN=user.name,OU=Employees,OU=Users,OU=m > you,DC=subdomain,DC=example,DC=de))" gidNumberDo try with the terms in the other order. We match left-to-right, so the term that matches the least objects should be first. I don't think this is the problem however. Try a search with just '(member=CN=user.name,OU=Employees,OU=Users,OU=myou,DC=subdomain,DC=exa mple,DC=de)' and LDB_WARN_UNINDEXED=1 to confirm that this term is indexed for you.> # numResponses: 68# numEntries: 67 > real 0m0.378suser 0m0.029ssys 0m0.012s > When trying to get the gidNumber of all groups (courses) it only > takes around 249ms (-45ms bind/unbind overhead). So querying the > gidNumber of 1280 groups is faster then querying the gidNumber of > groups where the user is a member: > # time ldapsearch -H ldaps://10.12.100.1:636 -D "CN=Auth- > User,CN=Users,DC=subdomain,DC=example,DC=de" -w xxxx -b > "OU=courses,OU=myou,DC=subdomain,DC=example,DC=de" > "(&(objectclass=group))" gidNumber > # numResponses: 1281# numEntries: 1280 > real 0m0.249suser 0m0.051ssys 0m0.047s > --- > >-- Andrew Bartlett (he/him) https://samba.org/~abartlet/Samba Team Member (since 2001) https://samba.orgSamba Team Lead https://catalyst.net.nz/services/sambaCatalyst.Net Ltd Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group company Samba Development and Support: https://catalyst.net.nz/services/samba Catalyst IT - Expert Open Source Solutions
Marco Gaiarin
2024-Feb-05 13:41 UTC
[Samba] Slow ldap membership query in large active directory
Mandi! S?ren Busse via samba In chel di` si favelave...> We noticed that the query to get all the gidNumbers of the courses the > user is a member of takes about 370ms, while a simple query takes 47ms > (including bind/unbind). See the test results below.While not measuring it, i feel same result here. You speak about 'gidNumber', so i suppose you are using RFC2307. I don't know if it is correlated (but i feel it...) also offline caching does not work in RFC2307 mode: https://bugzilla.samba.org/show_bug.cgi?id=15405 So i can only suppose there's plently room for optimization in RFC2307... -- Quante persone che non contano, e invece contano e si stanno contando gia` Stanno solo aspettando un segno, Capata`z (F. De Gregori)