George Daswani
2006-Aug-02 22:58 UTC
[Fedora-directory-users] Odd performance problem, server not using indeces
Hello,
I have around 350K users in my test directory, and I''m running
into an odd problem with the directory not using indeces for
ldapsearches.
For example, using the following search string
(&(objectClass=organizationalPerson)(employeeNumber=*))
Looking at the console, there''s a system index on objectClass (which is
set to equality), there''s also an index on employeeNumber (both
equality,
and presence).
There are around 5K icasOrgPersons (which can hold the employeeNumber
attribute), the rest can''t. When the actual search (really slow as if
it
was using a full scan) is performed, the access log files shows
"notes=U"
meaning that the search was unindexed. The question is why considering
there were indeces built for the attributes in the search filter?
Thanks.
Richard Megginson
2006-Aug-02 23:12 UTC
Re: [Fedora-directory-users] Odd performance problem, server not using indeces
George Daswani wrote:> Hello, > > I have around 350K users in my test directory, and I''m running > into an odd problem with the directory not using indeces for > ldapsearches. > > For example, using the following search string > > (&(objectClass=organizationalPerson)(employeeNumber=*)) > > Looking at the console, there''s a system index on objectClass (which is > set to equality), there''s also an index on employeeNumber (both equality, > and presence). > > There are around 5K icasOrgPersons (which can hold the employeeNumber > attribute), the rest can''t.How many entries match (objectClass=organizationalPerson)? If this number is large, then I think what''s happening is that the database first looks up how many match this, and says there are too many. Try using icasOrgPerson or reverse the order of the filters.> When the actual search (really slow as if it > was using a full scan) is performed, the access log files shows "notes=U" > meaning that the search was unindexed. The question is why considering > there were indeces built for the attributes in the search filter? > > Thanks. > > > > > > > -- > Fedora-directory-users mailing list > Fedora-directory-users@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-directory-users >
George Daswani
2006-Aug-03 00:21 UTC
Re: [Fedora-directory-users] Odd performance problem, server not using indeces
> George Daswani wrote: >> Hello, >> >> I have around 350K users in my test directory, and I''m running >> into an odd problem with the directory not using indeces for >> ldapsearches. >> >> For example, using the following search string >> >> (&(objectClass=icasOrgPerson)(employeeNumber=*)) >> >> Looking at the console, there''s a system index on objectClass (which is >> set to equality), there''s also an index on employeeNumber (both >> equality, >> and presence). >> >> There are around 5K icasOrgPersons (which can hold the employeeNumber >> attribute), the rest can''t.> How many entries match (objectClass=organizationalPerson)? If this > number is large, then I think what''s happening is that the database > first looks up how many match this, and says there are too many. Try > using icasOrgPerson or reverse the order of the filters.I did the following per your statement above.. ldapsearch -D "cn=Directory Manager" -x -W "(&(employeeNumber=*)(objectClass=icasOrgPerson))" -b "ou=Users,ou=Internal,o=TEST,o=US" ou=Users,ou=Internal,o=TEST,o=US only holds icasOrgPerson type users (4778 in total) and all of those records have an employeeNumber. the rest of the users live in ou=Users,ou=External,o=TEST,o=US (around 345K+, none of which are icasOrgPerson''s) Running the search string above, the search is still unindexed (returns nentries=4778 notes=U) and is slow. Searches like the following are very fast (indexed per the access log) "(&(employeeNumber=2549)(objectClass=icasOrgPerson))" it''s weird that searches are so slow (not using indeces) considering the number of actual icasOrgPerson (objectClass) is quite low (5K out out of the 450K users), and that there''s a presence index on the employeeNumber attribute (which only exists in icasOrgPerson objects) along with a searchbase. The index files aren''t corrupt and I even recreated the database using ldif2db just to make sure everything was fine with the same result.
Richard Megginson
2006-Aug-03 01:47 UTC
Re: [Fedora-directory-users] Odd performance problem, server not using indeces
George Daswani wrote:>> George Daswani wrote: >> >>> Hello, >>> >>> I have around 350K users in my test directory, and I''m running >>> into an odd problem with the directory not using indeces for >>> ldapsearches. >>> >>> For example, using the following search string >>> >>> (&(objectClass=icasOrgPerson)(employeeNumber=*)) >>> >>> Looking at the console, there''s a system index on objectClass (which is >>> set to equality), there''s also an index on employeeNumber (both >>> equality, >>> and presence). >>> >>> There are around 5K icasOrgPersons (which can hold the employeeNumber >>> attribute), the rest can''t. >>> > > >> How many entries match (objectClass=organizationalPerson)? If this >> number is large, then I think what''s happening is that the database >> first looks up how many match this, and says there are too many. Try >> using icasOrgPerson or reverse the order of the filters. >> > > I did the following per your statement above.. > > ldapsearch -D "cn=Directory Manager" -x -W > "(&(employeeNumber=*)(objectClass=icasOrgPerson))" -b > "ou=Users,ou=Internal,o=TEST,o=US" > > ou=Users,ou=Internal,o=TEST,o=US only holds icasOrgPerson type users (4778 > in total) and all of those records have an employeeNumber. > > the rest of the users live in > > ou=Users,ou=External,o=TEST,o=US (around 345K+, none of which are > icasOrgPerson''s) > > Running the search string above, the search is still unindexed (returns > nentries=4778 notes=U) and is slow. > > Searches like the following are very fast (indexed per the access log) > > "(&(employeeNumber=2549)(objectClass=icasOrgPerson))" >Right. Because there is only one matching entry in the index for employeeNumber.> it''s weird that searches are so slow (not using indeces) considering the > number of actual icasOrgPerson (objectClass) is quite low (5K out out of > the 450K users), and that there''s a presence index on the employeeNumber > attribute (which only exists in icasOrgPerson objects) along with a > searchbase. >Well, in this case, it has to iterate through the employeeNumber index and return each one of several thousand.> The index files aren''t corrupt and I even recreated the database using > ldif2db just to make sure everything was fine with the same result. > >If you really need to perform searches like this that return a very large result set, I suggest you look into the Fedora DS Virtual List View feature which allows you to page through a sorted result set, or increase your nsslapd-idlistscanlimit. See http://www.redhat.com/docs/manuals/dir-server/ag/7.1/index1.html#1095569 for more details.> > > -- > Fedora-directory-users mailing list > Fedora-directory-users@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-directory-users >
George Daswani
2006-Aug-03 07:53 UTC
Re: [Fedora-directory-users] Odd performance problem, server not using indeces
> George Daswani wrote: >>> George Daswani wrote: >>> >> >> I did the following per your statement above.. >> >> ldapsearch -D "cn=Directory Manager" -x -W >> "(&(employeeNumber=*)(objectClass=icasOrgPerson))" -b >> "ou=Users,ou=Internal,o=TEST,o=US" >> >> ou=Users,ou=Internal,o=TEST,o=US only holds icasOrgPerson type users >> (4778 >> in total) and all of those records have an employeeNumber. >> >> the rest of the users live in >> >> ou=Users,ou=External,o=TEST,o=US (around 345K+, none of which are >> icasOrgPerson''s) >> >> Running the search string above, the search is still unindexed (returns >> nentries=4778 notes=U) and is slow. >> >> Searches like the following are very fast (indexed per the access log) >> >> "(&(employeeNumber=2549)(objectClass=icasOrgPerson))" >>> Right. Because there is only one matching entry in the index for > employeeNumber.>> it''s weird that searches are so slow (not using indeces) considering the >> number of actual icasOrgPerson (objectClass) is quite low (5K out out of >> the 450K users), and that there''s a presence index on the employeeNumber >> attribute (which only exists in icasOrgPerson objects) along with a >> searchbase. >>> Well, in this case, it has to iterate through the employeeNumber index > and return each one of several thousand. >> The index files aren''t corrupt and I even recreated the database using >> ldif2db just to make sure everything was fine with the same result. >> >> > If you really need to perform searches like this that return a very > large result set, I suggest you look into the Fedora DS Virtual List > View feature which allows you to page through a sorted result set, or > increase your nsslapd-idlistscanlimit. > > See > http://www.redhat.com/docs/manuals/dir-server/ag/7.1/index1.html#1095569 > for more details. >> >>Richard, thanks for the tip, the default value of the nsslapd-idlistscanlimit is 4K, and the result set that i''m looking at is around 4778 entries so that it''s past the tipping point and is not using the indeces. I originally found it odd because I was expecting index handling to be somewhat like how openldap 2.3.25 uses it (I loaded the same data set, same indeces, same hardware, os) and openldap didn''t break a sweat returning the result set (instantaneous and fast, the difference between 15 seconds vs 154+ seconds on FDS). I''ll bump up the nsslapd-idlistscanlimit to 5K or so and will try again (i''ll do some further research in regards to a vlvindex). It''s normal for the ldap server in our use-case to generate such large user entries. Non-LDAP aware systems import such data nightly - such is life I guess. G