Hello,
On Thu, Jan 22, 2009 at 01:41, ankush grover <ankushcentos at gmail.com>
wrote:> We are running some of the Centos 5 32 bit, 5.2 64-bit systems. These
> systems are ldap clients and the ldap server is Windows 2003 Server.
How exactly? Are you using nss_ldap to get user ids from AD? Are you
authenticating to AD using LDAP? What are the lines that contain
"ldap" in your /etc/nsswitch.conf? What are the lines that contain
"pam_ldap.so" in your /etc/pam.d/system-auth and the other files in
that directory?
> Sometimes 1 or 2 services on these servers sucks 100% cpu and the load
> becomes high on the server.
Only Apache or other daemons as well?
> Below is an example where one the httpd process was eating 100% cpu
> and we took dump of this process
Do you have any LDAP authentication configured in Apache? Or any other
kind of authentication (PAM? System?) that might end up being served
by LDAP? Do you have an application, such as a PHP application that
would run inside an Apache process, that might be using LDAP?
> #0 0x00002ad1849cd997 in ldap_chase_v3referrals () from
/usr/lib64/libldap-2.3.so.0
Looks like it's getting in a loop of referrals, but it's hard to tell
for sure from one backtrace only.
You could try to get several backtraces and see if it's all the time
in that same function, that might indicate a loop.
Can you get a log of queries that the LDAP server is receiving (if it
is receiving LDAP queries at all while your process is in that loop)?
Can you use tcpdump to determine if you get a lot of LDAP traffic and
if the traffic stops when you kill the process?
Can you see what that Apache process was serving at that time, using
/server-status or something like it? That might give you a clue of why
the problem appeared.
> /etc/ldap.conf file:
> [...]
> timelimit 0
> sizelimit 0
Did you try to increase those?
> There are 2 bugs listed on the redhat site but no solution for this
> problem has been provided.
> https://bugzilla.redhat.com/show_bug.cgi?id=222667
> https://bugzilla.redhat.com/show_bug.cgi?id=474181
These do not seem related to your problem, as they report processes
that hang in a deadlock, which is not your case. If that would have
been your case, the process would be using 0% CPU instead of 100% CPU.
HTH,
Filipe