Jeremy,
Good news is that the CVS winbind from last night has not blinked once since
I started it this morning and I have been running 'while wbinfo -u >
/dev/null; do /bin/true; done' against it. (Still not sure what the
'process loop' messages are which appear every minute. I disabled crond
and
they stopped as i suspected it might be my script which checks winbindd's
function every minute. However if i run this script manually it does not
create these messages in the logs.)
The bad new is that there still seems to be a small memory leak in there:
Winbind is 2744K in memory (1.0%) Sat Mar 23 09:35:01 GMT 2002
Winbind is 2764K in memory (1.0%) Sat Mar 23 09:36:01 GMT 2002
Winbind is 2840K in memory (1.1%) Sat Mar 23 09:37:00 GMT 2002
Winbind is 2772K in memory (1.0%) Sat Mar 23 09:38:00 GMT 2002
Winbind is 2844K in memory (1.1%) Sat Mar 23 09:39:01 GMT 2002
Winbind is 2844K in memory (1.1%) Sat Mar 23 09:40:00 GMT 2002
Winbind is 2784K in memory (1.0%) Sat Mar 23 09:41:01 GMT 2002
Winbind is 2848K in memory (1.1%) Sat Mar 23 09:42:00 GMT 2002
Winbind is 2852K in memory (1.1%) Sat Mar 23 09:43:00 GMT 2002
Winbind is 2972K in memory (1.1%) Sat Mar 23 09:44:01 GMT 2002
Winbind is 2972K in memory (1.1%) Sat Mar 23 09:45:00 GMT 2002
Winbind is 2976K in memory (1.1%) Sat Mar 23 09:46:01 GMT 2002
.
.
Winbind is 4608K in memory (1.7%) Sat Mar 23 22:20:01 GMT 2002
Winbind is 4536K in memory (1.7%) Sat Mar 23 22:21:01 GMT 2002
Winbind is 4608K in memory (1.7%) Sat Mar 23 22:22:01 GMT 2002
Winbind is 4616K in memory (1.7%) Sat Mar 23 22:23:01 GMT 2002
Winbind is 4920K in memory (1.9%) Sat Mar 23 22:24:00 GMT 2002
Winbind is 4564K in memory (1.7%) Sat Mar 23 22:25:00 GMT 2002
Winbind is 4616K in memory (1.7%) Sat Mar 23 22:26:01 GMT 2002
Winbind is 4548K in memory (1.7%) Sat Mar 23 22:27:00 GMT 2002
Winbind is 4620K in memory (1.7%) Sat Mar 23 22:28:01 GMT 2002
Winbind is 4620K in memory (1.7%) Sat Mar 23 22:29:00 GMT 2002
Winbind is 4572K in memory (1.7%) Sat Mar 23 22:30:01 GMT 2002
winbindd_cache.tdb is not a constant 64K rather than creeping to 80K prior
to a failure.
-rw------- 1 root root 64k Mar 23 08:57
/usr/local/samba/var/locks/winbindd_cache.tdb
Cheers
Noel
-----Original Message-----
From: Noel Kelly
Sent: 23 March 2002 09:02
To: 'jra@samba.org'; Noel Kelly
Cc: 'Tim Potter'
Subject: RE: FW: [Samba] Memory leak in winbindd
Jeremy,
Have got this running now. Have had to run with my existing libnss_wins.so
as I got this during compile:
[root@liver source]# make nsswitch/libnss_wins.so
Compiling nsswitch/wins.c with -fpic
nsswitch/wins.c: In function `nss_wins_init':
nsswitch/wins.c:88: `dyn_CONFIGFILE' undeclared (first use in this function)
nsswitch/wins.c:88: (Each undeclared identifier is reported only once
nsswitch/wins.c:88: for each function it appears in.)
make: *** [nsswitch/wins.po] Error 1
Not sure how much of an impact an older libnss_wins.so will have but I can
see you have made changes to wins.c.
Winbindd is looking solid at the moment and no hint of a memory leak but am
getting this in the logs every minute:
[2002/03/23 08:56:59, 0] nsswitch/winbindd.c:process_loop(606)
process_loop: Invalid request size (8) send, should be (1304)
[2002/03/23 08:56:59, 0] nsswitch/winbindd.c:process_loop(606)
process_loop: Invalid request size (8) send, should be (1304)
[2002/03/23 08:56:59, 0] nsswitch/winbindd.c:process_loop(606)
process_loop: Invalid request size (8) send, should be (1304)
[2002/03/23 08:56:59, 2] libsmb/namequery.c:name_query(420)
Got a positive name query response from 192.168.5.4 ( 192.168.5.4 )
[2002/03/23 08:57:00, 0] nsswitch/winbindd.c:process_loop(606)
process_loop: Invalid request size (8) send, should be (1304)
Noel
-----Original Message-----
From: jra@samba.org [mailto:jra@samba.org]
Sent: 22 March 2002 19:15
To: Noel Kelly
Cc: 'jra@samba.org'; 'Tim Potter'
Subject: Re: FW: [Samba] Memory leak in winbindd
On Fri, Mar 22, 2002 at 04:17:09PM -0000, Noel Kelly
wrote:> Jeremy hope this helps. I got the stack trace using the script at the
> bottom. Actual winbind output at the time of failure was:
>
> Secret is good
> Error looking up domain users
> Error looking up domain groups
>
> So wbinfo -t works but the -u & -g don't. As I said before,
according to
> the logs, winbindd seems to recover sometimes in that the user info is not
> returned but the group info is - so my test for failure passes. The user
> information then appears on the next pass.
>
> So maybe it has not died. My script runs every minute on a cron job,
doing> a 'getent passwd' to a file and checking if the file is not zero
bytes. I
> suppose it could be that winbind is merely having a hiccup on getting the
> information from the DC - if it was left for another minute then it would
> recover ? Perhaps it is the DC which is busy ?
>
> However, in this experiment 'getent passwd' has failed, and then
with a
> 'sleep 10' between, both 'wbinfo -u' and 'wbinfo
-g' also failed
immediately> as well. 'wbinfo -t' worked - so the DC is there and responding.
>
> Let me know if this doesn't make sense or you think of a better way we
can
> produce this data.
Errr - ok. So I did something rather radical at tridge's urging
yesterday.
I back ported the winbindd from HEAD (minus the ADS stuff for which
we'd need the krb5 and ldap spnego stuff) to 2.2.4pre yesterday.
I'd urge you to cvs checkout and try and reproduce this with the
very latest winbindd - it's been significanly changed and improved....
Jeremy.