Hi, as you are aware, there is a unix domain socket leak in 6-STABLE, which AFAIK is not yet fully fixed. I wanted to ask about the status or some possible fixes, as I know a way to reproduce the problem in a matter of minutes. We are running Cyrus and Postfix with the user DB in OpenLDAP. When using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd quickly runs out of filedescriptors as it is not closing any unix sockets (judging by ever increasing lsof output). Using TCP sockets is just fine. If there are patches I could try, don't hesitate to send them to me. Cheers, Uli
Ulrich Spoerlein wrote:> Hi, > > as you are aware, there is a unix domain socket leak in 6-STABLE, > which AFAIK is not yet fully fixed.> We are running Cyrus and Postfix with the user DB in OpenLDAP. When > using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for > both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd > quickly runs out of filedescriptors as it is not closing any unix > sockets (judging by ever increasing lsof output).Can you perhaps isolate the bug / give more information on it? I'm asking because I'm currently using an application with unix domain sockets in production wich handles lots of connects/disconnects per second and it doesn't seem to show leakage. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20070613/ac5202a1/signature.pgp
On Wed, Jun 13, 2007 at 04:22:45PM +0200, Ulrich Spoerlein wrote:> Hi, > > as you are aware, there is a unix domain socket leak in 6-STABLE, > which AFAIK is not yet fully fixed. > > I wanted to ask about the status or some possible fixes, as I know a > way to reproduce the problem in a matter of minutes. > > We are running Cyrus and Postfix with the user DB in OpenLDAP. When > using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for > both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd > quickly runs out of filedescriptors as it is not closing any unix > sockets (judging by ever increasing lsof output). > > Using TCP sockets is just fine. If there are patches I could try, > don't hesitate to send them to me.Might be a red herring, but worth mentioning as a possibility: I've seen this kind of problem with domain sockets (at least on Linux with a multi-use tool called busybox) where on error conditions the code never bothered to close the existing socket it opened, thus resulting in leaks/resource exhaustion over time. The code later got fixed, but a pretty nasty bug especially when the program is used in a lot of embedded products... In regards to FreeBSD, I remember reading some mails from Robert Watson last month in regards to UNIX domain socket code changes: http://monkey.org/freebsd/archive/freebsd-stable/200705/msg00200.html -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On Wed, 13 Jun 2007 16:22:45 +0200, Ulrich Spoerlein <uspoerlein@gmail.com> wrote:> Hi, > > as you are aware, there is a unix domain socket leak in 6-STABLE, > which AFAIK is not yet fully fixed. > > I wanted to ask about the status or some possible fixes, as I know a > way to reproduce the problem in a matter of minutes. > > We are running Cyrus and Postfix with the user DB in OpenLDAP. When > using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for > both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd > quickly runs out of filedescriptors as it is not closing any unix > sockets (judging by ever increasing lsof output).Shouldn't slapd close its unix socket? Or am I misreading this.> Using TCP sockets is just fine. If there are patches I could try, > don't hesitate to send them to me. > > Cheers, > UliRonald. -- Ronald Klop Amsterdam, The Netherlands
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Wednesday, June 13, 2007 09:17:36 -0700 Jeremy Chadwick <koitsu@FreeBSD.org> wrote:> I've seen this kind of problem with domain sockets (at least on Linux > with a multi-use tool called busybox) where on error conditions the > code never bothered to close the existing socket it opened, thus > resulting in leaks/resource exhaustion over time. The code later got > fixed, but a pretty nasty bug especially when the program is used in > a lot of embedded products... > > In regards to FreeBSD, I remember reading some mails from Robert Watson > last month in regards to UNIX domain socket code changes: > > http://monkey.org/freebsd/archive/freebsd-stable/200705/msg00200.html'k, just to ring in here ... I can definitely attest to there being a leak here, as it was me that was originally burned by it ... in my case, I eventually was able to isolate which VPS/jail was causing it and haven't run it since, but was never able to determine exactly what was causing it, since there wasn't really anything unusual running in that jail :( But ... based on the discussions that were had at the time, it was my understanding that if all applications were shut down on the server (to the bare minimal), eventually the kernel GC should clean up all residual sockets ... when I did this (shut down all applications but the very bare minimum) and waited for 10+ minutes, socket usage never drop'd below about 4k sockets in use, or something like that ... Unlike Ulrich, I wasn't running LDAP at the time, so that wasn't the cause for me ... I could easily enough restart that jail if there was some more useful information I could get from it, but the thread kinda dwindled off over time, and rebooting a server ever 3 days was getting a wee bit annoying to my clients :) But, if someone has something they'd like me to do to provide more info, I'm willing to do it (short of anything that requires DDB / console access ... that server is remote) ... - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGcC0y4QvfyHIvDvMRApuZAJ9xKfa2/LqkcMkFEr4vrtnLt3ObcQCg43hs 7QX1hYskbQh/L8XJn1r1/Ts=xKdx -----END PGP SIGNATURE-----
On 6/13/07, Ulrich Spoerlein <uspoerlein@gmail.com> wrote:> > Hi, > > as you are aware, there is a unix domain socket leak in 6-STABLE, > which AFAIK is not yet fully fixed. > > I wanted to ask about the status or some possible fixes, as I know a > way to reproduce the problem in a matter of minutes. > > We are running Cyrus and Postfix with the user DB in OpenLDAP. When > using ldapi://%2fvar%2frun%2fopenldap%2fldapi/ as a connection URL for > both Postfix' user lookup and cyrus' user lookup (via nss_ldap). slapd > quickly runs out of filedescriptors as it is not closing any unix > sockets (judging by ever increasing lsof output). > > Using TCP sockets is just fine. If there are patches I could try, > don't hesitate to send them to me.Ohhh !! I had exactly the same problem last night. After change the line of /usr/local/etc/nss_ldap.conf from uri ldap://127.0.0.1/ to uri ldapi://%2fvar%2frun%2fopenldap%2fldapi/ The open sockets off this machine started to increase until reach maxfiles limit and show messages like this: kernel: kern.maxfiles limit exceeded by uid 65534, please see tuning(7). and slapd stopped to accept new connections. During the day (production hours) the number off connections (using TCP sockets) to OpenLDAP range from 16 to 45. Last night after change the type connection to Unix Domain Socket the number of connections raised rapidly to about 4000. I get this numbers using sockstat -c command. This machine is our Samba PDC, running 6.2-STABLE compile in Apr 5 13:33:50 using samba-3.0.24,1, nss_ldap-1.255, openldap-server-2.3.34_1 I can provide more information if need. Any Advises/Patches ? Best Regards, Alexandre Biancalana