Hi all. I have samba 4.2.3 on freebsd 10.1 server. There are three DC and about 350 PC on domain. DCs have 3 CPU and 3GB RAM. Some servers with services like apache, exim, dovecot, etc use samba4 ldap (port 389) for user authentication. Some times ago after adding some services to use ldap I found, that samba4 cannot serve all ldap requests. Every 10-30 minutes I see in DCs logs: dc1 kernel: sonewconn: pcb 0xfffff800753d6ab8: Listen queue overflow: 16 already in queue awaiting acceptance (28 occurrences) After that I have used tcpdump for recording ldap-traffic and have seen that after TCP handshaking, server some times suddenly send TCP-RST to close connection. I have enlarged DCs resources (CPU and RAM), kern.ipc.somaxconn, did some other system tuning but all that didn't help. Load average on DCs permanently near 0.9-1.0 and samba cannot serve all ldap conncetions. ldap clients works well because they use as minimum as two domain controllers as ldap servers. Is there a some performance problem in samba4, slow processing ldap requests or something else? Thanks for any help. -- With best regards, Tabolin Yuriy System administrator Speech Technology Center
Andrew Bartlett
2015-Nov-25 07:21 UTC
[Samba] samba4 ldap high load and port queue overflow
On Mon, 2015-11-23 at 16:50 +0300, Yuriy Tabolin wrote:> Hi all. > I have samba 4.2.3 on freebsd 10.1 server. There are three DC and > about > 350 PC on domain. DCs have 3 CPU and 3GB RAM. Some servers with > services > like apache, exim, dovecot, etc use samba4 ldap (port 389) for user > authentication. Some times ago after adding some services to use ldap > I > found, that samba4 cannot serve all ldap requests. Every 10-30 > minutes I > see in DCs logs: > dc1 kernel: sonewconn: pcb 0xfffff800753d6ab8: Listen queue overflow: > 16 > already in queue awaiting acceptance (28 occurrences) > > After that I have used tcpdump for recording ldap-traffic and have > seen > that after TCP handshaking, server some times suddenly send TCP-RST > to > close connection. I have enlarged DCs resources (CPU and RAM), > kern.ipc.somaxconn, did some other system tuning but all that didn't > help. Load average on DCs permanently near 0.9-1.0 and samba cannot > serve all ldap conncetions. ldap clients works well because they use > as > minimum as two domain controllers as ldap servers. Is there a some > performance problem in samba4, slow processing ldap requests or > something else? > Thanks for any help.G'Day, Samba typically handles installations such as yours without much difficulty, so I'm surprised. However, if your web applications hit the LDAP server hard, this could certainly be an issue. Samba's LDAP server isn't designed for high performance - frankly we were glad when it worked at all - but works very well for most of our deployments at your scale. We certainly could re-architect the LDAP server for better performance, but at your scale it really shouldn't bit hitting limits. Perhaps your web applications are doing a lot of unindexed operations, or are in -efficient in their authentication processing? Try using PAM backed by pam_winbind intead of LDAP for example, this would put authentication though a different protocol, and not open a TCP socket per authentication. I don't normally suggest changing platform, and if you use FreeBSD, then I understand your passion for that platform, but you might also wish to compare with a trial server running on Ubuntu, to see if there is a platform-specific issue. Finally, there certainly is hope for improved performance, but it needs some backing from the users who need it most, either in engineering effort or working with a commercial support vendor. Those of us involved in Samba day to day know there are many, many things we can work on, we just haven't had the bandwidth or client requirement to focus on it yet. For example, until recently we had a 'prefork' process mode, that would allow for example once LDAP process per port. It hasn't ever been used in that combination, but interested parties could put it back, add a test and see if it helps. More likely to help (but more effort) are merging the ldb_perfs branch, profiling the rest of the code in search of performance issues, replacing the ldb_tdb backend with ldb_lmdb. However, I would go back to looking at the sources of the queries - if your DB is constantly under full-DB scan, it will very likely show symptoms like you see. Thanks, Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
25.11.2015 10:21, Andrew Bartlett пишет:> On Mon, 2015-11-23 at 16:50 +0300, Yuriy Tabolin wrote: >> Hi all. >> I have samba 4.2.3 on freebsd 10.1 server. There are three DC and >> about >> 350 PC on domain. DCs have 3 CPU and 3GB RAM. Some servers with >> services >> like apache, exim, dovecot, etc use samba4 ldap (port 389) for user >> authentication. Some times ago after adding some services to use ldap >> I >> found, that samba4 cannot serve all ldap requests. Every 10-30 >> minutes I >> see in DCs logs: >> dc1 kernel: sonewconn: pcb 0xfffff800753d6ab8: Listen queue overflow: >> 16 >> already in queue awaiting acceptance (28 occurrences) >> >> After that I have used tcpdump for recording ldap-traffic and have >> seen >> that after TCP handshaking, server some times suddenly send TCP-RST >> to >> close connection. I have enlarged DCs resources (CPU and RAM), >> kern.ipc.somaxconn, did some other system tuning but all that didn't >> help. Load average on DCs permanently near 0.9-1.0 and samba cannot >> serve all ldap conncetions. ldap clients works well because they use >> as >> minimum as two domain controllers as ldap servers. Is there a some >> performance problem in samba4, slow processing ldap requests or >> something else? >> Thanks for any help. > G'Day, > > Samba typically handles installations such as yours without much > difficulty, so I'm surprised. > > However, if your web applications hit the LDAP server hard, this could > certainly be an issue. Samba's LDAP server isn't designed for high > performance - frankly we were glad when it worked at all - but works > very well for most of our deployments at your scale. > > We certainly could re-architect the LDAP server for better performance, > but at your scale it really shouldn't bit hitting limits. Perhaps your > web applications are doing a lot of unindexed operations, or are in > -efficient in their authentication processing? Try using PAM backed by > pam_winbind intead of LDAP for example, this would put authentication > though a different protocol, and not open a TCP socket per > authentication. > > I don't normally suggest changing platform, and if you use FreeBSD, > then I understand your passion for that platform, but you might also > wish to compare with a trial server running on Ubuntu, to see if there > is a platform-specific issue. > > Finally, there certainly is hope for improved performance, but it needs > some backing from the users who need it most, either in engineering > effort or working with a commercial support vendor. > > Those of us involved in Samba day to day know there are many, many > things we can work on, we just haven't had the bandwidth or client > requirement to focus on it yet. For example, until recently we had a > 'prefork' process mode, that would allow for example once LDAP process > per port. It hasn't ever been used in that combination, but interested > parties could put it back, add a test and see if it helps. > > More likely to help (but more effort) are merging the ldb_perfs branch, > profiling the rest of the code in search of performance issues, > replacing the ldb_tdb backend with ldb_lmdb. > > However, I would go back to looking at the sources of the queries - if > your DB is constantly under full-DB scan, it will very likely show > symptoms like you see. > > Thanks, > > Andrew Bartlett >Thanks for answer. I had found that high load to ldap was generated by nss on mail server. During resolving one user, nss asks ldap for information about all groups user consist and then all users of this groups. This generates many queris to ldap. openldap (which I used early) works fine with this load, samba ldap has some problem. Now I switch nss on mail server back to my old openldap and it works well. If I cann't resolve problem with performance samba ldap, I try to use nscd for caching nss queries or something else. Interesting, when I had a problem with samba ldap I try to load balancing DCs and send all nss queries to second DC (dc2). Problem was not gone. Then I seen high load on both my DCs on same site: dc1 and dc2, samba processes on both DCs always was loaded by 20-40% permanently on both DCs. After I switched nss on openldap I found that load on both DCs decreased, now is about 3-7% permanently. Even on dc1, which not serve nss queries. dc1 have all FSMO roles in my domain, and may be reason of high load on it is permanent replication with dc2 or retransmit queries from dc2? Now I have test domain and easy repeat problem: setup server with nss to ldap and run simple script for i in `cat file-with-all-users-in-domain` ; do id $i & ; done This script processing minutes and on samba server I see messages like kernel: sonewconn: pcb 0xfffff800753d6ab8: Listen queue overflow: 16 already in queue awaiting acceptance (3018 occurrences) When I will have free time, I will try to add linux DC on my test domain and run same test on linux. -- With best regards, Tabolin Yuriy System administrator Speech Technology Center