Hello, as mentioned before, we are migrating our mailboxes from a 0.99 cluster to a 1.0.0 one. With 0.99 dovecot-auth (with LDAP as backend) was leaking quite happily and the dovecot-auth processes frequently did hit their size limit and thus were killed and restarted. Which in 0.99 at least lead to authentication failures on a busy server, as the dovecot master process just killed off the auth process w/o disabling new connections to it first and letting any current authentications finish. The activity per cluster node on the 0.99 boxes was about 700k logins (mostly pop3) per day and 5 dovecot-auth processes with a 64MB limit each. The migrated users so far are of the less active kind and on one node of the 1.0.0 cluster we see about 70k logins/day and currently the dovecot-auth process there is hovering at 123/84MB (VIRT/RES in top). It was at 121/82MB when I checked 4 hours ago and no new users have been added to that node for 2 days. Authentication backend is still LDAP, no caching, 256MB process size limit. There are just 16k users on that node at this point in time, with only 1200 distinct users generating those 70k logins/day. We will start migrating the rest of the users tomorrow and now I'm wondering: 1. How and why would the memory footprint of dovecot-auth grow when there is no change in the amount of users in the DB? 2. What will happen when the single dovecot-auth process reaches 256MB in the end? Internal housekeeping attempts of that process? A whack to the head from the master process like in 0.99 and thus more erroneous authentication failures, potentially aggravated by the fact that there is just single dovecot-auth process? I guess at the current rate of leakage we shall know the answers to #2 soon one way or another. ;) Regards, Christan -- Christian Balzer Network/Systems Engineer NOC chibi at gol.com Global OnLine Japan/Fusion Network Services http://www.gol.com/
On Tue, 2007-06-19 at 13:40 +0900, Christian Balzer wrote:> Hello, > > as mentioned before, we are migrating our mailboxes from a 0.99 cluster > to a 1.0.0 one. With 0.99 dovecot-auth (with LDAP as backend) was leaking > quite happily and the dovecot-auth processes frequently did hit their > size limit and thus were killed and restarted. Which in 0.99 at least > lead to authentication failures on a busy server, as the dovecot > master process just killed off the auth process w/o disabling new > connections to it first and letting any current authentications finish.It's actually your kernel that kills the process.> 1. How and why would the memory footprint of dovecot-auth grow when > there is no change in the amount of users in the DB?The only thing that's changing the size of dovecot-auth is how many requests it's simultaneously handling. For example if you try to login with invalid user/pass 1000 times within a second, dovecot-auth keeps those 1000 requests in memory for a couple of seconds until it returns with failure. But this happens also with normal requests, just not for so long. You could try http://dovecot.org/patches/debug/mempool-accounting.diff and send USR1 signal to dovecot-auth after a while. It logs how much memory is used by all existing memory pools. Each auth request has its own pool, so if it's really leaking them it's probably logging a lot of lines. If not, then the leak is elsewhere.> 2. What will happen when the single dovecot-auth process reaches 256MB > in the end? Internal housekeeping attempts of that process? A whack to > the head from the master process like in 0.99 and thus more erroneous > authentication failures, potentially aggravated by the fact that there > is just single dovecot-auth process?The same as 0.99. You could also kill -HUP dovecot when dovecot-auth is nearing the limit. That makes it a bit nicer, although not perfectly safe either (should fix this some day..). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20070619/a5bf7cdf/attachment-0002.bin>
On Tue, 19 Jun 2007 14:39:59 +0300 Timo Sirainen <tss at iki.fi> wrote:> On Tue, 2007-06-19 at 13:40 +0900, Christian Balzer wrote: > > > 1. How and why would the memory footprint of dovecot-auth grow when > > there is no change in the amount of users in the DB? > > The only thing that's changing the size of dovecot-auth is how many > requests it's simultaneously handling. For example if you try to login > with invalid user/pass 1000 times within a second, dovecot-auth keeps > those 1000 requests in memory for a couple of seconds until it returns > with failure. But this happens also with normal requests, just not for > so long. >There is nowhere near anything of this kind of concurrency and memory should be returned after a while, but that is clearly not happening. The dovecot-auth is now at 200/160MB and thus prone to blow up over the weekend I guess.> You could try http://dovecot.org/patches/debug/mempool-accounting.diff > and send USR1 signal to dovecot-auth after a while. It logs how much > memory is used by all existing memory pools. Each auth request has its > own pool, so if it's really leaking them it's probably logging a lot of > lines. If not, then the leak is elsewhere. >I grabbed the Debian package source on a test machine (not gonna chance anything on the production servers), applied the patch, did add --enable-debug to the debian/rules file (and got the #define DEBUG in config.h), created the binary packages, installed, configured, started them, tested a few logins and... nothing gets logged in mail.* if I send a USR1 to dovecot-auth. Anything I'm missing? But no matter, it is clearly leaking just as bad as 0.99 and I venture that his is the largest installation with LDAP as authentication backend. I wonder if this leak would be avoided by having LDAP lookups performed by worker processes as with SQL.> > 2. What will happen when the single dovecot-auth process reaches 256MB > > in the end? Internal housekeeping attempts of that process? A whack to > > the head from the master process like in 0.99 and thus more erroneous > > authentication failures, potentially aggravated by the fact that there > > is just single dovecot-auth process? > > The same as 0.99. You could also kill -HUP dovecot when dovecot-auth is > nearing the limit. That makes it a bit nicer, although not perfectly > safe either (should fix this some day..). >If that leak can't be found I would very much appreciate a solution that at least avoids failed and/or delayed logins. Regards, Christian -- Christian Balzer Network/Systems Engineer NOC chibi at gol.com Global OnLine Japan/Fusion Network Services http://www.gol.com/
Possibly Parallel Threads
- 0.99.10.x auth memory leak?
- No auto-conversion of .customflags?
- [BUG] client state / Message count mismatch with imap-hibernate and mixed POP3/IMAP access
- [BUG] client state / Message count mismatch with imap-hibernate and mixed POP3/IMAP access
- dovecot config for 1500 simultaneous connection