We use Dovecot on a cluster (10'000 domains, 30'000 account) without any problem with pop/imap. We use Dovecot 1.0 RC10 Now we use virtual transport from postfix for mail delivery. We need the Dovecot LDA for sieve support. We test this solution on some domains with success. But if we change the configuration of all our domains, we have a lots of bounce with this error in postfix log: postfix/pipe[24573]: 4F6D7CF11: to=<info at XXXX.com>, relay=dovecot, delay=1200, status=bounced (Command time limit exceeded: "/usr/local/dovecot/libexec/dovecot/deliver") If I watch the deliver activity with ps/awk/watch, I see that if deliver duration time exceed some seconds, the deliver process stay in memory but do nothing. Here a strace output of this case: Process 349 attached - interrupt to quit gettimeofday({1161965973, 768478}, {0, 0}) = 0 poll( Strace output nothing before the SIGKILL from postfix (command time exceed 1200 second), the end of the strace: [{fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=9, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}], 2, 2147483647) = -1 EINTR (Interrupted system call) +++ killed by SIGKILL +++ I'm not sure is this problem has a relation with our deliver problem. But in the deliver log /var/log/dovecot/deliver, I have a lots of message: deliver(info at labomex.com): 2006.10.27 18:29:28 Error: file_dotlock_replace(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes) failed: No such file or directory deliver(info at labomex.com): 2006.10.27 18:29:28 Error: rename(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes.lock, /var/mail/labomex.com/mails/info/.dovecot.lda-dupes) failed: No such file or directory Our dovecot.conf (lda parts): protocol lda { postmaster_address = postmaster at clm.net4all.ch #hostname mail_plugins = cmusieve mail_plugin_dir = /usr/local/dovecot/lib/dovecot/lda sendmail_path = /usr/sbin/sendmail auth_socket_path = /var/run/dovecot/auth-master log_timestamp = %Y.%m.%d %H:%M:%S%t log_path = /var/log/dovecot/deliver info_log_path = /var/log/dovecot/deliver.info } We use on UID/GID per domain so every account of a domain use the same UID/GID. The mail storage is on a Debian Sarge NFS server with 2.4.27-3-686-smp. Indexes are stored localy on each POP/IMAP/LDA server. I don't understand why we have this problem. If somebody can help us, thanks a lots. Dominique Feyer
After a lots of test, I found that deliver (LDA Dovecot) put the message in the mailbox of the user, but the process dont quit. After the max command time (from postfix), postfix kill the process, return an error in the log and bounce the message. But always no idea, why this deliver process dont quit after delivery. Thanks Le Fri, 27 Oct 2006 18:47:37 +0200, Dominique Feyer <dfeyer at net4all.ch> a ?crit :> We use Dovecot on a cluster (10'000 domains, 30'000 account) without > any problem with pop/imap. We use Dovecot 1.0 RC10 > > Now we use virtual transport from postfix for mail delivery. > > We need the Dovecot LDA for sieve support. > > We test this solution on some domains with success. > > But if we change the configuration of all our domains, we have a lots > of bounce with this error in postfix log: > > postfix/pipe[24573]: 4F6D7CF11: to=<info at XXXX.com>, > relay=dovecot, delay=1200, status=bounced (Command time limit > exceeded: "/usr/local/dovecot/libexec/dovecot/deliver") > > If I watch the deliver activity with ps/awk/watch, I see that if > deliver duration time exceed some seconds, the deliver process stay in > memory but do nothing. Here a strace output of this case: > > Process 349 attached - interrupt to quit > gettimeofday({1161965973, 768478}, {0, 0}) = 0 > poll( > > Strace output nothing before the SIGKILL from postfix (command time > exceed 1200 second), the end of the strace: > > [{fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=9, > events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}], 2, 2147483647) = -1 > EINTR (Interrupted system call) +++ killed by SIGKILL +++ > > I'm not sure is this problem has a relation with our deliver problem. > But in the deliver log /var/log/dovecot/deliver, I have a lots of > message: > > deliver(info at labomex.com): 2006.10.27 18:29:28 Error: > file_dotlock_replace(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes) > failed: No such file or directory deliver(info at labomex.com): > 2006.10.27 18:29:28 Error: > rename(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes.lock, /var/mail/labomex.com/mails/info/.dovecot.lda-dupes) > failed: No such file or directory > > Our dovecot.conf (lda parts): > > protocol lda { > postmaster_address = postmaster at clm.net4all.ch > #hostname > mail_plugins = cmusieve > mail_plugin_dir = /usr/local/dovecot/lib/dovecot/lda > sendmail_path = /usr/sbin/sendmail > auth_socket_path = /var/run/dovecot/auth-master > log_timestamp = %Y.%m.%d %H:%M:%S%t > log_path = /var/log/dovecot/deliver > info_log_path = /var/log/dovecot/deliver.info > } > > We use on UID/GID per domain so every account of a domain use the same > UID/GID. > > The mail storage is on a Debian Sarge NFS server with > 2.4.27-3-686-smp. Indexes are stored localy on each POP/IMAP/LDA > server. > > I don't understand why we have this problem. If somebody can help us, > thanks a lots. > > Dominique Feyer
On Mon, 2006-10-30 at 14:43 +0100, Dominique Feyer wrote:> After a lots of test, I found that deliver (LDA Dovecot) put the > message in the mailbox of the user, but the process dont quit. After > the max command time (from postfix), postfix kill the process, return > an error in the log and bounce the message.This conflicts with the only reason that I can see for this:> > Process 349 attached - interrupt to quit > > gettimeofday({1161965973, 768478}, {0, 0}) = 0 > > poll(This should only happen at startup when it's connecting to dovecot-auth. So my guess would have been that dovecot-auth is busy and not answering to our requests. I guess I should put some kind of a timeout to this myself also..> > deliver(info at labomex.com): 2006.10.27 18:29:28 Error: > > file_dotlock_replace(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes) > > failed: No such file or directory deliver(info at labomex.com): > > 2006.10.27 18:29:28 Error: > > rename(/var/mail/labomex.com/mails/info/.dovecot.lda-dupes.lock, /var/mail/labomex.com/mails/info/.dovecot.lda-dupes) > > failed: No such file or directoryHmm.. Something seems to be overriding or deleting the dotlocks.. Probably because the deliver hangs for a long time somewhere. Possibly when trying to send mails? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://dovecot.org/pipermail/dovecot/attachments/20061102/036a08f3/attachment.pgp