We've just cut over to a new mail server running Dovecot 1.0.5. The underlying OS is RHEL 5. User mailboxes are stored in Unix mbox format on a local ext3 file system. The MTA on the system is the default RedHat version of Sendmail 8.13.8 with procmail for local delivery. We're using a combination of dotlock and fcntl style locking. The output of "dovecot -n" for this system appears below my signature. Periodically we see errors like the following in our logs: Oct 21 19:36:01 postoffice1 dovecot: POP3(someuser): file mbox-sync.c: line 1433 (mbox_sync_handle_eof_updates): assertion failed: (file_size >= sync_ctx->expunged_space + trailer_size) Oct 21 19:36:01 postoffice1 dovecot: POP3(someuser): Raw backtrace: pop3 [0x809eda1] -> pop3 [0x809ecbc] -> pop3(mbox_sync+0x1088) [0x806c4e8] -> pop3(mbox_storage_sync_init+0x3e) [0x806cb6e] -> pop3(client_create+0x1a4) [0x80552a4] -> pop3(main+0x556) [0x8057236] -> /lib/libc.so.6(__libc_start_main+0xdc) [0x304dec] -> pop3 [0x8054a11] Oct 21 19:36:01 postoffice1 dovecot: child 24409 (pop3) killed with signal 6 Note that while the above log is for a POP3 user, we also see similar behavior when some users access mail folders via IMAP as well. The real problem with this behavior is that sometimes, but not in all cases, the process will leave behind a dotlock when it exits. If the abandoned dotlock is on the user's inbox, then procmail won't deliver new mail to the user until the lock is cleared. This is a big problem. Anybody seen this behavior before and/or know what's causing it? Can anybody suggest ideas for further debugging this issue? We'd appreciate any help that you all can provide. Thanks! -- Hal Pomeranz, Founder/CEO Deer Run Associates hal at deer-run.com Network Connectivity and Security, Systems Management, Training -- # /hpsw/mail/dovecot/sbin/dovecot -n # 1.0.5: /hpsw/mail/dovecot/etc/dovecot.conf syslog_facility: local3 protocols: imap imaps pop3 pop3s ssl_cert_file: /hpsw/mail/configs/dovecot/pki/certs/dovecot.pem ssl_key_file: /hpsw/mail/configs/dovecot/pki/private/dovecot.pem disable_plaintext_auth: no login_dir: /hpsw/mail/dovecot/var/run/dovecot/login login_executable(default): /hpsw/mail/dovecot/libexec/dovecot/imap-login login_executable(imap): /hpsw/mail/dovecot/libexec/dovecot/imap-login login_executable(pop3): /hpsw/mail/dovecot/libexec/dovecot/pop3-login mail_location: mbox:/hpsw/mail/users/%u/dovecot:INBOX=/hpsw/mail/users/%u/MBOX mbox_read_locks: dotlock fcntl mail_executable(default): /hpsw/mail/dovecot/libexec/dovecot/imap mail_executable(imap): /hpsw/mail/dovecot/libexec/dovecot/imap mail_executable(pop3): /hpsw/mail/dovecot/libexec/dovecot/pop3 mail_plugin_dir(default): /hpsw/mail/dovecot/lib/dovecot/imap mail_plugin_dir(imap): /hpsw/mail/dovecot/lib/dovecot/imap mail_plugin_dir(pop3): /hpsw/mail/dovecot/lib/dovecot/pop3 pop3_uidl_format(default): pop3_uidl_format(imap): pop3_uidl_format(pop3): %08Xu%08Xv auth default: passdb: driver: pam userdb: driver: passwd
On 22.10.2007, at 5.46, Hal Pomeranz wrote:> Oct 21 19:36:01 postoffice1 dovecot: POP3(someuser): file mbox- > sync.c: line 1433 (mbox_sync_handle_eof_updates): assertion failed: > (file_size >= sync_ctx->expunged_space + trailer_size)Does the file have CRLFs as linefeeds instead of plain LFs? CRLF handling is probably still a bit buggy. If that's not the problem then I'm not really sure.. It's difficult to do anything about this unless I can reproduce the problem. I've been using mboxes for my own mails fine for years. So if you can figure out a way to easily reproduce this, I'd like to know. If this happens every time for a specific mbox, could you put it through http://dovecot.org/tools/mbox-anonymize.pl and then send the output to me and also the mailbox's dovecot.index and dovecot.index.log files?> The real problem with this behavior is that sometimes, but not in all > cases, the process will leave behind a dotlock when it exits. If the > abandoned dotlock is on the user's inbox, then procmail won't deliver > new mail to the user until the lock is cleared. This is a big > problem.Dovecot writes the process's PID to the dotlock file. Procmail would be able to check if the PID still exists and override the dotlock immediately if it does. But since it doesn't do this, I guess you'd have to modify its sources or create some wrapper script to delete stale dotlocks. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20071022/856383bb/attachment-0002.bin>
> >Oct 21 19:36:01 postoffice1 dovecot: POP3(someuser): file mbox- > >sync.c: line 1433 (mbox_sync_handle_eof_updates): assertion failed: > >(file_size >= sync_ctx->expunged_space + trailer_size) > > Does the file have CRLFs as linefeeds instead of plain LFs? CRLF > handling is probably still a bit buggy.The mbox files are standard Unix text, so LF only, right? We haven't turned on the CRLF options in dovecot.conf, that's for sure.> If that's not the problem then I'm not really sure.. It's difficult > to do anything about this unless I can reproduce the problem. I've > been using mboxes for my own mails fine for years. So if you can > figure out a way to easily reproduce this, I'd like to know.We've done some more debugging since my original posting. The way to make the problem happen reliably seems to be to attempt to delete all messages from the mbox. Thus, users who POP all their mail off the server are affected, as are users who are using an IMAP client to purge all their email from a given folder. This problem does not seem to be related to a specific email client, since we can make the problem happen with different client programs. But there must be some underlying problem with our setup somehow. I can't believe that Dovecot's code for mailbox truncation is somehow inherently flawed or some other site surely would have noticed this problem before now. A file system buffering issue? I'm not sure. I'm open to suggestions on this problem. If anybody else can reproduce the issue, I'd be interested to hear about it, but in the meantime I'm going to assume that something's hosed with our specific configuration.> >The real problem with this behavior is that sometimes, but not in all > >cases, the process will leave behind a dotlock when it exits. If the > >abandoned dotlock is on the user's inbox, then procmail won't deliver > >new mail to the user until the lock is cleared. This is a big > >problem. > > Dovecot writes the process's PID to the dotlock file. Procmail would > be able to check if the PID still exists and override the dotlock > immediately if it does. But since it doesn't do this, I guess you'd > have to modify its sources or create some wrapper script to delete > stale dotlocks.Right. We've been using the PIDs to manually decide which dotlocks are stranded, but as you say getting procmail to do the right thing would require source code hacks. And the reality is that this wouldn't fix the underlying issue, just sort of "paper over" the problem. I'd prefer to fix the root cause. -- Hal Pomeranz, Founder/CEO Deer Run Associates hal at deer-run.com Network Connectivity and Security, Systems Management, Training