I wanted to followup on my NFS lock issue with dovecot-uidlist.
After doing some research, the current FreeBSD NFS client (as of 6.2-
STABLE at least) appears to have a long-standing bug with caching on
files with high create/removal rates. With the NFS access cache
enabled or disabled, the NFS client still uses another cache for
certain file attributes and requires at least a second to go by
before it will invalidate an entry if it was deleted. If the file
attributes are accessed before the second is up, the timer is restarted.
Since the dotlocking code in Dovecot micro-sleeps for less than a
second between each check for the .lock file, the entry is never
removed from the cache's cache, so the lstat() on the lock file
always returns 0 (success). This never allows the lock file to be re-
created until the stall timeout is reached. All Dovecot processes
(IMAP, POP3, deliver) hang until the kernel invalidates the entry,
causing the problem. Using a sleep() call > 1 second after removing
the lock and before attempting to use it again helps, but is
obviously not very performance-friendly for a high-volume mail server.
The other solution I've found that seems to work is updating the
mtime on the .lock file if all other dotlocking checks fail in
check_lock() in src/lib/file-dotlock.c (see attached patch). This
invalidates the cached entry in the kernel and allows lstat() to
return the correct response (-1), as the .lock file no longer
exists. I didn't check to see if the utime() fails, as it just means
the kernel invalidated the entry when it should have and can be ignored.
I have performed some high-volume delivery (deliver) and pickup
testing (imap and pop3) using the workaround, and so far everything
has worked as expected for all Dovecot control files, including indexes.
Does anyone know of any side effects the forced mtime update may have
that I may not be seeing?
Thanks again for any assistance.
-Doug
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-dotlock.c.diff
Type: application/octet-stream
Size: 463 bytes
Desc: not available
URL:
<http://dovecot.org/pipermail/dovecot/attachments/20070520/d0079212/attachment-0002.obj>
-------------- next part --------------
On May 17, 2007, at 10:45 AM, Doug Council wrote:
> We are in the process of migrating away from Courier-IMAP/POP3 and
> Maildrop. I want to use Dovecot (LDA, IMAP, POP3). During my
> testing, it has worked great except for dotlocking on the dovecot-
> uidlist file.
>
> The problem:
>
> When a delivery is being made with deliver and a mail client has
> the mailbox open (Thunderbird in this case), neither Thunderbird or
> deliver can get a dotlock on the dovecot-uidlist file, causing both
> deliver and Thunderbird to hang until the dotlock timeout runs out
> and the lock gets replaced. Once the lock is replaced, both will
> go about their business until the next lock miss and hang again.
> Eventually, everything is delivered and Thunderbird wakes up.
>
> Looking at each of the processes with truss, they are looping
> trying to stat the dotcot-uidlist.lock file, which no longer exists.
>
> We are using NFS, and based on reading through the mailing list
> archives, it can be a little difficult to get working reliably.
> But, I've read quite a few posts with our same or similar
> configuration having good luck with the setup. To reduce multiple
> box access-issues for now, I've been doing all testing with a
> single NFS client.
>
> Our configuration:
>
> NetApp filers for storage
> FreeBSD 6.2-RELEASE NFS clients
> Postfix 2.3.9 MTA
> Dovecot 1.0.0 LDA for local deliveries
> Dovecot 1.0.0 IMAP for pickup
>
> My dovecot.conf file is at the end of this message. NFS access
> cachcing on the FreeBSD has been disabled
> (vfs.nfs.access_cache_timeout = 0, see NFS mount options below).
> Postfix destination recipient and concurrency limit for the Dovecot
> LDA is set to 1.
>
> The NFS mount options:
>
> rw,tcp,-r=32768,-w=32768,nfsv3,dumbtimer,noatime,acregmin=0,
> acregmax=0,acdirmin=0,acdirmax=0
>
> The dovecot.conf file:
>
> protocols = imap imaps pop3 pop3s
> disable_plaintext_auth = no
> syslog_facility = local0
> ssl_cert_file = /nethere/conf/dovecot/ssl-nh-cert.pem
> ssl_key_file = /nethere/conf/dovecot/ssl-nh-key.pem
> login_greeting = Server ready.
> login_log_format_elements = user=<%u> ip=[%r] method=%m encryption=%
> c pid=%p
> login_log_format = %U$: %s
> mail_location = maildir:~/Maildir:INDEX=MEMORY
> mmap_disable = yes
> dotlock_use_excl = no
> lock_method = dotlock
> first_valid_uid = 200
> last_valid_uid = 200
> first_valid_gid = 200
> last_valid_gid = 200
> maildir_copy_with_hardlinks = yes
>
> namespace private {
> prefix = INBOX.
> inbox = yes
> }
>
> protocol imap {
> login_executable = /usr/local/libexec/dovecot/imap-login
> mail_executable = /usr/local/libexec/dovecot/imap
> imap_client_workarounds = outlook-idle delay-newmail
> }
>
> protocol pop3 {
> login_executable = /usr/local/libexec/dovecot/pop3-login
> mail_executable = /usr/local/libexec/dovecot/pop3
> pop3_uidl_format = UID%u-%v
> pop3_client_workarounds = outlook-no-nuls oe-ns-eoh
> }
>
> protocol lda {
> postmaster_address = postmaster at nethere.com
> sendmail_path = /usr/sbin/sendmail
> auth_socket_path = /var/run/dovecot/auth-master
> syslog_facility = mail
> }
>
> auth_executable = /usr/local/libexec/dovecot/dovecot-auth
>
> auth default {
> mechanisms = plain digest-md5 cram-md5
> passdb ldap {
> args = /nethere/conf/dovecot/dovecot-ldap.conf
> }
> userdb ldap {
> args = /nethere/conf/dovecot/dovecot-ldap.conf
> }
> user = root
> socket listen {
> master {
> path = /var/run/dovecot/auth-master
> mode = 0600
> user = mailuser
> group = mailuser
> }
> }
> }
>
> It may just be "how it works", but the lock contention seems a
> little too fragile for busy mailboxes.
>
> Does anyone have any ideas? Thanks in advance for any assistance.
>
> -Doug