Hi, We are using mbox and UW IMAP, and are having issues because UW IMAP does not support multiple client access when you use NFS for the mail files. I've done some research (see the comments after my 2 questions below), and most of the other major IMAP servers simply don't support it (courier is about the only one that has a positive comment). How much testing have you done w/NFS-mounted mail in mbox format? Have you found a reliable way to handle NFS mbox files? The UW IMAP docs say that multiple client access to mail spools that are NFS mounted is not safe, and the source has comments like this about NFS mail spools: /* Make fcntl() locking of NFS files be a no-op the way it is with flock() * on BSD. This is because the rpc.statd/rpc.lockd daemons don't work very * well and cause cluster-wide hangs if you exercise them at all. The * result of this is that you lose the ability to detect shared mail_open() * on NFS-mounted files. If you are wise, you'll use IMAP instead of NFS * for mail files. * * Sun alleges that it doesn't matter, because they say they have fixed all * the rpc.statd/rpc.lockd bugs. This is absolutely not true; huge amounts * of user and support time have been wasted in cluster-wide hangs. */ Cyrus' FAQ says this about NFS mail spools: (http://asg.web.cmu.edu/cyrus/imapd/install-FAQ.html#nfs) NFS (don't) Q: Weird things are happening. The mailspool is mounted over NFS, and... A: Don't mount the mail spool over NFS. Due to locking problems with NFS, this doesn't work (and for various other reasons, probably isn't worth the bother; IMAP appears to be I/O bound, not CPU bound, and scaling in this way is not a good idea). More specifically, if you use NFS and it looks to work, you may have locking problems in the future which will result in silently lost mail. If you use NFS and things go berzerk, it's because mmap(2) apparently has different semantics on local disk than it has on NFS. You can get the above behavior (silently lost mail) if you work at it. Courier is the only one that seems friendly (excerpt from their web site): Courier-IMAP is popular on Qmail/Exim/Postfix sites that are configured to use maildirs. The primary advantage of maildirs is that multiple applications can access the same Maildir simultaneously without requiring any kind of locking whatsoever. Maildir is a faster and more efficient way to store mail. It works particularly well over NFS, which has a long history of locking-related woes. -- Anthony Kay University Computing Center (541) 346-1719 GPG Fingerprint: B0DB D46A 60AF FAE7 A94A 5075 0CB4 4D88 9F4F 7F09 When they discover the center of the universe, a lot of people will be disappointed to discover they are not it. Bernard Bailey
Tony Kay wrote:> Courier is the only one that seems friendly (excerpt from their web > site): > > Courier-IMAP is popular on Qmail/Exim/Postfix sites that are > configured to use maildirs. The primary advantage of maildirs is that > multiple applications can access the same Maildir simultaneously > without requiring any kind of locking whatsoever. Maildir is a faster > and more efficient way to store mail. It works particularly well over > NFS, which has a long history of locking-related woes.Please note that this is in reference to Maildir NOT in reference to mbox format mail stores. Leeman
On Mon, 2006-01-30 at 11:50 -0800, Tony Kay wrote:> Hi, > > We are using mbox and UW IMAP, and are having issues because UW IMAP does not support multiple client access when you use NFS for the mail files. I've done some research (see the comments after my 2 questions below), and most of the other major IMAP servers simply don't support it (courier is about the only one that has a positive comment). > > How much testing have you done w/NFS-mounted mail in mbox format? > Have you found a reliable way to handle NFS mbox files?Dovecot allows you to configure how to do locking for mbox files, so if you're confident enough in your NFS lockd daemon you can just use it. But I think using dotlocks for reading should work well enough too. Only problem it causes is that if a client is reading one large mail for a long time, other clients/connections can't read the mbox at the same time. Probably doesn't matter much. Dovecot's indexes currently don't behave perfectly with NFS, so it might be better idea to store them locally (especially if all IMAP access goes through one computer). Although I'm going to fix that soon too..
On Tue, 2006-01-31 at 22:42 -0800, Tony Kay wrote:> How does dovecot handle this kind of conflict. I.e. how would one dovecot IMAP on server A detect that dovecot on server B had rewritten something like flags on the mailbox they are both accessing? I'm sure it doesn't leave the dot lock sitting around, since that would block mail delivery.It just checks if mbox's mtime has changed. If it has, it checks if there are new mails or if it needs to do some other synchronization (mbox_dirty_syncs / mbox_very_dirty_syncs causes it to delay it as long as possible). UW-IMAP also checks if mtime has changed, but instead of trying to figure out what changed it just disconnects the client with an "unexpected mbox change" error.
On Wed, 01 Feb 2006 11:45:16 +0200, Timo Sirainen <tss at iki.fi> wrote:> On Tue, 2006-01-31 at 22:42 -0800, Tony Kay wrote: > > How does dovecot handle this kind of conflict. I.e. how would one dovecot IMAP on server A detect that dovecot on server B had rewritten something like flags on the mailbox they are both accessing? I'm sure it doesn't leave the dot lock sitting around, since that would block mail delivery. > > It just checks if mbox's mtime has changed. If it has, it checks if > there are new mails or if it needs to do some other synchronization > (mbox_dirty_syncs / mbox_very_dirty_syncs causes it to delay it as long > as possible). > > UW-IMAP also checks if mtime has changed, but instead of trying to > figure out what changed it just disconnects the client with an > "unexpected mbox change" error.UW uses a "kiss of death" signal for collaborative locking...the mtime changes are all assumed to be appends as far as I know. The problem (which I can reproduce) comes with a situation like this: - IMAP host A and B access INBOX for the same user from different physical servers which share the inbox via NFS - Neither sees the other (because UW imap assumes lingering opens come from c-client apps, and it uses signals/flock/tmp to sync them, which are not available when A and B are on separate hosts). - Delete and expunge a message on B - 'A' can still read the expunged message. - Mark two messages for delete on A, including the one previously expunged by B - New message delivered to inbox (possibly by yet another host) - Expunge on A - New message is lost My theory is that the two IMAPs are both holding a fd open, and this causes one of the copies to become stale (in the sense it no longer has a filesystem name). The new mail is delivered to the non-stale box, and then the expunge on 'A' causes the stale open file to be "rewritten" over the top of the non-stale one. Result: mail loss. We also see things like message flags that magically reset themselves, etc. I did some testing w/dovecot 0.99 earlier in this same config, and it seems like it has some issues. For example a delete/expunge on IMAP A was "sort-of" seen by the B (fetch complained that the UIDs had changed, and would not work until a NOOP was done...this is technically OK), but B magically restored the deleted message with a new UID (which is not so good, since a client that caches stuff may now have a strange view of the world...I need to examine the set of "update" notices that went along with these events to better understand the implications). This also worries me because the "magic" reappearance indicates a copy from a stale fd, which could exhibit the same mail loss I produced w/UW IMAP. I am planning on updating to your latest source tree tomorrow (I am out of time today), and see if I can make it "lose" new mail. Opinions??? Thanks! Tony
On Thu, 02 Feb 2006 11:48:02 +0200, Timo Sirainen <tss at iki.fi> wrote:> -- SNIP -- > As long as both hosts keep using dotlocking, Dovecot should be able to > handle the above without any corruption problems. > > > I did some testing w/dovecot 0.99 earlier in this same config, > > Forget those results and try with 1.0beta instead :) >I ran tests on 1.0b2. I could _not_ make dovecot misbehave on NFS w/two physical servers accessing the same mailbox. All of the "undesirable" things that happened (i.e. really poorly written clients suffering strange views of the mailbox due to their own stupidity) were well within IMAP spec., and are non-destructive/lossy. Kudos Timo! Thanks for writing some really excellent code. -- Anthony Kay University Computing Center (541) 346-1719 GPG Fingerprint: B0DB D46A 60AF FAE7 A94A 5075 0CB4 4D88 9F4F 7F09 The camera makes everyone a tourist in other people's reality, and eventually in one's own. Susan Sontag