Apps Lists
2006-Apr-28 13:25 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
I'm running beta7 on two machines, with maildir on NFS. I have lockd running on all machines. I've found that Dovecot is highly unstable with NFS when accessing a mailbox on more than one machine at the same time. Both dovecot machines have: mmap_disable = yes lock_method = fcntl NFS is version 3, exported from a third linux machine. All machines are running 2.6.9 kernel. Any ideas what's going wrong here? Steve *** dev4 *** Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230258.P2889Q0M407851.dev4.neonova.net:2, Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0 Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line 883 (mail_index_sync_from_transactions): assertion failed: (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset) Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6 *** dev3 *** Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146230254.P2888Q0M651598.dev4.neonova.net Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4036 -> 1772 Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4036 -> 1774) Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2, Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4038 -> 1774 Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4038 -> 1775) Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: 1146145480.P5976Q0M245663.dev3.neonova.net:2, Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775) Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension introduction for unknown id 0 Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line 881 (mail_index_sync_from_transactions): assertion failed: (hdr.messages_count == (*map)->hdr.messages_count) Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered (4039 -> 1775) Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039 Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID 1775, but next_uid = 4039 Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is in inconsistent state, please relogin. Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file /var/mailstore/72/af/375887/Maildir/dovecot.index: first_recent_uid_lowwater 4039 -> 1776
Timo Sirainen
2006-Apr-28 13:30 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
On Fri, 2006-04-28 at 09:25 -0400, Apps Lists wrote:> Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file > /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: > 1146230258.P2889Q0M407851.dev4.neonova.net:2,I saw these kind of problems with 2.6.16 kernel. They got fixed after I applied utime() fix. Also in general with current code you're going to have problems if you have attribute cache enabled. See http://wiki.dovecot.org/NFS -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20060428/3e937012/attachment.bin>
Luis Melendez
2006-Apr-28 13:59 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
Hi all. We also had problems with client NFS locking using RHEL4, kernel 2.6.9 Some of them were solved updating the kelnel from RedHat. Hope this helps. Regards. Apps Lists wrote:>I'm running beta7 on two machines, with maildir on NFS. I have lockd >running on all machines. I've found that Dovecot is highly unstable with >NFS when accessing a mailbox on more than one machine at the same time. > >Both dovecot machines have: > >mmap_disable = yes >lock_method = fcntl > >NFS is version 3, exported from a third linux machine. All machines are >running 2.6.9 kernel. > >Any ideas what's going wrong here? > >Steve > > >*** dev4 *** > >Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist file >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >1146230258.P2889Q0M407851.dev4.neonova.net:2, > >Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log file >/var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >introduction for unknown id 0 > >Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line 883 >(mail_index_sync_from_transactions): assertion failed: >(hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset) > > >Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6 > > >*** dev3 *** > > > >Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >1146230254.P2888Q0M651598.dev4.neonova.net > >Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file >/var/mailstore/72/af/375887/Maildir/dovecot.index: >first_recent_uid_lowwater 4036 -> 1772 > >Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered >(4036 -> 1774) > >Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >1146145480.P5976Q0M245663.dev3.neonova.net:2, > >Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file >/var/mailstore/72/af/375887/Maildir/dovecot.index: >first_recent_uid_lowwater 4038 -> 1774 > >Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered >(4038 -> 1775) > >Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist file >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >1146145480.P5976Q0M245663.dev3.neonova.net:2, > >Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered >(4039 -> 1775) > >Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log file >/var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >introduction for unknown id 0 > >Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line 881 >(mail_index_sync_from_transactions): assertion failed: (hdr.messages_count >== (*map)->hdr.messages_count) > >Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): >/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was lowered >(4039 -> 1775) > >Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file >/var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID >1775, but next_uid = 4039 > >Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log file >/var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with UID >1775, but next_uid = 4039 > >Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log >desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index > >Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is in >inconsistent state, please relogin. > >Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file >/var/mailstore/72/af/375887/Maildir/dovecot.index: >first_recent_uid_lowwater 4039 -> 1776 > > > > > > >-- +----------------------------------------------^-----------------------+ | Luis Mel?ndez Aganzo ^ Email: luism at uco.es | | Servicio de Inform?tica ^ Tlf: 34-(9)57-211022 | | Analista - ?rea de Sistemas ^ Fax: 34-(9)57-218116 | | Universidad de C?rdoba (SPAIN) ^ http://www.uco.es | +----------------------------------------------^-----------------------+
Apps Lists
2006-Apr-28 14:58 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
> Of course. > We upgraded to the last kernel available (in that moment) for RHEL4: > 2.6.9-34.ELsmp > The release notes mentioned that some important issues related to > locking in client > NFS environment were missing in the kernel included in the distribution. >Darn. That's the same kernel I'm running. Must not be my lucky day. Steve> Regards. > > Apps Lists wrote: > >>>Hi all. >>>We also had problems with client NFS locking using RHEL4, kernel >>> 2.6.9 Some of them were solved updating the kelnel from RedHat. >>>Hope this helps. >>> >>> >>> >> >>Hi Luis. >> >>Can you tell me which kernel you upgraded to? >> >>Thanks. >> >>Steve >> >> >> >> >>>Regards. >>> >>>Apps Lists wrote: >>> >>> >>> >>>>I'm running beta7 on two machines, with maildir on NFS. I have lockd >>>> running on all machines. I've found that Dovecot is highly unstable >>>> with NFS when accessing a mailbox on more than one machine at the >>>> same time. >>>> >>>>Both dovecot machines have: >>>> >>>>mmap_disable = yes >>>>lock_method = fcntl >>>> >>>>NFS is version 3, exported from a third linux machine. All machines >>>> are running 2.6.9 kernel. >>>> >>>>Any ideas what's going wrong here? >>>> >>>>Steve >>>> >>>> >>>>*** dev4 *** >>>> >>>>Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in uidlist >>>> file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>1146230258.P2889Q0M407851.dev4.neonova.net:2, >>>> >>>>Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction log >>>> file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>> introduction for unknown id 0 >>>> >>>>Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line >>>> 883 (mail_index_sync_from_transactions): assertion failed: >>>>(hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset) >>>> >>>> >>>>Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with signal 6 >>>> >>>> >>>>*** dev3 *** >>>> >>>> >>>> >>>>Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist >>>> file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>1146230254.P2888Q0M651598.dev4.neonova.net >>>> >>>>Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file >>>>/var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>first_recent_uid_lowwater 4036 -> 1772 >>>> >>>>Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): >>>>/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>> lowered (4036 -> 1774) >>>> >>>>Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist >>>> file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>> >>>>Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file >>>>/var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>first_recent_uid_lowwater 4038 -> 1774 >>>> >>>>Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): >>>>/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>> lowered (4038 -> 1775) >>>> >>>>Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in uidlist >>>> file /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>> >>>>Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): >>>>/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>> lowered (4039 -> 1775) >>>> >>>>Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction log >>>> file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>> introduction for unknown id 0 >>>> >>>>Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line >>>> 881 (mail_index_sync_from_transactions): assertion failed: >>>>(hdr.messages_count == (*map)->hdr.messages_count) >>>> >>>>Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): >>>>/var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>> lowered (4039 -> 1775) >>>> >>>>Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log >>>> file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append >>>> with UID 1775, but next_uid = 4039 >>>> >>>>Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction log >>>> file /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append >>>> with UID 1775, but next_uid = 4039 >>>> >>>>Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected transaction log >>>> desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index >>>> >>>>Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: Mailbox is >>>> in inconsistent state, please relogin. >>>> >>>>Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file >>>>/var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>first_recent_uid_lowwater 4039 -> 1776 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>-- >>>+----------------------------------------------^-----------------------+ >>> | Luis Mel?ndez Aganzo ^ Email: luism at uco.es >>> | | Servicio de Inform?tica ^ Tlf: >>> 34-(9)57-211022 | | Analista - ?rea de Sistemas ^ >>> Fax: 34-(9)57-218116 | | Universidad de C?rdoba (SPAIN) >>> ^ http://www.uco.es | >>> +----------------------------------------------^-----------------------+ >>> >>> >> >> >> >> >> > > > -- > +----------------------------------------------^-----------------------+ > | Luis Mel?ndez Aganzo ^ Email: luism at uco.es | > | Servicio de Inform?tica ^ Tlf: 34-(9)57-211022 | > | Analista - ?rea de Sistemas ^ Fax: 34-(9)57-218116 | > | Universidad de C?rdoba (SPAIN) ^ http://www.uco.es | > +----------------------------------------------^-----------------------+
Apps Lists
2006-Apr-28 18:27 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
Rich, We're also in the process of setting up a GFS/Cluster with CentOS. I'm hopeful this will eliminate some of the issues. I'll keep everyone posted. Steve> Hi Steve, > > We'll give that a try. I'm sure users would prefer a slightly slower > mail experience if it ensured accurate mail lists and reduced the > chance for lost mail. > > Rich Sandberg > richs at whidbey.net > Systems & Network Administrator > Whidbey Telecom Network Operations > > > > On Apr 28, 2006, at 9:37 AM, Apps Lists wrote: > >> Hi Rich. >> >> I've seen a lot of the same IMAP errors... haven't played with POP >> yet. >> >> You might want to try adding "noac" to the mount options... your >> line.. >> >> nfshost:/var/mail /var/mail nfs >> rw,hard,intr,rsize=8192,wsize=8192 0 0 >> >> .... would become: >> >> >> nfshost:/var/mail /var/mail nfs >> rw,hard,intr,rsize=8192,wsize=8192,noac 0 0 >> >> >> .... and see if that helps. I haven't seen much if any difference. >> >> What I don't understand is how some people are running what appears >> to be >> the same configuration ... RHEL/CentOS ... NFS3, 2+ Dovecot >> machines with >> IMAP and have no problems. >> >> I'd love to get to the bottom of this. >> >> Steve >> >>> We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3 >>> Dovecot servers. Our most concerning episode was a failed APPEND >>> operation (with a set of folders containing 200 messages): >>> >>> IMAP(username): Corrupted transaction log file /var/mail/username/ >>> Maildir/dovecot.index.log: unexpected end of file while reading >>> header >>> >>> That resulted in one of the new APPENDed mail folders and content >>> being lost. We see those errors at other times too, although they >>> don't always end in a noticeable failure. >>> >>> Continuous other errors we see that might be NFS related: >>> >>> IMAP(username): close() failed with index cache file /var/mail/ >>> username/Maildir/.Deleted Messages/dovecot.index.cache: Stale NFS >>> file >>> handle >>> IMAP(username): Corrupted index cache file /var/mail/username/ >>> Maildir/.OldMail/dovecot.index.cache: invalid record size >>> IMAP(username): close() failed with index file /var/mail/username/ >>> Maildir/.INBOX.Security/dovecot.index: Bad file descriptor >>> >>> We also see errors with POP3, which would rarely benefit from indexes >>> (since 92% of our users RETR-and-DELE messages): >>> >>> POP3(username): file_dotlock_create() failed with file /var/mail/ >>> username/Maildir/dovecot.index.log: Stale NFS file handle >>> POP3(username): Corrupted transaction log file /var/mail/username/ >>> Maildir/dovecot.index.log: record size too small (type=0x8080, >>> offset=9056, size=0) >>> POP3(username): Our dotlock file /var/mail/username/Maildir/ >>> dovecot.index.log.lock was deleted (kept it -118 secs) >>> POP3(username): Timeout while waiting for release of dotlock for >>> transaction log file /var/mail/username/Maildir/dovecot.index.log >>> >>> We've tried manually removing "dovecot*" from all folders, allowing >>> the indexes to be re-built, but the errors return. >>> >>> Excerpt from our config: >>> >>> mail_cache_fields mail_never_cache_fields mail_cache_min_mail_count >>> >> 1000 mailbox_idle_check_interval = 10 mail_full_filesystem_access >>> = no >>> mail_read_mmaped = no >>> mmap_disable = yes >>> mmap_no_write = no >>> lock_method = dotlock >>> maildir_stat_dirs = yes >>> maildir_copy_with_hardlinks = no >>> >>> Our NFS mount: >>> >>> nfshost:/var/mail /var/mail nfs >>> rw,hard,intr,rsize=8192,wsize=8192 0 0 >>> >>> I've read some of the list archives about disabling the index files. >>> We realize that indexes should be beneficial, but we'd like to at >>> least experiment with a "no index" install. This would solve some of >>> the NFS locking issues, eliminate the need to modify our MTA >>> behavior, and remove (potentially) millions of files from the system. >>> >>> Did anyone discover a method to cleanly disable the indexes? At least >>> for POP3? >>> >>> Thanks! >>> >>> Rich Sandberg >>> richs at whidbey.net >>> >>> >>> On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote: >>> >>>> Hi all. >>>> We also had problems with client NFS locking using RHEL4, kernel >>>> 2.6.9 >>>> Some of them were solved updating the kelnel from RedHat. >>>> Hope this helps. >>>> >>>> Regards. >>>> >>>> Apps Lists wrote: >>>> >>>>> I'm running beta7 on two machines, with maildir on NFS. I have >>>>> lockd >>>>> running on all machines. I've found that Dovecot is highly >>>>> unstable with >>>>> NFS when accessing a mailbox on more than one machine at the same >>>>> time. >>>>> >>>>> Both dovecot machines have: >>>>> >>>>> mmap_disable = yes >>>>> lock_method = fcntl >>>>> >>>>> NFS is version 3, exported from a third linux machine. All >>>>> machines are >>>>> running 2.6.9 kernel. >>>>> >>>>> Any ideas what's going wrong here? >>>>> >>>>> Steve >>>>> >>>>> >>>>> *** dev4 *** >>>>> >>>>> Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in >>>>> uidlist file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>> 1146230258.P2889Q0M407851.dev4.neonova.net:2, >>>>> >>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction >>>>> log file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>>> introduction for unknown id 0 >>>>> >>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: line >>>>> 883 >>>>> (mail_index_sync_from_transactions): assertion failed: >>>>> (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset) >>>>> >>>>> >>>>> Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with >>>>> signal 6 >>>>> >>>>> >>>>> *** dev3 *** >>>>> >>>>> >>>>> >>>>> Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>> uidlist file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>> 1146230254.P2888Q0M651598.dev4.neonova.net >>>>> >>>>> Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>> first_recent_uid_lowwater 4036 -> 1772 >>>>> >>>>> Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>> lowered >>>>> (4036 -> 1774) >>>>> >>>>> Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>> uidlist file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>>> >>>>> Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>> first_recent_uid_lowwater 4038 -> 1774 >>>>> >>>>> Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>> lowered >>>>> (4038 -> 1775) >>>>> >>>>> Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>> uidlist file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>>> >>>>> Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>> lowered >>>>> (4039 -> 1775) >>>>> >>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>> log file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>>> introduction for unknown id 0 >>>>> >>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: line >>>>> 881 >>>>> (mail_index_sync_from_transactions): assertion failed: >>>>> (hdr.messages_count >>>>> == (*map)->hdr.messages_count) >>>>> >>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): >>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>> lowered >>>>> (4039 -> 1775) >>>>> >>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>> log file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with >>>>> UID >>>>> 1775, but next_uid = 4039 >>>>> >>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>> log file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append with >>>>> UID >>>>> 1775, but next_uid = 4039 >>>>> >>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected >>>>> transaction log >>>>> desync with index /var/mailstore/72/af/375887/Maildir/dovecot.index >>>>> >>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: >>>>> Mailbox is >>>>> in >>>>> inconsistent state, please relogin. >>>>> >>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file >>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>> first_recent_uid_lowwater 4039 -> 1776 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> +---------------------------------------------- >>>> ^-----------------------+ >>>> | Luis Mel?ndez Aganzo ^ Email: >>>> luism at uco.es | >>>> | Servicio de Inform?tica ^ Tlf: 34-(9) >>>> 57-211022 | >>>> | Analista - ?rea de Sistemas ^ Fax: 34-(9) >>>> 57-218116 | >>>> | Universidad de C?rdoba (SPAIN) ^ http:// >>>> www.uco.es | >>>> +---------------------------------------------- >>>> ^-----------------------+
richs at whidbey.net
2006-May-01 15:05 UTC
[Dovecot] Maildir + NFS + multiple machines = spectacular failure
Adding "noac" to our /etc/fstab NFS mounts seems to have helped our index errors so far: nfsserver:/var/mail /var/mail nfs rw,hard,intr,rsize=8192,wsize=8192,noac 0 0 Over the weekend we had 1,465,031 POP3 sessions across 3 servers, and only logged 15 "Corrupted" errors. They were of these types: Corrupted transaction log file /var/mail/(user)/Maildir/ dovecot.index.log: unexpected end of file while reading header Corrupted transaction log file /var/mail/(user)/Maildir/ dovecot.index.log: end_offset (236) > current sync_offset (200) Corrupted transaction log file /var/mail/(user)/Maildir/ dovecot.index.log: record size too small (type=0x8080, offset=7260, size=0) Corrupted transaction log file /var/mail/(user)/Maildir/ dovecot.index.log: hdr.size too large (0) We also had 1,929 IMAP sessions from our testing support staff, and only 2 errors: Corrupted transaction log file /var/mail/(user)/Maildir/ dovecot.index.log: record size too small (type=0x5ef9, offset=200, size=0) Corrupted transaction log file /var/mail/(user)/Maildir/.Junk E-mail/ dovecot.index.log: end_offset (15292) > current sync_offset (15236) Rich On Apr 28, 2006, at 11:27 AM, Apps Lists wrote:> Rich, > > We're also in the process of setting up a GFS/Cluster with > CentOS. I'm > hopeful this will eliminate some of the issues. > > I'll keep everyone posted. > > Steve > >> Hi Steve, >> >> We'll give that a try. I'm sure users would prefer a slightly slower >> mail experience if it ensured accurate mail lists and reduced the >> chance for lost mail. >> >> Rich Sandberg >> richs at whidbey.net >> Systems & Network Administrator >> Whidbey Telecom Network Operations >> >> >> >> On Apr 28, 2006, at 9:37 AM, Apps Lists wrote: >> >>> Hi Rich. >>> >>> I've seen a lot of the same IMAP errors... haven't played with POP >>> yet. >>> >>> You might want to try adding "noac" to the mount options... your >>> line.. >>> >>> nfshost:/var/mail /var/mail nfs >>> rw,hard,intr,rsize=8192,wsize=8192 0 0 >>> >>> .... would become: >>> >>> >>> nfshost:/var/mail /var/mail nfs >>> rw,hard,intr,rsize=8192,wsize=8192,noac 0 0 >>> >>> >>> .... and see if that helps. I haven't seen much if any difference. >>> >>> What I don't understand is how some people are running what appears >>> to be >>> the same configuration ... RHEL/CentOS ... NFS3, 2+ Dovecot >>> machines with >>> IMAP and have no problems. >>> >>> I'd love to get to the bottom of this. >>> >>> Steve >>> >>>> We're using 2.6.9-34, the latest RHEL4 kernel, over NFSv3 with 3 >>>> Dovecot servers. Our most concerning episode was a failed APPEND >>>> operation (with a set of folders containing 200 messages): >>>> >>>> IMAP(username): Corrupted transaction log file /var/mail/username/ >>>> Maildir/dovecot.index.log: unexpected end of file while reading >>>> header >>>> >>>> That resulted in one of the new APPENDed mail folders and content >>>> being lost. We see those errors at other times too, although they >>>> don't always end in a noticeable failure. >>>> >>>> Continuous other errors we see that might be NFS related: >>>> >>>> IMAP(username): close() failed with index cache file /var/mail/ >>>> username/Maildir/.Deleted Messages/dovecot.index.cache: Stale NFS >>>> file >>>> handle >>>> IMAP(username): Corrupted index cache file /var/mail/username/ >>>> Maildir/.OldMail/dovecot.index.cache: invalid record size >>>> IMAP(username): close() failed with index file /var/mail/username/ >>>> Maildir/.INBOX.Security/dovecot.index: Bad file descriptor >>>> >>>> We also see errors with POP3, which would rarely benefit from >>>> indexes >>>> (since 92% of our users RETR-and-DELE messages): >>>> >>>> POP3(username): file_dotlock_create() failed with file /var/mail/ >>>> username/Maildir/dovecot.index.log: Stale NFS file handle >>>> POP3(username): Corrupted transaction log file /var/mail/username/ >>>> Maildir/dovecot.index.log: record size too small (type=0x8080, >>>> offset=9056, size=0) >>>> POP3(username): Our dotlock file /var/mail/username/Maildir/ >>>> dovecot.index.log.lock was deleted (kept it -118 secs) >>>> POP3(username): Timeout while waiting for release of dotlock for >>>> transaction log file /var/mail/username/Maildir/dovecot.index.log >>>> >>>> We've tried manually removing "dovecot*" from all folders, allowing >>>> the indexes to be re-built, but the errors return. >>>> >>>> Excerpt from our config: >>>> >>>> mail_cache_fields mail_never_cache_fields mail_cache_min_mail_count >>>>>> 1000 mailbox_idle_check_interval = 10 mail_full_filesystem_access >>>> = no >>>> mail_read_mmaped = no >>>> mmap_disable = yes >>>> mmap_no_write = no >>>> lock_method = dotlock >>>> maildir_stat_dirs = yes >>>> maildir_copy_with_hardlinks = no >>>> >>>> Our NFS mount: >>>> >>>> nfshost:/var/mail /var/mail nfs >>>> rw,hard,intr,rsize=8192,wsize=8192 0 0 >>>> >>>> I've read some of the list archives about disabling the index >>>> files. >>>> We realize that indexes should be beneficial, but we'd like to at >>>> least experiment with a "no index" install. This would solve >>>> some of >>>> the NFS locking issues, eliminate the need to modify our MTA >>>> behavior, and remove (potentially) millions of files from the >>>> system. >>>> >>>> Did anyone discover a method to cleanly disable the indexes? At >>>> least >>>> for POP3? >>>> >>>> Thanks! >>>> >>>> Rich Sandberg >>>> richs at whidbey.net >>>> >>>> >>>> On Apr 28, 2006, at 6:59 AM, Luis Melendez wrote: >>>> >>>>> Hi all. >>>>> We also had problems with client NFS locking using RHEL4, kernel >>>>> 2.6.9 >>>>> Some of them were solved updating the kelnel from RedHat. >>>>> Hope this helps. >>>>> >>>>> Regards. >>>>> >>>>> Apps Lists wrote: >>>>> >>>>>> I'm running beta7 on two machines, with maildir on NFS. I have >>>>>> lockd >>>>>> running on all machines. I've found that Dovecot is highly >>>>>> unstable with >>>>>> NFS when accessing a mailbox on more than one machine at the same >>>>>> time. >>>>>> >>>>>> Both dovecot machines have: >>>>>> >>>>>> mmap_disable = yes >>>>>> lock_method = fcntl >>>>>> >>>>>> NFS is version 3, exported from a third linux machine. All >>>>>> machines are >>>>>> running 2.6.9 kernel. >>>>>> >>>>>> Any ideas what's going wrong here? >>>>>> >>>>>> Steve >>>>>> >>>>>> >>>>>> *** dev4 *** >>>>>> >>>>>> Apr 28 09:17:40 dev4 dovecot: IMAP(jtest): Duplicate file in >>>>>> uidlist file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>>> 1146230258.P2889Q0M407851.dev4.neonova.net:2, >>>>>> >>>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): Corrupted transaction >>>>>> log file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>>>> introduction for unknown id 0 >>>>>> >>>>>> Apr 28 09:17:41 dev4 dovecot: IMAP(jtest): file mail-index.c: >>>>>> line >>>>>> 883 >>>>>> (mail_index_sync_from_transactions): assertion failed: >>>>>> (hdr.log_file_int_offset == (*map)->hdr.log_file_int_offset) >>>>>> >>>>>> >>>>>> Apr 28 09:17:41 dev4 dovecot: child 2889 (imap) killed with >>>>>> signal 6 >>>>>> >>>>>> >>>>>> *** dev3 *** >>>>>> >>>>>> >>>>>> >>>>>> Apr 28 09:17:39 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>>> uidlist file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>>> 1146230254.P2888Q0M651598.dev4.neonova.net >>>>>> >>>>>> Apr 28 09:17:41 dev3 dovecot: IMAP(jtest): Fixed index file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>>> first_recent_uid_lowwater 4036 -> 1772 >>>>>> >>>>>> Apr 28 09:17:44 dev3 dovecot: IMAP(jtest): >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>>> lowered >>>>>> (4036 -> 1774) >>>>>> >>>>>> Apr 28 09:17:45 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>>> uidlist file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>>>> >>>>>> Apr 28 09:17:46 dev3 dovecot: IMAP(jtest): Fixed index file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>>> first_recent_uid_lowwater 4038 -> 1774 >>>>>> >>>>>> Apr 28 09:17:47 dev3 dovecot: IMAP(jtest): >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>>> lowered >>>>>> (4038 -> 1775) >>>>>> >>>>>> Apr 28 09:17:48 dev3 dovecot: IMAP(jtest): Duplicate file in >>>>>> uidlist file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: >>>>>> 1146145480.P5976Q0M245663.dev3.neonova.net:2, >>>>>> >>>>>> Apr 28 09:17:49 dev3 dovecot: IMAP(jtest): >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>>> lowered >>>>>> (4039 -> 1775) >>>>>> >>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>>> log file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Extension >>>>>> introduction for unknown id 0 >>>>>> >>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): file mail-index.c: >>>>>> line >>>>>> 881 >>>>>> (mail_index_sync_from_transactions): assertion failed: >>>>>> (hdr.messages_count >>>>>> == (*map)->hdr.messages_count) >>>>>> >>>>>> Apr 28 09:17:50 dev3 dovecot: IMAP(jtest): >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot-uidlist: next_uid was >>>>>> lowered >>>>>> (4039 -> 1775) >>>>>> >>>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>>> log file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append >>>>>> with >>>>>> UID >>>>>> 1775, but next_uid = 4039 >>>>>> >>>>>> Apr 28 09:17:51 dev3 dovecot: IMAP(jtest): Corrupted transaction >>>>>> log file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index.log: Append >>>>>> with >>>>>> UID >>>>>> 1775, but next_uid = 4039 >>>>>> >>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Unexpected >>>>>> transaction log >>>>>> desync with index /var/mailstore/72/af/375887/Maildir/ >>>>>> dovecot.index >>>>>> >>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Disconnected: >>>>>> Mailbox is >>>>>> in >>>>>> inconsistent state, please relogin. >>>>>> >>>>>> Apr 28 09:17:52 dev3 dovecot: IMAP(jtest): Fixed index file >>>>>> /var/mailstore/72/af/375887/Maildir/dovecot.index: >>>>>> first_recent_uid_lowwater 4039 -> 1776 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> +---------------------------------------------- >>>>> ^-----------------------+ >>>>> | Luis Mel?ndez Aganzo ^ Email: >>>>> luism at uco.es | >>>>> | Servicio de Inform?tica ^ Tlf: 34-(9) >>>>> 57-211022 | >>>>> | Analista - ?rea de Sistemas ^ Fax: 34-(9) >>>>> 57-218116 | >>>>> | Universidad de C?rdoba (SPAIN) ^ http:// >>>>> www.uco.es | >>>>> +---------------------------------------------- >>>>> ^-----------------------+ > > >