William L. Thomson Jr.
2016-Mar-22 19:30 UTC
Replication issues master <-> master nfs backend
I keep having some replication issues and not sure what can be done to resolve or correct. It does not seem to happen all the time, though for the last ~30 or so minutes and many messages seems to be happening consistent for me. I have 2 mail servers, basically clones, and thus master master replication. Most of the time things work fine. But many times an email or several will arrive on one, and never replicate to the other. I am not as concerned on the never replicating, as I am that the user never gets notified. Mail arrives on say server 1, users are checking mail on server 2, and they never see the email on server 2. This is not always the case, but its happening enough daily. I then log into one and run sync manually. Which usually syncs the mail on both servers, and then it arrives in the inbox. Here is an example, mail is on mail2, but not mail1. I am checking email on mail1 so I am not seeing the 1 email. Mail1 /home/wlt-ml/.maildir/new: total 0 Mail2 /home/wlt-ml/.maildir/new: total 12 -rw------- 1 wlt-ml site1 8502 Mar 22 14:57 1458673024.7643.mail2 Then I manually log into mail2 and run this command, though usually I can run it from either side, and just change the name to the other server. doveadm sync -u "*" remote:mail1 And then I end up with the missing email on mail1, and it arrives in my email client shortly there after Mail1 /home/wlt-ml/.maildir/new: total 12 -rw------- 1 wlt-ml site1 8502 Mar 22 14:57 1458673051.M838843P26735.mail1,S=8502,W=8678:2,T I have no idea why it does this. It seems to happen when when a full sync has taken place per doveadm replicator status wlt-ml. There does not seem to be any settings to force a full vs fast sync more often. No clue if this is even a full vs fast issue or other. I think it tends to happen more when people stay connected to the imap server. I had a theory that closing the email client and opening it again will get dovecot to sync. I believe this is still the case, but not able to confirm 100%. Also users are reporting closing Thunderbird. I can see them logging out and back in in the logs, but email does not replicate or show till I run doveadm sync manually. Tempted to have cron invoke that on the regular, but seems very hackish and likely will have its own issues doing that. Since its not the right way or how things were designed. Not sure if this is a bug or what. Hopefully miss-configuration on my end. Open to any feedback, advice, etc. I can provide replicator configuration but its pretty straight forward and mostly copy/paste from the replication page. Replication works, just seems it is not triggered to replicate at times or something. dovecot --version 2.2.22 (fe789d2) -- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
On 22.03.2016 21:30, William L. Thomson Jr. wrote:> I keep having some replication issues and not sure what can be done to resolve or correct. It > does not seem to happen all the time, though for the last ~30 or so minutes and many > messages seems to be happening consistent for me. > > I have 2 mail servers, basically clones, and thus master master replication. Most of the time > things work fine. But many times an email or several will arrive on one, and never replicate > to the other. I am not as concerned on the never replicating, as I am that the user never gets > notified. > > Mail arrives on say server 1, users are checking mail on server 2, and they never see the email > on server 2. This is not always the case, but its happening enough daily. I then log into one > and run sync manually. Which usually syncs the mail on both servers, and then it arrives in > the inbox. > > Here is an example, mail is on mail2, but not mail1. I am checking email on mail1 so I am not > seeing the 1 email. > > Mail1 > /home/wlt-ml/.maildir/new: > total 0 > > Mail2 > /home/wlt-ml/.maildir/new: > total 12 > -rw------- 1 wlt-ml site1 8502 Mar 22 14:57 1458673024.7643.mail2 > > Then I manually log into mail2 and run this command, though usually I can run it from either > side, and just change the name to the other server. > > doveadm sync -u "*" remote:mail1 > > And then I end up with the missing email on mail1, and it arrives in my email client shortly > there after > > Mail1 > /home/wlt-ml/.maildir/new: > total 12 > -rw------- 1 wlt-ml site1 8502 Mar 22 14:57 > 1458673051.M838843P26735.mail1,S=8502,W=8678:2,T > > I have no idea why it does this. It seems to happen when when a full sync has taken place > per doveadm replicator status wlt-ml. There does not seem to be any settings to force a full > vs fast sync more often. No clue if this is even a full vs fast issue or other. > > I think it tends to happen more when people stay connected to the imap server. I had a > theory that closing the email client and opening it again will get dovecot to sync. I believe > this is still the case, but not able to confirm 100%. Also users are reporting closing > Thunderbird. I can see them logging out and back in in the logs, but email does not replicate > or show till I run doveadm sync manually. > > Tempted to have cron invoke that on the regular, but seems very hackish and likely will have > its own issues doing that. Since its not the right way or how things were designed. Not sure > if this is a bug or what. Hopefully miss-configuration on my end.You should still include your doveconf -n output. Also any errors and warnings logged by dovecot, could be useful. br, Teemu Huovila> Open to any feedback, advice, etc. I can provide replicator configuration but its pretty > straight forward and mostly copy/paste from the replication page. Replication works, just > seems it is not triggered to replicate at times or something. > > dovecot --version > 2.2.22 (fe789d2) > >
William L. Thomson Jr.
2016-Mar-23 19:44 UTC
Replication issues master <-> master nfs backend
Forgot to mention before I run 2 NFS servers, each mail server uses a different NFS server. It is not the same NFS server for both. Just to clarify that, as I am not trying to replicate using the same NFS server with 2 mail servers. I have 2 of each, mail + nfs, and not at the same location. On Wednesday, March 23, 2016 11:19:07 AM Teemu Huovila wrote:> > You should still include your doveconf -n output.Below, end of email> Also any errors and warnings logged by dovecot, could be useful.Not many errors are logged. It does not give me anything to go on for the replication issues. The only errors I have ever seen logged other than some initial deployment fubar with assertion errors. The only other error logged is due to an account I use for nagios but does not exist, nobody. Mar 23 13:01:44 Error: dsync-local(nobody): Couldn't create lock /var/empty/.dovecot- sync.lock: Permission denied Messed with changing nobody's home directory but screwed up other things like ssh. I just do doveadm replicator remove nobody Not sure if I can put that in a config file, or somewhere so it is more permanent. I could not figure out how to ignore that user, while bring in all others. Occasionally some others with inability to reach the other mail server due to VPN issues. But that is not happening when I am experiencing replication issue and is very rare. Just does log a few times when it happens. doveconf -n # 2.2.22 (fe789d2): /etc/dovecot/dovecot.conf # OS: Linux 4.3.3-hardened-r1 x86_64 Gentoo Base System release 2.2 disable_plaintext_auth = no doveadm_password = # hidden, use -P to show it first_valid_gid = 1000 first_valid_uid = 1000 listen = *,[::] login_greeting = Mail server ready. login_log_format_elements = user=<%u> ip=[%r] port=[%b] method=[%m] security=[%c] mail_fsync = always mail_location = maildir:~/.maildir mail_nfs_index = yes mail_plugins = " notify replication" mmap_disable = yes namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = } passdb { args = * driver = pam } plugin { mail_replica = tcp:mail2.obsidian-studios.com:12345 } service aggregator { fifo_listener replication-notify-fifo { mode = 0666 user = root } unix_listener replication-notify { mode = 0666 user = root } } service doveadm { inet_listener { port = 12345 } } service imap-login { inet_listener imap { port = 0 } inet_listener imaps { port = 993 ssl = yes } } service pop3-login { inet_listener pop3 { port = 0 } inet_listener pop3s { port = 995 ssl = yes } } service replicator { process_min_avail = 1 unix_listener replicator-doveadm {
William L. Thomson Jr.
2016-Mar-24 19:10 UTC
Replication issues master <-> master nfs backend
On Tuesday, March 22, 2016 03:30:38 PM William L. Thomson Jr. wrote:> > Then I manually log into mail2 and run this command, though usually I can > run it from either side, and just change the name to the other server. > > doveadm sync -u "*" remote:mail1 > > Tempted to have cron invoke that on the regular, but seems very hackish and > likely will have its own issues doing that.Broke down and went with the hackish approach of having cron run doveadm sync every 15 minutes during business hours. Not ideal, but seems to correct the syncing delays. Hopefully no side effects. */15 8-18 * * * root /usr/bin/doveadm sync -u "*" remote:mail2 It corrects or bandaids the problem where mail remains on one server only for extended periods, several hours. While users check email on another, and never see the emails on the other server. While sync status shows fast and full sync having completed, no errors in logs etc. -- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com