Hi I'm currently debugging some replication issues between two dovecot 2.3.9.2 servers, where one is live and the other is just a copy used for backup with no imap user access. After initial alignment (with various error messages such as the stalled io messages a fnctl lock messages) I am seeing replication miss messages or stop altogether on mailboxes, even with no further error messages. doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for 600 seconds (last sent=mail_change (EOL), last recv=mailbox) doveadm: Error: Couldn't lock /var/vmail/DOMAIN/USER//.dovecot-sync.lock: fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307) I was surprised by this because although I know there were replication issues in 2.3.8 I understood these were resolved in 2.3.9 when both servers had 2.3.9. I am still investigating and will post further if I get any useful insights. However, I have a question, which despite using dovecot for many years in this configuration has never occurred to me before. I configured dovecot using the wiki https://wiki.dovecot.org/Replication using tcp and ssl. Both servers have an identical dovecot configuration except for: 1. different hostnames 2. on the backup server I have removed expire and quota plugins in the global mail_plugins 3. in the configuration of mail_replica tcps://hostname:port each server points to the other server's hostname What I just realized is that nowhere in the wiki does it state that both servers should be set up for replication. I had always assumed that was the logical thing to do. So the question is, for successful replication is it sufficient to setup one master configuration and just have a replication process listening on the other master, or should both servers be set up for replication in an almost identical way (with the 3 exceptions above)? thanks for any insights. John -------------- next part -------------- A non-text attachment was scrubbed... Name: pEpkey.asc Type: application/pgp-keys Size: 1753 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20200112/5e3a86c8/attachment.bin>
On 12.1.2020 13.49, John wrote:> Hi > > I'm currently debugging some replication issues between two dovecot > 2.3.9.2 servers, where one is live and the other is just a copy used for > backup with no imap user access. After initial alignment (with various > error messages such as the stalled io messages a fnctl lock messages) I > am seeing replication miss messages or stop altogether on mailboxes, > even with no further error messages. > > doveadm: Error: dsync(REMOTE_HOSTNAME): I/O has stalled, no activity for > 600 seconds (last sent=mail_change (EOL), last recv=mailbox) > > doveadm: Error: Couldn't lock > /var/vmail/DOMAIN/USER//.dovecot-sync.lock: > fcntl(/var/vmail/DOMAIN/USER//.dovecot-sync.lock, write-lock, F_SETLKW) > locking failed: Timed out after 30 seconds (WRITE lock held by pid 30307) > > I was surprised by this because although I know there were replication > issues in 2.3.8 I understood these were resolved in 2.3.9 when both > servers had 2.3.9. > > I am still investigating and will post further if I get any useful insights. > > However, I have a question, which despite using dovecot for many years > in this configuration has never occurred to me before. I configured > dovecot using the wiki https://wiki.dovecot.org/Replication using tcp > and ssl. Both servers have an identical dovecot configuration except for: > > 1. different hostnames > > 2. on the backup server I have removed expire and quota plugins in the > global mail_plugins > > 3. in the configuration of mail_replica tcps://hostname:port each server > points to the other server's hostname > > What I just realized is that nowhere in the wiki does it state that both > servers should be set up for replication. I had always assumed that was > the logical thing to do. So the question is, for successful replication > is it sufficient to setup one master configuration and just have a > replication process listening on the other master, or should both > servers be set up for replication in an almost identical way (with the 3 > exceptions above)? > > thanks for any insights. > > John > >Did you check what the process 30307 is? It is enough for the backup server to have only the doveadm server configured. Aki
Reasonably Related Threads
- Dovecot Replication Errors (only) when using tcps: as the mail_replica Protocol
- 2.3.1 Replication is throwing scary errors
- 2.3.1 Replication is throwing scary errors
- Dovecot Replication Errors (only) when using tcps: as the mail_replica Protocol
- 2.3.1 Replication is throwing scary errors