Still not quite right for me. Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: dsync(lightning.reub.net): I/O has stalled, no activity for 600 seconds (last sent=mail, last recv=mail (EOL)) Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout during state=sync_mails (send=mails recv=recv_last_common) I'm not sure if there is an underlying replication error or if the message is just cosmetic, though. Reuben On 7/06/2018 4:55 AM, Remko Lodder wrote:> Hi Timo, > > Yes this seems to work fine so far. I?ll ask the people to add it to the > current FreeBSD version.. > > Cheers > Remko > >> On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi >> <mailto:tss at iki.fi>> wrote: >> >> Should be fixed by >> https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 >> >
> On 7 Jun 2018, at 07:21, Reuben Farrelly <reuben-dovecot at reub.net> wrote: > > Still not quite right for me. > > Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: dsync(lightning.reub.net): I/O has stalled, no activity for 600 seconds (last sent=mail, last recv=mail (EOL)) > Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout during state=sync_mails (send=mails recv=recv_last_common) > > I'm not sure if there is an underlying replication error or if the message is just cosmetic, though. > > ReubenHi, Admittedly I have had a few occurences of this behaviour as well last night. It happens more sporadic now and seems to be a conflict with my user settings. (My users get added twice to the system, user-domain.tld and user at domain.tld <mailto:user at domain.tld>, both are being replicated, the noreplicate flag is not yet honored in the version I am using so I cannot bypass that yet). I do see messages that came on the other machine on the machine that I am using to read these emails. So replication seems to work in that regard (where it obviously did not do that well before). Cheers Remko> > > On 7/06/2018 4:55 AM, Remko Lodder wrote: >> Hi Timo, >> Yes this seems to work fine so far. I?ll ask the people to add it to the current FreeBSD version.. >> Cheers >> Remko >>> On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi <mailto:tss at iki.fi>> wrote: >>> >>> Should be fixed by https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180607/804b3704/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: <https://dovecot.org/pipermail/dovecot/attachments/20180607/804b3704/attachment-0001.sig>
Am 2018-06-07 07:34, schrieb Remko Lodder:> On 7 Jun 2018, at 07:21, Reuben Farrelly <reuben-dovecot at reub.net> > wrote:>> Still not quite right for me. >> >> Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: >> dsync(lightning.reub.net): I/O has stalled, no activity for 600 >> seconds (last sent=mail, last recv=mail (EOL)) >> Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout >> during state=sync_mails (send=mails recv=recv_last_common) >> >> I'm not sure if there is an underlying replication error or if the >> message is just cosmetic, though.> Admittedly I have had a few occurences of this behaviour as well last > night. It happens more sporadic now and seems to be a conflict with my > user settings. (My users > get added twice to the system, user-domain.tld and user at domain.tld, > both are being replicated, the noreplicate flag is not yet honored in > the version I am using so I cannot > bypass that yet). > > I do see messages that came on the other machine on the machine that I > am using to read these emails. So replication seems to work in that > regard (where it obviously > did not do that well before).First of all: Major improvement by this patch applied to 2.3.1, there are no more hanging processes. But: I do find quite a number of error messages like: Jun 7 06:34:20 mail dovecot: doveadm: Error: Failed to lock mailbox NAME for dsyncing: \ file_create_locked(/.../USER/mailboxes/NAME/dbox-Mails/.dovecot-box-sync.lock) \ failed: fcntl(/.../USER/mailboxes/NAME/dbox-Mails/.dovecot-box-sync.lock, write-lock, F_SETLKW) \ locking failed: Timed out after 30 seconds (WRITE lock held by pid 79452) These messages are only found at that server which is normally receiving synced messages (because almost all mail is received via the other master due to MX priorities). Conclusion: After 12 hours of running a patched FBSD port I do get those error messages but replictaion seems to work now. But, I still have the feeling that there might something else going wrong. @Timo: Wouldn't it be worth to add replicator/aggreator error messages to head like Aki sent to Remko? That might add some light into replication issues today and in the future. Regards, Michael
Am 2018-06-07 08:48, schrieb Remko Lodder:> On Thu, Jun 07, 2018 at 08:04:49AM +0200, Michael Grimm wrote:>> Conclusion: After 12 hours of running a patched FBSD port I do get >> those >> error messages but replictaion seems to work now. But, I still have >> the >> feeling that there might something else going wrong. > > I agree with that. Are you using the new pkg that ler@ prepared ? That > includes > the patch and is a 'native' package..Yes, I am running this new port from ler at . And: Thanks for his very fast modification! Regards, Michael
Regarding my comment below - it looks like a false alarm on my part. The commit referenced below hasn't gone into master-2.3 yet which meant it wasn't included when I rebuilt earlier today. That was was an incorrect assumption I made. I have since manually patched it into master-2.3 and it looks to be OK so far - touch wood - with 4 hours testing so far. Reuben On 7/06/2018 3:21 pm, Reuben Farrelly wrote:> Still not quite right for me. > > Jun? 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: > dsync(lightning.reub.net): I/O has stalled, no activity for 600 seconds > (last sent=mail, last recv=mail (EOL)) > Jun? 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout > during state=sync_mails (send=mails recv=recv_last_common) > > I'm not sure if there is an underlying replication error or if the > message is just cosmetic, though. > > Reuben > > > On 7/06/2018 4:55 AM, Remko Lodder wrote: >> Hi Timo, >> >> Yes this seems to work fine so far. I?ll ask the people to add it to >> the current FreeBSD version.. >> >> Cheers >> Remko >> >>> On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi >>> <mailto:tss at iki.fi>> wrote: >>> >>> Should be fixed by >>> https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 >>> >>> >>