Should be fixed by https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 <https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336> -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180606/996cfe38/attachment.html>
Hi Timo, Yes this seems to work fine so far. I?ll ask the people to add it to the current FreeBSD version.. Cheers Remko> On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi> wrote: > > Should be fixed by https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 <https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180606/dccf62c9/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: <https://dovecot.org/pipermail/dovecot/attachments/20180606/dccf62c9/attachment.sig>
Still not quite right for me. Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: dsync(lightning.reub.net): I/O has stalled, no activity for 600 seconds (last sent=mail, last recv=mail (EOL)) Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout during state=sync_mails (send=mails recv=recv_last_common) I'm not sure if there is an underlying replication error or if the message is just cosmetic, though. Reuben On 7/06/2018 4:55 AM, Remko Lodder wrote:> Hi Timo, > > Yes this seems to work fine so far. I?ll ask the people to add it to the > current FreeBSD version.. > > Cheers > Remko > >> On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi >> <mailto:tss at iki.fi>> wrote: >> >> Should be fixed by >> https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 >> >
Aaaaand I forgot to CC the list, sorry for that, it's way too early in the morning :P On 07.06.18 - 07:39, Thore B?decker wrote:> What does the output of these two commands show after that error has > been logged? > > doveadm replicator status > > doveadm replicator dsync-status > > If there are *waiting failed* requests, that never make it "through" > (after being temporarily in state *queued failed* and then returning > to *waiting failed*) this means there is something wrong with the > replication. > > You can try forcing replication of all known users using > > doveadm replicator replicate '*' > > And see if that resolves the failed requests, but I doubt it. > > Please let us know how your status outputs look like. > > > Cheers, > Thore > > -- > Thore B?decker > > GPG ID: 0xD622431AF8DB80F3 > GPG FP: 0F96 559D 3556 24FC 2226 A864 D622 431A F8DB 80F3 > > > On 07.06.18 - 15:21, Reuben Farrelly wrote: > > Still not quite right for me. > > > > Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: > > dsync(lightning.reub.net): I/O has stalled, no activity for 600 seconds > > (last sent=mail, last recv=mail (EOL)) > > Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout > > during state=sync_mails (send=mails recv=recv_last_common) > > > > I'm not sure if there is an underlying replication error or if the message > > is just cosmetic, though. > > > > Reuben > > > > > > On 7/06/2018 4:55 AM, Remko Lodder wrote: > > > Hi Timo, > > > > > > Yes this seems to work fine so far. I?ll ask the people to add it to the > > > current FreeBSD version.. > > > > > > Cheers > > > Remko > > > > > > > On 6 Jun 2018, at 19:34, Timo Sirainen <tss at iki.fi > > > > <mailto:tss at iki.fi>> wrote: > > > > > > > > Should be fixed by https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 > > > > > > >-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20180607/c5f204d3/attachment.sig>
Timo Sirainen:> Should be fixed by > https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336please ignore my ignorance but shouldn't one add this commit regarding src/doveadm/client-connection-tcp.c ... https://github.com/dovecot/core/commit/2a3b7083ce4e62a8bd836f9983b223e98e9bc157 ... to a vanilla 2.3.1 source tree as well? I do have to admit that I am absolutely clueless in understanding dovecot's source code, but I came accross this commit because it has been committed the very same day as the one solving those hanging processes. I could test it by myself, but I am not sure if that commit would break my productive dovecot instances. Regards, Michael
?On 6/7/18, 3:43 AM, "dovecot on behalf of Michael Grimm" <dovecot-bounces at dovecot.org on behalf of trashcan at ellael.org> wrote: Timo Sirainen: > Should be fixed by > https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 please ignore my ignorance but shouldn't one add this commit regarding src/doveadm/client-connection-tcp.c ... https://github.com/dovecot/core/commit/2a3b7083ce4e62a8bd836f9983b223e98e9bc157 ... to a vanilla 2.3.1 source tree as well? I do have to admit that I am absolutely clueless in understanding dovecot's source code, but I came accross this commit because it has been committed the very same day as the one solving those hanging processes. I could test it by myself, but I am not sure if that commit would break my productive dovecot instances. Regards, Michael I'm happy to add https://github.com/dovecot/core/commit/2a3b7083ce4e62a8bd836f9983b223e98e9bc157 to The FreeBSD port if folks think it should help. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: larryrtx at gmail.com US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
On 7 Jun 2018, at 11.43, Michael Grimm <trashcan at ellael.org> wrote:> > Timo Sirainen: > >> Should be fixed by >> https://github.com/dovecot/core/commit/a952e178943a5944255cb7c053d970f8e6d49336 > > please ignore my ignorance but shouldn't one add this commit regarding src/doveadm/client-connection-tcp.c ... > > https://github.com/dovecot/core/commit/2a3b7083ce4e62a8bd836f9983b223e98e9bc157 > > ... to a vanilla 2.3.1 source tree as well?That's a code simplification / cleanup commit. It doesn't fix anything.
Hey all, almost 48h ago I upgraded both my instances to 2.3.1 again to see if the new patches would fix the replication issues for me. So far, the result is: great. I haven't been able to provoke any kind of I/O stall or persisting queued/failed resync requests in my replication setup. Newly added users are replicated instantly upon the first received mails and the home directory gets created without issues now too. For reference: I'm using the official 2.3.1 tarball together with the 3 attached patches, that have been taken from GitHub diffs/commits linked to me by Aki in the #dovecot channel. I can only encourage everyone to try out 2.3.1 again with these 3 patches to make sure it is rock-solid so that we might get a proper and stable 2.3.2 release soon-ish :) PS: For the Arch Linux users among you the dovecot-2.3.1-5 package in the official repo contains said three patches :) Cheers, Thore -- Thore B?decker GPG ID: 0xD622431AF8DB80F3 GPG FP: 0F96 559D 3556 24FC 2226 A864 D622 431A F8DB 80F3 -------------- next part -------------- commit 890883f12e8d8dd3309743eb95cf0b04f6e39ea0 Author: Aki Tuomi <aki.tuomi at dovecot.fi> Date: Mon Mar 19 18:39:27 2018 +0200 dsync: Revert to /tmp if home does not exist Fixes doveadm: Error: Couldn't lock .dovecot-sync.lock: safe_mkstemp(.dovecot-sync.lock) failed: No such file or directory diff --git a/src/doveadm/dsync/dsync-brain.c b/src/doveadm/dsync/dsync-brain.c index c2b8169..1e84182 100644 --- a/src/doveadm/dsync/dsync-brain.c +++ b/src/doveadm/dsync/dsync-brain.c @@ -401,6 +401,7 @@ dsync_brain_lock(struct dsync_brain *brain, const char *remote_hostname) .lock_method = FILE_LOCK_METHOD_FCNTL, }; const char *home, *error, *local_hostname = my_hostdomain(); + struct stat st; bool created; int ret; @@ -437,8 +438,21 @@ dsync_brain_lock(struct dsync_brain *brain, const char *remote_hostname) if (brain->verbose_proctitle) process_title_set(dsync_brain_get_proctitle_full(brain, DSYNC_BRAIN_TITLE_LOCKING)); - brain->lock_path = p_strconcat(brain->pool, home, - "/"DSYNC_LOCK_FILENAME, NULL); + + /* if homedir does not yet exist, create lock under tmpdir */ + if (stat(home, &st) < 0) { + if (errno != ENOENT) { + i_error("stat(%s) failed: %m", home); + return -1; + } + brain->lock_path = p_strdup_printf(brain->pool, "%s/%s-%s", + brain->user->set->mail_temp_dir, + brain->user->username, + "/"DSYNC_LOCK_FILENAME); + } else { + brain->lock_path = p_strconcat(brain->pool, home, + "/"DSYNC_LOCK_FILENAME, NULL); + } brain->lock_fd = file_create_locked(brain->lock_path, &lock_set, &brain->lock, &created, &error); if (brain->lock_fd == -1) -------------- next part --------------