Can you provide any details on this instability? Aki On April 27, 2021 7:58:01 PM UTC, Joan Moreau <jom at grosjo.net> wrote:>Ok, a third regression is that it becomes highly unstable with the >patch >you sent > >I had to get back to 2.3.14 > >On 2021-04-27 17:07, Joan Moreau wrote: > >> Indeed, latest git works much better :) >> >> On 2021-04-27 05:58, Aki Tuomi wrote: >> Can you try with latest git? We did some improvements on the systemd >> configure parts. >> >> Aki >> >> On 26/04/2021 23:32 Joan Moreau <jom at grosjo.net> wrote: >> >> Looking at config.log, there is #define HAVE_LIBSYSTEMD 1 >> But "Type=notify" does not appear >> My systemd is version 248 >> >> On 2021-04-26 12:05, Joan Moreau wrote: I have >> # sudo systemctl status dovecot >> ? dovecot.service - Dovecot IMAP/POP3 email server >> Loaded: loaded (/usr/lib/systemd/system/dovecot.service; enabled; >> vendor preset: disabled) >> Active: active (running) since Sun 2021-04-25 20:13:25 UTC; 14h ago >> Docs: man:dovecot(1) >> https://doc.dovecot.org/ >> Main PID: 2559364 (dovecot) >> Tasks: 28 (limit: 76912) >> Memory: 1.0G >> CPU: 7min 18.342s >> CGroup: /system.slice/dovecot.service >> ??2559364 /usr/sbin/dovecot -F >> ??2559366 dovecot/imap-login >> ??2559367 dovecot/anvil [11 connections] >> ??2559368 dovecot/log >> >> On 2021-04-26 08:32, Aki Tuomi wrote: I don't know then. It works for > >> me and I just tried it again. The only reason it would fail would be >> that HAVE_LIBSYSTEMD is not defined, so it would not be using >> libsystemd for notify support. >> >> $ sudo systemctl status dovecot >> ? dovecot.service - Dovecot IMAP/POP3 email server >> Loaded: loaded (/lib/systemd/system/dovecot.service; disabled; vendor > >> preset: enabled) >> Active: active (running) since Mon 2021-04-26 10:30:02 EEST; 2s ago >> Docs: man:dovecot(1) >> https://doc.dovecot.org/ >> Main PID: 30213 (dovecot) >> Status: "v2.4.devel (98a1cca054) running" >> Tasks: 4 (limit: 4701) >> Memory: 3.3M >> CGroup: /system.slice/dovecot.service >> ??30213 /home/cmouse/dovecot/sbin/dovecot -F >> ??30214 dovecot/anvil >> ??30215 dovecot/log >> ??30216 dovecot/config >> >> You can tell from the "Status" line that it's using Type=notify. >> >> Aki >> >> On 26/04/2021 10:29 Joan Moreau <jom at grosjo.net> wrote: >> >> Yes, I do run autogen.sh after every "git pull" >> >> On 2021-04-26 08:21, Aki Tuomi wrote: The current autoconf code is >bit >> buggy, but if you do indeed have libsystemd-dev installed it should >do >> the right thing and will work with systemd even if you have >> Type=notify. >> >> This has been actually tested, so if it's not working, then something > >> else is wrong. >> >> Did you remember to run ./autogen.sh after pulling from git to make >> sure you get new configure script? >> >> Aki >> >> On 26/04/2021 10:11 Joan Moreau <jom at grosjo.net> wrote: >> >> Yes systemd is installed (and the "dev" files as well) >> >> On 2021-04-26 06:23, Aki Tuomi wrote: This is because you are not >> compiling with libsystemd-dev installed. I guess we need to make some > >> service template that use type simple when you don't use libsystemd. >> >> Aki >> >> On 25/04/2021 22:53 Joan Moreau <jom at grosjo.net> wrote: >> >> Yes, it seems fixed with this patch :) >> >> Another bug with git, is the "type=" in systemd is switched from >> "simple" to "notify". The later does not work and reverting to >"simple" >> does work >> >> On 2021-04-25 17:53, Aki Tuomi wrote: On 24/04/2021 21:56 Joan Moreau > >> <jom at grosjo.net> wrote: >> >> chroot= does not resolve the issue >> I have "chroot = login" in my conf >> >> Thanks! >> >> The chroot was needed to get the core dump. >> >> Can you try if this does fix the crash? >> >> Aki >> >> From 1df4e02cbff710ce8938480b07a5690e37f661f6 Mon Sep 17 00:00:00 >2001 >> From: Timo Sirainen <timo.sirainen at open-xchange.com> >> Date: Fri, 23 Apr 2021 16:43:36 +0300 >> Subject: [PATCH] login-common: Fix handling destroyed_clients linked >> list >> >> The client needs to be removed from destroyed_clients linked list >> before >> it's added to client_fd_proxies linked list. >> >> Broken by 1c622cdbe08df2f642e28923c39894516143ae2a >> --- >> src/login-common/client-common.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/src/login-common/client-common.c >> b/src/login-common/client-common.c >> index bdb6e9c798..1d264d9f75 100644 >> --- a/src/login-common/client-common.c >> +++ b/src/login-common/client-common.c >> @@ -289,8 +289,9 @@ void client_disconnect(struct client *client, >const >> char *reason, >> /* Login was successful. We may now be proxying the connection, >> so don't disconnect the client until client_unref(). */ >> if (client->iostream_fd_proxy != NULL) { >> + i_assert(!client->fd_proxying); >> client->fd_proxying = TRUE; >> - i_assert(client->prev == NULL && client->next == NULL); >> + DLLIST_REMOVE(&destroyed_clients, client); >> DLLIST_PREPEND(&client_fd_proxies, client); >> client_fd_proxies_count++; >> } >> @@ -307,8 +308,9 @@ void client_destroy(struct client *client, const >> char *reason) >> >> if (last_client == client) >> last_client = client->prev; >> - /* remove from clients linked list before it's added to >> - client_fd_proxies. */ >> + /* move to destroyed_clients linked list before it's potentially >> + added to client_fd_proxies. */ >> + i_assert(!client->fd_proxying); >> DLLIST_REMOVE(&clients, client); >> DLLIST_PREPEND(&destroyed_clients, client); >> >> @@ -409,13 +411,14 @@ bool client_unref(struct client **_client) >> DLLIST_REMOVE(&client_fd_proxies, client); >> i_assert(client_fd_proxies_count > 0); >> client_fd_proxies_count--; >> + } else { >> + DLLIST_REMOVE(&destroyed_clients, client); >> } >> i_stream_unref(&client->input); >> o_stream_unref(&client->output); >> i_close_fd(&client->fd); >> event_unref(&client->event); >> >> - DLLIST_REMOVE(&destroyed_clients, client); >> i_free(client->proxy_user); >> i_free(client->proxy_master_user); >> i_free(client->virtual_user);-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20210428/ef3faaaa/attachment.html>
Not much details Git version (including the patch you sent) raised CPU load very very high. Can't play too much on my production server. Let me know if I can help On 2021-04-28 06:12, Aki Tuomi wrote:> Can you provide any details on this instability? > > Aki > > On April 27, 2021 7:58:01 PM UTC, Joan Moreau <jom at grosjo.net> wrote: > > Ok, a third regression is that it becomes highly unstable with the > patch you sent > > I had to get back to 2.3.14 > > On 2021-04-27 17:07, Joan Moreau wrote: > > Indeed, latest git works much better :) > > On 2021-04-27 05:58, Aki Tuomi wrote: > Can you try with latest git? We did some improvements on the systemd > configure parts. > > Aki > > On 26/04/2021 23:32 Joan Moreau <jom at grosjo.net> wrote: > > Looking at config.log, there is #define HAVE_LIBSYSTEMD 1 > But "Type=notify" does not appear > My systemd is version 248 > > On 2021-04-26 12:05, Joan Moreau wrote: I have > # sudo systemctl status dovecot > ? dovecot.service - Dovecot IMAP/POP3 email server > Loaded: loaded (/usr/lib/systemd/system/dovecot.service; enabled; > vendor preset: disabled) > Active: active (running) since Sun 2021-04-25 20:13:25 UTC; 14h ago > Docs: man:dovecot(1) > https://doc.dovecot.org/ > Main PID: 2559364 (dovecot) > Tasks: 28 (limit: 76912) > Memory: 1.0G > CPU: 7min 18.342s > CGroup: /system.slice/dovecot.service > ??2559364 /usr/sbin/dovecot -F > ??2559366 dovecot/imap-login > ??2559367 dovecot/anvil [11 connections] > ??2559368 dovecot/log > > On 2021-04-26 08:32, Aki Tuomi wrote: I don't know then. It works for > me and I just tried it again. The only reason it would fail would be > that HAVE_LIBSYSTEMD is not defined, so it would not be using > libsystemd for notify support. > > $ sudo systemctl status dovecot > ? dovecot.service - Dovecot IMAP/POP3 email server > Loaded: loaded (/lib/systemd/system/dovecot.service; disabled; vendor > preset: enabled) > Active: active (running) since Mon 2021-04-26 10:30:02 EEST; 2s ago > Docs: man:dovecot(1) > https://doc.dovecot.org/ > Main PID: 30213 (dovecot) > Status: "v2.4.devel (98a1cca054) running" > Tasks: 4 (limit: 4701) > Memory: 3.3M > CGroup: /system.slice/dovecot.service > ??30213 /home/cmouse/dovecot/sbin/dovecot -F > ??30214 dovecot/anvil > ??30215 dovecot/log > ??30216 dovecot/config > > You can tell from the "Status" line that it's using Type=notify. > > Aki > > On 26/04/2021 10:29 Joan Moreau <jom at grosjo.net> wrote: > > Yes, I do run autogen.sh after every "git pull" > > On 2021-04-26 08:21, Aki Tuomi wrote: The current autoconf code is bit > buggy, but if you do indeed have libsystemd-dev installed it should do > the right thing and will work with systemd even if you have > Type=notify. > > This has been actually tested, so if it's not working, then something > else is wrong. > > Did you remember to run ./autogen.sh after pulling from git to make > sure you get new configure script? > > Aki > > On 26/04/2021 10:11 Joan Moreau <jom at grosjo.net> wrote: > > Yes systemd is installed (and the "dev" files as well) > > On 2021-04-26 06:23, Aki Tuomi wrote: This is because you are not > compiling with libsystemd-dev installed. I guess we need to make some > service template that use type simple when you don't use libsystemd. > > Aki > > On 25/04/2021 22:53 Joan Moreau <jom at grosjo.net> wrote: > > Yes, it seems fixed with this patch :) > > Another bug with git, is the "type=" in systemd is switched from > "simple" to "notify". The later does not work and reverting to "simple" > does work > > On 2021-04-25 17:53, Aki Tuomi wrote: On 24/04/2021 21:56 Joan Moreau > <jom at grosjo.net> wrote: > > chroot= does not resolve the issue > I have "chroot = login" in my conf > > Thanks! > > The chroot was needed to get the core dump. > > Can you try if this does fix the crash? > > Aki > > From 1df4e02cbff710ce8938480b07a5690e37f661f6 Mon Sep 17 00:00:00 2001 > From: Timo Sirainen <timo.sirainen at open-xchange.com> > Date: Fri, 23 Apr 2021 16:43:36 +0300 > Subject: [PATCH] login-common: Fix handling destroyed_clients linked > list > > The client needs to be removed from destroyed_clients linked list > before > it's added to client_fd_proxies linked list. > > Broken by 1c622cdbe08df2f642e28923c39894516143ae2a > --- > src/login-common/client-common.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/src/login-common/client-common.c > b/src/login-common/client-common.c > index bdb6e9c798..1d264d9f75 100644 > --- a/src/login-common/client-common.c > +++ b/src/login-common/client-common.c > @@ -289,8 +289,9 @@ void client_disconnect(struct client *client, const > char *reason, > /* Login was successful. We may now be proxying the connection, > so don't disconnect the client until client_unref(). */ > if (client->iostream_fd_proxy != NULL) { > + i_assert(!client->fd_proxying); > client->fd_proxying = TRUE; > - i_assert(client->prev == NULL && client->next == NULL); > + DLLIST_REMOVE(&destroyed_clients, client); > DLLIST_PREPEND(&client_fd_proxies, client); > client_fd_proxies_count++; > } > @@ -307,8 +308,9 @@ void client_destroy(struct client *client, const > char *reason) > > if (last_client == client) > last_client = client->prev; > - /* remove from clients linked list before it's added to > - client_fd_proxies. */ > + /* move to destroyed_clients linked list before it's potentially > + added to client_fd_proxies. */ > + i_assert(!client->fd_proxying); > DLLIST_REMOVE(&clients, client); > DLLIST_PREPEND(&destroyed_clients, client); > > @@ -409,13 +411,14 @@ bool client_unref(struct client **_client) > DLLIST_REMOVE(&client_fd_proxies, client); > i_assert(client_fd_proxies_count > 0); > client_fd_proxies_count--; > + } else { > + DLLIST_REMOVE(&destroyed_clients, client); > } > i_stream_unref(&client->input); > o_stream_unref(&client->output); > i_close_fd(&client->fd); > event_unref(&client->event); > > - DLLIST_REMOVE(&destroyed_clients, client); > i_free(client->proxy_user); > i_free(client->proxy_master_user); > i_free(client->virtual_user);-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20210428/acc58783/attachment-0001.html>
Did you see if the problem was that the imap-login process was using 100% CPU, or was the issue something else? I can't find a bug with the patch itself. But attached is another patch that adds some more asserts to make sure the linked lists are being used correctly, so if there is some bug it should now assert-crash instead of doing something else weird like go to infinite loop. But maybe the high CPU usage was something unrelated to this patch?> On 28. Apr 2021, at 21.57, Joan Moreau <jom at grosjo.net> wrote: > > Not much details > > Git version (including the patch you sent) raised CPU load very very high. > > Can't play too much on my production server. > > Let me know if I can help > > > > On 2021-04-28 06:12, Aki Tuomi wrote: > >> Can you provide any details on this instability? >> >> Aki >> >> On April 27, 2021 7:58:01 PM UTC, Joan Moreau <jom at grosjo.net> wrote: >> Ok, a third regression is that it becomes highly unstable with the patch you sent >> >> I had to get back to 2.3.14 >> >> >> >> >> >> On 2021-04-27 17:07, Joan Moreau wrote: >> >> Indeed, latest git works much better :) >> >> >> >> On 2021-04-27 05:58, Aki Tuomi wrote: >> >> Can you try with latest git? We did some improvements on the systemd configure parts. >> >> Aki >> >> On 26/04/2021 23:32 Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >> >> >> Looking at config.log, there is #define HAVE_LIBSYSTEMD 1 >> But "Type=notify" does not appear >> My systemd is version 248 >> >> >> >> On 2021-04-26 12:05, Joan Moreau wrote: >> I have >> # sudo systemctl status dovecot >> ? dovecot.service - Dovecot IMAP/POP3 email server >> Loaded: loaded (/usr/lib/systemd/system/dovecot.service; enabled; vendor preset: disabled) >> Active: active (running) since Sun 2021-04-25 20:13:25 UTC; 14h ago >> Docs: man:dovecot(1) >> https://doc.dovecot.org/ <https://doc.dovecot.org/> >> Main PID: 2559364 (dovecot) >> Tasks: 28 (limit: 76912) >> Memory: 1.0G >> CPU: 7min 18.342s >> CGroup: /system.slice/dovecot.service >> ??2559364 /usr/sbin/dovecot -F >> ??2559366 dovecot/imap-login >> ??2559367 dovecot/anvil [11 connections] >> ??2559368 dovecot/log >> >> >> >> On 2021-04-26 08:32, Aki Tuomi wrote: >> I don't know then. It works for me and I just tried it again. The only reason it would fail would be that HAVE_LIBSYSTEMD is not defined, so it would not be using libsystemd for notify support. >> >> $ sudo systemctl status dovecot >> ? dovecot.service - Dovecot IMAP/POP3 email server >> Loaded: loaded (/lib/systemd/system/dovecot.service; disabled; vendor preset: enabled) >> Active: active (running) since Mon 2021-04-26 10:30:02 EEST; 2s ago >> Docs: man:dovecot(1) >> https://doc.dovecot.org/ <https://doc.dovecot.org/> >> Main PID: 30213 (dovecot) >> Status: "v2.4.devel (98a1cca054) running" >> Tasks: 4 (limit: 4701) >> Memory: 3.3M >> CGroup: /system.slice/dovecot.service >> ??30213 /home/cmouse/dovecot/sbin/dovecot -F >> ??30214 dovecot/anvil >> ??30215 dovecot/log >> ??30216 dovecot/config >> >> You can tell from the "Status" line that it's using Type=notify. >> >> Aki >> >> >> On 26/04/2021 10:29 Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >> >> >> Yes, I do run autogen.sh after every "git pull" >> >> >> On 2021-04-26 08:21, Aki Tuomi wrote: >> The current autoconf code is bit buggy, but if you do indeed have libsystemd-dev installed it should do the right thing and will work with systemd even if you have Type=notify. >> >> This has been actually tested, so if it's not working, then something else is wrong. >> >> Did you remember to run ./autogen.sh after pulling from git to make sure you get new configure script? >> >> Aki >> >> >> >> On 26/04/2021 10:11 Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >> >> >> Yes systemd is installed (and the "dev" files as well) >> >> >> On 2021-04-26 06:23, Aki Tuomi wrote: >> This is because you are not compiling with libsystemd-dev installed. I guess we need to make some service template that use type simple when you don't use libsystemd. >> >> Aki >> >> >> >> >> On 25/04/2021 22:53 Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >> >> >> Yes, it seems fixed with this patch :) >> >> Another bug with git, is the "type=" in systemd is switched from "simple" to "notify". The later does not work and reverting to "simple" does work >> >> >> On 2021-04-25 17:53, Aki Tuomi wrote: >> On 24/04/2021 21:56 Joan Moreau <jom at grosjo.net <mailto:jom at grosjo.net>> wrote: >> >> >> chroot= does not resolve the issue >> I have "chroot = login" in my conf >> >> >> Thanks! >> >> The chroot was needed to get the core dump. >> >> Can you try if this does fix the crash? >> >> Aki >> >> From 1df4e02cbff710ce8938480b07a5690e37f661f6 Mon Sep 17 00:00:00 2001 >> From: Timo Sirainen <timo.sirainen at open-xchange.com <mailto:timo.sirainen at open-xchange.com>> >> Date: Fri, 23 Apr 2021 16:43:36 +0300 >> Subject: [PATCH] login-common: Fix handling destroyed_clients linked list >> >> The client needs to be removed from destroyed_clients linked list before >> it's added to client_fd_proxies linked list. >> >> Broken by 1c622cdbe08df2f642e28923c39894516143ae2a >> --- >> src/login-common/client-common.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/src/login-common/client-common.c b/src/login-common/client-common.c >> index bdb6e9c798..1d264d9f75 100644 >> --- a/src/login-common/client-common.c >> +++ b/src/login-common/client-common.c >> @@ -289,8 +289,9 @@ void client_disconnect(struct client *client, const char *reason, >> /* Login was successful. We may now be proxying the connection, >> so don't disconnect the client until client_unref(). */ >> if (client->iostream_fd_proxy != NULL) { >> + i_assert(!client->fd_proxying); >> client->fd_proxying = TRUE; >> - i_assert(client->prev == NULL && client->next == NULL); >> + DLLIST_REMOVE(&destroyed_clients, client); >> DLLIST_PREPEND(&client_fd_proxies, client); >> client_fd_proxies_count++; >> } >> @@ -307,8 +308,9 @@ void client_destroy(struct client *client, const char *reason) >> >> if (last_client == client) >> last_client = client->prev; >> - /* remove from clients linked list before it's added to >> - client_fd_proxies. */ >> + /* move to destroyed_clients linked list before it's potentially >> + added to client_fd_proxies. */ >> + i_assert(!client->fd_proxying); >> DLLIST_REMOVE(&clients, client); >> DLLIST_PREPEND(&destroyed_clients, client); >> >> @@ -409,13 +411,14 @@ bool client_unref(struct client **_client) >> DLLIST_REMOVE(&client_fd_proxies, client); >> i_assert(client_fd_proxies_count > 0); >> client_fd_proxies_count--; >> + } else { >> + DLLIST_REMOVE(&destroyed_clients, client); >> } >> i_stream_unref(&client->input); >> o_stream_unref(&client->output); >> i_close_fd(&client->fd); >> event_unref(&client->event); >> >> - DLLIST_REMOVE(&destroyed_clients, client); >> i_free(client->proxy_user); >> i_free(client->proxy_master_user); >> i_free(client->virtual_user);-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20210503/e4bd0702/attachment-0002.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 3679.diff Type: application/octet-stream Size: 3948 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20210503/e4bd0702/attachment-0001.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20210503/e4bd0702/attachment-0003.html>