Simone Lazzaris
2018-Sep-11 07:57 UTC
Auth process sometimes stop responding after upgrade
In data luned? 10 settembre 2018 09:58:50 CEST, Timo Sirainen ha scritto:> On 8 Sep 2018, at 15.18, Simone Lazzaris <simone.lazzaris at qcom.it> wrote: > > Timo, unfortunately the patch doesn't compile; I've moved the declaration > > of "conn" one line up to make it work. > > Oops, I guess I was too much in a hurry to even compile it. Here's a new > patch that compiles and passes our director CI tests.Hi Timo; after 24 hours of field testing, I can say that the issue is mostly gone. I say "mostly" because the service is working as far as the user is concerned, but I see some strange going on in the logs. Grepping "director" in the log file, I can see that there are some panic and some comunication errors: Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.143 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.219 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.218 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.216 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: director(212.183.164.161:9090/right): Host 192.168.1.145 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.217 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.144 vhost count changed from 100 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.145 vhost count changed from 0 to 0 Sep 11 03:24:55 imap-front4 dovecot: director: doveadm: Host 192.168.1.142 vhost count changed from 100 to 0 Sep 11 03:25:09 imap-front4 dovecot: director: director(212.183.164.161:9090/right): Host 192.168.1.143 vhost count changed from 0 to 100 Sep 11 03:25:09 imap-front4 dovecot: director: Error: Director 212.183.164.161:9090/right disconnected: Connection closed (bytes in=1116368, bytes out=1182555, 0+27319 USERs received, last input 0.000 s ago, last output 0.000 s ago, connected 4602.589 s ago, 481 peak output buffer size, 1.948 CPU secs since connected) Sep 11 03:25:09 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Reconnecting after disconnection Sep 11 03:25:09 imap-front4 dovecot: director: Error: Director 212.183.164.161:9090/out disconnected: Connection closed: read(size=968) failed: Connection reset by peer (bytes in=56, bytes out=59143, 0+0 USERs received, 1556 USERs sent in handshake, last input 0.002 s ago, last output 0.002 s ago, connected 0.024 s ago, 8190 peak output buffer size, 0.004 CPU secs since connected, handshake DONE not received) Sep 11 03:25:09 imap-front4 dovecot: director: Connecting to 212.183.164.162:9090 (as 212.183.164.164): Reconnecting after disconnection Sep 11 03:25:09 imap-front4 dovecot: director: director(212.183.164.162:9090/out): Handshake finished in 0.006 secs (bytes in=61, bytes out=59173, 0+0 USERs received, 1556 USERs sent in handshake, last input 0.000 s ago, last output 0.003 s ago, connected 0.006 s ago, 8190 peak output buffer size, 0.000 CPU secs since connected) Sep 11 03:25:10 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Received CONNECT request from 212.183.164.162:9090/right - replacing current right 212.183.164.162:9090/right Sep 11 03:25:10 imap-front4 dovecot: director: director(212.183.164.161:9090/out): Handshake finished in 0.004 secs (bytes in=61, bytes out=59332, 0+0 USERs received, 1561 USERs sent in handshake, last input 0.000 s ago, last output 0.004 s ago, connected 0.004 s ago, 8190 peak output buffer size, 0.000 CPU secs since connected) Sep 11 03:25:10 imap-front4 dovecot: director: director(212.183.164.161:9090/right): Host 192.168.1.216 vhost count changed from 0 to 100 Sep 11 03:25:10 imap-front4 dovecot: director: Error: Director 212.183.164.161:9090/right disconnected: Connection closed: read(size=558) failed: Connection reset by peer (bytes in=466, bytes out=60271, 0+6 USERs received, 1561 USERs sent in handshake, last input 0.001 s ago, last output 0.000 s ago, connected 0.553 s ago, 8190 peak output buffer size, 0.000 CPU secs since connected) Sep 11 03:25:10 imap-front4 dovecot: director: Connecting to 212.183.164.162:9090 (as 212.183.164.164): Reconnecting after disconnection Sep 11 03:25:10 imap-front4 dovecot: director: director(212.183.164.162:9090/out): Handshake finished in 0.005 secs (bytes in=61, bytes out=59372, 0+0 USERs received, 1562 USERs sent in handshake, last input 0.000 s ago, last output 0.005 s ago, connected 0.005 s ago, 8192 peak output buffer size, 0.000 CPU secs since connected) Sep 11 03:25:10 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Received CONNECT request from 212.183.164.162:9090/right - replacing current right 212.183.164.162:9090/right Sep 11 03:25:10 imap-front4 dovecot: director: director(212.183.164.161:9090/out): Handshake finished in 0.007 secs (bytes in=61, bytes out=59372, 0+0 USERs received, 1562 USERs sent in handshake, last input 0.000 s ago, last output 0.003 s ago, connected 0.007 s ago, 8516 peak output buffer size, 0.004 CPU secs since connected) Sep 11 03:25:25 imap-front4 dovecot: director: doveadm: Host 192.168.1.144 vhost count changed from 0 to 100 Sep 11 03:25:25 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:25 imap-front4 dovecot: director: Fatal: master: service(director): child 2237 killed with signal 6 (core dumps disabled) Sep 11 03:25:25 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Initial connection Sep 11 03:25:25 imap-front4 dovecot: director: Incoming connection from director 212.183.164.163/in Sep 11 03:25:25 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:25 imap-front4 dovecot: director: Fatal: master: service(director): child 4392 killed with signal 6 (core dumps disabled) Sep 11 03:25:25 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Initial connection Sep 11 03:25:25 imap-front4 dovecot: director: Incoming connection from director 212.183.164.163/in Sep 11 03:25:25 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:25 imap-front4 dovecot: director: Fatal: master: service(director): child 4393 killed with signal 6 (core dumps disabled) Sep 11 03:25:25 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Initial connection Sep 11 03:25:25 imap-front4 dovecot: director: Incoming connection from director 212.183.164.163/in Sep 11 03:25:25 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:25 imap-front4 dovecot: director: Fatal: master: service(director): child 4394 killed with signal 6 (core dumps disabled) Sep 11 03:25:25 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Initial connection -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180911/46b1e980/attachment-0001.html>
On 11 Sep 2018, at 10.57, Simone Lazzaris <s.lazzaris at interactive.eu> wrote:> > Sep 11 03:25:55 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) > Sep 11 03:25:55 imap-front4 dovecot: director: Fatal: master: service(director): child 4395 killed with signal 6 (core dumps disabled)It's crashing. Can you get gdb backtrace? First enable core dumps. https://dovecot.org/bugreport.html#coredumps <https://dovecot.org/bugreport.html#coredumps> -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180911/50d4dfb8/attachment.html>
Simone Lazzaris
2018-Sep-18 07:44 UTC
Auth process sometimes stop responding after upgrade
In data marted? 11 settembre 2018 10:46:30 CEST, Timo Sirainen ha scritto:> On 11 Sep 2018, at 10.57, Simone Lazzaris <s.lazzaris at interactive.eu> wrote: > > Sep 11 03:25:55 imap-front4 dovecot: director: Panic: file > > doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion > > failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:55 imap-front4 > > dovecot: director: Fatal: master: service(director): child 4395 killed > > with signal 6 (core dumps disabled) > It's crashing. Can you get gdb backtrace? First enable core dumps. > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>Hi all, again; I've enabled the core dumps and let it go for some day waiting for the issue to reoccur. Meantime I've also upgraded the poolmon script, as Sami suggested. It seems that the upgrade has scared the issue away, because it no longer occurred. Maybe the problem is related to the way the old poolmon talked to the director daemon? I'm not very inclined to downgrade poolmon to catch a traceback, but can do if neccessary. -- *Simone Lazzaris* *Qcom S.p.A.* simone.lazzaris at qcom.it[1] | www.qcom.it[2] * LinkedIn[3]* | *Facebook*[4] [5] -------- [1] mailto:simone.lazzaris at qcom.it [2] https://www.qcom.it [3] https://www.linkedin.com/company/qcom-spa [4] http://www.facebook.com/qcomspa [5] https://www.qcom.it/includes/email-banner.gif -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180918/e65484f6/attachment.html>
Simone Lazzaris
2018-Sep-18 07:46 UTC
Auth process sometimes stop responding after upgrade
In data marted? 11 settembre 2018 10:46:30 CEST, Timo Sirainen ha scritto:> On 11 Sep 2018, at 10.57, Simone Lazzaris <s.lazzaris at interactive.eu> wrote: > > Sep 11 03:25:55 imap-front4 dovecot: director: Panic: file > > doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion > > failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:55 imap-front4 > > dovecot: director: Fatal: master: service(director): child 4395 killed > > with signal 6 (core dumps disabled) > It's crashing. Can you get gdb backtrace? First enable core dumps. > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>Hi all, again; I've enabled the core dumps and let it go for some day waiting for the issue to reoccur. Meantime I've also upgraded the poolmon script, as Sami suggested. It seems that the upgrade has scared the issue away, because it no longer occurred. Maybe the problem is related to the way the old poolmon talked to the director daemon? I'm not very inclined to downgrade poolmon to catch a traceback, but can do if neccessary. -- *Simone Lazzaris* *Qcom S.p.A.* simone.lazzaris at qcom.it[1] | www.qcom.it[2] * LinkedIn[3]* | *Facebook*[4] [5] -------- [1] mailto:simone.lazzaris at qcom.it [2] https://www.qcom.it [3] https://www.linkedin.com/company/qcom-spa [4] http://www.facebook.com/qcomspa [5] https://www.qcom.it/includes/email-banner.gif -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180918/325da363/attachment.html>
Possibly Parallel Threads
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade