Simone Lazzaris
2018-Sep-18 07:46 UTC
Auth process sometimes stop responding after upgrade
In data marted? 11 settembre 2018 10:46:30 CEST, Timo Sirainen ha scritto:> On 11 Sep 2018, at 10.57, Simone Lazzaris <s.lazzaris at interactive.eu> wrote: > > Sep 11 03:25:55 imap-front4 dovecot: director: Panic: file > > doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion > > failed: (conn->to_ring_sync_abort == NULL) Sep 11 03:25:55 imap-front4 > > dovecot: director: Fatal: master: service(director): child 4395 killed > > with signal 6 (core dumps disabled) > It's crashing. Can you get gdb backtrace? First enable core dumps. > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>Hi all, again; I've enabled the core dumps and let it go for some day waiting for the issue to reoccur. Meantime I've also upgraded the poolmon script, as Sami suggested. It seems that the upgrade has scared the issue away, because it no longer occurred. Maybe the problem is related to the way the old poolmon talked to the director daemon? I'm not very inclined to downgrade poolmon to catch a traceback, but can do if neccessary. -- *Simone Lazzaris* *Qcom S.p.A.* simone.lazzaris at qcom.it[1] | www.qcom.it[2] * LinkedIn[3]* | *Facebook*[4] [5] -------- [1] mailto:simone.lazzaris at qcom.it [2] https://www.qcom.it [3] https://www.linkedin.com/company/qcom-spa [4] http://www.facebook.com/qcomspa [5] https://www.qcom.it/includes/email-banner.gif -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180918/325da363/attachment.html>
Simone Lazzaris
2018-Sep-18 10:29 UTC
Auth process sometimes stop responding after upgrade
> Hi all, again; > > I've enabled the core dumps and let it go for some day waiting for the issue > to reoccur. > > Meantime I've also upgraded the poolmon script, as Sami suggested. > > It seems that the upgrade has scared the issue away, because it no longer > occurred. > > Maybe the problem is related to the way the old poolmon talked to the > director daemon? I'm not very inclined to downgrade poolmon to catch a > traceback, but can do if neccessary.Well, maybe it's not necessary ;) I've performed some maintenance operations on the backends and that triggered the crash. It seems that something goes wrong where one backend come back online. Unfortunately, the core was not dumped.... And I don't know what to do: the director service was not chrooted, and ulimit -c is unlimited.>From the log file:Sep 18 12:21:46 imap-front4 dovecot: director: Panic: file doveadm-connection.c: line 1097 (doveadm_connection_deinit): assertion failed: (conn->to_ring_sync_abort == NULL) Sep 18 12:21:46 imap-front4 dovecot: director: Error: Raw backtrace: /usr/local/lib/dovecot/ libdovecot.so.0(+0xa15be) [0xb77345be] -> /usr/local/lib/dovecot/libdovecot.so. 0(+0xa1641) [0xb7734641] -> /usr/local/lib/dovecot/libdovecot.so.0(i_fatal+0) [0xb76ba35e] -> dovecot/director() [0x80574f7] -> dovecot/director() [0x8057f03] -> /usr/local/lib/ dovecot/libdovecot.so.0(io_loop_call_io+0x6b) [0xb774d3db] -> /usr/local/lib/dovecot/ libdovecot.so.0(io_loop_handler_run_internal+0xfe) [0xb774ee1e] -> /usr/local/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x46) [0xb774d496] -> /usr/local/lib/dovecot/ libdovecot.so.0(io_loop_run+0x48) [0xb774d658] -> /usr/local/lib/dovecot/libdovecot.so. 0(master_service_run+0x2e) [0xb76c645e] -> dovecot/director(main+0x49e) [0x804cf5e] -> /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb753be46] -> dovecot/ director() [0x804d081] Sep 18 12:21:46 imap-front4 dovecot: director: Fatal: master: service(director): child 7941 killed with signal 6 (core not dumped) Sep 18 12:21:46 imap-front4 dovecot: director: Connecting to 212.183.164.161:9090 (as 212.183.164.164): Initial connection Sep 18 12:21:46 imap-front4 dovecot: director: Incoming connection from director 212.183.164.163/in My current config: root at imap-front4:~# doveconf -n # 2.2.36 (1f10bfa63): /usr/local/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-4-686-pae i686 Debian 7.11 # Hostname: imap-front4 auth_mechanisms = plain login digest-md5 cram-md5 apop scram-sha-1 auth_verbose = yes auth_verbose_passwords = plain base_dir = /var/run/dovecot/ default_login_user = nobody director_doveadm_port = 9091 director_mail_servers = 192.168.1.142 192.168.1.143 192.168.1.216 192.168.1.217 192.168.1.218 192.168.1.219 director_servers = 212.183.164.161 212.183.164.162 212.183.164.163 212.183.164.164 disable_plaintext_auth = no listen = * passdb { args = /usr/local/etc/dovecot/sql.conf driver = sql } protocols = imap pop3 service director { chroot = fifo_listener login/proxy-notify { mode = 0666 } inet_listener { port = 9090 } unix_listener director-userdb { mode = 0600 } unix_listener login/director { mode = 0666 } } service imap-login { executable = imap-login director service_count = 0 vsz_limit = 128 M } service pop3-login { executable = pop3-login director service_count = 0 vsz_limit = 128 M } ssl_cert = </usr/local/etc/dovecot/imapd.pem ssl_key = # hidden, use -P to show it ssl_protocols = !SSlv2 !SSLv3 syslog_facility = local5 userdb { driver = prefetch } *Simone Lazzaris* *Qcom S.p.A.* simone.lazzaris at qcom.it[1] | www.qcom.it[2] * LinkedIn[3]* | *Facebook*[4] [5] -------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180918/4e2c20af/attachment.html>
On 18 Sep 2018, at 13.29, Simone Lazzaris <s.lazzaris at interactive.eu> wrote:> > > Hi all, again; > > > > I've enabled the core dumps and let it go for some day waiting for the issue > > to reoccur. > > > > Meantime I've also upgraded the poolmon script, as Sami suggested. > > > > It seems that the upgrade has scared the issue away, because it no longer > > occurred. > > > > Maybe the problem is related to the way the old poolmon talked to the > > director daemon? I'm not very inclined to downgrade poolmon to catch a > > traceback, but can do if neccessary. > > Well, maybe it's not necessary ;) > I've performed some maintenance operations on the backends and that triggered the crash. It seems that something goes wrong where one backend come back online.It's weird how easily you can reproduce the crash. I've ran all kinds of (stress) tests and I can't reproduce this crash. I was able to reproduce the original hang though.> Unfortunately, the core was not dumped.... And I don't know what to do: the director service was not chrooted, and ulimit -c is unlimited.Do you have: sysctl -w fs.suid_dumpable=2 -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20180918/cde1dc3b/attachment-0001.html>
Possibly Parallel Threads
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade