Tim Mohlmann
2018-Dec-19 18:00 UTC
Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
Dear list, We been having some issues where the indexer-worker is crashing. This happens on production servers which are handling a slight amount of mail, but is also reproducible by moving messages. Also, users on my server are complaining about "Trashed" items coming back etc. Some details: - Dovecot 2.3.2.1 in alpine:3.8 based Docker container. As part of the Mailu distribution. (https://github.com/Mailu/Mailu) - I've first seen this issues on my production server, which stores mail on GlusterFS - I've been able to reproduce running the Docker container on a Virtual machine, using local storage. - There is another Mailu user reporting the same problem on a different VM provider / disk infrastructure: https://github.com/Mailu/Mailu/issues/751 - Libc: musl-1.1.19-r10 Output of dovecot -n: # dovecot -n # 2.3.2.1 (0719df592): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.2 (7704de5e) # OS: Linux 4.16.3-301.fc28.x86_64 x86_64? ext4 # Hostname: 98a2726271d3 auth_mechanisms = plain login disable_plaintext_auth = no first_valid_gid = 8 first_valid_uid = 8 hostname = mail.usrpro.io log_path = /dev/stderr mail_access_groups = mail mail_gid = mail mail_home = /mail/%u mail_location = maildir:/mail/%u mail_plugins = " fts fts_lucene quota quota_clone zlib" mail_privileged_group = mail mail_uid = mail mail_vsize_bg_after_count = 100 maildir_stat_dirs = yes managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext spamtest spamtestplus editheader imapsieve vnd.dovecot.imapsieve namespace inbox { ? inbox = yes ? location ? mailbox Drafts { ??? auto = subscribe ??? special_use = \Drafts ? } ? mailbox Junk { ??? auto = subscribe ??? special_use = \Junk ? } ? mailbox Sent { ??? auto = subscribe ??? special_use = \Sent ? } ? mailbox Trash { ??? auto = subscribe ??? special_use = \Trash ? } ? prefix } passdb { ? args = /etc/dovecot/auth.conf ? driver = dict } plugin { ? fts = lucene ? fts_autoindex = yes ? fts_autoindex_exclude = \Junk ? fts_lucene = whitespace_chars=@. ? imapsieve_mailbox1_before = file:/conf/report-spam.sieve ? imapsieve_mailbox1_causes = COPY ? imapsieve_mailbox1_name = Junk ? imapsieve_mailbox2_before = file:/conf/report-ham.sieve ? imapsieve_mailbox2_causes = COPY ? imapsieve_mailbox2_from = Junk ? imapsieve_mailbox2_name = * ? quota = count:User quota ? quota_clone_dict = proxy:/tmp/podop.socket:quota ? quota_vsizes = yes ? sieve = file:~/sieve;active=~/.dovecot.sieve ? sieve_before = dict:proxy:/tmp/podop.socket:sieve ? sieve_execute_bin_dir = /conf/bin ? sieve_extensions = +spamtest +spamtestplus +editheader ? sieve_global_extensions = +vnd.dovecot.execute ? sieve_plugins = sieve_imapsieve sieve_extprograms ? sieve_spamtest_max_value = 15 ? sieve_spamtest_status_header = X-Spam-Level ? sieve_spamtest_status_type = strlen ? sieve_vacation_dont_check_recipient = yes ? sieve_vacation_send_from_recipient = yes } postmaster_address = admin at usrpro.io protocols = imap pop3 lmtp sieve service auth-worker { ? unix_listener auth-worker { ??? group = mail ??? mode = 0660 ??? user = dovecot ? } ? user = mail } service auth { ? user = dovecot } service imap-login { ? inet_listener imap { ??? port = 143 ? } } service lmtp { ? inet_listener lmtp { ??? port = 2525 ? } } service managesieve-login { ? inet_listener sieve { ??? port = 4190 ? } } submission_host = 192.168.203.6 userdb { ? args = /etc/dovecot/auth.conf ? driver = dict } protocol imap { ? mail_plugins = " fts fts_lucene quota quota_clone zlib imap_quota imap_sieve" } protocol lmtp { ? mail_plugins = " fts fts_lucene quota quota_clone zlib sieve" ? recipient_delimiter = + } And the actual error log: imap_1?????? | Dec 19 16:31:08 indexer-worker(admin at usrpro.io)<490><m+t5VmJ93K7AqMsG:grc9HUxyGlzqAQAANEhNiw>: Fatal: master: service(indexer-worker): child 490 killed with signal 11 (core dumped) imap_1?????? | Dec 19 16:31:09 indexer-worker(admin at usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: lucene index /mail/admin at usrpro.io/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1?????? | Dec 19 16:31:09 indexer-worker(admin at usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:08] imap_1?????? | Dec 19 16:31:09 indexer-worker(admin at usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1299..1299)) imap_1?????? | Dec 19 16:31:10 indexer-worker(admin at usrpro.io)<493><m+t5VmJ93K7AqMsG:GKWdMU1yGlztAQAANEhNiw>: Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped) imap_1?????? | Dec 19 16:31:10 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin at usrpro.io imap_1?????? | Dec 19 16:31:11 indexer-worker(admin at usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: lucene index /mail/admin at usrpro.io/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1?????? | Dec 19 16:31:11 indexer-worker(admin at usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:10] imap_1?????? | Dec 19 16:31:11 indexer-worker(admin at usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1310..1310)) imap_1?????? | Dec 19 16:31:11 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin at usrpro.io I managed to find a core dump file, which appeared outside of the container. So I copied it back in, installed and ran gdb: GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.? Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075] warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0? 0x00007fbd9a31c11a in free () from /lib/ld-musl-x86_64.so.1 So this seems musl related. I installed musl-dbg and ran again: GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.? Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075] warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0? a_crash () at ./arch/x86_64/atomic_arch.h:108 108???? ./arch/x86_64/atomic_arch.h: No such file or directory. Now I'm kinda lost in space. I don't know where that header file is. Tried running "find" on the filesystem and a google search. But nothing specific showed up. I am starting to feel this bug is more musl related than Dovecot. Since this has bitten our project more in the past, I'm considering to move the project to Debian based images. But I want to be 100% sure this is not a dovecot bug. Note, earlier I created an image with Alpine:edge with musl 1.1.20-r2 and Dovecot 2.3.3. Running that image I saw the same error symptoms (Was just a hopeful trail and error). I did not do any debugging on that one. Thanks in advance! Tim
Aki Tuomi
2018-Dec-19 18:08 UTC
Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
<!doctype html> <html> <head> <meta charset="UTF-8"> </head> <body> <div> <br> </div> <blockquote type="cite"> <div> On 19 December 2018 at 20:00 Tim Mohlmann via dovecot < <a href="mailto:dovecot@dovecot.org">dovecot@dovecot.org</a>> wrote: </div> <div> <br> </div> <div> <br> </div> <div> Dear list, </div> <div> <br> </div> <div> We been having some issues where the indexer-worker is crashing. This </div> <div> happens on production servers which are handling a slight amount of </div> <div> mail, but is also reproducible by moving messages. Also, users on my </div> <div> server are complaining about "Trashed" items coming back etc. </div> <div> <br> </div> <div> Some details: </div> <div> <br> </div> <div> - Dovecot 2.3.2.1 in alpine:3.8 based Docker container. As part of the </div> <div> Mailu distribution. ( <a href="https://github.com/Mailu/Mailu" rel="noopener" target="_blank">https://github.com/Mailu/Mailu</a>) </div> <div> <br> </div> <div> - I've first seen this issues on my production server, which stores mail </div> <div> on GlusterFS </div> <div> <br> </div> <div> - I've been able to reproduce running the Docker container on a Virtual </div> <div> machine, using local storage. </div> <div> <br> </div> <div> - There is another Mailu user reporting the same problem on a different </div> <div> VM provider / disk infrastructure: <a href="https://github.com/Mailu/Mailu/issues/751" rel="noopener" target="_blank">https://github.com/Mailu/Mailu/issues/751</a> </div> <div> <br> </div> <div> - Libc: musl-1.1.19-r10 </div> <div> <br> </div> <div> Output of dovecot -n: </div> <div> <br> </div> <div> # dovecot -n </div> <div> # 2.3.2.1 (0719df592): /etc/dovecot/dovecot.conf </div> <div> # Pigeonhole version 0.5.2 (7704de5e) </div> <div> # OS: Linux 4.16.3-301.fc28.x86_64 x86_64 ext4 </div> <div> # Hostname: 98a2726271d3 </div> <div> auth_mechanisms = plain login </div> <div> disable_plaintext_auth = no </div> <div> first_valid_gid = 8 </div> <div> first_valid_uid = 8 </div> <div> hostname = mail.usrpro.io </div> <div> log_path = /dev/stderr </div> <div> mail_access_groups = mail </div> <div> mail_gid = mail </div> <div> mail_home = /mail/%u </div> <div> mail_location = maildir:/mail/%u </div> <div> mail_plugins = " fts fts_lucene quota quota_clone zlib" </div> <div> mail_privileged_group = mail </div> <div> mail_uid = mail </div> <div> mail_vsize_bg_after_count = 100 </div> <div> maildir_stat_dirs = yes </div> <div> managesieve_notify_capability = mailto </div> <div> managesieve_sieve_capability = fileinto reject envelope </div> <div> encoded-character vacation subaddress comparator-i;ascii-numeric </div> <div> relational regex imap4flags copy include variables body enotify </div> <div> environment mailbox date index ihave duplicate mime foreverypart </div> <div> extracttext spamtest spamtestplus editheader imapsieve vnd.dovecot.imapsieve </div> <div> namespace inbox { </div> <div> inbox = yes </div> <div> location </div> <div> mailbox Drafts { </div> <div> auto = subscribe </div> <div> special_use = \Drafts </div> <div> } </div> <div> mailbox Junk { </div> <div> auto = subscribe </div> <div> special_use = \Junk </div> <div> } </div> <div> mailbox Sent { </div> <div> auto = subscribe </div> <div> special_use = \Sent </div> <div> } </div> <div> mailbox Trash { </div> <div> auto = subscribe </div> <div> special_use = \Trash </div> <div> } </div> <div> prefix </div> <div> } </div> <div> passdb { </div> <div> args = /etc/dovecot/auth.conf </div> <div> driver = dict </div> <div> } </div> <div> plugin { </div> <div> fts = lucene </div> <div> fts_autoindex = yes </div> <div> fts_autoindex_exclude = \Junk </div> <div> fts_lucene = whitespace_chars=@. </div> <div> imapsieve_mailbox1_before = file:/conf/report-spam.sieve </div> <div> imapsieve_mailbox1_causes = COPY </div> <div> imapsieve_mailbox1_name = Junk </div> <div> imapsieve_mailbox2_before = file:/conf/report-ham.sieve </div> <div> imapsieve_mailbox2_causes = COPY </div> <div> imapsieve_mailbox2_from = Junk </div> <div> imapsieve_mailbox2_name = * </div> <div> quota = count:User quota </div> <div> quota_clone_dict = proxy:/tmp/podop.socket:quota </div> <div> quota_vsizes = yes </div> <div> sieve = file:~/sieve;active=~/.dovecot.sieve </div> <div> sieve_before = dict:proxy:/tmp/podop.socket:sieve </div> <div> sieve_execute_bin_dir = /conf/bin </div> <div> sieve_extensions = +spamtest +spamtestplus +editheader </div> <div> sieve_global_extensions = +vnd.dovecot.execute </div> <div> sieve_plugins = sieve_imapsieve sieve_extprograms </div> <div> sieve_spamtest_max_value = 15 </div> <div> sieve_spamtest_status_header = X-Spam-Level </div> <div> sieve_spamtest_status_type = strlen </div> <div> sieve_vacation_dont_check_recipient = yes </div> <div> sieve_vacation_send_from_recipient = yes </div> <div> } </div> <div> postmaster_address = <a href="mailto:admin@usrpro.io">admin@usrpro.io</a> </div> <div> protocols = imap pop3 lmtp sieve </div> <div> service auth-worker { </div> <div> unix_listener auth-worker { </div> <div> group = mail </div> <div> mode = 0660 </div> <div> user = dovecot </div> <div> } </div> <div> user = mail </div> <div> } </div> <div> service auth { </div> <div> user = dovecot </div> <div> } </div> <div> service imap-login { </div> <div> inet_listener imap { </div> <div> port = 143 </div> <div> } </div> <div> } </div> <div> service lmtp { </div> <div> inet_listener lmtp { </div> <div> port = 2525 </div> <div> } </div> <div> } </div> <div> service managesieve-login { </div> <div> inet_listener sieve { </div> <div> port = 4190 </div> <div> } </div> <div> } </div> <div> submission_host = 192.168.203.6 </div> <div> userdb { </div> <div> args = /etc/dovecot/auth.conf </div> <div> driver = dict </div> <div> } </div> <div> protocol imap { </div> <div> mail_plugins = " fts fts_lucene quota quota_clone zlib imap_quota </div> <div> imap_sieve" </div> <div> } </div> <div> protocol lmtp { </div> <div> mail_plugins = " fts fts_lucene quota quota_clone zlib sieve" </div> <div> recipient_delimiter = + </div> <div> } </div> <div> <br> </div> <div> And the actual error log: </div> <div> <br> </div> <div> imap_1 | Dec 19 16:31:08 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<490><m+t5VmJ93K7AqMsG:grc9HUxyGlzqAQAANEhNiw>: </div> <div> Fatal: master: service(indexer-worker): child 490 killed with signal 11 </div> <div> (core dumped) </div> <div> imap_1 | Dec 19 16:31:09 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: </div> <div> Error: lucene index <a href="mailto:/mail/admin@usrpro.io">/mail/admin@usrpro.io</a>/lucene-indexes: IndexWriter() </div> <div> failed (#1): Lock obtain timed out </div> <div> imap_1 | Dec 19 16:31:09 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: </div> <div> Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer </div> <div> to server log for more information. [2018-12-19 16:31:08] </div> <div> imap_1 | Dec 19 16:31:09 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: </div> <div> Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit </div> <div> failed: backend deinit (attempted to index 1 messages (UIDs 1299..1299)) </div> <div> imap_1 | Dec 19 16:31:10 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<493><m+t5VmJ93K7AqMsG:GKWdMU1yGlztAQAANEhNiw>: </div> <div> Fatal: master: service(indexer-worker): child 493 killed with signal 11 </div> <div> (core dumped) </div> <div> imap_1 | Dec 19 16:31:10 indexer: Error: Indexer worker </div> <div> disconnected, discarding 1 requests for <a href="mailto:admin@usrpro.io">admin@usrpro.io</a> </div> <div> imap_1 | Dec 19 16:31:11 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: </div> <div> Error: lucene index <a href="mailto:/mail/admin@usrpro.io">/mail/admin@usrpro.io</a>/lucene-indexes: IndexWriter() </div> <div> failed (#1): Lock obtain timed out </div> <div> imap_1 | Dec 19 16:31:11 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: </div> <div> Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer </div> <div> to server log for more information. [2018-12-19 16:31:10] </div> <div> imap_1 | Dec 19 16:31:11 </div> <div> indexer-worker( <a href="mailto:admin@usrpro.io">admin@usrpro.io</a>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: </div> <div> Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit </div> <div> failed: backend deinit (attempted to index 1 messages (UIDs 1310..1310)) </div> <div> imap_1 | Dec 19 16:31:11 indexer: Error: Indexer worker </div> <div> disconnected, discarding 1 requests for <a href="mailto:admin@usrpro.io">admin@usrpro.io</a> </div> <div> <br> </div> <div> I managed to find a core dump file, which appeared outside of the </div> <div> container. So I copied it back in, installed and ran gdb: </div> <div> <br> </div> <div> GNU gdb (GDB) 8.0.1 </div> <div> Copyright (C) 2017 Free Software Foundation, Inc. </div> <div> License GPLv3+: GNU GPL version 3 or later </div> <div> < <a href="http://gnu.org/licenses/gpl.html" rel="noopener" target="_blank">http://gnu.org/licenses/gpl.html</a>> </div> <div> This is free software: you are free to change and redistribute it. </div> <div> There is NO WARRANTY, to the extent permitted by law. Type "show copying" </div> <div> and "show warranty" for details. </div> <div> This GDB was configured as "x86_64-alpine-linux-musl". </div> <div> Type "show configuration" for configuration details. </div> <div> For bug reporting instructions, please see: </div> <div> < <a href="http://www.gnu.org/software/gdb/bugs/>" rel="noopener" target="_blank" data-mce-href="http://www.gnu.org/software/gdb/bugs/">">http://www.gnu.org/software/gdb/bugs/></a>. </div> <div> Find the GDB manual and other documentation resources online at: </div> <div> < <a href="http://www.gnu.org/software/gdb/documentation/>" rel="noopener" target="_blank" data-mce-href="http://www.gnu.org/software/gdb/documentation/">">http://www.gnu.org/software/gdb/documentation/></a>. </div> <div> For help, type "help". </div> <div> Type "apropos word" to search for commands related to "word"... </div> <div> Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging </div> <div> symbols found)...done. </div> <div> [New LWP 1075] </div> <div> <br> </div> <div> warning: Can't read pathname for load map: No error information. </div> <div> Core was generated by `dovecot/indexer-worker'. </div> <div> Program terminated with signal SIGSEGV, Segmentation fault. </div> <div> #0 0x00007fbd9a31c11a in free () from /lib/ld-musl-x86_64.so.1 </div> <div> <br> </div> <div> So this seems musl related. I installed musl-dbg and ran again: </div> <div> <br> </div> <div> GNU gdb (GDB) 8.0.1 </div> <div> Copyright (C) 2017 Free Software Foundation, Inc. </div> <div> License GPLv3+: GNU GPL version 3 or later </div> <div> < <a href="http://gnu.org/licenses/gpl.html" rel="noopener" target="_blank">http://gnu.org/licenses/gpl.html</a>> </div> <div> This is free software: you are free to change and redistribute it. </div> <div> There is NO WARRANTY, to the extent permitted by law. Type "show copying" </div> <div> and "show warranty" for details. </div> <div> This GDB was configured as "x86_64-alpine-linux-musl". </div> <div> Type "show configuration" for configuration details. </div> <div> For bug reporting instructions, please see: </div> <div> < <a href="http://www.gnu.org/software/gdb/bugs/>" rel="noopener" target="_blank" data-mce-href="http://www.gnu.org/software/gdb/bugs/">">http://www.gnu.org/software/gdb/bugs/></a>. </div> <div> Find the GDB manual and other documentation resources online at: </div> <div> < <a href="http://www.gnu.org/software/gdb/documentation/>" rel="noopener" target="_blank" data-mce-href="http://www.gnu.org/software/gdb/documentation/">">http://www.gnu.org/software/gdb/documentation/></a>. </div> <div> For help, type "help". </div> <div> Type "apropos word" to search for commands related to "word"... </div> <div> Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging </div> <div> symbols found)...done. </div> <div> [New LWP 1075] </div> <div> <br> </div> <div> warning: Can't read pathname for load map: No error information. </div> <div> Core was generated by `dovecot/indexer-worker'. </div> <div> Program terminated with signal SIGSEGV, Segmentation fault. </div> <div> #0 a_crash () at ./arch/x86_64/atomic_arch.h:108 </div> <div> 108 ./arch/x86_64/atomic_arch.h: No such file or directory. </div> <div> <br> </div> <div> Now I'm kinda lost in space. I don't know where that header file is. </div> <div> Tried running "find" on the filesystem and a google search. But nothing </div> <div> specific showed up. </div> <div> <br> </div> <div> I am starting to feel this bug is more musl related than Dovecot. Since </div> <div> this has bitten our project more in the past, I'm considering to move </div> <div> the project to Debian based images. But I want to be 100% sure this is </div> <div> not a dovecot bug. </div> <div> <br> </div> <div> Note, earlier I created an image with Alpine:edge with musl 1.1.20-r2 </div> <div> and Dovecot 2.3.3. Running that image I saw the same error symptoms (Was </div> <div> just a hopeful trail and error). I did not do any debugging on that one. </div> <div> <br> </div> <div> Thanks in advance! Tim </div> <div> <br> </div> <div> <br> </div> </blockquote> <div> Can you run bt full on gdb and post that? </div> <div class="io-ox-signature"> --- <br>Aki Tuomi </div> </body> </html>
Tim Mohlmann
2018-Dec-19 18:15 UTC
Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
Hope this helps. There are no debugging symbols available for dovecot in the alpine repository. (gdb) bt full #0? a_crash () at ./arch/x86_64/atomic_arch.h:108 No locals. #1? free (p=0x55d949250660) at src/malloc/malloc.c:467 ??????? extra = 65 ??????? base = 0x55d94925060f "" ??????? len = 94391723427649 ??????? self = 0x55d949250650 ??????? next = <optimized out> ??????? final_size = <optimized out> ??????? new_size = <optimized out> ??????? size = <optimized out> ??????? reclaim = 0 ??????? i = <optimized out> #2? 0x00007fbd98ae7e29 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3? 0x00007fbd98ae87ee in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4? 0x00007fbd98b3639d in lucene::index::IndexWriter::init(lucene::store::Directory*, lucene::analysis::Analyzer*, bool, bool, lucene::index::IndexDeletionPolicy*, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5? 0x00007fbd98b36883 in lucene::index::IndexWriter::IndexWriter(char const*, lucene::analysis::Analyzer*, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6? 0x00007fbd98df7cea in lucene_index_build_init () from /usr/lib/dovecot/lib21_fts_lucene_plugin.so No symbol table info available. #7? 0x00007fbd98df677b in ?? () from /usr/lib/dovecot/lib21_fts_lucene_plugin.so No symbol table info available. #8? 0x00007fbd99836706 in fts_backend_update_set_build_key () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #9? 0x00007fbd99837c25 in fts_build_mail () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #10 0x00007fbd9983c63a in ?? () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #11 0x00007fbd9a008ea7 in mail_precache () from /usr/lib/dovecot/libdovecot-storage.so.0 No symbol table info available. #12 0x000055d9471ce6c0 in ?? () No symbol table info available. #13 0x00007fbd99d41beb in io_loop_call_io () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #14 0x00007fbd99d430ab in io_loop_handler_run_internal () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #15 0x00007fbd99d41d56 in io_loop_handler_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #16 0x00007fbd99d41e69 in io_loop_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #17 0x00007fbd99cd5d32 in master_service_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #18 0x000055d9471cdfa9 in main () No symbol table info available. On 12/19/18 8:08 PM, Aki Tuomi wrote:> >> On 19 December 2018 at 20:00 Tim Mohlmann via dovecot < >> dovecot at dovecot.org <mailto:dovecot at dovecot.org>> wrote: >> >> >> Dear list, >> >> We been having some issues where the indexer-worker is crashing. This >> happens on production servers which are handling a slight amount of >> mail, but is also reproducible by moving messages. Also, users on my >> server are complaining about "Trashed" items coming back etc. >> >> Some details: >> >> - Dovecot 2.3.2.1 in alpine:3.8 based Docker container. As part of the >> Mailu distribution. ( https://github.com/Mailu/Mailu) >> >> - I've first seen this issues on my production server, which stores mail >> on GlusterFS >> >> - I've been able to reproduce running the Docker container on a Virtual >> machine, using local storage. >> >> - There is another Mailu user reporting the same problem on a different >> VM provider / disk infrastructure: >> https://github.com/Mailu/Mailu/issues/751 >> >> - Libc: musl-1.1.19-r10 >> >> Output of dovecot -n: >> >> # dovecot -n >> # 2.3.2.1 (0719df592): /etc/dovecot/dovecot.conf >> # Pigeonhole version 0.5.2 (7704de5e) >> # OS: Linux 4.16.3-301.fc28.x86_64 x86_64? ext4 >> # Hostname: 98a2726271d3 >> auth_mechanisms = plain login >> disable_plaintext_auth = no >> first_valid_gid = 8 >> first_valid_uid = 8 >> hostname = mail.usrpro.io >> log_path = /dev/stderr >> mail_access_groups = mail >> mail_gid = mail >> mail_home = /mail/%u >> mail_location = maildir:/mail/%u >> mail_plugins = " fts fts_lucene quota quota_clone zlib" >> mail_privileged_group = mail >> mail_uid = mail >> mail_vsize_bg_after_count = 100 >> maildir_stat_dirs = yes >> managesieve_notify_capability = mailto >> managesieve_sieve_capability = fileinto reject envelope >> encoded-character vacation subaddress comparator-i;ascii-numeric >> relational regex imap4flags copy include variables body enotify >> environment mailbox date index ihave duplicate mime foreverypart >> extracttext spamtest spamtestplus editheader imapsieve >> vnd.dovecot.imapsieve >> namespace inbox { >> ? inbox = yes >> ? location >> ? mailbox Drafts { >> ??? auto = subscribe >> ??? special_use = \Drafts >> ? } >> ? mailbox Junk { >> ??? auto = subscribe >> ??? special_use = \Junk >> ? } >> ? mailbox Sent { >> ??? auto = subscribe >> ??? special_use = \Sent >> ? } >> ? mailbox Trash { >> ??? auto = subscribe >> ??? special_use = \Trash >> ? } >> ? prefix >> } >> passdb { >> ? args = /etc/dovecot/auth.conf >> ? driver = dict >> } >> plugin { >> ? fts = lucene >> ? fts_autoindex = yes >> ? fts_autoindex_exclude = \Junk >> ? fts_lucene = whitespace_chars=@. >> ? imapsieve_mailbox1_before = file:/conf/report-spam.sieve >> ? imapsieve_mailbox1_causes = COPY >> ? imapsieve_mailbox1_name = Junk >> ? imapsieve_mailbox2_before = file:/conf/report-ham.sieve >> ? imapsieve_mailbox2_causes = COPY >> ? imapsieve_mailbox2_from = Junk >> ? imapsieve_mailbox2_name = * >> ? quota = count:User quota >> ? quota_clone_dict = proxy:/tmp/podop.socket:quota >> ? quota_vsizes = yes >> ? sieve = file:~/sieve;active=~/.dovecot.sieve >> ? sieve_before = dict:proxy:/tmp/podop.socket:sieve >> ? sieve_execute_bin_dir = /conf/bin >> ? sieve_extensions = +spamtest +spamtestplus +editheader >> ? sieve_global_extensions = +vnd.dovecot.execute >> ? sieve_plugins = sieve_imapsieve sieve_extprograms >> ? sieve_spamtest_max_value = 15 >> ? sieve_spamtest_status_header = X-Spam-Level >> ? sieve_spamtest_status_type = strlen >> ? sieve_vacation_dont_check_recipient = yes >> ? sieve_vacation_send_from_recipient = yes >> } >> postmaster_address = admin at usrpro.io <mailto:admin at usrpro.io> >> protocols = imap pop3 lmtp sieve >> service auth-worker { >> ? unix_listener auth-worker { >> ??? group = mail >> ??? mode = 0660 >> ??? user = dovecot >> ? } >> ? user = mail >> } >> service auth { >> ? user = dovecot >> } >> service imap-login { >> ? inet_listener imap { >> ??? port = 143 >> ? } >> } >> service lmtp { >> ? inet_listener lmtp { >> ??? port = 2525 >> ? } >> } >> service managesieve-login { >> ? inet_listener sieve { >> ??? port = 4190 >> ? } >> } >> submission_host = 192.168.203.6 >> userdb { >> ? args = /etc/dovecot/auth.conf >> ? driver = dict >> } >> protocol imap { >> ? mail_plugins = " fts fts_lucene quota quota_clone zlib imap_quota >> imap_sieve" >> } >> protocol lmtp { >> ? mail_plugins = " fts fts_lucene quota quota_clone zlib sieve" >> ? recipient_delimiter = + >> } >> >> And the actual error log: >> >> imap_1?????? | Dec 19 16:31:08 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<490><m+t5VmJ93K7AqMsG:grc9HUxyGlzqAQAANEhNiw>: >> Fatal: master: service(indexer-worker): child 490 killed with signal 11 >> (core dumped) >> imap_1?????? | Dec 19 16:31:09 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: >> Error: lucene index /mail/admin at usrpro.io >> <mailto:/mail/admin at usrpro.io>/lucene-indexes: IndexWriter() >> failed (#1): Lock obtain timed out >> imap_1?????? | Dec 19 16:31:09 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: >> Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer >> to server log for more information. [2018-12-19 16:31:08] >> imap_1?????? | Dec 19 16:31:09 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: >> Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit >> failed: backend deinit (attempted to index 1 messages (UIDs 1299..1299)) >> imap_1?????? | Dec 19 16:31:10 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<493><m+t5VmJ93K7AqMsG:GKWdMU1yGlztAQAANEhNiw>: >> Fatal: master: service(indexer-worker): child 493 killed with signal 11 >> (core dumped) >> imap_1?????? | Dec 19 16:31:10 indexer: Error: Indexer worker >> disconnected, discarding 1 requests for admin at usrpro.io >> <mailto:admin at usrpro.io> >> imap_1?????? | Dec 19 16:31:11 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: >> Error: lucene index /mail/admin at usrpro.io >> <mailto:/mail/admin at usrpro.io>/lucene-indexes: IndexWriter() >> failed (#1): Lock obtain timed out >> imap_1?????? | Dec 19 16:31:11 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: >> Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer >> to server log for more information. [2018-12-19 16:31:10] >> imap_1?????? | Dec 19 16:31:11 >> indexer-worker( admin at usrpro.io >> <mailto:admin at usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: >> Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit >> failed: backend deinit (attempted to index 1 messages (UIDs 1310..1310)) >> imap_1?????? | Dec 19 16:31:11 indexer: Error: Indexer worker >> disconnected, discarding 1 requests for admin at usrpro.io >> <mailto:admin at usrpro.io> >> >> I managed to find a core dump file, which appeared outside of the >> container. So I copied it back in, installed and ran gdb: >> >> GNU gdb (GDB) 8.0.1 >> Copyright (C) 2017 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> < http://gnu.org/licenses/gpl.html> >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law.? Type "show >> copying" >> and "show warranty" for details. >> This GDB was configured as "x86_64-alpine-linux-musl". >> Type "show configuration" for configuration details. >> For bug reporting instructions, please see: >> < ">http://www.gnu.org/software/gdb/bugs/> >> <http://www.gnu.org/software/gdb/bugs/>>. >> Find the GDB manual and other documentation resources online at: >> < ">http://www.gnu.org/software/gdb/documentation/> >> <http://www.gnu.org/software/gdb/documentation/>>. >> For help, type "help". >> Type "apropos word" to search for commands related to "word"... >> Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging >> symbols found)...done. >> [New LWP 1075] >> >> warning: Can't read pathname for load map: No error information. >> Core was generated by `dovecot/indexer-worker'. >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0? 0x00007fbd9a31c11a in free () from /lib/ld-musl-x86_64.so.1 >> >> So this seems musl related. I installed musl-dbg and ran again: >> >> GNU gdb (GDB) 8.0.1 >> Copyright (C) 2017 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> < http://gnu.org/licenses/gpl.html> >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law.? Type "show >> copying" >> and "show warranty" for details. >> This GDB was configured as "x86_64-alpine-linux-musl". >> Type "show configuration" for configuration details. >> For bug reporting instructions, please see: >> < ">http://www.gnu.org/software/gdb/bugs/> >> <http://www.gnu.org/software/gdb/bugs/>>. >> Find the GDB manual and other documentation resources online at: >> < ">http://www.gnu.org/software/gdb/documentation/> >> <http://www.gnu.org/software/gdb/documentation/>>. >> For help, type "help". >> Type "apropos word" to search for commands related to "word"... >> Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging >> symbols found)...done. >> [New LWP 1075] >> >> warning: Can't read pathname for load map: No error information. >> Core was generated by `dovecot/indexer-worker'. >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0? a_crash () at ./arch/x86_64/atomic_arch.h:108 >> 108???? ./arch/x86_64/atomic_arch.h: No such file or directory. >> >> Now I'm kinda lost in space. I don't know where that header file is. >> Tried running "find" on the filesystem and a google search. But nothing >> specific showed up. >> >> I am starting to feel this bug is more musl related than Dovecot. Since >> this has bitten our project more in the past, I'm considering to move >> the project to Debian based images. But I want to be 100% sure this is >> not a dovecot bug. >> >> Note, earlier I created an image with Alpine:edge with musl 1.1.20-r2 >> and Dovecot 2.3.3. Running that image I saw the same error symptoms (Was >> just a hopeful trail and error). I did not do any debugging on that one. >> >> Thanks in advance! Tim >> >> > Can you run bt full on gdb and post that? > --- > Aki Tuomi-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20181219/8eb10540/attachment-0001.html>
Possibly Parallel Threads
- Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
- Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
- Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
- Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
- Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)