Phillip Macey
2009-Aug-13 02:10 UTC
[Dovecot] Crash in v1.2.3: istream.c: assertion failed on line 99
Hi, I have a couple of people bumping into an issue with their imap process crashing - from the users perspective, their mail client (thunderbird) just stops recieving new mail. It still seems to be possible for them to read their mail using squirell mail. When it crashes, it leaves behind a log message like this on the server: Aug 12 15:52:25 fury dovecot: imap-login: Login: user=<zhi>, method=PLAIN, rip=10.8.5.15, lip=10.10.0.26 Aug 12 15:52:25 fury dovecot: IMAP(zhi): Panic: file istream.c: line 99 (i_stream_read): assertion failed: ((size_t)ret+old_size == _stream->pos - _stream->skip) Aug 12 15:52:25 fury dovecot: IMAP(zhi): Raw backtrace: imap [0x49ddc0] -> imap [0x49de23] -> imap [0x49d4a6] -> imap [0x4a252e] -> imap [0x4a4691] -> imap(i_stream_read+0x48) [0x4a23f8] -> imap(i_stream_read_data+0x28) [0x4a2558] -> imap [0x4ab22e] -> imap(o_stream_send_istream+0x2e) [0x4aa63e] -> imap(maildir_save_continue+0x32) [0x443d12] -> imap(mail_storage_copy+0x6a) [0x462a1a] -> imap(maildir_copy+0x5e) [0x441d2e] -> imap(cmd_copy+0x1fe) [0x41a94e] -> imap(cmd_uid+0x78) [0x41f8d8] -> imap [0x42053c] -> imap [0x4205f2] -> imap(client_handle_input+0x3f) [0x42072f] -> imap(client_input+0x5f) [0x42127f] -> imap(io_loop_handler_run+0xf8) [0x4a5b08] -> imap(io_loop_run+0x1d) [0x4a4c1d] -> imap(main+0x620) [0x428b80] -> /lib64/libc.so.6(__libc_start_main+0xf4) [0x353e41d974] -> imap [0x4198f9] Aug 12 15:52:25 fury dovecot: dovecot: child 14759 (imap) killed with signal 6 (core dumps disabled) I was able to fairly reliably trigger the same behaviour myself by attempting to move a bunch (thousands) of emails in my account using thunderbird - I dont know exactly what it is that triggers the crash though. I tried enabling core dumps using 'ulimit -c unlimited' and messing around with some environment variables to stop the startup scripts from forcing core dumps off again (DAEMON_COREFILE_LIMIT=18446744073709551615 /etc/init.d/dovecot restart). After doing that, I do not see anything in the logs any more - but I still seem to be having problems moving the emails around in my account. It would seem that it is still broken but Im just not getting any logs telling me so. There don't seem to be any core dumps sitting around anywhere either. My 'dovecot -n' is: # 1.2.3: /etc/dovecot.conf # OS: Linux 2.6.18-128.1.16.el5.centos.plus x86_64 CentOS release 5.3 (Final) xfs syslog_facility: local0 shutdown_clients: no login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login max_mail_processes: 1024 first_valid_uid: 300 mail_location: maildir:/u/%u/Maildir mail_drop_priv_before_exec: yes mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 lda: postmaster_address: example at example.com auth default: mechanisms: gssapi gss-spnego plain krb5_keytab: /etc/opt/quest/vas/imap.keytab gssapi_hostname: $ALL passdb: driver: passwd-file args: /etc/dovecot.deny # empty file deny: yes passdb: driver: pam args: dovecot userdb: driver: passwd userdb: driver: static args: home=/u/%u -- Thanks, Phill Macey
Charles Marcus
2009-Aug-13 12:49 UTC
[Dovecot] Crash in v1.2.3: istream.c: assertion failed on line 99
On 8/12/2009, Phillip Macey (phillip.macey at cisra.canon.com.au) wrote:> I was able to fairly reliably trigger the same behaviour myself by > attempting to move a bunch (thousands) of emails in my account using > thunderbirdWhat version of TBird? The upcoming v3 is supposed to be bringing a *lot* of IMAP fixes... can you try it to see if this is at least limited to 2.0.0.x? -- Best regards, Charles
Timo Sirainen
2009-Aug-13 23:00 UTC
[Dovecot] Crash in v1.2.3: istream.c: assertion failed on line 99
On Thu, 2009-08-13 at 12:10 +1000, Phillip Macey wrote:> Aug 12 15:52:25 fury dovecot: IMAP(zhi): Panic: file istream.c: line 99 > (i_stream_read): assertion failed: ((size_t)ret+old_size == _stream->pos > - _stream->skip)This is kind of annoying problem. I've been adding more of these asserts to try to catch it, but no one's really found an easy way to reproduce it. If you manage to get a core dump, it would also help to see what those values are: fr 6 p ret p old_size p _stream.pos p _stream.skip (You may need to compile Dovecot without -O2 parameter to get some of those values. Easiest way would be to just remove -O2 from src/lib/Makefile; touch src/lib/istream.c; make; sudo make install)> imap(i_stream_read_data+0x28) [0x4a2558] -> imap [0x4ab22e] -> > imap(o_stream_send_istream+0x2e) [0x4aa63e] -> > imap(maildir_save_continue+0x32) [0x443d12] -> > imap(mail_storage_copy+0x6a) [0x462a1a] -> imap(maildir_copy+0x5e)This is another thing I'm wondering. It's copying the mails by actually copying the data, instead of doing a hard link. But your dovecot -n output shows that you have it set to defaults, so it should have used a hard link. Or did you copy to a directory that was in another mount point? Or doesn't your filesystem support hard links?> I was able to fairly reliably trigger the same behaviour myself by > attempting to move a bunch (thousands) of emails in my account using > thunderbird - I dont know exactly what it is that triggers the crash > though.If you're still able to reproduce this, can you also try if this does it: telnet localhost 143 a login user pass b select inbox c create testbox d copy 1:* testbox Instead of getting a core dump you can also attach gdb to an already running process: gdb -p <imap process pid> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20090813/afc32ee4/attachment-0002.bin>
Phillip Macey
2009-Aug-14 01:36 UTC
[Dovecot] Crash in v1.2.3: istream.c: assertion failed on line 99
On 13/08/2009 10:49 PM, Charles Marcus wrote:> What version of TBird? >2.0.0.22. It is not a thunderbird issue - I can trigger the crash using telnet. -- Thanks, Phill Macey (CiSRA IT Services)
Phillip Macey
2009-Aug-14 02:20 UTC
[Dovecot] Crash in v1.2.3: istream.c: assertion failed on line 99
On 14/08/2009 9:00 AM, Timo Sirainen wrote:> On Thu, 2009-08-13 at 12:10 +1000, Phillip Macey wrote: > >> Aug 12 15:52:25 fury dovecot: IMAP(zhi): Panic: file istream.c: line 99 >> (i_stream_read): assertion failed: ((size_t)ret+old_size == _stream->pos >> - _stream->skip) >> > > This is kind of annoying problem. I've been adding more of these asserts to try to catch it, but no one's really found an easy way to reproduce it. >Til now. It would seem that I can reproduce it at will - except that it doesnt get logged if I have core dumps enabled :-) It could be specific to my maildir or an email in my maildir?> If you manage to get a core dump, it would also help to see what those values are: > > fr 6 > p ret > p old_size > p _stream.pos > p _stream.skip > > (You may need to compile Dovecot without -O2 parameter to get some of those values. Easiest way would be to just remove -O2 from src/lib/Makefile; touch src/lib/istream.c; make; sudo make install) >Is this what you are after? I got to this point using telnet to send a copy command as you suggested below and then attaching gdb to the process and stepping through. 0x080f2c2c 337 while (ioloop->running) (gdb) 338 io_loop_handler_run(ioloop); (gdb) 0x080f2c23 338 io_loop_handler_run(ioloop); (gdb) Program received signal SIGABRT, Aborted. 0xb7f55402 in __kernel_vsyscall () (gdb) Aug 14 11:26:36 eyck dovecot: IMAP(pmacey): Panic: file istream.c: line 99 (i_stream_read): assertion failed: ((size_t)ret+old_size =_stream->pos - _stream->skip) Aug 14 11:26:36 eyck dovecot: IMAP(pmacey): Raw backtrace: imap [0x80eb000] -> imap [0x80eb05a] -> imap [0x80ea8ac] -> imap [0x80f0098] -> imap [0x80f25ca] -> imap(i_stream_read+0x3f) [0x80eff3f] -> imap(i_stream_read_data+0x1d) [0x80f00bd] -> imap [0x80e341f] -> imap [0x80e393e] -> imap(message_parser_parse_next_block+0x1d) [0x80e355d] -> imap(index_mail_cache_parse_continue+0x22) [0x809d3b2] -> imap(maildir_save_continue+0x45) [0x808ab55] -> imap(mail_storage_copy+0x6d) [0x80ac86d] -> imap(maildir_copy+0x4b) [0x80888db] -> imap(cmd_copy+0x1e0) [0x805e760] -> imap [0x80647ac] -> imap [0x806485b] -> imap(client_handle_input+0x33) [0x80649b3] -> imap(client_input+0x5f) [0x80654ff] -> imap(io_loop_handler_run+0x100) [0x80f3b40] -> imap(io_loop_run+0x28) [0x80f2c28] -> imap(main+0x738) [0x806da48] -> /lib/libc.so.6(__libc_start_main+0xdc) [0x58ae8c] -> imap [0x805d591] bt #0 0xb7f55402 in __kernel_vsyscall () #1 0x0059dd80 in raise () from /lib/libc.so.6 #2 0x0059f691 in abort () from /lib/libc.so.6 #3 0x080eb00d in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:160 #4 0x080eb05a in i_internal_fatal_handler (type=LOG_TYPE_PANIC, status=0, fmt=0x8102a90 "file %s: line %d (%s): assertion failed: (%s)", args=0xbfa95a34 "Kb\021\bc") at failures.c:440 #5 0x080ea8ac in i_panic (format=0x8102a90 "file %s: line %d (%s): assertion failed: (%s)") at failures.c:207 #6 0x080f0098 in i_stream_read (stream=0x9b284c0) at istream.c:99 #7 0x080f25ca in i_stream_tee_read (stream=0x9b28578) at istream-tee.c:130 #8 0x080eff3f in i_stream_read (stream=0x9b285a0) at istream.c:80 #9 0x080f00bd in i_stream_read_data (stream=0x9b285a0, data_r=0xbfa95bb8, size_r=0xbfa95bbc, threshold=1) at istream.c:361 #10 0x080e341f in message_parser_read_more (ctx=0x9b2fda0, block_r=0xbfa95bb0, full_r=0xbfa95b4f) at message-parser.c:118 #11 0x080e393e in parse_next_body_to_eof (ctx=0x9b2fda0, block_r=0xbfa95bb0) at message-parser.c:412 #12 0x080e355d in message_parser_parse_next_block (ctx=0x9b2fda0, block_r=0xbfa95bb0) at message-parser.c:768 #13 0x0809d3b2 in index_mail_cache_parse_continue (_mail=0x9b29ca0) at index-mail.c:1367 #14 0x0808ab55 in maildir_save_continue (_ctx=0x9b28698) at maildir-save.c:426 #15 0x080ac86d in mail_storage_copy (ctx=0x9b28698, mail=0x9b23cb0) at mail-copy.c:40 #16 0x080888db in maildir_copy (ctx=0x9b28698, mail=0x9b23cb0) at maildir-copy.c:282 #17 0x0805e760 in cmd_copy (cmd=0x9ac0ea0) at cmd-copy.c:73 #18 0x080647ac in client_command_input (cmd=0x9ac0ea0) at client.c:611 #19 0x0806485b in client_command_input (cmd=0x9ac0ea0) at client.c:660 #20 0x080649b3 in client_handle_input (client=0x9ac0c10) at client.c:701 #21 0x080654ff in client_input (client=0x9ac0c10) at client.c:752 #22 0x080f3b40 in io_loop_handler_run (ioloop=0x9abf298) at ioloop-epoll.c:208 #23 0x080f2c28 in io_loop_run (ioloop=0x9abf298) at ioloop.c:338 #24 0x0806da48 in main (argc=Cannot access memory at address 0x3c10 ) at main.c:327 (gdb) fr 6 #6 0x080f0098 in i_stream_read (stream=0x9b284c0) at istream.c:99 99 i_assert((size_t)ret+old_size == _stream->pos - _stream->skip); (gdb) p ret $4 = 15376 (gdb) p old_size No symbol "old_size" in current context. (gdb) p _stream $5 = (struct istream_private *) 0x9b28498 (gdb) p _stream.pos $6 = 8191 (gdb) p _stream.skip $7 = 0 (gdb)>> imap(i_stream_read_data+0x28) [0x4a2558] -> imap [0x4ab22e] -> >> imap(o_stream_send_istream+0x2e) [0x4aa63e] -> >> imap(maildir_save_continue+0x32) [0x443d12] -> >> imap(mail_storage_copy+0x6a) [0x462a1a] -> imap(maildir_copy+0x5e) > > This is another thing I'm wondering. It's copying the mails by actually copying the data, instead of doing a hard link. But your dovecot -n output shows that you have it set to defaults, so it should have used a hard link. Or did you copy to a directory that was in another mount point? Or doesn't your filesystem support hard links? >Mail storage is on an XFS filesystem - it should support hard links. I have explicitly set maildir_copy_with_hardlinks=yes in the config file rather than just relying on the defaults because I was certain that I wanted it turned on. I just jumped onto the server and created a hardlink by hand without any problems. 'ls -lni' looks like it works: 22972186 -rw-r--r-- 2 30332 60 81 Jul 13 16:38 dot.forward 22972186 -rw-r--r-- 2 30332 60 81 Jul 13 16:38 dot.forward.hardlink 1575505744 drwx------ 99 30332 60 8192 Aug 14 09:08 Maildir>> I was able to fairly reliably trigger the same behaviour myself by >> attempting to move a bunch (thousands) of emails in my account using >> thunderbird - I dont know exactly what it is that triggers the crash >> though. > > If you're still able to reproduce this, can you also try if this does > it: >Doing this also triggers the crash. See the gdb output above. The only thing I did differently was that 'testbox' already existed and had some emails in it so I didnt do the 'create'.> telnet localhost 143 > a login user pass > b select inbox > c create testbox > d copy 1:* testbox > > Instead of getting a core dump you can also attach gdb to an already > running process: gdb -p <imap process pid>-- Thanks, Phill Macey (CiSRA IT Services)