I have some archive mails in gzipped mboxes. I could use them with dovecot 1.x without problems. But recently I have installed dovecot 2.0.12, and they are slow. very slow. Creating index files takes about 10 minutes for ~20M file with 560 messages for bzipped mbox, for gzipped is little better but still unusable :( Stracing dovecot process shows that every ~ 20 messages it rereads complete mbox file. Am I doing something wrong? KJ -- http://modnebzdury.wordpress.com/2009/10/01/niewiarygodny-list-prof-majewskiej-wprowadzenie/
On 5/6/2011 3:07 PM, Kamil Jo?ca wrote:> > I have some archive mails in gzipped mboxes. I could use them with > dovecot 1.x without problems. > But recently I have installed dovecot 2.0.12, and they are slow. very > slow. > > Creating index files takes about 10 minutes for ~20M file with 560 > messages for bzipped mbox, for gzipped is little better but still > unusable :(What other software, if any, was also upgraded/changed when you upgraded to Dovecot 2.0.12? Libraries? Filesystem? Daemons? What OS/version? Was the OS upgraded? Is this a new machine as well as new software? If so how did you copy the files to the new system? Could they have been mildly corrupted along the way? Did this bad behavior start directly after the upgrade or did 2.0.12 run the zipped mbox files at acceptable speed for a while? Did you add/enable any new Dovecot plugins that you weren't running in 1.2.x?> Stracing dovecot process shows that every ~ 20 messages it rereads > complete mbox file.Can you be a bit more specific here? What do you mean by "rereads complete mbox file"? I'm not a dev, but that sounds suspiciously like an error handling mechanism. I.e. an error occurred while processing, or the file may have changed while processing, so we start over. Could you have a buggy inotify/dnotify or something along those lines? Do you now have something else running say, at the filesystem level, that that is making Dovecot think the file has changed even though it hasn't? Are you zipping these mbox files via a cron job that is running every few seconds instead of every few hours or days? Something is apparently causing Dovecot to reread these files regularly, and I'd guess it's probably not a Dovecot bug. Did you run strace when accessing a non-compressed mbox file for comparison? -- Stan
Here are some fixes: http://hg.dovecot.org/dovecot-2.0/rev/15a0687ec9d0 http://hg.dovecot.org/dovecot-2.0/rev/66ec075a49d3
kjonca at o2.pl (Kamil Jo?ca) writes:> I have some archive mails in gzipped mboxes. I could use them with > dovecot 1.x without problems. > But recently I have installed dovecot 2.0.12, and they are slow. very > slow.Recently I have to read some compressed mboxes again, and no progress :( I took 2.0.17 sources and put some i_debug ("#kjonca["__FILE__",%d,%s] %d", __LINE__,__func__,...some parameters ...); lines into istream-bzlib.c, istream-raw-mbox.c and istream-limit.c and found that: in istream-limit.c in function around lines 40-45: --8<---------------cut here---------------start------------->8--- i_stream_seek(stream->parent, lstream->istream.parent_start_offset + stream->istream.v_offset); stream->pos -= stream->skip; stream->skip = 0; --8<---------------cut here---------------end--------------->8--- seeks stream, (calling i_stream_raw_mbox_seek in file istream-raw-mbox.c ) and then (line 50 ) --8<---------------cut here---------------start------------->8--- if ((ret = i_stream_read(stream->parent)) == -2) return -2; --8<---------------cut here---------------end--------------->8--- tries to read some data earlier in stream, and with compressed mboxes it cause reread file from the beginning. Then I commented out (just for testing) lines 40-45 from istream-limit.c and bzipped mbox can be opened in reasonable time. (MOreover I can read some randomly picked mails without problems) Unfortunately, meanig of fields in istream* structures is very unclear for me (especially skip,pos and offset) to write proper code by myself. KJ -- http://sporothrix.wordpress.com/2011/01/16/usa-sie-krztusza-kto-nastepny/ Jak kto? ma pecha, to z?amie z?b podczas seksu oralnego (S.Sok??)
On 12/01/2012 10:39, Kamil Jo?ca wrote:> kjonca at o2.pl (Kamil Jo?ca) writes: > >> I have some archive mails in gzipped mboxes. I could use them with >> dovecot 1.x without problems. >> But recently I have installed dovecot 2.0.12, and they are slow. very >> slow. > > Recently I have to read some compressed mboxes again, and no progress :( > I took 2.0.17 sources and put some > i_debug ("#kjonca["__FILE__",%d,%s] %d", __LINE__,__func__,...some parameters ...); > > lines into istream-bzlib.c, istream-raw-mbox.c and istream-limit.c > and found that: > > in istream-limit.c in function around lines 40-45: > --8<---------------cut here---------------start------------->8--- > i_stream_seek(stream->parent, lstream->istream.parent_start_offset + > stream->istream.v_offset); > stream->pos -= stream->skip; > stream->skip = 0; > --8<---------------cut here---------------end--------------->8--- > seeks stream, (calling i_stream_raw_mbox_seek in file istream-raw-mbox.c ) > > and then (line 50 ) > --8<---------------cut here---------------start------------->8--- > if ((ret = i_stream_read(stream->parent)) == -2) > return -2; > --8<---------------cut here---------------end--------------->8--- > > tries to read some data earlier in stream, and with compressed mboxes it > cause reread file from the beginning. >Just wanted to bump this since it seems interesting. Timo do you have a comment? I definitely see your point that skipping backwards in a compressed stream is going to be very CPU intensive. Ed W