Nate
2007-Feb-01 01:39 UTC
[Dovecot] Indexing Performance Question (was tpop3d vs dovecot)
Since posting the previous thread we've setup a new system (Opteron 2.0ghz, 1gb ram, Fedora 6) for testing. I am hoping somebody very familiar with the indexing portion of dovecot can shed some light on this for me. After much testing, I've come to one primary conclusion that dovecot is possibly unnecessarily scanning or reading files within the maildir directories. Take a mailbox for instance (my test one has 71570 emails in the cur/ folder so the process runs long enough to get data of what it's doing). Follow these steps: 1. First pop3 connection. With no index files whatsoever, it takes nearly a full 10 minutes at 100% io load to scan the 71570 files (748M) directory and build the index files from scratch which look like this: -rw------- 1 postfix postdrop 839K Feb 1 00:43 dovecot.index -rw------- 1 postfix postdrop 1.9M Feb 1 00:57 dovecot.index.cache -rw------- 1 postfix postdrop 560K Feb 1 00:43 dovecot.index.log -rw------- 1 postfix postdrop 3.1M Feb 1 00:43 dovecot-uidlist While it's building index i see a lot of these fly by in lsof pop3 2023 postfix 8r REG 8,3 14425 285947 /var/spool/mail/test.visp.net/natetest/cur/1128488047.V902I7d42e.s3.visp.net:2, No problem here, the disk can only go as fast as the disk can go and thats a *lot* of mail. Although, one curious thing here, it appears that dovecot is reading or at least scanning the entire contents of the files within the cur/ directory to generate an index. I assume this by watching my disk cache go from 100MB to 600MB. Is it necessary to do that or only read the top X lines of each file, is that even possible? 2. Second connection. Near instantaneous results from a login and LIST. Cannot perform an lsof fast enough to see what files it's opening. This step can be repeated with nearly the same results. *rebooted machine to clear all cache* 3. Third connection. It takes ~40 seconds to display a LIST after login. The index files are untouched; however, during the 40 seconds I'm waiting, I can catch a few of these in lsof. pop3 1971 postfix 10r REG 8,3 4196 205963 /var/spool/mail/test.visp.net/natetest/cur/1157863512.V902I1f3867.s3.visp.net:2,S This tells me even though there is no new email, it is re-scanning the cur/ directory somewhat. It doesn't take near as long as the initial index build, but certainly takes longer than it would for the system to read any or all of the listed index files above. Why would dovecot need to even enter cur/ if the index file is up to date? Is it verifying that the index file is up to date by re-reading some or all of the many files in cur/? If so, could it not get the info it needs by simply doing the equivalent of a ls, which would take considerably less time? The disk cache is only filled up an additional 30meg vs an additional 500meg from connection 1, so it's certainly not reading as much data. (also makes me wonder why the initial scan/index build could not use this lesser time consuming function if thats what it's doing). Additional information: Running strace on the initial indexing looks like this... ------------------------------------------- open("/var/spool/mail/test.visp.net/natetest/cur/1115929738.V902I67c13.test.visp.net:2,", O_RDONLY) = 8 fstat(8, {st_mode=S_IFREG|0600, st_size=4667, ...}) = 0 pread(8, "Return-Path: <mehl8nl3n8osm3r83a"..., 4096, 0) = 4096 pread(8, "zdik8dyrnffv6bbrue.itownshipno12"..., 4095, 4096) = 571 pread(8, "", 3524, 4667) = 0 close(8) = 0 open("/var/spool/mail/test.visp.net/natetest/cur/1115931742.V902I67c42.test.visp.net:2,", O_RDONLY) = 8 fstat(8, {st_mode=S_IFREG|0600, st_size=3881, ...}) = 0 pread(8, "Return-Path: <sender-5-8759012-9"..., 4096, 0) = 3881 pread(8, "", 215, 3881) = 0 close(8) = 0 ------------------------------------------- Running strace on the 3rd connection looks like this... ------------------------------------------- open("/var/spool/mail/test.visp.net/natetest/cur/1114490671.V902I6011f.test.visp.net:2,", O_RDONLY) = 10 fstat(10, {st_mode=S_IFREG|0600, st_size=5367, ...}) = 0 pread(10, "Return-Path: <qflfmq82vz8zg-u6ci"..., 4096, 0) = 4096 pread(10, "gs/rpoc_District6Spauldings.php?"..., 4095, 4096) = 1271 pread(10, "", 2824, 5367) = 0 close(10) = 0 open("/var/spool/mail/test.visp.net/natetest/cur/1114496568.V902I60bf5.test.visp.net:2,", O_RDONLY) = 10 fstat(10, {st_mode=S_IFREG|0600, st_size=2047, ...}) = 0 pread(10, "Return-Path: <removed at aaa-calif.c"..., 4096, 0) = 2047 pread(10, "", 2049, 2047) = 0 close(10) = 0 Thank you for the help! - Nate
10 minutes to read 748Mb? Kinda slow, isn't it ? For me reading 2Gb mailbox (not maildir) takes MUCH less time than that (around 20 sec). It should be within 40secs to read the whole maildir. FiL Nate wrote:> > 1. First pop3 connection. With no index files whatsoever, it takes > nearly a full 10 minutes at 100% io load to scan the 71570 files > (748M) directory and build the index files from scratch which look > like this:
Timo Sirainen
2007-Feb-01 22:24 UTC
[Dovecot] Indexing Performance Question (was tpop3d vs dovecot)
On 1.2.2007, at 3.39, Nate wrote:> 3. Third connection. It takes ~40 seconds to display a LIST after > login. The index files are untouched; however, during the 40 > seconds I'm waiting, I can catch a few of these in lsof. > pop3 1971 postfix 10r REG 8,3 4196 205963 /var/spool/ > mail/test.visp.net/natetest/cur/1157863512.V902I1f3867.s3.visp.net:2,S > > This tells me even though there is no new email, it is re-scanning > the cur/ directory somewhat. It doesn't take near as long as the > initial index build, but certainly takes longer than it would for > the system to read any or all of the listed index files above. Why > would dovecot need to even enter cur/ if the index file is up to > date? Is it verifying that the index file is up to date by re- > reading some or all of the many files in cur/? If so, could it not > get the info it needs by simply doing the equivalent of a ls, which > would take considerably less time?The reason why the messages are read in the first place is because the message sizes need to be returned so that linefeeds are counted in CR+LF format, while they're typically stored in LF-only format. So if Dovecot just stat()ed the file and returned that as the message's size, it would be violating the POP3 spec. tpop3d seems to be doing that. It sounds like dovecot.index.cache file doesn't contain all the messages' sizes for some reason, and Dovecot goes and re-reads those files again to get the sizes. I'm not sure why it wouldn't contain the sizes though, so it sounds like a bug somewhere. I think this fixes it: http://dovecot.org/list/dovecot-cvs/2007- February/007582.html -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20070202/1f9bcc2b/attachment.bin>