Dear list, I am experimenting with a new mail handling setup and it involves a single IMAP folder with just under 70'000 messages. When OfflineIMAP connects to the server, the imap process starts to eat up a lot of memory: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15607 madduck 35 19 283m 244m 239m D 16.9 49.3 0:09.96 imap On the contrary, when "online" client, such as Thunderbird connect, memory usage is around 10m, which is entirely acceptable. The way offlineimap reads may is by FETCHing metadata, then APPENDing new local mail, SEARCHing for the UIDs of each uploaded mail, and finally FETCHing new remote mail. Memory use seems to be O(n) in the size of the folder. On the folder with 70k messages, dovecot seems to allocate 280m of memory, which it then fills to about 70% during the metadata FETCH, and then keeps growing while APPEND/SEARCHing the new local messages. The 70k mailbox is just short of 600Mb in size on disk. Dovecot uses 280Mb to serve it. Is it possible that dovecot is reading too much into memory, or over-optimising? Can I somehow tweak this to lower the memory footprint? Cheers, -- martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net at madduck "the unexamined life is not worth living" -- platon spamtraps: madduck.bogus at madduck.net -------------- next part -------------- A non-text attachment was scrubbed... Name: digital_signature_gpg.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature (see http://martin-krafft.net/gpg/) URL: <http://dovecot.org/pipermail/dovecot/attachments/20070813/c23b3d6d/attachment-0002.bin>
also sprach martin f krafft <madduck at madduck.net> [2007.08.13.2259 +0200]:> Memory use seems to be O(n) in the size of the folder. On the folder > with 70k messages, dovecot seems to allocate 280m of memory, whichI just saw in the logs: mmap() failed with index cache file /home/madduck/.maildir/.store/dovecot.index.cache: Cannot allocate memory and looking at the file, it's in fact 280m in size. Does dovecot need to read/mmap the entire file? Is there a way to vacuum/reduce/optimise the cache? -- martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net at madduck this message represents the official view of the voices in my head. spamtraps: madduck.bogus at madduck.net -------------- next part -------------- A non-text attachment was scrubbed... Name: digital_signature_gpg.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature (see http://martin-krafft.net/gpg/) URL: <http://dovecot.org/pipermail/dovecot/attachments/20070813/33102d26/attachment-0002.bin>
On Mon, 2007-08-13 at 22:59 +0200, martin f krafft wrote:> The way offlineimap reads may is by FETCHing metadata, then > APPENDing new local mail, SEARCHing for the UIDs of each uploaded > mail, and finally FETCHing new remote mail.What exactly do you mean by FETCHing metadata? Something like ENVELOPE or BODYSTRUCTURE? And this is fetched for all messages instead of just new ones? That could easily explain why cache is so large. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20070814/92a07741/attachment-0002.bin>
[back to list] On Tue, 2007-08-14 at 14:50 +0300, Timo Sirainen wrote:> On Tue, 2007-08-14 at 13:47 +0200, martin f krafft wrote: > > also sprach Timo Sirainen <tss at iki.fi> [2007.08.14.1250 +0200]: > > > mmap()ing a 200MB file doesn't mean that it immediately uses 200MB > > > of memory. It just reserves that much virtual space. Even if it > > > happens to read the whole 200MB, it shouldn't be much different > > > from simply reading the file in small parts, in which case kernel > > > places the same 200MB to page cache. > > > > I am aware of that. I just wonder why dovecot has to read the entire > > file into memory, via read() or mmap(). > > It doesn't. It mmaps it entirely so it doesn't waste CPU mmaping and > unmapping parts of it. It's then accessed only when and where needed.It probably does access the whole file though. This is because all the cached data for messages are stored next to each others. So now that your client fetches only INTERNALDATE it doesn't really need to read much, but because mmaped pages are read 4kB at a time, it practically reads the whole file. Perhaps this could use some work in future too, so that fields that are accessed together are stored in cache file close together. But then if multiple clients are used, there can be conflicting access patterns, and what's good for one client would be bad for another. Data could also be duplicated to reduce disk reads for different access patterns, but that comes at the cost of disk space.. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20070814/1cc17284/attachment-0002.bin>