Peer Heinlein
2012-Oct-06 21:32 UTC
[Dovecot] Large subjects increase memory-usage and enlarge index-files
Several times we already had the problems, that accounts with more the 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if vsize_limit of 750 MB is set. In this case, the lmtpd-process haven't been able to allocate more memory to read/write/update the index-files and crashed (and the index-files become corrupted at the end.) [Please -- don't discuss about the need of INBOXes with 1.7 million (unread) e-mails (don't discuss that with ME. Personally, I agree, that there's NO need for that...).] But: We also noticed accounts with ~ 300.000 e-Mails running out of memory in the same situations. This happends, if the subject is very large (subject or some other header attributes). And: We've been able to reproduce out-of-memory-Problems with just 13.000 e-mails with VERY long subjects (e.g.: network monitoring status informations), even with a vsize_limit of 750 MB (which is already very much). 13.000 e-mails isn't very much. And it's easy to inject several thousands of prepared e-mails. Having many mails for accounts with huge (and broken) index-files slows down the delivery rate VERY much and increases the need for memory and cpu resources and I/O very much. So: This could be used for a very easy to do denial-of-service attac against Dovecot-based mailservers. I don't have a clear solution for that, Dovecot needs the subject information in its index files. But it looks like, it isn't a good idea to put the whole subject into the index. Maybe it's better/necessary to use just the first 50-70 characters for that and to keep the rest away from the index? I think I would prefer that even if that means, that accessing those folders with "special" e-mails will become slower because Dovecot has to get those informations directly from the e-mail. This performance issue is just a problem for the user. But crashing lmtpd-processes and lowering the delivery rate is a *real* problem for the whole IMAP-cluster. Peer -- Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-42 Fax: 030 / 405051-19 Zwangsangaben lt. ?35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Gesch?ftsf?hrer: Peer Heinlein -- Sitz: Berlin
Steve Litt
2012-Oct-06 23:44 UTC
[Dovecot] Large subjects increase memory-usage and enlarge index-files
On Sat, 06 Oct 2012 23:32:56 +0200, Peer Heinlein said:> > Several times we already had the problems, that accounts with more the > 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if > vsize_limit of 750 MB is set. > > In this case, the lmtpd-process haven't been able to allocate more > memory to read/write/update the index-files and crashed (and the > index-files become corrupted at the end.) > > [Please -- don't discuss about the need of INBOXes with 1.7 million > (unread) e-mails (don't discuss that with ME. Personally, I agree, > that there's NO need for that...).] > > But: We also noticed accounts with ~ 300.000 e-Mails running out of > memory in the same situations. This happends, if the subject is very > large (subject or some other header attributes). > > And: We've been able to reproduce out-of-memory-Problems with just > 13.000 e-mails with VERY long subjects (e.g.: network monitoring > status informations), even with a vsize_limit of 750 MB (which is > already very much). > > 13.000 e-mails isn't very much. And it's easy to inject several > thousands of prepared e-mails. > > Having many mails for accounts with huge (and broken) index-files > slows down the delivery rate VERY much and increases the need for > memory and cpu resources and I/O very much. > > So: This could be used for a very easy to do denial-of-service attac > against Dovecot-based mailservers. > > I don't have a clear solution for that, Dovecot needs the subject > information in its index files. But it looks like, it isn't a good > idea to put the whole subject into the index. Maybe it's > better/necessary to use just the first 50-70 characters for that and > to keep the rest away from the index? > > I think I would prefer that even if that means, that accessing those > folders with "special" e-mails will become slower because Dovecot has > to get those informations directly from the e-mail. > > This performance issue is just a problem for the user. > > But crashing lmtpd-processes and lowering the delivery rate is a > *real* problem for the whole IMAP-cluster. > > PeerWhile the real solution is being decided, can I avoid this possible DOS attack by using procmail to /dev/null anything with more than a 256 byte subject, before it ever gets to Dovecot IMAP? Thanks SteveT Steve Litt * http://www.troubleshooters.com/ * http://twitter.com/stevelitt Troubleshooting Training * Human Performance
Timo Sirainen
2012-Oct-08 00:11 UTC
[Dovecot] Large subjects increase memory-usage and enlarge index-files
On 7.10.2012, at 0.32, Peer Heinlein wrote:> Several times we already had the problems, that accounts with more the > 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if > vsize_limit of 750 MB is set. > > In this case, the lmtpd-process haven't been able to allocate more > memory to read/write/update the index-files and crashed (and the > index-files become corrupted at the end.)I don't think dovecot.index file is much of a problem. With 1M mails it usually only takes something like 8-32 MB of memory depending on what mailbox format is used. dovecot.index.log file doesn't depend on the mailbox size at all. The main problem is dovecot.index.cache file. I've thought about the cache file problems earlier also, but it's a bit difficult to figure out the best solution for it. And since nobody had actually complained about it, I hadn't really done anything about it. Also I hadn't previously thought of LMTP/LDA processes crashing because of it, that's a bigger problem than IMAP process crashing. Although I think you're getting a lot more of "mmap(dovecot.index.cache) failed: Out of memory" errors than crashes for large mailboxes? So, subproblems related to this: 1. Filling out dovecot.index.cache too easily. A rather simple possibility that would catch all the possible ways would be to limit the max. size of a single message's cache entry to X kilobytes (64?). If it becomes larger, it's simply not written to the cache file. 2. Filling out memory too easily. If a long header is wanted to be cached or used for other purposes (e.g. Message-ID), it's still fully read into memory. Add some reasonable limit to max. length of a single header. Can't be too small, because some headers are legitimately pretty long (DKIM and such). Maybe something like 10kB would be safe enough for everyone? 3. If existing dovecot.index.cache is larger than X MB, shrink it first below X. Shrinking could begin with trying to do it the nice way of removing only unneeded data, but if that fails it could forcibly just remove some old messages. The X would have to be related to the process's VSZ limit. 4. Dovecot currently doesn't close index files immediately when mailbox is closed, because it's thinking that IMAP clients might reopen the index soon anyway. Max 3 indexes can be kept open, so 3x already different very large indexes can be too much. I'm not sure if this is actually useful at all. Maybe I should disable it for LMTP, or maybe just remove it completely. The 3. part is what I like changing the least. An alternative solution would be to just not map the entire cache file into memory all at once. The code was actually originally designed to do just that, but munmap()ing + mmap()ing again wasn't very efficient. But for LMTP there's really no need to map the whole file. All it really wants is to read a couple of header records and then append to the file. Maybe it could use an alternative code path that would simply do that instead of mmap()ing anything. It wouldn't solve it for IMAP though.> I don't have a clear solution for that, Dovecot needs the subject > information in its index files. But it looks like, it isn't a good idea > to put the whole subject into the index. Maybe it's better/necessary to > use just the first 50-70 characters for that and to keep the rest away > from the index?50-70 is way too little. The cached subject gets sent to the IMAP client. I think 200 bytes would be minimum and 1000 would be something I could probably even hardcode. But anyway, subject isn't the only way to trigger this and 1000 bytes is too low for some headers.