Allen Belletti
2008-Sep-24 19:03 UTC
[Dovecot] Dovecot performance on GFS clustered filesystem
Hello All, We are using Dovecot 1.1.3 to serve IMAP on a pair of clustered Postfix servers which share a fiber array via the GFS clustered filesystem. This all works very well for the most part, with the exception that certain operations are so inefficient on GFS that they generate significant I/O load and hurt performance. We are using the Maildir format on disk. We're also using Dovecot's deliver from Postfix to handle local delivery. As best I can determine, the worst problems occur when certain users with very large Inboxes (~10k messages) receive new mail and their client looks up information about that message. GFS doesn't seem to efficiently handle the large directories that contain folders like this. As a result, lots of I/O ops are generated and performance suffers for everyone. I am beginning to wonder if it might be more efficient to revert to the old mbox format, with one file per folder (plus whatever indices are creates.) It seems that this ought to work better with GFS which is geared toward smaller numbers of larger files. Is anyone on the list currently doing that? Alternately, any thoughts regarding tuning or other options would be appreciated. Thanks, Allen -- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology
Diego Liziero
2008-Sep-24 19:18 UTC
[Dovecot] Dovecot performance on GFS clustered filesystem
I've read somewhere that one of gfs2 goals was to improve performance for directory access with many files. I've tested it doing a simple ls in a directory with many test empty files in gfs and it was _really_ slow, doing the ls on a gfs2 with the same amount of emtpy files is actually faster. But when I tested gfs2 with bonnie++ I got fewer sequential I/O speed than in gfs (consider that I tested a beta version of gfs2 some months ago, maybe things are better now). So my conclusion of the tests was that gfs is best with mbox, gfs2 beta with maildir. But, again, I haven't tested gfs2 improvements recently. Regards, Diego.
Timo Sirainen
2008-Sep-24 19:19 UTC
[Dovecot] Dovecot performance on GFS clustered filesystem
On Sep 24, 2008, at 10:03 PM, Allen Belletti wrote:> As best I can determine, the worst problems occur when certain users > with very large Inboxes (~10k messages) receive new mail and their > client looks up information about that message. GFS doesn't seem to > efficiently handle the large directories that contain folders like > this. As a result, lots of I/O ops are generated and performance > suffers for everyone. > > I am beginning to wonder if it might be more efficient to revert to > the > old mbox format, with one file per folder (plus whatever indices are > creates.) It seems that this ought to work better with GFS which is > geared toward smaller numbers of larger files. Is anyone on the list > currently doing that? Alternately, any thoughts regarding tuning or > other options would be appreciated.One possibility would be to use dbox format with hashed directories so for each mailbox it could create n directories where to store the messages. Two problems here though: 1. dbox code hasn't been tested all that much yet in real world (but it works well in my stress tests) 2. dbox doesn't yet support directory hashing, but it would be pretty easy to implement. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20080924/3b72085d/attachment-0002.bin>
Allen Belletti
2009-Feb-05 23:38 UTC
[Dovecot] Dovecot performance on GFS clustered filesystem
Hi All, I wanted to follow up my own message from September now that I've got more information. As of RHEL 5.3, GFS2 was finally advertised as "production ready" and the servers discussed below have been upgraded from GFS to GFS2. The difference is night and day. Essentially GFS2 has completely eliminated the long periods of heavy I/O load that were seen before. In addition, the user experience is markedly better. For anyone who is considering something like this, feel free to contact me as I'll be glad to pass along whatever wisdom I've accumulated. Thanks, Allen Allen Belletti wrote:> Hello All, > > We are using Dovecot 1.1.3 to serve IMAP on a pair of clustered Postfix > servers which share a fiber array via the GFS clustered filesystem. > This all works very well for the most part, with the exception that > certain operations are so inefficient on GFS that they generate > significant I/O load and hurt performance. We are using the Maildir > format on disk. We're also using Dovecot's deliver from Postfix to > handle local delivery. > > As best I can determine, the worst problems occur when certain users > with very large Inboxes (~10k messages) receive new mail and their > client looks up information about that message. GFS doesn't seem to > efficiently handle the large directories that contain folders like > this. As a result, lots of I/O ops are generated and performance > suffers for everyone. > > I am beginning to wonder if it might be more efficient to revert to the > old mbox format, with one file per folder (plus whatever indices are > creates.) It seems that this ought to work better with GFS which is > geared toward smaller numbers of larger files. Is anyone on the list > currently doing that? Alternately, any thoughts regarding tuning or > other options would be appreciated. > > Thanks, > Allen >-- Allen Belletti allen at isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology