Cory Meyer
2009-Jan-22 04:28 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
I'm working on a project to use GlusterFS as the backend for email storage to replace the current NFS implementation. The goal is to configure GlusterFS w/AFR to replicate the files across all 3 storage nodes. Each storage node will also act a an email server configured behind a load balancer running Sendmail, Maildrop, and Courier-Imap. Main issue so far seems to be related with Courier-Imap in that when moving messages between IMAP folders some messages are duplicated with the io-threads enabled on the client side. Issue looks to be on the Courier-IMAP side though I haven't seen this with NFS and the duplicate messages within the Maildir have unique file names. Any other simular experiences with email services backended with GlusterFS? Node test hardware x 3: Quad core Xeon 2Ghz w/ 4x7200rpm SATA drives. Raid5 across all 4 drives. (Raid0 will also be tested if additional speed is necessary) 8Gb Ram Network: Currently 100Mbit though production will be 1Gbit. Os: Debian Etch (2.6.18-6-686-bigmem) GlusterFS: 1.3.12 Fuse: 2.7.3glfs10 Courier-imap 4.1.1.20060828-5 Partition Layout: sda1 --> Os (ext3) sda2 --> Swap sda5 --> glusterfs_data (ext3) GlusterFS patched Fuse kernel module, util, and libraries have been installed on each of my 3 storage nodes. Here is the basics of my Gluster configuration. Server: Brick --> TCP Client: Bricks --> AFR --> io-thread --> write-behind --> io-cache --> read-ahead Follow the Pastebin URL for my raw config file: http://glusterfs.pastebin.com/f7814657c Any suggestions? -- Cory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090121/822a08fc/attachment.html>
Anand Avati
2009-Jan-22 06:06 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
> Main issue so far seems to be related with Courier-Imap in that when moving > messages between IMAP folders some messages are duplicated with the > io-threads enabled on the client side. Issue looks to be on the > Courier-IMAP side though I haven't seen this with NFS and the duplicate > messages within the Maildir have unique file names.> GlusterFS: 1.3.12Can try upgrading to the 2.0 rc release and see if the issue still persists? avati
Keith Freedman
2009-Jan-22 08:58 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
we ran into this problem. it seems related to timestamps being off by microseconds. when someone would check their email on one machine, then hit another whose time was off by even microseconds, it would think all the messages are suddenly new or different from ones it''s checked. my guess is this problem doesn''t exist in 2.0 because of the way it manages timestamps on files in HA. However, I can''t tell you for sure, since we switched to dovecot which tracks messages by message id instead of timestamps and is also much more efficient. hope that helps. At 08:28 PM 1/21/2009, Cory Meyer wrote:>I''m working on a project to use GlusterFS as the backend for email >storage to replace the current NFS implementation. The goal is to >configure GlusterFS w/AFR to replicate the files across all 3 >storage nodes. Each storage node will also act a an email server >configured behind a load balancer running Sendmail, Maildrop, and >Courier-Imap. > >Main issue so far seems to be related with Courier-Imap in that when >moving messages between IMAP folders some messages are duplicated >with the io-threads enabled on the client side. Issue looks to be >on the Courier-IMAP side though I haven''t seen this with NFS and the >duplicate messages within the Maildir have unique file names. > >Any other simular experiences with email services backended with GlusterFS? > >Node test hardware x 3: >Quad core Xeon 2Ghz w/ 4x7200rpm SATA drives. >Raid5 across all 4 drives. (Raid0 will also be tested if additional >speed is necessary) >8Gb Ram > >Network: Currently 100Mbit though production will be 1Gbit. > >Os: Debian Etch (2.6.18-6-686-bigmem) >GlusterFS: 1.3.12 >Fuse: 2.7.3glfs10 >Courier-imap 4.1.1.20060828-5 > >Partition Layout: >sda1 --> Os (ext3) >sda2 --> Swap >sda5 --> glusterfs_data (ext3) > >GlusterFS patched Fuse kernel module, util, and libraries have been >installed on each of my 3 storage nodes. > >Here is the basics of my Gluster configuration. >Server: Brick --> TCP >Client: Bricks --> AFR --> io-thread --> write-behind --> io-cache >--> read-ahead > >Follow the Pastebin URL for my raw config file: ><http://glusterfs.pastebin.com/f7814657c>http://glusterfs.pastebin.com/f7814657c > >Any suggestions? > >-- Cory > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Keith Freedman
2009-Jan-22 08:58 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
we ran into this problem. it seems related to timestamps being off by microseconds. when someone would check their email on one machine, then hit another whose time was off by even microseconds, it would think all the messages are suddenly new or different from ones it's checked. my guess is this problem doesn't exist in 2.0 because of the way it manages timestamps on files in HA. However, I can't tell you for sure, since we switched to dovecot which tracks messages by message id instead of timestamps and is also much more efficient. hope that helps. At 08:28 PM 1/21/2009, Cory Meyer wrote:>I'm working on a project to use GlusterFS as the backend for email >storage to replace the current NFS implementation. The goal is to >configure GlusterFS w/AFR to replicate the files across all 3 >storage nodes. Each storage node will also act a an email server >configured behind a load balancer running Sendmail, Maildrop, and >Courier-Imap. > >Main issue so far seems to be related with Courier-Imap in that when >moving messages between IMAP folders some messages are duplicated >with the io-threads enabled on the client side. Issue looks to be >on the Courier-IMAP side though I haven't seen this with NFS and the >duplicate messages within the Maildir have unique file names. > >Any other simular experiences with email services backended with GlusterFS? > >Node test hardware x 3: >Quad core Xeon 2Ghz w/ 4x7200rpm SATA drives. >Raid5 across all 4 drives. (Raid0 will also be tested if additional >speed is necessary) >8Gb Ram > >Network: Currently 100Mbit though production will be 1Gbit. > >Os: Debian Etch (2.6.18-6-686-bigmem) >GlusterFS: 1.3.12 >Fuse: 2.7.3glfs10 >Courier-imap 4.1.1.20060828-5 > >Partition Layout: >sda1 --> Os (ext3) >sda2 --> Swap >sda5 --> glusterfs_data (ext3) > >GlusterFS patched Fuse kernel module, util, and libraries have been >installed on each of my 3 storage nodes. > >Here is the basics of my Gluster configuration. >Server: Brick --> TCP >Client: Bricks --> AFR --> io-thread --> write-behind --> io-cache >--> read-ahead > >Follow the Pastebin URL for my raw config file: ><http://glusterfs.pastebin.com/f7814657c>http://glusterfs.pastebin.com/f7814657c > >Any suggestions? > >-- Cory > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
Cory Meyer
2009-Jan-27 13:16 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
I've upgraded GlusterFS from 1.3.12 to 2.0rc1 and ran into the same duplicated messages. According to the packet captures the duplicate messages problem looks to be more client related as the operation is taking too long and times out then re-connects to issue the same IMAP COPY command. I haven't been able to recreate this with an NFS backend. Also, Is there anything specific that can be done to improve small file performance (14k) on GlusterFS? So far my benchmarks are showing read/write performance on a few large (100Mb) files is quite a bit better than on hundreds of small 14Kb files. There was some talk of using GlusterFS for email storage last November. Is anyone else successfully using this in a production enviroment? Thanks, Cory On Thu, Jan 22, 2009 at 2:58 AM, Keith Freedman <freedman at freeformit.com>wrote:> we ran into this problem. > it seems related to timestamps being off by microseconds. > when someone would check their email on one machine, then hit another whose > time was off by even microseconds, it would think all the messages are > suddenly new or different from ones it's checked. > > my guess is this problem doesn't exist in 2.0 because of the way it manages > timestamps on files in HA. > > However, I can't tell you for sure, since we switched to dovecot which > tracks messages by message id instead of timestamps and is also much more > efficient. > > hope that helps. > > > At 08:28 PM 1/21/2009, Cory Meyer wrote: > >> I'm working on a project to use GlusterFS as the backend for email storage >> to replace the current NFS implementation. The goal is to configure >> GlusterFS w/AFR to replicate the files across all 3 storage nodes. Each >> storage node will also act a an email server configured behind a load >> balancer running Sendmail, Maildrop, and Courier-Imap. >> >> Main issue so far seems to be related with Courier-Imap in that when >> moving messages between IMAP folders some messages are duplicated with the >> io-threads enabled on the client side. Issue looks to be on the >> Courier-IMAP side though I haven't seen this with NFS and the duplicate >> messages within the Maildir have unique file names. >> >> Any other simular experiences with email services backended with >> GlusterFS? >> >> Node test hardware x 3: >> Quad core Xeon 2Ghz w/ 4x7200rpm SATA drives. >> Raid5 across all 4 drives. (Raid0 will also be tested if additional speed >> is necessary) >> 8Gb Ram >> >> Network: Currently 100Mbit though production will be 1Gbit. >> >> Os: Debian Etch (2.6.18-6-686-bigmem) >> GlusterFS: 1.3.12 >> Fuse: 2.7.3glfs10 >> Courier-imap 4.1.1.20060828-5 >> >> Partition Layout: >> sda1 --> Os (ext3) >> sda2 --> Swap >> sda5 --> glusterfs_data (ext3) >> >> GlusterFS patched Fuse kernel module, util, and libraries have been >> installed on each of my 3 storage nodes. >> >> Here is the basics of my Gluster configuration. >> Server: Brick --> TCP >> Client: Bricks --> AFR --> io-thread --> write-behind --> io-cache --> >> read-ahead >> >> Follow the Pastebin URL for my raw config file: >> <http://glusterfs.pastebin.com/f7814657c> >> http://glusterfs.pastebin.com/f7814657c >> >> Any suggestions? >> >> -- Cory >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090127/f1b5a220/attachment.html>
Keith Freedman
2009-Jan-27 19:52 UTC
[Gluster-users] Email storage backend for Sendmail, Maildir, and Courier-imap
At 05:16 AM 1/27/2009, Cory Meyer wrote:>There was some talk of using GlusterFS for email storage last >November. Is anyone else successfully using this in a production enviroment?I use glusterfs 2.0rc1 on production web hosting environment, so this has email/imap/web/ftp we don''t really overload our servers so haven''t really run into any performance issues. There is still some performance issues related to small files, but this is mostly a fuse issue I belive. We''re going to be testing out mod_glusterfs in apache shortly to see if we get any noticeable web performance improvements, but on the mail things seem ok, although people using imap with >1500 messages in a folder sometimes experience long loading times, but this is mostly due to the imap server being stupid. On the systems we''ve switched to dovecot from courier, this is less of an issue. for POP it all seems to work just fine. I''d definitely avoid courier not because of gluster just because it''s terribly inefficient--it doesn''t cache mail headers as far as I can tell and so it scans the filesystem every time someone establishes an imap connection--this is inefficient no matter what your filesystem. for SMTP services we use exim instead of sendmail. Sendmail is terribly inefficient. Also it would depend how you configure it.. .presumably the cache/spool files would be on a local filesystem, and the delivered mail on a shared glusterfs. if you want to spool into gluster you might be ok, but you''ll possibly have multiple sendmail daemons fighting over the right to send a message and you''ll likely have problems by having multiple machines attempting to send messages more frequently than the receiving servers want to hear from you. Our email load also isn''t too out of hand and the only thing that ever causes a problem is the anti-spam software (which processes messages outside of gluster, so that''s not an issue). hoep that helps.