As someone who is about to begin the process of moving from maildir to mdbox on NFS (and therefore just about to start the 'director-ization' of everything) for ~6.5m mailboxes, I'm curious if anyone can share any experiences with it. The list is surprisingly quiet about this subject, and articles on google are mainly just about setting director up. I've yet to stumble across an article about someone's experiences with it. * How big of a director cluster do you use? I'm going to have millions of mailboxes behind 10 directors. I'm guessing that's plenty. It's actually split over two datacenters. In the larger, we've got about 200k connections currently, so in a perfectly-balanced world, each director would have 20k connections on it. I'm guessing that's child's play. Any good rule of thumb for ratio of 'backend servers::director servers'? In my larger DC, it's about 5::1. * Do you use the perl poolmon script or something else? The perl script was being weird for me, so I rewrote it in python but it basically does the exact same things. * Seen any issues with director? In testing, I managed to wedge things by having my poolmon script running on all the cluster boxes (I think). I've since rewritten it to run *only* on the lowest-numbered director. When it wedged, I had piles (read: hundreds per second) of log entries that said: Feb 12 06:25:03 director: Warning: director(10.1.20.5:9090/right): Host 10.1.17.3 is being updated before previous update had finished (down -> up) - setting to state=up vhosts=0 Feb 12 06:25:03 director: Warning: director(10.1.20.5:9090/right): Host 10.1.17.3 is being updated before previous update had finished (up -> down) - setting to state=down vhosts=0 Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host 10.1.17.3 is being updated before previous update had finished (down -> up) - setting to state=up vhosts=0 Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host 10.1.17.3 is being updated before previous update had finished (up -> down) - setting to state=down vhosts=0 Because it was in testing, I didn't notice it and it was like this for several days till dovecot was restarted on all the director nodes. I'm not 100% on what happened, but my *guess* is that two boxes tried to update the status of the same backend server in rapid succession. * Assuming you're using NFS, do you still see non-trivial amounts of indexes getting corrupted? * Again, assuming NFS and assuming at least some corrupted indexes, what's your guess for success rate % for dovecot recovering them automatically? And how about success rate % for ones that dovecot wasn't able to do automatically but you had to use doveadm to repair it? Really what I'm trying to figure out is 1) how often sysops will need to manually recover indexes; and 2) how often admins *can't* manually recover indexes? * if you have unrecoverable indexes (and assuming you have snapshots on your NFS server), does grabbing the most recent indexes from the snapshots always work for recovery (obviously, up till the point that the snapshot was taken)? * Any gotchas you've seen anywhere in a director-fied stack? I realize that's a broad question :) * Does one of your director nodes going down cause any issues? E.g. issues with the left and right nodes syncing with each other? Or when the director node comes back up? * Does a backend node going down cause a storm of reconnects? In the time between deploying director and getting mailboxes converted to mdbox, reconnects for us will mean cold local-disk dovecot caches. But hopefully consistent hashing helps with that? * Do you have consistent hashing turned on? I can't think of any reason not to have it turned on, but who knows * Any other configuration knobs (including sysctl) that you needed to futz with, vs the default? I appreciate any feedback!
On 24 Feb 2017, at 0.08, Mark Moseley <moseleymark at gmail.com> wrote:> > As someone who is about to begin the process of moving from maildir to > mdbox on NFS (and therefore just about to start the 'director-ization' of > everything) for ~6.5m mailboxes, I'm curious if anyone can share any > experiences with it. The list is surprisingly quiet about this subject, and > articles on google are mainly just about setting director up. I've yet to > stumble across an article about someone's experiences with it. > > * How big of a director cluster do you use? I'm going to have millions of > mailboxes behind 10 directors.I wouldn't use more than 10.> I'm guessing that's plenty. It's actually split over two datacenters.Two datacenters in the same director ring? This is dangerous. if there's a network connectivity problem between them, they split into two separate rings and start redirecting users to different backends.> * Do you have consistent hashing turned on? I can't think of any reason not > to have it turned on, but who knowsDefinitely turn it on. The setting only exists because of backwards compatibility and will be removed at some point.
On Thu, Feb 23, 2017 at 3:15 PM, Timo Sirainen <tss at iki.fi> wrote:> On 24 Feb 2017, at 0.08, Mark Moseley <moseleymark at gmail.com> wrote: > > > > As someone who is about to begin the process of moving from maildir to > > mdbox on NFS (and therefore just about to start the 'director-ization' of > > everything) for ~6.5m mailboxes, I'm curious if anyone can share any > > experiences with it. The list is surprisingly quiet about this subject, > and > > articles on google are mainly just about setting director up. I've yet to > > stumble across an article about someone's experiences with it. > > > > * How big of a director cluster do you use? I'm going to have millions of > > mailboxes behind 10 directors. > > I wouldn't use more than 10. > >Cool> > I'm guessing that's plenty. It's actually split over two datacenters. > > Two datacenters in the same director ring? This is dangerous. if there's a > network connectivity problem between them, they split into two separate > rings and start redirecting users to different backends. >I was unclear. The two director rings are unrelated and won't ever need to talk to each other. I only mentioned the two rings to point out that all 6.5m mailboxes weren't behind one ring, but rather split between two> > > * Do you have consistent hashing turned on? I can't think of any reason > not > > to have it turned on, but who knows > > Definitely turn it on. The setting only exists because of backwards > compatibility and will be removed at some point. > >Out of curiosity (and possibly extremely naive), unless you've moved a mailbox via 'doveadm director', if someone is pointed to a box via consistent hashing, why would the directors need to share that mailbox mapping? Again, assuming they're not moved (I'm also assuming that the mailbox would always, by default, hash to the same value in the consistent hash), isn't their hashing all that's needed to get to the right backend? I.e. "I know what the mailbox hashes to, and I know what backend that hash points at, so I'm done", in which case, no need to communicate to the other directors. I could see that if you moved someone, it *would* need to communicate that mapping. Then the only maps traded by directors would be the consistent hash boundaries *plus* any "moved" mailboxes. Again, just curious.
> On Feb 24, 2017, at 6:08 AM, Mark Moseley <moseleymark at gmail.com> wrote: > > * Do you use the perl poolmon script or something else? The perl script was > being weird for me, so I rewrote it in python but it basically does the > exact same things.Would you mind sharing it? :) ---- Zhang Huangbin, founder of iRedMail project: http://www.iredmail.org/ Time zone: GMT+8 (China/Beijing). Available on Telegram: https://t.me/iredmail
On Thu, Feb 23, 2017 at 3:45 PM, Zhang Huangbin <zhb at iredmail.org> wrote:> > > On Feb 24, 2017, at 6:08 AM, Mark Moseley <moseleymark at gmail.com> wrote: > > > > * Do you use the perl poolmon script or something else? The perl script > was > > being weird for me, so I rewrote it in python but it basically does the > > exact same things. > > Would you mind sharing it? :) > > ---- > Zhang Huangbin, founder of iRedMail project: http://www.iredmail.org/ > Time zone: GMT+8 (China/Beijing). > Available on Telegram: https://t.me/iredmail > >Attached. No claims are made on the quality of my code :) -------------- next part -------------- A non-text attachment was scrubbed... Name: poolmon Type: application/octet-stream Size: 8595 bytes Desc: not available URL: <http://dovecot.org/pipermail/dovecot/attachments/20170224/cc9d2f4d/attachment-0001.obj>
Il 23/02/2017 23:08, Mark Moseley ha scritto:> As someone who is about to begin the process of moving from maildir to > mdbox on NFS (and therefore just about to start the 'director-ization' of > everything) for ~6.5m mailboxes, I'm curious if anyone can share any > experiences with it. The list is surprisingly quiet about this subject, and > articles on google are mainly just about setting director up. I've yet to > stumble across an article about someone's experiences with it.Hi, in the past I did some consulting for ISPs with 4-5mln mailboxes, they had "only" 6 Director and about 30 or more Dovecot backend. About NFS, I had some trouble with Maildir, Director and NFSv4, I don't know if was a problem of client (Debian 6) or storage (NetApp Ontap 8.1) but with NFSv3 work fine. Now we should try again with Centos 6/7 and NFSv4.1. -- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice