On Thu, Feb 23, 2017 at 3:15 PM, Timo Sirainen <tss at iki.fi> wrote:> On 24 Feb 2017, at 0.08, Mark Moseley <moseleymark at gmail.com> wrote: > > > > As someone who is about to begin the process of moving from maildir to > > mdbox on NFS (and therefore just about to start the 'director-ization' of > > everything) for ~6.5m mailboxes, I'm curious if anyone can share any > > experiences with it. The list is surprisingly quiet about this subject, > and > > articles on google are mainly just about setting director up. I've yet to > > stumble across an article about someone's experiences with it. > > > > * How big of a director cluster do you use? I'm going to have millions of > > mailboxes behind 10 directors. > > I wouldn't use more than 10. > >Cool> > I'm guessing that's plenty. It's actually split over two datacenters. > > Two datacenters in the same director ring? This is dangerous. if there's a > network connectivity problem between them, they split into two separate > rings and start redirecting users to different backends. >I was unclear. The two director rings are unrelated and won't ever need to talk to each other. I only mentioned the two rings to point out that all 6.5m mailboxes weren't behind one ring, but rather split between two> > > * Do you have consistent hashing turned on? I can't think of any reason > not > > to have it turned on, but who knows > > Definitely turn it on. The setting only exists because of backwards > compatibility and will be removed at some point. > >Out of curiosity (and possibly extremely naive), unless you've moved a mailbox via 'doveadm director', if someone is pointed to a box via consistent hashing, why would the directors need to share that mailbox mapping? Again, assuming they're not moved (I'm also assuming that the mailbox would always, by default, hash to the same value in the consistent hash), isn't their hashing all that's needed to get to the right backend? I.e. "I know what the mailbox hashes to, and I know what backend that hash points at, so I'm done", in which case, no need to communicate to the other directors. I could see that if you moved someone, it *would* need to communicate that mapping. Then the only maps traded by directors would be the consistent hash boundaries *plus* any "moved" mailboxes. Again, just curious.
> > On Thu, Feb 23, 2017 at 3:15 PM, Timo Sirainen <tss at iki.fi> wrote: > >> On 24 Feb 2017, at 0.08, Mark Moseley <moseleymark at gmail.com> wrote: >> > >> > As someone who is about to begin the process of moving from maildir to >> > mdbox on NFS (and therefore just about to start the 'director-ization' >> of >> > everything) for ~6.5m mailboxes, I'm curious if anyone can share any >> > experiences with it. The list is surprisingly quiet about this subject, >> and >> > articles on google are mainly just about setting director up. I've yet >> to >> > stumble across an article about someone's experiences with it. >> > >> > * How big of a director cluster do you use? I'm going to have millions >> of >> > mailboxes behind 10 directors. >> >> I wouldn't use more than 10. >> >> > Cool > > > >> > I'm guessing that's plenty. It's actually split over two datacenters. >> >> Two datacenters in the same director ring? This is dangerous. if there's >> a network connectivity problem between them, they split into two separate >> rings and start redirecting users to different backends. >> > > I was unclear. The two director rings are unrelated and won't ever need to > talk to each other. I only mentioned the two rings to point out that all > 6.5m mailboxes weren't behind one ring, but rather split between two > > > >> >> > * Do you have consistent hashing turned on? I can't think of any reason >> not >> > to have it turned on, but who knows >> >> Definitely turn it on. The setting only exists because of backwards >> compatibility and will be removed at some point. >> >> > Out of curiosity (and possibly extremely naive), unless you've moved a > mailbox via 'doveadm director', if someone is pointed to a box via > consistent hashing, why would the directors need to share that mailbox > mapping? Again, assuming they're not moved (I'm also assuming that the > mailbox would always, by default, hash to the same value in the consistent > hash), isn't their hashing all that's needed to get to the right backend? > I.e. "I know what the mailbox hashes to, and I know what backend that hash > points at, so I'm done", in which case, no need to communicate to the other > directors. I could see that if you moved someone, it *would* need to > communicate that mapping. Then the only maps traded by directors would be > the consistent hash boundaries *plus* any "moved" mailboxes. Again, just > curious. > >Timo, Incidentally, on that error I posted: Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host 10.1.17.3 is being updated before previous update had finished (up -> down) - setting to state=down vhosts=0 Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host 10.1.17.3 is being updated before previous update had finished (down -> up) - setting to state=up vhosts=0 any idea what would cause that? Is my guess that multiple directors tried to update the status simultaneously correct?
In our experience. A ring with more of 4 servers is bad, we have sync problems everyone. Using 4 or less works perfect. Em 24 de fev de 2017 4:30 PM, "Mark Moseley" <moseleymark at gmail.com> escreveu:> > > > On Thu, Feb 23, 2017 at 3:15 PM, Timo Sirainen <tss at iki.fi> wrote: > > > >> On 24 Feb 2017, at 0.08, Mark Moseley <moseleymark at gmail.com> wrote: > >> > > >> > As someone who is about to begin the process of moving from maildir to > >> > mdbox on NFS (and therefore just about to start the 'director-ization' > >> of > >> > everything) for ~6.5m mailboxes, I'm curious if anyone can share any > >> > experiences with it. The list is surprisingly quiet about this > subject, > >> and > >> > articles on google are mainly just about setting director up. I've yet > >> to > >> > stumble across an article about someone's experiences with it. > >> > > >> > * How big of a director cluster do you use? I'm going to have millions > >> of > >> > mailboxes behind 10 directors. > >> > >> I wouldn't use more than 10. > >> > >> > > Cool > > > > > > > >> > I'm guessing that's plenty. It's actually split over two datacenters. > >> > >> Two datacenters in the same director ring? This is dangerous. if there's > >> a network connectivity problem between them, they split into two > separate > >> rings and start redirecting users to different backends. > >> > > > > I was unclear. The two director rings are unrelated and won't ever need > to > > talk to each other. I only mentioned the two rings to point out that all > > 6.5m mailboxes weren't behind one ring, but rather split between two > > > > > > > >> > >> > * Do you have consistent hashing turned on? I can't think of any > reason > >> not > >> > to have it turned on, but who knows > >> > >> Definitely turn it on. The setting only exists because of backwards > >> compatibility and will be removed at some point. > >> > >> > > Out of curiosity (and possibly extremely naive), unless you've moved a > > mailbox via 'doveadm director', if someone is pointed to a box via > > consistent hashing, why would the directors need to share that mailbox > > mapping? Again, assuming they're not moved (I'm also assuming that the > > mailbox would always, by default, hash to the same value in the > consistent > > hash), isn't their hashing all that's needed to get to the right backend? > > I.e. "I know what the mailbox hashes to, and I know what backend that > hash > > points at, so I'm done", in which case, no need to communicate to the > other > > directors. I could see that if you moved someone, it *would* need to > > communicate that mapping. Then the only maps traded by directors would be > > the consistent hash boundaries *plus* any "moved" mailboxes. Again, just > > curious. > > > > > Timo, > Incidentally, on that error I posted: > > Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host > 10.1.17.3 is being updated before previous update had finished (up -> down) > - setting to state=down vhosts=0 > Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host > 10.1.17.3 is being updated before previous update had finished (down -> up) > - setting to state=up vhosts=0 > > any idea what would cause that? Is my guess that multiple directors tried > to update the status simultaneously correct? >
On 24 Feb 2017, at 21.29, Mark Moseley <moseleymark at gmail.com> wrote:> > Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host > 10.1.17.3 is being updated before previous update had finished (up -> down) > - setting to state=down vhosts=0 > Feb 12 06:25:03 director: Warning: director(10.1.20.3:9090/left): Host > 10.1.17.3 is being updated before previous update had finished (down -> up) > - setting to state=up vhosts=0 > > any idea what would cause that? Is my guess that multiple directors tried > to update the status simultaneously correct?Most likely, yes. I'm not sure if it might happen also if the same server issues conflicting commands rapidly.