Hi all, anyone having any problems with restarting the director? Every time I bring down 1 of the director servers, reboot it, or just restart it for whatever reason, im seeing all kinds of problems. Dovecot generally always gives me this error: Jan 20 22:49:55 imapdirector3 dovecot: director: Error: Director 194.109.26.173:444/right disconnected before handshake finished It seems the directors cant agree on forming a ring anymore, and this may be leading to problems with clients. I mostly have to resort to bringing down all directors, and restarting them all at once. Not really a workable solution. As an example, last night for a few hours we were getting complaints from customers about being disconnected, and the only obvious error in the log was the one above, after one of my colleagues had to restart a director because of some changes in the syslog daemon. After I restarted all directors withing a few seconds of each other, all complaints disappeared. Timo, i know ive asked similar questions before, but the answer just eludes me. If I have 3 director servers, and need to take one down and restart it, what is the proper method to reconnect the ring? In practice, I cant seem to work it out and I mostly end up with the above error until I just restart them all. Not fun with 20.000 clients connected. Cor
On Fri, 2011-01-21 at 13:42 -0400, Cor Bosman wrote:> Hi all, anyone having any problems with restarting the director? Every > time I bring down 1 of the director servers, reboot it, or just > restart it for whatever reason, im seeing all kinds of problems. > Dovecot generally always gives me this error: > > Jan 20 22:49:55 imapdirector3 dovecot: director: Error: Director > 194.109.26.173:444/right disconnected before handshake finishedI'm not sure if that itself is a problem..> It seems the directors cant agree on forming a ring anymore, and this > may be leading to problems with clients. I mostly have to resort to > bringing down all directors, and restarting them all at once. Not > really a workable solution. As an example, last night for a few hours > we were getting complaints from customers about being disconnected, > and the only obvious error in the log was the one above, after one of > my colleagues had to restart a director because of some changes in the > syslog daemon. After I restarted all directors withing a few seconds > of each other, all complaints disappeared.I can take a look at it, but it would help if you were able to reproduce the problem. I'm still lagging a lot behind in emails (=bugfixes).. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20110121/f61344cc/attachment-0002.bin>