Paul Kudla
2023-Jan-05 01:45 UTC
Migrating, syncing, maybe load-balancing/failover two dovecot servers?
ok just a few quick things about replication 1. you should upgrade both versions to at least dovecot-2.3.19.1.tar.gz (2.3.18 had issues on larges folder counts - you will probably run into this on smaller servers but just sharing the experience) 2. i found replication worked better without using ssl 3. i went through the sync failures etc as well and found that NOT using NFS etc is the way to go 4. I can provide (or if you look on the mailing lists) my config for SCOM - it took a month of tweeking but finally got a good config that worked. 5. One thing i just remembered that you really should run a pgsql database for user auth, this way the two system will stay up to date automatically everytime an email box is modified. The replicator service selects users from a database to keep the mbox's in sync automatically the above are the basics but i find dovecot runs extremely well vs cyrus that i was running previous Good job to the designers ! Happy Wednesday !!! Thanks - paul Paul Kudla Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3 Toronto 416.642.7266 Main?1.866.411.7266 Fax?1.888.892.7266 Email?paul at scom.ca On 1/4/2023 4:24 PM, Gerben Wierda wrote:> So, I did set it up. > > As I am using not real users (but a cram md5 passwd db file with every > user uid=dovecot, gid=mail) and my dovecots are owning everything in the > mail store I had to synchronise uid/gid of the dovecots on both ends > > After I did that, I tested the sync. And while it has worked (I now have > an equal sized store at both ends), one side (running 2.3.17, the > sending 'old server') was throwing up quite a bit of this: > > Jan 04 20:13:15 doveadm(74435): Error: write(<local>) failed: Timed out > after 60 seconds > Jan 04 20:13:15 doveadm(74435): Panic: file ioloop.c: line 865 > (io_loop_destroy): assertion failed: (ioloop == current_ioloop) > Jan 04 20:13:15 doveadm(74435): Error: Raw backtrace: 0 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db6d157 backtrace_append > + 58 -> 1 ? libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db6d255 > backtrace_get + 31 -> 2 ? libdovecot.0.dylib > 0x000000010db79ff3 default_fatal_finish + 60 -> 3 ? libdovecot.0.dylib > ? ? ? ? ? ? ? ? 0x000000010db78afa default_error_handler + 0 -> 4 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db7973b > i_internal_error_handler + 0 -> 5 ? libdovecot.0.dylib > 0x000000010db78c > Jan 04 20:13:15 doveadm(74435): Error: b8 i_fatal + 0 -> 6 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db8fa1f io_loop_destroy + > 826 -> 7 ? doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x000000010d3445fc > doveadm_print_server_flush + 254 -> 8 ? doveadm-server > ? ? 0x000000010d33df1e doveadm_print + 44 -> 9 ? doveadm-server > ? ? ? ? ? ? ? 0x000000010d32bd5b cmd_dsync_run + 1618 -> 10 > doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x000000010d32db67 > doveadm_mail_next_user + 479 -> 11? doveadm-server > 0x000000010 > Jan 04 20:13:15 doveadm(74435): Error: d32e8bb > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12? doveadm-server > ? ? ? ? ? ? ? 0x000000010d33dc0c doveadm_cmd_run_ver2 + 1083 -> 13 > doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x000000010d34224a > client_connection_tcp_input + 1579 -> 14? libdovecot.0.dylib > ? ? ? 0x000000010db8efe1 io_loop_call_io + 114 -> 15 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db910cf > io_loop_handler_run_internal + 314 -> 16? libdovecot.0.dylib > ? ? ? 0x000000010db8f3fb io_loop_handler_run + > Jan 04 20:13:15 doveadm(74435): Error:? 212 -> 17? libdovecot.0.dylib > ? ? ? ? ? ? ? 0x000000010db8f2e6 io_loop_run + 81 -> 18 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010db075e0 > master_service_run + 24 -> 19? doveadm-server > 0x000000010d344c3f main + 292 -> 20? dyld > 0x000000011c73952e start + 462 > Jan 04 20:13:15 doveadm(74435): Fatal: master: service(doveadm): child > 74435 killed with signal 6 (core dumps disabled - > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>) > Jan 04 20:16:05 lmtp(pid 74518 user gerben): Warning: > replication(gerben): Sync failure: Timeout in 2 secs > Jan 04 20:17:05 doveadm(74522): Error: write(<local>) failed: Timed out > after 60 seconds > Jan 04 20:17:05 doveadm(74522): Panic: file ioloop.c: line 865 > (io_loop_destroy): assertion failed: (ioloop == current_ioloop) > Jan 04 20:17:05 doveadm(74522): Error: Raw backtrace: 0 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x00000001050d3157 backtrace_append > + 58 -> 1 ? libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x00000001050d3255 > backtrace_get + 31 -> 2 ? libdovecot.0.dylib > 0x00000001050dfff3 default_fatal_finish + 60 -> 3 ? libdovecot.0.dylib > ? ? ? ? ? ? ? ? 0x00000001050deafa default_error_handler + 0 -> 4 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x00000001050df73b > i_internal_error_handler + 0 -> 5 ? libdovecot.0.dylib > 0x00000001050dec > Jan 04 20:17:05 doveadm(74522): Error: b8 i_fatal + 0 -> 6 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x00000001050f5a1f io_loop_destroy + > 826 -> 7 ? doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x00000001048aa5fc > doveadm_print_server_flush + 254 -> 8 ? doveadm-server > ? ? 0x00000001048a3f1e doveadm_print + 44 -> 9 ? doveadm-server > ? ? ? ? ? ? ? 0x0000000104891d5b cmd_dsync_run + 1618 -> 10 > doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x0000000104893b67 > doveadm_mail_next_user + 479 -> 11? doveadm-server > 0x000000010 > Jan 04 20:17:05 doveadm(74522): Error: 48948bb > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12? doveadm-server > ? ? ? ? ? ? ? 0x00000001048a3c0c doveadm_cmd_run_ver2 + 1083 -> 13 > doveadm-server? ? ? ? ? ? ? ? ? ? ? 0x00000001048a824a > client_connection_tcp_input + 1579 -> 14? libdovecot.0.dylib > ? ? ? 0x00000001050f4fe1 io_loop_call_io + 114 -> 15 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x00000001050f70cf > io_loop_handler_run_internal + 314 -> 16? libdovecot.0.dylib > ? ? ? 0x00000001050f53fb io_loop_handler_run + > Jan 04 20:17:05 doveadm(74522): Error:? 212 -> 17? libdovecot.0.dylib > ? ? ? ? ? ? ? 0x00000001050f52e6 io_loop_run + 81 -> 18 > libdovecot.0.dylib? ? ? ? ? ? ? ? ? 0x000000010506d5e0 > master_service_run + 24 -> 19? doveadm-server > 0x00000001048aac3f main + 292 -> 20? dyld > 0x000000011487652e start + 462 > Jan 04 20:17:05 doveadm(74522): Fatal: master: service(doveadm): child > 74522 killed with signal 6 (core dumps disabled - > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>) > > Turns out, this is a known (and pretty old) problem > (https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html > <https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html>) and > my dovecot on the old server (macOS + MacPorts) is newer than the > dovecot on the new one. I should go back to a 2.3.16 on the old server. > > It seems the syncing works (or has worked) nonetheless, but it doesn't > feel good. > > Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>) > R&A IT Strategy <https://ea.rna.nl/>?(main site) > Book: Chess and the Art of Enterprise?Architecture > <https://ea.rna.nl/the-book/> > Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/> > >> On 4 Jan 2023, at 13:54, Paul Kudla <paul at scom.ca >> <mailto:paul at scom.ca>> wrote: >> >> >> maybe look a replicator / replication >> >> its designed to do exactly that >> >> >> >> >> Happy Wednesday !!! >> Thanks - paul >> >> Paul Kudla >> >> >> Scom.ca <http://Scom.ca> Internet Services <http://www.scom.ca >> <http://www.scom.ca>> >> 004-1009 Byron Street South >> Whitby, Ontario - Canada >> L1N 4S3 >> >> Toronto 416.642.7266 >> Main?1.866.411.7266 >> Fax?1.888.892.7266 >> Email?paul at scom.ca <mailto:paul at scom.ca> >> >> On 1/4/2023 7:46 AM, Gerben Wierda wrote: >>> I am in the process of migrating from dovecot on one OS >>> (macOS/darwin) to a new server running dovecot with another OS >>> (Ubuntu Linux 22.4). >>> I have mostly copied/adapted the setup of the old server to the new. >>> I am in the process of finishing that and adding some stuff that >>> still needs to be added/migrated, like rspamd. And the data of course >>> before the new one takes over from the old. >>> I have done a migration before (MacOS X Server dovecot to MacPorts >>> dovecot on macOS), many years ago, I recall that I used dovecot >>> syncing but also rsync and I don't really recall (and anyway, the >>> software has changed since) >>> I have been thinking about keeping them both alive, with one as a >>> failover for the other. They will not share their storage (e.g. NFS), >>> So, I was wondering if I can do something with syncing between >>> instances and dovecot director. I have been looking at the >>> documentation, but a quick scan reveals I cannot locate some sort of >>> tutorial and I am uncertain what will work and what not. >>> If keeping both alive in parallel is too problematic, it is OK to >>> have regular syncing in one direction (old to new) at first and then >>> switch over and have syncing in the other direction (new to old) >>> Can someone enlighten me? >>> Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>) >>> R&A IT Strategy <https://ea.rna.nl/>?(main site) >>> Book: Chess and the Art of Enterprise?Architecture >>> <https://ea.rna.nl/the-book/> >>> Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/> >>> -- >>> This message has been scanned for viruses and >>> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is >>> believed to be clean. > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is > believed to be clean.
Joachim Lindenberg
2023-Jan-05 15:55 UTC
WG: Migrating, syncing, maybe load-balancing/failover two dovecot servers?
In my experiments I also experienced replication being stalled when running with ssl. Is this being looked into? Thanks, Joachim -----Urspr?ngliche Nachricht----- Von: dovecot <dovecot-bounces at dovecot.org> Im Auftrag von Paul Kudla Gesendet: Donnerstag, 5. Januar 2023 02:46 An: dovecot at dovecot.org Betreff: Re: Migrating, syncing, maybe load-balancing/failover two dovecot servers? ok just a few quick things about replication 1. you should upgrade both versions to at least dovecot-2.3.19.1.tar.gz (2.3.18 had issues on larges folder counts - you will probably run into this on smaller servers but just sharing the experience) 2. i found replication worked better without using ssl 3. i went through the sync failures etc as well and found that NOT using NFS etc is the way to go 4. I can provide (or if you look on the mailing lists) my config for SCOM - it took a month of tweeking but finally got a good config that worked. 5. One thing i just remembered that you really should run a pgsql database for user auth, this way the two system will stay up to date automatically everytime an email box is modified. The replicator service selects users from a database to keep the mbox's in sync automatically the above are the basics but i find dovecot runs extremely well vs cyrus that i was running previous Good job to the designers ! Happy Wednesday !!! Thanks - paul Paul Kudla Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3 Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul at scom.ca On 1/4/2023 4:24 PM, Gerben Wierda wrote:> So, I did set it up. > > As I am using not real users (but a cram md5 passwd db file with every > user uid=dovecot, gid=mail) and my dovecots are owning everything in > the mail store I had to synchronise uid/gid of the dovecots on both > ends > > After I did that, I tested the sync. And while it has worked (I now > have an equal sized store at both ends), one side (running 2.3.17, the > sending 'old server') was throwing up quite a bit of this: > > Jan 04 20:13:15 doveadm(74435): Error: write(<local>) failed: Timed > out after 60 seconds Jan 04 20:13:15 doveadm(74435): Panic: file > ioloop.c: line 865 > (io_loop_destroy): assertion failed: (ioloop == current_ioloop) > Jan 04 20:13:15 doveadm(74435): Error: Raw backtrace: 0 > libdovecot.0.dylib 0x000000010db6d157 > backtrace_append > + 58 -> 1 libdovecot.0.dylib 0x000000010db6d255 > backtrace_get + 31 -> 2 libdovecot.0.dylib > 0x000000010db79ff3 default_fatal_finish + 60 -> 3 libdovecot.0.dylib > 0x000000010db78afa default_error_handler + 0 -> 4 > libdovecot.0.dylib 0x000000010db7973b > i_internal_error_handler + 0 -> 5 libdovecot.0.dylib > 0x000000010db78c > Jan 04 20:13:15 doveadm(74435): Error: b8 i_fatal + 0 -> 6 > libdovecot.0.dylib 0x000000010db8fa1f io_loop_destroy > + > 826 -> 7 doveadm-server 0x000000010d3445fc > doveadm_print_server_flush + 254 -> 8 doveadm-server > 0x000000010d33df1e doveadm_print + 44 -> 9 doveadm-server > 0x000000010d32bd5b cmd_dsync_run + 1618 -> 10 > doveadm-server 0x000000010d32db67 > doveadm_mail_next_user + 479 -> 11 doveadm-server > 0x000000010 > Jan 04 20:13:15 doveadm(74435): Error: d32e8bb > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12 doveadm-server > 0x000000010d33dc0c doveadm_cmd_run_ver2 + 1083 -> 13 > doveadm-server 0x000000010d34224a > client_connection_tcp_input + 1579 -> 14 libdovecot.0.dylib > 0x000000010db8efe1 io_loop_call_io + 114 -> 15 > libdovecot.0.dylib 0x000000010db910cf > io_loop_handler_run_internal + 314 -> 16 libdovecot.0.dylib > 0x000000010db8f3fb io_loop_handler_run + > Jan 04 20:13:15 doveadm(74435): Error: 212 -> 17 libdovecot.0.dylib > 0x000000010db8f2e6 io_loop_run + 81 -> 18 > libdovecot.0.dylib 0x000000010db075e0 > master_service_run + 24 -> 19 doveadm-server > 0x000000010d344c3f main + 292 -> 20 dyld > 0x000000011c73952e start + 462 > Jan 04 20:13:15 doveadm(74435): Fatal: master: service(doveadm): child > 74435 killed with signal 6 (core dumps disabled - > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>) > Jan 04 20:16:05 lmtp(pid 74518 user gerben): Warning: > replication(gerben): Sync failure: Timeout in 2 secs Jan 04 20:17:05 > doveadm(74522): Error: write(<local>) failed: Timed out after 60 > seconds Jan 04 20:17:05 doveadm(74522): Panic: file ioloop.c: line 865 > (io_loop_destroy): assertion failed: (ioloop == current_ioloop) > Jan 04 20:17:05 doveadm(74522): Error: Raw backtrace: 0 > libdovecot.0.dylib 0x00000001050d3157 > backtrace_append > + 58 -> 1 libdovecot.0.dylib 0x00000001050d3255 > backtrace_get + 31 -> 2 libdovecot.0.dylib > 0x00000001050dfff3 default_fatal_finish + 60 -> 3 libdovecot.0.dylib > 0x00000001050deafa default_error_handler + 0 -> 4 > libdovecot.0.dylib 0x00000001050df73b > i_internal_error_handler + 0 -> 5 libdovecot.0.dylib > 0x00000001050dec > Jan 04 20:17:05 doveadm(74522): Error: b8 i_fatal + 0 -> 6 > libdovecot.0.dylib 0x00000001050f5a1f io_loop_destroy > + > 826 -> 7 doveadm-server 0x00000001048aa5fc > doveadm_print_server_flush + 254 -> 8 doveadm-server > 0x00000001048a3f1e doveadm_print + 44 -> 9 doveadm-server > 0x0000000104891d5b cmd_dsync_run + 1618 -> 10 > doveadm-server 0x0000000104893b67 > doveadm_mail_next_user + 479 -> 11 doveadm-server > 0x000000010 > Jan 04 20:17:05 doveadm(74522): Error: 48948bb > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12 doveadm-server > 0x00000001048a3c0c doveadm_cmd_run_ver2 + 1083 -> 13 > doveadm-server 0x00000001048a824a > client_connection_tcp_input + 1579 -> 14 libdovecot.0.dylib > 0x00000001050f4fe1 io_loop_call_io + 114 -> 15 > libdovecot.0.dylib 0x00000001050f70cf > io_loop_handler_run_internal + 314 -> 16 libdovecot.0.dylib > 0x00000001050f53fb io_loop_handler_run + > Jan 04 20:17:05 doveadm(74522): Error: 212 -> 17 libdovecot.0.dylib > 0x00000001050f52e6 io_loop_run + 81 -> 18 > libdovecot.0.dylib 0x000000010506d5e0 > master_service_run + 24 -> 19 doveadm-server > 0x00000001048aac3f main + 292 -> 20 dyld > 0x000000011487652e start + 462 > Jan 04 20:17:05 doveadm(74522): Fatal: master: service(doveadm): child > 74522 killed with signal 6 (core dumps disabled - > https://dovecot.org/bugreport.html#coredumps > <https://dovecot.org/bugreport.html#coredumps>) > > Turns out, this is a known (and pretty old) problem > (https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html > <https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html>) > and my dovecot on the old server (macOS + MacPorts) is newer than the > dovecot on the new one. I should go back to a 2.3.16 on the old server. > > It seems the syncing works (or has worked) nonetheless, but it doesn't > feel good. > > Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>) > R&A IT Strategy <https://ea.rna.nl/> (main site) > Book: Chess and the Art of Enterprise Architecture > <https://ea.rna.nl/the-book/> > Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/> > >> On 4 Jan 2023, at 13:54, Paul Kudla <paul at scom.ca >> <mailto:paul at scom.ca>> wrote: >> >> >> maybe look a replicator / replication >> >> its designed to do exactly that >> >> >> >> >> Happy Wednesday !!! >> Thanks - paul >> >> Paul Kudla >> >> >> Scom.ca <http://Scom.ca> Internet Services <http://www.scom.ca >> <http://www.scom.ca>> >> 004-1009 Byron Street South >> Whitby, Ontario - Canada >> L1N 4S3 >> >> Toronto 416.642.7266 >> Main 1.866.411.7266 >> Fax 1.888.892.7266 >> Email paul at scom.ca <mailto:paul at scom.ca> >> >> On 1/4/2023 7:46 AM, Gerben Wierda wrote: >>> I am in the process of migrating from dovecot on one OS >>> (macOS/darwin) to a new server running dovecot with another OS >>> (Ubuntu Linux 22.4). >>> I have mostly copied/adapted the setup of the old server to the new. >>> I am in the process of finishing that and adding some stuff that >>> still needs to be added/migrated, like rspamd. And the data of >>> course before the new one takes over from the old. >>> I have done a migration before (MacOS X Server dovecot to MacPorts >>> dovecot on macOS), many years ago, I recall that I used dovecot >>> syncing but also rsync and I don't really recall (and anyway, the >>> software has changed since) I have been thinking about keeping them >>> both alive, with one as a failover for the other. They will not >>> share their storage (e.g. NFS), So, I was wondering if I can do >>> something with syncing between instances and dovecot director. I >>> have been looking at the documentation, but a quick scan reveals I >>> cannot locate some sort of tutorial and I am uncertain what will >>> work and what not. >>> If keeping both alive in parallel is too problematic, it is OK to >>> have regular syncing in one direction (old to new) at first and then >>> switch over and have syncing in the other direction (new to old) Can >>> someone enlighten me? >>> Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>) >>> R&A IT Strategy <https://ea.rna.nl/> (main site) >>> Book: Chess and the Art of Enterprise Architecture >>> <https://ea.rna.nl/the-book/> >>> Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/> >>> -- >>> This message has been scanned for viruses and dangerous content by >>> *MailScanner* <http://www.mailscanner.info/>, and is believed to be >>> clean. > > > -- > This message has been scanned for viruses and dangerous content by > *MailScanner* <http://www.mailscanner.info/>, and is believed to be > clean.