Sebastian Marske
2022-Aug-02 13:30 UTC
Replication not working - GUIDs conflict - will be merged later
On 8/1/22 11:15, Patrick Westenberg wrote:> Very interesting new insights: > > When I use imapsync and let it synchronize mails from INBOX to > INBOX/testfolder, the automatic replication works fine. > All mails are synchronized between my two backends. > > > When I move the mails to the INBOX (doveadm move -u mail at example.com > INBOX mailbox INBOX/testfolder all), these mails are lost on the > replica! They are neither in INBOX, nor in INBOX/testfolder > > Regards > PatrickHi, every now and then I have the same problem on our servers. Currently, I'm running Dovecot 2.3.19.1 as well, but I upgraded directly from 2.3.16 due to other issues with the versions in between. Last time I observed a de-sync due to a GUID change, it appeared like the user had moved a folder around in their mailbox. And indeed, the output of 'doveadm mailbox status -u someuser guid '*' listed different GUIDs. Dovecot actually logged some errors for this case: Dovecot log from replica1: Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10206><s1aFMQ8O4WLeJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID b0053e390f0ee162de270000c9042436 to path/to/folder Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10208><fgWCCRAO4WLgJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID 5823fe0d100ee162e0270000c9042436 to path/to/folder Dovecot log from replica2: Jul 27 12:06:04 replica2 dovecot[47018]: doveadm(someuser)<2239><TD9EDAwO4WK/CAAADKIhQg>: Warning: Failed to do incremental sync for mailbox path/to/folder, retry with a full sync (uidnext 1 < 13) Jul 27 12:06:04 replica2 dovecot[47018]: doveadm(someuser)<2241><ix0uKQwO4WLBCAAADKIhQg>: Error: Duplicate mailbox GUID 0ccaab01079031620e1e00000ca22142 for mailboxes path/to/folder and some/folder - giving a new GUID 78c9dc2c0c0ee162c10800000ca22142 to path/to/folder At that time, only replica2 was accepting imap connections. In this particular case, Dovecot eventually managed to get things back in sync after way over 24h, but I also had users out of sync for multiple days. Running 'doveadm -Dv sync -u someuser -d' manually gave me the same error message, but didn't change anything. Other things I've observed: * it's not limited to a fixed set of users (unlike the too-many-folders-thing with Dovecot 2.3.1[78]) * it's not limited to newly created users, but also affects users, that have been in sync for months/years * it's not limited to mailboxes with lots of imap operations going on * it's not specific to very large or very small mailboxes (although I've only seen it for folders with a small number of mails in them) * in most cases, Dovecot doesn't log any errors * it does seem to be related to something an imap client can trigger As of now, my "fix" is to * make sure that one of the replicas has all mails for that folder (we're using maildir, so I can just rsync the individual mails/folders) * create a full copy of the complete folder as backup * remove the user from replication * 'doveadm mailbox delete' the folder on one replica to get rid of one of the conflicting guids (one time, Dovecot replicated the deletion despite removing the user from replication, so the backup came in handy) * alternatively, you might be fine by deleting the folder's index files * add the user back to replication * let dsync replicate the user -> fixed It's not a very convenient way to resolve this, but maybe it helps. Any better solutions are greatly appreciated! Best Sebastian
Paul Kudla (SCOM.CA Internet Services Inc.)
2022-Aug-02 13:49 UTC
Replication not working - GUIDs conflict - will be merged later
ok i went through this as well a bit there is a replication full sync variable (i am having trouble finding it) 24h is the default but i might have rebuilt dovecot modifying this default after i got things working i put everything back to default code. yep i did from dovecot-2.3.19/src/replication see : aggregator/replicator-connection.c:#define MAX_INBUF_SIZE 1024 aggregator/replicator-connection.c:#define REPLICATOR_MEMBUF_MAX_SIZE 1024*1024 aggregator/replicator-connection.c: conn->queue[i] = buffer_create_dynamic(default_pool, 1024); Binary file replicator/replicator-brain.o matches replicator/replicator-settings.c: .replication_full_sync_interval = 60*60*24, replicator/notify-connection.c:#define MAX_INBUF_SIZE (1024*64) Binary file replicator/doveadm-connection.o matches Binary file replicator/.libs/replicator matches replicator/replicator-brain.c: pool = pool_alloconly_create("replication brain", 1024); replicator/replicator-queue.c: queue->user_queue = priorityq_init(user_priority_cmp, 1024); replicator/replicator-queue.c: hash_table_create(&queue->user_hash, default_pool, 1024, Binary file replicator/notify-connection.o matches Binary file replicator/dsync-client.o matches I do not believe there is a settable variable in dovecot.conf ? I could be wrong. the actual code containing the variable is below, change and recompile all and that should/might help. replicator/replicator-settings.c: .replication_full_sync_interval = 60*60*24, change to 24 so something more practical ? note 60*60*24 is math (ie how many seconds in between full syncs) - ie do not change 24 to 24h for example. do this on both servers. note that a full sync interval stress wise on the server is dependant on how much physical mail you have in the mbox. note that the full resync interval syncs both accounts from scratch. also note 6hrs is not a bad place to start? the replicator service will deal with this in the background there are also other variables hard set (like i believe 15m for the retry bad sync interval ?) you will need to dig through the replicator code to find these. Happy Tuesday !!! Thanks - paul Paul Kudla Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3 Toronto 416.642.7266 Main?1.866.411.7266 Fax?1.888.892.7266 Email?paul at scom.ca On 8/2/2022 9:30 AM, Sebastian Marske wrote:> > > > On 8/1/22 11:15, Patrick Westenberg wrote: >> Very interesting new insights: >> >> When I use imapsync and let it synchronize mails from INBOX to >> INBOX/testfolder, the automatic replication works fine. >> All mails are synchronized between my two backends. >> >> >> When I move the mails to the INBOX (doveadm move -u mail at example.com >> INBOX mailbox INBOX/testfolder all), these mails are lost on the >> replica! They are neither in INBOX, nor in INBOX/testfolder >> >> Regards >> Patrick > > Hi, > > every now and then I have the same problem on our servers. Currently, > I'm running Dovecot 2.3.19.1 as well, but I upgraded directly from > 2.3.16 due to other issues with the versions in between. > > Last time I observed a de-sync due to a GUID change, it appeared like > the user had moved a folder around in their mailbox. And indeed, the > output of 'doveadm mailbox status -u someuser guid '*' listed different > GUIDs. Dovecot actually logged some errors for this case: > > Dovecot log from replica1: > Jul 27 12:06:08 replica1 dovecot[3431]: > doveadm(someuser)<10206><s1aFMQ8O4WLeJwAAyQQkNg>: Error: Duplicate > mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes > path/to/folder and path/to/folder-temp-1 - giving a new GUID > b0053e390f0ee162de270000c9042436 to path/to/folder > Jul 27 12:06:08 replica1 dovecot[3431]: > doveadm(someuser)<10208><fgWCCRAO4WLgJwAAyQQkNg>: Error: Duplicate > mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes > path/to/folder and path/to/folder-temp-1 - giving a new GUID > 5823fe0d100ee162e0270000c9042436 to path/to/folder > > Dovecot log from replica2: > Jul 27 12:06:04 replica2 dovecot[47018]: > doveadm(someuser)<2239><TD9EDAwO4WK/CAAADKIhQg>: Warning: Failed to do > incremental sync for mailbox path/to/folder, retry with a full sync > (uidnext 1 < 13) > Jul 27 12:06:04 replica2 dovecot[47018]: > doveadm(someuser)<2241><ix0uKQwO4WLBCAAADKIhQg>: Error: Duplicate > mailbox GUID 0ccaab01079031620e1e00000ca22142 for mailboxes > path/to/folder and some/folder - giving a new GUID > 78c9dc2c0c0ee162c10800000ca22142 to path/to/folder > > At that time, only replica2 was accepting imap connections. > In this particular case, Dovecot eventually managed to get things back > in sync after way over 24h, but I also had users out of sync for > multiple days. > Running 'doveadm -Dv sync -u someuser -d' manually gave me the same > error message, but didn't change anything. > > Other things I've observed: > * it's not limited to a fixed set of users (unlike the > too-many-folders-thing with Dovecot 2.3.1[78]) > * it's not limited to newly created users, but also affects users, that > have been in sync for months/years > * it's not limited to mailboxes with lots of imap operations going on > * it's not specific to very large or very small mailboxes (although I've > only seen it for folders with a small number of mails in them) > * in most cases, Dovecot doesn't log any errors > * it does seem to be related to something an imap client can trigger > > As of now, my "fix" is to > * make sure that one of the replicas has all mails for that folder > (we're using maildir, so I can just rsync the individual mails/folders) > * create a full copy of the complete folder as backup > * remove the user from replication > * 'doveadm mailbox delete' the folder on one replica to get > rid of one of the conflicting guids (one time, Dovecot replicated the > deletion despite removing the user from replication, so the backup came > in handy) > * alternatively, you might be fine by deleting the folder's index files > * add the user back to replication > * let dsync replicate the user > -> fixed > > It's not a very convenient way to resolve this, but maybe it helps. Any > better solutions are greatly appreciated! > > > Best > Sebastian >
Gerald Galster
2022-Aug-02 18:24 UTC
Replication not working - GUIDs conflict - will be merged later
> (we're using maildir, so I can just rsync the individual mails/folders)I'm curious if anybody experienced this issue using mdbox. As far as I remember it's better suited for replication as filenames and location do not change on disk (index only). Best regards Gerald