On 13.01.2014 12:48, Markus Weippert wrote:> Hi,
>
> I'm having some issues with replicating public namespaces. Everything
> seems to work fine for private namespaces, but while importing some huge
> mailboxes (many small mails) into a public namespace via imapsync,
> something goes wrong.
>
> The expected mail flow is:
> old-server (imapsync)> new-server1 (replication)> new-server2
>
> But then, dovecot seems to run into race conditions when the
> replications process tries to sync the same public mailbox under two or
> more different users at the same time. As a result, messages get
> duplicated, new-server2 sends those back to new-server1 which then
> starts to produce duplicates too. If I don't kill the processes in time
> and delete the faulty mailbox, they start to produce thousands of mails.
> In fact, server2 should not export messages at all, since it's not
> productive yet and does not get any mail except from the replication.
>
> The only thing getting logged (only few compared to the huge amount of
> duplicates produced):
> "dsync-server(user at example.com): Warning: Maildir /...: Expunged
message
> reappeared, giving a new UID"
>
> Is there any way to fix this?
>
> Regards,
> Markus
I looked into this a bit more. The problem seems to be, replication
locking is only done at user level. For public namespaces, this allows
two replication processes to sync the same mailbox in parallel. So I did
a (poor) implementation for mailbox level locking. It locks the mailbox
with a lock file in the control directory on both sides (not sure if
that's necessary) and skips locked mailboxes instantly, because they are
currently being synced anyway.
It actually works in my setup. The duplicate messages are gone. It logs
some warnings when two replication processes try to access the same
mailbox at once, which seems to happen quite frequently in public
namespaces.
Maybe someone more experienced can clean this up and adopt it to
upstream? I really like the replication idea and it would be nice if it
were as stable for shared/public namespaces as it is for private ones...
Regards,
Markus
P.S.:> replication_dsync_parameters = -d -l 60 -N -x virtual -x ns_public -U
Typo, actually looks like this:
replication_dsync_parameters = -d -l 60 -N -x virtual -x legacy -U
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dsync-lock.patch
Type: text/x-patch
Size: 5672 bytes
Desc: not available
URL:
<http://dovecot.org/pipermail/dovecot/attachments/20140118/8a4d3741/attachment.bin>