> On Feb 22, 2017, at 2:44 PM, Timo Sirainen <tss at iki.fi> wrote: > > I guess mainly the message sequence numbers in IMAP protocol makes this more difficult, but it's not an impossible problem to solve.Any thoughts on the wisdom of supporting an external database for session state or even mailbox state (like using Redis or even MySQL)? Also, would it help reliability or scalability to store a copy of the index data in an external database? I want to use mdbox format but I have heard that these index files do get corrupted occasionally and have to be rebuilt (possibly using an older version of the index file to construct a new one). I worry that using mdbox might cause my users to see the IMAP flags suddenly reset back to a previous state (like seeing previously read messages becoming unread in their mail clients). If a copy of the index data were stored in an external database, such problems of duplicate messages occurring in a dovecot cluster could be handled by having the cluster ?lookup? the index data using the external database instead of the local copy stored on the server. An external database could easily implement unique serial numbers cluster-wide. In the site I?m working on building, I even use Redis to implement ?message queues? between Postfix and Dovecot (via redis push/pop feature). Currently, I am only delivering new messages via IMAP instead of LMTP (no LMTP will be available to my backend mail servers, only IMAP). If you stored the MD5 checksum of the index files (and even the message files) in the external database, you could also run a background process that would periodically check for corruption of the local index files using the checksums from the database, making mdbox format even more bulletproof. And, the best thing about using an external database is that making the external database highly available is not a problem (as most sites already do that). The index data stored in the database would become the ?source of truth? with the local index files/session data being an efficient cache for the mailstore. And, re-caching could occur as needed to make the whole cluster more reliable. Kevin
On 2/22/2017, 3:46:08 PM, KT Walrus <kevin at my.walr.us> wrote:> I want to use mdbox format but I have heard that these index files do > get corrupted occasionally and have to be rebuilt (possibly using an > older version of the index file to construct a new one). I worry that > using mdbox might cause my users to see the IMAP flags suddenly reset > back to a previous state (like seeing previously read messages > becoming unread in their mail clients).This is the only reason I haven't moved to mdbox myself. I really, really wish there was a way to not have to worry about losing flags.
Timo Sirainen
2017-Feb-23 21:00 UTC
Scaling to 10 Million IMAP sessions on a single server
On 22 Feb 2017, at 22.46, KT Walrus <kevin at my.walr.us> wrote:> >> On Feb 22, 2017, at 2:44 PM, Timo Sirainen <tss at iki.fi> wrote: >> >> I guess mainly the message sequence numbers in IMAP protocol makes this more difficult, but it's not an impossible problem to solve. > > Any thoughts on the wisdom of supporting an external database for session state or even mailbox state (like using Redis or even MySQL)? > > Also, would it help reliability or scalability to store a copy of the index data in an external database?I mainly see such external databases as additional reasons for things to break. And even if not, additional extra layers of latency. The thoughts I've had about storing such internal state in the Dovecot Proxy layer make sense because the IMAP sessions have to have active TCP connections. All the state can be stored by the process that is responsible for the TCP connection itself. There's not much point storing such state outside the process: If the process or the TCP connection dies, the state needs to be forgotten about in any case since there's no "state resume" command in IMAP (and even if there were, the state probably should then be stored in that command itself rather than on the server side).> I want to use mdbox format but I have heard that these index files do get corrupted occasionally and have to be rebuilt (possibly using an older version of the index file to construct a new one). I worry that using mdbox might cause my users to see the IMAP flags suddenly reset back to a previous state (like seeing previously read messages becoming unread in their mail clients).Both sdbox and mdbox formats have this problem in theory. Practically, there are many huge mdbox/sdbox installations and I don't think they see such problems much, if ever. Dovecot attempts pretty hard already not to lose flags with sdbox/mdbox. There are also separate dovecot.index.backup files that are kept just for this purpose.> If a copy of the index data were stored in an external database, such problems of duplicate messages occurring in a dovecot cluster could be handled by having the cluster ?lookup? the index data using the external database instead of the local copy stored on the server.This sounds a bit similar to the "obox" format that we use for storing emails and indexes to object storage in Dovecot Pro. That isn't open source though..> If you stored the MD5 checksum of the index files (and even the message files) in the external database, you could also run a background process that would periodically check for corruption of the local index files using the checksums from the database, making mdbox format even more bulletproof.I don't see why this would need an external database. I've long had in my TODO to add hashes/checksums to all of the Dovecot index files so it could properly detect corruption and ignore that. Hopefully that's not too far into the future anymore.> And, the best thing about using an external database is that making the external database highly available is not a problem (as most sites already do that). The index data stored in the database would become the ?source of truth? with the local index files/session data being an efficient cache for the mailstore. And, re-caching could occur as needed to make the whole cluster more reliable.In my opinion external database is just shifting the problem from one place to another. Yes, sometimes it's still useful. Dovecot supports all kinds of databases for all kinds of purposes, like with dict API you can access LDAP, SQL or Cassanda. I mostly like Cassandra nowadays, but it has its problems as well (tombstones). I'm not aware of any highly available database that actually scales and really just works without problems. (I'm talking about clusters with more than just 2 servers. Ideally more than just 2 datacenters.)
Timo Sirainen
2017-Feb-23 21:21 UTC
Scaling to 10 Million IMAP sessions on a single server
On 23 Feb 2017, at 23.00, Timo Sirainen <tss at iki.fi> wrote:> > I mainly see such external databases as additional reasons for things to break. And even if not, additional extra layers of latency.Oh, just thought that I should clarify this and I guess other things I said. I think there are two separate things we're possibly talking about in here: 1) Temporary state: This is what I was mainly talking about. State related to a specific IMAP session. This doesn't take much space and can be stored in the proxy's memory since it's specific to the TCP session anyway. 2) Permanent state: This is mainly about the storage. A lot of people use Dovecot with NFS. So one possibility for storing the permanent state is NFS. Another possibility with Dovecot Pro is to store it to object storage as blobs and keep a local cache of the state. A 3rd possibility might be to use some kind of a database for storing the permanent state. I'm fine with the first two, but with 3rd I see a lot of problems and not a whole lot of benefit. But if you think of the databases (or even NFS) as blob storage, you can think of them the same as any object storage and use the same obox format with them. What I'm mainly against is attempting to create some kind of a database that has structured format like (imap_uid, flags, ...) - I'm sure that can be useful for various purposes but performance or scalability isn't one of them.
Possibly Parallel Threads
- Scaling to 10 Million IMAP sessions on a single server
- Scaling to 10 Million IMAP sessions on a single server
- Scaling to 10 Million IMAP sessions on a single server
- Scaling to 10 Million IMAP sessions on a single server
- Scaling to 10 Million IMAP sessions on a single server