Hi, I?m reading the past topics related to archive and scalability of dovecot, they are all very interesting. Here, I?m using two dovecot proxies in front of five storages pairs, and we split the domain?s accounts among those servers. So, we can share the i/o load and if one server goes down only few accounts of the domain stops (not all of them). But, we began to have space problems - and the solution would be insert more and more storage servers. So I was searching for some archive solutions (hard links - at S.O level, or some dovecot extension). A friend told me that he knows an ISP that share even the mailbox of the users among many servers - this is very weird and (at same time) very interesting approach. Instead of put all messages into one maildir and this maildir into one server, this "maildir" (?) is spplited among many servers - so, if one servers fails the account is still acessible and they move old/big messages to a new "cheap" storage - archiving transparently. Well, maybe it?s an stupid idea, but couldn?t dovecot imap/pop proxy do the same ? I mean, imagine the following scenario: 1- Proxy does an user account sql - today the return is (among other data), the final server IP, but it could be the storage servers for this account 2- Proxy does parallel connections (instead of today?s one connection with the final server). Retrieve the messages and caches locally where the message is. If the cache is lost, no problem, it just connects again and re-cache them. 3- When a message is deleted/flagged, etc - it has the cache (allowing it to know where send the command to), when a message inserted (as sent folder) - the system can have the 'default storage' where the messages are delivered and saved to - the other storages are just for archiving (or even a round-robin delivery !? - hard to control). Problems: 1- Have the cache at proxy level. 2- Quota calculation (how the delivery process can check this ?) 3- Maybe mixing proxy connection and this new approach is not easy way. 4- the more servers, the more time waiting their answers (imagine one of them answering slowly). Advantages: 1- No local mounts at storage level - (as nfs and other network mounted partitions). 2- Servers independency - if space is at a critical value, insert a pair of servers, move messages there, insert the servers in the sql answer (server1, server2, server3, ... , serverN). 3- Scalability - the more servers, the more i/o load you can share. Well, sorry for the long post...but I hope to collaborate with some ideas (even if they are stupid :( ) Regards, Fernando Bertasso Figaro
On Tue, 2009-08-25 at 11:00 -0300, fernando at dfcom.com.br wrote:> this is very weird and (at same time) very interesting approach. Instead > of put all messages into one maildir and this maildir into one server, > this "maildir" (?) is spplited among many servers - so, if one servers > fails the account is still acessible and they move old/big messages to a > new "cheap" storage - archiving transparently.Well, this is somewhat related to the filesystem abstraction that I'm planning. You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers. That's actually also what I was planning on doing by using some existing database for that (Cassandra?) And sure it would be possible to implement all of that on my own, but probably it's too much trouble.. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20090825/9587c074/attachment-0002.bin>
Hi Timo, Yes it?s related, but I don?t understand '... You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers ...'. My idea is just in the direction that we don?t need to care about filesystem, we don?t need any distribuited filesystem...let it be as user wants....at any some proxy level, the end storage can be ext3, reiser, S.O linux, freebsd, and so on. I think that the more elements we insert, the more complex and hard to mount/debug the solution would be. Administrator maintains storage pairs, with any o.s/filesystem he wants- his only work would be to create the accounts and folders at each storage server (if you create a folder - you create at three servers - the same for accounts) and set a database with the servers envolved at the process. The account structure must be sync'ed, and messages will be stored where the users want to. I also like the idea to user some database to store message index. Fernando> On Tue, 2009-08-25 at 11:00 -0300, fernando at dfcom.com.br wrote: >> this is very weird and (at same time) very interesting approach. Instead >> of put all messages into one maildir and this maildir into one server, >> this "maildir" (?) is spplited among many servers - so, if one servers >> fails the account is still acessible and they move old/big messages to a >> new "cheap" storage - archiving transparently. > > Well, this is somewhat related to the filesystem abstraction that I'm > planning. You'll just need to implement a filesystem that allows > distributing a single user's mails to multiple servers. That's actually > also what I was planning on doing by using some existing database for > that (Cassandra?) And sure it would be possible to implement all of that > on my own, but probably it's too much trouble.. > >
fernando at dfcom.com.br wrote:> I?m reading the past topics related to archive and scalability of dovecot, > they are all very interesting. Here, I?m using two dovecot proxies in > front of five storages pairs, and we split the domain?s accounts among > those servers. So, we can share the i/o load and if one server goes down > only few accounts of the domain stops (not all of them). > > But, we began to have space problems - and the solution would be insert > more and more storage servers. So I was searching for some archive > solutions (hard links - at S.O level, or some dovecot extension). A friend > told me that he knows an ISP that share even the mailbox of the users > among many servers - > > this is very weird and (at same time) very interesting approach. Instead > of put all messages into one maildir and this maildir into one server, > this "maildir" (?) is spplited among many servers - so, if one servers > fails the account is still acessible and they move old/big messages to a > new "cheap" storage - archiving transparently.Surely, other than the possibility of archiving a copy in a separate location at delivery time, everything else here is better done by a high-availability clustered SAN and *not* by an application? Archival is a valuable thing to have. Being able to, on delivery, deposit a separate copy elsewhere (without necessarily indexing it etc.) allows for near-line back-up or storage where legal or corporate regulation require. (I'm currently doing this using a cron job and a program I've written which checks to see if there are any new messages in everyone's inbox Maildirs and then hard-links them into a separate directory structure once a minute. Messages which disappear from the true inbox are then kept for a further 90 days. This allows users to recover messages that they may have accidentally deleted from their inbox.) Oh, and with reference to the second paragraph... hard links only work on a single filesystem, not across multiple filesystems or servers. Steve -- --------------------------------------------------------------------------- IT Systems Administrator, E-Mail:- steve at earth.ox.ac.uk Department of Earth Sciences, Tel:- +44 (0)1865 282110 University of Oxford, Parks Road, Oxford, UK. Fax:- +44 (0)1865 272072