Dear Timo, Would there be any sense in giving Dovecot the option to split folders into multiple subfolders when they reached a specified size (probably message count) limit? Dovecot would monitor folders and when they reached, say, 10,000 messages, silently split the folder on the filesystem to ensure that access remains fast. I know that Dovecot scales very well but this would give practically unlimited storage capability and also keep things fast. You could even have it so that the latest 100 messages are kept in their own folder for fast access. .Folder.new .Folder.cur .Folder.tmp could become: .Folder__1.new .Folder__1.cur .Folder__1.tmp and .Folder__2.new .Folder__2.cur .Folder__2.tmp with Dovecot merging them before display as just "Folder" within the mail client. This could be further extended so that Dovecot could be configured to store 'old' message folders in a separate location. We could then have slower+cheaper+larger storage mounted so that 'old mail' does not take up the expensive local SCSI disks on the machine. Mail from 2 years ago is much less likely to be accessed than mail from the last week. This would provide very neat behind-the-scenes archiving functionality. Looking forward to hearing your thoughts. Best, Daniel -- Squirrelmail Stable 1.4.8 (and developing on 1.5.2) PHP 5.x Hardened with Eaccelerator Apache 2.x Mysql 5.0.x Imapproxy over Dovecot 1.0.rc27 with Maildir all running on Gentoo Linux for ~5,000 users.
Daniel Watts wrote:> Dear Timo, > > Would there be any sense in giving Dovecot the option to split folders > into multiple subfolders when they reached a specified size (probably > message count) limit?My understanding is this is partially covered in Timo's "dbox" format, which tries to take the best features of mbox and Maildir.> .Folder.new > .Folder.cur > .Folder.tmp > > could become: > > .Folder__1.new > .Folder__1.cur > .Folder__1.tmp > and > .Folder__2.new > .Folder__2.cur > .Folder__2.tmpYou would only need to split "cur", unless you expect someone to get over 10,000 new message waiting. "tmp" is only used _whilst_ message are being delivered, so mail clients don't see a partially written message.> This could be further extended so that Dovecot could be configured to > store 'old' message folders in a separate location. We could then have > slower+cheaper+larger storage mounted so that 'old mail' does not take > up the expensive local SCSI disks on the machine. Mail from 2 years ago > is much less likely to be accessed than mail from the last week.Also, instead of __N, you could try a different path, so /foo/bar/User/ is for new mail, and /old/slow/disk/User is for older stuff.> This would provide very neat behind-the-scenes archiving functionality.There's really two ideas here... one is the mechanism of multi-directory folders, the other is the policy of separating by age. -- Curtis Maloney cmaloney at cardgate.net
On 10/11/07, Daniel Watts <d at nielwatts.com> wrote:> Dear Timo, > > Would there be any sense in giving Dovecot the option to split folders > into multiple subfolders when they reached a specified size (probably > message count) limit? >Many modern file systems offer the possibility to use optimized directory indexes. Listing these directories scales very well. Splitting files into subdirectories would have a negative effect: You have to walk through every directory and merge all file names into one data table. Chris
On Thu, 2007-10-11 at 10:00 +0100, Daniel Watts wrote:> .Folder__1.new > .Folder__1.cur > .Folder__1.tmp > and > .Folder__2.new > .Folder__2.cur > .Folder__2.tmp > > with Dovecot merging them before display as just "Folder" within the > mail client.Virtual folders would enable this, if they're implemented one day..> This could be further extended so that Dovecot could be configured to > store 'old' message folders in a separate location. We could then have > slower+cheaper+larger storage mounted so that 'old mail' does not take > up the expensive local SCSI disks on the machine. Mail from 2 years ago > is much less likely to be accessed than mail from the last week.dbox format will support this soon. So that you can configure two (or more) directories for it and then Dovecot will look up the mail files from each of them in order. It would also support automatically moving non-recently accessed mails to the slower dirs. The current dbox implementation in v1.1 supports only one-message-per-file mode so it's quite similar to maildir. The main problem with implementing fast/slow storage for maildir is that the maildir filenames change all the time, so it would waste the slow storage's I/O all the time when trying to figure out if a file is there or not. dbox doesn't have this problem. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20071021/2d188d38/attachment-0002.bin>
Timo Sirainen wrote:> On Thu, 2007-10-11 at 10:00 +0100, Daniel Watts wrote: > >> .Folder__1.new >> .Folder__1.cur >> .Folder__1.tmp >> and >> .Folder__2.new >> .Folder__2.cur >> .Folder__2.tmp >> >> with Dovecot merging them before display as just "Folder" within the >> mail client. > > Virtual folders would enable this, if they're implemented one day.. > >> This could be further extended so that Dovecot could be configured to >> store 'old' message folders in a separate location. We could then have >> slower+cheaper+larger storage mounted so that 'old mail' does not take >> up the expensive local SCSI disks on the machine. Mail from 2 years ago >> is much less likely to be accessed than mail from the last week. > > dbox format will support this soon. So that you can configure two (or > more) directories for it and then Dovecot will look up the mail files > from each of them in order. It would also support automatically moving > non-recently accessed mails to the slower dirs. > > The current dbox implementation in v1.1 supports only > one-message-per-file mode so it's quite similar to maildir. The main > problem with implementing fast/slow storage for maildir is that the > maildir filenames change all the time, so it would waste the slow > storage's I/O all the time when trying to figure out if a file is there > or not. dbox doesn't have this problem. >Hi Timo! Digging up this thread from 2007. Just had another conversation in my company about how to spread old non-accessed files to cheaper slower storage. Is this now feasible? I noticed dbox is now v2.0 but see no reference to virtual folders or auto-archiving etc. Hope you're having a good time State-side! Best wishes, Dan