thr3ads.net - dovecot - What's a Reasonable Inbox Size? [May 2020]

If this information is useful, please help other people find it:
Share via:

asai at globalchangemusic.org

2020-May-08 18:54 UTC

What's a Reasonable Inbox Size?

> It depends on what you consider reasonable.
>
> The processing time of file operation that iterates through a mailbox
> will generally go up proportinately with size.? If you do a text search
> without some indexing system like Solr, it will take a very long time.
>
> If the mailbox is just some archive that you pile up and forget about it
> except for once in a blue moon retrieval, then it might be reasonable.
>
> If it's an active mailbox, it will be a pain to navigate, in the same
> way a single folder with 100K files or a file cabinet with huge stacks
> of envelopes.
>
> I would guess some partioning of the large mailboxes into smaller
> mailboxes would help with active mailboxes.? Most people spend most of
> their time on new/recent messages, so making time or size or subject
> based volmes wouldn't be a bad idea.
>
> If the bulk of the size are redundant copies of attachments, then 
> Dovecot's
> *dbox support de-duping which would aso help.
>
So, generally speaking, you don't want to have inboxes that just sync 
all day long, due to massive amounts of small files in the inbox.? This 
may be OK in the case of a rarely accessed archive folder, but not good 
for regularly accessed inboxes, etc.?

Joseph Tam

2020-May-08 23:16 UTC

head link

What's a Reasonable Inbox Size?

On Fri, 8 May 2020, asai at globalchangemusic.org wrote:
>
>> It depends on what you consider reasonable.
>> 
>> The processing time of file operation that iterates through a mailbox
>> will generally go up proportinately with size.  If you do a text search
>> without some indexing system like Solr, it will take a very long time.
>> 
>> If the mailbox is just some archive that you pile up and forget about
it
>> except for once in a blue moon retrieval, then it might be reasonable.
>> 
>> If it's an active mailbox, it will be a pain to navigate, in the
same
>> way a single folder with 100K files or a file cabinet with huge stacks
>> of envelopes.
>> 
>> I would guess some partioning of the large mailboxes into smaller
>> mailboxes would help with active mailboxes.  Most people spend most of
>> their time on new/recent messages, so making time or size or subject
>> based volmes wouldn't be a bad idea.
>> 
>> If the bulk of the size are redundant copies of attachments, then
Dovecot's
>> *dbox support de-duping which would aso help.
>> 
>
> So, generally speaking, you don't want to have inboxes that just sync
all day
> long, due to massive amounts of small files in the inbox.  This may be OK
in
> the case of a rarely accessed archive folder, but not good for regularly 
> accessed inboxes, etc.?
>
>
>
>
Joseph Tam <jtam.home at gmail.com>

Joseph Tam

2020-May-08 23:18 UTC

head link

What's a Reasonable Inbox Size?

On Fri, 8 May 2020, Joseph Tam wrote:
>>> It depends on what you consider reasonable.
Whoops.  Editing error.  What I wanted to send.

On Fri, 8 May 2020, asai at globalchangemusic.org wrote:
> So, generally speaking, you don't want to have inboxes that just sync
all day
> long, due to massive amounts of small files in the inbox.
I don't know enough about what is involved when your client tries
to sync to comment on your particular situation.  If the exchange of
information involves only delta changes (e.g. list datum that have been
added/removed since the last sync), and if this information is readily
available in Dovecot's caches, then this operation might be optimized
to take minimal time.

If however, it involves exchanging entire lists of many messages IDs,
or worse, involves Dovecot accessing each message, it will result in
large amounts of time spent in I/O (network, disk or both).  With Maildir
(many small message in a folder), this causes seeking all over the disk.
Some filesystems (XFS?) may be better at this than others.

The description of your problem seems to suggest the latter, so breaking
up gigantic mailboxes into manageable volumes will help.

If you really want to see what's going on when a client syncs, you
can network trace, process trace, or use Dovecot's rawlog feature

 	https://wiki.dovecot.org/Debugging/Rawlog

to directly observe the iteraction between a server and client.
> This may be OK in the case of a rarely accessed archive folder, but not
> good for regularly accessed inboxes, etc.?
This is not really so much technical advice as a rule of thumb: there's
not a lot of payoff to optimizing rare operations.

Joseph Tam <jtam.home at gmail.com>

@lbutlr

2020-May-09 07:52 UTC

head link

What's a Reasonable Inbox Size?

On 08 May 2020, at 12:54, asai at globalchangemusic.org
wrote:>> It depends on what you consider reasonable.
>> 
>> The processing time of file operation that iterates through a mailbox
>> will generally go up proportinately with size.  If you do a text search
>> without some indexing system like Solr, it will take a very long time.
>> 
>> If the mailbox is just some archive that you pile up and forget about
it
>> except for once in a blue moon retrieval, then it might be reasonable.
>> 
>> If it's an active mailbox, it will be a pain to navigate, in the
same
>> way a single folder with 100K files or a file cabinet with huge stacks
>> of envelopes.
>> 
>> I would guess some partioning of the large mailboxes into smaller
>> mailboxes would help with active mailboxes.  Most people spend most of
>> their time on new/recent messages, so making time or size or subject
>> based volmes wouldn't be a bad idea.
>> 
>> If the bulk of the size are redundant copies of attachments, then
Dovecot's
>> *dbox support de-duping which would aso help.
> 
> So, generally speaking, you don't want to have inboxes that just sync
all day long, due to massive amounts of small files in the inbox.  This may be
OK in the case of a rarely accessed archive folder, but not good for regularly
accessed inboxes, etc.?
Not really since most GUI clients keep all the folders synced, so moving files
to different, smaller count mailboxes doesn?t reduce the number of files
accessed.

The issue is if you have a folder with millions of files in it, most file
systems don?t deal well with this.

But with mbox, each ?folder? is a single file, and making a single multi-GB text
file that has to be parsed is a definitely issue on any file system.


-- 
ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES
	BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart
	chalkboard Ep. 1F07

Reasonably Related Threads

Search for more possibly parallel threads

dovecot - May 2020 - What's a Reasonable Inbox Size?

What's a Reasonable Inbox Size?

What's a Reasonable Inbox Size?

What's a Reasonable Inbox Size?

What's a Reasonable Inbox Size?

Reasonably Related Threads