I am having trouble scaling some regular dovecot cleanup operations on
our servers. On a daily basis, I'm wanting to do this on each server,
which contains its own isolated set of user storage:
/usr/bin/doveadm expunge -A mailbox Trash* savedbefore 21d
/usr/bin/doveadm expunge -A mailbox Spam savedbefore 7d
/usr/bin/doveadm expunge -A mailbox Sent savedbefore 120d
but these are a very expensive operations. For example, just doing the
Spam expunge takes 30 minutes (or more, depending on the load) of heavy
disk operations, on each machine it is run on. There are approximately
20k users on each machine. 
It seems like it does not use the iterate query, but rather it looks
into the database at the expires table and rather it iterates over every
user mentioned there. This is a problem because I've got multiple
dovecot machines with different sets of users on them, using the same
table, so that means that its doing stat() calls on each system for
every user mentioned there and failing to find the user on the
filesystem (because the user is on another system). In these cases it
spits out an error: doveadm(user): Info: User no longer exists, skipping
For users that do exist on this system, it seems to do something on the
order of 15 stat() calls, at minimum. 
What are some things I can do to make this less of an expensive process?
If I had a shared storage system that each machine used, this would cut
down on the resource waste because I'd only need to run the query once,
but unfortuntely, that isn't how these systems were designed.
thanks for any ideas, tips etc.
micah
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL:
<http://dovecot.org/pipermail/dovecot/attachments/20120217/9aecd69d/attachment-0004.bin>