Jeff Kletsky
2015-Feb-12 06:21 UTC
Processing Maildir contents on message-by-message basis
I (finally) moved over to Maildir storage here and would like to implement some "scripts" to manage taking actions on emails manually identified as misclassified as spam/ham. After reading through the Dovecot 2 description of how it works to try to see how it interacts with other processes changing the files. I'm concerned that I would be corrupting the message indexes if I just go hog-wild and run the scripts on the filesystem, rather than through Dovecot in some way. The types of actions taken would likely be: * Select a message from a given mailbox (the "source") * Potentially modify it drastically (remove spamassassin markup, for example) * Pipe the modified message to a mail-delivery agent (still running procmail here) and/or to sa-learn * Assuming successful completion of the pipe action(s), remove the source message from the mailbox While I can use doveadm to do bulk move/delete actions, I don't see a clear way to iterate through a set of messages and perform actions on them. First off, if I wrangle and mangle the message files directly, do I have to worry about the indexes, or do the indexes "magically" repair themselves in cases where the messages are either altered (including headers) or removed? Have I missed a way to iterate over messages and process it using external tools using the dovecot tools? Has this been discussed ad nauseum and I somehow missed it? (If so, for which I apologize profusely). Thanks, Jeff
Steffen Kaiser
2015-Feb-12 08:19 UTC
Processing Maildir contents on message-by-message basis
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 11 Feb 2015, Jeff Kletsky wrote:> First off, if I wrangle and mangle the message files directly, do I have to > worry about the indexes, or do the indexes "magically" repair themselves in > cases where the messages are either altered (including headers) or removed?1) Never ever modify a message on file system. 2) You can remove and add messages with no problem, the next time the mailbox is accessed, the indexes are repaired. 3) You can move messages to ../tmp (that would be a remove in the sense of the indexes), change the message there, modify the filename a bit, just to be sure, e.g. I add a counter after the hostname part: 1222364652.P11383Q0M620284.<hostname><counter>,S=7215,W=7294:2, adjust S= and W=, and finally move the message back into '.../new' or '.../cur'. That way the message is seen as new one (add). If you do so and if you have more then 26 keywords in the mailbox, the 27th and up are lost, because they cannot be tagged on the filename itself. The indexes are repaired as well. - -- Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEVAwUBVNxiDHz1H7kL/d9rAQI8UAf+MQgqCZlfEi6c1Fg/rqPtR+WUiszaHNjK kPZ7WDA2hbIgTncJNwRA+6Y4+qmKrSXj/bMhMLsMNlFPYeERw9plf8htYnIVVRgl sV09otLJ4fBZCeLJwB3DVtFHkh34KSQD2BaUZwV0wyAwrgk6sB9lGaEtTS1Ci3Pu RLSWl4yHuoN3uRuPTFwAoF5Iq3kG+EwxNY363HDdWqhqDHI7U+7Uj+LRWSi9jy/t D2S30jvZHEvO7SqjgYdVhKPhNy6lgh1HLuoTTEMK+H5pQk3NnLKTld+d1MdB36F3 O/NrrnJiymF1NZgKri+OCy1T6UPOczfSGt9NkZF04DwSQ3a22tzwKg==etwA -----END PGP SIGNATURE-----