I'm trying to do a rewrite of the dspam_plugin for dovecot 1.1b1. There are some API changes that warranted an update of the plugin. Also, I wanted the dspam_plugin to be able to handle pristine mails for dspam retraining, as opposed to the signature based retraining. Question: How can I retrieve the full unix path for a specific mail? The original code uses mail_get_first_header() to retrieve the signature header. I need something like mail_get_mail_file_path(&path) which I then could pass on to dspam. Is there some such function available? Thanks in advance /Lars
On Sat, 2007-09-29 at 12:44 +0200, Lars Stavholm wrote:> I'm trying to do a rewrite of the dspam_plugin for dovecot 1.1b1.Cool. I never imagined that the plugin would find such wide-spread use :)> How can I retrieve the full unix path for a specific mail? > > The original code uses mail_get_first_header() to retrieve the > signature header. I need something like mail_get_mail_file_path(&path) > which I then could pass on to dspam. Is there some such function > available?I don't think you can since mail might be stored in any kind of format like mbox, dbox, ... Only in maildir would this be possible. You can probably somehow get the raw text of a message though, but I don't know off-hand, and then write it to a temporary file. In any case, I suggest doing that only when no signature is available, and I still don't see how you would end up with mails w/o signature at all except maybe during conversion to a new dspam installation. johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 828 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20070929/ff87cbf5/attachment-0002.bin>
Johannes Berg wrote:> On Sat, 2007-09-29 at 12:44 +0200, Lars Stavholm wrote: >> I'm trying to do a rewrite of the dspam_plugin for dovecot 1.1b1. > > Cool. I never imagined that the plugin would find such wide-spread > use :)Well, it's only me, don't know if anyone else uses it. Still, I think it's a brilliant idea. Doesn't get any more user friendly.>> How can I retrieve the full unix path for a specific mail? >> >> The original code uses mail_get_first_header() to retrieve the >> signature header. I need something like mail_get_mail_file_path(&path) >> which I then could pass on to dspam. Is there some such function >> available? > > I don't think you can since mail might be stored in any kind of format > like mbox, dbox, ... Only in maildir would this be possible. You canOK, obviously I'm using Maildir format, but let's not restrict the functionality to that fact.> probably somehow get the raw text of a message though, but I don't know > off-hand, and then write it to a temporary file. In any case, I suggestThere you go, that's my solution then, should work with all storage formats.> doing that only when no signature is available, and I still don't see > how you would end up with mails w/o signature at all except maybe during > conversion to a new dspam installation.Who said anything about signatures not being available? As far as I can tell from my tests, the signature's are picked up nicely by the dspam plugin. However, I'm used to a dspam setup where TrainPristine=on, and the retraining/reclassification requires pristine mail-sources, without the X-DSPAM-... stuff, including the signature. So, basically, I would read the mail in error, be it spam or ham, and pipe it to the dspam client for retraining/reclassification. The --user option of dspam is used to point dspam to the correct user (since we don't have a signature). I saw some mail_get_istream() or similar, that seems to be a way to open up some sort of byte stream reading the mail contents. That might be what I'm looking for. BTW, I would like to keep the previous functionality with the dspam plugin using signatures. In order to do that I need to be able to set dspam plugin specific options somwhow. Any idea? Cheers /Lars
Let's try to keep this on the list, shall we. F?bio M. Catunda wrote:> Lars Stavholm escreveu: >> Question: >> How can I retrieve the full unix path for a specific mail? >> > I am trying the same thing, but I have a different idea. > > I want that when the user moves a message the transaction is logged into > a file, then I can right external programs that will read this file and > control dspam.Have a look at Johannes Berg's reply above: the Maildir format would have a file for each mail message, other formats would not.> My idea is based on the following: > Q: Scalability: And if a user moves 400 messages to Spam folder at once? > A: Well, its not a problem to write 400 lines into a file, then the > external program will control how much resources dspam can use to > classify all those messages.Sounds OK, not really a problem.> Q: Why not a FIFO: > A: In case of a crash I need to classify all messages anyway, so, a FIFO > is not a good idea here. (I think so)Didn't quite understand the reasoning there, but never mind, it's just me:)> Q: And if the user moves a message from Spam folder to trash folder, > will it be considered innocent? > A: Thats a big problem. When a message is deleted by a MUA its usually > copied from one folder to another and then deleted, but there is no > default trash folder in imap, so, you have to be able to configure a lot > of "possible trash folders" to ignore then, thats why I prefer to have a > external program controlling dspam.I don't see this as a problem at all (why create one when there's none to be found:): * Move message into Spam: it's a spam that should be reclassified. * Move message out of Spam: it's a ham that should be reclassified. Don't really care where the mail comes from or where it is moved. This is the beauty of it all, count the key or mouse clicks, can't be less than this:) * Using the expire plugin, the Spam folder will be emptied auto- matically in due time (typically 30 days maybe) without user intervention. All close to zero maintenance for sysadmin as well as end-user.> Well, by now I dont have much, but I really would like to know how to > find the filename of a message being copied from a folder to another.No such luck (I think): to my understanding (with the help of Johannes in previous reply in this thread), you'd have to create a temporary file with the mail message using tmpnam() + mail_get_stream() or similar, and then do your thing. I'm aiming towards that exact functionality: I want to be able to do training using "pristine" source (so I'll need the whole message), and keep the previous functionality using signatures. We'll see how it goes. Good Luck and thanks for your input /Lars