Karsten Bräckelmann
2009-Sep-01 20:20 UTC
[Dovecot] antispam-plugin 1.2 and trailing carriage-returns
Guys, Dovecot 1.0.15 [1], just built the latest antispam-plugin 1.2 (tarball) for testing, mailtrain backend for SA integration. Both built from custom spec files. The mail that is being trained is different than its respective source in the mbox file. The trained one shows added, trailing carriage-return chars for all headers, which are not in the headers in the mbox file. This breaks sa-learn -- both these variations are different, and SA would learn *both* when run against each one separately. How comes? Any insight? How could I fix this, other than wrapping the sa-learn inside another shell script and have sed strip off the noise? This becomes more of an issue, once I switch from sa-learn to the lightning-fast spamc training variant. TIA guenther [1] Yes, I know, sorry. Don't want to change everything at the same time, and the target system I'm experimenting for runs that version, too. -- char *t="\10pse\0r\0dtu\0. at ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Karsten Bräckelmann
2009-Oct-17 21:52 UTC
[Dovecot] antispam-plugin 1.2 and trailing carriage-returns
*nudge* Anyone? Since Timo seems to be on a list processing spree lately, here's hoping. :) On Tue, 2009-09-01 at 22:20 +0200, Karsten Br?ckelmann wrote:> Guys, > > Dovecot 1.0.15 [1], just built the latest antispam-plugin 1.2 (tarball) > for testing, mailtrain backend for SA integration. Both built from > custom spec files. > > The mail that is being trained is different than its respective source > in the mbox file. The trained one shows added, trailing carriage-return > chars for all headers, which are not in the headers in the mbox file. > > This breaks sa-learn -- both these variations are different, and SA > would learn *both* when run against each one separately. > > How comes? Any insight? How could I fix this, other than wrapping the > sa-learn inside another shell script and have sed strip off the noise? > This becomes more of an issue, once I switch from sa-learn to the > lightning-fast spamc training variant. > > TIA > > guenther > > > [1] Yes, I know, sorry. Don't want to change everything at the same > time, and the target system I'm experimenting for runs that version, > too.-- char *t="\10pse\0r\0dtu\0. at ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Timo Sirainen
2009-Oct-27 23:28 UTC
[Dovecot] antispam-plugin 1.2 and trailing carriage-returns
On Tue, 2009-09-01 at 22:20 +0200, Karsten Br?ckelmann wrote:> The mail that is being trained is different than its respective source > in the mbox file. The trained one shows added, trailing carriage-return > chars for all headers, which are not in the headers in the mbox file. > > This breaks sa-learn -- both these variations are different, and SA > would learn *both* when run against each one separately. > > How comes? Any insight?Probably because incoming mails have CRLF linefeeds. Antispam plugin could drop these by wrapping the mail_get_stream()'s returned input stream to i_stream_create_lf(). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20091027/25a9a429/attachment-0002.bin>