Kevin Duraj
2007-Aug-01 22:09 UTC
[Xapian-discuss] Xapian based spam filter using Bayesian algorithm.
Hi, I am building Xapian based spam filter using Bayesian algorithm. Building two separate search engines for spam and ham corpus that can efficiently determine whether the message is spam or ham. Let me know if there is some spam filter implementation using Xapian, thanks. Bayesian algorithm ... p = Probability of term s = Number of occurrences in Spam Corpus m = Number of messages in Spam Corpus h = Number of occurrences in Ham Corpus n = Number of messages in Ham Corpus (s / m) p = ----------------------------- ( (s / m) + ( (h * 2) / n ) ) -- Cheers, Kevin Duraj http://pacificair.com