I would like to announce a new module called Classifier for Ruby. It is available from: http://rubyforge.org/projects/ classifier/ or simply gem install classifier With it, you can do things like: ==require ''classifier'' b = Classifier::Bayes.new ''Interesting'', ''Uninteresting'' # supports any number of categories of any name b.train_interesting "here are some good words. I hope you love them" b.train_uninteresting "here are some bad words, I hate you" b.classify "I hate bad words and you" # returns ''Uninsteresting'' == Or if you would like persistence: ==require ''classifier'' require ''madeleine'' m = SnapshotMadeleine.new("bayes_data") { Classifier::Bayes.new ''Interesting'', ''Uninteresting'' } m.system.train_interesting "here are some good words. I hope you love them" m.system.train_uninteresting "here are some bad words, I hate you" m.take_snapshot m.system.classify "I love you" # returns ''Interesting'' == Please send me any feedback about this library, including how you plan to use it or extend it. Currently, I am working with Tobias to integrate it into Typo for comment and trackback SPAM blocking. Can anyone think of a good reason to include it into Rails for automatic tagging functionalities? Thank you! -Lucas Carlson http://rufy.com/
On Apr 11, 2005 1:17 PM, Lucas Carlson <rails-1eRuzFDw/cg@public.gmane.org> wrote:> Currently, I am working with Tobias to integrate it into Typo for > comment and trackback SPAM blocking. > > Can anyone think of a good reason to include it into Rails for > automatic tagging functionalities?I''m wondering if this could be useful for 43things to help group similar goals? -- Urban Artography http://artography.ath.cx
Woah, this looks cool! Do you have any performance statistics on this? How well does it go on different "word-database" sizes? I''m looking into a well-integrateable spam-filtering mechanism, we currently are using SpamAssasin, but it''s hell to "integrate"... So maybe I''ll go ahead and try to put this to use in a mail filter. Kind Regards, Flurin Lucas Carlson wrote:> I would like to announce a new module called Classifier for Ruby. It > is available from: > > http://rubyforge.org/projects/ classifier/ > > or simply > > gem install classifier > > With it, you can do things like: > > ==> require ''classifier'' > b = Classifier::Bayes.new ''Interesting'', ''Uninteresting'' # supports any > number of categories of any name > b.train_interesting "here are some good words. I hope you love them" > b.train_uninteresting "here are some bad words, I hate you" > b.classify "I hate bad words and you" # returns ''Uninsteresting'' > ==> > Or if you would like persistence: > > ==> require ''classifier'' > require ''madeleine'' > m = SnapshotMadeleine.new("bayes_data") { > Classifier::Bayes.new ''Interesting'', ''Uninteresting'' > > } > > m.system.train_interesting "here are some good words. I hope you love > them" > m.system.train_uninteresting "here are some bad words, I hate you" > m.take_snapshot > m.system.classify "I love you" # returns ''Interesting'' > ==> > Please send me any feedback about this library, including how you plan > to use it or extend it. > > Currently, I am working with Tobias to integrate it into Typo for > comment and trackback SPAM blocking. > > Can anyone think of a good reason to include it into Rails for > automatic tagging functionalities? > > Thank you! > -Lucas Carlson > http://rufy.com/ _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails
I don''t have any performance stats on it yet, but if you do use it, please send me any comments you have. If you want to patch anything, the subversion repository is at: http://rufy.com/classifier-svn/trunk/ -Lucas http://www.rufy.com/ On Apr 11, 2005, at 1:47 PM, Flurin Egger wrote:> Woah, this looks cool! Do you have any performance statistics on this? > How well does it go on different "word-database" sizes? > > I''m looking into a well-integrateable spam-filtering mechanism, we > currently are using SpamAssasin, but it''s hell to "integrate"... So > maybe I''ll go ahead and try to put this to use in a mail filter. > > Kind Regards, > Flurin > > Lucas Carlson wrote: > >> I would like to announce a new module called Classifier for Ruby. It >> is available from: >> >> http://rubyforge.org/projects/ classifier/ >> >> or simply >> >> gem install classifier >> >> With it, you can do things like: >> >> ==>> require ''classifier'' >> b = Classifier::Bayes.new ''Interesting'', ''Uninteresting'' # supports >> any >> number of categories of any name >> b.train_interesting "here are some good words. I hope you love them" >> b.train_uninteresting "here are some bad words, I hate you" >> b.classify "I hate bad words and you" # returns ''Uninsteresting'' >> ==>> >> Or if you would like persistence: >> >> ==>> require ''classifier'' >> require ''madeleine'' >> m = SnapshotMadeleine.new("bayes_data") { >> Classifier::Bayes.new ''Interesting'', ''Uninteresting'' >> >> } >> >> m.system.train_interesting "here are some good words. I hope you love >> them" >> m.system.train_uninteresting "here are some bad words, I hate you" >> m.take_snapshot >> m.system.classify "I love you" # returns ''Interesting'' >> ==>> >> Please send me any feedback about this library, including how you >> plan to use it or extend it. >> >> Currently, I am working with Tobias to integrate it into Typo for >> comment and trackback SPAM blocking. >> >> Can anyone think of a good reason to include it into Rails for >> automatic tagging functionalities? >> >> Thank you! >> -Lucas Carlson >> http://rufy.com/ _______________________________________________ >> Rails mailing list >> Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org >> http://lists.rubyonrails.org/mailman/listinfo/rails > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails
This is really cool. I was reading http://joelonsoftware.com/articles/FogBugzII.html the other day and they explain how they use their classifier to figure out which type of ticket or request incoming mail is -- is it spam, it''s it about billing, service, etc. Glad that you were able to package something like this up. One thing that I''m curious about -- does it handle the spam case different than the subgroups? i.e. it seem slike it''s important first to distinguish spam-vs-non-spam (or ham I guess) before trying to classify it. The paper referenced about talks about an tourniment algorythm to work it out. -w On 4/11/05, Lucas Carlson <rails-1eRuzFDw/cg@public.gmane.org> wrote:> I don''t have any performance stats on it yet, but if you do use it, > please send me any comments you have. If you want to patch anything, > the subversion repository is at: > > http://rufy.com/classifier-svn/trunk/ > > -Lucas > http://www.rufy.com/ > > On Apr 11, 2005, at 1:47 PM, Flurin Egger wrote: > > > Woah, this looks cool! Do you have any performance statistics on this? > > How well does it go on different "word-database" sizes? > > > > I''m looking into a well-integrateable spam-filtering mechanism, we > > currently are using SpamAssasin, but it''s hell to "integrate"... So > > maybe I''ll go ahead and try to put this to use in a mail filter. > > > > Kind Regards, > > Flurin > > > > Lucas Carlson wrote: > > > >> I would like to announce a new module called Classifier for Ruby. It > >> is available from: > >> > >> http://rubyforge.org/projects/ classifier/ > >> > >> or simply > >> > >> gem install classifier > >> > >> With it, you can do things like: > >> > >> ==> >> require ''classifier'' > >> b = Classifier::Bayes.new ''Interesting'', ''Uninteresting'' # supports > >> any > >> number of categories of any name > >> b.train_interesting "here are some good words. I hope you love them" > >> b.train_uninteresting "here are some bad words, I hate you" > >> b.classify "I hate bad words and you" # returns ''Uninsteresting'' > >> ==> >> > >> Or if you would like persistence: > >> > >> ==> >> require ''classifier'' > >> require ''madeleine'' > >> m = SnapshotMadeleine.new("bayes_data") { > >> Classifier::Bayes.new ''Interesting'', ''Uninteresting'' > >> > >> } > >> > >> m.system.train_interesting "here are some good words. I hope you love > >> them" > >> m.system.train_uninteresting "here are some bad words, I hate you" > >> m.take_snapshot > >> m.system.classify "I love you" # returns ''Interesting'' > >> ==> >> > >> Please send me any feedback about this library, including how you > >> plan to use it or extend it. > >> > >> Currently, I am working with Tobias to integrate it into Typo for > >> comment and trackback SPAM blocking. > >> > >> Can anyone think of a good reason to include it into Rails for > >> automatic tagging functionalities? > >> > >> Thank you! > >> -Lucas Carlson > >> http://rufy.com/ _______________________________________________ > >> Rails mailing list > >> Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > >> http://lists.rubyonrails.org/mailman/listinfo/rails > > > > _______________________________________________ > > Rails mailing list > > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > > http://lists.rubyonrails.org/mailman/listinfo/rails > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-- Will Schenk http://www.sublimeguile.com http://www.myelinate.com