Guillaume Differenthink
2007-Sep-07 14:21 UTC
[Ferret-talk] Custom Analyser .. where to put it ??
Hi, I m trying to use a custom analyser to add my french stop words... i m reading the tutorial at : http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage My problem is that i ve no idea where to put my custom Analyser class like : class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = FULL_GERMAN_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words), ''de'') end end Any clue ? Thanks a lot Guillaume. -- Posted via http://www.ruby-forum.com/.
Hey .. you mean where to place it in your directory structure? I place them in lib/ but any place is fine.. maybe in model/ .. just make sure it''s in the load_path of rails. :-) Ben On 2007-09-07, at 4:21 PM, Guillaume Differenthink wrote:> Hi, > > I m trying to use a custom analyser to add my french stop words... i m > reading the tutorial at : > http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage > > My problem is that i ve no idea where to put my custom Analyser class > like : > > class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer > include Ferret::Analysis > def initialize(stop_words = FULL_GERMAN_STOP_WORDS) > @stop_words = stop_words > end > def token_stream(field, str) > StemFilter.new(StopFilter.new(LowerCaseFilter.new > (StandardTokenizer.new(str)), > @stop_words), ''de'') > end > end > > Any clue ? > > Thanks a lot > > Guillaume. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
Hi, #{RAILS_ROOT}/lib is a good place for things like this. If you name your file correctly, i.e. german_stemming_analyzer.rb, Rails will auto-load it when you use the class name. Jens On Fri, Sep 07, 2007 at 04:21:56PM +0200, Guillaume Differenthink wrote:> Hi, > > I m trying to use a custom analyser to add my french stop words... i m > reading the tutorial at : > http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage > > My problem is that i ve no idea where to put my custom Analyser class > like : > > class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer > include Ferret::Analysis > def initialize(stop_words = FULL_GERMAN_STOP_WORDS) > @stop_words = stop_words > end > def token_stream(field, str) > StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), > @stop_words), ''de'') > end > end > > Any clue ? > > Thanks a lot > > Guillaume. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Guillaume Differenthink
2007-Sep-07 15:27 UTC
[Ferret-talk] Custom Analyser .. where to put it ??
thanks, i ll try :) Jens Kraemer wrote:> Hi, > > #{RAILS_ROOT}/lib is a good place for things like this. If you name your > file correctly, i.e. german_stemming_analyzer.rb, Rails will auto-load > it when you use the class name. > > Jens >-- Posted via http://www.ruby-forum.com/.
Guillaume Differenthink
2007-Sep-09 22:09 UTC
[Ferret-talk] Custom Analyser .. where to put it ??
Hi, i ve tried, but without success.... here is what i did, please tell me if something is wrong : In my class: acts_as_ferret({ :fields => { :nom => {}, :description => {:boost => 0}, :logiciel_nom => {}, :logiciel_id => {}, :difficulte_id => {}, :systeme_nom => {}, :fai_nom => {}, :fai_id =>{}, :lversion_id=>{}, :site_nom => {}, :siteutilise_id => {}, :nom_for_sort => {:index => :untokenized}, :note => {:index => :untokenized}, :visions_count => {:index => :untokenized}, :nb_vu => {:index => :untokenized}, :date_sort => {:index => :untokenized} }}, :analyzer => FrenchStemmingAnalyzer.new) My FrenchStemmingAnalyzer.rb (in /lib).. class FrenchStemmingAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = FULL_FRENCH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words), ''fr'') end end there is no error, i erased the /index/ form previous ferret analysis.. i restarted, did a search... but stil same results that before, even using only stop words... I do a fuzzy search with ferret (via find_by_content), could it be the cause of the problem ? Or maybe there is an error on the code ? Thanks you for helping, Guillaume. -- Posted via http://www.ruby-forum.com/.
> acts_as_ferret({ :fields => { > :nom => {}, > :description => {:boost => 0}, > :logiciel_nom => {}, > :logiciel_id => {}, > :difficulte_id => {}, > :systeme_nom => {}, > :fai_nom => {}, > :fai_id =>{}, > :lversion_id=>{}, > :site_nom => {}, > :siteutilise_id => {}, > :nom_for_sort => {:index => :untokenized}, > :note => {:index => :untokenized}, > :visions_count => {:index => :untokenized}, > :nb_vu => {:index => :untokenized}, > :date_sort => {:index => :untokenized} > }}, :analyzer => FrenchStemmingAnalyzer.new)The syntax for the acts_as_ferret options are a bit odd, the above code should be the following, notice the extra {} brackets around the :analyzer => FrenchStemmingAnalyzer.new acts_as_ferret({ :fields => { :nom => {}, :description => {:boost => 0}, :logiciel_nom => {}, :logiciel_id => {}, :difficulte_id => {}, :systeme_nom => {}, :fai_nom => {}, :fai_id =>{}, :lversion_id=>{}, :site_nom => {}, :siteutilise_id => {}, :nom_for_sort => {:index => :untokenized}, :note => {:index => :untokenized}, :visions_count => {:index => :untokenized}, :nb_vu => {:index => :untokenized}, :date_sort => {:index => :untokenized} }}, { :analyzer => FrenchStemmingAnalyzer.new }) -- Posted via http://www.ruby-forum.com/.