thr3ads.net - search: "stopfilter"

Displaying 20 results from an estimated 22 matches for "stopfilter".

2007 Sep 07

Custom Analyser .. where to put it ??

...oblem is that i ve no idea where to put my custom Analyser class like : class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = FULL_GERMAN_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words), ''de'') end end Any clue ? Thanks a lot Guillaume. -- Posted via http://www.ruby-forum.com/.

Trouble with custom Analyzer

2006 Oct 23

Trouble with custom Analyzer

Hi! I wanted to build my own custom Analyzer like so: class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end As one can easily spot, I essentially want a LetterAnalyzer with stop word filtering. However, using that analyzer (for indexing) results in a segmentation fault. /opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferre...

Indexing and searching across multiple locales

2006 Nov 02

Indexing and searching across multiple locales

Hi - I''m currently investigating support for Ferret and content that spans multiple locales. I am particularly interested in using stemming and fuzzy searches (e.g. with slop factor) across multiple locales. So far I''ve followed the online docs for implementing a Stemming Analyzer, and it is working for English terms just fine. I''ve also written a method to import data

stop words in query

2007 Jan 11

stop words in query

...llo all, Quick question, I''m using AAF and the following custom analyzer: class StemmedAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)) end However when my search term includes a stop word I never get any results back. Once I remove the stop word I get the normal results back. Do I need to do a search of my query for stop words and remove them myself? Or is th...

indexing source code

2005 Nov 17

indexing source code

Hi again, I''m using ferret to index source code - DamageControl will allow users to search for text in source code. Currently I''m using the default index with no custom analyzer (I''m using the StandardAnalyzer). Do you have any recommendations about how to write an analyzer that will index source code in a more ''optimal'' way? I.e. disregard common

Using StemFilter with PhraseQuery

2008 May 12

Using StemFilter with PhraseQuery

...t I''m expecting I could parse the phrase and build up a query to be used by QueryParser but I''d like a more succinct solution for now. I use a StemFilter in my analyzer as follows: def token_stream(field, str) ... ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = StemFilter.new(ts) ... end My use of PhraseQuery is as follows: def generate_query(phrase) phrase = phrase.downcase phrase_parts = phrase.split('' '') query = Ferret::Search::PhraseQuery.new(:content, 2) phrase_parts.each...

A few questions: Tweaking StemFilter, indexes, ...

2007 Jan 21

A few questions: Tweaking StemFilter, indexes, ...

...ns that I haven''t been able to figure out after messing around with ferret and going through the documentation. StemFilter ------ I am trying to improve the quality of my searches in context of the content of my application. I have created an analyzer using the following: StemFilter.new StopFilter.new( LowerCaseFilter.new(StandardTokenizer.new(text)), @stop_words ) This has been pretty good so far, however, I really would like to get a search for "plumber" match "plumbing" at maybe a lower score than it would match "plumbers". The thing is that plumber(s) is fi...

Problem with stemming and AAF

2007 Nov 09

Problem with stemming and AAF

...follows: require ''rubygems'' require ''ferret'' class StemmedAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)) end end And added the call to the analyzer in my model file: acts_as_ferret( :fields => { :name => { :boost => 1, :store => :yes }, :product_number => { :boost =&g...

acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)

2007 Nov 13

acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)

...e, :fields => { :name => {:store => :yes}} } , {:analyzer => PlainAsciiAnalyzer.new} ) end ANALYZER lib : plain_ascii_analyzer.rb class PlainAsciiAnalyzer < ::Ferret::Analysis::Analyzer include ::Ferret::Analysis def token_stream(field, str) StopFilter.new( StandardTokenizer.new(str) , ["fax", "gsm"] ) # raise <<<----- is never executed when uncommented !! end end In the console, I rebuild the index + search for a stop word => I get a results, when I should not : &...

Ignore apostrophes in words

2007 Jun 25

Ignore apostrophes in words

Hi, I just started using ferret and the aaf plugin and it seems to work quite nicely. However, my fields are very short (titles of music) and I don''t think may users will be typing in apostrophes when they are looking for something. Right now, for a simple document such as "what i''ve done" I''d like it to be indexed as "what ive done" instead. Right

Using custom stem analyzer giving mongrel errors

2006 Dec 08

Using custom stem analyzer giving mongrel errors

...ustom stem analyzer: require ''rubygems'' require ''ferret'' include Ferret module Ferret::Analysis class FerretAnalyzer def initialize(stop_words = FULL_ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, text) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(text)), @stop_words)) end end end and I''m simply setting the :analyzer option in AAF. However, I get odd behavior. The first search that I do will go through and display the proper results, but any subsequent request starts to produce od...

tweaking minimum word length?

2006 Jul 26

tweaking minimum word length?

Hi, Can Ferret be configured to change the minimum word length of what it indexes? Right now it seems to drop words 3 characters or less, but I''d like to include words going down to 2 characters. How would I do that? Francis

AAF - Stem Analyzer

2006 Dec 06

AAF - Stem Analyzer

I''m not on AAF. Can someone else help Raymond with an example? On 12/6/06, Raymond O''connor <nappin713 at yahoo.com> wrote: > > Matt Schnitz wrote: > > You also need to stem-analyze the incoming query. > > > > I had this same problem. :^> > > > > > > Schnitz > > Do you have an example of how to do this? I''m using

Query Objects vs. Query Strings

2006 Sep 22

Query Objects vs. Query Strings

Hi .. I tried to build some query objects to get some documents from my index.. without success.. Is something wrong here? q = Ferret::Search::BooleanQuery.new q1 = Ferret::Search::TermQuery.new(:type, "movie") q2 = Ferret::Search::TermQuery.new(:name, "Indiana") q.add_query(q1, :should) q.add_query(q2, :should) Indexer.index.search_each(q) do |doc, score| puts doc end 0

Portuguese Stemming

2006 Aug 18

Portuguese Stemming

Today while compiling ferret I noticed there was a Portuguese stemmer being compiled. How do I enable it''s use for my index? Pedro.

performance bottleneck

2007 Jul 14

performance bottleneck

I have got my database in Mysql. I used ferret to index a table with 10 million rows. On limiting the selection of data to 1000 initial retrieval, it takes 200 seconds but for the whole table it took more than four hours and after which i had to close my indexing application. I used the StandardAnalyser for it. There is no problem from the database side as retrieval of all the data in the table

Need help creating my own Filter in Ruby

2007 Mar 01

Need help creating my own Filter in Ruby

Hi, I posted a Trac ticket about it, but I thought I''d ask the mailing list to reach more people. I''m using these filters together in my analyzer (with acts_as_ferret + Ferret 0.11.1). HyphenFilter.new( StopFilter.new( LowerCaseFilter.new( MappingFilter.new( StandardTokenizer.new(str), mapping)), FULL_FRENCH_STOP_WORDS + FULL_ENGLISH_STOP_WORDS) ) The mapping filter maps pretty much all the f...

Extending/Modifying QueryParser

2007 Jul 07

Extending/Modifying QueryParser

...initialize(synonym_engine, stop_words = FULL_ENGLISH_STOP_WORDS, lower = true) @synonym_engine = synonym_engine @lower = lower @stop_words = stop_words end def token_stream(field, str) ts = StandardTokenizer.new(str) ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = SynonymTokenFilter.new(ts, @synonym_engine) end end class SynonymTokenFilter < Ferret::Analysis::TokenStream include Ferret::Analysis def initialize(token_stream, synonym_engine) @token_stream = token_stream @synonym_stack = [] @synonym_en...

Ferret and non latin characters support

2007 Apr 08

Ferret and non latin characters support

I''ve successfully installed ferret and acts_as_ferret and have no problem with utf-8 for accented characters. It returns correct results fot e.g. fran?ais. My problem is with non latin characters (Persian indeed). I have tested different locales with no success both on Debian and Mac. Any idea? (ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6) -- Posted via http://www.ruby-forum.com/.

Metaphone analysis

2006 Nov 25

Metaphone analysis

...module Analysis class MetaphoneAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(version = :double, stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words @version = version end def token_stream(field, str) MetaphoneFilter.new(StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)), @version) end end end end I saved both of these files, ''metaphone_filter.rb'' and ''metaphone_analyzer.rb'' to RAILS_ROOT/extras. Next I added the following line to my ''config/enviro...

search for: stopfilter