Displaying 20 results from an estimated 22 matches for "stopfilter".
2007 Sep 07
5
Custom Analyser .. where to put it ??
...oblem is that i ve no idea where to put my custom Analyser class
like :
class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = FULL_GERMAN_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words), ''de'')
end
end
Any clue ?
Thanks a lot
Guillaume.
--
Posted via http://www.ruby-forum.com/.
2006 Oct 23
2
Trouble with custom Analyzer
Hi!
I wanted to build my own custom Analyzer like so:
class Analyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, string)
StopFilter.new(LetterTokenizer.new(string, true), @stop_words)
end
end
As one can easily spot, I essentially want a LetterAnalyzer with stop
word filtering. However, using that analyzer (for indexing) results
in a segmentation fault.
/opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferre...
2006 Nov 02
3
Indexing and searching across multiple locales
Hi -
I''m currently investigating support for Ferret and content that spans
multiple locales. I am particularly interested in using stemming and fuzzy
searches (e.g. with slop factor) across multiple locales.
So far I''ve followed the online docs for implementing a Stemming Analyzer,
and it is working for English terms just fine. I''ve also written a method to
import data
2007 Jan 11
5
stop words in query
...llo all,
Quick question, I''m using AAF and the following custom analyzer:
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
However when my search term includes a stop word I never get any results
back. Once I remove the stop word I get the normal results back. Do I
need to do a search of my query for stop words and remove them myself?
Or is th...
2005 Nov 17
1
indexing source code
Hi again,
I''m using ferret to index source code - DamageControl will allow users
to search for text in source code.
Currently I''m using the default index with no custom analyzer (I''m
using the StandardAnalyzer). Do you have any recommendations about how
to write an analyzer that will index source code in a more ''optimal''
way? I.e. disregard common
2008 May 12
1
Using StemFilter with PhraseQuery
...t I''m expecting I could parse
the phrase and build up a query to be used by QueryParser but I''d like a
more succinct solution for now.
I use a StemFilter in my analyzer as follows:
def token_stream(field, str)
...
ts = LowerCaseFilter.new(ts) if @lower
ts = StopFilter.new(ts, @stop_words)
ts = StemFilter.new(ts)
...
end
My use of PhraseQuery is as follows:
def generate_query(phrase)
phrase = phrase.downcase
phrase_parts = phrase.split('' '')
query = Ferret::Search::PhraseQuery.new(:content, 2)
phrase_parts.each...
2007 Jan 21
2
A few questions: Tweaking StemFilter, indexes, ...
...ns that I haven''t been able to figure out after
messing around with ferret and going through the documentation.
StemFilter ------
I am trying to improve the quality of my searches in context of the
content of my application. I have created an analyzer using the
following:
StemFilter.new StopFilter.new(
LowerCaseFilter.new(StandardTokenizer.new(text)), @stop_words )
This has been pretty good so far, however, I really would like to get
a search for "plumber" match "plumbing" at maybe a lower score than it
would match "plumbers". The thing is that plumber(s) is fi...
2007 Nov 09
2
Problem with stemming and AAF
...follows:
require ''rubygems''
require ''ferret''
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
end
And added the call to the analyzer in my model file:
acts_as_ferret( :fields => { :name => { :boost => 1,
:store => :yes },
:product_number => { :boost =&g...
2007 Nov 13
8
acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)
...e, :fields => { :name => {:store => :yes}} } ,
{:analyzer => PlainAsciiAnalyzer.new}
)
end
ANALYZER
lib : plain_ascii_analyzer.rb
class PlainAsciiAnalyzer < ::Ferret::Analysis::Analyzer
include ::Ferret::Analysis
def token_stream(field, str)
StopFilter.new(
StandardTokenizer.new(str) ,
["fax", "gsm"]
)
# raise <<<----- is never executed when uncommented !!
end
end
In the console, I rebuild the index + search for a stop word => I get a
results, when I should not :
&...
2007 Jun 25
4
Ignore apostrophes in words
Hi, I just started using ferret and the aaf plugin and it seems to work
quite nicely. However, my fields are very short (titles of music) and I
don''t think may users will be typing in apostrophes when they are
looking for something. Right now, for a simple document such as "what
i''ve done" I''d like it to be indexed as "what ive done" instead. Right
2006 Dec 08
4
Using custom stem analyzer giving mongrel errors
...ustom stem analyzer:
require ''rubygems''
require ''ferret''
include Ferret
module Ferret::Analysis
class FerretAnalyzer
def initialize(stop_words = FULL_ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, text)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(text)),
@stop_words))
end
end
end
and I''m simply setting the :analyzer option in AAF.
However, I get odd behavior. The first search that I do will go through
and display the proper results, but any subsequent request starts to
produce od...
2006 Jul 26
13
tweaking minimum word length?
Hi,
Can Ferret be configured to change the minimum word length of what it
indexes? Right now it seems to drop words 3 characters or less, but
I''d like to include words going down to 2 characters. How would I do
that?
Francis
2006 Dec 06
1
AAF - Stem Analyzer
I''m not on AAF. Can someone else help Raymond with an example?
On 12/6/06, Raymond O''connor <nappin713 at yahoo.com> wrote:
>
> Matt Schnitz wrote:
> > You also need to stem-analyze the incoming query.
> >
> > I had this same problem. :^>
> >
> >
> > Schnitz
>
> Do you have an example of how to do this? I''m using
2006 Sep 22
1
Query Objects vs. Query Strings
Hi ..
I tried to build some query objects to get some documents from my
index.. without success.. Is something wrong here?
q = Ferret::Search::BooleanQuery.new
q1 = Ferret::Search::TermQuery.new(:type, "movie")
q2 = Ferret::Search::TermQuery.new(:name, "Indiana")
q.add_query(q1, :should)
q.add_query(q2, :should)
Indexer.index.search_each(q) do |doc, score| puts doc end
0
2006 Aug 18
1
Portuguese Stemming
Today while compiling ferret I noticed there was a Portuguese stemmer
being compiled. How do I enable it''s use for my index?
Pedro.
2007 Jul 14
1
performance bottleneck
I have got my database in Mysql. I used ferret to index a table with 10
million rows. On limiting the selection of data to 1000 initial retrieval,
it takes 200 seconds but for the whole table it took more than four hours
and after which i had to close my indexing application. I used the
StandardAnalyser for it. There is no problem from the database side as
retrieval of all the data in the table
2007 Mar 01
4
Need help creating my own Filter in Ruby
Hi,
I posted a Trac ticket about it, but I thought I''d ask the mailing
list to reach more people.
I''m using these filters together in my analyzer (with acts_as_ferret
+ Ferret 0.11.1).
HyphenFilter.new(
StopFilter.new(
LowerCaseFilter.new(
MappingFilter.new(
StandardTokenizer.new(str),
mapping)),
FULL_FRENCH_STOP_WORDS + FULL_ENGLISH_STOP_WORDS)
)
The mapping filter maps pretty much all the f...
2007 Jul 07
2
Extending/Modifying QueryParser
...initialize(synonym_engine, stop_words =
FULL_ENGLISH_STOP_WORDS, lower = true)
@synonym_engine = synonym_engine
@lower = lower
@stop_words = stop_words
end
def token_stream(field, str)
ts = StandardTokenizer.new(str)
ts = LowerCaseFilter.new(ts) if @lower
ts = StopFilter.new(ts, @stop_words)
ts = SynonymTokenFilter.new(ts, @synonym_engine)
end
end
class SynonymTokenFilter < Ferret::Analysis::TokenStream
include Ferret::Analysis
def initialize(token_stream, synonym_engine)
@token_stream = token_stream
@synonym_stack = []
@synonym_en...
2007 Apr 08
10
Ferret and non latin characters support
I''ve successfully installed ferret and acts_as_ferret and have no
problem with utf-8 for accented characters. It returns correct results
fot e.g. fran?ais. My problem is with non latin characters (Persian
indeed). I have tested different locales with no success both on Debian
and Mac. Any idea?
(ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6)
--
Posted via http://www.ruby-forum.com/.
2006 Nov 25
5
Metaphone analysis
...module Analysis
class MetaphoneAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(version = :double, stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
@version = version
end
def token_stream(field, str)
MetaphoneFilter.new(StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words)), @version)
end
end
end
end
I saved both of these files, ''metaphone_filter.rb'' and ''metaphone_analyzer.rb''
to RAILS_ROOT/extras. Next I added the following line to my
''config/enviro...