Displaying 20 results from an estimated 45 matches for "token_stream".
2007 Mar 28
6
trouble with PerFieldAnalyzer
I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14).
Script:
require ''rubygems''
require ''ferret''
require ''pp''
include Ferret::Analysis
include Ferret::Index
class TestAnalyzer
def token_stream field, input
pp field
pp input
LetterTokenizer.new(input)
end
end
pfa = PerFieldAnalyzer.new(StandardAnalyzer.new())
pfa[:test] = TestAnalyzer.new
index = Index.new(:analyzer => pfa)
index << {:test => ''foo''}
index.search_each(''bar'')...
2007 Aug 03
0
StandardTokenizer Doesn''t Support token_stream method
...ret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html
I ought to be able to construct a StandardTokenizer like this:
t = StandardTokenizer.new( true) # true to downcase tokens
and then later:
stream = token_stream( ignored_field_name, some_string)
To create a new TokenStream from some_string. This approach would be
valuable for my application since I am analyzing many short strings --
so I''m thinking that building my 5-deep analyzer chain for each small
string will be a nice savings.
Unfortunately...
2006 Apr 20
1
Creating my own analyzer
I created this analyzer:
class DescriptionAnalyzer < Ferret::Analysis::Analyzer
def token_stream(field, string)
if field == "code"
return CodeTokenStream.new(string)
else
return Ferret::Analysis::Analyzer.new.token_stream(field,string)
end
end
end
and created an IndexWriter with it:
Ferret::Index::IndexWriter.new(get_index_path,...
2007 Jan 11
5
stop words in query
Hello all,
Quick question, I''m using AAF and the following custom analyzer:
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
However when my search term includes a stop word I never get any results
back. Once I remove the stop word I get the normal results back. Do I
need to do a search of my query for s...
2007 Apr 08
3
How to make custom TokenFilter?
In the O''reilly Ferret short cuts, I found very useful example for me.
It explains how to make custom Tokenizer.
But that book doesn''t explain how to make custom Filter.
(especially, how to implement the #text=() method)
I''m a newbee and I don''t understand how do I create my own custom
Filter.
Are there some good source code examples??
--
Posted via
2007 Jul 07
2
Extending/Modifying QueryParser
...and SynonymTokenFilter:
class SynonymAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(synonym_engine, stop_words =
FULL_ENGLISH_STOP_WORDS, lower = true)
@synonym_engine = synonym_engine
@lower = lower
@stop_words = stop_words
end
def token_stream(field, str)
ts = StandardTokenizer.new(str)
ts = LowerCaseFilter.new(ts) if @lower
ts = StopFilter.new(ts, @stop_words)
ts = SynonymTokenFilter.new(ts, @synonym_engine)
end
end
class SynonymTokenFilter < Ferret::Analysis::TokenStream
include Ferret::Analysis
def in...
2006 Sep 05
15
ferret finds ''tests'' but not ''test''
Hello all,
Quick question (possibly!) - I''ve got a few records indexed and doing a
search for ''test'' reports in no hits even though I know the word ''tests''
exists in the indexed field. Doing a search for ''tests'' produces a
result. I would have thought that ''test'' would match ''tests'' but no such
2009 Apr 09
4
Weird analyzer issue with the word ''fly''
...:analyzer => Ferret::Analysis::StemmingAnalyzer.new,
:fields => {:name => { :boost => 2.0 },
...
}})
And this analyzer is defined in a module thus:
module Ferret::Analysis
class StemmingAnalyzer
def token_stream(field, text)
StemFilter.new(StandardTokenizer.new(text))
end
end
end
Now, here''s a search without using the analyzer:
>> TeachingObject.find_with_ferret("flea fly", :per_page => 2000).size
=> 14
And with the analyzer:
>> TeachingObject.find_with...
2007 Sep 07
5
Custom Analyser .. where to put it ??
...net/acts_as_ferret/wiki/AdvancedUsage
My problem is that i ve no idea where to put my custom Analyser class
like :
class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = FULL_GERMAN_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words), ''de'')
end
end
Any clue ?
Thanks a lot
Guillaume.
--
Posted via http://www.ruby-forum.com/.
2006 Oct 23
2
Trouble with custom Analyzer
Hi!
I wanted to build my own custom Analyzer like so:
class Analyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, string)
StopFilter.new(LetterTokenizer.new(string, true), @stop_words)
end
end
As one can easily spot, I essentially want a LetterAnalyzer with stop
word filtering. However, using that analyzer (for indexing) results
in a segmentation fault.
/opt/local/lib/ruby/gems/...
2007 Nov 13
8
acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)
...ot;.
test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a
stop word.
=> I get a result when I should not.
(note : I delete the index directory => I can see the index is recreated,
index/develop
).
test 2 : insert a ''raise'' in the token_stream() method => it''s never thrown.
test 3 : use the standard analyzer, to exclude the 2 stop words => same
wrong result.
class AccessPointKind2 < ActiveRecord::Base
set_table_name "access_point_kinds2"
acts_as_ferret(
{:remote => true, :fi...
2006 Sep 23
8
svn problems
I can consistently segfault the 0.10.4 gem, so I''m trying to get the
subversion version working with hopes towards tracking the problem down.
I have a fresh SVN checkout but:
a) the version (in ferret.rb) claims to be 0.9.6; and
b) Ferret::Index::FieldInfos and a couple other classes are missing at
run time. It looks like this is because they''re not exported in the C
2007 Nov 09
2
Problem with stemming and AAF
...ed_analyzer.rb file in the lib directory,
as follows:
require ''rubygems''
require ''ferret''
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
end
And added the call to the analyzer in my model file:
acts_as_ferret( :fields => { :name => { :boost => 1,
:store => :yes },...
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on
TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret
0.11.4
It happens while searching for some terms with accented or special
characters.
This makes ferret to allocate lots of memory (usually reaching 3+ GB)
and failing if another query like this is executed.
Any ideas on that, could this be locale
2006 Nov 25
5
Metaphone analysis
...m. It''s a fairly simple class, but does
require the ''Text'' gem be installed.
require ''ferret''
require ''text''
module Curtis
module Analysis
# TODO write tests!
class MetaphoneFilter < Ferret::Analysis::TokenStream
def initialize(token_stream, version = :double)
@input = token_stream
@version = version
end
def next
t = @input.next
return nil if t.nil?
t.text = @version.eql?(:double) ?
Text::Metaphone.double_metaphone(t.text) :
Text::Metaphone.metaphone(t.text)
end
end
end
end
Second I created a...
2007 Mar 06
1
case-sensitivity of analyzer
Is there anything about this analyzer that says "case-sensitive" to you?
module Ferret::Analysis
class StemmingAnalyzer
def token_stream(field, text)
StemFilter.new(StandardTokenizer.new(text))
end
end
end
Just wondering how I can force my index to be case-insensitive.
Thanks,
-Adam
--
Posted via http://www.ruby-forum.com/.
2006 Nov 13
1
Stemming, stop words, acts_as_ferret
...image" needs to hit "thermal
imaging."
2. Stop words. Searches for "failing to instruct the jury" should come up
with hits on a search for "fail to instruct."
3. Case-insensitive.
What I tried was:
class StemmedAnalyzer < Ferret::Analysis::Analyzer
def token_stream(field, reader)
return
Ferret::Analysis::PorterStemFilter.new(Ferret::Analysis::LowerCaseTokenizer.
new(reader))
end
end
class Summary < ActiveRecord::Base
acts_as_ferret(:analyzer => StemmedAnalyzer.new)
But this doesn''t appear to give me either stemming or stopwords. It d...
2006 Oct 19
2
How to deal with accentuated chars in 0.10.8?
I''m startin to use Ferret and acts_as_ferret.
I need to use something like EuropeanAnalyzer
(http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars).
By example, if the user search by "gonzalez" you can find documents taht
contents the term "gonz?lez" (gonzález)
The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter,
2008 May 12
1
Using StemFilter with PhraseQuery
...39;m doing wrong or is the above description
what I should expect? To get the response that I''m expecting I could parse
the phrase and build up a query to be used by QueryParser but I''d like a
more succinct solution for now.
I use a StemFilter in my analyzer as follows:
def token_stream(field, str)
...
ts = LowerCaseFilter.new(ts) if @lower
ts = StopFilter.new(ts, @stop_words)
ts = StemFilter.new(ts)
...
end
My use of PhraseQuery is as follows:
def generate_query(phrase)
phrase = phrase.downcase
phrase_parts = phrase.split('' &...
2006 Sep 15
1
Custom analyzer not invoked?
Hello,
I''m trying to define my own analyzer by doing something like:
#-----------------------------------------------------
require ''ferret''
include Ferret
class MyAnalyzer < Analysis::Analyzer
def token_stream(field, str)
# Display results of analysis
puts ''Analyzing: field:%s str:%s'' % [field, str]
t =
Analysis::LowerCaseFilter.new(Analysis::StandardTokenizer.new(str))
while true
n = t.next()
break if n == nil
puts n.to_s
end
return
Analys...