search for: token_stream

Displaying 20 results from an estimated 45 matches for "token_stream".

2007 Mar 28
6
trouble with PerFieldAnalyzer
I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14). Script: require ''rubygems'' require ''ferret'' require ''pp'' include Ferret::Analysis include Ferret::Index class TestAnalyzer def token_stream field, input pp field pp input LetterTokenizer.new(input) end end pfa = PerFieldAnalyzer.new(StandardAnalyzer.new()) pfa[:test] = TestAnalyzer.new index = Index.new(:analyzer => pfa) index << {:test => ''foo''} index.search_each(''bar'')...
2007 Aug 03
0
StandardTokenizer Doesn''t Support token_stream method
...ret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html I ought to be able to construct a StandardTokenizer like this: t = StandardTokenizer.new( true) # true to downcase tokens and then later: stream = token_stream( ignored_field_name, some_string) To create a new TokenStream from some_string. This approach would be valuable for my application since I am analyzing many short strings -- so I''m thinking that building my 5-deep analyzer chain for each small string will be a nice savings. Unfortunately...
2006 Apr 20
1
Creating my own analyzer
I created this analyzer: class DescriptionAnalyzer < Ferret::Analysis::Analyzer def token_stream(field, string) if field == "code" return CodeTokenStream.new(string) else return Ferret::Analysis::Analyzer.new.token_stream(field,string) end end end and created an IndexWriter with it: Ferret::Index::IndexWriter.new(get_index_path,...
2007 Jan 11
5
stop words in query
Hello all, Quick question, I''m using AAF and the following custom analyzer: class StemmedAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)) end However when my search term includes a stop word I never get any results back. Once I remove the stop word I get the normal results back. Do I need to do a search of my query for s...
2007 Apr 08
3
How to make custom TokenFilter?
In the O''reilly Ferret short cuts, I found very useful example for me. It explains how to make custom Tokenizer. But that book doesn''t explain how to make custom Filter. (especially, how to implement the #text=() method) I''m a newbee and I don''t understand how do I create my own custom Filter. Are there some good source code examples?? -- Posted via
2007 Jul 07
2
Extending/Modifying QueryParser
...and SynonymTokenFilter: class SynonymAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(synonym_engine, stop_words = FULL_ENGLISH_STOP_WORDS, lower = true) @synonym_engine = synonym_engine @lower = lower @stop_words = stop_words end def token_stream(field, str) ts = StandardTokenizer.new(str) ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = SynonymTokenFilter.new(ts, @synonym_engine) end end class SynonymTokenFilter < Ferret::Analysis::TokenStream include Ferret::Analysis def in...
2006 Sep 05
15
ferret finds ''tests'' but not ''test''
Hello all, Quick question (possibly!) - I''ve got a few records indexed and doing a search for ''test'' reports in no hits even though I know the word ''tests'' exists in the indexed field. Doing a search for ''tests'' produces a result. I would have thought that ''test'' would match ''tests'' but no such
2009 Apr 09
4
Weird analyzer issue with the word ''fly''
...:analyzer => Ferret::Analysis::StemmingAnalyzer.new, :fields => {:name => { :boost => 2.0 }, ... }}) And this analyzer is defined in a module thus: module Ferret::Analysis class StemmingAnalyzer def token_stream(field, text) StemFilter.new(StandardTokenizer.new(text)) end end end Now, here''s a search without using the analyzer: >> TeachingObject.find_with_ferret("flea fly", :per_page => 2000).size => 14 And with the analyzer: >> TeachingObject.find_with...
2007 Sep 07
5
Custom Analyser .. where to put it ??
...net/acts_as_ferret/wiki/AdvancedUsage My problem is that i ve no idea where to put my custom Analyser class like : class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = FULL_GERMAN_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words), ''de'') end end Any clue ? Thanks a lot Guillaume. -- Posted via http://www.ruby-forum.com/.
2006 Oct 23
2
Trouble with custom Analyzer
Hi! I wanted to build my own custom Analyzer like so: class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end As one can easily spot, I essentially want a LetterAnalyzer with stop word filtering. However, using that analyzer (for indexing) results in a segmentation fault. /opt/local/lib/ruby/gems/...
2007 Nov 13
8
acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)
...ot;. test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a stop word. => I get a result when I should not. (note : I delete the index directory => I can see the index is recreated, index/develop ). test 2 : insert a ''raise'' in the token_stream() method => it''s never thrown. test 3 : use the standard analyzer, to exclude the 2 stop words => same wrong result. class AccessPointKind2 < ActiveRecord::Base set_table_name "access_point_kinds2" acts_as_ferret( {:remote => true, :fi...
2006 Sep 23
8
svn problems
I can consistently segfault the 0.10.4 gem, so I''m trying to get the subversion version working with hopes towards tracking the problem down. I have a fresh SVN checkout but: a) the version (in ferret.rb) claims to be 0.9.6; and b) Ferret::Index::FieldInfos and a couple other classes are missing at run time. It looks like this is because they''re not exported in the C
2007 Nov 09
2
Problem with stemming and AAF
...ed_analyzer.rb file in the lib directory, as follows: require ''rubygems'' require ''ferret'' class StemmedAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)) end end And added the call to the analyzer in my model file: acts_as_ferret( :fields => { :name => { :boost => 1, :store => :yes },...
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret 0.11.4 It happens while searching for some terms with accented or special characters. This makes ferret to allocate lots of memory (usually reaching 3+ GB) and failing if another query like this is executed. Any ideas on that, could this be locale
2006 Nov 25
5
Metaphone analysis
...m. It''s a fairly simple class, but does require the ''Text'' gem be installed. require ''ferret'' require ''text'' module Curtis module Analysis # TODO write tests! class MetaphoneFilter < Ferret::Analysis::TokenStream def initialize(token_stream, version = :double) @input = token_stream @version = version end def next t = @input.next return nil if t.nil? t.text = @version.eql?(:double) ? Text::Metaphone.double_metaphone(t.text) : Text::Metaphone.metaphone(t.text) end end end end Second I created a...
2007 Mar 06
1
case-sensitivity of analyzer
Is there anything about this analyzer that says "case-sensitive" to you? module Ferret::Analysis class StemmingAnalyzer def token_stream(field, text) StemFilter.new(StandardTokenizer.new(text)) end end end Just wondering how I can force my index to be case-insensitive. Thanks, -Adam -- Posted via http://www.ruby-forum.com/.
2006 Nov 13
1
Stemming, stop words, acts_as_ferret
...image" needs to hit "thermal imaging." 2. Stop words. Searches for "failing to instruct the jury" should come up with hits on a search for "fail to instruct." 3. Case-insensitive. What I tried was: class StemmedAnalyzer < Ferret::Analysis::Analyzer def token_stream(field, reader) return Ferret::Analysis::PorterStemFilter.new(Ferret::Analysis::LowerCaseTokenizer. new(reader)) end end class Summary < ActiveRecord::Base acts_as_ferret(:analyzer => StemmedAnalyzer.new) But this doesn''t appear to give me either stemming or stopwords. It d...
2006 Oct 19
2
How to deal with accentuated chars in 0.10.8?
I''m startin to use Ferret and acts_as_ferret. I need to use something like EuropeanAnalyzer (http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars). By example, if the user search by "gonzalez" you can find documents taht contents the term "gonz?lez" (gonz&aacute;lez) The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter,
2008 May 12
1
Using StemFilter with PhraseQuery
...39;m doing wrong or is the above description what I should expect? To get the response that I''m expecting I could parse the phrase and build up a query to be used by QueryParser but I''d like a more succinct solution for now. I use a StemFilter in my analyzer as follows: def token_stream(field, str) ... ts = LowerCaseFilter.new(ts) if @lower ts = StopFilter.new(ts, @stop_words) ts = StemFilter.new(ts) ... end My use of PhraseQuery is as follows: def generate_query(phrase) phrase = phrase.downcase phrase_parts = phrase.split('' &...
2006 Sep 15
1
Custom analyzer not invoked?
Hello, I''m trying to define my own analyzer by doing something like: #----------------------------------------------------- require ''ferret'' include Ferret class MyAnalyzer < Analysis::Analyzer def token_stream(field, str) # Display results of analysis puts ''Analyzing: field:%s str:%s'' % [field, str] t = Analysis::LowerCaseFilter.new(Analysis::StandardTokenizer.new(str)) while true n = t.next() break if n == nil puts n.to_s end return Analys...