Hi! I wanted to build my own custom Analyzer like so: class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end As one can easily spot, I essentially want a LetterAnalyzer with stop word filtering. However, using that analyzer (for indexing) results in a segmentation fault. /opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/ index.rb:281: [BUG] Segmentation fault ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0] This is admittedly a rather naive implementation which is extrapolated from those I found in the docs. So what am I missing here? Cheers, Andy
On 10/23/06, Andreas Korth <andreas.korth at gmx.net> wrote:> Hi! > > I wanted to build my own custom Analyzer like so: > > class Analyzer < Ferret::Analysis::Analyzer > > include Ferret::Analysis > > def initialize(stop_words = ENGLISH_STOP_WORDS) > @stop_words = stop_words > end > > def token_stream(field, string) > StopFilter.new(LetterTokenizer.new(string, true), @stop_words) > end > > end > > As one can easily spot, I essentially want a LetterAnalyzer with stop > word filtering. However, using that analyzer (for indexing) results > in a segmentation fault. > > /opt/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/ > index.rb:281: [BUG] Segmentation fault > ruby 1.8.5 (2006-08-25) [powerpc-darwin8.8.0] > > This is admittedly a rather naive implementation which is > extrapolated from those I found in the docs. So what am I missing here?Hi Andy, This works for me so I''ll need a little more info to solve the problem. First, try running this: require ''rubygems'' require ''ferret'' class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end i = Ferret::I.new(:analyzer => Analyzer.new) i << "A sentence to analyze" puts i.search("analyze") If that works, try and track down where in your code ferret is seg-faulting. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/
On 23.10.2006, at 06:32, David Balmain wrote:> Hi Andy, > > This works for me so I''ll need a little more info to solve the > problem. First, try running this: > > [...] > > i = Ferret::I.new(:analyzer => Analyzer.new) > > i << "A sentence to analyze" > > puts i.search("analyze") > > If that works, try and track down where in your code ferret is seg- > faulting.Dave, thanks for the hint. I was using the add_document method instead of << to add documents to the index. Changing the above code to i = Ferret::I.new() i.addDocument("A sentence to analyze", Analyzer.new) still works fine. However, changing my original code to use the << method (and specifying the Analyzer with Index.new) solves the problem. I didn''t manage to distill a concise test case from my code to reproduce the segfault. And hey, why bother, it works just fine now :) Thanks again, Andy