similar to: AAF - Stem Analyzer

Displaying 20 results from an estimated 400 matches similar to: "AAF - Stem Analyzer"

2007 Sep 07
5
Custom Analyser .. where to put it ??
Hi, I m trying to use a custom analyser to add my french stop words... i m reading the tutorial at : http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage My problem is that i ve no idea where to put my custom Analyser class like : class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = FULL_GERMAN_STOP_WORDS)
2006 Dec 06
10
Stem Analyzer
Hi all, I am trying to implement a search that will use the Stem Analyzer. I added the Stem Anaylzer from the examples shown in another post http://ruby-forum.com/topic/80178#147014 module Ferret::Analysis class StemmingAnalyzer def token_stream(field, text) StemFilter.new(StandardTokenizer.new(text)) end end end The problem with the Stem analyzer is that when I search for a
2007 Mar 05
2
Is indexing slower?
Hi - I upgraded to Ferret 0.11.3 from 0.10.13. I used to index 10,000 records in 10 secs. Now it takes 13 minutes. (That''s a factor of ~75x) Did something change in the flush semantics, or something? Thanks! Schnitz -------------- next part -------------- An HTML attachment was scrubbed... URL:
2007 Nov 09
2
Problem with stemming and AAF
I''m sure I''m missing something completely obvious here, so I hope someone can point me in the right direction! I''ve implemented a basic search with AAF, which works as expected; I''m running a ferret drb server, and using will_paginate to page results. The code in my search_controller.rb: search_text = params[:query] || " " @products =
2006 Dec 08
4
Using custom stem analyzer giving mongrel errors
I''m using the custom stem analyzer: require ''rubygems'' require ''ferret'' include Ferret module Ferret::Analysis class FerretAnalyzer def initialize(stop_words = FULL_ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, text) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(text)),
2007 Nov 13
8
acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)
Hi all, I cannot make aaf (rev. 220) use my custom analyzer, despite following the indications @ http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage To pinpoint the problem, I created a model + a simple analyzer with 2 stop words : "fax" and "gsm". test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a stop word. => I get a
2006 Oct 23
2
Trouble with custom Analyzer
Hi! I wanted to build my own custom Analyzer like so: class Analyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, string) StopFilter.new(LetterTokenizer.new(string, true), @stop_words) end end As one can easily spot, I essentially want
2006 Dec 07
8
crash on repeated search
I have found another crash in ferret; this one just uses a regular search. It''s similar to an issue reported by Matt Schnitz a while ago, but unlike his, mine does not go away if I turn off omit_norms. It does go away if I turn on the garbage collector more often, but I''m not sure that''s a stable workaround under the circumstances. This one isn''t a
2007 Jun 25
4
Ignore apostrophes in words
Hi, I just started using ferret and the aaf plugin and it seems to work quite nicely. However, my fields are very short (titles of music) and I don''t think may users will be typing in apostrophes when they are looking for something. Right now, for a simple document such as "what i''ve done" I''d like it to be indexed as "what ive done" instead. Right
2007 Jan 11
5
stop words in query
Hello all, Quick question, I''m using AAF and the following custom analyzer: class StemmedAnalyzer < Ferret::Analysis::Analyzer include Ferret::Analysis def initialize(stop_words = ENGLISH_STOP_WORDS) @stop_words = stop_words end def token_stream(field, str) StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)), @stop_words)) end However when
2007 Apr 19
1
DRb examples for vanilla Ferret?
Hi folks! Does anyone have any example code for using a DRb Ferret server? No AAF. Dave - is yours ready? I know Jens has one, but I was hoping for something more stand-alone. I assume Jens has a lot of other code in there. Schnitz -------------- next part -------------- An HTML attachment was scrubbed... URL:
2007 Jan 31
6
GeoQuery with acts_as_ferret involved
So, I''m working on a search engine of sorts that restricts results to your local area. I can successfully return all entries within 15 miles of a particular point, and I can successfully return all entries that match a search query, but I''m having trouble combining the two together and doing pagination on them. Basically, for the range query, you do a SQL query that returns all
2006 Apr 13
3
QueryParser doesn''t use StandardAnalyzer correctly?
I am having a bit of a problem with my search queries being parsed correctly it seems, and I wonder if anyone else has experienced this. I have written an index using StandardAnalyzer for analysis. I want to search that index by passing my user query through a QueryParser instance which is also using a StandardAnalyzer. However the resultant query does not seem to be a valid term query and
2007 Aug 20
2
can''t stop stop_words
I have looked at the documentation and done some searching, but I can''t seem to stop the STOP_WORDS from cutting out common words. I am using acts_as_ferret and I have add the following to my code: STOP_WORDS = [] acts_as_ferret({ :fields => { :name => { :boost => 10 }, :project_client_company_id => { :boost => 0
2005 Nov 17
1
indexing source code
Hi again, I''m using ferret to index source code - DamageControl will allow users to search for text in source code. Currently I''m using the default index with no custom analyzer (I''m using the StandardAnalyzer). Do you have any recommendations about how to write an analyzer that will index source code in a more ''optimal'' way? I.e. disregard common
2007 Feb 27
3
segfault in ferret 0.11.0
Hi, Just downloaded the new ferret 0.11. I''m on OSX btw. I get this error everytime I run my unit tests: Loaded suite ferret_updater_unit_test Started E/usr/local/lib/ruby/1.8/erb.rb:504: [BUG] Segmentation fault ruby 1.8.4 (2005-12-24) [i686-darwin8.7.1] Abort trap When I revert back to 10.14 I dont get this error. When I comment out the line: Ferret::Index::Index.new({:path =>
2007 Sep 27
5
QueryParser.parse question
Hi there, I am stomped as to why QueryParser''s parse method behaves differently between query ''a'' and ''b''. See http://pastie.caboo.se/private/4rlwrecyyow3yl6qtf4tq Could someone please help me understand why that is the case. p.s. I also found ''i'' produce the same behavour as ''a'' Cheers, Andy
2008 May 12
1
Using StemFilter with PhraseQuery
Hi, I''m having difficulty getting the StemFilter and PhraseQuery to work properly together. When I use a StemFilter with a PhraseQuery, searches only work if the phrase consists of stems. For example, the search phrase "reduces health care" will not work but the phrase "reduce health care" will work even though the exact text "reduces health care" is
2007 Jan 21
2
A few questions: Tweaking StemFilter, indexes, ...
Hello all, I am new to the list, but I have been using ferret for a little bit already. I would first like to thank Dave for all his work on ferret. I had a few questions that I haven''t been able to figure out after messing around with ferret and going through the documentation. StemFilter ------ I am trying to improve the quality of my searches in context of the content of my
2020 Apr 28
3
Stopwords: Topic modelling con LDA
Buenos días, Estoy realizando un análisis de topic models con el método LDA. En principio, he quitado del análisis las palabras "stopwords" universales. A la hora de ver los topics y sus palabras más frecuentes encuentro que son muy similares y hay palabras que aparecen en todos los topics. Los textos que estoy analizando son opiniones de consumidores sobre una categoría concreta de