Displaying 20 results from an estimated 400 matches similar to: "AAF - Stem Analyzer"
2007 Sep 07
5
Custom Analyser .. where to put it ??
Hi,
I m trying to use a custom analyser to add my french stop words... i m
reading the tutorial at :
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
My problem is that i ve no idea where to put my custom Analyser class
like :
class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = FULL_GERMAN_STOP_WORDS)
2006 Dec 06
10
Stem Analyzer
Hi all,
I am trying to implement a search that will use the Stem Analyzer. I
added the Stem Anaylzer from the examples shown in another post
http://ruby-forum.com/topic/80178#147014
module Ferret::Analysis
class StemmingAnalyzer
def token_stream(field, text)
StemFilter.new(StandardTokenizer.new(text))
end
end
end
The problem with the Stem analyzer is that when I search for a
2007 Mar 05
2
Is indexing slower?
Hi - I upgraded to Ferret 0.11.3 from 0.10.13.
I used to index 10,000 records in 10 secs. Now it takes 13 minutes.
(That''s a factor of ~75x)
Did something change in the flush semantics, or something?
Thanks!
Schnitz
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2007 Nov 09
2
Problem with stemming and AAF
I''m sure I''m missing something completely obvious here, so I hope
someone can point me in the right direction!
I''ve implemented a basic search with AAF, which works as expected; I''m
running a ferret drb server, and using will_paginate to page results.
The code in my search_controller.rb:
search_text = params[:query] || " "
@products =
2006 Dec 08
4
Using custom stem analyzer giving mongrel errors
I''m using the custom stem analyzer:
require ''rubygems''
require ''ferret''
include Ferret
module Ferret::Analysis
class FerretAnalyzer
def initialize(stop_words = FULL_ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, text)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(text)),
2007 Nov 13
8
acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)
Hi all,
I cannot make aaf (rev. 220) use my custom analyzer, despite following the
indications @
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
To pinpoint the problem, I created a model + a simple analyzer with 2 stop
words : "fax" and "gsm".
test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a
stop word.
=> I get a
2006 Oct 23
2
Trouble with custom Analyzer
Hi!
I wanted to build my own custom Analyzer like so:
class Analyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, string)
StopFilter.new(LetterTokenizer.new(string, true), @stop_words)
end
end
As one can easily spot, I essentially want
2006 Dec 07
8
crash on repeated search
I have found another crash in ferret; this one just uses a regular
search. It''s similar to an issue reported by Matt Schnitz a while ago,
but unlike his, mine does not go away if I turn off omit_norms. It does
go away if I turn on the garbage collector more often, but I''m not sure
that''s a stable workaround under the circumstances.
This one isn''t a
2007 Jun 25
4
Ignore apostrophes in words
Hi, I just started using ferret and the aaf plugin and it seems to work
quite nicely. However, my fields are very short (titles of music) and I
don''t think may users will be typing in apostrophes when they are
looking for something. Right now, for a simple document such as "what
i''ve done" I''d like it to be indexed as "what ive done" instead. Right
2007 Jan 11
5
stop words in query
Hello all,
Quick question, I''m using AAF and the following custom analyzer:
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
However when
2007 Apr 19
1
DRb examples for vanilla Ferret?
Hi folks!
Does anyone have any example code for using a DRb Ferret server? No AAF.
Dave - is yours ready?
I know Jens has one, but I was hoping for something more stand-alone. I
assume Jens has a lot of other code in there.
Schnitz
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2007 Jan 31
6
GeoQuery with acts_as_ferret involved
So, I''m working on a search engine of sorts that restricts results to
your local area. I can successfully return all entries within 15 miles
of a particular point, and I can successfully return all entries that
match a search query, but I''m having trouble combining the two together
and doing pagination on them.
Basically, for the range query, you do a SQL query that returns all
2006 Apr 13
3
QueryParser doesn''t use StandardAnalyzer correctly?
I am having a bit of a problem with my search queries being parsed
correctly it seems, and I wonder if anyone else has experienced this.
I have written an index using StandardAnalyzer for analysis. I want to
search that index by passing my user query through a QueryParser
instance which is also using a StandardAnalyzer. However the resultant
query does not seem to be a valid term query and
2007 Aug 20
2
can''t stop stop_words
I have looked at the documentation and done some searching, but I can''t
seem to stop the STOP_WORDS from cutting out common words. I am using
acts_as_ferret and I have add the following to my code:
STOP_WORDS = []
acts_as_ferret({ :fields => { :name => { :boost
=> 10 },
:project_client_company_id => { :boost
=> 0
2005 Nov 17
1
indexing source code
Hi again,
I''m using ferret to index source code - DamageControl will allow users
to search for text in source code.
Currently I''m using the default index with no custom analyzer (I''m
using the StandardAnalyzer). Do you have any recommendations about how
to write an analyzer that will index source code in a more ''optimal''
way? I.e. disregard common
2007 Feb 27
3
segfault in ferret 0.11.0
Hi,
Just downloaded the new ferret 0.11. I''m on OSX btw. I get this error
everytime I run my unit tests:
Loaded suite ferret_updater_unit_test
Started
E/usr/local/lib/ruby/1.8/erb.rb:504: [BUG] Segmentation fault
ruby 1.8.4 (2005-12-24) [i686-darwin8.7.1]
Abort trap
When I revert back to 10.14 I dont get this error. When I comment out
the line:
Ferret::Index::Index.new({:path =>
2007 Sep 27
5
QueryParser.parse question
Hi there,
I am stomped as to why QueryParser''s parse method behaves differently
between query ''a'' and ''b''.
See http://pastie.caboo.se/private/4rlwrecyyow3yl6qtf4tq
Could someone please help me understand why that is the case.
p.s. I also found ''i'' produce the same behavour as ''a''
Cheers,
Andy
2008 May 12
1
Using StemFilter with PhraseQuery
Hi,
I''m having difficulty getting the StemFilter and PhraseQuery to work
properly together. When I use a StemFilter with a PhraseQuery, searches only
work if the phrase consists of stems. For example, the search phrase
"reduces health care" will not work but the phrase "reduce health care" will
work even though the exact text "reduces health care" is
2007 Jan 21
2
A few questions: Tweaking StemFilter, indexes, ...
Hello all,
I am new to the list, but I have been using ferret for a little bit
already. I would first like to thank Dave for all his work on ferret.
I had a few questions that I haven''t been able to figure out after
messing around with ferret and going through the documentation.
StemFilter ------
I am trying to improve the quality of my searches in context of the
content of my
2020 Apr 28
3
Stopwords: Topic modelling con LDA
Buenos días,
Estoy realizando un análisis de topic models con el método LDA. En
principio, he quitado del análisis las palabras "stopwords" universales. A
la hora de ver los topics y sus palabras más frecuentes encuentro que son
muy similares y hay palabras que aparecen en todos los topics. Los textos
que estoy analizando son opiniones de consumidores sobre una categoría
concreta de