Displaying 20 results from an estimated 71 matches for "standardanalyzer".
2007 May 05
4
Stop words, fields, StandardAnalyzer quagmire
...gt; 10},
:bio => {:store => :no},
:status_id => {:boost => 1}},
:store_class_name => true,
:remote => true,
:ferret => { :analyzer =>
Ferret::Analysis::StandardAnalyzer.new([]) }
} )
With the StandardAnalyzer added, I do find results with "no" or "the".
The complicating factor is that as you can see, I have a field
"status_id". This field lets me filter for profiles that are
published or draft in my CMS.
Before I...
2006 Apr 13
3
QueryParser doesn''t use StandardAnalyzer correctly?
I am having a bit of a problem with my search queries being parsed
correctly it seems, and I wonder if anyone else has experienced this.
I have written an index using StandardAnalyzer for analysis. I want to
search that index by passing my user query through a QueryParser
instance which is also using a StandardAnalyzer. However the resultant
query does not seem to be a valid term query and therefore the search
produces no hits.
Specifically I have a bunch of docs with the p...
2006 Jul 26
13
tweaking minimum word length?
Hi,
Can Ferret be configured to change the minimum word length of what it
indexes? Right now it seems to drop words 3 characters or less, but
I''d like to include words going down to 2 characters. How would I do
that?
Francis
2007 Mar 22
3
Noice words...
Hi
I use acts_as_ferret on an app that is in Danish and English. In
Danish english words like "and" and "under" has meaning. Is it
possible to make ferret search for these words? As it is now a seach
for "under" returns nothing even-though I know the word is present in
the index.
Cheers
Mattias
2006 Jul 18
4
Some basic questions
...ted Ferret or really looked at the
Ferret-related code since probably January, but I recently started
thinking about trying out the latest version (we were using 0.3.2, I
think). I got the latest (0.9.4) and have noticed things break. In
particular, I used to refer to the constant
Ferret::Analysis::StandardAnalyzer::ENGLISH_STOP_WORDS, but now when I
try to reference it I get an uninitialized constant error for
StopAnalyzer. Here''s an example IRB session:
1 irb(main):001:0> require ''rubygems''
2 => true
3 irb(main):002:0> require_gem ''ferret'&...
2006 Aug 16
1
StandardAnalyzer not indexing "some"
Hi everybody,
In the basic setup acts_as_ferret uses a StandardAnalyzer. How come that
it won''t index the headline "some headline" with "some" and "headline".
It only uses LetterTokenizer and LowerCaseFilter.
Thanks for your help.
Michael
--
Posted via http://www.ruby-forum.com/.
2006 Jul 07
4
How to add Asia token analyzer to ferret simply?
Hi,David
Can you give me an example of how to add analyzer to ferret to Asian
languages?
My web application will have to support multi language search,which
means,for example,both Chinese and English will be searched through the
form.
Currently,I have decided to use the simple token principles,which means
that every Chinese character will be a token,although this is not so
well in some
2006 Oct 24
2
Problem with stop words
...rd to the standard analyzer:
require ''rubygems''
require ''ferret''
index = Ferret::I.new(:or_default => false)
index << ''you''
puts index.search(''you'')
returns no hits.
I assumed from the docs that StandardAnalyzer was using stop words
as defined by:
Ferret::Analysis::ENGLISH_STOP_WORDS
but when I print that to the console I get:
["a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if"...
2007 Apr 08
10
Ferret and non latin characters support
I''ve successfully installed ferret and acts_as_ferret and have no
problem with utf-8 for accented characters. It returns correct results
fot e.g. fran?ais. My problem is with non latin characters (Persian
indeed). I have tested different locales with no success both on Debian
and Mac. Any idea?
(ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6)
--
Posted via http://www.ruby-forum.com/.
2005 Dec 29
5
Short words not indexed?
I noticed that if I have a field that contains something like "Institute
for medicine", that if I search using nay of these queries:
for
*for*
for~
Nothing shows up. If I search for either of the other two words, though,
that term would show up in the result set. Does this indicate that short
words like "for" are not indexed?
Thanks!
Jen
2007 Apr 10
8
ferret-0.11.4-mswin32 not compatible with Ruby1.8.4
Just a quick note for future reference - at least for me, ferret won''t
work on Ruby 1.8.4.
gem install ferret
Successfully installed ferret-0.11.4-mswin32
ruby -v
ruby 1.8.4 (2005-12-24) [i386-mswin32]
irb
irb(main):001:0> require ''ferret''
A windows error message box appears -
ruby.exe - Entry Point Not Found
The procedure entry point rb_w32_write could not be
2007 Mar 28
6
trouble with PerFieldAnalyzer
...:
require ''rubygems''
require ''ferret''
require ''pp''
include Ferret::Analysis
include Ferret::Index
class TestAnalyzer
def token_stream field, input
pp field
pp input
LetterTokenizer.new(input)
end
end
pfa = PerFieldAnalyzer.new(StandardAnalyzer.new())
pfa[:test] = TestAnalyzer.new
index = Index.new(:analyzer => pfa)
index << {:test => ''foo''}
index.search_each(''bar'')
Output:
:test
""
:test
"bar"
Why is input "" the first time token_stream is called?
I hope...
2007 Aug 20
2
can''t stop stop_words
...WORDS = []
acts_as_ferret({ :fields => { :name => { :boost
=> 10 },
:project_client_company_id => { :boost
=> 0 }
}
},
{:analyzer =>
Ferret::Analysis::StandardAnalyzer.new(STOP_WORDS)})
Regardless, words like ''into'' are not being indexed (I have looked at
the index files). I have been re-indexing, so it isn''t a problem like
that.
If anyone can point out what I am doing wrong that would be great.
--
Posted via http://www.ruby-forum.c...
2007 Jan 19
9
Double-quoted query with "and" fails.
Hi,
We''re using Ferret 0.9.4 and we''ve observed the following behavior.
Searching for ''fieldname: foo and bar'' works fine while ''fieldname:
"foo and bar"'' doesn''t return any results. Is there a way to make
ferret recognize the ''and'' inside the query as a search term and not
an operator? (I hope I got the
2007 May 09
3
bug when assigning new analyzer?
require ''rubygems''
require ''ferret''
include Ferret
PATH = ''/tmp/ferret_stopwords_test''
index = Index::IndexWriter.new(:path => PATH, :create => true)
index.analyzer = Analysis::StandardAnalyzer.new([])
index << {:title => ''a few good men'', :language => ''en''}
index.analyzer = Analysis::StandardAnalyzer.new([''men''])
index << {:title => ''a few good men'', :language => ''nl''}
inde...
2007 Mar 31
4
not understanding search results
I''m getting some results that I don''t understand from a search.
The code, based on the tutorial, and the results are below.
Everything makes sense to me, except the results for
the ''title:"Some"'' query. I would think that it should
match the first two documents, but not the third.
What am I missing here?
Thanks for any help!
--- code
2007 Jun 25
4
Ignore apostrophes in words
...to be indexed as "what ive done" instead. Right
now I''m using this for my aaf line (I don''t want any stop words either
as smaller docs, each word even articles can have some significance):
acts_as_ferret( { :fields => [ :name ] }, { :analyzer =>
Ferret::Analysis::StandardAnalyzer.new([]) } )
How should I go about removing the apostrophes when docs are added to
the index?
Thanks,
Chris
--
Posted via http://www.ruby-forum.com/.
2006 Sep 09
3
Per field analyzer
Is there a way to add per-field analyzer? I can''t seem to find a way to do that.
Thanks
--
Kent
---
http://www.datanoise.com
2005 Nov 17
1
indexing source code
Hi again,
I''m using ferret to index source code - DamageControl will allow users
to search for text in source code.
Currently I''m using the default index with no custom analyzer (I''m
using the StandardAnalyzer). Do you have any recommendations about how
to write an analyzer that will index source code in a more ''optimal''
way? I.e. disregard common sourcecode tokens and take into account
dots and such when tokenizing.
For example, if the source code looks like this:
def foo(bar)
bar....
2006 Jun 16
2
indexing large tokens
Hi,
I''m using the StandardAnalyzer to build an index, and passing in Documents
that have Fields that contain large tokens (22+ characters) interpersed with
normal English words. This seems to cause the IndexWriter to slow to a
crawl. Is this a known issue, or am I doing something wrong?
If this is a known issue I don''t hav...