Displaying 20 results from an estimated 1000 matches similar to: "Ferret and non latin characters support"
2006 Jul 18
10
searching with chinese chars
Hi all,
maybe not a Ferret question, but I assume here might have came across
that already.
I wrote a simple CGI app that adds docs into a Ferret index. The idea
is testing asian languages input and searching.
The script that does the input seems to be OK. As David mentioned in a
question I made a little while ago, Ferret''s index is agnostic, in the
sense that you can store anything in
2006 Jul 07
4
How to add Asia token analyzer to ferret simply?
Hi,David
Can you give me an example of how to add analyzer to ferret to Asian
languages?
My web application will have to support multi language search,which
means,for example,both Chinese and English will be searched through the
form.
Currently,I have decided to use the simple token principles,which means
that every Chinese character will be a token,although this is not so
well in some
2006 Sep 06
9
Which analyzer to use
Lucene''s standard analyzer splits words separater with underscores.
Ferret doesn''t do this. For example, if I create an index with only
document ''test_case'' and search for ''case'' it doesn''t find anything.
Lucene on the other hand finds it. The same story goes for words
separated by colons.
Which analyzer should I use to emulate
2007 May 09
3
bug when assigning new analyzer?
require ''rubygems''
require ''ferret''
include Ferret
PATH = ''/tmp/ferret_stopwords_test''
index = Index::IndexWriter.new(:path => PATH, :create => true)
index.analyzer = Analysis::StandardAnalyzer.new([])
index << {:title => ''a few good men'', :language => ''en''}
index.analyzer =
2007 Apr 19
5
Chinese full text searching by acts_as_ferret?
How to add Chinese language full text searching function by using
acts_as_ferret?
RegExpAnalyzer.new(/./,false)
this analyzer, i don''t know how to use it!
does it works like this:
user searching---->acts_as_ferret---->ferret
????
--
Posted via http://www.ruby-forum.com/.
2008 Jun 13
2
strip out non-alphanumeric characters before saving to index
Does anyone know a simple way, with ferret or a_a_f, to strip out
everything that''s not a letter, number or space before saving to the
index? I know that i could do a custom method for every indexed field
that regexes them out but i thought that there might be a universal
option for it...
thanks
max
--
Posted via http://www.ruby-forum.com/.
2007 Mar 28
6
trouble with PerFieldAnalyzer
I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14).
Script:
require ''rubygems''
require ''ferret''
require ''pp''
include Ferret::Analysis
include Ferret::Index
class TestAnalyzer
def token_stream field, input
pp field
pp input
LetterTokenizer.new(input)
end
end
pfa =
2005 Nov 17
1
indexing source code
Hi again,
I''m using ferret to index source code - DamageControl will allow users
to search for text in source code.
Currently I''m using the default index with no custom analyzer (I''m
using the StandardAnalyzer). Do you have any recommendations about how
to write an analyzer that will index source code in a more ''optimal''
way? I.e. disregard common
2006 Jul 05
3
Is there any schema of full-text search that support utf-8?
Is there any schema of full-text search that support utf-8 especially
for Asia language such as Chinese,Japanese,etc.
Ferret/acts_as_ferret can not work when these language key words are
searched,and also, it is difficult to implement pagination-which need
both the count of search results and offset.
Very grateful!
--
Posted via http://www.ruby-forum.com/.
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on
TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret
0.11.4
It happens while searching for some terms with accented or special
characters.
This makes ferret to allocate lots of memory (usually reaching 3+ GB)
and failing if another query like this is executed.
Any ideas on that, could this be locale
2008 Jan 03
1
properly escaping special characters in AAF?
For most cases, I''ve got search working in Rails as follows:
## controller:
term = params[:search][:term]
@results = MyModel.find_by_contents "#{term}*"
The ''*'' character is appended to the search term so that searches match
anything that begins with ''term''. For the most part, this is great, but
let''s say term is equal to
2007 Apr 13
4
[Ferret] QueryParser memory leak bug (Joyent/OpenSolaris)
QueryParser fails badly allocating enormous amount of memory when
processing query strings with special/accented characters. See:
irb(main):002:0> require ''rubygems''
irb(main):003:0> require ''ferret''
irb(main):004:0> include Ferret
irb(main):005:0> index = Index::Index.new
irb(main):008:0> index << "something"
# Now the error
2007 Mar 31
4
not understanding search results
I''m getting some results that I don''t understand from a search.
The code, based on the tutorial, and the results are below.
Everything makes sense to me, except the results for
the ''title:"Some"'' query. I would think that it should
match the first two documents, but not the third.
What am I missing here?
Thanks for any help!
--- code
2007 Apr 10
8
ferret-0.11.4-mswin32 not compatible with Ruby1.8.4
Just a quick note for future reference - at least for me, ferret won''t
work on Ruby 1.8.4.
gem install ferret
Successfully installed ferret-0.11.4-mswin32
ruby -v
ruby 1.8.4 (2005-12-24) [i386-mswin32]
irb
irb(main):001:0> require ''ferret''
A windows error message box appears -
ruby.exe - Entry Point Not Found
The procedure entry point rb_w32_write could not be
2007 May 05
4
Stop words, fields, StandardAnalyzer quagmire
Hello,
I''m using: Ruby 1.8.6, Rails 1.2.3, ferret 0.11.4, acts_as_ferret from
svn stable.
I''ve had quite a day wrestling with trying to remove the use of
stopwords. The problem was that when searching for words like "no" or
"the", no results were found. I found a confusing thing behavior that
has taken me some time to figure out, and I hope sharing it
2006 Oct 27
1
Regexpr. analyzer
Hi!
I want to index html files, but w/o the tags, so I was thinking either I
remove them before I index it (expensive), or put up an RegExpAnalyzer.
BTW, when using an analyzer, does that mean that everything which it
declines (i.e. the RegExpAnalyzer doesn''t match) won''t be put into the
index files (i.e. blows it up)?
I came up with a simple test, which didn''t
2007 Jan 19
9
Double-quoted query with "and" fails.
Hi,
We''re using Ferret 0.9.4 and we''ve observed the following behavior.
Searching for ''fieldname: foo and bar'' works fine while ''fieldname:
"foo and bar"'' doesn''t return any results. Is there a way to make
ferret recognize the ''and'' inside the query as a search term and not
an operator? (I hope I got the
2012 Jan 15
1
Correct Localized Numbers on Plots, related to glibc!
Dear R Helpers,
I want to localize my plots, i.e. the numbers by x & y axis be
Persian, using Persian numerals and Persian decimal separator. I
change the locale to fa_IR.utf8, but nothing on plots change. I can
change the numerals shaping to Persian ones (???? instead of 1234)
using some non-standard fonts but the decimal point is a problem. I
asked about that in Persian-Computing mailing
2007 May 18
3
issues with : in the content
Hi,
I''ve discovered ferret and aaf this evening, I''ve just done some tests
and it seems perfect for my needs.
I''m indexing text data (title, description, etc) and also ethernet
hardware addresses (MAC).
Sorry if that sounds trivial but I can''t find the way to correctly
index and achieve correct searches on MAC addresses.
If I do something like this:
index =
2006 Oct 31
3
No search results using Searcher
I just started using Ferret and I successfully indexed some documents. I
can search this index using the following code:
index = Index::Index.new(:path => path)
index.search_each("something") do |doc, score|
print "##{doc} #{index[doc][''url'']} - #{score}"
print "\n"
end
However, when I try to use Search::Searcher and QueryParser