similar to: Ferret and non latin characters support

Displaying 20 results from an estimated 1000 matches similar to: "Ferret and non latin characters support"

2006 Jul 18
10
searching with chinese chars
Hi all, maybe not a Ferret question, but I assume here might have came across that already. I wrote a simple CGI app that adds docs into a Ferret index. The idea is testing asian languages input and searching. The script that does the input seems to be OK. As David mentioned in a question I made a little while ago, Ferret''s index is agnostic, in the sense that you can store anything in
2006 Jul 07
4
How to add Asia token analyzer to ferret simply?
Hi,David Can you give me an example of how to add analyzer to ferret to Asian languages? My web application will have to support multi language search,which means,for example,both Chinese and English will be searched through the form. Currently,I have decided to use the simple token principles,which means that every Chinese character will be a token,although this is not so well in some
2006 Sep 06
9
Which analyzer to use
Lucene''s standard analyzer splits words separater with underscores. Ferret doesn''t do this. For example, if I create an index with only document ''test_case'' and search for ''case'' it doesn''t find anything. Lucene on the other hand finds it. The same story goes for words separated by colons. Which analyzer should I use to emulate
2007 May 09
3
bug when assigning new analyzer?
require ''rubygems'' require ''ferret'' include Ferret PATH = ''/tmp/ferret_stopwords_test'' index = Index::IndexWriter.new(:path => PATH, :create => true) index.analyzer = Analysis::StandardAnalyzer.new([]) index << {:title => ''a few good men'', :language => ''en''} index.analyzer =
2007 Apr 19
5
Chinese full text searching by acts_as_ferret?
How to add Chinese language full text searching function by using acts_as_ferret? RegExpAnalyzer.new(/./,false) this analyzer, i don''t know how to use it! does it works like this: user searching---->acts_as_ferret---->ferret ???? -- Posted via http://www.ruby-forum.com/.
2008 Jun 13
2
strip out non-alphanumeric characters before saving to index
Does anyone know a simple way, with ferret or a_a_f, to strip out everything that''s not a letter, number or space before saving to the index? I know that i could do a custom method for every indexed field that regexes them out but i thought that there might be a universal option for it... thanks max -- Posted via http://www.ruby-forum.com/.
2007 Mar 28
6
trouble with PerFieldAnalyzer
I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14). Script: require ''rubygems'' require ''ferret'' require ''pp'' include Ferret::Analysis include Ferret::Index class TestAnalyzer def token_stream field, input pp field pp input LetterTokenizer.new(input) end end pfa =
2005 Nov 17
1
indexing source code
Hi again, I''m using ferret to index source code - DamageControl will allow users to search for text in source code. Currently I''m using the default index with no custom analyzer (I''m using the StandardAnalyzer). Do you have any recommendations about how to write an analyzer that will index source code in a more ''optimal'' way? I.e. disregard common
2006 Jul 05
3
Is there any schema of full-text search that support utf-8?
Is there any schema of full-text search that support utf-8 especially for Asia language such as Chinese,Japanese,etc. Ferret/acts_as_ferret can not work when these language key words are searched,and also, it is difficult to implement pagination-which need both the count of search results and offset. Very grateful! -- Posted via http://www.ruby-forum.com/.
2007 Apr 13
5
[Ferret] Serious memory leak on Joyent / TextDrive / Solaris
There is serious memory leak bug in ferret. I''m having this error on TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret 0.11.4 It happens while searching for some terms with accented or special characters. This makes ferret to allocate lots of memory (usually reaching 3+ GB) and failing if another query like this is executed. Any ideas on that, could this be locale
2008 Jan 03
1
properly escaping special characters in AAF?
For most cases, I''ve got search working in Rails as follows: ## controller: term = params[:search][:term] @results = MyModel.find_by_contents "#{term}*" The ''*'' character is appended to the search term so that searches match anything that begins with ''term''. For the most part, this is great, but let''s say term is equal to
2007 Apr 13
4
[Ferret] QueryParser memory leak bug (Joyent/OpenSolaris)
QueryParser fails badly allocating enormous amount of memory when processing query strings with special/accented characters. See: irb(main):002:0> require ''rubygems'' irb(main):003:0> require ''ferret'' irb(main):004:0> include Ferret irb(main):005:0> index = Index::Index.new irb(main):008:0> index << "something" # Now the error
2007 Mar 31
4
not understanding search results
I''m getting some results that I don''t understand from a search. The code, based on the tutorial, and the results are below. Everything makes sense to me, except the results for the ''title:"Some"'' query. I would think that it should match the first two documents, but not the third. What am I missing here? Thanks for any help! --- code
2007 Apr 10
8
ferret-0.11.4-mswin32 not compatible with Ruby1.8.4
Just a quick note for future reference - at least for me, ferret won''t work on Ruby 1.8.4. gem install ferret Successfully installed ferret-0.11.4-mswin32 ruby -v ruby 1.8.4 (2005-12-24) [i386-mswin32] irb irb(main):001:0> require ''ferret'' A windows error message box appears - ruby.exe - Entry Point Not Found The procedure entry point rb_w32_write could not be
2007 May 05
4
Stop words, fields, StandardAnalyzer quagmire
Hello, I''m using: Ruby 1.8.6, Rails 1.2.3, ferret 0.11.4, acts_as_ferret from svn stable. I''ve had quite a day wrestling with trying to remove the use of stopwords. The problem was that when searching for words like "no" or "the", no results were found. I found a confusing thing behavior that has taken me some time to figure out, and I hope sharing it
2006 Oct 27
1
Regexpr. analyzer
Hi! I want to index html files, but w/o the tags, so I was thinking either I remove them before I index it (expensive), or put up an RegExpAnalyzer. BTW, when using an analyzer, does that mean that everything which it declines (i.e. the RegExpAnalyzer doesn''t match) won''t be put into the index files (i.e. blows it up)? I came up with a simple test, which didn''t
2007 Jan 19
9
Double-quoted query with "and" fails.
Hi, We''re using Ferret 0.9.4 and we''ve observed the following behavior. Searching for ''fieldname: foo and bar'' works fine while ''fieldname: "foo and bar"'' doesn''t return any results. Is there a way to make ferret recognize the ''and'' inside the query as a search term and not an operator? (I hope I got the
2012 Jan 15
1
Correct Localized Numbers on Plots, related to glibc!
Dear R Helpers, I want to localize my plots, i.e. the numbers by x & y axis be Persian, using Persian numerals and Persian decimal separator. I change the locale to fa_IR.utf8, but nothing on plots change. I can change the numerals shaping to Persian ones (???? instead of 1234) using some non-standard fonts but the decimal point is a problem. I asked about that in Persian-Computing mailing
2007 May 18
3
issues with : in the content
Hi, I''ve discovered ferret and aaf this evening, I''ve just done some tests and it seems perfect for my needs. I''m indexing text data (title, description, etc) and also ethernet hardware addresses (MAC). Sorry if that sounds trivial but I can''t find the way to correctly index and achieve correct searches on MAC addresses. If I do something like this: index =
2006 Oct 31
3
No search results using Searcher
I just started using Ferret and I successfully indexed some documents. I can search this index using the following code: index = Index::Index.new(:path => path) index.search_each("something") do |doc, score| print "##{doc} #{index[doc][''url'']} - #{score}" print "\n" end However, when I try to use Search::Searcher and QueryParser