thr3ads.net - similar to: "Ferret and non latin characters support"

Displaying 20 results from an estimated 1000 matches similar to: "Ferret and non latin characters support"

2006 Jul 18

searching with chinese chars

Hi all, maybe not a Ferret question, but I assume here might have came across that already. I wrote a simple CGI app that adds docs into a Ferret index. The idea is testing asian languages input and searching. The script that does the input seems to be OK. As David mentioned in a question I made a little while ago, Ferret''s index is agnostic, in the sense that you can store anything in

How to add Asia token analyzer to ferret simply?

2006 Jul 07

How to add Asia token analyzer to ferret simply?

Hi,David Can you give me an example of how to add analyzer to ferret to Asian languages? My web application will have to support multi language search,which means,for example,both Chinese and English will be searched through the form. Currently,I have decided to use the simple token principles,which means that every Chinese character will be a token,although this is not so well in some

Which analyzer to use

2006 Sep 06

Which analyzer to use

Lucene''s standard analyzer splits words separater with underscores. Ferret doesn''t do this. For example, if I create an index with only document ''test_case'' and search for ''case'' it doesn''t find anything. Lucene on the other hand finds it. The same story goes for words separated by colons. Which analyzer should I use to emulate

bug when assigning new analyzer?

2007 May 09

bug when assigning new analyzer?

require ''rubygems'' require ''ferret'' include Ferret PATH = ''/tmp/ferret_stopwords_test'' index = Index::IndexWriter.new(:path => PATH, :create => true) index.analyzer = Analysis::StandardAnalyzer.new([]) index << {:title => ''a few good men'', :language => ''en''} index.analyzer =

Chinese full text searching by acts_as_ferret?

2007 Apr 19

Chinese full text searching by acts_as_ferret?

How to add Chinese language full text searching function by using acts_as_ferret? RegExpAnalyzer.new(/./,false) this analyzer, i don''t know how to use it! does it works like this: user searching---->acts_as_ferret---->ferret ???? -- Posted via http://www.ruby-forum.com/.

strip out non-alphanumeric characters before saving to index

2008 Jun 13

strip out non-alphanumeric characters before saving to index

Does anyone know a simple way, with ferret or a_a_f, to strip out everything that''s not a letter, number or space before saving to the index? I know that i could do a custom method for every indexed field that regexes them out but i thought that there might be a universal option for it... thanks max -- Posted via http://www.ruby-forum.com/.

trouble with PerFieldAnalyzer

2007 Mar 28

trouble with PerFieldAnalyzer

I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14). Script: require ''rubygems'' require ''ferret'' require ''pp'' include Ferret::Analysis include Ferret::Index class TestAnalyzer def token_stream field, input pp field pp input LetterTokenizer.new(input) end end pfa =

indexing source code

2005 Nov 17

indexing source code

Hi again, I''m using ferret to index source code - DamageControl will allow users to search for text in source code. Currently I''m using the default index with no custom analyzer (I''m using the StandardAnalyzer). Do you have any recommendations about how to write an analyzer that will index source code in a more ''optimal'' way? I.e. disregard common

Is there any schema of full-text search that support utf-8?

2006 Jul 05

Is there any schema of full-text search that support utf-8?

Is there any schema of full-text search that support utf-8 especially for Asia language such as Chinese,Japanese,etc. Ferret/acts_as_ferret can not work when these language key words are searched,and also, it is difficult to implement pagination-which need both the count of search results and offset. Very grateful! -- Posted via http://www.ruby-forum.com/.

[Ferret] Serious memory leak on Joyent / TextDrive / Solaris

2007 Apr 13

[Ferret] Serious memory leak on Joyent / TextDrive / Solaris

There is serious memory leak bug in ferret. I''m having this error on TextDrive Container (aka. Joyent Accelerators) OpenSolaris with Ferret 0.11.4 It happens while searching for some terms with accented or special characters. This makes ferret to allocate lots of memory (usually reaching 3+ GB) and failing if another query like this is executed. Any ideas on that, could this be locale

properly escaping special characters in AAF?

2008 Jan 03

properly escaping special characters in AAF?

For most cases, I''ve got search working in Rails as follows: ## controller: term = params[:search][:term] @results = MyModel.find_by_contents "#{term}*" The ''*'' character is appended to the search term so that searches match anything that begins with ''term''. For the most part, this is great, but let''s say term is equal to

[Ferret] QueryParser memory leak bug (Joyent/OpenSolaris)

2007 Apr 13

[Ferret] QueryParser memory leak bug (Joyent/OpenSolaris)

QueryParser fails badly allocating enormous amount of memory when processing query strings with special/accented characters. See: irb(main):002:0> require ''rubygems'' irb(main):003:0> require ''ferret'' irb(main):004:0> include Ferret irb(main):005:0> index = Index::Index.new irb(main):008:0> index << "something" # Now the error

not understanding search results

2007 Mar 31

not understanding search results

I''m getting some results that I don''t understand from a search. The code, based on the tutorial, and the results are below. Everything makes sense to me, except the results for the ''title:"Some"'' query. I would think that it should match the first two documents, but not the third. What am I missing here? Thanks for any help! --- code

ferret-0.11.4-mswin32 not compatible with Ruby1.8.4

2007 Apr 10

ferret-0.11.4-mswin32 not compatible with Ruby1.8.4

Just a quick note for future reference - at least for me, ferret won''t work on Ruby 1.8.4. gem install ferret Successfully installed ferret-0.11.4-mswin32 ruby -v ruby 1.8.4 (2005-12-24) [i386-mswin32] irb irb(main):001:0> require ''ferret'' A windows error message box appears - ruby.exe - Entry Point Not Found The procedure entry point rb_w32_write could not be

Stop words, fields, StandardAnalyzer quagmire

2007 May 05

Stop words, fields, StandardAnalyzer quagmire

Hello, I''m using: Ruby 1.8.6, Rails 1.2.3, ferret 0.11.4, acts_as_ferret from svn stable. I''ve had quite a day wrestling with trying to remove the use of stopwords. The problem was that when searching for words like "no" or "the", no results were found. I found a confusing thing behavior that has taken me some time to figure out, and I hope sharing it

Regexpr. analyzer

2006 Oct 27

Regexpr. analyzer

Hi! I want to index html files, but w/o the tags, so I was thinking either I remove them before I index it (expensive), or put up an RegExpAnalyzer. BTW, when using an analyzer, does that mean that everything which it declines (i.e. the RegExpAnalyzer doesn''t match) won''t be put into the index files (i.e. blows it up)? I came up with a simple test, which didn''t

Double-quoted query with "and" fails.

2007 Jan 19

Double-quoted query with "and" fails.

Hi, We''re using Ferret 0.9.4 and we''ve observed the following behavior. Searching for ''fieldname: foo and bar'' works fine while ''fieldname: "foo and bar"'' doesn''t return any results. Is there a way to make ferret recognize the ''and'' inside the query as a search term and not an operator? (I hope I got the

Correct Localized Numbers on Plots, related to glibc!

2012 Jan 15

Correct Localized Numbers on Plots, related to glibc!

Dear R Helpers, I want to localize my plots, i.e. the numbers by x & y axis be Persian, using Persian numerals and Persian decimal separator. I change the locale to fa_IR.utf8, but nothing on plots change. I can change the numerals shaping to Persian ones (???? instead of 1234) using some non-standard fonts but the decimal point is a problem. I asked about that in Persian-Computing mailing

issues with : in the content

2007 May 18

issues with : in the content

Hi, I''ve discovered ferret and aaf this evening, I''ve just done some tests and it seems perfect for my needs. I''m indexing text data (title, description, etc) and also ethernet hardware addresses (MAC). Sorry if that sounds trivial but I can''t find the way to correctly index and achieve correct searches on MAC addresses. If I do something like this: index =

No search results using Searcher

2006 Oct 31

No search results using Searcher

I just started using Ferret and I successfully indexed some documents. I can search this index using the following code: index = Index::Index.new(:path => path) index.search_each("something") do |doc, score| print "##{doc} #{index[doc][''url'']} - #{score}" print "\n" end However, when I try to use Search::Searcher and QueryParser

similar to: Ferret and non latin characters support