thr3ads.net - similar to: "How to add Asia token analyzer to ferret simply?"

Displaying 20 results from an estimated 4000 matches similar to: "How to add Asia token analyzer to ferret simply?"

Is there any schema of full-text search that support utf-8?

2006 Jul 05

Is there any schema of full-text search that support utf-8?

Is there any schema of full-text search that support utf-8 especially for Asia language such as Chinese,Japanese,etc. Ferret/acts_as_ferret can not work when these language key words are searched,and also, it is difficult to implement pagination-which need both the count of search results and offset. Very grateful! -- Posted via http://www.ruby-forum.com/.

searching with chinese chars

2006 Jul 18

searching with chinese chars

Hi all, maybe not a Ferret question, but I assume here might have came across that already. I wrote a simple CGI app that adds docs into a Ferret index. The idea is testing asian languages input and searching. The script that does the input seems to be OK. As David mentioned in a question I made a little while ago, Ferret''s index is agnostic, in the sense that you can store anything in

Ferret and non latin characters support

2007 Apr 08

Ferret and non latin characters support

I''ve successfully installed ferret and acts_as_ferret and have no problem with utf-8 for accented characters. It returns correct results fot e.g. fran?ais. My problem is with non latin characters (Persian indeed). I have tested different locales with no success both on Debian and Mac. Any idea? (ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6) -- Posted via http://www.ruby-forum.com/.

Per field analyzer

2006 Sep 09

Per field analyzer

Is there a way to add per-field analyzer? I can''t seem to find a way to do that. Thanks -- Kent --- http://www.datanoise.com

Which analyzer to use

2006 Sep 06

Which analyzer to use

Lucene''s standard analyzer splits words separater with underscores. Ferret doesn''t do this. For example, if I create an index with only document ''test_case'' and search for ''case'' it doesn''t find anything. Lucene on the other hand finds it. The same story goes for words separated by colons. Which analyzer should I use to emulate

Chinese full text searching by acts_as_ferret?

2007 Apr 19

Chinese full text searching by acts_as_ferret?

How to add Chinese language full text searching function by using acts_as_ferret? RegExpAnalyzer.new(/./,false) this analyzer, i don''t know how to use it! does it works like this: user searching---->acts_as_ferret---->ferret ???? -- Posted via http://www.ruby-forum.com/.

trouble with PerFieldAnalyzer

2007 Mar 28

trouble with PerFieldAnalyzer

I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14). Script: require ''rubygems'' require ''ferret'' require ''pp'' include Ferret::Analysis include Ferret::Index class TestAnalyzer def token_stream field, input pp field pp input LetterTokenizer.new(input) end end pfa =

tweaking minimum word length?

2006 Jul 26

tweaking minimum word length?

Hi, Can Ferret be configured to change the minimum word length of what it indexes? Right now it seems to drop words 3 characters or less, but I''d like to include words going down to 2 characters. How would I do that? Francis

Querying against numeric fields? e.g. price:( >= min_price)

2006 Sep 12

Querying against numeric fields? e.g. price:( >= min_price)

Using acts_as_ferret I''m trying to do a query like: active:(true) title|body:(#{params[:s]}) product_price:( >= #{params[:min]}) Where I want to return only the active products that contain the search term in the title or body and has a minimum price >= params[:min] I''m finding that even though I''m indexing the product price as an integer (so no .00 to cause

strip out non-alphanumeric characters before saving to index

2008 Jun 13

strip out non-alphanumeric characters before saving to index

Does anyone know a simple way, with ferret or a_a_f, to strip out everything that''s not a letter, number or space before saving to the index? I know that i could do a custom method for every indexed field that regexes them out but i thought that there might be a universal option for it... thanks max -- Posted via http://www.ruby-forum.com/.

Regexpr. analyzer

2006 Oct 27

Regexpr. analyzer

Hi! I want to index html files, but w/o the tags, so I was thinking either I remove them before I index it (expensive), or put up an RegExpAnalyzer. BTW, when using an analyzer, does that mean that everything which it declines (i.e. the RegExpAnalyzer doesn''t match) won''t be put into the index files (i.e. blows it up)? I came up with a simple test, which didn''t

Query question

2005 Dec 14

Query question

I have an index in which I want different records to be accessible to different users. I think I can do this by adding a "users" field to each record in the index and narrow down my queries to only those records matching the current user''s userid. I have the userids separated by commas. What would be the right way to query for a certain user? I have to make sure that I

Ferret::Analysis::PerFieldAnalyzerWrapper is not exported

2006 Jun 15

Ferret::Analysis::PerFieldAnalyzerWrapper is not exported

Hi, I am on Ferret 0.9.3 and it seems to me that Ferret::Analysis::PerFieldAnalyzerWrapper is not available in ferret_ext. -- Sergei Serdyuk Red Leaf Software LLC web: http://redleafsoft.com -- Posted via http://www.ruby-forum.com/.

Chinese full-text support! Still fail-_-

2007 Apr 29

Chinese full-text support! Still fail-_-

Hi all, I want to use ferrent in my website but when i input chinese words, i have the same symptom like Chengcai. In order to fix it, i have reviewed all the topics about chinese support in our forum and tried all the way your guys suggested but still made any progress. i downloaded the latest version of ferret from svn. Thanks and regards. captain Chengcai He wrote: > Hello everyone!

performance bottleneck

2007 Jul 14

performance bottleneck

I have got my database in Mysql. I used ferret to index a table with 10 million rows. On limiting the selection of data to 1000 initial retrieval, it takes 200 seconds but for the whole table it took more than four hours and after which i had to close my indexing application. I used the StandardAnalyser for it. There is no problem from the database side as retrieval of all the data in the table

bug when assigning new analyzer?

2007 May 09

bug when assigning new analyzer?

require ''rubygems'' require ''ferret'' include Ferret PATH = ''/tmp/ferret_stopwords_test'' index = Index::IndexWriter.new(:path => PATH, :create => true) index.analyzer = Analysis::StandardAnalyzer.new([]) index << {:title => ''a few good men'', :language => ''en''} index.analyzer =

Numeric Range or comparision doesn''t work

2007 May 03

Numeric Range or comparision doesn''t work

Hi, it looks like Ferret still compares numeric fields by lexical ordering, not numerical ordering. I am using Ferret 0.11.4(I tried in both linux and windows, the results are the same). index = Ferret::Index::Index.new() docs = [ {:num => 1, :data => "yes"}, {:num => 1, :data => "no"}, {:num => 10, :data => "yes"}, {:num => 10, :data

Memory leak in PerFieldAnalyzer

2007 Jul 29

Memory leak in PerFieldAnalyzer

Hello everyone, we''ve recently discovered a memory leak in the PerFieldAnalyzer. If you use the PerFieldAnalyzer (which you acutally should), you should switch to a pure ruby version of that analyzer. The C version of the Analyzer is consuming memory on every analyzing request. You can find an example script to verify the leak[1]. Furthermore we''ve added a workaround, building

acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)

2007 Nov 13

acts_as_ferret : cannot use a customized Analyzer (as indicated in the AdvancedUsageNotes)

Hi all, I cannot make aaf (rev. 220) use my custom analyzer, despite following the indications @ http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage To pinpoint the problem, I created a model + a simple analyzer with 2 stop words : "fax" and "gsm". test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a stop word. => I get a

No search results using Searcher

2006 Oct 31

No search results using Searcher

I just started using Ferret and I successfully indexed some documents. I can search this index using the following code: index = Index::Index.new(:path => path) index.search_each("something") do |doc, score| print "##{doc} #{index[doc][''url'']} - #{score}" print "\n" end However, when I try to use Search::Searcher and QueryParser

similar to: How to add Asia token analyzer to ferret simply?