similar to: How to handle non-ascii characters

Displaying 20 results from an estimated 30000 matches similar to: "How to handle non-ascii characters"

2006 Jun 01
8
Windows progress
Hi there, What''s the current status of the Windows port? I may be in a position to lend a hand over the next couple of weeks - where should I start looking? And what''s the best way to get SVN HEAD? This happens: $ svn checkout svn://www.davebalmain.com/ferret/trunk ferret svn: Can''t connect to host ''www.davebalmain.com'': Connection refused --
2005 Dec 02
1
cFerret ETA?
I''m noticing some long delays when optimizing my index. I know this is terribly inefficient, but in order to make sure that my ActiveRecord model is in sync with my index, I''m optimizing after every new record that I store, like so: class Resume < ActiveRecord::Base include Ferret has_and_belongs_to_many :users SEARCH_INDEX = File.dirname(__FILE__) +
2005 Dec 02
1
Compile error on FreeBSD 4.10 gcc 2.95.4
FYI, I tried installing ferret on my freebsd virtual server and got this: retango# gem install ferret --include-dependencies Attempting local installation of ''ferret'' Local gem file not found: ferret*.gem Attempting remote installation of ''ferret'' Updating Gem source index for: http://gems.rubyforge.org Building native extensions. This could take a while...
2006 May 15
16
Ferret not able to read a Lucene Index?
Hi all, Having problems trying to get Ferret to read an index generated by Lucene. Am I right in thinking Ferret should be able to read a Lucene generated index no problem? Using the code snippets detailed in http://www.ruby-forum.com/topic/64099#new Any advice gratefully received. Many Thanks, Steven -- Posted via http://www.ruby-forum.com/.
2005 Dec 19
17
Indexing so slow......
I am indexing over 10,000 rows of data, it is very slow when it is indexing the 100,1000,10000 row, and now it is over 1 hour passed on the row 10,000. how to make it faster? here is my code: ================== doc = Document.new doc << Field.new("id", t.id, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("title", t.title,
2006 Jun 16
2
indexing large tokens
Hi, I''m using the StandardAnalyzer to build an index, and passing in Documents that have Fields that contain large tokens (22+ characters) interpersed with normal English words. This seems to cause the IndexWriter to slow to a crawl. Is this a known issue, or am I doing something wrong? If this is a known issue I don''t have any problem just not indexing tokens longer than a
2006 Jan 02
11
aligning Ferret''s IndexSearcher.search API with Lucene''s
Recently I''ve been revisiting some of my search code. With a greater understanding of how Java Lucene implements its search methods, I realized that one level of abstraction is not present in the Ferret classes/methods. Here are the relevant method signatures: Ferret''s search methods: in Ferret::Index::Index: search(query, options = {}) -> returns a TopDocs
2006 Jan 23
7
Search functionality and CMS
Hello, I am planning to build a bigger Internet platform and actually evaluating Java EE and Rails. I have a lot of Java experience and I am quite new to Rails. After playing some weeks with Rails I am sure that it is a mature web framework and I really like the productivtity of Rails. One of the key advantages is that new developers will understand this platform much quicker than all the Java
2006 Jan 10
18
Ferret with IMAP dirs
I''d like to use ferret to build an imap indexer and search utility, but want to check first to see if anyone else is working on this and offer my help. Anyone? Also, if you could provide any helpful pointers on indexing directories via ferret, it''ll be very much appreciated. I''m a lucene nuby. Thanks! John -- Posted via http://www.ruby-forum.com/.
2005 Nov 17
1
indexing source code
Hi again, I''m using ferret to index source code - DamageControl will allow users to search for text in source code. Currently I''m using the default index with no custom analyzer (I''m using the StandardAnalyzer). Do you have any recommendations about how to write an analyzer that will index source code in a more ''optimal'' way? I.e. disregard common
2006 Mar 14
6
cFerret nearing completion
Hey folks, Some good news. I''ve finished cFerret and it''s ruby bindings to the point where I can run all of the unit tests. I still have to work out how I''m going to package and release it but it shouldn''t be long now. If you can''t wait you might like to try it from the subversion repository. It''ll probably only work on linux at the moment and
2006 May 12
2
Benchmark - Thanks Dave for making this gnawer this FAST!!
Hi List, I''ve took some time and made some tests on the performance of java-lucene, hyperestraier and ferret as Dave encourages the community of ferret to do so. Quite intersting numbers. Ferret indeed deserves to be called a high-performance port!! It''s MyFirstBenchmark ( http://ferret.davebalmain.com/trac/wiki/MyFirstBenchmark ) so please don''t be too cruel on
2006 May 04
5
How to install Ferret to get the best performance
Hey all, After dabbling with ActiveSearch, we''re coming back around to take another look at Ferret. ActiveSearch slowed to a crawl after indexing about 20k documents, each 20 lines each. This time we may attempt to create multiple Ferret indexes (isolating each organization''s data individually), since we eventually could have upwards of 20k documents for some
2006 May 17
8
How to implement full-text search with OR just like google?
The current full-text search will return the AND collection results,for example,if we use Article.search("aa bb"),then the articles that include "aa" and "bb" in the fields will be returned,how to return the articles that include "aa" OR "bb" effectly? A stumb method is to setup two queries respectly and collect them together with remove the
2006 Feb 07
1
setting of :key to :id in cFerret
Hi Dave, I''ve been reading this post below back in December 2005. Is it possible to set :key to :id in cFerret like suggested below? Thanks, Mac On 12/3/05, Carl Youngblood <carl at youngbloods.org <http://rubyforge.org/mailman/listinfo/ferret-talk>> wrote: >* I seem to be getting the same document multiple times in my search *>* results. I''m wondering if
2006 Jun 30
4
Substantial problems with write locking (and other flux)
I am having some great trouble keeping my Ferret indexer for ActiveRecord working. First the get_field_names disappears (now back), then I am collectig some major trouble with locking. Same thing here: exception 6 not handled: Could not obtain write lock when trying to write index A snippet like this just deadlocks retrying endlessly: begin @ferret_index << doc
2005 Oct 26
1
index compatibility
Hi, first of all: great work! I''d like to know which Lucene Version Ferret is based on, in other words: will I be able to read/write indexes created with current lucene trunk ? Thanks in advance, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Telefon +49 351
2007 Jul 18
3
Ferret doesn''t work with Luke
Hi, Does anyone know why the indexes created by Ferret can''t be opened by Luke (http://www.getopt.org/luke/)? When I do development with Clucene I use Luke all the time to understand what is going on in the index. It is especially useful when trying to diagnose analyzer issues. When I try to open a Ferret index with Luke I get the message "Invalid or corrupted index". I
2006 Feb 07
15
So, this search thing...
I am using ferret right now, and it works great for all my regular text documents/information. My problem arises when I want to index/search all of our assets (mostly pdf files). Currently, there is no way to READ pdfs from Ruby. Because of this I have to resort to using Java to read the PDF''s and then Lucene to index them. My problem here is a couple things. One, to index a asset I have
2006 Mar 19
3
Ferret 0.9.0-alpha (port of Apache Lucene to pure ruby)
Hi Folks, I''ve just released version 0.9.0. This latest version of Ferret is an alpha release. I have removed the old c extension and Ferret is now running on a fully ported C library. This has allowed some huge performance improvements both with regard to memory and CPU usage. There will probably be a few portability issues to start with. It has been developed on Linux so it should