similar to: indexing large tokens

Displaying 20 results from an estimated 1000 matches similar to: "indexing large tokens"

2007 Apr 10
8
ferret-0.11.4-mswin32 not compatible with Ruby1.8.4
Just a quick note for future reference - at least for me, ferret won''t work on Ruby 1.8.4. gem install ferret Successfully installed ferret-0.11.4-mswin32 ruby -v ruby 1.8.4 (2005-12-24) [i386-mswin32] irb irb(main):001:0> require ''ferret'' A windows error message box appears - ruby.exe - Entry Point Not Found The procedure entry point rb_w32_write could not be
2006 Jun 14
3
In memory IndexReader bug?
Hi All, Hope all is going well. I''m having trouble with the following code creating an in memory index reader - it seems to be attempting to read from a file regardless. Here''s the simple code: require ''rubygems'' require ''ferret'' a = Ferret::Index::Index.new r = Ferret::Index::IndexReader.new(nil) Running the code on my OS X machine
2007 May 09
3
bug when assigning new analyzer?
require ''rubygems'' require ''ferret'' include Ferret PATH = ''/tmp/ferret_stopwords_test'' index = Index::IndexWriter.new(:path => PATH, :create => true) index.analyzer = Analysis::StandardAnalyzer.new([]) index << {:title => ''a few good men'', :language => ''en''} index.analyzer =
2007 Jan 19
9
Double-quoted query with "and" fails.
Hi, We''re using Ferret 0.9.4 and we''ve observed the following behavior. Searching for ''fieldname: foo and bar'' works fine while ''fieldname: "foo and bar"'' doesn''t return any results. Is there a way to make ferret recognize the ''and'' inside the query as a search term and not an operator? (I hope I got the
2006 Apr 19
2
How to do case-sensitive searches
Forgive me if this topic has already been discussed on the list. I googled but couldn''t find much. I''d like to search through text for US state abbreviations that are written in capitals. What is the best way to do this? I read somewhere that tokenized fields are stored in the index in lowercase, so I am concerned that I will lose precision. What is the best way to store a
2006 Jul 07
4
How to add Asia token analyzer to ferret simply?
Hi,David Can you give me an example of how to add analyzer to ferret to Asian languages? My web application will have to support multi language search,which means,for example,both Chinese and English will be searched through the form. Currently,I have decided to use the simple token principles,which means that every Chinese character will be a token,although this is not so well in some
2006 Apr 25
3
Migrating to 0.9.1
After migrating to 0.9.1, I got: usr/local/lib/ruby/gems/1.8/gems/activesupport-1.3.1/lib/active_support/dependencies.rb:123:in `const_missing'': uninitialized constant TokenFilter (NameError) Here is a snapshot of my code: ... require ''ferret'' class MyFilter < Analysis::TokenFilter ... I works fine on my dev machine, but not a production server (shared host). Any
2006 Oct 19
2
How to deal with accentuated chars in 0.10.8?
I''m startin to use Ferret and acts_as_ferret. I need to use something like EuropeanAnalyzer (http://olivier.liquid-concept.com/fr/pages/2006_acts_as_ferret_accentuated_chars). By example, if the user search by "gonzalez" you can find documents taht contents the term "gonz?lez" (gonz&aacute;lez) The EuropeanAnalyzer is based on Ferret::Analysis::TokenFilter,
2007 Apr 08
3
How to make custom TokenFilter?
In the O''reilly Ferret short cuts, I found very useful example for me. It explains how to make custom Tokenizer. But that book doesn''t explain how to make custom Filter. (especially, how to implement the #text=() method) I''m a newbee and I don''t understand how do I create my own custom Filter. Are there some good source code examples?? -- Posted via
2006 May 08
3
Index::Index.new vs. Readers and Writers
Hey gang, A post on the Rails forum a while back had it sound like you pretty much had to use the Index Readers & Writers if you were going to be potentially accessing an index from more than one process. (i.e. multiple dispatch.fcgi''s, etc) Is this still the case, or does the main Index class do that black magic behind the scenes? =) I was having trouble implementing the
2006 Jul 26
13
tweaking minimum word length?
Hi, Can Ferret be configured to change the minimum word length of what it indexes? Right now it seems to drop words 3 characters or less, but I''d like to include words going down to 2 characters. How would I do that? Francis
2007 Mar 22
3
Noice words...
Hi I use acts_as_ferret on an app that is in Danish and English. In Danish english words like "and" and "under" has meaning. Is it possible to make ferret search for these words? As it is now a seach for "under" returns nothing even-though I know the word is present in the index. Cheers Mattias
2006 Sep 14
1
Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit
I''m playing with "updating" docs in my index, and I think I''ve found bug with IndexWriter counting deleted docs. Script and output follow: ===== require ''rubygems'' require ''ferret'' p Ferret::VERSION @doc = {:id => ''44'', :name => ''fred'', :email => ''abc at
2007 May 05
4
Stop words, fields, StandardAnalyzer quagmire
Hello, I''m using: Ruby 1.8.6, Rails 1.2.3, ferret 0.11.4, acts_as_ferret from svn stable. I''ve had quite a day wrestling with trying to remove the use of stopwords. The problem was that when searching for words like "no" or "the", no results were found. I found a confusing thing behavior that has taken me some time to figure out, and I hope sharing it
2006 Jul 18
4
Some basic questions
Hi, David and everyone, I''ve had Ferret running fine in a production Rails application for a while now. I haven''t updated Ferret or really looked at the Ferret-related code since probably January, but I recently started thinking about trying out the latest version (we were using 0.3.2, I think). I got the latest (0.9.4) and have noticed things break. In particular, I used to
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David, > Deleted documents don''t get deleted until commit is called Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count, even across ruby sessions. On a different note, I''d like to request a variation of #add_document which returns the doc_id of the document added, as opposed to self. I''m trying to track down an issue with a large
2006 Apr 25
12
Newb Gem Install Help!
This seems newbish, but I can''t seem to get this gem to install no matter what I do. The gem in question is login_generator, and no matter what folder I put the .gem file in, it can''t read locally - am I missing something? Remote install hangs at updating the source. -- Posted via http://www.ruby-forum.com/.
2007 Apr 28
6
Determine how many documents a term occurs in
Is there a fast way to determine how many documents a term occurs in, besides iterating through every document with TermDocEnum? -- Best regards, Stian Gryt?yr
2007 Apr 08
10
Ferret and non latin characters support
I''ve successfully installed ferret and acts_as_ferret and have no problem with utf-8 for accented characters. It returns correct results fot e.g. fran?ais. My problem is with non latin characters (Persian indeed). I have tested different locales with no success both on Debian and Mac. Any idea? (ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6) -- Posted via http://www.ruby-forum.com/.
2005 Dec 29
5
Short words not indexed?
I noticed that if I have a field that contains something like "Institute for medicine", that if I search using nay of these queries: for *for* for~ Nothing shows up. If I search for either of the other two words, though, that term would show up in the result set. Does this indicate that short words like "for" are not indexed? Thanks! Jen