thr3ads.net - similar to: "Getting non-stemmed terms from IndexReader"

Displaying 20 results from an estimated 2000 matches similar to: "Getting non-stemmed terms from IndexReader"

2007 Apr 09

IndexReader#terms for all fields?

Is it possible to query the index for a TermEnum for all fields in the index instead of just ? Thanks, John

Determine how many documents a term occurs in

2007 Apr 28

Determine how many documents a term occurs in

Is there a fast way to determine how many documents a term occurs in, besides iterating through every document with TermDocEnum? -- Best regards, Stian Gryt?yr

Did you mean ...? with act_as_ferret

2008 Jan 06

Did you mean ...? with act_as_ferret

Hello, does anybody know how to implement a "Did you mean ...?" like Google with act_as_ferret? I think this is a possible way: 1. Generate a keyword-list (this is my difficulty. I don''t know how to build such a list from the index) with no stop-words from the first index. e. g. (car, ship, plant, house) 2. Build a second index from this word-list where we store the word in

Term frequency doesn''t decrement after document is deleted.

2007 Dec 05

Term frequency doesn''t decrement after document is deleted.

Hey all, The frequency count returned by my ferret reader doesn''t decrement after I remove a documents with those terms. Using the example from http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html the frequency increments after a document is added but stays the same after a document is deleted. index.reader.terms(:tags).each do |term, freq| "#{term} appears

Proper noun stemming

2008 Mar 27

Proper noun stemming

Hi All I was wondering if anyone had a solution for the following problem. I user QueryParser to stem my documents before adding them to a database. During the stemming process I would like to find a way of keeping proper nouns that span two or more words together as a phrase. For example "New York" or "Gordon Brown" or "Prime Minister" get spilt up. I see

Per field analyzer

2006 Sep 09

Per field analyzer

Is there a way to add per-field analyzer? I can''t seem to find a way to do that. Thanks -- Kent --- http://www.datanoise.com

How can I count frequency of terms in a document?

2007 Apr 03

How can I count frequency of terms in a document?

Hi, there. I need some help. Is there a way to count frequencies of terms in a document on Ferret? I know that Ferret has IndexReader#terms_docs_for method which counts all documents. I need to count frequencies of terms in a specific document. Some way?? -- Posted via http://www.ruby-forum.com/.

Get term from document by position

2015 Jul 26

Get term from document by position

> Snippet highlighting is something that was worked on for a GSoC project a > few years ago, and is mentioned in our FAQ: <http://trac.xapian.org/wiki/FAQ/Snippets>. > It?s not available in the 1.2 series, but as I understand it should work out of the > box in 1.3.3. I tried it, this approach returns snippet that have nothing to do with the search string. Moreover, it takes too

In memory IndexReader bug?

2006 Jun 14

In memory IndexReader bug?

Hi All, Hope all is going well. I''m having trouble with the following code creating an in memory index reader - it seems to be attempting to read from a file regardless. Here''s the simple code: require ''rubygems'' require ''ferret'' a = Ferret::Index::Index.new r = Ferret::Index::IndexReader.new(nil) Running the code on my OS X machine

KMeans Clusterer - Going forward

2017 Jun 14

KMeans Clusterer - Going forward

Hello, I have finished moving the API to PIMPL classes and will fix issues within the current code over the next week, based on reviews from mentors. The next step going forward is to start with forming document vectors that are reduced and more useful. This majorly helps in saving run time (since time for distance calculation depends on number of terms). Getting the useful terms within a

Does OP_NEAR works with stemming?

2011 May 27

Does OP_NEAR works with stemming?

Hi All, I used the OP_NEAR operator for queryparser, and when I searched for "apple store" from my own collection, the query is parsed as "Zappl:(pos=1) NEAR 11 Zstore:(pos=2)" but retrieved nothing. However, if I type in "Apple Store", the query is parsed as Xapian::Query((apple:(pos=1) NEAR 11 store:(pos=2))) and some results are showed. I'm not sure whether

Proposed changes to omindex

2006 Aug 11

Proposed changes to omindex

Proposed changes to omindex Currently Available Items ========================= 1) Have the Q prefix contain the 16 byte MD5 of the full file name used for document lookup during indexing. 2) Add the document?s last modified time to the value table (ID 0). This would allow incremental indexing based on the timestamp and also sorting by date in omega (SORT=0) a. Currently I store the timestamp

Error on optimize leads to corrupt index?

2007 Apr 14

Error on optimize leads to corrupt index?

The following exception occurred while trying optimize a large index: vendor/gems/rdig-0.3.4/lib/rdig/index.rb:46:in `optimize'': End-of- File Error occured at <except.c>:93 in xraise (EOFError) Error occured in store.c:216 - is_refill current pos = 0, file length = 0 Now, I get the following error any time I try to create a new index on the directory that I was trying

ferret webpage down

2007 Feb 20

ferret webpage down

The ferret webpage at http://ferret.davebalmain.com/ has been down for a number of days. Any idea what''s going on? or how to notify the webmaster? -- Posted via http://www.ruby-forum.com/.

doubts in ferret

2007 Jul 26

doubts in ferret

I am using ferret to build a search application for my site. I used stemming analyzer to build the index. When i searched "market" i get hits but on searching "marketing" i get no hits,while there are fields containing the word marketing. I am using stemming analyzer even while searching. Is the problem with the analyzer? Or am I missing out something -------------- next part

trouble with PerFieldAnalyzer

2007 Mar 28

trouble with PerFieldAnalyzer

I''m having trouble with PerFieldAnalyzer (ferret version 0.10.14). Script: require ''rubygems'' require ''ferret'' require ''pp'' include Ferret::Analysis include Ferret::Index class TestAnalyzer def token_stream field, input pp field pp input LetterTokenizer.new(input) end end pfa =

IndexReader NotImplemented

2006 Feb 17

IndexReader NotImplemented

Hi there, Sorry if this has come up before, but I couldn''t see it obviously addressed anywhere. There are a few methods in IndexReader that raise NotImplementedErrors. I''m specifically interested in get_term_vector, but there are a number of others. Is there anything specific holding these back, or would patches to implement them be accepted? Thanks, -- Alex

Ferret and non latin characters support

2007 Apr 08

Ferret and non latin characters support

I''ve successfully installed ferret and acts_as_ferret and have no problem with utf-8 for accented characters. It returns correct results fot e.g. fran?ais. My problem is with non latin characters (Persian indeed). I have tested different locales with no success both on Debian and Mac. Any idea? (ferret 0.11.4, acts_as_ferret 0.4.0, rails 1.1.6) -- Posted via http://www.ruby-forum.com/.

similar to: Getting non-stemmed terms from IndexReader