thr3ads.net - similar to: "Determine how many documents a term occurs in"

Displaying 20 results from an estimated 800 matches similar to: "Determine how many documents a term occurs in"

Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit

2006 Sep 14

Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit

I''m playing with "updating" docs in my index, and I think I''ve found bug with IndexWriter counting deleted docs. Script and output follow: ===== require ''rubygems'' require ''ferret'' p Ferret::VERSION @doc = {:id => ''44'', :name => ''fred'', :email => ''abc at

Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit

2006 Sep 14

Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit

Hi David, > Deleted documents don''t get deleted until commit is called Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count, even across ruby sessions. On a different note, I''d like to request a variation of #add_document which returns the doc_id of the document added, as opposed to self. I''m trying to track down an issue with a large

Parallel indexing doesn''t work?

2008 Jan 09

Parallel indexing doesn''t work?

Hi, I''m trying to get parallelized ferret indexing working for my AAF indices, based on the example in the O''Reilly Ferret shortcut. However, the resulting indices after merging seem to have no actual documents. I went and made minimal changes to the example in the Ferret shortcut pdf, and indeed can''t get that to work either. I''d appreciate any help

How to make a Tag cloud with Ferret ?

2007 May 14

How to make a Tag cloud with Ferret ?

Hello, I want to make a TAG CLOUD using ferret. How can i do so ? I would need to know the amount of keyword for every each words in the index. Thank you -- Posted via http://www.ruby-forum.com/.

Lock errors and segfaults

2007 Apr 19

Lock errors and segfaults

Greetings, I''ve been using ferret with great results now for a while, but in the last week, I''ve been running into some issues. I will occasionally see this message: Exception Message: Lock Error occured at <except.c>:103 in xpop_context Error occured in index.c:5368 - iw_open Couldn''t obtain write lock when opening IndexWriter Which is accompanied by

Memory concerns ferret 11.4.

2007 Jun 04

Memory concerns ferret 11.4.

Hi list, We just built our own ferret drb server (mostly because we don''t do an indexing from within rails). The ferret drb server only handles index inserts and some deletes. Usually we make batch inserts were we retrieve a couple of hundred or thousands of documents from a database and then inserts them inte ferret one by one. We call flush every 50th file. We are very impressed

Ferret progress update

2007 Feb 22

Ferret progress update

Hi folks, Just thought I better let you all know that I''m still working on the next release of Ferret. I''ve been working the last 7 days doing nothing but Ferret development. The last iteration generated a diff of almost 5000 lines so there are some pretty major changes. Most people won''t notice these changes however as the API remains unchanged. But if you were having

Benchmark - Thanks Dave for making this gnawer this FAST!!

2006 May 12

Benchmark - Thanks Dave for making this gnawer this FAST!!

Hi List, I''ve took some time and made some tests on the performance of java-lucene, hyperestraier and ferret as Dave encourages the community of ferret to do so. Quite intersting numbers. Ferret indeed deserves to be called a high-performance port!! It''s MyFirstBenchmark ( http://ferret.davebalmain.com/trac/wiki/MyFirstBenchmark ) so please don''t be too cruel on

Assignment of values with different indexes

2012 Dec 06

Assignment of values with different indexes

I would like to take the values of observations and map them to a new index. I am not sure how to accomplish this. The result would look like so: x[1,2,3,4,5,6,7,8,9,10] becomes y[2,4,6,8,10,12,14,16,18,20] The "newindex" would not necessarily be this sequence, but a sequence I have stored in a vector, so it could be all kinds of values. here is what happens: > x <- rnorm(10)

aligning Ferret''s IndexSearcher.search API with Lucene''s

2006 Jan 02

aligning Ferret''s IndexSearcher.search API with Lucene''s

Recently I''ve been revisiting some of my search code. With a greater understanding of how Java Lucene implements its search methods, I realized that one level of abstraction is not present in the Ferret classes/methods. Here are the relevant method signatures: Ferret''s search methods: in Ferret::Index::Index: search(query, options = {}) -> returns a TopDocs

Modifying a data frame based on a vector that contains column numbers

2013 Mar 14

Modifying a data frame based on a vector that contains column numbers

Hello! # I have a data frame: mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5)) # I have an index whose length is always the same as nrow(mydf): myindex<-c(1,2,3,2,1) # I need c1 to have 1s in rows 1 and 5 (based on the information in myindex) # I need c2 to have 1s in rows 2 and 4 (also based on myindex) # I need c3 to have 1 in row 3 # In other words, I am trying to achieve this

Finding out all terms from search results. How?

2006 Jun 15

Finding out all terms from search results. How?

Hi everybody, I need to find out all terms (field values) from one of the fields from a set of documents returned by search. In other words, I have indexed documents with two fields. I do search on one field and then want to know all other field''s values from fount documents. How? -- Sergei Serdyuk Red Leaf Software LLC web: http://redleafsoft.com -- Posted via

oddness when adding to index -

2006 Oct 10

oddness when adding to index -

I was having some odd results when working with acts_as_ferret (current trunk), so I decided to test with the current version of ferret to see if I encountered the same problem. I did. Here are the details: installed ferret 0.10.10 on debian sarge with ''sudo gem install ferret'' (btw, same results on OSX) opened up an irb session: irb(main):001:0> require

How to do indexing after splitting my data-frame?

2008 Dec 20

How to do indexing after splitting my data-frame?

Hello, after splitting a data-frame I want to access the results. Maybe the problem is, that the factor/index is a string... ...or do I miss knowing details of the index-uasge? Please look and help: ======================================= > weblog <- read_weblog("web.log") > > > str(weblog) 'data.frame': 2247 obs. of 18 variables: $ host : Factor w/ 77

IndexReader#terms for all fields?

2007 Apr 09

IndexReader#terms for all fields?

Is it possible to query the index for a TermEnum for all fields in the index instead of just ? Thanks, John

Numeric Range or comparision doesn''t work

2007 May 03

Numeric Range or comparision doesn''t work

Hi, it looks like Ferret still compares numeric fields by lexical ordering, not numerical ordering. I am using Ferret 0.11.4(I tried in both linux and windows, the results are the same). index = Ferret::Index::Index.new() docs = [ {:num => 1, :data => "yes"}, {:num => 1, :data => "no"}, {:num => 10, :data => "yes"}, {:num => 10, :data

Getting non-stemmed terms from IndexReader

2007 Mar 04

Getting non-stemmed terms from IndexReader

I need to get a set of terms being indexed using Ferret. I used IndexReader.terms and it returns a list of TermEnum nicely. The only problem is that my analyzer includes a stemming filter. So now, the terms I''m getting back are all stemmed. Is there anyway to get the original unstemmed terms back from the index somehow? Thanks. -- Posted via http://www.ruby-forum.com/.

Index::Index.new vs. Readers and Writers

2006 May 08

Index::Index.new vs. Readers and Writers

Hey gang, A post on the Rails forum a while back had it sound like you pretty much had to use the Index Readers & Writers if you were going to be potentially accessing an index from more than one process. (i.e. multiple dispatch.fcgi''s, etc) Is this still the case, or does the main Index class do that black magic behind the scenes? =) I was having trouble implementing the

Did you mean ...? with act_as_ferret

2008 Jan 06

Did you mean ...? with act_as_ferret

Hello, does anybody know how to implement a "Did you mean ...?" like Google with act_as_ferret? I think this is a possible way: 1. Generate a keyword-list (this is my difficulty. I don''t know how to build such a list from the index) with no stop-words from the first index. e. g. (car, ship, plant, house) 2. Build a second index from this word-list where we store the word in

Ferret 0.11.0-rc1

2007 Feb 25

Ferret 0.11.0-rc1

Hey folks, Sorry for cross posting like this but this is an important announcement for all Ferret users. ** Description ** Firstly for those who don''t know, Ferret is a full-text search library which makes adding search to your application a breeze. It''s much faster than MySQL full-text search as well most other search libraries out there. It allows you to do Boolean (+ruby +

similar to: Determine how many documents a term occurs in