similar to: Determine how many documents a term occurs in

Displaying 20 results from an estimated 800 matches similar to: "Determine how many documents a term occurs in"

2006 Sep 14
1
Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit
I''m playing with "updating" docs in my index, and I think I''ve found bug with IndexWriter counting deleted docs. Script and output follow: ===== require ''rubygems'' require ''ferret'' p Ferret::VERSION @doc = {:id => ''44'', :name => ''fred'', :email => ''abc at
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David, > Deleted documents don''t get deleted until commit is called Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count, even across ruby sessions. On a different note, I''d like to request a variation of #add_document which returns the doc_id of the document added, as opposed to self. I''m trying to track down an issue with a large
2008 Jan 09
5
Parallel indexing doesn''t work?
Hi, I''m trying to get parallelized ferret indexing working for my AAF indices, based on the example in the O''Reilly Ferret shortcut. However, the resulting indices after merging seem to have no actual documents. I went and made minimal changes to the example in the Ferret shortcut pdf, and indeed can''t get that to work either. I''d appreciate any help
2007 May 14
3
How to make a Tag cloud with Ferret ?
Hello, I want to make a TAG CLOUD using ferret. How can i do so ? I would need to know the amount of keyword for every each words in the index. Thank you -- Posted via http://www.ruby-forum.com/.
2007 Apr 19
7
Lock errors and segfaults
Greetings, I''ve been using ferret with great results now for a while, but in the last week, I''ve been running into some issues. I will occasionally see this message: Exception Message: Lock Error occured at <except.c>:103 in xpop_context Error occured in index.c:5368 - iw_open Couldn''t obtain write lock when opening IndexWriter Which is accompanied by
2007 Jun 04
2
Memory concerns ferret 11.4.
Hi list, We just built our own ferret drb server (mostly because we don''t do an indexing from within rails). The ferret drb server only handles index inserts and some deletes. Usually we make batch inserts were we retrieve a couple of hundred or thousands of documents from a database and then inserts them inte ferret one by one. We call flush every 50th file. We are very impressed
2007 Feb 22
4
Ferret progress update
Hi folks, Just thought I better let you all know that I''m still working on the next release of Ferret. I''ve been working the last 7 days doing nothing but Ferret development. The last iteration generated a diff of almost 5000 lines so there are some pretty major changes. Most people won''t notice these changes however as the API remains unchanged. But if you were having
2006 May 12
2
Benchmark - Thanks Dave for making this gnawer this FAST!!
Hi List, I''ve took some time and made some tests on the performance of java-lucene, hyperestraier and ferret as Dave encourages the community of ferret to do so. Quite intersting numbers. Ferret indeed deserves to be called a high-performance port!! It''s MyFirstBenchmark ( http://ferret.davebalmain.com/trac/wiki/MyFirstBenchmark ) so please don''t be too cruel on
2012 Dec 06
4
Assignment of values with different indexes
I would like to take the values of observations and map them to a new index. I am not sure how to accomplish this. The result would look like so: x[1,2,3,4,5,6,7,8,9,10] becomes y[2,4,6,8,10,12,14,16,18,20] The "newindex" would not necessarily be this sequence, but a sequence I have stored in a vector, so it could be all kinds of values. here is what happens: > x <- rnorm(10)
2006 Jan 02
11
aligning Ferret''s IndexSearcher.search API with Lucene''s
Recently I''ve been revisiting some of my search code. With a greater understanding of how Java Lucene implements its search methods, I realized that one level of abstraction is not present in the Ferret classes/methods. Here are the relevant method signatures: Ferret''s search methods: in Ferret::Index::Index: search(query, options = {}) -> returns a TopDocs
2013 Mar 14
2
Modifying a data frame based on a vector that contains column numbers
Hello! # I have a data frame: mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5)) # I have an index whose length is always the same as nrow(mydf): myindex<-c(1,2,3,2,1) # I need c1 to have 1s in rows 1 and 5 (based on the information in myindex) # I need c2 to have 1s in rows 2 and 4 (also based on myindex) # I need c3 to have 1 in row 3 # In other words, I am trying to achieve this
2006 Jun 15
10
Finding out all terms from search results. How?
Hi everybody, I need to find out all terms (field values) from one of the fields from a set of documents returned by search. In other words, I have indexed documents with two fields. I do search on one field and then want to know all other field''s values from fount documents. How? -- Sergei Serdyuk Red Leaf Software LLC web: http://redleafsoft.com -- Posted via
2006 Oct 10
5
oddness when adding to index -
I was having some odd results when working with acts_as_ferret (current trunk), so I decided to test with the current version of ferret to see if I encountered the same problem. I did. Here are the details: installed ferret 0.10.10 on debian sarge with ''sudo gem install ferret'' (btw, same results on OSX) opened up an irb session: irb(main):001:0> require
2008 Dec 20
1
How to do indexing after splitting my data-frame?
Hello, after splitting a data-frame I want to access the results. Maybe the problem is, that the factor/index is a string... ...or do I miss knowing details of the index-uasge? Please look and help: ======================================= > weblog <- read_weblog("web.log") > > > str(weblog) 'data.frame': 2247 obs. of 18 variables: $ host : Factor w/ 77
2007 Apr 09
5
IndexReader#terms for all fields?
Is it possible to query the index for a TermEnum for all fields in the index instead of just ? Thanks, John
2007 May 03
1
Numeric Range or comparision doesn''t work
Hi, it looks like Ferret still compares numeric fields by lexical ordering, not numerical ordering. I am using Ferret 0.11.4(I tried in both linux and windows, the results are the same). index = Ferret::Index::Index.new() docs = [ {:num => 1, :data => "yes"}, {:num => 1, :data => "no"}, {:num => 10, :data => "yes"}, {:num => 10, :data
2007 Mar 04
5
Getting non-stemmed terms from IndexReader
I need to get a set of terms being indexed using Ferret. I used IndexReader.terms and it returns a list of TermEnum nicely. The only problem is that my analyzer includes a stemming filter. So now, the terms I''m getting back are all stemmed. Is there anyway to get the original unstemmed terms back from the index somehow? Thanks. -- Posted via http://www.ruby-forum.com/.
2006 May 08
3
Index::Index.new vs. Readers and Writers
Hey gang, A post on the Rails forum a while back had it sound like you pretty much had to use the Index Readers & Writers if you were going to be potentially accessing an index from more than one process. (i.e. multiple dispatch.fcgi''s, etc) Is this still the case, or does the main Index class do that black magic behind the scenes? =) I was having trouble implementing the
2008 Jan 06
3
Did you mean ...? with act_as_ferret
Hello, does anybody know how to implement a "Did you mean ...?" like Google with act_as_ferret? I think this is a possible way: 1. Generate a keyword-list (this is my difficulty. I don''t know how to build such a list from the index) with no stop-words from the first index. e. g. (car, ship, plant, house) 2. Build a second index from this word-list where we store the word in
2007 Feb 25
9
Ferret 0.11.0-rc1
Hey folks, Sorry for cross posting like this but this is an important announcement for all Ferret users. ** Description ** Firstly for those who don''t know, Ferret is a full-text search library which makes adding search to your application a breeze. It''s much faster than MySQL full-text search as well most other search libraries out there. It allows you to do Boolean (+ruby +