Displaying 20 results from an estimated 800 matches similar to: "Determine how many documents a term occurs in"
2006 Sep 14
1
Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit
I''m playing with "updating" docs in my index, and I think I''ve found bug
with IndexWriter counting deleted docs. Script and output follow:
=====
require ''rubygems''
require ''ferret''
p Ferret::VERSION
@doc = {:id => ''44'', :name => ''fred'', :email => ''abc at
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David,
> Deleted documents don''t get deleted until commit is called
Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count,
even across ruby sessions.
On a different note, I''d like to request a variation of #add_document
which returns the doc_id of the document added, as opposed to self.
I''m trying to track down an issue with a large
2008 Jan 09
5
Parallel indexing doesn''t work?
Hi,
I''m trying to get parallelized ferret indexing working for my AAF
indices, based on the example in the O''Reilly Ferret shortcut.
However, the resulting indices after merging seem to have no actual
documents.
I went and made minimal changes to the example in the Ferret shortcut
pdf, and indeed can''t get that to work either. I''d appreciate any help
2007 May 14
3
How to make a Tag cloud with Ferret ?
Hello,
I want to make a TAG CLOUD using ferret.
How can i do so ?
I would need to know the amount of keyword for every each words in the
index.
Thank you
--
Posted via http://www.ruby-forum.com/.
2007 Apr 19
7
Lock errors and segfaults
Greetings,
I''ve been using ferret with great results now for a while, but in the last week, I''ve
been running into some issues.
I will occasionally see this message:
Exception Message: Lock Error occured at <except.c>:103 in xpop_context
Error occured in index.c:5368 - iw_open
Couldn''t obtain write lock when opening IndexWriter
Which is accompanied by
2007 Jun 04
2
Memory concerns ferret 11.4.
Hi list,
We just built our own ferret drb server (mostly because we don''t do
an indexing from within rails).
The ferret drb server only handles index inserts and some deletes.
Usually we make batch inserts were we retrieve a couple of hundred or
thousands of documents from a database and then inserts them inte
ferret one by one.
We call flush every 50th file. We are very impressed
2007 Feb 22
4
Ferret progress update
Hi folks,
Just thought I better let you all know that I''m still working on the
next release of Ferret. I''ve been working the last 7 days doing
nothing but Ferret development. The last iteration generated a diff of
almost 5000 lines so there are some pretty major changes. Most people
won''t notice these changes however as the API remains unchanged. But
if you were having
2006 May 12
2
Benchmark - Thanks Dave for making this gnawer this FAST!!
Hi List,
I''ve took some time and made some tests on the performance of
java-lucene, hyperestraier and ferret as Dave encourages the community
of ferret to do so.
Quite intersting numbers. Ferret indeed deserves to be called a
high-performance port!!
It''s MyFirstBenchmark (
http://ferret.davebalmain.com/trac/wiki/MyFirstBenchmark ) so please
don''t be too cruel on
2012 Dec 06
4
Assignment of values with different indexes
I would like to take the values of observations and map them to a new index. I am not sure how to accomplish this. The result would look like so:
x[1,2,3,4,5,6,7,8,9,10]
becomes
y[2,4,6,8,10,12,14,16,18,20]
The "newindex" would not necessarily be this sequence, but a sequence I have stored in a vector, so it could be all kinds of values. here is what happens:
> x <- rnorm(10)
2006 Jan 02
11
aligning Ferret''s IndexSearcher.search API with Lucene''s
Recently I''ve been revisiting some of my search code. With a greater
understanding of how Java Lucene implements its search methods, I
realized that one level of abstraction is not present in the Ferret
classes/methods. Here are the relevant method signatures:
Ferret''s search methods:
in Ferret::Index::Index:
search(query, options = {}) -> returns a TopDocs
2013 Mar 14
2
Modifying a data frame based on a vector that contains column numbers
Hello!
# I have a data frame:
mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5))
# I have an index whose length is always the same as nrow(mydf):
myindex<-c(1,2,3,2,1)
# I need c1 to have 1s in rows 1 and 5 (based on the information in myindex)
# I need c2 to have 1s in rows 2 and 4 (also based on myindex)
# I need c3 to have 1 in row 3
# In other words, I am trying to achieve this
2006 Jun 15
10
Finding out all terms from search results. How?
Hi everybody,
I need to find out all terms (field values) from one of the fields from
a set of documents returned by search.
In other words, I have indexed documents with two fields. I do search on
one field and then want to know all other field''s values from fount
documents.
How?
--
Sergei Serdyuk
Red Leaf Software LLC
web: http://redleafsoft.com
--
Posted via
2006 Oct 10
5
oddness when adding to index -
I was having some odd results when working with acts_as_ferret (current
trunk), so I decided to test with the current version of ferret to see
if I encountered the same problem. I did. Here are the details:
installed ferret 0.10.10 on debian sarge with ''sudo gem install ferret''
(btw, same results on OSX)
opened up an irb session:
irb(main):001:0> require
2008 Dec 20
1
How to do indexing after splitting my data-frame?
Hello,
after splitting a data-frame I want to access the results.
Maybe the problem is, that the factor/index is a string...
...or do I miss knowing details of the index-uasge?
Please look and help:
=======================================
> weblog <- read_weblog("web.log")
>
>
> str(weblog)
'data.frame': 2247 obs. of 18 variables:
$ host : Factor w/ 77
2007 Apr 09
5
IndexReader#terms for all fields?
Is it possible to query the index for a TermEnum for all fields in
the index instead of just ?
Thanks,
John
2007 May 03
1
Numeric Range or comparision doesn''t work
Hi,
it looks like Ferret still compares numeric fields by lexical ordering,
not numerical ordering. I am using Ferret 0.11.4(I tried in both linux
and windows, the results are the same).
index = Ferret::Index::Index.new()
docs = [
{:num => 1, :data => "yes"},
{:num => 1, :data => "no"},
{:num => 10, :data => "yes"},
{:num => 10, :data
2007 Mar 04
5
Getting non-stemmed terms from IndexReader
I need to get a set of terms being indexed using Ferret. I used
IndexReader.terms and it returns a list of TermEnum nicely. The only
problem is that my analyzer includes a stemming filter.
So now, the terms I''m getting back are all stemmed. Is there anyway to
get the original unstemmed terms back from the index somehow? Thanks.
--
Posted via http://www.ruby-forum.com/.
2006 May 08
3
Index::Index.new vs. Readers and Writers
Hey gang,
A post on the Rails forum a while back had it sound like you pretty much
had to use the Index Readers & Writers if you were going to be
potentially accessing an index from more than one process. (i.e.
multiple dispatch.fcgi''s, etc)
Is this still the case, or does the main Index class do that black magic
behind the scenes? =)
I was having trouble implementing the
2008 Jan 06
3
Did you mean ...? with act_as_ferret
Hello,
does anybody know how to implement a "Did you mean ...?" like Google
with act_as_ferret?
I think this is a possible way:
1. Generate a keyword-list (this is my difficulty. I don''t know how to
build such a list from the index) with no stop-words from the first
index.
e. g. (car, ship, plant, house)
2. Build a second index from this word-list where we store the word in
2007 Feb 25
9
Ferret 0.11.0-rc1
Hey folks,
Sorry for cross posting like this but this is an important
announcement for all Ferret users.
** Description **
Firstly for those who don''t know, Ferret is a full-text search library
which makes adding search to your application a breeze. It''s much
faster than MySQL full-text search as well most other search libraries
out there. It allows you to do Boolean (+ruby +