Displaying 7 results from an estimated 7 matches for "max_doc".
Did you mean:
max_db
2006 Nov 22
2
crash while retrieving term vectors
This program reliably crashes for me (usually a segfault):
require ''rubygems''
require ''ferret''
reader=Ferret::Index::IndexReader.new ARGV
fields=reader.field_infos.fields
reader.max_doc.times{|n|
fields.each{|field|
reader.term_vector(n,field)
} unless reader.deleted?(n)
print "."; STDOUT.flush
}
As you can see, it just goes through the index, retrieving all the term
vectors. I imagine term vectors must be enabled in at least one field to
trigger this......
2006 Jun 22
3
Partition results based on field
Hello all
I''m using Ferret for a site wide search where I have several kinds of
(similar) objects in a central index (using a "type" field containing
the class name). This works great, and I can search all objects with one
query.
What I''d like to do now is to limit the results so that there will be a
maximum of 10 (or 5 or whatever) results for each type.. I
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David,
> Deleted documents don''t get deleted until commit is called
Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count,
even across ruby sessions.
On a different note, I''d like to request a variation of #add_document
which returns the doc_id of the document added, as opposed to self.
I''m trying to track down an issue with a large
2006 Sep 22
3
Error with :create => true and existing index
I implemented a "reindex" command which simply creates an IndexWriter
with :create => true for a prexisting index.
The "reindexing" seems to start out ok, with several thousand docs
added, then Ferret throws an exception:
IO Error occured: couldn''t rename file "index\_0.tmp" to "index\_0.cfs":
<File exists>
I guess that _0.cfs is held
2006 Sep 14
1
Possiible Bug ? indexWriter#doc_count counts deleted docs after #commit
I''m playing with "updating" docs in my index, and I think I''ve found bug
with IndexWriter counting deleted docs. Script and output follow:
=====
require ''rubygems''
require ''ferret''
p Ferret::VERSION
@doc = {:id => ''44'', :name => ''fred'', :email => ''abc at
2007 Mar 01
2
FerretHash
...keys
reader=Ferret::Index::IndexReader.new(@name)
result=reader.terms(:key).extend(Enumerable).map{|term,freq|
freq==1 or fail
term
}
reader.close
return result
end
def values
result=[]
reader=Ferret::Index::IndexReader.new(@name)
reader.max_doc.times{|n|
result << reader[n][:value] unless reader.deleted? n
}
reader.close
result
end
def each_key
reader=Ferret::Index::IndexReader.new(@name)
result=reader.terms(:key).extend(Enumerable).each{|term,freq|
freq==1 or fail
yield term...
2007 Mar 09
5
memory leak in index build?
...Z/.match(File.basename(manfile,
".gz"))
tttt=`man #{section} #{name}`.gsub(/.\b/m, '''')
i << {
:data=>tttt.to_s,
:name=>name,
:field1=>name,
:field2=>name,
:field3=>name,
}
}
i.close
i=Ferret::Index::IndexReader.new dir
i.max_doc.times{|n|
i.term_vector(n,:data).terms \
.inject(0){|sum,tvt| tvt.positions.size } > 1_000_000 and
puts "heinous term count for #{i[n][:name]}"
}
seenterms=Set[]
begin
i.terms(:data).each{|term,df|
seenterms.include? term and next
i.term_docs_for(:data,term)
seenterms <...