Shane Hanna
2007-Dec-05 23:47 UTC
[Ferret-talk] Term frequency doesn''t decrement after document is deleted.
Hey all, The frequency count returned by my ferret reader doesn''t decrement after I remove a documents with those terms. Using the example from http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html the frequency increments after a document is added but stays the same after a document is deleted. index.reader.terms(:tags).each do |term, freq| "#{term} appears #{freq} times" end If I iterate through each document matched by terms_for I get the correct frequency but I assume at a higher performance cost. index.reader.terms(:tags).each do |term| freq = index.reader.terms_for(:tags, term).each{} "#{term} appears #{freq} times" end I''m wondering if I''m plain just doing something wrong. I''m running the gem version 0.11.6 (ruby) on i686-darwin9.1.0 and I can provide a unit test if it''d help. Cheers, Shane.
Jens Kraemer
2007-Dec-06 00:05 UTC
[Ferret-talk] Term frequency doesn''t decrement after document is deleted.
Hi! I''m not sure if this is the intended behaviour, so it might be a Ferret bug indeed. However you should get the correct term frequency again after optimizing the index. Cheers, Jens On Thu, Dec 06, 2007 at 10:47:09AM +1100, Shane Hanna wrote:> Hey all, > > The frequency count returned by my ferret reader doesn''t decrement > after I remove a documents with those terms. Using the example from > http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html > the frequency increments after a document is added but stays the same > after a document is deleted. > > index.reader.terms(:tags).each do |term, freq| > "#{term} appears #{freq} times" > end > > If I iterate through each document matched by terms_for I get the > correct frequency but I assume at a higher performance cost. > > index.reader.terms(:tags).each do |term| > freq = index.reader.terms_for(:tags, term).each{} > "#{term} appears #{freq} times" > end > > I''m wondering if I''m plain just doing something wrong. I''m running the > gem version 0.11.6 (ruby) on i686-darwin9.1.0 and I can provide a unit > test if it''d help. > > Cheers, > Shane. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database
Julio Cesar Ody
2007-Dec-06 00:23 UTC
[Ferret-talk] Term frequency doesn''t decrement after document is deleted.
Indeed looks like a bug. I''ve gone through a small hell recently because of a similar issue =) index.size also suffers from the same problem. Apparently values for num_docs (or you tell me what it is exactly if I''m getting it wrong) get cached in IndexReader and when you call it, it returns values that are not necessarily consistent with what''s actually in the index. Also in this same situation, index.optimize before index.size solves the problem. On Dec 6, 2007 11:05 AM, Jens Kraemer <jk at jkraemer.net> wrote:> Hi! > > I''m not sure if this is the intended behaviour, so it might be a Ferret > bug indeed. > > However you should get the correct term frequency again after > optimizing the index. > > Cheers, > Jens > > > On Thu, Dec 06, 2007 at 10:47:09AM +1100, Shane Hanna wrote: > > Hey all, > > > > The frequency count returned by my ferret reader doesn''t decrement > > after I remove a documents with those terms. Using the example from > > http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html > > the frequency increments after a document is added but stays the same > > after a document is deleted. > > > > index.reader.terms(:tags).each do |term, freq| > > "#{term} appears #{freq} times" > > end > > > > If I iterate through each document matched by terms_for I get the > > correct frequency but I assume at a higher performance cost. > > > > index.reader.terms(:tags).each do |term| > > freq = index.reader.terms_for(:tags, term).each{} > > "#{term} appears #{freq} times" > > end > > > > I''m wondering if I''m plain just doing something wrong. I''m running the > > gem version 0.11.6 (ruby) on i686-darwin9.1.0 and I can provide a unit > > test if it''d help. > > > > Cheers, > > Shane. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >