Our ferret 0.10.13 index has been slowly growing on our debian server and has just got up over 14,000 records. Yesterday I randomly noticed that one search I did was suddenly giving whack, unexpected results. I have spent much time trying to track the problem. Tried ferret 0.10.9 - no change. Tried on a windows machine - where it works fine, and doesn''t give weird results (which just adds to the strangeness - anyway I need it to work on the debian server) narrowed it down to one single entry that when you add or delete from the index completely changes results in unrelated searches. a little console output shows this best. index = Ferret::Index::Index.new(FerretConfig::INDEXOPTIONS) puts index.search("westpac").total_hits 286 puts index.search("westpac branch").total_hits 277 doc = Entry.find(1094481).make_entry_ferret_doc => {:latitude1d=>"36.9", :address=>"61 Remuera Rd, Newmarket", :longitude1d=>"174.8", :name=>"Spiro''s Florists", :precision=>"1 number", :tags=>"Flowers, bouquets, gift baskets, permanent floral arrangements, inter-flora", :zid=>1094481} index << doc index.flush index.optimize puts index.search("westpac").total_hits 286 puts index.search("westpac branch").total_hits 3 index.delete("1094481") index.flush index.optimize puts index.search("westpac").total_hits 286 puts index.search("westpac branch").total_hits 277 I''m completely lost on this. It makes no sense to me at all. Rebuilding the index doesn''t help. It happens the same on 2 similar but independent debian boxes. Anyone got any clues as to where to start? While it''s fine to just remove this entry and presume everything is working - without knowing why this breaks it''s pretty hard to have faith in the index not breaking again... Really appreciate any thoughts, Sam -- Posted via http://www.ruby-forum.com/.
On Sat, Feb 10, 2007 at 06:03:47AM +0100, Sam wrote:> Our ferret 0.10.13 index has been slowly growing on our debian server > and has just got up over 14,000 records. Yesterday I randomly noticed > that one search I did was suddenly giving whack, unexpected results. I > have spent much time trying to track the problem. > > Tried ferret 0.10.9 - no change. > Tried on a windows machine - where it works fine, and doesn''t give weird > results (which just adds to the strangeness - anyway I need it to work > on the debian server)could you try Ferret 0.10.14?> narrowed it down to one single entry that when you add or delete from > the index completely changes results in unrelated searches. > a little console output shows this best. > > index = Ferret::Index::Index.new(FerretConfig::INDEXOPTIONS) > > puts index.search("westpac").total_hits > 286 > puts index.search("westpac branch").total_hits > 277 > > doc = Entry.find(1094481).make_entry_ferret_doc > => {:latitude1d=>"36.9", :address=>"61 Remuera Rd, Newmarket", > :longitude1d=>"174.8", :name=>"Spiro''s Florists", :precision=>"1 > number", :tags=>"Flowers, bouquets, gift baskets, permanent floral > arrangements, inter-flora", :zid=>1094481} > index << doc > index.flush > index.optimize > > puts index.search("westpac").total_hits > 286 > puts index.search("westpac branch").total_hits > 3really strange. To further track this down I''d try with variations of this record, i.e. leave one field empty, then the other to find out which field''s value is causing this problem. btw, what number of hits does index.search("branch").total_hits yield with/without that record? Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Everything happens the same with 0.10.14 index.search("branch").total_hits is constant at 811 through all tests I guessed that it was something to do with the tags field, removing it before adding the doc made everything ok - so I played with changing the values in the tags field. I narrowed it down to this. If tags is or contains any of the follwing words baskets basket ba ball baloney basketcase babaracchus then the search numbers for westpac branch drop from 277 to 3 if tags is any of b ba baracchus then the search numbers for westpac branch stay at 277 Looks like even the A-team can''t help me... -- Posted via http://www.ruby-forum.com/.
Hi Sam, Do you think it would be possible to send me a copy of the index (if the data isn''t sensitve)? It would be really helpful as I can''t seem to reproduce the problem. I''m on Ubuntu here so I should be able to replicate the problem with the index. Cheers, Dave On 2/10/07, Sam <samuelgiffney at gmail.com> wrote:> Our ferret 0.10.13 index has been slowly growing on our debian server > and has just got up over 14,000 records. Yesterday I randomly noticed > that one search I did was suddenly giving whack, unexpected results. I > have spent much time trying to track the problem. > > Tried ferret 0.10.9 - no change. > Tried on a windows machine - where it works fine, and doesn''t give weird > results (which just adds to the strangeness - anyway I need it to work > on the debian server) > > narrowed it down to one single entry that when you add or delete from > the index completely changes results in unrelated searches. > a little console output shows this best. > > index = Ferret::Index::Index.new(FerretConfig::INDEXOPTIONS) > > puts index.search("westpac").total_hits > 286 > puts index.search("westpac branch").total_hits > 277 > > doc = Entry.find(1094481).make_entry_ferret_doc > => {:latitude1d=>"36.9", :address=>"61 Remuera Rd, Newmarket", > :longitude1d=>"174.8", :name=>"Spiro''s Florists", :precision=>"1 > number", :tags=>"Flowers, bouquets, gift baskets, permanent floral > arrangements, inter-flora", :zid=>1094481} > index << doc > index.flush > index.optimize > > puts index.search("westpac").total_hits > 286 > puts index.search("westpac branch").total_hits > 3 > > index.delete("1094481") > index.flush > index.optimize > > puts index.search("westpac").total_hits > 286 > puts index.search("westpac branch").total_hits > 277 > > I''m completely lost on this. It makes no sense to me at all. > Rebuilding the index doesn''t help. It happens the same on 2 similar but > independent debian boxes. > > Anyone got any clues as to where to start? > While it''s fine to just remove this entry and presume everything is > working - without knowing why this breaks it''s pretty hard to have faith > in the index not breaking again... > > Really appreciate any thoughts, > Sam > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-- Dave Balmain http://www.davebalmain.com/
Hey Sam, dave said he is going to look into this in the near future.. We''ll hopefully get some information about your problem soon. Ben> I guessed that it was something to do with the tags field, removing it > before adding the doc made everything ok - so I played with > changing the > values in the tags field. I narrowed it down to this.
It''s open source data so no problem there. Index sent off list... -- Posted via http://www.ruby-forum.com/.
On 2/13/07, Sam <samuelgiffney at gmail.com> wrote:> It''s open source data so no problem there. Index sent off list...Thanks Sam, problem fixed. Ben emailed me privately about this bug suggesting that it might be serious. He was quite correct. When I put out the fix for this it will require everyone to rebuild their indexes. I''m going to add another fix to get rid of the FileNotFound bug that a lot of people have been getting (yes, I''ve finally found the cause of this one) and then I''ll put another release out. I was going to make this change backwards compatible but since their is a bug in the current index format and everyone will need to rebuild anyway, I guess it probably isn''t necessary. If anyone can''t rebuild their indexes for some reason, please let me know and I''ll try and come up with a solution before I put the next release out. Once these fixes are out and I''m happy I haven''t introduced any new bugs I''ll be releasing Ferret 1.0 so look out for it. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/
David Balmain wrote:> Once these fixes are out and I''m happy I haven''t introduced any new > bugs I''ll be releasing Ferret 1.0 so look out for it.Awesome Dave! Lovely to have you back. -- Posted via http://www.ruby-forum.com/.