Sam Giffney
2006-Aug-01 00:26 UTC
[Ferret-talk] Per field boost values - possible? working?
I''m making a simple business directory search and I want to boost the relevance of the ''name'' field over the ''address'' field - both stored in the same document in the same index. Here is some console code to demonstrate what I am actually doing>> include Ferret::Document=> Object>> doc = Document.new=> Document { }>> doc << Field.new(:name, "Business Search", Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 2.0)=> nil>> doc << Field.new("physical_address", "New Zealand", Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0)=> nil>> doc=> Document { stored/uncompressed,indexed,tokenized,<name:Business Search> stored/uncompressed,indexed,tokenized,<physical_address:New Zealand> } I realise the docs say: "Note: this value is not stored directly with the document in the index." so I guess that''s why the boost field isn''t shown here. However, browsing the index in Luke shows that the boost value on each field is still set to the default 1.0. Also empirical testing suggests the boost value I''m entering isn''t taken into account at all. Am I doing something wrong or is the boost functionality not working? I''m running ferret 0.9.4 with ruby 1.82 on debian sarge. -- Posted via http://www.ruby-forum.com/.
David Balmain
2006-Aug-01 01:35 UTC
[Ferret-talk] Per field boost values - possible? working?
On 8/1/06, Sam Giffney <samuelgiffney at gmail.com> wrote:> I''m making a simple business directory search and I want to boost the > relevance of the ''name'' field over the ''address'' field - both stored in > the same document in the same index. > > Here is some console code to demonstrate what I am actually doing > > >> include Ferret::Document > => Object > >> doc = Document.new > => Document { > } > >> doc << Field.new(:name, "Business Search", Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 2.0) > => nil > >> doc << Field.new("physical_address", "New Zealand", Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0) > => nil > >> doc > => Document { > stored/uncompressed,indexed,tokenized,<name:Business Search> > stored/uncompressed,indexed,tokenized,<physical_address:New Zealand> > } > > I realise the docs say: "Note: this value is not stored directly with > the document in the index." so I guess that''s why the boost field isn''t > shown here.The boost isn''t shown here simple because I forgot to add it. It is stored with the document when you create it. However, it isn''t stored with the document in the index. It is stored in a "norms" file. There is a norms file for every indexed field in the index (unless you chose Field::Index::OMIT_NORMS) and the norms file contains a single byte for every document in the index.> However, browsing the index in Luke shows that the boost value on each > field is still set to the default 1.0. Also empirical testing suggests > the boost value I''m entering isn''t taken into account at all.I''m not sure why it doesn''t show up in Luke. The boost is definitely working. I''m not sure what kinds of empirical tests you did. Try this; require ''rubygems'' require ''ferret'' include Ferret::Index include Ferret::Document index = Index.new doc = Document.new doc << Field.new(:name, "Business Search", Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO) index << doc doc.field(:name).boost = 2.0 index << doc puts "Explanation for Doc 0" puts index.explain("business", 0) puts "" puts "Explanation for Doc 1" puts index.explain("business", 1) The explain method explains the score for a query and a particular document. You''ll notice the score is doubled for the second document. Hope that helps, Dave PS: anyone interested in porting Luke to ruby? Luke won''t work on future versions of the Ferret index. I''d be happy to help out but I don''t have time to do it by myself.
Sam Giffney
2006-Aug-01 04:48 UTC
[Ferret-talk] Per field boost values - possible? working?
Thanks Dave, Using the explain method proved it was definitely working. The boost value I was using, 2.0, just wasn''t enough to change the placing in the test i was using. What are the (highlights of the) changes to the index that make it incompatible with Luke? Just wondering what would be involved... -- Posted via http://www.ruby-forum.com/.
David Balmain
2006-Aug-01 05:09 UTC
[Ferret-talk] Per field boost values - possible? working?
On 8/1/06, Sam Giffney <samuelgiffney at gmail.com> wrote:> Thanks Dave, > Using the explain method proved it was definitely working. The boost > value I was using, 2.0, just wasn''t enough to change the placing in the > test i was using.Great. One thing I neglected to mention was that the field_norm value that you see in the Index#explain output is actually the field boost (I may change the name as it''s not really clear). You''ll notice that 1.0 and 2.0 get converted to 0.625 and 1.25 respectively. This is because the the boost gets compressed into a single byte so it looses a lot of it''s precision. This is just something to keep in mind when setting boost values.> What are the (highlights of the) changes to the index that make it > incompatible with Luke? Just wondering what would be involved...The only thing staying the same is the field norms files. Everything else is changing so it wouldn''t be worth doing it in Java using any of the existing Luke code. It''d have to be completely rewritten in Ruby. I haven''t done any GUI stuff in ruby before so I''m not sure which library would be best. If anyone has any recommendations I could probably start something and then others could play around with it. Cheers, Dave
Jens Kraemer
2006-Aug-01 08:22 UTC
[Ferret-talk] Per field boost values - possible? working?
On Tue, Aug 01, 2006 at 02:09:36PM +0900, David Balmain wrote: [..]> The only thing staying the same is the field norms files. Everything > else is changing so it wouldn''t be worth doing it in Java using any of > the existing Luke code. It''d have to be completely rewritten in Ruby. > > I haven''t done any GUI stuff in ruby before so I''m not sure which > library would be best. If anyone has any recommendations I could > probably start something and then others could play around with it.I''ve started porting Luke to Ruby/Gtk a while ago. It''s far from complete but I could make available what I have so far. But don''t expect anything too fancy, I haven''t done any Gui stuff with Ruby or Gtk before that, too ;-) Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
David Balmain
2006-Aug-01 08:44 UTC
[Ferret-talk] Per field boost values - possible? working?
On 8/1/06, Jens Kraemer <kraemer at webit.de> wrote:> On Tue, Aug 01, 2006 at 02:09:36PM +0900, David Balmain wrote: > [..] > > The only thing staying the same is the field norms files. Everything > > else is changing so it wouldn''t be worth doing it in Java using any of > > the existing Luke code. It''d have to be completely rewritten in Ruby. > > > > I haven''t done any GUI stuff in ruby before so I''m not sure which > > library would be best. If anyone has any recommendations I could > > probably start something and then others could play around with it. > > I''ve started porting Luke to Ruby/Gtk a while ago. It''s far from > complete but I could make available what I have so far. > > But don''t expect anything too fancy, I haven''t done any Gui stuff > with Ruby or Gtk before that, too ;-)Cool, I''d love to see it.