I am trying to build up a filtered search using the logic below. bq = Ferret::Search::BooleanQuery.new bq.add_query(Ferret::Search::TermQuery.new(Ferret::Index::Term.new("section",section.downcase!)), Ferret::Search::BooleanClause::Occur::MUST) filter = Ferret::Search::QueryFilter.new(bq) @vobjects = VoObject.find_by_contents(search_input,:filter => filter, :sort => ["section", "sale_category"]) This works fine when the "section" is a single word like "book" but when there is white spaces in the query like "paperback book" it does not find the appropriate result and comes back with zero hits. I changed this to use FuzzyQuery and it works but I sometimes get segmentation errors (this was reported in another topic). Does anyone have a solution to this problem for me? Thanks very much. -- Posted via http://www.ruby-forum.com/.
It''s hard to know for sure without seeing how your index is built, but if you are using TOKENIZED on that field, then whenever the index is built the text is split on whitespace, and each element is added as a separate term. It looks like when you are searching, you are trying to find the entire text as a single term. In order to solve this, I believe you can either construct your query using QueryParser, which will use the analyzer / tokenizer and split the terms out for you, or you can simply split the ''section'' string on whitespace and build a Term and TermQuery for each resulting element and build a PhraseQuery from that set. I hope this is some help, Jeremy On 7/14/06, BlueJay <clare.cavanagh at btclick.com> wrote:> > I am trying to build up a filtered search using the logic below. > > > bq = Ferret::Search::BooleanQuery.new > bq.add_query(Ferret::Search::TermQuery.new(Ferret::Index:: > Term.new("section",section.downcase!)), > Ferret::Search::BooleanClause::Occur::MUST) > > filter = Ferret::Search::QueryFilter.new(bq) > @vobjects = VoObject.find_by_contents(search_input,:filter > => > filter, :sort => ["section", "sale_category"]) > > > This works fine when the "section" is a single word like "book" but when > there is white spaces in the query like "paperback book" it does not > find the appropriate result and comes back with zero hits. > > I changed this to use FuzzyQuery and it works but I sometimes get > segmentation errors (this was reported in another topic). > > Does anyone have a solution to this problem for me? > > Thanks very much. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060714/dacd0555/attachment.html
Jeremy Bensley wrote:> It''s hard to know for sure without seeing how your index is built, but > if > you are using TOKENIZED on that field, then whenever the index is built > the > text is split on whitespace, and each element is added as a separate > term.Jeremy Thanks for the reply. I am building the index like this... class VoObject < ActiveRecord::Base acts_as_ferret :fields=> [''short_description'',''section'',''sale_category'',''sale_type'',''outcode'']> It looks like when you are searching, you are trying to find the entire > text > as a single term. > > In order to solve this, I believe you can either construct your query > using > QueryParser, which will use the analyzer / tokenizer and split the terms > out > for you, or you can simply split the ''section'' string on whitespace and > build a Term and TermQuery for each resulting element and build a > PhraseQuery from that set.Sorry for asking a silly question but how would I go about doing this?> I hope this is some help, > > Jeremy-- Posted via http://www.ruby-forum.com/.
Method #1 should be shorter / easier, and would look something like this: qp = Ferret::QueryParser.new("section") #section defines the default field to build the query query = qp.parse("\"#{section}\"") # modified boolean query bq = Ferret::Search::BooleanQuery.new bq.add_query(pq, Ferret::Search::BooleanClause::Occur::MUST) filter = Ferret::Search::QueryFilter.new(bq) @vobjects = VoObject.find_by_contents(search_input,:filter => filter, :sort => ["section", "sale_category"]) Uness you have more than one query in the boolean query, you should probably just skip that entirely and build your filter from the PhraseQuery. On 7/14/06, BlueJay <clare.cavanagh at btclick.com> wrote:> > Jeremy Bensley wrote: > > It''s hard to know for sure without seeing how your index is built, but > > if > > you are using TOKENIZED on that field, then whenever the index is built > > the > > text is split on whitespace, and each element is added as a separate > > term. > > Jeremy > > Thanks for the reply. I am building the index like this... > > class VoObject < ActiveRecord::Base > acts_as_ferret :fields=> > [''short_description'',''section'',''sale_category'',''sale_type'',''outcode''] > > > It looks like when you are searching, you are trying to find the entire > > text > > as a single term. > > > > In order to solve this, I believe you can either construct your query > > using > > QueryParser, which will use the analyzer / tokenizer and split the terms > > out > > for you, or you can simply split the ''section'' string on whitespace and > > build a Term and TermQuery for each resulting element and build a > > PhraseQuery from that set. > > Sorry for asking a silly question but how would I go about doing this? > > > I hope this is some help, > > > > Jeremy > > > > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060714/4a6ffed9/attachment.html