I am getting strange results when I reverse sort a query. I am sorting by date, but it doesn''t seem to be related to dates (I have tried just integers). I also paginate the results. Items in the result set are sometimes duplicated and the not ordered at all. When I try a non-reverse sort I don''t see duplicates and the ordering is correct. Any ideas what is going on? Thanks -- Posted via http://www.ruby-forum.com/.
On 7/13/06, Floyd Morgan <Floyd_Morgan at intuit.com> wrote:> I am getting strange results when I reverse sort a query. I am sorting > by date, but it doesn''t seem to be related to dates (I have tried just > integers). I also paginate the results. Items in the result set are > sometimes duplicated and the not ordered at all. When I try a > non-reverse sort I don''t see duplicates and the ordering is correct. Any > ideas what is going on? ThanksNo idea. Could you show us some example code. Preferably with a short test case. Cheers, Dave
Here is the index snippet: doc = Ferret::Document::Document.new # insert the id doc << Ferret::Document::Field.new( "id", post.id, Ferret::Document::Field::Store::YES, Ferret::Document::Field::Index::UNTOKENIZED ) # insert the date doc << Ferret::Document::Field.new( "created_at", post.created_at, Ferret::Document::Field::Store::NO, Ferret::Document::Field::Index::UNTOKENIZED ) # add some other stuff ... # write to the index index << doc Here is the query snippet: sort_fields = [] sort_fields << Ferret::Search::SortField.new ( "created_at", :sort_type => Ferret::Search::SortField::SortType::INTEGER, :reverse => true ) # search the index top_docs = index.search( query, { :first_doc => first_doc , :num_docs => 5, :sort => sort_fields } ) On Jul 12, 2006, at 4:57 PM, David Balmain wrote:> On 7/13/06, Floyd Morgan <Floyd_Morgan at intuit.com> wrote: >> I am getting strange results when I reverse sort a query. I am >> sorting >> by date, but it doesn''t seem to be related to dates (I have tried >> just >> integers). I also paginate the results. Items in the result set are >> sometimes duplicated and the not ordered at all. When I try a >> non-reverse sort I don''t see duplicates and the ordering is >> correct. Any >> ideas what is going on? Thanks > > No idea. Could you show us some example code. Preferably with a > short test case. > > Cheers, > Dave > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >
On 7/13/06, Floyd Morgan <Floyd_Morgan at intuit.com> wrote:> Here is the index snippet: > > doc = Ferret::Document::Document.new > # insert the id > doc << Ferret::Document::Field.new( "id", post.id, > Ferret::Document::Field::Store::YES, > Ferret::Document::Field::Index::UNTOKENIZED ) > # insert the date > doc << Ferret::Document::Field.new( "created_at", post.created_at, > Ferret::Document::Field::Store::NO, > Ferret::Document::Field::Index::UNTOKENIZED ) > # add some other stuff ... > # write to the index > index << doc > > Here is the query snippet: > > sort_fields = [] > sort_fields << Ferret::Search::SortField.new > ( "created_at", :sort_type => > Ferret::Search::SortField::SortType::INTEGER, :reverse => true ) > # search the index > top_docs = index.search( query, { :first_doc => > first_doc , :num_docs => 5, :sort => sort_fields } )I''m not exactly sure what post.created_at but if it''s a Time object then you need to convert it to a string that will sort correctly as a string. ie use strftime("%Y%m%d") (use whatever precision you need. Here is an example which adds 100 documents with 100 random dates in the last 100 days; require ''rubygems'' require ''ferret'' include Ferret::Index include Ferret::Search index = Index.new t = Time.now 100.times do index << {:id => "x", :date => (t-24*60*60*rand(100)).strftime("%Y%m%d")} end sort_fields = [SortField.new(:date, :sort_type => SortField::SortType::INTEGER, :reverse => true)] 10.times do |start| index.search_each("x", :first_doc => start*10, :num_docs => 10, :sort => sort_fields) do |doc_id, score| puts index[doc_id][:date] end end
The field is a DateTime. So I tried what you suggested and no luck. I noticed that when I remove the first_doc and num_doc options it appears to work correctly (getting all of the docs in the right order). On Jul 12, 2006, at 7:13 PM, David Balmain wrote:> On 7/13/06, Floyd Morgan <Floyd_Morgan at intuit.com> wrote: >> Here is the index snippet: >> >> doc = Ferret::Document::Document.new >> # insert the id >> doc << Ferret::Document::Field.new( "id", post.id, >> Ferret::Document::Field::Store::YES, >> Ferret::Document::Field::Index::UNTOKENIZED ) >> # insert the date >> doc << Ferret::Document::Field.new( "created_at", post.created_at, >> Ferret::Document::Field::Store::NO, >> Ferret::Document::Field::Index::UNTOKENIZED ) >> # add some other stuff ... >> # write to the index >> index << doc >> >> Here is the query snippet: >> >> sort_fields = [] >> sort_fields << Ferret::Search::SortField.new >> ( "created_at", :sort_type => >> Ferret::Search::SortField::SortType::INTEGER, :reverse => >> true ) >> # search the index >> top_docs = index.search( query, { :first_doc => >> first_doc , :num_docs => 5, :sort => sort_fields } ) > > > I''m not exactly sure what post.created_at but if it''s a Time object > then you need to convert it to a string that will sort correctly as a > string. ie use strftime("%Y%m%d") (use whatever precision you need. > Here is an example which adds 100 documents with 100 random dates in > the last 100 days; > > require ''rubygems'' > require ''ferret'' > include Ferret::Index > include Ferret::Search > > index = Index.new > t = Time.now > > 100.times do > index << {:id => "x", > :date => (t-24*60*60*rand(100)).strftime("%Y%m%d")} > end > > sort_fields = [SortField.new(:date, > :sort_type => > SortField::SortType::INTEGER, > :reverse => true)] > > 10.times do |start| > index.search_each("x", > :first_doc => start*10, > :num_docs => 10, > :sort => sort_fields) do |doc_id, score| > puts index[doc_id][:date] > end > end > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >
On 7/13/06, Floyd Morgan <Floyd_Morgan at intuit.com> wrote:> The field is a DateTime. > > So I tried what you suggested and no luck. I noticed that when I > remove the first_doc and num_doc options it appears to work correctly > (getting all of the docs in the right order).I''m sorry, I can''t really help unless I can see an example that isn''t working. Try modifying the code that I posted previously to match more closely what you are doing. Then send back the broken example snippet and I''ll be able to tell you what is wrong with it or fix the bug if it exists. As long as the strings going into the index are the same (ie, in the format "%Y%m%d") then I can''t see how your results could be any different. Cheers, Dave