I had sort-by-date working almost perfectly with my app. It was behaving as expected for most data, but had a few hiccups with certain data. I investigated and discovered that the correct data was storing this in my ferret index: "1999-10-18 00:00:00" and the incorrect data was storing this: "Mon Oct 18 00:00:00 EDT 1999" (oops...) So I of course had to fix the incorrect data, and I figured while i was at it, I would normalize and minimize everything to this format: "19991018000000". Now it seems that sorting on this column does not work at all. I have not changed how the data is stored in the index, it has always been: :search_date => {:term_vectors => :no, :index => :untokenized, :store => :yes } Any ideas? Thanks. John
On Mar 31, 2007, at 8:07 PM, John Bachir wrote:> I investigated and discovered that the correct data was > storing this in my ferret index: "1999-10-18 00:00:00" and the > incorrect data was storing this: "Mon Oct 18 00:00:00 EDT > 1999" (oops...) > > So I of course had to fix the incorrect data, and I figured while i > was at it, I would normalize and minimize everything to this format: > "19991018000000". > > Now it seems that sorting on this column does not work at all.I just normalized everything to the "1999-10-18 00:00:00" format, and it is working again. My guess is that ferret is treating the data differently if it is only numeric characters? I''ve been using ferret for quite some time and have never come accros a type issue like this. Also, on that same model, I have another ferret field, configured the very same way, that is always a number; sorting works perfectly. However, those numbers are much smaller (number of messages in the discussion thread). So maybe ferret has a problem with big numbers? Anyway, I''m glad it''s working again, but would be very interested to know what the problem was. Cheers, John
(shamelessly bumping this thread up...) On Mar 31, 2007, at 8:30 PM, John Bachir wrote:> My guess is that ferret is treating the data differently if it is > only numeric characters? I''ve been using ferret for quite some time > and have never come accros a type issue like this. > > Also, on that same model, I have another ferret field, configured the > very same way, that is always a number; sorting works perfectly. > However, those numbers are much smaller (number of messages in the > discussion thread). > > So maybe ferret has a problem with big numbers? > Anyway, I''m glad it''s working again, but would be very interested to > know what the problem was.Does anyone have any insight into what may be causing this behavior? Thanks, John
On 4/4/07, John Bachir <john at digitalpulp.com> wrote:> > (shamelessly bumping this thread up...)Shame on you. :D Bumping this up won''t help anything, because they''re probably only one person, David, who can answer you question, and he seems to be in and out, with large periods of time with no net connectivity. It''s not that people aren''t willing to help, it''s just that most of us can''t. -ryan> On Mar 31, 2007, at 8:30 PM, John Bachir wrote: > > My guess is that ferret is treating the data differently if it is > > only numeric characters? I''ve been using ferret for quite some time > > and have never come accros a type issue like this. > > > > Also, on that same model, I have another ferret field, configured the > > very same way, that is always a number; sorting works perfectly. > > However, those numbers are much smaller (number of messages in the > > discussion thread). > > > > So maybe ferret has a problem with big numbers? > > Anyway, I''m glad it''s working again, but would be very interested to > > know what the problem was. > > > Does anyone have any insight into what may be causing this behavior? > > Thanks, > John > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >
On 4/5/07, Ryan King <ryansking at gmail.com> wrote:> On 4/4/07, John Bachir <john at digitalpulp.com> wrote: > > > > (shamelessly bumping this thread up...) > > Shame on you. :D > > Bumping this up won''t help anything, because they''re probably only one > person, David, who can answer you question, and he seems to be in and > out, with large periods of time with no net connectivity. It''s not > that people aren''t willing to help, it''s just that most of us can''t.And I''m back again. :-) You can actually specify how you want fields to be sorted, ie whether you want to sort by string, bytes, integer or float; sort_field = SortField.new(:field_name, {:type => :float, :reverse => true}) hits = index.search(query, :sort => Sort.new([sort_field, SortField::SCORE])) So, John, in your case you will want to set the type to :string or even better :byte. Sorting by :byte basically does a strcmp, ignoring locale and encoding, making it faster than sorting by :string. Actually, sorting by integer would be even better and can be done if you store the dates with day precision (eg 19770905). Unfortunately 19991018000000 won''t fit into a single integer so it won''t work for this precision. Now, if you don''t specify the sort type, Ferret will try and determine the sort type for you. It will first try to parse the field as an integer and then as a float before defaulting the a string type. My guess is that the reason John''s sort isn''t working is that Ferret is detecting an integer field so it is trying to sort by integer but the integers don''t fit in a 4 byte integer, hence the problem. Hope that explains it. If not, let me know and I''ll try and make it a little clearer. I''m in a bit of a rush to get through all the emails on the list to see if there are any issues I need to deal with before putting out another release. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/
On Apr 5, 2007, at 10:45 PM, David Balmain wrote:> Now, if you don''t specify the sort type, Ferret will try and determine > the sort type for you. It will first try to parse the field as an > integer and then as a float before defaulting the a string type. My > guess is that the reason John''s sort isn''t working is that Ferret is > detecting an integer field so it is trying to sort by integer but the > integers don''t fit in a 4 byte integer, hence the problem. > > Hope that explains it. If not, let me know and I''ll try and make it a > little clearer.That explains it very well, thanks. I wasn''t aware of the default sort behavior. It might be worth it to mention that (more?) prominently in the documentation. Thanks again, John
On Apr 5, 2007, at 10:45 PM, David Balmain wrote:> Now, if you don''t specify the sort type, Ferret will try and determine > the sort type for you. It will first try to parse the field as an > integer and then as a float before defaulting the a string type. My > guess is that the reason John''s sort isn''t working is that Ferret is > detecting an integer field so it is trying to sort by integer but the > integers don''t fit in a 4 byte integer, hence the problem. > > Hope that explains it. If not, let me know and I''ll try and make it a > little clearer. I''m in a bit of a rush to get through all the emails > on the list to see if there are any issues I need to deal with before > putting out another release.That explains it very well, thanks. I wasn''t aware of the default sort behavior. It might be worth it to mention that (more?) prominently in the documentation. Thanks again, John
On 4/9/07, John Joseph Bachir <john at digitalpulp.com> wrote:> That explains it very well, thanks. I wasn''t aware of the default > sort behavior. It might be worth it to mention that (more?) > prominently in the documentation.Point taken. I''ve added this to SortField: * Note 1: Care should be taken when using the :auto sort-type since * numbers will occur before other strings in the index so if you are sorting * a field with both numbers and strings (like a title field which might have * "24" and "Prison Break") then the sort_field will think it is sorting * integers when it really should be sorting strings. * * Note 2: When sorting by integer, integers are only 4 bytes so anything * larger will cause strange sorting behaviour. Plus this where the :sort parameter is mentioned; * :sort:: A Sort object or sort string describing how the field * should be sorted. A sort string is made up of field names * which cannot contain spaces and the word "DESC" if you * want the field reversed, all seperated by commas. For * example; "rating DESC, author, title". Note that Ferret * will try to determine a field''s type by looking at the * first term in the index and seeing if it can be parsed as * an integer or a float. Keep this in mind as you may need * to specify a fields type to sort it correctly. For more * on this, see the documentation for SortField Let me know if you have any suggestions where else you might expect to see something about this. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/