Max Williams
2008-Apr-25 10:45 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is breaking my search results...
Hi I have a problem with a search where i want to get some results according to some conditions, sort the results, and then paginate over the sorted collection. My search looks like this: @results = TeachingObject.find_with_ferret(search_term, #(ferret) options {:page => options[:page], :per_page => options[:per_page], :sort => Ferret::Search::SortField.new(:asset_count, :type => :integer, :reverse => true )}, #find options { :conditions => ["id in (?)", @ids] } ) where @ids is an array of ids from which the results must come (ie a collection of ''allowed'' results of which @results will be a subset): often the search term is set to * to get all of this collection in @results. '':asset_count'' is an untokenized ferret field that stores integers. Through debugging and experimenting, i''ve observed the following: - The overall results set, without sorting/pagination, is correct (therefore :conditions is being taken into account). - If :per_page is set to be so large that no pagination is required, then the sorting occurs properly (therefore sorting is being taken into account) - If :per_page is reduced so that pagination is required, then the sorting of the overall set breaks: it seems as if the results are ordered by id, then paginated. - However, on every individual page, the results are sorted properly for that page, ie each page-size subset is internally sorted. - If i sort on a different untokenized field, the problem persists. It seems as if the pagination is happening and THEN the sorting is happening, which obviously doesn''t give the expected results. This is just a theory on my part though. Can anyone tell me how to fix this problem? I''ve been gnashing my teeth over it for over a day now and can''t find any solutions... thanks max -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080425/a9f6e4bb/attachment-0001.html>
Jens Kraemer
2008-Apr-25 11:59 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is breaking my search results...
Hi Max, thanks for your detailed report. Might well be that I broke one or more of the various combinations of pagination / sorting / active record conditions (where you might specify :order, too, btw) in trunk. I''ll look into it asap. Cheers, Jens On Fri, Apr 25, 2008 at 11:45:30AM +0100, Max Williams wrote:> Hi > > I have a problem with a search where i want to get some results according to > some conditions, sort the results, and then paginate over the sorted > collection. > > My search looks like this: > > @results = TeachingObject.find_with_ferret(search_term, > #(ferret) options > {:page => options[:page], > :per_page => options[:per_page], > :sort => Ferret::Search::SortField.new(:asset_count, > :type => :integer, :reverse => true )}, > #find options > { :conditions => ["id in (?)", @ids] } ) > > where @ids is an array of ids from which the results must come (ie a > collection of ''allowed'' results of which @results will be a subset): often > the search term is set to * to get all of this collection in @results. > '':asset_count'' is an untokenized ferret field that stores integers. > > Through debugging and experimenting, i''ve observed the following: > > - The overall results set, without sorting/pagination, is correct > (therefore :conditions is being taken into account). > - If :per_page is set to be so large that no pagination is required, then > the sorting occurs properly (therefore sorting is being taken into account) > - If :per_page is reduced so that pagination is required, then the > sorting of the overall set breaks: it seems as if the results are ordered by > id, then paginated. > - However, on every individual page, the results are sorted properly for > that page, ie each page-size subset is internally sorted. > - If i sort on a different untokenized field, the problem persists. > > It seems as if the pagination is happening and THEN the sorting is > happening, which obviously doesn''t give the expected results. This is just > a theory on my part though. > > Can anyone tell me how to fix this problem? I''ve been gnashing my teeth > over it for over a day now and can''t find any solutions... > > thanks > max> _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk-- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold
Max Williams
2008-Apr-25 12:35 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is breaking my search results...
Fantastic, thanks Jens. BTW, I can''t :order by :asset_count as it''s a method, rather than an instance variable. I wish AR::find would let me order by method returns. While we''re talking, i just tried another way of searching and found another bit of weirdness: instead of find_with_ferret, i tried using ActsAsFerret::find(term, class_array) instead, like so: #sorting/pagination is broken here @results = ActsAsFerret::find(search_term, [TeachingObject], #(ferret) options { :page => options[:page], :per_page => options[:per_page], :sort => Ferret::Search::SortField.new(:asset_count, :type => :integer, :reverse => true ) }, #find options - need to specify conditions for each searched class individually {:conditions => { :teaching_object => ["id in (?)", @ids] } } ) This gave exactly the same results as the previous search. However, when i added another class to the search, it works! #this works! @results = ActsAsFerret::find(search_term, [TeachingObject, LearningObject], #(ferret) options { :page => options[:page], :per_page => options[:per_page], :sort => Ferret::Search::SortField.new(:asset_count, :type => :integer, :reverse => true ) }, #find options - need to specify conditions for each searched class individually {:conditions => { :teaching_object => ["id in (?)", @ids], :learning_object => ["id in (?)", @ids] } } ) So, this works while the previous doesn''t. It so happens in this case that LearningObject is a ''sister'' class of TeachingObject (they both extend a class called Resource), where both are saved in a table called resources using STI, and at the moment i don''t actually have any LearningObject records, so adding LearningObject doesn''t harm my results. Obviously though this isn''t a nice workaround. Sorry to pile bug reports on you, i just mention it in case it''s relevant. :/ Thanks a lot max 2008/4/25 Jens Kraemer <kraemer at webit.de>:> Hi Max, > > thanks for your detailed report. Might well be that I broke one or more > of the various combinations of pagination / sorting / active record > conditions (where you might specify :order, too, btw) in trunk. > > I''ll look into it asap. > > Cheers, > Jens > > On Fri, Apr 25, 2008 at 11:45:30AM +0100, Max Williams wrote: > > Hi > > > > I have a problem with a search where i want to get some results according > to > > some conditions, sort the results, and then paginate over the sorted > > collection. > > > > My search looks like this: > > > > @results = TeachingObject.find_with_ferret(search_term, > > #(ferret) options > > {:page => options[:page], > > :per_page => options[:per_page], > > :sort => > Ferret::Search::SortField.new(:asset_count, > > :type => :integer, :reverse => true )}, > > #find options > > { :conditions => ["id in (?)", @ids] } ) > > > > where @ids is an array of ids from which the results must come (ie a > > collection of ''allowed'' results of which @results will be a subset): > often > > the search term is set to * to get all of this collection in @results. > > '':asset_count'' is an untokenized ferret field that stores integers. > > > > Through debugging and experimenting, i''ve observed the following: > > > > - The overall results set, without sorting/pagination, is correct > > (therefore :conditions is being taken into account). > > - If :per_page is set to be so large that no pagination is required, > then > > the sorting occurs properly (therefore sorting is being taken into > account) > > - If :per_page is reduced so that pagination is required, then the > > sorting of the overall set breaks: it seems as if the results are > ordered by > > id, then paginated. > > - However, on every individual page, the results are sorted properly > for > > that page, ie each page-size subset is internally sorted. > > - If i sort on a different untokenized field, the problem persists. > > > > It seems as if the pagination is happening and THEN the sorting is > > happening, which obviously doesn''t give the expected results. This is > just > > a theory on my part though. > > > > Can anyone tell me how to fix this problem? I''ve been gnashing my teeth > > over it for over a day now and can''t find any solutions... > > > > thanks > > max > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080425/ec26879f/attachment.html>
Sheldon Maloff
2008-May-02 02:24 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Hello Jens, I think I know what''s going on here, because our descending sort searches are broken too and I have started to investigate what''s causing the problem and trying to fix it. I have a January ''08 version of the trunk. I believe it''s changed quite a lot since that time. Jens, I don''t think it''s anything you "broke" but rather an artifiact of how MySQL works. At least, I''m using MySQL and this is the behaviour I see. I created 6 records, whose ids are 1 to 6 in my database. I am paginating on every 5 records. In my reverse sort I would expect to see records 6, 5, 4, 3, 2 on page 1 of the results. And id 1 on page 2 of the results. What I see is a method called ar_find_by_contents. It calls find_id_by_content that returns an array that in turn calls ferret. The array that comes back from ferret is actually correctly sorted: 6 0.928179502487183 5 0.928179502487183 4 0.928179502487183 3 0.928179502487183 2 0.928179502487183 1 0.928179502487183 The first number is the id, the second is the rank. Now what happens is ar_find_by_contents calls retrieve_records. And retrieve_records produces a SELECT statement like so: SELECT * FROM model WHERE id IN (6, 1, 2, 3, 4, 5) LIMIT 0, 5 It took me a while to figure out that things are being passed around as a hash, and hence the wacky order of the ids in the IN clause. Now the problem with this statement is that MySQL doesn''t return records in the order that the ids appear in the IN clause. MySQL returns records in the order of the Primary Key on the table, which happens to be the id column. So MySQL is returning records 1, 2, 3, 4, 5, 6, in that order. Then the LIMIT clause kicks in and truncates the results to 1 through 5. Now the rest of ar_find_by_contents valiantly tries to order the AR results with the rank returned by ferret (my first table above). The problem is, record 6, the youngest, is no longer in the results because LIMIT took it out. So AAF sorts records 1 through 5 descending. Following along we can see how page two returns only record 6. On page two, the limit changes to SELECT * FROM model WHERE id IN (6, 1, 2, 3, 4, 5) LIMIT 5, 5 Once again, My SQL returns records 1, 2, 3, 4, 5, 6, but this time the limit returns only the last record, id 6. And then AAF sorts that descending. I working on a patch for the version I have by making MySQL return only the correct set of records in the first place. In other words, ensuring that the only ids present in the IN clause are the ones that should appear on page 1 of the results, or page 2, or pane N. So my AR query for page 1 looks like SELECT * FROM model WHERE id IN (6, 5, 4, 3, 2) LIMIT 0, 5 and the AR query for page 2 looks like SELECT * FROM model WHERE id IN (1) LIMIT 0, 5 I got it working, but in the process have made every other search, not work. Funny. I''m sure I''ll figure it out. Anyway, Jens, that''s the gist of the problem at least how it relates to MySQL. Other databases may vary. Regards Sheldon Maloff veer.com Jens Kraemer wrote:> Hi Max, > > thanks for your detailed report. Might well be that I broke one or more > of the various combinations of pagination / sorting / active record > conditions (where you might specify :order, too, btw) in trunk. > > I''ll look into it asap. > > Cheers, > Jens-- Posted via http://www.ruby-forum.com/.
Jeff Webb
2008-Jul-09 04:01 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Sheldon/Jens, Any chance that this has been figured out and patched? Great analysis Sheldon! Jeff On Thu, May 1, 2008 at 7:24 PM, Sheldon Maloff <smaloff at veer.com> wrote:> Hello Jens, > > I think I know what''s going on here, because our descending sort > searches are broken too and I have started to investigate what''s causing > the problem and trying to fix it. I have a January ''08 version of the > trunk. I believe it''s changed quite a lot since that time. > > Jens, I don''t think it''s anything you "broke" but rather an artifiact of > how MySQL works. At least, I''m using MySQL and this is the behaviour I > see. > > I created 6 records, whose ids are 1 to 6 in my database. I am > paginating on every 5 records. In my reverse sort I would expect to see > records 6, 5, 4, 3, 2 on page 1 of the results. And id 1 on page 2 of > the results. > > What I see is a method called ar_find_by_contents. It calls > find_id_by_content that returns an array that in turn calls ferret. The > array that comes back from ferret is actually correctly sorted: > > 6 0.928179502487183 > 5 0.928179502487183 > 4 0.928179502487183 > 3 0.928179502487183 > 2 0.928179502487183 > 1 0.928179502487183 > > The first number is the id, the second is the rank. > > Now what happens is ar_find_by_contents calls retrieve_records. And > retrieve_records produces a SELECT statement like so: > > SELECT * FROM model WHERE id IN (6, 1, 2, 3, 4, 5) LIMIT 0, 5 > > It took me a while to figure out that things are being passed around as > a hash, and hence the wacky order of the ids in the IN clause. Now the > problem with this statement is that MySQL doesn''t return records in the > order that the ids appear in the IN clause. MySQL returns records in the > order of the Primary Key on the table, which happens to be the id > column. So MySQL is returning records 1, 2, 3, 4, 5, 6, in that order. > Then the LIMIT clause kicks in and truncates the results to 1 through 5. > > Now the rest of ar_find_by_contents valiantly tries to order the AR > results with the rank returned by ferret (my first table above). The > problem is, record 6, the youngest, is no longer in the results because > LIMIT took it out. So AAF sorts records 1 through 5 descending. > > Following along we can see how page two returns only record 6. On page > two, the limit changes to > > SELECT * FROM model WHERE id IN (6, 1, 2, 3, 4, 5) LIMIT 5, 5 > > Once again, My SQL returns records 1, 2, 3, 4, 5, 6, but this time the > limit returns only the last record, id 6. And then AAF sorts that > descending. > > I working on a patch for the version I have by making MySQL return only > the correct set of records in the first place. In other words, ensuring > that the only ids present in the IN clause are the ones that should > appear on page 1 of the results, or page 2, or pane N. > > So my AR query for page 1 looks like > > SELECT * FROM model WHERE id IN (6, 5, 4, 3, 2) LIMIT 0, 5 > > and the AR query for page 2 looks like > > SELECT * FROM model WHERE id IN (1) LIMIT 0, 5 > > I got it working, but in the process have made every other search, not > work. Funny. I''m sure I''ll figure it out. > > Anyway, Jens, that''s the gist of the problem at least how it relates to > MySQL. Other databases may vary. > > Regards > Sheldon Maloff > veer.com > > > Jens Kraemer wrote: > > Hi Max, > > > > thanks for your detailed report. Might well be that I broke one or more > > of the various combinations of pagination / sorting / active record > > conditions (where you might specify :order, too, btw) in trunk. > > > > I''ll look into it asap. > > > > Cheers, > > Jens > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-- Jeff Webb jeff at boowebb.com http://boowebb.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080708/109f552e/attachment.html>
Jens Kraemer
2008-Jul-09 09:24 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Hi folks, On 09.07.2008, at 06:01, Jeff Webb wrote:> Sheldon/Jens, > > Any chance that this has been figured out and patched?I just committed a fix to this problem to trunk, so pagination with sorting and AR conditions should work now. cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database
Max Williams
2008-Jul-18 11:11 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Hi Jens/all I''ve been playing with latest version, and it seems that the combination of AR conditions, paginating and AR order (eg ":order => "name") seems to be fine now, even on our server, which was were it was broken before (it always worked locally, weirdly). However, i''ve just tried it with results that are sorted by ferret on the basis of their boost score, and it seems broken. I''m sorry i don''t have any tests set up but if i walk through an example can anyone see something that maybe i''m doing wrong? Or shed any light? I use AR conditions to filter the results through a set of ids that an individual user is allowed to view. I just have a method User#allowed_ids for this that returns an array of integers. For the purpose of illustration, though, let''s say that the allowed ids are everything between 1000 & 2000. allowed_ids = (1000..2000).to_a => [1000, 1001, 1002, etc, 2000] For a particular search (on ''rhythm'') i get these results back - the order is the order calculated by ferret/aaf on the basis of boost values: i''ve collected them by id for clarity. In this example i''ve set a big per_page to get all the results.>> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page =>1000 }, {} ).collect(&:id) => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411, 1834, 702, 1827, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 2129, 2130, 2131, 1858, 1859, 1860, 1861, 1865, 2132, 2141, 2345, 2350, 2352, 2353, 2356, 2360, 2362, 2366, 2368, 2371, 2372, 2373, 2376, 2377, 2378, 2384, 2385, 2389, 2407, 2414, 2417, 2419, 2427, 2428, 2438, 2439, 2458, 2459, 2473, 2474, 2475, 2477, 2478, 2133, 2349, 2361, 2363, 2365, 2370, 2375, 2383, 2386, 2392, 2415, 2430, 2431, 2440, 2441, 2442, 2472, 2142, 3751, 2161, 1236] Now, when i apply the condition of only being in the allowed ids, i''d expect the remaining ids to be in the same order as above, and that is in fact the case:>> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page =>1000 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] Just to check, let''s do the original again (with no AR conditions) and just do set intersection with the allowed_ids>> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page =>1000 }, {} ).collect(&:id) & allowed_ids => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] OK - looks good. But, when i try to paginate into some actual pages, the order breaks: first, without AR conditions:>> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page =>10 }, {} ).collect(&:id) => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411] That all seems to be in order - i get the first ten results from the big list, above. Now, if we were to apply the allowed_ids condition here, we''d expect the results to start [1830, 1831, 1826, 1832, 1825, 1833, ...] - right? Because we should have the same ordering applied to the remaining resources, and then we get the first ten for page 1. But, the ordering is different ->> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page =>10 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) => [1826, 1825, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 1236] So, it seems that pagination + ferret score ordering + AR conditions is a bad combination Again, sorry to not supply test cases but we don''t use them (i know, i know!!!). Can anyone shed any light? thanks - max 2008/7/9 Jens Kraemer <jk at jkraemer.net>:> Hi folks, > > On 09.07.2008, at 06:01, Jeff Webb wrote: > >> Sheldon/Jens, >> >> Any chance that this has been figured out and patched? >> > > I just committed a fix to this problem to trunk, so pagination with sorting > and AR conditions should work now. > > > cheers, > Jens > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > > > > > > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080718/51fac4b5/attachment.html>
Jeff Webb
2008-Jul-18 22:28 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Max, Here is what I changed in the current stable version of AAF in my local plugin install to get conditions and pagination working. I have not fully tested it except for my particular use cases but it seems to work out. Sorting is not factored into these changes. Basically if there is no pagination OR no AR conditions you can let ferret do all the work for you. If you have both of these then you need to return all results, merge with ferret IDs, then do your offset and limit. I replicated what Jens has done on the AAF trunk. Hope this works for you. I have not heard if Jens is going to add this to the current branch as a hotfix. replace the existing ar_by_contents method in class_methods.rb with: #################################### #changes to ar_find_by_contents AAF 0.4.3# #################################### def ar_find_by_contents(q, options = {}, find_options = {}) result_ids = {} has_conditions = !find_options[:conditions].blank? || caller.find{ |call| call =~ %r{active_record/associations} } # odd case - cannot do pagination combo with AR & Ferret # must retrieve all then paginate after if options[:per_page] && has_conditions late_paginate = true offset = find_options.delete(:offset) limit = find_options.delete(:limit) options.delete(:page) options.delete(:per_page) find_options.delete(:offset) find_options.delete(:limit) options[:limit] = :all end total_hits = find_id_by_contents(q, options) do |model, id, score, data| # stores ids, index and score of each hit for later ordering of # results result_ids[id] = [ result_ids.size + 1, score ] end result = retrieve_records( { self.name => result_ids }, find_options ) if has_conditions # what we got from the database is our full result set, so take it''s size total_hits = result.length if late_paginate result = result[offset..offset+limit-1] end end [ total_hits, result ] end Jeff On Fri, Jul 18, 2008 at 4:11 AM, Max Williams <toastkid.williams at gmail.com> wrote:> Hi Jens/all > > I''ve been playing with latest version, and it seems that the combination of > AR conditions, paginating and AR order (eg ":order => "name") seems to be > fine now, even on our server, which was were it was broken before (it always > worked locally, weirdly). > > However, i''ve just tried it with results that are sorted by ferret on the > basis of their boost score, and it seems broken. > > I''m sorry i don''t have any tests set up but if i walk through an example > can anyone see something that maybe i''m doing wrong? Or shed any light? > > I use AR conditions to filter the results through a set of ids that an > individual user is allowed to view. I just have a method User#allowed_ids > for this that returns an array of integers. For the purpose of > illustration, though, let''s say that the allowed ids are everything between > 1000 & 2000. > allowed_ids = (1000..2000).to_a > => [1000, 1001, 1002, etc, 2000] > > For a particular search (on ''rhythm'') i get these results back - the order > is the order calculated by ferret/aaf on the basis of boost values: i''ve > collected them by id for clarity. In this example i''ve set a big per_page > to get all the results. > > >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page => > 1000 }, {} ).collect(&:id) > => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411, 1834, 702, > 1827, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 2129, 2130, 2131, 1858, > 1859, 1860, 1861, 1865, 2132, 2141, 2345, 2350, 2352, 2353, 2356, 2360, > 2362, 2366, 2368, 2371, 2372, 2373, 2376, 2377, 2378, 2384, 2385, 2389, > 2407, 2414, 2417, 2419, 2427, 2428, 2438, 2439, 2458, 2459, 2473, 2474, > 2475, 2477, 2478, 2133, 2349, 2361, 2363, 2365, 2370, 2375, 2383, 2386, > 2392, 2415, 2430, 2431, 2440, 2441, 2442, 2472, 2142, 3751, 2161, 1236] > > Now, when i apply the condition of only being in the allowed ids, i''d > expect the remaining ids to be in the same order as above, and that is in > fact the case: > > >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page => > 1000 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) > => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, 1679, > 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] > > Just to check, let''s do the original again (with no AR conditions) and just > do set intersection with the allowed_ids > >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page => > 1000 }, {} ).collect(&:id) & allowed_ids > => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, 1679, > 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] > > OK - looks good. But, when i try to paginate into some actual pages, the > order breaks: first, without AR conditions: > > >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page => > 10 }, {} ).collect(&:id) > => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411] > > That all seems to be in order - i get the first ten results from the big > list, above. > > Now, if we were to apply the allowed_ids condition here, we''d expect the > results to start > [1830, 1831, 1826, 1832, 1825, 1833, ...] - right? Because we should have > the same ordering applied to the remaining resources, and then we get the > first ten for page 1. But, the ordering is different - > > >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page => > 10 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) > => [1826, 1825, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 1236] > > So, it seems that pagination + ferret score ordering + AR conditions is a > bad combination > > Again, sorry to not supply test cases but we don''t use them (i know, i > know!!!). Can anyone shed any light? > > thanks - max > > 2008/7/9 Jens Kraemer <jk at jkraemer.net>: > > Hi folks, >> >> On 09.07.2008, at 06:01, Jeff Webb wrote: >> >>> Sheldon/Jens, >>> >>> Any chance that this has been figured out and patched? >>> >> >> I just committed a fix to this problem to trunk, so pagination with >> sorting and AR conditions should work now. >> >> >> cheers, >> Jens >> >> -- >> Jens Kr?mer >> http://www.jkraemer.net/ - Blog >> http://www.omdb.org/ - The new free film database >> >> >> >> >> >> >> >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-- Jeff Webb jeff at boowebb.com http://boowebb.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080718/0a5f98b4/attachment-0001.html>
Max Williams
2008-Jul-19 10:51 UTC
[Ferret-talk] Pagination, sorting and conditions: the combination is b
Hi Jeff - thanks a lot for looking at this. I discovered that since last august, will paginate lets you paginate any arbitrary array - so you can paginate an existing set of results. so what i''m doing now is this - get the results using ferret find, passing through the AR conditions and a large per page which means there''s no pagination, effectively. The sorting (by ferret score) is correct. - cache the results with memcache, using params (minus the page k-v pair) as a cache key - this means that the retrieve will be slow the first time but very quick if the user just wants to look at another page - then paginate the results, simply doing "@all_results.paginate(:page => params[:page], :per_page => 20)" This is pretty similar to your patch i think - get results with ferret and then paginate - and it''s working fine for me. The size of my result set returned by ferret is rarely more than 150 big, so getting all results isn''t a problem. Obviously it''s not very scaleable though. thanks! max 2008/7/18 Jeff Webb <jeff at boowebb.com>:> Max, > > Here is what I changed in the current stable version of AAF in my local > plugin install to get conditions and pagination working. I have not fully > tested it except for my particular use cases but it seems to work out. > Sorting is not factored into these changes. > > Basically if there is no pagination OR no AR conditions you can let ferret > do all the work for you. If you have both of these then you need to return > all results, merge with ferret IDs, then do your offset and limit. I > replicated what Jens has done on the AAF trunk. > > Hope this works for you. I have not heard if Jens is going to add this to > the current branch as a hotfix. > > replace the existing ar_by_contents method in class_methods.rb with: > > #################################### > #changes to ar_find_by_contents AAF 0.4.3# > #################################### > > def ar_find_by_contents(q, options = {}, find_options = {}) > result_ids = {} > has_conditions = !find_options[:conditions].blank? || caller.find{ > |call| call =~ %r{active_record/associations} } > > # odd case - cannot do pagination combo with AR & Ferret > # must retrieve all then paginate after > if options[:per_page] && has_conditions > late_paginate = true > offset = find_options.delete(:offset) > limit = find_options.delete(:limit) > options.delete(:page) > options.delete(:per_page) > find_options.delete(:offset) > find_options.delete(:limit) > options[:limit] = :all > end > > total_hits = find_id_by_contents(q, options) do |model, id, score, > data| > # stores ids, index and score of each hit for later ordering of > # results > result_ids[id] = [ result_ids.size + 1, score ] > end > > result = retrieve_records( { self.name => result_ids }, find_options > ) > > if has_conditions > # what we got from the database is our full result set, so take > it''s size > total_hits = result.length > > if late_paginate > result = result[offset..offset+limit-1] > end > end > > [ total_hits, result ] > end > > > Jeff > > > > > On Fri, Jul 18, 2008 at 4:11 AM, Max Williams <toastkid.williams at gmail.com> > wrote: > >> Hi Jens/all >> >> I''ve been playing with latest version, and it seems that the combination >> of AR conditions, paginating and AR order (eg ":order => "name") seems to be >> fine now, even on our server, which was were it was broken before (it always >> worked locally, weirdly). >> >> However, i''ve just tried it with results that are sorted by ferret on the >> basis of their boost score, and it seems broken. >> >> I''m sorry i don''t have any tests set up but if i walk through an example >> can anyone see something that maybe i''m doing wrong? Or shed any light? >> >> I use AR conditions to filter the results through a set of ids that an >> individual user is allowed to view. I just have a method User#allowed_ids >> for this that returns an array of integers. For the purpose of >> illustration, though, let''s say that the allowed ids are everything between >> 1000 & 2000. >> allowed_ids = (1000..2000).to_a >> => [1000, 1001, 1002, etc, 2000] >> >> For a particular search (on ''rhythm'') i get these results back - the order >> is the order calculated by ferret/aaf on the basis of boost values: i''ve >> collected them by id for clarity. In this example i''ve set a big per_page >> to get all the results. >> >> >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page >> => 1000 }, {} ).collect(&:id) >> => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411, 1834, 702, >> 1827, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 2129, 2130, 2131, 1858, >> 1859, 1860, 1861, 1865, 2132, 2141, 2345, 2350, 2352, 2353, 2356, 2360, >> 2362, 2366, 2368, 2371, 2372, 2373, 2376, 2377, 2378, 2384, 2385, 2389, >> 2407, 2414, 2417, 2419, 2427, 2428, 2438, 2439, 2458, 2459, 2473, 2474, >> 2475, 2477, 2478, 2133, 2349, 2361, 2363, 2365, 2370, 2375, 2383, 2386, >> 2392, 2415, 2430, 2431, 2440, 2441, 2442, 2472, 2142, 3751, 2161, 1236] >> >> Now, when i apply the condition of only being in the allowed ids, i''d >> expect the remaining ids to be in the same order as above, and that is in >> fact the case: >> >> >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page >> => 1000 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) >> => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, >> 1679, 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] >> >> Just to check, let''s do the original again (with no AR conditions) and >> just do set intersection with the allowed_ids >> >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page >> => 1000 }, {} ).collect(&:id) & allowed_ids >> => [1830, 1831, 1826, 1832, 1825, 1833, 1834, 1827, 1689, 1680, 1688, >> 1679, 1686, 1684, 1676, 1858, 1859, 1860, 1861, 1865, 1236] >> >> OK - looks good. But, when i try to paginate into some actual pages, the >> order breaks: first, without AR conditions: >> >> >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page >> => 10 }, {} ).collect(&:id) >> => [4038, 698, 4039, 1830, 1831, 1826, 1832, 1825, 1833, 411] >> >> That all seems to be in order - i get the first ten results from the big >> list, above. >> >> Now, if we were to apply the allowed_ids condition here, we''d expect the >> results to start >> [1830, 1831, 1826, 1832, 1825, 1833, ...] - right? Because we should have >> the same ordering applied to the remaining resources, and then we get the >> first ten for page 1. But, the ordering is different - >> >> >> ActsAsFerret::find("rhythm", [TeachingObject],{ :page => 1, :per_page >> => 10 }, {:conditions => ["id in (?)", allowed_ids] } ).collect(&:id) >> => [1826, 1825, 1689, 1680, 1688, 1679, 1686, 1684, 1676, 1236] >> >> So, it seems that pagination + ferret score ordering + AR conditions is a >> bad combination >> >> Again, sorry to not supply test cases but we don''t use them (i know, i >> know!!!). Can anyone shed any light? >> >> thanks - max >> >> 2008/7/9 Jens Kraemer <jk at jkraemer.net>: >> >> Hi folks, >>> >>> On 09.07.2008, at 06:01, Jeff Webb wrote: >>> >>>> Sheldon/Jens, >>>> >>>> Any chance that this has been figured out and patched? >>>> >>> >>> I just committed a fix to this problem to trunk, so pagination with >>> sorting and AR conditions should work now. >>> >>> >>> cheers, >>> Jens >>> >>> -- >>> Jens Kr?mer >>> http://www.jkraemer.net/ - Blog >>> http://www.omdb.org/ - The new free film database >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >> >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > > > -- > > Jeff Webb > jeff at boowebb.com > http://boowebb.com/ > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20080719/2bd5c82e/attachment.html>