My question is about object.collections (AssociationProxy). We all know that it pretends to be an Array, but in reality it''s actually an AssociationProxy. But, Why does it have its own db intensive include? method? Example: @product = Product.find 1 (product habtm categories) We have 1000 categories in the system. In the product edit admin, we display a list of checkboxes including all categories, and the ones that are associated to the product are checked. <% Category.all do |category| %> <%= check_box_tag "product[category_ids][]", category.id, @product.categories.include?(category) %> <% end %> but it triggers 1000 sql queries like this (very time consuming): SELECT `categories`.id FROM `categories` INNER JOIN `categories_products` ON `categories`.id `categories_products`.category_id WHERE (`categories`.`id` = 1) AND (`categories_products`.product_id = 1 ) LIMIT 1 Why doesn''t it use the include? method of the Array? (way faster and doesn''t need db queries at all, since the association is already loaded.) Why does it need its own include? method? If I change my code to this it skips those unnecessary db queries, and return the result lightning fast: @product.categories.to_a.include?(category) what do you guys think? -- Posted via http://www.ruby-forum.com/.
On Aug 3, 5:40 pm, Mark Ron <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> Why doesn''t it use the include? method of the Array? (way faster and > doesn''t need db queries at all, since the association is already > loaded.) Why does it need its own include? method?Presumably the association isn''t actually loaded - the code in association_collection.rb checks for that and uses Array#include? if the association is loaded (at least that is the intent). The key thing is that active record doesn''t know ahead of time if you are going to do this once (in which case loading 1000 objects just to do this test would be wasteful) or if you are going to loop over a long list of products in which case not loading the array is wasteful). You can however (as you have found) force AR to go down the more efficient route (but yes, it would be nicer if it was better at guessing what you wanted to do). Fred> > If I change my code to this it skips those unnecessary db queries, and > return the result lightning fast: > > @product.categories.to_a.include?(category) > > what do you guys think? > -- > Posted viahttp://www.ruby-forum.com/.
Frederick Cheung wrote:> On Aug 3, 5:40�pm, Mark Ron <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote: > >> Why doesn''t it use the include? method of the Array? (way faster and >> doesn''t need db queries at all, since the association is already >> loaded.) Why does it need its own include? method? > > Presumably the association isn''t actually loaded - the code in > association_collection.rb checks for that and uses Array#include? if > the association is loaded (at least that is the intent). > > The key thing is that active record doesn''t know ahead of time if you > are going to do this once (in which case loading 1000 objects just to > do this test would be wasteful) or if you are going to loop over a > long list of products in which case not loading the array is > wasteful). You can however (as you have found) force AR to go down the > more efficient route (but yes, it would be nicer if it was better at > guessing what you wanted to do). > > Fredso basically it''s the same 1+N query problem we face when we try to access attributes of an associated object when we iterate through the collection of the parent. but that one does have a solution, if you include the associated model in the original query. (eager loading associations) for post in Post.find(:all, :include => :author) # 2 db queries) p post.author.name #no db query end but in case of object.collections, AR caches the association automatically like: project.milestones # fetches milestones from the database project.milestones.size # uses the milestone cache project.milestones.empty? # uses the milestone cache project.milestones(true).size # fetches milestones from the database project.milestones # uses the milestone cache So technically AR should know that the associated objects are loaded already (and cached) so the include? method should use the faster Array#include? method instead of AssociationCollection#inlcude? or to be more accurate #exists? method, right? project.milestones.include?(milestone) #should be fetched from cache too, no db query should be needed) It seems that it''s programmed in the framework, but for some reason in my example it''s always getting the result from the database instead of the already loaded association cache. Why is that? Thanks, Mark -- Posted via http://www.ruby-forum.com/.
On Aug 3, 9:28 pm, Mark Ron <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> project.milestones # fetches milestones from the database > project.milestones.size # uses the milestone cache > project.milestones.empty? # uses the milestone cache > project.milestones(true).size # fetches milestones from the database > project.milestones # uses the milestone cacheActually that''s not quite true: project.milestones does nothing (except possibly create an instance of AssociationCollection. In the console it looks like this loads the array but this is only because the console calls #inspect to display it (which causes it to be loaded). project.milestones.size will do a count(*) as will .empty? unless something else had caused the association to be loaded.> > So technically AR should know that the associated objects are loaded > already (and cached) so the include? method should use the faster > Array#include? method instead of AssociationCollection#inlcude? or to be > more accurate #exists? method, right? > > project.milestones.include?(milestone) #should be fetched from cache > too, no db query should be needed) > > It seems that it''s programmed in the framework, but for some reason in > my example it''s always getting the result from the database instead of > the already loaded association cache. > > Why is that?I suspect that the association cache isn''t actually loaded but without seeing what you''re doing this is mere speculation. Fred> > Thanks, > Mark > > -- > Posted viahttp://www.ruby-forum.com/.
Optimization really depends on the developer. The ORM framework has limitations. You could move the fetch outside the loop and use specific fields to reduce memory usage as well. For example (illustrative) <% avail_category_ids = @product.categories.all(:select => "id").map{| category| category.id}) Category.all(:select => "id").map{|category| category.id} do | category| %> <%= check_box_tag "product[category_ids][]", category.id, avail_category_ids.include?(category.id) %> <% end %> On Aug 4, 1:28 am, Mark Ron <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> Frederick Cheung wrote: > > On Aug 3, 5:40 pm, Mark Ron <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote: > > >> Why doesn''t it use the include? method of the Array? (way faster and > >> doesn''t need db queries at all, since the association is already > >> loaded.) Why does it need its own include? method? > > > Presumably the association isn''t actually loaded - the code in > > association_collection.rb checks for that and uses Array#include? if > > the association is loaded (at least that is the intent). > > > The key thing is that active record doesn''t know ahead of time if you > > are going to do this once (in which case loading 1000 objects just to > > do this test would be wasteful) or if you are going to loop over a > > long list of products in which case not loading the array is > > wasteful). You can however (as you have found) force AR to go down the > > more efficient route (but yes, it would be nicer if it was better at > > guessing what you wanted to do). > > > Fred > > so basically it''s the same 1+N query problem we face when we try to > access attributes of an associated object when we iterate through the > collection of the parent. but that one does have a solution, if you > include the associated model in the original query. (eager loading > associations) > > for post in Post.find(:all, :include => :author) # 2 db queries) > p post.author.name #no db query > end > > but in case of object.collections, AR caches the association > automatically like: > > project.milestones # fetches milestones from the database > project.milestones.size # uses the milestone cache > project.milestones.empty? # uses the milestone cache > project.milestones(true).size # fetches milestones from the database > project.milestones # uses the milestone cache > > So technically AR should know that the associated objects are loaded > already (and cached) so the include? method should use the faster > Array#include? method instead of AssociationCollection#inlcude? or to be > more accurate #exists? method, right? > > project.milestones.include?(milestone) #should be fetched from cache > too, no db query should be needed) > > It seems that it''s programmed in the framework, but for some reason in > my example it''s always getting the result from the database instead of > the already loaded association cache. > > Why is that? > > Thanks, > Mark > > -- > Posted viahttp://www.ruby-forum.com/.