So eager loading via :include is far from perfect: as soon as there are more than one has_many then you''re getting a big cartesian join and end up processing 10000 rows of information in order to get out 200 rows of extra stuff (assuming two 100 object has_manys) I''ve knocked together a patch (http://dev.rubyonrails.org/ticket/ 9640) which takes a different approach: Author.find :all, :conditions => blah, :preload => [:posts, :comments] will first load authors, then one query for posts and one for comments You can nest things like you can with eager loading i.e. Author.find :all, :conditions => blah, :preload => [:comments, {:posts => :categories}] The number of queries will be n+1 (except if you have a has_many :through, which count double since we first load the :through association) The one thing you definitely can''t do is have conditions on the preloaded tables, so Author.find :all, :conditions => "posts.body like ''%foo%'', :preload => [:comments, {:posts => :categories}] is right out. I''ve put together some unit tests (and by ''put together'' I mean ''nicked the tests for eager loading and replaced :include with :preload'') that at least cover basic use. I think most of what is achievable via :include should also also be doable in this way, but it''s definitely the case that right now I''ve only handled the simple case for a lot of stuff. I''m sure there are various combinations of options that aren''t handled or covered by the tests although I don''t see why most of it can''t be fixed up. I''m not sure where all this fits in but thought it was interesting enough that others might want to have a look. In the specific case I had in mind (large has_many loads) it''s way faster than :include, and seems to be as fast or faster in other cases too. Fred --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
This is kinda similar to Rick''s ar_context plugin - http://activereload.net/2007/5/23/spend-less-time-in-the-database-and-more-time-outdoors I guess this will be a very good addition. I''m +1 on the idea. But we need to come up with a bit clearer distiction about when should one use :include and when :preload. Fred : It would be interesting if you can compare results of :preload with :include after applying Gabe''s patch - http://darwinweb.net/ Just a thought. -Pratik On 9/24/07, Frederick Cheung <frederick.cheung@gmail.com> wrote:> > So eager loading via :include is far from perfect: as soon as there > are more than one has_many then you''re getting a big cartesian join > and end up processing 10000 rows of information in order to get out > 200 rows of extra stuff (assuming two 100 object has_manys) > > I''ve knocked together a patch (http://dev.rubyonrails.org/ticket/ > 9640) which takes a different approach: > Author.find :all, :conditions => blah, :preload => [:posts, :comments] > > will first load authors, then one query for posts and one for comments > > You can nest things like you can with eager loading i.e. > > Author.find :all, :conditions => blah, :preload => [:comments, > {:posts => :categories}] > The number of queries will be n+1 (except if you have a > has_many :through, which count double since we first load > the :through association) > > The one thing you definitely can''t do is have conditions on the > preloaded tables, so > > Author.find :all, :conditions => "posts.body like ''%foo%'', :preload > => [:comments, {:posts => :categories}] > > is right out. > > I''ve put together some unit tests (and by ''put together'' I mean > ''nicked the tests for eager loading and replaced :include > with :preload'') that at least cover basic use. I think most of what > is achievable via :include should also also be doable in this way, > but it''s definitely the case that right now I''ve only handled the > simple case for a lot of stuff. I''m sure there are various > combinations of options that aren''t handled or covered by the tests > although I don''t see why most of it can''t be fixed up. > > I''m not sure where all this fits in but thought it was interesting > enough that others might want to have a look. In the specific case I > had in mind (large has_many loads) it''s way faster than :include, and > seems to be as fast or faster in other cases too. > > Fred > > > > >-- Cheers! - Pratik http://m.onkey.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On 24 Sep 2007, at 10:24, Pratik wrote:> > This is kinda similar to Rick''s ar_context plugin - > http://activereload.net/2007/5/23/spend-less-time-in-the-database- > and-more-time-outdoors >Yes, that was part of my inspiration in a slightly circular way> I guess this will be a very good addition. I''m +1 on the idea. But we > need to come up with a bit clearer distiction about when should one > use :include and when :preload.Sure - this is still somewhat experimental. I''m going to do a little benchmarking sometime today or tomorrow to work out when it''s a winner and when it isn''t> > Fred : It would be interesting if you can compare results of :preload > with :include after applying Gabe''s patch - http://darwinweb.net/ Just > a thought. >I think that what Gabe is saying is slow :include = slow prequery + lots of rows, and then fixes the slow prequery part. With :preload there isn''t really a prequery (unless you count the initial find). Either way you end up with the same constraint: conditions only on the base table, except that :preload goes on to fix the next part of the problem too, which is the bit that was biting me (we were already sidestepping the prequery issues by doing foo_ids = Foo.find (:all, :conditions => [...], :limit => 10; Foo.find(foo_ids, :include => [...]) What preload can do that I don''t think you can easily do with :include is make conditions more intuitive, so as gabe says Article.find(:all, :include => :tags, :conditions => "tags.name = ''foo''") is somewhat counter intuitive since it zaps articles with no tags matching the conditions. You could (if you came up with a better syntax) imagine preload doing Article.find(:all, :preload => {:name => ''tags'', :conditions => "name = ''foo''}) which might be a little more intuitive and dead simple to add. Fred> -Pratik > > On 9/24/07, Frederick Cheung <frederick.cheung@gmail.com> wrote: >> >> So eager loading via :include is far from perfect: as soon as there >> are more than one has_many then you''re getting a big cartesian join >> and end up processing 10000 rows of information in order to get out >> 200 rows of extra stuff (assuming two 100 object has_manys) >> >> I''ve knocked together a patch (http://dev.rubyonrails.org/ticket/ >> 9640) which takes a different approach: >> Author.find :all, :conditions => blah, :preload => >> [:posts, :comments] >> >> will first load authors, then one query for posts and one for >> comments >> >> You can nest things like you can with eager loading i.e. >> >> Author.find :all, :conditions => blah, :preload => [:comments, >> {:posts => :categories}] >> The number of queries will be n+1 (except if you have a >> has_many :through, which count double since we first load >> the :through association) >> >> The one thing you definitely can''t do is have conditions on the >> preloaded tables, so >> >> Author.find :all, :conditions => "posts.body like ''%foo%'', :preload >> => [:comments, {:posts => :categories}] >> >> is right out. >> >> I''ve put together some unit tests (and by ''put together'' I mean >> ''nicked the tests for eager loading and replaced :include >> with :preload'') that at least cover basic use. I think most of what >> is achievable via :include should also also be doable in this way, >> but it''s definitely the case that right now I''ve only handled the >> simple case for a lot of stuff. I''m sure there are various >> combinations of options that aren''t handled or covered by the tests >> although I don''t see why most of it can''t be fixed up. >> >> I''m not sure where all this fits in but thought it was interesting >> enough that others might want to have a look. In the specific case I >> had in mind (large has_many loads) it''s way faster than :include, and >> seems to be as fast or faster in other cases too. >> >> Fred >> >> >>> >> > > > -- > Cheers! > - Pratik > http://m.onkey.org > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
>> Article.find(:all, :preload => {:name => ''tags'', :conditions => "name= ''foo''}) which might be a little more intuitive and dead simple to add. I was dreaming about this last night. But you have to do it cleverly. eg. the above statement should do which of the following? (i''m gonna write these out in pseudo-AR/SQL) a) Article.find(:all), Tag.find(:all, "id IN *what_we_just_got* AND name=''foo''") b) Article.find(:all, "EXISTS (SELECT * FROM tags WHERE article_idarticles.id)"). Tag.find(:all, "id IN *what_we_just_got*) c) Article.find(:all, "EXISTS (SELECT * FROM tags WHERE article_idarticles.id)"). Tag.find(:all, "id IN *what_we_just_got* AND name="foo") but I fear that offering too many options will make the situation worse. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On 24 Sep 2007, at 11:36, matthewrudyjacobs@gmail.com wrote:> >>> Article.find(:all, :preload => {:name => ''tags'', :conditions => >>> "name > = ''foo''}) which might be a little more intuitive and dead simple to > add. > > I was dreaming about this last night. > But you have to do it cleverly. > > eg. the above statement should do which of the following? (i''m gonna > write these out in pseudo-AR/SQL) >Well I was thinking of a) - certainly in the light of the pseudo syntax i wrote above it''s what I would expect. c) is what you currently get if you do Article.find :all, :include => :tags, :conditions => "tags.name = ''foo" which can be counterintuitive - at least to me, moving the condition inside :preload reads to me like ''load articles and then preload matching tags I''ve no idea what the common/useful case is though. Fred> a) Article.find(:all), Tag.find(:all, "id IN *what_we_just_got* AND > name=''foo''") > b) Article.find(:all, "EXISTS (SELECT * FROM tags WHERE article_id> articles.id)"). Tag.find(:all, "id IN *what_we_just_got*) > c) Article.find(:all, "EXISTS (SELECT * FROM tags WHERE article_id> articles.id)"). Tag.find(:all, "id IN *what_we_just_got* AND > name="foo") > > but I fear that offering too many options will make the situation > worse. > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
I''ve updated the patch at http://dev.rubyonrails.org/ticket/9640 (I''d managed to omit the tests I''d written from the patch). I''ve also made class Post has_many :comments, :preload => :author end and related work. At this point :preload basically covers everything that :include does, except that conditions can''t be specified on the :preloaded tables. Conditions on the associations do however work, ie class Post has_many :unapproved_comments, :class_name => ''Comment'', :conditions => ''approved = 0'' end Post.find(:all, :preload => :unapproved_comments) works fine, but Post.find(:all, :preload =>:comments, :conditions => ''approved = 0'') doesn''t work. The ticket also has a few benchmarks, drawn from some of the stuff I''ve worked on. :preload seems to be faster than :include (not always by much) in all cases, but the big win is multiple has_manys (because of the cartesian product being loaded etc...). That''s just on data that I have access to, it would be great to see how it works in other circumstances. Anyway, to conclude the ramble it would be nice to see what difference this makes to other apps/tasks (regardless of what happens to the patch itself). Fred --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---