Victor Lin
2011-Jul-06 18:17 UTC
Need help porting limited eager loading optimization to Rails 3
I''m fairly certain this 2.x feature never made it to rails 3. You can
view
the original ticket and patches here:
http://web.archive.org/web/20081005125758/http://dev.rubyonrails.org/ticket/9560
The gist of it is, when you run something like this:
Article.includes(:comments, :author).where(''authors.id =
1'').limit(10)
You end up with two queries. The first is a "SELECT DISTINCT
authors.id...",
and the next will actually load the comments and authors associations. In
Rails 2.x, AR was smart enough to only join against the tables that actually
limited the resultset (e.g anything in the where or order clauses). Rails 3
will blindly join all the tables, which kills performance when you have
several eager loaded associations.
I started working on a patch to apply_join_dependency but ran into a problem
with table aliasing. The diff is here:
https://gist.github.com/1067917
The approach is basically to scan the order and where clauses for table
names. Then scan the included associations for these table names, adding
them (and any intermediate joins) to a list, and only joining those
associations. The problem is when the arel object is built for
clean_relation, it has fewer joins than the original. AR builds a
JoinDependency object and JoinDependency#graft''s all my joins to it.
That
object never actually sees any of the original joins, and so the alias
tracker hands out the default table name instead of the aliased table name.
I don''t know how to deal with this without hacking up more of the
source
than I want to - anyone have any ideas about how to deal with this? Or maybe
a completely different approach?
--
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/rubyonrails-core/-/As1BiWORxRAJ.
To post to this group, send email to rubyonrails-core@googlegroups.com.
To unsubscribe from this group, send email to
rubyonrails-core+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/rubyonrails-core?hl=en.
Oriol Gual
2011-Jul-06 19:35 UTC
Re: Need help porting limited eager loading optimization to Rails 3
I''m not really sure, but maybe hacking directly Arel (https://github.com/rails/arel) instead AR is easier to solve this issue. On Wed, Jul 6, 2011 at 20:17, Victor Lin <victorhlin@gmail.com> wrote:> I''m fairly certain this 2.x feature never made it to rails 3. You can view > the original ticket and patches here: > http://web.archive.org/web/20081005125758/http://dev.rubyonrails.org/ticket/9560 > > The gist of it is, when you run something like this: > Article.includes(:comments, :author).where(''authors.id = 1'').limit(10) > > You end up with two queries. The first is a "SELECT DISTINCT authors.id...", > and the next will actually load the comments and authors associations. In > Rails 2.x, AR was smart enough to only join against the tables that actually > limited the resultset (e.g anything in the where or order clauses). Rails 3 > will blindly join all the tables, which kills performance when you have > several eager loaded associations. > > I started working on a patch to apply_join_dependency but ran into a problem > with table aliasing. The diff is here: > https://gist.github.com/1067917 > > The approach is basically to scan the order and where clauses for table > names. Then scan the included associations for these table names, adding > them (and any intermediate joins) to a list, and only joining those > associations. The problem is when the arel object is built for > clean_relation, it has fewer joins than the original. AR builds a > JoinDependency object and JoinDependency#graft''s all my joins to it. That > object never actually sees any of the original joins, and so the alias > tracker hands out the default table name instead of the aliased table name. > I don''t know how to deal with this without hacking up more of the source > than I want to - anyone have any ideas about how to deal with this? Or maybe > a completely different approach? > > -- > You received this message because you are subscribed to the Google Groups > "Ruby on Rails: Core" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/rubyonrails-core/-/As1BiWORxRAJ. > To post to this group, send email to rubyonrails-core@googlegroups.com. > To unsubscribe from this group, send email to > rubyonrails-core+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/rubyonrails-core?hl=en. >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
Ernie Miller
2011-Jul-06 20:26 UTC
Re: Need help porting limited eager loading optimization to Rails 3
On Jul 6, 2:17 pm, Victor Lin <victorh...@gmail.com> wrote:> I''m fairly certain this 2.x feature never made it to rails 3. You can view > the original ticket and patches here:http://web.archive.org/web/20081005125758/http://dev.rubyonrails.org/... > > The gist of it is, when you run something like this: > Article.includes(:comments, :author).where(''authors.id = 1'').limit(10) > > You end up with two queries. The first is a "SELECT DISTINCT authors.id...", > and the next will actually load the comments and authors associations. In > Rails 2.x, AR was smart enough to only join against the tables that actually > limited the resultset (e.g anything in the where or order clauses). Rails 3 > will blindly join all the tables, which kills performance when you have > several eager loaded associations. > > I started working on a patch to apply_join_dependency but ran into a problem > with table aliasing. The diff is here:https://gist.github.com/1067917 > > The approach is basically to scan the order and where clauses for table > names. Then scan the included associations for these table names, adding > them (and any intermediate joins) to a list, and only joining those > associations. The problem is when the arel object is built for > clean_relation, it has fewer joins than the original. AR builds a > JoinDependency object and JoinDependency#graft''s all my joins to it. That > object never actually sees any of the original joins, and so the alias > tracker hands out the default table name instead of the aliased table name. > I don''t know how to deal with this without hacking up more of the source > than I want to - anyone have any ideas about how to deal with this? Or maybe > a completely different approach?Victor, I feel your pain, but in my opinion, moving further down the road of scanning queries for strings to determine a table''s inclusion takes things in the wrong direction. The trend in core (and one that I hope continues) has been toward using ARel objects representing the various parts of a query. For instance, your order-scanning code is subject to the same bug that this commit fixes: https://github.com/rails/rails/commit/08f3f30994d37f6f44acfac801f82fc43127fc78 . I don''t think there''s a quick (and also correct) way to fix this behavior, without modifying more than this one method. Jon or Aaron might tell me I''m wrong, though. :) -- Ernie Miller http://metautonomo.us -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.
Victor Lin
2011-Jul-06 21:21 UTC
Re: Need help porting limited eager loading optimization to Rails 3
I totally agree - scanning sql strings is pretty crappy. Extracting the necessary tables from the ARel object should be more robust, this was just a quick hack to get things going. I think there''s a related performance issue here too, where the second query uses the LEFT OUTER JOIN method of preloading associations instead of issuing separate queries. We have the IDs already from the prequery, there''s no need to use the join strategy anymore, right? -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To view this discussion on the web visit https://groups.google.com/d/msg/rubyonrails-core/-/Fh7jPpx6MvIJ. To post to this group, send email to rubyonrails-core@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en.