When to-many associations are eagerly loaded, a "base" object and its
associated objects can be represented by an unpredictable number of
rows in the result set of the query. As a consequence, retrieving a
certain range of objects including associations cannot be achieved by
using OFFSET and LIMIT as they refer to rows in the result set, not
objects.
In order to enable :offset and :limit in cases like this, the current
code splits the task of retrieving objects. In a first query it
retrieves a list of ids of objects, properly offset and limited. Then,
in a second query, these ids are used in a condition to restrict the
returned rows from which objects are instantiated. The code looks like
this (slightly reformatted).
def add_limited_ids_condition!(sql, options, join_dependency)
unless (id_list = select_limited_ids_list(options,
join_dependency)).empty?
sql << "#{condition_word(sql)} #{table_name}.#{primary_key} IN
(#{id_list}) "
else
throw :invalid_query
end
end
def select_limited_ids_list(options, join_dependency)
connection.select_all(
construct_finder_sql_for_association_limiting(options,
join_dependency),
"#{name} Load IDs For Limited Eager Loading"
).collect { |row| connection.quote(row[primary_key]) }.join(", ")
end
The advantage of this approach is that it works even on less capable
DBMS, specifically those that don''t support subselects. The
disadvantage is that on DBMS that do support subselects it incurs an
unnecessary roundtrip to the database and robs the optimizer of a
chance to do its job.
A solution is rather simple, or so it seems to me. Apart from naming
issues, it would look like this, for databases that support subselects
def select_limited_ids_list(options, join_dependency)
construct_finder_sql_for_association_limiting(options,
join_dependency)
end
Instead of eagerly executing a query for the ids, the requisite SQL is
simply inserted in the main query to be executed as a subselect. I''ve
applied this change and except for a test that explicitly checks
select_limited_ids_list all test still pass on PostgreSQL.
Obviously, this change should not be applied unconditionally. Rather, I
think it should be conditional on the capabilities of the used DBMS.
Information, which in turn could be supplied by the respective
adapters.
Michael
--
Michael Schuerig
mailto:michael@schuerig.de
http://www.schuerig.de/michael/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To post to this group, send email to rubyonrails-core@googlegroups.com
To unsubscribe from this group, send email to
rubyonrails-core-unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/rubyonrails-core?hl=en
-~----------~----~----~----~------~----~------~--~---
On Dec 15, 2006, at 1:22 AM, Michael Schuerig wrote:> Obviously, this change should not be applied unconditionally. > Rather, I > think it should be conditional on the capabilities of the used DBMS. > Information, which in turn could be supplied by the respective > adapters.Imong others I heard that MySQL now does support subselects but it does not optimize anything inside a subselect (up to the point of it not using indexes and such). Maybe AbstractAdapter#supports_subselects? -- Julian ''Julik'' Tarkhanov please send all personal mail to me at julik.nl --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
Michael Koziarski
2006-Dec-16 04:40 UTC
Re: Included associations, limiting and subselects
> Maybe AbstractAdapter#supports_subselects?In the ''long run'' it''d be nice if all the conditional stuff for query generation was removed from AR::Base and moved into the adapters themselves. With custom databases using polymorphism to handle the quirks and supported / unsupported behaviour. However in the meantime, I think a patch to make use a real subselect would be good, but probably not a 1.2 thing. -- Cheers Koz --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On Saturday 16 December 2006 05:40, Michael Koziarski wrote:> > Maybe AbstractAdapter#supports_subselects? > > In the ''long run'' it''d be nice if all the conditional stuff for query > generation was removed from AR::Base and moved into the adapters > themselves. With custom databases using polymorphism to handle the > quirks and supported / unsupported behaviour.Agreed.> However in the meantime, I think a patch to make use a real subselect > would be good, but probably not a 1.2 thing.I''ve timed (script/performance/benchmarker) find''s with and without subselect. To my very big surprise, there was only a small difference and if anything, using two queries (i.e. the current state of things) was slightly faster than using a subselect. I may well have made a measuring error, though. I''ve tested on PostgreSQL with 100_000 "base" objects with 10_000 objects "belonged to" and 400_000 has many objects. Generally, I''ve used a low LIMIT up to 100 and OFFSETs from 0 to 9_000. The showstopper, however, was when I noticed that at least on PostgreSQL, columns referred to in the ORDER BY clause have to be contained in the SELECT clause. As a result, the subselect returns not just id and the IN check in the surrounding statement becomes invalid. Wrapping the subselect in another select that maps to the id only would be possible, but I didn''t bother to try. I hope someone else will independently check this behavior, to confirm that the current implementation without subselect does not hurt performance. Michael -- Michael Schuerig mailto:michael@schuerig.de http://www.schuerig.de/michael/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
Ultimately, trying to become more clever to deal with query conditions can become self defeating. I ended up solving my issues with having to deal with limited included query results by allowing a finder_sql option to be used along with a column mapping to provide for an optimized query that allows me to limit the result sets however I like. Check out my plugin that extends the eager loading code here: http://kellogg-assoc.com/articles/2006/11/05/eager-finder-sql. Another useful addition would be able to use the results of a MySQL-like GROUP_CONCAT. I''ve done it on a case-by-case basis in the past, but haven''t yet tried to do something that is more integrated. Gregg --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---