Adam Stegman
2009-Oct-16 03:26 UTC
size of array returned by named_scope with :group/:select distinct
Here''s the example: $ ruby script/generate scaffold thing name:string class Thing < ActiveRecord::Base named_scope :distinct, :group => :name end :select => "DISTINCT things.*" would work in the same way. Start up script/console:>> Thing.new(:name => "a").save!=> true>> Thing.new(:name => "a").save!=> true>> Thing.all=> [#<Thing id: 1, name: "a", created_at: "2009-10-16 02:34:28", updated_at: "2009-10-16 02:34:28">, #<Thing id: 2, name: "a", created_at: "2009-10-16 02:34:28", updated_at: "2009-10-16 02:34:28">]>> Thing.distinct=> [#<Thing id: 2, name: "a", created_at: "2009-10-16 02:34:28", updated_at: "2009-10-16 02:34:28">]>> Thing.distinct.size=> 2>> Thing.distinct.all.size=> 1 As you can see, Thing.distinct.size is mis-reporting the size of the result. I can''t figure out what''s causing this, or why adding .all fixes it. Running the query straight into the database predictably returns just one record: sqlite> SELECT * FROM "things" GROUP BY name; id = 2 name = a created_at = 2009-10-16 02:34:28 updated_at = 2009-10-16 02:34:28 Am I missing something obvious?
Brandon Dimcheff
2009-Oct-16 15:08 UTC
Re: size of array returned by named_scope with :group/:select distinct
I just stumbled upon the same problem... From what I can tell, ActiveRecord doesn''t add the GROUP BY clause when it''s doing a COUNT. Take a look in your log. You should see two nearly identical queries, but one will be a SELECT with the GROUP BY and one will be a COUNT without the GROUP BY. Something like this: SELECT things.* FROM things GROUP BY name and then SELECT COUNT(*) AS count_all FROM things; # NO GROUP BY! The thing is, GROUP BY with an aggregate function will return the number of items in each group. So if you have 3 Things, "a", "a", and "b", adding a GROUP BY at the end of the COUNT query will give you something like this: +-----------+ | count_all | +-----------+ | 2 | # <= a''s | 1 | # <= b +-----------+ What you really want to know is the number of rows in that resultset, since that''s the number of groups in your query. I''m not sure if there''s a good way of doing that without running a subquery. Any ideas? Also, the reason why Thing.distinct.all.size works is because .all converts the association proxy returned by the named scope into an array. Rather than running an SQL query to determine the size of the dataset, after calling .all it just returns the number of items in the actual array of data. Since the GROUP is added properly to the actual data fetch query, .all returns the proper number because that array has the proper data in it. This seems like a bug to me... Any AR hackers out there know why this might be intended behavior, or should I work on a patch? - Brandon On Oct 15, 11:26 pm, Adam Stegman <adam.steg...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Here''s the example: > $ ruby script/generate scaffold thing name:string > class Thing < ActiveRecord::Base > named_scope :distinct, :group => :name > end > > :select => "DISTINCT things.*" would work in the same way. > > Start up script/console: > > >> Thing.new(:name => "a").save! > => true > >> Thing.new(:name => "a").save! > => true > >> Thing.all > > => [#<Thing id: 1, name: "a", created_at: "2009-10-16 02:34:28", > updated_at: "2009-10-16 02:34:28">, #<Thing id: 2, name: "a", > created_at: "2009-10-16 02:34:28", updated_at: "2009-10-16 02:34:28">]>> Thing.distinct > > => [#<Thing id: 2, name: "a", created_at: "2009-10-16 02:34:28", > updated_at: "2009-10-16 02:34:28">]>> Thing.distinct.size > => 2 > >> Thing.distinct.all.size > > => 1 > > As you can see, Thing.distinct.size is mis-reporting the size of the > result. I can''t figure out what''s causing this, or why adding .all > fixes it. Running the query straight into the database predictably > returns just one record: > > sqlite> SELECT * FROM "things" GROUP BY name; > id = 2 > name = a > created_at = 2009-10-16 02:34:28 > updated_at = 2009-10-16 02:34:28 > > Am I missing something obvious?
Adam Stegman
2009-Oct-16 20:55 UTC
Re: size of array returned by named_scope with :group/:select distinct
That makes sense - I was thinking of what named_scope returns as an Array, when really it''s ActiveRecord::NamedScope::Scope. I see now in the source that it just delegates Array methods to find(:all), which is why you can do things like Thing.distinct.each and why Thing.distinct.to_s returns an Array''s string notation. I''m not sure why it has to define a size method, that seems like something it could just hand off to find(:all) too. I''ll have to think about that subquery. On Oct 16, 10:08 am, Brandon Dimcheff <brandon.dimch...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I just stumbled upon the same problem... From what I can tell, > ActiveRecord doesn''t add the GROUP BY clause when it''s doing a COUNT. > Take a look in your log. You should see two nearly identical queries, > but one will be a SELECT with the GROUP BY and one will be a COUNT > without the GROUP BY. > > Something like this: > > SELECT things.* FROM things GROUP BY name > > and then > > SELECT COUNT(*) AS count_all FROM things; # NO GROUP BY! > > The thing is, GROUP BY with an aggregate function will return the > number of items in each group. So if you have 3 Things, "a", "a", and > "b", adding a GROUP BY at the end of the COUNT query will give you > something like this: > > +-----------+ > | count_all | > +-----------+ > | 2 | # <= a''s > | 1 | # <= b > +-----------+ > > What you really want to know is the number of rows in that resultset, > since that''s the number of groups in your query. I''m not sure if > there''s a good way of doing that without running a subquery. Any > ideas? > > Also, the reason why Thing.distinct.all.size works is because .all > converts the association proxy returned by the named scope into an > array. Rather than running an SQL query to determine the size of the > dataset, after calling .all it just returns the number of items in the > actual array of data. Since the GROUP is added properly to the actual > data fetch query, .all returns the proper number because that array > has the proper data in it. > > This seems like a bug to me... Any AR hackers out there know why this > might be intended behavior, or should I work on a patch? > > - Brandon > > On Oct 15, 11:26 pm, Adam Stegman <adam.steg...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > Here''s the example: > > $ ruby script/generate scaffold thing name:string > > class Thing < ActiveRecord::Base > > named_scope :distinct, :group => :name > > end > > > :select => "DISTINCT things.*" would work in the same way. > > > Start up script/console: > > > >> Thing.new(:name => "a").save! > > => true > > >> Thing.new(:name => "a").save! > > => true > > >> Thing.all > > > => [#<Thing id: 1, name: "a", created_at: "2009-10-16 02:34:28", > > updated_at: "2009-10-16 02:34:28">, #<Thing id: 2, name: "a", > > created_at: "2009-10-16 02:34:28", updated_at: "2009-10-16 02:34:28">]>> Thing.distinct > > > => [#<Thing id: 2, name: "a", created_at: "2009-10-16 02:34:28", > > updated_at: "2009-10-16 02:34:28">]>> Thing.distinct.size > > => 2 > > >> Thing.distinct.all.size > > > => 1 > > > As you can see, Thing.distinct.size is mis-reporting the size of the > > result. I can''t figure out what''s causing this, or why adding .all > > fixes it. Running the query straight into the database predictably > > returns just one record: > > > sqlite> SELECT * FROM "things" GROUP BY name; > > id = 2 > > name = a > > created_at = 2009-10-16 02:34:28 > > updated_at = 2009-10-16 02:34:28 > > > Am I missing something obvious?
Jeff Sutherland
2010-Aug-05 17:28 UTC
Re: size of array returned by named_scope with :group/:select distinct
I found that calling named_scope.all.size resolves this issue. It''s not perfect, but it does the job. -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.