HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2006-Nov-09 21:58 UTC
Slow ActiveRecord - MySQL
I use find_by_sql to execute a query which takes around 120 seconds to execute in Ruby. When I execute the same request using SQLyog (GUI), the requests takes around 15 seconds to execute. Here''s roughly what the query looks like: SELECT DAY(date_time) "day", COUNT( * ) "count" FROM table WHERE (id = some_id) AND(YEAR(date_time) = 2006) AND(MONTH(date_time) = 8) AND(some_string IN (''some short string'') ) GROUP BY DAY( date_time ); I''m querying about a million records. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
I use find_by_sql to execute a query which takes around 120 seconds to execute in Ruby. When I execute the same request using SQLyog (GUI), the requests takes around 15 seconds to execute. Here''s roughly what the query looks like: SELECT DAY(date_time) "day", COUNT( * ) "count" FROM table WHERE (id = some_id) AND(YEAR(date_time) = 2006) AND(MONTH(date_time) = 8) AND(some_string IN (''some short string'') ) GROUP BY DAY( date_time ); I''m querying about a million records. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On 11/9/06, HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > I use find_by_sql to execute a query which takes around 120 seconds to > execute in Ruby. When I execute the same request using SQLyog (GUI), > the requests takes around 15 seconds to execute. > > Here''s roughly what the query looks like: > > SELECT DAY(date_time) "day", COUNT( * ) "count" > FROM table > WHERE (id = some_id) > AND(YEAR(date_time) = 2006) > AND(MONTH(date_time) = 8) > AND(some_string IN (''some short string'') ) > GROUP BY DAY( date_time ); > > I''m querying about a million records.How many results? Compare find_by_sql with connection.select_all timing. An index such as (id, YEAR(date_time), MONTH(date_time)) would really help, in any case. jeremy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Heist wrote:> I use find_by_sql to execute a query which takes around 120 seconds to > execute in Ruby. When I execute the same request using SQLyog (GUI), > the requests takes around 15 seconds to execute. > > Here''s roughly what the query looks like: > > SELECT DAY(date_time) "day", COUNT( * ) "count" > FROM table > WHERE (id = some_id) > AND(YEAR(date_time) = 2006) > AND(MONTH(date_time) = 8) > AND(some_string IN (''some short string'') ) > GROUP BY DAY( date_time ); > > I''m querying about a million records. >Do you need all of the million records in memory at once? You''re getting overhead of ActiveRecord creating one million records and keeping them in memory, and your program is probably ending up using swap which is forcing it to take sooo long. I often query a few million records, but I don''t pull them all into memory a once, it''s too inefficient, and not just in ruby. Zach ActiveRecord::Extensions - http://www.continuousthinking.com/are -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFU6hbMyx0fW1d8G0RAliUAJ9ZKjCgqjkkKV/woBQP8ZQl5LHyxACeOzW7 c2QQUvmdeg4bapbkz6OMepg=lvq7 -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On 11/9/06, zdennis <zdennis-aRAREQmnvsAAvxtiuMwx3w@public.gmane.org> wrote:> > Do you need all of the million records in memory at once? You''re getting > overhead of ActiveRecord creating one million records and > keeping them in memory, and your program is probably ending up using swap > which is forcing it to take sooo long. > > I often query a few million records, but I don''t pull them all into memory > a once, it''s too inefficient, and not just in ruby.I think he''s saying there are a million rows in the table. A million results grouped by day would mean he has records spanning nearly three millenia ;) jeremy --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Nov 9, 2006, at 11:14 PM, zdennis wrote:> I often query a few million records, but I don''t pull them all into > memory a once, it''s too inefficient, and not just in ruby.What do you do then? Paginate by hand? -- fxn --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
> On Nov 9, 2006, at 11:14 PM, zdennis wrote: > >> I often query a few million records, but I don''t pull them all into >> memory a once, it''s too inefficient, and not just in ruby. > > What do you do then? Paginate by hand?Yes :) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Around 6. select_all gave results similar to SQLyog (around 15 seconds). It''s not possible to use YEAR(date_time) as an index, is it? If I alter my table to have different columns for the year, month and day, will there be a good performance gain? And while this is all good advice, I still don''t understand why Rails would have such problems with the query. Jeremy Kemper wrote:> On 11/9/06, HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > I use find_by_sql to execute a query which takes around 120 seconds to > > execute in Ruby. When I execute the same request using SQLyog (GUI), > > the requests takes around 15 seconds to execute. > > > > Here''s roughly what the query looks like: > > > > SELECT DAY(date_time) "day", COUNT( * ) "count" > > FROM table > > WHERE (id = some_id) > > AND(YEAR(date_time) = 2006) > > AND(MONTH(date_time) = 8) > > AND(some_string IN (''some short string'') ) > > GROUP BY DAY( date_time ); > > > > I''m querying about a million records. > > > How many results? Compare find_by_sql with connection.select_all timing. > > An index such as (id, YEAR(date_time), MONTH(date_time)) would really help, > in any case. > > jeremy > > ------=_Part_56743_7586538.1163110279712 > Content-Type: text/html; charset=ISO-8859-1 > X-Google-AttachSize: 1140 > > On 11/9/06, <b class="gmail_sendername"><a href="mailto:HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org">HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org</a></b> <<a href="mailto:HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org">HeistTheAlmighty-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org</a>> wrote:<div><span class="gmail_quote"> > </span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I use find_by_sql to execute a query which takes around 120 seconds to<br>execute in Ruby. When I execute the same request using SQLyog (GUI), > <br>the requests takes around 15 seconds to execute.<br><br>Here''s roughly what the query looks like:<br><br>SELECT DAY(date_time) "day", COUNT( * ) "count"<br>FROM table<br>WHERE (id = some_id)<br>AND(YEAR(date_time) = 2006) > <br>AND(MONTH(date_time) = 8)<br>AND(some_string IN (''some short string'') )<br>GROUP BY DAY( date_time );<br><br>I''m querying about a million records.</blockquote><div><br>How many results? Compare find_by_sql with connection.select_all > timing.<br><br>An index such as (id, YEAR(date_time), MONTH(date_time)) would really help, in any case.<br><br>jeremy</div></div> > > ------=_Part_56743_7586538.1163110279712----~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Heist wrote:>SELECT DAY(date_time) "day", COUNT( * ) "count" > FROM table > WHERE (id = some_id) > AND(YEAR(date_time) = 2006) > AND(MONTH(date_time) = 8) > AND(some_string IN (''some short string'') ) > GROUP BY DAY( date_time );> Around 6. select_all gave results similar to SQLyog (around 15 > seconds). > > It''s not possible to use YEAR(date_time) as an index, is it?Nope. You''re losing ability for the query optimizer to take advantages of indexes, by determining the name of your column using a function. Functions should always go on the right side of a comparison in a query, so you can take advantage of indexes. The way you''re doing the query above forces that you''re going to do a FULL table scan each time.> If I alter > my table to have different columns for the year, month and day, will > there be a good performance gain?Yes, and add indexes on those columns.> And while this is all good advice, I still don''t understand why Rails > would have such problems with the query.ActiveRecord doesn''t have to bad of overhead for running a query. ActiveRecord thinly wraps a MySQL database connection. The overhead that ActiveRecord adds is in instantiating all of the results into ActiveRecord model instances. This is why in the first response to your post I commented on NOT trying to query results where the result set was a million objects. I dont'' think you''re seeing an ActiveRecord issue here, now that I read your query more closely, the query needs to be written better and you need to use take advantage of indexes. Zach -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFVKZtMyx0fW1d8G0RAkzcAJ48BaW+o/XJwvCecPrnFC3WTsgf3ACfcxNj etliGjLlH/i78WnvhjlnZCk=8lma -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeremy Kemper wrote:> On 11/9/06, *zdennis* <zdennis-aRAREQmnvsAAvxtiuMwx3w@public.gmane.org <mailto:zdennis-aRAREQmnvsAAvxtiuMwx3w@public.gmane.org>> wrote: > > Do you need all of the million records in memory at once? You''re > getting overhead of ActiveRecord creating one million records and > keeping them in memory, and your program is probably ending up using > swap which is forcing it to take sooo long. > > I often query a few million records, but I don''t pull them all into > memory a once, it''s too inefficient, and not just in ruby. > > > I think he''s saying there are a million rows in the table. A million > results grouped by day would mean he has records spanning nearly three > millenia ;) >My bad. =) Zach -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFVKavMyx0fW1d8G0RAmUTAJ41LSS002nIj25G4I3OvgzJYA2YVACfXKwD hPzN8yzBDhE1For8BWYS0Xg=8rut -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---