The following method in my rails app is called very, very frequently on incoming requests and results in model objects being rendered as JSON. All is working fine, but this thing has to be as efficient and fast as possible. So far I''ve added an index to updated_at and I''ve enabled mod_deflate on the server so the JSON responses are compressed. I''ve heard that it''s possible to optimise rails queries as raw SQL so to eek extra performance out of the app. Is this true and a good idea (if it''s marginal I''d stay as I am) and given the code in my method; is it worth doing, I''m not so familiar with SQL syntax so if it is something that I should do I''d appreciate any tips as to the syntax. Here''s the method as it stands, working fine - circa 5,000 records, growing I hope to 50,000 in a few weeks, maybe 100,000 records ultimately. def self.time_query(date_from,date_to,limit = 40) self.all(:conditions => { :updated_at => date_from..date_to }, :order => "updated_at DESC", :limit => limit) end Here''s some output from the logs, I guess the SQL I could use directly is based on the "SELECT * FROM `apps` WHERE (`apps`.`updated_at` BETWEEN ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY updated_at DESC, updated_at ASC LIMIT 40" Is this worth doing, are there significant performance gains to be had bypassing the AR layer. If it matters this is a rails 2.3.8 app. Processing AppsController#index to json (for 127.0.0.1 at 2011-02-15 18:35:53) [GET] Parameters: {"date_to"=>"2011-02-15T18:35:53Z", "date_from"=>"2011-02-15T18:25:53Z"} App Load (27.5ms) SELECT * FROM `apps` WHERE (`apps`.`updated_at` BETWEEN ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY updated_at DESC, updated_at ASC LIMIT 40 Completed in 34ms (View: 1, DB: 28) | 200 OK SQL (0.2ms) SET NAMES ''utf8'' SQL (0.1ms) SET SQL_AUTO_IS_NULL=0 -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Frederick Cheung
2011-Feb-15 21:50 UTC
Re: Bypassing ORM for performance, is it worth it ?
On Feb 15, 8:28 pm, bingo bob <li...-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> The following method in my rails app is called very, very frequently on > incoming requests and results in model objects being rendered as JSON. > All is working fine, but this thing has to be as efficient and fast as > possible. So far I''ve added an index to updated_at and I''ve enabled > mod_deflate on the server so the JSON responses are compressed. > > I''ve heard that it''s possible to optimise rails queries as raw SQL so to > eek extra performance out of the app. Is this true and a good idea (if > it''s marginal I''d stay as I am) and given the code in my method; is it > worth doing, I''m not so familiar with SQL syntax so if it is something > that I should do I''d appreciate any tips as to the syntax. > > Here''s the method as it stands, working fine - circa 5,000 records, > growing I hope to 50,000 in a few weeks, maybe 100,000 records > ultimately.You might try connection.select_all (which will return an array of hashes), the default to_json might be sufficient. Also worth only selecting those columns you need in your output, and (if applicable) create an index that includes those columns (known as a covering index) Fred> > def self.time_query(date_from,date_to,limit = 40) > self.all(:conditions => { :updated_at => date_from..date_to }, :order > => "updated_at DESC", :limit => limit) > end > > Here''s some output from the logs, I guess the SQL I could use directly > is based on the > "SELECT * FROM `apps` WHERE (`apps`.`updated_at` BETWEEN > ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY updated_at > DESC, updated_at ASC LIMIT 40" > > Is this worth doing, are there significant performance gains to be had > bypassing the AR layer. > If it matters this is a rails 2.3.8 app. > > Processing AppsController#index to json (for 127.0.0.1 at 2011-02-15 > 18:35:53) [GET] > Parameters: {"date_to"=>"2011-02-15T18:35:53Z", > "date_from"=>"2011-02-15T18:25:53Z"} > App Load (27.5ms) SELECT * FROM `apps` WHERE (`apps`.`updated_at` > BETWEEN ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY > updated_at DESC, updated_at ASC LIMIT 40 > Completed in 34ms (View: 1, DB: 28) | 200 OK > SQL (0.2ms) SET NAMES ''utf8'' > SQL (0.1ms) SET SQL_AUTO_IS_NULL=0 > > -- > Posted viahttp://www.ruby-forum.com/.-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Nikos Dimitrakopoulos
2011-Feb-16 05:40 UTC
Re: Bypassing ORM for performance, is it worth it ?
I''ve done it before on a very specific case that we wanted to load a very big graph of objects and performance was the #1 priority. Using explicit "select" (aka not loading all the data) and not instatiating AR objects (through "connection.select_all") *does make a huge* difference but it''s only worth the noise in code when all other optimizations have taken place (indexes, no N+1 queries, etc) and the bottleneck is ruby and ActiveRecord. And in general don''t optimize pre-maturely : let your records grow to the numbers you expect and then make it faster ;) P.S.: Another thing you might want to check out is the ActiveSupport#to_json since that also was a big bottleneck (with all that magic it has) - if what you have in memory for serialization is a pure hash (and not an object which needs special handling) you can try the original ::JSON.generate instead and see if that makes any difference to you. On Tuesday, February 15, 2011 11:50:55 PM UTC+2, Frederick Cheung wrote:> > > > On Feb 15, 8:28 pm, bingo bob <li...-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote: > > The following method in my rails app is called very, very frequently on > > incoming requests and results in model objects being rendered as JSON. > > All is working fine, but this thing has to be as efficient and fast as > > possible. So far I''ve added an index to updated_at and I''ve enabled > > mod_deflate on the server so the JSON responses are compressed. > > > > I''ve heard that it''s possible to optimise rails queries as raw SQL so to > > eek extra performance out of the app. Is this true and a good idea (if > > it''s marginal I''d stay as I am) and given the code in my method; is it > > worth doing, I''m not so familiar with SQL syntax so if it is something > > that I should do I''d appreciate any tips as to the syntax. > > > > Here''s the method as it stands, working fine - circa 5,000 records, > > growing I hope to 50,000 in a few weeks, maybe 100,000 records > > ultimately. > > You might try connection.select_all (which will return an array of > hashes), the default to_json might be sufficient. > Also worth only selecting those columns you need in your output, and > (if applicable) create an index that includes those columns (known as > a covering index) > > Fred > > > > > def self.time_query(date_from,date_to,limit = 40) > > self.all(:conditions => { :updated_at => date_from..date_to }, :order > > => "updated_at DESC", :limit => limit) > > end > > > > Here''s some output from the logs, I guess the SQL I could use directly > > is based on the > > "SELECT * FROM `apps` WHERE (`apps`.`updated_at` BETWEEN > > ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY updated_at > > DESC, updated_at ASC LIMIT 40" > > > > Is this worth doing, are there significant performance gains to be had > > bypassing the AR layer. > > If it matters this is a rails 2.3.8 app. > > > > Processing AppsController#index to json (for 127.0.0.1 at 2011-02-15 > > 18:35:53) [GET] > > Parameters: {"date_to"=>"2011-02-15T18:35:53Z", > > "date_from"=>"2011-02-15T18:25:53Z"} > > App Load (27.5ms) SELECT * FROM `apps` WHERE (`apps`.`updated_at` > > BETWEEN ''2011-02-15T18:25:53Z'' AND ''2011-02-15T18:35:53Z'') ORDER BY > > updated_at DESC, updated_at ASC LIMIT 40 > > Completed in 34ms (View: 1, DB: 28) | 200 OK > > SQL (0.2ms) SET NAMES ''utf8'' > > SQL (0.1ms) SET SQL_AUTO_IS_NULL=0 > > > > -- > > Posted viahttp://www.ruby-forum.com/.-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Thanks for yours, I hear your ideas. I''m not hearing anything massive that I haven''t done or need to change. I think I''ll stay where I am for now and see how I go - updated_at column is indexed. Thanks for the tips re the JSON, I had done a tweak on that in the model already by overriding the default to_json method to only include the attributes I need to serve. I figure I''m in reasonable shape, I''m now looking at how to benchmark and performance monitor. Got hold of the rawk.rb script yesterday which looks helpful.. Anyone know anything that will monitor a production log live and report requests/sec? -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.