JimCifarelli
2008-Sep-19 21:28 UTC
Updating a Field in All Records of a Model, Efficiently?
Hello Everyone, I have a Model for which there are many thousands (hundreds of thousands) of records. I''d like to update the value of one field for each record in a rake task. A ruby based function must be applied to this field for all records (i.e. can''t perform the algorithm in an SQL UPDATE alone). First Thought: MyModel.find(:all).each { |m| m.my_field = function_call_here; m.save } Load all the models into memory, and I''d like to avoid that. Second Thought: Get the maximum "id" of the Model''s records and program a loop, doing a "find_by_id" for each record. While the model does have an "id" field, some records have been deleted, so I would have to check that the "find_by_id" is nil... I''m not worried about new records being added to the database as this update will be run during a "maintenance" period. Anyone have any thoughts on how you would update every record for a model when you have a large number of records? Thank you for your time and help. -Jim --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Pardee, Roy
2008-Sep-19 22:55 UTC
Re: Updating a Field in All Records of a Model, Efficiently?
I may not understand your second thought, but I have a hard time imagining it will be faster than the loop over all records in your first thought. Since you''ve got to retrieve the record & make it an AR instance in any case, you may as well not mess around guessing IDs... Are you sure you can''t translate your ruby function to SQL? That''s going to be loads faster... Can you break your records up into chunks that should all get the same value of my_field & do a series of subset UPDATEs maybe? -----Original Message----- From: rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org [mailto:rubyonrails-talk@googlegroups.com] On Behalf Of JimCifarelli Sent: Friday, September 19, 2008 2:28 PM To: Ruby on Rails: Talk Subject: [Rails] Updating a Field in All Records of a Model, Efficiently? Hello Everyone, I have a Model for which there are many thousands (hundreds of thousands) of records. I''d like to update the value of one field for each record in a rake task. A ruby based function must be applied to this field for all records (i.e. can''t perform the algorithm in an SQL UPDATE alone). First Thought: MyModel.find(:all).each { |m| m.my_field = function_call_here; m.save } Load all the models into memory, and I''d like to avoid that. Second Thought: Get the maximum "id" of the Model''s records and program a loop, doing a "find_by_id" for each record. While the model does have an "id" field, some records have been deleted, so I would have to check that the "find_by_id" is nil... I''m not worried about new records being added to the database as this update will be run during a "maintenance" period. Anyone have any thoughts on how you would update every record for a model when you have a large number of records? Thank you for your time and help. -Jim --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Rob Biedenharn
2008-Sep-19 23:09 UTC
Re: Updating a Field in All Records of a Model, Efficiently?
On Sep 19, 2008, at 6:55 PM, Pardee, Roy wrote:> I may not understand your second thought, but I have a hard time > imagining it will be faster than the loop over all records in your > first thought. Since you''ve got to retrieve the record & make it an > AR instance in any case, you may as well not mess around guessing > IDs... > > Are you sure you can''t translate your ruby function to SQL? That''s > going to be loads faster... Can you break your records up into > chunks that should all get the same value of my_field & do a series > of subset UPDATEs maybe? > > -----Original Message----- > From: rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org [mailto:rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > ] On Behalf Of JimCifarelli > Sent: Friday, September 19, 2008 2:28 PM > To: Ruby on Rails: Talk > Subject: [Rails] Updating a Field in All Records of a Model, > Efficiently? > > Hello Everyone, > > I have a Model for which there are many thousands (hundreds of > thousands) of records. I''d like to update the value of one field > for each record in a rake task. > > A ruby based function must be applied to this field for all records > (i.e. can''t perform the algorithm in an SQL UPDATE alone). > > First Thought: > MyModel.find(:all).each { |m| m.my_field = function_call_here; > m.save } Load all the models into memory, and I''d like to avoid that. > > Second Thought: > Get the maximum "id" of the Model''s records and program a loop, > doing a "find_by_id" for each record. > While the model does have an "id" field, some records have been > deleted, so I would have to check that the "find_by_id" is nil... > > I''m not worried about new records being added to the database as > this update will be run during a "maintenance" period. > > Anyone have any thoughts on how you would update every record for a > model when you have a large number of records? > > Thank you for your time and help. > > -JimJim, When I''ve had to do a similar thing over a table with many (100,000+) records, I''ve done something like: total = Model.count limit = [ 100, total ].min 0.step(total-1, limit) do |offset| Model.find(:all, :limit => limit, :offset => offset).each do |model| # do stuff end end If you have a condition that limits what comes back, you might have to tweak the offset if the "stuff" you do causes records to fall out of the condition. Roy''s idea of a series of Model.update_all() calls for each unique set of whatever goes into your function sounds like a winner if the function is predictable enough. -Rob Rob Biedenharn http://agileconsultingllc.com Rob-xa9cJyRlE0mWcWVYNo9pwxS2lgjeYSpx@public.gmane.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---