I''m a beginning programming using ActiveRecord outside of Rails to do conditional processing of database records. So far, I''ve been successful. However, my script loads all matching records into memory first. There are hundreds of thousands of matching records so the script quickly consumes over 500MB of RAM before any processing is done. Is there a way to avoid this preloading of row objects in memory? Below is an example of the type of thing I''m trying to do (although the actual table row-level processing I''m doing is more complicated than the example, but this isn''t relevant to my question): #example code require ''rubygems'' require ''mysql'' require ''active_record'' ActiveRecord::Base.establish_connection( :adapter => "mysql", :username => "root", :password => "password", :database => "my_schema" ) class MyTable < ActiveRecord::Base end for m in MyTable.find(:all, :conditions => "some_column=''criterion to match''") m.other_column = "new value" m.save end #thanks for any tips --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
(I tried posting this before, but after several hours it has not appeared on the list, if a dupe shows up i''ll delete one) -- I''m a beginning programming using ActiveRecord outside of Rails to do conditional processing of database records. So far, I''ve been successful. However, my script loads all matching records into memory first. There are hundreds of thousands of matching records so the script quickly consumes over 500MB of RAM before any processing is done. Is there a way to avoid this preloading of row objects in memory? Below is an example of the type of thing I''m trying to do (although the actual table row-level processing I''m doing is more complicated than the example, but this isn''t relevant to my question): #example code require ''rubygems'' require ''mysql'' require ''active_record'' ActiveRecord::Base.establish_connection( :adapter => "mysql", :username => "root", :password => "password", :database => "my_schema" ) class MyTable < ActiveRecord::Base end for m in MyTable.find(:all, :conditions => "some_column=''criterion to match''") m.other_column = "new value" m.save end #thanks for any tips --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Well, for some reason it took a few days for my first message to appear in google-groups. In the meantime, I figured out an imperfect but useable solution. I changed the loop so that it only loads the id values into memory, and then issues a select query for each id, so that it only creates the an object with all column values for one row at a time. Now my loop looks like this: for m in MyTable.find(:all, :select => "id", :conditions => "some_column=''criterion to match''") p = MyTable.find(m.id) p.other_column = "new value" p.save end In my particular case, this saved me from creating 500MB+ of objects in memory and instead only used ~70MB using only the id values. Granted the script has to issue a SELECT query for every id, but in my case this is acceptable as the loop is on a timer anyway (only querying the database every couple of seconds). If anyone else has a more elegant solution to this problem, please chime in. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
You''d be better off writing custom SQL for this particular case. If your database supports subselects, you could do: MyTable.execute("UPDATE my_table SET other_column = ''new_value'' WHERE id IN (SELECT id FROM my_table WHERE some_column = ''criterion to match'')") David Rose On 12/4/06, John-Scott <john.scott.atlakson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > > Well, for some reason it took a few days for my first message to appear > in google-groups. In the meantime, I figured out an imperfect but > useable solution. I changed the loop so that it only loads the id > values into memory, and then issues a select query for each id, so that > it only creates the an object with all column values for one row at a > time. Now my loop looks like this: > > for m in MyTable.find(:all, :select => "id", :conditions => > "some_column=''criterion to match''") > p = MyTable.find(m.id) > p.other_column = "new value" > p.save > end > > In my particular case, this saved me from creating 500MB+ of objects in > memory and instead only used ~70MB using only the id values. > Granted the script has to issue a SELECT query for every id, but in my > case this is acceptable as the loop is on a timer anyway (only querying > the database every couple of seconds). > > If anyone else has a more elegant solution to this problem, please > chime in. > > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
David...thanks for the reply. However, I''m using the google-geocoder gem and so I have a timed loop (50,000 Google Maps API daily geocode limit = request every 1.728 seconds) that grabs an address (:condition => "latitude=''''") and then geocodes the parcel. I just didn''t want to clutter up my example or distract the discussion with the particulars. For what it''s worth, here is the actual loop: for m in Property.find(:all, :select => "id", :conditions => "latitude = ''0''") p = Property.find(m.id) address = p.address + ", " + p.city.name + ", " + p.state + " " + p.zip begin location = gg.locate address rescue GoogleGeocode::AddressError else p.latitude = location.latitude p.longitude = location.longitude p.save end Time.new sleep 1.728 end Using the ''google-geocode'' gem and Active Record to interact with the database was the cleanest and easiest way for me to accomplish my specific task. And it works. Again, I''m a beginner programmer so that has a lot to do with my choices here. So if there is a smarter way to do this, I''m all ears and eager to learn. John-Scott --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
John-Scott wrote:> David...thanks for the reply. However, I''m using the google-geocoder > gem and so I have a timed loop (50,000 Google Maps API daily geocode > limit = request every 1.728 seconds) that grabs an address (:condition > => "latitude=''''") and then geocodes the parcel. I just didn''t want to > clutter up my example or distract the discussion with the particulars. > For what it''s worth, here is the actual loop: > > for m in Property.find(:all, :select => "id", :conditions => "latitude > = ''0''") > p = Property.find(m.id) > address = p.address + ", " + p.city.name + ", " + p.state + " " + > p.zip > begin > location = gg.locate address > rescue GoogleGeocode::AddressError > else > p.latitude = location.latitude > p.longitude = location.longitude > p.save > end > Time.new > sleep 1.728 > end > > Using the ''google-geocode'' gem and Active Record to interact with the > database was the cleanest and easiest way for me to accomplish my > specific task. And it works. Again, I''m a beginner programmer so that > has a lot to do with my choices here. > So if there is a smarter way to do this, I''m all ears and eager to > learn. > > John-ScottYou may need an outer loop like: main = Property.find(:all, :select => "id", :conditions => "latitude> = ''0''", :limit=>1000)loop through and increment count main = Property.find(:all, :select => "id", :conditions => "latitude> = ''0'' and id > #{count} ", :limit=>1000)-- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---