I''m a beginning programming using ActiveRecord outside of Rails to do conditional processing of database records. So far, I''ve been successful. However, my script loads all matching records into memory first. There are hundreds of thousands of matching records so the script quickly consumes over 500MB of RAM before any processing is done. Is there a way to avoid this preloading of row objects in memory? Below is an example of the type of thing I''m trying to do (although the actual table row-level processing I''m doing is more complicated than the example, but this isn''t relevant to my question): #example code require ''rubygems'' require ''mysql'' require ''active_record'' ActiveRecord::Base.establish_connection( :adapter => "mysql", :username => "root", :password => "password", :database => "my_schema" ) class MyTable < ActiveRecord::Base end for m in MyTable.find(:all, :conditions => "some_column=''criterion to match''") m.other_column = "new value" m.save end #thanks for any tips --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
(I tried posting this before, but after several hours it has not appeared on the list, if a dupe shows up i''ll delete one) -- I''m a beginning programming using ActiveRecord outside of Rails to do conditional processing of database records. So far, I''ve been successful. However, my script loads all matching records into memory first. There are hundreds of thousands of matching records so the script quickly consumes over 500MB of RAM before any processing is done. Is there a way to avoid this preloading of row objects in memory? Below is an example of the type of thing I''m trying to do (although the actual table row-level processing I''m doing is more complicated than the example, but this isn''t relevant to my question): #example code require ''rubygems'' require ''mysql'' require ''active_record'' ActiveRecord::Base.establish_connection( :adapter => "mysql", :username => "root", :password => "password", :database => "my_schema" ) class MyTable < ActiveRecord::Base end for m in MyTable.find(:all, :conditions => "some_column=''criterion to match''") m.other_column = "new value" m.save end #thanks for any tips --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Well, for some reason it took a few days for my first message to appear in google-groups. In the meantime, I figured out an imperfect but useable solution. I changed the loop so that it only loads the id values into memory, and then issues a select query for each id, so that it only creates the an object with all column values for one row at a time. Now my loop looks like this: for m in MyTable.find(:all, :select => "id", :conditions => "some_column=''criterion to match''") p = MyTable.find(m.id) p.other_column = "new value" p.save end In my particular case, this saved me from creating 500MB+ of objects in memory and instead only used ~70MB using only the id values. Granted the script has to issue a SELECT query for every id, but in my case this is acceptable as the loop is on a timer anyway (only querying the database every couple of seconds). If anyone else has a more elegant solution to this problem, please chime in. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
You''d be better off writing custom SQL for this particular case. If
your
database supports subselects, you could do:
MyTable.execute("UPDATE my_table SET other_column =
''new_value'' WHERE id IN
(SELECT id FROM my_table WHERE some_column = ''criterion to
match'')")
David Rose
On 12/4/06, John-Scott
<john.scott.atlakson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:>
>
> Well, for some reason it took a few days for my first message to appear
> in google-groups. In the meantime, I figured out an imperfect but
> useable solution. I changed the loop so that it only loads the id
> values into memory, and then issues a select query for each id, so that
> it only creates the an object with all column values for one row at a
> time. Now my loop looks like this:
>
> for m in MyTable.find(:all, :select => "id", :conditions =>
> "some_column=''criterion to match''")
> p = MyTable.find(m.id)
> p.other_column = "new value"
> p.save
> end
>
> In my particular case, this saved me from creating 500MB+ of objects in
> memory and instead only used ~70MB using only the id values.
> Granted the script has to issue a SELECT query for every id, but in my
> case this is acceptable as the loop is on a timer anyway (only querying
> the database every couple of seconds).
>
> If anyone else has a more elegant solution to this problem, please
> chime in.
>
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---
David...thanks for the reply. However, I''m using the google-geocoder
gem and so I have a timed loop (50,000 Google Maps API daily geocode
limit = request every 1.728 seconds) that grabs an address (:condition
=> "latitude=''''") and then geocodes the parcel. I
just didn''t want to
clutter up my example or distract the discussion with the particulars.
For what it''s worth, here is the actual loop:
for m in Property.find(:all, :select => "id", :conditions =>
"latitude
= ''0''")
p = Property.find(m.id)
address = p.address + ", " + p.city.name + ", " + p.state
+ " " +
p.zip
begin
location = gg.locate address
rescue GoogleGeocode::AddressError
else
p.latitude = location.latitude
p.longitude = location.longitude
p.save
end
Time.new
sleep 1.728
end
Using the ''google-geocode'' gem and Active Record to interact
with the
database was the cleanest and easiest way for me to accomplish my
specific task. And it works. Again, I''m a beginner programmer so that
has a lot to do with my choices here.
So if there is a smarter way to do this, I''m all ears and eager to
learn.
John-Scott
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---
John-Scott wrote:> David...thanks for the reply. However, I''m using the google-geocoder > gem and so I have a timed loop (50,000 Google Maps API daily geocode > limit = request every 1.728 seconds) that grabs an address (:condition > => "latitude=''''") and then geocodes the parcel. I just didn''t want to > clutter up my example or distract the discussion with the particulars. > For what it''s worth, here is the actual loop: > > for m in Property.find(:all, :select => "id", :conditions => "latitude > = ''0''") > p = Property.find(m.id) > address = p.address + ", " + p.city.name + ", " + p.state + " " + > p.zip > begin > location = gg.locate address > rescue GoogleGeocode::AddressError > else > p.latitude = location.latitude > p.longitude = location.longitude > p.save > end > Time.new > sleep 1.728 > end > > Using the ''google-geocode'' gem and Active Record to interact with the > database was the cleanest and easiest way for me to accomplish my > specific task. And it works. Again, I''m a beginner programmer so that > has a lot to do with my choices here. > So if there is a smarter way to do this, I''m all ears and eager to > learn. > > John-ScottYou may need an outer loop like: main = Property.find(:all, :select => "id", :conditions => "latitude> = ''0''", :limit=>1000)loop through and increment count main = Property.find(:all, :select => "id", :conditions => "latitude> = ''0'' and id > #{count} ", :limit=>1000)-- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---