Hi I''m using ActiveRecord in a batch processing environment. My application needs to process several hundred thousand records, about in the million records range. I have noticed that it works fine for the first 50,000-70,000 records, but then performance starts to drop. If I stop the application and restart it, the performance is fine again. The application max memory usage is about 100mb. For this, I think the issue may be memory related. The client & server machines run Linux (Fedora Core 3 & 4). The database is Postgresql 8.0.3. Is there a way to prevent this performance drop? TIA for your help. Werner Bohl
On Fri, 2005-09-09 at 14:57 -0600, Werner Bohl wrote:> Hi > > I''m using ActiveRecord in a batch processing environment. My application > needs to process several hundred thousand records, about in the million > records range. I have noticed that it works fine for the first > 50,000-70,000 records, but then performance starts to drop. If I stop > the application and restart it, the performance is fine again. The > application max memory usage is about 100mb. For this, I think the issue > may be memory related. > > The client & server machines run Linux (Fedora Core 3 & 4). The database > is Postgresql 8.0.3. > > Is there a way to prevent this performance drop?I cannot offer direct help, but I am curious in this problem. I assume you have test records and db, that you can test things with... so could you test, and count the accumulative Objects in existence for every 5,000th record you commit? Perhaps there is a memory leak... To count objects in memory (i know this is biased, because I''m relying on ruby to tell me correct information...but I am assuming it would be a rails-related memory leak) use ObjectSpace. h = {} ObjectSpace.each_object{|o| h[o.class.to_s] ||= 0; h[o.class.to_s] +=1 } base_name, i = ''object.log.'', 1 while File.exist?( filename=base_name+i.to_s ) do; i+= 1; end f = File.open( filename, ''w'' ) do |outf| h.keys.sort.each{|k| outf.print k, '' = '', h[k], "\n" } end I know directly this doesn''t pinpoint the issue, but if all you''re doing is one thing it should show you where there is a memory leak if one exists. Also you should run top and/or memstat during this test to see if memory just continually goes up and up and up. You may want to call GC.start each time before you run this, so ruby will get rid of any objects marked to be garbage collected first. Zach
On Sep 9, 2005, at 3:57 PM, Werner Bohl wrote:> Hi > > I''m using ActiveRecord in a batch processing environment. My > application > needs to process several hundred thousand records, about in the > million > records range. I have noticed that it works fine for the first > 50,000-70,000 records, but then performance starts to drop. If I stop > the application and restart it, the performance is fine again. The > application max memory usage is about 100mb. For this, I think the > issue > may be memory related.IS there any chance you''re running it in development, as opposed to production mode? Cheers Dave
Thanks for your help. I''m using GC.start after each batch read (10,000 records) and have no more memory issues. BTW, I''m using ActiveRecord in a TUI application, this particular application does not run on full Rails. On Fri, 2005-09-09 at 16:37, Zach Dennis wrote:> On Fri, 2005-09-09 at 14:57 -0600, Werner Bohl wrote: > > Hi > > > > I''m using ActiveRecord in a batch processing environment. My application > > needs to process several hundred thousand records, about in the million > > records range. I have noticed that it works fine for the first > > 50,000-70,000 records, but then performance starts to drop. If I stop > > the application and restart it, the performance is fine again. The > > application max memory usage is about 100mb. For this, I think the issue > > may be memory related. > > > > The client & server machines run Linux (Fedora Core 3 & 4). The database > > is Postgresql 8.0.3. > > > > Is there a way to prevent this performance drop? > > I cannot offer direct help, but I am curious in this problem. I assume > you have test records and db, that you can test things with... so could > you test, and count the accumulative Objects in existence for every > 5,000th record you commit? > > Perhaps there is a memory leak... To count objects in memory (i know > this is biased, because I''m relying on ruby to tell me correct > information...but I am assuming it would be a rails-related memory leak) > use ObjectSpace. > > h = {} > ObjectSpace.each_object{|o| h[o.class.to_s] ||= 0; h[o.class.to_s] +=1 } > base_name, i = ''object.log.'', 1 > while File.exist?( filename=base_name+i.to_s ) do; i+= 1; end > f = File.open( filename, ''w'' ) do |outf| > h.keys.sort.each{|k| outf.print k, '' = '', h[k], "\n" } > end > > I know directly this doesn''t pinpoint the issue, but if all you''re doing > is one thing it should show you where there is a memory leak if one > exists. Also you should run top and/or memstat during this test to see > if memory just continually goes up and up and up. You may want to call > GC.start each time before you run this, so ruby will get rid of any > objects marked to be garbage collected first. > > Zach-- Werner Bohl <WernerBohl-g6+PG5E9EnRBDgjK7y7TUQ@public.gmane.org> Infutor de Costa Rica S.A.