John Kopanas
2006-Nov-23 22:37 UTC
Inserting A Large Amount of Records into DB using Rake, Ruby and ActiveRecord? Help :-)
I have a text file with a couple hundred thousand rows that I use ruby to massage and then active record to insert each row after being massaged into the DB. Now... inserting a couple hundred thousand rows using ActiveRecord and ruby is crazy. I know. It was beautiful to write my rake task to do everything for me but now the problem is time. For 1,000 records it is taking me on average 1:00 minute to process. I have an old PowerBook 867. Now, I noticed when looking at the log file that the COMMIT opperation is what is taking so much time. I was wondering if I can extend the commit so the commit is done at the end or every X amount of updates? Or any other suggestions on how to speed this up? Thanks my friends :-). -- John Kopanas john-Iau1QiYlxLpBDgjK7y7TUQ@public.gmane.org http://www.kopanas.com http://www.cusec.net http://www.soen.info --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Anthony Eden
2006-Nov-23 23:28 UTC
Re: Inserting A Large Amount of Records into DB using Rake, Ruby and ActiveRecord? Help :-)
You might want to consider exporting to a file which can be bulk loaded into the DB. We are dealing with data sources which have tens of millions of rows and bulk loading is the only way to effeciently load it. If you absolutely want to manage the commit manually you can, via ActiveRecord::Base.connection methods: begin_db_transaction commit_db_transaction rollback_db_transaction Or you may be able to wrap the inserts in a transaction block. Still, I''d highly recommend considering bulk load. V/r Anthony Eden On 11/23/06, John Kopanas <kopanas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > I have a text file with a couple hundred thousand rows that I use ruby > to massage and then active record to insert each row after being > massaged into the DB. Now... inserting a couple hundred thousand rows > using ActiveRecord and ruby is crazy. I know. > > It was beautiful to write my rake task to do everything for me but now > the problem is time. For 1,000 records it is taking me on average > 1:00 minute to process. I have an old PowerBook 867. > > Now, I noticed when looking at the log file that the COMMIT opperation > is what is taking so much time. I was wondering if I can extend the > commit so the commit is done at the end or every X amount of updates? > > Or any other suggestions on how to speed this up? > > Thanks my friends :-). > > -- > John Kopanas > john-Iau1QiYlxLpBDgjK7y7TUQ@public.gmane.org > > http://www.kopanas.com > http://www.cusec.net > http://www.soen.info >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
zdennis
2006-Nov-23 23:45 UTC
Re: Inserting A Large Amount of Records into DB using Rake, Ruby and ActiveRecord? Help :-)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 John Kopanas wrote:> I have a text file with a couple hundred thousand rows that I use ruby > to massage and then active record to insert each row after being > massaged into the DB. Now... inserting a couple hundred thousand rows > using ActiveRecord and ruby is crazy. I know. > > It was beautiful to write my rake task to do everything for me but now > the problem is time. For 1,000 records it is taking me on average > 1:00 minute to process. I have an old PowerBook 867. > > Now, I noticed when looking at the log file that the COMMIT opperation > is what is taking so much time. I was wondering if I can extend the > commit so the commit is done at the end or every X amount of updates? > > Or any other suggestions on how to speed this up?ActiveRecord::Extensions (for MySQL) handles bulk inserts using multi-value insert statements. It goes up to 40x faster then current ActiveRecord behavior. API looks like: MyModel.import( columns, array_of_value_sets, options_hash ) Documentation on it can be found here: http://www.continuousthinking.com/are/import Here are some MySQL benchmarks for current ActiveRecord#create vs. ActiveRecord::Extensions#import comparing MyISAM to InnoDb to Memory tables: http://www.continuousthinking.com/are/import-benchmarks-for-mysql You can download the latest version from: http://rubyforge.org/projects/arext/ What db adapter do you use? I am currently adding PostgreSQL support over this holiday weekend. Zach -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFZjJ7Myx0fW1d8G0RAt33AJ4tpTNptf5jDzK/JxmAkmWAWyRoIgCaAoiM Ha5cR5ZOPzmaNpTlxeZEHQY=XPVC -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---