Guanliang Liu
2007-May-21 11:44 UTC
how to insert a batch of records into db more efficiently
I am currently using the standard AR method create to insert records into oracle db. It seems that the create method generates an INSERT sql command for each record, and execute them separately. I felt that it was a bit slow.(For a record with about 50 columns, the insert speed is 25.3 records per second, and for a record with 4 columns, the insert speed is 115.25 records per second) Since all the data have been stored in an array/hash (as the structure in db) before the inserting, I wonder if there is a more efficient way to write all the data into the db.(e.g. a table level command/an one-off command to insert or update a table) Is there anyone who is experienced in that field? I really appreciate your help. cheers -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
MichaelLatta
2007-May-21 15:22 UTC
Re: how to insert a batch of records into db more efficiently
I would not say this is the best style (pretty ugly in fact), but for bulk inserts I have gotten much better performance using a raw SQL insert. You can get the connection to the DB using the ''connection'' method on an AR class. Then there are methods to execute raw SQL commands documented under "database statements" module at http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/DatabaseStatements.html. Using this you can insert multiple rows, or just avoid the AR overhead for creating and dealing with objects. On May 21, 4:44 am, Guanliang Liu <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> I am currently using the standard AR method create to insert records > into oracle db. It seems that the create method generates an INSERT sql > command for each record, and execute them separately. I felt that it was > a bit slow.(For a record with about 50 columns, the insert speed is 25.3 > records per second, and for a record with 4 columns, the insert speed is > 115.25 records per second) > > Since all the data have been stored in an array/hash (as the structure > in db) before the inserting, I wonder if there is a more efficient way > to write all the data into the db.(e.g. a table level command/an one-off > command to insert or update a table) > > Is there anyone who is experienced in that field? I really appreciate > your help. > > cheers > > -- > Posted viahttp://www.ruby-forum.com/.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Ball, Donald A Jr (Library)
2007-May-21 16:47 UTC
Re: how to insert a batch of records into db more efficiently
> I am currently using the standard AR method create to insert > records into oracle db. It seems that the create method > generates an INSERT sql command for each record, and execute > them separately. I felt that it was a bit slow.(For a record > with about 50 columns, the insert speed is 25.3 records per > second, and for a record with 4 columns, the insert speed is > 115.25 records per second) > > Since all the data have been stored in an array/hash (as the > structure in db) before the inserting, I wonder if there is a > more efficient way to write all the data into the db.(e.g. a > table level command/an one-off command to insert or update a table)You could either use the AR connection directly to write raw sql, or you could try using the ActiveRecord Extensions module. I don''t know that it supports true batch insertion, but it does obviate the need to construct a model object for each new row, which is where most of the cost of mass imports seems to be. My mass imports went from hours to minutes once I switched to it. - donald --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
dasil003
2007-May-21 17:59 UTC
Re: how to insert a batch of records into db more efficiently
Yes, ar::extensions does do a true batch insert, it even supports MySQL''s ON DUPLICATE KEY UPDATE syntax, and it also verifies the size of the query and breaks it down into individual queries as necessary (ie. if you are inserting more than the MAX_PACKET_SIZE). It also does a lot of other cool stuff. The hash-based condition extensions via suffix are my favorite: http://www.continuousthinking.com/tags/arext On May 21, 10:47 am, "Ball, Donald A Jr (Library)" <donald.b...-GjtI+QwuxAR68HQyEA6aog@public.gmane.org> wrote:> > I am currently using the standard AR method create to insert > > records into oracle db. It seems that the create method > > generates an INSERT sql command for each record, and execute > > them separately. I felt that it was a bit slow.(For a record > > with about 50 columns, the insert speed is 25.3 records per > > second, and for a record with 4 columns, the insert speed is > > 115.25 records per second) > > > Since all the data have been stored in an array/hash (as the > > structure in db) before the inserting, I wonder if there is a > > more efficient way to write all the data into the db.(e.g. a > > table level command/an one-off command to insert or update a table) > > You could either use the AR connection directly to write raw sql, or you > could try using the ActiveRecord Extensions module. I don''t know that it > supports true batch insertion, but it does obviate the need to construct > a model object for each new row, which is where most of the cost of mass > imports seems to be. My mass imports went from hours to minutes once I > switched to it. > > - donald--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Guanliang Liu
2007-May-22 13:55 UTC
Re: how to insert a batch of records into db more efficientl
Thanks a lot to all you guys :) I will try the AR::extensions when I finish the current module. It sounds a neater way than writing raw sql directly. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Guanliang Liu
2007-May-22 15:35 UTC
Re: how to insert a batch of records into db more efficientl
OK, here is another quick question. I have installed the AR-E using the following command gem install ar-extensions the prompted message said the installation was successful. but when I try to invoke the *import* method, I got NoMethodError in PasController#import undefined method `import'' for Paspatient:Class I did exactly the same thing as the example showed in http://www.continuousthinking.com/are/import. Is there any setups I should do before using it? or do I need to add a *require ''...''* somewhere in my Paspatient class or in my controller? Many thanks. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
zdennis
2007-Jun-08 03:53 UTC
Re: how to insert a batch of records into db more efficientl
I answered this in a private email, but in case someone searches archives and has this problem. The issue was that import was being called on the controller and not on the model. It must be called on the model, ie: Book.import See http://www.continuousthinking.com/tags/arext for more information Zach On May 22, 11:35 am, Guanliang Liu <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> OK, here is another quick question. > > I have installed the AR-E using the following command > > gem install ar-extensions > > the prompted message said the installation was successful. but when I > try to invoke the *import* method, I got > > NoMethodError in PasController#import > undefined method `import'' for Paspatient:Class > > I did exactly the same thing as the example showed inhttp://www.continuousthinking.com/are/import. Is there any setups I > should do before using it? or do I need to add a *require ''...''* > somewhere in my Paspatient class or in my controller? > > Many thanks. > > -- > Posted viahttp://www.ruby-forum.com/.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Hi, I don''t know if anyone can help me but I was trying to do a batch insert but got an error saying that there was a problem with the generated SQL. Anyone know how to fix this? <2007-06-15 00:10:46><DEBUG> SQL (0.012701) INSERT INTO numbers (`num1`,`num2`) VALUES (509081924,1620644),(509081924,500231348),(509081924,509659840),(509081924,524396516),(509081924,528220398),(509081924,534085839),(509081924,547781856),(509081924,573547171),(509081924,574312014),(509081924,577805238),(509081924,582805398),(509081924,589756118),(509081924,607286192),(509081924,607425581),(509081924,619176048),(509081924,620010268),(509081924,634632018),(509081924,642685358),(509081924,657185365),(509081924,692756207),(509081924,692790495),(509081924,704706222),(509081924,771555161),(509081924,772265233),(509081924,791020130),(509081924,898075257) <2007-06-15 00:10:46><DEBUG> Review Load (0.000000) Mysql::Error: #42000You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''=> 509081924,1620644) LIMIT 1'' at line 1: SELECT * FROM reviews WHERE (user_id => 509081924,1620644) LIMIT 1 zdennis wrote:> I answered this in a private email, but in case someone searches > archives and has this problem. The issue was that import was being > called on the controller and not on the model. It must be called on > the model, > > ie: Book.import > > See http://www.continuousthinking.com/tags/arext for more information > > Zach > > On May 22, 11:35 am, Guanliang Liu <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>-- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
zdennis
2007-Jul-11 02:12 UTC
Re: how to insert a batch of records into db more efficientl
Can you send me the import code you had written that produced this error? Email me privately, thanks, Zach On Jun 15, 3:34 am, EP <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> Hi, I don''t know if anyone can help me but I was trying to do a batch > insert but got an error saying that there was a problem with the > generated SQL. > Anyone know how to fix this? > > <2007-06-15 00:10:46><DEBUG> SQL (0.012701) INSERT INTO numbers > (`num1`,`num2`) VALUES > (509081924,1620644),(509081924,500231348),(509081924,509659840),(509081924,524396516),(509081924,528220398),(509081924,534085839),(509081924,547781856),(509081924,573547171),(509081924,574312014),(509081924,577805238),(509081924,582805398),(509081924,589756118),(509081924,607286192),(509081924,607425581),(509081924,619176048),(509081924,620010268),(509081924,634632018),(509081924,642685358),(509081924,657185365),(509081924,692756207),(509081924,692790495),(509081924,704706222),(509081924,771555161),(509081924,772265233),(509081924,791020130),(509081924,898075257) > <2007-06-15 00:10:46><DEBUG> Review Load (0.000000) Mysql::Error: > #42000You have an error in your SQL syntax; check the manual that > corresponds to your MySQL server version for the right syntax to use > near ''=> 509081924,1620644) LIMIT 1'' at line 1: SELECT * FROM reviews > WHERE (user_id => 509081924,1620644) LIMIT 1 > > zdennis wrote: > > I answered this in a private email, but in case someone searches > > archives and has this problem. The issue was that import was being > > called on the controller and not on the model. It must be called on > > the model, > > > ie: Book.import > > > Seehttp://www.continuousthinking.com/tags/arext for more information > > > Zach > > > On May 22, 11:35 am, Guanliang Liu <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> > > -- > Posted viahttp://www.ruby-forum.com/.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Karthi kn
2007-Dec-07 06:11 UTC
Re: how to insert a batch of records into db more efficientl
zdennis wrote:> Can you send me the import code you had written that produced this > error? Email me privately, thanks, > > ZachCan any one show the sample code here for how to use the ActiveRecord::extensions methods? It will be useful for every one to get some idea on that. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Michael Glass
2010-Mar-26 16:31 UTC
Re: how to insert a batch of records into db more efficientl
My friend Jonah wrote Crewait, which is easier and faster than ARExtensions for bulk insertion. http://www.jonah.org/articles/crewait_go_.html -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Fitz Fitz
2010-Apr-01 15:50 UTC
Re: how to insert a batch of records into db more efficientl
I have been facing this problem for a while now. ar-extensions is really good but it doesn''t preserve the links between models like one-to-one many-to-many etc. I am trying to insert ~1M records daily. ActiveRecord hasn''t a hope of doing this in a reasonable time. I am hoping that I am using the plugin in the wrong way. Anyone else have similar problems? Cheers, Patrick -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Josh _
2010-Apr-01 19:23 UTC
Re: Re: how to insert a batch of records into db more efficientl
On Thu, Apr 1, 2010 at 8:50 AM, Fitz Fitz <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> I have been facing this problem for a while now. ar-extensions is > really good but it doesn''t preserve the links between models like > one-to-one many-to-many etc. >Have you tried doing batches via transactions? By default activerecord will make every save operation it''s own transaction which adds overhead to the DB in many setups. See ActiveRecord::Transactions::ClassMethods, and put ten or a hundred records at a time in a transaction wrapper and see what you gain. The improvement you see from this depends on the DB engine in use as well as how much the DB is actually the bottleneck versus the processing you''re doing application-side per record. I suspect your problem is actually a combination of both DB overhead and unoptimized application logic. You might want to try some profiling of your application to see where the bottleneck really is - at the scale of 1M operations per day it''s worth doing some of that. That''s my 2 cents, the disclaimer is that I am by no means a rails expert. Just trying to help out where I can. jsw -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.