I need to insert ~100k rows into the database, but using "Table.new(...).save" is painfully slow (I notice in the logs that it''s wrapping each insert in a transaction -- that can''t be fast). Any suggestions on how I can improve the speed of bulk inserts like this? Thanks ... -- Steve
I don''t know if this is the good option, but based on the fact that it wraps each insert in a transaction, you could just start a transaction, insert all the rows, and then commit. Is that going to be significantly faster? I''m not sure. Pat On 12/15/05, Steve Sloan <steve-2FdKsI0tZ45AfugRpC6u6w@public.gmane.org> wrote:> I need to insert ~100k rows into the database, but using "Table.new(...).save" > is painfully slow (I notice in the logs that it''s wrapping each insert in a > transaction -- that can''t be fast). > > Any suggestions on how I can improve the speed of bulk inserts like this? > > Thanks ... > > -- Steve > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > lists.rubyonrails.org/mailman/listinfo/rails >
On 12/15/05, Pat Maddox <pergesu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I don''t know if this is the good option, but based on the fact that it > wraps each insert in a transaction, you could just start a > transaction, insert all the rows, and then commit. Is that going to > be significantly faster? I''m not sure. > > Pat+1 It will be significantly faster in my experience. I used a REXML and ActiveRecord ruby script to do a mass import of XML data into one of my app''s tables back in June. Not necessarily the fastest performance, but quick and easy to write. Probably faster now since AR has had some performance optimizations since then.> > > On 12/15/05, Steve Sloan <steve-2FdKsI0tZ45AfugRpC6u6w@public.gmane.org> wrote: > > I need to insert ~100k rows into the database, but using "Table.new(...).save" > > is painfully slow (I notice in the logs that it''s wrapping each insert in a > > transaction -- that can''t be fast). > > > > Any suggestions on how I can improve the speed of bulk inserts like this? > > > > Thanks ... > > > > -- Steve > > _______________________________________________ > > Rails mailing list > > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > > lists.rubyonrails.org/mailman/listinfo/rails > > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > lists.rubyonrails.org/mailman/listinfo/rails >
Pat Maddox wrote:> I don''t know if this is the good option, but based on the fact that it > wraps each insert in a transaction, you could just start a > transaction, insert all the rows, and then commit. Is that going to > be significantly faster? I''m not sure.Yup, had the same thought, and after some benchmarking, wrapping the whole thing in a single transaction gives about a %22 speed increase. However, I''m still only (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB. Any other ideas? Also, I''m wondering if this method will work on DBs that don''t support transactions, i.e. MySQL/MyISAM. Or are we not bothering to support those silly people? ;) -- Steve
Steve Sloan wrote:> Pat Maddox wrote: >> I don''t know if this is the good option, but based on the fact that it >> wraps each insert in a transaction, you could just start a >> transaction, insert all the rows, and then commit. Is that going to >> be significantly faster? I''m not sure. > > Yup, had the same thought, and after some benchmarking, wrapping the > whole thing > in a single transaction gives about a %22 speed increase. However, I''m > still > only (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB. > Any other > ideas?Try disabling indexes. -- Posted via ruby-forum.com.
Andreas S. wrote:>> However, I''m stillonly (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB. >> Any other ideas? > > Try disabling indexes.Uh, how? Short of dropping/adding them, that is. -- Steve
Steve Sloan wrote:> Andreas S. wrote: > >>> However, I''m stillonly (?) getting about 186 rows/sec insert >>> velocity on MySQL/InnoDB. Any other ideas? >> >> >> Try disabling indexes. > > > Uh, how? Short of dropping/adding them, that is.drop, then adding indexes is correct. That''s the only way I know how on mysql and postgresql. Jason
Steve Sloan wrote:> Andreas S. wrote: >>> However, I''m stillonly (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB. >>> Any other ideas? >> >> Try disabling indexes. > > Uh, how? Short of dropping/adding them, that is.MySQL has a function for it. I can''t tell you off-hand how it''s called, you have to look it up in the manual. -- Posted via ruby-forum.com.
Jason Edgecombe wrote:> Steve Sloan wrote: >> Andreas S. wrote:>>>>>> Try disabling indexes.>> drop, then adding indexes is correct. That''s the only way I know how on > mysql and postgresql.Dropping the indexes seems to have a negligable impact (5:23 instead of 5:29), but that''s not too surprising, as the the table only has 3 columns (and the index is on two of them). Any other ideas? E.g., is there any way to insert multiple rows with a single SQL statement? I know MySQL will do it, but PgSQL won''t, and I don''t think it''s ANSI anyway. -- Steve
On 16/12/2005, at 12:51 PM, Steve Sloan wrote:> Any other ideas? E.g., is there any way to insert multiple rows > with a single SQL statement? I know MySQL will do it, but PgSQL > won''t, and I don''t think it''s ANSI anyway.PostgreSQL has a handy COPY command which imports TSV or CSV data, either from stdin or a file: COPY my_table(column1, column2) FROM STDIN WITH CSV; "thing","stuff" "foo","bar" etc. See postgresql.org/docs/8.1/interactive/sql-copy.html for more details -- Phillip Hutchings phillip.hutchings-QrR4M9swfipWk0Htik3J/w@public.gmane.org
On 12/15/05, Steve Sloan <steve-2FdKsI0tZ45AfugRpC6u6w@public.gmane.org> wrote:> I need to insert ~100k rows into the database, but using "Table.new(...).save" > is painfully slow (I notice in the logs that it''s wrapping each insert in a > transaction -- that can''t be fast). > > Any suggestions on how I can improve the speed of bulk inserts like this?Is using the native database tools an option, or do you need to use ActiveRecord? Both Postresql and Mysql have fast methods for bulk loading data. Chris
snacktime wrote:> On 12/15/05, Steve Sloan <steve-2FdKsI0tZ45AfugRpC6u6w@public.gmane.org> wrote: > >>I need to insert ~100k rows into the database, but using "Table.new(...).save" >>is painfully slow (I notice in the logs that it''s wrapping each insert in a >>transaction -- that can''t be fast). >> >>Any suggestions on how I can improve the speed of bulk inserts like this? > > > Is using the native database tools an option, or do you need to use > ActiveRecord? Both Postresql and Mysql have fast methods for bulk > loading data.Does anybody know if ActiveRecord automatically takes advantage of Postgresql prepared statements? postgresql.org/docs/8.1/interactive/sql-prepare.html Jeroen
Steve Sloan wrote:> Jason Edgecombe wrote: > >> Steve Sloan wrote: >> >>> Andreas S. wrote: > > >>> > >>>> Try disabling indexes. > > > > >> drop, then adding indexes is correct. That''s the only way I know how >> on mysql and postgresql. > > > Dropping the indexes seems to have a negligable impact (5:23 instead of > 5:29), but that''s not too surprising, as the the table only has 3 > columns (and the index is on two of them). > > Any other ideas? E.g., is there any way to insert multiple rows with a > single SQL statement? I know MySQL will do it, but PgSQL won''t, and I > don''t think it''s ANSI anyway. >Are you still using TableModel.new for each record? If so, this is incredibly slow. If you are using MySQL contact me offlist if you want my active record extension (which only works for mysql right now) which increases the speed 100x when inserting large sets of data into mysql. Zach
On 15-Dec-05, at 2:33 PM, Steve Sloan wrote:> I need to insert ~100k rows into the database, but using "Table.new > (...).save" is painfully slow (I notice in the logs that it''s > wrapping each insert in a transaction -- that can''t be fast). > > Any suggestions on how I can improve the speed of bulk inserts like > this?It''s pretty easy to add a class method #batch_insert to your AR class, or to all classes. If you are using mysql, the syntax is VALUES (...) (..) etc. I suggest saving a few thousand records every statement; the string for 100k records would get pretty big. :-)