I want to upload lines of 95000 users (email and login) from a text file to mysql table. There is an upload form on my admin page. How can I make it efficiently including existence pre-check? Is it suitable for Rails or shoud I put it to a separate mysql script? Gábor
Define "Suitable" Can rails do it. Sure you can upload the file and the process it. ruby has some great csv libraries. Is this going to happen often? or one time? if its often you may want to write a "prcessor" rubyscript to load the data and sanity check it. This is what I do for alot of dataloads. Sam On 8/9/05, Gábor SEBESTYÉN <segabor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I want to upload lines of 95000 users (email and login) from a text > file to mysql table. There is an upload form on my admin page. > How can I make it efficiently including existence pre-check? > > Is it suitable for Rails or shoud I put it to a separate mysql script? > > Gábor > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
Sam Mayes said:> Define "Suitable"I think he''s asking if it''s going to be horribly inefficient, and if so, if there is there a more efficient way. It''s been a while since I worked with MySQL, but in general, iterating through a file and doing a round trip to the database for every inserted row is a bad idea. PostgreSQL has a bulk insert command that allows you to import data. I''m pretty sure MySQL has a similar feature. You can probably either 1) provide a reference to a file in the query or 2) include the contents of the file in the query. The first approach will be faster, but it''s more complicated, because you need to provide a filename that''s correct from the server''s perspective, which may not be possible if your MySQL is running on a different server. And even if it is possible to do so, you''ll need to document that somewhere, so when you someday move the database to a different machine and your application stops working, you''ll know why. Regards, Ed> On 8/9/05, Gábor SEBESTYÉN <segabor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> I want to upload lines of 95000 users (email and login) from a text >> file to mysql table. There is an upload form on my admin page. >> How can I make it efficiently including existence pre-check? >> >> Is it suitable for Rails or shoud I put it to a separate mysql script?-- Transmogrify, LLC * <http://xmog.com>
A middle ground would be to process the file doing validation only, perhaps writing good records to one file, bad to another. Then use your favorite load utility to handle the sql inserts. So you could use Ruby/ActiveRecord for validations.
Gábor SEBESTYÉN
2005-Aug-10 07:59 UTC
Re: efficient mass record "insert or update" on mysql
On 2005.08.09., at 16:29, Sam Mayes wrote:> Define "Suitable" >Suitable: is it worth doing with Ruby or it is much more efficienty to generate a do-everything sql script and execute with mysql.> Can rails do it. Sure you can upload the file and the process it. > ruby has some great csv libraries. >Actually it works but it takes 2 hours to "join" 95000 new entries. I get the text file, iterate by applying .each_line and do a SELECT and INSERT (if record does not exist yet). It is a very dumb solution I know. But I doubt it could be done efficiently with Ruby and ActiveRecord.> Is this going to happen often? or one time?It will happen occassionally, maybe monthly.> if its often you may want to write a "prcessor" rubyscript to load > the data and sanity check it. > This is what I do for alot of dataloads.I actually about to transform all the task to native SQL :) Perhaps it is the ideal solution. Gábor "Everything begins with choice... " _______________________________________________ Rails mailing list Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org http://lists.rubyonrails.org/mailman/listinfo/rails
Its allways fastest to use the native db tools if you can. taking 2 hrs to validate and do an insert seems a bit much. If you post some code maybe we can help with what you have. Sam On 8/10/05, Gábor SEBESTYÉN <segabor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > > On 2005.08.09., at 16:29, Sam Mayes wrote: > > > Define "Suitable" > > > Suitable: is it worth doing with Ruby or it is much more efficienty to > generate a do-everything sql script and execute with mysql. > > > Can rails do it. Sure you can upload the file and the process it. ruby has > some great csv libraries. > > > Actually it works but it takes 2 hours to "join" 95000 new entries. I get > the text file, iterate by applying .each_line and do a SELECT and INSERT (if > record does not exist yet). It is a very dumb solution I know. But I doubt > it could be done efficiently with Ruby and ActiveRecord. > > > > Is this going to happen often? or one time? > It will happen occassionally, maybe monthly. > > > if its often you may want to write a "prcessor" rubyscript to load the data > and sanity check it. > > This is what I do for alot of dataloads.I actually about to transform all > the task to native SQL :) Perhaps it is the ideal solution. > > > Gábor > > > "Everything begins with choice... " > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails > > >
Zachery Hostens
2005-Aug-11 00:18 UTC
Re: efficient mass record "insert or update" on mysql
Why are you selecting the record before inserting it? with table constraints as your database should have, you shouldnt have to select anything before inserting (the insert should merely fail.) i dont see how your automagically switching to sql-only and doing select/insert. If you can i would get a list of all records that already exist, drop those and insert ALL remaining records in one large transaction in which will be a lot faster then inserting one record at a time (which equals 1 transaction for each entry) On 8/10/05, Sam Mayes <codeslave-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Its allways fastest to use the native db tools if you can. taking 2 > hrs to validate and do an insert seems a bit much. If you post some > code maybe we can help with what you have. > > Sam > > On 8/10/05, Gábor SEBESTYÉN <segabor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > > On 2005.08.09., at 16:29, Sam Mayes wrote: > > > > > > Define "Suitable" > > > > > > Suitable: is it worth doing with Ruby or it is much more efficienty to > > generate a do-everything sql script and execute with mysql. > > > > > > Can rails do it. Sure you can upload the file and the process it. ruby has > > some great csv libraries. > > > > > > Actually it works but it takes 2 hours to "join" 95000 new entries. I get > > the text file, iterate by applying .each_line and do a SELECT and INSERT (if > > record does not exist yet). It is a very dumb solution I know. But I doubt > > it could be done efficiently with Ruby and ActiveRecord. > > > > > > > > Is this going to happen often? or one time? > > It will happen occassionally, maybe monthly. > > > > > > if its often you may want to write a "prcessor" rubyscript to load the data > > and sanity check it. > > > > This is what I do for alot of dataloads.I actually about to transform all > > the task to native SQL :) Perhaps it is the ideal solution. > > > > > > Gábor > > > > > > "Everything begins with choice... " > > > > _______________________________________________ > > Rails mailing list > > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > > http://lists.rubyonrails.org/mailman/listinfo/rails > > > > > > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-- Zachery Hostens <zacheryph-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>