I have a table with the list of business name and address in it. It may contain 1 million rows. The users want to look up into this table based on the business name. The simple solution is to use LIKE in mysql, but I am worried about performance, since the user load will be very high for this system. Any suggestion on how to handle the potential performance issue? How can i use a search components (like lucene or something else) here? Does ruby have any in-built plugin or gems for search? Thanks -- Posted via http://www.ruby-forum.com/.
Thila: I''m no expert but let me give you an answer in two parts, the general and the specific. MySQL was built for that sort of situation. It is not unusual. I used to run a db with 2.5 million rows and we could find any one row virtually instantly. I think you''ll be fine, just remember to build indexes. With an index, the search is surprisingly undemanding. The indexes work in a binary way. Say you are looking for the letter ''C'' from all the letters in the alphabet. As best I understand it, it starts by grabbing the middle letter ''m''. too high it says, let''s half what''s left, that is roughly the letter ''F'', too high, it says (again),I''ll half what is left, which is ''C'' and voila. Three steps to find one from 26. Now more mathematically it take n search steps where the number of records is 2^n. And that is a remarkable series. 2^5 = 32, 2^10 = 1024 and 2^20 = just over 1,000,000 - the number of rows you''ll have. So it is only twice as hard for MySQL to find one row in a million as it is for it to find one row in 1000. And if your db swells to 2 million records, then it will take 21 steps instead of 20. Another 5% effort to find something in 100% more records. bruce On 18-Jan-06, at 9:34 AM, thila thila wrote:> I have a table with the list of business name and address in it. It > may > contain 1 million rows. The users want to look up into this table > based > on the business name. The simple solution is to use LIKE in mysql, > but I > am worried about performance, since the user load will be very high > for > this system. > > Any suggestion on how to handle the potential performance issue? > How can > i use a search components (like lucene or something else) here? Does > ruby have any in-built plugin or gems for search? > > Thanks > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails
Except that the SQL fragment thila is envisioning probably looks similar to: "... business_name like ''%SEARCH%'' ..." (or ["... business_name like ''%?%'' ...", params[:search] ] ;-) An index isn''t going to help you much there, except that the db engine may be able to search the business_name index rather than the table itself. Don''t be "worried" about a "potential performance issue" -- just be "aware". Wait and see how the query really performs before you decide that it''s an "ISSUE". You may be surprised! -Rob At 1/18/2006 11:23 AM, you wrote:>Thila: > >I''m no expert but let me give you an answer in two parts, the general >and the specific. > >MySQL was built for that sort of situation. It is not unusual. I used >to run a db with 2.5 million rows and we could find any one row >virtually instantly. I think you''ll be fine, just remember to build >indexes. > >With an index, the search is surprisingly undemanding. The indexes >work in a binary way. Say you are looking for the letter ''C'' from all >the letters in the alphabet. As best I understand it, it starts by >grabbing the middle letter ''m''. too high it says, let''s half what''s >left, that is roughly the letter ''F'', too high, it says (again),I''ll >half what is left, which is ''C'' and voila. Three steps to find one >from 26. > >Now more mathematically it take n search steps where the number of >records is 2^n. And that is a remarkable series. 2^5 = 32, 2^10 >1024 and 2^20 = just over 1,000,000 - the number of rows you''ll >have. So it is only twice as hard for MySQL to find one row in a >million as it is for it to find one row in 1000. And if your db >swells to 2 million records, then it will take 21 steps instead of >20. Another 5% effort to find something in 100% more records. > >bruce > > >On 18-Jan-06, at 9:34 AM, thila thila wrote: > >>I have a table with the list of business name and address in it. It >>may >>contain 1 million rows. The users want to look up into this table >>based >>on the business name. The simple solution is to use LIKE in mysql, >>but I >>am worried about performance, since the user load will be very high >>for >>this system. >> >>Any suggestion on how to handle the potential performance issue? >>How can >>i use a search components (like lucene or something else) here? Does >>ruby have any in-built plugin or gems for search? >> >>Thanks >> >>-- >>Posted via http://www.ruby-forum.com/. >>_______________________________________________ >>Rails mailing list >>Rails@lists.rubyonrails.org >>http://lists.rubyonrails.org/mailman/listinfo/rails > >_______________________________________________ >Rails mailing list >Rails@lists.rubyonrails.org >http://lists.rubyonrails.org/mailman/listinfo/rails
Hi Thila I have written a search engine for rails apps that might help you. As others have noted, an index isn''t going to help you when using LIKE in a full text search. One approach that many search engines use is to parse, tokenize and stem data on its way into the database and store each term in a separate table. The TERMS table is then indexed to provide very quick lookups. This is the approach I have taken in my search engine. Please feel free to take it for a spin: http://lance.langwell-ball.com/pages/indexed-search I''d be very happy to help you out if you have questions or run into problems. There are, of course, other options as well such as Ferret ( http://ferret.davebalmain.com/trac). This may be more what you are looking for - it''s certainly more mature. I''ve written my search engine, however, to be very easy to incorporate and use in a rails app. Please do let me know if you try it out and/or end up using it. Lance On 1/18/06, thila thila <isputnik_98@yahoo.com> wrote:> > I have a table with the list of business name and address in it. It may > contain 1 million rows. The users want to look up into this table based > on the business name. The simple solution is to use LIKE in mysql, but I > am worried about performance, since the user load will be very high for > this system. > > Any suggestion on how to handle the potential performance issue? How can > i use a search components (like lucene or something else) here? Does > ruby have any in-built plugin or gems for search? > > Thanks > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-- Lance Ball http://lance.langwell-ball.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails/attachments/20060119/b0da4830/attachment-0001.html