I tried adding a FULLTEXT search index to a table of mine in MySQL, only to discover that the InnoDB table format doesn''t seem to support this feature. Switching to the MyISAM table type seemed to work, but I seem to have some recollection that ActiveRecord transactions aren''t fully atomic on MyISAM tables. Is this true or am I just remembering wrong? If so, does anyone have any suggestions for doing fulltext searches in rails? I found a couple of links on the wiki but they only use LIKE syntax, which doesn''t rank searches by word frequency like FULLTEXT searches do. Thanks, Carl
Yeah, you have to use a separate MyISAM table for searches. But you can run both MyISAM and InnoDB tables in the same database, so it isn''t really a hindrance. Just build a specific table for searches. Hope that helps! caleb On 11/18/05, Carl Youngblood <carl-MJzSGySFh6ZUfOvSQQQpYw@public.gmane.org> wrote:> I tried adding a FULLTEXT search index to a table of mine in MySQL, > only to discover that the InnoDB table format doesn''t seem to support > this feature. Switching to the MyISAM table type seemed to work, but > I seem to have some recollection that ActiveRecord transactions aren''t > fully atomic on MyISAM tables. Is this true or am I just remembering > wrong? If so, does anyone have any suggestions for doing fulltext > searches in rails? I found a couple of links on the wiki but they > only use LIKE syntax, which doesn''t rank searches by word frequency > like FULLTEXT searches do. > > Thanks, > Carl > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-- caleb http://www.ruejulesverne.com
I''m not sure what you mean. Do you mean store duplicate instances of my data, one for updating and keeping track of with ActiveRecord, and one for searching? On 11/18/05, Caleb Rutan <crutan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Yeah, you have to use a separate MyISAM table for searches. But you > can run both MyISAM and InnoDB tables in the same database, so it > isn''t really a hindrance. Just build a specific table for searches.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2005, at 6:24 PM, Carl Youngblood wrote:> I tried adding a FULLTEXT search index to a table of mine in MySQL, > only to discover that the InnoDB table format doesn''t seem to support > this feature. Switching to the MyISAM table type seemed to work, but > I seem to have some recollection that ActiveRecord transactions aren''t > fully atomic on MyISAM tables. Is this true or am I just remembering > wrong? If so, does anyone have any suggestions for doing fulltext > searches in rails? I found a couple of links on the wiki but they > only use LIKE syntax, which doesn''t rank searches by word frequency > like FULLTEXT searches do.I don''t know how involved you want to get, but a common way is to use InnoDB in the master database then replicate to a slave which uses MyISAM. Then perform your fulltext queries against the slave. This eliminates search load from your main database as well. Other possibilities: - - mirror your data in the InnoDB table foobars to an auxiliary MyISAM table foobars_search with callbacks on the Foo model to keep the two in sync. - - dump MySQL fulltext in favor of Odeum or Ferret. Best, jeremy -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) iD8DBQFDfpRoAQHALep9HFYRAhYoAKCTFjtY/olDCEqeHoLQZaa3Br+lbQCfQ1s4 yW/XK1xBg8B9yW/wbux3vpk=L3G6 -----END PGP SIGNATURE-----
Wow. I''m surprised to find that it is so difficult. Thanks for the info. This makes me wonder... Back in the early days of mysql they recommended using LOCK/UNLOCK TABLES to achieve transaction-like functionality. Monty''s argument was that myisam is about 3 times faster than fully ACID-compliant tables and the act of locking tables is really low overhead. Also, you only need to lock the tables that are needed for the transaction. How hard would it be to add support for MyISAM tables in ActiveRecord by locking the tables needed for a query instead of using InnoDB transactions? I think this would be a useful alternative to InnoDB, not just because it would allow people to use FULLTEXT but also because it might actually be faster in some cases. Thoughts? Carl On 11/18/05, Jeremy Kemper <jeremy-w7CzD/W5Ocjk1uMJSBkQmQ@public.gmane.org> wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Nov 18, 2005, at 6:24 PM, Carl Youngblood wrote: > > I tried adding a FULLTEXT search index to a table of mine in MySQL, > > only to discover that the InnoDB table format doesn''t seem to support > > this feature. Switching to the MyISAM table type seemed to work, but > > I seem to have some recollection that ActiveRecord transactions aren''t > > fully atomic on MyISAM tables. Is this true or am I just remembering > > wrong? If so, does anyone have any suggestions for doing fulltext > > searches in rails? I found a couple of links on the wiki but they > > only use LIKE syntax, which doesn''t rank searches by word frequency > > like FULLTEXT searches do. > > I don''t know how involved you want to get, but a common way is to use > InnoDB in the master database then replicate to a slave which uses > MyISAM. > Then perform your fulltext queries against the slave. This eliminates > search load from your main database as well. > > Other possibilities: > - - mirror your data in the InnoDB table foobars to an auxiliary MyISAM > table > foobars_search with callbacks on the Foo model to keep the two in > sync. > - - dump MySQL fulltext in favor of Odeum or Ferret. > > Best, > jeremy > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2 (Darwin) > > iD8DBQFDfpRoAQHALep9HFYRAhYoAKCTFjtY/olDCEqeHoLQZaa3Br+lbQCfQ1s4 > yW/XK1xBg8B9yW/wbux3vpk> =L3G6 > -----END PGP SIGNATURE----- > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 18, 2005, at 7:14 PM, Carl Youngblood wrote:> Wow. I''m surprised to find that it is so difficult. Thanks for the > info. This makes me wonder... Back in the early days of mysql they > recommended using LOCK/UNLOCK TABLES to achieve transaction-like > functionality. Monty''s argument was that myisam is about 3 times > faster than fully ACID-compliant tables and the act of locking tables > is really low overhead. Also, you only need to lock the tables that > are needed for the transaction.It''s fast for one writer but very slow for many. InnoDB often outperforms MyISAM due to writer concurrency alone (it has row locks whereas MyISAM has table locks only.)> How hard would it be to add support for MyISAM tables in ActiveRecord > by locking the tables needed for a query instead of using InnoDB > transactions? I think this would be a useful alternative to InnoDB, > not just because it would allow people to use FULLTEXT but also > because it might actually be faster in some cases.Table locking cannot emulate transactions. Otherwise, MyISAM would have them. It gives you a small portion of ACID (isolation only) and no rollback capability.> Thoughts?I think the best option is for MySQL to implement fulltext indexes on InnoDB tables. jeremy -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) iD8DBQFDfqIVAQHALep9HFYRAkFEAJ4h/N7R+0Z7HWTZ2g7cNY/UDccE/QCfalOx ndefj9U8xhapUhSko0A3f5I=RbOj -----END PGP SIGNATURE-----
Hi, Carl, is there a specific reason why you don''t think about using a ''fulltext search specialist'' for the job? MySQL-Fulltext is often (have a look at http://www.oreillynet.com/pub/wlg/8168 for example) considered inferior to ''real'' search engines performance- and feature-wise. And the great part: You already have the choice: Try ferret, try ruby-odeum, try lucene with bindings or as a web-service. I''m sure there will be the perfect fit for you! regards Jan Prill Carl Youngblood wrote:>Wow. I''m surprised to find that it is so difficult. Thanks for the >info. This makes me wonder... Back in the early days of mysql they >recommended using LOCK/UNLOCK TABLES to achieve transaction-like >functionality. Monty''s argument was that myisam is about 3 times >faster than fully ACID-compliant tables and the act of locking tables >is really low overhead. Also, you only need to lock the tables that >are needed for the transaction. > >How hard would it be to add support for MyISAM tables in ActiveRecord >by locking the tables needed for a query instead of using InnoDB >transactions? I think this would be a useful alternative to InnoDB, >not just because it would allow people to use FULLTEXT but also >because it might actually be faster in some cases. > >Thoughts? > >Carl > >
Hi Carl, Please do check out Ferret. Jan has written a great howto on integrating Ferret with Rails here; http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails Also, you might like to check out the quickstart tutorial here; http://ferret.davebalmain.com/api/files/TUTORIAL.html I''ve tried to make the API as simple as possible so I don''t think it''d be too much more work than using MySQL full text search and the results you''ll be getting will be a lot better. And if it''s speed you''re worried about, I expect to have the full C backed version out by the end of the month which should be faster than Java Lucene. I don''t know how it will compare with MySQL full text search but I''m pretty sure it wouldn''t be the bottleneck in your app. Cheers, Dave On 11/19/05, Carl Youngblood <carl-MJzSGySFh6ZUfOvSQQQpYw@public.gmane.org> wrote:> Wow. I''m surprised to find that it is so difficult. Thanks for the > info. This makes me wonder... Back in the early days of mysql they > recommended using LOCK/UNLOCK TABLES to achieve transaction-like > functionality. Monty''s argument was that myisam is about 3 times > faster than fully ACID-compliant tables and the act of locking tables > is really low overhead. Also, you only need to lock the tables that > are needed for the transaction. > > How hard would it be to add support for MyISAM tables in ActiveRecord > by locking the tables needed for a query instead of using InnoDB > transactions? I think this would be a useful alternative to InnoDB, > not just because it would allow people to use FULLTEXT but also > because it might actually be faster in some cases. > > Thoughts? > > Carl > > On 11/18/05, Jeremy Kemper <jeremy-w7CzD/W5Ocjk1uMJSBkQmQ@public.gmane.org> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On Nov 18, 2005, at 6:24 PM, Carl Youngblood wrote: > > > I tried adding a FULLTEXT search index to a table of mine in MySQL, > > > only to discover that the InnoDB table format doesn''t seem to support > > > this feature. Switching to the MyISAM table type seemed to work, but > > > I seem to have some recollection that ActiveRecord transactions aren''t > > > fully atomic on MyISAM tables. Is this true or am I just remembering > > > wrong? If so, does anyone have any suggestions for doing fulltext > > > searches in rails? I found a couple of links on the wiki but they > > > only use LIKE syntax, which doesn''t rank searches by word frequency > > > like FULLTEXT searches do. > > > > I don''t know how involved you want to get, but a common way is to use > > InnoDB in the master database then replicate to a slave which uses > > MyISAM. > > Then perform your fulltext queries against the slave. This eliminates > > search load from your main database as well. > > > > Other possibilities: > > - - mirror your data in the InnoDB table foobars to an auxiliary MyISAM > > table > > foobars_search with callbacks on the Foo model to keep the two in > > sync. > > - - dump MySQL fulltext in favor of Odeum or Ferret. > > > > Best, > > jeremy > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.4.2 (Darwin) > > > > iD8DBQFDfpRoAQHALep9HFYRAhYoAKCTFjtY/olDCEqeHoLQZaa3Br+lbQCfQ1s4 > > yW/XK1xBg8B9yW/wbux3vpk> > =L3G6 > > -----END PGP SIGNATURE----- > > _______________________________________________ > > Rails mailing list > > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > > http://lists.rubyonrails.org/mailman/listinfo/rails > > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
Thanks Jan and Dave for the encouragement. The mere thought of abandoning an RDBMS for the barbaric wastes of the filesystem filled me with dread. But now that I read that tutorial it really isn''t all that hard. And I think it will probably improve the performance a good deal. I''ll give it a try. I especially like how the tutorial explains how to keep your index in sync with your database. Carl On 11/19/05, Jan Prill <JanPrill-sTn/vYlS8ieELgA04lAiVw@public.gmane.org> wrote:> Hi, Carl, > > is there a specific reason why you don''t think about using a ''fulltext > search specialist'' for the job? MySQL-Fulltext is often (have a look at > http://www.oreillynet.com/pub/wlg/8168 for example) considered inferior > to ''real'' search engines performance- and feature-wise. And the great > part: You already have the choice: Try ferret, try ruby-odeum, try > lucene with bindings or as a web-service. I''m sure there will be the > perfect fit for you! > > regards > Jan PrillOn 11/19/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Hi Carl, > > Please do check out Ferret. Jan has written a great howto on > integrating Ferret with Rails here; > > http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails > > Also, you might like to check out the quickstart tutorial here; > > http://ferret.davebalmain.com/api/files/TUTORIAL.html > > I''ve tried to make the API as simple as possible so I don''t think it''d > be too much more work than using MySQL full text search and the > results you''ll be getting will be a lot better. And if it''s speed > you''re worried about, I expect to have the full C backed version out > by the end of the month which should be faster than Java Lucene. I > don''t know how it will compare with MySQL full text search but I''m > pretty sure it wouldn''t be the bottleneck in your app. > > Cheers, > Dave
GREAT! Please notice, that Dave made big process since I''ve posted the howto. Notice further the thread on the mailing list about "best practices" of the integration of the indexer. You might mix it into your models. And finally: Just ask if you have problems. I''m sure you won''t regret your decision of stepping into the world of Ferret/Lucene. In my java-projects Lucene always happens to be one of the most important parts, mostly because of it''s performance and its capability of fuzzy searches! regards Jan Prill Carl Youngblood wrote:>Thanks Jan and Dave for the encouragement. The mere thought of >abandoning an RDBMS for the barbaric wastes of the filesystem filled >me with dread. But now that I read that tutorial it really isn''t all >that hard. And I think it will probably improve the performance a >good deal. I''ll give it a try. I especially like how the tutorial >explains how to keep your index in sync with your database. > >Carl > >On 11/19/05, Jan Prill <JanPrill-sTn/vYlS8ieELgA04lAiVw@public.gmane.org> wrote: > > >>Hi, Carl, >> >>is there a specific reason why you don''t think about using a ''fulltext >>search specialist'' for the job? MySQL-Fulltext is often (have a look at >>http://www.oreillynet.com/pub/wlg/8168 for example) considered inferior >>to ''real'' search engines performance- and feature-wise. And the great >>part: You already have the choice: Try ferret, try ruby-odeum, try >>lucene with bindings or as a web-service. I''m sure there will be the >>perfect fit for you! >> >>regards >>Jan Prill >> >> > >On 11/19/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > >>Hi Carl, >> >>Please do check out Ferret. Jan has written a great howto on >>integrating Ferret with Rails here; >> >> http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails >> >>Also, you might like to check out the quickstart tutorial here; >> >> http://ferret.davebalmain.com/api/files/TUTORIAL.html >> >>I''ve tried to make the API as simple as possible so I don''t think it''d >>be too much more work than using MySQL full text search and the >>results you''ll be getting will be a lot better. And if it''s speed >>you''re worried about, I expect to have the full C backed version out >>by the end of the month which should be faster than Java Lucene. I >>don''t know how it will compare with MySQL full text search but I''m >>pretty sure it wouldn''t be the bottleneck in your app. >> >>Cheers, >>Dave >> >> >_______________________________________________ >Rails mailing list >Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org >http://lists.rubyonrails.org/mailman/listinfo/rails > > >
Hey Dave, have you considered using RubyInline to speed up Ferret? It might be a lot easier than writing a C module. On 11/19/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> results you''ll be getting will be a lot better. And if it''s speed > you''re worried about, I expect to have the full C backed version out > by the end of the month which should be faster than Java Lucene. I > don''t know how it will compare with MySQL full text search but I''m > pretty sure it wouldn''t be the bottleneck in your app.
On 11/20/05, Carl Youngblood <carl-MJzSGySFh6ZUfOvSQQQpYw@public.gmane.org> wrote:> Hey Dave, have you considered using RubyInline to speed up Ferret? It > might be a lot easier than writing a C module.Hey Carl, I''ve looked into this and I can''t get the same speed up as I can get by writing the full indexer in pure C. In fact, I even started writing each Ruby class in C (which should be quicker than ruby inline) but even this was nowhere near as quick as going the pure C route. The current version has the 10 most heavily used classes rewritten in C already, but I''m still expecting a 50 times speed up when I integrate the full C version. Dave> > On 11/19/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > results you''ll be getting will be a lot better. And if it''s speed > > you''re worried about, I expect to have the full C backed version out > > by the end of the month which should be faster than Java Lucene. I > > don''t know how it will compare with MySQL full text search but I''m > > pretty sure it wouldn''t be the bottleneck in your app. > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >
On 11/20/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> already, but I''m still expecting a 50 times speed up when I integrate > the full C version.Just to clarify this statement, I mean a 50 times speed up on what I was able to achieve with my class by class C extensions, not 50 times faster than Apache Lucene. I don''t want to make any claims about the speed up there until I''ve actually finished the work, but I''m fairly confident that it will be faster. Cheers, Dave
On 19 Nov 2005, at 23:15, David Balmain wrote:> On 11/20/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > >> already, but I''m still expecting a 50 times speed up when I integrate >> the full C version. >> > > Just to clarify this statement, I mean a 50 times speed up on what I > was able to achieve with my class by class C extensions, not 50 times > faster than Apache Lucene. I don''t want to make any claims about the > speed up there until I''ve actually finished the work, but I''m fairly > confident that it will be faster.Fair enough... but I''ve already seen folks taking your speed-up claims to be against Java Lucene. And while I''m perhaps being a bit defensive of my good pal "Java Lucene", I think that folks are getting the wrong impression about your speed-up predictions. I don''t want to put a damper on your work or enthusiasm at all, just want to let you know the word of mouth impressions I''m getting from folks about Ferret vs. Java Lucene. An interesting aspect of it is that as long as Ferret is fast enough, even if it was somewhat faster than Java Lucene (which I''m still cautiously skeptical of, but certainly optimistic), that alone wouldn''t likely be the criteria for folks switching from Java. I really like your confidence! Very impressive what you''ve done so far, and I''m thrilled beyond belief! Thank you for Ferret, and keep up the great work. Erik
Hi Erik, Thanks for the feedback. I''ll definitely try and be a bit more careful what I say in future. Hopefully my results will do the talking. ;-) Cheers, Dave On 11/20/05, Erik Hatcher <erik-LIifS8st6VgJvtFkdXX2HpqQE7yCjDx5@public.gmane.org> wrote:> > On 19 Nov 2005, at 23:15, David Balmain wrote: > > > On 11/20/05, David Balmain <dbalmain.ml-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > > >> already, but I''m still expecting a 50 times speed up when I integrate > >> the full C version. > >> > > > > Just to clarify this statement, I mean a 50 times speed up on what I > > was able to achieve with my class by class C extensions, not 50 times > > faster than Apache Lucene. I don''t want to make any claims about the > > speed up there until I''ve actually finished the work, but I''m fairly > > confident that it will be faster. > > Fair enough... but I''ve already seen folks taking your speed-up > claims to be against Java Lucene. And while I''m perhaps being a bit > defensive of my good pal "Java Lucene", I think that folks are > getting the wrong impression about your speed-up predictions. > > I don''t want to put a damper on your work or enthusiasm at all, just > want to let you know the word of mouth impressions I''m getting from > folks about Ferret vs. Java Lucene. > > An interesting aspect of it is that as long as Ferret is fast enough, > even if it was somewhat faster than Java Lucene (which I''m still > cautiously skeptical of, but certainly optimistic), that alone > wouldn''t likely be the criteria for folks switching from Java. > > I really like your confidence! Very impressive what you''ve done so > far, and I''m thrilled beyond belief! Thank you for Ferret, and keep > up the great work. > > Erik > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >