I''ve been trying to implement acts_as_ferret in my latest project and ran into a snag. If I do a search for ''auditor state'' then the search works perfectly. If I include a stop word, as in ''auditor of state'', then I get no results. I''d prefer not to set stop words to nil and index everything. The solution, that I have yet to attempt, is to use Ferret::QueryParser instead of passing the query as a string to the search method. I couldn''t find a way to do this with the current acts_as_ferret plugin and was wondering if modifying the plugin to have a "ferret_query_parser" method would be better than trying to use Ferret directly from my app model. Also, wouldn''t this approach be necessary if I implement my own analyzer? I was thinking of possibly using the double metaphone algorithm and thinking that without the query parser to analyze the search string using my custom analyzer that I wouldn''t get any results. I hope that I haven''t missed something obvious in aaf''s api. On a side note, is there any recommended place to place custom analyzers for rails apps? Thanks, Curtis -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20061101/ad65daf5/attachment-0001.html
Hi! On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote:> I''ve been trying to implement acts_as_ferret in my latest project and ran into a snag. If I do a search for ''auditor state'' then the search works perfectly. If I include a stop word, as in ''auditor of state'', then I get no results. I''d prefer not to set stop words to nil and index everything.what version of AAF/Ferret do you use ? Afair that issue isn''t new, and should have been fixed some time ago. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Currently I''m using AAF 0.10 and windows build of Ferret version 0.10.9 I''m currently moving my development platform to a FreeBSD machine which is why I haven''t been able to do much testing. The FreeBSD version will be 0.10.13 I looked into the archives I have but only solution I found was to set the stopwords to nil. Thanks, Curtis ----- Original Message ----- From: "Jens Kraemer" <kraemer at webit.de> To: <ferret-talk at rubyforge.org> Sent: Wednesday, November 01, 2006 12:27 PM Subject: Re: [Ferret-talk] aaf and stop words; query parser Hi! On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote:> I''ve been trying to implement acts_as_ferret in my latest project and raninto a snag. If I do a search for ''auditor state'' then the search works perfectly. If I include a stop word, as in ''auditor of state'', then I get no results. I''d prefer not to set stop words to nil and index everything. what version of AAF/Ferret do you use ? Afair that issue isn''t new, and should have been fixed some time ago. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 _______________________________________________ Ferret-talk mailing list Ferret-talk at rubyforge.org http://rubyforge.org/mailman/listinfo/ferret-talk
I''m using the same version of AAF and Ferret 0.3.0 and 0.10.9 respectively. I sent David Balmain my index so he could analyze it. I posted a similiar message here: http://www.ruby-forum.com/topic/84909 Any index I built with AAF seemed to demostrate this problem. I checked the code, but I couldn''t see where it might have been modifying the query string in anyway. Any help? Charlie Curtis Hatter wrote:> Currently I''m using AAF 0.10 and windows build of Ferret version 0.10.9 > > I''m currently moving my development platform to a FreeBSD machine which > is > why I haven''t been able to do much testing. The FreeBSD version will be > 0.10.13 > > I looked into the archives I have but only solution I found was to set > the > stopwords to nil. > > Thanks, > Curtis > > ----- Original Message ----- > From: "Jens Kraemer" <kraemer at webit.de> > To: <ferret-talk at rubyforge.org> > Sent: Wednesday, November 01, 2006 12:27 PM > Subject: Re: [Ferret-talk] aaf and stop words; query parser > > > Hi! > > On Wed, Nov 01, 2006 at 09:54:25AM -0500, Curtis Hatter wrote: >> I''ve been trying to implement acts_as_ferret in my latest project and ran > into a snag. If I do a search for ''auditor state'' then the search works > perfectly. If I include a stop word, as in ''auditor of state'', then I > get no > results. I''d prefer not to set stop words to nil and index everything. > > what version of AAF/Ferret do you use ? Afair that issue isn''t new, and > should have been fixed some time ago. > > cheers, > Jens > > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk-- Posted via http://www.ruby-forum.com/.
Charlie Hubbard wrote:> I''m using the same version of AAF and Ferret 0.3.0 and 0.10.9 > respectively. I sent David Balmain my index so he could analyze it. I > posted a similiar message here: > > http://www.ruby-forum.com/topic/84909 > > Any index I built with AAF seemed to demostrate this problem. I checked > the code, but I couldn''t see where it might have been modifying the > query string in anyway. > > Any help?I should also say that I was not able to reproduce this when I created an index using just ferret. So doing something similar to what David suggested in the other thread. I got hits when I submitted queries with stop words. Hope that helps. Charlie -- Posted via http://www.ruby-forum.com/.
I believe the problem was in how I was creating my index. My acts_as_ferret declaration was as follows: acts_as_ferret( :fields => { :name => {}, :desc => {:index => :untokenized_omit_norms}, :body => {:store => :yes}, :role => {}, }) With the above a search that used stop words, ex. "auditor of state", would return no hits. When I removed the ":index => :untokenized_omit_norms" and rebuilt the index that same search started to work with acts_as_ferret. I haven''t played around with just using ferret and seeing what would happen because of time constraints on this current project. If there''s any suggestions or anything I''d gladly try them. I would like to keep the "desc" untokenized and omit the norms because I don''t do boosting and may wish to sort by the "desc" field. Thanks, Curtis ----- Original Message ----- From: "Charlie Hubbard" <charlie.hubbard at gmail.com> To: <ferret-talk at rubyforge.org> Sent: Tuesday, November 07, 2006 9:59 AM Subject: Re: [Ferret-talk] aaf and stop words; query parser> Charlie Hubbard wrote: > > I''m using the same version of AAF and Ferret 0.3.0 and 0.10.9 > > respectively. I sent David Balmain my index so he could analyze it. I > > posted a similiar message here: > > > > http://www.ruby-forum.com/topic/84909 > > > > Any index I built with AAF seemed to demostrate this problem. I checked > > the code, but I couldn''t see where it might have been modifying the > > query string in anyway. > > > > Any help? > > I should also say that I was not able to reproduce this when I created > an index using just ferret. So doing something similar to what David > suggested in the other thread. I got hits when I submitted queries with > stop words. Hope that helps. > > Charlie > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
On Tue, Nov 07, 2006 at 06:25:20PM -0500, Curtis Hatter wrote:> I believe the problem was in how I was creating my index. My acts_as_ferret > declaration was as follows: > > acts_as_ferret( :fields => { > :name => {}, > :desc => {:index => :untokenized_omit_norms}, > :body => {:store => :yes}, > :role => {}, > }) > > With the above a search that used stop words, ex. "auditor of state", would > return no hits. When I removed the ":index => :untokenized_omit_norms" and > rebuilt the index that same search started to work with acts_as_ferret. I > haven''t played around with just using ferret and seeing what would happen > because of time constraints on this current project. > > If there''s any suggestions or anything I''d gladly try them. I would like to > keep the "desc" untokenized and omit the norms because I don''t do boosting > and may wish to sort by the "desc" field.you really should tokenize the desc field if you want to run searches across it. If you have to sort by the desc field and therefore can''t tokenize it, you could index it twice, once tokenized for searching and once untokenized (and maybe truncated to save some space in your index) for sorting. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Jens Kraemer wrote:> you really should tokenize the desc field if you want to run searches > across it. If you have to sort by the desc field and therefore > can''t tokenize it, you could index it twice, once tokenized for > searching > and once untokenized (and maybe truncated to save some space in your > index) for sorting. >Jens, I''m seeing this same behavior as Curtis, but here is how I"m building my index: acts_as_ferret( { :additional_fields => [:content] } ) See my other thread for some observations from what I initially tested. http://www.ruby-forum.com/topic/84909 However, when I tried to reproduce this using just ferret I couldn''t. Any ideas? Charlie -- Posted via http://www.ruby-forum.com/.
Hi! On Sat, Nov 18, 2006 at 04:29:43PM +0100, Charlie Hubbard wrote:> Jens, > > I''m seeing this same behavior as Curtis, but here is how I"m building my > index: > > acts_as_ferret( { :additional_fields => [:content] } ) > > See my other thread for some observations from what I initially tested. > > http://www.ruby-forum.com/topic/84909 > > However, when I tried to reproduce this using just ferret I couldn''t. > Any ideas?yes, I think it''s a Ferret bug that was introduced some time after 0.10.1. Have a look at this script: http://pastie.caboo.se/22886 This reproduces the problem by adding an untokenized field to the index. If there is no untokenized field, everything is fine. As AAF uses untokenized fields to store IDs and class names, the problem is always present. I checked the following versions of Ferret: working: 0.10.1 not working: 0.10.9, 0.10.11, 0.10.13 I already tried to conact Dave about this, but he still seems to be offline. Hope he''s fine and back soon to help us out here ;-) cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66