Liam Morley
2008-Jan-03 20:35 UTC
[Ferret-talk] properly escaping special characters in AAF?
For most cases, I''ve got search working in Rails as follows: ## controller: term = params[:search][:term] @results = MyModel.find_by_contents "#{term}*" The ''*'' character is appended to the search term so that searches match anything that begins with ''term''. For the most part, this is great, but let''s say term is equal to "Title: Some subtitle". This will match anything that has a ''title'' attribute equal to "some subtitle", instead of any attribute equal to "Title: Some subtitle", which is what I''m hoping for. If I run my search from within a double-quotes expression, like MyModel.find_by_contents "''\"#{term}*\"''", then it looks like I can get matches for "Title: Some subtitle", but I can''t get matches if I search for "Titl" without the ''e'', presumably because the ''*'' is escaped as well? I''m not quite sure. I want something that works in all cases, where I can include a search term that has a special character, but still get matches when my search term isn''t equal to an entire word. I''m hoping that my situation is a typical one, and that someone out there has already dealt with this? Thanks very much for any advice. Liam Morley -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20080103/d3c3112e/attachment.html
Julio Cesar Ody
2008-Jan-03 23:11 UTC
[Ferret-talk] properly escaping special characters in AAF?
Hey, these are two separate problems in fact. Let me try to explain. Throwing a query like "title: foo bar" straight into Ferret, when that gets turned into FQL, it becomes basically "look for ''foo bar'' in the ''title'' field", as you figured. Quoting the whole lot will throw the query towards the index, and the default_field value will be used to decide which fields should be queried for "title: foo bar". Since it defaults to "*", it''ll query all fields in every document. Now the reason why a query like "Title: foo bar" won''t match any results with "title" in it is, simply put, the analyzer you''re using. If you''re using the StandardAnalyzer (if you didn''t specify otherwise, then that''s what you''re using), the behavior you can expect is it will catch whole words, separated by spaces, minus stop words (or, and, by, etc...). So "titl" will never match "title". If you''re looking for something that gives you half-string matches, I''d go for a RegexpAnalyzer and use a regex like "*", which would turn every character into a token. This is a bit nightmarish because you''ll get an insane number of matches for everything, but right now I can''t think of a better way (maybe declare a mininum number of chars for a query and filter out results with very low score?). Or if you''re looking for stemming (query for "titles", "titling" returning results with "title"), have a look at http://rubyforge.org/pipermail/ferret-talk/2007-March/002782.html Hope that helps. On Jan 4, 2008 7:35 AM, Liam Morley <imotic at gmail.com> wrote:> For most cases, I''ve got search working in Rails as follows: > ## controller: > term = params[:search][:term] > @results = MyModel.find_by_contents "#{term}*" > > The ''*'' character is appended to the search term so that searches match > anything that begins with ''term''. For the most part, this is great, but > let''s say term is equal to "Title: Some subtitle". This will match anything > that has a ''title'' attribute equal to "some subtitle", instead of any > attribute equal to "Title: Some subtitle", which is what I''m hoping for. > > If I run my search from within a double-quotes expression, like > MyModel.find_by_contents "''\"#{term}*\"''", then it looks like I can get > matches for "Title: Some subtitle", but I can''t get matches if I search for > "Titl" without the ''e'', presumably because the ''*'' is escaped as well? I''m > not quite sure. > > I want something that works in all cases, where I can include a search term > that has a special character, but still get matches when my search term > isn''t equal to an entire word. I''m hoping that my situation is a typical > one, and that someone out there has already dealt with this? Thanks very > much for any advice. > > Liam Morley > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >