Is there any way to write a search function that can search for words that contain accented characters when the user types in words without accented characters? My database has a lot of names in it that have characters with accents and other non-keyboard characters. When users search the database, I would like for them to be able to find records with accented characters even if they don''t type in the accent. For instance, a user might be searching for a text by the author Chrétien de Troyes. Right now, they have to type "Chrétien" into the search form to find him: I would like for a search for "Chretien" to also find "Chrétien." This strikes me as a rather common problem: is there a good solution for it? -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On 29 Sep 2008, at 21:26, Morgan Kay <rails-mailing-list@andreas- s.net> wrote:> > Is there any way to write a search function that can search for words > that contain accented characters when the user types in words without > accented characters? >Play with your database collation settings Fred> My database has a lot of names in it that have characters with accents > and other non-keyboard characters. When users search the database, I > would like for them to be able to find records with accented > characters > even if they don''t type in the accent. For instance, a user might be > searching for a text by the author Chrétien de Troyes. Right now, t > hey > have to type "Chrétien" into the search form to find him: I would li > ke > for a search for "Chretien" to also find "Chrétien." > > This strikes me as a rather common problem: is there a good solution > for > it? > -- > Posted via http://www.ruby-forum.com/. > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Quoting Frederick Cheung <frederick.cheung-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:> > > > On 29 Sep 2008, at 21:26, Morgan Kay <rails-mailing-list@andreas- > s.net> wrote: > > > > > Is there any way to write a search function that can search for words > > that contain accented characters when the user types in words without > > accented characters? > > > Play with your database collation settings > > Fred >This works if you have only one language. With multiple languages, you need to keep the locale, switching as needed. Generic Latin1 may do what you need. Jeffrey --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
I have been reading up on collation settings, and I''m not sure it will do what I want. I don''t want to get rid of accented characters (which is what would happen if I changed character sets), I just don''t want searches to get thrown off by them. In other words, a search for "Chrétien" or "Chretien" should still find "Chrétien", and he should still have the accent in his name. Can collation settings do this for me, or is there some other solution? Thanks! -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Oct 1, 2008, at 8:48 PM, Morgan Kay wrote:> I have been reading up on collation settings, and I''m not sure it will > do what I want. I don''t want to get rid of accented characters (which > is what would happen if I changed character sets), I just don''t want > searches to get thrown off by them. In other words, a search for > "Chrétien" or "Chretien" should still find "Chrétien", and he should > still have the accent in his name. Can collation settings do this for > me, or is there some other solution?One approach is to transliterate your input, e.g.: http://interglacial.com/~sburke/tpj/as_html/tpj22.html -- Sean M. Burke, Unidecode!, 2001 That way, "Chrétien" becomes "chretien" or some such for the purpose of your search, but remains "Chrétien" in the text. For example, both El-Aaiún and El-Aaiun could reference the same underlying text: http://svr225.stepx.com:3388/El-Aaiún http://svr225.stepx.com:3388/El-Aaiun Cheers, -- PA. http://alt.textdrive.com/nanoki/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
> One approach is to transliterate your input, e.g.: > > http://interglacial.com/~sburke/tpj/as_html/tpj22.html > -- Sean M. Burke, Unidecode!, 2001 > > That way, "Chrétien" becomes "chretien" or some such for the purpose > of your search, but remains "Chrétien" in the text. > > For example, both El-Aaiún and El-Aaiun could reference the same > underlying text: > > http://svr225.stepx.com:3388/El-Aaiún > http://svr225.stepx.com:3388/El-Aaiun >This looks really promising, but after reading up on this for a while, I don''t see how to get it to work with Rails... could you give me a few pointers or direct me to some documentation? Thank you!! -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Oct 2, 2008, at 2:25 AM, Morgan Kay wrote:> >> One approach is to transliterate your input, e.g.: >> >> http://interglacial.com/~sburke/tpj/as_html/tpj22.html >> -- Sean M. Burke, Unidecode!, 2001 >> >> That way, "Chrétien" becomes "chretien" or some such for the purpose >> of your search, but remains "Chrétien" in the text. >> >> For example, both El-Aaiún and El-Aaiun could reference the same >> underlying text: >> >> http://svr225.stepx.com:3388/El-Aaiún >> http://svr225.stepx.com:3388/El-Aaiun >> > > This looks really promising, but after reading up on this for a > while, I > don''t see how to get it to work with Rails... could you give me a few > pointers or direct me to some documentation?At its core, Unidecode is simply a lookup table. Should be rather straightforward to port to Ruby if it hasn''t been done already. Here is the original Perl implementation: http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm And bellow is a Lua port of it: http://dev.alt.textdrive.com/browser/HTTP/Unidecode.lua As well as the lookup table themselves: http://dev.alt.textdrive.com/browser/HTTP/Unidecode Usage example: local Unidecode = require( ''Unidecode'' ) print( 1, ''Москва́'', Unidecode( ''Москва́'' ) ) print( 2, ''北京'', Unidecode( ''北京'' ) ) print( 3, ''Ἀθηνᾶ'', Unidecode( ''Ἀθηνᾶ'' ) ) print( 4, ''서울'', Unidecode( ''서울'' ) ) print( 5, ''東京'', Unidecode( ''東京'' ) ) print( 6, ''京都市'', Unidecode( ''京都市'' ) ) print( 7, ''नेपाल'', Unidecode( ''नेपाल'' ) ) > 1 Москва́ Moskva > 2 北京 beijing > 3 Ἀθηνᾶ Athena > 4 서울 seoul > 5 東京 dongjing > 6 京都市 jingdushi > 7 नेपाल nepaal If Unidecode is too much of a good thing, one could use iconv translit or such, e.g. iconv( ''utf-8'', ''us-ascii//TRANSLIT'' )... One way or another, the crux of it is to transliterate your data as well as you query. And then use the later to search the former. Cheers, -- PA. http://alt.textdrive.com/nanoki/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Petite Abeille [2008-10-02 19:56]:> At its core, Unidecode is simply a lookup table. Should be rather > straightforward to port to Ruby if it hasn''t been done already.i wanted to do it, but it''s been there for over a year now: <http://rubyforge.org/projects/unidecode> cheers jens --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---