Hi all, I have indexed a huge amount of data with text from several european languages. In the index are values like Georg Friedrich H?ndel. I would like a search phrase like "Georg Friedrich Handel" to find records with the real spelling of H?ndel but it doesn''t seem to work. Can anyone give me an idea of what I need to do to make this happen. A bit lost here and can''t seem to find anything on google to help out. I have an idea that it might be a locale issue but not sure. Thanks, Chad. -- Posted via http://www.ruby-forum.com/.
I''m a 100% sure this has been asked before in this list. But I know it''s not exactly trivial to search for it. I''d say, give it a try on the archives. http://rubyforge.org/pipermail/ferret-talk/ On 8/14/07, Chad Thatcher <chad at zulu.net> wrote:> Hi all, > > I have indexed a huge amount of data with text from several european > languages. In the index are values like Georg Friedrich H?ndel. > > I would like a search phrase like "Georg Friedrich Handel" to find > records with the real spelling of H?ndel but it doesn''t seem to work. > > Can anyone give me an idea of what I need to do to make this happen. A > bit lost here and can''t seem to find anything on google to help out. I > have an idea that it might be a locale issue but not sure. > > Thanks, > > Chad. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
On Tue, Aug 14, 2007 at 02:24:54AM +0200, Chad Thatcher wrote:> Hi all, > > I have indexed a huge amount of data with text from several european > languages. In the index are values like Georg Friedrich H?ndel. > > I would like a search phrase like "Georg Friedrich Handel" to find > records with the real spelling of H?ndel but it doesn''t seem to work. > > Can anyone give me an idea of what I need to do to make this happen. A > bit lost here and can''t seem to find anything on google to help out. I > have an idea that it might be a locale issue but not sure.To achieve this, simply replace all occurences of ''?'' by ''a'' in both indexed content and queries. MappingFilter [1] is your friend :-) Cheers, Jens [1] http://ferret.davebalmain.com/api/classes/Ferret/Analysis/MappingFilter.html -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database
Chad, you should use a mapping filter to transform special characters like german umlauts into its ascii counterpiece. take a look out this analyzer, maybe it helps .. http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/analysis.rb Ben