Ben Jackson
2005-Jun-20 22:23 UTC
Convert international chars to ASCII for directory names
Not sure if this is possible, but... We are trying to do an export function that puts all of our content in neatly organized folders named after the title of the collection. The catch is that our sections are all in portuguese. So for example: Sections: Apresentação do Projeto Educação Ambiental etc... Would have to go into folders: apresentacao_do_projeto educacao_ambiental etc... Has anyone seen a situation like this before? In the worst case we could do a series of gsub operations on the string for all the characters we''d need to substitute and use a hash for the character mapping, but if it''s already been done I see no reason to reinvent the wheel :) ___________________ Ben Jackson Diretor de Desenvolvimento ben-p14LI7ZcAE/pVLaUnt/cCQC/G2K4zDHf@public.gmane.org http://www.incomumdesign.com
Rick Olson
2005-Jun-21 00:43 UTC
Re: Convert international chars to ASCII for directory names
> Has anyone seen a situation like this before? In the worst case we > could do a series of gsub operations on the string for all the > characters we''d need to substitute and use a hash for the character > mapping, but if it''s already been done I see no reason to reinvent the > wheel :)Tuxxxie from #rubyonrails wrote an americanize method for the String class to do just that, but for email addresses. I don''t believe he posted the code anywhere, though. -- rick http://techno-weenie.net
Julian ''Julik'' Tarkhanov
2005-Jun-21 02:31 UTC
Re: Convert international chars to ASCII for directory names
On 21-jun-2005, at 0:23, Ben Jackson wrote:> Not sure if this is possible, but... > > We are trying to do an export function that puts all of our content > in neatly organized folders named after the title of the > collection. The catch is that our sections are all in portuguese. > So for example: > > Sections: > Apresentação do Projeto > Educação Ambiental > etc... > > Would have to go into folders: > apresentacao_do_projeto > educacao_ambiental > etc... > > Has anyone seen a situation like this before? In the worst case we > could do a series of gsub operations on the string for all the > characters we''d need to substitute and use a hash for the character > mapping, but if it''s already been done I see no reason to reinvent > the wheel :)When I need to do this I use the process called "transliteration" - the translation of letters to Latin equivalents. Essentially, the Unicode Consirtium should have established the transliteration conventions for different alphabets - but I am not aware of a library implementing this, let alone Ruby bindings for such a library. For me with my russian alphabet the process si straightforward though. For "inspiration" look at the dirify method in MovableType. -- Julian "Julik" Tarkhanov
Rick Olson
2005-Jun-21 04:45 UTC
Re: Convert international chars to ASCII for directory names
This is what I got from Tuxxxie on #rubyonrails: http://rafb.net/paste/results/wLKrUB46.html On 6/20/05, Julian ''Julik'' Tarkhanov <listbox-RY+snkucC20@public.gmane.org> wrote:> > On 21-jun-2005, at 0:23, Ben Jackson wrote: > > > Not sure if this is possible, but... > > > > We are trying to do an export function that puts all of our content > > in neatly organized folders named after the title of the > > collection. The catch is that our sections are all in portuguese. > > So for example: > > > > Sections: > > Apresentação do Projeto > > Educação Ambiental > > etc... > > > > Would have to go into folders: > > apresentacao_do_projeto > > educacao_ambiental > > etc... > > > > Has anyone seen a situation like this before? In the worst case we > > could do a series of gsub operations on the string for all the > > characters we''d need to substitute and use a hash for the character > > mapping, but if it''s already been done I see no reason to reinvent > > the wheel :) > > When I need to do this I use the process called "transliteration" - > the translation of letters to Latin equivalents. Essentially, the > Unicode Consirtium should have established the transliteration > conventions for different alphabets - but I am not aware of a library > implementing this, let alone Ruby bindings for such a library. For me > with my russian alphabet the process si straightforward though. > > For "inspiration" look at the dirify method in MovableType. > > -- > Julian "Julik" Tarkhanov > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-- rick http://techno-weenie.net
Michael Koziarski
2005-Jun-21 04:52 UTC
Re: Convert international chars to ASCII for directory names
On 6/21/05, Rick Olson <technoweenie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> This is what I got from Tuxxxie on #rubyonrails: > > http://rafb.net/paste/results/wLKrUB46.html >Sure that works for european languages, but what about elvish?! ;) -- Cheers Koz
Erik Hatcher
2005-Jun-21 10:28 UTC
Re: Convert international chars to ASCII for directory names
On Jun 21, 2005, at 12:45 AM, Rick Olson wrote:> This is what I got from Tuxxxie on #rubyonrails: > > http://rafb.net/paste/results/wLKrUB46.htmlJust to validate this approach, here is something similar in Java (within the context of a Lucene filter that will normalize terms to be indexed): http://rubyurl.com/zigoB