Hi, I''m not entirely satisfied with the way ActiveSupport::Inflector.transliterate works: - "œuf" (egg in French) is transliterated into "uf" instead of the more logical "oeuf" - "Straße" is transliterated into "Strae" instead of "Strasse" - "€" is transliterated into nothing (blank string) The result is that you end up with meaningless URLs when you generate them with parametrize, which uses transliterate. The "unidecode" gem (http://rubyforge.org/projects/unidecode/) has a different approach: - any ligature is expanded into separate characters, "œ" is transliterated into "oe", "ß" into "ss", etc. - more generally, unidecode always tries to find a replacement. For example, "€" is transliterated into "EU". What do you think: do you prefer the transliterate approach that ignores any fancy character or the unidecode gem that always tries to have a meaningfull replacement? Would it make sense to propose a patch that includes and uses the unidecode gem for the transliterate method? Martin --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
> I''m not entirely satisfied with the way > ActiveSupport::Inflector.transliterate works: > - "œuf" (egg in French) is transliterated into "uf" instead of the > more logical "oeuf" > - "Straße" is transliterated into "Strae" instead of "Strasse" > - "€" is transliterated into nothing (blank string)Agreed, it has absolutely no value unless you have an English application and want to get rid of ''those pesky unreadable characters''. Because sensible transliteration is dependent on both the source and destination locale it''s really hard to solve. I think it shouldn''t be included in Rails at all and should be solved in a separate library or gem. Manfred --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On Fri, Jan 16, 2009 at 1:24 PM, Manfred Stienstra <manfred@gmail.com> wrote:>> I''m not entirely satisfied with the way >> ActiveSupport::Inflector.transliterate works: >> - "œuf" (egg in French) is transliterated into "uf" instead of the >> more logical "oeuf" >> - "Straße" is transliterated into "Strae" instead of "Strasse" >> - "€" is transliterated into nothing (blank string) > > Agreed, it has absolutely no value unless you have an English > application and want to get rid of ''those pesky unreadable > characters''. Because sensible transliteration is dependent on both the > source and destination locale it''s really hard to solve. I think it > shouldn''t be included in Rails at all and should be solved in a > separate library or gem.I agree with you. While I may not agree with € => "EU", I think it would be nice if transliterate relied on Unidecode if the gem is present (very similar to the approach with the textilize helper). Basically it means "Rails is running under an environment that cares about these characters – rely on the gem". --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
> Agreed, it has absolutely no value unless you have an English > application and want to get rid of ''those pesky unreadable > characters''. Because sensible transliteration is dependent on both the > source and destination locale it''s really hard to solve. I think it > shouldn''t be included in Rails at all and should be solved in a > separate library or gem.When you step outside the latin-derived languages the transliterate code is even more problematic. 馬鹿 should return baka but returns '''', the same is true of any korean, thai or cyrillic text. This kind of thing is completely outside the scope of parameterize and I can''t imagine we''ll ever get a decent solution baked in to rails. For applications which care about this kind of thing, parameterize won''t ever be a decent solution. They can and should just put the values into the url like wikipedia does. If there are bugs with routes matching non-ascii values, we''ll fix them. -- Cheers Koz --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---