Multilingual Rails v0.6 is released! Here is the changelog. Documentation and download at the homepage: http://www.tuxsoft.se/oss/rails/multilingual v0.6 - 2005-08-13 * String case-manipulation functions replaced with ruby-unicode equivalents (if ruby-unicode is installed): String#downcase, String#upcase and String#capitalize now fully handle Unicode. * String normalization functions from ruby-unicode integrated into the String class: String#compose(compat=false) # Unicode::compose(str) or Unicode::compose_compat(str) String#decompose(compat=false) # Unicode::decompose(str) or Unicode::decompose_compat(str) String#normalize(compat=false) # Unicode::normalize_C(str) or Unicode::normalize_KC(str) Unicode::normalize_D/KD are the same as str.decompose.compose (=D) and str.decompose(true).compose (=KD). All methods above have method!-equivalents for in-place manipulation. * Multilingual Rails no longer use ruby-locale! You may uninstall it. :) Instead we use precalculated .rb-files which are platform independent. Overloaded Time#strftime to use translated day/month-names. * Support for pluralization of non-germanic languages. Algorithm (not code) borrowed from ri18n and modified slightly. * Some words (meant for interface buttons) are standardized and supposed to be available in all languages: :true, :false, :yes, :no, :on, :off, :next, :previous, :back, :fo rward, :skip, :cancel, :abort, :quit, :save, :discard, :reset, :delete, :edit * Spanish translation added * No longer overload the Kconv methods because they handle a few cases that Iconv doesn''t. My knowledge of Japanese and other asian languages is VERY limited. If you Japanese guys think that the way MLR integrates compose/decompose/normalize right now is wrong/ backwards, please tell me right now so I can make a more natural-to-use implementation for the next version. Translators wanted! Please look at lib/multilingual/data/README for information about how to make translated ISO3166/639-files and use lib/multilingal/locales/en.rb as a template default translation file for your language, then send the files back to me! // Per Wigren
I just noticed that MLR v0.6 doesn''t run unless the constant SITE_ROOT is set so as a temporary hack you can add: SITE_ROOT ||= RAILS_ROOT in your environment.rb before the require-line. // Per
Hi, On Sat, 13 Aug 2005 17:22:34 +0200 Per Wigren <tuxie-0tmT37jx+ZsOP4wsBPIw7w@public.gmane.org> wrote:> Multilingual Rails v0.6 is released! > Here is the changelog. Documentation and download at the homepage: > http://www.tuxsoft.se/oss/rails/multilingualPluralization seems better. But not still enough for me. In Japanese, sometimes we use number "2" But, in some cases, we want to use number "1". It depends on the context. So we need to be able to choose them. Also it''s better "character encoding" is selectable. UTF-8 has some problems in Japan. So we prefer to use other encoding such as EUC-JP, Shift_JIS. -- .:% Masao Mutoh<mutoh-+e5RZkbjevhHfZP73Gtkiw@public.gmane.org>
14 aug 2005 kl. 04.16 skrev Masao Mutoh:> Pluralization seems better. But not still enough for me. > > In Japanese, sometimes we use number "2" > But, in some cases, we want to use number "1". > It depends on the context. > > So we need to be able to choose them.I''m not sure exactly what you mean here.. Can you give me some more details? Does ri18n have the same problem?> Also it''s better "character encoding" is selectable. > UTF-8 has some problems in Japan. So we prefer to use > other encoding such as EUC-JP, Shift_JIS.Are you sure the problems aren''t with your texteditor as opposed to the UTF-8 encoding? :) Do you have an example file that is no longer the same if encoded from EUC-JP to UTF-8 back to EUC-JP? Using Japanese-specific encodings is against the multilingual nature of MLR because it can only represent Japanese (and maybe ASCII) so if you use EUC-JP for your database you won''t be able to store data in non-japanese charactersets. Anyway, if you have concrete ideas how to implement it, please tell! Preferrably in the .diff-language. :) // Per
Hi, On Sun, 14 Aug 2005 05:41:08 +0200 Per Wigren <tuxie-0tmT37jx+ZsOP4wsBPIw7w@public.gmane.org> wrote:> 14 aug 2005 kl. 04.16 skrev Masao Mutoh: > > > Pluralization seems better. But not still enough for me. > > > > In Japanese, sometimes we use number "2" > > But, in some cases, we want to use number "1". > > It depends on the context. > > > > So we need to be able to choose them. > > I''m not sure exactly what you mean here.. Can you give me some > more details? Does ri18n have the same problem?Sorry, but it''s difficult for me to examin it unless Japanese. I don''t know ri18n well. But with GetText and Ruby-GetText, the translators can set the rule in *.po file.> > Also it''s better "character encoding" is selectable. > > UTF-8 has some problems in Japan. So we prefer to use > > other encoding such as EUC-JP, Shift_JIS. > > Are you sure the problems aren''t with your texteditor as opposed > to the UTF-8 encoding? :) > > Do you have an example file that is > no longer the same if encoded from EUC-JP to UTF-8 back to EUC-JP? > > Using Japanese-specific encodings is against the multilingual > nature of MLR because it can only represent Japanese (and maybe > ASCII) so if you use EUC-JP for your database you won''t be able > to store data in non-japanese charactersets.I don''t think I tell the details here. It''s very complex problem, and I don''t know I can tell you them correctly in English. So, I said, "selectable is better" like ruby. -- .:% Masao Mutoh<mutoh-+e5RZkbjevhHfZP73Gtkiw@public.gmane.org>
Mutoh-San, I believe Per would like to help, but he needs your help in describing the problem. Perhaps a friend could translate for you? Better understanding leads to better solutions. -San Masao Mutoh wrote:> Hi, > > On Sun, 14 Aug 2005 05:41:08 +0200 > Per Wigren <tuxie-0tmT37jx+ZsOP4wsBPIw7w@public.gmane.org> wrote: > > >>14 aug 2005 kl. 04.16 skrev Masao Mutoh: >> >> >>>Pluralization seems better. But not still enough for me. >>> >>>In Japanese, sometimes we use number "2" >>>But, in some cases, we want to use number "1". >>>It depends on the context. >>> >>>So we need to be able to choose them. >> >>I''m not sure exactly what you mean here.. Can you give me some >>more details? Does ri18n have the same problem? > > > Sorry, but it''s difficult for me to examin it unless Japanese. > > I don''t know ri18n well. > But with GetText and Ruby-GetText, the translators can set the rule > in *.po file. > > >>>Also it''s better "character encoding" is selectable. >>>UTF-8 has some problems in Japan. So we prefer to use >>>other encoding such as EUC-JP, Shift_JIS. >> >>Are you sure the problems aren''t with your texteditor as opposed >>to the UTF-8 encoding? :) >> >>Do you have an example file that is >>no longer the same if encoded from EUC-JP to UTF-8 back to EUC-JP? >> >>Using Japanese-specific encodings is against the multilingual >>nature of MLR because it can only represent Japanese (and maybe >>ASCII) so if you use EUC-JP for your database you won''t be able >>to store data in non-japanese charactersets. > > > I don''t think I tell the details here. > It''s very complex problem, and I don''t know I can tell you them > correctly in English. > So, I said, "selectable is better" like ruby. >
Hi.> > Also it''s better "character encoding" is selectable. > > UTF-8 has some problems in Japan. So we prefer to use > > other encoding such as EUC-JP, Shift_JIS. > > Are you sure the problems aren''t with your texteditor as opposed > to the UTF-8 encoding? :)For example, in Japan, most of the browsers in portable phone can handle only Shift_JIS. So, if you choose UTF-8 as ja_JP locale, we can''t use Rails in order to service to portable phones. In your MLR, this ruby script is acceptable as translation file? string ''foobarbaz'' to :"ja_JP.eucjp", "xxxx" to :"ja_JP.utf8", "yyyy" to :"ja_JP.sjis", "zzzz" end If MLR can accept it, we may use Japanese encodings other than UTF-8.> Do you have an example file that is > no longer the same if encoded from EUC-JP to UTF-8 back to EUC-JP?I have already suggested the site where the problem of Shift_JIS/EUC-JP/iso-2022-jp/UCS conversion is explained. http://www.miraclelinux.com/english/technet/samba30/iconv_issues.html As I said, some of the problems explained here exists now. Also, you can see the list of characters that are not the same if encoded from Shift_JIS to Unicode back to Shift_JIS. > There are some codes that are not matched one-to-one > between Shift-JIS (Japanese character set supported by MS) and Unicode. http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559 We, Japanese, have difficulty to encode Unicode from/to other Japanese encodings.> Using Japanese-specific encodings is against the multilingual > nature of MLR because it can only represent Japanese (and maybe > ASCII)This idea of MLR is against the M17N Ruby. If possible, please tell me why you use UTF-8 as internal character encoding against M17N Ruby? **** NISHIO Mizuho
NISHIO Mizuho wrote:> This idea of MLR is against the M17N Ruby. > If possible, please tell me why you use UTF-8 > as internal character encoding against M17N Ruby?wanted to chime in on this with my 2 cents. I can think of 2 things: 1) As far as I understand it, M17N is not in the current Ruby 1.8.x. Rails is not compatible with Ruby > 1.8.x so it can not be supported, *yet*. There is not even a release date for Ruby 2.0 yet... 2) There is no support for M17N in databases which are the most important data store in Rails and web applications in general. This means that you cannot use it to save your data into Mysql for example. This will probably not change for a loooooong time. So another solution is needed. So far the solution to use UTF-8 as an internal encoding mechanism is the one with the fewest problems for the overall situation of character encodings. First and foremost Rails (and I suspect Nitro and Wee and others) are web application frameworks that run server-side and serve clients that are Browsers and Aggregators. These are must have and everything else is nice to have. And, as Per already mentioned multiple times. If you have a solution than just send in a patch ;) Sascha Ebach
>>> Also it''s better "character encoding" is selectable. >>> UTF-8 has some problems in Japan. So we prefer to use >>> other encoding such as EUC-JP, Shift_JIS. >>> >> >> Are you sure the problems aren''t with your texteditor as opposed >> to the UTF-8 encoding? :) >> > For example, in Japan, most of the browsers in portable phone > can handle only Shift_JIS. So, if you choose UTF-8 as ja_JP locale, > we can''t use Rails in order to service to portable phones.after_filter { |c| c.response.body.iconv_to!(''shift-jis'') if Locale.current =~ /^ja/ }> Also, you can see the list of characters that are not the same > if encoded from Shift_JIS to Unicode back to Shift_JIS. > >> There are some codes that are not matched one-to-one >> between Shift-JIS (Japanese character set supported by MS) and >> > Unicode. > http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559 > > We, Japanese, have difficulty to encode Unicode > from/to other Japanese encodings.From what I''ve heard Unicode use the same codes for some chinese and japanese kanji-characters that does not look exactly similar, but if you specify <html lang="ja" xml:lang="ja"> the Japanese version will be used by the browser. Correct me if I''m wrong.>> Using Japanese-specific encodings is against the multilingual >> nature of MLR because it can only represent Japanese (and maybe >> ASCII) >> > This idea of MLR is against the M17N Ruby. > If possible, please tell me why you use UTF-8 > as internal character encoding against M17N Ruby?Maybe because M17N isn''t released yet? AFAIK it doesn''t even have a release date set... // Per
Hi.> >>> Also it''s better "character encoding" is selectable. > >>> UTF-8 has some problems in Japan. So we prefer to use > >>> other encoding such as EUC-JP, Shift_JIS. > >>> > >> > >> Are you sure the problems aren''t with your texteditor as opposed > >> to the UTF-8 encoding? :) > >> > > For example, in Japan, most of the browsers in portable phone > > can handle only Shift_JIS. So, if you choose UTF-8 as ja_JP locale, > > we can''t use Rails in order to service to portable phones. > > after_filter { |c| c.response.body.iconv_to!(''shift-jis'') if > Locale.current =~ /^ja/ }Maybe, it is better to use iconv_to!(''cp932'') if iconv supports cp932. If not, I will use Kconv.tosjis(). In addition to it, before_filter is needed. before_filter :set_charset protected def set_charset # EUC-JP @headers[''Content-Type''] = ''text/html; charset=Shift_JIS'' end Strictly speaking, We, Japanese, don''t use Shift_JIS. We often uses CP932 as Shift_JIS. There are difference between the two character encodings.> > Also, you can see the list of characters that are not the same > > if encoded from Shift_JIS to Unicode back to Shift_JIS. > > > >> There are some codes that are not matched one-to-one > >> between Shift-JIS (Japanese character set supported by MS) and > >> > > Unicode. > > http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559 > > > > We, Japanese, have difficulty to encode Unicode > > from/to other Japanese encodings. > > From what I''ve heard Unicode use the same codes for some chinese and > japanese kanji-characters that does not look exactly similar, but if > you specify <html lang="ja" xml:lang="ja"> the Japanese version will > be used by the browser. Correct me if I''m wrong.To tell the truth, I am not familiar to CJK unification. And, I don''t see the problem of CJK unification. But, CJK unification is one of the problem of Unicode. The CJK unification problem may be solved by variation selector. Version 3.2 of the Unicode Standard defines variation selector. Other than CJK unification, Unicode has some problems. Conversion table of Unicode is variable. So, for example, depending on implementation of iconv(), the result of "xxxx(utf-8 string)".to_iconv(''shift-jis'') is changed even if the iconv() supports ''shift-jis''.> >> Using Japanese-specific encodings is against the multilingual > >> nature of MLR because it can only represent Japanese (and maybe > >> ASCII) > >> > > This idea of MLR is against the M17N Ruby. > > If possible, please tell me why you use UTF-8 > > as internal character encoding against M17N Ruby? > > Maybe because M17N isn''t released yet? AFAIK it doesn''t even have a > release date set...OK. As you said in the past mail, you use UTF-8 as internal encoding until M17N Ruby is released.