Hi, we are using rails with utf-8. Unfortuantely rubys utf-8 support isn''t perfect yet and length/size for an utf8-string returns the number of bytes in the string, not the number of characters. This makes validates_length_of erroneous in the case of values containing non ascii characters. How should one work around this? I think I could a) patch validations.rb to use value.jlength (from jcode) instead of value.size b) do the same as a) without modification of rails by overwriting the validates_length_of method in some common superclass of our models c) overwrite strings size method to use jlength (size can be overwritten, length not, since jlength uses length) a and b are basically the same change in different ways (the first has the disadvantage that I have to redo the change whenever updating rails, the second has the disadvantage that I''ll have my own validates_length_of method). c is more general and might fix or worsen things in other places. Currently I''m favouring b. Are there any other possibilities? How do others handle this problem? Morus
On 24-okt-2005, at 10:22, Morus Walter wrote:> > How should one work around this? > > I think I could > a) patch validations.rb to use value.jlength (from jcode) instead > of value.size > b) do the same as a) without modification of rails by overwriting the > validates_length_of method in some common superclass of our models > c) overwrite strings size method to use jlength (size can be > overwritten, > length not, since jlength uses length) > > a and b are basically the same change in different ways (do a) and wrap it into a patch so that we all can use it -- Julian "Julik" Tarkhanov
On Mon, 24 Oct 2005 13:52:28 +0200 Julian ''Julik'' Tarkhanov <listbox-RY+snkucC20@public.gmane.org> wrote:> > How should one work around this? > > > > I think I could > > a) patch validations.rb to use value.jlength (from jcode) instead > > of value.size > > > do a) and wrap it into a patch so that we all can use it >hmm. How would you do that in a way that doesn''t require everyone (that is those, not using utf8) to use jcode? If you look at jcode''s implementation of jlength, it''s something you don''t really want to have unless there''s no alternative (it replaces all non-ascii characters by blanks and calls length on the result). Morus
On 25-okt-2005, at 10:43, Morus Walter wrote:> On Mon, 24 Oct 2005 13:52:28 +0200 > Julian ''Julik'' Tarkhanov <listbox-RY+snkucC20@public.gmane.org> wrote: > > > >>> How should one work around this? >>> >>> I think I could >>> a) patch validations.rb to use value.jlength (from jcode) instead >>> of value.size >>> >> >> >> do a) and wrap it into a patch so that we all can use it >> >> > hmm. How would you do that in a way that doesn''t require everyone > (that is those, not using > utf8) to use jcode? > > If you look at jcode''s implementation of jlength, it''s something > you don''t really want to have > unless there''s no alternative (it replaces all non-ascii characters > by blanks and calls length > on the result).I know that Unicode in Ruby sucks big big time. It''s my pain as well. As for your solution- You can check for $KCODE. If it''s set to ''u'', then you can act like this: ($KCODE == ''UTF8'' and str.respond_to?(:jsize)) ? str.jsize : str.size I believe Shugo Maeda had some more info on the ways jcode is bad in his blog. -- Julian "Julik" Tarkhanov