Hello, with Rails 3.0.3 "Café Noir ".strip => "Café noir" but "Café ".strip => "Caf\303\251" In fact, strip() doesn''t works if the last printable character is accentuated. Surprisingly " écologie".strip works fine. I''ve tried to dig deeper in active_support multibyte source code but didn''t found any solution. Any help ? -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On 14 December 2010 16:22, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> Hello, with Rails 3.0.3 > > "Café Noir ".strip => "Café noir" > but > "Café ".strip => "Caf\303\251" > > In fact, strip() doesn''t works if the last printable character is > accentuated.Strange, I get: $ rails console Loading development environment (Rails 3.0.3) ruby-1.9.2-p0 > "Café Noir ".strip => "Café Noir" ruby-1.9.2-p0 > "Café ".strip => "Café" Which Ruby are you using? Colin -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
I don''t see the ''é'' in your snippet code. Did you tried with real accentuated chars ? I''m using Ruby enterprise edition 1.8.x - I didn''t thought about a possible bug in Ruby itself. I might try a more recent 1.8 version or REE... Don''t want to switch to 1.9 just for a so small (but annoying) problem... -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On 15 December 2010 00:02, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> I don''t see the ''é'' in your snippet code. Did you tried with real > accentuated chars ?Is this in reply to my response? You have not quoted anything and have changed the subject line so gmail has not linked up the thread. If so I don''t understand when you say you do not see the accented char, copying from my previous post: $ rails console Loading development environment (Rails 3.0.3) ruby-1.9.2-p0 > "Café Noir ".strip => "Café Noir" ruby-1.9.2-p0 > "Café ".strip => "Café" I see accented char é. It is interesting, though, that the é in your mail looks different to the one here, even though I have just copied and pasted it from your email into mine. It does look like yours though when I paste it into the ruby console. What happens if you copy it from here and use it in your console?> > I''m using Ruby enterprise edition 1.8.x - I didn''t thought about a > possible bug in Ruby itself. I might try a more recent 1.8 version or > REE... Don''t want to switch to 1.9 just for a so small (but annoying) > problem...This is the result in 1.8.7 $ ruby script/console Loading development environment (Rails 2.3.2) ruby-1.8.7-p302 > "Café Noir ".strip => "Café Noir" ruby-1.8.7-p302 > "Café ".strip => "Café" ruby-1.8.7-p302 > Of course maybe your response was not to my mail at all, in which case I have been wasting my time. Colin -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Frederick Cheung
2010-Dec-15 10:27 UTC
Re: UTF-8 String.strip bug (and several over methods)
On Dec 14, 4:22 pm, Bob Mundane <li...-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> Hello, with Rails 3.0.3 > > "Café Noir ".strip => "Café noir" > but > "Café ".strip => "Caf\303\251"While it may not look pretty this is accurate if you are using utf8 - é is 0xC3 0xA9 in UTF8, which is 0o303 0o251 in octal. I''m not sure why inspect is choosing to show the octal escape codes but you string does contain the correct bytes. (maybe some heuristic that tries to determine whether the string is utf8 and show be displayed as such or whether it just contains random binary gunk) Fred> > In fact, strip() doesn''t works if the last printable character is > accentuated. > Surprisingly " écologie".strip works fine. > > I''ve tried to dig deeper in active_support multibyte source code but > didn''t found any solution. > > Any help ? > > -- > Posted viahttp://www.ruby-forum.com/.-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Colin Law wrote in post #968503:> > Is this in reply to my response? You have not quoted anything and > have changed the subject line so gmail has not linked up the thread. >I''m posting throught ruby-forum, so may be something got mixed up during the process ? So, in your case String.strip() does work correctly with both versions of Ruby. I really don''t understand why it goes wrong for me. May be a bug in the REE code. -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On 15 December 2010 11:09, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> Colin Law wrote in post #968503: >> >> Is this in reply to my response? You have not quoted anything and >> have changed the subject line so gmail has not linked up the thread. >> > > I''m posting throught ruby-forum, so may be something got mixed up during > the process ? > > So, in your case String.strip() does work correctly with both versions > of Ruby. I really don''t understand why it goes wrong for me. May be a > bug in the REE code.Have you seen Fred''s reply back in your original thread? Colin -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Colin Law wrote in post #968534:> On 15 December 2010 11:09, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote: >> of Ruby. I really don''t understand why it goes wrong for me. May be a >> bug in the REE code. > > Have you seen Fred''s reply back in your original thread? > > ColinYes I did. But I still don''t see how to avoid having all my right stripped strings beeing garbaged by octal escapes. I would like to avoid the need of a regexp call to revert them to something readable. Thanks -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On 15 December 2010 13:54, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote:> Colin Law wrote in post #968534: >> On 15 December 2010 11:09, Bob Mundane <lists-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote: >>> of Ruby. I really don''t understand why it goes wrong for me. May be a >>> bug in the REE code. >> >> Have you seen Fred''s reply back in your original thread? >> >> Colin > > Yes I did. But I still don''t see how to avoid having all my right > stripped strings beeing garbaged by octal escapes. I would like to avoid > the need of a regexp call to revert them to something readable.If I understand Fred correctly there is nothing wrong with the string, it is just the display that is wrong in the console. Are you seeing the same thing when you show it on a web page? Colin -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Peter Vandenabeele
2010-Dec-16 17:00 UTC
Re: UTF-8 String.strip bug (and several over methods)
Frederick Cheung wrote in post #968521:> On Dec 14, 4:22pm, Bob Mundane <li...-fsXkhYbjdPsEEoCn2XhGlw@public.gmane.org> wrote: >> Hello, with Rails 3.0.3 >> >> "Caf Noir ".strip => "Caf noir" >> but >> "Caf ".strip => "Caf\303\251" > > While it may not look pretty this is accurate if you are using utf8 - > is 0xC3 0xA9 in UTF8, which is 0o303 0o251 in octal. I''m not sure > why inspect is choosing to show the octal escape codes but you string > does contain the correct bytes. (maybe some heuristic that tries to > determine whether the string is utf8 and show be displayed as such or > whether it just contains random binary gunk) > > FredI tried in 3 different versions of ruby and the way it is rendered in irb is indeed different (and is confusing): ruby-1.8.7-p302 > "Caf\303\251" => "Caf\303\251" ... ree-1.8.7-2010.02 > "Caf\303\251" => "Caf\303\251" ... ruby-1.9.2-head > "Caf\303\251" => "Café" @Bob, are you sure you use UTF-8 encoding for your web page? HTH, Peter -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.