Ruby String methods assume the string is a single byte per character,
which as you know, is not the case with unicode strings. therefore a
multibyte character in your string is going to throw everything off.
Such is the nature of Ruby.
as a starting point, i suggest you check out:
http://wiki.rubyonrails.org/rails/pages/HowToUseUnicodeStrings
http://julik.textdriven.com/svn/tools/rails_plugins/unicode_hacks
Chris
On 8/29/06, harper
<rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>
wrote:>
> Hi,
>
> there is a problem i''ve been trying to solve for a couple of hours
now,
> and after some useless googeling and searching around, i haven''t
come up
> with anything substansial - i thought the forum might help me.
>
> say i have a string and want to display only the first 10 characters or
> so:
>
> shortstring = "this is a very long string object"[0..10]
> # shortstring = "this is a " # which is great
>
> but if i use the same method on a utf8 string, i get some weird
> characters popping in there, sometimes yes, sometimes no. from looking
> around it seems that because every character is two bytes(as apposed to
> 1 in regular encoding) there is sometimes a sum of odd/even characters,
> and then the [0..10] doesn''t work correctly, populating wierd
> characters. (same deal goes for the String#slice method)
>
> the final result i need, in essence of this message is this:
>
> "very long string in utf8" to become
> "very lon..."
>
> without weird characters.
> any help, much appreciated.
>
> thanks,
> harp
>
> --
> Posted via http://www.ruby-forum.com/.
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk
-~----------~----~----~----~------~----~------~--~---