Hi all, I''m trying to store the following in a mysql DB but am having no luck: 1. Pär 2. Björn 3. gösta (hopefully the encodings came out correct in this message) I''ve looked into Ruby''s Unicode library and have tried things such as: Unicode.normalize_D(nickname) Unicode.normalize_KD(nickname) Unicode.normalize_C(nickname) Unicode.normalize_KC(nickname) All those methods either seem to strip out the European characters or replace them with their English equivalents. All I want to do is store the text in its original form. If I try to store the string as it currently is, I get question marks showing up in all my views where the European characters are. Does anyone have any ideas on how to accomplish such a thing? Thanks! -Dave Hoefler --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
> I'm trying to store the following in a mysql DB but am having no luck: > > 1. Pr > 2. Bjrn > 3. gsta<...>> If I try to store the string as it currently is, I get question marks > showing up in all my views where the European characters are. > Does anyone have any ideas on how to accomplish such a thing?It may well be, that your characters are ok in DB, but there is different charset specified in your HTTP headers. Check what headers server is sending (you can do it with the help of LiveHTTPHeaders or Firebug extensions for Firefox). Or the simplest test - check out, what menu item is checked on View-Character Encoding menu in your browser. If UTF is not checked, try to check it and see if characters are OK. If yes, that means you have to fix HTTP headers. If no, then problems may indeed be in the database. Regards, Rimantas -- http://rimantas.com/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Thanks Rimantas, I checked out the headers and they are sending: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*,q=0.7 My database is using ISO Latin 1. I failed to mention before that I'm loading a ruby script of users in a text file and executing it from the terminal. If I were to enter Unicode characters through a form in my Rails view then everything works fine. "Bjorn" is properly saved as "Bj\303\266rn". It's just loading the text file and looping through the records that seems to be the problem. The important parts of the script: $KCODE = 'u' require 'jcode' require 'unicode' arr = IO.readlines(member_file) # member_file is actually a text file member = Member.new(nickname) member.save That obviously happens within a loop. I just wanted to uber-simplify it. Still kind of confused... -Dave On 1/2/07, Rimantas Liubertas <rimantas@gmail.com> wrote:> > > I'm trying to store the following in a mysql DB but am having no luck: > > > > 1. P r > > 2. Bj rn > > 3. g sta > <...> > > If I try to store the string as it currently is, I get question marks > > showing up in all my views where the European characters are. > > Does anyone have any ideas on how to accomplish such a thing? > > It may well be, that your characters are ok in DB, but there is > different charset specified in > your HTTP headers. Check what headers server is sending (you can do it > with the help of LiveHTTPHeaders or Firebug extensions for Firefox). > Or the simplest test - check out, what > menu item is checked on View-Character Encoding menu in your browser. > If UTF is not checked, try to check it and see if characters are OK. > If yes, that means you have to fix HTTP headers. If no, then problems > may indeed be in the database. > > Regards, > Rimantas > -- > http://rimantas.com/ > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On 1/2/07, Dave Hoefler <dhoefler-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Thanks Rimantas,> My database is using ISO Latin 1.You might want to change that to UTF-8. Your accented unicode characters (western european) use a single byte under latin-1 and 2 under utf-8. Every link in the unicode chain (the round-trip) has to be fixed. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
> > I checked out the headers and they are sending: > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*,q=0.7These are sent by browser, how does server response look like? Look for the 'Content-type' header.> > My database is using ISO Latin 1.<...> This may or may not be the problem. It is possible (but not advisable) to keep utf data in database tables with Latin 1 encoding. However it would be better to convert your tables (and data) to UTF, then make sure that you have line 'encoding: utf8' in your config/database.yml This part is confusing, so I'll try to describe some scenarios: a) Database's encoding is set as Latin 1 (default) and there is no encoding line in config/database.yml. This will work, at least data can be saved/retrieved and displayed correctly in the browser - given the HTTP headers are right. b) Database's encoding is Latin 1, but database.yml sets encoding to UTF. This will not work, because that line makes database client to issue "SET NAMES UTF8" when connecting to the database. In this case MySQL will try to convert data which it thinks is in Latin 1 to UTF. Alas, data is already UTF, but wrong encoding was specifiedwhat you get is useless conversion and character junk. c) Like b), but in this case database is set to UTF, and data is in UTF, but there is no encoding line in database.yml. In this case connection's encoding will be Latin 1 and MySQL will try to convert data from UTF to Latin 1. This will mess things, assuming HTTP headers/meta tag says that encoding is UTF. So what would I do in your situation: 1) Try to change view-encoding to UTF-8 in the browser. If that fixes how characters are displayed, then, most likely, HTTP headers are messed up. 2) Check 'Content-type' header of the server response. If it says anything else than UTF - that must be fixed. 3) When headers are ok and page is displayed the way it should - convert tables and data in DB to UTF and then add encoding: utf8 to config/database.yml 4) Enjoy. Of course other scenarios are likely too, which would require additional exploration. Regards, Rimantas -- http://rimantas.com/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Thank you very much Rimantas for that explanation. That was just what I needed. I will try your suggestions. -Dave On 1/2/07, Rimantas Liubertas <rimantas@gmail.com> wrote:> > > > > I checked out the headers and they are sending: > > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*,q=0.7 > > These are sent by browser, how does server response look like? > Look for the 'Content-type' header. > > > > > My database is using ISO Latin 1. > <...> > > This may or may not be the problem. It is possible (but not advisable) > to keep utf data in > database tables with Latin 1 encoding. > However it would be better to convert your tables (and data) to UTF, > then make sure that you > have line 'encoding: utf8' in your config/database.yml > > This part is confusing, so I'll try to describe some scenarios: > a) Database's encoding is set as Latin 1 (default) and there is no > encoding line in config/database.yml. This will work, at least data > can be saved/retrieved and displayed correctly > in the browser - given the HTTP headers are right. > > b) Database's encoding is Latin 1, but database.yml sets encoding to > UTF. This will not work, because that line makes database client to > issue "SET NAMES UTF8" when connecting to the database. In this case > MySQL will try to convert data which it thinks is in Latin 1 to UTF. > Alas, data is already UTF, but wrong encoding was specified what you > get is useless conversion and character junk. > > c) Like b), but in this case database is set to UTF, and data is in > UTF, but there is no encoding line in database.yml. In this case > connection's encoding will be Latin 1 and MySQL will try to convert > data from UTF to Latin 1. This will mess things, assuming HTTP > headers/meta tag says that encoding is UTF. > > So what would I do in your situation: > 1) Try to change view-encoding to UTF-8 in the browser. If that fixes > how characters are displayed, then, most likely, HTTP headers are > messed up. > 2) Check 'Content-type' header of the server response. If it says > anything else than UTF - that must be fixed. > 3) When headers are ok and page is displayed the way it should - > convert tables and data in DB to UTF and then add encoding: utf8 to > config/database.yml > 4) Enjoy. > > Of course other scenarios are likely too, which would require > additional exploration. > > Regards, > Rimantas > -- > http://rimantas.com/ > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Jan 2, 2007, at 3:34 PM, Dave Hoefler wrote:> Thank you very much Rimantas for that explanation. That was just > what I needed. I will try your suggestions. > > -Dave > > On 1/2/07, Rimantas Liubertas <rimantas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > I checked out the headers and they are sending: > > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*,q=0.7 > > These are sent by browser, how does server response look like? > Look for the ''Content-type'' header. > > > > > My database is using ISO Latin 1. > <...> > > This may or may not be the problem. It is possible (but not advisable) > to keep utf data in > database tables with Latin 1 encoding. > However it would be better to convert your tables (and data) to UTF, > then make sure that you > have line ''encoding: utf8'' in your config/database.yml > > This part is confusing, so I''ll try to describe some scenarios: > a) Database''s encoding is set as Latin 1 (default) and there is no > encoding line in config/database.yml. This will work, at least data > can be saved/retrieved and displayed correctly > in the browser - given the HTTP headers are right. > > b) Database''s encoding is Latin 1, but database.yml sets encoding to > UTF. This will not work, because that line makes database client to > issue "SET NAMES UTF8" when connecting to the database. In this case > MySQL will try to convert data which it thinks is in Latin 1 to UTF. > Alas, data is already UTF, but wrong encoding was specified what you > get is useless conversion and character junk. > > c) Like b), but in this case database is set to UTF, and data is in > UTF, but there is no encoding line in database.yml. In this case > connection''s encoding will be Latin 1 and MySQL will try to convert > data from UTF to Latin 1. This will mess things, assuming HTTP > headers/meta tag says that encoding is UTF. > > So what would I do in your situation: > 1) Try to change view-encoding to UTF-8 in the browser. If that fixes > how characters are displayed, then, most likely, HTTP headers are > messed up. > 2) Check ''Content-type'' header of the server response. If it says > anything else than UTF - that must be fixed. > 3) When headers are ok and page is displayed the way it should - > convert tables and data in DB to UTF and then add encoding: utf8 to > config/database.yml > 4) Enjoy. > > Of course other scenarios are likely too, which would require > additional exploration. > > Regards, > Rimantas > -- > http://rimantas.com/ >This response is someone late, but if you''re using migrations, here''s one to put your MySql tables and database to UTF8: class UseUtf8AsDefaultCharset < ActiveRecord::Migration def self.my_number_is say File.basename(__FILE__).to_i end def self.database_charset(charset=''utf8'') execute "ALTER DATABASE `#{current_database}` DEFAULT CHARACTER SET #{charset}" end def self.table_charset(table, charset=''utf8'') execute "ALTER IGNORE TABLE `#{table}` CONVERT TO CHARACTER SET # {charset}, DEFAULT CHARACTER SET #{charset}" end def self.up my_number_is database_charset(''utf8'') select_values("SHOW TABLES").each do |table| table_charset(table, ''utf8'') end end def self.down my_number_is database_charset(''latin1'') select_values("SHOW TABLES").each do |table| table_charset(table, ''latin1'') end end end Note that the down migration puts the tables back to their "original" character set which, in my case, was latin1. -Rob Rob Biedenharn http://agileconsultingllc.com Rob-xa9cJyRlE0mWcWVYNo9pwxS2lgjeYSpx@public.gmane.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Thanks Rob! That will come in handy. -Dave On 1/7/07, Rob Biedenharn <Rob-GBZH0y1GwQfnZcttdmLDtcI/UQi/AW5J@public.gmane.org> wrote:> > > > On Jan 2, 2007, at 3:34 PM, Dave Hoefler wrote: > > > Thank you very much Rimantas for that explanation. That was just > > what I needed. I will try your suggestions. > > > > -Dave > > > > On 1/2/07, Rimantas Liubertas <rimantas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > I checked out the headers and they are sending: > > > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*,q=0.7 > > > > These are sent by browser, how does server response look like? > > Look for the ''Content-type'' header. > > > > > > > > My database is using ISO Latin 1. > > <...> > > > > This may or may not be the problem. It is possible (but not advisable) > > to keep utf data in > > database tables with Latin 1 encoding. > > However it would be better to convert your tables (and data) to UTF, > > then make sure that you > > have line ''encoding: utf8'' in your config/database.yml > > > > This part is confusing, so I''ll try to describe some scenarios: > > a) Database''s encoding is set as Latin 1 (default) and there is no > > encoding line in config/database.yml. This will work, at least data > > can be saved/retrieved and displayed correctly > > in the browser - given the HTTP headers are right. > > > > b) Database''s encoding is Latin 1, but database.yml sets encoding to > > UTF. This will not work, because that line makes database client to > > issue "SET NAMES UTF8" when connecting to the database. In this case > > MySQL will try to convert data which it thinks is in Latin 1 to UTF. > > Alas, data is already UTF, but wrong encoding was specified what you > > get is useless conversion and character junk. > > > > c) Like b), but in this case database is set to UTF, and data is in > > UTF, but there is no encoding line in database.yml. In this case > > connection''s encoding will be Latin 1 and MySQL will try to convert > > data from UTF to Latin 1. This will mess things, assuming HTTP > > headers/meta tag says that encoding is UTF. > > > > So what would I do in your situation: > > 1) Try to change view-encoding to UTF-8 in the browser. If that fixes > > how characters are displayed, then, most likely, HTTP headers are > > messed up. > > 2) Check ''Content-type'' header of the server response. If it says > > anything else than UTF - that must be fixed. > > 3) When headers are ok and page is displayed the way it should - > > convert tables and data in DB to UTF and then add encoding: utf8 to > > config/database.yml > > 4) Enjoy. > > > > Of course other scenarios are likely too, which would require > > additional exploration. > > > > Regards, > > Rimantas > > -- > > http://rimantas.com/ > > > > This response is someone late, but if you''re using migrations, here''s > one to put your MySql tables and database to UTF8: > > class UseUtf8AsDefaultCharset < ActiveRecord::Migration > def self.my_number_is > say File.basename(__FILE__).to_i > end > > def self.database_charset(charset=''utf8'') > execute "ALTER DATABASE `#{current_database}` DEFAULT CHARACTER > SET #{charset}" > end > > def self.table_charset(table, charset=''utf8'') > execute "ALTER IGNORE TABLE `#{table}` CONVERT TO CHARACTER SET # > {charset}, DEFAULT CHARACTER SET #{charset}" > end > > def self.up > my_number_is > database_charset(''utf8'') > select_values("SHOW TABLES").each do |table| > table_charset(table, ''utf8'') > end > end > > def self.down > my_number_is > database_charset(''latin1'') > select_values("SHOW TABLES").each do |table| > table_charset(table, ''latin1'') > end > end > end > > Note that the down migration puts the tables back to their "original" > character set which, in my case, was latin1. > > -Rob > > Rob Biedenharn http://agileconsultingllc.com > Rob-xa9cJyRlE0mWcWVYNo9pwxS2lgjeYSpx@public.gmane.org > > > > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---