I''m trying to get basic unicode support working using the
"Iteration
A1" sample application from the "Agile Web Development With
Rails"
book.
Following the "HowToUseUnicodeStrings" wiki document, I have made the
following changes:
config/environment.rb:
# Include your application configuration below
$KCODE = ''u''
require ''jcode''
admin.rhtml:
<head><meta http-equiv="content-type"
content="text/html; charset=utf-8" />
database.yml:
development:
adapter: mysql
encoding: utf8
application.rb:
class ApplicationController < ActionController::Base
before_filter :set_charset
after_filter :fix_unicode_for_safari
def set_charset
@headers["Content-Type"] = "text/html; charset=utf-8"
end
# automatically and transparently fixes utf-8 bug
# with Safari when using xmlhttp
def fix_unicode_for_safari
if @headers["Content-Type"] == "text/html;
charset=utf-8" and
@request.env[''HTTP_USER_AGENT''].to_s.include?
''AppleWebKit'' and request.xhr?
@response.body = @response.body.gsub(/([^\x00-\xa0])/u) { |s|
"&#x%x;" % $1.unpack(''U'')[0] }
end
end
end
And finally create.sql:
) ENGINE=MyISAM DEFAULT CHARSET=utf8
The last step was not mentioned in the wiki guide, but was
nevertheless required in my testing.
So, having performed the above steps, I can now successfully view,
add, and edit entries with unicode text (try, for example, inserting
chinese characters).
I run into problems, however, when I try to import existing unicode
data into the mysql table. I have a utf8 encoded sql query file that
inserts some unicode records into the table. When I try to view these
new records with the admin interface of the sample application, I get
garbage instead of the correct unicode characters. Editing a record
and pasting the correct unicode strings and updating again (all
through the interface) works correctly.
At this point I''m not sure what can be causing the problem. It seems
that anything that''s handled outside of the rails application seems to
be incorrectly treated once inside rails.
Has anyone had any experience with this?
Cheers, M
martin aatmaa wrote:> And finally create.sql: > ) ENGINE=MyISAM DEFAULT CHARSET=utf8This step isn''t necessary if you set the default character set for your schema to utf8: CREATE DATABASE mydatabase DEFAULT CHARSET utf8 Now every table you create will use utf8 unless you specify otherwise.> I run into problems, however, when I try to import existing unicode > data into the mysql table. I have a utf8 encoded sql query file that > inserts some unicode records into the table. When I try to view these > new records with the admin interface of the sample application, I get > garbage instead of the correct unicode characters. Editing a record > and pasting the correct unicode strings and updating again (all > through the interface) works correctly.My guess is that the problem is that your data is not being inserted correctly by your "utf8 encoded sql". If you view the data within the MySQL Query Browser does it appear as you would expect (my guess is that you''ll see the same garbled data that you see in the admin interface)? The character set handling within MySQL is complicated to say the least. It took me quite a lot of reading the documentation and experimenting to work it out. If your problem is what I think it is, then what''s happening is that MySQL thinks that your utf8 encoded SQL is actually ISO-8859 encoded. You can tell it that it''s utf8 by adding this line to the top of your SQL: SET NAMES utf8 It''s worth reading the MySQL documentation about character sets carefully. The way that it works isn''t obvious (to me, at least!). Hope this is some help, paul.butcher->msgCount++ Snetterton, Castle Combe, Cadwell Park... Who says I have a one track mind? -- Posted via http://www.ruby-forum.com/.
Paul, thank you for the response. Your suggestion that it is a mysql problem turned out to be true. It seems that I am using the command line mysql command incorrectly, and using the "source" command somehow incorrectly reads my SQL script (located in a file). For the time being I run the script from mySQLFront, which handles the data correctly. Cheers, Martin