Hi there! Just don''t know what to do: any advice appreciated: My DB is InnoDB utf-8 and I have no problems when I get information from db, displaying it in the broswer or saving it. Except one case: I''m saving the text with umlaut. This works fine if my browser''s character encoding is set to utf-8. But: if I switch encoding to western ( ISO-8859-1 ), then copy the string with umlaut to the field, and click on save I get the following error: ActiveRecord::StatementInvalid Mysql::Error: #22001Data too long for column ''_text'' at row 1: UPDATE company_description SET `creation_time` = ''2006-10-04 21:17:23'', `_text` = ''Stefan Ha�'', <...> The POST request is sent with Content-Type: application/x-www-form-urlencoded, So it looks like it sents request in western encoding. POST Content differes for the case when I have browser encoding set to utf8 and when it is set to western. It could be quite OK if I hadn''t that error. BTW, I don''t have this error under linux, only under windows (probably because under linux ruby version is 1.8.4 and under windows 1.8.5?) I tried everything I found about unicode, I tried to find the way to escape string before saving it, but no success(( Any ideas, please? -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
I am scratching my head here, but isn't there a way to have the form force a character encoding in the POST request? i.e. you can explicitly state that the POST character data is utf-8? THat way the user can mess around with their character encodings all they want, but whatever characters submitted will be valid utf-8 characters (but possibly/probably not the characters they expected). On 10/4/06, Dmitry Hazin <rails-mailing-list@andreas-s.net> wrote:> > Hi there! > > Just don't know what to do: any advice appreciated: > > My DB is InnoDB utf-8 and I have no problems when I get information from > db, displaying it in the broswer or saving it. > > Except one case: > I'm saving the text with umlaut. This works fine if my browser's > character encoding is set to utf-8. > But: if I switch encoding to western ( ISO-8859-1 ), then copy the > string with umlaut to the field, and click on save I get the following > error: > > ActiveRecord::StatementInvalid > > Mysql::Error: #22001Data too long for column '_text' at row 1: UPDATE > company_description SET `creation_time` = '2006-10-04 21:17:23', `_text` > = 'Stefan Ha�', <...> > > The POST request is sent with Content-Type: > application/x-www-form-urlencoded, > So it looks like it sents request in western encoding. > POST Content differes for the case when I have browser encoding set to > utf8 and when it is set to western. > > It could be quite OK if I hadn't that error. BTW, I don't have this > error under linux, only under windows (probably because under linux ruby > version is 1.8.4 and under windows 1.8.5?) > > I tried everything I found about unicode, I tried to find the way to > escape string before saving it, but no success(( > > Any ideas, please? > > -- > Posted via http://www.ruby-forum.com/. > > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk@googlegroups.com To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
upd: the same error with ruby1.8.4 under windows.. -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Richard Conroy wrote:> I am scratching my head here, but isn''t there a way to have the form > force a character encoding in the POST request? > > i.e. you can explicitly state that the POST character data is utf-8? > > THat way the user can mess around with their character encodings > all they want, but whatever characters submitted will be valid utf-8 > characters (but possibly/probably not the characters they expected).but how can I do that? i.e. have the form force a character encoding in the POST request??? -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
On 10/4/06, Dmitry Hazin <rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> > upd: the same error with ruby1.8.4 under windows..Dmitry, my gut feeling is that you have to enforce POST encoding in the form at least, or otherwise detect when you have not received a utf-8 encoded POST data string. I am at a loss as to how a latin-1 string ended up *bigger* than a UTF-8 one, but its possible that you might have encountered some cut&paste artifacts. Try entering an umlaut using the character map (i.e. more naturally). --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Richard Conroy wrote:> On 10/4/06, Dmitry Hazin <rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote: >> >> upd: the same error with ruby1.8.4 under windows.. > > Dmitry, my gut feeling is that you have to enforce POST encoding in the > form at least, or otherwise detect when you have not received a utf-8 > encoded POST data string. > > I am at a loss as to how a latin-1 string ended up *bigger* than a UTF-8 > one, but its possible that you might have encountered some cut&paste > artifacts. Try entering an umlaut using the character map (i.e. more > naturally).ok, i`ll try this; anyway i think it`s bad that it`s possible to pass such data to application.... -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
On 10/4/06, Dmitry Hazin <rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> > ok, i`ll try this; anyway i think it`s bad that it`s possible to pass > such data to application....Well yes, but its not Rails fault. In fact anyone can pass any kind of information to any kind of web system. Your system has to be robust enough to handle it. Even by your best efforts to ensure everything comes across as utf-8, users can still force it to be something that won''t display properly, like latin-1, Shift-JIS or whatever. In those cases you have to detect that you have received an invalid encoding and either convert it to utf-8 or send back an error message. I just thought that a particularly clever hacker might be able to exploit encoding confusion with multi-byte encoding systems to get around cross-site-scripting defences. Its just a thought, and I am thinking in general, not in a Rails context (which has some fairly serious XSS defences) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Julian ''Julik'' Tarkhanov
2006-Oct-04 15:56 UTC
Re: utf-8 or encoding problems, need help!
On 4-okt-2006, at 16:54, Dmitry Hazin wrote:> but how can I do that? i.e. have the form force a character > encoding in > the POST request???That''s what the accept-charset is for on the form element. However, if your page is by itself explicitly UTF-8 (via output headers or he <head> element, or both) all forms that you postback or get should be automagically in UTF-8 as well. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Julian ''Julik'' Tarkhanov wrote:> On 4-okt-2006, at 16:54, Dmitry Hazin wrote: > >> but how can I do that? i.e. have the form force a character >> encoding in >> the POST request??? > > That''s what the accept-charset is for on the form element. However, > if your page is by itself explicitly UTF-8 (via output headers or he > <head> element, or both) all forms that you postback or get should be > automagically in UTF-8 as well.So if it''s possible to fake the encoding and headers, should I check that all the contents of the params hash is in the utf-8; say by using before_filter in application.rb? Wouldn''t it affect performance of the whole application? Maybe there is a better way to avoid such errors? Thanks -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Dmitry Hazin wrote:> Julian ''Julik'' Tarkhanov wrote: >> On 4-okt-2006, at 16:54, Dmitry Hazin wrote: >> >>> but how can I do that? i.e. have the form force a character >>> encoding in >>> the POST request??? >> >> That''s what the accept-charset is for on the form element. However, >> if your page is by itself explicitly UTF-8 (via output headers or he >> <head> element, or both) all forms that you postback or get should be >> automagically in UTF-8 as well. > > So if it''s possible to fake the encoding and headers, should I check > that all the contents of the params hash is in the utf-8; say by using > before_filter in application.rb? Wouldn''t it affect performance of the > whole application? Maybe there is a better way to avoid such errors? > > ThanksI created the following filter to check incoming requests. Is there better and faster way to do the same? class ApplicationController < ActionController::Base require ''iconv'' ICONV = Iconv.new( ''UTF-8'', ''UTF-8'' ) before_filter :convert_request def convert_request convert_hash(params) #if request.post? end def convert_hash(hash) begin for k, v in hash case v when String: ICONV.iconv(v) when Array: v.collect { |v| ICONV.iconv(v) } when Hash: convert_hash(v) end end rescue Iconv::Failure => iconv_exception hash[k] = iconv_exception.success flash[:error] = ''Request was sent in invalid encoding (not utf-8). Text was truncated.'' end end end -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
Just found better solution:) class ApplicationController < ActionController::Base before_filter :convert_request def convert_request convert_hash(params) #if request.post? end def convert_hash(hash) for k, v in hash case v when String: hash[k] = Kconv.toutf8(v).to_s when Array: hash[k] = v.collect { |v| Kconv.toutf8(v).to_s } when Hash: convert_hash(v) end end end end -- Posted via http://www.ruby-forum.com/. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk -~----------~----~----~----~------~----~------~--~---
cubiqsys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2006-Nov-23 09:45 UTC
Re: utf-8 or encoding problems, need help!
Hi, I''m having a related utf8-problem I would like to share in this topic. When I''m submitting swedish characters (such as åäö) in a ajaxcall (:observe_field) the åäö-characters gets translated into weird characters that causes the postgresql to display the error: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xf6f6f6f6 Anyone know a fix? :) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On 11/23/06, cubiqsys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <cubiqsys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > Hi, I''m having a related utf8-problem I would like to share in this > topic. > > When I''m submitting swedish characters (such as åäö) in a ajaxcall > (:observe_field) the åäö-characters gets translated into weird > characters that causes the postgresql to display the error: > PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xf6f6f6f6Its pretty clear that you are inserting high bit latin-1 (?) characters into your UTF-8 database. (?) I am assuming that scandinavian countries are part of the latin-1 (ISO-8859-1) character set.> Anyone know a fix? :)Are you (1) typing these characters into your source file or are you (2) letting the user enter them directly from a form? (1) I suspect you will need to escape the characters in your source file. I don''t know enough about using non-ASCII characters in Ruby source to help you further (2) should work as long as you just let Rails pass them through, and screen them for being valid utf-8. Make sure your browser knows the page is UTF-8 (you will need to do something in your Rails config to enforce this) and that any POSTs are encoding the data as UTF-8.> > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
cubiqsys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2006-Nov-23 13:02 UTC
Re: utf-8 or encoding problems, need help!
Hi, thanks for the reply. The answer is 2 : the user is entering the text. I don''t understand why rails processes the characters correctly through normal posts, but not when I do the ajax obversefield-call. Anyhow, how do I specify to use UTF-8 in that ajax-call? My database is UTF-8. This is what I have right now: In applicationcontroller: before_filter :set_charset # Sets default character set to UTF-8 def set_charset if request.xhr? @headers["Content-Type"] = "text/javascript; charset=utf-8" else @headers["Content-Type"] = "text/html; charset=utf-8" end end At the bottom of environment.rb I have: $KCODE = ''u'' require ''jcode''> (2) should work as long as you just let Rails pass them through, > and screen them for being valid utf-8. Make sure your browser > knows the page is UTF-8 (you will need to do something in your > Rails config to enforce this) and that any POSTs are encoding > the data as UTF-8.--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---