Currently, it appears to_xml will automatically escape any entities into their corresponding &XXX representation. There''s a piece in the documentation that says "If $KCODE is set to u and encoding set to UTF8, then escaping will NOT be performed." Unfortunately, this doesn''t appear to be the case. Even after following the docs and ensuring that default_charset is indeed UTF-8 (actually the default for Rails nowadays), we still get encoded characters in to_xml output. Since our client is UTF-8 aware, we need to pass thru the UTF-8 data intact. The only way we''ve found to do this is thru the following horrible monkey-patch: module Builder class XmlBase def _escape(text) text end end end What''s the proper way to do this? Thanks, Nate --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
prutser-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2008-Oct-16 17:11 UTC
Re: Disabling XML character escaping for to_xml
I had the same issue, but eventually putting $KCODE=''UTF8'' in my config/environment.rb solved the issue. Greetings, Wouter --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
prutser-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2008-Oct-16 19:23 UTC
Re: Disabling XML character escaping for to_xml
Just deployed to a production server, but it doesn''t work there, although the rails version is the same. Maybe it''s the ruby version (1.8.7 locally and 1.8.6 on the server) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Any word on if this is fixed in Edge/Rails 2.2? Cheers, Walter On Oct 21, 10:28 pm, BobiJo <bob...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I have the same issue, > $KCODE=''UTF8'' by default, but I set it anyway in environment.rb > This didn''t solve my problem, I applied the patch and it worked, > It''s not the ideal solution, but it gets the job done :) > I''ve tried the multibyte chars thing and it didn''t work eather. > > May the source be with you--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Actually, the monkey patch solution sort of sucks. It turns off ALL escaping, not just turning off utf to entities escaping. So this is fine: <dc:description>māori</dc:description> but this is not: <dc:description><p>āēīōū</p> <p> </p></dc:description> The html tags SHOULD be escaped, while the unicode characters shouldn''t be. My work around will simply be to strip out the embedded HTML, but this a problem that people should be aware of when using the monkey patch. Cheers, Watler --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
On Nov 11, 10:16 pm, mcginniwa <walter.mcgin...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> The html tags SHOULD be escaped, while the unicode characters > shouldn''t be. My work around will simply be to strip out the embedded > HTML, but this a problem that people should be aware of when using the > monkey patch. >Many moons ago I overrode the String#to_xs method that builder adds to just escape the vitals (ie &<>''" ) instead of all the extra stuff it does. Fred> Cheers, > Watler--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Yeah, I ended up doing that basically, but in some specific helpers. My coworker refined it though using the htmlentities plugin. You can see it here: http://github.com/kete/kete/tree/master/lib/oai_dc_helpers.rb#L135 Long term we may do this for all the xml values, not just our dc:description element. So it might move up to monkey patching builder or more general spot or something. Cheers, Walter On Wed, Nov 12, 2008 at 1:20 PM, Frederick Cheung < frederick.cheung-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> > > > On Nov 11, 10:16 pm, mcginniwa <walter.mcgin...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > The html tags SHOULD be escaped, while the unicode characters > > shouldn''t be. My work around will simply be to strip out the embedded > > HTML, but this a problem that people should be aware of when using the > > monkey patch. > > > Many moons ago I overrode the String#to_xs method that builder adds to > just escape the vitals (ie &<>''" ) instead of all the extra stuff it > does. > > Fred > > > Cheers, > > Watler > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---