I like the various sanitizers available in Rails since 2.0, but after some pondering I''m not quite sure how to use them. The goal is to allow users to enter some markup in form fields and have it displayed as valid markup later on. Now, assuming that the user enters valid markup to begin with, the sanitizers do a nice job of picking the allowed parts. For invalid input, the output can be messy. As far as I can tell, the dangerous bits and pieces are reliably removed, but the remains are far from valid markup. And therefore in the case of the white_list and link sanitizers, unfortunately unusable, for HTML escaping the remaining markup and "debris" (mostly ''<'', ''>'', ''&'') defeats the purpose of letting through some markup in the first place. I may be suffering from some misunderstanding or another, but I really don''t see how to use the sanitizers as they currently are. Michael -- Michael Schuerig mailto:michael@schuerig.de http://www.schuerig.de/michael/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
To me, a sanitizer doesn''t imply that it is a parser as well. If you
wanted
to ensure well-formed (this isn''t same as valid!) markup, you could
parse
and cache the output with Hpricot.
Hpricot("<p>Foo <b>bar</i></div>").to_html
#-> "<p>Foo <b>bar</b></p>"
But if you want to transform user input into *valid* HTML, I can''t
really
give any suggestions.
On Sat, May 10, 2008 at 10:32 PM, Michael Schuerig <michael@schuerig.de>
wrote:
>
>
> I like the various sanitizers available in Rails since 2.0, but after
> some pondering I''m not quite sure how to use them. The goal is to
allow
> users to enter some markup in form fields and have it displayed as
> valid markup later on.
>
> Now, assuming that the user enters valid markup to begin with, the
> sanitizers do a nice job of picking the allowed parts. For invalid
> input, the output can be messy. As far as I can tell, the dangerous
> bits and pieces are reliably removed, but the remains are far from
> valid markup. And therefore in the case of the white_list and link
> sanitizers, unfortunately unusable, for HTML escaping the remaining
> markup and "debris" (mostly ''<'',
''>'', ''&'') defeats the purpose of
> letting through some markup in the first place.
>
> I may be suffering from some misunderstanding or another, but I really
> don''t see how to use the sanitizers as they currently are.
>
> Michael
>
> --
> Michael Schuerig
> mailto:michael@schuerig.de
> http://www.schuerig.de/michael/
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To post to this group, send email to rubyonrails-core@googlegroups.com
To unsubscribe from this group, send email to
rubyonrails-core-unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/rubyonrails-core?hl=en
-~----------~----~----~----~------~----~------~--~---
On Saturday 10 May 2008, Mislav Marohnić wrote:> To me, a sanitizer doesn''t imply that it is a parser as well. If you > wanted to ensure well-formed (this isn''t same as valid!) markup, you > could parse and cache the output with Hpricot. > > Hpricot("<p>Foo <b>bar</i></div>").to_html > #-> "<p>Foo <b>bar</b></p>" > > But if you want to transform user input into *valid* HTML, I can''t > really give any suggestions.You are completely right regarding well-formed vs. valid. I''d settle for well-formed, however I still don''t see how to achieve this with the existing sanitizers out of the box. I could do some post-processing, of course, but that this is necessary detracts mightily from the usefulness of the sanitizers. As they are, they don''t fit the purpose of deriving sane (well-formed) HTML from arbitrary input. I may be misunderstanding what their purpose is, though. Michael -- Michael Schuerig mailto:michael@schuerig.de http://www.schuerig.de/michael/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
Mislav Marohnić wrote:> To me, a sanitizer doesn''t imply that it is a parser as well. If you > wanted to ensure well-formed (this isn''t same as valid!) markup, you > could parse and cache the output with Hpricot.I use a similar setup with ruby-tidy to ensure a valid markup. Jonathan -- Jonathan Weiss http://blog.innerewut.de http://twitter.com/jweiss --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
>> But if you want to transform user input into *valid* HTML, I can''t >> really give any suggestions.Their intention is to prevent the various XSS attacks which happen when a user provides markup, not to guarantee that the page looks good coming out the other side. I''ve done this at one client and the rig we used was: * html tidy to make whatever it is valid html * xslt to strip out unsupported tags (almost everything) and transform some to others. * Store the result because the transformation takes *ages* (relatively speaking) This has been remarkably reliable, but also incredibly cumbersome to set up. The other option is html5lib which is supposed to treat invalid html the same way browsers do. -- Cheers Koz --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On Sun, May 11, 2008 at 12:59 AM, Michael Koziarski <michael@koziarski.com> wrote:> > ... Store the result because the transformation takes *ages* (relatively > speaking) >Yeah, I can imagine. I''d do this (sanitize + force well-formedness) immediately after user input and before saving into the database. This way, this potentially expensive process has minimal impact. Well, this thread is quickly spinning into a usage question and not core discussion. I guess the answer to Michael''s question, "should the sanitizers in Rails ensure well-formedness", is ultimately "no"? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
> Well, this thread is quickly spinning into a usage question and not core > discussion. I guess the answer to Michael''s question, "should the sanitizers > in Rails ensure well-formedness", is ultimately "no"?That''s my take on the matter, if html5lib evolves into something that''s stable, and reliable, we could look at bundling that. But for now, I''d avoid adding that complexity at all costs. -- Cheers Koz --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
On Sunday 11 May 2008, Mislav Marohnić wrote:> Well, this thread is quickly spinning into a usage question and not > core discussion. I guess the answer to Michael''s question, "should > the sanitizers in Rails ensure well-formedness", is ultimately "no"?The original "core" aspect of the question was that other developers might get the idea, as I did initially, that they could use the sanitize helper and be done. Well, the result is safe at least, if potentially not well-formed. At this time I don''t see much for core to do. Possibly add a stern warning to the docs, what the sanitize does and doesn''t do. Michael -- Michael Schuerig mailto:michael@schuerig.de http://www.schuerig.de/michael/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
That warning sounds like a good idea, care to whip something up? Cheers Koz On 11/05/2008, at 8:31 AM, Michael Schuerig <michael@schuerig.de> wrote:> > On Sunday 11 May 2008, Mislav Marohnić wrote: >> Well, this thread is quickly spinning into a usage question and not >> core discussion. I guess the answer to Michael''s question, "should >> the sanitizers in Rails ensure well-formedness", is ultimately "no"? > > The original "core" aspect of the question was that other developers > might get the idea, as I did initially, that they could use the > sanitize helper and be done. Well, the result is safe at least, if > potentially not well-formed. > > At this time I don''t see much for core to do. Possibly add a stern > warning to the docs, what the sanitize does and doesn''t do. > > Michael > > -- > Michael Schuerig > mailto:michael@schuerig.de > http://www.schuerig.de/michael/ > > >--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---
> On 11/05/2008, at 8:31 AM, Michael Schuerig <michael@schuerig.de>wrote:> > At this time I don''t see much for core to do. Possibly add a stern > > warning to the docs, what the sanitize does and doesn''t do.On Sunday 11 May 2008, Michael Koziarski wrote:> That warning sounds like a good idea, care to whip something up?http://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/167 Michael -- Michael Schuerig mailto:michael@schuerig.de http://www.schuerig.de/michael/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group. To post to this group, send email to rubyonrails-core@googlegroups.com To unsubscribe from this group, send email to rubyonrails-core-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-core?hl=en -~----------~----~----~----~------~----~------~--~---