thr3ads.net - Rails - sanitizing and stripping some html? [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Thomas Mango

2007-Apr-22 15:32 UTC

sanitizing and stripping some html?

I have an application that manages a list of feeds. In a scheduled
BackgrounDRb worker, I parse each of these feeds and post the content
to the same site. Some of these feeds contain HTML in the description
of each item in the feed. I would like to first sanitize the HTML to
remove anything particularly harmful, then I would like to strip
certain tags, leaving the content.

I first tested Rick Olson''s white_list plugin. It seems that this
simply strips tags and their content. For example, if I say p is a bad
tag, <p>content</p> gets completely stripped. I would actually like
to
keep the ''content'' and simply remove the HTML. Certain tags
are
alright, such as b, em, strong, but most I would like stripped out.

I then tested http://ideoplex.com/id/1138/sanitize-html-in-ruby and it
seems to do the trick. I was just wondering if anyone else had been
interested in stripping HTML but leaving the content and how they went
about doing so. Thanks for your input.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Rails - Apr 2007 - sanitizing and stripping some html?

sanitizing and stripping some html?