I have an application that manages a list of feeds. In a scheduled BackgrounDRb worker, I parse each of these feeds and post the content to the same site. Some of these feeds contain HTML in the description of each item in the feed. I would like to first sanitize the HTML to remove anything particularly harmful, then I would like to strip certain tags, leaving the content. I first tested Rick Olson''s white_list plugin. It seems that this simply strips tags and their content. For example, if I say p is a bad tag, <p>content</p> gets completely stripped. I would actually like to keep the ''content'' and simply remove the HTML. Certain tags are alright, such as b, em, strong, but most I would like stripped out. I then tested http://ideoplex.com/id/1138/sanitize-html-in-ruby and it seems to do the trick. I was just wondering if anyone else had been interested in stripping HTML but leaving the content and how they went about doing so. Thanks for your input. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---