bluescreen303-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2008-Oct-01 22:31 UTC
convert html to plain text in ruby
Hi, I''m looking for a way to convert html to plain text. Now, I know about strip_tags, but - as the name says - that only strips the tags. What I need is to get stuff like & and < back to & and < too. Any help? Thanks, Mathijs --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
You could use some regexp and the hash ERB::Util::HTML_ESCAPE to return the unescaped versions of the characters. - Richard On Oct 1, 3:31 pm, "bluescreen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <bluescreen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Hi, > > I''m looking for a way to convert html to plain text. > Now, I know about strip_tags, but - as the name says - that only > strips the tags. > > What I need is to get stuff like & and < back to & and < too. > Any help? > > Thanks, > Mathijs--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
You might be able to check out some example code in convert_attachment_to plugin: http://github.com/kete/convert_attachment_to/tree/master Depending on configuration, it will take an uploaded HTML file (or PDF, MS doc...) and convert it into a plain text attribute, etc. Probably overkill for what what you are after, but might have something you can learn from. Cheer, Walter --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---