I''ve got a fairly basic problem here that I''m hoping there is an easy solution for. I have a chunk of html code that I want to truncate to a given length... say 20 characters or so. If I use the ''truncate'' helper function I end up with unbalanced tags. For example. <a href=www.someplace.com>A really long string of words</a> becomes <a href=www.someplace.com>A really long... When run through the ''truncate'' function, leaving off the closing tag, causing untold trouble and chaos. On top of that, the trunctate function counts characters in the tag, so you end up getting somewhat less than what you asked for. So... is there a way to truncate html text properly? By this I mean a function or set of functions that returns a chunk of html with the tags properly closed and where the length of the text outside the tags is the specified amount. -- Posted via http://www.ruby-forum.com/.
Kevin- How about this: truncate(html_text.gsub(/(<[^>]+>)/, ''''), 20) That will just do a naive regex to remove the html tags from html_text and pass that in to truncate with a length of 20 Cheers- -Ezra On Dec 22, 2005, at 8:00 PM, Kevin Olbrich wrote:> I''ve got a fairly basic problem here that I''m hoping there is an easy > solution for. > > I have a chunk of html code that I want to truncate to a given > length... > say 20 characters or so. > > If I use the ''truncate'' helper function I end up with unbalanced tags. > > For example. > > <a href=www.someplace.com>A really long string of words</a> > > becomes > > <a href=www.someplace.com>A really long... > > When run through the ''truncate'' function, leaving off the closing tag, > causing untold trouble and chaos. On top of that, the trunctate > function counts characters in the tag, so you end up getting somewhat > less than what you asked for. > > So... is there a way to truncate html text properly? > > By this I mean a function or set of functions that returns a chunk of > html with the tags properly closed and where the length of the text > outside the tags is the specified amount. > > -- > > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails >-Ezra Zygmuntowicz WebMaster Yakima Herald-Republic Newspaper ezra-gdxLOakOTQ9oetBuM9ipNAC/G2K4zDHf@public.gmane.org 509-577-7732
Kevin Olbrich wrote:> I''ve got a fairly basic problem here that I''m hoping there is an easy > solution for. > > I have a chunk of html code that I want to truncate to a given length... > say 20 characters or so. > > If I use the ''truncate'' helper function I end up with unbalanced tags. > > For example. > > <a href=www.someplace.com>A really long string of words</a> > > becomes > > <a href=www.someplace.com>A really long... > > When run through the ''truncate'' function, leaving off the closing tag, > causing untold trouble and chaos. On top of that, the trunctate > function counts characters in the tag, so you end up getting somewhat > less than what you asked for. > > So... is there a way to truncate html text properly?Try this: http://www.bigbold.com/snippets/posts/show/295 -- Posted via http://www.ruby-forum.com/.
> > Try this: > http://www.bigbold.com/snippets/posts/show/295@Ezra... I would like to retain the HTML formatting if possible. Stripping them out would work, but then the formatting gets lost. Not ideal, but functional. Closing the broken tags might work. I need to see how this works if a tag gets chopped in half. Something like "<a href=www.someplace.com>My tag link</a..." might make that algorithm upset. I''m still stuck with the fact that the truncated length will be totally wrong. Something to work with anyway, Thanks guys. _Kevin -- Posted via http://www.ruby-forum.com/.
Why don''t you truncate the text before you wrap it in the tags? This would be the easiest way to avoid the problem. -----Original Message----- From: Kevin Olbrich [mailto:kevin.olbrich-4+jYJfmkT58@public.gmane.org] Sent: Friday, December 23, 2005 7:00 AM To: rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org Subject: [Rails] Re: truncating html text> > Try this: > http://www.bigbold.com/snippets/posts/show/295@Ezra... I would like to retain the HTML formatting if possible. Stripping them out would work, but then the formatting gets lost. Not ideal, but functional. Closing the broken tags might work. I need to see how this works if a tag gets chopped in half. Something like "<a href=www.someplace.com>My tag link</a..." might make that algorithm upset. I''m still stuck with the fact that the truncated length will be totally wrong. Something to work with anyway, Thanks guys. _Kevin -- Posted via http://www.ruby-forum.com/. _______________________________________________ Rails mailing list Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org http://lists.rubyonrails.org/mailman/listinfo/rails
Kevin Olbrich wrote:>>Try this: >>http://www.bigbold.com/snippets/posts/show/295 > > > @Ezra... I would like to retain the HTML formatting if possible. > Stripping them out would work, but then the formatting gets lost. Not > ideal, but functional. > > Closing the broken tags might work. I need to see how this works if a > tag gets chopped in half. > > Something like "<a href=www.someplace.com>My tag link</a..." might make > that algorithm upset. I''m still stuck with the fact that the truncated > length will be totally wrong. > > Something to work with anyway, Thanks guys.Perhaps if you explained *why* you want to truncate an HTML string, that would help... regards Justin
Kevin Olbrich wrote:>> >> Try this: >> http://www.bigbold.com/snippets/posts/show/295 > > @Ezra... I would like to retain the HTML formatting if possible. > Stripping them out would work, but then the formatting gets lost. Not > ideal, but functional. > > Closing the broken tags might work. I need to see how this works if a > tag gets chopped in half. > > Something like "<a href=www.someplace.com>My tag link</a..." might make > that algorithm upset.A regex for removing open tags from the end should be quite trivial.> I''m still stuck with the fact that the truncated > length will be totally wrong.You''ll probably have to write your own truncate function with String#scan and make it count only non-tag characters. -- Posted via http://www.ruby-forum.com/.
Justin Forder wrote:> Perhaps if you explained *why* you want to truncate an HTML string, that > would help... > > regards > > Justin@Justin: The goal is to have an ''article'' model. I would like to have the ''list'' view generate a brief excerpt of the article body as a teaser. For now the text itself is being generated from text using textile. I have considered simply truncating the textile source and then generating html from that, but you run into similar problems with unbalanced decorations (sort of like my Christmas tree). @Andreas, yes, removing the malformed tag at the end is easy. The rest of it is a bit tricky, but I am making progress. It is a good learning excercise for regex judo. _Kevin -- Posted via http://www.ruby-forum.com/.
Tom Fakes wrote:> Why don''t you truncate the text before you wrap it in the tags? This > would be the easiest way to avoid the problem.I''m starting with textile markup text, so there will be some decoration in there even before I convert it to html. Doing the truncate on the markup is better because you are less likely to cut a tag in half, and because when you do the conversion to html, you don''t get malformed html tags. This is an intermediate solution for now. -- Posted via http://www.ruby-forum.com/.
Kevin Olbrich wrote:> The goal is to have an ''article'' model. I would like to have the ''list'' > view generate a brief excerpt of the article body as a teaser. For now > the text itself is being generated from text using textile. I have > considered simply truncating the textile source and then generating html > from that, but you run into similar problems with unbalanced decorations > (sort of like my Christmas tree).Thanks, that''s useful. Have you looked at the feasibility of altering the textile-to-html conversion, so that it works with a bound on the number of content characters? On reaching the bound, it would just need to emit closing tags for all currently unclosed HTML tags. regards Justin
Justin Forder wrote:> > Thanks, that''s useful. Have you looked at the feasibility of altering > the textile-to-html conversion, so that it works with a bound on the > number of content characters? On reaching the bound, it would just need > to emit closing tags for all currently unclosed HTML tags. > > regards > > JustinThanks, that''s a good suggestion. This may solve my immediate problem so long as I continue to use textile. However, I''m still interested in finding a more general solution to the problem. _Kevin -- Posted via http://www.ruby-forum.com/.