I''ve found 2 bugs that produce (imho) incorrect rendering results: 1) The regexp for strong (*) and bold (**) is greedy, which produces very strange results. The simplest way to show the problem is to give an example. This is the original code: ====Strong: Lets do a little test *t* this should not be strong *u*. Bold: Lets do another test **t** this should not be bold **u**. ==== And this is the (relevant part of) the html that is produced: ==== <p>Strong: Lets do a little test <strong>t* this should not be strong *u</strong>.</p> <p>Bold: Lets do another test <b>t<strong>* this should not be bold *</strong>u</b>.</p> ==== As you can see, the html produced is not exactly what you would expect. 2) Using _TEXT_ to emphasize a string doesn''t work if TEXT spans multiple lines. If you want to emphasize a piece of text that spans multiple lines, then _TEXT_ does not work, the underscores are simply shown in the generated text, even if there are no hard linebreaks. I filed these bugs both with the Debian bugtracker and the tracker on rubyforge a couple of weeks ago, but there was no response. Though I''m a pretty decent ruby coder, the redcloth code is way over my head, so I was wondering if anyone here has a solution to one or both of the above problems? Thanks, Bas
> I''ve found 2 bugs that produce (imho) incorrect rendering results: > > 1) The regexp for strong (*) and bold (**) is greedy, which produces > very strange results.*snip* I haven''t looked at the relevant RedCloth code, but the non-greedy modifier in Ruby is "?". In other words: .* is greedy while .*? isn''t. -- Christoffer Sawicki
On Mon, Jul 03, 2006 at 12:20:53AM +0200, Christoffer Sawicki wrote:> > I''ve found 2 bugs that produce (imho) incorrect rendering results: > > > > 1) The regexp for strong (*) and bold (**) is greedy, which produces > > very strange results. > > *snip* > > I haven''t looked at the relevant RedCloth code, but the non-greedy > modifier in Ruby is "?". In other words: .* is greedy while .*? isn''t. >Thanks, but that''s not my real problem. I have a pretty good idea of where in the code this is happening, and I know basic regular expression syntax, but the following regexp code is just a bit too complicated for me: ===== QTAGS = [ [''**'', ''b''], [''*'', ''strong''], [''??'', ''cite'', :limit], [''-'', ''del'', :limit], [''__'', ''i''], [''_'', ''em'', :limit], [''%'', ''span'', :limit], [''+'', ''ins'', :limit], [''^'', ''sup''], [''~'', ''sub''] ] QTAGS.collect! do |rc, ht, rtype| rcq = Regexp::quote rc re case rtype when :limit /(\W) (#{rcq}) (#{C}) (?::(\S+?))? (\S.*?\S|\S) #{rcq} (?=\W)/x else /(#{rcq}) (#{C}) (?::(\S+))? (\S.*?\S|\S) #{rcq}/xm end [rc, ht, re, rtype] end ===== My main problem is that any trial-and-error modification of the code to fix one problem spawns 2 new ones. I hope this makes my problem a bit clearer. Thanks, Bas
On Sun, Jul 02, 2006 at 12:13:43PM +0200, Bas Kloet wrote:> I''ve found 2 bugs that produce (imho) incorrect rendering results: > > 1) The regexp for strong (*) and bold (**) is greedy, which produces > very strange results. > > The simplest way to show the problem is to give an example. > > This is the original code: > > ====> Strong: > Lets do a little test *t* > this should not be strong *u*. > > Bold: > Lets do another test **t** > this should not be bold **u**. > ====> > And this is the (relevant part of) the html that is produced: > > ====> <p>Strong: > Lets do a little test <strong>t* > this should not be strong *u</strong>.</p> > > > <p>Bold: > Lets do another test <b>t<strong>* > this should not be bold *</strong>u</b>.</p> > ====> > As you can see, the html produced is not exactly what you would expect.I''ve taken a quick look at it and minimized your example a bit: ====*t* not strong *u*. ==== This produces: ====<p><strong>t* not *u</strong></p> ==== But the funny thing is that the following: ====*tt* not strong *u*. ==== produces: ====<p><strong>tt</strong> not <strong>u</strong></p> ==== So I don''t think the matching is really greedy. It just doesn''t handle 1-character cases very well. Mark
On Wed, Jul 05, 2006 at 11:56:57AM +0200, Mark van Eijk wrote:> > So I don''t think the matching is really greedy. It just doesn''t handle > 1-character cases very well. >Thanks, that makes the problem a lot clearer. I found a fix for the _ problem when text spans multiple lines myself. I removed the :limit from the following line: --- [''_'', ''em'', :limit] --- There''s probably a good reason why the :limit was there, but all the tests I''ve run produced correct results, so for the moment I''m happy about that. Thanks for looking into the problem further. Greetings, Bas
Apparently Analagous Threads
- hard_breaks
- Dovecot, LDAP and something akin to Postfix' "table search order" from virtual deliver.
- squares instead of font characters since wine-0.9.13
- a difficult situation, how to do this using base function.
- Re: Ugly menu/dialog font of applications, using debian