Hey guys, I have been working with RedCloth for the past week or so in order to try and upgrade a University''s blog system (blogs.warwick.ac.uk) to Textile2, having used a bespoke Java-based Textile1 implementation in the past. As such, it''s been an important part of this that it works as much as possible to give the same results for Textile1 code, but with the added features (footnotes, tables) of Textile2. The RedCloth 3.0.4 release gave me too many headaches, so I''ve been working with the latest out of SVN and I think I''ve uncovered a few bugs that I''d be interested in sharing the code for if anyone wants it: (All of the changes are to base.rb) Auto-magic URLs: There''s a couple of things that were a little iffy. Firstly, I changed it so that the parsing for auto-magic linking of normal links AND emails are done twice (so I added the rule twice to TEXTILE_RULES in both cases): TEXTILE_RULES = [:refs_textile, :block_textile_table, :block_textile_lists, :block_textile_defs, :block_textile_prefix, :inline_textile_image, :inline_textile_link, :inline_textile_code, :inline_textile_span, :glyphs_textile, :inline_textile_autolink_urls, :inline_textile_autolink_urls, :inline_textile_autolink_emails, :inline_textile_autolink_emails] # do these twice to accomodate for one space/line diffs This fixes any problems with two URLs appearing with a single space or linebreak between them. Also, the AUTO_LINK_RE regular expression has a bug, where it matches any character in the leading punctuation (I think the intention was to have a list of punctuation to be not matched, but that means it matches any character that isn''t punction, including word characters). I changed this to have a list of "whitelisted" punctuation (I only needed the open-bracket at the moment) and also to match any whitespace preceding the link: AUTO_LINK_RE = / ( # leading text <\w+.*?>| # leading HTML tag, or [\(]| # selected punctuation, or \s| # whitespace, or ^ # beginning of line ) ( (?:http[s]?:\/\/)| # protocol spec, or (?:www\.) # www.* ) ( ([\w]+[=?&:%\/\.\~\-]*)* # url segment \w+[\/]? # url tail (?:\#\w*)? # trailing anchor ) ([[:punct:]]|\s|<|$) # trailing text /x The IMAGE_RE regular expression also has a bug, where it matches any character (.) for the start of a line. I replaced this with any whitespace. Also, I made it so that the URL for a link *must* have a . in it: IMAGE_RE = / (\<p\>|\s|^) # start of line? \! # opening (\<|\=|\>)? # optional alignment atts (#{C}) # optional style,class atts (?:\. )? # optional dot-space ([^\s(!]+?\.[^\s(!]+?) # presume this is the src \s? # optional space (?:\(((?:[^\(\)]|\([^\)]+\))+?)\))? # optional title \! # closing (?::#{ HYPERLINK })? # optional href /x I made a couple of minor other alterations, such as adding the sterling (?) symbol to the glyphs to be replaced with £, and adding a newline after <br /> when hard_breaks are turned on to help readability. In order to work well with our implementation, I also added an escape capability, to allow a user to write \*text\* without the asterisks being converted to <strong> tags. If anyone wants the code for this then feel free to ask, I''m wary about committing anything to SVN unless it''s really wanted/needed Regards, Mat -- Mat Mannion Web Developer e-lab, University of Warwick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/redcloth-upwards/attachments/20060829/a642efd7/attachment-0001.html