1.) What is everyone''s preference on NCRs or character entities? Textile 2 uses decimal NCRs, so a less-than character becomes < whereas RedCloth (3.04 and prior) used <. What is your preference? It gets tough because ' (a straight single quote) doesn''t have a character entity equivalent. 2.) How do you feel about encoding characters like quotes in blockcode and pre blocks? Textile 2 does it, but the old RedCloth never did. Example:> This <code>is some code, "isn''t it"</code>.under Textile 2 becomes> This <code>is some code, "isn't it"</code>.Thanks! Jason References: http://www.w3.org/International/questions/qa-escapes http://textile.thresholdstate.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080221/145b93b7/attachment.html
At 6:07 PM -0500 2/21/08, Jason Garber wrote:>1.) What is everyone''s preference on NCRs or character entities? > Textile 2 uses decimal NCRs, so a less-than character becomes < >whereas RedCloth (3.04 and prior) used <. What is your >preference? It gets tough because ' (a straight single quote) >doesn''t have a character entity equivalent. > >2.) How do you feel about encoding characters like quotes in >blockcode and pre blocks? Textile 2 does it, but the old RedCloth >never did. Example: > >>This <code>is some code, "isn''t it"</code>. >> > >under Textile 2 becomes > >>This <code>is some code, "isn't it"</code>. >> >I prefer unicode character references rather than entities. See: http://rubyforge.org/pipermail/redcloth-upwards/2007-August/000161.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080310/17a9e296/attachment.html
ENTITIES I think html entities are more readable in case someone reads the raw code, but as you mentioned, some cannot be escaped and need unicode character references.>From my needs, it does not really matter. Maybe consistency is good,then we have to go for unicode. A much more important thing is that this entity escaping should be optional. I wrote a "to_latex" grammar and escaping entities is different in LateX. I proposed a patch that tried to make as little changes to the overall design as possible (http://code.whytheluckystiff.net/redcloth/ticket/35). But in essence, I think there should be some universal hooks to do this kind of escaping. I would propose the following hooks: pre: before anything happens escape: before raw text is written out (entity escaping in html for example) post: after parsing It would then be easy to alter the grammar for HTML by writing: class << SuperRedCloth::HTML def escape(text) html_unicode_escape(text) end end The method "html_entity_escape" would be the C function "rb_str_cat_escaped", "html_unicode_escape" could be another C function so there is no speed loss. The "pre" and "post" hooks could be used to extract custom tags and put the parsed result back after parsing. The interest of having them "inside" RedCloth is that we can alter the extracted data during parsing. This might seem mad, but it might be the only way to solve footnotes or tables when producing latex output without parsing the whole text once more. CODE ESCAPING This seems bad to me. Code should be as raw material as possible. It would be terrible to write code in ASCII and put it on a website. A user makes a copy of the code and finds himself with utf-8 data that doesn''t compile just because of quotes that look pretty but have no meaning in the language the code was written in. Ok, that''s it. Gaspard 2008/3/10, Stephen Bannasch <stephen.bannasch at deanbrook.org>:> > > At 6:07 PM -0500 2/21/08, Jason Garber wrote: > 1.) What is everyone''s preference on NCRs or character entities? Textile 2 > uses decimal NCRs, so a less-than character becomes < whereas RedCloth > (3.04 and prior) used <. What is your preference? It gets tough because > ' (a straight single quote) doesn''t have a character entity equivalent. > > > 2.) How do you feel about encoding characters like quotes in blockcode and > pre blocks? Textile 2 does it, but the old RedCloth never did. Example: > > > This <code>is some code, "isn''t it"</code>. > > > > under Textile 2 becomes > > > This <code>is some code, "isn't it"</code>. > > > > > > I prefer unicode character references rather than entities. > > > See: > http://rubyforge.org/pipermail/redcloth-upwards/2007-August/000161.html > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >
This idea about hooks is a good one. I''d wished for it myself when outputting HTML vs. XHTML (because while <br /> works, it isn''t truly valid HTML) A patch to do this would be most welcome! Jason On Mar 10, 2008, at 6:30 AM, Gaspard Bucher wrote:> But in essence, > I think there should be some universal hooks to do this kind of > escaping. I would propose the following hooks: > > pre: before anything happens > escape: before raw text is written out (entity escaping in html for > example) > post: after parsing > > It would then be easy to alter the grammar for HTML by writing: > > class << SuperRedCloth::HTML > def escape(text) > html_unicode_escape(text) > end > end > > The method "html_entity_escape" would be the C function > "rb_str_cat_escaped", "html_unicode_escape" could be another C > function so there is no speed loss.-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080311/1d1f20ce/attachment.html
I will start by referencing the parts in the parser that are specific to HTML and are not covered by the ruby methods ("def p", "def h1", etc) and also the parts that could need a 2 step parsing (1 build a tree, 2 render) typically parts that need some knowledge of what is coming further down before we can render them (footnotes, tables).>From this list we can discuss the best ways to enable multiple outputswithout loosing speed. Gaspard 2008/3/11, Jason Garber <jg at jasongarber.com>:> This idea about hooks is a good one. I''d wished for it myself when > outputting HTML vs. XHTML (because while <br /> works, it isn''t truly valid > HTML) > > A patch to do this would be most welcome! > > Jason > > > On Mar 10, 2008, at 6:30 AM, Gaspard Bucher wrote: > > > But in essence, > > I think there should be some universal hooks to do this kind of > > escaping. I would propose the following hooks: > > > > > pre: before anything happens > > escape: before raw text is written out (entity escaping in html for example) > > post: after parsing > > > > > It would then be easy to alter the grammar for HTML by writing: > > > > > class << SuperRedCloth::HTML > > def escape(text) > > html_unicode_escape(text) > > end > > end > > > > > The method "html_entity_escape" would be the C function > > "rb_str_cat_escaped", "html_unicode_escape" could be another C > > function so there is no speed loss. > > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >
On Mar 11, 2008, at 5:24 PM, Gaspard Bucher wrote:> also the parts that could need a 2 step parsing (1 build a > tree, 2 render) typically parts that need some knowledge of what is > coming further down before we can render them (footnotes, tables).That would be really handy. Right now I re-parse the document if any link aliases were found and I assume that footnotes always come after they''re referenced. Re-parsing isn''t that slow, but it is unnecessary. Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080311/ff1e17d0/attachment.html