thr3ads.net - Redcloth upwards - Question about entities [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Jason Garber

2008-Feb-21 23:07 UTC

Question about entities

1.) What is everyone''s preference on NCRs or character entities?   
Textile 2 uses decimal NCRs, so a less-than character becomes &#60;  
whereas RedCloth (3.04 and prior) used &lt;.  What is your  
preference?  It gets tough because &#39; (a straight single quote)  
doesn''t have a character entity equivalent.

2.) How do you feel about encoding characters like quotes in  
blockcode and pre blocks?  Textile 2 does it, but the old RedCloth  
never did.  Example:
> This <code>is some code, "isn''t it"</code>.

under Textile 2 becomes
> This <code>is some code, &#34;isn&#39;t
it&#34;</code>.


Thanks!
Jason

References:
http://www.w3.org/International/questions/qa-escapes
http://textile.thresholdstate.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080221/145b93b7/attachment.html

Stephen Bannasch

2008-Mar-10 04:34 UTC

head link

Question about entities

At 6:07 PM -0500 2/21/08, Jason Garber wrote:>1.) What is everyone''s preference on NCRs or character entities? 
> Textile 2 uses decimal NCRs, so a less-than character becomes &#60; 
>whereas RedCloth (3.04 and prior) used &lt;.  What is your 
>preference?  It gets tough because &#39; (a straight single quote) 
>doesn''t have a character entity equivalent.
>
>2.) How do you feel about encoding characters like quotes in 
>blockcode and pre blocks?  Textile 2 does it, but the old RedCloth 
>never did.  Example:
>
>>This <code>is some code, "isn''t
it"</code>.
>>
>
>under Textile 2 becomes
>
>>This <code>is some code, &#34;isn&#39;t
it&#34;</code>.
>>
>
I prefer unicode character references rather than entities.

See: http://rubyforge.org/pipermail/redcloth-upwards/2007-August/000161.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080310/17a9e296/attachment.html

Gaspard Bucher

2008-Mar-10 10:30 UTC

head link

Question about entities

ENTITIES

I think html entities are more readable in case someone reads the raw
code, but as you mentioned, some cannot be escaped and need unicode
character references.
>From my needs, it does not really matter. Maybe consistency is good,then we have to go for unicode.

A much more important thing is that this entity escaping should be
optional. I wrote a "to_latex" grammar and escaping entities is
different in LateX. I proposed a patch that tried to make as little
changes to the overall design as possible
(http://code.whytheluckystiff.net/redcloth/ticket/35). But in essence,
I think there should be some universal hooks to do this kind of
escaping. I would propose the following hooks:

pre: before anything happens
escape: before raw text is written out (entity escaping in html for example)
post: after parsing

It would then be easy to alter the grammar for HTML by writing:

class << SuperRedCloth::HTML
  def escape(text)
    html_unicode_escape(text)
  end
end

The method "html_entity_escape" would be the C function
"rb_str_cat_escaped", "html_unicode_escape" could be another
C
function so there is no speed loss.

The "pre" and "post" hooks could be used to extract custom
tags and
put the parsed result back after parsing. The interest of having them
"inside" RedCloth is that we can alter the extracted data during
parsing. This might seem mad, but it might be the only way to solve
footnotes or tables when producing latex output without parsing the
whole text once more.

CODE ESCAPING

This seems bad to me. Code should be as raw material as possible. It
would be terrible to write code in ASCII and put it on a website. A
user makes a copy of the code and finds himself with utf-8 data that
doesn''t compile just because of quotes that look pretty but have no
meaning in the language the code was written in.

Ok, that''s it.

Gaspard

2008/3/10, Stephen Bannasch <stephen.bannasch at
deanbrook.org>:>
>
> At 6:07 PM -0500 2/21/08, Jason Garber wrote:
> 1.) What is everyone''s preference on NCRs or character entities? 
Textile 2
> uses decimal NCRs, so a less-than character becomes &#60; whereas
RedCloth
> (3.04 and prior) used &lt;.  What is your preference?  It gets tough
because
> &#39; (a straight single quote) doesn''t have a character
entity equivalent.
>
>
> 2.) How do you feel about encoding characters like quotes in blockcode and
> pre blocks?  Textile 2 does it, but the old RedCloth never did.  Example:
>
>
> This <code>is some code, "isn''t it"</code>.
>
>
>
> under Textile 2 becomes
>
>
> This <code>is some code, &#34;isn&#39;t
it&#34;</code>.
>
>
>
>
>
> I prefer unicode character references rather than entities.
>
>
> See:
> http://rubyforge.org/pipermail/redcloth-upwards/2007-August/000161.html
> _______________________________________________
>  Redcloth-upwards mailing list
>  Redcloth-upwards at rubyforge.org
>  http://rubyforge.org/mailman/listinfo/redcloth-upwards
>

Jason Garber

2008-Mar-11 19:31 UTC

head link

Question about entities

This idea about hooks is a good one.  I''d wished for it myself when  
outputting HTML vs. XHTML (because while <br /> works, it isn''t
truly
valid HTML)

A patch to do this would be most welcome!

Jason

On Mar 10, 2008, at 6:30 AM, Gaspard Bucher wrote:
> But in essence,
> I think there should be some universal hooks to do this kind of
> escaping. I would propose the following hooks:
>
> pre: before anything happens
> escape: before raw text is written out (entity escaping in html for  
> example)
> post: after parsing
>
> It would then be easy to alter the grammar for HTML by writing:
>
> class << SuperRedCloth::HTML
>   def escape(text)
>     html_unicode_escape(text)
>   end
> end
>
> The method "html_entity_escape" would be the C function
> "rb_str_cat_escaped", "html_unicode_escape" could be
another C
> function so there is no speed loss.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080311/1d1f20ce/attachment.html

Gaspard Bucher

2008-Mar-11 21:24 UTC

head link

Question about entities

I will start by referencing the parts in the parser that are specific
to HTML and are not covered by the ruby methods ("def p", "def
h1",
etc) and also the parts that could need a 2 step parsing (1 build a
tree, 2 render) typically parts that need some knowledge of what is
coming further down before we can render them (footnotes, tables).
>From this list we can discuss the best ways to enable multiple outputswithout loosing speed.

Gaspard

2008/3/11, Jason Garber <jg at jasongarber.com>:> This idea about hooks is a good one.  I''d wished for it myself
when
> outputting HTML vs. XHTML (because while <br /> works, it
isn''t truly valid
> HTML)
>
> A patch to do this would be most welcome!
>
> Jason
>
>
> On Mar 10, 2008, at 6:30 AM, Gaspard Bucher wrote:
>
>
> But in essence,
>
> I think there should be some universal hooks to do this kind of
>
> escaping. I would propose the following hooks:
>
>
>
>
> pre: before anything happens
>
> escape: before raw text is written out (entity escaping in html for
example)
>
> post: after parsing
>
>
>
>
> It would then be easy to alter the grammar for HTML by writing:
>
>
>
>
> class << SuperRedCloth::HTML
>
>   def escape(text)
>
>     html_unicode_escape(text)
>
>   end
>
> end
>
>
>
>
> The method "html_entity_escape" would be the C function
>
> "rb_str_cat_escaped", "html_unicode_escape" could be
another C
>
> function so there is no speed loss.
>
> _______________________________________________
>  Redcloth-upwards mailing list
>  Redcloth-upwards at rubyforge.org
>  http://rubyforge.org/mailman/listinfo/redcloth-upwards
>

Jason Garber

2008-Mar-11 21:44 UTC

head link

Question about entities

On Mar 11, 2008, at 5:24 PM, Gaspard Bucher wrote:
>  also the parts that could need a 2 step parsing (1 build a
> tree, 2 render) typically parts that need some knowledge of what is
> coming further down before we can render them (footnotes, tables).
That would be really handy.  Right now I re-parse the document if any  
link aliases were found and I assume that footnotes always come after  
they''re referenced.  Re-parsing isn''t that slow, but it is
unnecessary.

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/redcloth-upwards/attachments/20080311/ff1e17d0/attachment.html

Possibly Parallel Threads

Search for more possibly parallel threads

Redcloth upwards - Feb 2008 - Question about entities

Question about entities

Question about entities

Question about entities

Question about entities

Question about entities

Question about entities

Possibly Parallel Threads