John Gruber
2006-Sep-25 15:09 UTC
Tightening the rules for literal `[` and `]` chars in link ids
So here's an interesting bug I just discovered:
[Like this][d]: [here][h].
[d]: foo
[h]: bar
The output here should be:
<a href="foo">Like this</a>: <a
href="bar">here</a>.
But instead the output is completely empty. I see this bug in both
Markdown.pl and PHP Markdown.
The problem is that all three lines are being treated as link
definitions. The first line is being matched as though
Like this][d
is the link id, and
[here][h].
is the URL.
You can trigger it in other, simpler ways as well:
[Like this][d]: here.
[Like this][]: here.
And with the magic implicit autolinks, even:
The next line will disappear
[like this]: here.
The current pattern for identifying link references, translated to
English and simplified slightly for this discussion, is:
An opening bracket `[`
Followed by anything other than a newline
A closing bracket `]`
A colon `:`
Zero or more spaces and tabs
Followed by the URL
The URL is defined, simply, as a run of non-space characters.
So, we *won't* trigger this bug with this:
[Like this][d]: two words.
* * *
My thinking is that this can be solved by changing the rules for
link IDs to state that they can only contain embedded literal
brackets if (a) they're properly nested; or (b) they're backslash
escaped.
Objections or suggestions?
This change won't solve the problem for the magic implicit link
references:
The next line will disappear
[like this]: here.
One way to address this might be to tighten up the rules for what
a URL is.
I should also note that this entire problem has never been
reported to me by anyone, so it doesn't seem to be something
people are stumbling upon frequently.
-J.G.
John Gruber
2006-Sep-25 15:19 UTC
Tightening the rules for literal `[` and `]` chars in link ids
John Gruber <gruber@fedora.net> wrote on 9/25/06 at 3:09 PM:> My thinking is that this can be solved by changing the rules for > link IDs to state that they can only contain embedded literal > brackets if (a) they're properly nested; or (b) they're backslash > escaped. > > Objections or suggestions?My other thoughts, I should add: * We could also just tighten the rules for link ref IDs all the way, and outright ban literal `[` and `]` chars in the IDs. * Even if we allow them, it might be easier, both in terms of code and documentation, to simply state that they must be backslash-escaped. After a few more minutes of thought, I'm having a hard time coming up with a good reason why `[` and `]` shouldn't just be banned characters for link ref IDs. -J.G.
A. Pagaltzis
2006-Sep-25 15:38 UTC
Tightening the rules for literal `[` and `]` chars in link ids
* John Gruber <gruber@fedora.net> [2006-09-25 21:15]:> This change won't solve the problem for the magic implicit link > references: > > The next line will disappear > [like this]: here. > > One way to address this might be to tighten up the rules for > what a URL is.I doubt you can do that without undue restrictions. `here.` might just as soon be a word as an actual relative link I want to use. * John Gruber <gruber@fedora.net> [2006-09-25 21:25]:> After a few more minutes of thought, I'm having a hard time > coming up with a good reason why `[` and `]` shouldn't just be > banned characters for link ref IDs.That was my first thought after reading the previous mail. They just make short link names harder to read. However, I am using Markdown on a wiki where I currently implement internal links simply by doing [frobnicate the weeblefitzer]: /doc/42 where the link name is just the title of the page at `/doc/42`. That way I get all the Markup linking features without for intrawiki links without any effort [^1] ? but there might well be square brackets in it. It is a rare enough case that requiring backslashes shouldn?t be onerous, though. So that?s what I?d lean toward. [^1]: Would that Markdown.pl had an API; then I could simply `$mkd->add_link( $title => $href);` instead of the current dirty hack of appending generated Markdown to documents. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
Michel Fortin
2006-Sep-25 17:22 UTC
Tightening the rules for literal `[` and `]` chars in link ids
Le 25 sept. 2006 ? 15:09, John Gruber a ?crit :> I should also note that this entire problem has never been > reported to me by anyone, so it doesn't seem to be something > people are stumbling upon frequently.That's why I think it'd be better to solve this while still allowing nested brackets. The other reason for supporting nested brackets is that they are already supported for the content of the link, which is often the same as the definition name. How unnatural would this be if you had to : In [his [distorted] view][], this isn't something to worry about. [his \[own\] view]: http://something.com/ Michel Fortin michel.fortin@michelf.com http://www.michelf.com/
Michel Fortin
2006-Sep-25 17:29 UTC
Tightening the rules for literal `[` and `]` chars in link ids
I sent the last message before I had finished it, sorry. (Note to
self, don't hit Cmd-Shift-D while writing a message in OSX's Mail.)
What I was trying to say is that in some situations you have to
escape brackets and in others you don't... and while things like this
are pretty rare, it's probably better to keep things similar.
In [his [distorted] view][], this isn't something to worry about.
[his \[own\] view]: http://something.com/
On the other side, since this was never really reported to John, nor
to me for that matter, whatever we choose to support or not unescaped
but correctly nested brackets is probably not that important.
Michel Fortin
michel.fortin@michelf.com
http://www.michelf.com/