John Gruber
2006-Sep-25 15:09 UTC
Tightening the rules for literal `[` and `]` chars in link ids
So here's an interesting bug I just discovered: [Like this][d]: [here][h]. [d]: foo [h]: bar The output here should be: <a href="foo">Like this</a>: <a href="bar">here</a>. But instead the output is completely empty. I see this bug in both Markdown.pl and PHP Markdown. The problem is that all three lines are being treated as link definitions. The first line is being matched as though Like this][d is the link id, and [here][h]. is the URL. You can trigger it in other, simpler ways as well: [Like this][d]: here. [Like this][]: here. And with the magic implicit autolinks, even: The next line will disappear [like this]: here. The current pattern for identifying link references, translated to English and simplified slightly for this discussion, is: An opening bracket `[` Followed by anything other than a newline A closing bracket `]` A colon `:` Zero or more spaces and tabs Followed by the URL The URL is defined, simply, as a run of non-space characters. So, we *won't* trigger this bug with this: [Like this][d]: two words. * * * My thinking is that this can be solved by changing the rules for link IDs to state that they can only contain embedded literal brackets if (a) they're properly nested; or (b) they're backslash escaped. Objections or suggestions? This change won't solve the problem for the magic implicit link references: The next line will disappear [like this]: here. One way to address this might be to tighten up the rules for what a URL is. I should also note that this entire problem has never been reported to me by anyone, so it doesn't seem to be something people are stumbling upon frequently. -J.G.
John Gruber
2006-Sep-25 15:19 UTC
Tightening the rules for literal `[` and `]` chars in link ids
John Gruber <gruber@fedora.net> wrote on 9/25/06 at 3:09 PM:> My thinking is that this can be solved by changing the rules for > link IDs to state that they can only contain embedded literal > brackets if (a) they're properly nested; or (b) they're backslash > escaped. > > Objections or suggestions?My other thoughts, I should add: * We could also just tighten the rules for link ref IDs all the way, and outright ban literal `[` and `]` chars in the IDs. * Even if we allow them, it might be easier, both in terms of code and documentation, to simply state that they must be backslash-escaped. After a few more minutes of thought, I'm having a hard time coming up with a good reason why `[` and `]` shouldn't just be banned characters for link ref IDs. -J.G.
A. Pagaltzis
2006-Sep-25 15:38 UTC
Tightening the rules for literal `[` and `]` chars in link ids
* John Gruber <gruber@fedora.net> [2006-09-25 21:15]:> This change won't solve the problem for the magic implicit link > references: > > The next line will disappear > [like this]: here. > > One way to address this might be to tighten up the rules for > what a URL is.I doubt you can do that without undue restrictions. `here.` might just as soon be a word as an actual relative link I want to use. * John Gruber <gruber@fedora.net> [2006-09-25 21:25]:> After a few more minutes of thought, I'm having a hard time > coming up with a good reason why `[` and `]` shouldn't just be > banned characters for link ref IDs.That was my first thought after reading the previous mail. They just make short link names harder to read. However, I am using Markdown on a wiki where I currently implement internal links simply by doing [frobnicate the weeblefitzer]: /doc/42 where the link name is just the title of the page at `/doc/42`. That way I get all the Markup linking features without for intrawiki links without any effort [^1] ? but there might well be square brackets in it. It is a rare enough case that requiring backslashes shouldn?t be onerous, though. So that?s what I?d lean toward. [^1]: Would that Markdown.pl had an API; then I could simply `$mkd->add_link( $title => $href);` instead of the current dirty hack of appending generated Markdown to documents. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
Michel Fortin
2006-Sep-25 17:22 UTC
Tightening the rules for literal `[` and `]` chars in link ids
Le 25 sept. 2006 ? 15:09, John Gruber a ?crit :> I should also note that this entire problem has never been > reported to me by anyone, so it doesn't seem to be something > people are stumbling upon frequently.That's why I think it'd be better to solve this while still allowing nested brackets. The other reason for supporting nested brackets is that they are already supported for the content of the link, which is often the same as the definition name. How unnatural would this be if you had to : In [his [distorted] view][], this isn't something to worry about. [his \[own\] view]: http://something.com/ Michel Fortin michel.fortin@michelf.com http://www.michelf.com/
Michel Fortin
2006-Sep-25 17:29 UTC
Tightening the rules for literal `[` and `]` chars in link ids
I sent the last message before I had finished it, sorry. (Note to self, don't hit Cmd-Shift-D while writing a message in OSX's Mail.) What I was trying to say is that in some situations you have to escape brackets and in others you don't... and while things like this are pretty rare, it's probably better to keep things similar. In [his [distorted] view][], this isn't something to worry about. [his \[own\] view]: http://something.com/ On the other side, since this was never really reported to John, nor to me for that matter, whatever we choose to support or not unescaped but correctly nested brackets is probably not that important. Michel Fortin michel.fortin@michelf.com http://www.michelf.com/