Matt Kraai
2007-May-23 03:48 UTC
Markdown generates invalid html for a list immediately followed by a quote
Howdy, [Please preserve the CC to 424919-forwarded at bugs.debian.org on any replies.] The following bug in Markdown was reported to the Debian bug tracking system. In short, running both the released version of Markdown and the latest beta on * foo > bar > baz produces invalid HTML. ----- Forwarded message from Joey Hess <joeyh at debian.org> ----- From: Joey Hess <joeyh at debian.org> To: Debian Bug Tracking System <submit at bugs.debian.org> Subject: Bug#424919: generates invalid html for a list element immediately followed by a quote Date: Thu, 17 May 2007 16:07:35 -0400 X-Spam-Status: No, hits=-8.5 required=4.0 tests=BAYES_00,FROMDEVELOPER, HAS_PACKAGE,HTML_MESSAGE,RCVD_IN_SORBS autolearn=ham version=2.60-bugs.debian.org_2005_01_02 Package: markdown Version: 1.0.1-6 Severity: normal joey at kodama:~>cat foo * foo> bar > bazjoey at kodama:~>markdown foo <p><ul> <li>foo</p> <blockquote> <p>bar baz</li> </ul></p> </blockquote> Notice that the closing tags are not in the right order.. If a newline is added before the quote, it closes the list before starting the blockquote, so that's a workaround. (This also happens with markdown 1.0.2~b8-1) -- System Information: Debian Release: lenny/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: i386 (i686) Kernel: Linux 2.6.20-1-686 (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages markdown depends on: ii perl 5.8.8-7 Larry Wall's Practical Extraction markdown recommends no packages. -- no debconf information -- see shy jo ----- End forwarded message ----- -- Matt
Michel Fortin
2007-May-23 14:01 UTC
Markdown generates invalid html for a list immediately followed by a quote
Le 2007-05-22 ? 23:48, Matt Kraai a ?crit :> * foo > > bar > > bazAlthough it should certainly be valid HTML, the output Markdown should generate for that is a pretty tricky question in my opinion. I see three valid interpretations according to the Markdown syntax documentation. Here is the simplest: <ul> <li>foo > bar > baz</li> </ul> It also happens to be PHP Markdown's output. bar and baz are taken as part of the list item since the following lines do not need to be indented, and since the list item does not contain any blank line the content gets treated as a span-level, hence no blockquote. The next one is what most people would expect I think: <ul> <li>foo</li> </ul> <blockquote> <p>bar baz</p> </blockquote> Blockquote markers are obeyed and are on the same level as the list since they aren't indented. Third option: <ul> <li><p>foo</p> <blockquote> <p>bar baz</p> </blockquote></li> </ul> Blockquote markers are seen as inside the list item since adjacent lines do not need indentation, and are obeyed making the list item content's block-level. I think, as a general rule, the explicit syntax should take precedence over the lazy one. This would make the second option above the preferred one over the others. Other tricky cases could work like the following. A list item containing a "foo" paragraph and a "bar baz" blockquote: * foo > bar > baz A list item containing a "foo" paragraph and a "bar" blockquote, followed by a "baz" blockquote: * foo > bar > baz A list item containing a "foo" paragraph, followed by a "bar baz" blockquote: * foo > bar baz A list item containing "foo" (no paragraph), followed by a blockquote containing "bar", followed by a list item containing "baz" (no paragraph): * foo > bar * baz Basically, I'd eliminate any "half-lazy" syntax were you can be lazy about list item indentation while not being lazy on blockquote markers. This just creates confusion; syntax markers shouldn't be allowed to be lazy. Removing half-lazy things would also fix a surprising issue with blockquotes: > foo > > bar > baz This would be seen as a blockquote containing a "foo" paragraph, a nested "bar" blockquote and a "baz" paragraph, instead of the completly counter-intuitive output produced today. To make "baz" part of the nested blockquote, you would either go the explicit route: > foo > > bar > > baz or the lazy route: > foo > > bar baz but not something in between. Michel Fortin michel.fortin at michelf.com http://www.michelf.com/
Matt Kraai
2007-Jun-13 06:00 UTC
Markdown generates invalid html for a list immediately followed by a quote
Howdy, The attached patch prevents Markdown from generating invalid HTML if a list is followed immediately by a quote, e.g., * foo> bar > baz-- Matt -------------- next part -------------- diff -Nru /tmp/ZHVRVKY7mc/markdown-1.0.2~b8/Markdown.pl /tmp/dDW4tYqXOK/markdown-1.0.2~b8/Markdown.pl --- /tmp/ZHVRVKY7mc/markdown-1.0.2~b8/Markdown.pl 2007-06-12 22:58:14.000000000 -0700 +++ /tmp/dDW4tYqXOK/markdown-1.0.2~b8/Markdown.pl 2007-06-12 22:58:15.000000000 -0700 @@ -858,6 +858,11 @@ ( # $4 \z | + \n + (?= # Lookahead for a blockquote + [ \t]*> + ) + | \n{2,} (?=\S) (?! # Negative lookahead for another list item marker
Reasonably Related Threads
- [joeyh@debian.org: Bug#405058: does not properly support nested divs in inlined html]
- Bug#486557: cpio segfault
- Markdown is confused by quoted text inside a list inside a list
- seemingly no good way to end bulleted list and start code block
- Bug: invalid nesting of inline markup across link labels