Esteemed human authors and robotic parse-bots: I recently discovered that most or all Markdown implementations, including Gruber?s original in Perl, have an odd behavior with regards to lists that follow each other. Namely, a bulleted list followed by a numbered list, or vice-versa, is masked as if it were part of the first list (and of the first list?s type.) For example, consider the following input: ~~~~~~ - Bulleted item - Second bulleted item 1. Numbered list 2. Second numbered item ~~~~~~ It will yield an output of one UL element with four LI children (themselves containing some number of P tags, varying by implementation). Now, I realize full well that a blank line between list items causes the list items to be given <p> tags. But the blank line above, to any reasonable *human*, isn?t separating list items but rather *lists.* There is a fundamental problem in the above code: that it triggers **non-obvious data loss.** The data is of course the numbering. The non-obviousness is due to the way the output formatting is essentially correct, and only the list item markers are unexpected. A cursory scan of the Markdown-transformed text ? e.g., looking over a blog post before publishing ? will show no structural problems. Success, publish! ? How long until the author realizes his/her reference to ?step #2? is actually referring to the fourth bullet in an awkward list? One of the nicest things about Markdown is that once you get it, and it doesn?t take long, then there is precious little by way of surprises it will throw at you. If for no other reason, I think the counter-intuitiveness and ?crap do I really have to remember that you can?t follow a list by a list? moment are in and of themselves reasons to change the behavior. Besides the data loss. I also struggle to imagine anyone who would be upset at the change. After all, what end-user would *rely* on this feature to munge their list types? Alan
Hello, while I agree that this is technically an issue, I don't think it is an often seen issue in actual human-written text. Markdown is plain text formatted by and for humans. I don't think there are many cases where you would want to put two lists after each other without an introduction of sorts. And on a side note: Gruber notes in the markdown spec that the actual numbers used in a numbered list are ignored. So data loss is already occuring here. Greetings, _Lasar On 2011-06-06, at 19:20, Alan Hogan wrote:> Esteemed human authors and robotic parse-bots: > > I recently discovered that most or all Markdown implementations, including Gruber?s original in Perl, have an odd behavior with regards to lists that follow each other. Namely, a bulleted list followed by a numbered list, or vice-versa, is masked as if it were part of the first list (and of the first list?s type.) > > For example, consider the following input: > > ~~~~~~ > > - Bulleted item > - Second bulleted item > > 1. Numbered list > 2. Second numbered item > > ~~~~~~ > > It will yield an output of one UL element with four LI children (themselves containing some number of P tags, varying by implementation). > > Now, I realize full well that a blank line between list items causes the list items to be given <p> tags. But the blank line above, to any reasonable *human*, isn?t separating list items but rather *lists.* > > There is a fundamental problem in the above code: that it triggers **non-obvious data loss.** > > The data is of course the numbering. > > The non-obviousness is due to the way the output formatting is essentially correct, and only the list item markers are unexpected. A cursory scan of the Markdown-transformed text ? e.g., looking over a blog post before publishing ? will show no structural problems. Success, publish! ? How long until the author realizes his/her reference to ?step #2? is actually referring to the fourth bullet in an awkward list? > > One of the nicest things about Markdown is that once you get it, and it doesn?t take long, then there is precious little by way of surprises it will throw at you. > > If for no other reason, I think the counter-intuitiveness and ?crap do I really have to remember that you can?t follow a list by a list? moment are in and of themselves reasons to change the behavior. Besides the data loss. > > I also struggle to imagine anyone who would be upset at the change. After all, what end-user would *rely* on this feature to munge their list types? > > Alan > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss-- _Lasar Liepins lasar at liepins.net http://liepins.net/ http://10110101.net/
I agree with Lasar that such cases arise infrequently. I do support such a change in theory, though, but I'm not sure how difficult this would be to implement given the fact that double line breaks can be used to have list items wrapped in `p` tags. Alan Hogan <contact at alanhogan.com> wrote: After all, what end-user would *rely* on this feature to munge their list> types?Good point. David -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://six.pairlist.net/pipermail/markdown-discuss/attachments/20110606/461bb807/attachment-0001.html>
Quoth _Lasar:> while I agree that this is technically an issue, I don't think it > is an often seen issue in actual human-written text. Markdown is > plain text formatted by and for humans. I don't think there are > many cases where you would want to put two lists after each other > without an introduction of sorts.I must of course agree that it is not an exceedingly common case, or a terribly sensible decision to make. That said: Consider a student quickly taking notes, or a liveblogger publishing quickly. They may not have time to write an intro for each list, or realize that they skipped it? I personally have experienced this issue, so it does happen. Even if a small fraction of users run into this issue ? half a percent, say ? if I am providing a service to two hundred thousand of users (and I do), that?s a thousand people affected.> And on a side note: Gruber notes in the markdown spec that the > actual numbers used in a numbered list are ignored. So data loss > is already occuring here.Now that is true. However: Existing data loss doesn?t mean we should be okay with more data loss. The numbers couldn?t really be always matched in output given how HTML works, anyway? I personally made a mistake by starting a paragraph with ?1999.? today, so this too can cause problems. (At least it?s part of the Markdown spec though.) I am personally disappointed that the `start` attribute (?) isn?t used, based off the first number in the list; this would also help catch mistakes. Given that I still struggle to see a downside to making my proposed change, I?m really hoping we can achieve a rough consensus here. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://six.pairlist.net/pipermail/markdown-discuss/attachments/20110606/bfcfd701/attachment.htm>
+++ Alan Hogan [Jun 06 11 10:20 ]:> Esteemed human authors and robotic parse-bots: > > I recently discovered that most or all Markdown implementations, including Gruber?s original in Perl, have an odd behavior with regards to lists that follow each other. Namely, a bulleted list followed by a numbered list, or vice-versa, is masked as if it were part of the first list (and of the first list?s type.) > > For example, consider the following input: > > ~~~~~~ > > - Bulleted item > - Second bulleted item > > 1. Numbered list > 2. Second numbered item > > ~~~~~~I strongly agree that this should be parsed as an unordered list followed by an ordered list. That is how any normal person would construe it. I also think that the following should be interpreted as two different unordered lists (that is, the change of bullet character should be significant): ~~~~~~ * one * two - new - list ~~~~~~ Finally, I think that the starting number of an ordered list should be significant. Otherwise there is no way to have a running list with commentary in between (but not part of) the items. (By the way, pandoc implements this last feature, and the next major version will implement the first two.) John
John MacFarlane <jgm at berkeley.edu> wrote: I also think that the following should be interpreted as two different> unordered lists (that is, the change of bullet character should be > significant): > > ~~~~~~ > * one > * two > > - new > - list > ~~~~~~+1 to this, too. David -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://six.pairlist.net/pipermail/markdown-discuss/attachments/20110606/1a09c1d5/attachment.html>