> Anyway, a spec for Markdown Extra would contain a spec for Markdown as > well, wouldn't it?I think the whole enterprise would be a lot more valuable, if we produce a combined spec, which would be self-contained, and call it Markdown 2.0. I don't think we necessarily need a formal grammar. What we need is to create a document, starting with "Markdown Syntax" perhaps, throw a bunch of questions at it, settle on the answers, incorporate them into a spec. Perhaps we can use the wiki at http://markdown.infogami.com/ for this. (BTW, I just cleaned up the wiki removing links to unrelated sites and reorganizing the rest into what seemed like a more coherent set of categories.) - yuri -- http://sputnik.freewisdom.org/
Le 2008-02-29 ? 3:49, Yuri Takhteyev a ?crit :>> Anyway, a spec for Markdown Extra would contain a spec for Markdown >> as >> well, wouldn't it? > > I think the whole enterprise would be a lot more valuable, if we > produce a combined spec, which would be self-contained, and call it > Markdown 2.0.I also think the Markdown Extra spec should be usable as a spec for how to parse plain Markdown. But I'm not conviced calling it "Markdown 2.0" will make the spec much more valuable. Can we even do that without John Gruber's blessing?> I don't think we necessarily need a formal grammar. What we need is > to create a document, starting with "Markdown Syntax" perhaps, throw a > bunch of questions at it, settle on the answers, incorporate them into > a spec. Perhaps we can use the wiki at http://markdown.infogami.com/ > for this.I think the syntax needs to be defined unambiguously, not necessarily as a formal grammar, but certainly not with code either. My idea, currently, is to write a parsing procedure which is easy to read and implement in various ways, using a formal grammar to define various constructs of the syntax and plain english to link things together. I also intend to keep the spec implementable as an incremental parser, but that will require backtracking. Michel Fortin michel.fortin at michelf.com http://michelf.com/
On Feb 29, 2008, at 10:13 AM, Michel Fortin wrote:> I think the syntax needs to be defined unambiguously, not > necessarily as a formal grammar, but certainly not with code either. > My idea, currently, is to write a parsing procedure which is easy to > read and implement in various ways, using a formal grammar to define > various constructs of the syntax and plain english to link things > together. I also intend to keep the spec implementable as an > incremental parser, but that will require backtracking.I agree that Markdown needs to be defined unambiguously, but I don't think that's feasible with plain English in the loop. For something as complex and flighty as Markdown, we need working code. One possibility I've been thinking about is to define Markdown as a dressed-up parsing expression grammar, which would be fed to a simple, portable packrat parser. The result would be a "spec" that makes sense to machines as well as humans, so implementors would have something to test against. We could take it a step further and define rewrite rules to transform the parse tree into the final output. Then we'd have a readable definition of both syntax and semantics that doubles as a reference implementation. It might be too slow for many uses; I'd probably have to do a hand-written JavaScript parser for [Showdown] 2.0, for example. But it's much easier to optimize a working system than a paper spec. [Showdown]: http://www.attacklab.net/showdown-gui.html The biggest win here would be extensibility: adding rules to a PEG is simple enough that Markdown would finally be flexible. Users could add custom rules for wiki links, footnotes, or whatever else they need -- without the risk of fragmenting the language. If there's interest in this approach, I'll try to do a quick hand-wavy prototype next week to show what I'm talking about. John Fraser http://wmd-editor.com/ (going open source soon)
In article <6CDFE6FE-CE6C-461D-BB95-18C2A86CB75F at attacklab.net>, John Fraser <markdown-discuss at six.pairlist.net> wrote:> >On Feb 29, 2008, at 10:13 AM, Michel Fortin wrote: > >> I think the syntax needs to be defined unambiguously, not >> necessarily as a formal grammar, but certainly not with code either. >> My idea, currently, is to write a parsing procedure which is easy to >> read and implement in various ways, using a formal grammar to define >> various constructs of the syntax and plain english to link things >> together. I also intend to keep the spec implementable as an >> incremental parser, but that will require backtracking. > > >I agree that Markdown needs to be defined unambiguously, but I don't >think that's feasible with plain English in the loop. For something >as complex and flighty as Markdown, we need working code.I'm not so sure about this. I managed to write a markdown implementation without using anything other than the daring fireball syntax document and MarkdownTest_1.0. And I am by no means a Perl programmer. If it's possible to write a Markdown (that passes MarkdownTest) with the current documentation, describing it in plain english isn't as difficult as it might seem. And plain english has the advantage that it doesn't require knowledge of the implementation language to understand the document. -david parsons
On Mar 1, 2008, at 1:19 PM, david parsons wrote:>> I agree that Markdown needs to be defined unambiguously, but I don't >> think that's feasible with plain English in the loop. For something >> as complex and flighty as Markdown, we need working code. > > I'm not so sure about this. I managed to write a markdown > implementation without using anything other than the daring > fireball > syntax document and MarkdownTest_1.0. And I am by no means a > Perl programmer.Okay, but I'd argue that your success had a lot more to do with the test suite than the syntax document. I'll admit it: I'm probably more suspicious of paper specs than I should be. But I can't help thinking that (1) any natural-language Markdown spec will have holes; (2) any test suite will have littler holes; and (3) the most popular implementation will always be the de facto standard. My JavaScript port of Markdown needs to match up perfectly with a server-side version in order to be useful, so I'm probably a little more sensitive to underspecification than most. But a spec's not worth much if implementations aren't interchangeable. And since Markdown has to continue silently when it gets confused, we'd need to define all the corner cases completely -- or risk locking users into a particular reading of the spec. I'm all for writing a specification, but I think its purpose should be to inform and to justify a reference implementation and test suite. - John Fraser
In article <EB61C125-47FB-4DC7-ABE2-BBD576D95F8A at attacklab.net>, John Fraser <markdown-discuss at six.pairlist.net> wrote:>On Mar 1, 2008, at 1:19 PM, david parsons wrote: >>> I agree that Markdown needs to be defined unambiguously, but I don't >>> think that's feasible with plain English in the loop. For something >>> as complex and flighty as Markdown, we need working code. >> >> I'm not so sure about this. I managed to write a markdown >> implementation without using anything other than the daring >> fireball >> syntax document and MarkdownTest_1.0. And I am by no means a >> Perl programmer. > > >Okay, but I'd argue that your success had a lot more to do with the >test suite than the syntax document.I'm not sure. The test suite kicked out about 7 failures where I spazzed out and misread the syntax document, but there was only one place where I actually had to hack the compiler to generate test-matching code (the first line of a code block needs to have trailing whitespace trimmed.) If I didn't have a test suite, it would have taken a lot longer to dust out the corners, but if I was confronted with nothing but a mass of Perl code and a test suite, I don't think I would have even bothered.>I'll admit it: I'm probably more suspicious of paper specs than I >should be. But I can't help thinking that (1) any natural-language >Markdown spec will have holes; (2) any test suite will have littler >holes; and (3) the most popular implementation will always be the de >facto standard.Which means that you end up having to code-peek to determine how the language works. And this restricts the audience to people who are proficient in that language (or, worse yet, to people who are proficient to the particular dialect that the developers used when they wrote the code.) (And it still needs documentation so people can actually use the code; if Markdown.pl didn't have the syntax document or the dingus, I suspect that the userbase would be the two people who developed the language in the first place. So if you're going to have a document describing how the language works, why not say that document is the definition of the language and have it fully describe the process that is already being done?) -david parsons
On 1 Mar 2008, at 19:19, David Parsons wrote:>> I agree that Markdown needs to be defined unambiguously, but I don't >> think that's feasible with plain English [...] > > I'm not so sure about this. I managed to write a markdown > implementation without using anything other than the daring > fireball > syntax document and MarkdownTest_1.0. And I am by no means a > Perl programmer.And no offense, but there must be hundreds of edge-cases where your implementation disagrees with Markdown.pl. Have a look e.g. at http://six.pairlist.net/pipermail/markdown-discuss/2006-August/000151.html for some edge cases related to nesting block elements, where the outcome is ambiguous, and I am quite sure there are no tests, as Markdown.pl generates bad HTML for most of them. A formal grammar defines _exactly_ how things should be, as I argue here http://six.pairlist.net/pipermail/markdown-discuss/2007-August/000746.html where I also show how Markdown.pl presently has unintuitive precedence which is definitely not defined in the syntax document and something I doubt is in the tests (as the behavior to me is unattractive, but stems from how Markdown.pl is implemented, and thus most of the ports as well). The problem so far has been that the formal syntax normally used to define grammars does not support Markdown?s notion of embedding, but as mentioned here http://six.pairlist.net/pipermail/markdown-discuss/2008-February/001002.html I have had some success with a rule-based implementation that uses a stack for aggregating rules that needs to be applied to the current line before it is handed to the regular parser -- this allows a specification without code and which is unambiguous to edge-cases since the rules are exhaustive, unlike a document written in English. Though without changing a lot of edge-case behavior, I find it hard to see Markdown using such rule-based implementation, so personally I am favoring a new Markdown-inspired language.
In article <4845-67907 at sneakemail.com>, Allan Odgaard <markdown-discuss at six.pairlist.net> wrote:> >On 1 Mar 2008, at 19:19, David Parsons wrote: > >>> I agree that Markdown needs to be defined unambiguously, but I don't >>> think that's feasible with plain English [...] >> >> I'm not so sure about this. I managed to write a markdown >> implementation without using anything other than the daring >> fireball >> syntax document and MarkdownTest_1.0. And I am by no means a >> Perl programmer. > >And no offense, but there must be hundreds of edge-cases where your >implementation disagrees with Markdown.pl.I'm sure there is, and that's a good reason to have a better language definition. About all I can say for my implementation is "I think it follows everything in the spec, because it passes MarkdownTest" (including, alas, the one place where I don't think the test suite actually follows the spec) and that's a fairly imprecise definition. But the point is that I could write a markdown from the spec as it sits now, so there's nothing in the language that prevents it from being described in text. All of those hypothetical edge cases? The ones that aren't defects will just go away when the language defines how they're supposed to work.>The problem so far has been that the formal syntax normally used to >define grammars does not support Markdown?s notion of embedding,The simple solution to that is to describe the language with a different syntax. Programming languages exist with different rules at different scopes, so it's not as if there isn't a precedent for describing such things. The daringfireball syntax document isn't that bad, so expanding on it would seem to be the ideal starting point. -david parsons
Allan Odgaard wrote:> Though without changing a lot of edge-case behavior, I find it hard > to see Markdown using such rule-based implementation, so personally > I am favoring a new Markdown-inspired language.For my part, I'm currently trying to specify parsing rules Markdown Extra, and make the specification usable to parse Markdown too. The idea is to preserve the way it is working now, but to handle edge cases in a consistent and predictable manner. What I want to achieve is interoperability between implementations for the current Markdown and Markdown Extra languages, not creating a new look-alike language.> The problem so far has been that the formal syntax normally used to > define grammars does not support Markdown?s notion of embedding, but > as mentioned here http://six.pairlist.net/pipermail/markdown-discuss/2008-February/001002.html > I have had some success with a rule-based implementation that uses > a stack for aggregating rules that needs to be applied to the > current line before it is handed to the regular parser -- this > allows a specification without code and which is unambiguous to edge- > cases since the rules are exhaustive, unlike a document written in > English.I'd like to point out a thing: you can always write in english what you can with a formal grammar; if you write things correctly, they'll be precise and unambiguous. This has the disadvantage of being more verbose, but the advantage that you don't need to learn a new "language", which is the grammar. That said, I'm currently looking at how to specify Markdown formally. Whether to use a grammar or english, that is to be decided later. I'm looking at the general form of a rule, and I find the post you linked above gives a pretty good insight at what I need. Each rule in your lost rule-based implementation had this (quoting):> 1. A regexp that makes the parser enter the context the rule > represents (e.g. block quote, list, raw, etc.). > > 2. A list of which rules are allowed in the context of this rule. > > 3. A regexp for leaving the context of this rule. > > 4. A regexp which is pushed onto a stack when entering the context of > this rule, and popped again when leaving this rule. > > The fourth item here is really the interesting part, because it is > what made Markdown nesting work (99% of the time) despite this being > 100% rule-driven.I'm not sure that the regular expression in 4 does, beside being pushed and popped from the stack (perhaps it's the end of block expression), but overall it looks pretty good, and is pretty similar to how I'm currently approaching the problem. There are a couple of subtleties I'm not sure if these rules can catch though. In my idea, you'd have parametrized rules. For instance, the list of allowed rules (2) should change depending on the context: you shouldn't have a link within a link, but you can have emphasis in your link; therefore, the emphasis rule when within a link shouldn't have a link rule in it's list of sub rules (2). You also need a way for the regular expression in 3 to be variable depending on what you caught in 1 (to match the same number of backticks in a code span for instance; to catch a matching closing HTML tag, etc.). The way I see it, rules need to be parametrized so the above can be changed without having to define 2^(number of syntax elements) rules, such as EmphasisWithinLink, LinkWihtinEmphasis, CodeSpanWithinLinkWithinEmphasis, and so on. Michel Fortin michel.fortin at michelf.com http://michelf.com/
On Mar 3, 2008, at 7:30 AM, Michel Fortin wrote:> Allan Odgaard wrote: >> 4. A regexp which is pushed onto a stack when entering the context of >> this rule, and popped again when leaving this rule. >> >> The fourth item here is really the interesting part, because it is >> what made Markdown nesting work (99% of the time) despite this being >> 100% rule-driven. > > I'm not sure that the regular expression in 4 does, beside being > pushed and popped from the stack (perhaps it's the end of block > expression), but overall it looks pretty good, and is pretty similar > to how I'm currently approaching the problem. There are a couple of > subtleties I'm not sure if these rules can catch though.I assume Allan let the grammar refer back to this stack as if it were an ordinary rule, so you could use the stack to collect levels of indentation. It's like a limited kind of parameterization. I'd been planning to use recursive transformation to handle nesting, since it makes memoization easier and ought to be a little more readable. But I'll try Allan's idea if mine gets hairy. I like the direction you're both going, and I'm hoping we can come up with a definition that doesn't use any English at all. Admittedly, that'll be a lot easier for a version that does change some behavior at the edges -- like ditching Markdown's 'undocumented *precedence' rules* (<http://six.pairlist.net/pipermail/markdown-discuss/2007-August/000746.html >). I'm going to build my own little prototype to experiment with this stuff (<http://six.pairlist.net/pipermail/markdown-discuss/2008-February/001042.html >). My goal is to come up with a formal grammar that doubles as a (slow) reference implementation. You'll feed a grammar and an input file into a generic text-munging tool, which will spit out either the transformed output or an AST. The tool will be small, easy to port, and completely general -- you could use it to implement html2txt or smartypants or an HTML sanitizer, for example. That's the plan, anyway; we'll how the first iteration turns out.> The way I see it, rules need to be parametrized so the above can be > changed without having to define 2^(number of syntax elements) > rules, such as EmphasisWithinLink, LinkWihtinEmphasis, > CodeSpanWithinLinkWithinEmphasis, and so on.Since I'm doing something packrat-ish, I'm hoping I can use lookahead to keep the rules from exploding. John Fraser
> Since I'm doing something packrat-ish, I'm hoping I can use lookahead > to keep the rules from exploding.I didn't know what packrat was, so I googled it. It looks interesting: from what I understand, it's a more user-friendly way of specifying languages. Here's a good link, if anyone else is interested: http://pdos.csail.mit.edu/~baford/packrat/ -- Andrea Censi PhD student, Control & Dynamical Systems, Caltech http://www.cds.caltech.edu/~andrea/ "Life is too important to be taken seriously" (Oscar Wilde)
On Mar 3, 2008, at 3:36 PM, Andrea Censi wrote:> I didn't know what packrat was, so I googled it. It looks interesting: > from what I understand, it's a more user-friendly way of specifying > languages. > Here's a good link, if anyone else is interested: > > http://pdos.csail.mit.edu/~baford/packrat/Sorry, I should have provided a link. And since you're looking at the papers, I should mention that what I'm building isn't strictly a packrat parser -- specifically, it won't guarantee O(n) running time. Two reasons: 1) it lets you match against regular expressions as well as literal strings; and 2) I'll probably handle nesting by modifying the results of one rule (i.e. stripping one level of indentation) and recursively calling a parser on the result.
On 3 Mar 2008, at 13:30, Michel Fortin wrote:> [...] >> 1. A regexp that makes the parser enter the context the rule >> represents (e.g. block quote, list, raw, etc.). >> >> 2. A list of which rules are allowed in the context of this rule. >> >> 3. A regexp for leaving the context of this rule. >> >> 4. A regexp which is pushed onto a stack when entering the context of >> this rule, and popped again when leaving this rule. >> >> The fourth item here is really the interesting part, because it is >> what made Markdown nesting work (99% of the time) despite this being >> 100% rule-driven. > > I'm not sure that the regular expression in 4 does, beside being > pushed and popped from the stackYeah, I accidentally sent the letter w/o noticing I forgot to explain the fourth rule. The regexps which end on this stack are used to preprocess the current line, so for example the rule for code blocks is: RAW[1] = /\g {4}/ # Four spaces starts raw. RAW[2] = [ RAW_TEXT ] # No other rules are active inside raw, RAW_TEXT is a dummy .+ rule RAW[4] = /\g( {4}| {,3}$)/ # While in the raw context, we need to eat the first # four spaces of each line, or the line must be empty. Two things to notice here: 1. I don?t use an explicit ?end? rule since we automatically leave the context if RAW[4] doesn?t successfully match. 2. I use \g instead of ^ since we need to anchor to where the last block-rule stopped matching, not necessarily BOL. Now take the rule for block quote: BQ[1] = /\g {,3}> {,3}/ # We start it for lines with > allowing # up to 3 spaces before/after. BQ[2] = [ BQ, RAW, PAR, ? ] # Basically all block elements # can go inside block quote. BQ[3] = /\g( *$|?hr?)/ # We leave block quote at empty lines or # horizontal rulers?. The actual pattern for # ?hr? is something like: # [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>) {2,}[ \t]*+$ BQ[4] = /\g( {,3}> ?)?/ # While in BQ eat leading quote characters. ? I am actually not sure if this is ?the spec? or just a bug. But placing a horizontal ruler just below a block quoted paragraph does not give the expected ?lazy mode? and places the <hr> inside the block quote, instead it leaves the block quote. Just to make the example more complete, let us also have a paragraph rule: PAR[1] = /\g {,3}(?=[^ >])/ # Any non-special character with less than # 4 leading spaces starts a paragraph. PAR[2] = [ B, EM, LINK, TEXT, ? ] # All the inline stuff works in this context PAR[3] = /\g(?= | {,3}>| {,3}$)/ # We exit the paragraph when the line # is starting raw, block quote, or is # empty. In practice paragraphs do end # with block quote, but not with raw. Now we have 3 rules, be aware I typed all this just now without actual testing, and the goal is not to replicate Markdown.pl 100%, just to give an example of how the rule-system works. So our ROOT rule looks like this: ROOT[1] = // ROOT[2] = [ RAW, BQ, PAR ] So when we start to process a document, using this root rule, we will get a match (without actually advancing our position in the document, since zero characters were matched). After this match we have RAW, BQ, and PAR as active rules. Say our document looks like this: > A normal paragaph > Some raw text > Normal text again Out of the block quote The first line is ?> A normal paragaph? and we have 3 rules to apply, BQ[1], RAW[1], and PAR[1]. Since all of these regexps starts with \g, they are anchored to the first byte of the document, and only BQ[1] will match. This ?eats? the ?> ? prefix, pushes BQ[4] on our stack, and makes BQ, RAW, and PAR our new active rules (yeah, the same as before). So we now have ?A normal paragaph? and again apply our 3 active rules, this time PAR[1] will match, it won?t actually eat any characters, and it won?t push additional rules onto our stack, but ti will change the active rules to: B, EM, LINK, TEXT, ? I didn?t define TEXT, but that is a fallback rule for non-special text- runs. We apply these rules to the line, and TEXT will match the line. Now comes the special part, when we move to next line, which is ?> Some raw text? we start by applying the rules from our stack to this line, we have BQ[4] on the stack, which will eat the leading ?> ?. The line is now: ? Some raw text? and we have no more rules on the stack. Before we apply the active rules though, we need to check if we need to leave the current context, which is PAR, thus we try to apply PAR[3], and we do get a match, so we leave PAR. The active rules now revert to those active before we entered PAR, i.e. RAW, BQ, and PAR. Applying these will give a match for RAW, so we eat the match (the leading four spaces), push RAW[4] on the stack, and set the new active rules to RAW[2], i.e. RAW_TEXT. The line is now ?Some raw text? which will be eaten by the RAW_TEXT rule. Next line is ?> Normal text again? and we have both BQ[4] and RAW[4] on the stack. We apply these in a FIFO order, so first BQ[4] which eats ?> ?, then RAW[4], which fails to match, instructing us to leave RAW, ? Okay, enough writing ? I hope the above gives a better understanding of how the rules are used.> [...] You also need a way for the regular expression in 3 to be > variable depending on what you caught in 1 (to match the same number > of backticks in a code span for instance; to catch a matching > closing HTML tag, etc.).I allow captures from the match done by 1 to be referenced in 3.
Le 2008-03-04 ? 0:49, Allan Odgaard a ?crit :> On 3 Mar 2008, at 13:30, Michel Fortin wrote: > >> [...] >>> 1. A regexp that makes the parser enter the context the rule >>> represents (e.g. block quote, list, raw, etc.). >>> >>> 2. A list of which rules are allowed in the context of this rule. >>> >>> 3. A regexp for leaving the context of this rule. >>> >>> 4. A regexp which is pushed onto a stack when entering the context >>> of >>> this rule, and popped again when leaving this rule. >>> >>> The fourth item here is really the interesting part, because it is >>> what made Markdown nesting work (99% of the time) despite this being >>> 100% rule-driven. >> >> I'm not sure that the regular expression in 4 does, beside being >> pushed and popped from the stack > > Yeah, I accidentally sent the letter w/o noticing I forgot to > explain the fourth rule. > > [big explanation]So you're basically using a line by line approach. I was thinking about that as a possibility for parsing blocks, but I don't think I'll do that because I need backtracking to be able to rewind beyond the current line. Or can you do it? I'm particularly curious about how you can handle headers of this form: Header =====> Now take the rule for block quote: > > BQ[1] = /\g {,3}> {,3}/ # We start it for lines with > allowing > # up to 3 spaces before/after. > > BQ[2] = [ BQ, RAW, PAR, ? ] # Basically all block elements > # can go inside block quote. > > BQ[3] = /\g( *$|?hr?)/ # We leave block quote at empty lines or > # horizontal rulers?. The actual > pattern for > # ?hr? is something like: > # [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>) > {2,}[ \t]*+$ > > BQ[4] = /\g( {,3}> ?)?/ # While in BQ eat leading quote > characters. > > ? I am actually not sure if this is ?the spec? or just a bug. But > placing a horizontal ruler just below a block quoted paragraph does > not give the expected ?lazy mode? and places the <hr> inside the > block quote, instead it leaves the block quote.I'm not sure what's the problem with horizontal rules in blockquotes. I've tried many variations of: > test > > *** > > test and couldn't make it end the blockquote prematurely. If it did, I'd say it'd be a bug because I see no way the user would expect the horizontal rule to break the blockquote and no reason for the parser to do so either.> [...] > > Okay, enough writing ? I hope the above gives a better understanding > of how the rules are used.Indeed, it was quite insightful. Thank you. Michel Fortin michel.fortin at michelf.com http://michelf.com/
On Tue, Mar 4, 2008 at 8:02 PM, Michel Fortin <michel.fortin at michelf.com> wrote:> > I'm not sure what's the problem with horizontal rules in blockquotes. > I've tried many variations of: ...trying to reproduce this on the df Markdown dingus lead to a weird result. Here's the input:> test >---- and here's the output: <blockquote> <p>test</p> <h2>></h2> </blockquote>
On 5 Mar 2008, at 05:02, Michel Fortin wrote:>> [big explanation] > So you're basically using a line by line approach.Yes, seeing how the block-level nesting stuff affects things ?line by line?, this seems like the best approach :)> I was thinking about that as a possibility for parsing blocks, but I > don't think I'll do that because I need backtracking to be able to > rewind beyond the current line. Or can you do it?Backtracking? I am not sure when you want to do it and why you think looking at things line-by-line will prevent you from doing it.> I'm particularly curious about how you can handle headers of this > form: > > Header > =====One approach is (for the regexp which starts a context) to allow an array instead of just a single regexp?. The index of the regexp in this array is the relative line offset (to current line) that the regexp should be matched against. So for the setext style header the rule would be: H1[1][0] = /\g.+/ # anything can go into a heading H1[1][1] = /\g={3,}/ # but line below must have at least three equal signs Of course when testing the regexps on lines, these must be preprocessed as if in the current context, i.e. all the regexps on the stack are applied to that line. ? In practice one could allow a composite regexp that uses \n and simply call split on that, then insert the regexps from the stack before all but the first regexp resulting from this split. This would have the advantage of making the implementation hidden from the actual rules and simpler for the person tweaking the rules.>> [...] placing a horizontal ruler just below a block quoted >> paragraph does not give the expected ?lazy mode? and places the >> <hr> inside the block quote, instead it leaves the block quote. > > I'm not sure what's the problem with horizontal rules in blockquotes > [...]This is what I was referring to: > Test bla bla > Test - - - The result becomes: <blockquote> <p>Test bla bla</p> <p>Test</p> </blockquote> <hr>
With all this discussion about evolving the spec, I think we want to remember the philosophy behind Markdown to begin with. Go re-read the Overview[1] of the syntax rules. [1]: http://daringfireball.net/projects/markdown/syntax#overview As the very first line says:> Markdown is intended to be as easy-to-read and easy-to-write as is feasible.Personally, some of the "holes" in the current syntax rules are actually the "features" that makes this statement true. As implementors, we want a strict spec because it's easier to implement, but that does not always result in easier to read and/or write. Take the discussion a short time ago on this list regarding whitespace allowed at the start of a list item. A quick read of the rules would indicate the the `*` or number should be the first item on that line. In practice, markdown.pl allows up to 3 spaces at the start of a list item. While J.G. agreed (IIRC) that that probably is a bug that should be fixed, we learned through the course of that conversation that a number of people actually are relying on that "bug" as a "feature", and in fact, if the "bug" was "fixed", their documents would break. Admittedly, why those three spaces were allowed to begin with is beyond me, but when we consider the philosophy behind Markdown, we realize that it is *easy* for a writer to inadvertently add a space to the beginning of a line of text, but *hard* for that same writer (or future editor) to find that space to remove it later. Therefore, as crazy as is sounds, this "bug" is a "feature" when the philosophy is taken into consideration. My guess is that this is also why J.G. "doesn't give a rip" about a spec. A spec doesn't fit his understanding of the philosophy behind Markdown (which he wrote). Now, before you all write me off as insane, this is actually why I think Markdown 2.0 is a good idea. By moving to 2.0, we don't have to worry about backward compatibility (Markdown 2.0 should not allow those 3 spaces). There have been various situations (some edge cases, some not) that are not addressed in the current rules, and AFAIK those rules have never been updated to address them. A new set of rules would open the doors for all kinds of possibilities. However, in writing those rules, I think we need to keep that philosophy at the **forefront**. For example, many people will propose various additional syntax to accomplish different things. In general I would be opposed to nearly every one when considering this excerpt from the current rules:> Markdown is not a replacement for HTML, or even close to it. Its syntax is > very small, corresponding only to a very small subset of HTML tags. The > idea is not to create a syntax that makes it easier to insert HTML tags. > In my opinion, HTML tags are already easy to insert. The idea for > Markdown is to make it easy to read, write, and edit prose.That's not to say that there are no valid arguments to add additional syntax, but the arguments for those new rules would need to be very convincing. After all, that's what attracted me to Markdown in the first place. I hate editing HTML. Don't get me wrong, I know my way around an html document, but even standards compliant, well structured html can start to look like tag-soup the more you stare at it. On the other hand, I can send a Markdown document to someone that has never seen html and they **should** be able to read and understand most, if not all, of the "markup" immediately. Lets keep it that way! If you notice, I never suggest that Markdown 2.0 should be a "spec", but rather an updated set of syntax *rules*. I've already explained my reasons above, but just wanted to make sure that's clear. I suspect that is also what Micheal Fortin is trying to say in his response to Yuri's suggestion of a Markdown 2.0 spec. Personally, I believe that if a spec is created (with all the strictness that that entails), then we will have moved to far from the philosophy behind Markdown and what we have will no longer be Markdown but some derivative that subscribes to a different philosophy. That's not something I'm interested in. On Fri, Feb 29, 2008 at 3:49 AM, Yuri Takhteyev <qaramazov at gmail.com> wrote:> > Anyway, a spec for Markdown Extra would contain a spec for Markdown as > > well, wouldn't it? > > I think the whole enterprise would be a lot more valuable, if we > produce a combined spec, which would be self-contained, and call it > Markdown 2.0. > > I don't think we necessarily need a formal grammar. What we need is > to create a document, starting with "Markdown Syntax" perhaps, throw a > bunch of questions at it, settle on the answers, incorporate them into > a spec. Perhaps we can use the wiki at http://markdown.infogami.com/ > for this. > > (BTW, I just cleaned up the wiki removing links to unrelated sites and > reorganizing the rest into what seemed like a more coherent set of > categories.) > > - yuri > > -- > http://sputnik.freewisdom.org/ > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss >-- ---- Waylan Limberg waylan at gmail.com
I wholeheartedly agree. The main attractions of Markdown to me are:
1. It is easy to read
I use Markdown for personal info and stuff, then convert and read in
a browser. But for me it is ALSO important to be able to easily read the
original source. That is where Markdown excels over the other
text-to-HTML conversion tools. I have tried other methods (generally
wiki's) but find their markups generally nonintuitive or hard to read in
source (especially the use of apostrophe
'''''ugh''''')
2. It is fault-tolerant
The rules are loose enough that if I don't use the exact number of
spaces I still get what I intended rather than what I actually entered.
Or if I add a space or two in front of the bullets (often by cutting and
pasting) it still works. Actually that is a good point too - cutting and
pasting with Markdown requires less after-the-fact cleanup than the
other systems I've tried.
We need to keep these point in mind when discussing the future of
Markdown. I do use PHP Markdown Extra, but ONLY for the tables feature
('cause HTML tables are tedious and I'm lazy). Other than that I stick
pretty much to the original and just use HTML if something extra is needed.
Less is more when it comes to Markdown's syntax. Markdown is intended to
make text writing easier and more legible than is currently possible
with HTML. The least (and most flexible) syntax rules required for that
should be the goal.
Waylan Limberg wrote:> With all this discussion about evolving the spec, I think we want to
> remember the philosophy behind Markdown to begin with. Go re-read the
> Overview[1] of the syntax rules.
>
> [1]: http://daringfireball.net/projects/markdown/syntax#overview
>
> As the very first line says:
>
>
>> Markdown is intended to be as easy-to-read and easy-to-write as is
feasible.
>>
>
> Personally, some of the "holes" in the current syntax rules are
> actually the "features" that makes this statement true. As
> implementors, we want a strict spec because it's easier to implement,
> but that does not always result in easier to read and/or write.
>
> Take the discussion a short time ago on this list regarding whitespace
> allowed at the start of a list item. A quick read of the rules would
> indicate the the `*` or number should be the first item on that line.
> In practice, markdown.pl allows up to 3 spaces at the start of a list
> item. While J.G. agreed (IIRC) that that probably is a bug that should
> be fixed, we learned through the course of that conversation that a
> number of people actually are relying on that "bug" as a
"feature",
> and in fact, if the "bug" was "fixed", their documents
would break.
> Admittedly, why those three spaces were allowed to begin with is
> beyond me, but when we consider the philosophy behind Markdown, we
> realize that it is *easy* for a writer to inadvertently add a space to
> the beginning of a line of text, but *hard* for that same writer (or
> future editor) to find that space to remove it later. Therefore, as
> crazy as is sounds, this "bug" is a "feature" when the
philosophy is
> taken into consideration. My guess is that this is also why J.G.
> "doesn't give a rip" about a spec. A spec doesn't fit his
> understanding of the philosophy behind Markdown (which he wrote).
>
> Now, before you all write me off as insane, this is actually why I
> think Markdown 2.0 is a good idea. By moving to 2.0, we don't have to
> worry about backward compatibility (Markdown 2.0 should not allow
> those 3 spaces). There have been various situations (some edge cases,
> some not) that are not addressed in the current rules, and AFAIK those
> rules have never been updated to address them. A new set of rules
> would open the doors for all kinds of possibilities. However, in
> writing those rules, I think we need to keep that philosophy at the
> **forefront**.
>
> For example, many people will propose various additional syntax to
> accomplish different things. In general I would be opposed to nearly
> every one when considering this excerpt from the current rules:
>
>
>> Markdown is not a replacement for HTML, or even close to it. Its syntax
is
>> very small, corresponding only to a very small subset of HTML tags. The
>> idea is not to create a syntax that makes it easier to insert HTML
tags.
>> In my opinion, HTML tags are already easy to insert. The idea for
>> Markdown is to make it easy to read, write, and edit prose.
>>
>
> That's not to say that there are no valid arguments to add additional
> syntax, but the arguments for those new rules would need to be very
> convincing. After all, that's what attracted me to Markdown in the
> first place. I hate editing HTML. Don't get me wrong, I know my way
> around an html document, but even standards compliant, well structured
> html can start to look like tag-soup the more you stare at it. On the
> other hand, I can send a Markdown document to someone that has never
> seen html and they **should** be able to read and understand most, if
> not all, of the "markup" immediately. Lets keep it that way!
>
> If you notice, I never suggest that Markdown 2.0 should be a
"spec",
> but rather an updated set of syntax *rules*. I've already explained my
> reasons above, but just wanted to make sure that's clear. I suspect
> that is also what Micheal Fortin is trying to say in his response to
> Yuri's suggestion of a Markdown 2.0 spec. Personally, I believe that
> if a spec is created (with all the strictness that that entails), then
> we will have moved to far from the philosophy behind Markdown and what
> we have will no longer be Markdown but some derivative that subscribes
> to a different philosophy. That's not something I'm interested in.
>
> On Fri, Feb 29, 2008 at 3:49 AM, Yuri Takhteyev <qaramazov at
gmail.com> wrote:
>
>>> Anyway, a spec for Markdown Extra would contain a spec for
Markdown as
>>>
>> > well, wouldn't it?
>>
>> I think the whole enterprise would be a lot more valuable, if we
>> produce a combined spec, which would be self-contained, and call it
>> Markdown 2.0.
>>
>> I don't think we necessarily need a formal grammar. What we need
is
>> to create a document, starting with "Markdown Syntax"
perhaps, throw a
>> bunch of questions at it, settle on the answers, incorporate them into
>> a spec. Perhaps we can use the wiki at http://markdown.infogami.com/
>> for this.
>>
>> (BTW, I just cleaned up the wiki removing links to unrelated sites and
>> reorganizing the rest into what seemed like a more coherent set of
>> categories.)
>>
>> - yuri
>>
>> --
>> http://sputnik.freewisdom.org/
>> _______________________________________________
>> Markdown-Discuss mailing list
>> Markdown-Discuss at six.pairlist.net
>> http://six.pairlist.net/mailman/listinfo/markdown-discuss
>>
>>
>
>
>
>
On Fri, Feb 29, 2008 at 10:56 AM, Waylan Limberg <waylan at gmail.com> wrote:> That's not to say that there are no valid arguments to add additional > syntax, but the arguments for those new rules would need to be very > convincing.Just wanted to note that this does not mean that I'm opposed to additional syntax. In fact, last I checked I have written more extensions to python-markdown than any other single person. But those extensions are just that, extensions, which can easily be included or excluded with python-markdown's extension api. In fact, that api is the feature of python-markdown we are most proud of. However, the core of python-markdown *mostly* (there are still a few bugs we haven't ironed out) conforms to the syntax rules and adds little to no additional features. Perhaps, if a new set of rules were created, we should also define the rules for various extensions that don't make it into the core for some reason or another. It looks like this is what Micheal Fortin is trying to do with his spec for php markdown extra. The thing is, I see his documentation for extra as just that. Not sure what we need the spec for unless it's his 2.0. Although I think it would be helpful if everything was all in one place. There have been various proposals on this list that have received general approval from the community, but as they have never been added to the rules, they soon become lost and forgotten unless another implementation takes it upon itself to maintain it's own rules. I'm with Yuri here that perhaps a public location not associated with any one implementation would be the answer. However, I imagine the current syntax rules are copyrighted by J.G., so we can't just copy them over and start honing them. We'd have to start from scratch. Uhg. Perhaps the extensions alone could be defined at http://markdown.infogami.com/ for all implementations, but being a publicly editable wiki, who gets final say? -- ---- Waylan Limberg waylan at gmail.com
Waylan Limberg wrote on 2008/02/29 15:56:> With all this discussion about evolving the spec, I think we want to > remember the philosophy behind Markdown to begin with. Go re-read the > Overview[1] of the syntax rules. >... snip ...> Take the discussion a short time ago on this list regarding whitespace > allowed at the start of a list item. A quick read of the rules would > indicate the the `*` or number should be the first item on that line. > In practice, markdown.pl allows up to 3 spaces at the start of a list > item. While J.G. agreed (IIRC) that that probably is a bug that should > be fixed, we learned through the course of that conversation that a > number of people actually are relying on that "bug" as a "feature", > and in fact, if the "bug" was "fixed", their documents would break. >FWIW, I (as a humble Markdown user) am in that group. Why? Because it is how I _expect_ a list to be formatted in ASCII, and I tentatively suggest may be what many others expect also. Certainly it's a form I've seen used widely. If I'm not thinking about "correct markdown syntax", but just "what comes naturally" when writing a quick email; I might say Cases in point: * Feynman * Dirac * Bohr without thinking about inserting an extra line before the list to ensure that it gets correctly processed, aligning asterisks with zero indent so they get correctly processed, yada yada. Part of the joy of markdown (that sounds a little over-caffeinated) is precisely the laxity that makes it, I gather, so hard to implement.> think Markdown 2.0 is a good idea. By moving to 2.0, we don't have to > worry about backward compatibility (Markdown 2.0 should not allow > those 3 spaces).That's one of the scarier suggestions I've read today. So all my documents would need to be pre-processed to conform to the new markdown-2-0-strict syntax? May I ask why? Having a spec/ruleset/syntax definition seems an admirable goal; does this necessarily imply that, for example, you should not be able to begin a list item with zero to three spaces, at your discretion? This seems rather at odds with the overall theme of your mail, with which I heartily agree. Please bear in mind I know nothing about the implementation complexity of this: if it is infeasible to have such a loose approach, I'll still write in Markdown instead of DocBook/HTML, and will simply learn the "new" syntax. -- Thomas
On 29 Feb 2008, at 16:52, Thomas Nichols wrote:> Cases in point: > * Feynman > * Dirac > * Bohr > > without thinking about inserting an extra line before the list to > ensure that it gets correctly processed, aligning asterisks with > zero indent so they get correctly processed, yada yada. Part of the > joy of markdown (that sounds a little over-caffeinated) is > precisely the laxity that makes it, I gather, so hard to implement.That is also how I do it. :) When I mentioned in a previous mail on the thread that I have found lists one of the hardest cases, I wasn't joking, as I consider this to be a valid case (albeit very much a psychotic edge case), and I'd expect it to do "the right thing": * L1I1 1. L2I1 2. L2I2 * L1I2 * L1I3 * L3I1 * L3I2> Please bear in mind I know nothing about the implementation > complexity of this: if it is infeasible to have such a loose > approach, I'll still write in Markdown instead of DocBook/HTML, and > will simply learn the "new" syntax.I, personally, feel that trading implementation complexity for 'correctness' and ease of use is a good trade off in Markdown! Getting a parser that is loose enough to do "the right thing" in the above edge case is much less trivial than writing a strict parser, but *well worth it*. Cheers Tom
On Fri, Feb 29, 2008 at 8:52 AM, Thomas Nichols <nichols7 at googlemail.com> wrote:> > Having a spec/ruleset/syntax definition seems an admirable goal; does > this necessarily imply that, for example, you should not be able to > begin a list item with zero to three spaces, at your discretion? This > seems rather at odds with the overall theme of your mail, with which I > heartily agree.As a slightly-OT aside, there's another view on this "spaces before a list item" issue that sees it as a bug. When I write a list of references in a academic paper, I do so with list items. I do a hanging indent where the rest of the reference is indented by two or three spaces, like so: * Aslam, J. A., Popa, R. A., & Rivest, R. L. (2007). On estimating the size and confidence of a statistical audit, USENIX/ACCURATE Electronic Voting Technology Workshop 2007. Retrieved February 24, 2008. from <http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf>. Markdown sees that " 2008." as a list item. <ul> <li>Aslam, J. A., Popa, R. A., & Rivest, R. L. (2007). On estimating the size and confidence of a statistical audit, USENIX/ACCURATE Electronic Voting Technology Workshop 2007. Retrieved February 24, <ol><li>from <a href="http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf">http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf</a>.</li></ol></li> </ul> best, Joe -- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
On Fri, Feb 29, 2008 at 12:14 PM, Joseph Lorenzo Hall <joehall at gmail.com> wrote:> On Fri, Feb 29, 2008 at 8:52 AM, Thomas Nichols <nichols7 at googlemail.com> wrote: > > > > Having a spec/ruleset/syntax definition seems an admirable goal; does > > this necessarily imply that, for example, you should not be able to > > begin a list item with zero to three spaces, at your discretion? This > > seems rather at odds with the overall theme of your mail, with which I > > heartily agree. >[snip]> > Markdown sees that " 2008." as a list item. >And this is why I think the spaces should go. It leaves things a little *too* ambiguous. In any event, this thread is not about list syntax, but whether we want/need a new spec/rule set. I knew the list issue would bring a few more opinions to the table, so thanks for sharing (yeah, it was bait). Sometimes, as one of those implementers I have to remind myself of that originating philosophy behind Markdown. That's the only thing keeping me from making the implementation I work on more strict. Well, the only thing except those real world use cases like the ones all those people passionate about keeping whitespace before list items have. Btw, the more I think about this, the more I don't see a *need* for a new rule-set as much as I *want* one. When looked at in the context of the philosophy, the current rules pretty much stand on their own. However, a central (perhaps non-implementation specific) location for various extensions (alternate behaviors) to define their syntax would be nice. -- ---- Waylan Limberg waylan at gmail.com
Joseph Lorenzo Hall wrote on 2008/02/29 17:14:> On Fri, Feb 29, 2008 at 8:52 AM, Thomas Nichols <nichols7 at googlemail.com> wrote: > >> Having a spec/ruleset/syntax definition seems an admirable goal; does >> this necessarily imply that, for example, you should not be able to >> begin a list item with zero to three spaces, at your discretion? This >> seems rather at odds with the overall theme of your mail, with which I >> heartily agree. >> > > As a slightly-OT aside, there's another view on this "spaces before a > list item" issue that sees it as a bug. > > When I write a list of references in a academic paper, I do so with > list items. I do a hanging indent where the rest of the reference is > indented by two or three spaces, like so: > > * Aslam, J. A., Popa, R. A., & Rivest, R. L. (2007). On estimating the > size and confidence of a statistical audit, USENIX/ACCURATE > Electronic Voting Technology Workshop 2007. Retrieved February 24, > 2008. from > <http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf>. > > Markdown sees that " 2008." as a list item. > >Perhaps a first step to resolving the much broader question of whether to define a formal grammar, a ruleset, a textual description or whatever could be just to reach consensus on some of these questions? As Yuri mentioned, "code block syntax" is still an open loop, as is "indentation before list item marker" and many others. In the interests of starting to close some of these loops, I'll kick this one to a separate thread - and if we can reach consensus, it can be documented and referenced in any spec/docs/implementations anyone cares to create. Once we have a set of these "consensus opinions" hammered out, it makes some sense to me that we then start talking about a spec, a set of rules and so on - IETF-style, but perhaps with rather shorter RFCs... -- Thomas.
Since Joe called for procedural suggestion, here is what I think we should do:
1. First, I think there were valid concerns about whether it would be
ok for us to come up with a spec and call it "Markdown 2.0". I
suggest we put the question of naming aside. Once we agree on a spec,
we'll ask for John's permission to call it "Markdown 2.0" or
Markdown-Something-Else, and in the worst case we'll call it something
different, like "M-Spec" or "FooBar7.0" For now, let me
refer to it
as the M-Spec.
2. As Thomas suggested, we should first reach some agreement as to
what if anything needs to change from the original Markdown or how the
"holes" are to be filled. _Then_ try writing a grammar. I suggest
that we do this first part in plain English (and perhaps some
pseudo-code), using the wiki to record the decisions and the
highlights of the dissenting views, and using the mailing list for the
actual discussion. Let's _try_ to do it by consesus. It might just
work. If we can't agree and have to resort to voting, we can then
figure out how to handle that. (We've got a voting expert among us:
Joe.)
3. I suggest that be start by breaking "M-Spec" into two levels.
Level 1 will aim to clarify the original Markdown syntax and "fix" it
in cases where we agree it is broken. This may involve incorporating
some ideas from Markdown Extra (like emphasis_in_the_middle fix). It
will try to stay true to the "spirit" of Markdown and to not add any
new "features." Level 2 will add certain features, such as footnotes,
tables, definition lists, unindented code blocks, etc.
4. We'll agree that Level 1 is required for all "M-Spec"
implementation, while Level 2 is optional. We can either ask that all
M-Spec implementations return literally the same HTML for Level-1 text
or say that the output should match apart from "irrelevant"
whitespace. M-Spec implementations can choose whether to implement
some (or all) Level 2 features, but they should avoid implementing
similar-but-not quite features. E.g., you don't have to implement
footnotes, but if you do, please do it as specified in Level 2 spec.
All implementations should also be clear as to which Level 2 features
are supported and which are not. It could be as simple as giving the
user a check list. All implementations that support Level 2 features
should have an option of turning them of.
5. I suggest that for everyone's sanity we divide both levels of the
specs into a "macro" and "micro" part. The first
("macro") part will
tell us how the text is to be chunked into headers, paragraphs, lists,
quotations, etc. I think this part would be best described as an
algorithm for turning markdown text into a tree of "block-level"
nodes, where each node has certain "type" ("paragraph",
"list item",
"quotation", "code") The second ("micro") part
will tell us what to
do with the text inside those nodes. I think that this would be best
described as a list of substitution rules that would be run in a
particular order (to clarify precedence). This also would allow us to
divide the work: we could have a "macro" working group and a
"micro"
working group.
6. Once we have a reasonable draft and have worked out the main
disagreements, we can do another pass over the spec looking for
ambiguities, and once we are satisfied try to write a formal grammar.
If we manage, great. If not, I think that's ok too. A detailed and
agreed upon spec "in English" will already be a big step forward.
- yuri
--
http://sputnik.freewisdom.org/
On Fri, Feb 29, 2008 at 3:04 PM, Yuri Takhteyev <qaramazov at gmail.com> wrote:> Since Joe called for procedural suggestion, here is what I think we should do: > 1. ... > 2. ...Yuri's suggestion looks reasonable to me. -- Andrea Censi PhD student, Control & Dynamical Systems, Caltech http://www.cds.caltech.edu/~andrea/ "Life is too important to be taken seriously" (Oscar Wilde)
Well, if its going to happen, this will probably make it happen. Further thoughts below: On Fri, Feb 29, 2008 at 6:04 PM, Yuri Takhteyev <qaramazov at gmail.com> wrote:> Since Joe called for procedural suggestion, here is what I think we should do: > > 1. First, I think there were valid concerns about whether it would be > ok for us to come up with a spec and call it "Markdown 2.0". I > suggest we put the question of naming aside. Once we agree on a spec, > we'll ask for John's permission to call it "Markdown 2.0" or > Markdown-Something-Else, and in the worst case we'll call it something > different, like "M-Spec" or "FooBar7.0" For now, let me refer to it > as the M-Spec. > > 2. As Thomas suggested, we should first reach some agreement as to > what if anything needs to change from the original Markdown or how the > "holes" are to be filled. _Then_ try writing a grammar. I suggest > that we do this first part in plain English (and perhaps some > pseudo-code), using the wiki to record the decisions and the > highlights of the dissenting views, and using the mailing list for the > actual discussion. Let's _try_ to do it by consesus. It might just > work. If we can't agree and have to resort to voting, we can then > figure out how to handle that. (We've got a voting expert among us: > Joe.)Something tells me this is going to be the ugly part. As long as it doesn't turn into something like writing html specs has become. Uhg.> > 3. I suggest that be start by breaking "M-Spec" into two levels. > Level 1 will aim to clarify the original Markdown syntax and "fix" it > in cases where we agree it is broken. This may involve incorporating > some ideas from Markdown Extra (like emphasis_in_the_middle fix). It > will try to stay true to the "spirit" of Markdown and to not add any > new "features." Level 2 will add certain features, such as footnotes, > tables, definition lists, unindented code blocks, etc.Excellent. Without this separation between Level1 and 2, I'd say the spec is a really bad idea.> > 4. We'll agree that Level 1 is required for all "M-Spec" > implementation, while Level 2 is optional. We can either ask that all > M-Spec implementations return literally the same HTML for Level-1 text > or say that the output should match apart from "irrelevant" > whitespace. M-Spec implementations can choose whether to implement > some (or all) Level 2 features, but they should avoid implementing > similar-but-not quite features. E.g., you don't have to implement > footnotes, but if you do, please do it as specified in Level 2 spec. > All implementations should also be clear as to which Level 2 features > are supported and which are not. It could be as simple as giving the > user a check list. All implementations that support Level 2 features > should have an option of turning them of.I really like every detail of this. If any part of this gets left out in the end, I, for one, would be disappointed. Although I would like to add one thing, not only should the various implementations be able to turn Level2 on or off, is should be preferred (or maybe required?? thought anyone?) that each Level2 feature can be turned on or off individually. For example, assuming wikilinks become a Level2 feature, I dislike them (yes I know, I wrote the wikilink extension for python-markdown), so I don't want them at all even if I need footnotes.> > 5. I suggest that for everyone's sanity we divide both levels of the > specs into a "macro" and "micro" part. The first ("macro") part will > tell us how the text is to be chunked into headers, paragraphs, lists, > quotations, etc. I think this part would be best described as an > algorithm for turning markdown text into a tree of "block-level" > nodes, where each node has certain "type" ("paragraph", "list item", > "quotation", "code") The second ("micro") part will tell us what to > do with the text inside those nodes. I think that this would be best > described as a list of substitution rules that would be run in a > particular order (to clarify precedence). This also would allow us to > divide the work: we could have a "macro" working group and a "micro" > working group.Works for me. But why not just call them "block-level" and "inline" rather than "macro" and "micro"?> > 6. Once we have a reasonable draft and have worked out the main > disagreements, we can do another pass over the spec looking for > ambiguities, and once we are satisfied try to write a formal grammar. > If we manage, great. If not, I think that's ok too. A detailed and > agreed upon spec "in English" will already be a big step forward. > > > - yuri > > -- > http://sputnik.freewisdom.org/ > _______________________________________________ > > > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss >-- ---- Waylan Limberg waylan at gmail.com
I'm going to comment both on Yuri and Waylan's message here. Le 2008-03-01 ? 0:31, Waylan Limberg a ?crit :> On Fri, Feb 29, 2008 at 6:04 PM, Yuri Takhteyev > <qaramazov at gmail.com> wrote: >> Since Joe called for procedural suggestion, here is what I think we >> should do: >> >> 1. First, I think there were valid concerns about whether it would be >> ok for us to come up with a spec and call it "Markdown 2.0". I >> suggest we put the question of naming aside. Once we agree on a >> spec, >> we'll ask for John's permission to call it "Markdown 2.0" or >> Markdown-Something-Else, and in the worst case we'll call it >> something >> different, like "M-Spec" or "FooBar7.0" For now, let me refer to it >> as the M-Spec.My solution to this problem was to call it the Markdown Extra spec. What do you think?>> 2. As Thomas suggested, we should first reach some agreement as to >> what if anything needs to change from the original Markdown or how >> the >> "holes" are to be filled. _Then_ try writing a grammar.I don't think that's a good process. My idea for writing a Markdown Extra spec was to write something as a draft, call for comments, improve things, call for new comments, and so on until we have something stable enough (a few months later).>> I suggest that we do this first part in plain English (and perhaps >> some >> pseudo-code), using the wiki to record the decisions and the >> highlights of the dissenting views, and using the mailing list for >> the >> actual discussion. Let's _try_ to do it by consesus. It might just >> work. If we can't agree and have to resort to voting, we can then >> figure out how to handle that. (We've got a voting expert among us: >> Joe.) > > Something tells me this is going to be the ugly part. As long as it > doesn't turn into something like writing html specs has become. Uhg.I think the HTML spec is going on well, thanks to an editor who can make decisions. Voting on an issue-to-issue basis isn't something I'd like to try. The problem being that once something has been elected in or out of the spec, it's problematic and conter-productive to ask everyone to vote again on that decision some time later because something changes (unexpected side effects are found, new research data shows its a bad idea, some change elsewhere made the thing a little silly; those things happen all the time). Beside, on a list like this one, simply voting again on an issue at a later time can cause the result to change since people will have joined and other left. I wouldn't want to be the editor of a spec with such a strict voting process. Beside, it's a better idea in my opinion to make a decision considering the technical merit of an argument rather than a popularity contest.>> 3. I suggest that be start by breaking "M-Spec" into two levels. >> Level 1 will aim to clarify the original Markdown syntax and "fix" it >> in cases where we agree it is broken. This may involve incorporating >> some ideas from Markdown Extra (like emphasis_in_the_middle fix). It >> will try to stay true to the "spirit" of Markdown and to not add any >> new "features." Level 2 will add certain features, such as >> footnotes, >> tables, definition lists, unindented code blocks, etc.In my Markdown Extra spec, level 1 is "Markdown" and level 2 is "Markdown Extra".> Excellent. Without this separation between Level1 and 2, I'd say the > spec is a really bad idea.I disagree with this statement: having a spec with no separation between the two is better than no spec at all since one could take the given spec, remove some features, and get a good specification for how to parse plain Markdown. Most of the work would already have been done. That said, I agree that it's a good idea to have a way to read the spec ignoring the extra features if you want to base yourself on it to implement plain Markdown.>> 4. We'll agree that Level 1 is required for all "M-Spec" >> implementation, while Level 2 is optional.I wonder where you would put PHP Markdown's no-markup mode (ignoring HTML) into this. Is this a level 2 feature?>> M-Spec implementations can choose whether to implement >> some (or all) Level 2 features, but they should avoid implementing >> similar-but-not quite features. E.g., you don't have to implement >> footnotes, but if you do, please do it as specified in Level 2 spec. >> All implementations should also be clear as to which Level 2 features >> are supported and which are not.That's a good idea in general, but I'm not sure it should be explicitly forbidden in the spec. If the spec defines something and you decide to do it differently, then you're obviously not following the spec, and probably for a reason. But anyway, I'd rather begin by defining something that works, and leave conformance requirements for later.>> It could be as simple as giving the >> user a check list. All implementations that support Level 2 features >> should have an option of turning them of.Again, I'm not keen to enter conformance requirement before having a working spec. But while I agree with you sentiment that implementors should make an effort in that direction (as I'm doing by maintaining PHP Markdown and PHP Markdown Extra side by side), I don't think it's fair for implementors to require them to implement that kind of customizability. Keep in mind that having various modes makes a program harder to test, more bug-prone, and may prevent some optimizations.> Although I would like to add one thing, not only should the various > implementations be able to turn Level2 on or off, is should be > preferred (or maybe required?? thought anyone?) that each Level2 > feature can be turned on or off individually. For example, assuming > wikilinks become a Level2 feature, I dislike them (yes I know, I wrote > the wikilink extension for python-markdown), so I don't want them at > all even if I need footnotes.While I partly agree with that sentiment, I don't think the spec should require this for the reason above. The more modes a program has, the more difficult to implement and test the program will be. I don't feel the spec should burden implementors more than necessary.>> 5. I suggest that for everyone's sanity we divide both levels of the >> specs into a "macro" and "micro" part. The first ("macro") part will >> tell us how the text is to be chunked into headers, paragraphs, >> lists, >> quotations, etc. I think this part would be best described as an >> algorithm for turning markdown text into a tree of "block-level" >> nodes, where each node has certain "type" ("paragraph", "list item", >> "quotation", "code") The second ("micro") part will tell us what to >> do with the text inside those nodes.The tree is a good idea (and is how I intend to spec Markdown Extra).>> I think that this would be best >> described as a list of substitution rules that would be run in a >> particular order (to clarify precedence). This also would allow us >> to >> divide the work: we could have a "macro" working group and a "micro" >> working group.Two separate groups working on block-level and inline-level syntaxes in parallel would be like having two people each engraving a different side of a coin at the same time: there are too much interactions between the two for them to be defined separately, and I don't think the spec is going to be big enough to justify this anyway.> Works for me. But why not just call them "block-level" and "inline" > rather than "macro" and "micro"?Indeed. Michel Fortin michel.fortin at michelf.com http://michelf.com/
I think many of Michel's points are quite reasonable. Given that we agree on many parts, how about we start off with those things we agree on and deal with other issues later. Specifically, I suggest that we start with hammering out the "macro" part of "Level 1" of the spec. I.e., let's not worry about all the inline markup for now (unless it turns out to be relevant to macro parsing), and let's put aside all additional features. Let's first unambiguously specify how markdown text ought to be parsed into paragraphs, quotes, lists, etc. Michel, do you want to do a first draft and circulate it?> My solution to this problem was to call it the Markdown Extra spec. > What do you think?Sure, assuming that JG will give us blessing for calling this "Markdown." Though, this also creates a bit of confusion between the orginal "Markdown Extra" as implemented in PHP Markdown and the new Markdown Extra defined by the spec.> I don't think that's a good process. My idea for writing a Markdown > Extra spec was to write something as a draft, call for comments, > improve things, call for new comments, and so on until we have > something stable enough (a few months later).Ok, but we should start with core markdown, I think, to have a stable foundation.> I think the HTML spec is going on well, thanks to an editor who can > make decisions. Voting on an issue-to-issue basis isn't something I'd > like to try. The problem being that once something has been elected in > or out of the spec, it's problematic and conter-productive to ask > everyone to vote again on that decision some time later becauseI didn't suggest voting. I said let's try to do it by consensus, and if this doesn't work out then we'll figure out what to do, maybe some kind of voting.> Beside, it's a better idea in my opinion to make a decision > considering the technical merit of an argument rather than a > popularity contest.Well sure, but at the end someone will have to make that decision. Hopefully after discussing the technical merits we'll reach general agreement. But if we don't, someone will have to make a call, in which case we'll have to decide how to do that. But as I said, let's not worry about this for now.> I disagree with this statement: having a spec with no separation > between the two is better than no spec at all since one could take the > given spec, remove some features, and get a good specification for how > to parse plain Markdown. Most of the work would already have been done.Well, it's either all required, some required and some optional or all optional. I think making all features "required" will be asking too much of the implementors. Making all of them be "optional" might undermine compatibility even further. Which is why I think there should be a "required" part and an "optional" part.> I wonder where you would put PHP Markdown's no-markup mode (ignoring > HTML) into this. Is this a level 2 feature?I would treat it as "Level 2". It's a great feature, but it's a new feature. I would rather leave Level 1 as a clarified and fixed spec for the original markdown.> That's a good idea in general, but I'm not sure it should be > explicitly forbidden in the spec. If the spec defines something and > you decide to do it differently, then you're obviously not following > the spec, and probably for a reason.Not forbidden, just frowned upon.> But anyway, I'd rather begin by defining something that works, and > leave conformance requirements for later.Agreed.> Again, I'm not keen to enter conformance requirement before having a > working spec. But while I agree with you sentiment that implementors > should make an effort in that direction (as I'm doing by maintaining > PHP Markdown and PHP Markdown Extra side by side), I don't think it's > fair for implementors to require them to implement that kind of > customizability. Keep in mind that having various modes makes a > program harder to test, more bug-prone, and may prevent some > optimizations.Let's make it "recommended". We "recommend" that implementations that add Level 2 features give the user a simple way to turn them off.> > Although I would like to add one thing, not only should the various > > implementations be able to turn Level2 on or off, is should be > > preferred (or maybe required?? thought anyone?) that each Level2 > > feature can be turned on or off individually. For example, assuming > > wikilinks become a Level2 feature, I dislike them (yes I know, I wrote > > the wikilink extension for python-markdown), so I don't want them at > > all even if I need footnotes.I think that's too much to ask. It should be good enough to recommend a single switching that activates or desactivates L2 features. Any further customizability should be up to the implementation.> Two separate groups working on block-level and inline-level syntaxes > in parallel would be like having two people each engraving a different > side of a coin at the same time: there are too much interactions > between the two for them to be defined separately, and I don't think > the spec is going to be big enough to justify this anyway.I don't agree, but let's not fight over this and do them sequentially instead.> > Works for me. But why not just call them "block-level" and "inline" > > rather than "macro" and "micro"?Sure. - yuri -- http://sputnik.freewisdom.org/
* Yuri Takhteyev <qaramazov at gmail.com> [2008-03-01 00:05]:> 2. As Thomas suggested, we should first reach some agreement as > to what if anything needs to change from the original Markdown > or how the "holes" are to be filled. _Then_ try writing a > grammar. I suggest that we do this first part in plain English > (and perhaps some pseudo-code), using the wiki to record the > decisions and the highlights of the dissenting views, and using > the mailing list for the actual discussion.That won?t work. You can try to do it this way, but you will discover that in step 2, you will throw away everything you did in step 1, or very nearly. Why would I say that, you ask? I submit the following excerpt from the foreword of [Structure and Interpretation of Classical Mechanics][SICM] for consideration: This book is the result of teaching classical mechanics at MIT for the past six years. The contents of our class began with ideas from a class on nonlinear dynamics and solar system dynamics by Wisdom and ideas about how computation can be used to formulate methodology developed in an introductory computer science class by Abelson and Sussman. When we started we expected that using this approach to formulate mechanics would be easy. We quickly learned that many things we thought we understood we did not in fact understand. Our requirement that our mathematical notations be explicit and precise enough that they can be interpreted automatically, as by a computer, is very effective in uncovering puns and flaws in reasoning. The resulting struggle to make the mathematics precise, yet clear and computationally effective, lasted far longer than we anticipated. We learned a great deal about both mechanics and computation by this process. [SICM]: http://mitpress.mit.edu/SICM/ And isn?t this your experience when you go to write code? Every time I sit down to write any non-trivial program, I discover that my idea of how it would be architected was flawed; I overlooked large areas, the APIs I envisioned wouldn?t work, or they are horribly inconvenient, the approach I imagined cannot even work, and so on. The sooner you start writing code, the sooner you find these problems. Further, imagine: the book authors are talking about *classical mechanics*. That is formulated in *mathematical notation* and there have been *over three centuries* of work on it, yet there are big flaws in that notation. How much less useful must natural language be if you want to be precise?> Let's _try_ to do it by consesus. It might just work. If we > can't agree and have to resort to voting, we can then figure > out how to handle that.The IETF uses a consensus process for all their work, so in the Atom WG we did it that way. There were a few cases where it was extremely difficult to achieve consensus, because there were multiple positions with wide support, but a voting process would only have aggravated the problem, even if it had led to a faster decision, and would have been a drag on the majority of the work that was largely uncontroversial. The cornerstone to make this work is a fair moderator who is capable of ignoring his own opinions and biases while making a consensus assessment.> I suggest that for everyone's sanity we divide both levels of the > specs into a "macro" and "micro" part.Agree with everyone else about calling this ?block? and ?inline.? I?m not sure a strict separation is actually feasible. Given some of the bugs in Markdown.pl, I think sometimes you?ll have to peek inside the block in order to decide where and how things start and end. Or maybe some of these cases can be legislated out of existence by changing Markdown slightly.> A detailed and agreed upon spec "in English" will already be a > big step forward.Not very, honestly. If the group fails to produce a grammar specification, it would help more to try and make a really big test suite than it would to write more English. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
Thomas Nichols
2008-Feb-29 22:46 UTC
spaces and newlines before list markers (was: evolving the spec)
Joseph Lorenzo Hall wrote on 2008/02/29 17:14:> As a slightly-OT aside, there's another view on this "spaces before a > list item" issue that sees it as a bug. > > When I write a list of references in a academic paper, I do so with > list items. I do a hanging indent where the rest of the reference is > indented by two or three spaces, like so: > > * Aslam, J. A., Popa, R. A., & Rivest, R. L. (2007). On estimating the > size and confidence of a statistical audit, USENIX/ACCURATE > Electronic Voting Technology Workshop 2007. Retrieved February 24, > 2008. from > <http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf>. > > Markdown sees that " 2008." as a list item. >Ok: that's what Markdown.pl sees. Do any of _us_ see it like that? For a human, I think this is easy. Even removing the comma from the preceding line, I think there's enough 'ASCII layout' information here for an untrained reader to tell from a casual glance that '2008' is not intended to be a list item. So do we need a full-on AI engine to be able to build a parser to handle this? Or can we think up some simple algorithm - there must be a change of indent for a sub-bullet, perhaps, or a new bullet must be at the same level as the preceding one? If "2008" were indented by a space, then I think it **should** be a new item, a sub-bullet. Maybe that's because I'm used to working with suboptimal tools (web browsers for blog comments have already been mentioned) which don't automatically indent the paragraph beneath the bullet point as in this example. In fact, I often see a blank line separating bullets instead. ---- * Inertial-Electrodynamic Fusion (IEF) Device - Energy/Matter Conversion Corporation (EMC2). The fusion process recommended by Dr. Bussard takes boron-11 and fuses a proton to it, producing, in its excited state, a carbon-12 atom. This excited carbon-12 atom decays to beryllium-8 and helium-4. * Bussard's website, asking for donations to fund further research * American Scientist article mentioning the founding of EMC2 ---- Again, I think each of us understands these are bullets. I'm writing this in Thunderbird, just typing '*' instead of pressing the 'bullet' button', so I'm getting word-wrap but no automatic indentation (though MTAs etc. may reformat the message later). They follow the typographic approach of indenting the first line of the para; compare with the example Joe quoted which outdents the bullet points and then indents the whole paragraph beneath it. Can we consider these to be two equally valid approaches? On a personal note, what inspired me about Markdown was John Gruber's [Dive into Markdown][] article. Possibly relevant here: ---- In fact, I love writing email. Email is my favorite writing medium. I?ve sent over 16,000 emails in the last five years. The conventions of plain text email allow me to express myself clearly and precisely, without ever getting in my way. Thus, Markdown. Email-style writing for the web. ... The typographic constraints of plain text ? a single typeface, in a single size, with no true italics or bold ? are very much similar to the constraints of a typewriter. ---- That's what I'm after ... to be able to use "the conventions of plain text email" when creating content; and conventions are often pretty woolly, so creating a formal ruleset from them is probably going to be tough. In this instance, though, I haven't yet understood why Markdown should not continue to support the following from the [syntax page][], "List markers typically start at the left margin, but may be indented by up to three spaces. List markers must be followed by one or more spaces or a tab." As a (weak) analogy, SQL has both a laboriously detailed specification and a surprisingly loose query syntax, allowing noise words and using intelligent defaults to capture intent wherever possible. TIMTOWTDI, especially when writing an email. So -- any reasons why we should need to "tighten the spec"? Or can we simply document it formally, with a grammar, test suite and so on to make sure that the expected behaviour is always known, and ideally is consistent with "email convention"? -- Thomas. [Dive into Markdown]: http://daringfireball.net/2004/03/dive_into_markdown [syntax page]: http://daringfireball.net/projects/markdown/syntax#list
Yuri Takhteyev
2008-Feb-29 23:18 UTC
spaces and newlines before list markers (was: evolving the spec)
> So -- any reasons why we should need to "tighten the spec"? Or can we > simply document it formally, with a grammar, test suite and so on to > make sure that the expected behaviour is always known, and ideally is > consistent with "email convention"?I don't think we should generally try to tighten the spec for the sake of tightening it. Where possible, we should just document what we decided the behaviour should be for all the edge cases. If we agree that examples like Joe's ought to be turned into numbered lists, then so be it. However, we should also consider that those might be "bugs" in the current spec, in the sense that simple, email-like writing gets turned into markup in cases where the users would not want or expect it. I think there are a few cases where Markdown is a bit trigger-happy. Numbered lists are one such things. My suggestion would be to actually restrict the range of what is turned into ordered lists, either by requiring that ordered lists start with "1.", or that they have at least two items with consecutive numbers, or even to require both. I frankly don't see a point in allowing the user to number their list items starting with an arbitrary number if those the list will start with "1." when displayed to the viewer anyway. But again, the idea is not to tighten the spec, but to find a solution for a "bug" that Joe described. - yuri -- http://sputnik.freewisdom.org/
Vinay Augustine
2008-Feb-29 23:39 UTC
spaces and newlines before list markers (was: evolving the spec)
> However, we should also consider that those might be "bugs" in the > current spec, in the sense that simple, email-like writing gets turned > into markup in cases where the users would not want or expect it. I > think there are a few cases where Markdown is a bit trigger-happy. > Numbered lists are one such things. My suggestion would be to > actually restrict the range of what is turned into ordered lists, > either by requiring that ordered lists start with "1.", or that they > have at least two items with consecutive numbers, or even to require > both. I frankly don't see a point in allowing the user to number > their list items starting with an arbitrary number if those the list > will start with "1." when displayed to the viewer anyway. But again, > the idea is not to tighten the spec, but to find a solution for a > "bug" that Joe described. >I can think of an obvious use-case against this: what if I type an ordered list up, and then decide to cut & paste to re-order it or add items to it? Now, in the optimal case I would want to go through and change the numbering of all the elements. In the meantime, though, I think it's great that I can rapidly reorder my list and markdown will *do what makes sense*. -V
Yuri Takhteyev
2008-Feb-29 23:58 UTC
spaces and newlines before list markers (was: evolving the spec)
> I can think of an obvious use-case against this: what if I type an > ordered list up, and then decide to cut & paste to re-order it or add > items to it? Now, in the optimal case I would want to go through and > change the numbering of all the elements. In the meantime, though, I > think it's great that I can rapidly reorder my list and markdown will > *do what makes sense*.Is it so much work, though, to then change the numbers of just the first two items to "1." and "2."? Note that I am not suggesting that all numbers must be consecutive - just the first two. In fact, I would also throw in the idea of making the numbers themselves optional for all subsequent elements. I.e. the following should be enough: 1. Item one 2. Item two . Item three . Item four . Item five or: 1. Item one 2. Item two n. Item three n. Item four n. Item five - yuri -- http://sputnik.freewisdom.org/
david parsons
2008-Mar-05 16:22 UTC
spaces and newlines before list markers (was: evolving the spec)
In article <fa4efbc00802291558t35aeed62o329ed3bde495e3a0 at mail.gmail.com>, Yuri Takhteyev <markdown-discuss at six.pairlist.net> wrote:>Is it so much work, though, to then change the numbers of just the >first two items to "1." and "2."? Note that I am not suggesting that >all numbers must be consecutive - just the first two. In fact, I >would also throw in the idea of making the numbers themselves optional >for all subsequent elements. I.e. the following should be enough: > >1. Item one >2. Item two > . Item three > . Item four > . Item fiveBut that's not very readable, is it? When I look at a list in the source document, I'd expect to see a list. I don't know of very many (if any) cases where the numbers in the list just sort of trail off and leave every item prefixed with a bullet. -david parsons
Seumas Mac Uilleachan
2008-Mar-05 16:42 UTC
spaces and newlines before list markers (was: evolving the spec)
david parsons wrote:> In article <fa4efbc00802291558t35aeed62o329ed3bde495e3a0 at mail.gmail.com>, > Yuri Takhteyev <markdown-discuss at six.pairlist.net> wrote: > > >> Is it so much work, though, to then change the numbers of just the >> first two items to "1." and "2."? Note that I am not suggesting that >> all numbers must be consecutive - just the first two. In fact, I >> would also throw in the idea of making the numbers themselves optional >> for all subsequent elements. I.e. the following should be enough: >> >> 1. Item one >> 2. Item two >> . Item three >> . Item four >> . Item five >> > > > But that's not very readable, is it? > > When I look at a list in the source document, I'd expect to see > a list. I don't know of very many (if any) cases where the > numbers in the list just sort of trail off and leave every item > prefixed with a bullet. > > >It is also hard to distinguish between . for numbered list and - for bullet list when reading.> -david parsons > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss > > >
Waylan Limberg
2008-Mar-05 18:29 UTC
spaces and newlines before list markers (was: evolving the spec)
On Wed, Mar 5, 2008 at 11:42 AM, Seumas Mac Uilleachan <seumas at idirect.ca> wrote:> david parsons wrote: > > In article <fa4efbc00802291558t35aeed62o329ed3bde495e3a0 at mail.gmail.com>, > > Yuri Takhteyev <markdown-discuss at six.pairlist.net> wrote: > > > > > >> Is it so much work, though, to then change the numbers of just the > >> first two items to "1." and "2."? Note that I am not suggesting that > >> all numbers must be consecutive - just the first two. In fact, I > >> would also throw in the idea of making the numbers themselves optional > >> for all subsequent elements. I.e. the following should be enough: > >> > >> 1. Item one > >> 2. Item two > >> . Item three > >> . Item four > >> . Item five > >> > > > > > > But that's not very readable, is it? > > > > When I look at a list in the source document, I'd expect to see > > a list. I don't know of very many (if any) cases where the > > numbers in the list just sort of trail off and leave every item > > prefixed with a bullet. > > > > > > > It is also hard to distinguish between . for numbered list and - for > bullet list when reading. > >I agree. Remembering the philosophy that markdown is to be readable first - this doesn't fit. Additionally, Markdown is meant to be a format to write email in, which can later be easily converted to html. I'll never format a list in an email that way, and I doubt anyone else would either. I realize many markdown documents will never be viewed by the public in their raw form, but that's beside the point. The above suggested syntax would just be an excuse for lazy authors and adds no real value IMO. -- ---- Waylan Limberg waylan at gmail.com
Yuri Takhteyev
2008-Mar-05 19:21 UTC
spaces and newlines before list markers (was: evolving the spec)
> >1. Item one > >2. Item two > > . Item three > > . Item four > > . Item five > > > But that's not very readable, is it?Note the word "optional" in the paragraph preceding this example. I think markdown's ideal of allowing the source to be readable should be understood as making possible and easy for the author to write markdown source that is readable. I don't see why it would make sense to _require_ them to make it readable. Sometimes I care that my source is easy to read, and in cases like that I can number all of my list items. In other cases I am the only person to sees the source, and numbering the items is just a waste of time. And tell me that the example above is more confusing then: 1. Item one 2. Item two 5. Item five 4. Item four 3. Item five> When I look at a list in the source document, I'd expect to see > a list. I don't know of very many (if any) cases where theYes, I expect to see a list, but I also expect the numbers to be in order. A list without numbers is better in my opinion than a list with numbers out of order. And fixing the numbers after reordering the list is a pain. - yuri -- http://sputnik.freewisdom.org/
Aristotle Pagaltzis
2008-Mar-03 00:49 UTC
spaces and newlines before list markers (was: evolving the spec)
* Yuri Takhteyev <qaramazov at gmail.com> [2008-03-01 00:20]:> My suggestion would be to actually restrict the range of what > is turned into ordered lists, either by requiring that ordered > lists start with "1.", or that they have at least two items > with consecutive numbers, or even to require both.Not both, but lists not starting at 1. having to have more than one item is a nice idea for solving the problem.> I frankly don't see a point in allowing the user to number > their list items starting with an arbitrary number if those the > list will start with "1." when displayed to the viewer anyway.This is actually my second biggest complaint with Markdown. As I understand, John was originally going to allow numbered lists to start at numbers other than 1, but then discovered that the HTML4 WG labelled the `start` attribute of lists deprecated, so he removed this feature. I find the WG?s decision utterly baffling and wrong-headed. *How on Earth* is that information presentational?! So while I don?t disagree with the basic desire to output valid HTML, I think John?s decision in this one case was mistaken. IMO, in this one case Markdown should do what is sensible, validation be damned. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
Yuri Takhteyev
2008-Mar-03 01:15 UTC
spaces and newlines before list markers (was: evolving the spec)
> This is actually my second biggest complaint with Markdown. As I > understand, John was originally going to allow numbered lists to > start at numbers other than 1, but then discovered that the HTML4 > WG labelled the `start` attribute of lists deprecated, so he > removed this feature.Not sure what you mean. Markdown _does_ allow lists to start with any number, at least that's what "Markdown Syntax says very unambiguously. Or do you mean to say that John originally was thinking of setting the "start" attribute based on the value of the first item and then decided not to? What about setting "value" on each "li" instead? - yuri -- http://sputnik.freewisdom.org/
Aristotle Pagaltzis
2008-Mar-03 11:13 UTC
spaces and newlines before list markers (was: evolving the spec)
* Yuri Takhteyev <qaramazov at gmail.com> [2008-03-03 02:20]:> What about setting "value" on each "li" instead?Equally deprecated. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
Már Örlygsson
2008-Mar-03 12:04 UTC
spaces and newlines before list markers (was: evolving the spec)
Aristotle Pagaltzis <pagaltzis at gmx.de> wrote:> > What about setting "value" on each "li" instead? > > Equally deprecated.The HTML spec authors have admitted that they believe this particular change was a mistake on their behalf. All browsers support the value attribute on <ol> <li>s, and there are numerous real world use-cases for allowing continuations of ordered lists. I for one, would like to see markdown support... 1. ordered 2. lists that interrupt and then continue 3. like 4. this Basic support for implied <ol type=""> attributes would also be nice... A. like B. this C. where the third item refers to list-item "A" above (which becomes non-sensial if the browser renders the list as 1. 2. 3...) or i. like ii. this iii. as well ...although I realise that over-zealous auto-detection algorithms would cause problems with sencences/lines starting with an abbrivated proper name like this: A. Lincoln was an US president, and Lyndon B. Johnson was another. -- M?r
John MacFarlane
2008-Mar-03 15:59 UTC
spaces and newlines before list markers (was: evolving the spec)
I agree with all this. Pandoc's extended markdown syntax supports both these features (with some heuristics to avoid capturing initials in names). http://johnmacfarlane.net/pandoc/README.html#lists John> The HTML spec authors have admitted that they believe this particular > change was a mistake on their behalf. > > All browsers support the value attribute on <ol> <li>s, and there are > numerous real world use-cases for allowing continuations of ordered > lists. > > I for one, would like to see markdown support... > > 1. ordered 2. lists > > that interrupt and then continue > > 3. like 4. this > > > Basic support for implied <ol type=""> attributes would also be > nice... > > A. like B. this C. where the third item refers to list-item "A" above > (which becomes non-sensial if the browser renders the list as 1. 2. > 3...) > > or > > i. like ii. this iii. as well > > > ...although I realise that over-zealous auto-detection algorithms > would cause problems with sencences/lines starting with an abbrivated > proper name like this: > > A. Lincoln was an US president, and Lyndon B. Johnson was another. > > > -- M?r _______________________________________________ > Markdown-Discuss mailing list Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss >
Michel Fortin
2008-Mar-03 12:14 UTC
spaces and newlines before list markers (was: evolving the spec)
Le 2008-03-02 ? 19:49, Aristotle Pagaltzis a ?crit :> This is actually my second biggest complaint with Markdown. As I > understand, John was originally going to allow numbered lists to > start at numbers other than 1, but then discovered that the HTML4 > WG labelled the `start` attribute of lists deprecated, so he > removed this feature. > > I find the WG?s decision utterly baffling and wrong-headed. *How > on Earth* is that information presentational?!I agree with your sentiment. We mustn't be the only ones thinking that: the start attribute is coming back in HTML5: <http://www.whatwg.org/specs/web-apps/current-work/multipage/section-grouping.... >> So while I don?t disagree with the basic desire to output valid > HTML, I think John?s decision in this one case was mistaken. IMO, > in this one case Markdown should do what is sensible, validation > be damned.I agree with you. Michel Fortin michel.fortin at michelf.com http://michelf.com/
John Gruber
2008-Mar-18 05:46 UTC
spaces and newlines before list markers (was: evolving the spec)
On Mar 2, 2008, at 7:49 PM, Aristotle Pagaltzis wrote:> This is actually my second biggest complaint with Markdown. As I > understand, John was originally going to allow numbered lists to > start at numbers other than 1, but then discovered that the HTML4 > WG labelled the `start` attribute of lists deprecated, so he > removed this feature. > > I find the WG?s decision utterly baffling and wrong-headed. *How > on Earth* is that information presentational?! > > So while I don?t disagree with the basic desire to output valid > HTML, I think John?s decision in this one case was mistaken. IMO, > in this one case Markdown should do what is sensible, validation > be damned.Agreed. This will work correctly in the next version. Plus, the "start" attribute is in HTML5 for <ol>. Back when I was working on Markdown 1.0, I was completely under the spell of the W3C's "strict" specs. -J.G.
Waylan Limberg
2008-Mar-01 06:12 UTC
spaces and newlines before list markers (was: evolving the spec)
On Fri, Feb 29, 2008 at 5:46 PM, Thomas Nichols <nichols7 at googlemail.com> wrote:> tough. In this instance, though, I haven't yet understood why Markdown > should not continue to support the following from the [syntax page][], > > "List markers typically start at the left margin, but may be indented by > up to three spaces. List markers must be followed by one or more spaces > or a tab." >Hmm, I don't remember reading that before. Was it always there? Anyway, to be honest this has been the hardest thing about markdown for me to wrap my head around (and probably why I picked it as an example in the other discussion). The way I always understood it, indenting in Markdown is done in increments of 4. Therefore, in my mind, the only amounts of indent allowed should be 0, 4, 8, 12, 16, ... and so on. It would never even occur to me to use any other amounts of indent for any reason - ever. And, in fact, I had never tried it until someone brought it up here on the list. With the exception of the "it makes copy and pasting easier" argument, I'll probably never understand it. Thats just the way my mind works. Nothing against those who think differently. And, in fact, reading Gruber's "Dive into Markdown" it would seem reasonable that one could conceivably take any well crafted email and run it through a markdown parser and get some decent html. Obviously, in practice things don't work so well. In the end consesions have to be made on both sides. I won't enforce my restrictive view of indentation on the community at large, and perhaps those who like to indent secondary lines of list items will need to watch their line breaks to avoid errant list items. We could always require a blank line between list items, but consider this list of one-word items: * red * blue * green Do we really want to require the author to add a blank line between each item? I don't. -- ---- Waylan Limberg waylan at gmail.com
Thomas Nichols
2008-Mar-01 07:46 UTC
spaces and newlines before list markers (was: evolving the spec)
Waylan Limberg wrote on 2008/03/01 6:12:> Hmm, I don't remember reading that before. Was it always there? >Not sure. Been there for a cuple of years at least, I think.> Anyway, to be honest this has been the hardest thing about markdown > for me to wrap my head around (and probably why I picked it as an > example in the other discussion). The way I always understood it, > indenting in Markdown is done in increments of 4. Therefore, in my > mind, the only amounts of indent allowed should be 0, 4, 8, 12, 16, > ... and so on. It would never even occur to me to use any other > amounts of indent for any reason - ever. And, in fact, I had never > tried it until someone brought it up here on the list. With the > exception of the "it makes copy and pasting easier" argument, I'll > probably never understand it. Thats just the way my mind works. > Nothing against those who think differently. >I use increments of two - but the same principle applies I think. Using ~ for spaces for clarity: ~~1. Item One ~~~~a. One A ~~~~b. One B ~~2. Item Two ... ... and unsurprisingly some (maybe all) of the Markdown implementations I use have a fit about this -- they interpret the 4 space indent as implying a <pre><code>block. In this context, though, I think a human would see immediately what is intended. Perhaps M-Spec parts 1 or 2 could as well?> And, in fact, reading Gruber's "Dive into Markdown" it would seem > reasonable that one could conceivably take any well crafted email and > run it through a markdown parser and get some decent html. Obviously, > in practice things don't work so well. In the end consesions have to > be made on both sides. I won't enforce my restrictive view of > indentation on the community at large, and perhaps those who like to > indent secondary lines of list items will need to watch their line > breaks to avoid errant list items. >Sounds good to me :-)> We could always require a blank line between list items, but consider > this list of one-word items: > > * red > * blue > * green > > Do we really want to require the author to add a blank line between > each item? I don't. >Nor me - I agree, compact lists like this are much best without blank lines. When list items require more than one line (in whatever tool I'm using to edit at the time) it can make sense to insert blank lines. To me, at least... -- Thomas
Joseph Lorenzo Hall
2008-Mar-01 19:34 UTC
spaces and newlines before list markers (was: evolving the spec)
Is there any use case other than the copy and pasting example that gets you whitespace before a list item marker? Do people do this on purpose? best, Joe -- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
david parsons
2008-Mar-01 20:20 UTC
spaces and newlines before list markers (was: evolving the spec)
In article <928946aa0803011134y6054af3dq64a00834bfced75a at mail.gmail.com>, Joseph Lorenzo Hall <markdown-discuss at six.pairlist.net> wrote:>Is there any use case other than the copy and pasting example that >gets you whitespace before a list item marker? Do people do this on >purpose?I do it all the time. I indent text by 4 spaces, and do * text or 1. text Worse yet, I do 1. text 2. text . . . 10. text 11. text A revised markdown syntax that forbade this would be slightly catastrophic for me. -david parsons
Thomas Nichols
2008-Mar-01 20:31 UTC
spaces and newlines before list markers (was: evolving the spec)
Joseph Lorenzo Hall wrote on 2008/03/01 19:34:> Is there any use case other than the copy and pasting example that > gets you whitespace before a list item marker? Do people do this on > purpose? > > best, Joe >I do. Which is probably a guarantee that it is outlandish and uncivilised -- but I prefer it. I can possibly scrape together some logical arguments when I have a brain that functions, but for me it's an aesthetic reaction, an asterisk indented two spaces "feels right" to start a list item. In much the same way, on a typewriter I would indent the first line of a para. -- Thomas
Tomas Doran
2008-Mar-02 11:41 UTC
spaces and newlines before list markers (was: evolving the spec)
On 1 Mar 2008, at 19:34, Joseph Lorenzo Hall wrote:> Is there any use case other than the copy and pasting example that > gets you whitespace before a list item marker? Do people do this on > purpose?Yes, I use 1 or 2 spaces in all of my documents, so I'd be strongly against this being 'fixed'. Cheers Tom
Joseph Lorenzo Hall
2008-Mar-02 16:56 UTC
spaces and newlines before list markers (was: evolving the spec)
Sounds like there are quite a few people who write intuitively by placing a space or two before a list marker. Next question: what if we only allowed a fixed number of whitespaces before a list marker? For example, what if the spec said 0-1 whitespace characters before a list marker? Is that too rigid? This would accommodate those of us that write with a space before a list marker, but wouldn't capture my edge case (with the "2008." indented three spaces on a newline). best, Joe -- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
John Fraser
2008-Mar-02 18:00 UTC
spaces and newlines before list markers (was: evolving the spec)
On Mar 2, 2008, at 11:56 AM, Joseph Lorenzo Hall wrote:> Sounds like there are quite a few people who write intuitively by > placing a space or two before a list marker. Next question: what if > we only allowed a fixed number of whitespaces before a list marker? > For example, what if the spec said 0-1 whitespace characters before a > list marker? > > Is that too rigid?Tightening up indentation rules is definitely a breaking change, and I don't see any payoff for users here. If anything, we should be making indentation rules more lenient. I made a suggestion about indentation last July: <http://six.pairlist.net/pipermail/markdown-discuss/2007-July/000690.html > My idea is to consider a list item to be a child of the most recent list item whose bullet is indented less than its own. If there's no such parent, then the item is part of a top-level list. It's harder to parse than a two-space rule, but that's our problem as implementors. Anyway, this whole discussion of spaces and bullets is a good preview of the spec-writing process: we'll go back and forth for days, and end up with English-language pseudocode that nobody can follow without pen and paper. +1 for something formal-grammar-y. John Fraser
Joseph Lorenzo Hall
2008-Mar-02 18:26 UTC
spaces and newlines before list markers (was: evolving the spec)
On Sun, Mar 2, 2008 at 10:00 AM, John Fraser <john at attacklab.net> wrote:> > Tightening up indentation rules is definitely a breaking change, and I > don't see any payoff for users here. If anything, we should be making > indentation rules more lenient.My only desire is to figure out a way to allow the whitspace-before-list-marker and also avoid the more general class of "bugs" where a list is triggered by a sentence ending with a number on an indented newline. The reference citation I sent out on another thread is one example but anything of the following form will trigger this: * This is a list item with a hanging indent ending with a number, 4. The rest is considered a child of a new ordered list, no matter what I do to this paragraph (other than rephrase to get rid of the hanging-indented digit+dot). Which produces <ul> <li>This is a list item with a hanging indent ending with a number, <ol><li>The rest is considered a child of a new ordered list, no matter what I do to this paragraph (other than rephrase to get rid of the hanging-indented digit+dot).</li></ol></li> </ul> Is this something we're comfortable with? If not, can we come up with something that avoids this? best, Joe -- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
Thomas Nichols
2008-Mar-03 00:03 UTC
spaces and newlines before list markers (was: evolving the spec)
Joseph Lorenzo Hall wrote on 2008/03/02 18:26:> On Sun, Mar 2, 2008 at 10:00 AM, John Fraser <john at attacklab.net> wrote: > >> Tightening up indentation rules is definitely a breaking change, and I >> don't see any payoff for users here. If anything, we should be making >> indentation rules more lenient. >> > > My only desire is to figure out a way to allow the > whitspace-before-list-marker and also avoid the more general class of > "bugs" where a list is triggered by a sentence ending with a number on > an indented newline. > > The reference citation I sent out on another thread is one example but > anything of the following form will trigger this: > > * This is a list item with a hanging indent ending with a number, > 4. The rest is considered a child of a new ordered list, no matter > what I do to this paragraph (other than rephrase to get rid of the > hanging-indented digit+dot). > > Which produces > > <ul> > <li>This is a list item with a hanging indent ending with a number, > <ol><li>The rest is considered a child of a new ordered list, no matter > what I do to this paragraph (other than rephrase to get rid of the > hanging-indented digit+dot).</li></ol></li> > </ul> > > Is this something we're comfortable with? If not, can we come up with > something that avoids this? best, Joe > >Actually, when I first read your example I was confused -- I thought '4.' was a second-level bullet point, despite the comma on the preceding line. If a human (admittedly a very tired one) can make this interpretation, I can live with a Markdown processor making it also. John's proposed approach in http://six.pairlist.net/pipermail/markdown-discuss/2007-July/000690.html seems to fit well with what a naive (tired) human might expect to happen. As ever, YMMV. -- Thomas.
Seumas Mac Uilleachan
2008-Mar-03 03:14 UTC
spaces and newlines before list markers (was: evolving the spec)
Joseph Lorenzo Hall wrote:> Sounds like there are quite a few people who write intuitively by > placing a space or two before a list marker. Next question: what if > we only allowed a fixed number of whitespaces before a list marker? > For example, what if the spec said 0-1 whitespace characters before a > list marker? > > Is that too rigid? > > This would accommodate those of us that write with a space before a > list marker, but wouldn't capture my edge case (with the "2008." > indented three spaces on a newline). best, Joe > >What's needed is a way to distinguish your edge case from the general case where it would be a list. Do you use two white spaces to preserve the line breaks? Perhaps that could be the trigger in this case - a line ending in two white spaces prevents the next line from being formatted as a new list. I just tested this edge case in PHP Markdown Extra and it does the same thing (both with and without the two white spaces for newlines).
Joseph Lorenzo Hall
2008-Mar-03 04:02 UTC
spaces and newlines before list markers (was: evolving the spec)
On Sun, Mar 2, 2008 at 7:14 PM, Seumas Mac Uilleachan <seumas at idirect.ca> wrote:> > What's needed is a way to distinguish your edge case from the general > case where it would be a list. Do you use two white spaces to preserve > the line breaks? Perhaps that could be the trigger in this case - a line > ending in two white spaces prevents the next line from being formatted > as a new list.I don't tend to end each line with two spaces (it's emacs' justification). best, Joe
Michel Fortin
2008-Mar-03 11:48 UTC
spaces and newlines before list markers (was: evolving the spec)
Le 2008-03-02 ? 22:14, Seumas Mac Uilleachan a ?crit :> What's needed is a way to distinguish your edge case from the > general case where it would be a list. Do you use two white spaces > to preserve the line breaks? Perhaps that could be the trigger in > this case - a line ending in two white spaces prevents the next line > from being formatted as a new list.I don't think that's a good idea. Two spaces at the end of a line means a line break, not an end of the current paragraph.> I just tested this edge case in PHP Markdown Extra and it does the > same thing (both with and without the two white spaces for newlines).Indeed. I'm not sure what could be done here however, but here is an idea. John changed things a long time ago now so that it doesn't pose a problem for text at the root of the document by forcing a blank line to be present before a list when not inside a list. I'm thinking that we could do the same for the content of list item parsed as block- level content. For instance, here you would have a nested ordered list: * Blah blah blah 1. blah blah * Blah blah blah Here too: * Blah blah blah 1. blah blah * Blah blah blah But not in the next examples. Here the "1." list marker wouldn't be accepted because we're in a block-level list element (since there is a blank line between the two items): * Blah blah blah 1. blah blah * Blah blah blah Same here, because there is a blank line inside the list item: * Blah blah blah Blah blah blah 1. blah blah * Blah blah blah Perhaps that's a too subtle distinction, but it's my preferred solution to date. Michel Fortin michel.fortin at michelf.com http://michelf.com/
Seumas Mac Uilleachan
2008-Mar-03 14:32 UTC
spaces and newlines before list markers (was: evolving the spec)
Michel Fortin wrote:> Le 2008-03-02 ? 22:14, Seumas Mac Uilleachan a ?crit : > >> What's needed is a way to distinguish your edge case from the general >> case where it would be a list. Do you use two white spaces to >> preserve the line breaks? Perhaps that could be the trigger in this >> case - a line ending in two white spaces prevents the next line from >> being formatted as a new list. > > I don't think that's a good idea. Two spaces at the end of a line > means a line break, not an end of the current paragraph.Yes, my mistake was in thinking that the lines as indented and separated in the source was what was desired in the output, but the output removes the leading spaces (as it should). I have always wondered why the arbitrary starting point for numbered lists was desirable. If Markdown is supposed to be intuitive, then it doesn't really make sense to either start a list with a number other than one or, if so, to then force that number to become one. The first case is not something I have ever done myself and the second really would be conflicting with the user's intent (imho). Except in the case of: * Foo 1. Blah 2. Blah * Foo 3. Blah or # Header 1. Point one 2. Point two # Header 3. Point three (both variations on the same idea) I can't really think of where not starting with one is desirable (and here it's a continuation of a list that did), and this could conceivably be covered by tracking the previous list numbers, right? Currently the 3. is changed to 1. but should it not remain 3.? In that case, changing the syntax to say that lists start with 1. unless there is a previous (same level) list being continued would make a lot of sense to me. Following numbers don't have to be consecutive (for editing and changing orders of lists etc) just ensure first item is 1. Then again, you still have the edge case where the 1. is intended to be the end of a sentence in the list item with another sentence following. This is what is happening with the "2008." that started all this. It is not the "2008." but the "from" following that creates the problem. (remove the "from" and the 2008. is preserved as is) For that you probably need the subtle distinction you proposed below (which works for the above as well if you want to keep the arbitrary starting number).>> I just tested this edge case in PHP Markdown Extra and it does the >> same thing (both with and without the two white spaces for newlines). > > Indeed. I'm not sure what could be done here however, but here is an > idea. > > John changed things a long time ago now so that it doesn't pose a > problem for text at the root of the document by forcing a blank line > to be present before a list when not inside a list. I'm thinking that > we could do the same for the content of list item parsed as > block-level content. For instance, here you would have a nested > ordered list: > > * Blah blah blah > 1. blah blah > * Blah blah blah > > Here too: > > * Blah blah blah > > 1. blah blah > > * Blah blah blah > > But not in the next examples. Here the "1." list marker wouldn't be > accepted because we're in a block-level list element (since there is a > blank line between the two items): > > > * Blah blah blah > 1. blah blah > > * Blah blah blah > > Same here, because there is a blank line inside the list item: > > * Blah blah blah > > Blah blah blah > 1. blah blah > * Blah blah blah > > Perhaps that's a too subtle distinction, but it's my preferred > solution to date. > > > Michel Fortin > michel.fortin at michelf.com > http://michelf.com/ > > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss > >
Joseph Lorenzo Hall
2008-Mar-04 00:14 UTC
spaces and newlines before list markers (was: evolving the spec)
I'm a big fan of this proposal... seems to nicely take care of my dorky edge case. best, Joe On Mon, Mar 3, 2008 at 3:48 AM, Michel Fortin <michel.fortin at michelf.com> wrote:> Le 2008-03-02 ? 22:14, Seumas Mac Uilleachan a ?crit : > > > > What's needed is a way to distinguish your edge case from the > > general case where it would be a list. Do you use two white spaces > > to preserve the line breaks? Perhaps that could be the trigger in > > this case - a line ending in two white spaces prevents the next line > > from being formatted as a new list. > > I don't think that's a good idea. Two spaces at the end of a line > means a line break, not an end of the current paragraph. > > > > I just tested this edge case in PHP Markdown Extra and it does the > > same thing (both with and without the two white spaces for newlines). > > Indeed. I'm not sure what could be done here however, but here is an > idea. > > John changed things a long time ago now so that it doesn't pose a > problem for text at the root of the document by forcing a blank line > to be present before a list when not inside a list. I'm thinking that > we could do the same for the content of list item parsed as block- > level content. For instance, here you would have a nested ordered list: > > * Blah blah blah > 1. blah blah > * Blah blah blah > > Here too: > > * Blah blah blah > > 1. blah blah > > * Blah blah blah > > But not in the next examples. Here the "1." list marker wouldn't be > accepted because we're in a block-level list element (since there is a > blank line between the two items): > > > * Blah blah blah > 1. blah blah > > * Blah blah blah > > Same here, because there is a blank line inside the list item: > > * Blah blah blah > > Blah blah blah > 1. blah blah > * Blah blah blah > > Perhaps that's a too subtle distinction, but it's my preferred > solution to date. > > > Michel Fortin > michel.fortin at michelf.com > http://michelf.com/ > > > > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss >-- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
> Personally, some of the "holes" in the current syntax rules are > actually the "features" that makes this statement true. As > implementors, we want a strict spec because it's easier to implement, > but that does not always result in easier to read and/or write.I don't see how ambiguity of the spec and the ease of reading and writing are inherently linked. The spec can say that the list must start with a "*" right after the new line, or it can say that the * may be preceeded by up to 3 spaces, or it can say that the * can be preceded by what any number of spaces. Let's pick the option that we find most readable and writable. But let's decide and make it clear. Otherwise, whenever you move from one implementation to another you'll have to learn which of those features are supported and which are not.> item. While J.G. agreed (IIRC) that that probably is a bug that should > be fixed, we learned through the course of that conversation that a > number of people actually are relying on that "bug" as a "feature",Well, let's have more of those conversations. At this point many of us have used markdown for several years and have opinions about what works and what doesn't. E.g., I am quite convinced (as I think Michel is too) that italicizing_parts_of_words was a really bad idea. I can agree that historically the ambiguity of the spec have allowed for some experimentation that was good. But at this point I think we might be better off settling on a spec.> > Markdown is not a replacement for HTML, or even close to it. Its syntax is > > very small, corresponding only to a very small subset of HTML tags. The > > idea is not to create a syntax that makes it easier to insert HTML tags. > > In my opinion, HTML tags are already easy to insert. The idea for > > Markdown is to make it easy to read, write, and edit prose.Well, except that people use it for blogs and wikis and they need some of those extra features, and if we don't agree on some of them (like definition lists and tables) then we end up with a hundred different ways of doing the same thing. E.g., as was discussed a few weeks ago, markdown's handling of code block is nice if you are editing your text in an editor that supports block indentation, but is quite disfunctional if you are doing it in a web form, especially if most of your content is snippets of code. We talked about this, and what now? Some people might go and implement an extension for {{{...}}} for their favourite implementation. Others will go and implement something else. - yuri -- http://sputnik.freewisdom.org/
On Fri, Feb 29, 2008 at 11:59 AM, Yuri Takhteyev <qaramazov at gmail.com> wrote:> > Well, except that people use it for blogs and wikis and they need some > of those extra features, and if we don't agree on some of them (like > definition lists and tables) then we end up with a hundred different > ways of doing the same thing. E.g., as was discussed a few weeks ago, > markdown's handling of code block is nice if you are editing your text > in an editor that supports block indentation, but is quite > disfunctional if you are doing it in a web form, especially if most of > your content is snippets of code. We talked about this, and what now? > Some people might go and implement an extension for {{{...}}} for > their favourite implementation. Others will go and implement > something else.So, as a policy person, what I haven't seen is a proposal for making decisions about all of this stuff. What say you mardkown-discuss? Should we have a council of implementation maintainers? Yikes. Probably not. What about an IETF-like "rough consensus and running code/spec/etc." model? best, Joe -- Joseph Lorenzo Hall UC Berkeley School of Information http://josephhall.org/
* Waylan Limberg <waylan at gmail.com> [2008-02-29 17:00]:> As implementors, we want a strict spec because it's easier to > implement, but that does not always result in easier to read > and/or write.You have ?strict? and ?simplistic? confused. If the spec for the syntax is rigorous that does not mean the syntax has to be rigid. It just means it is well-defined how an implementation of Markdown should parse particular constructs. That has nothing whatsoever to do with the flexibility in the syntax. The current syntax summary leaves a lot of grey areas where you could reasonably do any of several things. The upshot is that the whenever someone implements Markdown from scratch, most likely his implementation *does* do something other than the reference implementation for that case. And experience confirms this. Therefore, I don?t quite follow this argument:> Now, before you all write me off as insane, this is actually > why I think Markdown 2.0 is a good idea. By moving to 2.0, we > don't have to worry about backward compatibility (Markdown 2.0 > should not allow those 3 spaces).You *already* can?t move documents from one implementation to another without expecting breakage. Heck, you can?t move them from one version of an implementation to a newer version of the same implementation without expecting some breakage. The question is, how much breakage would conformance to the more rigorous spec cause? If it isn?t much: do you think any users will care about the subtleties of Markdown 1.0 vs Markdown 2.0? Don?t you think they?ll blithely grind their Markdown 1.0 documents through a Markdown 2.0 processor if this works most of the time? If it would cause a lot of breakage: isn?t that maybe because a lot of people actually like this ?unintended feature?? Does that not possibly mean that it?s worthwhile to try preserving as an actual feature in the spec? (If several implementations have the same accidental feature, particularly?) Remember, it?s always easier to change the spec to fit existing fact rather than the other way around. (Cf. HTML5.) Now if you insist on causing so much breakage that people *cannot* just grind their 1.0 documents through a 2.0 processor: what do you expect does this imply for the adoption of 2.0? Don?t you think it would rather cause a lot of people not to ?upgrade? the way they did with XML 2.0 and Windows Vista? Some upgrades those were. Now turn around and consider the two calls you make in relation to each other. I don?t know about you, but it seems contradictory to me that first you argue for a spec that allows documents to be written in a very lax syntax, then turn around and backward compatibility should be abandoned so that we can have the freedom to reduce the flexibility of the syntax. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
On Fri, Feb 29, 2008 at 6:18 PM, Aristotle Pagaltzis <pagaltzis at gmx.de> wrote:>[snip] Thanks Aristotle. You bring up may good points which I will not argue with. That post was meant to generate conversation and you, among others bit the bait (although you may have bit a little harder). These are all things we need to discuss and consider if and when a spec is developed and I hadn't yet seen that conversation yet. Additionally, I may not have used the best words at times and you called me on that, so thank you. -- ---- Waylan Limberg waylan at gmail.com