Benchmarking the pure-ruby version of the parser, which is a new option in 4.2... ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.2.0 compiled in Ruby... Finished in 30.703538 seconds 30 seconds is a long time compared to... ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 3.0.4 compiled in ruby-regex... Finished in 0.346966 seconds ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.0.0 compiled in C... Finished in 0.055147 seconds ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.1.1 compiled in C... Finished in 0.05756 seconds ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.1.9 compiled in C... Finished in 0.140867 seconds ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.2.0 compiled in C... Finished in 0.13385 seconds Uh, yeah. I knew it was slow, but until now I had no idea it was _that_ slow compared to 3.0.4. The tests in 3.0.4 took a long time to run, so I just figured the two all-ruby versions would be about the same. Now that I think about it, it does make sense that even lots of Regexes would be faster than doing all the looping, character comparison, and concatenation in Ruby. Also note that RedCloth has been getting slower with each version, except this last one where I tried hard to reduce complexity in some areas. Fixing bugs usually means making the machine more specific, which means more complexity. The parser binary has more than doubled in size since 4.0.0. Let''s see about Ruby 1.8 vs 1.9... ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 3.0.4 compiled in ruby-regex... Finished in 0.320067 seconds ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 3.0.4 compiled in ruby-regex... Finished in 0.566467 seconds So 3.0.4 is _slower_ in Ruby 1.9. Interesting. Is 4.2.0? ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.2.0 compiled in C... Finished in 0.148924 seconds ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O spec/ spec.opts Benchmarking version 4.2.0 compiled in C... Finished in 0.107696 seconds Thankfully, no. My conclusion is, I need to do something different with RedCloth in the long term if I want it available as a pure-Ruby library and if I don''t want the size and parse time to keep ballooning. Jason
Thanks for the very interesting comparaisons Jason. Pure ruby could definitely be an option. As I said, I really need a good, ruby extendable textile parser for zena in the long run so I''d be glad to help. I have implemented several pure ruby (regex based) parsers (parser for http://zenadmin.org/zafu is one) and other ragel based (json/scripts for http://rubyk.org for example). Regarding 3.0.4: I see most regexp are not left anchored (starting with /\A.../). I think we could get a nice speedup (and a more robust parser) if we used some kind of state machine instead of raw split and gsub. The zafu parser for example uses contextual regexps and eats input data as it matches. This means that input data is only parsed a minimal amount of times. Gaspard On Thu, Jun 11, 2009 at 4:49 PM, Jason Garber<jg at jasongarber.com> wrote:> Benchmarking the pure-ruby version of the parser, which is a new option in > 4.2... > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.2.0 compiled in Ruby... > Finished in 30.703538 seconds > > 30 seconds is a long time compared to... > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 3.0.4 compiled in ruby-regex... > Finished in 0.346966 seconds > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.0.0 compiled in C... > Finished in 0.055147 seconds > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.1.1 compiled in C... > Finished in 0.05756 seconds > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.1.9 compiled in C... > Finished in 0.140867 seconds > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.2.0 compiled in C... > Finished in 0.13385 seconds > > Uh, yeah. ?I knew it was slow, but until now I had no idea it was _that_ > slow compared to 3.0.4. ?The tests in 3.0.4 took a long time to run, so I > just figured the two all-ruby versions would be about the same. ?Now that I > think about it, it does make sense that even lots of Regexes would be faster > than doing all the looping, character comparison, and concatenation in Ruby. > > Also note that RedCloth has been getting slower with each version, except > this last one where I tried hard to reduce complexity in some areas. ?Fixing > bugs usually means making the machine more specific, which means more > complexity. ?The parser binary has more than doubled in size since 4.0.0. > > Let''s see about Ruby 1.8 vs 1.9... > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 3.0.4 compiled in ruby-regex... > Finished in 0.320067 seconds > > ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O > spec/spec.opts > Benchmarking version 3.0.4 compiled in ruby-regex... > Finished in 0.566467 seconds > > So 3.0.4 is _slower_ in Ruby 1.9. ?Interesting. ?Is 4.2.0? > > ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/spec.opts > Benchmarking version 4.2.0 compiled in C... > Finished in 0.148924 seconds > > ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O > spec/spec.opts > Benchmarking version 4.2.0 compiled in C... > Finished in 0.107696 seconds > > Thankfully, no. > > My conclusion is, I need to do something different with RedCloth in the long > term if I want it available as a pure-Ruby library and if I don''t want the > size and parse time to keep ballooning. > > Jason > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >
On Jun 11, 2009, at 1:24 PM, Hugh Sasse wrote:> Is the ruby code to do this in the RedCloth-4.2.0 distro? I can''t see > any parsers in there (grep -lir parser .) except the C code.You have to get the source code yourself and build it with rake pureruby compile. I didn''t include it in any package/distro because I want to discourage its use (it is 100x slower, after all!).
Gaspard, thanks for offering to help! I''ll surely take you up on it? seems like you have some very helpful experience. Want to do a little more research on it myself first, though. Then maybe you and I can get together and work out some prototypes. Jason On Jun 11, 2009, at 12:44 PM, Gaspard Bucher wrote:> Thanks for the very interesting comparaisons Jason. Pure ruby could > definitely be an option. > > As I said, I really need a good, ruby extendable textile parser for > zena in the long run so I''d be glad to help. I have implemented > several pure ruby (regex based) parsers (parser for > http://zenadmin.org/zafu is one) and other ragel based (json/scripts > for http://rubyk.org for example). > > Regarding 3.0.4: I see most regexp are not left anchored (starting > with /\A.../). I think we could get a nice speedup (and a more robust > parser) if we used some kind of state machine instead of raw split and > gsub. The zafu parser for example uses contextual regexps and eats > input data as it matches. This means that input data is only parsed a > minimal amount of times. > > Gaspard > > On Thu, Jun 11, 2009 at 4:49 PM, Jason Garber<jg at jasongarber.com> > wrote: >> Benchmarking the pure-ruby version of the parser, which is a new >> option in >> 4.2... >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.2.0 compiled in Ruby... >> Finished in 30.703538 seconds >> >> 30 seconds is a long time compared to... >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 3.0.4 compiled in ruby-regex... >> Finished in 0.346966 seconds >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.0.0 compiled in C... >> Finished in 0.055147 seconds >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.1.1 compiled in C... >> Finished in 0.05756 seconds >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.1.9 compiled in C... >> Finished in 0.140867 seconds >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.2.0 compiled in C... >> Finished in 0.13385 seconds >> >> Uh, yeah. I knew it was slow, but until now I had no idea it was >> _that_ >> slow compared to 3.0.4. The tests in 3.0.4 took a long time to >> run, so I >> just figured the two all-ruby versions would be about the same. >> Now that I >> think about it, it does make sense that even lots of Regexes would >> be faster >> than doing all the looping, character comparison, and concatenation >> in Ruby. >> >> Also note that RedCloth has been getting slower with each version, >> except >> this last one where I tried hard to reduce complexity in some >> areas. Fixing >> bugs usually means making the machine more specific, which means more >> complexity. The parser binary has more than doubled in size since >> 4.0.0. >> >> Let''s see about Ruby 1.8 vs 1.9... >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 3.0.4 compiled in ruby-regex... >> Finished in 0.320067 seconds >> >> ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O >> spec/spec.opts >> Benchmarking version 3.0.4 compiled in ruby-regex... >> Finished in 0.566467 seconds >> >> So 3.0.4 is _slower_ in Ruby 1.9. Interesting. Is 4.2.0? >> >> ~/Documents/redcloth(master) $ spec spec/benchmark_spec.rb -O spec/ >> spec.opts >> Benchmarking version 4.2.0 compiled in C... >> Finished in 0.148924 seconds >> >> ~/Documents/redcloth(master) $ spec19 spec/benchmark_spec.rb -O >> spec/spec.opts >> Benchmarking version 4.2.0 compiled in C... >> Finished in 0.107696 seconds >> >> Thankfully, no. >> >> My conclusion is, I need to do something different with RedCloth in >> the long >> term if I want it available as a pure-Ruby library and if I don''t >> want the >> size and parse time to keep ballooning. >> >> Jason >> _______________________________________________ >> Redcloth-upwards mailing list >> Redcloth-upwards at rubyforge.org >> http://rubyforge.org/mailman/listinfo/redcloth-upwards >> > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards
For benchmarking, I think it could make sense to have some representative data of a typical application. This might mean large amounts of pure text without any textile tags (except paragraphs), or it might be a mixture of very short texts with a few very long ones. I could give you the data from http://zenadmin.org, but the links and images are not very typical: "":45 (internal link to node 45)> Want to do a little more research on it myself first, though.Of course ! Please continue to give us feedback. This kind of experimenting can provide very useful information on the different parsing solutions. G.
On Thu, 11 Jun 2009, Jason Garber wrote:> On Jun 11, 2009, at 1:24 PM, Hugh Sasse wrote: > > Is the ruby code to do this in the RedCloth-4.2.0 distro? I can''t see > > any parsers in there (grep -lir parser .) except the C code. > > You have to get the source code yourself and build it with rake pureruby > compile. I didn''t include it in any package/distro because I want to > discourage its use (it is 100x slower, after all!).Ok, just wondered if I could see any speedups in a few mins. I''ll leave this for now. I will have a bit more time in a few weeks I think. Hugh> > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards
Hugh Sasse wrote:> Is the ruby code to do this in the RedCloth-4.2.0 distro? I can''t see > any parsers in there (grep -lir parser .) except the C code. > > I''m wondering, much of the work of a parser will be looking for > fixed tokens, won''t it? I know for numbers and arbitrary strings > this isn''t true... I ask because one of the few weaknesses in Ruby > is that there is an over-emphasis on Regexps for processing strings > in the language, and the only way I know to test whether some string > (str) begins with a substring (sought), other than regexps, is > str.index(sought).zero? . Have I missed something obvious? It seems > to me that constructing something as complex as a regexp for such an > operation would be expensive. Even str.scan which will accept a string > rather than a regexp seems to treat such a string as a regexp, from a > quick look at the c code. > > Hugh >Using index() to check the beginning of the string will be very slow, since will search the entire string for the substring, although you are only interested in the beginning of the string. I would suggest str[0,sought.length] == sought, which is considerably faster, especially for short substrings. For longer ones, it seems (from some quick benchmarking) that each approach is about equal. -Justin require ''benchmark'' include Benchmark str = "adiaodibasd" * 1000 sought = "hello" bmbm do |t| t.report "Index" do 100000.times do str.index(sought) == 0 end end t.report "Regexp" do reg = /^#{sought}/ 100000.times do str =~ reg end end t.report "Slice" do 100000.times do str[0,sought.length] == sought end end end
I followed some of the discussions on the treetop mailing list and it sure looks great. I''ll have a look at the subset you are parsing and if that''s not too much, I''ll try an Oniguruma based parser.>From what I understand, Treetop builds an abstract syntax tree withrendering methods attached. This means that the parser knows about everything there is to know about parsing *and* rendering (Interpreter pattern). I''ll try to build something that could be totally agnostic of what we actually do with the AST (or S-expression), may it be pdf, html, latex or other SM gears (Visitor pattern). Gaspard On Tue, Jun 16, 2009 at 10:02 PM, Jason Garber<jg at jasongarber.com> wrote:> Gaspard (and others who are interested), > I''ve been researching some tools in this domain to figure out what they''re > good at and what they''re not. ?I like the advantages of parsing expression > grammars (PEGs) over regular expressions so I started a test rewrite with > Treetop to find out what the gotchas are (there are always gotchas). ?I''ve > hit surprisingly few and mostly they were indicative that I was thinking > about the problem wrongly. ?Overall it''s going great, it''s fast (so far) and > very elegant. ?I can get down and spec individual parts of the grammar or > even individual rules! > I''m interested to see what you would come up with using contextual regexps > like you mentioned for the zafu parser. ?I''d welcome you to build a little > prototype like I''m building with Treetop > (http://github.com/jgarber/redcloth-treetop). ?If yours implements the same > subset of Textile, then we can do some benchmarking, compare the grammars > and so forth. ?Don''t feel obligated?just know that I''m open-minded about it. > I think I''d really like to stay away from C/Java unless someone demonstrates > that it''s the only way to parse with any speed. ?My experience so far with > Treetop indicates a Ruby parser can?be fast enough (and if speed were really > more important to us than some other factors, we wouldn''t have picked Ruby > in the first place!). > Either way, I also agree that it needs to be durable over the long-term. > ?I''m making websites in Textile right and left and I can''t be updating them > every time the parser changes. ?I think lots of unit tests are the only sure > way to prevent regression (like happened in 4.2.0) and I''ve been very > pleased with how I can spec the behavior of individual rules and grammars in > my treetop prototype. > So, if you want to start a prototype regexp parser, that''s cool, or if you > like what you see on my prototype project, you''re welcome to join that as > well. > Regards, > Jason > On Jun 11, 2009, at 2:25 PM, Gaspard Bucher wrote: > > For benchmarking, I think it could make sense to have some > representative data of a typical application. This might mean large > amounts of pure text without any textile tags (except paragraphs), or > it might be a mixture of very short texts with a few very long ones. I > could give you the data from http://zenadmin.org, but the links and > images are not very typical: > > "":45 (internal link to node 45) > > Want to do a little more research on it myself first, though. > > Of course ! Please continue to give us feedback. This kind of > experimenting can provide very useful information on the different > parsing solutions. > > G. > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards > > > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >
Yes, you can use Treetop for the interpreter pattern out of the box, but I''m doing following the visitor pattern like Cucumber did. Just started working on the visitor, actually. And my posts to the treetop list are a little old?I''ve moved past the problems I described there? so don''t read them as the current state of things. Hope I''m using all the terminology right. I''m new to all this (no CompSci background). On Jun 16, 2009, at 4:29 PM, Gaspard Bucher wrote:> I followed some of the discussions on the treetop mailing list and it > sure looks great. I''ll have a look at the subset you are parsing and > if that''s not too much, I''ll try an Oniguruma based parser. > >> From what I understand, Treetop builds an abstract syntax tree with > rendering methods attached. This means that the parser knows about > everything there is to know about parsing *and* rendering (Interpreter > pattern). > > I''ll try to build something that could be totally agnostic of what we > actually do with the AST (or S-expression), may it be pdf, html, latex > or other SM gears (Visitor pattern). > > Gaspard > > > On Tue, Jun 16, 2009 at 10:02 PM, Jason Garber<jg at jasongarber.com> > wrote: >> Gaspard (and others who are interested), >> I''ve been researching some tools in this domain to figure out what >> they''re >> good at and what they''re not. I like the advantages of parsing >> expression >> grammars (PEGs) over regular expressions so I started a test >> rewrite with >> Treetop to find out what the gotchas are (there are always >> gotchas). I''ve >> hit surprisingly few and mostly they were indicative that I was >> thinking >> about the problem wrongly. Overall it''s going great, it''s fast (so >> far) and >> very elegant. I can get down and spec individual parts of the >> grammar or >> even individual rules! >> I''m interested to see what you would come up with using contextual >> regexps >> like you mentioned for the zafu parser. I''d welcome you to build a >> little >> prototype like I''m building with Treetop >> (http://github.com/jgarber/redcloth-treetop). If yours implements >> the same >> subset of Textile, then we can do some benchmarking, compare the >> grammars >> and so forth. Don''t feel obligated?just know that I''m open-minded >> about it. >> I think I''d really like to stay away from C/Java unless someone >> demonstrates >> that it''s the only way to parse with any speed. My experience so >> far with >> Treetop indicates a Ruby parser can be fast enough (and if speed >> were really >> more important to us than some other factors, we wouldn''t have >> picked Ruby >> in the first place!). >> Either way, I also agree that it needs to be durable over the long- >> term. >> I''m making websites in Textile right and left and I can''t be >> updating them >> every time the parser changes. I think lots of unit tests are the >> only sure >> way to prevent regression (like happened in 4.2.0) and I''ve been very >> pleased with how I can spec the behavior of individual rules and >> grammars in >> my treetop prototype. >> So, if you want to start a prototype regexp parser, that''s cool, or >> if you >> like what you see on my prototype project, you''re welcome to join >> that as >> well. >> Regards, >> Jason >> On Jun 11, 2009, at 2:25 PM, Gaspard Bucher wrote: >> >> For benchmarking, I think it could make sense to have some >> representative data of a typical application. This might mean large >> amounts of pure text without any textile tags (except paragraphs), or >> it might be a mixture of very short texts with a few very long >> ones. I >> could give you the data from http://zenadmin.org, but the links and >> images are not very typical: >> >> "":45 (internal link to node 45) >> >> Want to do a little more research on it myself first, though. >> >> Of course ! Please continue to give us feedback. This kind of >> experimenting can provide very useful information on the different >> parsing solutions. >> >> G. >> _______________________________________________ >> Redcloth-upwards mailing list >> Redcloth-upwards at rubyforge.org >> http://rubyforge.org/mailman/listinfo/redcloth-upwards >> >> >> _______________________________________________ >> Redcloth-upwards mailing list >> Redcloth-upwards at rubyforge.org >> http://rubyforge.org/mailman/listinfo/redcloth-upwards >> > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards
Hi Hugh ! I went through a description of the top down parser using Python (http://effbot.org/zone/simple-top-down-parsing.htm) and it''s funny how we keep reinventing the wheel: I just implemented something similar for querybuilder, using a ruby array as stack and poping the stack depending on operator precedence (http://tinyurl.com/mgj6et). Using operator precedence could be an interesting solution to solve inherent textile ambiguities but it does not remove the curse of tokenizing. I''ll keep this operator precedence idea in mind when writing a regexp based prototype. G. On Wed, Jun 17, 2009 at 12:45 AM, Hugh Sasse<hgs at dmu.ac.uk> wrote:> It sounds like you are not looking for alternative techniques any > more. That''s good, but I''ve found Top Down Operator Precedence > parsers, as explained in Chapter 9 of "Beautiful Code" [Oram and > Wilson, O''Reilly] particularly easy to understand, and relatively OK > to debug as well. ?The chapter is an implementation and explanation > of a paper by Vaughan Pratt in 1973, but it''s an ACM paper, so access > to it may be a problem. > > ? ? ? ?HTH > ? ? ? ?Hugh > > On Tue, 16 Jun 2009, Jason Garber wrote: > >> Yes, you can use Treetop for the interpreter pattern out of the box, but I''m >> doing following the visitor pattern like Cucumber did. ?Just started working >> on the visitor, actually. ?And my posts to the treetop list are a little >> old?I''ve moved past the problems I described there?so don''t read them as the >> current state of things. >> >> Hope I''m using all the terminology right. ?I''m new to all this (no CompSci >> background). >> >> On Jun 16, 2009, at 4:29 PM, Gaspard Bucher wrote: >> >> > I followed some of the discussions on the treetop mailing list and it >> > sure looks great. I''ll have a look at the subset you are parsing and >> > if that''s not too much, I''ll try an Oniguruma based parser. >> > >> > > From what I understand, Treetop builds an abstract syntax tree with >> > rendering methods attached. This means that the parser knows about >> > everything there is to know about parsing *and* rendering (Interpreter >> > pattern). >> > >> > I''ll try to build something that could be totally agnostic of what we >> > actually do with the AST (or S-expression), may it be pdf, html, latex >> > or other SM gears (Visitor pattern). >> > >> > Gaspard >> > >> > >> > On Tue, Jun 16, 2009 at 10:02 PM, Jason Garber<jg at jasongarber.com> wrote: >> > > Gaspard (and others who are interested), >> > > I''ve been researching some tools in this domain to figure out what they''re >> > > good at and what they''re not. ?I like the advantages of parsing expression >> > > grammars (PEGs) over regular expressions so I started a test rewrite with >> > > Treetop to find out what the gotchas are (there are always gotchas). ?I''ve >> > > hit surprisingly few and mostly they were indicative that I was thinking >> > > about the problem wrongly. ?Overall it''s going great, it''s fast (so far) >> > > and >> > > very elegant. ?I can get down and spec individual parts of the grammar or >> > > even individual rules! >> > > I''m interested to see what you would come up with using contextual regexps >> > > like you mentioned for the zafu parser. ?I''d welcome you to build a little >> > > prototype like I''m building with Treetop >> > > (http://github.com/jgarber/redcloth-treetop). ?If yours implements the >> > > same >> > > subset of Textile, then we can do some benchmarking, compare the grammars >> > > and so forth. ?Don''t feel obligated?just know that I''m open-minded about >> > > it. >> > > I think I''d really like to stay away from C/Java unless someone >> > > demonstrates >> > > that it''s the only way to parse with any speed. ?My experience so far with >> > > Treetop indicates a Ruby parser can be fast enough (and if speed were >> > > really >> > > more important to us than some other factors, we wouldn''t have picked Ruby >> > > in the first place!). >> > > Either way, I also agree that it needs to be durable over the long-term. >> > > I''m making websites in Textile right and left and I can''t be updating them >> > > every time the parser changes. ?I think lots of unit tests are the only >> > > sure >> > > way to prevent regression (like happened in 4.2.0) and I''ve been very >> > > pleased with how I can spec the behavior of individual rules and grammars >> > > in >> > > my treetop prototype. >> > > So, if you want to start a prototype regexp parser, that''s cool, or if you >> > > like what you see on my prototype project, you''re welcome to join that as >> > > well. >> > > Regards, >> > > Jason >> > > On Jun 11, 2009, at 2:25 PM, Gaspard Bucher wrote: >> > > >> > > For benchmarking, I think it could make sense to have some >> > > representative data of a typical application. This might mean large >> > > amounts of pure text without any textile tags (except paragraphs), or >> > > it might be a mixture of very short texts with a few very long ones. I >> > > could give you the data from http://zenadmin.org, but the links and >> > > images are not very typical: >> > > >> > > "":45 (internal link to node 45) >> > > >> > > Want to do a little more research on it myself first, though. >> > > >> > > Of course ! Please continue to give us feedback. This kind of >> > > experimenting can provide very useful information on the different >> > > parsing solutions. >> > > >> > > G. >> > > _______________________________________________ >> > > Redcloth-upwards mailing list >> > > Redcloth-upwards at rubyforge.org >> > > http://rubyforge.org/mailman/listinfo/redcloth-upwards >> > > >> > > >> > > _______________________________________________ >> > > Redcloth-upwards mailing list >> > > Redcloth-upwards at rubyforge.org >> > > http://rubyforge.org/mailman/listinfo/redcloth-upwards >> > > >> > _______________________________________________ >> > Redcloth-upwards mailing list >> > Redcloth-upwards at rubyforge.org >> > http://rubyforge.org/mailman/listinfo/redcloth-upwards >> >> _______________________________________________ >> Redcloth-upwards mailing list >> Redcloth-upwards at rubyforge.org >> http://rubyforge.org/mailman/listinfo/redcloth-upwards > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >