Gaspard Bucher
2009-Jun-07 19:17 UTC
Close to a 4.2 release; experimenting with Ragel alternatives
Hi Jason ! Hmmm, this is good and bad news: Good: ruby hooks means I could use a single pass to parse textile customizations in zena instead of running two parsers: nice. Bad: I have just switched to ragel for QueryBuilder to parse pseudo sql and I fear your shortcomings (if that''s an english phrase). Could you describe more precisely what you are missing with ragel ? I''m parsing about anything I want with this thing but maybe I''m too dumb to see the walls I''m running into... Gaspard On Sun, Jun 7, 2009 at 12:59 PM, Jason Garber<jg at jasongarber.com> wrote:> I just went through the ticket list and dropped a bunch from the 4.2 > milestone that are just too difficult with Ragel. ?Many of them I''ve poked > at and they''ve left me saying, "how the heck am I supposed to do that!?" > ?Multi-byte content will probably never work because Ragel docs say it won''t > with conditionals (actions that return true or false to determine if a state > should be accepted), which I see no way around. ?Not recognizing vertical > pipes escaped with notextile tags in tables, exiting the HTML machine on the > first closing block tag it sees, leaving pre blocks prematurely... all these > bugs would require a lot?of time and code to fix. ?And they''re just the tip > of the?iceberg. ?If I walk through the code and look at it through the lens > of nondeterminism, I can see lots more problems that people just haven''t run > into yet. > I''d like to release RedCloth 4.2 once I fix the low-hanging fruit. ?Then, I > plan to poke around for alternatives to Ragel. ?It''s been great, but > RedCloth has gotten really difficult to maintain because: > 1.) It has to compile > 2.) It compiles to three languages, has a couple binary gem distributions, > and needs to work with Ruby 1.8 and 1.9, which is always a challenge > 3.) Many reported bugs involve nondeterminism and require things DFAs like > Ragel have a hard time doing > 4.) Not that many people can fix bugs themselves because they don''t know > Ragel or they don''t understand the code. > 5.) It''s hard to tell people they can''t mix in extensions. ?Right now > RedCloth is a black box and you have to pre- or post-parse for extra > patterns, like wiki links. ?I want people to be able to use it how they > want. ?If that means mixing in their own cruddy patterns, awesome. > A PEG might be the way to go. ?Looking at Treetop, which is nice, decently > maintained, has some history, and is used by Cucumber. ?Doesn''t let me > manipulate the parser''s acceptance of expressions in code, though. ?It''s a > known problem, which is why you don''t see any yaml parsers in treetop yet > (they have a proposal on?Global Parsing State and Semantic Backtrack > Triggering). ?Also, without backreferences or the equivalent in code, it > would be hard to match things like HTML tags. > Also looking at James Edward Gray II''s Ghost Wheel. ?I like the grammar > syntax better and he says it "provides hooks for Ruby code that can be used > to make parsing decisions or transform parsed results," but it''s less widely > used and well-documented and I haven''t tried it out, so I don''t know its > limitations. > If anyone else has suggestions of things I should explore, do let me know! > ?I want to keep RedCloth fast, but it also needs to be maintainable. > Jason > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards >
Jason Garber
2009-Jun-08 04:28 UTC
Close to a 4.2 release; experimenting with Ragel alternatives
It''s probably me who''s too dumb for Ragel. :). Take a look at the bugs tagged difficult on the tracker. Also I''ll forward you what I sent to why describing the problems. Sent from my iPod On Jun 7, 2009, at 3:17 PM, Gaspard Bucher <gaspard at teti.ch> wrote:> Hi Jason ! > > Hmmm, this is good and bad news: > > Good: ruby hooks means I could use a single pass to parse textile > customizations in zena instead of running two parsers: nice. > > Bad: I have just switched to ragel for QueryBuilder to parse pseudo > sql and I fear your shortcomings (if that''s an english phrase). > > Could you describe more precisely what you are missing with ragel ? > I''m parsing about anything I want with this thing but maybe I''m too > dumb to see the walls I''m running into... > > Gaspard > > On Sun, Jun 7, 2009 at 12:59 PM, Jason Garber<jg at jasongarber.com> > wrote: >> I just went through the ticket list and dropped a bunch from the 4.2 >> milestone that are just too difficult with Ragel. Many of them >> I''ve poked >> at and they''ve left me saying, "how the heck am I supposed to do >> that!?" >> Multi-byte content will probably never work because Ragel docs say >> it won''t >> with conditionals (actions that return true or false to determine >> if a state >> should be accepted), which I see no way around. Not recognizing >> vertical >> pipes escaped with notextile tags in tables, exiting the HTML >> machine on the >> first closing block tag it sees, leaving pre blocks prematurely... >> all these >> bugs would require a lot of time and code to fix. And they''re just >> the tip >> of the iceberg. If I walk through the code and look at it through >> the lens >> of nondeterminism, I can see lots more problems that people just >> haven''t run >> into yet. >> I''d like to release RedCloth 4.2 once I fix the low-hanging fruit. >> Then, I >> plan to poke around for alternatives to Ragel. It''s been great, but >> RedCloth has gotten really difficult to maintain because: >> 1.) It has to compile >> 2.) It compiles to three languages, has a couple binary gem >> distributions, >> and needs to work with Ruby 1.8 and 1.9, which is always a challenge >> 3.) Many reported bugs involve nondeterminism and require things >> DFAs like >> Ragel have a hard time doing >> 4.) Not that many people can fix bugs themselves because they don''t >> know >> Ragel or they don''t understand the code. >> 5.) It''s hard to tell people they can''t mix in extensions. Right now >> RedCloth is a black box and you have to pre- or post-parse for extra >> patterns, like wiki links. I want people to be able to use it how >> they >> want. If that means mixing in their own cruddy patterns, awesome. >> A PEG might be the way to go. Looking at Treetop, which is nice, >> decently >> maintained, has some history, and is used by Cucumber. Doesn''t let >> me >> manipulate the parser''s acceptance of expressions in code, though. >> It''s a >> known problem, which is why you don''t see any yaml parsers in >> treetop yet >> (they have a proposal on Global Parsing State and Semantic Backtrack >> Triggering). Also, without backreferences or the equivalent in >> code, it >> would be hard to match things like HTML tags. >> Also looking at James Edward Gray II''s Ghost Wheel. I like the >> grammar >> syntax better and he says it "provides hooks for Ruby code that can >> be used >> to make parsing decisions or transform parsed results," but it''s >> less widely >> used and well-documented and I haven''t tried it out, so I don''t >> know its >> limitations. >> If anyone else has suggestions of things I should explore, do let >> me know! >> I want to keep RedCloth fast, but it also needs to be maintainable. >> Jason >> _______________________________________________ >> Redcloth-upwards mailing list >> Redcloth-upwards at rubyforge.org >> http://rubyforge.org/mailman/listinfo/redcloth-upwards >> > _______________________________________________ > Redcloth-upwards mailing list > Redcloth-upwards at rubyforge.org > http://rubyforge.org/mailman/listinfo/redcloth-upwards