Before anything else, let me state this: Of course it''s going to be PAINFULLY slow on MRI. That''s not the point :) I thought I''d try out writing out a Ruby version of the parser for the purposes of Rubinius. For those of you who aren''t aware, Ragel supports a goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head honcho guy Evan Phoenix is working on a patch for Ragel to update it to the new compiler semantics. So really, there is a purpose for trying this out. Anyway, here''s my initial hack. It''s nasty, and presently jams the entire FSM into instance-specific data. Aieee! But it more or less seems to generate similar (albeit not identical) output to the C one: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da I''ve thought about having a Mongrel::HttpParser::FSM module to store the actual Ragel-generated state machine, and pass all ivars from the Mongrel::HttpParser to an execute method then recapture them as return values, or something to that effect. Thoughts? Suggestions? Complete rewrites? I''d appreciate them all. -- Tony Arcieri medioh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-development/attachments/20080423/19cb571c/attachment.html
One could replace http11''s parser with some regular expressions and out-of-bounds checking rather easily. I think Kirk Haines did this (?) and said it was rather comparable in speed to the C/Ragel state machine. I guess that wasn''t really the point of your exercise, but it''s worth noting, if anyone actually wants a pure ruby http parser. ry On Thu, Apr 24, 2008 at 2:50 AM, Tony <tony at clickcaster.com> wrote:> Before anything else, let me state this: Of course it''s going to be > PAINFULLY slow on MRI. That''s not the point :) > > I thought I''d try out writing out a Ruby version of the parser for the > purposes of Rubinius. For those of you who aren''t aware, Ragel supports a > goto-driven FSM on Rubinius by injecting assembly directly, and Rubinus head > honcho guy Evan Phoenix is working on a patch for Ragel to update it to the > new compiler semantics. So really, there is a purpose for trying this out. > > Anyway, here''s my initial hack. It''s nasty, and presently jams the entire > FSM into instance-specific data. Aieee! But it more or less seems to > generate similar (albeit not identical) output to the C one: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=508f9bd42b4aad322f357637d52576f780707a2f;hb=868732662abbf4aa571bf2f3d598152467f6f4da > > I''ve thought about having a Mongrel::HttpParser::FSM module to store the > actual Ragel-generated state machine, and pass all ivars from the > Mongrel::HttpParser to an execute method then recapture them as return > values, or something to that effect. > > Thoughts? Suggestions? Complete rewrites? I''d appreciate them all. > > -- > Tony Arcieri > medioh.com > _______________________________________________ > Mongrel-development mailing list > Mongrel-development at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-development > >
I pushed an updated version here: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 It''s now confirmed working with Mongrel::HttpServer on Rubinius with a "Hello, world!" Mongrel::HttpHandler. It can be used to generate a goto-driven FSM using Rubinius assembly: http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 Some performance figures: MRI + C extension, parsing 10,000 requests: 0.150000 0.000000 0.150000 ( 0.152268) Rubinius + Rubinius.asm parser, parsing 10,000 requests: 20.500086 0.000000 20.500086 ( 20.500085) So, presently ~135x slower than the C extension on MRI :) -- Tony Arcieri medioh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mongrel-development/attachments/20080426/df0c1bf2/attachment.html>
On Sat, Apr 26, 2008 at 8:33 PM, Tony <tony at clickcaster.com> wrote:> I pushed an updated version here: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11_parser.rb.rl;h=60c8f3d2519dc1673ef0b4107d40a9df9eca0662;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 > > It''s now confirmed working with Mongrel::HttpServer on Rubinius with a > "Hello, world!" Mongrel::HttpHandler. > > It can be used to generate a goto-driven FSM using Rubinius assembly: > > http://git.rubini.us/?p=code;a=blob;f=lib/mongrel/http11.rb;h=435f643ea105f7adc486dc06ab960392c3dfeab5;hb=7d246b17efc0ac37db6c241729f6b0e298f49950 > > Some performance figures: > > MRI + C extension, parsing 10,000 requests: > 0.150000 0.000000 0.150000 ( 0.152268) > > Rubinius + Rubinius.asm parser, parsing 10,000 requests: > 20.500086 0.000000 20.500086 ( 20.500085) > > So, presently ~135x slower than the C extension on MRI :) >Hey Tony, how that can compare with Rubinius + substend? -- Luis Lavena Multimedia systems - Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so. Douglas Adams
On Thu, 24 Apr 2008 13:38:03 +0200 "ry dahl" <ry at tinyclouds.org> wrote:> One could replace http11''s parser with some regular expressions and > out-of-bounds checking rather easily. I think Kirk Haines did this (?) > and said it was rather comparable in speed to the C/Ragel state > machine. I guess that wasn''t really the point of your exercise, but > it''s worth noting, if anyone actually wants a pure ruby http parser.Yes, fast, but not correct. The main difference between a generated parser based on algorithms and hand crafted regex is when the parser blows up it says: "Syntax error at character #34 expecting BLAH, FOO, and BAR symbols." Regexen do this: "Hi, oh thanks, I *love* hacks like this. You crafted this shellcode really well so that it looks mildly like a payload. Super awesome I''ll just pass this vaguely HTTP string right on to our app." :-) -- Zed A. Shaw - Hate: http://savingtheinternetwithhate.com/ - Good: http://www.zedshaw.com/ - Evil: http://yearofevil.com/