Eric Wong
2009-Sep-12 23:57 UTC
[Mongrel-development] [RFC Mongrel2] simpler response API + updated HTTP parser
Hi all, I''ve pushed out some changes based on fauna/master[1] to git://git.bogomips.org/ur-mongrel that includes a good chunk of the platform-independent stuff found in Unicorn. The new HTTP parser is named "mongrel_http" to avoid loadtime conflicts with the old one ("http11") but maintains the same class name (Mongrel::HttpParser). This one even supports HTTP/0.9, so "http11" wasn''t an appropriate name for it :) Problems: I''m having some trouble with Rake+Echoe 3.2 with an "uninitialized constant Platform" error but everything seems to work by hand without Rake+Echoe. I''m also getting some test failures under 1.9.1-p243 with the semaphore/threading tests. I haven''t looked too hard at this current threading model, but my gut feeling is that it''s too complicated and a "dumber" model in mongrel 1.x *or* a fixed number of worker threads doing accept() is sufficient... One thing that may be cool is to support multiple threading/concurrency models since 1.8/1.9/jruby/rubinius all implement threads differently and we can also get Actors with 1.9/Rubinius. shortlog and diffstat below: Eric Wong (6): http_response: replace old API with simpler one http_response: drop old API compatibility remove HeaderOut class Add new HTTP/{0.9,1.0,1.1} parser Start using the new HTTP parser + TeeInput Remove unused Const::HTTP_STATUS_CODES hash Manifest | 10 +- ext/mongrel_http/c_util.h | 107 ++++ ext/mongrel_http/common_field_optimization.h | 111 ++++ ext/mongrel_http/ext_help.h | 48 ++ ext/mongrel_http/extconf.rb | 8 + ext/mongrel_http/global_variables.h | 91 ++++ ext/mongrel_http/mongrel_http.rl | 708 ++++++++++++++++++++++++++ ext/mongrel_http/mongrel_http_common.rl | 74 +++ lib/mongrel.rb | 64 +--- lib/mongrel/const.rb | 46 +-- lib/mongrel/header_out.rb | 34 -- lib/mongrel/http_request.rb | 147 ++---- lib/mongrel/http_response.rb | 202 ++------ lib/mongrel/tee_input.rb | 144 ++++++ test/unit/test_http_parser.rb | 425 ++++++++++++++-- test/unit/test_http_parser_ng.rb | 307 +++++++++++ test/unit/test_response.rb | 12 +- test/unit/test_server.rb | 3 + 18 files changed, 2101 insertions(+), 440 deletions(-) Full changelog: commit 4e6ab7b7d608bd074107c6a1804401d8165062d4 Author: Eric Wong <normalperson at yhbt.net> Date: Sat Sep 12 16:38:22 2009 -0700 Remove unused Const::HTTP_STATUS_CODES hash It''s no longer used when we generate responses, instead we just use the one found in Rack (which was originally "stolen" from us) so it''s one less thing for us to maintain. commit 46ca4a1c35b92109cedd59808908e7ad1d289abb Author: Eric Wong <normalperson at yhbt.net> Date: Sat Sep 12 10:40:30 2009 -0700 Start using the new HTTP parser + TeeInput The new HTTP parser minimizes the amount of Ruby support code needed and the HttpRequest class has been changed to a single class method: HttpRequest.read As a result, this hooks up the TeeInput class into the request processing cycle. TeeInput lets us read the request body off the socket while the Rack application is being called (instead of being buffered before-hand) while providing rewindable semantics that the Rack spec requires. commit c5a63522bc7e323c706609f7d99ed9f09fe9975d Author: Eric Wong <normalperson at yhbt.net> Date: Fri Sep 11 13:55:20 2009 -0700 Add new HTTP/{0.9,1.0,1.1} parser This is descended from the Mongrel parser but modified to support: * chunked transfer-encoding * trailers after chunked request bodies * HTTP/0.9 * absolute URI requests * multi-line headers with continuation lines * repeated headers (joined by commas) * #keepalive? boolean method * better integration with Rack This is not yet hooked into any existing parts of Mongrel, that is the next step. commit 8c1c7bdd3c1767708f8507d5aef8ded03b6f1796 Author: Eric Wong <normalperson at yhbt.net> Date: Fri Sep 11 13:16:25 2009 -0700 remove HeaderOut class HttpResponse has been rewritten to just iterate through the headers Rack gives us in a GC-friendly way so we have no need for this any longer. commit 392ea08624e39faec8d5e10ba04b21dfd9ca19a1 Author: Eric Wong <normalperson at yhbt.net> Date: Fri Sep 11 12:58:42 2009 -0700 http_response: drop old API compatibility Avoid needless overhead in allocating a HttpResponse object and instead just use a class method. This is alright with Rack applications since Rack specifies the response is already a tuple for writing. Of course the headers and body of the response can both be generated iteratively with #each. commit 469a507133bd20034df485f03b6eb7b0e82080d6 Author: Eric Wong <normalperson at yhbt.net> Date: Fri Sep 11 12:52:20 2009 -0700 http_response: replace old API with simpler one The old API is completely dropped, a compatibility layer for the old one will be added as Rack middleware instead. This allows newly-written applications to go through fewer layers of abstraction. git: git://git.bogomips.org/ur-mongrel cgit: http://git.bogomips.org/cgit/ur-mongrel.git [1] 9f9a9d488ed32a2891dc3dd7d50a17a16357042d -- Eric Wong
Eric Wong
2009-Oct-08 01:35 UTC
[Mongrel-development] [RFC Mongrel2] simpler response API + updated HTTP parser
Eric Wong <normalperson at yhbt.net> wrote:> Hi all, > > I''ve pushed out some changes based on fauna/master[1] to > git://git.bogomips.org/ur-mongrel that includes a good chunk of the > platform-independent stuff found in Unicorn.We hit one bug/weird-interaction with Rails in Unicorn so here''s a fix I put it. Unfortunately the test cases for TeeInput in Unicorn currently rely on fork() + pipe() (it was just more natural for me to write), but if there''s interest I could be persuaded to write a non-*nix version. This issue that could be arguably considered a bug in Rails: https://rails.lighthouseapp.com/projects/8994/tickets/3343 Just in case, I''m also asking for Rack to allow the readpartial method into the "rack.input" spec here: http://groups.google.com/group/rack-devel/browse_thread/thread/3dfccb68172a6ed6>From 87254d37c519b63a1d39c938cd4a53b08e2a1065 Mon Sep 17 00:00:00 2001From: Eric Wong <normalperson at yhbt.net> Date: Wed, 7 Oct 2009 18:24:27 -0700 Subject: [PATCH] more-compatible TeeInput#read for POSTs with Content-Length There are existing applications and libraries that don''t check the return value of env[''rack.input''].read(length) (like Rails :x). Those applications became broken under the IO#readpartial semantics of TeeInput#read when handling larger request bodies. We''ll preserve the IO#readpartial semantics _only_ when handling chunked requests (as long as Rack allows it, it''s useful for real-time processing of audio/video streaming uploads, especially with Rainbows! and mobile clients) but use read-in-full semantics for TeeInput#read on requests with a known Content-Length. --- lib/mongrel/tee_input.rb | 43 +++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 41 insertions(+), 2 deletions(-) diff --git a/lib/mongrel/tee_input.rb b/lib/mongrel/tee_input.rb index 442c55a..3605e20 100644 --- a/lib/mongrel/tee_input.rb +++ b/lib/mongrel/tee_input.rb @@ -44,6 +44,26 @@ module Mongrel @size = tmp_size end + # call-seq: + # ios = env[''rack.input''] + # ios.read([length [, buffer ]]) => string, buffer, or nil + # + # Reads at most length bytes from the I/O stream, or to the end of + # file if length is omitted or is nil. length must be a non-negative + # integer or nil. If the optional buffer argument is present, it + # must reference a String, which will receive the data. + # + # At end of file, it returns nil or "" depend on length. + # ios.read() and ios.read(nil) returns "". + # ios.read(length [, buffer]) returns nil. + # + # If the Content-Length of the HTTP request is known (as is the common + # case for POST requests), then ios.read(length [, buffer]) will block + # until the specified length is read (or it is the last chunk). + # Otherwise, for uncommon "Transfer-Encoding: chunked" requests, + # ios.read(length [, buffer]) will return immediately if there is + # any data and only block when nothing is available (providing + # IO#readpartial semantics). def read(*args) socket or return @tmp.read(*args) @@ -58,9 +78,9 @@ module Mongrel rv = args.shift || @buf2.dup diff = tmp_size - @tmp.pos if 0 == diff - tee(length, rv) + ensure_length(tee(length, rv), length) else - @tmp.read(diff > length ? length : diff, rv) + ensure_length(@tmp.read(diff > length ? length : diff, rv), length) end end end @@ -140,5 +160,24 @@ module Mongrel tmp end + # tee()s into +buf+ until it is of +length+ bytes (or until + # we''ve reached the Content-Length of the request body). + # Returns +buf+ (the exact object, not a duplicate) + # To continue supporting applications that need near-real-time + # streaming input bodies, this is a no-op for + # "Transfer-Encoding: chunked" requests. + def ensure_length(buf, length) + # @size is nil for chunked bodies, so we can''t ensure length for those + # since they could be streaming bidirectionally and we don''t want to + # block the caller in that case. + return buf if buf.nil? || @size.nil? + + while buf.size < length && @size != @tmp.pos + buf << tee(length - buf.size, @buf2) + end + + buf + end + end end -- Eric Wong
Eric Wong
2009-Oct-27 21:59 UTC
[Mongrel-development] [RFC Mongrel2] simpler response API + updated HTTP parser
Eric Wong <normalperson at yhbt.net> wrote:> Eric Wong <normalperson at yhbt.net> wrote: > > Hi all, > > > > I''ve pushed out some changes based on fauna/master[1] to > > git://git.bogomips.org/ur-mongrel that includes a good chunk of the > > platform-independent stuff found in Unicorn.One more that I just pushed out to git://git.bogomips.org/ur-mongrel>From f1e493e98a76345b4a05b29e037826626138776b Mon Sep 17 00:00:00 2001From: Eric Wong <normalperson at yhbt.net> Date: Tue, 27 Oct 2009 14:38:51 -0700 Subject: [PATCH] tee_input: avoid IO#sync=true to workaround BSD stdio issue IO#sync = true causes bad things with Ruby 1.8.x and stdio in *BSDs. Since Mongrel 1.x originally didn''t use IO#sync=true and needs to work on slow clients and a wider number of OSes than Unicorn, it maybe be better to just avoid IO#sync=true instead of an explicit seek-after-write (like Unicorn does). This issue was tracked (and fixed) in ruby-core:26300[1], but a MRI 1.8 release may be a while off and people have a tendency to upgrade MRI slowly. [1] http://redmine.ruby-lang.org/issues/show/2267 --- lib/mongrel/tee_input.rb | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/lib/mongrel/tee_input.rb b/lib/mongrel/tee_input.rb index 3605e20..cf20613 100644 --- a/lib/mongrel/tee_input.rb +++ b/lib/mongrel/tee_input.rb @@ -134,6 +134,10 @@ module Mongrel begin if parser.filter_body(dst, socket.readpartial(length, buf)).nil? @tmp.write(dst) + # This seek is to workaround a BSD stdio + MRI 1.8.x issue, + # [ruby-core:26300] but currently not needed unless we''ve + # set @tmp.sync=true + # @tmp.seek(0, IO::SEEK_END) if @tmp.sync return dst end rescue EOFError @@ -155,7 +159,6 @@ module Mongrel def tmpfile tmp = Tempfile.new(Const::MONGREL_TMP_BASE) - tmp.sync = true tmp.binmode tmp end -- Eric Wong