Tim Snowhite
2013-Aug-15  22:32 UTC
[PATCH] http_response: reattempt writing body chunks to the socket under situations of high EINVAL/EAGAIN load.
We run Unicorn with ruby 1.8.7 on FreeBSD and began experiencing many early
terminations of the body stream when attempting to transfer files over a couple
hundred Kb. The body stream would terminate with …data…HTTP 1.1 500 Internal
Service Error, due to raising an Errno::EINVAL and catching this handler:
 # if we get any error, try to write something back to the client
 # assuming we haven't closed the socket, but don't get hung up
 # if the socket is already closed or broken.  We'll always ensure
 # the socket is closed at the end of this function
 def handle_error(client, e)
   code = case e
   when EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF,
        Errno::ENOTCONN
     500
   when Unicorn::RequestURITooLongError
     414
   when Unicorn::RequestEntityTooLargeError
     413
   when Unicorn::HttpParserError # try to tell the client they're bad
     400
   else
     Unicorn.log_error(@logger, "app error", e)
     500
   end
   client.kgio_trywrite(err_response(code, @request.response_start_sent))
   client.close
   rescue
 end
 By reattempting the write to the socket over and over we were able to get past
the error and send the appropriate data along. (It didn't usually take more
than 10 retries per _connection_, I bumped the number up to 50 tries per _write_
to be far beyond anything we'd ever need.)
It appears that Kgio is used throughout unicorn to serve this exact purpose, a
possible better solution might be to use something along the lines of:
  body.each {|chunk|
     50.times {
	     failure = socket.kgio_trywrite(chunk)
         case failure
	     when :wait_writable then
           next;
         when String then
           chunk = failure
           next;
         when nil then
           break;
         end
     }
     Unicorn.log_error(@logger, "response write error",
Exception.new(maybe)) if @logger.respond_to?(:error)
  }
Sadly I was not able to create an example scenario under which the socket.write
would throw Errno::EINVAL. Any suggestions would be welcome for how to generate
such a scenario.
---
 lib/unicorn/http_response.rb | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/lib/unicorn/http_response.rb b/lib/unicorn/http_response.rb
index 083951c..edaceee 100644
--- a/lib/unicorn/http_response.rb
+++ b/lib/unicorn/http_response.rb
@@ -57,7 +57,20 @@ module Unicorn::HttpResponse
       body = nil # ensure we do not close body
       hijack.call(socket)
     else
-      body.each { |chunk| socket.write(chunk) }
+      body.each { |chunk| 
+        tries = 0
+        begin
+          socket.write(chunk)
+        rescue Errno::EINVAL => e
+          @logger.error "response write retryable error einval attempt
#{tries}: #{e.message} (#{e.class}) #{e.backtrace.first}" if
@logger.respond_to?(:error)
+          tries += 1
+          retry if tries < 50
+          raise
+        rescue => e
+          Unicorn.log_error(@logger, "response write error", e) if
@logger.respond_to?(:error)
+          raise
+        end
+      }
     end
   ensure
     body.respond_to?(:close) and body.close
-- 
1.8.1.5
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying
Eric Wong
2013-Aug-15  23:46 UTC
Re: [PATCH] http_response: reattempt writing body chunks to the socket under situations of high EINVAL/EAGAIN load.
Tim Snowhite <tsnowhite@taximagic.com> wrote:> We run Unicorn with ruby 1.8.7 on FreeBSD and began experiencing many > early terminations of the body stream when attempting to transfer > files over a couple hundred Kb. The body stream would terminate with > …data…HTTP 1.1 500 Internal Service Error, due to raising an > Errno::EINVAL and catching this handler:Having to retry a write() (or any syscall) on EINVAL seems completely wrong... More analysis below.> # if we get any error, try to write something back to the client > # assuming we haven't closed the socket, but don't get hung up > # if the socket is already closed or broken. We'll always ensure > # the socket is closed at the end of this function > def handle_error(client, e) > code = case e > when EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF, > Errno::ENOTCONNWe inherited the EINVAL/EBADF capture from Mongrel, but really we shouldn't be swallowing that error/backtrace. Perhaps we should just let those exceptions fall through to the "app error" below> 500 > when Unicorn::RequestURITooLongError > 414 > when Unicorn::RequestEntityTooLargeError > 413 > when Unicorn::HttpParserError # try to tell the client they're bad > 400 > elseLet EINVAL/EBADF hit here:> Unicorn.log_error(@logger, "app error", e) > 500 > end > client.kgio_trywrite(err_response(code, @request.response_start_sent)) > client.close > rescue > end >> By reattempting the write to the socket over and over we were able to > get past the error and send the appropriate data along. (It didn't > usually take more than 10 retries per _connection_, I bumped the > number up to 50 tries per _write_ to be far beyond anything we'd ever > need.)> It appears that Kgio is used throughout unicorn to serve this exact > purpose, a possible better solution might be to use something along > the lines of: > > body.each {|chunk| > 50.times { > failure = socket.kgio_trywrite(chunk) > case failure > when :wait_writable then > next;That's just busy looping, at least check for writability via kgio_wait_writable or IO.select. But really, a non-blocking socket write doesn't make sense for unicorn.> when String then > chunk = failure > next; > when nil then > break; > end > } > Unicorn.log_error(@logger, "response write error", Exception.new(maybe)) if @logger.respond_to?(:error) > }> Sadly I was not able to create an example scenario under which the > socket.write would throw Errno::EINVAL. Any suggestions would be > welcome for how to generate such a scenario.Is there any chance stringifying the chunk from your Rack response body could be causing Errno::EINVAL? Otherwise, this is a FreeBSD bug which needs to be fixed. In the mean time (if you can confirm the EINVAL is really coming from the socket.write and not the chunk stringification, perhaps modifying Kgio::Socket#write is less intrusive: # workaround possibly broken write() on FreeBSD class Kgio::Socket def write(chunk) tries = 50 begin super rescue Errno::EINVAL tries -= 1 raise if tries < 0 retry rescue => e # ... end end end if RUBY_PLATFORM =~ /freebsd/ _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying