We have a service that clients use to upload files. We have a couple of clients that are on slow links and so their upload times out. We get errors in the logs that I''d like to get rid of. I was hoping that the recent commit 24b9f66dcdda44378b4053645333ce9ce336b413 would help us, but it does not. After digging in a bit, I have some ideas about why and a patch I''d like comments on. The commit above attempts to ignore exceptions of types EOFError, Errno::ECONNRESET, Errno::EPIPE, and Errno::ENOTCONN. However, that doesn''t address our issue because our exception comes when trying to write the response. It is a generic IOError and thus is not handled. In addition, I believe that change is unsafe because I believe client code could raise those exceptions. Think about a client with a database connection: that could also raise e.g. EOFError. I believe unicorn needs to check the status of the client connection instead of the type of exception that is raised. The patch below has two new test cases: one where the client code raises an EOFError and one that replicates the condition we see where the client socket is closed before the response can be written. It changes the behavior of the ''test_client_shutdown_write_truncates'' test, but I believe the change makes sense. If the client socket is still open for writing, unicorn should write the response. Thoughts? --drew diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index 402f133..3620427 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -547,8 +547,6 @@ class Unicorn::HttpServer # the socket is closed at the end of this function def handle_error(client, e) code = case e - when EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::ENOTCONN - # client disconnected on us and there''s nothing we can do when Unicorn::RequestURITooLongError 414 when Unicorn::RequestEntityTooLargeError @@ -556,8 +554,12 @@ class Unicorn::HttpServer when Unicorn::HttpParserError # try to tell the client they''re bad 400 else - Unicorn.log_error(@logger, "app error", e) - 500 + if client.closed? + # client disconnected on us and there''s nothing we can do + else + Unicorn.log_error(@logger, "app error", e) + 500 + end end if code client.kgio_trywrite(err_response(code, @request.response_start_sent)) diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb index e5b335f..2fca1b4 100644 --- a/test/unit/test_server.rb +++ b/test/unit/test_server.rb @@ -145,8 +145,7 @@ class WebServerTest < Test::Unit::TestCase # processing on us even during app dispatch sock.shutdown(Socket::SHUT_WR) IO.select([sock], nil, nil, 60) or raise "Timed out" - buf = sock.read - assert_equal "", buf + assert_match %r{\AHTTP/1.[01] 500\b}, sock.sysread(4096) next_client = Net::HTTP.get(URI.parse("http://127.0.0.1:#@port/")) assert_equal ''hello!\n'', next_client lines = File.readlines("test_stderr.#$$.log") @@ -265,4 +264,55 @@ class WebServerTest < Test::Unit::TestCase def test_listener_names assert_equal [ "127.0.0.1:#@port" ], Unicorn.listener_names end + + # ensure that EOFError from client code is not ignored + def test_eof_app + teardown + app = lambda { |env| raise EOFError } + # [200, {}, []] } + redirect_test_io do + @server = HttpServer.new(app, :listeners => [ "127.0.0.1:#@port"] ) + @server.start + end + sock = TCPSocket.new(''127.0.0.1'', @port) + sock.syswrite("GET / HTTP/1.0\r\n\r\n") + assert_match %r{\AHTTP/1.[01] 500\b}, sock.sysread(4096) + assert_nil sock.close + lines = File.readlines("test_stderr.#$$.log") + assert lines.grep(/app error:/) + end + + def test_file_streamed_request_close + teardown + # do a funky dance so that the socket is closed partway though the request + tmp = Tempfile.new(''test_file_streamed_request_close'') + app = lambda { |env| + while File.zero?(tmp.path) + sleep(0.2) + end + tmp.unlink + o = env[''rack.input''].instance_variable_get(:@socket) + o.close + hdrs = { ''Content-Type'' => ''text/plain'' } + [ 200, hdrs, [ "#$$\n" ] ] + } + # [200, {}, []] } + redirect_test_io do + @server = HttpServer.new(app, :listeners => [ "127.0.0.1:#@port"] ) + @server.start + end + sock = TCPSocket.new(''127.0.0.1'', @port) + body = "a" * 10 + long = "PUT /test HTTP/1.1\r\nContent-length: #{body.length + 10}\r\n\r\n" + body + sock.syswrite(long) + sock.close + tmp.puts "ok" + tmp.flush + while File.exists?(tmp.path) + sleep(0.2) + end + lines = File.readlines("test_stderr.#$$.log") + assert lines.grep(/app error:/).empty? + end + end _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Andrew Hobson <ahobson@gmail.com> wrote:> We have a service that clients use to upload files. We have a couple > of clients that are on slow links and so their upload times out. We > get errors in the logs that I''d like to get rid of.Those clients should really be hitting nginx, first.> I was hoping that the recent commit > 24b9f66dcdda44378b4053645333ce9ce336b413 would help us, but it does > not. After digging in a bit, I have some ideas about why and a patch > I''d like comments on.Bubbling a socket level error during an application dispatch is tricky because the application might not be prepared to handle it. Unfortunately this error is unique to unicorn since it implements streaming input. Fwiw, wrapping the app Unicorn::PrereadInput middleware may help in this situation if you''re dealing with a buggy local client.> The commit above attempts to ignore exceptions of types EOFError, > Errno::ECONNRESET, Errno::EPIPE, and Errno::ENOTCONN. However, that > doesn''t address our issue because our exception comes when trying to > write the response. It is a generic IOError and thus is not handled.IOError usually means an attempt to use the socket when it was already closed (possibly after it hit ECONNRESET/EPIPE/ENOTCONN). The only place we close the client socket where it might be visible to a Rack app is in the eof! method of StreamInput. Based on what I''m reading, this is what''s happening. However, lately, I''m thinking merely calling .shutdown on the socket is sufficient (patch below), and the close just confuses things.> In addition, I believe that change is unsafe because I believe client > code could raise those exceptions. Think about a client with a > database connection: that could also raise e.g. EOFError. I believe > unicorn needs to check the status of the client connection instead of > the type of exception that is raised.The Rack application should _always_ be trapping exceptions it generates, including DB. Where we log "app error" is only to tell the app author to fix their code and prevent a buggy app from completely breaking a worker. This also means the app needs to abort application processing if env["rack.input"] operations return EOFError (Unicorn::ClientShutdown).> The patch below has two new test cases: one where the client code > raises an EOFError and one that replicates the condition we see where > the client socket is closed before the response can be written.Your patch is badly whitespace mangled so I can''t apply it. Anyways, I think closing the socket while app dispatch is running is sufficient to avoid IOError (you''ll end up hitting ENOTCONN instead, I think). diff --git a/lib/unicorn/stream_input.rb b/lib/unicorn/stream_input.rb index c8a4240..d4aa5f9 100644 --- a/lib/unicorn/stream_input.rb +++ b/lib/unicorn/stream_input.rb @@ -141,7 +141,6 @@ private # raised EOFError on us. if @socket @socket.shutdown - @socket.close end ensure raise Unicorn::ClientShutdown, "bytes_read=#{@bytes_read}", [] _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
(Please don''t cull Cc:, I''m assuming you''re not subscribed to the mailing list since we don''t require subscriptions) Andrew Hobson <ahobson@gmail.com> wrote:> Eric Wong <normalperson@yhbt.net> writes: > > > Those clients should really be hitting nginx, first. > > I apologize for not being clear. They are hitting apache first.Heh, just as bad, since I''m not aware of any Apache config which will protect unicorn from slow clients.> > Fwiw, wrapping the app Unicorn::PrereadInput middleware may help in > > this situation if you''re dealing with a buggy local client. > > Hmm, I might give that a try.I expect that to consolidate your errors to one place and your application wont try to write errors. But yeah, you''re still opening up yourself to slow clients limiting your concurrency without nginx.> > IOError usually means an attempt to use the socket when it was already > > closed (possibly after it hit ECONNRESET/EPIPE/ENOTCONN). > > > > The only place we close the client socket where it might be visible to a > > Rack app is in the eof! method of StreamInput. Based on what I''m > > reading, this is what''s happening. > > Our application does not use the socket directly. It is a relatively > simple sinatra application that is accepting a file upload. As far as I > can tell from putting in some debugging statements in our code, the > error happens when we return a 400 error status when certain parameters > are missing.Right, however your upload processing will indirectly trigger socket reads: env["rack.input"].read -> Unicorn::StreamInput#eof! -> Unicorn::ClientShutdown.> I should have included this before, so here''s an example stack trace > (using a pre-release gem with the fixes from 24b9f66dcdda44378b4053645333ce9ce336b413): > > ERROR -- : app error: closed stream (IOError) > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_response.rb:53:in `write'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_response.rb:53:in `http_response_write'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_server.rb:563:in `process_client'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_server.rb:633:in `worker_loop'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_server.rb:500:in `spawn_missing_workers'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_server.rb:511:in `maintain_worker_count'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/lib/unicorn/http_server.rb:277:in `join'' > ERROR -- : gems/unicorn-4.6.3.5.ga9df/bin/unicorn:126:in `<top (required)>'' > ERROR -- : bin/unicorn:21:in `load'' > ERROR -- : bin/unicorn:21:in `<main>''Yep, I had your stack trace in my mind already :)> So it looks to me like what happens is that unicorn tries to write to > the socket to report the error and then it hits a generic IOError, > possibly from kgio: > > http://bogomips.org/kgio.git/tree/ext/kgio/my_fileno.h#n34Yes, but it would really happen from anything which attempted to use the socket.> > However, lately, I''m thinking merely calling .shutdown on the socket is > > sufficient (patch below), and the close just confuses things. > > I tried it out on my test case below. > > > The Rack application should _always_ be trapping exceptions it > > generates, including DB. Where we log "app error" is only to tell the > > app author to fix their code and prevent a buggy app from completely > > breaking a worker. > > I agree that applications should be trapping exceptions. I speculated > that the change is "incompatible" because now buggy applications that > raise EOFError will not report a 500 error. In fact, they won''t return > anything at all. Maybe that''s ok, but it seems like a pretty big change. > Or maybe I am misunderstanding entirely.Applications need to be aware of who raises EOFError. It would handle it differently if it''s a backend connection it makes or if a client triggered it (Unicorn::ClientShutdown). Since Unicorn::ClientShutdown is a subclass of EOFError, that should make things a little easier to distinguish (but unfortunately, forces some client code to be Unicorn-specific).> > Your patch is badly whitespace mangled so I can''t apply it. > > Please accept my apologies for the sloppiness. I have included a non > mangled version below. > > I think the test case I added called test_file_streamed_request_close > accurately reproduces the situation we are encountering, even though it > does so in a very heavy handed way. The change to handle_error fixes > the test case, but it does change the behavior of another test case. > > > Anyways, I think closing the socket while app dispatch is running > > is sufficient to avoid IOError (you''ll end up hitting ENOTCONN > > instead, I think). > > I applied the patch, but it does not fix the > test_file_streamed_request_close test case I have created.Right. The socket is no longer closed with my patch, so your test case does not reproduce what unicorn might do internally.> Thank you for your feedback, > > diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb > index 402f133..3620427 100644 > --- a/lib/unicorn/http_server.rb > +++ b/lib/unicorn/http_server.rb > @@ -547,8 +547,6 @@ class Unicorn::HttpServer > # the socket is closed at the end of this function > def handle_error(client, e) > code = case e > - when EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::ENOTCONN > - # client disconnected on us and there''s nothing we can doRemoving this is bad, at least the Errno::* portions since we might be outside of app dispatch and log_error would spew a backtrace for every single dropped/incomplete connection (this is still an issue on a LAN for me, but I use the worst LANs in the world :x) I''m trying to minimize the places we call IO#close on the socket, since we''ll trigger IOError in other places (as you''ve experienced).> when Unicorn::RequestURITooLongError > 414 > when Unicorn::RequestEntityTooLargeError > @@ -556,8 +554,12 @@ class Unicorn::HttpServer > when Unicorn::HttpParserError # try to tell the client they''re bad > 400 > else > - Unicorn.log_error(@logger, "app error", e) > - 500 > + if client.closed? > + # client disconnected on us and there''s nothing we can do > + else > + Unicorn.log_error(@logger, "app error", e) > + 500 > + end > end > if code > client.kgio_trywrite(err_response(code, @request.response_start_sent))In other words, your change to handle_error would fail the following case: --- a/test/unit/test_server.rb +++ b/test/unit/test_server.rb @@ -265,4 +267,29 @@ class WebServerTest < Test::Unit::TestCase def test_listener_names assert_equal [ "127.0.0.1:#@port" ], Unicorn.listener_names end + + def test_eof_headers + teardown + app = lambda { |env| [200, {}, []] } + redirect_test_io do + @server = HttpServer.new(app, :listeners => [ "127.0.0.1:#@port"] ) + @server.start + end + + sock = TCPSocket.new(''127.0.0.1'', @port) + sock.syswrite("GET / HTTP/1.0\r\n") # partial request + sock.shutdown + assert_raises(EOFError) { sock.readpartial(666) } + sock.close + + # just send another complete request to ensure the log message from the + # first (failed) request is flushed out to FS + sock = TCPSocket.new(''127.0.0.1'', @port) + sock.syswrite("GET / HTTP/1.0\r\n\r\n") + assert_match %r{200 OK}, sock.read + sock.close + + lines = File.readlines("test_stderr.#$$.log") + assert lines.grep(/app error:/).empty? + end end (I thought we already had a test for this, but apparently not..)> diff --git a/test/unit/test_server.rb b/test/unit/test_server.rb > index e5b335f..2fca1b4 100644 > --- a/test/unit/test_server.rb > +++ b/test/unit/test_server.rb > @@ -145,8 +145,7 @@ class WebServerTest < Test::Unit::TestCase > # processing on us even during app dispatch > sock.shutdown(Socket::SHUT_WR) > IO.select([sock], nil, nil, 60) or raise "Timed out" > - buf = sock.read > - assert_equal "", buf > + assert_match %r{\AHTTP/1.[01] 500\b}, sock.sysread(4096)I don''t agree with this at all. 500 means something in the server broke, but in this case the _client_ broke.> next_client = Net::HTTP.get(URI.parse("http://127.0.0.1:#@port/")) > assert_equal ''hello!\n'', next_client > lines = File.readlines("test_stderr.#$$.log") > @@ -265,4 +264,55 @@ class WebServerTest < Test::Unit::TestCase > def test_listener_names > assert_equal [ "127.0.0.1:#@port" ], Unicorn.listener_names > end > + > + # ensure that EOFError from client code is not ignored > + def test_eof_app > + teardown > + app = lambda { |env| raise EOFError } > + # [200, {}, []] } > + redirect_test_io do > + @server = HttpServer.new(app, :listeners => [ "127.0.0.1:#@port"] ) > + @server.start > + endI don''t think defining behavior here is good (other than being able to handle subsequent, unrelated requests with the process). This is a application bug if it can''t handle the exceptions it raises.> + sock = TCPSocket.new(''127.0.0.1'', @port) > + sock.syswrite("GET / HTTP/1.0\r\n\r\n") > + assert_match %r{\AHTTP/1.[01] 500\b}, sock.sysread(4096) > + assert_nil sock.close > + lines = File.readlines("test_stderr.#$$.log") > + assert lines.grep(/app error:/) > + end > + > + def test_file_streamed_request_close > + teardown > + # do a funky dance so that the socket is closed partway though the request > + tmp = Tempfile.new(''test_file_streamed_request_close'') > + app = lambda { |env| > + while File.zero?(tmp.path) > + sleep(0.2) > + end > + tmp.unlink > + o = env[''rack.input''].instance_variable_get(:@socket) > + o.closeWith my proposed patch to eliminate IO#close from StreamInput, this test is no longer an accurate representation of unicorn behavior.> + hdrs = { ''Content-Type'' => ''text/plain'' } > + [ 200, hdrs, [ "#$$\n" ] ] > + } > + # [200, {}, []] } > + redirect_test_io do > + @server = HttpServer.new(app, :listeners => [ "127.0.0.1:#@port"] ) > + @server.start > + end > + sock = TCPSocket.new(''127.0.0.1'', @port) > + body = "a" * 10 > + long = "PUT /test HTTP/1.1\r\nContent-length: #{body.length + 10}\r\n\r\n" + body > + sock.syswrite(long) > + sock.close > + tmp.puts "ok" > + tmp.flush > + while File.exists?(tmp.path) > + sleep(0.2) > + end > + lines = File.readlines("test_stderr.#$$.log") > + assert lines.grep(/app error:/).empty? > + end > + > end_______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Andrew Hobson <ahobson@gmail.com> wrote:> Eric Wong <normalperson@yhbt.net> writes: > > > (Please don''t cull Cc:, I''m assuming you''re not subscribed to the > > mailing list since we don''t require subscriptions) > > Sorry, that was unintentional.No worries, and it is good to also send a copy to each recipient in the thread in case Rubyforge is down (like it is right now). If it stays down, I''ll have to find/make a replacement myself. ...And move to something more decentralized and resilient to downtime while I''m at it.> > With my proposed patch to eliminate IO#close from StreamInput, > > this test is no longer an accurate representation of unicorn behavior. > > I applied that one line patch a day and a half ago and we haven''t seen > the error in the field (yet). I am optimistic you have elegantly fixed > the problem. > > If we do see an error, I will send another email to the list. > > Thanks again for your help,No problem, I''ll push that out later today. _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Eric Wong <normalperson@yhbt.net> wrote:> Andrew Hobson <ahobson@gmail.com> wrote: > > I applied that one line patch a day and a half ago and we haven''t seen > > the error in the field (yet). I am optimistic you have elegantly fixed > > the problem. > > > > If we do see an error, I will send another email to the list. > > > > Thanks again for your help, > > No problem, I''ll push that out later today.Pushed, thanks again:>From f4005d5efc608e7d75371f0d0527041facd33f89 Mon Sep 17 00:00:00 2001From: Eric Wong <normalperson@yhbt.net> Date: Thu, 7 Nov 2013 20:10:01 +0000 Subject: [PATCH] stream_input: avoid IO#close on client disconnect This can avoid IOError from being seen by the application, and also reduces points where IO#close may be called. This is a good thing if we eventually port this code into a low-level server like cmogstored where per-client memory space is defined by FD number of a client. Reported-by: Andrew Hobson <ahobson@gmail.com> --- lib/unicorn/stream_input.rb | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/lib/unicorn/stream_input.rb b/lib/unicorn/stream_input.rb index c8a4240..9278f47 100644 --- a/lib/unicorn/stream_input.rb +++ b/lib/unicorn/stream_input.rb @@ -139,10 +139,7 @@ private # we do support clients that shutdown(SHUT_WR) after the # _entire_ request has been sent, and those will not have # raised EOFError on us. - if @socket - @socket.shutdown - @socket.close - end + @socket.shutdown if @socket ensure raise Unicorn::ClientShutdown, "bytes_read=#{@bytes_read}", [] end -- 1.8.4.483.g7fe67e6.dirty _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
Maybe Matching Threads
- app error: Socket is not connected (Errno::ENOTCONN)
- [PATCH] construct listener_fds Hash in 1.8 compatible way
- [PATCH] Start the server if another user has a PID matching our stale pidfile.
- Unicorn 4.7.0 tests fail on OSX 10.7.5 and debian squeeze
- Fwd: Issue starting unicorn with non-ActiveRecord Rails app